[Feedback] Site Feedback: 1-base shift in release-59/gff3/zea_mays/gene_id_mapping_v3_to_v4/V3gene.V4_coordinates.gff3
feedback at gramene.org
feedback at gramene.org
Thu Jan 10 11:19:27 EST 2019
URL : http://ensembl.gramene.org/Zea_mays/Info/Index
Subject : 1-base shift in release-59/gff3/zea_mays/gene_id_mapping_v3_to_v4/V3gene.V4_coordinates.gff3
Name : Wen-Dar Lin
Email : wdlin at gate.sinica.edu.tw
Organization: IPMB, Academia Sinica
Comments : I tried to work with v3 annotation on v4 genome so that I found the V3gene.V4_coordinates.gff3 file. An intreseting fact that I found is that the distribution of donor/accepter combination of this GFF3 file is not like of v3 annotation on v3 genome. I think GTAG should be the most common donor/accepter combination.
V3gene.V4_coordinates:
GGCA 56879
TAGG 32173
GTAG 24957
GGTA 15543
TAGA 13441
TGGG 10362
TAGC 7365
TAGT 6850
TGCA 6467
AGCA 5934
TTGG 5765
TGGA 4369
TCGG 3398
GGAA 3394
CGCA 2940
TTGA 2506
TGGC 2455
TGGT 2339
TGTA 1894
AGTA 1732
v3 annotation on v3 genome:
GTAG 259227
GCAG 4052
ATAC 189
CTAC 165
ATAG 150
GTAT 125
GTTG 121
GAAG 117
GTAA 115
GGAG 99
TTAG 95
GTGG 95
GTAC 93
CTAG 90
GTCG 75
GCCG 69
CCCG 61
ATAT 43
CTGC 39
CTGG 35
I built my version of v3-to-v4 GFF3 and I think I found the reason. Just an example: coordinates of a transcript from the V3gene.V4_coordinates.gff3
>GRMZM2G059865_T01
50882 51216
51370 51435
51885 52003
52136 52293
52390 52545
52667 52825
52946 53148
53622 53931
55221 55680
and mine:
GRMZM2G059865_T01
50883 51217
51371 51436
51886 52004
52137 52294
52391 52546
52668 52826
52947 53149
53623 53932
55222 55681
I understand that this might simply because of 0-base reports made by tools like MUMmer, but people used to consider coordinates in a GFF3 file as 1-base.
http://www.warelab.org/bugs/view.php?id=5784
More information about the Feedback
mailing list