From First Base: The Sequence of the Tip of the X Chromosome of Drosophila melanogaster, a Comparison of Two Sequencing Strategies

Table 1.

Genes Identified or Predicted in the 1A–3C Interval

Cytology Gene symbol Gene HMMER EST Matching gene(s) EDGP vs. joint sequence
EG:23E12.1 PF01019: G_glu_transpept GH10105 CG17636 0
EG:23E12.5 GH15984 CG17617 0
EG:23E12.2 PF00169: PH LD22360 CG17960 B−
PF00620: RhoGAP
PF00621: RhoGEF
EG:23E12.3 CG17707 0
EG:BACR37P7.1 PF01762: Galactosyl_T CK01556 CG3038 0
EG:BACR37P7.2 PF00856: SET LD10743 CG2995 0
PF00023: ank
1A8 EG:BACR37P7.3 cin PF00994: MoCF_biosynth GH09380 CG2945 0
EG:BACR37P7.9 PF00106: adh_short CG13377 A−
1A8 EG:BACR37P7.7 ewg AF171732 CG3114 B+
EG:BACR37P7.8 PF00071: ras CG13375 B−
EG:BACR37P7.5 bs28b06 CG12470 0
EG:125H10.1 LP06894 CG3777 0
1B1 EG:125H10.2 y CG3757 0
1B1 EG:125H10.3 ac PF00010: HLH CG3796 0
1B2 EG:198A6.1 sc PF00010: HLH CG3827 0
1B3 EG:198A6.2 l(1)sc PF00010: HLH CG3839 0
1B4 EG:EG0001.1 pcl PF00026: asp CG13374 0
1B4 EG:165H7.2 ase PF00010: HLH CG3258 0
EG:165H7.1 Cyp4g1 PF00067: p450 GH20504 CG3972 0
1B4 EG:165H7.3 l(1)Bb LD14543 CG3923 D+*
EG:171D11.6 LD04586 CG13372 D*
CG18166
CG18273
EG:171D11.2 PF00664: ABC_membrane LD18126 CG3156 A−
PF00005: ABC_tran
EG:171D11.1 PF00171: aldedh GM07535 CG17896 0
EG:171D11.5 CG17778 0
1B7 EG:171D11.3 svr PF00246: Zn_carbOpept LD28490 CG4122 D*
CG18503
EG:171D11.4; EG:65F1.3 arginase PF00491: arginase GH02581 CG18104 C+
1B8 EG:65F1.2 elav PF00076: rrm HL03451 CG4262 0
EG:65F1.1 GH24496 CG4293 0
1B8 EG:65F1.5 Appl PF02177: A4_EXTRA HL03850 CG7727 A+
1B9 EG:118B3.1 vnd PF00046: homeobox CG6172 0
EG:118B3.2 PF00307: CH GH04661 CG13366 0
1B13 EG:115C2.5 mod(r) LP01383 CG17828 0
EG:115C2.1 PF00294: pfkB LP11157 CG13369 0
EG:115C2.12 LP11709 CG18451 0
EG:115C2.6 PF00096: zf-C2H2 LD23988 CG17829 0
1B13 EG:115C2.7 RpL36 PF01158: Ribosomal_L36e LD01128 CG7622 0
1B13 EG:115C2.2 l(1)lBi LD09823 CG6189 0
1B13 EG:115C2.9 Dredd PF00655: ICE_p10 LD14339 CG7486 B−
PF00656: ICE_p20
1B13 EG:115C2.3 su(s) LD06838 CG6222 0
EG:115C2.8 GH16756 CG13367 A−
EG:115C2.11 GH22310 CG16982 B+
EG:115C2.10 PF00856: SET LD03312 CG13363 C−
1B13 EG:115C2.4 skpA PF01466: Skp1 LD03188 CG16983 0
1B14-C1 EG:BACR19J1.1 sdk PF00041: fn3 GM02010 CG5227 B+
PF00047: ig
EG:BACR19J1.2 PF00153: mito_carr LD09021 CG5254 0
EG:BACR19J1.3 GH28702 CG5273 0
EG:BACR19J1.4 RpL22 PF01776: Ribosomal_L22e LP05628 CG7434 0
EG:34F3.2 PF00784: MyTH4 LD11354 CG12467 D
PF00169: PH
EG:34F3.1 LD26268 CG12467 D
EG:34F3.8 PF00957: synaptobrevin LD05791 CG7359 0
EG:34F3.10 CG13358 A+
EG:34F3.9 CG13359 B−
1B14-C1 EG:34F3.3 Rbf PF01858: RB_A LP07395 CG7413 0
PF01857: RB_B
EG:34F3.4 LD26306 CG16989 0
EG:34F3.5 LP04844 CG13360 0
EG:34F3.7 PF02366: PMT LP01681 CG12311 A− C−
EG:34F3.6 fz3 PF01534: Frizzled CG16785 A−
PF01392: Fz
EG:BACR7A4.2 bs33b10 CG3713 0
EG:BACR7A4.3 PF00089: trypsin CG11664 0
EG:BACR7A4.19 PF00651: BTB LP01394 CG3711 0
PF01344: Kelch
EG:BACR7A4.6 1.82 CG3034 0
EG:BACR7A4.18 PF00956: NAP_family GH17085 CG3708 A+
EG:BAC7A4.20 LP11534 CG3706 0
EG:BACR7A4.5 LP07093 CG11642 0
EG:BACR7A4.17 LD33276 CG3704 0
EG:BACR7A4.16 LD03548 CG3026 0
EG:BACR7A4.15 GH12139 CG3703 A−
EG:BACR7A4.14 PF00106: adh_short LP06734 CG3699 D−*
PF00678: adh_short_C2
EG:BACR7A4.13 PF00083: sugar_tr GH13765 CG3690 0
EG:BACR7A4.7 PF02268: TFIIA_gamma GM03032 CG11639 0
EG:BACR7A4.12 PF00036: efhand bs03d05 CG11638 A++
1E1-4 EG:BACR7A4.8 anon-1Ed LD29918 CG3021 C−
EG:BACR7A4.11 CDC45L LD08729 CG3658 0
1E1-4 EG:BACR7A4.9 anon-1Eb GH11273 CG14630 A−
1E1-4 EG:BACR7A4.10 su(w[a]) PF01805: Surp SD01276 CG3019 B−
EG:103E12.2 GH24974 CG14629 0
EG:103E12.3 LD08339 CG3655 0
EG:BACR42I17.12 PF00076: rrm CG14628 0
EG:BACR42I17.1 PF01652: IF4E bs10b09 CG11392 B+
EG:BACR42I17.2 LP03214 CG11378 0
EG:BACR42I17.3 CG11384 0
EG:BACR42I17.4 bs31h12 CG11379 0
EG:BACR42I17.5 CG14627 A++
EG:BACR42I17.6 CG14626 A−
EG:BACR42I17.7 CG11380 A++
EG:BACR42I17.8 CG14625 A+
EG:BACR42I17.9 CG11381 D
CG14624
EG:BACR42I17.10 LP08751 CG11382 0
EG:BACR42I17.11 PF00096: zf-C2H2 CG11398 0
EG:33C11.3 LP06890 CG3638 A+ C−−
EG:33C11.2 GM08856 CG11403 0
EG:33C11.1 A3-3 PF00170: bZIP GH24653 CG11405 0
EG:114D9.1 PF00036: efhand CG11408 0
EG:114D9.2 PF02181: FH2 LD26058 CG14622 A−−
EG:190E7.1 CG18091 0
EG:8D8.1 GM13066 CG11411 0
EG:8D8.2 LD34263 CG11409 0
EG:8D8.6 PF00583: Acetyltransf LD06467 CG11412 B+
EG:8D8.8 GM12784 CG11418 0
EG:8D8.7 PF00335: transmembrane4 LP04678 CT11415 0
EG:8D8.3 PF00324: aa_permeases LD15480 CG12773 0
EG:8D8.4 LD08351 CG11417 0
EG:8D8.5 png PF00069: pkinase CG11420 0
EG:132E8.1 PF00076: rrm LD09340 CG3056 0
1F EG:132E8.2 SNF1A PF00069: pkinase GH05909 CG3051 0
EG:132E8.3 PF00085: thiored LD03613 CG3719 0
EG:132E8.4 CG11448 0
EG:49E4.1 futsch GH21135 CG3064 D*
EG:BACN32G11.1 CG18531 0
EG:BACN32G11.2 GH10964 CG14785 0
EG:BACN32G11.3 PF01535: PPR LD01992 CG14786 0
EG:BACN32G11.4 PF00378: ECH LP07530 CG14787 A−
EG:BACN32G11.5 PF01926: MMR_HSR1 HL05876 CG14788 0
EG:BACN32G11.6 GH07929 CG14789 A−
EG:80H7.10 GH22272 CG14777 0
EG:80H7.1 PF00089: trypsin
EG:80H7.2 LD18706 CG14779 0
EG:80H7.3 PF00089: trypsin CG14780 0
EG:80H7.4 PF00071: ras GM10914 CG14791 B−
EG:80H7.11 LD02045 CG14781 B+
EG:80H7.5 PF01363: FYVE GM03532 CG14782 0
PF00169: PH
2B1-2 EG:80H7.6 sta PF00318: Ribosomal_S2 LD27557 CG14792 A− B C+
EG:80H7.7 PF00060: lig_chan CG14793 D*
EG:196F3.1
EG:196F3.3 CG14795 A+
EG:196F3.2 PF02214: K_tetra LD05656 CG14783 C+
EG:56G7.1 PF01607: Chitin_bind_2 CG14796 0
2B5 EG:123F11.1; EG:17A9.1; EG:25D2.1 br PF00651: BTB PF00096: zf-C2H2 LP05017 CG11491 0
2B6 EG:171E4.1 dor LD12589 CG3093 0
EG:171E4.4 CK00326 CG3740 D*
EG:171E4.2 PF00560: LRR CG3095 A+ C+
EG:171E4.3 CG3737 0
EG:73D1.1 LD24507 CG3791 0
2B6-7 EG:9D2.1 b6 HL05401 CG3100 0
EG:9D2.2 GH23439 CG3783 D*
2B6-8 EG:9D2.3 a6 LD13641 CG3771 C−
EG:9D2.4 PF00089: trypsin CG3795 0
EG:4F1.1 GH21860 CG14808 0
EG:BACN35H14.1 Adar PF02137: A_deamin LD31451 CG12598 A+
PF00035: dsrm
EG:137E7.1 LD19625 CG17968 0
EG:131F2.2 PF00929: Exonuclease CG14801 A−
EG:131F2.3 LP07325 CG14812 0
EG:63B12.10 δCOP LD30910 CG14813 0
EG:63B12.6 GM12676 CG14814 A−
EG:63B12.13 GH20211 CG14802 0
EG:63B12.5 PF00515: TPR GH08708 CG14815 0
EG:63B12.9 LD13889 CG14803 B+
EG:63B12.4 PF00300: PGAM LD30851 CG14816 0
EG:63B12.8 LD10891 CG14804 0
EG:63B12.11 GH01621 CG14817 0
EG:63B12.7 PF00400: WD40 LD02447 CG14805 B+
EG:63B12.12 LP05103 CG14818 0
2B15 EG:63B12.3 trr PF00856: SET GM10003 CG3848 B++
2B15 EG:63B12.2 anon-2Bd PF00252: Ribosomal_L16 GH05976 CG3109 B+
2B15 EG:86E4.6 arm PF00514: Armadillo_seg LD10209 CG11579 A+
EG:86E4.2 PF01532: Glyco_hydro_47 LD21416 CG3810 C+
EG:86E4.3 PF00400: WD40 CG17766 A−
EG:86E4.4 LD27573 CG3480 0
2B15 EG:86E4.1 eIF-2bε PF02020: W2 LD26247 CG3806 0
PF00132: hexapep
EG:86E4.5 PF00783: IPPc GH18456 CG3573 0
EG:39E1.1 LD22420 CG11596 0
EG:39E1.3 LP09039 CG3857 0
EG:39E1.2 LD09945 CG3587 0
EG:BACH61I5.1 CG3600 0
EG:133E12.2 PF00104: hormone_rec CG16902 D*
PF00105: zf-C4
EG:133E12.3 PF01650: Peptidase_C13 CG4406 A+
EG:133E12.4 east LD33602 CG4399 0
2C3 EG:133E12.1 Actn PF00307: CH HL01581 CG4376 0
PF00036: efhand
PF00435: spectrin
2C3 EG:22E5.1 usp PF00104: hormone_rec LD09973 CG4380 0
PF00105: zf-C4
EG:22E5.12 PF00097: zf-C3HC4 CG4325 0
EG:22E5.11 PF00001: 7tm_1 CG4322 C+
EG:22E5.10 PF00001: 7tm_1 GM02327 CG4313 0
EG:22E5.8 PF00069: pkinase GH06888 CG4290 0
EG:22E5.7 LD08665 CG4281 D*
EG:22E5.5 PF00355: Rieske GH11732 CG4199 A+
PF00070: pyr_redox
EG:22E5.6 LD31238 CG4194 0
EG:22E5.3 PF01137: RCT GH07716 CG4061 0
EG:22E5.4 PF02390: Methyltransf_4 GM01339 CG4045 C+
EG:22E5.9 LP10820 CG4025 0
EG:67A9.2 LD01561 CG16903 C−−
EG:67A9.1 CK00561 CG3981 A−
2D3 EG:BACN25G24.2 csw PF00017: SH2 HL03192 CG3954 0
PF00102: Y_phosphatase
2D3 EG:BACN25G24.3 ph-d PF00536: SAM GH08934 CG3895 A−− B+ C+
2D3 EG:87B1.5 ph-p PF00536: SAM GH19743 D*
EG:87B1.3 PF01565: FAD_binding_4 GH17284 CG3835 0
2D6 EG:87B1.4 Pgd PF00393: 6PGD GH13486 CG3724 0
2D6 EG:87B1.6 bcn92 CG3717 0
2D6 EG:87B1.2 wapl LD29979 CG3707 A+
2D6 EG:87B1.1 Cyp4d1 PF00067: p450 GH01333 CG3656 0
EG:152A3.3 HL02445 CG3630 0
EG:152A3.7 anon-2Db CG3621 0
EG:152A3.2 Cyp4d14 PF00067: p450 HL05508 CG3540 0
2E1 EG:152A3.4 Cyp4d2 PF00067: p450 GH09810 CG3466 A−
2E1 EG:152A3.6 Cyp4ae1 PF00067: p450 GH24265 CG10755 0
2E1 EG:152A3.5 pn GM10090 CG3461 0
2E3 EG:152A3.1 Nmd3 LD13746 CG3460 0
EG:17E2.1 LD17911 CG3457 B−
2E3 EG:103B4.3 Mct1 PF01587: MCT LP01643 CG3456 A−
EG:103B4.2 LP02712 CG18031 D
2E3 EG:103B4.4 msta GH20239 CG18033 0
2E3 EG:103B4.1 Vinc PF01044: Vinculin LD16157 CG3299 0
2E3 EG:30B8.4 pcx LD27929 CG3443 B−−
2F1 EG:30B8.2 kz GH21962 CG3228 0
2F1 EG:30B8.5 fs(1)K10 LD08992 CG3218 0
2F1 EG:30B8.7 Or2a CG3206 C
2F1 EG:30B8.1 crn PF02184: HAT LP05055 CG3193 0
EG:30B8.3 PF00650: CRAL_TRIO GM01086 CG3191 0
EG:30B8.6 GH06335 CG3078 D
EG:25E8.3 PF00400: WD40 LD29959 CG3071 B+
EG:25E8.2 PF00179: UQ_con LD09991 CG2924 A+ C−
EG:25E8.1 PF00012: HSP70 GH11566 CG2918 0
EG:25E8.6 CG2879 D
EG:25E8.4 GH04956 CG2865 0
EG:BACH48C10.1 CG14050 0
EG:BACH48C10.2 GH19593 CG2854 C−
2F6 EG:BACH48C10.3 phl PF00130: DAG_PE-bind GH03557 CG2845 B+
PF02196: RBD
PF00069: pkinase
EG:BACH48C10.6 CG14048 0
2F6 EG:BACH48C10.5 ptr GH02860 CG2841 A+
EG:BACH48C10.4 GH27724 CG14047 D
EG:BACH7M4.1 SD05785 CG14045 A−−
EG:BACH7M4.2 PF00168: C2 CK01827 CG14045 A− C−
PF00505: PDZ
EG:BACH7M4.4 CG12496 C−
3A2 EG:BACH7M4.5 gt CG7952 0
3A3 EG:BACH59J11.1 tko PF00164: Ribosomal_S12 GM03810 CG7925 0
EG:BACH59J11.2 PF00041: fn3 SD01373 CG13756 B+
3A3 EG:BACH59J11.3 z CG7803 0
EG:BACR25B3.11 pcan PF0008: EGF GM03359 CG7981 D*
PF00047: ig
PF00054: laminin_G
PF00057: ldl_recept_a
EG:BACR25B3.10 PF00047: ig GM02481 CG7981 D*
EG:BACR25B3.1 PF00047: ig GM06086 CG7981 A++ C−
PF00052: laminin_B
PF00053: laminin_EGF
PF00057: ldl_recept_a
EG:BACR25B3.2 PF00057: ldl_recept_a CG12497 A+ B+
EG:BACR25B3.3 PF00002: 7tm_2 CG13758 D
EG:BACR25B3.4 PF01813: ATP-synt_D GH28048 CG8310 D
EG:BACR25B3.5 GH02552 CG13759 B+
EG:BACR25B3.6 LD41675 CG13760 A−−
EG:BACR25B3.7 wds PF00400: WD40 LD30385 CG17437 0
3A8 EG:BACR25B3.8 egh CG9659 0
3A8 EG:BACR25B3.9 Klp3A PF00225: kinesin 14 LD21815 CG8590 0
3A9 EG:BACR7C10.3 mit(1)15 LD31038 CG9900 0
EG:BACR7C10.4 Bzd PF01753: zf-MYND CG13761 C+
EG:BACR7C10.6 PF00335: transmembrane4 GH15125 CG10742 0
EG:BACR7C10.1 LD08769 CG9904 0
EG:BACR7C10.7 CG13762 B−
EG:BACR7C10.2 PF00613: PI3Ka GH26308 CG10260 D
PF00454: PI3_PI4_kinase
3B1 EG:155E2.3 sgg PF00069: pkinas3 GM02018 CG2621 A+
3B2 EG:155E2.2 HLH3B PF00010: HLH CG2655 0
EG:155E2.5 GH07966 CG2652 0
3B2 EG:155E2.4 per PF00989: PAS GH01975 CG2647 A− B+
3B2 EG:155E2.1 anon-3B1.2 CG2650 B−
EG:100G10.7 anon-3Ba PF0004: AAA GH01006 CG2658 0
PF01434: Peptidase_M41
EG:100G10.6 PF00628: PHD HL01595 CG2662 0
EG:100G10.5 anon-3Bb LD37122 CG2675 A+
EG:100G10.3 PF01008: IF-2B CG2677 0
EG:100G10.4 GH11163 CG2680 B+
EG:100G10.2 GH02982 CG2681 B−
EG:100G10.1 LD25954 CG2685 0
EG:100G10.8; EG:95B7.10 LD34251 CG2695 0
3B4 EG:95B7.9 anon-3Bd GH08386 CG2701 0
3C1 EG:95B7.8 fs(1)Yb CG2706 0
3C1 EG:95B7.4 fs(1)Ya LD47547 CG2707 A−
EG:95B7.5 CG2709 0
3C1 EG:95B7.6 dwg PF00096: zf-C2H2 LD08032 CG2711 0
EG:95B7.3 LD05179 CG2713 0
EG:95B7.7 anon-3Be PF00096: zf-C2H2 LD39664 CG2712 0
3C2 EG:95B7.2 crm PF00249: myb_DNA- LD09365 CG2714 0
binding
EG:95B7.1 PF00804: Syntaxin CG2715 0
EG:BACN33B1.2 HL08104 CG2766 D*
CG2716
3C2 EG:BACN33B1.1 w PF00005: ABC_tran GH06126 CG2759 0
EG:BACR43E12.1 CG12498 0
EG:BACR43E12.7 GM07661 CG14416 0
EG:BACR43E12.6 CG14417 0
EG:BACR43E12.5 CG14417 0
EG:BACR43E12.4 PF00569: ZZ GH01442 CG3526 A++
EG:100G7.6 CG3588 A−− C+
EG:100G7.5 CG14424 0
3C5 EG:100G7.1 anon-3Ca CG18089 0
3C5 EG:100G7.2 anon-3Cb CG3591 0
EG:100G7.3 CG3598 0
  • All known or predicted genes have a symbol in the formEG:#, where the # indicates the clones on which they were first discovered followed by a dot and integer. Genes previously known are also shown with their FlyBase symbols and, if determined, cytological locations. The EST column indicates a matching EST sequence from either the BDGP collection or B. Oliver's testes-derived EST collection (as submitted to GenBank; see Andrews et al. 2000). Only one cDNA clone name is listed for each gene. The column headed “Matching Gene(s)” indicates the matching gene from the Joint Sequence. The column headed “EDGP vs. Joint Sequence” indicates the result of comparing the EDGP and Joint Sequence at the predicted protein level. In this column, 0 indicates identity or <1% difference in sequence; A, that the sequences differ in their predicted start sites; B, that they differ in their predicted termination sites; and C, that they differ by a predicted exon or intron. A ‘D’ indicates that the gene models predicted by us and by the Joint Sequence differ very markedly; an accompanying asterisk indicates that we have evidence that the EDGP model is the more correct (see text). A plus sign indicates the EDGP sequence is longer than the CG sequence; a minus sign indicates that it is shorter. For more details see the supplementary data. Only positive hits of known or predicted proteins to PFAM are shown (see text). A dagger before a gene symbol indicates a gene with alternatively spliced messages.

This Article

  1. Genome Res. 11: 710-730

Preprint Server