The Journal of Heredity 2001:92(1)
© 2001 The American Genetic Association 92:1-8
Expressed Sequence Tags for the Chicken Genome from a Normalized 10-Day-Old White Leghorn Whole Embryo cDNA Library: 1. DNA Sequence Characterization and Linkage Analysis
From the Comparative Genomics Laboratory, College of Agricultural, Environmental, and Natural Sciences, Tuskegee University, Tuskegee, Alabama.
| Abstract |
|---|
|
|
|---|
Expressed sequence tags (ESTs) provide a rapid and reliable method for gene discovery as well as a resource for the large-scale analysis of gene expression of known and unknown genes. Here we describe a normalized cDNA library developed from a 10-day-old White Leghorn chicken whole embryo. The utility of the library was evaluated by partial sequencing of 99 randomly selected insert-containing clones and the analysis of EST-targeted genomic regions for single nucleotide polymorphisms (SNPs) in the East Lansing chicken reference DNA mapping panel. Using stringent match criteria of percent identity of 80 or higher across a length of 50 or more bases, 46 ESTs matched database sequences including previously reported Gallus gallus genes. Thirty-seven of the 50 primer pairs developed from 50 unique ESTs amplified a single fragment. The size of the 37 amplicons ranged from 276 to 693 bp for a total of 17,508 and an average of 473. About 70% of the SNPs detected were either G
A or C
T transition. The number of SNPs detected within the amplicons from EST-targeted genomic regions ranged from 0 to 4 for a total of 65 and a frequency of about 1 every 470 bases. About 35% of the amplicons contained only 1 SNP, while 19% had 4 SNPs. Using the SNPs that were informative in the East Lansing reference panel, 17 ESTs were mapped on the East Lansing chicken genetic map. The ESTs described, as well as the nucleotide variants identified within the EST-targeted genomic regions, represent significant resources for genome analysis in the chicken. | Introduction |
|---|
|
|
|---|
Progress in livestock genomics, especially in the creation of low-density genetic maps has been aided significantly by the development of in vitro amplification procedures such as the polymerase chain reaction (PCR) and the discovery of highly informative genomewide polymorphic markers such as microsatellites (Georges and Andersson 1996). The genetic maps have provided resources for the identification of markers in linkage disequilibrium with genes that influence economic traits (Andersson et al. 1994; Spelman et al. 1996; Vallejo et al. 1998). Despite this progress, further progress in the identification and isolation of quantitative trait loci (QTL) and of genes with low penetrance and modest effects on economic traits has been limited. To overcome this type of limitation, the development of high-density genetic maps containing 10- to 20-fold more markers than current levels has been recommended (Kruglyak 1997, 1999).
To generate genetic maps with sufficient density to map QTLs, recent efforts in human genomics have focused on developing single nucleotide polymorphisms (SNPs), the most common type of variation in eukaryotic genomes (Cooper et al. 1985; Li and Sadler 1991). Estimates suggest that SNPs occur at a rate of about 1 in 5001000 bp when any two chromosomes are compared in the human genome (Harding et al. 1997; Nickerson et al. 1998; Wang et al. 1998). Though the information is limited, estimates using a randomly mating chicken population suggest a higher frequency of 1 in 200500 bp (Smith et al. 2000). The identification of SNPs therefore provides a resource for building a high-density genetic map using millions of potentially informative genetic markers.
The value of SNPs in genomics has resulted in several initiatives to develop these resources. Methods for discovering SNPs have involved direct sequencing (Rieder et al. 1999) as well as the search of public databases for redundant sequences (Picoult-Newberg et al. 1999). After their discovery, automated methods were available for direct and rapid screening and genotyping. However, the utility of SNPs, despite their ubiquity and abundance in the genome, depends on whether they are in expressed or nonexpressed DNA sequences (Kwok et al. 1996). Gene-based SNPs, or what are referred to as coding SNPs (cSNPs), are considered valuable for association studies between complex traits and quantitative trait loci (Halushka et al., 1999). Nonetheless, the discovery of cSNPs requires the identification of genes as a necessary first step; but most livestock genome maps, including that of the chicken, have very few genes. Single-pass sequencing of cDNA clones has been suggested as a rapid method of identifying genes, the mapping of which will enhance genetic maps (Wang et al. 1991). These sequences, generally known as expressed sequenced tags (ESTs), have been developed for several organisms including humans (Adams et al. 1991), chickens (Li et al. 1998; Ruyter-Spira et al. 1996, 1998), and turkeys (Smith et al. 2000).
The direct tagging of expressed genes also make EST discovery useful for monitoring and understanding gene expression patterns at specific stages of development and in specific tissues. In addition, ESTs can facilitate gene discovery, especially using the EST database (dbEST). More recently the identification of nucleotide variants in ESTs has provided a resource for the use of microarrays in genetic screening and monitoring of gene expression (Hacia et al. 1998).
The chicken genetic map currently consists of about 235 gene-based markers or about 12% of the total number of DNA markers (Groenen et al. 2000). Expressed sequence tags provide a unique opportunity for increasing the number of gene-based or type I markers for the chicken genome map as well as for comparative mapping with densely mapped species like humans and mouse (Adams et al. 1991; Bogulski et al. 1993). As a special class of sequence tagged sites used for physical mapping (Olson et al. 1989), ESTs provide an additional advantage of directly tagging expressed genes. Several ESTs have been reported for chickens and deposited in the international databases, and other laboratories have reported a limited number that were characterized and mapped (Bumstead et al. 1994; Ruyter-Spira et al. 1998; and Spike et al. 1996). Here we describe ESTs developed from a White Leghorn chicken embryo cDNA library that were characterized by sequence identity with database matches and linkage analysis in the East Lansing (EL) reference DNA panel (Crittenden et al. 1993).
| Materials and Methods |
|---|
|
|
|---|
cDNA Library
The library, a collaborative effort with Dr. Bob Zahorchak (Research Genetics, Inc.) was constructed using a standard primary librarysingle-strand DNA librarycDNA library normalization procedure of Soares et al. (1994). Briefly, poly(A)+ RNA isolated using a standard protocol from a 10-day-old White Leghorn whole embryo was used to construct a directional cDNA library in the vector
ZAP (Stratagene) between the T3 and T7 promoters (Sambrook et al. 1989). The polylinker used was streamlined in order to facilitate normalization, as suggested by Bonaldo et al. (1996). The normalization procedure, involving partial reassociation kinetics and hydroxyapatite chromatography to remove clones with many copies, was done according to Soares et al. (1994). Three hundred eighty-four clones were randomly picked into a 384-well microtiter plate from which inserts were obtained using a modification of the Gussow and Clackson (1989) direct clone characterization procedure described previously (Smith et al. 2000).
DNA Sequencing and Analysis
The PCR-amplified fragments were purified by resin-based precipitation as follows: PCR products were added to 300 µl of a well-mixed slurry of Sephacryl S-300-HR (Sigma Chemical Co., St. Louis, MO) in a 96-well Silent Monitor microtiter plate (Fisher Scientific, Suwanee, GA) and spun at 3500 rpm for 5 min. The purified fragments within the filtrate were used as templates for BigDye terminator sequencing as previously described (Smith et al. 2000). The partial sequences were edited to remove vector sequences from the 5' ends as well as unreliable and unresolved nucleotides from the 3' ends. The sequences were compared against National Center for Biotechnology Information (NCBI) nonredundant and dbEST databases using BLAST (Altschul et al. 1990). At the nucleotide level, sequence identity was considered significant to a database entry if the length on which the identity was based exceeded 50 bp and the homology was 80% or greater. The ESTs were also analyzed for redundancy using Phred/Phrap/Polyphred and Consed as previously described (Gordon et al. 1998). This latter comparison was especially necessary for ESTs with no database matches.
Polymorphism and Linkage Analysis
The male and female parental samples of the East Lansing reference panel, described previously by Crittenden et al. (1993), were used to screen 50 of the nonredundant ESTs for SNPs in the EST-targeted genomic regions. Primers for PCR were developed from each EST using the web-based program Primer 3 (Rozen and Skaletsky 1997) and amplification performed at an annealing temperature of between 53°C and 65°C as previously described (Smith 1998). The amplified products were purified and sequenced as described above. The sequences were also analyzed for SNPs using Phred/Phrap/Polyphred and Consed (Gordon et al. 1998). Multipoint linkage analyses of ESTs polymorphic in the parental samples were according to Cheng et al. (1995). Candidate SNPs that were initially identified as heterozygous in one or both parental samples were validated by sequencing a second PCR product as well as resequencing using the reverse primer (Nickerson et al. 1998).
| Results and Discussion |
|---|
|
|
|---|
A normalized cDNA library from a 10-day-old whole embryo was developed and partially characterized. A total of 301 colonies yielded amplified products containing single or multiple bands. Out of 144 clones that produced a single band, PCR-generated templates from 99 insert-containing clones were used to develop single-pass DNA sequence information. The edited sequences of the 99 ESTs ranged in size from 127 to 884 bp (Table 1). The ESTs have been submitted to GenBank (NCBI) and assigned accession numbers. Additional sequences and unpublished data related to the ESTs from this library are available by ftp at apsc26.apsc.vt.edu.
|
Fifty ESTs, or about 51% of the sequences, matched database sequences. Nineteen ESTs matched previously reported chicken genes (Table 1). Among these, the complete sequence of eight ESTsTUCEST6, TUCEST9, TUCEST10, TUCEST19, TUCEST76, TUCEST81, TUCEST180, TUCEST190matched database chicken DNA sequences with an identity of 96% or more. Among those ESTs that matched nonavian database sequences, the percent identity as well as the length of the matched region was much lower at 84% and 185 bp, respectively.
Among the 50 primer-pairs developed and tested from 50 ESTs, only 37 amplified a single product (amplicon) each of which was sequenced and scanned for SNPs (Table 2). The sequences of each of the 37 amplicons showed complete homology with the respective reference EST sequence. The total length of the sequences developed from the 37 amplicons was 17,508 bp (Table 2). Within these, 65 SNPs were detected and validated (Figure 1). The number of SNPs detected within amplicons ranged from 0 to 4 in 16% and 19% of EST-targeted genomic regions, respectively. Thirty-five percent of the amplicons from EST-targeted genomic regions contained only one SNP. All the SNPs detected and validated were substitutions. Of the 65 SNPs detected and validated, 41 or 64% were either G
A (n = 19) or C
T (n = 22) transitions (Figure 2).
|
|
|
Seventeen ESTs were mapped in the East Lansing reference panel as markers identified with the prefix TUS- (Table 3). About 65% of the mapped ESTs were assigned to linkage groups previously anchored to macrochromosomes (Table 3). The high logarithm of odd scores for these as well as the other loci provides strong support for the linkage assignments. Three of the markers appear to be telomeric. Except for three markers, most of the markers were in regions of higher marker density. Several of these cosegregate with previously described markers (Groenen et al. 2000). In providing the sequence context of the SNPs, others can now use automated approaches such as genetic bit analysis and hybridization arrays to screen and genotype populations of interest.
|
We have developed a cDNA library that represents a significant resource for the development of a chicken genome map containing more gene-based markers and one with a density high enough to improve the chances of QTL identification by linkage disequilibrium. Currently there are two publicly available chicken EST databases (http://udgenome.ags.udel.edu/chickest/chick.htm; http://genetics.hpi.uni-hamburg.de). Because only 35% of the ESTs described here matched these two database sequences, it will appear that the library may be a useful resource for gene discovery in chickens. Though by themselves ESTs represent a powerful tool for genetic analysis (Faranda et al. 1996), their utility in genomics can be greatly enhanced by mutation and linkage analyses (Kwok et al. 1996). Here we have reported ESTs for the chicken and conducted additional analysis to increase their utility for genetic studies by scanning them for SNPs and mapping some in the East Lansing reference panel. The mapping of ESTs technically represent transcript mapping which will facilitate the cloning of genes for economic traits by investigating candidate ESTs for evidence of mutation in phenotypically divergent individuals (Jones et al. 1998). Knowledge of the chromosomal location of the ESTs as well as the SNPs, makes the data provided here useful for genetic linkage studies, the integration of the chicken genetic and physical maps and for comparative genome analysis.
| Acknowledgments |
|---|
Contribution number 312 of the George Washington Carver Agricultural Experiment Station, Tuskegee University. We are grateful to Tom Savage, Oregon State University, M. Loretan, and M. Egnin, Tuskegee University, for editorial suggestions, and to Hans Cheng, Avian Disease and Oncology Laboratory, USDA/ARS, for the DNA samples from the East Lansing reference panel and the multipoint linkage analysis. This research was sponsored, in part, by the National Human Genome Research Institute as a supplement to the Tuskegee University RCMI program, and the USDA SCD grant 58-3148-9-028.
| Footnotes |
|---|
Address correspondence to E. J. Smith 3130 Litton Reaves Hall, Virginia Tech, Blacksburg, VA 24061, or e-mail: esmith{at}vt.edu.
Corresponding Editor: Susan J. Lamont
Received March 6, 2000
Accepted August 30, 2000
| References |
|---|
|
|
|---|
-
Adams MD, Kelley JM, Gocayne GD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, and Moreno RF, 1991. Complementary DNA sequencing: expressed sequence tags and Human Genome Project. Science 252:16511656.
Altschul SF, Gish W, Miller W, Myers E, and Lipman DJ, 1990. Basic local alignment search tool. J Mol Biol 215:403410.[Web of Science][Medline]
Andersson L, Haley CS, Ellegren H, Knott SA, Johansson M, Andersson K, Andersson-Eklund L, Edfors-Lilja I, Fredholm M, Hansson I, Hkansson J, and Lundstrom K, 1994. Genetic mapping of quantitative trait loci for growth and fatness in pigs. Science 263:17711774.
Bogulski MS, Lowe TM, and Tolstoshev CM, 1993. DbEST-database for expressed sequence tags. Nat Genet 4:332333.[Web of Science][Medline]
Bonaldo MF, Lennon G, and Soares MB, 1996. Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res 6:791806.
Bumstead N, Young RR, Tregaskes C, Palyga P, and Dunn PP, 1994. Linkage mapping and partial sequencing of 10 cDNA loci in the chicken. Anim Genet 25:337341.[Web of Science][Medline]
Cheng HH, Levin I, Vallejo RL, Khatib H, Dodgson JB, Crittenden LB, and Hillel J, 1995. Development of a genetic map of the chicken with markers of high utility. Poult Sci 74:18551874.[Web of Science][Medline]
Cooper DN, Smith BA, Cooke HJ, Nieman S, and Schmidtke J, 1985. An estimate of unique DNA sequence heterozygosity in the human genome. Hum Genet 60:201215.
Crittenden LB, Provencher L, Santangelo L, Levin I, Abplanalp H, Briles R, Briles W, and Dodgson JB, 1993. Characterization of a red jungle fowl by white leghorn reference population for a molecular mapping of the chicken genome. Poult Sci 72:334348.[Web of Science]
Crittenden L, Bitgood J, and Burt D, 1995. Genetic nomenclature guide. Chick Trends Genet March:3334.
Faranda S, Frattini A, Zucchi L, Patrosso C, Milanesi L, Montagna C, and Vezzoni P, 1996. Characterization and fine localization of two new genes in Xq28 using the genomic sequence/EST database screening approach. Genomics 34:323327.[Web of Science][Medline]
Georges M and Andersson L, 1996. Livestock genomics comes of age. Genome Res 6:907921.
Gordon D, Abajian C, and Green P, 1998. Consed: a graphical tool for sequence finishing. Genome Res 8:195202.
Groenen MAM, Crooijmans RPM, Veenendaal A, Cheng HH, Siwek M, and Van der Poel JJ, 1998. A comprehensive microsatellite linkage map of the chicken genome. Genomics 49:265274.[Web of Science][Medline]
Gussow D and Clackson T, 1989. Direct clone characterization from plaques and colonies by the polymerase chain reaction. Nucleic Acids Res 17:4000.
Hacia JG, Makalowski W, Edgemon K, Erdos MR, Robbins CM, Fodor SP, Brody LC, and Collins FS, 1998. Evolutionary sequence comparisons using high-density oligonucleotide arrays. Nat Genet 18:155158.[Web of Science][Medline]
Halushka MK, Fan JB, Bentley K, Hsie L, Shen N, Weder A, Cooper R, Lipshutz R, and Chakravarti A, 1999. Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nat Genet 22:239247.[Web of Science][Medline]
Harding RM, Fullerton SM, Griffiths RC, Bond J, Cox MJ, Schneider JA, Moulin DS, Clegg JB, 1997. Archaic African and Asian lineages in the genetic ancestry of modern humans. Am J Hum Genet 60:772789.[Web of Science][Medline]
Jones MH, Tirosvoutis KN, Bowgen C, Davey P, Moore S, Naylor S, and Affara NA, 1998. Regional assignment and expression analysis of 29 expressed sequence tags mapped to chromosome 3. Genomics 533:400405.
Kruglyak L, 1997. The use of a genetic map of biallelic markers in linkage studies. Nat Genet 17:2124.[Web of Science][Medline]
Kruglyak L, 1999. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet 22:139144.[Web of Science][Medline]
Kwok PY, Deng Q, Zakeri H, Taylor SL, Nickerson DA, 1996. Increasing the information content of STS-based genome maps: identifying polymorphisms in mapped STSs. Genomics 31:123126.[Web of Science][Medline]
Li WH and Sadler LA, 1991. Low nucleotide diversity in man. Genetics 129:513523.[Abstract]
Li S, Liu N, Zadworny D, and Kuhnlein U, 1998. Genetic variability in white leghorns revealed by chicken liver expressed sequence tags. Poult Sci 77:13439.
Nickerson DA, Taylor SL, Weiss KM, Clark AG, Hutchinson RG, Stengard J, Salomaa V, Vartiainen E, Boerwinkle E, and Sing CF, 1998. DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene. Nat Genet 19:233240.[Web of Science][Medline]
Olson M, Hood L, Cantor C, and Botstein D, 1989. A common language for physical mapping of the human genome. Science 245:14341435.
Picoult-Newberg L, Ideker TE, Pohl MG, Taylor SL, Donaldson MA, Nickerson DA, and Boyce-Jacino M, 1999. Mining SNPs from EST databases. Genome Res 9:167174.
Rieder MJ, Taylor SL, Clark AG, and Nickerson DA, 1999. Sequence variation in the human angiotensin converting enzyme. Nat Genet 22:5962.[Web of Science][Medline]
Rozen S and Skaletsky HJ, 1997. Primer3. Code available at http://www-genome.wi.mit.edu/genome_software/other/primer3.html.
Ruyter-Spira CP, Crooijmans RP, Dijkhof RJ, van Oers PA, Strijk JA, van der Poel JJ, and Groenen MAM, 1996. Development and mapping of polymorphic microsatellite markers derived from a chicken brain cDNA library. Anim Genet 27:229234.[Web of Science][Medline]
Ruyter-Spira CP, de Koning DJ, van der Poel JJ, Crooijmans RP, Dijkhof PA, and Groenen MAM, 1998. Developing microsatellite marker from cDNA: a tool for adding expressed sequence tags to the genetic linkage map of the chicken. Anim Genet 29:8590.[Web of Science][Medline]
Sambrook J, Fritsch EF, and Maniatis T, 1989. Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press.
Smith E, 1998. A sequence tagged site in the chicken based on primers specific for the mouse transcription factor core-binding factor alpha 1 CBFA1. Anim Genet 29:236243.
Smith E, Shi L, Drummond P, Rodriguez L, Hamilton R, Powell E, Nahashon S, Ramlal S, and Foster J, 2000. Development and characterization of expressed sequence tags for the turkey (Meleagris gallopavo) genome and comparative sequence analysis with other birds. Anim Genet 31:6267.[Web of Science][Medline]
Soares M, Bonaldo M, Jelene P, Su L, Lawton L, and Efstratiadis A, 1994. Construction and characterization of a normalized cDNA library. Proc Natl Acad Sci USA 91:92289232.
Spelman RJ, Coppieters W, van Arendonk JA, and Bovenhuis H, 1996. Quantitative trait loci analysis for five milk production traits on chromosome six in the Dutch Holstein-Freisian population. Genetics 144:17991808.[Abstract]
Spike CA, Bumstead N, Crittenden LB, and Lamont SJ, 1996. RFLP mapping of expressed sequence tags in the chicken. J Hered 87:69.
Vallejo RL, Bacon LD, Liu HC, Witter RL, Groenen MAM, Hillel J, and Cheng HH, 1998. Genetic mapping of quantitative trait loci affecting susceptibility to Marek's disease virus induced tumors in F2 intercross chickens. Genetics 148:349360.
Wang DG, Fan JB, Siao CJ, Berno A, Young P, Sapolsky R, Ghandour G, Perkins N, Winchester E, Spencer J, Kruglyak L, Stein L, Hsie L, Topaloglou T, Hubbell E, Robinson E, Mittmann M, Morris M, Shen N, Kilburn D, Rioux J, Nusbaum C, Rozen S, Hudson TJ, Lander ES, et al., 1998. Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science 280:10771082.
This article has been cited by other articles:
![]() |
Y. H. Hong, E.-S. Kim, H. S. Lillehoj, E. P. Lillehoj, and K.-D. Song Association of resistance to avian coccidiosis with single nucleotide polymorphisms in the zyxin gene Poult. Sci., March 1, 2009; 88(3): 511 - 518. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Guan, T. Geng, P. Silva, and E. J. Smith Mitochondrial DNA Sequence and Haplotype Variation Analysis in the Chicken (Gallus gallus) J. Hered., November 5, 2007; (2007) esm094v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Soller, S. Weigend, M. N. Romanov, J. C. M. Dekkers, and S. J. Lamont Strategies to Assess Structural Variation in the Chicken Genome and its Associations with Biodiversity and Biological Performance Poult. Sci., December 1, 2006; 85(12): 2061 - 2078. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Sundstrom, M. T. Webster, and H. Ellegren Reduced Variation on the Chicken Z Chromosome Genetics, May 1, 2004; 167(1): 377 - 385. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




