Skip Navigation


Journal of Heredity Advance Access originally published online on December 14, 2004
Journal of Heredity 2005 96(1):80-82; doi:10.1093/jhered/esi015
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
96/1/80    most recent
esi015v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Hu, Z.-L.
Right arrow Articles by Rothschild, M. F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hu, Z.-L.
Right arrow Articles by Rothschild, M. F.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2005 The American Genetic Association

Computer Note

Expeditor: A Pipeline for Designing Primers Using Human Gene Structure and Livestock Animal EST Information

Z.-L. Hu, K. Glenn, A. M. Ramos, C. J. Otieno, J. M. Reecy, and M. F. Rothschild

From the Department of Animal Science, 2255 Kildee Hall, Center for Integrated Animal Genomics, Iowa State University, Ames, IA 50011

Address correspondence to Zhi-Liang Hu and Max F. Rothschild at the address above, or e-mail: zhu{at}iastate.edu or mfrothsc{at}iastate.edu, respectively.


    Abstract
 Top
 Abstract
 Program Description
 Discussion
 References
 
We have developed software, called Expeditor, that can be used to combine known gene structure information from human and coding sequence information from farm animal species for a streamlined primer design in target farm animal species. This software has many utilities, which include PCR-based SNP discovery for identification of genes/markers associated with economically important traits in farm animals, comparative mapping analysis, and evolution studies. The use of this software helps minimize tedious manual operations and reduces the chance of errors by more conventional approaches.


The near completion of the human genome sequencing project has provided a useful resource for animal genome research due to considerable conservation of genomic organization, including the intron/exon structures, in vertebrates (O'Brien et al. 1997; Zhu et al. 2003). The recent generation of large numbers of expressed sequence tags (ESTs) in livestock species and the construction of gene indices and Unigene for cattle, chickens, pigs, and sheep along with other species (Quackenbush et al. 2000; NCBI dbEST 2003; NCBI UniGene 2003) has also made it easier for comparative genome and candidate gene analysis in animals. To facilitate a polymerase chain reaction (PCR)-based approach to successfully amplify genomic segments of genes in animals using human gene structure information, we have developed software, Expeditor, which can be used to combine human gene structure information and animal coding sequence information for primer design. This tool is useful to identify mRNA splice sites in the TIGR Indices or Unigen sequences to correctly design PCR primers. Compared to molecular biology methods (Iwahana et al. 1994; Siebert et al. 1995; Zhang et al. 2000) and computational approaches (Pertea et al. 2001; Reese et al. 1997; Thanaraj et al. 2000) to determine intron/exon boundaries, our approach is proven to be more efficient and practical to use. This tool enables a rapid PCR-based approach to study functional or positional candidate genes of interest in farm animals among other utilities such as comparative mapping.


    Program Description
 Top
 Abstract
 Program Description
 Discussion
 References
 
This program uses human gene exon sequences with intron/exon structural information for a given gene to form a "genomic sequence" template in which exon sequences may be replaced with available animal equivalents by a satisfactory blast match. When there is no good animal equivalent sequence available for a given exon, the users have the option of substituting the exon sequences with a consensus sequence generated from other (multiple) animal species. A consensus tool within the Expeditor is provided for this purpose. The existing "Primer3" program is bundled for streamlined primer design on the template sequence with user-defined stringency parameters.

Platform and Prerequests
The Expeditor tool was developed as a CGI program with Perl 5.8 in a Unix environment. The software requires three external software programs, NCBI Blast (version 2.0.14; Altschul et al. 1997), CAP3 (Huang and Madan 1999), and Primer3 (version 0.6; Rozen and Skaletsky 1996, 1997) to be called. We built multiple coding sequence databases for the blast search using TIGR Indices and NCBI Unigen for cattle, chickens, pigs, and sheep. All software and sequence database installations are on the same server.

Input
Expeditor is designed to take the Ensembl (Clamp et al. 2003) ExonView sequences as its standard input. We designed a Web form for input of raw data as well as user-defined parameter values (see our Web site, http://www.genome.iastate.edu/~hu/expeditor). With this input form, users have options to determine each of the eight blast stringency thresholds for exon replacement before each run. In addition to the six standard blast parameters, "minimum matched length" and acceptable "score" values were added to filter for acceptable blast results. Ten major parameters for primer design are also passed on to the Primer3 program. The input for the consensus tool can be multiple similar sequences in fasta format, or multiple blast alignments directly from the NCBI blast output (see examples on our Web site).

Output
On each successful run, users can retrieve results via Web links. These include: (1) blast results for each exon. The ratio between the length of matched animal sequence and the length of the human exon are calculated for each exon evaluated. This is useful when users wish to determine how good the blast outputs are for an exon replacement. (2) A virtual genomic sequence template formed by Expeditor, with or without the exon sequences being replaced depending on the availability of similar animal sequences, the blast thresholds, and possible consensus sequences from other species. (3) PCR primers designed off the virtual template sequence. Expeditor also has options to show primer locations in color.

An example of a virtual genomic sequence template for "3-hydroxy-3-methylglutaryl-coenzyme A reductase" (HMG-CoA reductase) produced by Expeditor, with exons 7–9 of the gene from the Ensembl ExonView as the input sequence and the TIGR Pig Indices as the blast database, is shown in Figure 1. The format for the virtual sequence template is designed such that exons with pig sequences are printed in small letters (replacement was successful for the first two exons), and exons with human sequences (replacement was unsuccessful for the last exon) are printed in capital letters. This template sequence, together with each blast result, may help users get an idea how good a template is to use. Stretches of "Ns" in the template represent the sizes and locations of intron sequences.



View larger version (56K):
[in this window]
[in a new window]
 
Figure 1.. An Expeditor output showing a virtual genomic sequence template formed. Exons printed in small letters represent the first 2 exons are replaced with pig coding sequences upon satisfactory blast matches (the coverage of the human exon lengths are 100% and 86% respectively). The last exon is printed with the original human sequence in capital letters due to an unsatisfactory pig sequence match (10% only). The stretches of "Ns" represent the size and location of intron sequences.

 
For the example shown in Figure 1, it took the program only 5 s on a True64 computer with 512 MB RAM and 500 MHz CPU speed. On average, for each run like this one, it may save about 2–3 h of a researcher's time.

Availability and Usage
Expeditor is available online for free use by not-for-profit or academic users (http://www.genome.iastate.edu/~hu/expeditor). Usage instructions are provided on the Web. The code is also available on request.


    Discussion
 Top
 Abstract
 Program Description
 Discussion
 References
 
One challenge to efficiently search for useful functional or positional candidate genes of interests in farm animals is to correctly identify mRNA splice sites in the coding sequences of the target species to correctly design PCR primers for amplification of genomic segments of interest based only on EST or mRNA information. Our approach to tackle the problem is more efficient than laboratory or computational approaches.

When human homologous gene structure information is known, one normally has to manually align the animal and human sequences to find where in the animal coding sequence an intron may be inserted and how big this insertion may be. The same manual process has to be repeated for every possible intron insertion to create a useful template sequence for primer design. To design primers on a sequence without gene structural information using graphical primer design software like Oligo 6, one needs to determine where a primer may be located and whether it is possible for an amplicon to span an intron. Often this effort is combined with test/fail/test cycles before a cross-intron fragment may be successfully amplified when its size is within the range of PCR capability. The major advantages of Expeditor are that it relieves researchers from laborious trials and errors and reduces errors that might be introduced by hand.

Expeditor is designed for PCR amplification of genomic sequences with primer sites anchored in exons. Primer3 is used to optimize the locations of the primers with the user-set primer design parameters. The representation of introns with stretches of Ns effectively helps prevent Primer3 from landing any primer on intron locations because only the coding sequences share good homologies between species. The amplicon may or may not span introns depending on the intron/exon sizes but will not span 5' or 3' untranslated region.

There are several factors affecting the success rate of the output from Expeditor: (1) availability of complete coding sequence of the gene from the target species; (2) blast stringency; (3) presence of alternative splicing site(s); and (4) actual degree of intron/exon structural homology for a given gene between human and animal species. The design of Expeditor was based on the assumption that the human and livestock animals (e.g., pig) share a high degree of homology in both coding sequence and genomic organization surrounding a gene. However, due to the fact that alternative splicing exists, not all intron locations for a gene may be correctly predicted. In addition, because of the high frequency of tandem repeats within introns, the size of an intron may not be precisely determined. Subsequently, the actual amplified DNA fragments may or may not be exactly like what was anticipated based on Expeditor prediction. In other words, the template sequence made by Expeditor is only an approximation. Therefore, users are strongly advised to carefully evaluate the template produced before carrying out actual PCR experiments. The default blast and primer design parameters were set based on our empirical data. Users are always encouraged to properly adjust them and find appropriate combinations for a given situation.

Compared with other primer design tools using a similar approach (for example, ICCARE; Muller et al. 2004), Expeditor is unique in that it takes inputs directly from known human genes/sequences, take into account the intron sizes in estimating amplicon sizes, and allows consensus sequences from multiple animal species to be used as part of the template. In our laboratory the use of this tool has been quite successful in searching for SNPs in pigs.

Besides its utility in individual gene searches, Expeditor can also be used in comparative mapping and evolution analysis of genes/genomes between species. As evidence for increased genomic organization of intron/exon structure conservation continues to be demonstrated among animals and even plants (Fedorov et al. 2002), this tool may have wider utility in situations where information on gene structure is available in one species and coding sequences are available in another species.


    Acknowledgments
 
This work was supported in part by funding from the USDA-CSREES Pig Genome and Database Coordination programs, the Iowa Agriculture and Home Economics Experiment Station, and by Hatch Act and State of Iowa funds. The authors thank Dr. Kwan-Suk Kim, Benny Mote, Renata Hernandes, James Koltes, and Dr. Laura Grapes for serving as the beta testers of the software and many useful discussions through the course of this work.


    Footnotes
 
Corresponding Editor: Leif Andersson

Received November 25, 2003
Accepted July 23, 2004


    References
 Top
 Abstract
 Program Description
 Discussion
 References
 

    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, and Lipman DJ, 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402.[Abstract/Free Full Text]

    Clamp M, Andrews D, Barker D, Bevan P, Cameron G, Chen Y, Clark L, Cox T, Cuff J, and Curwen V others, 2003. Ensembl 2002: accommodating comparative genomics. Nucleic Acids Res 31:38–42.[Abstract/Free Full Text]

    Fedorov A, Merican AF, and Gilbert W, 2002. Large-scale comparison of intron positions among animal, plant, and fungal genes. Proc Natl Acad Sci USA 99(25):16128–16133.[Abstract/Free Full Text]

    Huang X, and Madan A, 1999. CAP3: Aa DNA sequence assembly program. Genome Res 9:868–877.[Abstract/Free Full Text]

    Iwahana H, Tsujisawa T, Katashima R, Yoshimoto K, and Itakura M, 1994. PCR with end trimming and cassette ligation: a rapid method to clone exon-intron boundaries and a 5'-upstream sequence of genomic DNA based on a cDNA sequence. PCR Methods Appl 4(1):19–25.[Web of Science][Medline]

    Muller C, Denis M, Gentzbittel L, and Faraut T, 2004. The Iccare web server: an attempt to merge sequence and mapping information for plant and animal species. Nucleic Acids Res 32:W429–W434.[Abstract/Free Full Text]

    NCBI dbEST: database of expressed squence tags. Last modified: July 28, 2003. http://www.ncbi.nlm.nih.gov/dbEST/dbEST_summary.html.

    NCBI UniGene: pigs. Last modified: August 3, 2003 http://www.ncbi.nlm.nih.gov/UniGene/clust.cgi?ORG=Ssc.

    O'Brien SJ, Wienberg J, and Lyons LA, 1997. Comparative genomics: lessons from cats. Trends Genet 13(10):393–399.[CrossRef][Web of Science][Medline]

    Pertea M, Lin X, and Salzberg SL, 2001. GeneSplicer: a new computational method for splice site prediction. Nucleic Acids Res 29(5):1185–1190.[Abstract/Free Full Text]

    Quackenbush J, Liang F, Holt I, Pertea G, and Upton J, 2000. The TIGR gene indices: reconstruction and representation of expressed gene sequences. Nucleic Acids Res 28(1):141–145.[Abstract/Free Full Text]

    Reese MG, Eeckman FH, Kulp D, and Haussler D, 1997. Improved splice site detection in Genie. J Comput Biol 4(3):311–323.[Web of Science][Medline]

    Rozen S, and Skaletsky HJ, 1996, 1997. Primer3 Code available at http://www-genome.wi.mit.edu/genome_software/other/primer3.html.

    Siebert PD, Chenchik A, Kellogg DE, Lukyanov KA, and Lukyanov SA, 1995. An improved PCR method for walking in uncloned genomic DNA. Nucleic Acids Res 23(6):1087–1088.[Free Full Text]

    Thanaraj TA, and Robinson AJ, 2000. Prediction of exact boundaries of exons. Brief Bioinform 1(4):343–356.[Abstract/Free Full Text]

    Zhang Z-G, and Gurr SJ, 2000. Walking into the unknown: a "step down" PCR-based technique leading to the direct sequence analysis of flanking genomic DNA. Gene 253(2):145–150.[CrossRef][Web of Science][Medline]

    Zhu L, Swergold GD, and Seldin MF, 2003. Examination of sequence homology between human chromosome 20 and the mouse genome: intense conservation of many genomic elements. Hum Genet 113(1):60–70.[Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
96/1/80    most recent
esi015v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Hu, Z.-L.
Right arrow Articles by Rothschild, M. F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hu, Z.-L.
Right arrow Articles by Rothschild, M. F.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?