Skip Navigation


Journal of Heredity Advance Access originally published online on May 22, 2008
Journal of Heredity 2008 99(5):500-511; doi:10.1093/jhered/esn029
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
99/5/500    most recent
esn029v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Perelygin, A. A.
Right arrow Articles by Brinton, M. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Perelygin, A. A.
Right arrow Articles by Brinton, M. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The American Genetic Association. 2008. All rights reserved. For permissions, please email: journals.permissions@oxfordjournals.org.

Concerted Evolution of Vertebrate CCR2 and CCR5 Genes and the Origin of a Recombinant Equine CCR5/2 Gene

Andrey A. Perelygin, Andrey A. Zharkikh, Natalia M. Astakhova, Teri L. Lear, and Margo A. Brinton

From the Department of Biology, Georgia State University, PO Box 4010, Atlanta, GA 30302 (Perelygin, Astakhova, and Brinton); the Bioinformatics Department, Myriad Genetics, Inc., Salt Lake City, 320 Wakara Way, UT 84108 (Zharkikh); and the Department of Veterinary Science, University of Kentucky, 108 Maxwell H. Gluck Equine Research Center, Lexington, KY 40546 (Lear)

Address correspondence to Andrey A. Perelygin at the address above, or e-mail: aperelygin{at}gsu.edu.

Chemokine receptors (CCRs) play an essential role in the initiation of an innate immune host response. Several of these receptors have been shown to modulate the outcome of viral infections. The recent availability of complete genome sequences from a number of species provides a unique opportunity to analyze the evolution of the CCR genes. A phylogenetic analysis revealed that the CCR2 gene evolved in concert with the paralogous CCR5 gene, but not with another paralogous gene, CCR3, in the opossum, platypus, rabbit, guinea pig, cat, and rodent lineages. In addition, evidence of concerted evolution of the CCR2 and CCR5 genes was observed in chicken and lizard genomes. A unique CCR5/2 gene that originated by unequal crossing over between the CCR2 and CCR5 genes was detected in the domestic horse. The CCR2, CCR5, and CCR5/2 genes were mapped to ECA16q21 using fluorescent in situ hybridization (FISH). Single-nucleotide polymorphisms identified in the equine CCR5 gene and characterized within 5 horse breeds provide haplotype markers for future case/control studies investigating the genetic bases of horse susceptibility to infectious diseases.


The chemokine family includes a group of secreted proteins that activate the immune system during inflammation in response to microbial infection and during autoimmune or allergic reactions (Howard et al. 1996). Chemokines act on immune cells by binding to chemokine receptors (CCRs). These receptors are members of the G protein–coupled receptor family and are composed of 7 hydrophobic transmembrane (TM) domains as well as an extracellular N-terminal region and a cytoplasmic C-terminal tail that contains important structural and functional motifs. In addition to binding chemokine ligands, particular CCRs have previously been reported to facilitate immunodeficiency virus entry (Deng et al. 1996; Dragic et al. 1996; He et al. 1997; Lim et al. 2006).

Subcutaneous inoculation of West Nile virus (WNV) consistently led to a fatal outcome in CCR5 knockout mice (Glass et al. 2005). The data obtained with this animal model suggested the possibility that the CCR5{Delta}32 missense mutation, that provides protection against human immunodeficiency virus 1, might increase the risk of fatal WNV infection in humans. Cohort studies using clinical samples from WNV-infected and control individuals from Arizona and Colorado showed that 1% of the controls but 4–8% of individuals with laboratory-confirmed, WNV-induced disease were homozygous for the CCR5{Delta}32 allele. However, this difference was not statistically significant. An increased frequency (P = 0.04) of CCR5{Delta}32 homozygotes among fatal cases in the Arizona cohort was reported to be statistically significant but only a few fatal cases were studied (Glass et al. 2006).

WNV is a mosquito-borne, single-stranded, positive-sense RNA virus that has recently become endemic in the Americas. WNV is maintained in a mosquito–bird–mosquito transmission cycle but humans and other mammals are occasionally infected. WNV infection does not cause severe disease and death in domestic animals, such as cattle (Bos taurus), pigs (Sus scrofa), rabbits (Oryctolagus cuniculus), cats (Felis catus), or dogs (Canis lupus familiaris). In contrast, horses (Equus caballus) and other equids that develop clinical WNV-induced disease show mortality rates up to 30% (Salazar et al. 2004; Schuler et al. 2004; Ward et al. 2006).

The mRNA sequences and exon/intron structures of the equine CCR2 and CCR5 genes and a unique CCR5/2 gene were characterized in this study. These genes were FISH mapped to ECA16q21. Single-nucleotide polymorphisms (SNPs) were identified in the coding regions of the CCR5 and CCR5/2 genes. These SNPs will be utilized in future cohort studies to analyze associations between innate immunity genes and susceptibility to infectious disease in horses. Sequences of orthologous elephant and chicken CCR5 genes were also determined. These sequences as well as CCR2, CCR3, and CCR5 gene sequences from other species detected in the GenBank database were used for phylogenetic analysis. The results of this analysis provided evidence for concerted evolution of the vertebrate CCR2 and CCR5 genes.


    Materials and Methods
 Top
 Materials and Methods
 Results and Discussion
 Conclusions
 Funding
 References
 
RNA and DNA Sources
Leghorn chicken intestine full-length cDNA was purchased from Seegene (Rockville, MD). Both equine total RNA and genomic DNA were isolated from Quarter horse peripheral white blood cells (PWBC) kindly provided by Dr T. Thompson (Equine Medicine and Surgery, Douglasville, GA). This equine PWBC RNA as well as spleen total RNA that was isolated from a horse of an unknown breed (Zyagen Laboratories, San Diego, CA) were converted into single-stranded cDNA (sscDNA) as described previously (Perelygin et al. 2005). High molecular weight genomic DNA isolated from an African savannah elephant was kindly provided by Drs A. Roca and S. O'Brien (National Cancer Institute, Frederick, MD).

Extension of Partial Gene Sequences
The DNA Walking (DW) SpeedUp kit (Seegene) was used to extend partial gene sequences. This kit utilized both annealing control primer (ACP) and dual priming oligonucleotide technologies, which maximize the ability of commercial DW-ACP primers and target-specific primers to capture unknown target sites and optimize polymerase chain reaction (PCR) conditions. Three gene-specific primers are required for each walking step, which significantly increases the specificity of each of the partial sequence extensions. In contrast to the rapid amplification of cDNA ends method, specific adapter ligation is not required and both genomic DNA and cDNA templates can be used with the DNA walking SpeedUp kit.

Detection of Polymorphisms in the Equine CCR5 and CCR5/2 Genes
To identify SNPs in the equine CCR5 and CCR5/2 genes, genomic DNA samples from 1 Arabian, 1 Hanoverian, 1 Paint, 4 Quarter, 3 Thoroughbred, and 6 mix-breed horses were isolated as described previously (Perelygin et al. 2006a) and amplified by PCR using the primers listed in Table 1. Because of high similarity between CCR2 and CCR5 sequences, special precautions were taken to ensure that only the correct gene was amplified. First, long primers (usually 24 mers) with high annealing temperatures (68–72 °C) were used to amplify specific PCR products. Second, each primer pair was designed from gene-specific sequences so that each amplified only 1 CCR gene. Third, all PCR amplifications were performed using the Advantage 2 polymerase mix (Clontech, Mountain View, CA), which contains Taq-polymerase antibody that provides high PCR specificity compatible with "hot-start" PCR. PCR products were individually purified using a QIAquick PCR Purification Kit (Qiagen, Valencia, CA) and directly sequenced using the BigDye Terminator v1.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA) in the DNA/Protein Core facility at Georgia State University.


View this table:
[in this window]
[in a new window]

 
Table 1. PCR primers used to amplify vertebrate CCR2, CCR5, and CCR5/2 genes

 
SNP Genotyping within the Equine CCR5 Gene
The tetraprimer PCR amplification of refractory mutation system (tetraprimer ARMS–PCR) technique (Ye et al. 2001) or the restriction fragment length polymorphism (RFLP) technique were used to genotype identified SNPs within the equine CCR5 gene in 5 horse breeds: American Saddlebreds, Arabians, Quarter horses, Standardbreds, and Thoroughbreds. The PCR primer pairs used for genotyping are listed in Table 2.


View this table:
[in this window]
[in a new window]

 
Table 2. PCR primers and restriction enzymes used to genotype SNPs within the ORF of the equine CCR5 gene in DNA samples from 5 breeds of horses

 
Phylogenetic Analysis
CCR2 and CCR5 phylogenetic trees were built using the njtree program with 2-parameter (Kimura 1980) or codon-based (Li 1993) distances as described previously (Perelygin et al. 2006b). This program is available on request from A.A.Z. (zharkikh{at}myriad.com).

FISH Mapping of Equine CCR5
The horse CHORI-241 BAC library was searched with a probe derived from the partial equine CCR5 cDNA that was obtained as part of this study using degenerative primers EcCCR5degF and EcCCR5degR (Table 1). DNA from equine BAC clones was labeled with Spectrum Orange-dUTP (Vysis, Downer's Grove, IL) following the manufacturer's directions. These DNAs were FISH mapped on equine metaphase spreads as described previously (Perelygin et al. 2006a).


    Results and Discussion
 Top
 Materials and Methods
 Results and Discussion
 Conclusions
 Funding
 References
 
Assembling the Horse, CCR2, CCR5, and CCR5/2 Genes
Cat (AB022910 [GenBank] ), cattle (AY834252 [GenBank] ), human (NM_000579 [GenBank] ), pig (AB119272 [GenBank] ), and rabbit (DQ444458 [GenBank] ) CCR5 mRNA sequences were downloaded from GenBank and aligned to design degenerative primers (EcCCR5degF and EcCCR5degR; Table 1) located in evolutionarily conserved regions. Using these primers, 3 fragments were amplified from equine PWBC sscDNA by PCR. The sequence of one of these fragments was similar to those of other mammalian CCR5 genes. This cDNA sequence was extended to full length by DNA walking and submitted to GenBank under accession number DQ629590. Alignment of this sequence with the horse whole-genome sequence assembly from the UCSC genome browser (www.genome.ucsc.edu) revealed 2 exons: the noncoding exon 1 of 285 bp and exon 2 that includes a 5' NCR of 11 bp, an open reading frame (ORF) of 1059 bp, and a 3' NCR of 679 bp (Figure 1). The exon 1 (5' NCR) and exon 2 (3' NCR) sequences may not be complete due to limitations of the DNA walking method. The ORF and 3' NCR of the DQ629590 [GenBank] sequence showed 99–100% similarity to several sequences (EF507698 [GenBank] through EF507703 [GenBank] ) derived from Portuguese horse genomic DNAs and recently submitted to GenBank by van der Loo and coworkers, Research Center in Biodiversity and Genetic Resources, Vairao, Portugal. No alternatively spliced variants of the horse CCR5 mRNA were detected after amplification of either equine PWBC or spleen sscDNA with horse CCR5 gene–specific primers (EcCCR5rnaF and EcCCR5rnaR; Table 1). Although only one version of the horse CCR5 mRNA sequence (DQ629590 [GenBank] ) has been detected to date (Figure 1), 2 alternatively spliced variants of the human CCR5 mRNA were previously reported (Mummidi et al. 1997).


Figure 1
View larger version (11K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1. Splice variants of equine CCR5, CCR5/2, and CCR2 transcripts. Asterisks indicate cryptic translation initiation or termination codons. The 3' NCRs were not included for any of the genes.

 
The sequence of another of the PCR fragments was similar to both horse expressed sequences tag (EST) CD466909 [GenBank] and the whole-genome contig AAWR01020520. This sequence represented a middle portion of the equine CCR2 gene. A similar genomic sequence consisting of the 3' end of intron 1 and the ORF of the horse CCR2 gene was recently submitted to GenBank by van der Loo and coworkers (DQ473316 [GenBank] ). Alignment of EST CD466909 [GenBank] with the DQ473316 [GenBank] and AAWR01020520 sequences (data not shown) revealed a 660-bp gap in the middle of the CD466909 [GenBank] sequence that suggested the existence of an alternatively spliced version of the equine CCR2 mRNA. To further analyze this possibility, equine spleen sscDNA was amplified using CCR2-specific primers (EcCCR2rnaF and EcCCR2rnaR; Table 1). Four alternatively spliced transcripts of various lengths were detected. Their sequences were determined and submitted to GenBank under accession numbers EF649735 [GenBank] through EF649738. Alignments of each of these cDNA sequences with the sequence of the genomic contig AAWR01020520 revealed 3 alternatively spliced exons in the equine CCR2 gene (Figure 1). A GenBank search using the BLASTN program indicated that the 5' portion (328 bp) of the largest equine CCR2 transcript (EF649735 [GenBank] ) is similar to the 5' noncoding exon of the recently reported porcine ortholog NM_001001619 (Shinkai et al. 2005). The EF649736 [GenBank] sequence, which is 99% identical to the Portuguese DQ473316 [GenBank] sequence, was 192 bp shorter than the EF649735 [GenBank] sequence, suggesting the presence of 2 noncoding exons at the 5' end of the equine CCR2 gene (Figure 1). Two noncoding 5'-terminal exons were also detected in the murine ortholog by searching the UCSC genome browser. The EF649737 [GenBank] version of equine CCR2 mRNA differed from EF649735 [GenBank] by the lack of 660 bp at the 5' end in the single 3'-terminal exon due to alternative splicing. The splicing of both introns as well as of the 660-bp region at the 3'-terminal exon generated the EF649738 [GenBank] version of equine CCR2 mRNA (Figure 1). Both EF649735 [GenBank] and EF649736 [GenBank] mRNAs encode the full-length equine CCR2 protein, whereas both EF649737 [GenBank] and EF649738 [GenBank] mRNAs encode a truncated protein. Alternative splicing was previously reported to occur at the 3'-end of human CCR2 pre-mRNA (Charo et al. 1994).

A GenBank search using the sequence of the third amplified equine PCR fragment revealed that its 5' portion was similar to other mammalian CCR5 gene sequences, whereas its 3' portion was similar to orthologous CCR2 gene sequences as well as to the equine genomic contig AAWR01020518, suggesting that a unique CCR5/2 gene had originated in the horse genome by unequal crossing over between the CCR5 and CCR2 genes (Figure 2A). When the equine PWBC sscDNA was amplified using CCR5/2 specific primers (EcCCR5/2rnaF and EcCCR5/2rnaR; Table 1), 3 cDNAs of various lengths were detected and sequenced. The sequences of these cDNAs were submitted to GenBank under accession numbers EF583878 [GenBank] through EF583880. Alignment of these 3 cDNA sequences with the AAWR01020518 genomic contig revealed 3 introns in the equine CCR5/2 gene (Figure 1). One, 2, or all 3 of these introns were spliced out in the EF583878 [GenBank] , EF583879 [GenBank] , and EF583880 [GenBank] CCR5/2 mRNAs, respectively. A different protein would be translated from each of the 3 alternatively spliced CCR5/2 mRNAs. Sequences of orthologous CCR5/2 genes from donkey (Equus asinus) and Grevy's zebra (Equus grevyi) were recently submitted to GenBank by van der Loo and coworkers (EF515830 [GenBank] and EF515831 [GenBank] , respectively). These 2 sequences showed 98–99% similarity to the orthologous EF583880 [GenBank] CCR5/2 horse sequence but only about 90% similarity to either the DQ629590 [GenBank] CCR5 or the EF649736 [GenBank] CCR2 paralogous horse sequences.


Figure 2
View larger version (17K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 2. The origin of the horse CCR5/2 gene. (A) Model of unequal recombination between the horse CCR5–CCR2 chromosomal regions. Arrows indicate direction of RNA transcription. (B) Phase diagram of the cumulative distributions of differences between the horse CCR genes: the horizontal axis corresponds to the cumulative difference between the CCR2 and CCR5 genes; the vertical axis corresponds to the cumulative difference between the CCR5/2 and CCR2 (black line) or between CCR5/2 and CCR5 (gray line) genes. Identical distributions would follow the diagonal line. The vertical lines indicate the interval in which the recombination event occurred.

 
Mapping of Horse CCR2, CCR5, and CCR5/2 Genes
Two BAC clones, 47I4 and 85J10, were found to contain the equine CCR2, CCR5, and CCR5/2 genes. The coding exons of the CCR5 and CCR5/2 genes as well as the 3' portion of the CCR2-coding exon were PCR amplified from each of these 2 BAC clones and sequenced. The sequences obtained showed complete identity with the corresponding cDNA sequences described above. FISH mapping with these 2 BAC clones assigned the CCR2, CCR5, and CCR5/2 genes to the horse chromosomal location ECA16q21 (data not shown). This map position was in agreement with the positions of the CCR2 and CCR5 genes on the horse genome sequence assembly (UCSC genome browser).

SNPs in Coding Regions of the Horse CCR5 and CCR5/2 Genes
Amplification and direct sequencing of DNA from 16 unrelated horses of various breeds using CCR5 specific primers (EcCCR5dnaF and EcCCR5dnaR; Table 1) identified 4 variable positions in the equine CCR5 gene. All of these SNPs (A63G, C108G, C267T, and C399T; nucleotide numbering started from the translation initiation site in the cDNA sequence) were synonymous. The EcCCR5/2dnaF and EcCCR5/2dnaR primers (Table 1) were used to identify CCR5/2 mutations in the same 16 horse DNA samples. Thirteen SNPs were found. Seven of these SNPs changed the amino acid residue and one created an insertion/deletion (C/–) that generated both a frameshift and prematurely terminated translation. Alignment of the equine CCR5 and CCR5/2 sequences demonstrated that all of the CCR5 SNP positions differed from those of the CCR5/2 gene SNPs (Figure 3). This observation indicated that the individual SNPs could be genotyped within the horse CCR5 gene using simple high-throughput assays. Each of the CCR5 substitutions detected was subsequently genotyped by either RFLP or tetraprimer ARMS–PCR (Table 2). The frequencies of identified alleles were estimated using DNA samples from 5 horse breeds (Table 3). Arabian horses were found to be polymorphic only for the C399T SNP, whereas both American Saddlebreds and Thoroughbreds demonstrated a high rate of variability in 3 of the 4 CCR5 SNPs analyzed. Haplotype prediction using the 2SNP software (http://alla.cs.gsu.edu/~software/2SNP/) revealed 3 common and a single rare (<1%) SNP haplotypes spanning most of the horse CCR5 gene (data not shown). Association analysis of haplotype markers rather than individual SNPs will facilitate future case/control studies investigating the genetic bases of horse susceptibility to infectious diseases.


View this table:
[in this window]
[in a new window]

 
Table 3. Major allele frequencies of synonymous SNPs within the coding region of the equine CCR5 gene

 


Figure 3
View larger version (32K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 3. Variable nucleotides (in bold) detected in the equine CCR5 or CCR5/2 genes. The predicted junction region between the CCR5-like and CCR2-like portions in the CCR5/2 gene is underlined.

 
Determination of Elephant and Chicken CCR5 Gene Sequences
To obtain additional insights about CCR5 and CCR2 gene evolution, the sequences of these genes from 2 additional vertebrate species, elephant and chicken, were determined. Cattle, horse, and pig CCR5 sequences were used to search the GenBank African savanna elephant (Loxodonta africana) genome trace archive database using the discontiguous Mega BLAST program. The genomic sequence 507006073 (G720P622497RG11.T0) was found to contain the 3' end of the elephant CCR5 gene. The 5' end of this partial sequence was extended by genomic DNA walking to the ATG start codon. Subsequently, elephant genomic DNA and 2 CCR5 specific primers (LaCCR5dnaF and LaCCR5dnaR; Table 1) were used to amplify and sequence the coding region of the elephant CCR5 gene. The resulting sequence, which contained a 1062-bp ORF, was submitted to GenBank under accession number EF524204.

Comparison of the CCR5 sequences from several species (Table 4) revealed a variation in ORF length ranging from 1056 bp (platypus; Ornithorhynchus anatinus) to 1071 bp (tenrec; Echinops telfairi). The length of a chicken (Gallus gallus) CCR5 ORF in the GenBank database (NM_001045834) was substantially shorter (867 bp) than those of the orthologous mammalian genes. The NM_001045834 sequence was predicted from a genomic sequence of the chicken chemokine receptor cluster. To explore the possibility that this prediction was not accurate and that the length of the chicken CCR5 ORF exceeded 867 bp, a search of the chicken GenBank EST database was performed using NM_001045834. Two ESTs, that did not overlap each other, were initially identified. The 3' part of the NM_001045834 sequence was found to be identical to the 5' part of EST AW239706 [GenBank] (reverse complement). This EST has a stop codon that is in frame with the NM_001045834 ORF as well as a 3' NCR of 292 bp. Another EST, BI393893 [GenBank] , contained the 5' portion of the chicken CCR5 mRNA, which consisted of a 257-bp 5' NCR and an additional in frame translation initiation codon located 201 bp upstream of the translation initiation codon predicted in the NM_001045834 sequence. When the chicken GenBank database was searched with the BI393893 [GenBank] sequence, 2 additional short ESTs, CK613374 [GenBank] and DT658610 [GenBank] , were detected. Neither CK613374 [GenBank] nor DT658610 [GenBank] overlapped the NM_001045834 sequence but had translation initiation codons in positions that were similar to that in the BI393893 [GenBank] sequence. Of the three 5' terminal chicken CCR5 ESTs detected in GenBank, the DT658610 [GenBank] sequence had the longest 5' NCR (314 bp). Two primers, GgCCR5rnaF and GgCCR5rnaR (Table 1), designed based on the DT658610 [GenBank] or AW239706 [GenBank] sequences, respectively, were used to amplify and sequence the CCR5 transcript from a chicken cDNA (intestine). The resulting full-length mRNA sequence was submitted to GenBank under accession number EF524205. This sequence consisted of a 5' NCR of 247 bp, an ORF of 1068 bp, and a 3' NCR of 118 bp. It is expected that neither the 5' nor 3' NCR sequences in the EF524205 [GenBank] sequence is complete.


View this table:
[in this window]
[in a new window]

 
Table 4. Lengths of ORFs and GenBank accession numbers of selected vertebrate CCR5, CCR2, and CCR3 genes

 
Synteny and Evolution of the Vertebrate CCR2 and CCR5 Genes
The CCR2 and CCR5 genes are members of a cluster of 8 chemokine receptor genes that are highly conserved among mammals. In the human genome, this cluster is located in a 500-kb region of chromosome 3 between the LZTFL1 and LTF genes (Figure 4). A search of the UCSC genome browser revealed that the gene order in the corresponding region of dog chromosome 20 is identical to that in humans. In addition to the CCR genes that were identified in the human and dog clusters, a unique equid CCR5/2 gene was found in the horse (chromosome 16), donkey, and zebra genomes (see above). Mouse and rat have identical CCR gene orders but the rodent CCRL2 gene has been translocated to a different region of the same chromosome (chromosome 8 and 9 in rat and mouse, respectively). In opossum (Monodelphis domestica) and chicken, the CCR cluster is located in chromosome 5 and 2, respectively, but the CCRL2 gene is absent. Also, there is a single ortholog of the CCR1 and CCR3 genes in both the chicken and lizard (Anolis carolinensis) genomes (Figure 4). In the frog (Xenopus tropicalis), only one ortholog of the mammalian CCR5 and CCR2 genes was found within genomic scaffold_36 (data not shown), suggesting that tandem duplication of these 2 genes occurred after divergence of amphibians but before the split of the reptilian, avian, and mammalian lineages from a common ancestor (DeVries et al. 2006).


Figure 4
View larger version (32K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 4. Maps of vertebrate CCR gene clusters compiled from the whole-genome sequencing projects represented at the UCSC web site (http://genome.ucsc.edu). Arrows indicate direction of RNA transcription. The CCR2 genes are indicated by light gray, the CCR5 genes by white, and the other CCR genes by black arrows. The genes flanking the CCR cluster are indicated in dark gray. The diagrams shown are not to scale.

 
To analyze the relationships of the CCR2 and CCR5 genes, a phylogenetic tree was constructed using the closest paralog, the CCR3 gene for each species, as an outgroup. For several mammalian species, the orthologous CCR5 genes grouped in a separate subtree (Figure 5). However, the lizard, chicken, opossum, platypus, rabbit, guinea pig (Cavia porcellus), cat, and rodent CCR2 and CCR5 genes, but not the paralogous mammalian CCR3 genes, were present in species-specific clusters in the tree. This clustering contradicted the paralogous nature of these genes as inferred from their chromosomal location and suggested intensive concerted evolution of these genes by gene conversion. Evidence of concerted evolution of the CCR2 and CCR5 genes was previously reported in mice (Shields 2000), leporids (Carmo et al. 2006) and felids (Esteves et al. 2007; Vazquez-Salat et al. 2007).


Figure 5
View larger version (21K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 5. Phylogenetic tree of selected vertebrate CCR2, CCR5, and equid CCR5/2 genes. The sequences of mammalian CCR3 genes were used as the outgroups. The numbers at the tree nodes indicate the corresponding bootstrap values estimated by 1000 bootstrap replications.

 
To compare the evolution rates in both paralogous and orthologous genes, the distribution of divergent and convergent substitutions among various lineages was analyzed as previously described (Perelygin et al. 2006b). The goal of this analysis was to estimate the background level of convergent substitution between evolutionarily unrelated lineages (different genes in different species) and to compare it to the corresponding occurrence of convergent substitutions in paralogous genes (different genes in the same species). If the paralogous genes evolved independently from each other, their convergent substitution rates would be equal to those within orthologous genes. The results of this analysis (Table 5) demonstrated that the rate of convergent substitution was approximately the same between paralogous and orthologous CCR2 and CCR3 genes indicating the absence of conversion between these 2 genes. However, the convergent substitution rate between the paralogous CCR2 and CCR5 genes was significantly higher than the corresponding rate between the orthologous genes. For the human–mouse and human–opossum comparisons, the number of convergent substitutions was close to or exceeded the number of divergent substitutions suggesting that the duplication and divergence of CCR2 and CCR5 genes occurred prior to mammalian speciation. When gene divergence occurs prior to speciation, the phylogenetic analysis cannot correctly infer the true relationship between genes and species. This was the major reason that species-specific clustering of the rodent, opossum, and many other paralogous genes was observed in the tree.


View this table:
[in this window]
[in a new window]

 
Table 5. Divergent and convergent substitution rates in paralogous and orthologous lineages of CCR2/CCR5 and CCR2/CCR3 gene pairs

 
The hypothesis that CCR2 and CCR5 gene duplication occurred prior to speciation is supported by the increased similarity of the 5' ends of the orthologous gene sequences in the alignment that was used to build the phylogenetic tree. The majority of the divergent gaps are located at the beginning of paralogous genes (data not shown). The alternative hypothesis that species divergence occurred prior to gene duplication would assume an independent origin of each of the gaps in several lineages. Because gap generation is a relatively rare evolutionary event and gene duplication would have had to occur independently many times in the lizard, chicken, platypus, opossum, rodent, and primate lineages, this alternative hypothesis seems improbable.

The region-specific patterns of gene conversions were reported in several other gene families (Higgs et al. 1984; Zhao et al. 1998; Lazzaro and Clark 2001; Archibald and Roger 2002a, 2002b; Bettencourt and Feder 2002; Desjardins et al. 2002; Drouin 2002; Israel et al. 2002; Nagawa et al. 2002; Annilo et al. 2003; Perelygin et al. 2006b; Kruithof et al. 2007). To study the distribution of gene conversion events along the CCR5 and CCR2 genes, we applied the Kolmogorov–Smirnov test as described previously (Perelygin et al. 2006b). In this approach, the distribution of nucleotide substitutions between a pair of paralogous sequences is compared with that in a pair of orthologous sequences. A gene conversion event results in a high similarity between affected parts of "donor" and "acceptor" genes and a relatively high divergence between the remaining parts of the 2 genes. Therefore, in the case of a recent gene conversion in a particular species, the distributions between paralogous genes will deviate from the distributions between orthologous genes, which can be detected statistically by the nonparametric Kolmogorov–Smirnov test. The most significant deviation was observed for each of the species that form clusters of paralogous CCR2 and CCR5 genes in the tree (Figure 5): lizard (P = 10–47), chicken (P = 10–38), guinea pig (P = 10–28), platypus (P = 10–12), cat and rodents (P = 10–9), opossum, rabbit, and rhesus monkey (P = 10–6). In all these cases, the observed deviations can be explained by one or several recent gene conversion events affecting large portions of the genes. Actually, all regions of these genes with the exception of their 5' ends were involved in gene conversion in different species. The 5' ends differ significantly between CCR2 and CCR5 genes and can serve as reliable markers in combination with gene order for establishing orthologous relationships between these genes in various species.

For most of the other placental mammals, deviations in the distributions of nucleotide substitutions were not as significant (P < 10–6) as the deviations reported above. Therefore, the correct topology of the phylogenetic tree would not be substantially affected for these mammals. However, comparison of the numbers of convergent substitutions between paralogous and orthologous genes (Table 5) revealed significant gene conversion in these species suggesting that gene conversion is an ongoing process that randomly affects small portions of the CCR2 and CCR5 genes in a number of mammalian species.

Origin and Evolution of the Equid CCR5/2 Gene
Among the species for which genomes have been sequenced to date, the equids are the only ones with an additional CCR5/2 gene. The CCR5/2 gene likely originated from an unequal crossing over event between the CCR2 and CCR5 genes (Figure 2A) and would be expected to have a CCR5-like 5' end and a CCR2-like 3' end. To confirm this assumption and to identify the most probable recombination site, the cumulative distributions of nucleotide differences were calculated between 3 pairs of genes: CCR2 and CCR5 (S25), CCR2 and CCR5/2 (S2x), and CCR5 and CCR5/2 (S5x). The recombination site was located using break-points in the lines of a phase diagram of the distributions for S2x or S5x plotted versus the distribution for S25 (Figure 2B). The break point was estimated to be located between positions 394 to 402 bp in the CCR5/2 gene. This region encodes the inner loop located between the TM3 and TM4 domains. The CCR5-like portion of the chimeric protein contains the N-terminal sequence and the first 3 TM domains. The CCR2-like portion includes the remaining 4 TM domains and the C-terminal domain. Because the CCR2-like portion is larger, the equid CCR5/2 gene clustered with the horse CCR2 gene in the tree (Figure 5). The presence of the divergent CCR5-like portion increased the length of the CCR5/2 branch of the tree.

As an additional means of characterizing CCR5/2 gene evolution, the number of synonymous and nonsynonymous substitutions was counted separately in the CCR5 and CCR2 portions of the chimeric gene (Figure 6). Although the number of synonymous substitutions in CCR5/2 gene was close to that in the corresponding parts of the CCR5 and CCR2 genes, the number of nonsynonymous substitutions in the chimeric gene was about 3–4 times higher than those in the original genes. The CCR5/2 gene may have evolved without any functional restrictions or under selection that supports nonsynonymous substitutions in functional domains. The distribution of the nonsynonymous substitutions along the sequence of the CCR5/2 gene was uniform and did not appear to be associated with any known functional sites, suggesting that there are no functional restrictions on the evolution of the CCR5/2 gene. It is likely that the horse CCR5/2 protein currently has no specific functional activity and that both the CCR2 and CCR5 proteins perform their regular cellular functions.


Figure 6
View larger version (9K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 6. Ratios of nonsynonymous to synonymous substitutions in the horse CCR5/2 gene compared with the human and horse CCR5 and CCR2 genes. (A) Tree corresponding to the CCR5-like portion (positions from 1 to 396 bp). (B) Tree corresponding to the CCR2-like portion (positions from 397 to 1068 bp) of the CCR5/2 gene.

 
Similar numbers of accumulated substitutions were found in the human and horse lineages: 75.4 ± 8.7 and 71.0 ± 8.4, respectively, in the CCR5 genes and 84.3 ± 9.2 and 89.7 ± 9.5, respectively, in CCR2 genes. These calculations suggested similar substitution rates in ungulates and primates after their divergence from a common ancestor and made it possible to estimate the time when recombination between the horse CCR2 and CCR5 genes had occurred. According to the data shown in Figure 6, the total number of synonymous and nonsynonymous substitutions between the horse CCR2 and CCR5 genes prior to origination of the CCR5/2 gene was 46 (8 + 10 + 12 + 16). After the unequal crossing over event between the CCR2 and CCR5 genes, 32 (7 + 6 + 11 + 8) substitutions accumulated. The divergence between primates and ungulates was estimated to occur about 95 million years ago (Ma) (Soligo et al. 2007). Therefore, the age of the horse CCR5/2 gene is estimated to be about 95 x 32/(32 + 46) = 39 Ma.


    Conclusions
 Top
 Materials and Methods
 Results and Discussion
 Conclusions
 Funding
 References
 
Evidence for concerted evolution described in this study provides an explanation for the high sequence similarity between the vertebrate CCR2 and CCR5 genes. Special precautions are therefore required to discriminate the sequences of these paralogous genes in DNA samples, and it also likely that antibodies made to either CCR2 or CCR5 proteins would cross-react. This would also be the case for the CCR5/2 gene and gene product in equids. Although not yet experimentally confirmed, the biological properties (e.g., antiviral activities) of individual CCR proteins previously observed in particular vertebrate lineages (e.g., humans and mice) may also be conserved in other species (e.g., equids). The resequencing of the CCR5 genes in a limited number of control horses revealed synonymous but no nonsynonymous or missense mutations. Sequence analysis of much larger numbers of DNA samples, especially those from WNV-infected horses that develop CNS disease, may reveal a low-frequency mutation similar to the rare human CCR5{Delta}32 mutation that inactivates the equine CCR5 protein function and increases susceptibility to infectious disease. The SNPs identified will be useful for developing a horse haplotype map.


    Funding
 Top
 Materials and Methods
 Results and Discussion
 Conclusions
 Funding
 References
 
National Center for Infectious Diseases, Centers for Disease Control and Prevention (CI000216 [GenBank] ).


    Acknowledgments
 
The authors would like to thank J. Lundquist for FISH technical support and all of the researchers who supplied animal samples. This work was done in connection with a project 07-14-116 at the University of Kentucky Agricultural Experiment Station.


    Footnotes
 
Corresponding Editor: Ernest Bailey

Received October 26, 2007
Accepted April 10, 2008


    References
 Top
 Materials and Methods
 Results and Discussion
 Conclusions
 Funding
 References
 

    Annilo T, Chen ZQ, Shulenin S, Dean M. Evolutionary analysis of a cluster of ATP-binding cassette (ABC) genes. Mamm Genome (2003) 14:7–20.[CrossRef][Web of Science][Medline]

    Archibald JM, Roger AJ. Gene conversion and the evolution of euryarchaeal chaperonins: a maximum likelihood-based method for detecting conflicting phylogenetic signals. J Mol Evol (2002a) 55:232–245.[CrossRef][Web of Science][Medline]

    Archibald JM, Roger AJ. Gene duplication and gene conversion shape the evolution of archaeal chaperonins. J Mol Biol (2002b) 316:1041–1050.[CrossRef][Web of Science][Medline]

    Bettencourt BR, Feder ME. Rapid concerted evolution via gene conversion at the Drosophila hsp70 genes. J Mol Evol (2002) 54:569–586.[CrossRef][Web of Science][Medline]

    Carmo CR, Esteves PJ, Ferrand N, van der Loo W. Genetic variation at chemokine receptor CCR5 in leporids: alteration at the 2nd extracellular domain by gene conversion with CCR2 in Oryctolagus, but not in Sylvilagus and Lepus species. Immunogenetics (2006) 58:494–501.[CrossRef][Web of Science][Medline]

    Charo IF, Myers SJ, Herman A, Franci C, Connolly AJ, Coughlin SR. Molecular cloning and functional expression of two monocyte chemoattractant protein 1 receptors reveals alternative splicing of the carboxyl-terminal tails. Proc Natl Acad Sci USA (1994) 91:2752–2756.[Abstract/Free Full Text]

    Deng H, Liu R, Ellmeier W, Choe S, Unutmaz D, Burkhart M, Di Marzio P, Marmon S, Sutton RE, Hill CM, et al. Identification of a major co-receptor for primary isolates of HIV-1. Nature (1996) 381:661–666.[CrossRef][Web of Science][Medline]

    Desjardins PR, Burkman JM, Shrager JB, Allmond LA, Stedman HH. Evolutionary implications of three novel members of the human sarcomeric myosin heavy chain gene family. Mol Biol Evol (2002) 19:375–393.[Abstract/Free Full Text]

    DeVries ME, Kelvin AA, Xu L, Ran L, Robinson J, Kelvin DJ. Defining the origins and evolution of the chemokine/chemokine receptor system. J Immunol (2006) 176:401–415.[Abstract/Free Full Text]

    Dragic T, Litwin V, Allaway GP, Martin SR, Huang Y, Nagashima KA, Cayanan C, Maddon PJ, Koup RA, Moore JP, et al. HIV-1 entry into CD4+ cells is mediated by the chemokine receptor CC-CKR-5. Nature (1996) 381:667–673.[CrossRef][Web of Science][Medline]

    Drouin G. Characterization of the gene conversions between the multigene family members of the yeast genome. J Mol Evol (2002) 55:14–23.[CrossRef][Web of Science][Medline]

    Esteves PJ, Abrantes J, van der Loo W. Extensive gene conversion between CCR2 and CCR5 in domestic cat (Felis catus). Int J Immunogenet (2007) 34:321–324.[CrossRef][Web of Science][Medline]

    Glass WG, Lim JK, Cholera R, Pletnev AG, Gao JL, Murphy PM. Chemokine receptor CCR5 promotes leukocyte trafficking to the brain and survival in West Nile virus infection. J Exp Med (2005) 202:1087–1098.[Abstract/Free Full Text]

    Glass WG, McDermott DH, Lim JK, Lekhong S, Yu SF, Frank WA, Pape J, Cheshier RC, Murphy PM. CCR5 deficiency increases risk of symptomatic West Nile virus infection. J Exp Med (2006) 203:35–40.[Abstract/Free Full Text]

    He J, Chen Y, Farzan M, Choe H, Ohagen A, Gartner S, Busciglio J, Yang X, Hofmann W, Newman W, et al. CCR3 and CCR5 are co-receptors for HIV-1 infection of microglia. Nature (1997) 385:645–649.[CrossRef][Web of Science][Medline]

    Higgs DR, Hill AV, Bowden DK, Weatherall DJ, Clegg JB. Independent recombination events between the duplicated human alpha globin genes; implications for their concerted evolution. Nucleic Acids Res (1984) 12:6965–6977.[Abstract/Free Full Text]

    Howard OM, Ben-Baruch A, Oppenheim JJ. Chemokines: progress toward identifying molecular targets for therapeutic agents. Trends Biotechnol (1996) 14:46–51.[CrossRef][Web of Science][Medline]

    Israel RL, Kosakovsky Pond SL, Muse SV, Katz LA. Evolution of duplicated alpha-tubulin genes in ciliates. Evolution Int J Org Evolution (2002) 56:1110–1122.[CrossRef][Web of Science][Medline]

    Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol (1980) 16:111–120.[CrossRef][Web of Science][Medline]

    Kruithof EK, Satta N, Liu JW, Dunoyer-Geindre S, Fish RJ. Gene conversion limits divergence of mammalian TLR1 and TLR6. BMC Evol Biol (2007) 7:148.[CrossRef][Medline]

    Lazzaro BP, Clark AG. Evidence for recurrent paralogous gene conversion and exceptional allelic divergence in the Attacin genes of Drosophila melanogaster. Genetics (2001) 159:659–671.[Abstract/Free Full Text]

    Li WH. Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J Mol Evol (1993) 36:96–99.[CrossRef][Web of Science][Medline]

    Lim JK, Glass WG, McDermott DH, Murphy PM. CCR5: no longer a "good for nothing" gene—chemokine control of West Nile virus infection. Trends Immunol (2006) 27:308–312.[CrossRef][Web of Science][Medline]

    Mummidi S, Ahuja SS, McDaniel BL, Ahuja SK. The human CC chemokine receptor 5 (CCR5) gene. Multiple transcripts with 5'-end heterogeneity, dual promoter usage, and evidence for polymorphisms within the regulatory regions and noncoding exons. J Biol Chem (1997) 272:30662–30671.[Abstract/Free Full Text]

    Nagawa F, Yoshihara S, Tsuboi A, Serizawa S, Itoh K, Sakano H. Genomic analysis of the murine odorant receptor MOR28 cluster: a possible role of gene conversion in maintaining the olfactory map. Gene (2002) 292:73–80.[CrossRef][Web of Science][Medline]

    Perelygin AA, Lear TL, Zharkikh AA, Brinton MA. Structure of equine 2'-5'oligoadenylate synthetase (OAS) gene family and FISH mapping of OAS genes to ECA8p15–>p14 and BTA17q24–>q25. Cytogenet Genome Res (2005) 111:51–56.[CrossRef][Web of Science][Medline]

    Perelygin AA, Lear TL, Zharkikh AA, Brinton MA. Comparative analysis of vertebrate EIF2AK2 (PKR) genes and assignment of the equine gene to ECA15q24-q25 and the bovine gene to BTA11q12-q15. Genet Sel Evol (2006a) 38:551–563.[CrossRef][Web of Science][Medline]

    Perelygin AA, Zharkikh AA, Scherbik SV, Brinton MA. The mammalian 2'-5' oligoadenylate synthetase gene family: evidence for concerted evolution of paralogous Oas1 genes in Rodentia and Artiodactyla. J Mol Evol (2006b) 63:562–576.[CrossRef][Web of Science][Medline]

    Salazar P, Traub-Dargatz JL, Morley PS, Wilmot DD, Steffen DJ, Cunningham WE, Salman MD. Outcome of equids with clinical signs of West Nile virus infection and factors associated with death. J Am Vet Med Assoc (2004) 225:267–274.[CrossRef][Web of Science][Medline]

    Schuler LA, Khaitsa ML, Dyer NW, Stoltenow CL. Evaluation of an outbreak of West Nile virus infection in horses: 569 cases (2002). J Am Vet Med Assoc (2004) 225:1084–1089.[CrossRef][Web of Science][Medline]

    Shields DC. Gene conversion among chemokine receptors. Gene (2000) 246:239–245.[CrossRef][Web of Science][Medline]

    Shinkai H, Morozumi T, Toki D, Eguchi-Ogawa T, Muneta Y, Awata T, Uenishi H. Genomic structure of eight porcine chemokine receptors and intergene sharing of an exon between CCR1 and XCR1. Gene (2005) 349:55–66.[CrossRef][Web of Science][Medline]

    Soligo C, Will OA, Tavaré S, Marshall CR, Martin RD. New light on the dates of primate origins and divergence. In: Primate origins: adaptations and evolution—Ravosa MJ, Dagosto M, eds. (2007) New York: Springer. 29–49.

    Vazquez-Salat N, Yuhki N, Beck T, O'Brien SJ, Murphy WJ. Gene conversion between mammalian CCR2 and CCR5 chemokine receptor genes: a potential mechanism for receptor dimerization. Genomics (2007) 90:213–224.[CrossRef][Web of Science][Medline]

    Ward MP, Schuermann JA, Highfield LD, Murray KO. Characteristics of an outbreak of West Nile virus encephalomyelitis in a previously uninfected population of horses. Vet Microbiol (2006) 118:255–259.[CrossRef][Web of Science][Medline]

    Ye S, Dhillon S, Ke X, Collins AR, Day IN. An efficient procedure for genotyping single nucleotide polymorphisms. Nucleic Acids Res (2001) 29:E88.[CrossRef][Medline]

    Zhao Z, Hewett-Emmett D, Li WH. Frequent gene conversion between human red and green opsin genes. J Mol Evol (1998) 46:494–496.[CrossRef][Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
99/5/500    most recent
esn029v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Perelygin, A. A.
Right arrow Articles by Brinton, M. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Perelygin, A. A.
Right arrow Articles by Brinton, M. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?