The Journal of Heredity 2002:93(2)
© 2002 The American Genetic Association 93:153-154
Computer Note |
IBD (Isolation by Distance): A Program for Analyses of Isolation by Distance
From the Department of Biology, San Diego State University, San Diego, CA 92182-4614.
Address correspondence to A. J. Bohonak at the address above or e-mail: bohonak{at}sciences.sdsu.edu.
| Introduction |
|---|
|
|
|---|
The genetic similarity among individuals or populations can be ascertained using a number of statistical techniques (reviewed by Bohonak 1999; Neigel 1997; Roderick 1996; Slatkin 1985). When populations can be defined a priori, one option is to analyze genetic "isolation by distance" (sensu Wright 1943) by plotting the genetic similarity (or distance) among population pairs as a function of the geographic distance between those pairs. Slatkin (1993) suggested the genetic distance
= (1/FST - 1)/4 as an appropriate similarity measure, although other approaches are possible (e.g., Epperson and Li 1996). Qualitative and statistical analyses of isolation by distance can reveal much about population genetic structure. The primary use for plots of (genetic) isolation by (geographic) distance is to assess whether more distant population pairs are more different genetically. However, these plots can also be used to test the validity of simpler models of population structure (e.g., island or hierarchical island models). Isolation by distance analyses may help separate the effects of population history from ongoing gene flow, and test the explanatory power of alternative dispersal pathways (Slatkin 1994). For example, one might assess whether the distance along a river or a topographic isocline is more biologically relevant than distance "as the crow flies." The influence of geographic features or specific life-history traits on population differentiation can also be tested. Peterson and Denno (1998) contrasted isolation by distance slopes and intercepts in species with different dispersal abilities.
IBD version 1.1 is a program written in C and compiled for Macintosh and Windows that can be used for analyses of isolation by distance. This program provides a number of unique features: isolation by distance slopes and intercepts are calculated using reduced major axis (RMA) regression, confidence intervals are generated based on several different assumptions regarding data structure, and statistical significance is determined using Mantel tests. The program is freely available at http://www.bio.sdsu.edu/pub/andy/IBD.html.
| Rationale |
|---|
|
|
|---|
Studies of isolation by distance typically seek to ascertain (1) whether there is a statistically significant relationship between genetic distance (or similarity) and geographic distance, and (2) the strength of this relationship. Significance is usually assessed by asking whether the pairwise genetic distance matrix is correlated with the pairwise geographic distance matrix using a Mantel test (see Manly 1994). For the genetic distance matrix A and the geographic distance matrix B, the test statistic is calculated as Z =
i,j Aij Bij. IBD also reports an alternative statistic, r, which provides a standardized Z that ranges from -1 to 1 (Manly 1994). Significance is assessed by comparing Zactual to a distribution of Z scores obtained by randomizing rows/columns of the B matrix and holding A constant. The IBD application provides one-tailed P values for this distribution. Because matrix rows (populations) are treated as a single unit, the Mantel test is a more appropriate way to assess significance than alternatives which assume that each population pair is independent. A logical way to quantify the strength of the isolation by distance relationship is to calculate the slope and intercept of genetic similarity or distance against geographic distance. Based on simulations, Hellberg (1994) suggested that RMA regression is more appropriate for this purpose than standard ordinary least squares (OLS) regression. (In general, RMA is less biased when the independent variable is measured with error). McArdle (1988) suggests that RMA be used when the error rate in x exceeds one-third of the error rate in y. IBD calculates the RMA slope and intercept using the formulas provided by Sokal and Rohlf (1981).
| Input File Format |
|---|
|
|
|---|
IBD reads generic ASCII (text) files. The input file must be a text file named "IBD.#" (where # is replaced by a number) and must be in the same folder as the application. Two file formats are recognized; each can be generated by saving a text-only file from a spreadsheet application. For the pairwise distance format, genetic distances are entered on single lines as
population A
population B
genetic distance
, where "population A/B" is replaced by a number. Pairwise geographic distances are then entered line by line in the same manner. For diploid, codominant markers (e.g., allozymes, microsatellites), genotypes may also be entered in a raw data format. Each line of the input file lists the genotypes at all loci for a single individual, beginning with the population number. Diploid genotypes at each locus are designated with two numbers separated by a comma (e.g., 1,1 2,3 4,6). Missing genotypes at a locus are coded 0,0. The end of the genetic data is indicated by a population number of 0, followed by genotypes of 0,0. The geographic distances between all population pairs are then entered as described for the pairwise distance format.
Program maxima for IBD consist of 100 populations, 30 loci, 20 alleles per locus, 1000 individuals per population, and 1 x 105 randomizations/bootstraps (see below). The IBD application folder contains example files and a manual with more detailed information on input file formats.
| Output |
|---|
|
|
|---|
For the raw data format only, IBD provides (1) allele counts and heterozygosity for each population and locus, (2) locus-specific and overall FST for each population pair, estimated using the methods of Weir (1990), and (3) Slatkin's (1993) similarity measure
= (1/FST - 1)/4 for each population pair.
If geographic distances are available from either input file format, IBD will perform a Mantel test as described above, using the number of matrix randomizations requested. The slope and intercept from RMA regressions are calculated following Sokal and Rohlf (1981). When raw genotypic data are entered, the dependent variable is
, otherwise the genetic distances provided in the input file are used. Error estimation for RMA regression is considered using five methods:
- Standard linear model formulas (Sokal and Rohlf 1981).
- Jackknife over population pairs (i.e., each point on the graph): one-delete jackknife estimates of the slope, intercept, and associated standard errors are calculated following Weir (1990). The 95% and 99% confidence intervals are provided for each.
- One-delete jackknife over populations.
- Bootstrapping over population pairs: confidence intervals are calculated by creating new "pseudoreplicate" datasets, each with the same number of population pairs, by random sampling with replacement. The middle 95% and 99% of the bootstrap pseudoreplicates constitute the confidence intervals.
- Bootstrapping over independent population pairs: random datasets are created by sampling completely independent population pairs. For p populations, each dataset will contain p/2 population pairs if p is even, or (p - 1)/2 pairs if p is odd. For example, if p = 6 populations, then one pseudoreplicate might consist of {(1,4), (3,6), (2,5)}.
Finally, the genetic and geographic distances are log-transformed (following Slatkin 1993) and the RMA analyses are repeated.
| Statistical Considerations |
|---|
|
|
|---|
As noted by numerous authors, pairwise contrasts in the isolation by distance relationship are not independent; a single population will be involved in multiple contrasts. The Mantel test is expected to provide an appropriate test of significance for isolation by distance because it appropriately considers the unit of replication to be a population (and not a pairwise contrast). Similarly, to generate confidence limits for the RMA slope or intercept, bootstrapping over independent population pairs would seem to be the most conservative approach (see above). For a small number of populations, jackknifing over populations provides the next best alternative. IBD is the only currently available software package that provides confidence limits for isolation by distance slopes and intercepts using the population as the unit of replication.
| Acknowledgments |
|---|
I thank Neil Davies, Francis X. Villablanca, and especially George Roderick for constructive commentary and support. IBD is written in C, compiled using CodeWarrior for Macintosh, and output is via CodeWarrior's SIOUX module. IBD can be downloaded from http://www.bio.sdsu.edu/pub/andy/IBD.html. Source code will be made available upon request.
| Footnotes |
|---|
Corresponding Editor: Stephen J. O'Brien
Received June 21, 2001
Accepted December 31, 2001
| References |
|---|
|
|
|---|
-
Bohonak AJ, 1999. Dispersal, gene flow and population structure. Q Rev Biol 74:2145.[CrossRef][Medline]
Epperson BK and Li T, 1996. Measurement of genetic structure within populations using Moran's spatial autocorrelation statistics. Proc Natl Acad Sci USA 93:1052810532.
Hellberg ME, 1994. Relationships between inferred levels of gene flow and geographic distance in a philopatric coral, Balanophyllia elegans. Evolution 48:18291854.[CrossRef]
Manly BFJ, 1994. Multivariate statistical methods: a primer, 2nd ed. New York: Chapman & Hall.
McArdle BH, 1988. The structural relationship: regression in biology. Can J Zool 66:23292339.
Neigel JE, 1997. A comparison of alternative strategies for estimating gene flow from genetic markers. Annu Rev Ecol Syst 28:105128.[CrossRef][Web of Science]
Peterson MA and Denno RF, 1998. The influence of dispersal and diet breadth on patterns of genetic isolation by distance in phytophagous insects. Am Nat 152:428446.[CrossRef]
Roderick GK, 1996. Geographic structure of insect populations: gene flow, phylogeography, and their uses. Annu Rev Entomol 41:325352.[CrossRef][Web of Science][Medline]
Slatkin M, 1985. Gene flow in natural populations. Annu Rev Ecol Syst 16:393430.[Web of Science]
Slatkin M, 1993. Isolation by distance in equilibrium and non-equilibrium populations. Evolution 47:264279.[CrossRef][Web of Science]
Slatkin M, 1994. Gene flow and population structure. In: Ecological genetics (Real LA, ed). Princeton, NJ: Princeton University Press; 317.
Sokal RR and Rohlf FJ, 1981. Biometry, 2nd ed. New York: WH Freeman.
Weir BS, 1990. Genetic data analysis: methods for discrete population analysis. Sunderland, MA: Sinauer.
Wright S, 1943. Isolation by distance. Genetics 28:114138.
This article has been cited by other articles:
![]() |
T. Pusadee, S. Jamjod, Y.-C. Chiang, B. Rerkasem, and B. A. Schaal Genetic structure and isolation by distance in a landrace of Thai rice PNAS, August 18, 2009; 106(33): 13880 - 13885. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Haney, M. Dionne, J. Puritz, and D. M. Rand The Comparative Phylogeography of East Coast Estuarine Fishes in Formerly Glaciated Sites: Persistence versus Recolonization in Cyprinodon variegatus ovinus and Fundulus heteroclitus macrolepidotus J. Hered., May 1, 2009; 100(3): 284 - 296. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. O. Stefansson, J. Reinert, T. Sigurthsson, K. Kristinsson, K. Nedreaas, and C. Pampoulie Depth as a potential driver of genetic structure of Sebastes mentella across the North Atlantic Ocean ICES J. Mar. Sci., May 1, 2009; 66(4): 680 - 690. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Yan, H.-J. Chu, H.-C. Wang, J.-Q. Li, and T. Sang Population genetic structure of two Medicago species shaped by distinct life form, mating system and seed dispersal Ann. Bot., April 1, 2009; 103(6): 825 - 834. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Miller, C. Williams, A. L. Strong, and D. Carvey Ecological Specialization in a Spatially Structured Population of the Thermophilic Cyanobacterium Mastigocladus laminosus Appl. Envir. Microbiol., February 1, 2009; 75(3): 729 - 734. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A Lytle, M. T Bogan, and D. S Finn Evolution of aquatic insect behaviours across a gradient of disturbance predictability Proc R Soc B, February 22, 2008; 275(1633): 453 - 462. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. M. Anthony, M. Johnson-Bawe, K. Jeffery, S. L. Clifford, K. A. Abernethy, C. E. Tutin, S. A. Lahm, L. J. T. White, J. F. Utley, E. J. Wickings, et al. The role of Pleistocene refugia and rivers in shaping gorilla genetic diversity in central Africa PNAS, December 18, 2007; 104(51): 20432 - 20436. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. G. Albaladejo and A. Aparicio Population Genetic Structure and Hybridization Patterns in the Mediterranean Endemics Phlomis lychnitis and P. crinita (Lamiaceae) Ann. Bot., October 1, 2007; 100(4): 735 - 746. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Couceiro, R. Barreiro, J. M. Ruiz, and E. E. Sotka Genetic Isolation by Distance among Populations of the Netted Dog Whelk Nassarius reticulatus (L.) along the European Atlantic Coastline J. Hered., September 1, 2007; 98(6): 603 - 610. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Miller, R. W. Castenholz, and D. Pedersen Phylogeography of the Thermophilic Cyanobacterium Mastigocladus laminosus Appl. Envir. Microbiol., August 1, 2007; 73(15): 4751 - 4759. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Zhang, M. Li, Z. Zhang, B. Goossens, L. Zhu, S. Zhang, J. Hu, M. W. Bruford, and F. Wei Genetic Viability and Population History of the Giant Panda, Putting an End to the "Evolutionary Dead End"? Mol. Biol. Evol., August 1, 2007; 24(8): 1801 - 1810. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Venkatesan, C. J. Westbrook, M. C. Hauer, and J. L. Rasgon Evidence for a Population Expansion in the West Nile Virus Vector Culex tarsalis Mol. Biol. Evol., May 1, 2007; 24(5): 1208 - 1218. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Boys, M. Cherry, and S. Dayanandan Microsatellite analysis reveals genetically distinct populations of red pine (Pinus resinosa, Pinaceae) Am. J. Botany, May 1, 2005; 92(5): 833 - 841. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A Rocha, D. R. Robertson, J. Roman, and B. W Bowen Ecological speciation in tropical reef fishes Proc R Soc B, March 22, 2005; 272(1563): 573 - 579. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Austin, S. C. Lougheed, and P. T. Boag Controlling for the Effects of History and Nonequilibrium Conditions in Gene Flow Estimates in Northern Bullfrog (Rana catesbeiana) Populations Genetics, November 1, 2004; 168(3): 1491 - 1506. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Franks, C. L. Richards, E. Gonzales, J. E. Cousins, and J. L. Hamrick Multi-scale genetic analysis of Uniola paniculata (Poaceae): a coastal species with a linear, fragmented distribution Am. J. Botany, September 1, 2004; 91(9): 1345 - 1351. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. DeWalt and J. L. Hamrick Genetic variation of introduced Hawaiian and native Costa Rican populations of an invasive tropical shrub, Clidemia hirta (Melastomataceae) Am. J. Botany, August 1, 2004; 91(8): 1155 - 1162. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z-F. Wang, J. L. Hamrick, and M. J. W. Godt High Genetic Diversity in Sarracenia leucophylla(Sarraceniaceae), a Carnivorous Wetland Herb J. Hered., May 1, 2004; 95(3): 234 - 243. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Cruse-Sanders and J. L. Hamrick Genetic diversity in harvested and protected populations of wild American ginseng, Panax quinquefolius L. (Araliaceae) Am. J. Botany, April 1, 2004; 91(4): 540 - 548. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Fuselli, E. Tarazona-Santos, I. Dupanloup, A. Soto, D. Luiselli, and D. Pettener Mitochondrial DNA Diversity in South America and the Genetic History of Andean Highlanders Mol. Biol. Evol., October 1, 2003; 20(10): 1682 - 1691. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








