Journal of Heredity Advance Access originally published online on September 8, 2005
Journal of Heredity 2005 96(6):704-712; doi:10.1093/jhered/esi103
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Combining Inter- and Intrapopulation Information with the Weitzman Approach to Diversity Conservation
From the Laboratorio de Genética, Facultad de Veterinaria, Universidad Complutense de Madrid, Spain (García and Cañón); and Departamento de Estadística, Investigación Operativa y D.M., Universidad de Oviedo, Spain (Corral)
Address correspondence to Javier Cañón, Laboratorio de Genética, Facultad de Veterinaria, Universidad Complutense de Madrid, Avda. Puerta de Hierro s/n, 28040Madrid, Spain, or e-mail: jcanon{at}vet.ucm.es.
| Abstract |
|---|
|
|
|---|
This article introduces a new perspective on Weitzman's methodology for assessing the distribution of resources in genetic diversity conservation programs. Intrapopulation information is added to the procedure by diffusion process formulas to calculate genetic extinction probabilities, and therefore the marginal diversities and elasticities of diversity. The method was tested with a set of European cattle breeds from Spain and France and provided satisfactory results.
The efficient distribution of economic resources in biodiversity conservation programs is difficult since many economic, productive, morphological, genetic, social, and even affective and aesthetic factors are involved in decision making. In recent years, the preservation of biological diversity has been a headline topic of discussion [see Amos and Balmford (2001) for a short review). In particular, the conservation of genetic diversity in livestock has received a great deal of attention (Barker 1999; Eding and Meuwissen 2001; Oldenbroek 1999; Ruane 2000). When developing a program for the conservation of genetic resources, special attention must be paid to how genetic diversity is measured, as it is well known that the rate of evolution in natural populations is limited by the degree of genetic variability (Fisher 1930). For small and endangered populations, it is important to avoid homozygosity as much as possible, because any increase leads to less genetic variation, and molecular estimates of inbreeding would suffer a parallel increase. This determines not only survival, but also adaptation to changing environments, including changes in consumer preferences and demands for animal-derived products (Frankham 1995; Hedrick and Kalinowski 2000; Lacy 1997). Considerable effort has been expended to minimize the long-term rate of inbreeding (Sonesson and Meuwissen 2001; Wang 1997; Wang and Hill 2000). Another point of particular interest is how to make decisions for investment policies to be applied to the different breeds or populations to maximize benefits in terms of diversity conservation.
Weitzman (1992) proposed a set of properties to be verified for the proper measurement of genetic diversity (see also Eding and Meuwissen 2001; Thaon d'Arnoldi et al. 1998). These properties make sense both intuitively and algebraically, but none of the usual measures of diversity verify them (Weitzman 1992). However, Weitzman developed a diversity function in which they all hold. This is ultimately based on pairwise distances between operational taxonomical units (OTUs), and it is easily implementable through a recursive algorithm. Since this is indirectly of the clustering type, it provides a graphic representation that has (in the strict sense) erroneously been called a "maximum-likelihood evolutionary tree." However, it supplies much useful information. Examples of its use can be found in Cañón et al. (2001), Laval et al. (2000), and Thaon d'Arnoldi et al. (1998). With this procedure, the contribution of each OTU, or groups of OTUs, to total diversity can easily be calculated, allowing conclusions on conservation policies to be drawn. However, as with any other methodology, there are some drawbacks (Caballero and Toro 2002; Eding and Meuwissen 2001). The most frequently mentioned is that it ignores within-population information, and being that assertion is not entirely true, since the method can be applied at any levelspecies, breed, population, and even individualand therefore a within-population diversity can be computed, the fact is that the algorithm is a between-units one. A way of incorporating within-population information into the algorithm itself, involving computing Weitzman diversity within each population under study, is currently under development. It is also commonly argued that it does not take into account different population sizes. The present work has filled these gaps by studying the development of gene frequencies within populations in order to compute extinction probabilities.
| Weitzman's Approach |
|---|
|
|
|---|
Weitzman (1993) devised a natural way of studying the evolution of diversity over different generations and of weighing the contributions of each OTU more precisely in order to assess the distribution of resources for conservation of genetic diversity programs. Basically it consists of calculating the expected value of a certain diversity measurethe method does not depend on which is chosenby assigning probabilities to all the possible values of diversity at a particular time. Obviously there are as many values as subsets of the whole set of OTUs, since at a particular time (t) some OTUs may have become extinct and only the remainders count toward the final diversity. These values of probability depend on the probabilities of extinction of each particular OTU. Values are then obtained that measure how much the expected diversity at time t is affected by changes in each probability of extinction. Therefore a very logical and coherent, but conceptually simple way of weighing the OTUs with respect to their influence on expected diversity is constructed. This method has already been applied by Reist-Marti et al. (2003), but they do not escape the fact, already noted in Weitzman (1993), that the choice of these extinction probabilities is strongly subjective, and that is an important drawback of this approach. To circumvent this problem, the present investigation substitutes the concept of extinction with that of genetic extinction, in such a way that these probabilities can be calculated for any generation by using allelic frequencies and the theory of diffusion processes. In addition, this allows within-population diversity information to be brought into the analysis. The original procedure lacked this feature.
This method is a powerful tool for performing combined analyses integrating between- and within-population information. The former comes from pairwise distances, which can be genetic, and are studied by Weitzman's diversity measure. The latter come from allelic frequencies, and are studied in a second step using Weitzman's marginal diversity and elasticity of diversity (Weitzman 1993). These concepts are first introduced and later the theory of diffusion processes is applied to show how it accommodates Weitzman's model. Finally, an example of its use with a dataset from eight European cattle breeds is shown. All calculations were made using a FORTRAN program that implements the computation of Weitzman's diversity, including all the links, representatives, and percentages of diversity contributed by each OTU and each node of the tree. It also provides the "as if" ultra metric distance matrix, which, when used as an input in any tree-generating software, gives the unique hierarchical tree resulting from Weitzman's algorithm. Finally, for a given generation, the probabilities of genetic extinction, the marginal diversity, and the elasticity of the diversity of each OTU are derived.
| Expected Diversity, Marginal Diversity, and Elasticity of Diversity |
|---|
|
|
|---|
When establishing a policy of conservation of genetic resources, it is important not only to be able to evaluate current diversity, but to broaden the time horizon and infer the behavior of that diversity a number of generations ahead, and to determine which breeds or species are more influential upon it.
Assuming a trustworthy measure of diversity already existswhether it is Weitzman's or anotherit is of interest to know how the diversity of a set of populations will change, and which of them are more important for the conservation of genetic diversity in the context of biodiversity management programs. It is with this in mind that the concepts of expected diversity, marginal diversity, and elasticity of diversity are proposed.
Let Q be a set of NP OTUs. Let also D:
(Q)
|R be a certain diversity function, and {Pi}i=1,...,NP be the probabilities of extinction in one generation for each of them. The theoretical development of the procedure that will be described does not depend on D. It is applicable to any diversity function, as long as for any subset S
Q, the value of D(S) exists and is well defined. To exemplify the use of this methodology, the Weitzman diversity measure has been chosen. Its explicit definition will not be given here, since it can already be found and discussed in Weitzman (1992) and Thaon d'Arnoldi et al. (1998). If, for each subset of OTUs, S
Q, the probability Pt(S) of the OTUs in S having survived after t generations and those in Q\S having become extinct can be found, then D at time t, Dt, can be thought of as a random variable whose expected value can easily be calculated. The expected diversity after t generations can therefore be defined as
![]() | (1) |
This is not exactly the expression provided by Weitzman, since he defines the expected diversity as the cumulative sum of generations 1 to t, weighted by a "discount factor." The discount factor has been ignored here because, as shown below, the expected diversity is to be partially differentiated with respect to the probabilities of extinction. Therefore the discount factor provides no information for comparative purposes, and no advantage is to be found in accumulating the sumbesides, equation 1 is more coherent with the statistical concept of expected value.
Another important point that arises from the choice of Weitzman's diversity as the one to operate with is its very nature as a measure that uses pairwise distances as the input to calculate its value. If these distances do not vary with time, then Dt(S) = D(S), the diversity at the present time. But Dt(S) might be different from D(S) if there is variation in the distances with time, as happens with most genetic distances. It is important therefore to assess the choice of the distance used to calculate the diversity, so that the effect of the chosen evolutionary model can be taken into account.
To calculate equation 1, it is assumed (Weitzman 1993) that the "lifetime" of population i follows an exponential distribution with parameter Pi. Thus Pt(S) is given by
![]() | (2) |
![]() | (3) |
With respect to the meaning of these expressions, the situation is as follows: a set of species, populations, or OTUs in general are under consideration; a measure of diversity has already been established for them; and the importance of each to conservation purposes can now be discussed. The future needs to be examined to see how that diversity might behave and how modifications to each population might affect that expected behavior. One of two things can happen to every single population: it may survive or it may become extinct. Therefore the diversity value at any future point in time will be that of the set of populations that have survived. Given the one-generation probabilities of extinction {Pi}i=1,...,NP, the probability of survival at time t of any possible subset of the original can easily be calculated, as already shown (equation 2). Now, for a set Q of NP populations, there are 2NP possible subsets of Q, that is, there are 2NP possible patterns of survival/extinction, and hence 2NP different values of diversity. Each of these values can be calculated along with its corresponding probability, so it makes sense to consider diversity at time t as a random variable and obtain its expected value. This is precisely the expected diversity.
As just shown, this expected diversity is ultimately a function of {Pi}i=1,...,NP, so studies can be made of how variations in these probabilities affect the expected value of diversity. The more the diversity is affected by changes in a population's probability of extinction, the more that population should be considered a priority for action. But the concept of variation in this case is just that of a derivative, or more precisely, that of a partial derivative, with respect to each population's probability of extinction. Note that the value of these derivatives must be negative, since the more a population is likely to become extinct, the lower the expected diversity value will be. The marginal diversity is therefore defined as the negative of the value of the partial derivative to handle positive numbers.
Since a reduction in the probability of extinction provokes an increase in the expected diversity, and this increase is greater for higher marginal diversity values, it is on those populations with the highest values that attention should be focused. This must be directed toward lowering their extinction probability.
Another interesting indicator is the elasticity of diversity or the conservation potential of the population i, i = 1, ..., NP, in generation t. Elasticity is a very popular and well-known concept in economy. In most cases it compares the relative variation in the quantity of demand with respect to the relative variation in prices, thereby obtaining a measure of the responsiveness, or sensitivity, of the demand to changes in price. If, for example, the price of a particular good increases by 2% and the demand for that good decreases by 5%, then the elasticity of the demand is 2.5. Usually the elasticity of demand is negative, since an increase in prices has, as a consequence, a decrease in the quantity of demand. If the elasticity is less than 1, then the demand is said to be elastic with respect to prices, while if it is between 1 and 0, the demand is called inelastic. If the elasticity equals zero, the demand is completely inelastic, which means that the seller can raise the price of the product as much as he wants and the demand will not decrease at all. The other extreme is a perfectly elastic demand. In this situation, elasticity approaches
, and the seller can sell virtually as much as he wants at the current market price. Raising the price would imply that demand would nearly disappear, and lowering it would be unprofitable, since demand is practically unlimited at the current price. Most real-life situations are, however, in the middle of these extremes, and sellers have to have adequate prices in order to optimize benefits.
The general formula for the price elasticity of demand is E = ((
QD/QD)/(
P/P)), where P is the current price of a certain good, QD is the quantity demanded at that price,
P is a small change in the current price, and
QD a small change in the quantity demanded. Note that this expression can be rearranged as E = (
QD/
P)(P/QD). When the changes are not measured in a short time period, averages of the initial and final values are taken on QD and P. To approximate the elasticity at a particular point, short intervals can be taken around quantity and price values at that point to calculate differences. However, when differentiation is possible, exact computation is available, as will be shown in equation 4.
The concept of elasticity can be applied to many other economic variables, such as supply instead of demand, or in noneconomic contexts, as happens in this case. An analogy can be established here between quantity of demand and expected diversity value and between price and extinction probability, so the elasticity of diversity or conservation potential, as renamed in Weitzman (1993), can be defined as
![]() | (4) |
Elasticity is particularly relevant when the cost of reducing the probability of extinction of a population is directly proportional to the probability itself [see Weitzman (1993) for further discussion on the singularities of equations 3 and 4].
| Genetic Extinction |
|---|
|
|
|---|
All the previous definitions are based on one-period probabilities of extinction of each populationwhich ought to be known. However, this is not as easy as it sounds. There is no objective way of determining this for a common species, population, or breed. Therefore the information obtained depends on the subjectiveness of the probability values, which is not very desirable. Scientific, rigorous, and objective probabilities are required if the above methodology is to be correctly applied. Instead of the concept of physical extinction, it is proposed here that genetic extinction be used. The concept is even more appropriate if it is taken into account that it is genetic diversity that is being referred to in a framework of genetic resources conservation programs. In addition, this introduces within-breed information from allelic frequencies into the analysis.
The concept of genetic extinction is closely related to that of homozygosity, since the genetic extinction of a population is equivalent to a homozygosity rate of one. The more homozygous a population is, the more genetically endangered. But homozygosity is, in practice, nothing more than an average of a number of homozygous individuals taken across a series of representative loci. Therefore, by analogy, an average can be calculated for the allelic situation of the loci under study for each breed. There are two alternatives. One is to average the probability of fixation of the loci at time t and use that to calculate the marginal diversity and conservation potential. The other is to perform the analysis on each locus with its corresponding fixation probability, obtain a different value of marginal diversity and conservation potential for each, and average those values across the loci. The extinction probabilities can be loosely extended to the point in which allelic frequencies exceed a certain value
instead of just equaling one.
For ease of notation, index i will be dropped in the following formulas, which is referred to a generic population, but all the calculations will be ulteriorly applied to each of the NP populations to obtain the corresponding extinction probabilities.
Consider a set of L loci, and for locus l, let
be the alleles, l = 1, ..., L. For each m = 1, ..., nl and l = 1, ..., L, let
be the frequency of the allele
in generation t in a certain population. For the first option, Pext(t) is therefore calculated as
![]() | (5) |
, for
> 0.5, no other allele can do the same). For the second option, a set of L probabilities {Pext(t,l)}l=1,...,L can be calculated:
![]() | (6) |
The change in gene frequencies over the generations is an intricate process that depends on very different factors. However, diffusion processes allow explicit expressions for
in many cases. To illustrate the approach introduced in this article, the particular situation of populations solely under the effect of genetic drift will be explained. In this case, it can be proven that
![]() | (7) |
![]() | (8) |
are the present-time allelic frequencies of loci 1 to L, and F(·,·,·,·) is the hypergeometric function.
Equations 7 and 8 are calculated for a single population and require the allelic frequencies of all the marker loci involved in the study. For a whole set of populations, a set of probabilities exists,
(or
), each of them calculated according to equation 7 or 8. Observe that these probabilities are not conceptually the same as those in
since the latter are one-generation extinction probabilities, while the former are the direct probabilities of extinction at generation t.
Equations 7 and 8 allow that which is desired: to have at our disposal a series of objective probability values based on genetic data. Furthermore, with this approach, within-population information as well as information on population sizes is introduced into the analysis.
For ease of notation, only the first case, that of averaged probability values across loci will be analyzed, so probabilities,
obtained for each population according to equation 7, will be used. Thus, for a subset S of the whole set of populations Q, equation 2 is substituted by
![]() | (9) |
instead of Pi.
Note that with this model, EDt is a linear function of each
separately. This is, if for a given i, we consider
as variable and the rest,
i
j
{1, ..., NP} as fixed, then

is a constant value for each i = 1, ..., N. In fact, 
![]() | (12) |
S and zero when i
S, and
is defined over
(Q\i) in the same way that Pt is over
(Q), and consequently MDt(i) does not actually depend on
Had one-period extinction probabilities been employed, such as in Weitzman (1993), there would be a nonlinear dependence of EDt on each
and MDt(i) would indeed depend on
Now, from equation 12,
and, finally, equation 12 can be rewritten as
![]() | (13) |
(Q\i)
|R is defined such that for every S
Q\i, Ci(S) := D(S
i) D(S) with probability
then the marginal diversity is the expected value of Ci at generation t, and, divided by the expected diversity at generation t, can be understood as an expected partial contribution at generation t. | Application and Results |
|---|
|
|
|---|
This procedure was tested with an example set of local French and Spanish cattle breeds from the European contract FAIR1 CT95 0702 project [see Cañón et al. (2001) for details]. Table 1 shows the names, origins, and effective sizes of the breeds.
|
Allelic frequencies from 16 microsatellite-type marker loci, obtained from a total of 50 animals per breed (Cañón et al. 2001), were used to calculate marginal diversities and elasticities. Goldstein et al. (1995) proposed a distance measure, the average squared distance, especially devised for markers with a high degree of polymorphism, such as microsatellites. Some of its properties include the fact that its expected value does not change over time under a genetic drift model, which allows for Dt(S) = D(S). Although it is not a distance in the strict sense, since when measured from one population to itself it does not equal zero, the procedure can be applied without loss of generality. Thus this distance was applied to calculate the Weitzman diversity, and partial contributions of the breeds to the total current diversity, {PCi}i=1,...,NP, were obtained. Note that this distance is used just as an example, since the method presented here can be implemented with any distance, as long as its variation with time can be assessed effectively enough. Actually the distance measure does not necessarily have to be a genetic distance. It may involve morphological or other kinds of information, and it is up to the person in charge of putting the method into practice to decide which particular distance suits the situation under study.
Table 2 shows the partial contributions and the marginal diversities and conservation potentials obtained for 25, 35, and 50 generations, and an
threshold of 0.90. Values were rescaled to 100. Other within-population variation statistics are included, such as mean observed and expected homozygosity and mean effective number of alleles. The results from the two methods described aboveaverage extinction probabilities and average marginal diversities and conservation potentialsare extremely similar (data not shown). Suffice it to say, the mean coefficient of variation of the marginal diversities and the conservation potentials were 0.004% and 0.09%, respectively. Therefore, for the sake of simplicity, only values for the first method (average extinction probability across loci) are shown. Table 2 also shows that the differences in the figures among the breeds for other traditional within-variation measures are too small to make clear distinctions.
|
According to the partial contributions, Gasconne and Salers are the two most contributing breeds, followed by Aubrac, and, by far, by the Sayaguesa breed. Alistana and Asturiana de Valles are the breeds with the lowest partial contributions.
Marginal diversities are almost identical to partial contributions. The ultimate reason is the very nature of the marginal diversity. As seen in equation 13, marginal diversity is a sort of expected partial contribution at the given generation. Now, in the weighted sum, the set Q\i
(Q\i) has a higher weight, since the probability of survival of all the populations in Q\i is noticeably higher than the probability of any other combination of survival/extinction within the set of populations Q\i. This happens especially in short to medium time intervals, because the fewer the number of generations, the lower the fixation probability for any allele and therefore the lower the extinction probability, making the product of the probabilities used to calculate each term of equation 13except for Q\imuch smaller. This fact may lead one to consider marginal diversity as a superfluous indicator once we already have partial contributions, but it may be seen so only in the short term, because as generations pass, the differences between partial contributions and marginal diversities increase. This increase is already suggested in Table 2, although it would be much more evident for a larger number of generations.
Conservation potentials offer a different prioritization scheme than marginal diversities. Because of its low effective size, Sayaguesa is now the highest scored breed, followed by Salers, due to its strong influence in the between-population variation. Salers and Aubrac, highly prioritized with the marginal diversity criterion, are also important with respect to the conservation potential. Excluding Sayaguesa, Salers and Aubrac would join Alistana, Asturiana de Montaña, and Tudanca in a middle group. Differences within this group tend to diminish notably as the time horizon increases. The figures for Gasconne are particularly noteworthy. Marginal diversity sets this breed as the first in the prioritization ranking, while its conservation potentials are remarkably lowonly Asturiana de Valles shows lower values. This may possibly be due to the fact that Gasconne is uniformly the most distant breed, that is, for any certain breed, Gasconne has a higher pairwise distance with this breed than any other one (data not shown), so it scores high in partial contribution and marginal diversity, where between-breed information seems to have a greater weight. On the other hand, it has the lowest extinction probability of all the breeds (down to six times that of Salers and Aubrac; all three having the same effective size), and this pulls down its ranking in the conservation potential. Only Asturiana de Valles is below Gasconne. This might be due to the combination of having a relatively low extinction probability and being, as opposed to Gasconne, uniformly the closest breed, meaning that for any certain breed, Asturiana de Valles has a lower pairwise distance with this breed than any other one (data not shown). So, on one side, within-breed information says that its extinction probability is low, while on the other, between-breed information says that it is quite close to the rest of the breeds. Therefore it has the lowest conservation potential value.
| Discussion |
|---|
|
|
|---|
Maintaining genetic variability is an important goal in the conservation of animal populations and Weitzman's procedure can be applied to achieve this. However, it has been criticized (Caballero and Toro 2002; Eding and Meuwissen 2001; Thaon d'Arnoldi et al. 1998) for not considering within-population variation. Only Ollivier and Foulley (2002) suggest the otherwise obvious, but always ignored, possibility of applying Weitzman's algorithm to obtain a within-diversity measure. To do this, the coefficient of molecular coancestry (Fabuel et al. 2004) between individuals, which applies Malécot's (1948) definition, but uses the concept of identity in state instead of identity by descent, could be used to build a matrix of genetic distances (actually 1 Malécot's coefficient) between individuals within one population to compute the Weitzman diversity measure within each population. However, this way of calculating within-breed diversity is actually highly correlated (r = 0.985 in the example presented in this article; data not shown) with within-breed heterozygosity. Other criteria have been proposed to estimate within-breed variation, such as allelic richness (Marshall and Brown 1975). Efficiency of heterozygosity or allelic richness depends on the interest in the short- or long-term selection response.
If no extinction probabilities, be they genetic based or not, and no future behavior of the diversity and each population's contributions to it are to be studied, other solutions to incorporate Weitzman's approach to a joint between- and within-population analysis have been suggested. Ollivier and Foulley (2002) proposed a linear combination of between-population diversity, calculated via the Weitzman approach, and within-population diversity, measured with the heterozygosis, as an aggregate diversity, similar in concept to the traditional aggregate genotype in animal breeding. Depending on the emphasis given to the different possible breeding objectives, for example, selection or crossbreeding, different weights can be justified. It can be shown (Ollivier and Foulley 2004) that when the weights of between-breed and within-breed contributions are proportional to FST (between-population differentiation index) and (1 FST), breeds rank in a similar way when applying the method proposed by Petit et al. (1998) and Caballero and Toro (2002).
It is well known that more than 80% of the total variability in a domestic animal species is a consequence of the genetic differences among individuals within subpopulations, (e.g., breeds), so giving a relatively greater emphasis to this within-breed component means promoting the possibility of the selection response within breeds. However, the crossbreeding, to exploit the heterosis or the complementarity, is also an important genetic improvement strategy to be used in animal production, and their magnitudes are proportional to the genetic distances among breeds (Falconer and Mackay 1996:255). This should justify giving more emphasis to the between-breed component, as illustrated by Chaiwong and Kinghorn (1999).
Using extinction probabilities allows a different way of using Weitzman's methodology. Between- and within-genetic information are both included in the analysis, and the behavior of the diversity in future generations can be assessed. The two alternate definitions of extinction probabilities are computable without difficulty and the rationale behind them agrees with that of popular measures like homozygosity. The results from both procedures do not differ too much and allow clear differentiations among the breeds, something not possible with more classical measures of within-population variation, as can be deduced from Table 2. The close relationship between the rate of inbreeding and the probability of extinction defined here allows the application of Weitzman's methodology to classify populations by the effect on expected diversity when efforts (Sonesson and Meuwissen 2001; Wang 1997; Wang and Hill 2000) are made to reduce their homozygosity increase per generation.
As discussed above, marginal diversities were similar to partial contributions in all breeds, which supports the concept of expected partial contributions. Nevertheless, there are upward trends in some of the breeds and downward trends in others, which reflects that although the most influential factor for the marginal diversity seems to be the weight of the breed in terms of diversity, extinction probabilities tend to force constant patterns of increase/decrease of the marginals. This perhaps excessive influence of the phylogenetic scene on marginal diversities might be overcome by introducing one-generation genetic extinction probabilities, such as in Weitzman's work (1993), and basing all the analysis at any generation on them. However, this is beyond the scope of this article.
Neither is it the aim of this article to decide or advise on which of the many available distance measures should be used to perform an analysis. On the contrary, the methodology presented here is introduced as a general procedure in which a distance measure is required, but its choice is not inherent to the procedure. However, the particulars of each individual application should be taken into account when choosing the distance to use, since different distances can actually lead to quite different conclusions. For example, since the effective size already has an important weight on the extinction probability, the more a distance weight drifts, the more the values for marginals and elasticities will correlate with drift-based measures such as heterozygosity. Different distances should be used depending on the kind of OTUs or taxas used. Some distances are better suited for species comparisons and others for close breed analysis, depending on the time since divergence and the drift-mutation model. The same applies to the solution of the Kolmogorov (1931) equation, used to obtain equations 7 and 8, which can be found for a number of different scenarios.
The example shows that the results for the partial contributions and marginal diversities in short- or medium-term applications are extremely similar. Since no within-population information is taken into account to obtain the partial contributions, it seems that the marginal diversity is not a very good indicator to account for both between- and within-population information in short- or medium-term applications. In the long term, the marginal diversity seems to depart from the value of partial contribution, so it should be interesting to consider it.
Regarding conservation potentials, figures show that effective size is an important factor, as proved by Sayaguesa's high values, but also that other important information is accounted for in the model, so the final result is a combination of all the genetic properties of the populations.
Also, as noted by Weitzman (1993), elasticity, or conservation potential, is the most trustworthy indicator when the mechanism of investment is directly related to extinction probability, since the marginal diversity is weighted by the extinction probability itself. Ideally an optimal study should include a functional relationship between costs and effects on extinction probability, so the problem could be examined in terms of function optimization subject to a series of constraints. In the absence of this kind of information, a global view of the results must be taken to arrive at balanced advice. In addition, it must not be forgotten that many factors other than genetics are usually and necessarily taken into account when addressing the conservation of biological diversity, and the information provided by genetic diversity studies must not be considered the sole source of information for decision making. Nevertheless, the methodology introduced here gathers together different types of genetic data (between breed, within breed, and population size) and is a useful tool in diversity conservation assessment.
The kind of analysis presented is recommended when mid- or long-term diversity studies are proposed, since genes, on average, are not expected to suffer noticeable risk of fixation in short time periods, unless effective sizes are particularly small, in which case a lesser number of generations can be considered.
As a final consideration, the appendix details a new approximation of the Weitzman algorithm that allows for a greater number of populations to be introduced in the analysis and improves existing approximations.
| Appendix |
|---|
|
|
|---|
The Weitzman algorithm (Weitzman 1992) is very computer time demanding, and computation time grows exponentially with the number of populations included in the analysislog regression of the experimental computing time versus the number of populations provides an r2 of 0.998 (data not shown). For example, with an AMD processor at 2.4 GHz, computing only the diversity of 33 populations, without partial contributions or marginals, took almost 11 h, and time increases by an estimated factor of 2.14 (data not shown), so handling larger numbers of populations becomes infeasible. Therefore some kind of approximation of the exact algorithm must be used if the number of populations becomes larger than a few dozen.
Thaon d'Arnoldi et al. (1998) proposed an approximation consisting of randomly sampling trees among the 2n1 that the algorithm examines and taking as the approximated value of diversity the maximum of the values coming out of the sampled trees. This procedure works well with a relatively small number of populationsthey tested it up to 29but as the number of populations increases, so must the sample size. The problem is that while the exact algorithm can be implemented recursively, the approximated one cannot, and it comes to a point where, for reasonable precision in the estimation, the approximated algorithm actually takes more time than the exact one.
We propose a different approximation, and preliminary results seem to indicate that its behavior is rather satisfactory. In each step, the algorithm splits into two symmetrical recursive problems, each of which deals with a submatrix one dimension smaller than the input one. What we propose is, before inputing these submatrixes into the next step, checking their dispersion, and if all the values in one submatrix are close enough together around a mean value, to use this value, multiplied by the remaining number of steps, as the diversity value for that submatrix, and if not, then proceed with the exact computation in the usual way. The way to check for this "closeness" is to set a threshold to the variation coefficient (i.e., the standard deviation divided by the mean) of the values in the matrix. If this coefficient is lower than the threshold, then approximate, and if it is larger, then go on with the exact algorithmwhich might eventually be truncated in some further step if the condition is met.
Figure 1 shows the bias we observed between the real and the approximated values of the diversity through increasing numbers of populations, from 13 to 33, and it can be seen that the bias never goes beyond 3% in absolute value. Calculations were made for a threshold on the variation coefficient of 0.25. Of course, the higher the threshold, the faster the computation, but with a higher bias attached, and vice versa. The behavior of the bias, as shown in Figure 1, is so irregular because the approximation is quite dependent on the data structurethe distance matrix in this casebut apparently it seems to provide good results. For 33 populations, the exact algorithm took, as mentioned above, almost 11 h to complete, while on the same computer the approximation took 2 min 40 s, with a bias of 2.7%. For a different dataset, 35 populations took 2 days minus 5 min until completion with the exact algorithm, and the approximation took less than 2 s with a threshold of 0.4, biasing the result less than 2%.
|
As promising as these results may seem, this procedure should be tested exhaustively before being applied to large datasets on the order of 50 or more populations, but that task is beyond the scope of this article.
| Acknowledgments |
|---|
We thank A. Burton for providing language advice and the reviewers for their helpful comments on this article. This study received the financial support of FEDER project 2FD1997-1191 and European project CT-98-118.
| Footnotes |
|---|
Corresponding Editor: Leif Andersson
Received June 23, 2004
Accepted June 23, 2005
| References |
|---|
|
|
|---|
-
Amos W and Balmford A, 2001. When does conservation genetics matter? Heredity 87:257265.[CrossRef][Web of Science][Medline]
Barker JSF, 1999. Conservation of livestock breed diversity. Anim Genet Resource Inform 25:3343.
Caballero A and Toro M, 2002. Analysis of genetic diversity for the management of conserved subdivided populations. Conserv Genet 3:289299.[CrossRef]
Cañón J, Alexandrino P, Bessa I, Carleos C, Carretero Y, Dunner S, Ferran N, Garcia D, Jordana J, Laloë D, Pereira A, Sanchez A, and Moazami-Goudarzi K, 2001. Genetic diversity measures of local European beef cattle breeds for conservation purposes. Genet Sel Evol 33:311332.[CrossRef][Web of Science][Medline]
Chaiwong N and Kinghorn BP, 1999. Use of genetic markers to aid conservation decisions for groups of rare domestic breeds. Proc Assoc Adv Anim Breed Genet 13:365368.
Eding H and Meuwissen T, 2001. Marker-based estimates of between and within population kinships for the conservation of genetic diversity. J Anim Breed Genet 118:141159.[CrossRef]
Fabuel E, Barragán C, Silió L, Rodríguez MC, and Toro MA, 2004. Analysis of genetic diversity and conservation priorities in Iberian pigs based on microsatellite markers. Heredity 93:104113.[CrossRef][Web of Science][Medline]
Falconer DS and Mackay TFC, 1996. An introduction to quantitative genetics, 4th ed. Harlow, UK: Longman.
Fisher RA, 1930. The genetical theory of natural selection. Oxford: Clarendon Press.
Frankham R, 1995. Inbreeding and conservation: a threshold effect. Conserv Biol 9:792799.[CrossRef]
Goldstein DB, Ruiz Linares A, Cavalli-Sforza LL, and Feldman MW, 1995. An evaluation of genetic distances for use with microsatellite loci. Genetics 139:463471.[Abstract]
Hedrick PW and Kalinowski ST, 2000. Inbreeding depression in conservation biology. Annu Rev Ecol Syst 31:139162.[CrossRef][Web of Science]
Kolmogorov A, 1931. Über die analytischen Methoden in der Wahrscheinlich-keitsrechnung. Math Ann 104:415458.[CrossRef]
Lacy RC, 1997. Importance of genetic variation to the viability of mammalian populations. J Mamm 78:320335.[CrossRef]
Laval G, Iannuccelli N, Legault C, Milan D, Groenen MAM, Giuffra, E, Andersson L, Nissen PH, Jorgensen CB, Beeckmann P, Geldermann H, Foulley JL, Chevalet C, and Ollivier L, 2000. Genetic diversity of eleven European pig breeds. Genet Sel Evol 32:187203.[CrossRef][Web of Science][Medline]
Malécot G, 1948. Les Mathématiques de l'Hérédité. Paris: Masson et Cie.
Marshall DR and Brown AHD, 1975. Optimum sampling strategies in genetic conservation. In: Crop genetic resources for today and tomorrow (Frankel OH and Hawkes JG, eds). Cambridge: Cambridge University Press; 5380.
Oldenbroek JK, 1999. Genebanks and the conservation of farm animal genetic resources. Lelystad, The Netherlands: DLO Institute for Animal Science.
Ollivier L and Foulley JL, 2002. Some suggestions on how to preserve both within- and between-breed genetic diversity. In: Book of Abstracts of the 53rd Annual Meeting of the European Association for Animal Production, Cairo, Egypt, 14 September 2002 (van der Honing Y, ed). Wageningen, The Netherlands: Wageningen Academic Publishers.
Ollivier L and Foulley JL, 2004. Objectives in livestock diversity preservation: the European pig example. In: Wissenschaftliches Kolloquium "Nutztierzüchtung im Wandel der Zeit." Götingen, Germany: Cuvillier Verlag; 87106.
Petit RJ, El Mousadik A, and Pons O, 1998. Identifying populations for conservation on the basis of genetic markers. Conserv Biol 12:844855.[CrossRef]
Reist-Marti SB, Gibson J, Rege JEO, Simianer H, and Hanotte O, 2003. Weitzman's approach and conservation of breed diversity: an application to African cattle breeds. Conserv Biol 17:12991311.[CrossRef]
Ruane J, 2000. A framework for prioritizing domestic animal breeds for conservation purposes at the national level: a Norwegian case study. Conserv Biol 14:13851393.[CrossRef]
Sonesson AK and Meuwissen THE, 2001. Minimization of rate of inbreeding for small populations with overlapping generations. Genet Res 77:285292.[CrossRef][Medline]
Thaon d'Arnoldi C, Foulley JL, and Ollivier L, 1998. An overview of the Weitzman approach to diversity. Genet Sel Evol 30:149161.
Wang J, 1997. Effective size and F-statistics of subdivided populations. 2. Dioecious species. Genetics 146:14651474.[Abstract]
Wang J and Hill WG, 2000. Marker assisted selection to increase effective population size by reducing Mendelian segregation variance. Genetics 154:475489.
Weitzman ML, 1992. On diversity. Q J Econ 107:363405.[CrossRef]
Weitzman M, 1993. What to preserve? An application of diversity theory to crane conservation. Q J Econ 108:157183.
This article has been cited by other articles:
![]() |
R. A. Martinez, D. Garcia, J. L. Gallego, G. Onofre, J. Perez, and J. Canon Genetic variability in Colombian Creole cattle populations estimated by pedigree information J Anim Sci, March 1, 2008; 86(3): 545 - 552. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||











marginal diversities (MDi(t)), and conservation potentials (CPi(t)) from averaged probabilities for t = 25, 35, and 50, using the average squared distance
