Thursday 19 July 2007

genetics - Linkage and LD: quantitative or qualitative?

Linkage disequilibrium (LD) occurs when there is a non-random association or correlation between genotypes. Note I used the word correlation; this is a quantitative trait.



Some genotypes may well correlate perfectly (R=1), i.e. they are always inherited together. Others may not be in 'perfect' linkage (e.g. R=0.9), but are still considered to be in LD because there is a strong correlation between the genotypes (I think 0.8 is generally seen in published papers as the 'cut-off').



The correlation coefficient is derived from the D' value - this (simply) denotes the observed vs. expected frequencies of the genotypes (whether they are in linkage or not). Therefore either value can be used, but I think it is more common (and more interpretable) to express the correlation coefficient (or to be more precise, the coefficient of determination, R^2).



If you'd like an example: I might be interested to know if any SNPs are in LD with rs10757278 (located on 9p21, associated with heart disease). Using SNAP (by the BROAD) I can input my search, choose my options (e.g. use 1000 genomes data, and an R^2 cut-off of 0.8) and search. ~5 SNPs are found to be in 'perfect' LD (R^2=1), but a further ~40 are still considered to be in linkage with the input-SNP because their R^2 values are above 0.8.



So in summary, both statements can be used correctly, but it is always more informative to state the degree of linkage (otherwise it might be assumed they are in perfect correlation).

No comments:

Post a Comment