The genetic component of forensic genetic genealogy (FGG) is an estimate of kinship, often conducted at genome scales between a great number of individuals. The promise of FGG is substantial: in concert with genealogical records, it can identify a person of interest without a direct reference sample to compare against. However, at present there are no tools that allow FGG investigations when the sample is a DNA mixture. At ISHI, August Woerner will present Demixtify (v2), a free open-source tool that deconvolves two-person mixtures from shotgun whole genome sequencing data. Demixtify estimates the mixture fraction (mf) using a broad sampling of intermediate frequency low FST markers. Next, it estimates the joint genotype likelihood of both contributors (given the mf), and it extracts the marginal (deconvolved) likelihoods and genotypes into a standard format for genomic information (VCF). Simple tools are also provided to convert these data into a GEDmatch-compliant file format. He and his colleagues evaluate Demixtify considering both in silico and in vitro DNA mixtures, and contrast its performance when a single contributor’s genotypes are known against its limits when there are two unknown contributors.
Can you give us a sneak peek into the main theme of your presentation and why it’s important for our audience?
Genetic genealogy is a very powerful emergent technique in forensic genetics. However, if your sample is a DNA mixture, you cannot (or perhaps should not) upload these data to services such as GEDmatch. Mixture deconvolution is one solution to this problem. We wrote a simple tool, a tool that draws from genotyping techniques often applied to plants (e.g., taxa with four chromosomes), and show that it can deconvolve balanced (requiring a known contributor) and imbalanced two-person mixtures. Once more, it provides estimates of genotype uncertainty, suggesting that this tool also is compatible with genotype imputation software.
What inspired you to explore the topic of your presentation?
The kinds of samples that were talking likely correspond to some of the most egregious crimes there are. For most forensic problems, and this shames me to admit, I like the puzzle in it; that it happens to apply to science is just an additional benefit. This work, however, is an exception. True, I like the puzzle of how to solve the problem well, but the application (e.g., unsolved serial cases) is heinous enough that I often burn my Saturday and Sunday mornings (and other time too!) to help solve it.
What’s one common misconception about your area of expertise you’d like to clarify?
Genotype imputation is made out to be the boogeyman. It’s not. At a high-level, modern imputation methods are just a kind of probabilistic genotyping that work quite well (and quite well in many diverse population groups). There are, of course, ways to do imputation poorly, but that is a separate problem!
How do you hope your presentation will impact the audience or industry?
Mixture deconvolution is not an ideal answer to this problem; tools such as GEDmatch require single-source profiles, which forces deconvolution. As a mixed sample becomes more balanced, deconvolution becomes ill-suited to the problem as (in truth) there are many potential deconvolutions, and it would be better to consider all (or all probable) deconvolutions instead. A better answer, and an answer I hope is adopted by industry, is to estimate kinship directly on the mixture.
Are there any resources or tools you recommend for those interested in learning more about your presentation topic?
There is a rich literature on interpreting STR mixtures (for an excellent review, see https://doi.org/10.1016/j.fsigen.2018.11.009). The background for this talk is really how to “detect” mixtures and estimate the mixture proportions (https://doi.org/10.1016/j.fsigen.2023.102980 ), and how to estimate the genotypes of the major and minor contributor if you know their proportions (https://doi.org/10.1016/j.fsigen.2022.102776). Both approaches are designed for whole genome sequencing, however also see: https://doi.org/10.20944/preprints202407.1705.v1 for a sneak-peek now how these same ideas can be applied to other forensic genetic genealogy workflows.
When you’re not working, what’s your favorite way to unwind or relax?
I am a geek, through and through. I love games—I skipped a lot of high school to play bridge at a nearby coffee shop—and I still love gaming (board, card or computer!). I also like biking and playing soccer with my two (very energetic) children.
What’s the best piece of advice you’ve ever received?
This is for the academics out there—academia is a pie eating contest where the reward for success is more pie.
If you could only eat one dish for the rest of your life, what would it be?
Ice cream. I wouldn’t live long, but at least I’d be happy!
What were you doing in 1989, when ISHI first started?
That would put me in 5th grade—honestly, a pretty good year as it was before the awfulness that is middle school!
Can you share a memorable moment from ISHI that has stayed with you?
Oh, there are memorable moments, although the good ones shouldn’t be shared here!