Single-cell investigative genetics: Single-cell data produces genotype distributions concentrated at the true genotype across all mixture complexities

Catherine M. Grgicak*, Qhawe Bhembe, Klaas Slooten, Nidhi C. Sheth, Ken R. Duffy, Desmond S. Lun

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

In the absence of a suspect the forensic aim is investigative, and the focus is one of discerning what genotypes best explain the evidence. In traditional systems, the list of candidate genotypes may become vast if the sample contains DNA from many donors or the information from a minor contributor is swamped by that of major contributors, leading to lower evidential value for a true donor's contribution and, as a result, possibly overlooked or inefficient investigative leads. Recent developments in single-cell analysis offer a way forward, by producing data capable of discriminating genotypes. This is accomplished by first clustering single-cell data by similarity without reference to a known genotype. With good clustering it is reasonable to assume that the scEPGs in a cluster are of a single contributor. With that assumption we determine the probability of a cluster's content given each possible genotype at each locus, which is then used to determine the posterior probability mass distribution for all genotypes by application of Bayes’ rule. A decision criterion is then applied such that the sum of the ranked probabilities of all genotypes falling in the set is at least 1−α. This is the credible genotype set and is used to inform database search criteria. Within this work we demonstrate the salience of single-cell analysis by performance testing a set of 630 previously constructed admixtures containing up to 5 donors of balanced and unbalanced contributions. We use scEPGs that were generated by isolating single cells, employing a direct-to-PCR extraction treatment, amplifying STRs that are compliant with existing national databases and applying post-PCR treatments that elicit a detection limit of one DNA copy. We determined that, for these test data, 99.3% of the true genotypes are included in the 99.8% credible set, regardless of the number of donors that comprised the mixture. We also determined that the most probable genotype was the true genotype for 97% of the loci when the number of cells in a cluster was at least two. Since efficient investigative leads will be borne by posterior mass distributions that are narrow and concentrated at the true genotype, we report that, for this test set, 47,900 (86%) loci returned only one credible genotype and of these 47,551 (99%) were the true genotype. When determining the LR for true contributors, 91% of the clusters rendered LR>1018, showing the potential of single-cell data to positively affect investigative reporting.

Original languageEnglish
Article number103000
JournalForensic Science International: Genetics
Volume69
Early online date19 Dec 2023
DOIs
Publication statusPublished - Mar 2024

Bibliographical note

Publisher Copyright:
© 2024 Elsevier B.V.

Funding

This work was partially supported by NIJ2014-DN-BX-K026 and NIJ2018-DU-BX-0185 and NIJ2020-R2-CX-0032 awarded by the National Institute of Justice, Office of Justice Programs, U.S. Department of Justice . The opinions, findings, and conclusions or recommendations expressed in this publication are those of the author(s) and do not reflect those of the Department of Justice.

FundersFunder number
U.S. Department of Justice
National Institute of Justice
Office of Justice Programs

    Keywords

    • Database searching
    • Forensic DNA
    • Investigative forensics
    • Mixture interpretation
    • Probabilistic genotyping
    • Single-cell forensics
    • Single-cell genetics

    Fingerprint

    Dive into the research topics of 'Single-cell investigative genetics: Single-cell data produces genotype distributions concentrated at the true genotype across all mixture complexities'. Together they form a unique fingerprint.

    Cite this