Abstract
Numerous ligand-based drug discovery projects are based on structure-activity relationship (SAR) analysis, such as Free-Wilson (FW) or matched molecular pair (MMP) analysis. Intrinsically they assume linearity and additivity of substituent contributions. These techniques are challenged by nonadditivity (NA) in protein–ligand binding where the change of two functional groups in one molecule results in much higher or lower activity than expected from the respective single changes. Identifying nonlinear cases and possible underlying explanations is crucial for a drug design project since it might influence which lead to follow. By systematically analyzing all AstraZeneca (AZ) inhouse compound data and publicly available ChEMBL25 bioactivity data, we show significant NA events in almost every second assay among the inhouse and once in every third assay in public data sets. Furthermore, 9.4% of all compounds of the AZ database and 5.1% from public sources display significant additivity shifts indicating important SAR features or fundamental measurement errors. Using NA data in combination with machine learning showed that nonadditive data is challenging to predict and even the addition of nonadditive data into training did not result in an increase in predictivity. Overall, NA analysis should be applied on a regular basis in many areas of computational chemistry and can further improve rational drug design.
Original language | English |
---|---|
Article number | 47 |
Pages (from-to) | 1-18 |
Number of pages | 18 |
Journal | Journal of Cheminformatics |
Volume | 13 |
Issue number | 1 |
Early online date | 2 Jul 2021 |
DOIs | |
Publication status | Published - Dec 2021 |
Bibliographical note
Funding Information:Gratitude towards Uppsala University, Dr. Lena ?slund and the colleagues from IMIM program for supporting the master thesis of DG.
Funding Information:
DG is supported financially by Erasmus Mundus Joint Master Degree scholarship 2018-2020 and AstraZeneca Master Student program.
Publisher Copyright:
© 2021, The Author(s).
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
Keywords
- Experimental uncertainty
- Machine learning
- Matched molecular pair analysis
- Nonadditivity analysis
- Random forest
- Structure-activity relationship
- Support vector machine
Fingerprint
Dive into the research topics of 'Nonadditivity in public and inhouse data: implications for drug design'. Together they form a unique fingerprint.Datasets
-
Additional file 2 of Nonadditivity in public and inhouse data: implications for drug design
Nittinger, E. (Contributor), Tyrchan, C. (Contributor), Gogishvili, D. (Contributor) & Margreitter, C. (Contributor), figshare Academic Research System, 1 Jan 2021
DOI: 10.6084/m9.figshare.14904126.v1, https://doi.org/null
Dataset / Software: Dataset
-
Additional file 3 of Nonadditivity in public and inhouse data: implications for drug design
Nittinger, E. (Contributor), Margreitter, C. (Contributor), Gogishvili, D. (Contributor) & Tyrchan, C. (Contributor), figshare Academic Research System, 1 Jan 2021
DOI: 10.6084/m9.figshare.14904129.v1, https://doi.org/null
Dataset / Software: Dataset
-
Additional file 4 of Nonadditivity in public and inhouse data: implications for drug design
Tyrchan, C. (Contributor), Margreitter, C. (Contributor), Gogishvili, D. (Contributor) & Nittinger, E. (Contributor), figshare Academic Research System, 1 Jan 2021
DOI: 10.6084/m9.figshare.14904132.v1, https://doi.org/null
Dataset / Software: Dataset