New clues on carcinogenicity-related substructures derived from mining two large datasets of chemical compounds

Abstract : In this study, new molecular fragments associated with genotoxic and nongenotoxic carcinogens are introduced to estimate the carcinogenic potential of compounds. Two rule-based carcinogenesis models were developed with the aid of SARpy: model R (from rodents' experimental data) and model E (from human carcinogenicity data). Structural alert extraction method of SARpy uses a completely automated and unbiased manner with statistical significance. The carcinogenicity models developed in this study are collections of carcinogenic potential fragments that were extracted from two carcinogenicity databases: the ANTARES carcinogenicity dataset with information from bioassay on rats and the combination of ISSCAN and CGX datasets, which take into accounts human-based assessment. The performance of these two models was evaluated in terms of cross-validation and external validation using a 258 compound case study dataset. Combining R and H predictions and scoring a positive or negative result when both models are concordant on a prediction, increased accuracy to 72% and specificity to 79% on the external test set. The carcinogenic fragments present in the two models were compared and analyzed from the point of view of chemical class. The results of this study show that the developed rule sets will be a useful tool to identify some new structural alerts of carcinogenicity and provide effective information on the molecular structures of carcinogenic chemicals.
Document type :
Journal articles
Complete list of metadatas

Cited literature [39 references]  Display  Hide  Download

https://hal-ineris.archives-ouvertes.fr/ineris-01863016
Contributor : Gestionnaire Civs <>
Submitted on : Tuesday, August 28, 2018 - 10:06:17 AM
Last modification on : Tuesday, August 28, 2018 - 10:06:18 AM
Long-term archiving on : Thursday, November 29, 2018 - 1:09:23 PM

File

2016-174_post-print.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Azadi Golbamaki, Emilio Benfenati, Nazanin Golbamaki, Alberto Manganaro, Erinc Merdivan, et al.. New clues on carcinogenicity-related substructures derived from mining two large datasets of chemical compounds. Journal of Environmental Science and Health, Part C, Taylor & Francis: STM, Behavioural Science and Public Health Titles, 2016, 34 (2), pp.97-113. ⟨10.1080/10590501.2016.1166879⟩. ⟨ineris-01863016⟩

Share

Metrics

Record views

29

Files downloads

101