Skip to Main content Skip to Navigation
New interface
Journal articles

New clues on carcinogenicity-related substructures derived from mining two large datasets of chemical compounds

Abstract : In this study, new molecular fragments associated with genotoxic and nongenotoxic carcinogens are introduced to estimate the carcinogenic potential of compounds. Two rule-based carcinogenesis models were developed with the aid of SARpy: model R (from rodents' experimental data) and model E (from human carcinogenicity data). Structural alert extraction method of SARpy uses a completely automated and unbiased manner with statistical significance. The carcinogenicity models developed in this study are collections of carcinogenic potential fragments that were extracted from two carcinogenicity databases: the ANTARES carcinogenicity dataset with information from bioassay on rats and the combination of ISSCAN and CGX datasets, which take into accounts human-based assessment. The performance of these two models was evaluated in terms of cross-validation and external validation using a 258 compound case study dataset. Combining R and H predictions and scoring a positive or negative result when both models are concordant on a prediction, increased accuracy to 72% and specificity to 79% on the external test set. The carcinogenic fragments present in the two models were compared and analyzed from the point of view of chemical class. The results of this study show that the developed rule sets will be a useful tool to identify some new structural alerts of carcinogenicity and provide effective information on the molecular structures of carcinogenic chemicals.
Document type :
Journal articles
Complete list of metadata

Cited literature [59 references]  Display  Hide  Download
Contributor : Gestionnaire Civs Connect in order to contact the contributor
Submitted on : Tuesday, August 28, 2018 - 10:06:17 AM
Last modification on : Tuesday, August 28, 2018 - 10:06:18 AM
Long-term archiving on: : Thursday, November 29, 2018 - 1:09:23 PM


Files produced by the author(s)




Azadi Golbamaki, Emilio Benfenati, Nazanin Golbamaki, Alberto Manganaro, Erinc Merdivan, et al.. New clues on carcinogenicity-related substructures derived from mining two large datasets of chemical compounds. Journal of Environmental Science and Health, Part C, 2016, 34 (2), pp.97-113. ⟨10.1080/10590501.2016.1166879⟩. ⟨ineris-01863016⟩



Record views


Files downloads