Show simple item record

dc.contributor.authorHerrera Ibatá, Diana María
dc.contributor.authorPazos, A.
dc.contributor.authorOrbegozo-Medina, Ricardo Alfredo
dc.contributor.authorRomero-Durán, Francisco Javier
dc.contributor.authorGonzález-Díaz, Humberto
dc.date.accessioned2016-10-19T12:04:00Z
dc.date.available2016-10-19T12:04:00Z
dc.date.issued2015-04-24
dc.identifier.citationHerrera-Ibatá DM, Pazos A, Orbegozo-Medina RA, Romero-Durán FJ, González-Díaz H. Mapping chemical structure-activity information of HAART-drug cocktails over complex networks of AIDS epidemiology and socioeconomic data of U.S. counties. Biosystems. 2015;132-133:20-34es_ES
dc.identifier.urihttp://hdl.handle.net/2183/17471
dc.description.abstract[Abstract] Using computational algorithms to design tailored drug cocktails for highly active antiretroviral therapy (HAART) on specific populations is a goal of major importance for both pharmaceutical industry and public health policy institutions. New combinations of compounds need to be predicted in order to design HAART cocktails. On the one hand, there are the biomolecular factors related to the drugs in the cocktail (experimental measure, chemical structure, drug target, assay organisms, etc.); on the other hand, there are the socioeconomic factors of the specific population (income inequalities, employment levels, fiscal pressure, education, migration, population structure, etc.) to study the relationship between the socioeconomic status and the disease. In this context, machine learning algorithms, able to seek models for problems with multi-source data, have to be used. In this work, the first artificial neural network (ANN) model is proposed for the prediction of HAART cocktails, to halt AIDS on epidemic networks of U.S. counties using information indices that codify both biomolecular and several socioeconomic factors. The data was obtained from at least three major sources. The first dataset included assays of anti-HIV chemical compounds released to ChEMBL. The second dataset is the AIDSVu database of Emory University. AIDSVu compiled AIDS prevalence for >2300 U.S. counties. The third data set included socioeconomic data from the U.S. Census Bureau. Three scales or levels were employed to group the counties according to the location or population structure codes: state, rural urban continuum code (RUCC) and urban influence code (UIC). An analysis of >130,000 pairs (network links) was performed, corresponding to AIDS prevalence in 2310 counties in U.S. vs. drug cocktails made up of combinations of ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4856 protocols, and 10 possible experimental measures. The best model found with the original data was a linear neural network (LNN) with AUROC > 0.80 and accuracy, specificity, and sensitivity ≈ 77% in training and external validation series. The change of the spatial and population structure scale (State, UIC, or RUCC codes) does not affect the quality of the model. Unbalance was detected in all the models found comparing positive/negative cases and linear/non-linear model accuracy ratios. Using synthetic minority over-sampling technique (SMOTE), data pre-processing and machine-learning algorithms implemented into the WEKA software, more balanced models were found. In particular, a multilayer perceptron (MLP) with AUROC = 97.4% and precision, recall, and F-measure >90% was found.es_ES
dc.language.isoenges_ES
dc.publisherElsevieres_ES
dc.relation.urihttp://dx.doi.org/10.1016/j.biosystems.2015.04.007es_ES
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 Españaes_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.subjectUrban influence codees_ES
dc.subjectAIDS epidemiologyes_ES
dc.subjectBox–Jenkins operatorses_ES
dc.subjectShannon entropyes_ES
dc.subjectInformation theoryes_ES
dc.titleMapping chemical structure-activity information of HAART-drug cocktails over complex networks of AIDS epidemiology and socioeconomic data of U.S. countieses_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.rights.accessinfo:eu-repo/semantics/openAccesses_ES
UDC.journalTitleBiosystemses_ES
UDC.volume132-133es_ES
UDC.startPage20es_ES
UDC.endPage34es_ES


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record