Sabiia Seb
PortuguêsEspañolEnglish
Embrapa
        Busca avançada

Botão Atualizar


Botão Atualizar

Registro completo
Provedor de dados:  17
País:  United Kingdom
Título:  Developing a Text Mining Prototype for the Comparative Toxicogenomics Database Biocuration Process
Autores:  Thomas C. Wiegers
Data:  2009-04-22
Ano:  2009
Palavras-chave:  Genetics & Genomics
Pharmacology
Bioinformatics
Resumo:  Understanding interactions between environmental chemicals and genes provides insights into the mechanisms of chemical action, disease susceptibility, therapeutic drug interactions, and toxicity. The Comparative Toxicogenomics Database (CTD; http://ctd.mdibl.org) is a web-based resource that integrates diverse information for the cross-species analysis of chemical, gene, and disease relationships. Much of the data contained in CTD is manually gathered by biocurators; CTD integrates data curated manually from over 10,000 scientific documents. CTD biocurators manually curate chemical-gene and chemical/gene-disease interactions from the scientific literature using controlled vocabularies. Unfortunately, there are many more scientific documents available for curation than can actually be curated by CTD staff; consequently, selecting the best documents for curation is very important.

To improve the efficacy of CTD biocuration process, a computational text mining prototype was developed to score and rank PubMed abstracts in terms of their desirability for curation. The prototype identifies:
•	chemical, gene, and disease actors, 
•	specific action terms used to define interaction activity, and 
•	other key factors that contribute to a document’s overall relevancy to CTD.

The prototype was then tested using data manually curated by CTD as the control group in order to determine its overall effectiveness; a metric known as mean average precision was used in evaluating the prototype. How was the prototype designed and architected, what 3rd party tools were integrated into the prototype, how was the prototype tested? Were the tools able to identify the same actors as the curators, how were the documents scored and ranked, how effective was the document ranking process? What major problems were encountered? How will the prototype ultimately be integrated into the CTD biocuration process? The answers to these and other questions will be discussed during the workshop.
Tipo:  Presentation
Identificador:  http://precedings.nature.com/documents/3142/version/1

oai:nature.com:10.1038/npre.2009.3142.1

http://dx.doi.org/10.1038/npre.2009.3142.1
Fonte:  Nature Precedings
Direitos:  Creative Commons Attribution 3.0 License
Fechar
 

Empresa Brasileira de Pesquisa Agropecuária - Embrapa
Todos os direitos reservados, conforme Lei n° 9.610
Política de Privacidade
Área restrita

Embrapa
Parque Estação Biológica - PqEB s/n°
Brasília, DF - Brasil - CEP 70770-901
Fone: (61) 3448-4433 - Fax: (61) 3448-4890 / 3448-4891 SAC: https://www.embrapa.br/fale-conosco

Valid HTML 4.01 Transitional