Resumo: |
Curating biological information from the published literature can be time- and labor-intensive especially without automated tools. WormBase1 has adopted several curation interfaces and tools, most of which were built in-house, to help curators recognize and extract data more efficiently from the literature. These tools range from simple computer interfaces for data entry to employing scripts that take advantage of complex text extraction algorithms, which automatically identify specific objects in a paper and presents them to the curator for curation. By using these in-house tools, we are also able to tailor the tool to the individual needs and preferences of the curator. For example, Gene Ontology Cellular Component and gene-gene interaction curators employ the text mining software Textpresso2 to indentify, retrieve, and extract relevant sentences from the full text of an article. The curators then use a web-based curation form to enter the data into our local database. For transgene and antibody curation, curators use the publicly available Phenote ontology annotation curation interface (developed by the Berkeley Bioinformatics Open-Source Projects (BBOP)), which we have adapted with datatype specific configurations. This tool has been used as a basis for developing our own Ontology Annotator tool, which is being used by our phenotype and gene ontology curators. For RNAi curation, we created web-based submission forms that allow the curator to efficiently capture all relevant information. In all cases, the data undergoes a final scripted data dump step to make sure all the information conforms into a readable file by our object oriented database.
|