Background
Integration of the scientific literature into a biomedical research infrastructure requires the processing of the literature, identification of the contained named entities (NEs) and concepts, and to represent the content in a standardised way. Little efforts have been spent on the integration of content from the literature text into RDF Triple Stores.
The CALBC project partners (PPs) have produced a large-scale annotated biomedical corpus with four different semantic groups through the harmonisation of annotations from automatic text mining solutions, the first version of the Silver Standard Corpus I (SSC-I). The four semantic groups were chemical entities and drugs (CHED), genes and proteins (PRGE), diseases... |