|
|
|
Registros recuperados: 12 | |
|
|
Heather A. Piwowar; Douglas B. Fridsma. |
Background
 Many initiatives and repositories exist to encourage the sharing of research data, and thousands of microarray gene expression datasets are publicly available. Many studies reuse this data, but it is not well understood which datasets are reused and for what purpose.

 Materials and Methods
 We trained a machine-learning algorithm to automatically classify full-text gene expression microarray studies into two classes: those that generated original microarray data (n=900) and those which only reused data (n=250). We then compared the Medical Subject Heading (MeSH) terms of two classes to identify MeSH topics which were over- or under-represented by publications with... |
Tipo: Poster |
Palavras-chave: Bioinformatics. |
Ano: 2007 |
URL: http://precedings.nature.com/documents/425/version/1 |
| |
|
|
Alexander Garnett; Heather A. Piwowar; Edie M. Rasmussen; Judy Illes. |
Bibliographic search engines allow endless possibilities for building queries based on specific words or phrases in article titles and abstracts, indexing terms, and other attributes. Unfortunately, deciding which attributes to use in a methodologically sound query is a non-trivial process. In this paper, we describe a system to help with this task, given an example set of PubMed articles to retrieve and a corresponding set of articles to exclude. The system provides the users with unigram and bigram features from the title, abstract, MeSH terms, and MeSH qualifier terms in decreasing order of precision, given a recall threshold. From this information and their knowledge of the domain, users can formulate a query and evaluate its performance. We apply... |
Tipo: Manuscript |
Palavras-chave: Bioinformatics. |
Ano: 2010 |
URL: http://precedings.nature.com/documents/4270/version/2 |
| |
|
|
Heather A. Piwowar. |
Presentation based on the publication here:
Piwowar HA, Day RS, Fridsma DB (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. doi:10.1371/journal.pone.0000308

Sharing research data provides benefit to the general scientific community, but the benefit is less obvious for the investigator who makes his or her data available.

We examined the citation history of 85 cancer microarray clinical trial publications with respect to the availability of their data. The 48% of trials with publicly available microarray data received 85% of the aggregate citations. Publicly available data was significantly (p = 0.006) associated with... |
Tipo: Presentation |
Palavras-chave: Bioinformatics. |
Ano: 2007 |
URL: http://precedings.nature.com/documents/361/version/1 |
| |
|
|
Heather A. Piwowar; Douglas B. Fridsma. |
Does your research area re-use shared datasets?
* Re-using data has many benefits, including research synergy and efficient resource use
* Some research areas have tools, communities, and practices which facilitate re-use
* Identifying these areas will allow us to learn from them, and apply the lessons to areas which underutilize the sharing and re-purposing of scientific data between investigators

 Which datasets?
This preliminary analysis examines the re-use of microarray gene expression datasets. Thousands of microarray gene expression datasets have been deposited in publicly available databases. 
Many studies reuse this data,... |
Tipo: Poster |
Palavras-chave: Bioinformatics. |
Ano: 2007 |
URL: http://precedings.nature.com/documents/425/version/3 |
| |
|
|
Heather A. Piwowar; Douglas B. Fridsma. |
Background
 Many initiatives and repositories exist to encourage the sharing of research data, and thousands of microarray gene expression datasets are publicly available. Many studies reuse this data, but it is not well understood which datasets are reused and for what purpose.

 Materials and Methods
 We trained a machine-learning algorithm to automatically classify full-text gene expression microarray studies into two classes: those that generated original microarray data (n=900) and those which only reused data (n=250). We then compared the Medical Subject Heading (MeSH) terms of two classes to identify MeSH topics which were over- or under-represented by publications with... |
Tipo: Poster |
Palavras-chave: Bioinformatics. |
Ano: 2007 |
URL: http://precedings.nature.com/documents/425/version/2 |
| |
|
|
Heather A. Piwowar; Wendy W. Chapman. |
*Background:* Sharing data is a tenet of science, yet commonplace in only a few subdisciplines. Recognizing that a data sharing culture is unlikely to be achieved without policy guidance, some funders and journals have begun to request and require that investigators share their primary datasets with other researchers. The purpose of this study is to understand the current state of data sharing policies within journals, the features of journals which are associated with the strength of their data sharing policies, and whether the strength of data sharing policies impact the observed prevalence of data sharing. 

*Methods:* We investigated these relationships with respect to gene expression microarray data in the... |
Tipo: Manuscript |
Palavras-chave: Bioinformatics. |
Ano: 2008 |
URL: http://precedings.nature.com/documents/1700/version/1 |
| |
|
|
Heather A. Piwowar; Wendy W. Chapman. |
Sharing research data is a cornerstone of science. Although many tools and policies exist to encourage data sharing, the prevalence with which datasets are shared is not well understood. We report our preliminary results on patterns of sharing microarray data in public databases.

The most comprehensive method for measuring occurrences of public data sharing is manual curation of research reports, since data sharing plans are usually communicated in free text within the body of an article. Our early findings from manual curation of 100 papers suggest that 30% of investigators publicly share their full microarray datasets. Of these, 70% of the datasets are deposited at NCBI's Gene Expression Omnibus (GEO) database,... |
Tipo: Poster |
Palavras-chave: Bioinformatics. |
Ano: 2008 |
URL: http://precedings.nature.com/documents/1701/version/1 |
| |
|
|
Alexander Garnett; Heather A. Piwowar; Edie M. Rasmussen; Judy Illes. |
Bibliographic search engines allow endless possibilities for building queries based on specific words or phrases in article titles and abstracts, indexing terms, and other attributes. Unfortunately, deciding which attributes to use in a methodologically sound query is a non-trivial process. In this paper, we describe a system to help with this task, given an example set of PubMed articles to retrieve and a corresponding set of articles to exclude. The system provides the users with unigram and bigram features from the title, abstract, MeSH terms, and MeSH qualifier terms in decreasing order of precision, given a recall threshold. From this information and their knowledge of the domain, users can formulate a query and evaluate its performance. We apply... |
Tipo: Manuscript |
Palavras-chave: Bioinformatics. |
Ano: 2010 |
URL: http://precedings.nature.com/documents/4270/version/1 |
| |
|
|
Heather A. Piwowar; Wendy W. Chapman. |
*Background*
Much scientific knowledge is contained in the details of the full-text biomedical literature. Most research in automated retrieval presupposes that the target literature can be downloaded and preprocessed prior to query. Unfortunately, this is not a practical or maintainable option for most users due to licensing restrictions, website terms of use, and sheer volume. Scientific article full-text is increasingly queriable through portals such as PubMed Central, Highwire Press, Scirus, and Google Scholar. However, because these portals only support very basic Boolean queries and full text is so expressive, formulating an effective query is a difficult task for users. We propose improving the formulation of full-text queries... |
Tipo: Manuscript |
Palavras-chave: Bioinformatics. |
Ano: 2010 |
URL: http://precedings.nature.com/documents/4267/version/1 |
| |
|
|
Valerie Enriquez; Sarah Walker Judson; Nicholas M. Walker; Suzie Allard; Robert B. Cook; Heather A. Piwowar; Robert J. Sandusky; Todd J. Vision; Bruce E. Wilson. |
Consistent attribution of research data upon reuse is necessary to reward the original data-producing investigators, reconstruct provenance, and inform data sharing policies, tool requirements, and funding decisions. Unfortunately, norms for data attribution are varied and often weak. As part of the DataONE 2010 summer internship program, three interns studied the policies, practice, and implications of current data attribution behavior in the environmental sciences. We found that few policies recommend robust data citation practices: in our preliminary evaluation, only one-third of repositories (n=26), 6% of journals (n=307), and 1 of 53 funders suggested a best practice for data citation. We manually reviewed 500 papers published between 2000 and... |
Tipo: Poster |
Palavras-chave: Ecology; Earth & Environment; Evolutionary Biology. |
Ano: 2010 |
URL: http://precedings.nature.com/documents/5452/version/1 |
| |
|
|
Heather A. Piwowar; Wendy W. Chapman. |
Repurposing research data holds many benefits for the advancement of biomedicine, yet is very difficult to measure and evaluate. We propose a data reuse registry to maintain links between primary research datasets and studies that reuse this data. Such a resource could help recognize investigators whose work is reused, illuminate aspects of reusability, and evaluate policies designed to encourage data sharing and reuse. |
Tipo: Poster |
Palavras-chave: Bioinformatics. |
Ano: 2008 |
URL: http://precedings.nature.com/documents/2152/version/1 |
| |
|
|
Heather A. Piwowar; Wendy W. Chapman. |
*Background* 
Much scientific knowledge is contained in the details of the full-text biomedical literature. Most research in automated retrieval presupposes that the target literature can be downloaded and preprocessed prior to query. Unfortunately, this is not a practical or maintainable option for most users due to licensing restrictions, website terms of use, and sheer volume. Scientific article full-text is increasingly queriable through portals such as PubMed Central, Highwire Press, Scirus, and Google Scholar. However, because these portals only support very basic Boolean queries and full text is so expressive, formulating an effective query is a difficult task for users. We propose improving the formulation of full-text queries... |
Tipo: Manuscript |
Palavras-chave: Bioinformatics. |
Ano: 2010 |
URL: http://precedings.nature.com/documents/4267/version/2 |
| |
Registros recuperados: 12 | |
|
|
|