Micropublications: A Semantic Model For Claims, Evidence, Arguments And Annotations In Biomedical Communications http://www.jbiomedsem.com/content/5/1/28
In total we have 13 quotes from this source:
There is no suggestion in the current specification that nanopublications may be applied directly to ordinary scientific articles, nor that they are de- signed to present primary scientific evidence – although more expansive claims have been made elsewhere in the literature [22,23].
Such a common metadata representation of scientific claims, argument, evidence and annotation in biomedi- cine should serve as an integrating point for the original publication, subsequent annotations, and all other com- putational methods, supporting a single framework for activities in the nine point cycle of authoring-publishing- consumption-reuse we discuss in the section on Use Cases. This cycle can be thought of as an information value chain in science. This means that each set of dis- parately motivated and rewarded activities, carried out by various actors, creates and passes along value to the next, which consumes this value-added product as an in- put.
The micropublications model is adapted to the Web, and designed for (a) representing the key arguments and evidence in scientific articles, and (b) supporting the “layering” of annotations and various useful formalizations upon the full text paper.
The micropublication approach goes beyond statements and their provenance, proposing a richer model in order to account for a more complete and broadly useful view of scientific argument and evidence, beyond that of simple assertions, or assertions supported only by literature refer- ences. It is also designed to be readily compatible with assertions coded in BEL or as nanopublications, as these models are considered useful in certain applications and will need to be integrated.
Our model is based on understanding scientific publications as arguments, which present a narrative of experi- ments or observations, the data obtained, and a reasoned interpretation (“finding”) of the data’s meaning [38]. Such arguments present a line of reasoning, to a “best” explan- ation of the data (“abductive reasoning”, “inference to the best explanation”, “ampliative inference”) [39-42] taken in context with the published findings of others in the field. The determination whether or not a finding is correct, is made over time by the community of the researcher’s peers. What claims are considered true, may evolve over time, based on re-examination of evidence and development of new evidence. Assertions may be criticised and refuted. Thus scientific reasoning is defeasible [43].
Empirical evidence is required in scientific publications so that the scientific community may make and debate judgments based on the “interpretation of nature” ra- ther than interpretation of texts [28]. The process of establishing new “facts” and either supplementing or overthrowing old ones, is the central work of biomedical research.
Recent results spotlight concerns with the communi- cation of evidence and its citation. Begley and Ellis re- cently found that only 11% of research findings they examined from the academic literature could be repro- duced in a biopharmaceutical laboratory [8]. Fang et al. reviewed all retractions indexed in PubMed, finding that over two thirds were due to misconduct [7]. Retractions themselves are an increasingly common event [6]. Greenberg conducted a citation network analysis of over 300 publications on a single neuromuscular disorder, and found extensive progressive distortion of citations, to the extent that reviews in reputable journals presented statements as “facts”, which were ultimately based on no evidence at all [3,4]. Simkin and Roychowdhury showed that, in the sample of publications they studied, a majority of scientific citations were merely copied from the reference lists in other publications [32,33]. The increasing interest in direct data citation of datasets, deposited in robust repositories, is another result of this growing concern with the evidence behind assertions in the literature [34].
Our model provides a framework to support exten- sively qualified claims in natural language, as generally presented by researchers in their primary publications. Most fundamentally, it adds support relations to claims in the literature, to assist in resolving primary scientific evidence within a “lineage” of assertions. Support rela- tions, structured as graphs, back up assertions with the data, context and methodological evidence which vali- dates them.
Micropublications permit scientific claims to be formulated minimally as any statement with an attribution (basic provenance), and maximally as entire knowledge-bases with extensive evidence graphs. Thus micropublications in their minimal form subsume or encompass statement-based models, while allowing presentation of evidential support for statements and natural language assertions as backing for formalisms. This has significant applicability across the lifecycle of biomedical communications.
Micropublications represent scientific arguments. The goal of an argument is to induce belief [44]. An argument (therefore a micropublication) argues a principal claim, with statements and/or evidence deployed to support it. Its support may also include contrary statements or evi- dence; and/or the claim may dispute claims made by other arguments. These are called challenges in our model, rebuttal by Toulmin [44-46], and attacks in the artifical intelligence literature on argumentation frameworks (see e.g. [47,49,54,94]. The minimal form of an argument in our model is a statement supported by its attribution. If the source of the statement is trusted, that may be enough to induce belief. Aristotle called this aspect of rhetoric ethos, the character and reputation of the speaker [95]. Figure 3 shows this minimal form of micropublication.
Statement-based models have been proposed as mechanisms for publishing key facts asserted in the scientific literature or in curated databases in a machine processable form. Examples include: Biological Expression Language (BEL) statements [18]; SWAN, a model for claims and hypotheses in natural language developed for the annota- tion of scientific hypotheses in Alzheimers Disease (AD) research [14,19-21]; and nanopublications [22-26], which contribute to the Open PHACTS linked data warehouse of pharmacological data [26]. What we mean by “statement-based” is that they con- fine themselves to modeling statements found in scientific papers or databases, with limited or no presentation of the backing evidence for these statements. Some offer statement backing in the form of other statements in the scientific literature, but none actually has a complete representation of scientific argument including empirical evidence and methods.
Nanopublications distill content as a graph of assertions associated with (a) provenance of the article or dataset from whence they came; and (b) a set of terms for indexing and filtering in order to identify auxiliary information in large data sets. Although this last point is represented by a named graph called “Support”, this is not intended to represent argumentative support or evidence, but rather descriptive information (cell type, species, etc.) “to enable first pass filtering over large nanopublication sets” [25].
During the past two decades the ecosystem of biomedical publications has moved from a print-based to a mainly Web-based model. However, this transition brings with it many new problems, in the context of an exponentially increasing, intractable volume of publications [1,2]; of systemic problems relating to valid (or invalid) citation of scientific evidence [3,4]; rising levels of article retractions [5,6] and scientific misconduct [7]; of uncertain reproducibility and re-usability of results in therapeutic development [8], and lack of transparency in research publication [9]. While we now have rapid access to much of the world’s biomedical literature, our methods to organize, verify, assess, combine and absorb this information in a comprehensive way, and to move discussion and annotation activities through the ecosystem efficiently, remain disappointing. [...] Computational methods previously proposed as solutions include ontologies [10]; text mining [2,11,12]; databases [13]; knowledgebases [14]; visualization [15]; new forms of publishing [16]; digitial abstracting [1]; semantic annotating [17]; and combinations of these approaches. However, we lack a comprehensive means to orchestrate these methods. We propose to accomplish this with a layered metadata model of scientific argumentation and evidence.