Core : Three Access Levels To Underpin Open Access http://www.dlib.org/dlib/november12/knoth/11knoth.html

In total we have 10 quotes from this source:

 3 types of access to open access content

There are three essential types of access to this content, which we will call access levels. We argue that these access levels must be supported by services in order to create an environment in which OA content can be fully exploited. They are: -Access at the granularity of papers. -Analytical access at the granularity of collections. -Access to raw data

[...]

While developers are interested in accessing the raw data, for example through an API, academics will primarily require accessing the content at the level of individual items or relatively small sets of items, mostly expecting to communicate with a digital library (DL) using a set of search and exploration tools. A relatively specific group of users are eResearchers3 whose work is largely motivated by information communicated at the transaction and analytical levels, but in terms of their actual work are mostly dependent on raw data access typically realised using APIs and downloadable datasets.

#digital-libraries  #raw-data  #access 
 Apart from the growth of...

Apart from the growth of OA journals, providing so-‐called Gold OA, a cornerstone of the OA movement has been a process called self-‐archiving (Green OA). Self-‐archiving refers to the deposit of a preprint or postprint of a journal or conference article in an institutional repository or archive. According to the Directory of Open Access Repositories (OpenDOAR), more than 65% of publishers endorse self-‐archiving. They include major players, such as Springer Verlag. It is largely due to this policy that millions of research papers are available online without the necessity to pay a subscription. According the to the study of Laasko & Björk (2012), 17% of currently published research articles are available as gold OA. The recent study of Gargouri, et al. (2012) shows that about 20% of articles are made available as green OA. Countries with more and stronger (institutional or funder) green OA mandates have even higher proportions of articles available as green OA. For example, almost 40% of research outputs in the UK are believed to be available through the green route. ... There is still, of course, a number of reasons that hinder the adoption of Open Access (Björk, 2003) including often discussed legal barriers as well as the issues related to evidence of scientific recognition. In this paper, we discuss a very important yet a rarely debated one — the lack of a mature technical infrastructure.

#self  #article  #journals 
 ...going "beyond search and access"...

...going "beyond search and access" while not ignoring these functions has already been explored in Lagoze, et al. (2005). The authors argue that digital libraries need to add value to web resources by extending current metadata models to allow establishing and representing context of resources, enriching them with new information and relationships, and encourage collaborative use. While the value of such services is apparent, and their realisation is often entirely plausible, there is a high barrier to entering the market. This barrier is the difficulty of being able to access and work with the raw data needed to realise these services.

#services  #metadata-model  #digital-libraries  #resources  #raw-data 
 Open access aggregation systems vs academic search engines

The majority of repository aggregation systems focus on the essential problem of aggregating resources for the purposes of providing cross-‐repository metadata search. While search is an essential component of an OA infrastructure, connecting and tying OA repositories together offers far more possibilities. Aggregations should not become just large searchable metadata silos, they should offer (or enable others to offer) a wide range of value-‐added services targeting all different types of users participating in the research process, i.e. not just users searching for individual publications, but, for example, those who need statistical information about collections of publications and their dynamics or those who need access to raw data for the purposes of research or applications development. These characteristics should distinguish OA aggregation systems from major academic search engines, such as Google Scholar or Microsoft Academic Search. [..] ..these systems provide only very limited support for those wanting to build new tools on top of them, for those who need flexible access to the indexed content and consequently also for those who need to use the content for analytical purposes. In addition, they do not distinguish between Open Access and subscription based content, which makes them unsuitable for realising the above mentioned vision of connected OARs.

#users  #search  #access 
 Classification of content in oa repositories

...content stored in OA repositories and journals reflects the diversity of research disciplines. For example, information about the specific subject of a paper (e.g. computer science, humanities) can be used to narrow down search, to monitor trends and to estimate content growth in specific disciplines. Only about 1.4% (Pieper and Summann, 2006b) of items in OA repositories have been classified and manual classification is costly. We recently carried out a series of experiments with text-‐classification of full-‐text articles into 18 top-‐level classes of the DOAJ classification using a multiclass SVM. The experiments were carried out on a large balanced dataset of 1,000 documents, articles randomly selected from DOAJ. The system produced encouraging results achieving about 93% accuracy.

#repository  #classification 
 Discovery of semantically related content

..information about the semantic relatedness of content can be used for a number of purposes, such as recommendation, navigation, duplicates or plagiarism detection. CORE estimates semantic relatedness between two textual fragments using the cosine similarity measure calculated on term frequency-‐inverse document frequency (tfidf ) vectors. Details of the similarity metric are provided in (Knoth, et al., 2010). Due to the size of the CORE dataset7 and consequently the high number of combinations, semantic similarity cannot be calculated for all document pairs in reasonable time. To make the solution scalable, CORE uses a number of heuristics, to decide which document pairs are unlikely to be similar and can therefore be discarded. This allows CORE to cut down the amount of combinations and to scale up the calculation to millions of documents. CORE supports the discovery of semantic relatedness between two texts held in the CORE aggregator. In addition, the system supports the recommendation of full-‐text documents related to a metadata record and the recommendation of a semantically related item held in the aggregator for an arbitrary resource on the web.

#documents 
 The Budapest Open Access Initiative...

The Budapest Open Access Initiative clearly identifies, in its original definition of Open Access from 2001, that OA is not only about making research outputs freely available for download and reading. The aspect of reuse, which includes being able to index and pass OA content to software, is firmly embedded in the definition, opening new possibilities for the development of innovative OA services. However, while the growth of OA content has been used in the last decade as a benchmark of success of the OA movement, the successes in terms of finding and reusing OA content are much less documented. We believe that in order to fully exploit the reuse potential of OA, it is vital to improve the current OA technical infrastructure and facilitate the creation of novel (possibly unforeseen) services utilising the OA content. According to Figure 1 below, this OA content consists of OA papers, OA research data and all possibly inferred (or extracted) knowledge from these materials. The services that can access and manipulate this content can be tailored to different audiences and serve different purposes. [...] The Confederation of Open Access Repositories states: Each individual repository is of limited value for research: the real power of Open Access lies in the possibility of connecting and tying together repositories, which is why we need interoperability. In order to create a seamless layer of content through connected repositories from around the world, Open Access relies on interoperability, the ability for systems to communicate with each other and pass information back and forth in a usable format.

#repository  #interoperability  #services  #open-access 
 Open access will facilitate text mining and semantic enrichment

The huge amount of content openly available online offers a lot of opportunities for semantic enrichment realised through text mining, crowdsourcing, etc. Exploiting this content might completely redefine the way research is carried out. We may be standing at the brink of a research revolution, where semantic enrichment or enhancement and Open Access will take the lead role. However, there are two frequently discussed issues slowing down or even preventing this from happening — legal issues and the issue of scientific esteem and recognition. In a recent study commissioned by JISC (McDonald and Kelly, 2012), it was reported that copyright law and other barriers are limiting the use of semantic enrichment technologies, namely text-‐mining. In our view, this creates a strong argument for the wide adoption of Open Access in research. If semantic enrichment technologies are applied as part of an OA technical infrastructure in a way that provides significant benefits to users, users will prefer OA resources and this will create pressure on commercial publishers. To fully exploit the OA reuse potential, it is therefore important to better inform the Open Access comunity about both the benefits and commitments resulting from OA publishing.

#semantic-enrichment  #users  #mining  #open-access  #issues  #technology 
 The second often discussed issue...

The second often discussed issue is that of building confidence in Open Access. Apart from a few successful OA journals, such as those maintained by PLoS or BioMed central, it is still (possibly wrongly) believed that OA journals today typically do not compare in terms of impact factor with their commercial counterparts. However, the traditional impact measures based purely on citations are inappropriate for use in the 21st century (Curry, 2012) where scientific debate is happening often outside of publications, such as on blogs or social websites, and where scientific results do not always materialise into publications, but also into datasets or software. Instead of trying to achieve high impact factors by establishing government policies that would require researchers to deposit their results as Open Access, we need to develop a technical infrastructure that will be completely transparent and will enable us to establish new measures of scientific importance. At the same time we need methods and tools that will provide analytical information, including trends, about the OA content. This will strengthen the argument for both academics and publishers to adopt Open Access as a default policy.

#open-access  #policy  #access  #debate 
 CORE uses the ParsCit system...

CORE uses the ParsCit system to extract citation information from the publications full-‐text. This information is used in turn to check if the (cited) target documents are also present in the CORE aggregation to create a link between the cited publications. Within the currently running DiggiCORE project, we aim at extracting and releasing these citation networks for research purposes.

#publications  #information  #purpose  #aggregation  #link