Reprints Desk | The Content Workflow Company

Article Galaxy Blog

To Archive or Not to Archive

Posted by Ian Palmer on July 10, 2012

Many Reprints Desk customers – especially in the corporate sector – have recently inquired about options to leverage investments in scientific, technical, and medical (STM) papers purchased through Reprints Desk.

We thought you and others might find it useful to receive a summary overview from us about a few of the copyright-compliant options that exist in hopes that you can find the offering that’s best for you.

So we’ve published this blog article that covers:

  • What is an article archive or repository?
  • What is the financial value? How much is perception?
  • How do they and should they work?
  • Market options that your organization may want to consider


So let's get started...

What is an article archive or repository?
An article archive or repository is a database containing (usually copyrighted) PDFs of full-text papers. Archives are frequently comprised of articles acquired through pay per use access such as document delivery, publisher tokens, or via a publisher’s website.

However, we’re also aware of archives that have been built through more than just adding content through pay per use channels. For example, Medical Affairs groups often maintain product literature databases for reactive re-use when responding to a consumer or healthcare professional (HCP) inquiry about a drug or device. Another example is in the realm of Regulatory Affairs, where articles may be stored as part of the Pharmacovigilance drug safety monitoring process or in preparation for making regulatory submissions to agencies such as the FDA.

Because of the diversity in archive use cases, there are often one or more centralized archives within an organization and many “unofficial” personal or workgroup archives. There are also many archives that exist outside of the organization, especially within the academic community. These archives sometimes exist in reference management systems that are socially more “open” compared to solutions like Bibliogo. In either case, we’ve seen the storage of papers range from thousands to tens of thousands.

What is the financial value? How much is perception?
Information aggregators like Reprints Desk and publishers with whom we work always reinforce to customers that the value of an archive is for legal sharing of legally acquired STM content within internal workgroups. As much as we stand on our soapbox with this message, there are actually a few popular reasons that customers tell us they’re interested in an article archive:

  • Enterprise-wide cost savings – essentially to eliminate unnecessary duplicate purchases of content they’ve already paid for.
  • Library collection development – for many reasons, but primarily easier access for end users
  • Text mining – represents R&D nirvana for many in biomedical research, since many search and discovery aspects can be automated or semi-automated

We’ve found that customer perception usually outweighs reality for the “cost savings” value of an archive. The analyses we’ve performed for customers show only a minor percentage of document orders are actually duplicate purchases. Furthermore, there are just a select number of articles responsible for the majority of re-purchases.

As for information access, building a library-based archive is often a tactic to deliver end users with yet another convenient method of access. The results of our May 2012 survey question “What is the first action you PREFER to take when you need to obtain a scientific paper” indicate that users prefer to start with their organization’s collection when they know which article they want. This basically helps to prevent duplicate article ordering in a pre-document delivery phase rather than catching a document delivery request that has already been submitted.

Growing an archive collection is also one of the ways to prepare for full-text data mining. Scientific papers acquired through document delivery represent just one of the potential content types and targets for data mining. Gene and protein expression data are other examples of the many other content types. There are many types of data mining targets as well, such as publisher servers when access is licensed.

Regardless of the rhyme or reason, here at Reprints Desk we balance the customer imperative with copyright compliance. Our mantra is simple: Do the right thing.

We believe the only way to do this effectively and efficiently is to collaborate closely with information aggregators like Reprints Desk and with trusted copyright licensing organizations like the Copyright Clearance Center (CCC) and Copyright Licensing Agency (CLA).

As a content buyer, one of your financial decisions for whether an archive is worth it may ultimately be based on your relationship with a reproduction rights organization like CCC or CLA (e.g., do you already have a license) and what your appetite is for risk and steep financial penalties.

A few examples of recent case law include:

  • Lowry's Reports, Inc. v. Legg Mason, Inc. ($20 million settlement)
  • American Geophysical Union v. Texaco Inc. (Seven figure settlement)
  • Basic Books, Inc. v. Kinko's Graphics Corporation ($2 million of damages, fees and other costs)

Adopting a license with a reproduction rights organization is certainly one tactic to manage risk, as is ensuring that copyright compliance is part of your organization’s compliance DNA.


How do archives work and how should they work?
Article Archives work in multiple ways. They can be managed more manually for personal use or for a group through a mediated process. Article PDFs may be manually inserted into the archive, manually referenced, and manually pulled for self-viewing or to internally fulfill a request from a colleague. Archives can also be managed in a more automated fashion, wherein processes such as document delivery automatically store, check and pull licensed articles for delivery.

At Reprints Desk, we believe using automation is the most efficient and only viable long-term solution for supporting an internal archive. Furthermore, we believe it is the best way to systematically ensure copyright compliance.

In addition, we believe that the source of the rights you reference matters. Rights from a copyright licensing organization and direct from rights holders are the only authoritative sources for accurate, real-time rights information in support of true copyright compliance.  Rights for storing and re-using PDFs in specified ways are based on a set of data which is both granular and dynamic. Some major publishers do not permit any unlicensed sharing of article PDFs from their journals, even for copyright license holders. This makes a trustworthy rights management component an essential part of any archiving strategy.

Before you adopt an article archive or repository solution, we highly recommend you ask organizations that you may do business with either directly or indirectly (e.g., CCC, Elsevier, and others) about their policy and their input on solution providers who offer such a solution.


Market options that your organization may want to consider
In addition to shared drives and solutions like SharePoint and EndNote that we know are often used for article archiving, below are just a few of the other options that we think you may want to evaluate if you are interested in adopting, switching, or consolidating article archives at your organization.

  • Article Shelf from Reprints Desk – We integrate with customer hosted repositories and also offer a Reprints Desk-managed version that is available to customers with RightSphere Plus or Premium from CCC. Both solutions do not require any IT involvement or user training from your organization since the solution works “behind the scenes.” There is no additional cost for integration with your own hosted repository, while the Reprints Desk-managed version can be activated for a minor document delivery service fee increase or an annual fee – whichever you prefer.
  • Bibliogo from Reprints Desk – Bibliogo is our award-winning journal article web app that combines current awareness, reference management, document delivery, and secure social collaboration at the citation level. Article Shelf and RightSphere can both be activated for Bibliogo premium customers, with articles delivered directly into a users’ account and attached to citations of interest. Bibliogo also offers users a copyright compliant alternative to attaching a PDF to an email to alert a colleague or external contact about the existence of an article. There are account options that can be activated so users can email citations as well as tag and comment on citations.
  • QUOSA DocFlow from Elsevier – QUOSA provides literature management and archiving solutions that empower enterprises to share full-text scientific information. QUOSA DocFlow, one of many Reprints Desk integrations, puts full-text PDFs of scientific literature directly on the screens of users fast with tools to reduce duplicate article purchases.
  • Pubget PaperStream from CCC – Pubget PaperStream and Collaboration services deliver PDFs instantaneously in a single integrated environment, from your internal folders and servers, your journal subscriptions, open access documents, and link resolvers.

We hope you find this update useful and encourage you to share your feedback and to contact us if you’d like any help or additional information. We also highly recommend tapping into trusted industry resources such as Outsell, FreePint and, of course, your peers!

Topics: document delivery, product literature database, Pharmacovigilance, copyright compliance, article archive, article repository, drug safety review, sharepoint

Subscribe to Email Updates