Cost to libraries: OA vs TA

Note: important update/correction.
In 2004, Philip Davis carried out a study of library costs in which he estimated the average subscription cost/article for a subset of ARL libraries and compared this with a range of estimated author-side fees for Gold OA, in order to determine whether libraries might pay more or less if all journals switched to OA. Here I’ve tried to update that study using information that wasn’t available back then.
Davis set the spreadsheet up to make it easy to update his assumptions and recalculate (kudos!), and Peter Suber (among others) pointed out that at least the following assumptions should be updated:

  1. all OA journals charge author-side fees
  2. the full cost of OA fees will be borne by libraries
  3. TA journals charge no author-side fees

We now have five different studies (one recently confirmed, improved and updated) showing that in fact the majority of OA journals do not charge author-side fees. The highest proportion of no-fee journals is in the DOAJ psychology subset (90%) and the lowest is in the chemistry subset (49-58%); the most recent analysis of the entire DOAJ showed 70% no-fee.
We also know that research funders are increasingly willing to foot the bill for OA. For example, HHMI has institutional agreements/memberships with BMC, Springer and Elsevier, and BMC’s page of funder policies shows that a majority of UK funders either make additional funds available or allow publication charges to be treated as an indirect cost. A recent RCUK report showed that 45% of authors publishing in fee-based OA journals had their costs covered by their research funders.
Rather than pick a single number for either of these updates, I’ve plotted the fraction of the OA cost borne by libraries against the number of institutions at which OA is predicted to cost more than, the same as, or less than the TA model. The fractional cost borne by libraries is the product of (100 – %covered by funders)(%OA journals charging fees). (See Figs 1 and 2 below.)

We don’t know much about author-side fees at toll-access journals, but we do have some information. Firstly, the 2005 Kaufman-Wills report showed that more than 75% of the 247 toll-access journals in their sample charged author-side fees in addition to subscriptions. Secondly, I just had a rough-and-ready look at a small number of TA journals and found average author-side fees ranging from $400 to almost $3000. Finally, the NIH estimates (scroll to section L) that it spends over $30 million/year in author-side fees and funds the production of around 60,000 manuscripts. This means that the NIH is paying, on average, about $500/article in page charges. Since this is the largest sample we have, I’ve used this figure to update the spreadsheet. I added $500/article to the calculated serials expenditure/article and compared this adjusted TA cost/article to the OA costs.
Update: this was a mistake! The point of the exercise was to compare existing library subscription costs with predicted OA costs, and libraries are not currently paying the TA author side fees. See this post for the correctly updated version of the Davis study. ( Much later update: see this post, I wouldn’t put too much weight on those NIH figures given the nature of the sources.)
I’ve updated two further aspects of Davis’ spreadsheet. First, we now have better information about the actual range of author-side fees charged by those OA journals that do charge them. Rather than Davis’ $2500 – $5000 range, I’ve used $1300 (PLoS ONE) to $3000 (most of the high-profile hybrid programs). If the adjusted TA cost/article falls within this range, the prediction is that the OA and TA models cost about the same from a library point of view.
Second, Davis assumed that the scholarly literature made up 50% of library serials expenditures. I don’t know where this figure came from (the spreadsheet refers to a report which does not give any further information), but I think the real value is closer to 90%. My reasoning is based on my observation (see Table 2) that the average unit cost of a curated list of scholarly journals from UCOSC is about ten times the average unit cost of “all serials” from ACRL, ARL and NCES datasets. If that result is broadly representative it means that scholarly journals must contribute either a small fraction or the vast majority of the cost. Here’s a simple explanation: suppose 1000 items at an average cost of $10; then average cost of the scholarly items must be about $100 if the “10 x all serials” rule is accurate. So you can either have 90 scholarly items and 910 non-scholarly items at about $1, or you can have one scholarly item and 999 non-scholarly items at about $10. What you can’t have, for the averages to work out according to the “10 x” rule, is any ratio close to 50% scholarly/50% non-scholarly.
Summary of updates:

  1. plot fractional cost borne by libraries to account for %OA journals that don’t charge fees and % OA costs borne by research funders (or other bodies)
  2. add $500/article to TA model costs to account for author-side fees charged in addition to subscriptions
  3. predicted OA fee range = $1300 to $3000
  4. assume scholarly literature makes up 90% of serials expenditure

The updated spreadsheet is here, and the end result is this:


At a fractional cost of 0.8, there are no libraries at which OA is predicted to cost more than the TA model, and at a fractional cost of 0.3 the OA model is predicted to cost less than the TA model at all 113 libraries.
To see how the %fee and %funder proportions affect the fractional cost borne by libraries, I constructed a simple matrix and highlighted the two cutoff points shown on the graph above:


As you can see, there are a number of perfectly reasonable combinations which result in a fractional cost of 0.3 or less, at which all the libraries in the sample would save money under the OA model. (This, by the way, is exactly what Peter Suber predicted.)
Update/correction: see this post.

One thought on "Cost to libraries: OA vs TA

    (full text: )
    #1: The vast majority of current (peer-reviewed) journal articles are not OA (Open Access) (neither Green OA nor Gold OA ).
    #2: The vast majority of journals are not Gold OA.
    #3: The vast majority of journals are Green OA.
    #4: The vast majority of citations are to the top minority of articles (the Pareto/Seglen 90/10 rule).
    #5: The vast majority of journals (or journal articles) are not among the top minority of journals (or journal articles).
    #6: The vast majority of the top journals are not Gold OA.
    #7: The vast majority of the top journals are Green OA.
    #8: The vast majority of article authors would comply willingly with a Green OA mandate from their institutions and/or funders.
    #9: The vast majority of institutions and funders do not yet mandate Green OA.
    #10: The vast majority of Gold OA journals are not paid-publication journals.
    #11: The vast majority of the top Gold OA journals are paid-publication journals.
    #12: The vast majority of institutions do not have the funds to subscribe to all the journals their users need.
    CONCLUSION I: The fact that the vast majority of Gold OA journals are not paid-publication journals is not relevant if we are concerned about providing OA to the articles in the top journals.
    CONCLUSION II: Green OA, mandated by institutions and funders, is the vastly underutilized means of providing OA.
    CONCLUSION III: It is vastly more productive (of OA) for universities and funders to mandate Green OA than to fund Gold OA.
    Stevan Harnad
    American Scientist Open Access Forum

