Update and correction re: cost to libraries and author-side fees

In comments below, Peter Suber points out that the NIH has amended its estimates to $100 million/yr spent on author-side charges and 80,000 manuscripts funded — which brings the estimated average author-side fee to $1250, well in line with the individual journal estimates I made and the published figures I found. This is an important number because it is derived from a very large sample of the scholarly literature and casts a very different light on OA author-side fees than the one that TA publishers are wont to shine on their competitors. Compare, for instance, PLoS ONE at $1300, or the standard BMC charge of $1470 — for a couple hundred dollars more than the average cost of a TA publication, you can make your work free for all users to access, immediately and permanently. (It would be interesting to know what proportion of the $100 million is going to OA fees, though I doubt it would be large enough to make a significant dent in the average TA charge. Edit: according to Björk et al., less than 5% of all articles are available via no-embargo Gold OA; taking this into account, and assuming that the average Gold OA fee is triple the average TA fee, gives an average of $1136/article.) Much later update: see this post, I wouldn’t put too much weight on those NIH figures given the nature of the sources.
But! However! There is a flaw in my reasoning!
The problem is not with the estimate of author-side charges, but in my use of that estimate to update Philip Davis’ library costs study. The point of that study was to look at what libraries would pay in an all-OA model, which is why I used the fractional cost matrix1 and graph in the first place. See the problem? Libraries don’t pay the toll-access author-side charges, the NIH does! This makes the model a little artificial, perhaps, since *someone* has to pay those charges regardless of which journal levies them; nonetheless, the idea was to estimate practical library costs, so the TA author-side fees should not be included.
Here’s what the updated situation looks like with the subscription/article estimate NOT adjusted for TA author-side fees (see my earlier post for details of the calculations):


The fractional cost has to drop to 0.4 before there are no libraries predicted to pay more in the OA model — as I pointed out in the original post, there are numerous realistic combinations that will result in a fractional cost of 0.4 or lower:


The new figures also show that the fractional cost has to drop below 0.2 before all 113 libraries are predicted to save money in an OA model. That still seems to me to fall within a realistic range, given that 70% of journals in the DOAJ don’t charge author-side fees and 45% of researchers in a recent RCUK study had their OA fees covered by their research funders, for a fractional cost of 0.135.
Nonetheless, it’s worth taking a quick look at the libraries which are predicted to pay about the same in the OA and TA models. At a fractional cost of 0.4, they are: UC Davis, LA and San Diego, Univ Colorado, Cornell, Harvard, Johns Hopkins, McGill, Univ Massachusetts, Univ Maryland, MIT, Univ Toronto, Univ Washington and Univ Wisconsin. At a fractional cost of 0.3, only UC Davis, UCLA, Harvard, McGill, Maryland and Washington remain in the “pay about the same” category.
It’s easy enough to guess what these universities have in common, and a simple analysis confirms it:


Shading the top six yellow and the next 8 blue for visibility and ranking the libraries according to FTE, serials expenditure and “estimated scholarly articles published” reveals that the 14 “pay-same” libraries have only a slight tendency to be among the larger schools, but cluster very strongly at the high end of the “scholarly articles published” ranking. In other words, research-intensive schools that publish a lot may put more pressure on their libraries in the OA world (to the extent that libraries are likely to be asked to repurpose serials costs for OA charges).
Among other things, it was in order to examine this particular concern in detail that Davis carried out his original study, and for the same reason I have here updated it with more recent estimates and assumptions. The newer numbers show that a realistic worst-case scenario is that the libraries in question (14 out of 113 total) don’t save any money in the OA model.
1 I neglected to mention in earlier posts that I got the %fee x %funded matrix idea (of which the fractional cost graph is an obvious extension) from Peter Suber. My apologies to Peter; I’m usually more careful about crediting sources.

Cost to libraries: OA vs TA

Note: important update/correction.
In 2004, Philip Davis carried out a study of library costs in which he estimated the average subscription cost/article for a subset of ARL libraries and compared this with a range of estimated author-side fees for Gold OA, in order to determine whether libraries might pay more or less if all journals switched to OA. Here I’ve tried to update that study using information that wasn’t available back then.
Davis set the spreadsheet up to make it easy to update his assumptions and recalculate (kudos!), and Peter Suber (among others) pointed out that at least the following assumptions should be updated:

  1. all OA journals charge author-side fees
  2. the full cost of OA fees will be borne by libraries
  3. TA journals charge no author-side fees

We now have five different studies (one recently confirmed, improved and updated) showing that in fact the majority of OA journals do not charge author-side fees. The highest proportion of no-fee journals is in the DOAJ psychology subset (90%) and the lowest is in the chemistry subset (49-58%); the most recent analysis of the entire DOAJ showed 70% no-fee.
We also know that research funders are increasingly willing to foot the bill for OA. For example, HHMI has institutional agreements/memberships with BMC, Springer and Elsevier, and BMC’s page of funder policies shows that a majority of UK funders either make additional funds available or allow publication charges to be treated as an indirect cost. A recent RCUK report showed that 45% of authors publishing in fee-based OA journals had their costs covered by their research funders.
Rather than pick a single number for either of these updates, I’ve plotted the fraction of the OA cost borne by libraries against the number of institutions at which OA is predicted to cost more than, the same as, or less than the TA model. The fractional cost borne by libraries is the product of (100 – %covered by funders)(%OA journals charging fees). (See Figs 1 and 2 below.)

We don’t know much about author-side fees at toll-access journals, but we do have some information. Firstly, the 2005 Kaufman-Wills report showed that more than 75% of the 247 toll-access journals in their sample charged author-side fees in addition to subscriptions. Secondly, I just had a rough-and-ready look at a small number of TA journals and found average author-side fees ranging from $400 to almost $3000. Finally, the NIH estimates (scroll to section L) that it spends over $30 million/year in author-side fees and funds the production of around 60,000 manuscripts. This means that the NIH is paying, on average, about $500/article in page charges. Since this is the largest sample we have, I’ve used this figure to update the spreadsheet. I added $500/article to the calculated serials expenditure/article and compared this adjusted TA cost/article to the OA costs.
Update: this was a mistake! The point of the exercise was to compare existing library subscription costs with predicted OA costs, and libraries are not currently paying the TA author side fees. See this post for the correctly updated version of the Davis study. ( Much later update: see this post, I wouldn’t put too much weight on those NIH figures given the nature of the sources.)
I’ve updated two further aspects of Davis’ spreadsheet. First, we now have better information about the actual range of author-side fees charged by those OA journals that do charge them. Rather than Davis’ $2500 – $5000 range, I’ve used $1300 (PLoS ONE) to $3000 (most of the high-profile hybrid programs). If the adjusted TA cost/article falls within this range, the prediction is that the OA and TA models cost about the same from a library point of view.
Second, Davis assumed that the scholarly literature made up 50% of library serials expenditures. I don’t know where this figure came from (the spreadsheet refers to a report which does not give any further information), but I think the real value is closer to 90%. My reasoning is based on my observation (see Table 2) that the average unit cost of a curated list of scholarly journals from UCOSC is about ten times the average unit cost of “all serials” from ACRL, ARL and NCES datasets. If that result is broadly representative it means that scholarly journals must contribute either a small fraction or the vast majority of the cost. Here’s a simple explanation: suppose 1000 items at an average cost of $10; then average cost of the scholarly items must be about $100 if the “10 x all serials” rule is accurate. So you can either have 90 scholarly items and 910 non-scholarly items at about $1, or you can have one scholarly item and 999 non-scholarly items at about $10. What you can’t have, for the averages to work out according to the “10 x” rule, is any ratio close to 50% scholarly/50% non-scholarly.
Summary of updates:

  1. plot fractional cost borne by libraries to account for %OA journals that don’t charge fees and % OA costs borne by research funders (or other bodies)
  2. add $500/article to TA model costs to account for author-side fees charged in addition to subscriptions
  3. predicted OA fee range = $1300 to $3000
  4. assume scholarly literature makes up 90% of serials expenditure

The updated spreadsheet is here, and the end result is this:


At a fractional cost of 0.8, there are no libraries at which OA is predicted to cost more than the TA model, and at a fractional cost of 0.3 the OA model is predicted to cost less than the TA model at all 113 libraries.
To see how the %fee and %funder proportions affect the fractional cost borne by libraries, I constructed a simple matrix and highlighted the two cutoff points shown on the graph above:


As you can see, there are a number of perfectly reasonable combinations which result in a fractional cost of 0.3 or less, at which all the libraries in the sample would save money under the OA model. (This, by the way, is exactly what Peter Suber predicted.)
Update/correction: see this post.

Author-side fee comparison: OA vs TA.

I’ve posted a couple of times about the misconception that all OA journals charge author-side fees, and each time I’ve mentioned the Kaufman-Wills study which found that 75% of the toll-access journals they examined charged author-side fees in addition to subscription charges. I thought it would be useful to compare author-side fees charged by OA and TA journals.
It’s easy to work out what OA and hybrid journals charge; BMC maintains a detailed list of publisher article processing charges.  Here are some examples:

PLoS journals chargein three tiers:

PLoS ONE, $1300
PLoS Pathogens, NTDs, Genetics and Comp Biol, $2200
PLoS Biology and Medicine, $2850

BMC charges between $1105 and $2095 for most journals, and their standard charge is $1470
Hindawi charges between $275 and $850 for most of their journals, with a few titles up to $1400
Springer Open Choice, Wiley Funded Access and Elsevier’s Sponsored Articles all cost $3000. (*cough*)

What is much more difficult to determine is how much the average author is paying in author-side fees at toll-access journals, because the charge for a given article depends on number of pages and/or color figures, and in some cases also on whether supplementary information is included.
Below are a few examples; in each case for which I calculated a figure, I extracted the page and figure counts manually from a single issue. This is far too small a sample to be representative, but I’m just trying to get some kind of feel for the numbers. Further, the published figures I managed to find (indicated by footnotes) are consistent with my “calculated guesses”. Also, the NIH estimates (scroll to section L) that it spends “over $30 million annually in direct costs for publication and other page charges” and produces “roughly 50,000 – 70,000 manuscripts”, which means that the NIH is paying, on average, about $500/article in page charges. If around 8% of all new articles are Gold OA, that number goes up to about $543/article. If the Kaufman-Wills 75% figure is representative, then the average author-side fee being charged is $666/article, or $724/article if the %OA is taken into account. (Note that the %OA adjustment might be spurious and the estimated average slightly off, because we don’t know how much of the estimated $30 million is going to Gold OA fees.) Edit: according to Björk et al., only about 5% of all articles are available through Gold OA without an embargo period. Taking this into account, and assuming that the average Gold OA fee is triple the average TA fee, gives an average of $454/article, or $606/article on the Kaufman-Wills estimate.
Update: In comments, Peter Suber points out that the NIH has amended its estimates to $100 million/yr spent on author-side charges and 80,000 manuscripts funded — which brings the estimated average author-side fee to $1136; if only 75% of TA journals are charging such fees, then they are charging on average $1515. Much later update: see this post, I wouldn’t put too much weight on those NIH figures given the nature of the sources.
This section became way too cluttered, so I’ve put a summary here and the details are below:
journal ……………………………… average author side fee
PNAS ……………………………………….. $1446
Science …………………………………….. $1019
Nature ……………………………………… $1669
Cell ……………………………………….. $2031
Cell Cycle ………………………………….. $756
EMBO J ……………………………………… $2974
Mol Biol Cell ……………………………….. $1829 1
American Physiological Society (14 journals) ……. $1000 2
Journal of Nutrition …………………………. $456
J Neuroscience ………………………………. $850 + color charges 2
Molecular Biology and Evolution ……………….. $922 3
Molecular Plant-Microbe Interactions …………… $1275 4
J Natural Res & Life Sci Education …………….. $400

1 official figures, 2006
2official figures, current
3 official figures, 2008
4 official figures, 2000
The selection of journals is fairly random, just the first few that came to mind then whatever turned up when I was searching for things like “average page color charges”. They range from prestige to niche, and even the cheapest charge fees that amount to a significant fraction of Gold OA author-side fees.
It would be very interesting to extend this half-baked pilot study, but I think it would also be unavoidably labor intensive. Except for rare cases where publishers provide the numbers, there’s really no way to calculate average author-side fees based on page and figure counts except by doing those counts for a representative sample of issues in each journal. (Perhaps a passing statistician could help me figure out what would constitute a representative sample — perhaps sqrt(issues/year)?) Then you have to select which journals to investigate — perhaps high, middle and low ranked journals in a handful of broad categories? Finally, it’s pretty slow going, so I don’t think Mechanical Turk would be cost effective for this job — even if you could solve the problem of giving Turkers access to the journals. In the end I think you’d have to inflict the counting task on some hapless grad student or intern, who would probably find it easiest to sit in a library with a stack of journals and a spreadsheet.

—————————————-details of “calculated guesses” and official figures—————————————-
PNAS: $70/page, $250 for supplementary information, $300 per color figure or table

March 17 2009 vol 106 issue 11: 88 papers, pp 4079 to 4570; mean = 5.6 pages
5.6 pages = $392
10 papers had no supplementary info so mean SI=78/88=0.886 = $221
approx every 5-6th paper examined, 18 in total:
5 color figures ($1500) ii
4 color figures ($1200) iiiii i
3 color figures ($900)  ii
2 color figures ($600)  iiii
1 color figure  ($300)  ii
0 color figures ii
mean color cost = $833; mean total cost/article = $1446

In 2004 Cozzarelli et al. suggested that around $2000/article would be needed to cover PNAS’  costs without subscription income.
Science: $650 for the first color figure, $450/color figure thereafter

March 20 2009 vol 323 issue 5921: 2 research articles, 11 reports:
4 color figures ($2000) iii
3 color figures ($1550) i
2 color figures ($1100) iiii
1 color figure ($650) ii
0 color figures iii
mean color cost = mean cost/article = $1019

Nature: £735 ($1072) for the first colour figure and £262.50 ($383) for each additional figure (note: “Inability to pay this charge will not prevent publication of colour figures judged essential by the editors”)

March 19 2009 vol 458 number 7236: 2 articles, 12 letters:
5 color figures ($2604) ii
4 color figures ($2221) iiii
3 color figures ($1838) iii
2 color figures ($1455) iii
1 color figure ($1072) i
0 color figures ii
mean color cost = mean cost/article = $1669

Cell: $1000 for the first color figure and $275 for each additional color figure.

March 20 2009 vol 135 number 6: 12 articles:
7 color figures ($2650) iii
6 color figures ($2375) iii
5 color figures ($2100) ii
4 color figures ($1825)
3 color figures ($1550) ii
2 color figures ($1275)
1 color figure  ($1000) ii
0 color figures
mean color cost = mean cost/article = $2031

J Neurosci: $850 for regular manuscripts, $450 for brief communications, color figures are free “when color is judged essential by the editors and when the first and last authors are members of the Society for Neuroscience”, otherwise $1,000 each.

March 18 2009 vol 29 issue 11: 28 articles; looked at 4 random articles, no color figs = 6,8,5,1.  Regular SfN membership is $160.  I’m guessing most authors are members but it’s still impossible to tell how much each paper is being charged for color.

Landes Bioscience (all journals): four pages free, then $80/page; $340 for the first color page and $150 for each additional color page (in print — color is free online)

Cell Cycle March 15 2009 vol 8 issue 6: 10 research reports, pp 870 – 949
pages = 5,12,6,5,6,6,8,5,8,9
pages charged = 1,8,2,1,2,2,4,1,4,5; total = 30, mean = 3 = $240
7 color figures ($1240)
6 color figures ($1090)
5 color figures ($940)
4 color figures ($790) iiii
3 color figures ($640) i
2 color figures ($490)
1 color figure  ($340) iiii
0 color figures i
mean color cost = $516; mean total cost/article = $756

EMBO J: $250/page over 6 pages, plus color charges: $650/figure for the first three figures, $432/figure for the next two, $2928 for six figures and $326 per additional figure thereafter.

March 18 2009 vol 28 number 6: 15 articles
pages = 10,8,10,10,13,8,10,13,13,10,8,9,10,12,12
pages charged = 4,2,4,4,7,2,4,7,7,4,2,3,4,6,6; total = 66, mean = 4.4 = $1100
9 color figures ($3906) i
7 color figures ($3254) ii
6 color figures ($2928) ii
5 color figures ($2814) ii
4 color figures ($2382) i
3 color figures ($1950)
2 color figures ($1300) ii
1 color figure  ($650) ii
0 color figures iii
mean color cost = $1874; mean total cost/article = $2974

Molecular Biology of the Cell: according to the Am Soc Cell Biol, in their 2006 publication “MBC and the Economics of Scientific Publishing” (available as a pdf from the linked page):

The average article published in MBC in 2006 was 11.7 pages long and included 2.9 color figures. With the 20% discount on page and color charges now offered to ASCB members, publishing such an article would cost the author $1,829.

(Regular ASCB membership is $130.) Interestingly, the same publication gives the following details of budgeted (projected?) journal revenue for 2008:


I don’t know how similar that breakdown would be for other journals, but it’s interesting that subscription revenue is roughly equal to page OR color charges — meaning that the average author would pay about 50% more if the journal switched to full cost recovery from author side fees.  This would put MCB’s author side fees roughly on par with those charged by the top two PLoS tiers.
The American Physiological Society’s Author Choice (hybrid OA) fee is $3000 for review articles and $2000 for research articles; according to their FAQ this is because:

For research articles, the Author Choice fee was determined by calculating the real average cost ($3,000) of publishing an article in an APS journal, and subtracting the actual average amount already paid by authors in author fees (page charges and color fees). The Author Choice fee for review articles is $3,000, because there are no other fees paid by authors of review articles. The Author Choice fee was designed to completely cover the cost of publishing an article.

which indicates that the average author-side fee for the 14 journals published by the APS is $1000.
Journal of Nutrition: in this editorial, AC Ross gave some figures regarding costs:

On average, each published page costs about $465, and pages with color, $1300! Each published manuscript costs, on average, $3233. Page charges (starting at $70) and color charges to authors ($400 per figure) are only a fraction of the actual costs of publication. Institutional subscriptions remain a key factor in the financial success of professional society journals like JN.

Page charges are currently $75/page for the first 7 pages and $120/page thereafter, and color charges are still $400/figure.

March 2009 volume 139 issue 3: 29 articles
pages = 5,4,7,4,8,5,6,7,5,6,6,4,6,7,5,4,6,6,7,5,6,7,5,5,5,3,4,7,5
mean page charge = $415
1 color figure  ($400) iii
0 color figures iiiii iiiii iiiii iiiii iiiii i
mean color charge = $41; mean total cost/article = $456

Molecular Biology and Evolution: in the 2008 Editor’s Report (pdf available here) the Society for Mol Biol and Evolution provided the following figures for MBE in 2008:

average article length: 10.1 pages
average number of color figures per article: 0.927

Current charges are $50/page plus $450 per color figure, giving an average cost/article of $922.
Phytopathology and Plant Disease: $50 per printed page for the first six pages and $80 per printed page for each additional page for members of The American Phytopathological Society and $130 per printed page for nonmembers. In addition, there is a $20 fee charged for each black-and-white figure or line drawing. Color charges are $500 for the first illustration, $500 for the second illustration, and $250 for the third and each subsequent color illustration in one article.
Molecular Plant-Microbe Interactions: $150 for the first 6 pages, $150/page or fraction of thereafter; Color charges are $500 for the first illustration, $500 for the second, and $250 for the third and each subsequent color illustration in one article. In addition, there is a $20 fee charged for each black and white figure or line drawing.

The Society’s Reports of Publications from 2000 gives the following figures:
Phytopathology: average article = 7.3, average color figs/article = ?
Plant Disease: average article = 5.4, average color figs/article = ?
MMPI: average article = 9.4, average color figs/article = 1.05; mean cost/article = $1275

(Regular membership in Am Phytopath Soc is $76.)
Journal of Natural Resources and Life Sciences Education: $350/article, $10 per table and $10 per figure plus $100/color page (print only; color is free online).

Vol 36, 2007: 17 articles, number of figs/tables = 1,3,6,7,12,4,5,4,8,8,5,5,9,1,2,2,4
only a couple had color figures; mean additional charge = $50, mean cost/article = $400



On FriendFeed, items move back up the temporal sequence when they get “likes” and comments, giving them extra chances to be noticed. In addition, a “like” or comment from one of your friends will bring an item into view even if posted by someone whose stream you don’t follow. The emerging mores of the system include leaving a one-word comment, bump, to indicate that one feels a particular item is worthy of wider attention — “bumping” the item up the queue, as it were.
That’s what I’m doing with this post. Richard Poynder is trying to put together a list of institutions and funding bodies which have established funds to pay for Gold Open Access:

I am trying to establish how many research institutions and funders have created Gold Open Access (Gold OA) authors funds, and would be grateful for input from others.
I am aware that the Wellcome Trust announced a scheme for paying OA publication fees for its grantees in 2006. But what other funders have introduced such schemes?
So far as research institutions are concerned, Peter Suber kindly provided me with the following list of those he knows have created Gold OA funds:
University of Amsterdam
University of Calgary
University of California, Berkeley
Delft University of Technology
ETH Zurich
Griffith University
University of Helsinki
Institute of Social Studies (Netherlands)
Lund University
University of North Carolina, Chapel Hill
University of Nottingham
University of Tennessee, Knoxville
Texas A&M University
Tilburg University
Wageningen University and Research Center
University of Wisconsin
However, I do not think this list is complete.

Richard also points out that it is probably useful to keep track of which Gold funds are complemented by a Green mandate, and makes the (imo excellent) suggestion of establishing a Gold Fund equivalent to ROARMAP, which tracks Green Mandates.
So — *bump* — please go read Richard’s post, and help him out if you can.
Update: Peter Suber has created and pre-populated the Open Access Directory list of journal OA funds, so if you have information please add it there.

That’s the way you do it!

Via Peter Suber, I am delighted to find that Stuart Shieber has started a weblog, and even more delighted that in one of his first entries he has turned my long-ago author-side fees DOAJ hack into an actual, readily reproducible study:

Here are the results computed by my software, as of May 26, 2009:

Charges.......................951  (23.14%)
No charges....................2889 (70.29%)
Information missing...........270  (6.57%)
Hybrid........................1519 (26.99%)

The numbers are consistent with those of Hooker’s study some 16 months earlier.

It’s great to have the numbers confirmed, and even better to be able to make regular updates and construct time series. Thanks to Stuart for doing it right, and for making the code freely available.
(Note, had to reformat the quoted table into ugly text, because I still can’t get MT to play nice. Grrr.)

What use are research patents?

DrugMonkey has a conversation going about the ongoing kerfluffle over (micro)blogging of conference presentations (see also the FriendFeed discussion). I want to go off on a tangent from something that came up in his comment thread, so rather than derail it I thought I’d post here.
In his first comment in the thread, David Crotty made the following claim:

Lots of researchers support their families and labs through money generated by patents, and most universities are heavily dependent upon their patent portfolios for funding.

That doesn’t accord with my (limited!) experience — I know a few researchers who hold multiple patents, and none of them ever made any money that way — and my general impression is that the return on investment for tech transfer offices and the like is fairly dismal.
This seems like the sort of beans that beancounters everywhere should be counting, so I asked on FriendFeed whether anyone knew of any data to address the question of whether universities really make much money from patents. Christina Pikas pointed me to the Association of University Technology Managers, whose 2007 Licensing Activity Survey is now available.
I extracted data for 154 universities and 27 hospitals and research institutions. Between them, in 2007, these institutions filed 11116 patent applications, were awarded 3512 patents, and gave rise to 538 start-up companies. I calculated licensing income as a percentage of research expenditure:


Apart from New York University (I wonder what they own that’s so profitable?), it’s clear that none of these universities are “heavily dependent upon their patent portfolios for funding”. In fact, more than half of them (78/154) made less than 1% of their research expenditure back in licensing income, and the great majority (144/154) made less than 10%.
Licensing income for Massachusetts General Hospital and “City of Hope National Medical Ctr. & Beckman Research” (whoever they are) amounted to 65-70% of research expenditure, but none of the other hospitals or research institutions made more than 20%. More than half of this group (15/27) made less than 2%, and most of them (23/27) made less than 10%.
The distribution looks just about as you would expect:


I also wondered whether there was any evidence that greater numbers of patents awarded, or more money spent per patent, resulted in higher licensing income. As you can see, the answer is no (insets show the same plots with the circled outliers removed):


I don’t know how representative this dataset is; there are several thousand universities and colleges in the US, and surely even more hospitals and research institutions, so the sample size is relatively small. It does include some big names, though – Harvard, Johns Hopkins, MIT, Stanford, U of California — and I would expect a list of schools answering the AUTM survey to be weighted towards those schools with an emphasis on tech transfer.
In any case, I’m not buying David’s assertion that “most universities”, or most hospitals or research institutes for that matter, rely heavily on licensing income. And that being so, I am also somewhat skeptical about the number of researchers’ families being supported by patents.
What’s the Open Science connection? Well, if you’re interested in patenting the results of your research, there are a lot of restrictions on how you can disseminate your results. You can’t keep an Open Notebook, or upload unprotected work to a preprint server or publicly-searchable repository, or even in many cases talk about the IP-related parts of your work at conferences. It seems from the data above that most universities would not be losing much if they gave up chasing patents entirely; nor would they be risking much future income, since so few seem to get significant funds from licensing. My own feeling is that any real or potential losses would be much more than offset by the gains in opportunities for collaboration and full exploitation of research data that come with an Open approach.
1. Christina left a comment pointing out that patents may be required for more than simply making money from licensing:

…an extremely important reason universities patent [is] to protect their work so that they may exploit it for future research… it turns out that universities have to patent in life sciences – even if they don’t actively market and license these patents – to be able to attract new research money from industry.

There are two distinct points here: first, that if you don’t patent you may not attract industry partners, and second, that if you don’t patent you may end up licensing your own tech back from someone else (I note that most tech licenses I know of are cheap or free “for research purposes” so the latter factor might not weigh so heavily). According to the 2007 AUTM data, industry investment in academic research amounted to about 7% of research expenditure and was up 15% over 2006.
2. David responded on DM’s thread with some counter evidence, on reading which I realise that the data above may (likely?) only show what the university received and not any money that went to the labs or researchers involved. Tech transfer may not be financially worth it for the university, except that it might still be doing good things for individual labs and PIs, and so would constitute a support service the university offers its research community. It also strikes me that my experience, such as it is, is mainly with Australian researchers, whereas David’s is in the US, so cultural differences may also apply.
3. More from Christina at her own place, here.
If you want the data, the spreadsheet I used is here.

What happened to serials prices in 1986-87? (Update: probably nothing.)

This could be nothing but an artifact (e.g. of the way the data were collected), but if you look at Fig 1 from this post, there’s a clear break in the serials expenses (EXPSER) curve that’s not evident in any of the others. Here’s the same plot reworked to emphasize what I’m talking about:


If you squint just right you can imagine a similar but much weaker effect, beginning a year or two later, in the total expenditures (TOTEXP) curve; and the salaries (TOTSAL) curve seems to start a similar upward trend at about the same time but then levels off after 1991 or so. I wouldn’t put any weight on either of those observations though — I’d never have noticed either if I hadn’t been comparing carefully with the EXPSER curve.
I’ve added linear regression lines for the 1976-1986 and 1987-2003 sections of the EXPSER data, just to emphasize the change in rate of increase. For those of you who will twitch until they know, just ‘cos, the regression coefficients of the two lines are 0.99 and 0.98 respectively. If you extrapolate from just the 76-86 section, TOTEXP exceeds the forecast for EXPSER after about 2000.
I have no idea if this means anything, but it is tempting to speculate. For instance: when did the big mergers begin in Big Publishing, and when did the big publishing companies start the odious practice of “bundling”, that is, selling their subscriptions in packages so that libraries are forced to subscribe to journals they don’t want just to get the ones they do?

Update: it’s probably nothing; the curve simply shows an increasing rate of increase, and you can break it up into at least five reasonably convincing-looking segments with breaks at 86-87 and 94-95. It’s possible there were two “pricing events” around those times, but I think this is most likely just an illustration of what can happen when you look a little too hard for patterns in your data!


Every little bit counts.

There are so many good causes, and so many of them are not just good but urgent — even assuming you have some money to spare, where are you to donate it? Everyone has their own solution to this problem. Mine is to try to hedge my bets: donate roughly equally to long- and short-term, local and global, human and environmental. I’m out of work and thoroughly skint right now, but I try to remember that by world standards I’m still living like a king; my budget includes some “don’t go insane” funds for occasional movies or dinners out or whatever, and I can always skip one of those in order to give just a little to some good cause.
One such is the Open Knowledge Foundation, which is turning five and asking for support:

This month the Open Knowledge Foundation is five years old.

Over those last five years we’ve done much to promote open access to information — from sonnets to stats, genes to geodata — not only in the form of specific projects like Open Shakespeare and Public Domain Works but also in the creation of tools such as KnowledgeForge and the Comprehensive Knowledge Archive Network, standards such as the Open Knowledge Definition, and events such as OKCon, designed to benefit the wider open knowledge community. (More about what we’ve been up just over the last year can be found in our latest annual report).

While we have achieved a lot, we believe we can do much, much more. We are therefore reaching out to our community and asking you to help us take our vision further.

Our aim: at least a 100 supporters committed to making regular, ongoing donations of £5 (EUR 6, $7.50) or more a month.

These funds will be essential in expanding and sustaining our work by allowing us to invest in infrastructure and employ modest central support. To pledge yourself as one of those supporters all you need to do is take 30 seconds to sign up to our “100 supporters” pledge at:


And if you want to act on the pledge right now (or make any other kind of donation), please visit: http://www.okfn.org/support/

We are and will remain a not-for-profit organization, built on the work of passionate volunteers but these additional fund are essential in maintaining and extending our effort. Become a supporter and help us take our work forward!

I’m in no position to make a regular commitment, but I skipped a movie and sent ’em ten quid. It’s not much but it’s my hope that small donations can be a powerful force in the internet age. The other thing I can donate is publicity, which is what this post is for.
Why donate to OKF? My belief is that openness is not only our best weapon in the unending battle against bad actors and free riders, it is the key to a radically more efficient scientific process, which in turn is the key to all material progress in human quality of life.
The OKF not only builds tools and standards for open exchange of information, but they are also part of the front line effort to make openness and transparency into a constant, widely adopted habit of mind and of behaviour. To choose a topical example, we won’t have appropriate access to information about the spending habits of our elected officials until we are so in the habit of openness that it is a surprise and an affront to the average citizen to realise that such information is being kept secret. To choose my own bête noire as another example, we won’t be free of “data not shown” in the scientific literature until the majority of scientists respond to that phrase with an immediate and indignant “why the hell not?”.
So, support for the OKF is one of my long-term choices: an investment in a better future for everybody. If you have a couple of dollars to spare, please consider investing with me.

Pick an index, any index.

Over at The Scholarly Kitchen, Philip Davis takes the ARL to task for comparing their serials expenditures with the Consumer Price Index:

By adopting the CPI as a general frame of reference, almost any industry that requires huge professional worker input will look like it is spiraling out of control. Perhaps this is the reason the ARL uses the Consumer Price Index as a reference for journal prices when it could have used the Higher Education Price Index, the Producer Price Index, or an index which more closely resembles professional knowledge production.
The CPI is an excellent tool for collective salary bargaining, for estimating who should be eligible for food stamps or free school lunches. It is a very bad tool for measuring the purchasing power of libraries or justifying a reinvention of the journal publication system.

Since I’ve just played around with updating the famous graph to which Davis takes exception, I thought I’d better take a closer look at the alternative indices he suggests.
From the Commonfund 2008 HEPI Report (pdf; linked from here) I extracted historical HEPI and CPI data from 1976 to 2003, and from the ARL stats interface at U Virginia I extracted the median values for serials expenditures (EXPSER), total salaries expenditures (TOTSAL) and total expenditures (TOTEXP) for the same period (it was limitations in the ARL data range that dictated the time period). I also extracted Producer Price Index data for “all commodities” (PPI ALL) over the same period from the Bureau of Labor Statistics. There are lots of choices for PPI data, but most of them don’t go back as far as 1976. (I did try a couple of industries that I thought required “huge professional worker input”, such as hospitals and book publishers, but the data weren’t available for all the years I wanted — and by eyeball it didn’t look as though they showed much greater increase than the all commodities index.)
Plotting percent cumulative change against time we see:


There isn’t a lot of difference between the HEPI and the CPI, and the all commodities PPI index shows even less increase. Davis suggests that salaries, professional worker input, are at least part of the reason why the CPI is a poor choice for comparison with serials costs, but (to the extent that the HEPI is a better “professional worker weighted” measure) the data do not bear him out. Neither does his claim regarding librarian salaries fit the data I have to hand:

If we plotted academic librarian salaries against the CPI, we could claim that the profession was in crisis, that salary growth was unsustainable, and that the system was simply broken.

It’s clear from the data, though, that library salary expenditures have outstripped the HEPI and CPI, but not by as much as total expenses and not by nearly as much as serials costs.
Remember, too, that this is still only part of the story: “serials” includes a great many publications whose costs have not increased at the same rate as the scholarly literature. The Abridged Index Medicus data I got from EBSCO only cover 1990 onwards, so I reworked the comparison to include the AIM data:


I used the AIM data because comparison with a much larger data set, broken down by individual discipline, showed that the AIM data gave what looks like a reasonable “middle value” — and as you can see, scholarly journal price increases outstrip all others, including total serials, by a considerable margin.
Note also that there’s little difference between “total salaries” and “professional salaries” — the professional salary data series (SALPRF) only goes back to 1986, which is why I’ve included it in this second graph.
None of this is to say that the CPI is the ideal comparison index against which to measure increases in the cost of the scholarly literature. It seems from the comparisons above, though, that there’s not much difference for this particular purpose between the CPI and the HEPI. While I don’t have data for publishing industry salaries, library salaries hew fairly closely to the HEPI and to total library expenditures. It therefore doesn’t seem that salaries have much to do with the much-bruited discrepancy between “general cost of living/doing business/whatever” increases and the rise and rise of the cost of scholarly literature.
If you want the data I used, the spreadsheet is here.

Motes, beams &c.

A while back, Philip Davis over at The Scholarly Kitchen posted about a small but useful research project of his:

All I did was ask five librarians at institutions administrating Open Access publication charges two simple questions:
“Can you provide a list of Open Access articles that you have supported through your author support program,” and “Have you rejected any requests to date?”

This is (to me) clearly information that such programs should be collating and reporting, and after two weeks Davis’ results were not exactly stellar:

Two weeks after asking my simple questions, I received just two short responses. No list, no numbers, but at least a few details: There was some confusion on the part of faculty of what an OA article publication charge really was. Some faculty requests were actually for page charges in conventional subscription journals; one faculty submitted a request for reprint charges; others submitted invoices to the library when they should have been directed to the external granting agency (like the HHMI). To date, no bonafide requests have been denied.

That’s useful information, as far as it goes, but it doesn’t go very far. Davis plays the conspiracy theory card way too hard for my taste, with “dark secrets” in the post title and an opening paragraph that reeks of melodrama:

You would have thought I was requesting a field manual for interrogating prisoners of war or a list of members on Dick Cheney’s Energy Taskforce. At least in those instances, I would have received a response that answering my questions violated national security or “executive privilege.”

Whoa, cowboy, back up a minute. As commenter Amanda R pointed out, we don’t know much about how Davis went about gathering the information:

As a point of clarification, were you directly refused data, or did libraries simply not respond? Did you contact them back and ask why there was no response, or if there was a reason they weren’t providing the full data you wanted?
Obviously, you deserve a professional response from the libraries you contacted. But, as much as it pains me to say it, I could easily imagine a library in which a request for statistics was bumped around internally for a few weeks before actually being answered.

In a Friendfeed discussion, librarian Christina Pikas made a related point:

the worst part of this is figuring out who you would send a request like that to. It takes me 10 e-mails and 3 phone calls to find the right person at my mothership main library. Almost seems that he’s taking confusion for malicious intent

as did commenter JQ Johnson:

when I in March queried the same institutions that Davis did, I got lots of cooperation. For example, UNC pointed me to a public letter (2/20/2009) to their vice chancellor that summarized in some detail the 12 requests they had funded to date. I’m puzzled why Davis got the response he did. Did he ask the wrong people?

Davis replied to both Amanda R and JQJ, but he gave non-answers containing no information about his methodology and insisted that what he had shown was a lack of transparency:

Whether the lack of response was caused by human error, technological barriers or internal policy, the result is a lack of transparency in how these author-support programs are performing.
These are all good questions but they skirt around the main issue of why I received only 2 responses, and why even these two responses were unable to provide me with any meaningful (even summarized or anonymized) data.

I found this very frustrating and left a comment1 aimed at clarifying why that was so:

JQJ’s comments and questions do not seem to me to skirt the issue at all, but rather to speak directly to alternative explanations for the lack of response. Methodological concerns are not trivial here.

  • Whom did you contact?
  • Did you say explicitly that you were sensitive to confidentiality issues and happy with various forms of anonymized data?
  • Did you phone anyone, or simply email?
  • How do you know your emails didn’t just end up in the spam bin?
  • Did you follow up (an unanswered question from Amanda, above)?

And so on. You have asked good questions, and have shown that routine reporting could be improved for such programs (already a useful outcome). But you need a good deal more evidence — including a more transparent methodology — before you go claiming there are “dark secrets” at work.

Now, it’s been almost two weeks since I left that comment, and it hasn’t appeared or been answered. What dark secrets is Philip Davis hiding? What dim, Crotty-esque ambitions of being the famous naysayer, the Nicholas Carr of Open Access, are forming even now in the troubled subconscious of this —
Or, you know, I just got stuck in the spam queue. It happens. 🙂
Davis finishes up by saying something relatively unexceptionable if taken out of the context of his insistence on ignoring both Occam’s and Hanlon’s razors:

Library Open Access policies cannot exist with secret budgets, ambiguous guidelines, and a practice of stonewalling requests for information.
Those who campaign for Open Access need to be held accountable just like everyone else, and budget transparency is the first step.

Exactly so — everyone else, including bloggers who wish to hold librarian feet to the accountability fire.

1I added the list formatting for this post, hoping for improved readability.