Perfect match?

Surely this:


You may find a technical report that you want to share with others or you think worthy of making broadly available on the Web to support the advancement of science. When you search for important science information in your area of interest, you can choose to sponsor the digitization of any adoptable technical report. The cost is $85 (approximately the same cost as ordering a hard copy). Discounts for multiples of 5 or more adoptions may be available. If you are interested in a larger scale project, please contact (865) 576-5699.

is a job for this guy:


… Most recently, Malamud has set up the nonprofit, headquartered in Sebastopol, California, to work for the publication of public domain information from local, state, and federal government agencies. Among his victories have been digitizing 588 government films for the Internet Archive and YouTube, publishing a 5 million page crawl of the Government Printing Office, and persuading the state of Oregon to not assert copyright over its legislative statutes.


(CC-BY image of Carl Malamud from Joe Hall via Wikimedia)

OA vs TA costs: I think I have finally got this straight.

I made some errors in the last few posts, making the information somewhat scrambled — my apologies. Here is what I hope is a clear picture of what we know about the relative costs of OA and TA publishing.
1. The NIH estimates that it pays $100 million/year in author-side charges, and supports the production of some 80,000 scholarly articles; that’s an average of $1250/article.
Update: Peter Suber points out that some fraction of that 80,000 articles did not use NIH funds, either because they were published in no-fee journals or because the authors found other ways to pay. I can’t think of any way to estimate the actual number of articles the $100 million paid for in order to adjust the estimated fee/article, but it’s worth remembering that it’s an underestimate. Much later update: see this post, I wouldn’t put too much weight on those NIH figures given the nature of the sources.
2. Björk et al. found that less than 5% of all articles worldwide are available through no-embargo Gold OA. We don’t know what proportion of the NIH’s $100 million went to Gold OA fees, nor what the average such fee might be. In order to be conservative, let’s assume that the average Gold OA fee is triple the average TA fee (it almost certainly isn’t that high). Then (if that 5% is evenly distributed) the NIH paid for (0.95×80000=) 76,000 articles at $average and 4,000 articles at 3x$average, bringing the average author-side charge for a TA article to $1136.
3. Philip Davis’ 2004 library costs spreadsheet estimates the average subscription charge per scholarly article at between $970 and $1750, depending on what proportion of the library serials budget is allocated to scholarly publications.


Davis’ original study estimated this proportion at 50% (on what basis I don’t know), but I think the real value is closer to 90%. My reasoning is based on my observation (see Table 2) that the average unit cost of a curated list of scholarly journals from UCOSC is about ten times the average unit cost of “all serials” from ACRL, ARL and NCES datasets. If that result is broadly representative it means that scholarly journals must contribute either a small fraction or the vast majority of the cost (see here for a brief explanation).
So that gives an estimated fee of between $2106 and $2886 per toll-access article. That money isn’t all coming from the same place — the NIH is paying author-side fees and libraries are paying subscriptions — but it’s all going to the same place, publisher coffers.
I’ve added a current (under)estimate of NIH costs for author-side fees, adjusted for a 2006 estimate of %OA by article, to a 2004 estimate of subscription fee/article, but I’m confident that the real cost (if I could get up-to-the-minute figures for all inputs) would be in the same ballpark.
Sure puts one-time, up-front Gold OA fees in a different perspective, doesn’t it? Here’s a reminder (stupid Impact Factors in brackets just because I know a lot of people still think they mean something even though they don’t):

average revenue
1 per toll-access article ………….. $2100 – $2900

BioMed Central
Genome Biology (6.6) ………………………………. $2250
BMC Biology (5.1) …………………………………. $1950
Molecular Cancer (3.7) …………………………….. $1710
Retrovirology (4.0) ……………………………….. $1390
J. of Cardiovascular Magnetic Resonance (1.9) ………… $1195

Comparative and Functional Genomics (1.6) ……………. $850
J. of Biomedicine and Biotechnology (1.9) ……………. $975
Mediators of Inflammation (1.2) …………………….. $975
Bioinorganic Chemistry and Applications (1.0) ………… $700

Public Library of Science
PLos Biology (13.5), PLoS Medicine (12.6) ……………. $2850
PLoS Pathogens (9.3), Neglected Tropical Diseases (n/a),
Genetics (8.7) and Comp Biol (6.2) ………………….. $2200
PLoS ONE (n/a) ……………………………………. $1300

J. Medical Internet Research (3.6, best in field) …….. $1590
Biological Procedures Online (1.2) ………………….. $1250
J. of Clinical Investigation (16.9) ………………… ~$2500

1 Update: since D0r0th34 has already pointed out one dumb thing I did, neglecting other revenue streams available to TA but not OA publishers, I think that rather than continually update this post I’ll just go ahead and embed the FriendFeed discussion right here:

OA and strategy

Stuart Sheiber recently gave a talk at Caltech, which prompted the following blogospheric exchange with Stevan Harnad (which I recommend highly if you are interested in Green vs Gold OA and the intricacies of OA mandate politics):

Harnad –> Sheiber –> Harnad

followed by this related post on “proportion and strategy” from Prof Harnad, the main points of which he also left as a comment on a couple of my posts:

#1: The vast majority of current (peer-reviewed) journal articles are not OA (Open Access) (neither Green OA nor Gold OA ).
#2: The vast majority of journals are not Gold OA.
#3: The vast majority of journals are Green OA.
#4: The vast majority of citations are to the top minority of articles (the Pareto/Seglen 90/10 rule).
#5: The vast majority of journals (or journal articles) are not among the top minority of journals (or journal articles).
#6: The vast majority of the top journals are not Gold OA.
#7: The vast majority of the top journals are Green OA.
#8: The vast majority of article authors would comply willingly with a Green OA mandate from their institutions and/or funders.
#9: The vast majority of institutions and funders do not yet mandate Green OA.
#10: The vast majority of Gold OA journals are not paid-publication journals.
#11: The vast majority of the top Gold OA journals are paid-publication journals.
#12: The vast majority of institutions do not have the funds to subscribe to all the journals their users need.
CONCLUSION I: The fact that the vast majority of Gold OA journals are not paid-publication journals is not relevant if we are concerned about providing OA to the articles in the top journals.
CONCLUSION II: Green OA, mandated by institutions and funders, is the vastly underutilized means of providing OA.
CONCLUSION III: It is vastly more productive (of OA) for universities and funders to mandate Green OA than to fund Gold OA.

I think there is a considerable strategic error embedded in those premises and the conclusions which follow, the basis of which is the emphasis on “the top minority of journals (or journal articles)”. The 90/10 rule is not relevant: the goal of OA is 100% OA, not 10% — not even “the top” 10% in which is concentrated 90% of whatever your metrics are measuring.
Much of the potential of OA lies in the provision of a comprehensive corpus of information on which to build the semantic web. Comprehensivity matters, because just as re-use beyond the scope of the original author’s imagination is a primary impetus for information sharing between humans, it is folly to imagine that we can determine ahead of time what will matter to machines — that is, which articles will be crucial to finding new and unexpected connections in text- and data-mining initiatives. The more complete the corpus, the more likely we can refine from it insights that are currently unpredictable.
Also, in an odd bit of circularity, 100% OA is vital to the development of rich, fine-grained, multiply cross-validated metrics that will likely be more reliable than existing metrics in guiding management decisions and researcher information searches. If we focus on “the top” journals and articles, we hamstring our best strategy for improving the methods with which we identify quality in the first place.
It’s also worth addressing claim #11 separately. For the direct argument against the assertion that most of the “top” Gold OA journals charge fees, see Peter Suber:

If this is a claim about quality, or about future submission patterns, as opposed to present submission patterns, then it’s an assumption for which there is no evidence.  Nobody has done the studies. […] In the absence of studies, this is all we know:

[T]here are strong and weak OA journals, just as there are strong and weak TA journals. Hence, any analysis focusing on weak OA journals and strong TA journals (as if to show the superiority of TA journals) would be as arbitrary as one focusing on weak TA journals and strong OA journals (as if to show the superiority of OA journals). Without some additional argument showing that the journals on which they focus are typical of their breeds, they would be guilty of cherry-picking and generalizing from an unrepresentative sample.

There is, however, a neglected and (in my opinion) important counter-argument: even if that assertion is true, it is surely equally or more the case that the vast majority of toll-access journals charge author-side fees in addition to subscription charges. A 2005 Kaufman-Wills study found that 75% of TA journals in their sample charged author-side fees. There is at least as much reason to suppose that the top-ranked TA journals are to be found among the fee-charging cohort as there is to suppose the same of OA journals.
The NIH estimates that it pays author-side fees to the tune of $100 million per year, and funds the publication of some 80,000 scholarly articles. Assuming, in order to be conservative, 5% Gold OA at fees that are triple the average TA fee, that averages out to $1136/article, but what’s sauce for the TA goose is sauce for the OA gander: if the Kaufman-Wills figures are broadly representative then those TA journals that charge additional author-side fees are charging, on average, $1515 per article. That’s more than PLoS ONE, more than most BMC journals and more than any Hindawi journal.
It follows that, since we are not — that is, I argue that we should not be — “concerned about providing OA to the articles in the top journals”, the fact that most Gold OA journals do not charge fees is in fact relevant to all strategies for increasing OA to the research literature.
I think I disagree with the second conclusion also — in the most comprehensive study so far, about 8% of articles published in 2006 were available via Gold OA, whereas a further 11% was available as a self-archived copy. I agree, of course, that both are vastly underutilized relative to the goal of 100% OA, but it doesn’t seem to me that Green suffers more neglect than Gold.
Given the flaws in some premises and the first two conclusions, I don’t believe that conclusion 3 stands up either. I find Stuart Sheiber’s argument for the Harvard model compelling:

In summary, a university that commits to the open access compact1 will more easily be able to answer objections against green OA policies specifically because it has an approach to long-range support for gold OA publishing, not in spite of it. The two models are inextricably tied. I, like Professor Harnad, am interested in facilitating the adoption of green OA policies. I proposed the open access compact in large part because I expect that adoption of the compact will lead to more green OA policies. The open access compact is therefore contributory to the promotion of green OA, not a sidetrack to it. I of course encourage universities to adopt green OA policies before gold OA support, but given that dystopian fears of faculty are preventing adoption of such policies, an open access compact that might assuage these worries should not be delayed.

1 The compact simply states that “”The university commits to underwrite reasonable article processing fees for open-access journals for which funds are not otherwise available”.
Given all of the above, the optimal strategy seems to me to be the one adopted by Harvard: a Green OA mandate and careful (fiscally responsible) support for Gold OA.

Update and correction re: cost to libraries and author-side fees

In comments below, Peter Suber points out that the NIH has amended its estimates to $100 million/yr spent on author-side charges and 80,000 manuscripts funded — which brings the estimated average author-side fee to $1250, well in line with the individual journal estimates I made and the published figures I found. This is an important number because it is derived from a very large sample of the scholarly literature and casts a very different light on OA author-side fees than the one that TA publishers are wont to shine on their competitors. Compare, for instance, PLoS ONE at $1300, or the standard BMC charge of $1470 — for a couple hundred dollars more than the average cost of a TA publication, you can make your work free for all users to access, immediately and permanently. (It would be interesting to know what proportion of the $100 million is going to OA fees, though I doubt it would be large enough to make a significant dent in the average TA charge. Edit: according to Björk et al., less than 5% of all articles are available via no-embargo Gold OA; taking this into account, and assuming that the average Gold OA fee is triple the average TA fee, gives an average of $1136/article.) Much later update: see this post, I wouldn’t put too much weight on those NIH figures given the nature of the sources.
But! However! There is a flaw in my reasoning!
The problem is not with the estimate of author-side charges, but in my use of that estimate to update Philip Davis’ library costs study. The point of that study was to look at what libraries would pay in an all-OA model, which is why I used the fractional cost matrix1 and graph in the first place. See the problem? Libraries don’t pay the toll-access author-side charges, the NIH does! This makes the model a little artificial, perhaps, since *someone* has to pay those charges regardless of which journal levies them; nonetheless, the idea was to estimate practical library costs, so the TA author-side fees should not be included.
Here’s what the updated situation looks like with the subscription/article estimate NOT adjusted for TA author-side fees (see my earlier post for details of the calculations):


The fractional cost has to drop to 0.4 before there are no libraries predicted to pay more in the OA model — as I pointed out in the original post, there are numerous realistic combinations that will result in a fractional cost of 0.4 or lower:


The new figures also show that the fractional cost has to drop below 0.2 before all 113 libraries are predicted to save money in an OA model. That still seems to me to fall within a realistic range, given that 70% of journals in the DOAJ don’t charge author-side fees and 45% of researchers in a recent RCUK study had their OA fees covered by their research funders, for a fractional cost of 0.135.
Nonetheless, it’s worth taking a quick look at the libraries which are predicted to pay about the same in the OA and TA models. At a fractional cost of 0.4, they are: UC Davis, LA and San Diego, Univ Colorado, Cornell, Harvard, Johns Hopkins, McGill, Univ Massachusetts, Univ Maryland, MIT, Univ Toronto, Univ Washington and Univ Wisconsin. At a fractional cost of 0.3, only UC Davis, UCLA, Harvard, McGill, Maryland and Washington remain in the “pay about the same” category.
It’s easy enough to guess what these universities have in common, and a simple analysis confirms it:


Shading the top six yellow and the next 8 blue for visibility and ranking the libraries according to FTE, serials expenditure and “estimated scholarly articles published” reveals that the 14 “pay-same” libraries have only a slight tendency to be among the larger schools, but cluster very strongly at the high end of the “scholarly articles published” ranking. In other words, research-intensive schools that publish a lot may put more pressure on their libraries in the OA world (to the extent that libraries are likely to be asked to repurpose serials costs for OA charges).
Among other things, it was in order to examine this particular concern in detail that Davis carried out his original study, and for the same reason I have here updated it with more recent estimates and assumptions. The newer numbers show that a realistic worst-case scenario is that the libraries in question (14 out of 113 total) don’t save any money in the OA model.
1 I neglected to mention in earlier posts that I got the %fee x %funded matrix idea (of which the fractional cost graph is an obvious extension) from Peter Suber. My apologies to Peter; I’m usually more careful about crediting sources.

Cost to libraries: OA vs TA

Note: important update/correction.
In 2004, Philip Davis carried out a study of library costs in which he estimated the average subscription cost/article for a subset of ARL libraries and compared this with a range of estimated author-side fees for Gold OA, in order to determine whether libraries might pay more or less if all journals switched to OA. Here I’ve tried to update that study using information that wasn’t available back then.
Davis set the spreadsheet up to make it easy to update his assumptions and recalculate (kudos!), and Peter Suber (among others) pointed out that at least the following assumptions should be updated:

  1. all OA journals charge author-side fees
  2. the full cost of OA fees will be borne by libraries
  3. TA journals charge no author-side fees

We now have five different studies (one recently confirmed, improved and updated) showing that in fact the majority of OA journals do not charge author-side fees. The highest proportion of no-fee journals is in the DOAJ psychology subset (90%) and the lowest is in the chemistry subset (49-58%); the most recent analysis of the entire DOAJ showed 70% no-fee.
We also know that research funders are increasingly willing to foot the bill for OA. For example, HHMI has institutional agreements/memberships with BMC, Springer and Elsevier, and BMC’s page of funder policies shows that a majority of UK funders either make additional funds available or allow publication charges to be treated as an indirect cost. A recent RCUK report showed that 45% of authors publishing in fee-based OA journals had their costs covered by their research funders.
Rather than pick a single number for either of these updates, I’ve plotted the fraction of the OA cost borne by libraries against the number of institutions at which OA is predicted to cost more than, the same as, or less than the TA model. The fractional cost borne by libraries is the product of (100 – %covered by funders)(%OA journals charging fees). (See Figs 1 and 2 below.)

We don’t know much about author-side fees at toll-access journals, but we do have some information. Firstly, the 2005 Kaufman-Wills report showed that more than 75% of the 247 toll-access journals in their sample charged author-side fees in addition to subscriptions. Secondly, I just had a rough-and-ready look at a small number of TA journals and found average author-side fees ranging from $400 to almost $3000. Finally, the NIH estimates (scroll to section L) that it spends over $30 million/year in author-side fees and funds the production of around 60,000 manuscripts. This means that the NIH is paying, on average, about $500/article in page charges. Since this is the largest sample we have, I’ve used this figure to update the spreadsheet. I added $500/article to the calculated serials expenditure/article and compared this adjusted TA cost/article to the OA costs.
Update: this was a mistake! The point of the exercise was to compare existing library subscription costs with predicted OA costs, and libraries are not currently paying the TA author side fees. See this post for the correctly updated version of the Davis study. ( Much later update: see this post, I wouldn’t put too much weight on those NIH figures given the nature of the sources.)
I’ve updated two further aspects of Davis’ spreadsheet. First, we now have better information about the actual range of author-side fees charged by those OA journals that do charge them. Rather than Davis’ $2500 – $5000 range, I’ve used $1300 (PLoS ONE) to $3000 (most of the high-profile hybrid programs). If the adjusted TA cost/article falls within this range, the prediction is that the OA and TA models cost about the same from a library point of view.
Second, Davis assumed that the scholarly literature made up 50% of library serials expenditures. I don’t know where this figure came from (the spreadsheet refers to a report which does not give any further information), but I think the real value is closer to 90%. My reasoning is based on my observation (see Table 2) that the average unit cost of a curated list of scholarly journals from UCOSC is about ten times the average unit cost of “all serials” from ACRL, ARL and NCES datasets. If that result is broadly representative it means that scholarly journals must contribute either a small fraction or the vast majority of the cost. Here’s a simple explanation: suppose 1000 items at an average cost of $10; then average cost of the scholarly items must be about $100 if the “10 x all serials” rule is accurate. So you can either have 90 scholarly items and 910 non-scholarly items at about $1, or you can have one scholarly item and 999 non-scholarly items at about $10. What you can’t have, for the averages to work out according to the “10 x” rule, is any ratio close to 50% scholarly/50% non-scholarly.
Summary of updates:

  1. plot fractional cost borne by libraries to account for %OA journals that don’t charge fees and % OA costs borne by research funders (or other bodies)
  2. add $500/article to TA model costs to account for author-side fees charged in addition to subscriptions
  3. predicted OA fee range = $1300 to $3000
  4. assume scholarly literature makes up 90% of serials expenditure

The updated spreadsheet is here, and the end result is this:


At a fractional cost of 0.8, there are no libraries at which OA is predicted to cost more than the TA model, and at a fractional cost of 0.3 the OA model is predicted to cost less than the TA model at all 113 libraries.
To see how the %fee and %funder proportions affect the fractional cost borne by libraries, I constructed a simple matrix and highlighted the two cutoff points shown on the graph above:


As you can see, there are a number of perfectly reasonable combinations which result in a fractional cost of 0.3 or less, at which all the libraries in the sample would save money under the OA model. (This, by the way, is exactly what Peter Suber predicted.)
Update/correction: see this post.

Author-side fee comparison: OA vs TA.

I’ve posted a couple of times about the misconception that all OA journals charge author-side fees, and each time I’ve mentioned the Kaufman-Wills study which found that 75% of the toll-access journals they examined charged author-side fees in addition to subscription charges. I thought it would be useful to compare author-side fees charged by OA and TA journals.
It’s easy to work out what OA and hybrid journals charge; BMC maintains a detailed list of publisher article processing charges.  Here are some examples:

PLoS journals chargein three tiers:

PLoS ONE, $1300
PLoS Pathogens, NTDs, Genetics and Comp Biol, $2200
PLoS Biology and Medicine, $2850

BMC charges between $1105 and $2095 for most journals, and their standard charge is $1470
Hindawi charges between $275 and $850 for most of their journals, with a few titles up to $1400
Springer Open Choice, Wiley Funded Access and Elsevier’s Sponsored Articles all cost $3000. (*cough*)

What is much more difficult to determine is how much the average author is paying in author-side fees at toll-access journals, because the charge for a given article depends on number of pages and/or color figures, and in some cases also on whether supplementary information is included.
Below are a few examples; in each case for which I calculated a figure, I extracted the page and figure counts manually from a single issue. This is far too small a sample to be representative, but I’m just trying to get some kind of feel for the numbers. Further, the published figures I managed to find (indicated by footnotes) are consistent with my “calculated guesses”. Also, the NIH estimates (scroll to section L) that it spends “over $30 million annually in direct costs for publication and other page charges” and produces “roughly 50,000 – 70,000 manuscripts”, which means that the NIH is paying, on average, about $500/article in page charges. If around 8% of all new articles are Gold OA, that number goes up to about $543/article. If the Kaufman-Wills 75% figure is representative, then the average author-side fee being charged is $666/article, or $724/article if the %OA is taken into account. (Note that the %OA adjustment might be spurious and the estimated average slightly off, because we don’t know how much of the estimated $30 million is going to Gold OA fees.) Edit: according to Björk et al., only about 5% of all articles are available through Gold OA without an embargo period. Taking this into account, and assuming that the average Gold OA fee is triple the average TA fee, gives an average of $454/article, or $606/article on the Kaufman-Wills estimate.
Update: In comments, Peter Suber points out that the NIH has amended its estimates to $100 million/yr spent on author-side charges and 80,000 manuscripts funded — which brings the estimated average author-side fee to $1136; if only 75% of TA journals are charging such fees, then they are charging on average $1515. Much later update: see this post, I wouldn’t put too much weight on those NIH figures given the nature of the sources.
This section became way too cluttered, so I’ve put a summary here and the details are below:
journal ……………………………… average author side fee
PNAS ……………………………………….. $1446
Science …………………………………….. $1019
Nature ……………………………………… $1669
Cell ……………………………………….. $2031
Cell Cycle ………………………………….. $756
EMBO J ……………………………………… $2974
Mol Biol Cell ……………………………….. $1829 1
American Physiological Society (14 journals) ……. $1000 2
Journal of Nutrition …………………………. $456
J Neuroscience ………………………………. $850 + color charges 2
Molecular Biology and Evolution ……………….. $922 3
Molecular Plant-Microbe Interactions …………… $1275 4
J Natural Res & Life Sci Education …………….. $400

1 official figures, 2006
2official figures, current
3 official figures, 2008
4 official figures, 2000
The selection of journals is fairly random, just the first few that came to mind then whatever turned up when I was searching for things like “average page color charges”. They range from prestige to niche, and even the cheapest charge fees that amount to a significant fraction of Gold OA author-side fees.
It would be very interesting to extend this half-baked pilot study, but I think it would also be unavoidably labor intensive. Except for rare cases where publishers provide the numbers, there’s really no way to calculate average author-side fees based on page and figure counts except by doing those counts for a representative sample of issues in each journal. (Perhaps a passing statistician could help me figure out what would constitute a representative sample — perhaps sqrt(issues/year)?) Then you have to select which journals to investigate — perhaps high, middle and low ranked journals in a handful of broad categories? Finally, it’s pretty slow going, so I don’t think Mechanical Turk would be cost effective for this job — even if you could solve the problem of giving Turkers access to the journals. In the end I think you’d have to inflict the counting task on some hapless grad student or intern, who would probably find it easiest to sit in a library with a stack of journals and a spreadsheet.

—————————————-details of “calculated guesses” and official figures—————————————-
PNAS: $70/page, $250 for supplementary information, $300 per color figure or table

March 17 2009 vol 106 issue 11: 88 papers, pp 4079 to 4570; mean = 5.6 pages
5.6 pages = $392
10 papers had no supplementary info so mean SI=78/88=0.886 = $221
approx every 5-6th paper examined, 18 in total:
5 color figures ($1500) ii
4 color figures ($1200) iiiii i
3 color figures ($900)  ii
2 color figures ($600)  iiii
1 color figure  ($300)  ii
0 color figures ii
mean color cost = $833; mean total cost/article = $1446

In 2004 Cozzarelli et al. suggested that around $2000/article would be needed to cover PNAS’  costs without subscription income.
Science: $650 for the first color figure, $450/color figure thereafter

March 20 2009 vol 323 issue 5921: 2 research articles, 11 reports:
4 color figures ($2000) iii
3 color figures ($1550) i
2 color figures ($1100) iiii
1 color figure ($650) ii
0 color figures iii
mean color cost = mean cost/article = $1019

Nature: £735 ($1072) for the first colour figure and £262.50 ($383) for each additional figure (note: “Inability to pay this charge will not prevent publication of colour figures judged essential by the editors”)

March 19 2009 vol 458 number 7236: 2 articles, 12 letters:
5 color figures ($2604) ii
4 color figures ($2221) iiii
3 color figures ($1838) iii
2 color figures ($1455) iii
1 color figure ($1072) i
0 color figures ii
mean color cost = mean cost/article = $1669

Cell: $1000 for the first color figure and $275 for each additional color figure.

March 20 2009 vol 135 number 6: 12 articles:
7 color figures ($2650) iii
6 color figures ($2375) iii
5 color figures ($2100) ii
4 color figures ($1825)
3 color figures ($1550) ii
2 color figures ($1275)
1 color figure  ($1000) ii
0 color figures
mean color cost = mean cost/article = $2031

J Neurosci: $850 for regular manuscripts, $450 for brief communications, color figures are free “when color is judged essential by the editors and when the first and last authors are members of the Society for Neuroscience”, otherwise $1,000 each.

March 18 2009 vol 29 issue 11: 28 articles; looked at 4 random articles, no color figs = 6,8,5,1.  Regular SfN membership is $160.  I’m guessing most authors are members but it’s still impossible to tell how much each paper is being charged for color.

Landes Bioscience (all journals): four pages free, then $80/page; $340 for the first color page and $150 for each additional color page (in print — color is free online)

Cell Cycle March 15 2009 vol 8 issue 6: 10 research reports, pp 870 – 949
pages = 5,12,6,5,6,6,8,5,8,9
pages charged = 1,8,2,1,2,2,4,1,4,5; total = 30, mean = 3 = $240
7 color figures ($1240)
6 color figures ($1090)
5 color figures ($940)
4 color figures ($790) iiii
3 color figures ($640) i
2 color figures ($490)
1 color figure  ($340) iiii
0 color figures i
mean color cost = $516; mean total cost/article = $756

EMBO J: $250/page over 6 pages, plus color charges: $650/figure for the first three figures, $432/figure for the next two, $2928 for six figures and $326 per additional figure thereafter.

March 18 2009 vol 28 number 6: 15 articles
pages = 10,8,10,10,13,8,10,13,13,10,8,9,10,12,12
pages charged = 4,2,4,4,7,2,4,7,7,4,2,3,4,6,6; total = 66, mean = 4.4 = $1100
9 color figures ($3906) i
7 color figures ($3254) ii
6 color figures ($2928) ii
5 color figures ($2814) ii
4 color figures ($2382) i
3 color figures ($1950)
2 color figures ($1300) ii
1 color figure  ($650) ii
0 color figures iii
mean color cost = $1874; mean total cost/article = $2974

Molecular Biology of the Cell: according to the Am Soc Cell Biol, in their 2006 publication “MBC and the Economics of Scientific Publishing” (available as a pdf from the linked page):

The average article published in MBC in 2006 was 11.7 pages long and included 2.9 color figures. With the 20% discount on page and color charges now offered to ASCB members, publishing such an article would cost the author $1,829.

(Regular ASCB membership is $130.) Interestingly, the same publication gives the following details of budgeted (projected?) journal revenue for 2008:


I don’t know how similar that breakdown would be for other journals, but it’s interesting that subscription revenue is roughly equal to page OR color charges — meaning that the average author would pay about 50% more if the journal switched to full cost recovery from author side fees.  This would put MCB’s author side fees roughly on par with those charged by the top two PLoS tiers.
The American Physiological Society’s Author Choice (hybrid OA) fee is $3000 for review articles and $2000 for research articles; according to their FAQ this is because:

For research articles, the Author Choice fee was determined by calculating the real average cost ($3,000) of publishing an article in an APS journal, and subtracting the actual average amount already paid by authors in author fees (page charges and color fees). The Author Choice fee for review articles is $3,000, because there are no other fees paid by authors of review articles. The Author Choice fee was designed to completely cover the cost of publishing an article.

which indicates that the average author-side fee for the 14 journals published by the APS is $1000.
Journal of Nutrition: in this editorial, AC Ross gave some figures regarding costs:

On average, each published page costs about $465, and pages with color, $1300! Each published manuscript costs, on average, $3233. Page charges (starting at $70) and color charges to authors ($400 per figure) are only a fraction of the actual costs of publication. Institutional subscriptions remain a key factor in the financial success of professional society journals like JN.

Page charges are currently $75/page for the first 7 pages and $120/page thereafter, and color charges are still $400/figure.

March 2009 volume 139 issue 3: 29 articles
pages = 5,4,7,4,8,5,6,7,5,6,6,4,6,7,5,4,6,6,7,5,6,7,5,5,5,3,4,7,5
mean page charge = $415
1 color figure  ($400) iii
0 color figures iiiii iiiii iiiii iiiii iiiii i
mean color charge = $41; mean total cost/article = $456

Molecular Biology and Evolution: in the 2008 Editor’s Report (pdf available here) the Society for Mol Biol and Evolution provided the following figures for MBE in 2008:

average article length: 10.1 pages
average number of color figures per article: 0.927

Current charges are $50/page plus $450 per color figure, giving an average cost/article of $922.
Phytopathology and Plant Disease: $50 per printed page for the first six pages and $80 per printed page for each additional page for members of The American Phytopathological Society and $130 per printed page for nonmembers. In addition, there is a $20 fee charged for each black-and-white figure or line drawing. Color charges are $500 for the first illustration, $500 for the second illustration, and $250 for the third and each subsequent color illustration in one article.
Molecular Plant-Microbe Interactions: $150 for the first 6 pages, $150/page or fraction of thereafter; Color charges are $500 for the first illustration, $500 for the second, and $250 for the third and each subsequent color illustration in one article. In addition, there is a $20 fee charged for each black and white figure or line drawing.

The Society’s Reports of Publications from 2000 gives the following figures:
Phytopathology: average article = 7.3, average color figs/article = ?
Plant Disease: average article = 5.4, average color figs/article = ?
MMPI: average article = 9.4, average color figs/article = 1.05; mean cost/article = $1275

(Regular membership in Am Phytopath Soc is $76.)
Journal of Natural Resources and Life Sciences Education: $350/article, $10 per table and $10 per figure plus $100/color page (print only; color is free online).

Vol 36, 2007: 17 articles, number of figs/tables = 1,3,6,7,12,4,5,4,8,8,5,5,9,1,2,2,4
only a couple had color figures; mean additional charge = $50, mean cost/article = $400



On FriendFeed, items move back up the temporal sequence when they get “likes” and comments, giving them extra chances to be noticed. In addition, a “like” or comment from one of your friends will bring an item into view even if posted by someone whose stream you don’t follow. The emerging mores of the system include leaving a one-word comment, bump, to indicate that one feels a particular item is worthy of wider attention — “bumping” the item up the queue, as it were.
That’s what I’m doing with this post. Richard Poynder is trying to put together a list of institutions and funding bodies which have established funds to pay for Gold Open Access:

I am trying to establish how many research institutions and funders have created Gold Open Access (Gold OA) authors funds, and would be grateful for input from others.
I am aware that the Wellcome Trust announced a scheme for paying OA publication fees for its grantees in 2006. But what other funders have introduced such schemes?
So far as research institutions are concerned, Peter Suber kindly provided me with the following list of those he knows have created Gold OA funds:
University of Amsterdam
University of Calgary
University of California, Berkeley
Delft University of Technology
ETH Zurich
Griffith University
University of Helsinki
Institute of Social Studies (Netherlands)
Lund University
University of North Carolina, Chapel Hill
University of Nottingham
University of Tennessee, Knoxville
Texas A&M University
Tilburg University
Wageningen University and Research Center
University of Wisconsin
However, I do not think this list is complete.

Richard also points out that it is probably useful to keep track of which Gold funds are complemented by a Green mandate, and makes the (imo excellent) suggestion of establishing a Gold Fund equivalent to ROARMAP, which tracks Green Mandates.
So — *bump* — please go read Richard’s post, and help him out if you can.
Update: Peter Suber has created and pre-populated the Open Access Directory list of journal OA funds, so if you have information please add it there.

That’s the way you do it!

Via Peter Suber, I am delighted to find that Stuart Shieber has started a weblog, and even more delighted that in one of his first entries he has turned my long-ago author-side fees DOAJ hack into an actual, readily reproducible study:

Here are the results computed by my software, as of May 26, 2009:

Charges.......................951  (23.14%)
No charges....................2889 (70.29%)
Information missing...........270  (6.57%)
Hybrid........................1519 (26.99%)

The numbers are consistent with those of Hooker’s study some 16 months earlier.

It’s great to have the numbers confirmed, and even better to be able to make regular updates and construct time series. Thanks to Stuart for doing it right, and for making the code freely available.
(Note, had to reformat the quoted table into ugly text, because I still can’t get MT to play nice. Grrr.)

What use are research patents?

DrugMonkey has a conversation going about the ongoing kerfluffle over (micro)blogging of conference presentations (see also the FriendFeed discussion). I want to go off on a tangent from something that came up in his comment thread, so rather than derail it I thought I’d post here.
In his first comment in the thread, David Crotty made the following claim:

Lots of researchers support their families and labs through money generated by patents, and most universities are heavily dependent upon their patent portfolios for funding.

That doesn’t accord with my (limited!) experience — I know a few researchers who hold multiple patents, and none of them ever made any money that way — and my general impression is that the return on investment for tech transfer offices and the like is fairly dismal.
This seems like the sort of beans that beancounters everywhere should be counting, so I asked on FriendFeed whether anyone knew of any data to address the question of whether universities really make much money from patents. Christina Pikas pointed me to the Association of University Technology Managers, whose 2007 Licensing Activity Survey is now available.
I extracted data for 154 universities and 27 hospitals and research institutions. Between them, in 2007, these institutions filed 11116 patent applications, were awarded 3512 patents, and gave rise to 538 start-up companies. I calculated licensing income as a percentage of research expenditure:


Apart from New York University (I wonder what they own that’s so profitable?), it’s clear that none of these universities are “heavily dependent upon their patent portfolios for funding”. In fact, more than half of them (78/154) made less than 1% of their research expenditure back in licensing income, and the great majority (144/154) made less than 10%.
Licensing income for Massachusetts General Hospital and “City of Hope National Medical Ctr. & Beckman Research” (whoever they are) amounted to 65-70% of research expenditure, but none of the other hospitals or research institutions made more than 20%. More than half of this group (15/27) made less than 2%, and most of them (23/27) made less than 10%.
The distribution looks just about as you would expect:


I also wondered whether there was any evidence that greater numbers of patents awarded, or more money spent per patent, resulted in higher licensing income. As you can see, the answer is no (insets show the same plots with the circled outliers removed):


I don’t know how representative this dataset is; there are several thousand universities and colleges in the US, and surely even more hospitals and research institutions, so the sample size is relatively small. It does include some big names, though – Harvard, Johns Hopkins, MIT, Stanford, U of California — and I would expect a list of schools answering the AUTM survey to be weighted towards those schools with an emphasis on tech transfer.
In any case, I’m not buying David’s assertion that “most universities”, or most hospitals or research institutes for that matter, rely heavily on licensing income. And that being so, I am also somewhat skeptical about the number of researchers’ families being supported by patents.
What’s the Open Science connection? Well, if you’re interested in patenting the results of your research, there are a lot of restrictions on how you can disseminate your results. You can’t keep an Open Notebook, or upload unprotected work to a preprint server or publicly-searchable repository, or even in many cases talk about the IP-related parts of your work at conferences. It seems from the data above that most universities would not be losing much if they gave up chasing patents entirely; nor would they be risking much future income, since so few seem to get significant funds from licensing. My own feeling is that any real or potential losses would be much more than offset by the gains in opportunities for collaboration and full exploitation of research data that come with an Open approach.
1. Christina left a comment pointing out that patents may be required for more than simply making money from licensing:

…an extremely important reason universities patent [is] to protect their work so that they may exploit it for future research… it turns out that universities have to patent in life sciences – even if they don’t actively market and license these patents – to be able to attract new research money from industry.

There are two distinct points here: first, that if you don’t patent you may not attract industry partners, and second, that if you don’t patent you may end up licensing your own tech back from someone else (I note that most tech licenses I know of are cheap or free “for research purposes” so the latter factor might not weigh so heavily). According to the 2007 AUTM data, industry investment in academic research amounted to about 7% of research expenditure and was up 15% over 2006.
2. David responded on DM’s thread with some counter evidence, on reading which I realise that the data above may (likely?) only show what the university received and not any money that went to the labs or researchers involved. Tech transfer may not be financially worth it for the university, except that it might still be doing good things for individual labs and PIs, and so would constitute a support service the university offers its research community. It also strikes me that my experience, such as it is, is mainly with Australian researchers, whereas David’s is in the US, so cultural differences may also apply.
3. More from Christina at her own place, here.
If you want the data, the spreadsheet I used is here.

What happened to serials prices in 1986-87? (Update: probably nothing.)

This could be nothing but an artifact (e.g. of the way the data were collected), but if you look at Fig 1 from this post, there’s a clear break in the serials expenses (EXPSER) curve that’s not evident in any of the others. Here’s the same plot reworked to emphasize what I’m talking about:


If you squint just right you can imagine a similar but much weaker effect, beginning a year or two later, in the total expenditures (TOTEXP) curve; and the salaries (TOTSAL) curve seems to start a similar upward trend at about the same time but then levels off after 1991 or so. I wouldn’t put any weight on either of those observations though — I’d never have noticed either if I hadn’t been comparing carefully with the EXPSER curve.
I’ve added linear regression lines for the 1976-1986 and 1987-2003 sections of the EXPSER data, just to emphasize the change in rate of increase. For those of you who will twitch until they know, just ‘cos, the regression coefficients of the two lines are 0.99 and 0.98 respectively. If you extrapolate from just the 76-86 section, TOTEXP exceeds the forecast for EXPSER after about 2000.
I have no idea if this means anything, but it is tempting to speculate. For instance: when did the big mergers begin in Big Publishing, and when did the big publishing companies start the odious practice of “bundling”, that is, selling their subscriptions in packages so that libraries are forced to subscribe to journals they don’t want just to get the ones they do?

Update: it’s probably nothing; the curve simply shows an increasing rate of increase, and you can break it up into at least five reasonably convincing-looking segments with breaks at 86-87 and 94-95. It’s possible there were two “pricing events” around those times, but I think this is most likely just an illustration of what can happen when you look a little too hard for patterns in your data!