Ada Lovelace Day is an international day of blogging to draw attention to women excelling in technology.

Women’s contributions often go unacknowledged, their innovations seldom mentioned, their faces rarely recognised. We want you to tell the world about these unsung heroines. Entrepreneurs, innovators, sysadmins, programmers, designers, games developers, hardware experts, tech journalists, tech consultants. The list of tech-related careers is endless.

Since most of my role models who happen to be female are not really in any kind of tech career, I’m spared the need to write the enormous essay that it would take to cover them all. Instead I’ll point to just two for whom I can reasonably make a tech connection: Rosie Redfield and Maureen Hoatlin.
I’ve never met Rosie, who is a PI in the Zoology Department at University of British Columbia, but she is one of the first biomed researchers — if not the very first — to embrace Open Science and I’ve been following her online presence for a couple of years now. From her lab’s homepage you can read not just the usual list of publications and personnel, but also submitted research proposals and work in progress. The latter is communicated by blog: Rosie has one, and so do several other lab members. They discuss upcoming and ongoing experiments, work up data and think out loud about their research in general.
I met Maureen after we were both quoted in Mitch Waldrop’s SciAm article on Open Science, and she realized that we worked on the same campus. Maureen is a PI in the Biochem Dept at OHSU. She tells a great story about neglecting her family one weekend while she sat in bed reading scientific articles online — “this changes everything” was all she would say to their pleas for breakfast, etc. Well, Maureen meant what she said, and she’s walking the walk. You can find the Hoatlin lab on OpenWetWare, along with a wiki-based, bottom-up, ongoing experiment in improving grad student education that she pioneered, and you can find Maureen on a range of social networking sites including FriendFeed and LinkedIn. Her lab has its own Twitter account.
Since I think this sort of open, collaborative model is very much the way of the future, if science is to have a future at all, I’d like to see Rosie and Maureen get their props for having been such early adopters. It’s also worth mentioning that, in addition to still being a Boys’ Club in many ways, research is a very conservative environment in which new ideas are usually met with scorn and active resistance. So, having made it up the foodchain in the face of irrational opposition, they are now confronting the same tribe with another set of new and threatening ideas. Both are worthy additions to the Ada Lovelace Day pantheon.

New blog in town.

I don’t normally promote new blogs, other than to add them to my blogroll if I think they are worth my readers’ time, but I’ll make an exception for PLoS ONE’s new community blog, EveryONE:

Why a blog and why now? As of March 2009,  PLoS ONE, the peer-reviewed open-access journal for all scientific and medical research, has published over 5,000 articles, representing the work of over 30,000 authors and co-authors, and receives over 160,000 unique visitors per month. That’s a good sized online community and we thought it was about time that you had a blog to call your own. This blog is for authors who have published with us and for users who haven’t and it contains something for everyone.

Why did you call the blog everyONE? For three main reasons that encapsulate the mission of the journal:

Firstly, because PLoS ONE is for every rigorous research article that passes our peer-review process.

Secondly, because PLoS ONE is a forum for research in every scientific discipline (with a current emphasis on life and health sciences because of PLoS’s history).

Thirdly, because PLoS ONE is a source of information for every inquisitive reader with an interest in high-quality scientific research.

I hope, and on my better days believe, that PLoS ONE is one of the leading models for the future of scientific journals:

  • they offer gold OA — that is, free online to everyone everywhere from the moment of publication, including submission to PubMed Central
  • they offer a sustainable business model for OA: in the black after less than three years and with an author-side fee of $1300
  • their peer review process is as rigorous as any, but it does not ask reviewers to make guesses about what is “hot”, or what is likely to be important at some time in the future: if it’s solid science, PLoS ONE will publish it
  • they don’t have an Impact Factor: homey don’t play dat, as the kids around here say
  • that’s not to say that they are not actively seeking rich measures of utility/impact for scientific publications: for instance, here’s Bora’s roundup of analyses of an experimental dataset that they passed around a while back, and an update from Euan
  • in the same vein, I can’t find a link right now but there are plans afoot to release real-time access to such data as downloads, comment frequency and so on — post-publication measures which can improve and speed up citation based measures; for another example, scroll down on this page for some self-measurement that represents a level of disclosure I have not seen from any other journal
  • they are responsive to and engaged with the community: for instance, both Bora Zivkovic (community manager — how many journals have one of those?) and Peter Binfield (managing editor) are active on FriendFeed
  • they encourage and enable community input in the form of notes, comments and ratings on every article; I particularly like the option given to reviewers to have their reviews included as comments with the paper

EveryONE is another way for PLoS ONE to engage with their community of readers and contributors, and well worth a look.
DISCLAIMER: I consider Bora and Peter friends of mine, and I’ve previously applied to work at PLoS.

Should we talk about the “journals crisis” instead of the “serials crisis”?

I stumbled upon something new-to-me, and possibly even useful-to-others, in my fooling around with numbers (table 2 and discussion thereof here), but it’s somewhat buried under all the “how I made this figure” and “where I got these data” details. For that reason, and because I didn’t trust my idea until I had some external reinforcement, I thought I’d give it a separate post all its own.
Here’s the thing: what is widely known as the serials crisis in library costs is probably driven largely by the pricing of scholarly journals. In library parlance, “serials” includes, inter no doubt many alia, newspapers, goverment reports issued in series, yearbooks and magazines (periodicals), in addition to the scholarly literature. Of the 225, 000 or so periodicals in Ulrich’s, only about 25,000 are peer reviewed. In the FriendFeed discussion started by my post, Walt Crawford said

…some of us have long argued that there isn’t a serials crisis for library budgets, there’s a scholarly journal crisis. Magazines (and there are about 1/4 million magazines as compared to about 25,000 scholarly journals) tend to have very low prices and very modest increases.

Although non-refereed serials dominate product counts (and, apparently, library collections), the situation is reversed for unit expenditures. The average unit cost for the UCOSC dataset, which is composed entirely of scholarly journals, is roughly ten times the average unit cost for any of the other datasets I used, all of which were general data that included all types of serial. Here’s Walt again:

the 10:1 ratio for UC (that is, scholarly journals averaging 10x as expensive as all serials) sounds about right

When the numbers and Walt’s experience began to line up, I became much more confident in my conclusion, that the serials crisis is really a scholarly journals crisis. It’s not clear to me, in fact, why the phenomenon got the nickname it did; perhaps it’s just that “serials crisis” is a punchier phrase.
I’m not at all sure that any of this is more than semantic nitpicking, but giving things their proper name can be important. Most researchers who only hear the name won’t care about a “serials crisis” — that’s a library problem, nothing to do with us. But if they hear about a “scholarly literature crisis”, it becomes clearer that the issue is the potential loss of access to resources necessary to do our jobs. I suspect most researchers who’ve heard of the serials crisis are aware that it is, at least in part, about journal pricing, but I wonder how many know that it’s pretty much only about journal pricing? This little “discovery” of mine really did put things in a different perspective for me, and I’m probably more informed about library- and publishing-related issues than most benchmonkeys.
I doubt that an alternative name will catch on, and I’m not going to start campaigning for one — but I think that from now on I’ll at least occasionally refer to the “serials/scholarly literature” crisis, or something similar, if only to remind myself of my own little satori. (Question for the lazyweb: can anyone suggest a better phrase, one which would make it more apparent to researchers that they should care about this?)

Fooling around with numbers, part 5

As promised, here is the distribution of journal prices for the subsets of the Elsevier life sciences dataset which either have or don’t have impact factors, and for the entire UCOSC dataset (in which all journals have IFs):
Each interval is $499: $0 to $499, $500 to $999, etc, and datapoints are plotted at the midpoint of each interval.
The conclusion is the same as in part 1, just a bit clearer now. Elsevier journals without an impact factor are priced lower than those which have an IF, and the price distributions are somewhat different between journals with and without an IF. Note, though, that if I’d used a $1000 interval instead of $500, the initial rise in the +IF curves would not appear; if these are power-law distributions the main difference is probably the scaling exponent. I think. (Math is not my friend.)
It almost looks as though low-end journals are shunted out of the lowest price bracket as soon as they get an IF, any IF, and then tend to increase in price as the IF goes up. Update: no it doesn’t. I don’t know what I was thinking there.

The rest of the series: part 1, part 2, part 3, part 4.

How ’bout it, codemonkey? One for all you web app wizards out there.

A great opportunity has opened up for a code-savvy free culture type to earn a little good karma. Here’s the thing:

  1. Bora Zivkovic’s Open Laboratory project is way cool
  2. the more submissions they get, the cooler it is
  3. they have a badge that blogs can display for one-click submission access to the submission form, but no bookmarklet

Now, a bookmarklet seems to me even better than a badge, because it’s independent of the blog you’re reading, right there on your browser toolbar. When you think to yourself “this is such a good post that I should submit it to The Open Lab”, rather than finding the submission form and filling it in or looking to see whether the blog has a badge, you can just hit the bookmarklet. Even better, the bookmarklet can be set up to autofill at least your details, and perhaps to extract information from the page you’re on as well. In any case, the various submission mechanisms are not mutually exclusive: there’s no reason not to have badges and bookmarklets and anything else the community can think of.
I could build one, in principle, since I’ve hacked around with js a little, but it would take me literally days of screaming frustration to do a half-assed job. Surely there’s some web app wizard out there who could whip up something over their lunch break?
So — how about it? Help the Open Laboratory, help the science blogging community in general: build Bora a bookmarklet.
For those not familiar with it, some background on The Open Laboratory: in 2006, Bora was approached by print-on-demand web publishers Lulu.com about the possibility of putting together a print anthology of science blogging. Being the community-centric type, Bora posted a call for suggestions and arranged a panel of reviewers to help him edit the resulting list of blog posts. I was privileged to be on that panel; here’s what I said at the time about the first edition:

As Bora intimates in his introduction, blogs are conversations and so they lose a certain liveliness when embalmed in a blook (blog + book; don’t blame me, I didn’t coin it!) like this. Nonetheless, there is some excellent writing in this thing, it is as perfect an introduction to science blogging as you’re likely to see offline, and it’s a fun read all on its own. True to the open nature of the original medium, you can of course surf over to Bora’s blog and find the anthology entries listed there. No one will mind if you do, but I hope you will also consider buying the blook — which, after all, unlike the internets, you can carry with you on the bus and leave on the break-room table at work. It’s priced at cost and any incidental proceeds will go towards next year’s edition.

Since then, there have been two subsequent editions, 2007 and 2008, and what I said of the 2006 incarnation remains true (except that incidental proceeds now go towards the Science Online conference). (Incidentally, if you follow those links you can read not only the posts that made it into each anthology but all the entries as well.)

Author-side fees in hybrid and OA chemistry journals

Peter Suber, responding to a J Cheminfo paper, wondered what proportion of chemistry journals in the DOAJ charge author-side fees. Since I was in that mode, as it were:


Hybrid journals are those that offer OA-for-a-fee, so of course all of those charge fees. “Open” above refers to Gold OA journals, roughly half of which charge author-side fees in this chemistry subset. This is broadly consistent with the overall DOAJ listing (as of December 2007) and also with several other studies that Peter mentions.
I still can’t solve the tables bug; if you want the numbers, view source — I’ve commented out a simple table that displays fine unless Moveable bloody Type gets hold of it. If you want to see how I generated the numbers, grab this spreadsheet. I first cut-and-pasted from the DOAJ subject listings into a text editor, then used the replace function to introduce tabs before “hybrid” or “open” and between “publication fee” and the entry for each journal. Then I used the replace function to delete all lines between “hybrid/open” and “publication fee”, to simplify the Excel formula… you’ll see what I mean if you look at the spreadsheet.