That bloody video.

still.pngThis video annoyed me the first time I saw it, but I just figured, you know, not everything is made for me. Now it seems to be making another round of the social media stream; it ended up on my radar via FriendFeed, and this time I just had to say something.
First of all, that’s five minutes you’ll never get back. Five minutes isn’t much, but when you only have 30 or 60 minutes a day to spend online — as, e.g., I did in my last job — you resent every stolen second. This is why I hate, with a fierce and curmudgeonly hate, multimedia without transcripts or text versions.
Secondly, here’s the content — in a form you can use at your own pace without needing pause and fast forward buttons:

  • if you’re 1 in a million in China, there are 1300 people just like you
  • China will soon become the number 1 English speaking country in the world
  • the 25% of India’s population with the highest IQ’s is greater than the total population of the United States
  • translation: India has more honors kids than America has kids
  • the top 10 in-demand jobs in 2010 did not exist in 2004
  • we are currently preparing students for jobs that don’t yet exist, using technologies that haven’t been invented, in order to solve problems we don’t even know are problems yet
  • US Dept of Labor estimates that today’s learner will have 10-14 jobs by the age of 38
  • 1 in 4 workers have been with their current employer less than a year; 1 in 2 have been there less than five years
  • 1 in 8 couples married in the US last year met online
  • if MySpace were a country, its 200 million registered users would make it the 5th largest in the world, between Indonesia and Brazil
  • the #1 ranked country in broadband internet penetration is Bermuda; #19 the US; #22 Japan
  • we are living in exponential times
  • Google searches: 2008, 31 billion/month; 2006, 2.7 billion/month
  • to whom were these questions addressed Before Google?
  • the first commercial text message was sent in Dec 1992; today, the number of text messages sent and received every day exceeds the total population of the planet
  • years it took to reach a market audience of 50 million: radio 38 years; television 13 years; internet 4 years; iPod 3 years; facebook 2 years.
  • in 1984 there were 1,000 internet devices, in 1992 there were 1,000,000, in 2008 there were 1,000,000,000
  • there are about 540,000 words in the English language, 5 X as many as in Shakespeare’s time
  • it is estimated that a week’s worth of the NY Times contains more information than a person was likely to come across in a lifetime in the 18th century
  • it is estimated that 4 exabytes (4×10^19 bytes) of unique information will be generated this year — more than in the previous 5,000 years
  • the amount of new technical information is doubling every 2 years; for students in a 4-year degree this means that half of what they learn in their first year of study will be outdated by their third year
  • NTT Japan has successfully tested a fiber optic cable that pushes 14 trillion bits/second down a single strand of fiber — that is 2,660 CDs or 210 million phone calls every second
  • it is currently tripling every 6 months and expected to do so for the next 20 years
  • by 2013, a supercomputer will be built that exceeds the computational capabilities of the human brain
  • predictions are that by 2049, a $1000 computer will exceed the computational capabilities of the entire human species
  • during the course of this presentation (4:55), 67 babies were born in the US, 274 were born in China, 395 were born in India and 694,000 songs were downloaded illegally
  • credit: Karl Fisch, Scott McLeod, and Jeff Brenman

When you see it like that, not zooming out at you with a soundtrack and a bunch of twee effects, it becomes obvious that there’s nothing much there, and what there is, is rather disjointed and incoherent. Many of the factoids look shaky to me, and there are only a couple of references or sources provided (why not provide the others?). I’m not going to bother with a fisking, but here are some obvious eyebrow-raisers:

  • All that stuff about China and India smacks of xenophobic scaremongering to me — I very much doubt that’s the intent, but there’s nothing to tie it to the technological stuff, so it starts to sound like “flee, the brown people are coming!”
  • “We are currently preparing…” — feels good means nothing; it’s just an overblown description of what good teachers have always done.
  • “We are living in exponential times” —  that word (“exponential”), I don’t think it means what you think it means…
  • OK, the google searches, text messages and years-to-50-million stuff is neat, though I still want sources.
  • The prefix exa- denotes 10^18; even using the unofficial binary-base interpretation, 4 exabytes is about 4.61 x 10^18 bytes (See what I did there, with the links to my sources? In a slideshow, you can do that with footnotes and a final slide.)
  • In any case, 4 or 40 exabytes of what? How do you define/count “unique information”?
  • Even if we gloss over “unique information”, how do any of the other quoted rates of change square with “more than in the previous 5,000 years”? What would that mean for the following 1/5000th of a year (~1.75 hours)? In other words, we must have maxed out — right?
  • If the optical fiber example needs a human-scale yardstick, so does 4 exabytes –e.g. if you wrote that data to CD-ROM and covered a football field with the discs, the resulting stack would be about 16 m high, or roughly the height of a four story house.

Update, written after all of the above:
It’s important to note that although the version discussed above is the only one I’d ever seen before today, it is actually the third version on YouTube and was “remixed” by Sony BMG in August 2008. The original was made by Karl Fisch in August 2006; Scott McLeod’s version dates to January 2007 (this was the first one to make it to YouTube and was responsible for the first viral wave); Jeff Brenman created a SlideShare version a couple of months later, and the official version 2.0 was made in consultation with XPLANE in June 2007.
In fairness to Fisch (sounds like a PETA chant), many of the shortcomings of the version that so annoyed me must be laid at the feet of the anonymous Sony drone responsible for the “remix”.
Not only did Fisch provide a text version and a list of his sources with version 1.0, but version 2.0 does a better job than the Sony version of acknowledging the sources in the course of the presentation and even comes with its own wiki, mentioned in the presentation. Version 2.0 is also considerably more coherent and much nicer to look at, and does a (somewhat) better job of avoiding the “eek, brown people!” tone. (Fisch says in a couple of places that he and McLeod, in response to criticism, consciously worked to reduce that “us vs them” feeling, and points out here that he views it as largely an unforseen side-effect of some of the changes between his original powerpoint version, made for his immediate colleagues, and the first YouTube version.) Finally, kudos for choosing a Creative Commons license (even though I don’t like copyleft): although the Sony version leaves this out, all versions are CC-BY-NC-SA (source files are available on the wiki).
In my opinion it’s a damn shame that the Sony version took off (at the time of writing, there are two copies on YouTube with 4,458,229 and 29,828 views, respectively). If you come across someone talking about that version, do everyone a favour and point them to version 2.0.

history in the making

I have plenty to blog about and no time to do it in, but I can’t turn down a request for a quick plug on such a cool topic: Aaron Rowe from Wired Science alerted me to the fact that Falcon 1 made it to space: the first privately developed liquid fueled launch vehicle to achieve earth orbit. Aaron’s followup entry has some good background for those who haven’t been following the SpaceX story.
Our Star Trek future gets closer every day.

In which Gavin Baker finds one of my pet peeves

stfu_noob.jpgIt really chafes my scrote when someone says something like this:

A comment to bloggers. I do my best to credit blog posts by the author’s real name. However, if you blog under a psuedonym [sic] and don’t make it easy to find your actual name, I may not. Unless you want me to attribute your writings to your silly Internet handle, you should include your name somewhere prominent (if not on every page, on the “About” or “Contact” page).

With all due respect, Mr Baker, it’s not up to you where I should or shouldn’t put my “real” name; plenty of people have damn good reasons for remaining anonymous online. Nor is it up to you to sneer at someone’s “silly internet handle”. Put the nick in quotes if you must, and move on. It’s a name, it attaches to a person, and it matters — at least it should matter — a good deal less than the substance of whatever you’re quoting.
I realise that netonyms have been passé among the hipsterati for some time now, and my impression is that it’s a good thing, due mainly to being more comfortable online than crusty old luddites like me. Nonetheless, that you haven’t been online long enough to have a nick that half your friends use instead of your “real” name is no reason the rest of us should subscribe to your particular view of how the internets should work. You can quote me on that — you can even use my “real” name if you want.
Damn kids, get off my lawn, mutter grumble mutter mutter…

brief idea/question

This reminded me of the famous psych experiments conducted by Milgram and Zimbardo, about which every thinking person spends some time wondering and which are more on the public mind than ever since Abu Ghraib. For some reason, this time it occurred to me, as it has not previously, that I’d like to hear from the subjects themselves. I found this account of the Milgram experiment by a participant, but nothing else like it. Does anyone know where I could find more such accounts?

Ray: thanks man, I needed that.

Ray wrote this little song for his daughter, who is having a hard time at work. I hope it gave the young lady in question as much of a lift as it did me.
Ze Frank has a couple of dozen remixes in this gallery; I like Brother Klez’s Feel-The-Spirit Revival Mix, Psycho Dragon Joe’s HateMyJob Club Mix and the Portishead-Style Mix, but the original file is still my favourite.
Thanks, Ray.

When life gives you melons…

Breasts, breasts, breasts, breasts, breasts.
Breasts, breasts, breasts.
Breasts, breasts, breasts, breasts, breasts, breasts, breasts!
Breasts, breasts, breasts.
Breasts, breasts!
There. Now, if Ann Outhouse wants to accuse someone of using breasts to drive traffic to a blog, let her accuse me!
It’s a shame about BoobGate, because Terrance has a more interesting point to make, but except for one noteworthy outbreak of vileness it seems to have been (uh-huh, uh-huh) overshadowed by Jessica’s breasts.
(Title shamelessly stolen from Ann.)

OK, I’ll play.

From Julie via Janet, the random quotations game: Go here and look through random quotes until you find 5 that you think reflect who you are or what you believe; grab the first five you come to or you’ll be there forever looking for a perfect set.

I do not want people to be agreeable, as it saves me the trouble of liking them.

Jane Austen (1775 – 1817)

When a man assumes a public trust, he should consider himself as public property.

Thomas Jefferson (1743 – 1826)

It may be true that the law cannot make a man love me, but it can stop him from lynching me, and I think that’s pretty important.

Martin Luther King Jr. (1929 – 1968)

The public interest is best served by the free exchange of ideas.

Judge John Kane, US District Court

What are the facts? Again and again and again – what are the facts? Shun wishful thinking, ignore divine revelation, forget what “the stars fortell”, avoid opinion, care not what the neighbors think, never mind the unguessable “verdict of history” – what are the facts, and to how many decimal places? You pilot always into an unknown future; facts are your single clue. Get the facts!

Lazarus Long, from Robert Heinlein’s “Time Enough for Love”

There you have: the eternal tension between my desire to help others and my distaste for them in the flesh; an important facet of my view of both politics and academia; a succinct expression of the function of law and the reason we form societies in the first place; an endorsement of Open Access; and the reason I’m a scientist by trade (I know, Heinlein’s kind of a jerk, but that doesn’t stop him from being right every now and then).

Wayback Weirdness

Peter Suber recently linked to a post on the LibraryLaw blog which asked why the Wayback Machine does not seem to archive National Science Foundation pages:

I was just looking on the National Science Foundation’s web site to try to find the Index of FOIA Frequently Requested Documents. The Index is mentioned in the NSF’s Public Information Handbook. When I couldn’t find the Index, I realized the Handbook was written in 1999, and perhaps an older version of the NSF website had a copy of the Index. So I went to the Internet Archive’s trusty Wayback Machine, and put in the NSF’s web address. Yesterday when I looked at the results page, there were no results, and the statement that the site had been blocked by robots.txt was the only information returned. Today, the Wayback Machine’s results page shows each instance when the site was archive, from 1997 to 2005, but when you click on a link, the resulting page is empty and has this message:”We’re sorry, access to has been blocked by the site owner via robots.txt.”

I thought this was weird, and wrote the NSF webmaster, who wrote back to say this:

NSF blocks all indexing of the site between 7AM and 7PM ET, our peak traffic hours, for the convenience of our users. However, there is no block on the site from 7PM to 7AM ET. This is standard policy for most high traffic sites. The owner of [the Wayback Machine] need only comply with our policy in order to index our pages.

So that made me wonder whether is aware that NSF has this policy, or whether there might be some other error somewhere. Searching the Wayback Machine for “” or “” produces a list of archived pages. Clicking on any of those links earlier today produced a file location error, but right now (some hours later) it’s working fine. The earliest available version of the relevant public information page says that the document Susan was looking for is “coming soon”, but I couldn’t find it even though I went through about six versions of the public information page from 1997 to 2005. The Public Info Handbook actually says

An index of FOIA Frequently Requested Records will be published, if applicable, on the Home Page under “Public Information – FOIA and Privacy Act Requests.” Where possible, this will include an electronic version of the actual records released.

(emphasis mine), so perhaps it was never added. Searching the current NSF site for “frequently requested” does not turn up the index in question, and neither does searching their publications for “FOIA”, but I did find a recent management plan (pdf) which includes “Review Agency posting of statements of policy, administrative staff manuals and copies of frequently requested records” in a list of areas “identified for review”. So perhaps it’s still “coming soon”, 9 years on. We are, after all, talking about a government agency.
Incidentally, the NSF’s robots.txt file is right where it should be:

# robots.txt for
# Change history:
User-agent: vspider
Disallow: /cgi-bin/
Disallow: /stats/
Disallow: /home/nsforg/
Disallow: /awards/
Disallow: /pubsys/data/
Disallow: /search97cgi/
Disallow: /seind98/topdemo.htm
Disallow: /nsf99338/topdemo.htm
Disallow: /home/ebulletin/archive/
Disallow: /sbe/srs/start.htm
Disallow: /web/
Disallow: /geo/
Disallow: /eng/
Disallow: /home/crssprgm/igert/survey/
Disallow: /staff/
Disallow: /ads-cgi/
Disallow: /awardsearch/
User-agent: *
Disallow: /cgi-bin/
Disallow: /nsf99338/topdemo.htm
Disallow: /home/ebulletin/archive/
Disallow: /home/crssprgm/igert/survey/
Disallow: /staff/
Disallow: /ads-cgi/
Disallow: /awardsearch/

The Wayback Machine uses Alexa crawlers, so as far as I can tell the file as shown allows vspider (a commercial spiderbot) more limited access, but every other robot can go to most of the site. It doesn’t change (I checked before and after 7pm ET; same file), so NSF must be implementing their block some other way. F’rinstance, .htaccess can serve/block pages depending on the time of day.
So, to sum up: NSF only restricts access during peak hours, and the Wayback Machine knows about this and archives the site just fine. The index of FOIA requests that Susan was looking for, however, does not appear to be available. The person to ask would appear to be the FOIA Officer.