spurious factoid proves exceedingly hard to kill

Ever notice how things that “everybody knows” are usually wrong? Here’s another one (from here, concerning this paper):

It is an established fact that 98 percent of the DNA, or the code of life, is exactly the same between humans and chimpanzees. So the key to what it means to be human resides in that other 2 percent.

Argh. Actually, it’s an established fact that this meme, or trope, or whatever you want to call it, is bollocks. Individual human genomes vary by about 0.08% at the single-nucleotide level, whereas human and chimpanzee genomes differ by about 1-1.5% at the same level. This is misleading, because single-nucleotide comparison means aligning comparable sequences base-by-base and counting the differences. In order to line up the two sequences in the first place, however, you have to introduce gaps into each sequence to allow for insertions and deletions. Like this:


In this made-up example, three bases out of fifty are different (6%) but the gaps account for a further 7 bases’ worth of difference (14%). Do this with enough regions of each genome to get a representative sample and you can estimate the degree of sequence identity between the two genomes. Of the optimally-aligned sections of our genomes, we share about 98.5-99% with chimps, but taking the gaps into account produces a rather lower figure of about 95%, something Roy Britten showed in 2002.
What both figures overlook, and tend to obscure, is differences in the organization of large sections of the genetic information: duplications, inversions, recombinations between and within chromosomes, insertions of retroviral sequences, and so on. I wrote earlier about a method that allows us to measure such differences. Variation between individual humans on this scale seems to run at about 1.5% (cf. 0.08% at the nucleotide level); it will be interesting to see a ROMA-based comparison between humans and chimps. It is on this organisational scale that the real clues to the inscrutable Decree will be found.