What I’ve Been Reading: Darwin’s Doubt – Pt 4

For the previous installments: part 1, 2, 3.

Chapter 9

In the early 1960s, MIT professor Murray Eden set his mind to discover whether neo-Darwinism could account for the origin of new organisms. He knew life was based on a genetic code, and based on our shared experience of all other coded systems, he assumed the sequence of nucleotides was absolutely critical to its function. If you start adding, deleting, or moving pieces of a digital code, for example, the meaning (function) is degraded or even lost. If we can’t create a better program by randomly adding, deleting, and moving pieces of digital code, why think a Darwinian process that makes random changes to the DNA code could build better and novel organisms (indeed, why think a coded system could ever be built by random processes to begin with)?

In 1966, Eden and other colleagues convened a conference at the Wistar Institute in Philadelphia. The conference was titled “Mathematical Challenges to the Neo-Darwinian Interpretation of Evolution.” The conference sought to explore the creative power of natural selection acting on random mutations. Those present recognized that there are an enormous number of ways to combine amino acids together to form protein chains. And while they did not know precisely how many combinations could result in a functional protein compared to those that could not, they did know the number of functional combinations was extremely small.

The difficulty of creating a functional protein increases exponentially as the size of the protein increases (what’s called “combinatorial inflation”). To see why, consider the math. There are four DNA bases that can occur at any point along the backbone of a DNA molecule. For each DNA base you add to the chain, you increase the possible combinations by 4. So for a DNA chain consisting of 2 DNA bases, there are 16 possible combinations of DNA (4 x 4); 64 possible combinations of DNA in a chain consisting of 3 DNA bases (4 x 4 x 4), etc. For a DNA string comprised of just 15 bases, there are 1,073,741,824 different possible ways you could arrange the DNA. But when it comes to building amino acids, we must concern ourselves with groups of DNA bases. It takes a group of three consecutive DNA bases (called “codons”) to code for a single amino acid in a protein chain, so to calculate the probability of creating any given protein one must calculate the odds at the codon level. Most genes consist of ~1000 DNA bases, which is ~333 codons. Each codon could code for one of 20 different amino acids, so a protein chain consisting of 333 amino acids is just one of 10³⁹⁰ possible sequences those amino acids could have been arranged in (multiplying 20×20 333 times). What are the chances that blind, random processes would create a sequence so specific? Prohibitively low (a scientific way of saying “not a chance!”). There are only 10⁸⁰ elementary particles in the entire known universe. That means that even if every particle in the universe was working toward creating this single sequence for all 13.7 billion years the universe has existed, it wouldn’t be a fraction of the time required to create even one functional protein.

It won’t help to imagine blocks of different genes being reassembled into new genes either. This is no more likely to produce functional biological information than taking random paragraphs from hundreds of different novels and pasting them together is going to create a coherent, new novel.

Evolutionary biologists were not deterred by the math because they assumed that the ratio of functional to non-functional sequences of DNA was high. In other words – unlike human language in which only a very small number of letter combinations create meaningful messages, and even small changes to the sequence degrade or destroy the information – they thought virtually any combination of DNA letters would produce meaningful biological information. If true, then virtually any combination of letters produced by random mutations will result in new function.

In the 1960s, we didn’t know enough about genetics to answer the question of how rare functional genetic sequences are in the total space of all genetic possibilities (“sequence space”). We only knew that different proteins, each with a slightly different amino acid sequence, could perform the same basic function. This suggested that function was not dependent on a highly specified ordering of amino acids, and thus a highly ordered sequence of DNA.

So how rare are functional proteins? MIT molecular biologist, Robert Saucer, was the first to attempt to answer this question with precision. In the late 1980s and early 90s, Saucer performed some experiments which allowed him to estimate the chances of producing a single functional protein (consisting of 92 amino acids) by chance to be 1 in 10⁶³.[1] That means for every one functional protein that is 92 amino acids in length, there are 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 other combinations that will not function.

Once again, defenders of evolution were not deterred by the math. Instead of focusing on the extreme rarity of functional proteins, and the prohibitively low odds of nature stumbling on even one, they focused on Saucer’s finding that there are a lot of functional sequences, and that even different sequences can produce the same basic function. To see why, once again we must do the math. There are 20⁹² possible combinations of amino acids in a protein consisting of 92 amino acids. The great disparity between the amount of possible combinations (20⁹²) and the chances of producing a functional combination (1 in 10⁶³) means that there are many functional combinations. The elephant in the room that they chose to ignore, however, is how extremely unlikely it would be that random processes would stumble on even one of these functional sequences given the enormous amount of possible combinations nature has to try. It’s like a bike thief who is overwhelmed by the odds of stumbling on the combination for a ten-dial bike lock (1 in 10 billion), but then gains hope when he discovers that there are millions of bike locks in the world. If he doesn’t have the resources to stumble on the combination of even one bike lock,[2] the existence of millions of others with different combinations does him no good.

Chapter 10

Douglas Axe was also interested in determining the rarity of functional sequences in protein sequence space, as well as the power of the neo-Darwinian mechanism to explain the origin of new protein folds. For proteins to function, they not only need to have the right sequence of amino acids, but the protein chain also needs to be folded in a very particular way. Axe recognized that protein folds are the smallest level of structural innovation in an organism. If neo-Darwinian mechanisms are not able to account for the origin of new protein folds, then neo-Darwinism is dead in the water as an explanation for the origin of functional biological novelty.

Axe was able to demonstrate that folded, functional proteins are extremely rare within sequence space. The chances of nature stumbling on a functional, folded protein consisting of just 150 amino acids is 1 in 10⁷⁴! The probability of any event occurring is based on the number of attempts one has (probabilistic resources). If the odds of winning a game are 1 in 100, and it is played only once, the odds of winning are small. But if you play the game 200 times, odds are that you will win twice. When it comes to odds, anything less than half is unlikely to occur, while anything more than half is likely to occur. So how many opportunities does nature have to stumble on a functional, folded protein?

At most, only 10⁴⁰organisms have ever lived since the origin of life 3.5 billion years ago, and the vast majority of these are microbial life. Even if every organism that ever lived miraculously experienced a series of mutations that caused each organism to develop one of the unique 10⁷⁴ possible protein sequences, we would not even come close to making it likely that one organism would stumble on the correct sequence. The odds are considerably less than half, and thus unlikely. And when you consider that hundreds of thousands of new proteins are necessary to account for the diversity of life forms, that most proteins are much longer than 150 amino acids, that proteins are extremely sensitive to functional loss when mutations are multiplied, and that the greatest diversification of life occurred in just tens of millions of years at best, the chances of these new proteins/folds being discovered by random change processes is negligible to the point of being laughable.

Chapter 11

While Douglas Axe demonstrated that there is not enough time in the history of the universe for natural selection acting on random mutations to find even a single, functionally folded protein sequence, detractors point to a scientific review essay by Manyuan Long, which reviews several studies purporting to show the origin of new genetic information capable of creating new proteins. The paper, “The Origin of New Genes: Glimpses from the Young and Old,” appeared in Nature Review Genetics in 2003. But does this paper truly demonstrate how Darwinian processes can create new genetic information as claimed? No.

Each of the studies reviewed by Long are based on taking an existing gene, then finding a similar gene in a related organism, constructing a hypothetical common ancestor to both genes, and then postulating an evolutionary scenario (different types of mutation events) that explains how ancestral gene A became gene B in organism X and gene C in organism Y. The story usually starts with gene duplication – an event in which an extra copy of a gene is created during reproduction. After this, a host of mutational events occur to the duplicate gene that turn it into a new gene capable of coding for new proteins:

Exon shuffling
Retropositioning
Lateral gene transfer
Point mutations

There are a number of problems with these studies. First, no one actually shows how these mutational events worked together in a step-wise fashion to produce the modern forms of the ancestral gene. There is no biochemical pathway of the evolutionary change provided to the reader. No analysis is done to show how much genetic change was required for gene A to evolve into genes B/C, nor is there any attempts to show that there is a reasonable enough amount of time between gene A and genes B/C for evolution to work its magic through trial and error. No mathematical models are offered. No empirical evidence is appealed to.[3] Instead, these mutational events are offered more as a conceptual model as to how the modern genes could have been produced.

Second, even if the studies Long surveys did explain how new genetic information is formed, because all of the experiments start with existing information (an ancestral gene), none of them are capable of showing how the genetic information coded into DNA arose in the first place. They attempt to trace the history of changing genetic information than the origin of genetic information itself. It’s much easier to create small amounts of new information from large amounts of existing information than it is to create large amounts of information from informationless resources. So even if the studies did explain the origin of new information in existing organisms, it would do nothing to explain the origin of information in the first life form, or the explosion of information in the Cambrian.

Third, the ancestral genes that serve as the basis for constructing models of genetic evolution are hypothetical, not actual. These ancestral genes appear nowhere in the biological world. One must presuppose the truth of common ancestry before it becomes plausible to think such ancestral genes truly existed, and even then, who is to say that the biologists’ reconstruction of that ancestral gene is correct? Trying to construct the ancestral gene from two closely related modern genes is like trying to construct an ancestral story of the flood based on a reading of the Jewish and Babylonian accounts. One could reconstruct the original story in all sorts of ways, which creates a myriad of evolutionary accounts one could give for the story.

Fourth, biologists will counter that they are not merely presupposing the truth of common descent when they postulate these ancestral genes, because homologous genes in similar organisms provides evidence for common ancestry. But this just argues in a circle. The reason they see homologous genes as evidence for common ancestry is because they already presuppose the truth of common ancestry. The only way appealing to homologous genes in support of common descent would not be guilty of begging the question is if the only viable explanation for why two organisms share similar genes is because they descended from a common ancestor. But that’s not clear at all. An equally plausible explanation is that a designing intelligence created both organisms with similar genes to perform similar functions unique to each organism. Before homologous genes can be used as evidence for common descent (as opposed to merely being consistent with common descent), one must show why the design hypothesis is not viable. That’s a tall order.

A second problem with appealing to homologous genes as evidence of common descent (and thus a common ancestral gene) is that homologous genes do not always appear in similar organisms. Sometimes they appear in organisms whose common ancestor could not have possessed an ancestral form of the gene. In such cases, evolutionists claim the same (or similar) gene arose more than once completely independent of the other(s). This is genetic convergent evolution. Examples of so-called genetic convergent evolution undermine the presupposition underlying the theory of common descent: homologous genes indicate similar ancestry. Examples of genetic convergent evolution demonstrate that just because two genes in two organisms are similar, it does not mean that the two genes are related by a common, ancestral organism/gene. Genetic convergent evolution negates the logic of the argument for common descent from homology.

Fifth, there are many genes that cannot be explained by descent with modification from an ancestral gene because the genes are unique to a specific organism, and radically different from any genes present in similar organisms or the closest supposed ancestor. Scientists call these ORFan genes (open reading frames of unknown origin). The list of ORFan genes continues to grow, with no homologous genes in sight. This presents a problem for Darwinists because they do not have a mechanism to explain the sudden appearance of such genes. The studies examined in Long’s review essay do not address the origin of ORFan genes at all. They do refer to genes being created de novo, but that is not a mechanism. That is merely a recognition that the genes are so different and so unique that they cannot be explained by their mutational scenarios. To say a gene originated de novo is worse than pleading ignorance; it’s an appeal to naturalistic magic.

Sixth, the mutational processes invoked to explain the rise of new genes and new genetic information do nothing to explain how evolution gets around the combinatorial search problem. There simply isn’t enough time for the Darwinian process to stumble on a combination that results in a new functional protein, and no attempt is made to show how Darwinian processes can overcome this.

Seventh, the studies cited by Long provide no plausible evidence to support their storytelling. It’s one thing to provide a theory or plausible mechanism for generating new genetic information, but it’s an entirely different matter to provide empirical evidence that this is what happened. No actual biochemical pathways are ever provided. We’re just told stories. Hand-waving and using intellectually-sounding phrases such as “extensive refashioning,” “genes emerge and evolve rapidly,” “hypermutability,” “rapid, adaptive evolution,” and “fortuitous juxtaposition of suitable sequences” are no substitutes for actual empirical evidence.

[1]There are 20⁹² possible combinations of amino acids in a protein consisting of 92 amino acids, which means the ratio of functional to non-functional sequences is extremely low. Functional sequences are very rare in sequence space. And yet, the great disparity between the amount of possible combinations and the chances of producing a functional combination means that there are many functional combinations. It’s just extremely rare that nature would stumble on any one of them given the enormous amount of possibilities to try.

[2]If each attempt to guess the correct combination by chance took six seconds, it would take the thief more than 1,902 years to crack the combination (assuming he did not sleep).

[3]Of course, evidence is appealed to in support of the fact that such mutations do occur in nature. But no empirical evidence is provided to show that specific mutations were responsible for evolving gene A into genes B/C, the order in which they occurred, etc.

Thinking to Believe