Sandwalk: Revisiting the genetic load argument with Dan Graur

Friday, July 14, 2017

Revisiting the genetic load argument with Dan Graur

The genetic load argument is one of the oldest arguments for junk DNA and it's one of the most powerful arguments that most of our genome must be junk. The concept dates back to J.B.S. Haldane in the late 1930s but the modern argument traditionally begins with Hermann Muller's classic paper from 1950. It has been extended and refined by him and many others since then (Muller, 1950; Muller, 1966).

Several prominent scientists have used the genetic load data to argue that most of our genome must be junk (King and Jukes, 1969; Ohta and Kimura, 1971; Ohno, 1972). Ohno concluded in in 1972 that ...

... all in all, it appears that calculations made by Muller, Kimora and others are not far off the mark in that at least 90% of our genome is 'junk' or 'garbage' of various sorts.

It's important to keep in mind that the genetic load argument is one of the Five Things You Should Know if You Want to Participate in the Junk DNA Debate. It's also very important to understand that this is positive evidence for junk DNA based on fundamental population genetics. It refutes the popular view that the idea of junk DNA is just based on not knowing all the functions of our genome. There's delicious irony in being accused of argumentum ad ignorantiam by those who are ignorant.

I've discussed gentic load several times on this blog (e.g. Genetic Load, Neutral Theory, and Junk DNA) but a recent paper by Dan Graur provides a good opportunity to explain it once more. The basic idea of Genetic Load is that a population can only tolerate a finite number of deleterious mutations before going extinct. The theory is sound but many of the variables are not known with precision.

Let's see how Dan handles them in his paper (Graur, 2017). In order to calculate the genetic load (or mutation load), we need to know the size of the genome, the mutation rate, and the percentage of mutations that are deleterious. Dan Graur assumes that the diploid genome size is 6.114 × 10⁹ bp based on accurate cytology measurements from 2010. I think the DNA sequence data is more accurate so I would use 6.4 Gb. The difference isn't important.

There's a huge literature on mutation rates in humans. We don't know the exact value because there's a fair bit of controversy in the scientific literature. The values range from about 70 new mutations per generation to about 150 [see: Human mutation rates - what's the right number?]. Graur uses a range of mutation rates covering these values. He expresses them as mutations per site per generation which translates to values from 1.0 × 10^-8 to 2.5 × 10^-8. As we shall see, he calculates the genetic load for a range of mutation rates order to get an upper limit to the amount of functional DNA in our genome.

The most difficult part of these calculations is estimating the percentage of mutations that are beneficial, neutral, and deleterious. Population geneticists have rightly assumed that the number of beneficial (selected) mutations is insignificant so they concentrate on the number of deleterious mutations. The estimates range from about 4% of the total mutations to about 40% of the total based on the analysis of mutations in coding regions.

Most scientists assume that the correct value is about 10% of the total. What this means is that if there are 100 new mutations in every newborn there will be about 10 deleterious mutations if the entire genome is functional. If only 10% is functional then there will be only 1 deleterious mutation per generation. A mutation load of about one deleterious mutation per generation is the limit that a population can tolerate. Graur assumes 0.99. Others have proposed that the mutation load could be higher (Lynch, 2010; Agrawal and Whitlock, 2012) but it's unlikely to be more than 1.5. The difference isn't important.

Graur calculates a range of deleterious mutation rates (μ_del) based on multiplying the percentage of deleterious mutations times the total number of mutations.

The other variable is the replacement level fertility of humans (F). Think of it this way: if every child has a significant number of deleterious mutations then the population can still survive if every couple has a huge number of children. Statistically, some of them will have fewer deleterious mutations and those ones will survive. If F = 50 then in order to get one survivor each person needs to have 50 children (or each couple needs to have 100 children).

Historical data suggests that the range of values goes from 1.05 to 1.75 per person (2.1 to 3.5 children per couple). Graur makes the reasonable assumption that the maximum sustainable replacement level fertility rate is 1.8 per person in human populations over the past million years or so.

The important part of the Graur paper is the table he constructs where he estimates the number of deleterious mutations by combining the mutation rate and the percentage of deleterious mutations on the y-axis and the fraction of the genome that may be functional on the x-axis. At the intersection of each value he calculates the minimum replacement level fertility values required to sustain the population.

Let's look at the first line in this table. The deleterious mutation rate is calculated using the lowest possible mutation rate and the smallest percentage of deleterious mutations (4%). Under these conditions, the human population could survive with a fertility value of 1.8 as long as less than 25% of the genome is functional (i.e. 75% junk) (red circle). That's the UPPER LIMIT on the functional fraction of the human genome.

But that limit is quite unreasonable. It's more reasonable to assume about 100 new mutations per generation with about 10% deleterious. Using these assumptions, only 10% of the genome could be functional with a fertility value of 1.8 (green circle).

Whatever the exact percentage of junk DNA it's clear that the available data and population genetics point to a genome that's mostly junk DNA. If you want to argue for more functionality then you have to refute this data.

Note: Strictly speaking, the genetic load argument only applies to sequence-specific DNA where mutations have a direct effect on function. Some DNA serves as necessary spacers between functional sequences and this DNA will only be affected by deletion mutations. From what we know right now, this is a small percentage of the genome. However, there are bulk DNA hypotheses that attribute non-sequence specific function to most of the genome and if they are correct the genetic load argument carries no weight. So far, there is no good evidence that these bulk DNA hypotheses are valid and most objections to junk DNA are based on sequence-specific functions.

Agrawal, A. F., and Whitlock, M. C. (2012) Mutation load: the fitness of individuals in populations where deleterious alleles are abundant. Annual Review of Ecology, Evolution, and Systematics, 43:115-135. [doi: 10.1146/annurev-ecolsys-110411-160257]

Graur, D. (2017) An upper limit on the functional fraction of the human genome. Genome Biol Evol evx121 [doi: 10.1093/gbe/evx121]

King, J.L., and Jukes, T.H. (1969) Non-darwinian evolution. Science, 164:788-798. [PDF]

Lynch, M. (2010) Rate, molecular spectrum, and consequences of human mutation. Proceedings of the National Academy of Sciences, 107:961-968. [doi: 10.1073/pnas.0912629107]

Muller, H.J. (1950) Our load of mutations. American journal of human genetics, 2:111-175. [PDF]

Muller, H.J. (1966) The gene material as the initiator and the organizing basis of life. American Naturalist, 100:493-517. [PDF]

Ohno, S. (1972) An argument for the genetic simplicity of man and other mammals. Journal of Human Evolution, 1(6), 651-662. doi: [doi: 10.1016/0047-2484(72)90011-5]

Ohta, T., and Kimura, M. (1971) Functional organization of genetic material as a product of molecular evolution. Nature, 233:118-119. [PDF]

84 comments :

Anonymous said...: Although this https://www.quantamagazine.org/missing-mutations-suggest-a-reason-for-sex-20170713/ doesn't use the phrase "genetic load," it seems to be addressing the same topic to I think greatly different effect. Any comment?; Friday, July 14, 2017 12:42:00 PM
Markk said...: To me this shows that the word junk is bad in describing these part of the genome. It should be called the buffer area or something like that.; Friday, July 14, 2017 2:11:00 PM
Eric said...: If I am understanding the data correctly, what it actually points to is the maximum size of human functional genome. Humans could have a genome that is nearly >90% functional if the genome was about 600 kbp.

On the flip side, if the size of the human functional genome was larger, say 1.2 Mbp, then this would tend to select for lower mutation rates. There might be a balancing act between the size of the functional genome and the fidelity of DNA replication during meiosis.; Friday, July 14, 2017 5:37:00 PM
Larry Moran said...: About 90% of our genome is nonfunctional DNA. We've been calling it "junk" for the past 45 years Why would you want to call this nonfunctional DNA "buffer" unless you have evidence that it has such a function?; Friday, July 14, 2017 10:21:00 PM
Larry Moran said...: There is no evidence to support your bizarre speculation.

BTW, the vast majority of mutations we are discussing occur during mitosis, not meiosis.; Friday, July 14, 2017 10:25:00 PM
daedalus2u said...: Humans have a haploid life stage, which may be important in removing deleterious mutations. Data on other eukaryotes seems to indicate that selection in the haploid life stage does have F1 and longer effects.

Presumably these effects would be important in humans; in both haploid gametes.

http://www.pnas.org/content/early/2017/07/10/1705601114.short

If I may speculate, senescence in haploid gametes may be a low-cost “feature” that culls deleterious mutations from the F1 generation. Senescence in the adult (observed in essentially all organisms with a haploid life stage) may be an unavoidable consequence of senescence of haploid gametes.; Saturday, July 15, 2017 11:26:00 AM
Larry Moran said...: The mutation rates are based on actual data. Even if your speculation were true it has no effect on the rate calculated from actually looking at the number of mutations per generation.; Saturday, July 15, 2017 1:41:00 PM
Faizal Ali said...: Why are people so averse to the idea of junk DNA? Other than creationists, of course. I know why they don't like the idea.; Saturday, July 15, 2017 1:57:00 PM
daedalus2u said...: Maybe not on the mutation rate per se, but perhaps on how many of the mutations that show up in the F1 generation are deleterious.; Saturday, July 15, 2017 3:48:00 PM
Larry Moran said...: There are many reasons. Here's one ...

The Deflated Ego Problem

But the main reason is that many scientists don't understand evolution. They believe that everything is shaped by natural selection. This means that every transcript must have a purpose or it would have been eliminated by natural selection. The idea of junk DNA doesn't fit with their (false) worldview. They have no satisfactory explanation for junk.

The other (related) problem is that they don't understand the sloppiness of biochemistry. They haven't assimilated the idea that RNA polymerase can make mistakes, the spliceosome can screw up, and some transcription factors can bind nonproductively. They tend to have a Swiss watch view of biochemistry where everything is precisely tuned to regulate and control basic cellular processes.; Sunday, July 16, 2017 10:37:00 AM
Bill Cole said...: Transcriptional landscape of repetitive elements in normal and cancer human cells

Article · July 2014 with 46 Reads
DOI: 10.1186/1471-2164-15-583 · Source: PubMed

Cancer cells are acting similar to how cells behave during embryo development. This paper show increased activity of repeat sequences in cancer cells. The DNA that appears to be junk when measured in adult cells maybe very active during embryo development. If DNA was degrading due to generational mutation why would their be conservation of repetitive sequences?; Sunday, July 16, 2017 12:42:00 PM
Mikkel Rumraket Rasmussen said...: Nothing you said made any sense. Evolutionary Conservation is an indication the sequences aren't junk. So if they really are conserved, then no-one here is claiming it's junk DNA.

Also, mutation in junk-DNA can create activities in that junk DNA that makes in interfere with normal cellular processes. In effect, the accumulation of mutations makes the "junk DNA" functional but in a deleterious way. For that reason, the mere fact that some piece of DNA is involved in disease like cancer doesn't actually indicate that it isn't junk-DNA.; Sunday, July 16, 2017 1:17:00 PM
Bill Cole said...: "Nothing you said made any sense. Evolutionary Conservation is an indication the sequences aren't junk. So if they really are conserved, then no-one here is claiming it's junk DNA."

If repeat counts are not junk then Larry's estimates are off by a lot.

". For that reason, the mere fact that some piece of DNA is involved in disease like cancer doesn't actually indicate that it isn't junk-DNA."

It is evidence of function during embryo development.; Sunday, July 16, 2017 1:52:00 PM
Larry Moran said...: The paper refers to transcription of retrotransposon sequences. This is not evidence of function. It's almost certainly spurious transcription of bits and pieces of transposons that still retain promoter sequences in the LTR remnants.; Sunday, July 16, 2017 3:26:00 PM
Bill Cole said...: "Cancer cell lines display increased RNA Polymerase II binding to retrotransposons than cell lines derived from normal tissue. Consistent with increased transcriptional activity of retrotransposons in cancer cells we found significantly higher levels of L1 retrotransposon RNA expression in prostate tumors compared to normal-matched controls.

Conclusions
Our results support increased transcription of retrotransposons in transformed cells, which may explain the somatic retrotransposition events recently reported in several types of cancers."

Increased expression levels shows a changed based on the cell being in cancer or rapid cell division mode. This is usually accompanied by activation of embryonic pathways. The question is if increased expression levels is indicative of function during cell division?; Sunday, July 16, 2017 5:54:00 PM
Dave Carlson said...: You seem to be confused about what this study is showing. With some exceptions, transcription of TEs is generally harmful, which is why organisms need mechanisms for repressing TE activity.

Tumors are obviously cases in which the normal cellular processes have gone awry. One aspect of this abnormal behavior is de-repression of TE activity.; Sunday, July 16, 2017 6:08:00 PM
Faizal Ali said...: Do you know what "cancer" is, Bill? It's a disease, often a really, really bad one.; Sunday, July 16, 2017 8:37:00 PM
Bill Cole said...: This paper shows increase activity during embryo development. Again embryo pathways are activated in cancer cells.

"RESEARCH ARTICLE OPEN ACCESS
Exploratory bioinformatics investigation reveals importance of “junk” DNA in early embryo development
Steven Xijin GeEmail author
BMC Genomics201718:200
DOI: 10.1186/s12864-017-3566-0© The Author(s). 2017
Received: 13 October 2016Accepted: 7 February 2017Published: 23 February 2017
Abstract

Background
Instead of testing predefined hypotheses, the goal of exploratory data analysis (EDA) is to find what data can tell us. Following this strategy, we re-analyzed a large body of genomic data to study the complex gene regulation in mouse pre-implantation development (PD).

Results
Starting with a single-cell RNA-seq dataset consisting of 259 mouse embryonic cells derived from zygote to blastocyst stages, we reconstructed the temporal and spatial gene expression pattern during PD. The dynamics of gene expression can be partially explained by the enrichment of transposable elements in gene promoters and the similarity of expression profiles with those of corresponding transposons. Long Terminal Repeats (LTRs) are associated with transient, strong induction of many nearby genes at the 2-4 cell stages, probably by providing binding sites for Obox and other homeobox factors. B1 and B2 SINEs (Short Interspersed Nuclear Elements) are correlated with the upregulation of thousands of nearby genes during zygotic genome activation. Such enhancer-like effects are also found for human Alu and bovine tRNA SINEs. SINEs also seem to be predictive of gene expression in embryonic stem cells (ESCs), raising the possibility that they may also be involved in regulating pluripotency. We also identified many potential transcription factors underlying PD and discussed the evolutionary necessity of transposons in enhancing genetic diversity, especially for species with longer generation time.

Conclusions
Together with other recent studies, our results provide further evidence that many transposable elements may play a role in establishing the expression landscape in early embryos. It also demonstrates that exploratory bioinformatics investigation can pinpoint developmental pathways for further study, and serve as a strategy to generate novel insights from big genomic data.

Keywords

Single-cell RNA-seq Exploratory data analysis Pre-implantation development Early embryogenesis Transposons Repetitive DNA"; Sunday, July 16, 2017 9:04:00 PM
W. Benson said...: Good post. I've always thought mutational load argument merited more emphasis as strong support for lots of junk DNA in 3+ Gb mammalian genomes.; Sunday, July 16, 2017 11:20:00 PM
Faizal Ali said...: Yes. And...?; Monday, July 17, 2017 6:33:00 AM
Aceofspades said...: Larry writes:

> Let's look at the first line in this table. The deleterious mutation rate is calculated using the lowest possible mutation rate and the smallest percentage of deleterious mutations (4%).

If μdel in this row is calculated based on the smallest percentage of deleterious mutations (4%) then aren't we already assuming for this row that the functional fraction of the genome is 4%? Why do we then go on to compare this value of μdel to other functional fractions of the genome?

That is the one thing thing I don't understand about this paper - it seems that the variable μdel already incorporates the functional fraction of the genome into it - yet in the table, it is plotted against the functional fraction of the genome.; Monday, July 17, 2017 6:47:00 AM
Larry Moran said...: We know that mutations occur and we know the approximate rate. What we don't know is what percentage of mutations in FUNCTIONAL DNA are deleterious and what percentage are neutral or beneficial.

It's safe to ignore the beneficial mutations since those are rare. The fraction of deleterious mutations in functional DNA can be estimated from looking at coding regions where we have a pretty good idea about which mutations can be harmful. Those estimates say that more than half of the mutations are likely neutral in effect. The upper estimate of the fraction of deleterious mutations is 40% and the lowest estimate is 4%.

If there are 100 new mutations per genome and 4% are deleterious then this means that each individual in each generation will acquire 4 deleterious mutations. That's an intolerable genetic load; it suggests that most of our genome is not a target (i.e. not functional).

The fraction of all mutations in functional DNA that are deleterious not the same as the fraction of the genome that is junk.; Monday, July 17, 2017 11:45:00 AM
judmarc said...: [T]here are bulk DNA hypotheses that attribute non-sequence specific function to most of the genome and if they are correct the genetic load argument carries no weight. So far, there is no good evidence that these bulk DNA hypotheses are valid and most objections to junk DNA are based on sequence-specific functions.

Is there a way to tell the difference between "bulk" DNA that serves a function, albeit non-sequence-specific, and junk that is there by accident (and because effective population sizes aren't large enough) but serves at least to give mutations a somewhat safer place to go?

I can think of widely varying genome size in reasonably closely related species. Anything else?; Monday, July 17, 2017 1:14:00 PM
Eric said...: Dr. Moran,

Correct me if I am wrong, but if we took out all of the junk DNA and were left with a 600 million base diploid genome the number of mutations within the functional part of the genome would be the same since the number of mutations scales with genome size.

As to mitosis v. meiosis, you are probably right on that one. Most mutations come from the father, and those mutations occur in sperm progenitor cells during mitosis. I guess what I was getting at is that the mutations occur in germ line cells, not somatic cells.; Monday, July 17, 2017 1:25:00 PM
Larry Moran said...: Data on genome sizes in closely related species suggests that some species can harbor a great deal of nonfunctional (junk) DNA. This is the so-called C-Value Paradox argument. It demonstrates that there are genomes that are mostly junk.

It's very difficult to distinguish between junk and DNA sequences that are required to bulk up the genome for some reason. The problem right now is that there are no compelling reasons to assume that such bulk DNA is necessary.; Monday, July 17, 2017 1:41:00 PM
Larry Moran said...: If only 10% of our genome is functional then there are about 10 mutations per generation in the functional part of the genome. If we eliminate all the junk DNA there will still be 10 mutations per generation in the functional part of the genome.; Monday, July 17, 2017 1:44:00 PM
David said...: I blame ENCODE for offering up such provocative findings that are contingent upon the interpretability of such an ambiguous term. And, while I generally agree with Grauer's thesis, I think in the end he (like ENCODE) is arguing more about semantics rather than biology. His points are good but their significance will be lost because the words being used are contextually defined and re-defined from study-to-study.

In common parlance, I don't think that even SJ Gould would deny that spandrels (literal or biologic) served *some* "function" at *some* level (i.e., you cannot have an functional arch without the spandrels). However, I believe Gould's larger caution is very much in play in the current debate--namely that by trying to ascribe "function" to genomic regions in order to then *deduce* function is destined to result in an untestable tautology rather than a testable hypothesis.; Monday, July 17, 2017 2:32:00 PM
judmarc said...: I was thinking the C-Value Paradox would indicate that between among related species in similar environments, some of these species appear not to have any need for "bulk" DNA, suggesting the reason other species have it isn't due to any "bulk" function but rather simple accident of inheritance.; Monday, July 17, 2017 3:47:00 PM
unknowing said...: I don't think that it can be dismissed as semantics. As Larry as mentioned several times, ones choice of interpretation vastly alters their investigative approach. Assuming that all any transcribed RNAs have sequence specific biological functions often leads to a lot of wasteful research being performed. It's all too common to see months or years of time and money poured into the study of such sequences without ever first rigorously testing that said sequences have meaningful physiological or pathological functions.; Tuesday, July 18, 2017 9:53:00 AM
David said...: It is a matter of semantics. You can prove that simply by realizing that a) there is anything but consensus on what "function" means and b) authors are always compelled to offer their own definition of "function" that carries no application beyond the study at hand. Here Graur offers a definition that is indeed very well delineated--I buy it. However it is not a definition that most biologists will intuitively embrace. Too late now, but ENCODE should never have been so free with the term "functional" and Ohno should never have used the work "junk" Using common words that already carry a ton of often irrelevant baggage only muddies the waters of discourse.; Tuesday, July 18, 2017 12:16:00 PM
unknowing said...: On the contrary, while authors may quibble over the specifics over the exact criteria used to categorize sequences there is a clear distinction between function as dependent on sequence specificity and the generic "functionality" of encode.

Such a broad definition of functionality renders the term meaningless.

Also junk was chosen as a descriptor precisely because the lay meaning is an apt metaphor. Junk DNA has the potential to see use someday, but at present sits idle taking up genomic space.; Tuesday, July 18, 2017 9:15:00 PM
Aceofspades said...: Thanks, that makes sense; Wednesday, July 19, 2017 4:49:00 AM
Aceofspades said...: > However it is not a definition that most biologists will intuitively embrace

What is wrong with a definition of function that relates it to the genetic fitness of an organism or its offspring? This seems to me to be the only obvious definition.

I also don't understand why you think the term "junk" shouldn't be used if it is literally referring to sequences that can be removed without any effect on the organism?; Wednesday, July 19, 2017 4:57:00 AM
Bill Cole said...: Ace
"I also don't understand why you think the term "junk" shouldn't be used if it is literally referring to sequences that can be removed without any effect on the organism?"

So do you consider genes that were required for embryo development but are since inactive junk?; Wednesday, July 19, 2017 10:13:00 AM
John Harshman said...: Bill,

So do you consider raising absurd straw men a legitimate form of argument?; Wednesday, July 19, 2017 11:20:00 AM
Bill Cole said...: John
"So do you consider raising absurd straw men a legitimate form of argument?"
This is not an argument. I am asking to get definitional clarification. I perviously answered a question about introns. Are they junk if their sequence is not important yet their length is. This question Larry already answered that he does not consider an intron where length matters junk.; Wednesday, July 19, 2017 11:31:00 AM
John Harshman said...: Bill,

Then the answer to your question is "no", and it was a question that showed you have no understanding of the subject, as many of your questions and statements do.

And I believe Larry said that he considers that intron mostly junk, unless the exact length is important, which is very unlikely.; Wednesday, July 19, 2017 1:31:00 PM
Bill Cole said...: John
"Then the answer to your question is "no", and it was a question that showed you have no understanding of the subject, as many of your questions and statements do."
Ace
"I also don't understand why you think the term "junk" shouldn't be used if it is literally referring to sequences that can be removed without any effect on the organism?""

So how would you validate junk DNA based on Ace's definition.; Wednesday, July 19, 2017 10:18:00 PM
John Harshman said...: Bill,

Ah, so it was an attempt at "gotcha" after all. You must understand that the embryo is a stage of the organism, and that knockouts are not done on adult organisms but in the single-celled stage. If a bit that was knocked out turned out to be necessary for development that would be noticed, generally by having the embryo die. That's the ignorance I was talking about. Now, ignorance is no sin. The sin is in being proud of ignorance and making no attempt to repair it.; Wednesday, July 19, 2017 10:28:00 PM
Faizal Ali said...: Really, John? So you mean knockout organisms are not created by painstakingly going thru every single cell of an adult, and removing the specific gene one by one? Imagine that. You learn something knew every day. When one is as ignorant as Bill Cole, that is.; Thursday, July 20, 2017 10:12:00 AM
Bill Cole said...: John,

My question is not a "gotcha" its to stimulate conversation and real discussion about the issue. Now that you have informed me of my ignorance, no actually you have assumed ignorance again just like you continually assume UCD, lets talk about other stages where the cell changes expression levels and activates non coding genes. Apoptosis, DNA repair, hypoxia are just a few examples where DNA which during resting cell measurements would appear non functional or have low expression levels. I honestly don't care how much DNA is actually functional but have found this to be a discussion about measuring ignorance. Larry is at 10% the NIH is at 80% so I guess we have a 70% ignorance factor or perhaps 90% if the NIH's figure is conservative.; Thursday, July 20, 2017 1:47:00 PM
John Harshman said...: I can see how you might take offense when someone says you're ignorant. But stop for a minute and ask yourself whether you really are. Your question assumed, somehow, that "removed without any effect on the organism" referred to the adult organism only, or to some particular stage of development. Yes, that is indeed ignorance on display. Rather than get all huffy, just accept it. The ignorance being measured here is yours, personally, not that of anyone else.

Now once again you make the assumption that any sequence whose function we don't know is assumed to be junk, when Larry's post to which that was a supposed comment is right there at the top of the page to point out that no, that isn't the reason. 90% of your genome is junk because of the genetic load argument, the non-conservation argument, the fugu argument, and a number of others that you are apparently incapable of noticing even when they're in front of your face (or at the top of a page).; Thursday, July 20, 2017 3:32:00 PM
Bill Cole said...: "Now once again you make the assumption that any sequence whose function we don't know is assumed to be junk, when Larry's post to which that was a supposed comment is right there at the top of the page to point out that no, that isn't the reason. 90% of your genome is junk because of the genetic load argument, the non-conservation argument, the fugu argument, and a number of others that you are apparently incapable of noticing even when they're in front of your face (or at the top of a page)."

What are the assumptions that support the genetic load argument? To what extent have they been validated?; Friday, July 21, 2017 11:46:00 AM
Joe Felsenstein said...: @Bill Cole: You could try reading the post, above, and the Dan Graur paper that it reports on. That's what the post and the paper are about.; Friday, July 21, 2017 3:01:00 PM
Bill Cole said...: Joe
"But that limit is quite unreasonable. It's more reasonable to assume about 100 new mutations per generation with about 10% deleterious. Using these assumptions, only 10% of the genome could be functional with a fertility value of 1.8 (green circle)."

Good point. The mutation rate is assumed to be about 100 per generation. Do have a number of DNA variation caused by recombination which does not degrade function. The human variation chosen from a random sample is around .1%. The variation in dogs is around .25%. The other thing I am noticing is that species that are said to share a common ancestor typically have 5 to 100 times these numbers. The other issue is that dogs have split 600 times sooner from their common ancestor( the grey wolf) then humans yet the in species variation is greater. Does this make sense to you if the mutation rate is fixed at 100 mutations per generation. Given dogs are breed it seems like the genetic variation must be coming from recombination vs mutation. DNA repair is running repeatably in cells and a mutation measurement at time x may not capture the actual end mutation rate which may be close to 0 given the dog human data.; Friday, July 21, 2017 6:52:00 PM
John Harshman said...: Bill, you're just embarrassing yourself. Stop posting and start reading.; Friday, July 21, 2017 10:07:00 PM
Faizal Ali said...: FFS, Bill.; Saturday, July 22, 2017 8:00:00 AM
Dave Carlson said...: It's almost as if there are factors other than the mutation rate that can influence the amount of variation present in a population!

Snark aside, Bill, I sincerely recommend that if you continue to be interested in topics such as this that you make a good faith effort to read a textbook about evolutionary biology. There are several quality ones available. Doug Futuyma (disclosure: he is a Professor in the same department in which I'm doing my PhD, so I may be biased) just published an updated edition of his that I would recommend. You can find it on Amazon.; Saturday, July 22, 2017 1:06:00 PM
Bill Cole said...: Dave
"It's almost as if there are factors other than the mutation rate that can influence the amount of variation present in a population!'

I agree with you. Do you have an opinion on why there are jumps in variation between the same species and the closest ancestor? An example is humans have about .1% variation and chimps vs humans 1.3%. How would isolated populations account for that? The example with dogs and humans makes time as an explanation suspect.

Thanks for the recommendation. I will purchase Dr Futuyma's book. Please collect a commission from him.. :-); Sunday, July 23, 2017 1:05:00 PM
Dave Carlson said...: Hope you find the book useful, Bill.

As for your question, I'm not sure why you would find it surprising that within species diversity is lower than between species divergence, nor is it especially peculiar (although it is certainly interesting) that different species or populations harbor varying amounts genetic diversity. Many of the reasons why this might be the case can be found in Futuyma's textbook!; Sunday, July 23, 2017 2:35:00 PM
Unknown said...: You're assuming mutations are substitutions. Many are nasty indels and viral insertions that can destroy the "good stuff" in the genome. If there is more "junk" DNA out there then this makes it more likely that the indels will affect this region and not the "functional" part. So "junk DNA" does act as a buffer. Another major role it has to play is as sequence spacers that reduce the colision rates of transcription factors and allow more accident-free interactions between DNA-binding proteins and DNA.; Thursday, July 27, 2017 6:31:00 AM
Unknown said...: Larry thinks that because most retrotransposons are silent/inactive they are non-functional. He doesn't realise that they contain important regulatory sequences that can serve to alter transcriptional control. We are finding more and more SINEs and LINEs as having a role in genomic regulation: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4914423/; Thursday, July 27, 2017 6:41:00 AM
Unknown said...: The fact that a mutation is deleterious or not has no bearing on whether it is affecting "functional" DNA or not. A mutation can be only very very very slightly deleterious and so it may be tolerable. Natural selection will serve to reduce the genetic load by weeding out the really harmful mutations.; Thursday, July 27, 2017 6:46:00 AM
judmarc said...: Natural selection will serve to reduce the genetic load

Now that is some funny stuff.; Thursday, July 27, 2017 9:01:00 AM
judmarc said...: The fact that a mutation is deleterious or not has no bearing on whether it is affecting "functional" DNA or not.

Here, let me rephrase that for you and see if it continues to strike you as a sensible statement: Whether there is a hole in a tail light or in the oil drain pan has no bearing on how deleterious its effect is on your car.; Thursday, July 27, 2017 9:06:00 AM
judmarc said...: Actually, my analogy in the last comment wasn't a good one, since tail lights still serve a function. Better then to say this: Whether there is a hole in one of the fuzzy dice you have tied to your rear view mirror or in the oil drain pan has no bearing on how deleterious its effect is on your car.; Thursday, July 27, 2017 9:09:00 AM
judmarc said...: So "junk DNA" does act as a buffer.

You do realize this is precisely opposite to the argument you make in another comment that whether a mutation is in functional DNA or not has no bearing on whether it's deleterious, right?; Thursday, July 27, 2017 9:16:00 AM
judmarc said...: Since Larry has made it abundantly clear he doesn't consider regulatory sequences to be junk, and that he's quite aware of the amount of DNA that might be found to serve some function, including regulatory, you either:

- Haven't read much on this site and are making an unwarranted assumption;

- Have read some of what Larry has written and haven't understood it; or

- Have read and understood what Larry says and are deliberately mischaracterizing his position.

So which is it?; Thursday, July 27, 2017 9:31:00 AM
Unknown said...: Now that is some funny stuff.

That's what natural selection does. It weeds out the most serious deleterious mutations. The very slightly deleterious, however, often remain in the gene pool.; Thursday, July 27, 2017 11:51:00 AM
Unknown said...: What Larry doesn't seem to understand is that much of the genome consists of functionally redundant copies of genes. As such, a deleterious mutation can be compensated for by another intact copy. So the genetic load can be tolerated by the buffering effect of gene duplicates.; Thursday, July 27, 2017 11:53:00 AM
Unknown said...: Larry posits an argument from ignorance and personal incredulity: "I can't believe that so much of the genome is functional..therefore it's junk." He also thinks that an inactive retrotransposon cannot be functional because it can't move about even though he knows they donate a regulatory sequence (as with Alus).; Thursday, July 27, 2017 12:01:00 PM
Unknown said...: One more thing, Larry. Many harmful mutations have RECESSIVE phenotypes and their effects are masked by the other allele.; Thursday, July 27, 2017 12:05:00 PM
Peter said...: Solovei et al provides a pretty compelling effect that heterochromatin (i.e. mostly repetitive "junk" elements) has a causal-role function on visual acuity in nocturnal mammals. Mechanistically, what's going on is that the rod cells compartmentalise their heterochromatin towards the centre of the nucleus, where the increased density (heterochromatin is more compact) acts as a mini-lens focusing the light.
http://www.cell.com/cell/fulltext/S0092-8674(09)00137-8

Of course, a causal-role function doesn't necessarily imply a selected-effect function. However, a new paper shows that in owl monkeys - a species that is transitioning to a more nocturnal lifestyle - there has been an accumulation of specific heterochromatic repeats, and that these are the genomic elements involved in the light-focusing role. So that looks very much like something that has been selected for in the evolution of this species.
https://academic.oup.com//gbe/article/4048064/Co-opted-megasatellite-DNA-drives-evolution-of

Of course, this only applies to nocturnal animals and isn't a general answer, however I think it's a very clear demonstration that there can indeed be selection to increase DNA bulk in a sequence-indifferent manner.; Thursday, July 27, 2017 5:09:00 PM
Larry Moran said...: Thank-you for all your helpful comments. I'm sure the experts in genomes and evolution have never thought of them before now. I'll make sure to pass them along to all the ignorant population geneticists who have been studying the genetic load of populations for almost a century.; Thursday, July 27, 2017 9:24:00 PM
Larry Moran said...: @Peter

It's abundantly clear that some DNA is functional even though the specific sequence is irrelevant. That point is not controversial ... at least not among knowledgeable experts. Spacer DNA is a good example.

Your example doesn't make any sense and I doubt very much whether that explanation will hold up in the future.; Thursday, July 27, 2017 10:16:00 PM
Aceofspades said...: An experiment was conducted earlier this year to see if researchers could come up with some functional denovo genes from randomised DNA sequences. The result was that they were able to create hundreds of functional genes from random strings of DNA of length 150bp (this should put to bed any argument from ID advocates which states that new, useful genes cannot arise from junk DNA)

In an article about the study, one of the researchers recounts:

"During my early months in the Tautz lab, while still a Master’s student, I contemplated the possibility of doing an experiment that could support de novo evolution as a general process, and so I came up with a thought experiment. I would insert random sequences in living cells, together with enough regulatory machinery to make sure they would be transcribed and translated by the host. Then, I would wait until any of those would mutate enough to “acquire a function.” It occurred to me that starting with a sufficiently large pool of random sequences would reduce the waiting time, because some would exhibit some biochemical activity upon their introduction."

https://natureecoevocommunity.nature.com/posts/16396-exploring-random-sequence-space-in-the-name-of-de-novo-genes

The results were surprising - 25% of the random sequences they generated were beneficial to the bacteria that received them and 52% inhibited growth.

Here is the paper: https://www.nature.com/articles/s41559-017-0127

My question to Prof. Moran then is: If 25% of the transcripts from these random bits of DNA were able to promote growth in E-coli then wouldn't that imply that the same might be possible for us?

Maybe 25% of our own transcripts which are not evolutionarily conserved are also be conferring some small transient advantage? It might not be that this advantage is strong enough to be selected for and that is why it is transient - a few tens of thousands of generations from now and perhaps most will be mutated out of existence but by then other new transcripts that are somewhat beneficial will have popped up in the mean time.; Monday, July 31, 2017 9:46:00 AM
John Harshman said...: Always possible. But, following your logic, 52% are slightly deleterious, so the net effect is not "somewhat beneficial" at all. Note, however, that the said random sequences were embedded with "enough regulatory machinery to make sure they would be transcribed and translated by the host". What you're talking about here are random protein-coding genes, and it's the expressed proteins that are beneficial or otherwise. Junk DNA is not generally transcribed (except as noise) and certainly not translated, so the case doesn't arise.; Monday, July 31, 2017 9:54:00 AM
judmarc said...: That's what natural selection does.

Duh. Now tell us what "genetic load" is, and you're halfway to understanding why your post that I commented on is nonsense.; Monday, July 31, 2017 10:11:00 AM
Aceofspades said...: Yes, I agree that the net effect would NOT be "somewhat beneficial". But going back to thinking about the functionality of individual transcripts, this does suggest that for any given sequence that is transcribed, there might be a fairly high chance (25%) that it does something beneficial regardless of whether it goes on to be conserved or not.

Note that this study did attempt to look at whether the functions were a result of the expressed transcripts or the translated peptides.

In the abstract they write:

"Testing of individual clones in competition assays confirms their activity and provides an indication that their activity could be exerted by either the transcribed RNA or the translated peptide."

To test whether the functionality was coming about through the expressed transcripts or the translated peptides, they added a stop codon to the start of some of these random sequences to see what difference it would make:

"Given that the bioactivity could be conveyed by either the transcribed RNA or the translated peptide, we produced versions of these clones harbouring a stop codon directly at the start of the random part of the sequence, that is, only the first four amino acids that are common among all clones would be translated. These mutated clones were also tested in pairwise competition assays with the empty vector. Only one of the clones (clone 600) showed a clear difference between the mutated and the non-mutated version (Fig. 5c), which would suggest that only this clone exerts its effect via the encoded peptide, whereas the two other clones might act through their RNA alone. To study this in more detail, we did an experiment with a direct competition of each clone with its stop codon counterpart, but with the same qualitative results"

If about 76% of our DNA is transcribed (according to ENCODE) and most of those transcripts are junk but 25% of them confer some advantage to the organism either through peptides or simple RNA transcripts then that comes to about 19% of our genome altogether.

In real animals though I imagine that the proportion of useful transcripts is much lower than 25% because these randomly inserted sequences were all given promoters that I assume were causing them to be transcribed in high concentrations.

As Moran has pointed out many times before, most of our transcripts are transcribed at frequencies of less than one transcript per cell. This probably significantly affects whether they can be functional or not.; Monday, July 31, 2017 11:03:00 AM
Larry Moran said...: It's very difficult to understand exactly what the authors did. As far as I can tell, they only analyzed a subset of clones - the ones that were not highly deleterious after induction. Among this subset of 1,082 clones about 70% showed an effect on growth of the population after induction.

Keep in mind that following induction there will be hundreds of copies of the transcripts in the bacterial cells. That's a lot of copies.

Among the clones that showed an effect there were far more that were deleterious than were beneficial. The breakdown was between 20-25% beneficial. If you include the 30% that showed no effect then there were 14-17% beneficial.

However, the authors remind us that all the severely deleterious clones "are already mostly lost" as a consequence of the experimental design. Thus, of the total number of random transcripts only a small percentage (~1-5% ?) were beneficial under conditions where a huge amount of transcription was present.; Monday, July 31, 2017 3:17:00 PM
David said...: >> However it is not a definition that most biologists will intuitively embrace

>What is wrong with a definition of function that relates it
>to the genetic fitness of an organism or its offspring?
>This seems to me to be the only obvious definition.

I disagree that there's anything "obvious" about function equating to fitness. I would say that among most scientists, the more common conception of "function" does not entail essentially, especially to the whole. A protein product of a gene may very well play a local, mechanistic/metabolic role without it having any bearing upon the reproductive fitness of the organism. In this case, it would be common for many to say that this gene still had a "function" without then going on to make any presumptions about that gene's impact on fitness. Indeed, the concept of "function" seems to naturally lend itself to being sub-divided into "essential" and "non-essential" sub-types. That's not to say that ENCODE didn't take this notion too far, only that "function" is not a self-evident concept (as I have already said); Monday, August 28, 2017 1:19:00 PM
John Harshman said...: David,

You seem to be claiming that a DNA sequence must be essential in order to affect fitness. What sort of "local, mechanistic/metabolic role" would not affect fitness?

While we're here, what's wrong with "junk"?; Monday, August 28, 2017 2:58:00 PM
Eli said...: You and Graur might be interested in this recent preprint (https://www.biorxiv.org/content/biorxiv/early/2019/09/30/785865.1.full.pdf); Saturday, November 16, 2019 5:33:00 PM
Larry Moran said...: I look forward to reading the paper after it has been reviewed and published. Please let me know when that happens.; Monday, November 18, 2019 3:03:00 PM
John Harshman said...: This seems relevant:

"We stress that we, in this work, take no position on the actual proportion of the human genome that is likely to be functional. It may indeed be quite low, as the contemporary evidence from species divergence and intraspecies polymorphism data suggests. Many of the criticisms of the ENCODE claim of 80% functionality (e.g. Doolittle 2013, Graur 2013) strike us as well-founded. We wish only to point out that an argument from mutational load does not appear to be particularly limiting on f."; Monday, November 18, 2019 7:20:00 PM
João said...: Larry, do you know if Graur responded or commented on the following paper by Galeota-Sprung et al., 2019?

Mutational Load and the Functional Fraction of the Human Genome

"We find that the functional fraction is not very likely to be limited substantially by mutational load, and that any such limit, if it exists, depends strongly on the selection coefficients of new deleterious mutations."

https://academic.oup.com/gbe/article/12/4/273/5762616; Monday, April 13, 2020 12:24:00 AM
Larry Moran said...: I don't know if Graur has commented on that paper. I just read the Galeota-Sprung et al. paper and I find it very annoying that they avoid making any substantive comment on the amount of functional DNA in a genome. Surely as experts they have some idea about reasonable selection coefficients? Also, it's not clear to me why the fitness of the most fit individual is so important. Perhaps some other expert can take a look at the paper?; Monday, April 13, 2020 12:28:00 PM
João said...: I asked Dan on Twitter if he has some comments on Galeota-Sprung et al. This is what he said:

"I was one of the two reviewers of this paper and I recommended publication without revision. I don't tend to fall in love with my own papers."

You say "I find it very annoying that they avoid making any substantive comment on the amount of functional DNA in a genome". And I totally agree. They say mutational load is not that limiting, however they decline from discussing further on the probable amount of junk. Well, at least they seem not to deny that possibly a large fraction of our genome is junk.; Monday, April 13, 2020 12:39:00 PM
João said...: Larry, here is what Ben Garleota-Sprung commented on their paper:

https://twitter.com/SprungBen/status/1249790349725368321; Monday, April 13, 2020 4:35:00 PM
Larry Moran said...: Thanks. I'm still not clear about what the problem is with mutational load arguments.; Tuesday, April 14, 2020 11:49:00 AM
João said...: Larry, did you see that somebody on EN wrote something about Garleota-Sprung et al. paper? The claim is the following: "Paper Shows that “Mutational Load” Arguments Don’t Refute ENCODE".

https://evolutionnews.org/2020/05/paper-shows-that-mutational-load-arguments-dont-refute-encode/; Friday, May 08, 2020 1:18:00 AM
João said...: Also, I just became aware that Ben Garleota-Sprung briefly explained their argument here: https://twitter.com/SprungBen/status/1249800322236723200; Friday, May 08, 2020 1:25:00 AM
michaeljf said...: Joao (and Larry) - I am working my way through this Galeota-Sprung et al and comparing to Graur (2017). I agree they don't make a substantive comment on the % functional ("f") in either their abstract, intro or conclusion. However, they do say some interesting things in the middle. They reconcile their model with Graur's (note: they believe Graur didn't correctly include a factor of 1/2 to account for diploid vs haploid counts - but they show the corrected interpretation of Graur to have a better apples-apples comparison). Then they do make this statement: "Our interpretation of the quantity w_max/w_bar is more liberal than Graur’s: We do not interpret it as mean requisite fertility, because we are not using a pure viability selection model, but as the fitness of the fittest individual. Thus, our interpretation of the approach of Graur yields somewhat higher possible values for f than occur in Graur (2017), as shown in the bottom row of table 1, but still almost certainly no higher than f = 0.10 given the parameter values in (3)". And the "parameter values in (3)" are their estimates earlier in the paper of the main inputs, namely, size of genome, probability of a deleterious mutation at a site, impact of a mutation on fitness ("s") and steady size of human population (conservative at 1 billion). So I find this interesting, because they have no real issue with the input values that reconcile to Graur (save the 1/2 correction) and they come up with an upper limit of f = 0.10. With the correction applied to Graur's paper this would be Graur's own table figure of 0.20 (and note Graur had 0.15 as the upper limit). So seems coming in around the same value, ie 0.80 or more = non-functional. With any apologies as I'm a layperson with respect to genetics/biology but was following the rabbit trail of people trying to rebut Casey Luskin (an effort which I support, but don't have the full training on!); Tuesday, February 01, 2022 5:54:00 PM
João said...: Michaeljf said:
"Joao (and Larry) - I am working my way through this Galeota-Sprung et al and comparing to Graur (2017)."

Thank you for your comment, Michael! I revived this comments as I read "What's in your genome". I think I brought Larry's attention to Galeota-Sprung paper, and I'm happy this is somewhat in the book. Larry wrote:

"Using these values, Dan Graur estimated that at least 75 percent of our
genome has to be junk, and it’s likely that the actual amount of junk DNA is
closer to 90 percent. However, a more recent analysis shows that calculating
the fraction of junk DNA is a lot more difficult than Graur thought and
certainly a lot more complicated than the simplistic calculations that I
presented earlier[4]."

Footnote 14 refers to Graur (2017) and Galeota-Sprung et al. (2020).

:); Monday, June 05, 2023 5:55:00 PM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Friday, July 14, 2017

Revisiting the genetic load argument with Dan Graur

84 comments :