Amino acid repeats cause extraordinary coding sequence variation in the social amoeba Dictyostelium discoideum

PLoS One. 2012;7(9):e46150. doi: 10.1371/journal.pone.0046150. Epub 2012 Sep 28.

Abstract

Protein sequences are normally the most conserved elements of genomes owing to purifying selection to maintain their functions. We document an extraordinary amount of within-species protein sequence variation in the model eukaryote Dictyostelium discoideum stemming from triplet DNA repeats coding for long strings of single amino acids. D. discoideum has a very large number of such strings, many of which are polyglutamine repeats, the same sequence that causes various human neurological disorders in humans, like Huntington's disease. We show here that D. discoideum coding repeat loci are highly variable among individuals, making D. discoideum a candidate for the most variable proteome. The coding repeat loci are not significantly less variable than similar non-coding triplet repeats. This pattern is consistent with these amino-acid repeats being largely non-functional sequences evolving primarily by mutation and drift.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Motifs
  • Animals
  • Dictyostelium / genetics*
  • Genetic Drift
  • Genetic Loci*
  • Genetic Variation
  • Genome, Protozoan*
  • Humans
  • Molecular Sequence Data
  • Mutation
  • Open Reading Frames
  • Peptides / genetics*
  • Phylogeny
  • Trinucleotide Repeats*

Substances

  • Peptides
  • polyglutamine

Grants and funding

This material is based upon work supported by the National Science Foundation under Grant No. DEB-0918931 (http://www.nsf.gov/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.