On universal coding events in protein biogenesis
The complete ribosomal protein synthesis cycle and codon-amino acids associations are universally preserved in all life taxa on Earth. This process is accompanied by a set of hierarchically organized recognition and controlling events at different complexity levels. It starts with amino acid activation by aminoacyl tRNA synthetases (aaRS) followed by matching with the acceptor units of their cognate tRNAs (“operational RNA code”) and ribosomal codon-anticodon pairing of messenger RNA (“triplet code”). However, this codon-anticodon matching is possible only when protein translation machinery (translation factors, ribosome) accepts an esterified amino acid. This capacity (“charge code”) correlates mainly with the amino acid nature and the identity elements in the tRNA 3D structure. A fourth potential “folding code” (also referred as “stereochemical code”) between the translation dynamics, sequence composition and folding of the resulting protein can also be defined in the frame of the ‘Anfinsen dogma’ followed by post-translational modifications. All these coding events as well as the basic chemistry of life are deemed invariant across biological taxa due to the horizontal gene transfer (HGT) making the ‘universal genetic code’ the ‘lingua franca’ of life of earth.
Life is a process, which requires transmission, modification and interpretation of information. It is easy to assume that this information is simply encoded in genes, because there is a simple and universal set of rules how this information from the abstract form of four letters gets translated to the more complex 20(+2) letter set of the executive machinery, proteins. Though, as any process, translation occurs in time and requires its chemical implementation if the form of a complex molecular machinery, thus having several levels of control (or sub-coding). Moreover, this machinery has developed from a simpler to more sophisticated form in the course of evolution, thus it contains rudiments of an evolutionary development. As a result, there is a complex picture for how a gene would be accurately translated into an amino acid sequence, or how would amino acid land into a ready protein. Engineering of this process requires understanding of the;
- 1)
Individual interacting components (discussed here as coding levels);
- 2)
Driving forces behind them (e.g., HGT provides a source of innovations);
- 3)
Interactions between them (e.g., correlation of the translation and protein folding velocities).;
- 4)
Biological context (e.g., how the change in code can be fixed by different mechanisms like gain or loss of novel functions under specific selective pressures)
The interacting components during the translation process are amino acids, aaRSs, tRNAs, translational factors, mRNAs and ribosomes, which are composed of dozens of proteins and ribosomal RNAs. In addition, folding features are encoded in mRNAs and primary amino acid sequences. The understanding of the interacting components in protein biogenesis along with its experimental engineering both will help us to articulate the chemical boundaries of life, i.e., how the chemical environment influences the size and contents of genetically encoded information.
