Properties characteristic of the genetic code. What is the genetic code: general information

  • Date: 11.10.2019

Gene classification

1) By the nature of the interaction in the allelic pair:

Dominant (a gene capable of suppressing the manifestation of an allelic recessive gene); - recessive (a gene, the manifestation of which is suppressed by an allelic dominant gene).

2) Functional classification:

2) Genetic code- these are certain combinations of nucleotides and the sequence of their location in the DNA molecule. This is a way of encoding the amino acid sequence of proteins using a sequence of nucleotides, characteristic of all living organisms.

Four nucleotides are used in DNA - adenine (A), guanine (G), cytosine (C), thymine (T), which in Russian literature are denoted by the letters A, G, T and C. These letters make up the alphabet genetic code. In RNA, the same nucleotides are used, with the exception of thymine, which is replaced by a similar nucleotide - uracil, which is denoted by the letter U (U in Russian-language literature). In DNA and RNA molecules, nucleotides line up in chains and, thus, sequences of genetic letters are obtained.

Genetic code

There are 20 different amino acids used in nature to build proteins. Each protein is a chain or several chains of amino acids in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties. The set of amino acids is also universal for almost all living organisms.

The implementation of genetic information in living cells (that is, the synthesis of a protein encoded by a gene) is carried out using two matrix processes: transcription (that is, mRNA synthesis on a DNA matrix) and translation of the genetic code into an amino acid sequence (synthesis of a polypeptide chain on an mRNA matrix). Three consecutive nucleotides are enough to encode 20 amino acids, as well as the stop signal, which means the end of the protein sequence. A set of three nucleotides is called a triplet. Accepted abbreviations corresponding to amino acids and codons are shown in the figure.

Properties of the genetic code

1. Tripletity- a significant unit of the code is a combination of three nucleotides (triplet, or codon).

2. Continuity- there are no punctuation marks between the triplets, that is, the information is read continuously.

3. discreteness- the same nucleotide cannot be simultaneously part of two or more triplets.

4. Specificity- a certain codon corresponds to only one amino acid.

5. Degeneracy (redundancy) Several codons can correspond to the same amino acid.

6. Versatility - genetic code works the same way in organisms of different levels of complexity - from viruses to humans. (methods are based on this genetic engineering)

3) transcription - the process of RNA synthesis using DNA as a template that occurs in all living cells. In other words, it is the transfer of genetic information from DNA to RNA.

Transcription is catalyzed by the enzyme DNA-dependent RNA polymerase. The process of RNA synthesis proceeds in the direction from 5 "- to 3" - end, that is, RNA polymerase moves along the template DNA chain in the direction 3 "-> 5"

Transcription consists of the stages of initiation, elongation and termination.

Transcription initiation - difficult process, which depends on the DNA sequence near the transcribed sequence (and in eukaryotes also on more distant parts of the genome - enhancers and silencers) and on the presence or absence of various protein factors.

Elongation- Further unwinding of DNA and RNA synthesis along the coding chain continues. it, like DNA synthesis, is carried out in the direction 5-3

Termination- as soon as the polymerase reaches the terminator, it is immediately cleaved from DNA, the local DNA-RNA hybrid is destroyed and the newly synthesized RNA is transported from the nucleus to the cytoplasm, at which transcription is completed.

Processing- a set of reactions leading to the transformation of the primary products of transcription and translation into functioning molecules. Items are subject to functionally inactive precursor molecules decomp. ribonucleic acid (tRNA, rRNA, mRNA) and many others. proteins.

In the process of synthesis of catabolic enzymes (cleaving substrates), prokaryotes undergo induced synthesis of enzymes. This gives the cell the ability to adapt to conditions. environment and save energy by stopping the synthesis of the corresponding enzyme if the need for it disappears.
To induce the synthesis of catabolic enzymes, the following conditions are required:

1. The enzyme is synthesized only when the cleavage of the corresponding substrate is necessary for the cell.
2. The substrate concentration in the medium must exceed a certain level before the corresponding enzyme can be formed.
The mechanism of regulation of gene expression in coli on the example of the lac operon, which controls the synthesis of three catabolic enzymes that break down lactose. If there is a lot of glucose and little lactose in the cell, the promoter remains inactive, and the repressor protein is located on the operator - transcription of the lac operon is blocked. When the amount of glucose in the environment, and therefore in the cell, decreases, and lactose increases, the following events occur: the amount of cyclic adenosine monophosphate increases, it binds to the CAP protein - this complex activates the promoter to which RNA polymerase binds; at the same time, excess lactose binds to the repressor protein and releases the operator from it - the path for RNA polymerase is open, transcription of the structural genes of the lac operon begins. Lactose acts as an inductor for the synthesis of those enzymes that break it down.

5) Regulation of gene expression in eukaryotes is much more difficult. Different types of cells of a multicellular eukaryotic organism synthesize a number of identical proteins and at the same time they differ from each other in a set of proteins specific to cells of this type. The level of production depends on the type of cells, as well as on the stage of development of the organism. Gene expression is regulated at the cell level and at the organism level. The genes of eukaryotic cells are divided into two main types: the first determines the universality of cellular functions, the second determines (determines) specialized cellular functions. Gene Functions first group appear in all cells. To carry out differentiated functions, specialized cells must express a specific set of genes.
Chromosomes, genes, and operons of eukaryotic cells have a number of structural and functional features, which explains the complexity of gene expression.
1. Operons of eukaryotic cells have several genes - regulators, which can be located on different chromosomes.
2. Structural genes that control the synthesis of enzymes of one biochemical process, can be concentrated in several operons located not only in one DNA molecule, but also in several.
3. Complex sequence of the DNA molecule. There are informative and non-informative sections, unique and repeatedly repeated informative nucleotide sequences.
4. Eukaryotic genes consist of exons and introns, and mRNA maturation is accompanied by excision of introns from the corresponding primary RNA transcripts (pro-i-RNA), i.e. splicing.
5. The process of gene transcription depends on the state of chromatin. Local compaction of DNA completely blocks RNA synthesis.
6. Transcription in eukaryotic cells is not always associated with translation. The synthesized mRNA can long time be stored as infosomes. Transcription and translation occur in different compartments.
7. Some eukaryotic genes have non-permanent localization (labile genes or transposons).
8. Methods of molecular biology revealed the inhibitory effect of histone proteins on the synthesis of mRNA.
9. In the process of development and differentiation of organs, the activity of genes depends on hormones circulating in the body and causing specific reactions in certain cells. In mammals importance has the action of sex hormones.
10. In eukaryotes, 5-10% of genes are expressed at each stage of ontogenesis, the rest should be blocked.

6) repair of genetic material

Genetic repair- the process of eliminating genetic damage and restoring the hereditary apparatus, which occurs in the cells of living organisms under the action of special enzymes. The ability of cells to repair genetic damage was first discovered in 1949 by the American geneticist A. Kelner. Repair- a special function of cells, which consists in the ability to correct chemical damage and breaks in DNA molecules damaged during normal DNA biosynthesis in the cell or as a result of exposure to physical or chemical agents. It is carried out by special enzyme systems of the cell. A number of hereditary diseases (eg, xeroderma pigmentosum) are associated with impaired repair systems.

types of reparations:

Direct repair is the simplest way to eliminate damage in DNA, which usually involves specific enzymes that can quickly (usually in one stage) repair the corresponding damage, restoring the original structure of nucleotides. This is how, for example, O6-methylguanine-DNA methyltransferase acts, which removes a methyl group from a nitrogenous base to one of its own cysteine ​​residues.

In the body's metabolism main role belongs to proteins and nucleic acids.
Protein substances form the basis of all vital cell structures, have an unusually high reactivity, and are endowed with catalytic functions.
Nucleic acids are part of the most important organ of the cell - the nucleus, as well as the cytoplasm, ribosomes, mitochondria, etc. Nucleic acids play an important, primary role in heredity, body variability, and protein synthesis.

Plan synthesis protein is stored in the cell nucleus, and direct synthesis occurs outside the nucleus, so it is necessary delivery service encoded plan from the nucleus to the site of synthesis. This delivery service is performed by RNA molecules.

The process starts at core cells: part of the DNA "ladder" unwinds and opens. Due to this, the RNA letters form bonds with the open DNA letters of one of the DNA strands. The enzyme transfers the letters of the RNA to connect them into a thread. So the letters of DNA are "rewritten" into the letters of RNA. The newly formed RNA chain is separated, and the DNA "ladder" twists again. The process of reading information from DNA and synthesizing its RNA template is called transcription , and the synthesized RNA is called informational or i-RNA .

After further modifications, this kind of encoded mRNA is ready. i-RNA comes out of the nucleus and goes to the site of protein synthesis, where the letters i-RNA are deciphered. Each set of three letters of i-RNA forms a "letter" that stands for one specific amino acid.

Another type of RNA looks for this amino acid, captures it with the help of an enzyme, and delivers it to the site of protein synthesis. This RNA is called transfer RNA, or tRNA. As the mRNA message is read and translated, the chain of amino acids grows. This chain twists and folds into a unique shape, creating one kind of protein. Even the process of protein folding is remarkable: to use a computer to calculate all options it would take 1027 (!) years to fold a medium-sized protein consisting of 100 amino acids. And for the formation of a chain of 20 amino acids in the body, it takes no more than one second, and this process occurs continuously in all cells of the body.

Genes, genetic code and its properties.

About 7 billion people live on Earth. Except for 25-30 million pairs of identical twins, then genetically all people are different : each is unique, has unique hereditary characteristics, character traits, abilities, temperament.

Such differences are explained differences in genotypes- sets of genes of an organism; each one is unique. The genetic traits of a particular organism are embodied in proteins - consequently, the structure of the protein of one person differs, although quite a bit, from the protein of another person.

It does not mean that humans do not have exactly the same proteins. Proteins that perform the same functions may be the same or very slightly differ by one or two amino acids from each other. But does not exist on the Earth of people (with the exception of identical twins), in which all proteins would be are the same .

Information about the primary structure of a protein encoded as a sequence of nucleotides in a section of a DNA molecule, gene - a unit of hereditary information of an organism. Each DNA molecule contains many genes. The totality of all the genes of an organism makes up its genotype . In this way,

A gene is a unit of hereditary information of an organism, which corresponds to a separate section of DNA

Hereditary information is encoded using genetic code , which is universal for all organisms and differs only in the alternation of nucleotides that form genes and code for proteins of specific organisms.

Genetic code consists of triplets (triplets) of DNA nucleotides, combined in different sequences (AAT, HCA, ACG, THC, etc.), each of which encodes a specific amino acid (which will be built into the polypeptide chain).

Actually code counts sequence of nucleotides in an i-RNA molecule , because it removes information from DNA (the process transcriptions ) and translates it into a sequence of amino acids in the molecules of synthesized proteins (process broadcasts ).
The composition of mRNA includes nucleotides A-C-G-U, the triplets of which are called codons : the CHT DNA triplet on mRNA will become the HCA triplet, and the AAG DNA triplet will become the UUC triplet. Exactly i-RNA codons reflects the genetic code in the record.

In this way, genetic code - a unified system for recording hereditary information in nucleic acid molecules in the form of a sequence of nucleotides . The genetic code is based on the use of an alphabet consisting of only four nucleotide letters that differ in nitrogenous bases: A, T, G, C.

The main properties of the genetic code:

1. Genetic code triplet. A triplet (codon) is a sequence of three nucleotides that codes for one amino acid. Since proteins contain 20 amino acids, it is obvious that each of them cannot be encoded by one nucleotide ( since there are only four types of nucleotides in DNA, in this case 16 amino acids remain uncoded). Two nucleotides for coding amino acids are also not enough, since in this case only 16 amino acids can be encoded. This means that the smallest number of nucleotides encoding one amino acid must be at least three. In this case, the number of possible nucleotide triplets is 43 = 64.

2. Redundancy (degeneracy) The code is a consequence of its triplet nature and means that one amino acid can be encoded by several triplets (since there are 20 amino acids, and there are 64 triplets), with the exception of methionine and tryptophan, which are encoded by only one triplet. In addition, some triplets perform specific functions: in the mRNA molecule, the triplets UAA, UAG, UGA are terminating codons, i.e. stop-signals that stop the synthesis of the polypeptide chain. The triplet corresponding to methionine (AUG), standing at the beginning of the DNA chain, does not encode an amino acid, but performs the function of initiating (exciting) reading.

3. Unambiguity code - along with redundancy, the code has the property uniqueness : each codon matches only one specific amino acid.

4. Collinearity code, i.e. sequence of nucleotides in a gene exactly corresponds to the sequence of amino acids in the protein.

5. Genetic code non-overlapping and compact , i.e. does not contain "punctuation marks". This means that the reading process does not allow for the possibility of overlapping columns (triplets), and, starting at a certain codon, the reading goes continuously triplet by triplet until stop-signals ( termination codons).

6. Genetic code universal , i.e., the nuclear genes of all organisms encode information about proteins in the same way, regardless of the level of organization and the systematic position of these organisms.

Exists genetic code tables for decryption codons i-RNA and building chains of protein molecules.

Matrix synthesis reactions.

In living systems, there are reactions that are unknown in inanimate nature - matrix synthesis reactions.

The term "matrix" in technology they denote the form used for casting coins, medals, typographic type: the hardened metal exactly reproduces all the details of the form used for casting. Matrix synthesis resembles a casting on a matrix: new molecules are synthesized in strict accordance with the plan laid down in the structure of already existing molecules.

The matrix principle lies at the core the most important synthetic reactions of the cell, such as the synthesis of nucleic acids and proteins. In these reactions, an exact, strictly specific sequence of monomeric units in the synthesized polymers is provided.

This is where directional pulling monomers to a specific location cells - into molecules that serve as a matrix where the reaction takes place. If such reactions occurred as a result of a random collision of molecules, they would proceed infinitely slowly. The synthesis of complex molecules based on the matrix principle is carried out quickly and accurately. The role of the matrix macromolecules of nucleic acids play in matrix reactions DNA or RNA .

monomeric molecules, from which the polymer is synthesized - nucleotides or amino acids - in accordance with the principle of complementarity are arranged and fixed on the matrix in a strictly defined, predetermined order.

Then comes "crosslinking" of monomer units into a polymer chain, and the finished polymer is dropped from the matrix.

Thereafter matrix ready to the assembly of a new polymer molecule. It is clear that just as only one coin, one letter can be cast on a given mold, so only one polymer can be "assembled" on a given matrix molecule.

Matrix type of reactions- a specific feature of the chemistry of living systems. They are the basis of the fundamental property of all living things - its ability to reproduce its own kind.

Matrix synthesis reactions

1. DNA replication - replication (from lat. replicatio - renewal) - the process of synthesis of a daughter molecule of deoxyribonucleic acid on the matrix of the parent DNA molecule. During the subsequent division of the mother cell, each daughter cell receives one copy of a DNA molecule that is identical to the DNA of the original mother cell. This process ensures the accurate transmission of genetic information from generation to generation. DNA replication is carried out by a complex enzyme complex, consisting of 15-20 different proteins, called replisome . The material for synthesis is free nucleotides present in the cytoplasm of cells. The biological meaning of replication lies in the exact transfer of hereditary information from the parent molecule to the daughter ones, which normally occurs during the division of somatic cells.

The DNA molecule consists of two complementary strands. These chains are held together by weak hydrogen bonds that can be broken by enzymes. The DNA molecule is capable of self-doubling (replication), and a new half of it is synthesized on each old half of the molecule.
In addition, an mRNA molecule can be synthesized on a DNA molecule, which then transfers the information received from DNA to the site of protein synthesis.

Information transfer and protein synthesis follow a matrix principle, comparable to the work of a printing press in a printing house. Information from DNA is copied over and over again. If errors occur during copying, they will be repeated in all subsequent copies.

True, some errors in copying information by a DNA molecule can be corrected - the process of eliminating errors is called reparations. The first of the reactions in the process of information transfer is the replication of the DNA molecule and the synthesis of new DNA strands.

2. Transcription (from Latin transcriptio - rewriting) - the process of RNA synthesis using DNA as a template, occurring in all living cells. In other words, it is the transfer of genetic information from DNA to RNA.

Transcription is catalyzed by the enzyme DNA-dependent RNA polymerase. RNA polymerase moves along the DNA molecule in the direction 3 " → 5". Transcription consists of steps initiation, elongation and termination . The unit of transcription is the operon, a fragment of the DNA molecule consisting of promoter, transcribed moiety, and terminator . i-RNA consists of one strand and is synthesized on DNA in accordance with the rule of complementarity with the participation of an enzyme that activates the beginning and end of the synthesis of the i-RNA molecule.

The finished mRNA molecule enters the cytoplasm on the ribosomes, where the synthesis of polypeptide chains takes place.

3. Broadcast (from lat. translation- transfer, movement) - the process of protein synthesis from amino acids on the matrix of information (matrix) RNA (mRNA, mRNA) carried out by the ribosome. In other words, this is the process of translating the information contained in the nucleotide sequence of i-RNA into the sequence of amino acids in the polypeptide.

4. reverse transcription is the process of forming double-stranded DNA based on information from single-stranded RNA. This process is called reverse transcription, since the transfer of genetic information in this case occurs in the “reverse” direction relative to transcription. The idea of ​​reverse transcription was initially very unpopular, as it went against the central dogma of molecular biology, which assumed that DNA is transcribed into RNA and then translated into proteins.

However, in 1970, Temin and Baltimore independently discovered an enzyme called reverse transcriptase (revertase) , and the possibility of reverse transcription was finally confirmed. In 1975, Temin and Baltimore were awarded Nobel Prize in the field of physiology and medicine. Some viruses (such as the human immunodeficiency virus that causes HIV infection) have the ability to transcribe RNA into DNA. HIV has an RNA genome that integrates into DNA. As a result, the DNA of the virus can be combined with the genome of the host cell. The main enzyme responsible for the synthesis of DNA from RNA is called revertase. One of the functions of reversease is to create complementary DNA (cDNA) from the viral genome. The associated enzyme ribonuclease cleaves RNA, and reversetase synthesizes cDNA from the DNA double helix. cDNA is integrated into the host cell genome by integrase. The result is synthesis of viral proteins by the host cell that form new viruses. In the case of HIV, apoptosis (cell death) of T-lymphocytes is also programmed. In other cases, the cell may remain a distributor of viruses.

The sequence of matrix reactions in protein biosynthesis can be represented as a diagram.

In this way, protein biosynthesis- this is one of the types of plastic exchange, during which the hereditary information encoded in the DNA genes is realized in a certain sequence of amino acids in protein molecules.

Protein molecules are essentially polypeptide chains made up of individual amino acids. But amino acids are not active enough to connect with each other on their own. Therefore, before they combine with each other and form a protein molecule, amino acids must activate . This activation occurs under the action of special enzymes.

As a result of activation, the amino acid becomes more labile and, under the action of the same enzyme, binds to t- RNA. Each amino acid corresponds to a strictly specific t- RNA, which finds "its" amino acid and endures it into the ribosome.

Therefore, the ribosome receives various activated amino acids linked to their T- RNA. The ribosome is like conveyor to assemble a protein chain from various amino acids entering it.

Simultaneously with t-RNA, on which its own amino acid "sits", " signal» from the DNA that is contained in the nucleus. In accordance with this signal, one or another protein is synthesized in the ribosome.

The directing influence of DNA on protein synthesis is not carried out directly, but with the help of a special intermediary - matrix or messenger RNA (mRNA or i-RNA), which synthesized into the nucleus It is not influenced by DNA, so its composition reflects the composition of DNA. The RNA molecule is, as it were, a cast from the form of DNA. The synthesized mRNA enters the ribosome and, as it were, transfers it to this structure plan- in what order should the activated amino acids entering the ribosome be combined with each other in order to synthesize a certain protein. Otherwise, genetic information encoded in DNA is transferred to mRNA and then to protein.

The mRNA molecule enters the ribosome and flashes her. The segment that is in this moment in the ribosome codon (triplet), interacts in a completely specific way with a structure suitable for it triplet (anticodon) in the transfer RNA that brought the amino acid into the ribosome.

Transfer RNA with its amino acid approaches a certain codon of mRNA and connects with him; to the next, neighboring site of i-RNA joins another tRNA with a different amino acid and so on until the entire i-RNA chain is read, until all the amino acids are strung in the appropriate order, forming a protein molecule. And t-RNA, which delivered the amino acid to a specific site of the polypeptide chain, freed from its amino acid and exits the ribosome.

Then again in the cytoplasm, the desired amino acid can join it, and it will again transfer it to the ribosome. In the process of protein synthesis, not one, but several ribosomes, polyribosomes, are simultaneously involved.

The main stages of the transfer of genetic information:

1. Synthesis on DNA as on an mRNA template (transcription)
2. Synthesis of the polypeptide chain in ribosomes according to the program contained in i-RNA (translation) .

The stages are universal for all living beings, but the temporal and spatial relationships of these processes differ in pro- and eukaryotes.

At prokaryotes transcription and translation can occur simultaneously because DNA is located in the cytoplasm. At eukaryote transcription and translation are strictly separated in space and time: the synthesis of various RNAs occurs in the nucleus, after which the RNA molecules must leave the nucleus, passing through the nuclear membrane. The RNA is then transported in the cytoplasm to the site of protein synthesis.

- a unified system for recording hereditary information in nucleic acid molecules in the form of a sequence of nucleotides. The genetic code is based on the use of an alphabet consisting of only four nucleotide letters that differ in nitrogenous bases: A, T, G, C.

The main properties of the genetic code are as follows:

1. The genetic code is triplet. A triplet (codon) is a sequence of three nucleotides that codes for one amino acid. Since proteins contain 20 amino acids, it is obvious that each of them cannot be encoded by one nucleotide (since there are only four types of nucleotides in DNA, in this case 16 amino acids remain uncoded). Two nucleotides for coding amino acids are also not enough, since in this case only 16 amino acids can be encoded. This means that the smallest number of nucleotides encoding one amino acid is three. (In this case, the number of possible nucleotide triplets is 4 3 = 64).

2. The redundancy (degeneracy) of the code is a consequence of its triplet nature and means that one amino acid can be encoded by several triplets (since there are 20 amino acids, and 64 triplets). The exceptions are methionine and tryptophan, which are encoded by only one triplet. In addition, some triplets perform specific functions. So, in an mRNA molecule, three of them - UAA, UAG, UGA - are terminating codons, i.e., stop signals that stop the synthesis of the polypeptide chain. The triplet corresponding to methionine (AUG), standing at the beginning of the DNA chain, does not encode an amino acid, but performs the function of initiating (exciting) reading.

3. Simultaneously with redundancy, the code has the property of unambiguity, which means that each codon corresponds to only one specific amino acid.

4. The code is collinear, i.e. The sequence of nucleotides in a gene exactly matches the sequence of amino acids in a protein.

5. The genetic code is non-overlapping and compact, that is, it does not contain "punctuation marks". This means that the reading process does not allow for the possibility of overlapping columns (triplets), and, starting at a certain codon, the reading goes continuously triple by triplet up to stop signals (terminating codons). For example, in mRNA, the following sequence of nitrogenous bases AUGGUGCUUAAAUGUG will only be read in triplets like this: AUG, GUG, CUU, AAU, GUG, not AUG, UGG, GGU, GUG, etc. or AUG, GGU, UGC, CUU, etc. or in some other way (for example, codon AUG, punctuation mark G, codon UHC, punctuation mark U, etc.).

6. The genetic code is universal, that is, the nuclear genes of all organisms encode information about proteins in the same way, regardless of the level of organization and the systematic position of these organisms.

Previously, we emphasized that nucleotides have an important feature for the formation of life on Earth - in the presence of one polynucleotide chain in a solution, the process of formation of a second (parallel) chain spontaneously occurs based on the complementary compound of related nucleotides. The same number of nucleotides in both chains and their chemical relationship is an indispensable condition for the implementation of such reactions. However, during protein synthesis, when information from mRNA is implemented into the protein structure, there can be no question of observing the principle of complementarity. This is due to the fact that in mRNA and in the synthesized protein not only the number of monomers is different, but, what is especially important, there is no structural similarity between them (on the one hand, nucleotides, on the other, amino acids). It is clear that in this case there is a need to create a new principle for the exact translation of information from a polynucleotide into a polypeptide structure. In evolution, such a principle was created and the genetic code was laid in its basis.

The genetic code is a system for recording hereditary information in nucleic acid molecules, based on a certain alternation of nucleotide sequences in DNA or RNA that form codons corresponding to amino acids in a protein.

The genetic code has several properties.

    Tripletity.

    Degeneracy or redundancy.

    Unambiguity.

    Polarity.

    Non-overlapping.

    Compactness.

    Versatility.

It should be noted that some authors also offer other properties of the code related to the chemical features of the nucleotides included in the code or to the frequency of occurrence of individual amino acids in the proteins of the body, etc. However, these properties follow from the above, so we will consider them there.

a. Tripletity. The genetic code, like many complexly organized systems, has the smallest structural and smallest functional unit. A triplet is the smallest structural unit of the genetic code. It consists of three nucleotides. A codon is the smallest functional unit of the genetic code. As a rule, mRNA triplets are called codons. In the genetic code, a codon performs several functions. First, its main function is that it codes for one amino acid. Second, a codon may not code for an amino acid, but in this case it has a different function (see below). As can be seen from the definition, a triplet is a concept that characterizes elementary structural unit genetic code (three nucleotides). codon characterizes elementary semantic unit genome - three nucleotides determine the attachment to the polypeptide chain of one amino acid.

The elementary structural unit was first deciphered theoretically, and then its existence was confirmed experimentally. Indeed, 20 amino acids cannot be encoded by one or two nucleotides. the latter are only 4. Three out of four nucleotides give 4 3 = 64 variants, which more than covers the number of amino acids present in living organisms (see Table 1).

The combinations of nucleotides presented in Table 64 have two features. First, of the 64 variants of triplets, only 61 are codons and encode any amino acid, they are called sense codons. Three triplets do not encode

Table 1.

Messenger RNA codons and their corresponding amino acids

F undamentals of codons

nonsense

nonsense

nonsense

Met

Shaft

amino acids a are stop signals marking the end of translation. There are three such triplets UAA, UAG, UGA, they are also called "meaningless" (nonsense codons). As a result of a mutation, which is associated with the replacement of one nucleotide in a triplet with another, a meaningless codon can arise from a sense codon. This type of mutation is called nonsense mutation. If such a stop signal is formed inside the gene (in its informational part), then during protein synthesis in this place the process will be constantly interrupted - only the first (before the stop signal) part of the protein will be synthesized. A person with such a pathology will experience a lack of protein and will experience symptoms associated with this lack. For example, this kind of mutation was found in the gene encoding the hemoglobin beta chain. A shortened inactive hemoglobin chain is synthesized, which is rapidly destroyed. As a result, a hemoglobin molecule devoid of a beta chain is formed. It is clear that such a molecule is unlikely to fully fulfill its duties. Arises serious illness, developing according to the type hemolytic anemia(beta-zero thalassemia, from the Greek word "Talas" - the Mediterranean Sea, where this disease was first discovered).

The mechanism of action of stop codons is different from the mechanism of action of sense codons. This follows from the fact that for all the codons encoding amino acids, the corresponding tRNAs were found. No tRNAs were found for nonsense codons. Therefore, tRNA does not take part in the process of stopping protein synthesis.

codonAUG (sometimes GUG in bacteria) not only encodes the amino acid methionine and valine, but is alsobroadcast initiator .

b. Degeneracy or redundancy.

61 of the 64 triplets code for 20 amino acids. Such a threefold excess of the number of triplets over the number of amino acids suggests that two coding options can be used in the transfer of information. Firstly, not all 64 codons can be involved in encoding 20 amino acids, but only 20, and secondly, amino acids can be encoded by several codons. Studies have shown that nature used the latter option.

His preference is clear. If only 20 out of 64 triplet variants were involved in coding amino acids, then 44 triplets (out of 64) would remain non-coding, i.e. meaningless (nonsense codons). Earlier, we pointed out how dangerous for the life of a cell is the transformation of a coding triplet as a result of a mutation into a nonsense codon - this significantly disrupts the normal operation of RNA polymerase, ultimately leading to the development of diseases. There are currently three nonsense codons in our genome, and now imagine what would happen if the number of nonsense codons increased by about 15 times. It is clear that in such a situation the transition of normal codons to nonsense codons will be immeasurably higher.

A code in which one amino acid is encoded by several triplets is called degenerate or redundant. Almost every amino acid has several codons. So, the amino acid leucine can be encoded by six triplets - UUA, UUG, CUU, CUC, CUA, CUG. Valine is encoded by four triplets, phenylalanine by two and only tryptophan and methionine encoded by one codon. The property that is associated with the recording of the same information with different characters is called degeneracy.

The number of codons assigned to one amino acid correlates well with the frequency of occurrence of the amino acid in proteins.

And this is most likely not accidental. The higher the frequency of occurrence of an amino acid in a protein, the more often the codon of this amino acid is represented in the genome, the higher the probability of its damage by mutagenic factors. Therefore, it is clear that a mutated codon is more likely to code for the same amino acid if it is highly degenerate. From these positions, the degeneracy of the genetic code is a mechanism that protects the human genome from damage.

It should be noted that the term degeneracy is used in molecular genetics in another sense as well. Since the main part of the information in the codon falls on the first two nucleotides, the base in the third position of the codon turns out to be of little importance. This phenomenon is called “degeneracy of the third base”. The latter feature minimizes the effect of mutations. For example, it is known that the main function of red blood cells is the transport of oxygen from the lungs to the tissues and carbon dioxide from the tissues to the lungs. This function is carried out by the respiratory pigment - hemoglobin, which fills the entire cytoplasm of the erythrocyte. It consists of a protein part - globin, which is encoded by the corresponding gene. In addition to protein, hemoglobin contains heme, which contains iron. Mutations in globin genes lead to the appearance of different variants of hemoglobins. Most often, mutations are associated with substitution of one nucleotide for another and the appearance of a new codon in the gene, which can code for a new amino acid in the hemoglobin polypeptide chain. In a triplet, as a result of a mutation, any nucleotide can be replaced - the first, second or third. Several hundred mutations are known to affect the integrity of globin genes. Near 400 of which are associated with the replacement of single nucleotides in the gene and the corresponding amino acid substitution in the polypeptide. Of these, only 100 substitutions lead to instability of hemoglobin and various kinds of diseases from mild to very severe. 300 (approximately 64%) substitution mutations do not affect hemoglobin function and do not lead to pathology. One of the reasons for this is the “degeneracy of the third base” mentioned above, when the replacement of the third nucleotide in the triplet encoding serine, leucine, proline, arginine, and some other amino acids leads to the appearance of a synonym codon encoding the same amino acid. Phenotypically, such a mutation will not manifest itself. In contrast, any replacement of the first or second nucleotide in a triplet in 100% of cases leads to the appearance of a new hemoglobin variant. But even in this case, there may not be severe phenotypic disorders. The reason for this is the replacement of an amino acid in hemoglobin with another one similar to the first in terms of physicochemical properties. For example, if an amino acid with hydrophilic properties is replaced by another amino acid, but with the same properties.

Hemoglobin consists of an iron porphyrin group of heme (oxygen and carbon dioxide molecules are attached to it) and a protein - globin. Adult hemoglobin (HbA) contains two identical- chains and two-chains. Molecule-chain contains 141 amino acid residues,- chain - 146,- and-chains differ in many amino acid residues. The amino acid sequence of each globin chain is encoded by its own gene. The gene encoding- the chain is located on the short arm of chromosome 16,-gene - in the short arm of chromosome 11. Change in the gene encoding- hemoglobin chain of the first or second nucleotide almost always leads to the appearance of new amino acids in the protein, disruption of hemoglobin functions and severe consequences for the patient. For example, replacing “C” in one of the CAU (histidine) triplets with “U” will lead to the appearance of a new UAU triplet encoding another amino acid - tyrosine. Phenotypically, this will manifest itself in a serious illness .. A similar replacement in position 63-chain of the histidine polypeptide to tyrosine will destabilize hemoglobin. The disease methemoglobinemia develops. Change, as a result of mutation, of glutamic acid to valine in the 6th positionchain is the cause of a severe disease - sickle cell anemia. Let's not continue the sad list. We only note that when replacing the first two nucleotides, an amino acid can appear according to physical and chemical properties similar to the previous one. Thus, the replacement of the 2nd nucleotide in one of the triplets encoding glutamic acid (GAA) in-chain on “Y” leads to the appearance of a new triplet (GUA) encoding valine, and the replacement of the first nucleotide with “A” forms an AAA triplet encoding the amino acid lysine. Glutamic acid and lysine are similar in physicochemical properties - they are both hydrophilic. Valine is a hydrophobic amino acid. Therefore, the replacement of hydrophilic glutamic acid with hydrophobic valine significantly changes the properties of hemoglobin, which ultimately leads to the development of sickle cell anemia, while the replacement of hydrophilic glutamic acid with hydrophilic lysine changes the function of hemoglobin to a lesser extent - patients develop mild form anemia. As a result of the replacement of the third base, the new triplet can encode the same amino acids as the previous one. For example, if uracil was replaced by cytosine in the CAC triplet and a CAC triplet arose, then practically no phenotypic changes in a person will be detected. This is understandable, because Both triplets code for the same amino acid, histidine.

In conclusion, it is appropriate to emphasize that the degeneracy of the genetic code and the degeneracy of the third base from a general biological position are protective mechanisms that are incorporated in evolution in the unique structure of DNA and RNA.

v. Unambiguity.

Each triplet (except for meaningless ones) encodes only one amino acid. Thus, in the direction of codon - amino acid, the genetic code is unambiguous, in the direction of amino acid - codon - ambiguous (degenerate).

unambiguous

codon amino acid

degenerate

And in this case, the need for unambiguity in the genetic code is obvious. In another variant, during the translation of the same codon, different amino acids would be inserted into the protein chain and, as a result, proteins with different primary structures and different functions would be formed. The cell's metabolism would switch to the "one gene - several polypeptides" mode of operation. It is clear that in such a situation the regulatory function of genes would be completely lost.

g. Polarity

Reading information from DNA and from mRNA occurs only in one direction. Polarity is essential for defining higher order structures (secondary, tertiary, etc.). Earlier we talked about the fact that structures of a lower order determine structures of a higher order. The tertiary structure and structures of a higher order in proteins are formed immediately as soon as the synthesized RNA chain moves away from the DNA molecule or the polypeptide chain moves away from the ribosome. While the free end of the RNA or polypeptide acquires a tertiary structure, the other end of the chain still continues to be synthesized on DNA (if RNA is transcribed) or ribosome (if polypeptide is transcribed).

Therefore, the unidirectional process of reading information (in the synthesis of RNA and protein) is essential not only for determining the sequence of nucleotides or amino acids in the synthesized substance, but for the rigid determination of secondary, tertiary, etc. structures.

e. Non-overlapping.

The code may or may not overlap. In most organisms, the code is non-overlapping. An overlapping code has been found in some phages.

The essence of a non-overlapping code is that the nucleotide of one codon cannot be the nucleotide of another codon at the same time. If the code were overlapping, then the sequence of seven nucleotides (GCUGCUG) could encode not two amino acids (alanine-alanine) (Fig. 33, A) as in the case of a non-overlapping code, but three (if one nucleotide is common) (Fig. 33, B) or five (if two nucleotides are common) (see Fig. 33, C). In the last two cases, a mutation of any nucleotide would lead to a violation in the sequence of two, three, etc. amino acids.

However, it has been found that a mutation of one nucleotide always disrupts the inclusion of one amino acid in a polypeptide. This is a significant argument in favor of the fact that the code is non-overlapping.

Let us explain this in Figure 34. Bold lines show triplets encoding amino acids in the case of non-overlapping and overlapping codes. Experiments have unambiguously shown that the genetic code is non-overlapping. Without going into the details of the experiment, we note that if we replace the third nucleotide in the nucleotide sequence (see Fig. 34)At (marked with an asterisk) to some other then:

1. With a non-overlapping code, the protein controlled by this sequence would have a replacement for one (first) amino acid (marked with asterisks).

2. With an overlapping code in option A, a replacement would occur in two (first and second) amino acids (marked with asterisks). Under option B, the substitution would affect three amino acids (marked with asterisks).

However, numerous experiments have shown that when one nucleotide in DNA is broken, the protein always affects only one amino acid, which is typical for a non-overlapping code.

ГЦУГЦУГ ГЦУГЦУГ ГЦУГЦУГ

HCC HCC HCC UHC CUG HCC CUG UGC HCC CUG

*** *** *** *** *** ***

Alanine - Alanine Ala - Cys - Lei Ala - Lei - Lei - Ala - Lei

A B C

non-overlapping code overlapping code

Rice. 34. Scheme explaining the presence of a non-overlapping code in the genome (explanation in the text).

The non-overlapping of the genetic code is associated with another property - the reading of information begins from a certain point - the initiation signal. Such an initiation signal in mRNA is the codon encoding AUG methionine.

It should be noted that a person still has a small number of genes that deviate from general rule and overlap.

e. Compactness.

There are no punctuation marks between codons. In other words, the triplets are not separated from each other, for example, by one meaningless nucleotide. The absence of "punctuation marks" in the genetic code has been proven in experiments.

well. Versatility.

The code is the same for all organisms living on Earth. Direct evidence of the universality of the genetic code was obtained by comparing DNA sequences with corresponding protein sequences. It turned out that the same sets of code values ​​are used in all bacterial and eukaryotic genomes. There are exceptions, but not many.

The first exceptions to the universality of the genetic code were found in the mitochondria of some animal species. This concerned the terminator codon UGA, which read the same as the UGG codon encoding the amino acid tryptophan. Other rarer deviations from universality have also been found.

MZ. The genetic code is a system for recording hereditary information in nucleic acid molecules, based on a certain alternation of nucleotide sequences in DNA or RNA that form codons,

corresponding to the amino acids in the protein.The genetic code has several properties.

The genetic code is a special encoding of hereditary information with the help of molecules. Based on this, genes appropriately control the synthesis of proteins and enzymes in the body, thereby determining metabolism. In turn, the structure of individual proteins and their functions are determined by the location and composition of amino acids - the structural units of the protein molecule.

In the middle of the last century, genes were identified that are separate sections (abbreviated as DNA). The links of nucleotides form a characteristic double chain, assembled in the form of a spiral.

Scientists have found a connection between genes and the chemical structure of individual proteins, the essence of which is that the structural order of amino acids in protein molecules fully corresponds to the order of nucleotides in the gene. Having established this connection, scientists decided to decipher the genetic code, i.e. establish the laws of correspondence between the structural orders of nucleotides in DNA and amino acids in proteins.

There are only four types of nucleotides:

1) A - adenyl;

2) G - guanyl;

3) T - thymidyl;

4) C - cytidyl.

Proteins contain twenty types of essential amino acids. Difficulties arose with deciphering the genetic code, since there are much fewer nucleotides than amino acids. When solving this problem, it was suggested that amino acids are encoded by various combinations of three nucleotides (the so-called codon or triplet).

In addition, it was necessary to explain exactly how the triplets are located along the gene. Thus, three main groups of theories arose:

1) triplets follow each other continuously, i.e. form a continuous code;

2) triplets are arranged with alternation of "meaningless" sections, i.e. the so-called "commas" and "paragraphs" are formed in the code;

3) triplets can overlap, i.e. the end of the first triplet may form the beginning of the next.

Currently, the theory of code continuity is mainly used.

The genetic code and its properties

1) The code is triplet - it consists of arbitrary combinations of three nucleotides that form codons.

2) The genetic code is redundant - its triplets. One amino acid can be encoded by several codons, since, according to mathematical calculations, there are three times more codons than amino acids. Some codons perform certain termination functions: some may be "stop signals" that program the end of the production of an amino acid chain, while others may indicate the initiation of code reading.

3) The genetic code is unambiguous - only one amino acid can correspond to each of the codons.

4) The genetic code is collinear, i.e. the sequence of nucleotides and the sequence of amino acids clearly correspond to each other.

5) The code is written continuously and compactly, there are no "meaningless" nucleotides in it. It begins with a certain triplet, which is replaced by the next one without a break and ends with a termination codon.

6) The genetic code is universal - the genes of any organism encode information about proteins in exactly the same way. This does not depend on the level of complexity of the organization of the organism or its systemic position.

modern science suggests that the genetic code arises directly from the birth of a new organism from bone matter. Random changes and evolutionary processes make possible any variants of the code, i.e. amino acids can be rearranged in any order. Why did this kind of code survive in the course of evolution, why is the code universal and has a similar structure? The more science learns about the phenomenon of the genetic code, the more new mysteries arise.