The Human Genome Project only managed to map 92% of the human genome. The remaining 8% took two more decades to complete because the sequences in this 8% were long and contained repeats of nucleotides, making it difficult for the technology at the time to map.
It’s amazing how four different molecular letters can combine to make living organisms as small as a microbe and as ginormous as blue whales. These four molecular molecules, A (Adenine), T (Thymine), G (Guanine), and C (Cytosine) make up our DNA (along with sugars and phosphates).
The entire library of the sequences of ATCGs that hold the key is called the genome. The genome contains all of the information necessary to construct that organism and allow it to grow and develop over time. The size and complexity of a genome varies from species to species, and it is governed by a set of instructions in the form of DNA.
Think of the genome as a multi-story building constructed by the repetition of building blocks, while the different stories of the building store the information necessary for the proper functioning, signaling, and survival of the organism.
It is necessary to conduct a detailed analysis of this building in order to understand the problems (genetic disorders) that may be occurring in any part of it, which can be accomplished by beginning with the foundation.
Sequencing is the process of learning about a species’ genome in a detailed order, and is accomplished through research.
The Human Genome Project
The Human Genome Project began in October 1990 with the goal of sequencing the entire human genome. By 2003, however, the project had managed to sequence 92% of the human genome. The remaining 8% was finally decoded, following two more decades of research, in March 2019. But why did it take decades longer to sequence only 8%t of the genome?
The answer lies in the framework of the human genome.
The four molecular letters combine in pairs, A to T and C to G, to form nucleotides. A long string of these nucleotides composes our entire genome. A string of nucleotides that codes for particular information (usually a protein) is a gene. The genes and all the strings of DNA that are not genes is collectively the genome.
Scientists estimate that the human genome contains approximately 3 billion nucleotides, with approximately 1% of the nucleotides involved in signaling and information processing. The majority of the remaining 99% of the genome was considered useless—junk. And much of this 99% were long repeats of nucleotides, which was a significant problem.
It is like placing identical-looking bricks in a building, but these identical bricks must be placed in an appropriate order and sequence, which posed a significant challenge. Due to the similarity in structure and content of these repeating sequences, the technology we used to sequence couldn’t pick up and tell apart these sequences.
In other words, let’s say we have a cast that can only hold 100 bricks at a time; how can we be certain that the specific 100 bricks should be placed in this section? As a result, we needed a more extensive cast to lay all the bricks at once.
How Did They Solve The Problem?
This difficulty was overcome by a group of scientists working together as part of the Telomere to Telomere Consortium. The consortium managed to piece together the long repeats thanks to advancements in biotechnology and computational methods. The new methods developed required less memory than previous techniques, which allowed researchers to process the lengthy repeats. The new technology also helped bring down the costs of processing the data.
A complete genome refers to an individual’s entire genetic sequence; consequently, the complete human genome will serve as a reference for comparing various people’s genomes and identifying genetic differences that make us unique. It will also aid in the comparison of a family’s genome and understanding the source of genetic differences in order to identify the genes (active or inactive) that cause various inheritable diseases. This represents a significant step forward in the understanding and treatment of numerous genetic illnesses and mutations in people, as well as in the advancement of mankind.
How well do you understand the article above!
References (click to expand)
- The Human Genome Project pieced together only 92% of the .... The Conversation
- International Human Genome Sequencing Consortium, Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., … The Wellcome Trust:. (2001, February 15). Initial sequencing and analysis of the human genome. Nature. Springer Science and Business Media LLC.
- Applications and Issues of the Human Genome Project. North Dakota State University District
- (2008) The Human Genome Project. The Stanford Encyclopedia of PhilosophyIt