The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.

×
Images in NeuroscienceFull Access

The Human Genome Sequence

The Human Genome I: Chromosomes and Protein Coding

In 2001, the National Institutes of Health Human Genome Project and Celera will announce the full sequence of the human genome. This progress is being hailed as the beginning of a new era in biomedical research, intended to facilitate the understanding and approaches to human disease. It is now the responsibility of each area of medical research to exploit this resource to its limit for the understanding of its target diseases. In humans, the full genome sequence is distributed over 23 pairs of chromosomes and purportedly carries the code for the fewer than 100,000 proteins that represent the full hereditary blueprint of a human person.

Chromosomes are composed of double strands of DNA nucleotides, each made up of a sugar (deoxyribose), a phosphate group, and one of the four nitrogenous bases (adenine [A], thymine [T], cytosine [C], or guanine [G]). A strand of DNA is generated through bonds between the phosphate and the sugar groups that, once linked, form the outer backbone (figure, left). The strands are paired to their complements (A with T and C with G) by weak hydrogen bonding between the base pairs and intertwined to form the double helix (figure, right). A chromosome is a single genetically specific DNA strand surrounded by large numbers of proteins that are involved in maintaining the structure of the chromosome and regulating its expression. DNA sequences encode protein structures by having contiguous nucleotide triplets code for a single amino acid. In addition, DNA sequences code for the start and the stop of each protein to produce a true reading frame.

DNA translates itself into its complementary protein through RNA, which copies the code from DNA and then produces proteins from nucleotide triplet codons in the ribosome. Even though the order of amino acids in a protein translated to the RNA nucleotide code is determined by DNA, substantial posttranslational processing of the protein takes place locally within the cell to eventually influence the function of the protein. Only a small fraction of the genome is thought to code for proteins. The known coding sequences for many proteins (exons) are not contiguous but have noncoding regions at varying intervals (introns) within the full gene. The function of the vast noncoding regions of DNA are not known but are expected to include the DNA sequences regulating genetic expression.

Address reprint requests to Dr. Tamminga, Maryland Psychiatric Research Center, University of Maryland, P.O. Box 21247, Baltimore, MD 21228. Image courtesy of Paul Thiessen of Custom Chemical Graphics. Additional DNA graphic images created by Mr. Thiessen can be viewed at www.ChemicalGraphics.com/paul/DNA.html.

Figure

The image on the left depicts a standard creatine phosphokinase model, colored by element, in which the flat base pairs are stacked in the center of the strand, perpendicular to the view, with the backbone making a right-handed spiral around the central, vertical axis. The same atomic positions and balls colored by element are shown in the image on the right, with “sticks” conveying additional information about the nitrogenous bases (purple=adenine, green=thymine, yellow=guanine, and cyan=cytosine).