A massive consortium of researchers, led in part by local scientists, has taken the next step after researchers mapped the human genome, compiling an encyclopedia that illuminates how the vast majority of the 3 billion building blocks of human DNA works.
When the genome, the blueprint of a person, was first deciphered 11 years ago, scientists were faced with a conundrum: only a tiny fraction was made up of genes, the stretches that carried instructions to make proteins that gave rise to inherited traits, such as having blue eyes or black hair. The rest was called “junk DNA.”
The raft of publications being released in top scientific journals Wednesday should permanently change the meaning of “junk.” Hundreds of scientists from 30 institutions elucidated the functions of 80 percent of the genome, finding regulatory elements that act like switches, determining which genes are “on” or turning their volume up or down.
The information, gained through more than 1,600 experiments on cells from 147 types of tissue could help explain the causes of human disease, because many studies that scan the genome for changes associated with common illnesses such as diabetes or cardiovascular disease have highlighted areas that have no genes, but may be important in regulating genes.
“This is Google Maps,” said Eric Lander, director of the Broad Institute, a genomic research center in Cambridge that participated in the new project, called ENCODE for Encyclopedia of DNA Elements. “The human genome project gave us the picture of the whole human genome” like a satellite image, but “it’s not immediately obvious: where are the cities, where are the pizzerias, where are the coffee shops, where are the highways, what has traffic.”
The new data, he said, give scientists the ability to understand how the genome works and begin to unravel human disease.
The effort does not have a single, simple finding, but provides a resource that will be useful in making sense of genetic information. It also helps solve a puzzle that first emerged when the human genome was sequenced: why so few genes?
“When the genome was published, people had all kinds of speculations about how many genes there are in the human genome—the number thrown around was 100,000 genes,” said Zhiping Weng, director of the program in bioinformatics and integrative biology at the University of Massachusetts Medical School, who will play a leading role in the next phase of the project. When it became clear it was “25 to 30,000 genes, a lot of people are very upset, and why? Because the fly has 20,000 genes, the worm has 20,0000 genes, and are we just bigger? What exactly makes us us? It’s how our genes are regulated.”
Weng will lead a four-year, $8 million grant to continue the analysis and integration of the vast amounts of information as scientists continue to understand the function of the remainder of the genome and apply the techniques developed to understand the genome more broadly.
Job Dekker, codirector of the program in systems biology at UMass Medical School who led one of the papers published Wednesday in the journal Nature has developed a tool to map the genome in three dimensions, showing where regulatory elements touch each other.
His laboratory is already beginning to utilize the data in order to better understand a deadly inherited disease, cystic fibrosis. Although the mutations that cause that disease are known, his laboratory is looking at the regulatory elements that control it, lying outside the gene. Knowing how genes are regulated could turn out to be as important as knowing which ones are there.
“You can compare it to building a house,” Dekker said. “With the genome sequence you have the outline of the house, you know there’s a house. Now, we know what part is actually a bathroom, which part is a living room, which part is a kitchen. So in this kitchen where actually is the stove and how is it connected—how do you turn it on? It’s those details you really have to know.”Carolyn Y. Johnson can be reached at email@example.com. Follow her on Twitter @carolynyjohnson.