High-Throughput Computing and the Shape of DNA to Come

First Posted: Aug 15, 2013 04:43 PM EDT

Could we inherit genetic behavioral changes without changes to our genetic code? Could epigenetic changes that lead to disease be reversed with a drug? Could manipulation of epigenetic processes in cells lead to new therapies? In the relatively new field of epigenetics, discoveries are on the horizon - particularly as advances in modeling and simulation enable researchers to ask ever more detailed questions about the behavior of genes.

“Now is an exciting time across many areas of research for the field of epigenetics,” says Gordon Freeman, a graduate student in the de Pablo Research Group at the University of Wisconsin–Madison (UWM), US. “Where genetics actually looks at the code in our DNA – the As, Cs, Gs, Ts that make up our genome – epigenetics looks at everything else in a cell that also interacts with and affects DNA and the expression of the genes contained therein.”

Freeman’s main focus is on the interactions between DNA and proteins. “I focus, in particular, on how the sequence of As, Cs, Gs, Ts changes the behavior of the molecule, and in turn how that change in behavior causes different DNA sequences to interact with proteins in different ways,” explains Freeman.

The primary question is, do different sequences, with only subtle differences between them, bind to proteins the same way. Freeman and his colleagues are developing better models to answer this question. “There is a lot you can do with very simple models, but when we started three years ago we weren't getting results that were consistent with experiments,” notes Freeman. “We were coming up short in benchmarking our results.”

The complexity of the systems Freeman and colleagues study is always increasing. Capturing the important physics of interactions adds layers of complexity that can be computationally expensive. “For us computational resources had always been a challenge.”

“Now on the Open Science Grid (OSG), we can have a few thousand simulations running on opportunistic resources all over the nation, including at UWM.” High-throughput computing provides Freeman and his colleagues with a trove of data they can use to inform their research and build better models. “Without the hundreds of thousands of hours of compute time, our research would be at a standstill.”

The research team is assisted by the Center for High Throughput Computing (CHTC) at UWM, where thousands of multi-core machines use special software – HTCondor – to run thousands of small jobs. CHTC provides access to some of the largest collections of resources at UWM, as well as at OSG.

There are a lot of competing hypotheses about what drives the interactions between DNA and proteins, and a lot of computational and experimental evidence that supports a number of different models. Freeman’s research suggests the shape of DNA matters. “For example, if you change the sequence, you can subtly change the physical dimensions of the DNA,” he says. This could enable Freeman to come up with design rules that determine which attributes of DNA change when the sequence is changed.

Freeman’s research also suggests the ability to design proteins that are very specific –that could go through the entire genome, find a specific sequence of DNA, and bind to that particular position. There are a lot of naturally occurring enzymes that do this – biologists and engineers use this tool frequently, despite being limited to only naturally occurring enzymes. “It would be very powerful to be able to design your own enzyme that goes to any specific sequence you want and modifies it in some beneficial way,” says Freeman

“Another application is a molecule designed to have a high degree of specificity and bind only to a particular DNA sequence. And, if you could bind proteins very strongly to DNA, decreasing the expression of some gene could be possible. This could lead to therapeutic devices that bind very strongly in a cancer cell, for example, to genes that are responsible for uncontrolled cell growth that results in tumors.”

High-throughput computing and resources like OSG are critical in facilitating these and other discoveries. “I think moving to high-throughput computing is tough for some researchers because naturally they already have their workflow organized to streamline things, making their daily operations easy and straightforward,” says Freeman. “But my recommendation is to be willing to change your workflow in any way you have to because high-throughput resources are just too big and too powerful to overlook.”

Ultimately, Freeman’s model is able to inform his experimental counterparts of the fundamental physics and mechanisms that are causing the behavior between DNA and proteins. Together with experimental data, this generates a very powerful and complete picture of the physics driving what is observed. -- by Amber Harmon, © i SGTW

See Now: NASA's Juno Spacecraft's Rendezvous With Jupiter's Mammoth Cyclone

TagsHigh Performance Computing