What Is Bioinformatics?: The Protein Folding Problem

First, if you've never had the opportunity to visit xkcd, I highly suggest you do it. It's a wonderful website. My personal favorite is the What If? feature. Read one of them; you'll like it. The comic relevant to this post is:

If you're unfamiliar with the protein folding problem, let me briefly set the stage. Imagine a protein made of 100 amino acids linked together in a chain. The protein has to figure out where to put each of these amino acids relative to each other, in order to work properly. (As an aside, proteins which fail to fold properly have implications to many horrible diseases.) For simplicity, let's also image that each of the amino acids can take one of 2 positions. This implies that the protein can take 2¹⁰⁰ positions. This may or may not seem like a lot of choices to you. Try putting 2¹⁰⁰ into WolframAlpha. It works out to be 1267650600228229401496703205376 different positions, and only one of them is right.

To think about this number a different way, consider having 100 computers that will each check a trillion of these different possibilities a second. That's pretty good, right? Actually, it's not even close. It's going to take you approximately ~400,000,000 years to be able to look at all possibilities. It's absolutely impossible to fold a protein this way. It's not how nature does it, and it isn't how we attempt to do it either. Instead, the general hypothesis is that as the protein begins to fold into a favorable/proper state, it lowers it's energy, which basically crosses certain possibilities off the list without even having to check. The usual analogy is a folding funnel:

So, while all spaces on the funnel are technically possible, the protein is almost always just going to fall down the funnel. When scientist try to fold proteins, they use this funnel analogy as a method for only checking some of the possible conformations. But they check intelligently, mostly looking at the positions that have a high chance of being the natural shape of the protein.

The specifics of how to implement such a search on a computer can get as technical as you like. But a teaching tool called PyRosetta has been made that allows students like me (and you?) to get a pretty good idea of whats going on and what tools we have available to us. In my next post I'll show you a very simple script written for class that allows for the folding of a short peptide.

Until next time, cheers.

What Is Bioinformatics?

Code

Thursday, February 5, 2015

The Protein Folding Problem

No comments:

Post a Comment

Wikipedia

Search This Blog