The GoD of B-cells

I took my first immunology class at UCSD in the spring of 2004. I've always been interested in signaling (how cells take information from the outside and translate that to the inside) but the subject matter of this class was set to disappoint - in terms of signaling, it more or less stopped at the outer membrane of cells. Even though looking back, I can see now that a subject as vast as immunology has to cut some corners in a 10 week course, early on that quarter I was a bit frustrated. But just before the first midterm, we started learning about one of the most bizarre behaviors of cells that I've ever learned about, the thing that would get me hooked on immunology and make it my passion. This is the molecular magic that can generate a functionally limitless number of different genes that allow B-cells to make antibodies that recognize almost any chemical structure that has ever existed or will ever exist. It's so important for the immune system that its acronym amongst immunologists is GoD - Generation of Diversity. GoD is made possible by a complex series of steps that, for reasons that will soon be come apparent, we call V(D)J recombination. But in order to explain why V(D)J is so amazing, we have to take a few steps back and look at some of the underlying principals of biology: DNA, genes and proteins. For all living things that we know of (except a few types of virus), genes are encoded by DNA. The word gene gets thrown around a lot, but the simplest way to think about a gene is as a code that tells a cell to make a particular protein. Antibodies are proteins, so you would be correct to assume that there are genes that code for antibodies written on the DNA within your cell. The trouble is, humans have between 20,000 and 30,000 genes, but the average person is capable of making about 10 billion (with a b) different types of antibodies that all bind unique molecular shapes. There simply isn't enough room in your genome to fit a gene for every antibody you can produce. What in GoD's name is going on here? It didn't take the early immunologists long to figure out that this was a problem, or to start looking for solutions. The first thing that became apparent is that all of those individual, unique antibodies are actually far more similar than they are different. As I mentioned a couple weeks ago, antibodies are shaped like a Y, and it's the end of the two arms of the Y that are actually the business end that stick to things. The bottom (or butt according to Abby) within particular classes of antibodies are are remarkably similar within individuals, and even across the population.

So, one solution to the not-enough-room-in-the-DNA problem is to have one copy of a code for all of the stuff that remains constant, and a bunch of different copies of the code for the part that's variable.

Antibody: gene to protein (simple)

This is sort of how it works, but it's a bit more complicated. Even if you vary only a tiny part of the gene, getting to 10 billion would still take up more space than the entire rest of your genome. Instead, the gene that codes for the variable part of the antibody protein is broken up into a bunch of different segments. The end of the gene still codes for the constant region, but the variable region is split up into 3 segments, and there are multiple versions of each of these segments. In humans, there are about 100 versions of the V (variable) segment, 30 of the D (diversity) segment and 6 J (joining) segments1. During the development of the B-cell, one V, one D and one J are brought together to form a complete piece.

Antibody gene to protein (with VDJ)

In order for B-cells to accomplish this, they actually break apart their DNA. An enzyme randomly picks a D and a J, cuts across both strands of DNA and then stitches them back together. Then, the same enzyme grabs a random V as well as the new DJ segment, cuts the DNA again and mashes V to DJ.

This is astonishing - double-strand breaks are incredibly dangerous, but B-cells in your body are doing it all the time. As Abbie mentioned with regards to class-switching:

This is an abomination. This should not happen (HELLO??? CANCER!!! We have a million safe-guards in our DNA to kill cells that start doing crazy stuff like CUT UP THEIR OWN DNA!!!), but in this case, it does, for a very good reason.

The reason for doing this at the DNA level is a topic for another post, but we're not through with GoD just yet. The algebra-obsessed among you might have noticed that 100 x 30 x 6 does not equal 10 billion. This is true, but I've left out a few things. First, an antibody isn't just a single protein, it's actually four proteins stuck together - two copies of a heavy chain, and two copies of a light chain.

Screen Shot 2011-08-08 at 3.39.29 PM.png

The variable region of the light chain is also spliced together (though it only has V and J segments), and it's the combination of heavy and light chain variable regions that ends up sticking to the antibody's target. Plus, there are two different light chain genes, either of which can be combined with the heavy chain. Plus, you have two copies of each of these genes (one from mom and one from dad). You can't get a V from mom pairing with a DJ from dad, but you can get a light chain from one and a heavy chain from the other. Even still, if you do all the math, this combinatorial diversity will only get you to a few hundred thousand possible antibodies - a far cry from the true extent of the average person's antibody repertoire.

The final piece of GoD is what we call "junctional diversity." When the DNA is severed durring V(D)J recombination, it's not always a clean cut, and some extra nucleotides must be added to fill the gap. In addition, there's an enzyme whose only job is to add in random nucleotides to the junction. This randomness can be extreme and it's here that GoD mangages to reach the 10 billion mark. This is also the reason that every individual has their own set of antibodies. The inherent randomness, from the selection of V's, D's and J's, to the jagged cuts to the random inserted nucleotides means that not even identical twins will have the same repertoire, or even similar repertoires.every individual has their own set of antibodies. The inherant randomness, from the selection of V's, D's and J's, to the jagged cuts to the random inserted nucleotides, means that not even identical twins will have the same repertoire.


1Matsuda and Hanjo, "Organization of the Human Immunoglobulin Heavy-Chain Locus" Advances in Immunology. Volume 62, 1996, Pages 1-29