Whole Genome Approaches to Complex Kidney Disease
February 11-12, 2012 Conference Videos

Population Genetics to Personalized Medicine: An Icelandic Saga
Kari Stefansson, deCode, Iceland

Video Transcript

00:00:00,000 --> 00:00:04,000
JEFFREY KOPP: So, we are very pleased to have as our keynote speaker Dr. Kári Stefánsson, who actually has been a frequent

00:00:04,000 --> 00:00:13,733
visitor to NIH over the years. I know I’ve heard him speak in the last two or three years. He received his M.D. from the University of Iceland

00:00:13,733 --> 00:00:20,666
and then trained in neurology and neuropathology at the University of Chicago and then afterwards was at Harvard for a number of

00:00:20,666 --> 00:00:31,232
years in these fields. He then returned to Iceland to study the genetics of multiple sclerosis. In fact, he founded deCODE genetics in 1996 to study the

00:00:31,233 --> 00:00:40,699
genetics of just about everything. He’s received numerous awards, including from the European Society of Human Genetics and the Anders Jahre

00:00:40,700 --> 00:00:48,633
Award and he was on the Time 100 list of “People Who Shape Our World” in 2007. He’s agreed to give our keynote address, addressing

00:00:48,633 --> 00:00:57,533
topics at the intersection of medical genetics, population genetics and public health, exactly what this conference is all about. Thank you,

00:00:57,533 --> 00:01:04,033
Dr. Stefánsson.-- KÁRI STEFÁNSSON: I thank the organizers for

00:01:04,033 --> 00:01:14,466
inviting me to come here and I will tell you a little bit about how we have gone about doing genetics in Iceland, and I will focus more on our

00:01:14,466 --> 00:01:23,932
search for rare variants coming out of whole genome sequencing, but I will be relatively light on the application of genetics in healthcare because

00:01:23,933 --> 00:01:35,466
we simply haven’t done all that much with it. But I think it is terribly important when we think about human genetics, is that basically all of the

00:01:35,466 --> 00:01:49,132
diversity in the biosphere is explained in large way by diversity in sequence of A, T, C and G. So, basically there is an awful lot of the secret to

00:01:49,133 --> 00:02:00,299
life in general that can be read out of this miraculous molecule that we call DNA, and how much, if we just focus on how much can we

00:02:00,300 --> 00:02:11,600
explain on the basis of the diversity in the sequence. And I think basically these identical twins—we read about it in the newspaper last

00:02:11,600 --> 00:02:21,266
year—that they died within hours of each other at the age of 86 from the same disease. And you will then say that’s not all because of their

00:02:21,266 --> 00:02:30,299
genetics. These guys were in the same jobs, they dressed in the same clothes, etc., so perhaps there was a large contribution by a

00:02:30,300 --> 00:02:39,066
shared environment, and my response to that, yes of course, there was a large contribution by shared environment but actually the environment

00:02:39,066 --> 00:02:49,666
they shared was highly genetically…was highly inherited, and I will end my presentation by presenting you with arguments that I believe that

00:02:49,666 --> 00:02:59,332
there’s a large genetic component to an environmental component to the risk of common diseases. But basically, I think that when we think

00:02:59,333 --> 00:03:08,366
about human genetics, when we think about genetics and the discovery coming out of genetics and then go to the question of “how do

00:03:08,366 --> 00:03:19,566
we use this in a clinical practice?” Basically their use of genetics in diagnostic versus the use of genetics in discoveries basically is the mirror

00:03:19,566 --> 00:03:28,632
image of each other. When we are working on the discovery we are trying to figure out what it is that characterizes the group, what people who

00:03:28,633 --> 00:03:36,499
have a particular disease share. One we are using in genetics, we are trying to figure out into what group does the individual fit? And actually,

00:03:36,500 --> 00:03:44,566
in the end, we will be using the same software systems, we will be using the same analytical methods in both instances, and I will come back

00:03:44,566 --> 00:03:55,099
to that again in this presentation. When we are trying to make discovery in human genetics, the kinds of human genetics that I work from, we are

00:03:55,100 --> 00:04:02,500
basically dealing with two datasets. We are dealing with datasets on diversity and the sequence of the human genome and we are

00:04:02,500 --> 00:04:11,833
dealing with datasets on diversity in phenotypes, and we are trying to find non-chance association by data-punching the two datasets. And when I

00:04:11,833 --> 00:04:19,433
was in medical school, which unfortunately was not yesterday, the only genetic diseases we were taught about were the ones with this

00:04:19,433 --> 00:04:26,899
relationship which is very simple. You have a mutation, you develop a disease, you don’t have a mutation, and you won’t. These were the

00:04:26,900 --> 00:04:35,233
Mendelian diseases. And actually, coming out to the sequencing—the whole genome sequencing that we are going through now—we are drifting

00:04:35,233 --> 00:04:45,699
back towards this relationship. We are finding out that what we call common complex diseases is in many ways a collection of a very large number of

00:04:45,700 --> 00:04:55,766
Mendelian, let’s call them Mendelian phenocopies, what we have thought about as this common complex diseases. But what we saw mostly

00:04:55,766 --> 00:05:04,632
coming out of what we call the GWAS era, although mind you, that we are continuing to do genomic associations just with a little bit rarer

00:05:04,633 --> 00:05:13,033
variant. But looking at the common variants, this is pretty much what we have been seeing, a large number of variants affecting one phenotype, but

00:05:13,033 --> 00:05:24,566
we have also been seeing this, one mutation affects many phenotypes. This has been particularly the case in cancer, where one

00:05:24,566 --> 00:05:34,299
mutation can confer risk for many cancers, but this is also fairly common in the case of diseases of the brain. One and the same mutation confers

00:05:34,300 --> 00:05:43,566
risk of schizophrenia, autism, ADHD, and epilepsy, and it’s actually very, very interesting when we begin to speculate why that should be,

00:05:43,566 --> 00:05:54,766
and I may have time to come to that also a little bit later. We have a tendency to divide the variants in the sequence that affect the risk of common

00:05:54,766 --> 00:06:02,566
diseases into the common and the rare, like these are two distinct categories, and it is true that we have been finding both common, we have been

00:06:02,566 --> 00:06:13,766
finding rare variants, and there is a little bit of…there is a relationship between the size of the effect and how common the variants are, and it is

00:06:13,766 --> 00:06:21,966
probably biologically important in the case of the common variants because we are not going to see a lot of very common variants conferring

00:06:21,966 --> 00:06:30,899
very large risk because there is inevitably negative selection against them but I’m convinced that there are rare variants to small effect but we

00:06:30,900 --> 00:06:43,166
don’t find them because we are not powered to do so. But I think that the meaningful separation, the meaningful dichotomy of these variants lies in

00:06:43,166 --> 00:06:52,366
the following. We have these variants, the common variants, the variants that basically create the normal human diversity that produce a

00:06:52,366 --> 00:06:59,166
normal distribution curve of physiologic function, and if you’re at one end of the curve you’re at risk, if you’re at the other end of the curve you

00:06:59,166 --> 00:07:10,166
are somewhat protected, and it is interesting that the definition of normal distribution is that it is a measurement of this under the influence of many

00:07:10,166 --> 00:07:22,332
factors, each one of them equally likely or almost equally likely to add and subtract, which means that they’re common and the nature of the

00:07:22,333 --> 00:07:32,133
variants that do this must be fairly similar. But then you have the rare variants that disrupt the same physiological function, all right? And I will

00:07:32,133 --> 00:07:40,666
give you one example where we have a common variant and a rare variant in the same gene: one of them affects the physiological function, the

00:07:40,666 --> 00:07:54,432
other disrupts the same. I cannot resist. You see, I’m putting up this slide because I am a vain man and I like to thump my own chest, and this is just

00:07:54,433 --> 00:08:08,766
a list of phenotypes where we at deCODE have found common variants affecting the risk of these phenotypes and we’re the first ones to do so,

00:08:08,766 --> 00:08:20,166
and this ranges from diseases like myocardial infarction to funny phenotypes like the love of crossword puzzles, and that is a terribly

00:08:20,166 --> 00:08:28,199
important phenotype because it shows that we can go all the way from subtle differences in the sequence of the human genome to a very

00:08:28,200 --> 00:08:39,633
complex human behavior and human feelings, and we can find it unequivocally associations, terribly important, because the brain is just like the

00:08:39,633 --> 00:08:49,233
kidney; it’s just an organ. The kidney makes piss, the brain makes thoughts and emotions, and we can analyze all of these using human genetics.

00:08:49,233 --> 00:09:02,833
And let me give you just a couple of examples of the common variants and where we have shown that the effect of physiological function and

00:09:02,833 --> 00:09:14,699
through that if they generate the risk of disease, and the thyroid cancer example is actually a fairly good one, and actually it is the most familial of

00:09:14,700 --> 00:09:27,966
cancers, and actually, we have pulled together, we have isolated now…this is an example of four variants here…five variants that we have

00:09:27,966 --> 00:09:38,699
isolated that actually affect the risk of thyroid cancer and all of them the risk variants that are associated with decreased concentration of TSH.

00:09:38,700 --> 00:09:49,233
Remember, TSH, the way in which the endocrine function of the thyroid works is that you have the thyrotropin-releasing factor coming from the

00:09:49,233 --> 00:09:58,833
hypothalamus, stimulating the production and secretion of TSH. That then stimulates the thyroid and actually stimulates the differentiation of

00:09:58,833 --> 00:10:04,366
thyroid epithelium, and it is interesting. You see, we looked at about 1,000 cases of thyroid

00:10:04,366 --> 00:10:14,832
cancer, but we looked at about 15,000 individuals when it comes to level of TSH—normal people without any cancer—and we showed that all of

00:10:14,833 --> 00:10:23,533
these risk variants are associated with decreased concentration of TSH. So, our hypothesis is that the way in which these

00:10:23,533 --> 00:10:31,666
variants confer risk is that they lead to less secretion of TSH, which leads to less differentiation of the thyroid epithelium that then

00:10:31,666 --> 00:10:43,432
leads to thyroid cancer. The second is actually a little bit different, for example; I wanted to emphasize the importance. You see, we are

00:10:43,433 --> 00:10:54,399
working in Iceland always with the same population. So, the same individual plays a role in many studies, either as a patient or a control, but

00:10:54,400 --> 00:11:07,566
it is so important to be able to go again and again to the same genome sequence, you have genotypes you want, etc., and in this instance

00:11:07,566 --> 00:11:20,232
this is from work we did on uromodulin and chronic kidney disease, and actually, our contribution to that study is to demonstrate that

00:11:20,233 --> 00:11:30,599
the variants and the sequence that affect the risk or confer risk of chronic kidney disease do so through interactions with age and interaction with

00:11:30,600 --> 00:11:44,600
co-morbid conditions. So, the impact of the uromodulin variant on creatinine concentration is completely dependent on increasing age and the

00:11:44,600 --> 00:11:51,833
presence of co-morbid conditions. So, this is an example actually. The uromodulin variant is an example of variants in sequence that

00:11:51,833 --> 00:12:04,966
predisposes to diabetic nephropathy, for example. But there is a relatively small percentage of the heritability that is accounted for

00:12:04,966 --> 00:12:13,366
by the common variants that have been discovered, and there are people in the various places of this world tearing their hair out in

00:12:13,366 --> 00:12:22,566
despair over the fact that these variants do not explain relative to the small amount of heritability. For example, an extraordinarily distinguished

00:12:22,566 --> 00:12:30,699
colleague of mine, Eric Lander, has come up to Iceland to sit down with us to see whether we can use the Icelandic population to explain where

00:12:30,700 --> 00:12:39,666
the missing heritability is, and he published a paper the other day in PNAS where he insists that he has figured out how to calculate how

00:12:39,666 --> 00:12:52,299
interactions between variants confer, explain, a large part of the missing heritability. The problem with that approach is that no one has been able

00:12:52,300 --> 00:13:06,466
to show this interaction. So, I could substitute these interactions that Eric is calculating out but cannot demonstrate by almost anything. I am

00:13:06,466 --> 00:13:15,532
convinced he is correct, that interactions play a role, but since we cannot demonstrate them it is really difficult to make use of a formula that

00:13:15,533 --> 00:13:25,166
includes them in the calculation of heritability. But there are things that definitely contribute, although they do not explain everything, and as I

00:13:25,166 --> 00:13:32,532
said, I’m convinced that Eric is right, that some are there; there are these interactions, we just haven’t been able to pull them out. One thing we

00:13:32,533 --> 00:13:42,399
have to look at is the mechanism of inheritance, and one of the things that we have been fortunate to be able to do is to figure out how to

00:13:42,400 --> 00:13:50,766
trace the entire genome. You know, when you sequence or you genotype you get out this [---] soup of variants coming from mothers and

00:13:50,766 --> 00:14:01,566
fathers, but it is extremely important to be able to string the variants into haplotypes, and usually you’ll need to have the genotypes of the parents

00:14:01,566 --> 00:14:11,866
and the proband to be able to trace or you cannot do that if they are strictly heterozygous, but we put together a method that is based in our deep

00:14:11,866 --> 00:14:18,699
understanding of the Icelandic population structure. So, we simply used the next paternal and

00:14:18,700 --> 00:14:27,333
maternal relative who is homozygous for the stuff that you’re looking at, and what is more, if you use this overlapping segment we can

00:14:27,333 --> 00:14:36,333
actually convincingly demonstrate whether the haplotypes come from the mother or the father. So, every individual we genotype or we

00:14:36,333 --> 00:14:44,633
sequence, we can figure out what comes from the mother and what comes from the cousin and why is that important? It is important because it is

00:14:44,633 --> 00:14:54,366
absolutely clear that in many instances it matters whether you inherit the variant from the mother or the father, and I’m just showing you an example

00:14:54,366 --> 00:15:05,732
coming out in one of my papers where we have variants that affect breast cancer, skin cancer,

00:15:05,733 --> 00:15:13,233
and Type II diabetes, and it depends on…the effect varies depending on whether it comes from the mother or the father, and actually, the

00:15:13,233 --> 00:15:22,433
most spectacular example is a variant on chromosome 11 that if you inherit it from the father it confers risk for Type II diabetes, if you

00:15:22,433 --> 00:15:32,999
inherit it from the mother it protects against the same. So this is one spectacular way in which nature is squeezing more information out to us for

00:15:33,000 --> 00:15:40,933
a genome. So, remember that it’s still another type of demonstration of the fact that sexual reproduction was not invented for our pleasure,

00:15:40,933 --> 00:15:55,066
but simply to increase human diversity. We have looked at this in many instances, and yes, the

00:15:55,066 --> 00:16:03,866
parental origin of variants affects things like human height, BMI, etc. and several diseases, but it is not the major player, all right? And in all

00:16:03,866 --> 00:16:16,032
instances where we have been able to figure out mechanism, the mechanism goes to differential methylation. But there are also rare variants, no

00:16:16,033 --> 00:16:27,099
question about it, that play an important role. In Iceland we have started a fairly large sequencing process. We are aspiring to sequence

00:16:27,100 --> 00:16:40,233
somewhere between 2,500 and 3,750 Icelandic genomes. We have now genotyped almost 120,000 people and this allows us, because of

00:16:40,233 --> 00:16:49,766
what we know about the structure, to impute the whole genome sequence in the entire population. And why can we do that? In sort of simple terms,

00:16:49,766 --> 00:16:59,532
because we can take any two random Icelanders and we can determine what they’ve inherited from a common ancestor, and if you take about a

00:16:59,533 --> 00:17:08,533
megabase of DNA we can find about 250-300 Icelanders who have inherited this from a common ancestor, so we only need to sequence

00:17:08,533 --> 00:17:19,499
it in one of them, and it allows us to impute variants down to a frequency of about .1%. Everyone else is struggling with imputing

00:17:19,500 --> 00:17:31,200
sequence variants; after 1 or 2 or 3 percent they cannot go farther down than that. We have now sequenced about 2,000 Icelanders to 10x; of

00:17:31,200 --> 00:17:42,233
them, 700 to 30x; and we are already beginning to get a lot of exciting stuff coming out of it. And there are some sort of basic things that we know

00:17:42,233 --> 00:17:52,433
about the Icelandic genome or the human genome, now that we get about 40-50 de novo mutations in each individual of evidence of 60

00:17:52,433 --> 00:18:07,699
recombinations. And actually, it is interesting when you take diseases like autism and schizophrenia. We know that the probability that

00:18:07,700 --> 00:18:16,633
a father will have a child with schizophrenia is dependent on age. For example, a 40-year-old father is 3 times more likely to conceive a

00:18:16,633 --> 00:18:28,333
schizophrenic child than a 20-year-old father, and this is, in many ways, interesting, you know, and I actually think that Pfizer should be punished just

00:18:28,333 --> 00:18:39,199
like Exxon for the oil spill. Just by marketing Viagra and extending the sex life of men, they are introducing a lot of new mutations into our

00:18:39,200 --> 00:18:52,933
population. And this is interesting when you look at all of these in the context of our data on de novo mutations, all right? Ninety-seven percent of

00:18:52,933 --> 00:19:05,533
diversity in new mutations—de novo mutations—is explained by paternal age. Ninety-seven percent of the diversity in de novo mutations is

00:19:05,533 --> 00:19:13,799
explained by paternal age. That’s absolutely astonishing, right? And we have more data that’s going to be…this is coming out to looking at 75

00:19:13,800 --> 00:19:25,100
trios, which is terribly interesting. But this brings us to sort of the new aspect of the genetics that we are doing because of the de novo mutations.

00:19:25,100 --> 00:19:32,600
We are not just looking at correlations between diversity-and-sequence and diversity-and-phenotype, we have to include the changes in

00:19:32,600 --> 00:19:44,500
diversity in the sequence, and actually, I think an awful lot of what we are missing today is due to the fact that we are not in a position, really in a

00:19:44,500 --> 00:19:59,166
position, to pick up de novo mutations. There’s still another interesting thing coming out. What we have is that we have now about 7,000 individuals

00:19:59,166 --> 00:20:11,232
who are homozygous but are truncating mutation in one of 500 genes. So basically there are, at least on the average, 12 individuals in our sample

00:20:11,233 --> 00:20:24,266
that have a knockout of one of these 500 genes, and we are now starting a process where we are phenotyping all of these guys. But this is sort

00:20:24,266 --> 00:20:35,332
of the list, not complete. This is the list that is now about three weeks old, but this is a list of sort of where we have pulled out rare variants with

00:20:35,333 --> 00:20:43,699
large effect, and the reason we can do this is that we can impute the rare variants where others cannot and there’s a founder effect for

00:20:43,700 --> 00:20:53,600
many of these traits in Iceland, and I can give you an example of the importance of the founder effect. When BRCA2 was discovered it was

00:20:53,600 --> 00:21:03,233
basically simultaneously in Utah and Cambridge, and in both places they were using Icelandic material, and there’s only one mutation in BRCA2

00:21:03,233 --> 00:21:17,166
in Iceland and that is a 5-base deletion out of exon 9. It has an allelic frequency of .4%. Myriad tells me that, now after having had the test

00:21:17,166 --> 00:21:32,599
ongoing for a couple of decades, they have found 6,300 mutations in BRCA2 in America and a combined allelic frequency is .1%. So, to be able

00:21:32,600 --> 00:21:42,133
to pick, if you were doing sequencing, if you compare the position we are in in Iceland versus what you have here, you would have to

00:21:42,133 --> 00:21:51,766
sequence a thousand times more people to be able to make the kind of discoveries that we are doing, if this indeed goes for all of these

00:21:51,766 --> 00:22:04,932
phenotypes. One of the things…I’m not going to go into details of this. I’m not going to discuss the kidney cancer stuff that we pulled out two or

00:22:04,933 --> 00:22:10,533
three weeks ago. I’m just going to go over a few of the things that we have already published. One of the first things that we published was on

00:22:10,533 --> 00:22:27,799
a variant that has also an allelic frequency of about .4%. It’s in BRIP1 which is one of the proteins interacting with BRCA1 and BRCA2. It

00:22:27,800 --> 00:22:39,133
confers an odds ratio of 8 of ovarian cancer. It is a friendship mutation; it is a truncated mutation. We have found another truncating mutation in a

00:22:39,133 --> 00:22:48,866
Spanish sample, and this is an important discovery because ovarian cancer is one of the difficult cancers with a 5-year survival of about

00:22:48,866 --> 00:22:58,632
45% today. It’s very important to be able to diagnose it early. So, you can use this, probably, in the same kind of a way that people are using

00:22:58,633 --> 00:23:10,899
the BRCA test today, is to genotype the close relatives of those who are diagnosed with the cancer. A second one we pulled out was a rare

00:23:10,900 --> 00:23:18,766
variant, low-frequency variant that conferred risk of gout and has impact on uric acid concentration, and it’s actually fairly interesting

00:23:18,766 --> 00:23:25,699
because the impact on women and men is the same. The relative risk or odds ratio for men is much higher because men start off with a much

00:23:25,700 --> 00:23:36,866
higher concentration of uric acid, so they’re much closer to their saturate in concentration that leads to precipitation of crystals in joint fluids. The one I

00:23:36,866 --> 00:23:46,899
want to just dwell on a little bit is one that confers an odds ratio of 12.5 for sick sinus syndrome, which is the most common cause of the

00:23:46,900 --> 00:24:01,933
placement of pacemakers in our society. The reason I want to dwell on it a little bit is that the mutation is in the myosin heavy chain 16 and we

00:24:01,933 --> 00:24:10,133
have previously found common variant over that gene that affects heart rate. So, you have a common variant that affects heart rate, you have

00:24:10,133 --> 00:24:21,699
a rare variant that abolishes heart rate, which is terribly interesting. Another interesting aspect of this is that the mutation is in the myosin gene.

00:24:21,700 --> 00:24:31,533
What does myosin have to do with generation of heartbeat? We’re still puzzled after we have published this paper, but now a couple of weeks

00:24:31,533 --> 00:24:39,833
ago we found another mutation. It’s a mutation in the myosin light chain 40. There are eight individuals in our dataset who are homozygous

00:24:39,833 --> 00:24:50,933
for that mutation. All of them have early onset for atrial fibrillation before the age of 30 and have developed embolic stroke because of that. So,

00:24:50,933 --> 00:24:59,499
you have another example of a mutation in a sarcomere gene that seems to be very important for the generation and the maintenance of heart

00:24:59,500 --> 00:25:08,300
rate. [43-24:55] So, this is an example of a gene that has variants that place you on the normal distribution curve and all the mutations that

00:25:08,300 --> 00:25:19,433
disrupt that physiologic function. But that brings me to the brain, [44-25:08] and I am an old neurologist and a neuropathologist and what I’m

00:25:19,433 --> 00:25:28,066
saying now is true. It may sound outrageous, but if I would take this lovely woman’s body and I would peel it away from her brain, put her brain

00:25:28,066 --> 00:25:35,632
in a bucket and keep it alive, that would be her, all right? If I would keep the rest of her body alive elsewhere, it would be something totally

00:25:35,633 --> 00:25:45,166
different, all right? So, we are nothing but a brain; the rest of it is only to get us from point A to point B, etc., so it is important to figure out how the

00:25:45,166 --> 00:25:53,932
brain works. [45-25:40] And actually, I couldn’t resist because your administration over the past couple of days have put a lot of emphasis on

00:25:53,933 --> 00:26:04,633
Alzheimer’s disease, and they want to spend $300 million doing sequencing to find protective variants, etc. We have done this work, all right?

00:26:04,633 --> 00:26:14,633
It’s over. They’re too late. They are not going to be the first one to get a man to that moon, I can promise you that. [46-26:07] So, we have pulled

00:26:14,633 --> 00:26:21,733
out, over the past four weeks, three variants that affect the risk of Alzheimer’s disease. One of them

00:26:21,733 --> 00:26:34,166
confers an odds ratio of 3, which is prevalent to a P4. The second one, an odds ratio of 2.4—that happens to be an inflammatory gene—and then

00:26:34,166 --> 00:26:47,732
we also found the variant that basically confers almost 100% protection against Alzheimer’s disease, and the +2 have an allelic frequency of

00:26:47,733 --> 00:26:58,366
about…one of them 1.2, the other 1.05. The protective variance has an allelic frequency of about .5. What’s interesting, if you look at the

00:26:58,366 --> 00:27:10,132
protective variant, is that in the general population it has a frequency of .5%; in Alzheimer’s disease, .1%; and in people who are 85 years of age or

00:27:10,133 --> 00:27:21,699
cognitively intact at 80 years of age, it is found in .7%; and in cognitively intact over the age of 85, you find it in 1.1%. We have only found this in

00:27:21,700 --> 00:27:31,600
four cases of Alzheimer’s disease and they are really, really old individuals who I basically questioned that had true Alzheimer’s disease. But

00:27:31,600 --> 00:27:41,566
what is interesting here is that if you take this variant and you look at it in people who are admitted to nursing homes and you look at their

00:27:41,566 --> 00:27:55,632
cognitive performance scale—so it’s a cognitive function measurement coming out of the resident assessment, the instrument—as you can see if

00:27:55,633 --> 00:28:05,333
you look at it between the ages of 80 and 100, the red line is those who do not carry this variant, the blue line are those who carry this variant. So,

00:28:05,333 --> 00:28:15,733
basically the people with this variant—the normal people—they retain their cognitive abilities up to 90 years of age and do not catch up with the

00:28:15,733 --> 00:28:25,799
ones without this variant until they are 95. And this is how it looks if you take out the Alzheimer’s patients from the red line, basically. So, this is a

00:28:25,800 --> 00:28:34,700
variant that protects against Alzheimer’s disease, this is a variant that allows you to retain your cognitive function longer than the control

00:28:34,700 --> 00:28:49,800
population. So, this is an example of…you see, the thing is that when you’re considering what to do when it comes to investing in whole genome

00:28:49,800 --> 00:28:56,866
sequencing for the purpose of discovery, what matters is not how many people you sequence, what matters is what you know about the

00:28:56,866 --> 00:29:05,299
population structure, how well you can mine it, how well is it suited for making discoveries, and I can assure you, doing whole genome

00:29:05,300 --> 00:29:12,166
sequencing in a population like the Icelandic population, it’s a thousand times less expensive than if you were going to do it in a heterogenous

00:29:12,166 --> 00:29:21,866
population. Make the discovery of the gene in a founded population, then go into the outbred population and do pool sequencing of your

00:29:21,866 --> 00:29:33,832
samples, looking for variants there. It’s the most effective. It’s the least expensive way of doing it. One of the things we have done is, as I

00:29:33,833 --> 00:29:40,966
mentioned before, we have all of these human knockouts and we have all of the CNVs we have pulled out and we have actually about 46 CNVs

00:29:40,966 --> 00:29:49,266
that we have pulled out that we have shown to be on the negative selection. How do we do that? We just count the number of children that people

00:29:49,266 --> 00:29:56,432
have who have this variant and show that there are fewer number, their average in our society, and then we have gone and we have

00:29:56,433 --> 00:30:07,066
phenotyped these people who carry this variant. And it is interesting that if you look at the GWAS era, relatively little was discovered of common

00:30:07,066 --> 00:30:16,899
variants that affect the risk of brain diseases. Most of the variants that have effect on the risk of brain disease is going to be rare variants and

00:30:16,900 --> 00:30:25,166
probably unconnected with selection because the brain is a terribly important reproductive organ, all right? We probably have defined the

00:30:25,166 --> 00:30:36,766
normal function of the brain relatively broadly, but we phenotyped these people extraordinarily thoroughly for cognitive function, and actually, a

00:30:36,766 --> 00:30:43,932
few of these variants confer risk of schizophrenia, which is actually the most human of all diseases; it’s a disease of thoughts and

00:30:43,933 --> 00:30:53,166
emotions, and thoughts and emotions are the functions that define us as a species and define us as individuals. We actually published a paper

00:30:53,166 --> 00:31:08,599
on CNVs as a factor risk of schizophrenia and two of them confer odds ratio that is relative of 10, but the one on chromosome 15-11.2 confers

00:31:08,600 --> 00:31:17,300
only an odds ratio of 3. So, we have two of these that confer the very large odds ratio, the one in the middle confers a modest odds ratio,

00:31:17,300 --> 00:31:26,766
and this is interesting when you begin to look at the phenotyping of the species. So, these are not 100% penetrant variants, so we have looked at

00:31:26,766 --> 00:31:35,366
them in quote-unquote “normal people.” So, this is an example of the effect that these variants have on spatial working memory and what’s interesting

00:31:35,366 --> 00:31:53,066
here is that the variant that carries the smallest odds ratio, the 15-11.2, has the least effect on spatial working memory. The high odds ratio

00:31:53,066 --> 00:32:02,399
variants have much greater impact on spatial working memory. So, when you take out the variant of the small effect and you look at this

00:32:02,400 --> 00:32:12,066
together, these are normal people without the CNV, these are the ones with high-risk CNVs, and these are people with schizophrenia. So you

00:32:12,066 --> 00:32:21,032
see, when it comes to spatial working memory, the quote-unquote “normal carriers” of the CNVs fall in between the patients and the normal. So,

00:32:21,033 --> 00:32:30,866
this is an example of variants that affect a very important disease of the brain, that out of the context of this disease, affect the physiological

00:32:30,866 --> 00:32:44,432
function that the disease affects, all right? The variant on 15-11.2 is important for many reasons. It’s a fairly common one. It’s found in 1 out of 350

00:32:44,433 --> 00:32:56,133
individuals, so there are 20 million people in the world who carry this variant. This is a variant that affects…this is a CNV deletion and in the

00:32:56,133 --> 00:33:08,133
middle of that deletion, it’s the gene that makes the CYFIP, which is cytoplasmic Fragile X interacting protein 1, and actually it turns out that

00:33:08,133 --> 00:33:20,066
the Fragile X protein forms a complex with two other proteins that serve the function of protecting RNA that is transported from the

00:33:20,066 --> 00:33:28,432
nucleus into the dendrites of the neurons; very important for synaptic function. So, this is a CNV that includes a very important thing that makes a

00:33:28,433 --> 00:33:39,999
very important protein for synaptic transmission, and actually, when we looked at people with this variant in our test of reading, it’s an absolutely

00:33:40,000 --> 00:33:50,066
clearly, the people with the deletion have a very, very specific learning disability that affects reading, whereas the controls and even people

00:33:50,066 --> 00:34:03,399
with duplication in this region performed much, much better, and when it comes to arithmetic, these guys also performed very poorly. So, this is

00:34:03,400 --> 00:34:14,233
an example of how you can use what you know about variants that affect the risk of diseases of the brain to explore the function of the brain. I

00:34:14,233 --> 00:34:24,633
cannot resist the temptation to show you an example of our duplication on chromosome 16p13.1, that we have described it affects ADHD,

00:34:24,633 --> 00:34:36,299
and actually, it does not only affect ADHD, this is what comes out of our phenotyping. We have shown that people with this CNV have a longer

00:34:36,300 --> 00:34:46,533
arm span than the normal population; about 8 cm longer than expected for height, and actually women with this duplication also have menarche

00:34:46,533 --> 00:34:56,599
somewhat later than women without it. What is interesting here is that Michael Phelps, the greatest swimmer of all time, he is a poster child

00:34:56,600 --> 00:35:05,300
for the association of people with ADHD. He also has an arm span that is exactly 8 cm longer than expected for his height, so we have found the

00:35:05,300 --> 00:35:18,366
Michael Phelps mutation. That brings me back to the brothers, the identical twins, who died within hours of each other from the same disease, and I

00:35:18,366 --> 00:35:29,132
insist that they did so because they have exactly the same genome. They did so not because they were members of this elite profession, and

00:35:29,133 --> 00:35:39,133
actually, my arguments come from the following work. We have done a lot of work on the genetics of nicotine addiction, all right? And the

00:35:39,133 --> 00:35:51,566
original variants we found were on chromosome 15 and when we had found it, we went and we looked at whether they affected the risk of

00:35:51,566 --> 00:35:57,666
various smoking-related diseases, and we showed that this variant has very significant effect on the risk of lung cancer. In Iceland, no

00:35:57,666 --> 00:36:04,332
one really develops lung cancer unless they smoke—unless they smoke for decades; 14.5% of those who smoke for decades develop lung

00:36:04,333 --> 00:36:11,333
cancer. So, it’s a pure environmental disease, all right? However, you inherit compulsion to seek the environment that causes the disease. So,

00:36:11,333 --> 00:36:21,999
where lies the line of distinction between nature and nurture? I’m absolutely convinced that this applies to almost every common complex

00:36:22,000 --> 00:36:30,966
disease. I can give you an example which is relative simple. We took all of the variants in the sequence that affect BMI—the ones that we

00:36:30,966 --> 00:36:40,032
have discovered and the ones that others have discovered—and we sort of regressed them on various addiction phenotypes, and indeed, they

00:36:40,033 --> 00:36:51,133
have very, very large impact as a group on the risk of smoking quantity, smoking initiation, alcoholism, coffee consumption, opiate addiction,

00:36:51,133 --> 00:37:03,133
etc. But it is interesting if you look at individual variants like FTO, which has the largest impact on BMI of all of the BMI variants, has no impact on

00:37:03,133 --> 00:37:14,199
any of the addiction phenotypes, in spite of the fact that it is expressed widely in the brain. So basically, this is what I’m going to tell you about,

00:37:14,200 --> 00:37:33,266
the discoveries, but I want to emphasize that the question that comes up when you think about this in the context of clinical tests, is that it is

00:37:33,266 --> 00:37:43,232
absolutely amazing how much you can pull out and much there is of these very rare variants—and remember, rare is synonymous with

00:37:43,233 --> 00:37:52,099
recent—and an awful lot of these are de novo mutations and when you’re designing ways to make discoveries, you have to include also, you

00:37:52,100 --> 00:38:00,400
can pull out these de novo mutations. You have to select your population, not just throw an enormous amount of money—$300 million—to

00:38:00,400 --> 00:38:08,866
sequence patients with Alzheimer’s disease because it’s silly. You select your population well, make sure that you understand the population

00:38:08,866 --> 00:38:18,899
structure, preferentially select a population where you have a founder effect, and where you can do a reasonable amount of imputation

00:38:18,900 --> 00:38:27,100
where you can do this more cheaply, more effectively, more quickly. But anyway, go back to this. In the discovery, you’re trying to figure out

00:38:27,100 --> 00:38:37,400
what characterizes the group. When it comes to delivery of healthcare you’re trying to figure what would be coming into its group. People fit. The

00:38:37,400 --> 00:38:48,066
second aspect of this is complex when you begin to do the sequencing because we are talking in term of, in many instances, de novo mutations,

00:38:48,066 --> 00:38:56,332
and there are going to be many of them private for many years to come. In the end, they will stop being private, but in the beginning they are

00:38:56,333 --> 00:39:04,066
private, so you will need pretty much sort of their [---] sight of what we have. One of the things we have at deCode is that we have very, very good

00:39:04,066 --> 00:39:13,632
software systems into which we imparted our own analytical algorithm, and that’s a key to our success—our ability to manage the information

00:39:13,633 --> 00:39:24,233
and for us to manage all of the enormous amounts of information data coming out of the sequencing. But basically, in the delivery of a use

00:39:24,233 --> 00:39:32,099
of sequencing in a clinical setting, you have to have exactly the same software systems. You have to be able to do the same thing, and we

00:39:32,100 --> 00:39:44,366
basically, you know, we have put together sort of an analysis platform that is independent of the sequencing technology being used, and basically

00:39:44,366 --> 00:39:52,399
we are absolutely convinced that the way in which we are approaching the research and the way in which we will be approaching the use of

00:39:52,400 --> 00:40:03,700
sequencing in a clinical practice, this will convert, because when you’re putting together a test…let’s say that you put together a test to try to

00:40:03,700 --> 00:40:15,033
diagnose developmental disorder of children. Every time you do the test you have to compare the sequence you get to a reference sequence

00:40:15,033 --> 00:40:23,666
that will be the whole database of every test that you have previously done, because the test itself will not be a sequence. Sequencing is commodity;

00:40:23,666 --> 00:40:36,866
it’s the ability to put the sequence in context. And we have, actually, put together an algorithm that we call GOR, which is basically an Icelandic

00:40:36,866 --> 00:40:47,966
word for the intestinal fish and I don’t understand how we came at this exact word. I mean, the adjective “gory” is derived directly from that. But

00:40:47,966 --> 00:41:01,499
it’s very true, basically look at a system to analyze this on the basis of access patterns, not just on the basis of pure sequence. Then we

00:41:01,500 --> 00:41:14,233
have put a lot of emphasis on putting together [---] to allow us to present the data to clinicians as well as to patients themselves, and we have

00:41:14,233 --> 00:41:23,066
recently started the collaboration with the only tertiary care hospital in Iceland to try to figure out how to use those genome sequence data in the

00:41:23,066 --> 00:41:33,699
delivery of healthcare. This hospital has about 100,000 visits a year and we are going to impute sequence into the patient for the past 3 years—

00:41:33,700 --> 00:41:42,333
about 300,000 people. We have now tried to determine in what clinical settings the data would make the most difference, and at 1 and 2 it’s

00:41:42,333 --> 00:41:48,999
going to take us about 6-12 months, and that’s actually a generous time for that, and then we are going to sequence the genomes on

00:41:49,000 --> 00:41:55,066
newcomers who fit the criteria coming out of 2 for the following 2 years. So, this is going to take us all together 3 years, and I think this is going to

00:41:55,066 --> 00:42:09,932
be very exciting, terribly exciting, and I’ll tell you, this study is hopefully going to teach us how to use this data in the delivery of healthcare, but a

00:42:09,933 --> 00:42:21,733
side benefit is that it is going to generate a new tidal wave of discoveries, all right? And thank you for listening. This is all I have to say today.

00:42:28,566 --> 00:42:32,566
FEMALE: First of all, I certainly want to thank you for a most fascinating presentation. I have a couple of questions. First, if I followed correctly,

00:42:37,700 --> 00:42:46,333
PAUL KIMMEL: Thank you so much.

00:42:46,333 --> 00:43:01,399
one of the observations you’ve made about common variants is that they map to physiological function. First of all, is that a kind of a

00:43:01,400 --> 00:43:07,833
generalization that can be made? KÁRI STEFÁNSSON: What was the question? FEMALE: Common variants mapping to

00:43:07,833 --> 00:43:23,599
physiological function. And then, if I was following the line that the observation in your efforts to phase to do the haplotype phasing that

00:43:23,600 --> 00:43:40,200
the variant often is related or can be related to parental…to imprinting from the parent. Now my question is, I was just curious as to how often,

00:43:40,200 --> 00:43:56,933
particularly in terms of the common variant and in the imprinting from the parent, is that variant in a regulatory or a region? And that’s kind of the

00:43:56,933 --> 00:44:04,233
answer to that. KÁRI STEFÁNSSON: The common variants in general that we have discovered are outside of

00:44:04,233 --> 00:44:18,099
coding sequences. In a significant number of instances we have shown that the variants affect the expression of the gene they are close

00:44:18,100 --> 00:44:28,366
to. In many instances we haven’t been able to show that. Very rarely are these common variants in known regulatory sequences, but our

00:44:28,366 --> 00:44:41,466
assumption is that they are in the kinds of sequences that have yet to be defined. But in all of the instances where we have shown common

00:44:41,466 --> 00:44:54,266
variants to have parent to origin effect, in all instances have then been outside to coding sequences and also the type where the variant

00:44:54,266 --> 00:45:04,766
that affects Type II diabetes on chromosome 11, it is in the region where it’s reasonable to expect that you have some regulatory influence.

00:45:04,766 --> 00:45:17,132
FEMALE: Okay. Thank you. Then just to kind of follow up. I was truly fascinated with your comment about the genomic contribution to the

00:45:17,133 --> 00:45:37,866
environment, and given your statement about neurology and the observation of the thoughts and emotion center, I couldn’t help but wonder if,

00:45:37,866 --> 00:45:54,499
indeed, it’s worth beginning to consider how much of the genetics of our thinking or abstracting biology is related to the environment

00:45:54,500 --> 00:46:05,833
in which that genome is essentially telling us about. KÁRI STEFÁNSSON: You see, this becomes a

00:46:05,833 --> 00:46:16,366
very sensitive issue when you being to talk about genetics of thoughts and emotions, but it is interesting, you know. The thoughts and emotions

00:46:16,366 --> 00:46:24,999
are the things that define us as individuals and as a species. We haven’t the faintest idea how the brain generates a thought or emotion, and even if

00:46:25,000 --> 00:46:33,566
I would ask this great audience to come up with a definition of a thought and I would give you about 15 years to do so, you would struggle, all right?

00:46:33,566 --> 00:46:42,032
Philosophers have tried to do this for centuries and have failed. One of the things you can do with genetics is not have any opinion on any of

00:46:42,033 --> 00:46:48,333
this, neither on the way in which the brain generates thought nor on how you should define it. The only thing you do is that you map out the

00:46:48,333 --> 00:46:57,799
diversity in cognition and then you go out and you try to figure out what are the variants in the sequence that drive this diversity, and we have

00:46:57,800 --> 00:47:05,733
started to do this. We have now done cognitive function phenotyping in about 10,000 Icelanders, and we have started to pull out variants in the

00:47:05,733 --> 00:47:16,066
sequence that affect this, all right? There is no question that we do not inherit thoughts. We inherit propensities to think in a certain way, and

00:47:16,066 --> 00:47:24,666
certainly, our environment—how we are brought up, the books we read, the music we listen to, and all of that—has great impact on who we are

00:47:24,666 --> 00:47:35,932
eventually. But it is awesome, when you begin to look at it, how important our choice of parents seems to be, all right? It’s almost like once you

00:47:35,933 --> 00:47:46,633
have chosen your parents, there is no other choice to be made. INGA PETER: Inga Peter, Mt. Sinai School of

00:47:46,633 --> 00:47:54,033
Medicine here. I have a question with regard to your sequencing study. You said that you sequenced about 2,000 individuals, but you were

00:47:54,033 --> 00:48:07,133
able to show data on the effect of rare variants on a long list of phenotypes. I was wondering how many individuals per phenotype you initially

00:48:07,133 --> 00:48:15,733
sequenced? KÁRI STEFÁNSSON: The reason we can do this is that we can impute these variants today into

00:48:15,733 --> 00:48:25,966
somewhere between 100,000 to 200,000 people. So, although we have only sequenced 2,000, in our discovery phase we can use imputed

00:48:25,966 --> 00:48:36,666
variants, and then every time before we publish the data we sequence that particular or associated variant so we do not base our final

00:48:36,666 --> 00:48:49,699
conclusion on the basis of imputation, but we use the imputation for the discovery purposes. In some of these we have a very large number of

00:48:49,700 --> 00:48:59,266
people we look at. In some of these we have very few people. For example, in the atrial fibrillation study I pointed out where we have only

00:48:59,266 --> 00:49:07,299
eight individuals who are homozygous for this particular mutation and all of them have atrial fibrillation, there we are only looking at eight

00:49:07,300 --> 00:49:15,633
individuals. So, when you are into something that is 100% penetrant, you can get away with a relatively small number of people.

00:49:15,633 --> 00:49:22,966
INGA PETER: Thank you. ERWIN BOTTINGER: Erwin Bottinger, Mt. Sinai. Thank you for taking us 3 to 5 years into the

00:49:22,966 --> 00:49:32,099
future, and that’s a space where we are thinking about very intensely, and we are in Manhattan where we have a number of large medical

00:49:32,100 --> 00:49:42,833
centers and not always the best of friends. So, I’m interested and very deeply thinking about a scenario where we take the kind of information

00:49:42,833 --> 00:49:51,833
solutions, information management, and knowledge management solutions that you’ve so beautifully described and predict will affect

00:49:51,833 --> 00:50:02,333
medicine in the future. How do we take these solutions into a larger society that’s diverse and where there are a lot of competing interests and

00:50:02,333 --> 00:50:12,699
where the genome is static—the individual carries the genome—but also the information—medical information—is so far not travelling, and

00:50:12,700 --> 00:50:21,566
so that sets up a conflict between the medical institution and the patient going from one medical institution to medical institution, and I was just

00:50:21,566 --> 00:50:31,032
wondering what are your thoughts on the portability of this information? KÁRI STEFÁNSSON: You see, I’m not worried

00:50:31,033 --> 00:50:41,399
about you guys [laughter]…let me finish my sentence…because you guys have led the world in biomedical research, you have led the world in

00:50:41,400 --> 00:50:51,966
the way in which biomedical research is turned into delivery of healthcare. So, you will solve that. I think that, you see, there has been a lot of…we

00:50:51,966 --> 00:51:04,099
were the first ones in the world to launch a consumer genetic service. You know, 23 and me got the Time Magazine award for the invention of

00:51:04,100 --> 00:51:13,933
the year for having followed our suit, all right? That’s how you guys treat foreigners, and people have criticized, and even there are some states

00:51:13,933 --> 00:51:22,433
have made it illegal to market genetic tests directly to consumers, but the consumer will become the key to the solution of the problem you’re talking

00:51:22,433 --> 00:51:32,833
about because once the information is managed by the patient or the individual about whom the information is, the institutions become sort of

00:51:32,833 --> 00:51:42,566
irrelevant. You just go to places with the information on yourself and that is, in many ways, in keeping with what has happened in

00:51:42,566 --> 00:51:50,932
healthcare over the past 10-15 years. There’s hardly a well-educated patient who comes to see a physician with a new problem without having

00:51:50,933 --> 00:52:03,533
doled out six feet of papers from the Internet to be able to have a dialogue with his or her physician, and I think that people will want this

00:52:03,533 --> 00:52:12,533
information about themselves, they will have it, and therefore they will be able to take it from Mt. Sinai to Albert Einstein, you know, these two

00:52:12,533 --> 00:52:23,899
great institutions that have been at war with each other for 200 years. PAUL KIMMEL: Paul Kimmel. I’m not from Mt. Sinai,

00:52:23,900 --> 00:52:31,833
I’m from NIDDK and there is an institution, NYU in Manhattan as well, so I just… KÁRI STEFÁNSSON: I apologize. I was trying to

00:52:31,833 --> 00:52:38,366
be brief. PAUL KIMMEL: I’m just waving that flag. You painted a very complex interaction between

00:52:38,366 --> 00:52:48,366
domains of phenotypes, genotypes, environment, age, gender, and I just wonder, with your 20 years of experience in doing so much work in a

00:52:48,366 --> 00:52:59,699
very precise, homogenous population, can you give us your speculations regarding the generalizability of the results that you’ve put

00:52:59,700 --> 00:53:07,300
forth. You’ve talked about founder populations and it’s almost like Erwin’s question; we have a very diverse population in the United States. We

00:53:07,300 --> 00:53:16,233
know in kidney disease that there are completely dramatic differences between different populations in terms of genetic susceptibility and

00:53:16,233 --> 00:53:23,199
allele frequencies. So, can you speculate on this important question? I thought it was an important question.

00:53:23,200 --> 00:53:33,200
KÁRI STEFÁNSSON: Yes, I can speculate and I can give you a little bit of data. Number one, in spite of the way in which I look like, Icelanders

00:53:33,200 --> 00:53:43,600
are a reasonably good [---] model for homo sapiens, to begin with. So basically, everything that we have discovered in Iceland has been

00:53:43,600 --> 00:53:57,166
replicated elsewhere. So, the same biochemical pathway seems to be important in our population as in most other populations. There are allelic

00:53:57,166 --> 00:54:07,766
differences and some of them are very, very significant and some of them are transparent. It is easy to understand why they are different. Some

00:54:07,766 --> 00:54:18,799
of them are extremely difficult to figure out. One of the ones that is relatively easy to understand came out of our work on malignant melanoma

00:54:18,800 --> 00:54:26,100
when we started the work there. At the time when we started the work there was only one mutation known to predispose to malignant

00:54:26,100 --> 00:54:36,533
melanoma and that was a mutation in the melanocortin Type 1 receptor which conferred an odds ratio of malignant melanoma in Spain of 3, in

00:54:36,533 --> 00:54:46,999
Sweden of 2, but it had no impact on the risk of melanoma in Iceland. This happens to be a mutation that turns your hair red and your skin

00:54:47,000 --> 00:54:56,900
very white, and you become sensitive to sunlight. If you live in Spain it is difficult to avoid the sunlight. If you live in Sweden, you’re

00:54:56,900 --> 00:55:04,700
somewhere in-between. I can promise you, if you’re sensitive to sunlight, you can avoid sunlight in Iceland, and what is interesting there is

00:55:04,700 --> 00:55:18,533
that the frequency—allelic frequency—of this mutation in Spain is 6%, in Sweden 17%, in Iceland 26%. So, there you have an example of a

00:55:18,533 --> 00:55:29,899
variant with geographic difference in effect and probably a consequent geographic difference in frequency. Another example comes of our work

00:55:29,900 --> 00:55:44,233
on atrial fibrillation. We found several common variants that affect atrial fibrillation, where we have an allelic frequency in Europeans that is

00:55:44,233 --> 00:55:58,866
about 20%, the allelic frequency in China is 60%, the odds ratio in the European population is about 1.75-1.8, but only 1.3 in the Chinese population.

00:55:58,866 --> 00:56:11,699
So, there is an example of variants of effect in heart rhythm that have very different effect depending on whether you are of European or

00:56:11,700 --> 00:56:25,900
you are of Chinese descent. So, there is some…you see, man is now spread all over the world, all right? So, we are exposed to maximum

00:56:25,900 --> 00:56:36,633
diversity in environment which basically leads to the generation of maximum diversity both in our genome and our phenotype, but the environment

00:56:36,633 --> 00:56:46,199
takes a long time to have impact on us, so it is not inconsequential when you move a large population from one part of the world to another.

00:56:46,200 --> 00:56:55,166
A good example can be seen in the African-American population in this country. If you look at, for example, hypertension, if you go to rural

00:56:55,166 --> 00:57:09,432
Nigeria the prevalence of hypertension is about 15%, in urban Nigeria about 25%, the Caribbean about 30-40%, and in African-Americans up to

00:57:09,433 --> 00:57:18,666
50%. So, that’s an example of where you take people with a genetic background that fits well to a particular environment and then put them into a

00:57:18,666 --> 00:57:27,566
different environment. There are differences like this. There are differences that you have to be aware of. You have to keep your eyes open, but

00:57:27,566 --> 00:57:38,166
the beauty of using the founder population is that it gives you the gene that you then go to and then you can tally the total diversity of the sequence in

00:57:38,166 --> 00:57:51,799
the gene, and you look at that in the context of an outbred population. Another way to use that is like many of us use that BRCA2 example. They

00:57:51,800 --> 00:58:01,733
found a truncating mutation in Iceland and then they go and sequence the gene in people here and they call it a positive test when they find any

00:58:01,733 --> 00:58:11,166
truncating mutation for the same gene. There is no question about it that they cannot demonstrate that all of these mutations associate with the

00:58:11,166 --> 00:58:21,632
disease because they don’t have enough cases to do that. But they know that if you truncate the gene, if you prevent it to generate another

00:58:21,633 --> 00:58:30,266
protein, this leads to our risk of a disease. So, you will have to use some deductive reasoning in the way in which you apply the genetic test.

Date Last Updated: 9/18/2012

General Inquiries may be addressed to:
Office of Communications and Public Liaison
Building 31, Rm 9A06
31 Center Drive, MSC 2560
Bethesda, MD 20892-2560
Phone: 301.496.3583