Whole Genome Approaches to Complex Kidney Disease
February 11-12, 2012 Conference Videos

Population Genetics to Personalized Medicine: An Icelandic Saga
Kari Stefansson, deCode, Iceland

Video Transcript

1
00:00:00,000 --> 00:00:04,000
JEFFREY KOPP: So, we are very pleased to have as our keynote speaker Dr. Kári Stefánsson, who actually has been a frequent

2
00:00:04,000 --> 00:00:13,733
visitor to NIH over the years. I know I’ve heard him speak in the last two or three years. He received his M.D. from the University of Iceland

3
00:00:13,733 --> 00:00:20,666
and then trained in neurology and neuropathology at the University of Chicago and then afterwards was at Harvard for a number of

4
00:00:20,666 --> 00:00:31,232
years in these fields. He then returned to Iceland to study the genetics of multiple sclerosis. In fact, he founded deCODE genetics in 1996 to study the

5
00:00:31,233 --> 00:00:40,699
genetics of just about everything. He’s received numerous awards, including from the European Society of Human Genetics and the Anders Jahre

6
00:00:40,700 --> 00:00:48,633
Award and he was on the Time 100 list of “People Who Shape Our World” in 2007. He’s agreed to give our keynote address, addressing

7
00:00:48,633 --> 00:00:57,533
topics at the intersection of medical genetics, population genetics and public health, exactly what this conference is all about. Thank you,

8
00:00:57,533 --> 00:01:04,033
Dr. Stefánsson.-- KÁRI STEFÁNSSON: I thank the organizers for

9
00:01:04,033 --> 00:01:14,466
inviting me to come here and I will tell you a little bit about how we have gone about doing genetics in Iceland, and I will focus more on our

10
00:01:14,466 --> 00:01:23,932
search for rare variants coming out of whole genome sequencing, but I will be relatively light on the application of genetics in healthcare because

11
00:01:23,933 --> 00:01:35,466
we simply haven’t done all that much with it. But I think it is terribly important when we think about human genetics, is that basically all of the

12
00:01:35,466 --> 00:01:49,132
diversity in the biosphere is explained in large way by diversity in sequence of A, T, C and G. So, basically there is an awful lot of the secret to

13
00:01:49,133 --> 00:02:00,299
life in general that can be read out of this miraculous molecule that we call DNA, and how much, if we just focus on how much can we

14
00:02:00,300 --> 00:02:11,600
explain on the basis of the diversity in the sequence. And I think basically these identical twins—we read about it in the newspaper last

15
00:02:11,600 --> 00:02:21,266
year—that they died within hours of each other at the age of 86 from the same disease. And you will then say that’s not all because of their

16
00:02:21,266 --> 00:02:30,299
genetics. These guys were in the same jobs, they dressed in the same clothes, etc., so perhaps there was a large contribution by a

17
00:02:30,300 --> 00:02:39,066
shared environment, and my response to that, yes of course, there was a large contribution by shared environment but actually the environment

18
00:02:39,066 --> 00:02:49,666
they shared was highly genetically…was highly inherited, and I will end my presentation by presenting you with arguments that I believe that

19
00:02:49,666 --> 00:02:59,332
there’s a large genetic component to an environmental component to the risk of common diseases. But basically, I think that when we think

20
00:02:59,333 --> 00:03:08,366
about human genetics, when we think about genetics and the discovery coming out of genetics and then go to the question of “how do

21
00:03:08,366 --> 00:03:19,566
we use this in a clinical practice?” Basically their use of genetics in diagnostic versus the use of genetics in discoveries basically is the mirror

22
00:03:19,566 --> 00:03:28,632
image of each other. When we are working on the discovery we are trying to figure out what it is that characterizes the group, what people who

23
00:03:28,633 --> 00:03:36,499
have a particular disease share. One we are using in genetics, we are trying to figure out into what group does the individual fit? And actually,

24
00:03:36,500 --> 00:03:44,566
in the end, we will be using the same software systems, we will be using the same analytical methods in both instances, and I will come back

25
00:03:44,566 --> 00:03:55,099
to that again in this presentation. When we are trying to make discovery in human genetics, the kinds of human genetics that I work from, we are

26
00:03:55,100 --> 00:04:02,500
basically dealing with two datasets. We are dealing with datasets on diversity and the sequence of the human genome and we are

27
00:04:02,500 --> 00:04:11,833
dealing with datasets on diversity in phenotypes, and we are trying to find non-chance association by data-punching the two datasets. And when I

28
00:04:11,833 --> 00:04:19,433
was in medical school, which unfortunately was not yesterday, the only genetic diseases we were taught about were the ones with this

29
00:04:19,433 --> 00:04:26,899
relationship which is very simple. You have a mutation, you develop a disease, you don’t have a mutation, and you won’t. These were the

30
00:04:26,900 --> 00:04:35,233
Mendelian diseases. And actually, coming out to the sequencing—the whole genome sequencing that we are going through now—we are drifting

31
00:04:35,233 --> 00:04:45,699
back towards this relationship. We are finding out that what we call common complex diseases is in many ways a collection of a very large number of

32
00:04:45,700 --> 00:04:55,766
Mendelian, let’s call them Mendelian phenocopies, what we have thought about as this common complex diseases. But what we saw mostly

33
00:04:55,766 --> 00:05:04,632
coming out of what we call the GWAS era, although mind you, that we are continuing to do genomic associations just with a little bit rarer

34
00:05:04,633 --> 00:05:13,033
variant. But looking at the common variants, this is pretty much what we have been seeing, a large number of variants affecting one phenotype, but

35
00:05:13,033 --> 00:05:24,566
we have also been seeing this, one mutation affects many phenotypes. This has been particularly the case in cancer, where one

36
00:05:24,566 --> 00:05:34,299
mutation can confer risk for many cancers, but this is also fairly common in the case of diseases of the brain. One and the same mutation confers

37
00:05:34,300 --> 00:05:43,566
risk of schizophrenia, autism, ADHD, and epilepsy, and it’s actually very, very interesting when we begin to speculate why that should be,

38
00:05:43,566 --> 00:05:54,766
and I may have time to come to that also a little bit later. We have a tendency to divide the variants in the sequence that affect the risk of common

39
00:05:54,766 --> 00:06:02,566
diseases into the common and the rare, like these are two distinct categories, and it is true that we have been finding both common, we have been

40
00:06:02,566 --> 00:06:13,766
finding rare variants, and there is a little bit of…there is a relationship between the size of the effect and how common the variants are, and it is

41
00:06:13,766 --> 00:06:21,966
probably biologically important in the case of the common variants because we are not going to see a lot of very common variants conferring

42
00:06:21,966 --> 00:06:30,899
very large risk because there is inevitably negative selection against them but I’m convinced that there are rare variants to small effect but we

43
00:06:30,900 --> 00:06:43,166
don’t find them because we are not powered to do so. But I think that the meaningful separation, the meaningful dichotomy of these variants lies in

44
00:06:43,166 --> 00:06:52,366
the following. We have these variants, the common variants, the variants that basically create the normal human diversity that produce a

45
00:06:52,366 --> 00:06:59,166
normal distribution curve of physiologic function, and if you’re at one end of the curve you’re at risk, if you’re at the other end of the curve you

46
00:06:59,166 --> 00:07:10,166
are somewhat protected, and it is interesting that the definition of normal distribution is that it is a measurement of this under the influence of many

47
00:07:10,166 --> 00:07:22,332
factors, each one of them equally likely or almost equally likely to add and subtract, which means that they’re common and the nature of the

48
00:07:22,333 --> 00:07:32,133
variants that do this must be fairly similar. But then you have the rare variants that disrupt the same physiological function, all right? And I will

49
00:07:32,133 --> 00:07:40,666
give you one example where we have a common variant and a rare variant in the same gene: one of them affects the physiological function, the

50
00:07:40,666 --> 00:07:54,432
other disrupts the same. I cannot resist. You see, I’m putting up this slide because I am a vain man and I like to thump my own chest, and this is just

51
00:07:54,433 --> 00:08:08,766
a list of phenotypes where we at deCODE have found common variants affecting the risk of these phenotypes and we’re the first ones to do so,

52
00:08:08,766 --> 00:08:20,166
and this ranges from diseases like myocardial infarction to funny phenotypes like the love of crossword puzzles, and that is a terribly

53
00:08:20,166 --> 00:08:28,199
important phenotype because it shows that we can go all the way from subtle differences in the sequence of the human genome to a very

54
00:08:28,200 --> 00:08:39,633
complex human behavior and human feelings, and we can find it unequivocally associations, terribly important, because the brain is just like the

55
00:08:39,633 --> 00:08:49,233
kidney; it’s just an organ. The kidney makes piss, the brain makes thoughts and emotions, and we can analyze all of these using human genetics.

56
00:08:49,233 --> 00:09:02,833
And let me give you just a couple of examples of the common variants and where we have shown that the effect of physiological function and

57
00:09:02,833 --> 00:09:14,699
through that if they generate the risk of disease, and the thyroid cancer example is actually a fairly good one, and actually it is the most familial of

58
00:09:14,700 --> 00:09:27,966
cancers, and actually, we have pulled together, we have isolated now…this is an example of four variants here…five variants that we have

59
00:09:27,966 --> 00:09:38,699
isolated that actually affect the risk of thyroid cancer and all of them the risk variants that are associated with decreased concentration of TSH.

60
00:09:38,700 --> 00:09:49,233
Remember, TSH, the way in which the endocrine function of the thyroid works is that you have the thyrotropin-releasing factor coming from the

61
00:09:49,233 --> 00:09:58,833
hypothalamus, stimulating the production and secretion of TSH. That then stimulates the thyroid and actually stimulates the differentiation of

62
00:09:58,833 --> 00:10:04,366
thyroid epithelium, and it is interesting. You see, we looked at about 1,000 cases of thyroid

63
00:10:04,366 --> 00:10:14,832
cancer, but we looked at about 15,000 individuals when it comes to level of TSH—normal people without any cancer—and we showed that all of

64
00:10:14,833 --> 00:10:23,533
these risk variants are associated with decreased concentration of TSH. So, our hypothesis is that the way in which these

65
00:10:23,533 --> 00:10:31,666
variants confer risk is that they lead to less secretion of TSH, which leads to less differentiation of the thyroid epithelium that then

66
00:10:31,666 --> 00:10:43,432
leads to thyroid cancer. The second is actually a little bit different, for example; I wanted to emphasize the importance. You see, we are

67
00:10:43,433 --> 00:10:54,399
working in Iceland always with the same population. So, the same individual plays a role in many studies, either as a patient or a control, but

68
00:10:54,400 --> 00:11:07,566
it is so important to be able to go again and again to the same genome sequence, you have genotypes you want, etc., and in this instance

69
00:11:07,566 --> 00:11:20,232
this is from work we did on uromodulin and chronic kidney disease, and actually, our contribution to that study is to demonstrate that

70
00:11:20,233 --> 00:11:30,599
the variants and the sequence that affect the risk or confer risk of chronic kidney disease do so through interactions with age and interaction with

71
00:11:30,600 --> 00:11:44,600
co-morbid conditions. So, the impact of the uromodulin variant on creatinine concentration is completely dependent on increasing age and the

72
00:11:44,600 --> 00:11:51,833
presence of co-morbid conditions. So, this is an example actually. The uromodulin variant is an example of variants in sequence that

73
00:11:51,833 --> 00:12:04,966
predisposes to diabetic nephropathy, for example. But there is a relatively small percentage of the heritability that is accounted for

74
00:12:04,966 --> 00:12:13,366
by the common variants that have been discovered, and there are people in the various places of this world tearing their hair out in

75
00:12:13,366 --> 00:12:22,566
despair over the fact that these variants do not explain relative to the small amount of heritability. For example, an extraordinarily distinguished

76
00:12:22,566 --> 00:12:30,699
colleague of mine, Eric Lander, has come up to Iceland to sit down with us to see whether we can use the Icelandic population to explain where

77
00:12:30,700 --> 00:12:39,666
the missing heritability is, and he published a paper the other day in PNAS where he insists that he has figured out how to calculate how

78
00:12:39,666 --> 00:12:52,299
interactions between variants confer, explain, a large part of the missing heritability. The problem with that approach is that no one has been able

79
00:12:52,300 --> 00:13:06,466
to show this interaction. So, I could substitute these interactions that Eric is calculating out but cannot demonstrate by almost anything. I am

80
00:13:06,466 --> 00:13:15,532
convinced he is correct, that interactions play a role, but since we cannot demonstrate them it is really difficult to make use of a formula that

81
00:13:15,533 --> 00:13:25,166
includes them in the calculation of heritability. But there are things that definitely contribute, although they do not explain everything, and as I

82
00:13:25,166 --> 00:13:32,532
said, I’m convinced that Eric is right, that some are there; there are these interactions, we just haven’t been able to pull them out. One thing we

83
00:13:32,533 --> 00:13:42,399
have to look at is the mechanism of inheritance, and one of the things that we have been fortunate to be able to do is to figure out how to

84
00:13:42,400 --> 00:13:50,766
trace the entire genome. You know, when you sequence or you genotype you get out this [---] soup of variants coming from mothers and

85
00:13:50,766 --> 00:14:01,566
fathers, but it is extremely important to be able to string the variants into haplotypes, and usually you’ll need to have the genotypes of the parents

86
00:14:01,566 --> 00:14:11,866
and the proband to be able to trace or you cannot do that if they are strictly heterozygous, but we put together a method that is based in our deep

87
00:14:11,866 --> 00:14:18,699
understanding of the Icelandic population structure. So, we simply used the next paternal and

88
00:14:18,700 --> 00:14:27,333
maternal relative who is homozygous for the stuff that you’re looking at, and what is more, if you use this overlapping segment we can

89
00:14:27,333 --> 00:14:36,333
actually convincingly demonstrate whether the haplotypes come from the mother or the father. So, every individual we genotype or we

90
00:14:36,333 --> 00:14:44,633
sequence, we can figure out what comes from the mother and what comes from the cousin and why is that important? It is important because it is

91
00:14:44,633 --> 00:14:54,366
absolutely clear that in many instances it matters whether you inherit the variant from the mother or the father, and I’m just showing you an example

92
00:14:54,366 --> 00:15:05,732
coming out in one of my papers where we have variants that affect breast cancer, skin cancer,

93
00:15:05,733 --> 00:15:13,233
and Type II diabetes, and it depends on…the effect varies depending on whether it comes from the mother or the father, and actually, the

94
00:15:13,233 --> 00:15:22,433
most spectacular example is a variant on chromosome 11 that if you inherit it from the father it confers risk for Type II diabetes, if you

95
00:15:22,433 --> 00:15:32,999
inherit it from the mother it protects against the same. So this is one spectacular way in which nature is squeezing more information out to us for

96
00:15:33,000 --> 00:15:40,933
a genome. So, remember that it’s still another type of demonstration of the fact that sexual reproduction was not invented for our pleasure,

97
00:15:40,933 --> 00:15:55,066
but simply to increase human diversity. We have looked at this in many instances, and yes, the

98
00:15:55,066 --> 00:16:03,866
parental origin of variants affects things like human height, BMI, etc. and several diseases, but it is not the major player, all right? And in all

99
00:16:03,866 --> 00:16:16,032
instances where we have been able to figure out mechanism, the mechanism goes to differential methylation. But there are also rare variants, no

100
00:16:16,033 --> 00:16:27,099
question about it, that play an important role. In Iceland we have started a fairly large sequencing process. We are aspiring to sequence

101
00:16:27,100 --> 00:16:40,233
somewhere between 2,500 and 3,750 Icelandic genomes. We have now genotyped almost 120,000 people and this allows us, because of

102
00:16:40,233 --> 00:16:49,766
what we know about the structure, to impute the whole genome sequence in the entire population. And why can we do that? In sort of simple terms,

103
00:16:49,766 --> 00:16:59,532
because we can take any two random Icelanders and we can determine what they’ve inherited from a common ancestor, and if you take about a

104
00:16:59,533 --> 00:17:08,533
megabase of DNA we can find about 250-300 Icelanders who have inherited this from a common ancestor, so we only need to sequence

105
00:17:08,533 --> 00:17:19,499
it in one of them, and it allows us to impute variants down to a frequency of about .1%. Everyone else is struggling with imputing

106
00:17:19,500 --> 00:17:31,200
sequence variants; after 1 or 2 or 3 percent they cannot go farther down than that. We have now sequenced about 2,000 Icelanders to 10x; of

107
00:17:31,200 --> 00:17:42,233
them, 700 to 30x; and we are already beginning to get a lot of exciting stuff coming out of it. And there are some sort of basic things that we know

108
00:17:42,233 --> 00:17:52,433
about the Icelandic genome or the human genome, now that we get about 40-50 de novo mutations in each individual of evidence of 60

109
00:17:52,433 --> 00:18:07,699
recombinations. And actually, it is interesting when you take diseases like autism and schizophrenia. We know that the probability that

110
00:18:07,700 --> 00:18:16,633
a father will have a child with schizophrenia is dependent on age. For example, a 40-year-old father is 3 times more likely to conceive a

111
00:18:16,633 --> 00:18:28,333
schizophrenic child than a 20-year-old father, and this is, in many ways, interesting, you know, and I actually think that Pfizer should be punished just

112
00:18:28,333 --> 00:18:39,199
like Exxon for the oil spill. Just by marketing Viagra and extending the sex life of men, they are introducing a lot of new mutations into our

113
00:18:39,200 --> 00:18:52,933
population. And this is interesting when you look at all of these in the context of our data on de novo mutations, all right? Ninety-seven percent of

114
00:18:52,933 --> 00:19:05,533
diversity in new mutations—de novo mutations—is explained by paternal age. Ninety-seven percent of the diversity in de novo mutations is

115
00:19:05,533 --> 00:19:13,799
explained by paternal age. That’s absolutely astonishing, right? And we have more data that’s going to be…this is coming out to looking at 75

116
00:19:13,800 --> 00:19:25,100
trios, which is terribly interesting. But this brings us to sort of the new aspect of the genetics that we are doing because of the de novo mutations.

117
00:19:25,100 --> 00:19:32,600
We are not just looking at correlations between diversity-and-sequence and diversity-and-phenotype, we have to include the changes in

118
00:19:32,600 --> 00:19:44,500
diversity in the sequence, and actually, I think an awful lot of what we are missing today is due to the fact that we are not in a position, really in a

119
00:19:44,500 --> 00:19:59,166
position, to pick up de novo mutations. There’s still another interesting thing coming out. What we have is that we have now about 7,000 individuals

120
00:19:59,166 --> 00:20:11,232
who are homozygous but are truncating mutation in one of 500 genes. So basically there are, at least on the average, 12 individuals in our sample

121
00:20:11,233 --> 00:20:24,266
that have a knockout of one of these 500 genes, and we are now starting a process where we are phenotyping all of these guys. But this is sort

122
00:20:24,266 --> 00:20:35,332
of the list, not complete. This is the list that is now about three weeks old, but this is a list of sort of where we have pulled out rare variants with

123
00:20:35,333 --> 00:20:43,699
large effect, and the reason we can do this is that we can impute the rare variants where others cannot and there’s a founder effect for

124
00:20:43,700 --> 00:20:53,600
many of these traits in Iceland, and I can give you an example of the importance of the founder effect. When BRCA2 was discovered it was

125
00:20:53,600 --> 00:21:03,233
basically simultaneously in Utah and Cambridge, and in both places they were using Icelandic material, and there’s only one mutation in BRCA2

126
00:21:03,233 --> 00:21:17,166
in Iceland and that is a 5-base deletion out of exon 9. It has an allelic frequency of .4%. Myriad tells me that, now after having had the test

127
00:21:17,166 --> 00:21:32,599
ongoing for a couple of decades, they have found 6,300 mutations in BRCA2 in America and a combined allelic frequency is .1%. So, to be able

128
00:21:32,600 --> 00:21:42,133
to pick, if you were doing sequencing, if you compare the position we are in in Iceland versus what you have here, you would have to

129
00:21:42,133 --> 00:21:51,766
sequence a thousand times more people to be able to make the kind of discoveries that we are doing, if this indeed goes for all of these

130
00:21:51,766 --> 00:22:04,932
phenotypes. One of the things…I’m not going to go into details of this. I’m not going to discuss the kidney cancer stuff that we pulled out two or

131
00:22:04,933 --> 00:22:10,533
three weeks ago. I’m just going to go over a few of the things that we have already published. One of the first things that we published was on

132
00:22:10,533 --> 00:22:27,799
a variant that has also an allelic frequency of about .4%. It’s in BRIP1 which is one of the proteins interacting with BRCA1 and BRCA2. It

133
00:22:27,800 --> 00:22:39,133
confers an odds ratio of 8 of ovarian cancer. It is a friendship mutation; it is a truncated mutation. We have found another truncating mutation in a

134
00:22:39,133 --> 00:22:48,866
Spanish sample, and this is an important discovery because ovarian cancer is one of the difficult cancers with a 5-year survival of about

135
00:22:48,866 --> 00:22:58,632
45% today. It’s very important to be able to diagnose it early. So, you can use this, probably, in the same kind of a way that people are using

136
00:22:58,633 --> 00:23:10,899
the BRCA test today, is to genotype the close relatives of those who are diagnosed with the cancer. A second one we pulled out was a rare

137
00:23:10,900 --> 00:23:18,766
variant, low-frequency variant that conferred risk of gout and has impact on uric acid concentration, and it’s actually fairly interesting

138
00:23:18,766 --> 00:23:25,699
because the impact on women and men is the same. The relative risk or odds ratio for men is much higher because men start off with a much

139
00:23:25,700 --> 00:23:36,866
higher concentration of uric acid, so they’re much closer to their saturate in concentration that leads to precipitation of crystals in joint fluids. The one I

140
00:23:36,866 --> 00:23:46,899
want to just dwell on a little bit is one that confers an odds ratio of 12.5 for sick sinus syndrome, which is the most common cause of the

141
00:23:46,900 --> 00:24:01,933
placement of pacemakers in our society. The reason I want to dwell on it a little bit is that the mutation is in the myosin heavy chain 16 and we

142
00:24:01,933 --> 00:24:10,133
have previously found common variant over that gene that affects heart rate. So, you have a common variant that affects heart rate, you have

143
00:24:10,133 --> 00:24:21,699
a rare variant that abolishes heart rate, which is terribly interesting. Another interesting aspect of this is that the mutation is in the myosin gene.

144
00:24:21,700 --> 00:24:31,533
What does myosin have to do with generation of heartbeat? We’re still puzzled after we have published this paper, but now a couple of weeks

145
00:24:31,533 --> 00:24:39,833
ago we found another mutation. It’s a mutation in the myosin light chain 40. There are eight individuals in our dataset who are homozygous

146
00:24:39,833 --> 00:24:50,933
for that mutation. All of them have early onset for atrial fibrillation before the age of 30 and have developed embolic stroke because of that. So,

147
00:24:50,933 --> 00:24:59,499
you have another example of a mutation in a sarcomere gene that seems to be very important for the generation and the maintenance of heart

148
00:24:59,500 --> 00:25:08,300
rate. [43-24:55] So, this is an example of a gene that has variants that place you on the normal distribution curve and all the mutations that

149
00:25:08,300 --> 00:25:19,433
disrupt that physiologic function. But that brings me to the brain, [44-25:08] and I am an old neurologist and a neuropathologist and what I’m

150
00:25:19,433 --> 00:25:28,066
saying now is true. It may sound outrageous, but if I would take this lovely woman’s body and I would peel it away from her brain, put her brain

151
00:25:28,066 --> 00:25:35,632
in a bucket and keep it alive, that would be her, all right? If I would keep the rest of her body alive elsewhere, it would be something totally

152
00:25:35,633 --> 00:25:45,166
different, all right? So, we are nothing but a brain; the rest of it is only to get us from point A to point B, etc., so it is important to figure out how the

153
00:25:45,166 --> 00:25:53,932
brain works. [45-25:40] And actually, I couldn’t resist because your administration over the past couple of days have put a lot of emphasis on

154
00:25:53,933 --> 00:26:04,633
Alzheimer’s disease, and they want to spend $300 million doing sequencing to find protective variants, etc. We have done this work, all right?

155
00:26:04,633 --> 00:26:14,633
It’s over. They’re too late. They are not going to be the first one to get a man to that moon, I can promise you that. [46-26:07] So, we have pulled

156
00:26:14,633 --> 00:26:21,733
out, over the past four weeks, three variants that affect the risk of Alzheimer’s disease. One of them

157
00:26:21,733 --> 00:26:34,166
confers an odds ratio of 3, which is prevalent to a P4. The second one, an odds ratio of 2.4—that happens to be an inflammatory gene—and then

158
00:26:34,166 --> 00:26:47,732
we also found the variant that basically confers almost 100% protection against Alzheimer’s disease, and the +2 have an allelic frequency of

159
00:26:47,733 --> 00:26:58,366
about…one of them 1.2, the other 1.05. The protective variance has an allelic frequency of about .5. What’s interesting, if you look at the

160
00:26:58,366 --> 00:27:10,132
protective variant, is that in the general population it has a frequency of .5%; in Alzheimer’s disease, .1%; and in people who are 85 years of age or

161
00:27:10,133 --> 00:27:21,699
cognitively intact at 80 years of age, it is found in .7%; and in cognitively intact over the age of 85, you find it in 1.1%. We have only found this in

162
00:27:21,700 --> 00:27:31,600
four cases of Alzheimer’s disease and they are really, really old individuals who I basically questioned that had true Alzheimer’s disease. But

163
00:27:31,600 --> 00:27:41,566
what is interesting here is that if you take this variant and you look at it in people who are admitted to nursing homes and you look at their

164
00:27:41,566 --> 00:27:55,632
cognitive performance scale—so it’s a cognitive function measurement coming out of the resident assessment, the instrument—as you can see if

165
00:27:55,633 --> 00:28:05,333
you look at it between the ages of 80 and 100, the red line is those who do not carry this variant, the blue line are those who carry this variant. So,

166
00:28:05,333 --> 00:28:15,733
basically the people with this variant—the normal people—they retain their cognitive abilities up to 90 years of age and do not catch up with the

167
00:28:15,733 --> 00:28:25,799
ones without this variant until they are 95. And this is how it looks if you take out the Alzheimer’s patients from the red line, basically. So, this is a

168
00:28:25,800 --> 00:28:34,700
variant that protects against Alzheimer’s disease, this is a variant that allows you to retain your cognitive function longer than the control

169
00:28:34,700 --> 00:28:49,800
population. So, this is an example of…you see, the thing is that when you’re considering what to do when it comes to investing in whole genome

170
00:28:49,800 --> 00:28:56,866
sequencing for the purpose of discovery, what matters is not how many people you sequence, what matters is what you know about the

171
00:28:56,866 --> 00:29:05,299
population structure, how well you can mine it, how well is it suited for making discoveries, and I can assure you, doing whole genome

172
00:29:05,300 --> 00:29:12,166
sequencing in a population like the Icelandic population, it’s a thousand times less expensive than if you were going to do it in a heterogenous

173
00:29:12,166 --> 00:29:21,866
population. Make the discovery of the gene in a founded population, then go into the outbred population and do pool sequencing of your

174
00:29:21,866 --> 00:29:33,832
samples, looking for variants there. It’s the most effective. It’s the least expensive way of doing it. One of the things we have done is, as I

175
00:29:33,833 --> 00:29:40,966
mentioned before, we have all of these human knockouts and we have all of the CNVs we have pulled out and we have actually about 46 CNVs

176
00:29:40,966 --> 00:29:49,266
that we have pulled out that we have shown to be on the negative selection. How do we do that? We just count the number of children that people

177
00:29:49,266 --> 00:29:56,432
have who have this variant and show that there are fewer number, their average in our society, and then we have gone and we have

178
00:29:56,433 --> 00:30:07,066
phenotyped these people who carry this variant. And it is interesting that if you look at the GWAS era, relatively little was discovered of common

179
00:30:07,066 --> 00:30:16,899
variants that affect the risk of brain diseases. Most of the variants that have effect on the risk of brain disease is going to be rare variants and

180
00:30:16,900 --> 00:30:25,166
probably unconnected with selection because the brain is a terribly important reproductive organ, all right? We probably have defined the

181
00:30:25,166 --> 00:30:36,766
normal function of the brain relatively broadly, but we phenotyped these people extraordinarily thoroughly for cognitive function, and actually, a

182
00:30:36,766 --> 00:30:43,932
few of these variants confer risk of schizophrenia, which is actually the most human of all diseases; it’s a disease of thoughts and

183
00:30:43,933 --> 00:30:53,166
emotions, and thoughts and emotions are the functions that define us as a species and define us as individuals. We actually published a paper

184
00:30:53,166 --> 00:31:08,599
on CNVs as a factor risk of schizophrenia and two of them confer odds ratio that is relative of 10, but the one on chromosome 15-11.2 confers

185
00:31:08,600 --> 00:31:17,300
only an odds ratio of 3. So, we have two of these that confer the very large odds ratio, the one in the middle confers a modest odds ratio,

186
00:31:17,300 --> 00:31:26,766
and this is interesting when you begin to look at the phenotyping of the species. So, these are not 100% penetrant variants, so we have looked at

187
00:31:26,766 --> 00:31:35,366
them in quote-unquote “normal people.” So, this is an example of the effect that these variants have on spatial working memory and what’s interesting

188
00:31:35,366 --> 00:31:53,066
here is that the variant that carries the smallest odds ratio, the 15-11.2, has the least effect on spatial working memory. The high odds ratio

189
00:31:53,066 --> 00:32:02,399
variants have much greater impact on spatial working memory. So, when you take out the variant of the small effect and you look at this

190
00:32:02,400 --> 00:32:12,066
together, these are normal people without the CNV, these are the ones with high-risk CNVs, and these are people with schizophrenia. So you

191
00:32:12,066 --> 00:32:21,032
see, when it comes to spatial working memory, the quote-unquote “normal carriers” of the CNVs fall in between the patients and the normal. So,

192
00:32:21,033 --> 00:32:30,866
this is an example of variants that affect a very important disease of the brain, that out of the context of this disease, affect the physiological

193
00:32:30,866 --> 00:32:44,432
function that the disease affects, all right? The variant on 15-11.2 is important for many reasons. It’s a fairly common one. It’s found in 1 out of 350

194
00:32:44,433 --> 00:32:56,133
individuals, so there are 20 million people in the world who carry this variant. This is a variant that affects…this is a CNV deletion and in the

195
00:32:56,133 --> 00:33:08,133
middle of that deletion, it’s the gene that makes the CYFIP, which is cytoplasmic Fragile X interacting protein 1, and actually it turns out that

196
00:33:08,133 --> 00:33:20,066
the Fragile X protein forms a complex with two other proteins that serve the function of protecting RNA that is transported from the

197
00:33:20,066 --> 00:33:28,432
nucleus into the dendrites of the neurons; very important for synaptic function. So, this is a CNV that includes a very important thing that makes a

198
00:33:28,433 --> 00:33:39,999
very important protein for synaptic transmission, and actually, when we looked at people with this variant in our test of reading, it’s an absolutely

199
00:33:40,000 --> 00:33:50,066
clearly, the people with the deletion have a very, very specific learning disability that affects reading, whereas the controls and even people

200
00:33:50,066 --> 00:34:03,399
with duplication in this region performed much, much better, and when it comes to arithmetic, these guys also performed very poorly. So, this is

201
00:34:03,400 --> 00:34:14,233
an example of how you can use what you know about variants that affect the risk of diseases of the brain to explore the function of the brain. I

202
00:34:14,233 --> 00:34:24,633
cannot resist the temptation to show you an example of our duplication on chromosome 16p13.1, that we have described it affects ADHD,

203
00:34:24,633 --> 00:34:36,299
and actually, it does not only affect ADHD, this is what comes out of our phenotyping. We have shown that people with this CNV have a longer

204
00:34:36,300 --> 00:34:46,533
arm span than the normal population; about 8 cm longer than expected for height, and actually women with this duplication also have menarche

205
00:34:46,533 --> 00:34:56,599
somewhat later than women without it. What is interesting here is that Michael Phelps, the greatest swimmer of all time, he is a poster child

206
00:34:56,600 --> 00:35:05,300
for the association of people with ADHD. He also has an arm span that is exactly 8 cm longer than expected for his height, so we have found the

207
00:35:05,300 --> 00:35:18,366
Michael Phelps mutation. That brings me back to the brothers, the identical twins, who died within hours of each other from the same disease, and I

208
00:35:18,366 --> 00:35:29,132
insist that they did so because they have exactly the same genome. They did so not because they were members of this elite profession, and

209
00:35:29,133 --> 00:35:39,133
actually, my arguments come from the following work. We have done a lot of work on the genetics of nicotine addiction, all right? And the

210
00:35:39,133 --> 00:35:51,566
original variants we found were on chromosome 15 and when we had found it, we went and we looked at whether they affected the risk of

211
00:35:51,566 --> 00:35:57,666
various smoking-related diseases, and we showed that this variant has very significant effect on the risk of lung cancer. In Iceland, no

212
00:35:57,666 --> 00:36:04,332
one really develops lung cancer unless they smoke—unless they smoke for decades; 14.5% of those who smoke for decades develop lung

213
00:36:04,333 --> 00:36:11,333
cancer. So, it’s a pure environmental disease, all right? However, you inherit compulsion to seek the environment that causes the disease. So,

214
00:36:11,333 --> 00:36:21,999
where lies the line of distinction between nature and nurture? I’m absolutely convinced that this applies to almost every common complex

215
00:36:22,000 --> 00:36:30,966
disease. I can give you an example which is relative simple. We took all of the variants in the sequence that affect BMI—the ones that we

216
00:36:30,966 --> 00:36:40,032
have discovered and the ones that others have discovered—and we sort of regressed them on various addiction phenotypes, and indeed, they

217
00:36:40,033 --> 00:36:51,133
have very, very large impact as a group on the risk of smoking quantity, smoking initiation, alcoholism, coffee consumption, opiate addiction,

218
00:36:51,133 --> 00:37:03,133
etc. But it is interesting if you look at individual variants like FTO, which has the largest impact on BMI of all of the BMI variants, has no impact on

219
00:37:03,133 --> 00:37:14,199
any of the addiction phenotypes, in spite of the fact that it is expressed widely in the brain. So basically, this is what I’m going to tell you about,

220
00:37:14,200 --> 00:37:33,266
the discoveries, but I want to emphasize that the question that comes up when you think about this in the context of clinical tests, is that it is

221
00:37:33,266 --> 00:37:43,232
absolutely amazing how much you can pull out and much there is of these very rare variants—and remember, rare is synonymous with

222
00:37:43,233 --> 00:37:52,099
recent—and an awful lot of these are de novo mutations and when you’re designing ways to make discoveries, you have to include also, you

223
00:37:52,100 --> 00:38:00,400
can pull out these de novo mutations. You have to select your population, not just throw an enormous amount of money—$300 million—to

224
00:38:00,400 --> 00:38:08,866
sequence patients with Alzheimer’s disease because it’s silly. You select your population well, make sure that you understand the population

225
00:38:08,866 --> 00:38:18,899
structure, preferentially select a population where you have a founder effect, and where you can do a reasonable amount of imputation

226
00:38:18,900 --> 00:38:27,100
where you can do this more cheaply, more effectively, more quickly. But anyway, go back to this. In the discovery, you’re trying to figure out

227
00:38:27,100 --> 00:38:37,400
what characterizes the group. When it comes to delivery of healthcare you’re trying to figure what would be coming into its group. People fit. The

228
00:38:37,400 --> 00:38:48,066
second aspect of this is complex when you begin to do the sequencing because we are talking in term of, in many instances, de novo mutations,

229
00:38:48,066 --> 00:38:56,332
and there are going to be many of them private for many years to come. In the end, they will stop being private, but in the beginning they are

230
00:38:56,333 --> 00:39:04,066
private, so you will need pretty much sort of their [---] sight of what we have. One of the things we have at deCode is that we have very, very good

231
00:39:04,066 --> 00:39:13,632
software systems into which we imparted our own analytical algorithm, and that’s a key to our success—our ability to manage the information

232
00:39:13,633 --> 00:39:24,233
and for us to manage all of the enormous amounts of information data coming out of the sequencing. But basically, in the delivery of a use

233
00:39:24,233 --> 00:39:32,099
of sequencing in a clinical setting, you have to have exactly the same software systems. You have to be able to do the same thing, and we

234
00:39:32,100 --> 00:39:44,366
basically, you know, we have put together sort of an analysis platform that is independent of the sequencing technology being used, and basically

235
00:39:44,366 --> 00:39:52,399
we are absolutely convinced that the way in which we are approaching the research and the way in which we will be approaching the use of

236
00:39:52,400 --> 00:40:03,700
sequencing in a clinical practice, this will convert, because when you’re putting together a test…let’s say that you put together a test to try to

237
00:40:03,700 --> 00:40:15,033
diagnose developmental disorder of children. Every time you do the test you have to compare the sequence you get to a reference sequence

238
00:40:15,033 --> 00:40:23,666
that will be the whole database of every test that you have previously done, because the test itself will not be a sequence. Sequencing is commodity;

239
00:40:23,666 --> 00:40:36,866
it’s the ability to put the sequence in context. And we have, actually, put together an algorithm that we call GOR, which is basically an Icelandic

240
00:40:36,866 --> 00:40:47,966
word for the intestinal fish and I don’t understand how we came at this exact word. I mean, the adjective “gory” is derived directly from that. But

241
00:40:47,966 --> 00:41:01,499
it’s very true, basically look at a system to analyze this on the basis of access patterns, not just on the basis of pure sequence. Then we

242
00:41:01,500 --> 00:41:14,233
have put a lot of emphasis on putting together [---] to allow us to present the data to clinicians as well as to patients themselves, and we have

243
00:41:14,233 --> 00:41:23,066
recently started the collaboration with the only tertiary care hospital in Iceland to try to figure out how to use those genome sequence data in the

244
00:41:23,066 --> 00:41:33,699
delivery of healthcare. This hospital has about 100,000 visits a year and we are going to impute sequence into the patient for the past 3 years—

245
00:41:33,700 --> 00:41:42,333
about 300,000 people. We have now tried to determine in what clinical settings the data would make the most difference, and at 1 and 2 it’s

246
00:41:42,333 --> 00:41:48,999
going to take us about 6-12 months, and that’s actually a generous time for that, and then we are going to sequence the genomes on

247
00:41:49,000 --> 00:41:55,066
newcomers who fit the criteria coming out of 2 for the following 2 years. So, this is going to take us all together 3 years, and I think this is going to

248
00:41:55,066 --> 00:42:09,932
be very exciting, terribly exciting, and I’ll tell you, this study is hopefully going to teach us how to use this data in the delivery of healthcare, but a

249
00:42:09,933 --> 00:42:21,733
side benefit is that it is going to generate a new tidal wave of discoveries, all right? And thank you for listening. This is all I have to say today.

250
00:42:28,566 --> 00:42:32,566
FEMALE: First of all, I certainly want to thank you for a most fascinating presentation. I have a couple of questions. First, if I followed correctly,

251
00:42:37,700 --> 00:42:46,333
PAUL KIMMEL: Thank you so much.

252
00:42:46,333 --> 00:43:01,399
one of the observations you’ve made about common variants is that they map to physiological function. First of all, is that a kind of a

253
00:43:01,400 --> 00:43:07,833
generalization that can be made? KÁRI STEFÁNSSON: What was the question? FEMALE: Common variants mapping to

254
00:43:07,833 --> 00:43:23,599
physiological function. And then, if I was following the line that the observation in your efforts to phase to do the haplotype phasing that

255
00:43:23,600 --> 00:43:40,200
the variant often is related or can be related to parental…to imprinting from the parent. Now my question is, I was just curious as to how often,

256
00:43:40,200 --> 00:43:56,933
particularly in terms of the common variant and in the imprinting from the parent, is that variant in a regulatory or a region? And that’s kind of the

257
00:43:56,933 --> 00:44:04,233
answer to that. KÁRI STEFÁNSSON: The common variants in general that we have discovered are outside of

258
00:44:04,233 --> 00:44:18,099
coding sequences. In a significant number of instances we have shown that the variants affect the expression of the gene they are close

259
00:44:18,100 --> 00:44:28,366
to. In many instances we haven’t been able to show that. Very rarely are these common variants in known regulatory sequences, but our

260
00:44:28,366 --> 00:44:41,466
assumption is that they are in the kinds of sequences that have yet to be defined. But in all of the instances where we have shown common

261
00:44:41,466 --> 00:44:54,266
variants to have parent to origin effect, in all instances have then been outside to coding sequences and also the type where the variant

262
00:44:54,266 --> 00:45:04,766
that affects Type II diabetes on chromosome 11, it is in the region where it’s reasonable to expect that you have some regulatory influence.

263
00:45:04,766 --> 00:45:17,132
FEMALE: Okay. Thank you. Then just to kind of follow up. I was truly fascinated with your comment about the genomic contribution to the

264
00:45:17,133 --> 00:45:37,866
environment, and given your statement about neurology and the observation of the thoughts and emotion center, I couldn’t help but wonder if,

265
00:45:37,866 --> 00:45:54,499
indeed, it’s worth beginning to consider how much of the genetics of our thinking or abstracting biology is related to the environment

266
00:45:54,500 --> 00:46:05,833
in which that genome is essentially telling us about. KÁRI STEFÁNSSON: You see, this becomes a

267
00:46:05,833 --> 00:46:16,366
very sensitive issue when you being to talk about genetics of thoughts and emotions, but it is interesting, you know. The thoughts and emotions

268
00:46:16,366 --> 00:46:24,999
are the things that define us as individuals and as a species. We haven’t the faintest idea how the brain generates a thought or emotion, and even if

269
00:46:25,000 --> 00:46:33,566
I would ask this great audience to come up with a definition of a thought and I would give you about 15 years to do so, you would struggle, all right?

270
00:46:33,566 --> 00:46:42,032
Philosophers have tried to do this for centuries and have failed. One of the things you can do with genetics is not have any opinion on any of

271
00:46:42,033 --> 00:46:48,333
this, neither on the way in which the brain generates thought nor on how you should define it. The only thing you do is that you map out the

272
00:46:48,333 --> 00:46:57,799
diversity in cognition and then you go out and you try to figure out what are the variants in the sequence that drive this diversity, and we have

273
00:46:57,800 --> 00:47:05,733
started to do this. We have now done cognitive function phenotyping in about 10,000 Icelanders, and we have started to pull out variants in the

274
00:47:05,733 --> 00:47:16,066
sequence that affect this, all right? There is no question that we do not inherit thoughts. We inherit propensities to think in a certain way, and

275
00:47:16,066 --> 00:47:24,666
certainly, our environment—how we are brought up, the books we read, the music we listen to, and all of that—has great impact on who we are

276
00:47:24,666 --> 00:47:35,932
eventually. But it is awesome, when you begin to look at it, how important our choice of parents seems to be, all right? It’s almost like once you

277
00:47:35,933 --> 00:47:46,633
have chosen your parents, there is no other choice to be made. INGA PETER: Inga Peter, Mt. Sinai School of

278
00:47:46,633 --> 00:47:54,033
Medicine here. I have a question with regard to your sequencing study. You said that you sequenced about 2,000 individuals, but you were

279
00:47:54,033 --> 00:48:07,133
able to show data on the effect of rare variants on a long list of phenotypes. I was wondering how many individuals per phenotype you initially

280
00:48:07,133 --> 00:48:15,733
sequenced? KÁRI STEFÁNSSON: The reason we can do this is that we can impute these variants today into

281
00:48:15,733 --> 00:48:25,966
somewhere between 100,000 to 200,000 people. So, although we have only sequenced 2,000, in our discovery phase we can use imputed

282
00:48:25,966 --> 00:48:36,666
variants, and then every time before we publish the data we sequence that particular or associated variant so we do not base our final

283
00:48:36,666 --> 00:48:49,699
conclusion on the basis of imputation, but we use the imputation for the discovery purposes. In some of these we have a very large number of

284
00:48:49,700 --> 00:48:59,266
people we look at. In some of these we have very few people. For example, in the atrial fibrillation study I pointed out where we have only

285
00:48:59,266 --> 00:49:07,299
eight individuals who are homozygous for this particular mutation and all of them have atrial fibrillation, there we are only looking at eight

286
00:49:07,300 --> 00:49:15,633
individuals. So, when you are into something that is 100% penetrant, you can get away with a relatively small number of people.

287
00:49:15,633 --> 00:49:22,966
INGA PETER: Thank you. ERWIN BOTTINGER: Erwin Bottinger, Mt. Sinai. Thank you for taking us 3 to 5 years into the

288
00:49:22,966 --> 00:49:32,099
future, and that’s a space where we are thinking about very intensely, and we are in Manhattan where we have a number of large medical

289
00:49:32,100 --> 00:49:42,833
centers and not always the best of friends. So, I’m interested and very deeply thinking about a scenario where we take the kind of information

290
00:49:42,833 --> 00:49:51,833
solutions, information management, and knowledge management solutions that you’ve so beautifully described and predict will affect

291
00:49:51,833 --> 00:50:02,333
medicine in the future. How do we take these solutions into a larger society that’s diverse and where there are a lot of competing interests and

292
00:50:02,333 --> 00:50:12,699
where the genome is static—the individual carries the genome—but also the information—medical information—is so far not travelling, and

293
00:50:12,700 --> 00:50:21,566
so that sets up a conflict between the medical institution and the patient going from one medical institution to medical institution, and I was just

294
00:50:21,566 --> 00:50:31,032
wondering what are your thoughts on the portability of this information? KÁRI STEFÁNSSON: You see, I’m not worried

295
00:50:31,033 --> 00:50:41,399
about you guys [laughter]…let me finish my sentence…because you guys have led the world in biomedical research, you have led the world in

296
00:50:41,400 --> 00:50:51,966
the way in which biomedical research is turned into delivery of healthcare. So, you will solve that. I think that, you see, there has been a lot of…we

297
00:50:51,966 --> 00:51:04,099
were the first ones in the world to launch a consumer genetic service. You know, 23 and me got the Time Magazine award for the invention of

298
00:51:04,100 --> 00:51:13,933
the year for having followed our suit, all right? That’s how you guys treat foreigners, and people have criticized, and even there are some states

299
00:51:13,933 --> 00:51:22,433
have made it illegal to market genetic tests directly to consumers, but the consumer will become the key to the solution of the problem you’re talking

300
00:51:22,433 --> 00:51:32,833
about because once the information is managed by the patient or the individual about whom the information is, the institutions become sort of

301
00:51:32,833 --> 00:51:42,566
irrelevant. You just go to places with the information on yourself and that is, in many ways, in keeping with what has happened in

302
00:51:42,566 --> 00:51:50,932
healthcare over the past 10-15 years. There’s hardly a well-educated patient who comes to see a physician with a new problem without having

303
00:51:50,933 --> 00:52:03,533
doled out six feet of papers from the Internet to be able to have a dialogue with his or her physician, and I think that people will want this

304
00:52:03,533 --> 00:52:12,533
information about themselves, they will have it, and therefore they will be able to take it from Mt. Sinai to Albert Einstein, you know, these two

305
00:52:12,533 --> 00:52:23,899
great institutions that have been at war with each other for 200 years. PAUL KIMMEL: Paul Kimmel. I’m not from Mt. Sinai,

306
00:52:23,900 --> 00:52:31,833
I’m from NIDDK and there is an institution, NYU in Manhattan as well, so I just… KÁRI STEFÁNSSON: I apologize. I was trying to

307
00:52:31,833 --> 00:52:38,366
be brief. PAUL KIMMEL: I’m just waving that flag. You painted a very complex interaction between

308
00:52:38,366 --> 00:52:48,366
domains of phenotypes, genotypes, environment, age, gender, and I just wonder, with your 20 years of experience in doing so much work in a

309
00:52:48,366 --> 00:52:59,699
very precise, homogenous population, can you give us your speculations regarding the generalizability of the results that you’ve put

310
00:52:59,700 --> 00:53:07,300
forth. You’ve talked about founder populations and it’s almost like Erwin’s question; we have a very diverse population in the United States. We

311
00:53:07,300 --> 00:53:16,233
know in kidney disease that there are completely dramatic differences between different populations in terms of genetic susceptibility and

312
00:53:16,233 --> 00:53:23,199
allele frequencies. So, can you speculate on this important question? I thought it was an important question.

313
00:53:23,200 --> 00:53:33,200
KÁRI STEFÁNSSON: Yes, I can speculate and I can give you a little bit of data. Number one, in spite of the way in which I look like, Icelanders

314
00:53:33,200 --> 00:53:43,600
are a reasonably good [---] model for homo sapiens, to begin with. So basically, everything that we have discovered in Iceland has been

315
00:53:43,600 --> 00:53:57,166
replicated elsewhere. So, the same biochemical pathway seems to be important in our population as in most other populations. There are allelic

316
00:53:57,166 --> 00:54:07,766
differences and some of them are very, very significant and some of them are transparent. It is easy to understand why they are different. Some

317
00:54:07,766 --> 00:54:18,799
of them are extremely difficult to figure out. One of the ones that is relatively easy to understand came out of our work on malignant melanoma

318
00:54:18,800 --> 00:54:26,100
when we started the work there. At the time when we started the work there was only one mutation known to predispose to malignant

319
00:54:26,100 --> 00:54:36,533
melanoma and that was a mutation in the melanocortin Type 1 receptor which conferred an odds ratio of malignant melanoma in Spain of 3, in

320
00:54:36,533 --> 00:54:46,999
Sweden of 2, but it had no impact on the risk of melanoma in Iceland. This happens to be a mutation that turns your hair red and your skin

321
00:54:47,000 --> 00:54:56,900
very white, and you become sensitive to sunlight. If you live in Spain it is difficult to avoid the sunlight. If you live in Sweden, you’re

322
00:54:56,900 --> 00:55:04,700
somewhere in-between. I can promise you, if you’re sensitive to sunlight, you can avoid sunlight in Iceland, and what is interesting there is

323
00:55:04,700 --> 00:55:18,533
that the frequency—allelic frequency—of this mutation in Spain is 6%, in Sweden 17%, in Iceland 26%. So, there you have an example of a

324
00:55:18,533 --> 00:55:29,899
variant with geographic difference in effect and probably a consequent geographic difference in frequency. Another example comes of our work

325
00:55:29,900 --> 00:55:44,233
on atrial fibrillation. We found several common variants that affect atrial fibrillation, where we have an allelic frequency in Europeans that is

326
00:55:44,233 --> 00:55:58,866
about 20%, the allelic frequency in China is 60%, the odds ratio in the European population is about 1.75-1.8, but only 1.3 in the Chinese population.

327
00:55:58,866 --> 00:56:11,699
So, there is an example of variants of effect in heart rhythm that have very different effect depending on whether you are of European or

328
00:56:11,700 --> 00:56:25,900
you are of Chinese descent. So, there is some…you see, man is now spread all over the world, all right? So, we are exposed to maximum

329
00:56:25,900 --> 00:56:36,633
diversity in environment which basically leads to the generation of maximum diversity both in our genome and our phenotype, but the environment

330
00:56:36,633 --> 00:56:46,199
takes a long time to have impact on us, so it is not inconsequential when you move a large population from one part of the world to another.

331
00:56:46,200 --> 00:56:55,166
A good example can be seen in the African-American population in this country. If you look at, for example, hypertension, if you go to rural

332
00:56:55,166 --> 00:57:09,432
Nigeria the prevalence of hypertension is about 15%, in urban Nigeria about 25%, the Caribbean about 30-40%, and in African-Americans up to

333
00:57:09,433 --> 00:57:18,666
50%. So, that’s an example of where you take people with a genetic background that fits well to a particular environment and then put them into a

334
00:57:18,666 --> 00:57:27,566
different environment. There are differences like this. There are differences that you have to be aware of. You have to keep your eyes open, but

335
00:57:27,566 --> 00:57:38,166
the beauty of using the founder population is that it gives you the gene that you then go to and then you can tally the total diversity of the sequence in

336
00:57:38,166 --> 00:57:51,799
the gene, and you look at that in the context of an outbred population. Another way to use that is like many of us use that BRCA2 example. They

337
00:57:51,800 --> 00:58:01,733
found a truncating mutation in Iceland and then they go and sequence the gene in people here and they call it a positive test when they find any

338
00:58:01,733 --> 00:58:11,166
truncating mutation for the same gene. There is no question about it that they cannot demonstrate that all of these mutations associate with the

339
00:58:11,166 --> 00:58:21,632
disease because they don’t have enough cases to do that. But they know that if you truncate the gene, if you prevent it to generate another

340
00:58:21,633 --> 00:58:30,266
protein, this leads to our risk of a disease. So, you will have to use some deductive reasoning in the way in which you apply the genetic test.




Date Last Updated: 9/18/2012

General Inquiries may be addressed to:
Office of Communications and Public Liaison
NIDDK, NIH
Building 31, Rm 9A06
31 Center Drive, MSC 2560
Bethesda, MD 20892-2560
USA
Phone: 301.496.3583