Whole Genome Approaches to Complex Kidney Disease
February 11-12, 2012 Conference Videos

Data Annotation to Identify Actionable Variants
Ben Solomon, NHGRI

Video Transcript

1
00:00:00,000 --> 00:00:09,566
ANDREY SHAW: Okay, why don’t we get started? The last speaker in this session is Ben Solomon, who is a pediatric geneticist and staff

2
00:00:09,566 --> 00:00:22,566
clinician in the Medical Genetics branch of NHGRI. Ben studies the natural history and genetic causes of rare disease like VACTERL, which I

3
00:00:22,566 --> 00:00:29,466
think we are going to hear more about today, and holoprosencephaly. He’s been employing whole genome sequencing as a tool and the title of his

4
00:00:29,466 --> 00:00:37,599
talk is “Actionable Annotation of Genomic Data.” BEN SOLOMON: Thanks very much for having

5
00:00:37,600 --> 00:00:46,233
me, it’s an honor to be here and hopefully we will all make it out of here in one piece with the weather outside. This slide makes it look like

6
00:00:46,233 --> 00:00:52,399
things are going to be much more organized than they actually are but I’ve tried to loosely frame things into three separate sections. One, I am

7
00:00:52,400 --> 00:00:59,466
going to try to talk about some background and frameworks; ways to organize data. Second, I am going to try to give you some kind of raw

8
00:00:59,466 --> 00:01:06,866
sense of the numbers of what you find and the types of things that you find, and then I am going to try to hone in on some concrete examples

9
00:01:06,866 --> 00:01:13,366
because I think in talking about this stuff it is very easy to get hung up on the abstract, and it’s very helpful to illustrate some of the points by looking

10
00:01:13,366 --> 00:01:21,699
at some concrete examples of what you find and what you have to deal with. So, I often put this slide in talks I do on this topic. One term I

11
00:01:21,700 --> 00:01:28,133
myself—hopefully you guys won’t look like this about five minutes into the session—but also to point out that if you guys want to shout out

12
00:01:28,133 --> 00:01:35,466
questions or comments or throw coffee cups at me, you’re welcome to. I think this picture is actually taken across the street at the military

13
00:01:35,466 --> 00:01:41,932
med school at USUHS; I’m not positive about that but I think that’s true. You can confirm or deny this; this is a med school lecture. That’s what I

14
00:01:41,933 --> 00:01:50,366
have to go through. So, the topic of this talk is Incidental Medical Information, and obviously this isn’t a new thing right? If you’re doing a study on

15
00:01:50,366 --> 00:01:56,532
scoliosis and you’re doing chest x-rays and you see a blip on the lung, then that would be incidental information that you would probably

16
00:01:56,533 --> 00:02:04,899
have to deal with. In this context of next generation sequencing and this high throughput sequencing, or whatever you want to call it, is

17
00:02:04,900 --> 00:02:12,233
sometimes called “genomic risk information,” and as you can see from this article that was published a few years back in JAMA—I like this

18
00:02:12,233 --> 00:02:20,199
cute term, “The Incidentalome”—but you can see by the subtitle there that there’s a lot of concern about what this is going to do and what is going

19
00:02:20,200 --> 00:02:28,466
to happen to research using these methods. I think the idea is that we’re opening Pandora’s Box here, right? That we are opening up and we’re

20
00:02:28,466 --> 00:02:33,599
going to find a lot of things that we didn’t want to find and we’re not going to be able to deal with them in any rational way and it’s just going to be

21
00:02:33,600 --> 00:02:42,766
honestly a disaster. I think that we have to be careful and we have to conscientious about it, but I think with careful thought and planning that

22
00:02:42,766 --> 00:02:50,132
this can be a good thing both for research projects and for study participants and for our patients. So, we have to do things very well but

23
00:02:50,133 --> 00:02:57,833
it’s not…my feeling, and this is up for debate, is that it’s not necessarily and Pandora’s Box. So, a lot of what I am going to talk about—just to give

24
00:02:57,833 --> 00:03:04,599
you guys some background—is in the context of this condition I studied, VACTERL association. I know nobody has heard of this or is interested in

25
00:03:04,600 --> 00:03:12,066
this, but just to let you know. This is a cluster of congenital malformations. By NHGRI standards, this isn’t rare at all, though it’s not a common

26
00:03:12,066 --> 00:03:19,499
disease. We don’t know much about causes, though we’ve had some good preliminary data using some of the exome methods, so I feel like

27
00:03:19,500 --> 00:03:28,000
we are cautiously optimistic that we’re starting to make inroads in some of this now. What I am going to talk about for most of this is the first

28
00:03:28,000 --> 00:03:35,266
case that we did exomes on because I’ve had the most chance to kind of look at this incidental or actionable medical information in this family and

29
00:03:35,266 --> 00:03:41,866
the most chance, I think more importantly, to think about it. So this was a set of monozygotic or identical twins and they were about one year old

30
00:03:41,866 --> 00:03:47,599
when they participated when they came to the NIH to take part in my study, and so, just keep that in the back of your mind: these are one-year-old

31
00:03:47,600 --> 00:03:55,033
kids that we’re finding this stuff on, this isn’t you or your parents, this is a one-year-old child, right? One had this condition; one did not have

32
00:03:55,033 --> 00:04:01,499
this condition. There is no family history, so we thought we’d do whole exome sequencing of the twin pair to see if there is any kind of discordant

33
00:04:01,500 --> 00:04:09,600
genetic explanations there. So, this is going to be a very dry slide here. I’m just kind of warning you that you can take some coffee here, but it’s going

34
00:04:09,600 --> 00:04:16,066
to be a little slow. I just want to give you a sense of what we do from start to finish with some of this actionable data. To start with, obviously, the

35
00:04:16,066 --> 00:04:25,966
patient takes part in my main protocol. This is a choice, but for us we select certain people who separately consent to go through some of this

36
00:04:25,966 --> 00:04:32,132
genomic sequencing. I know other protocols just blank consent everybody to do, you know, everything from genome sequencing to looking at

37
00:04:32,133 --> 00:04:39,933
one SNP and everything in between, but we select certain people and they sign a separate consent. I’m not going to go over this too much

38
00:04:39,933 --> 00:04:46,966
because some of the other very nice talks before me talked about why this might be helpful or not, but basically we do an array first and then if we

39
00:04:46,966 --> 00:04:55,366
decide it’s a good idea we do whole exome sequencing. So, these first and second passes, because of Jamie’s really nice tool…and I’ll say

40
00:04:55,366 --> 00:05:01,299
I’m not just trying to make an ad for this—maybe I kind of am—but it’s very fun to use and for someone like me who is not originally a

41
00:05:01,300 --> 00:05:10,500
bioinformatician and still not and you’re not a computer programmer, it’s very fun to kind of get your hands into this stuff and it’s very exciting.

42
00:05:10,500 --> 00:05:18,566
The two things that we do with VarSifter quickly for this actionable data is we look through some of these databases to see what’s out there, and

43
00:05:18,566 --> 00:05:27,532
these are databases like Jamie mentioned, the Human Gene Mutation Database, dbSNP, and at the same time…this used to be a second step but

44
00:05:27,533 --> 00:05:34,233
I think Jamie puts out a new version of VarSifter about every week, so with these latest versions what’s really nice is that you at the same time can

45
00:05:34,233 --> 00:05:40,266
filter out artifacts, and that’s absolutely critical because you don’t want to be confirming and thinking about thousands of artifacts that aren’t

46
00:05:40,266 --> 00:05:48,199
really there, even if they seem like they might be disease-causing. This one’s in red and there’s going to be another one in red and my point here

47
00:05:48,200 --> 00:05:55,500
and one of the Take-Home points from this talk is that it’s really great and very important and necessary to automate things to a great extent,

48
00:05:55,500 --> 00:06:01,466
but there’s got to be some steps where you have to think about things where there has got to be some manual curation, and I don’t think that can or

49
00:06:01,466 --> 00:06:08,299
should ever go away. So, the first manual review is doing what we would all do when we see something new, is looking at OMIM, looking at

50
00:06:08,300 --> 00:06:16,100
PubMed, looking at what you can find online. Then, if we think this meets our criteria—and I’ll hold off on that because I am going into what

51
00:06:16,100 --> 00:06:24,166
those criteria are—we would go ahead and confirm those in our lab just using regular old Sanger sequencing, and then if there is still a

52
00:06:24,166 --> 00:06:32,066
question in my mind and in the mind of our team, then I individually contact—and this is another manual step—some experts on these disorders

53
00:06:32,066 --> 00:06:39,299
or on these genes. I know a few genes really well but I certainly don’t know all 20,000-30,000 genes and wouldn’t know what to do with certain

54
00:06:39,300 --> 00:06:45,800
variants in some of them, so it’s another manual step there. Then we actually have a formal working group where folks like Sarah Hull,

55
00:06:45,800 --> 00:06:55,400
bioethicists, genetic counselors, clinical and molecular geneticists to go through some of these kinds of gray zone pieces of information. We

56
00:06:55,400 --> 00:07:03,100
CLIA confirm things, and that’s a requirement; we have to confirm things so you can return this data to the participants and we return it, and then I

57
00:07:03,100 --> 00:07:09,366
think in my mind if you return it to the participants you’re obligated to do some kind of follow-up and that follow-up might just be: let’s make sure your

58
00:07:09,366 --> 00:07:15,799
primary care doctor gets this. Or, it might be something in place that if you have to test relatives you might also have that, but you can’t

59
00:07:15,800 --> 00:07:22,766
just kind of give them the information and say, you know, “Best of luck with that mutation,” there’s got to be something in place there and that’s up to

60
00:07:22,766 --> 00:07:30,166
you. So, the Take-Home Message from this is that you have to start with a clearly defined, carefully formulated algorithm. Recently in the last few

61
00:07:30,166 --> 00:07:35,166
months reviewed grants for some very good places and I won’t mention any of them because probably some of you guys are from some of

62
00:07:35,166 --> 00:07:41,266
these places and they have in these statements that, “We’re just going to manage these things, we have a team that will meet and we’ll talk about

63
00:07:41,266 --> 00:07:46,232
these things,” and I thought, great. You’ve got to have a team and you’ve got to meet and talk, but you also have to have an algorithm in place or

64
00:07:46,233 --> 00:07:52,866
else you’re going to look like…this is a cartoon version of what I look like with my coffee drip in front of my computer all night while my wife

65
00:07:52,866 --> 00:08:01,766
screams at me and wonders why I took this job. So here is the algorithm for my protocol. I’m not going to go through this; it’s summarized here.

66
00:08:01,766 --> 00:08:11,099
There’s two ways to look at how we decide what to return. First, this is kind of how actionable things are. I know that’s not

67
00:08:11,100 --> 00:08:18,466
necessarily a good word but that’s the word that I use in my mind, so, “actionable.” First of all, things must be of urgent clinical significance,

68
00:08:18,466 --> 00:08:24,766
meaning that you need to know it and you need to know about it NOW in the context of the age of the patient there. I know some protocols say

69
00:08:24,766 --> 00:08:30,866
they’re going to re-examine things as people age but you have to take into account the age of them, and you have to be able to do something

70
00:08:30,866 --> 00:08:37,432
about it. It shouldn’t be just, “Here’s your mutation, now go stress about it for 30 years until symptoms do or do not show up.” You have to

71
00:08:37,433 --> 00:08:43,733
have some kind of treatment or intervention that’s going to make a difference, and it should be better to know sooner from the genetic

72
00:08:43,733 --> 00:08:50,499
information than it should be later when symptoms arise. Knowing sooner has to have an advantage than if they were just to find out and

73
00:08:50,500 --> 00:08:57,900
go to the doctor on their own saying, “Oh, I have this or that.” Number four obviously doesn’t give you a lot of information but it should be better to

74
00:08:57,900 --> 00:09:04,233
know than not know. Then for recessive disorders… and this is a controversial point. I certainly know investigators who are much

75
00:09:04,233 --> 00:09:12,099
higher up the totem pole than I that say we should not return anything that’s recessive. We, from my protocol, happen to set the bar at 1 in 40,000. So,

76
00:09:12,100 --> 00:09:18,533
to give you a sense of what that means, if you do the math it means 1 in 100 people would be carriers for these. So, this would allow things like

77
00:09:18,533 --> 00:09:26,266
cystic fibrosis, phenylketonuria, sickle cell disease, but actually you wouldn’t return a lot of stuff that’s pretty common—for example,

78
00:09:26,266 --> 00:09:33,499
ascertained by newborn screening, okay? Then, an important point is that there has to be an OPT IN/OPT OUT. Some people are going to want to

79
00:09:33,500 --> 00:09:39,166
find out some of this stuff, some people are going to say, “Absolutely not, no matter what, I don’t want to know it,” and so there needs to be some

80
00:09:39,166 --> 00:09:47,732
participant or patient or study cohort decision making in that process as well, at least that’s what I think. So that’s kind of the actionable side

81
00:09:47,733 --> 00:09:53,833
of things in terms of the variant side of things, and I know there’s been a number of talks about this today, but you have to set the bar high. You

82
00:09:53,833 --> 00:10:00,833
have to have good evidence that the gene is associated with human disease. We all know that there are studies out there that are just---pardon

83
00:10:00,833 --> 00:10:04,899
my language, but—crap, right, about what’s determined with what. So, you have to have good evidence that the gene is associated with

84
00:10:04,900 --> 00:10:13,266
human disease. Then the variant you find, the nature of that variant, has to be such that either just by itself it predicts pathogenicity in a high

85
00:10:13,266 --> 00:10:21,699
confidence manner, like a nonsense mutation in a loss of function disease, or in the case of, say, a very polymorphic gene where you’re going to find

86
00:10:21,700 --> 00:10:28,833
a lot of variants that that exact same variant has really good evidence to be associated with human disease. And I think this is a nice article,

87
00:10:28,833 --> 00:10:34,833
so if… probably many of you aren’t interested in this topic, but for those of you who are and are going to be working on this, I would highly

88
00:10:34,833 --> 00:10:41,866
recommend this article as maybe the one thing you take away from my talk. This is mainly by these folks down at Chapel Hill that I work with

89
00:10:41,866 --> 00:10:50,466
on some of these questions. From this article, this is a very nice table that put things in tabular format, kind of what I just said in words, and

90
00:10:50,466 --> 00:10:58,866
what they do is this idea of “binning.” So, they bin the variants into different categories along the lines of what I was talking about. Here’s just

91
00:10:58,866 --> 00:11:07,232
another and I’m not going to go through all these little boxes, but just to point out that this is an algorithm that a group applied to a genome that

92
00:11:07,233 --> 00:11:12,666
they sequenced and they just kind of went through how they decided what was clinically relevant and not. So again, if you’re interested in

93
00:11:12,666 --> 00:11:19,299
this, another good article to read. It’s interesting, too, because this was two years ago and so it’s interesting to see how things have changed

94
00:11:19,300 --> 00:11:28,100
since then. So at this point, I just want to share a couple of examples. Maybe horror stories is an exaggeration; maybe it should be “Tales of

95
00:11:28,100 --> 00:11:34,266
Caution” or something, but just a couple things to drive home some of the points about, for example, what other speakers said before me

96
00:11:34,266 --> 00:11:41,599
about why the databases… why you need some manual looking at them, why you can’t just depend on looking at what’s on the computer. So

97
00:11:41,600 --> 00:11:48,800
in these twins, when I first sequenced…when Jamie and his team first sequenced them, I was first looking at the data. We saw quickly that they

98
00:11:48,800 --> 00:11:56,733
had…one of the first mutations I saw was this frame-shift mutation of PMS-1, and in this semi-reputable journal Nature about 20 years ago, they

99
00:11:56,733 --> 00:12:05,266
published that mutations in this gene caused a common form of hereditary colon cancer. So I thought, “Oh gosh, I found this in two one-

100
00:12:05,266 --> 00:12:10,832
year-old kids.” There’s not much of a cancer history in this family but I don’t really focus on cancer histories, or there could have been—I

101
00:12:10,833 --> 00:12:18,399
could have missed it—and I was just really not excited about finding this. One of the genetic counselors that’s involved with me in this study,

102
00:12:18,400 --> 00:12:24,800
one of his main areas of research is actually colon cancer and other kinds of cancer, and I talked to him about this and he said, “Oh yeah,

103
00:12:24,800 --> 00:12:32,633
that’s very interesting, they publish this and you notice there’s not a lot else published on this gene,” and it turns out that in those families

104
00:12:32,633 --> 00:12:38,533
where they found this, they’re actually segregating mutations in a different gene but nobody really wants to publish that stuff quite as

105
00:12:38,533 --> 00:12:46,366
much; it’s not quite as exciting. Then, I guess the other post-script of this is that with some of the really new artifact filters—the really nice artifact

106
00:12:46,366 --> 00:12:52,899
filters, rather, that were added later—I could quickly see that what we found here was an artifact, so that saved some time and stress

107
00:12:52,900 --> 00:12:59,200
there. Another one, this one is one of the few examples that’s not from these twins. his is from a different family that I did exomes on recently.

108
00:12:59,200 --> 00:13:10,933
Within the last two months we found that this woman, who was right about my age—so, in her mid-30s—had a missense change in VHL, which

109
00:13:10,933 --> 00:13:17,733
is—you guys being renal folks probably know—Von Hippel-Lindau, this relatively horrible cancer syndrome. We had done exomes on other

110
00:13:17,733 --> 00:13:24,166
members of the family so we knew her dad and paternal relatives didn’t have it, but she had a couple of kids that could have had this and her

111
00:13:24,166 --> 00:13:30,366
mom who could have had it, her siblings could have had it, multiple maternal relatives, and if you look at these very nice databases, like the Human

112
00:13:30,366 --> 00:13:37,132
Gene Mutation Database, that exact variant was listed as “yep,” from this article. This is associated with this very horrible cancer

113
00:13:37,133 --> 00:13:42,833
syndrome; kind of a worst case scenario of what you could find in terms of incidental medical information. So, with one of the post-docs in the

114
00:13:42,833 --> 00:13:49,633
labs, we designed, sent off for some primers so we could confirm this, and then thought about how we were going to tell the family, and so on.

115
00:13:49,633 --> 00:13:56,033
And then, looking and designing primers and looking a little harder—and this takes a couple of hours, you don’t immediately find this stuff—it

116
00:13:56,033 --> 00:14:02,333
turns out that this exact variant, even though it was published here, is not associated with dominant Von Hippel-Lindau syndrome; this exact

117
00:14:02,333 --> 00:14:08,866
variant is associated with an allelic disorder, a recessive polycythemia disorder. So, this woman is a carrier for a rare disorder, it’s not going to

118
00:14:08,866 --> 00:14:17,732
affect her health, almost certainly not going to affect her kids’ health, but you could see how one could easily return this data if you didn’t take

119
00:14:17,733 --> 00:14:23,999
a little extra time, and you’re dooming this family because they are going to believe you. This was a very smart family but they are going to buy

120
00:14:24,000 --> 00:14:31,366
what you say—you’re the geneticist—if you had returned it from them. So, Take-Home Message 2 is that you’ve got to take everything out there

121
00:14:31,366 --> 00:14:37,899
with a grain of salt, even the stuff that’s in the best curated database; you’ve got to be really careful about it. And along those lines, my

122
00:14:37,900 --> 00:14:44,633
Take-Home Message 3 is that, because of some of these situations that have happened to me in trying to manage some of this data, I have

123
00:14:44,633 --> 00:14:52,799
evolved or devolved, or at least changed the way I look at things. So, I amended my protocol formally to actually make sure I was promising

124
00:14:52,800 --> 00:15:01,900
less to research participants in terms of this incidental medical information, because I would have hated to do something that I wasn’t totally

125
00:15:01,900 --> 00:15:07,033
sure about and violate the “do no harm” idea. So to give you a little sense about some raw numbers here, and the numbers are a little lower

126
00:15:07,033 --> 00:15:16,333
you’ll notice here than what Jamie quoted. These exomes were done a couple of years ago, so you get more variants now, if that’s correct,

127
00:15:16,333 --> 00:15:26,466
Jamie, if you do exomes. You can consider this kind of one person, right? These are monozygotic twins so we did both of them, but they’re

128
00:15:26,466 --> 00:15:32,832
monozygotic. We started with about 80,000 variants, you get rid of a lot of stuff that looks like bad data, you get down to about 65,000. If you

129
00:15:32,833 --> 00:15:41,299
look purely at the variant types—so, nonsense, missense where it looks like a bad quote-unquote “bad missense” and we can talk about that in a

130
00:15:41,300 --> 00:15:50,166
second—frame-shift, etc., you get about 8,000 potentially pathogenic variants. Of those, 400 are in some of these databases as known

131
00:15:50,166 --> 00:15:56,232
disease-associated variants. When I say disease-associated—maybe that’s a misnomer—maybe it should be health-associated, because

132
00:15:56,233 --> 00:16:02,333
they could be associated with a risk of asthma, a risk of diabetes, a risk of high blood-pressure. There are some that were risks for good things,

133
00:16:02,333 --> 00:16:10,233
like if you have this, you might have a better memory than the average person. Then, 32 that were in disease-associated genes, and by the

134
00:16:10,233 --> 00:16:16,899
nature the variant, looked like they could be pathogenic. So, we applied our filter and we get down to about three that we thought might meet

135
00:16:16,900 --> 00:16:25,400
criteria for return of information. After talking about that in some of the groups it turned out that one of them didn’t meet criteria in talking to some

136
00:16:25,400 --> 00:16:31,433
experts about that, so we were down to two, and of those two, one I think is returnable—and I will show you what these are—and then one I’m

137
00:16:31,433 --> 00:16:39,433
honestly still on the fence and I don’t know if I’ll ever have… I don’t know if I’ll ever decide until the science is much better on this particular gene.

138
00:16:39,433 --> 00:16:49,333
Another way of thinking about things is that they break down to kind of three categories of variants. One is carrier states and a lot of those,

139
00:16:49,333 --> 00:16:54,699
depending on how your protocol is setup, you may or may not choose to manage that, and I think there are arguments either way, obviously.

140
00:16:54,700 --> 00:17:02,700
The other is susceptibility variants and a lot of those we don’t return to folks, and I will kind of give some rationale for that, and then the

141
00:17:02,700 --> 00:17:09,866
interesting ones are the ones where they should have the disease, where you see something, whether you’re sequencing an infant or an

142
00:17:09,866 --> 00:17:17,032
elderly person where they should have the disease, and those are ones that we think about. So, here are some examples of things that are

143
00:17:17,033 --> 00:17:24,166
carrier states and I am a card-carrying, board certified pediatrician and clinical geneticist and some of these I haven’t even heard of, so these

144
00:17:24,166 --> 00:17:33,666
are very rare. To give you an example, we found that these twins were both carriers for this what looked like a bad pathogenic mutation in ORAI1. If

145
00:17:33,666 --> 00:17:42,132
you have two mutations, one from mom and one from dad, you have a pretty horrible immune deficiency. What you can see from this

146
00:17:42,133 --> 00:17:50,099
pedigree—I don’t know if you guys are familiar with pedigrees like this—this is very consanguineous, so a very inbred pedigree. This

147
00:17:50,100 --> 00:17:55,866
has only been described, this syndrome, in one consanguineous Turkish family, so it’s probably not something that most people would return to

148
00:17:55,866 --> 00:18:03,199
their research participant,s unless of course they are members of this consanguineous Turkish family, but these guys were not Turkish; they are

149
00:18:03,200 --> 00:18:12,100
probably not going to be consanguineous, and so on. You get a lot of things that are these possible susceptibilities. I know it’s probably hard to read

150
00:18:12,100 --> 00:18:19,566
these but you get a lot of things from various GWAS studies or other types of studies. You’re getting a lot things that say, oh, you’re at slightly

151
00:18:19,566 --> 00:18:31,466
higher risk for autoimmune disorders or infertility or having low bone density; you get a lot of this stuff. My feeling is, that unless the evidence is

152
00:18:31,466 --> 00:18:40,432
terrific and you have a clinical link to this, this should be weighted against returning. So for example, in our twins we found this variant in this

153
00:18:40,433 --> 00:18:46,899
gene that’s associated with a risk of schizophrenia, so having this variant raises the risk of schizophrenia, right? I’m going to make up

154
00:18:46,900 --> 00:18:53,033
these numbers and correct me; I’m sure I’m wrong. So, let’s say the risk of schizophrenia in the general population is—I don’t know—1 in 500,

155
00:18:53,033 --> 00:19:01,699
so maybe having this gives you a risk of 1 in 400. Again, I’m making up these numbers, but it’s not like it’s going from 1 in 500 to 1 in 2, right? With a

156
00:19:01,700 --> 00:19:12,133
lot of these when you look them up, the data is mixed, right? It hasn’t necessarily been validated, there are not follow-up studies, maybe it actually

157
00:19:12,133 --> 00:19:21,099
has to do with a different gene, and so you have to take these again with a big grain of salt. Some of the interesting ones are we find things that

158
00:19:21,100 --> 00:19:27,766
they say, when you find it, you should have the disease. I got a little cute with this slide. I should have eliminated some of these but I left some of

159
00:19:27,766 --> 00:19:34,732
them up because, when I looked through the artifact filter, a couple of these drop away really quickly, okay? But if you look at some of these,

160
00:19:34,733 --> 00:19:42,099
these are bad things, that neither these kids nor anybody in the family had, like neonatal/severe cataracts or congenital diabetes and mental

161
00:19:42,100 --> 00:19:50,966
retardation, cardiomyopathy and psychiatric disturbances. So they didn’t have these, but their exomes said that they should have these and

162
00:19:50,966 --> 00:19:56,199
these are very interesting, so there’s a lot of possibilities, right? One is that the gene isn’t associated with the disease. One is that this is

163
00:19:56,200 --> 00:20:02,833
not a pathogenic allele. One is that these people are going to develop this and they don’t have it yet, or it’s somehow much less penetrant. It’s

164
00:20:02,833 --> 00:20:10,633
very interesting what you find here, and then you go back and look at the papers and you think, wow, something doesn’t match up here. I left this

165
00:20:10,633 --> 00:20:17,933
slide on. When I give a similar talk to less medical folks I use this to explain it, and I am not going to harp on this point at all. You guys know that,

166
00:20:17,933 --> 00:20:25,233
basically, having the same amino acid substitution, one or two amino acids later could have a very different effect, so it’s very hard to

167
00:20:25,233 --> 00:20:34,299
extrapolate from a different variant to the next. There are tools that Jamie talked about that are very nice, and I think they’re very nice primarily

168
00:20:34,300 --> 00:20:41,633
for the primary research target—what is the cause of this rare condition—but in terms of this actionable secondary, or whatever you want to

169
00:20:41,633 --> 00:20:48,166
call it genomic risk information, maybe these shouldn’t be depended on quite as much. As Jamie says, these depend on things like

170
00:20:48,166 --> 00:20:55,599
conservation and functional motifs and so on. As you guys know, there’s also ones that are freely available on the Web, but again, these are great

171
00:20:55,600 --> 00:21:03,800
for kind of our primary research target, looking for candidate genes, but maybe for when you’re talking about actionable stuff these aren’t quite as

172
00:21:03,800 --> 00:21:13,133
useful here. I just wanted to run through—I’m not sure how I’m doing on time—a few concrete examples from our data just to kind of give you

173
00:21:13,133 --> 00:21:19,733
food for thought here. The Take-Home Message from this is that you can have, again, the best algorithm, the most carefully designed algorithm

174
00:21:19,733 --> 00:21:26,566
with the world’s most brilliant bioethicist and IRB members as we do, but they’re still going to permit gray areas and I think it’s good to have

175
00:21:26,566 --> 00:21:31,899
some of those gray areas because there are going to be some that are going to be completely up for discussion until our genome is well

176
00:21:31,900 --> 00:21:40,966
understood, which will be probably never. This is the one that I still don’t have an answer for and I know other groups looking at this variant have

177
00:21:40,966 --> 00:21:48,066
come up with different answers for this, but in any case, in these twins we found that they had this missense variation, that was predicted to be

178
00:21:48,066 --> 00:21:55,299
functionally significant by software, where you’re susceptible to malignant hyperthermia, hypokalemic periodic paralysis. Has anybody

179
00:21:55,300 --> 00:22:04,400
here worked on this gene or disorder by the way, just out of curiosity? It’s mostly—and people get into trouble here, not always but mostly—when

180
00:22:04,400 --> 00:22:11,233
they are exposed to certain anesthetics, but it can be a life-threatening horrible thing that’s treatable, that’s pre-treatable; that could make a

181
00:22:11,233 --> 00:22:18,333
difference. This variant that we found in our twins hasn’t been seen in controls. It also hasn’t been seen in the disease population. This is a

182
00:22:18,333 --> 00:22:25,366
very polymorphic gene, meaning it has a lot of changes in it, so it’s hard to know. There’s no family history of this, and I called the family. I

183
00:22:25,366 --> 00:22:34,099
didn’t spill the beans but I just went into a very careful history of anything related to this, and nothing really popped up. So I took the next step,

184
00:22:34,100 --> 00:22:41,566
and I think the point of this isn’t, “Look what I did,” but the point is that it does take time to do these things and so you have to think about that.

185
00:22:41,566 --> 00:22:49,032
Fortunately, there is a great national/international consortium on this disorder, so I talked to and had some very interesting e-mail exchanges with

186
00:22:49,033 --> 00:22:57,799
folks in New York, France, and Canada about this and we had lots of discussion and we ultimately felt that, given the family history, given the clinical

187
00:22:57,800 --> 00:23:06,400
history of these kids, given the nature of the variant, it’s unlikely to be disease-related, but this is, again, one of these things that honestly do

188
00:23:06,400 --> 00:23:13,366
keep me up at night sometimes, and this is one of the ones I think about. The Take-Home Message 5 from that is that you have to share the wealth,

189
00:23:13,366 --> 00:23:21,532
and that’s sharing the wealth both with some of these individual experts in some of these conditions as well as to kind of advertise

190
00:23:21,533 --> 00:23:29,966
ourselves: folks who are bioethicists and IRB members and clinical geneticists, and are kind of used to dealing with some of these questions.

191
00:23:29,966 --> 00:23:39,899
So, just to kind of end up, I thought this is a nice story of something that we found of a perhaps clinically actionable piece of information. So, we

192
00:23:39,900 --> 00:23:47,000
found they had this missense variation in this gene, in this enzyme, in this gene that encodes this enzyme in the urea cycle. And so, if you’re

193
00:23:47,000 --> 00:23:55,400
homozygous or compound heterozygous from mutations here, you have this horrible biochemical disorder and most folks die. If you live you need a

194
00:23:55,400 --> 00:24:02,466
liver transplant and you’re usually severely and neuro-cognitively impaired. So, the software predicts that this is a very deliterious change, and

195
00:24:02,466 --> 00:24:08,732
I talked to this very, very, very nice gentlemen in Spain who was very sure and had some preliminary data modeling that said this is, indeed,

196
00:24:08,733 --> 00:24:19,933
a pathogenic allele. It’s very rare, this disorder, so less than 1 in 100,000, so unlikely that their kids could be affected, even though they are both

197
00:24:19,933 --> 00:24:25,399
carriers, but what’s interesting is, starting about 10 years ago, it was found that there’s very good data published in the New England Journal and

198
00:24:25,400 --> 00:24:34,033
other journals like that, that carriers from this can be at risk for pulmonary hypertension, especially as babies, right? What was interesting about

199
00:24:34,033 --> 00:24:41,266
these guys…so the kid who had VACTERL had a tracheoesophageal fistula—an abnormal connection between his trachea and his

200
00:24:41,266 --> 00:24:47,999
esophagus. His esophagus doesn’t go down to his stomach and so he had to have surgical repair on the first day of life to fix that. So, I

201
00:24:48,000 --> 00:24:58,066
remembered from my pediatric training, and I talked to some NIQUE folks about this. This tends to be not a big deal in these kids. They get their

202
00:24:58,066 --> 00:25:04,866
surgery, they hang out in the NICU for a couple weeks kind of learning to eat, learning to use their esophagus, and then they get discharged and

203
00:25:04,866 --> 00:25:10,932
they do fine, but this kid completely crashed after surgery, and I reviewed his record and talked to his parents and he just fell apart. He was in the

204
00:25:10,933 --> 00:25:17,733
NICU for several months and on all—for those of you who are clinicians—sorts of pressers and drips to keep this kids alive, and I’m surprised he

205
00:25:17,733 --> 00:25:25,966
made it out of there, and I’m surprised that he is doing well now. The reason that he fell apart was he had very severe pulmonary hypertension. So

206
00:25:25,966 --> 00:25:35,266
the idea here and what we are pretty confident happened, is that he had a hypomorphic allelein this gene and what that does is it drives down

207
00:25:35,266 --> 00:25:43,299
your Citrulline and Arginine, and that makes nitric oxide less available, and this is part of the Nobel Prize winning stuff, this system here but I think it

208
00:25:43,300 --> 00:25:51,900
was more related to Viagra then this. The thing that’s interesting is that having surgery when you look at the literature—and I didn’t know any of this

209
00:25:51,900 --> 00:25:58,033
ahead of time—is it completely washes out or some types of surgery that the type of this kid gets—some approaches—wash out your

210
00:25:58,033 --> 00:26:05,566
Citrulline and Arginine stores, and so you basically get severe pulmonary hypertension. This kid is never going to be a baby again but he

211
00:26:05,566 --> 00:26:15,666
has siblings and he’s going to have kids probably some day, and if any of them need surgery in the future as babies, this would be something that

212
00:26:15,666 --> 00:26:21,666
would be reasonable, I think, for them to know about because there are things that can be done during or before surgery to prevent this from

213
00:26:21,666 --> 00:26:29,832
happening. Dad was the carrier. This just shows this, that dad was the carrier and the other twin had it. I won’t go into this but the

214
00:26:29,833 --> 00:26:40,699
bottom line is that there are a lot of issues here and I think since I was invited kindly to speak here I’ll go ahead and give a little more of my two cents

215
00:26:40,700 --> 00:26:47,933
here which boils down to first, do no harm, and the point here is that you’re promising… if you’re telling your research participants, your patients,

216
00:26:47,933 --> 00:26:56,099
that you’re going to look through genomic data for this clinically relevant risk information, it takes a lot of time to do a thorough job and a lot of this

217
00:26:56,100 --> 00:27:03,400
stuff can be automated and I am working on an automated tool. I know lots of others are working on automated tools that will probably be much

218
00:27:03,400 --> 00:27:10,100
better than mine. So, things will get more automated, be faster, but are still going to take a lot of time, and it still should take a lot of time. It’s

219
00:27:10,100 --> 00:27:19,266
very easy, as some of the other speakers talked about, to miss potentially significant variants, and it’s equally easy to overcall benign variants. So, I

220
00:27:19,266 --> 00:27:29,332
think it’s a fine thing to do but if you’re going to do it, you want to design things very carefully beforehand and do it conscientiously and well.

221
00:27:29,333 --> 00:27:46,199
Thank you. I’m happy to take questions. FEMALE: So, I’ll ask a question. I’m over here,

222
00:27:46,200 --> 00:27:56,100
sorry. Thank you for your presentation. I’m sorry I missed the first few minutes, so I hope I am not overlooking anything. You know, there is a subset

223
00:27:56,100 --> 00:28:04,033
of people out there in the world—research participants—who are making this claim of, “That’s my data and if you’re going to know

224
00:28:04,033 --> 00:28:08,133
something about me, then I damn well get to know about it too.”

225
00:28:08,133 --> 00:28:11,299
BEN SOLOMON: Yeah. FEMALE: How do you think about that claim in

226
00:28:11,300 --> 00:28:13,800
light of some of your experience here with these patients?

227
00:28:13,800 --> 00:28:20,533
BEN SOLOMON: I don’t have a perfect answer. I think there’s arguments that could be made either way. I can tell you my opinion is that, as a trained

228
00:28:20,533 --> 00:28:30,733
clinical geneticist, I have a lot of trouble managing this data and I think, you know, just giving someone everything on a disc or giving someone

229
00:28:30,733 --> 00:28:38,333
everything to their primary care doctor, I think then you’re really opening up Pandora’s Box. I think this stuff… just like you wouldn’t give an MRI

230
00:28:38,333 --> 00:28:46,766
to a person who came in for a possibility of cancer; you’d interpret it for them and I think it’s very similar to that. I think the other problem that,

231
00:28:46,766 --> 00:28:55,099
maybe this will change, but still these days is that so much of the information you get has to be confirmed through other methods that if you’re

232
00:28:55,100 --> 00:29:01,600
giving it to them you’re not going to be able to separately confirm it. They’re not going to go set up a lab in their house and do Sanger sequencing

233
00:29:01,600 --> 00:29:07,566
to confirm the findings. I think for now we have to be the confirmers, as well.

234
00:29:07,566 --> 00:29:14,766
MALE: Just to be clear, when you talked about those “should’ve known” variants—I think there were 11 of them—they were previously

235
00:29:14,766 --> 00:29:21,866
published, those particular variants in the genes but not previously published, exactly.

236
00:29:21,866 --> 00:29:28,266
BEN SOLOMON: Exactly. So, those exact variants…great point. Thank you. So, those exact variants weren’t published, but when you look at

237
00:29:28,266 --> 00:29:38,332
the alleles, they had high…you would expect them to be pathogenic. These were things like nonsense or frame-shift, most of those and

238
00:29:38,333 --> 00:29:44,933
things like that. FEMALE: Thank you. That was a great talk. I

239
00:29:44,933 --> 00:29:58,033
wanted to see your two cents and raise it to four. Your last slide you said “if” you promised results like this to your participants it’s a lot of work, and

240
00:29:58,033 --> 00:30:05,599
you’ve demonstrated that very clearly and I’m aware of a lot of what’s been going on behind the scenes, but I’m curious what you think about

241
00:30:05,600 --> 00:30:16,533
that “if” and if what your advice would be to investigators who aren’t equipped to do the kind of analysis that your group had been doing. Is it

242
00:30:16,533 --> 00:30:25,399
okay to say, “No, we’re not going to return these kinds of results”? What is your feeling about a sense of obligation to do this for investigators

243
00:30:25,400 --> 00:30:30,233
who are generating exomic data? BEN SOLOMON: I think it’s a fantastic question

244
00:30:30,233 --> 00:30:39,233
and a very hard question, obviously. To give you a little context I certainly know investigators who say that we should be only looking at the primary

245
00:30:39,233 --> 00:30:46,233
research target, it’s unethical to look at this other stuff because the reason people take part in our studies isn’t to find out about this other stuff, it’s

246
00:30:46,233 --> 00:30:58,199
to find out about why I have a renal disorder or why I have whatever I came to you in the first place for. My response now is different than it

247
00:30:58,200 --> 00:31:05,600
would have been two years ago, and now I honestly think that if you’re very clear about things, it’s okay not to promise to manage some of

248
00:31:05,600 --> 00:31:13,566
that data. But having said that, I still think you have to have an algorithm and a plan in place, because you can try not to see that data but if

249
00:31:13,566 --> 00:31:19,799
some post-doc in your lab comes to you and says, “This is so weird, I found this stop mutation in this BRCA…I’ve never heard of this BRCA1

250
00:31:19,800 --> 00:31:26,100
gene…what does that mean?” there has to be an algorithm still even if you’re not going to plan to return that to manage it, because these situations

251
00:31:26,100 --> 00:31:34,366
are still going to arise. No matter what you want to do you unfortunately can’t be an ostrich in the sand.

252
00:31:34,366 --> 00:31:42,399
MALE: I applaud you for being very thoughtful about these issues and suddenly tomorrow I am sure there is going to be more discussion, and

253
00:31:42,400 --> 00:31:54,633
there is going to be a breakout session about what we do in nephrology practice. Now, I can’t help but totally agree with you--everything that

254
00:31:54,633 --> 00:32:03,699
you said in the pediatric space—but I want to caution that when we move into the adult space, when we actually deal with the concept of

255
00:32:03,700 --> 00:32:13,800
autonomy, and it was mentioned before that people increasingly feel that their medical information, not just the genome, and certainly

256
00:32:13,800 --> 00:32:22,700
eventually also the genome sequence, belongs to them and they are going to take it wherever they want to take it, and we lose control; we, as those

257
00:32:22,700 --> 00:32:30,700
who have so far been sort of the people who’ve been sitting on the medical information on the genome. So, this is a trend that is happening and

258
00:32:30,700 --> 00:32:45,333
it’s not going to be reversed, but getting back to…I would like for you to comment on your perspective on this for the adult medicine space

259
00:32:45,333 --> 00:32:56,166
and most of the renal disease patients we are dealing with on adult space, and perhaps your reflection may be that the bar may not need to be

260
00:32:56,166 --> 00:33:04,999
that high, particularly considering that we’ve now done structured interviews with hundreds of patients and people in the community, and

261
00:33:05,000 --> 00:33:18,733
overwhelmingly after long discussion and so on, people really do want to know what’s in their genome. So, just before you close can you

262
00:33:18,733 --> 00:33:22,966
reflect on your bar for the adult space, if you could?

263
00:33:22,966 --> 00:33:30,932
BEN SOLOMON: I think it’s a fantastic point and these points about patient or participant autonomy I think have to be very carefully considered, and I

264
00:33:30,933 --> 00:33:42,066
absolutely agree that there’s a strong argument to be made that “it’s my genome, don’t hide it, give me what I need.” I don’t, again as I think I’ve said

265
00:33:42,066 --> 00:33:48,266
for every answer, I don’t have a perfect response but I think one thing that people are moving towards and groups are moving towards

266
00:33:48,266 --> 00:33:59,099
is that if you’re setting the bar a little lower, which I think is fine and there are certainly arguments for that, is that it’s very easy to cripple a

267
00:33:59,100 --> 00:34:06,966
research endeavor because all your resources suddenly get shunted towards dealing with things that you frankly don’t have an interest in or

268
00:34:06,966 --> 00:34:14,699
expertise in. I think that one of the models for managing some of this stuff is to, within in institutions whether it’s NIH or the BROAD

269
00:34:14,700 --> 00:34:23,700
Institute or CHOP or wherever else—to set up independent but collaborating centers who manage some of this data, and I think that might

270
00:34:23,700 --> 00:34:33,766
be some of the answers to being able to lower the bar to the point where people are more comfortable in participating. Thanks.




Date Last Updated: 9/18/2012

General Inquiries may be addressed to:
Office of Communications and Public Liaison
NIDDK, NIH
Building 31, Rm 9A06
31 Center Drive, MSC 2560
Bethesda, MD 20892-2560
USA
Phone: 301.496.3583