Whole Genome Approaches to Complex Kidney Disease
February 11-12, 2012 Conference Videos

NCBI Databases of Genetic Variants—What Is Available and How Is It Best Used?
Wendy Rubinstein, NIH/NLM/NCBI

Video Transcript

1
00:00:00,000 --> 00:00:10,500
WENDY RUBINSTEIN: Thanks for the invitation to speak. I kind of drew a funny picture here and it’s a little bit silly but I wondered if anybody could

2
00:00:10,500 --> 00:00:30,166
guess what that variant is? It’s really easy. Anyway, it’s sickle-cell disease. So, I’m going to talk about NCBI databases of genetic variants and

3
00:00:30,166 --> 00:00:39,466
try to keep this at the practical level. I wasn’t sure how to gauge who was actually going to attend here, so I’m making this more or less a “how do

4
00:00:39,466 --> 00:00:50,966
you navigate through the NCBI resources?” and at the end I’ll talk a bit more about the two resources that I’m really more involved with: the

5
00:00:50,966 --> 00:01:01,566
Genetic Testing Registry as well as ClinVar. So, I was asked fortunately, to speak about a narrow slice of the genome, called exomes, to make life

6
00:01:01,566 --> 00:01:16,099
easier and that’s where a lot of the good stuff is. So, I’ll talk about how our variation data is organized among the NCBI resources. I will talk

7
00:01:16,100 --> 00:01:26,966
briefly about dbSNP and dbVar and let you now a little bit about dbMHC because I think it’s relevant to a kidney audience. I will be talking a bit more

8
00:01:26,966 --> 00:01:40,399
in-depth about dbGaP and how to find your way around, a resource called PheGeni, and then a little more in-depth about ClinVar and GTR. So this

9
00:01:40,400 --> 00:01:53,533
is your landing page at NCBI and it is, I think, not too hard to get through. The resources are organized over here and what I want to draw

10
00:01:53,533 --> 00:02:03,866
your attention to, if you’re interested in variation, are three sections: genetics in medicine; training and tutorials, which is really about everything;

11
00:02:03,866 --> 00:02:16,932
and variation. And so, if you go deeper into variation you’ll see all the set-up here is tabbed, so you’ll find everything you want on this page. If

12
00:02:16,933 --> 00:02:27,566
you want to scroll down all the way, the database is here; what you’d expect for variation, like ClinVar, dbVar, dbGaP, and GTR. But

13
00:02:27,566 --> 00:02:37,399
you can either go to the tab “how to” or you can go all the way down if you want to know more about how do you actually use this resource,

14
00:02:37,400 --> 00:02:45,000
how do you find human variations associated with a phenotype or a disease, in other words, clinical associations. And when you’re there

15
00:02:45,000 --> 00:02:56,133
you’re going to find pretty detailed instructions about how to use PubMed, OMIM, dbVar, PheGenI, and dbSNP, so I’m not going to take you

16
00:02:56,133 --> 00:03:09,099
through that in detail, I just want to point out how you get there. So, dbSNP is about short genetic variations. Actually, the logo’s been changed to

17
00:03:09,100 --> 00:03:18,966
try to reflect the fact that it represents a wide range of variation content and there’s been misunderstanding about the scope of this

18
00:03:18,966 --> 00:03:30,132
resource, so I want to go through that a bit. dbSNP isn’t limited to single nucleotide polymorphisms. It includes multiple small scale

19
00:03:30,133 --> 00:03:42,999
variations, so insertions-deletions, microsatellites and also non-polymorphic variants. So, there are common as well as rare variations in these

20
00:03:43,000 --> 00:03:54,200
genotypes and you can get the allele frequencies. And importantly, don’t go to dbSNP thinking that, you know, it’s all benign stuff. It

21
00:03:54,200 --> 00:04:05,833
does have clinically significant variations and you shouldn’t go to it assuming that this is sort of your baseline for benign polymorphisms. Just a couple

22
00:04:05,833 --> 00:04:18,033
of words about dbVar. It’s the database of genomic structural variation and defined as 1 kb region, and here I just want to make the point that

23
00:04:18,033 --> 00:04:30,333
there’s been, obviously, much more focus on shorter types of variation in disease resistance, susceptibility, common disease, but there is

24
00:04:30,333 --> 00:04:41,399
evidence that copy number variation may be important for common diseases, and here I’ve pulled out just one example of a strong

25
00:04:41,400 --> 00:04:57,333
association between a gene with a copy number variation and the risk of lupus with a decent P value. And I also want to mention dbMHC. This is

26
00:04:57,333 --> 00:05:06,899
about the major histocompatibility locus on chromosome 6 and it has a comprehensive list of the known WHO alleles that are used for

27
00:05:06,900 --> 00:05:15,766
transplantation. In fact, I’m told that the people who do tissue typing will go to this resource and compare their variants against it to use it, even

28
00:05:15,766 --> 00:05:26,899
though it’s sort of not really out there for clinical use but it’s used that way, and it has other HLA-related datasets. So, you can find interesting

29
00:05:26,900 --> 00:05:38,766
things like survival curves for how much HLA matching you’ve got in the transplantation set. What I really want to point out here is the MHC

30
00:05:38,766 --> 00:05:51,132
locus generally isn’t reported as part of GWAS studies, I think, because of its complexity, and it’s also not part of the 1,000 Genomes Project data

31
00:05:51,133 --> 00:06:00,299
that’s given out. So, if you’re interested in this type of data, this is a resource that you might to want to go to, particularly if you’re interested in

32
00:06:00,300 --> 00:06:11,000
autoimmunity and so forth, and you can also use this for baseline association studies. It’s got a lot of detailed data, sequence data, allele

33
00:06:11,000 --> 00:06:23,100
frequencies, it’s got anthropology maps and, you know, it’s nicely laid out. So I’ll spend a little more time talking about dbGaP. So, this is the database

34
00:06:23,100 --> 00:06:33,133
of genotypes and phenotypes and the phenotypes are clinical phenotypes you might measure but also include exposure variables,

35
00:06:33,133 --> 00:06:41,433
which you’ve been hearing about this morning, genotypes and sequences. The sequences are so-called brokered by the Sequence Read

36
00:06:41,433 --> 00:06:50,699
Archive and there are several exome sequencing studies I’ll mention here. Many study documents you can pour through and also analysis results

37
00:06:50,700 --> 00:07:03,800
that you can go through as well. So, I kind of collated slides from Mike Feolo here just to kind of give you a sense of the amount of data in dbGaP.

38
00:07:03,800 --> 00:07:14,933
This slide is actually just showing the number of variables here and it hasn’t been updated since the end of 2011, but just to give you the idea of

39
00:07:14,933 --> 00:07:27,533
lots of studies, lots of documents, participants and so forth, so a lot of data in here. And this contains information from GWAS from multi-study

40
00:07:27,533 --> 00:07:41,499
projects, the names of which you’re very familiar. This just gives you an idea of some of the projects, the studies that entailed, the documents,

41
00:07:41,500 --> 00:07:48,666
the variables, subjects, so again, lots of data. And here are more longitudinal cohorts that you’ve heard a bit about this morning. I think the

42
00:07:48,666 --> 00:07:59,732
speakers this afternoon will go into a lot more depth, so that’s not my intention, but these are large cohort studies used in several of these NIH

43
00:07:59,733 --> 00:08:10,199
programs that are doing dif ferent, let’s say, degrees of sequencing from pulled exome sequencing to focused medical sequencing to

44
00:08:10,200 --> 00:08:21,766
genome-wide SNPs, and you’ll hear more about GO ESP. I think you’ve seen a slide like this already this morning, but the grand opportunity

45
00:08:21,766 --> 00:08:31,332
would be the exome sequencing project with these participating groups and these cohorts are being sequenced as part of ESP GO; they’re also

46
00:08:31,333 --> 00:08:42,233
part of GWAS studies funded by NHLBI and some of the other institutes. And I also want to point out, you know, so many of these studies have

47
00:08:42,233 --> 00:08:54,699
heart name associated with them but they can certainly be mined for kidney types and renal types of phenotypes.The Framingham Heart

48
00:08:54,700 --> 00:09:06,100
Study has phenotypic and exposure variables that could apply to renal disease and I want to point out it also has expression data, and dbGaP

49
00:09:06,100 --> 00:09:15,233
has been designed so that if you have access to the study you basically get everything in it. If it’s got GWAS, if it’s got sequencing expression or

50
00:09:15,233 --> 00:09:28,233
any type of molecular data, you get that whole package once you have access to the study. Here I did, as you can see, a search on NIDDK.

51
00:09:28,233 --> 00:09:36,266
You can search on, you know, renal or any sort of phenotype that you want and see what pops out, and it didn’t exactly pop out this way. I

52
00:09:36,266 --> 00:09:44,166
basically took some studies and clipped them on the page, but this can give you an idea of how things are laid out. So, you can read across and

53
00:09:44,166 --> 00:09:51,899
see how many studies came out of your search, the number of variables associated with that—so it’s a huge number—the number of study

54
00:09:51,900 --> 00:10:02,833
documents, analyses, datasets, and the study. So, here’s some of your favorite terms: diabetic nephropathy, here’s more nephropathy in

55
00:10:02,833 --> 00:10:15,433
diabetes, and here’s…I think the FIND Study is in here someplace, and you know, there’s a lot else in here. You can also hover over these little sort

56
00:10:15,433 --> 00:10:23,966
of “Chiclets” and find out what you want to know about the variables—how many there are—you can click on it and that will take you to all the

57
00:10:23,966 --> 00:10:32,432
details about those variables and you can hopefully find what’s interesting to you. So, the document analyses, and then if it has SRA

58
00:10:32,433 --> 00:10:47,966
components. You can go directly to the study links, you can figure out the platform and so forth. This is a resource you may not have heard of. I’m

59
00:10:47,966 --> 00:10:58,266
sure you’ve all heard of dbGaP but this is called PheGenI which stands for Phenotype Genotype Integrator and this assembles results from GWAS

60
00:10:58,266 --> 00:11:11,199
studies based on NHGRI curation and dbGaP. So, if you go in, that’s your interface. It’s pretty simple. I already kind of plugged in male, your

61
00:11:11,200 --> 00:11:20,100
general diseases—which isn’t probably the best terminology—but within that you can find kidney diseases or other kidney phenotypes, and here

62
00:11:20,100 --> 00:11:28,600
you’ll make selections about your genotype and whether you care about exons or you want to see about anything that comes back. So, you can

63
00:11:28,600 --> 00:11:39,166
query based on phenotype. Actually, you can also set your P value. You can say any P value or set a threshold, and here you can say what type

64
00:11:39,166 --> 00:11:49,899
of genotype you’re interested in or leave it blank. And so doing this type of search and putting in these terms and leaving the P value unspecified

65
00:11:49,900 --> 00:12:03,500
and hitting “search,” this happened to come up with two specific missense mutations that happened to be on the same

66
00:12:03,500 --> 00:12:11,133
chromosome—chromosome 2—and here you can do a lot of things. You can, you know, see the trait, the RS number, you can see the

67
00:12:11,133 --> 00:12:20,666
chromosomal location. They actually do have respectable P values. You can go straight to the PubMed document. You can see they’re both,

68
00:12:20,666 --> 00:12:32,332
actually, in the same paper, and then you can go in-depth on any of the genes or the SNPs and you can see the validation study and actually go

69
00:12:32,333 --> 00:12:45,199
into the study itself here. Or, if you click on the ideogram you can get sequence view here, you can hover over these little variant “Chiclets” and

70
00:12:45,200 --> 00:13:02,466
get, you know, the basics on the variant you’re looking at or you can go deeper. So, ClinVar is an interesting resource. It’s actually not out there yet

71
00:13:02,466 --> 00:13:10,432
but it will be later this year and it’s about representing these relationships between genotype, phenotype and what is the

72
00:13:10,433 --> 00:13:20,599
interpretation that goes with that; and that interpretation is based on the supporting evidence. So, what ClinVar does is it takes these

73
00:13:20,600 --> 00:13:31,866
structured observations and records them so that they can be aggregated and compared with each other because assertions can be different, and

74
00:13:31,866 --> 00:13:43,299
they can be searched and then over time they can be re-evaluated. So, ClinVar is essentially an archive. It is not an interpretation tool in and of

75
00:13:43,300 --> 00:13:52,300
itself, but you know, until we assemble all the data in one place on a variant, it’s hard to interpret anything. So, there are layers of

76
00:13:52,300 --> 00:14:07,366
assertions and confidence in any assertion is indicated as a range. It can be a single source making an assertion about a single variant, it can

77
00:14:07,366 --> 00:14:23,932
be an assembly of observations about a single variant, and it can also be a matter of a group coming together of experts and they are curating

78
00:14:23,933 --> 00:14:34,499
and making decisions about the level of evidence on a variant and then recording that, or it can go all the way up to a clinical practice guideline. For

79
00:14:34,500 --> 00:14:46,700
example, these are the 23 variants in the cystic fibrosis gene that you ought to be testing for based on expert opinion. The sources are

80
00:14:46,700 --> 00:14:56,266
acknowledged so there’s attribution, which has become quite important in the locus-specific database world. There’s gateways. You can find

81
00:14:56,266 --> 00:15:08,232
your way directly to publications and the external databases. There’s a strong effort to use terminology that’s consistent with community

82
00:15:08,233 --> 00:15:18,266
standards and that’s no small challenge. The data are available. It’s unrestricted availability. You can download, integrate with external databases and

83
00:15:18,266 --> 00:15:25,966
coming out later this year and then, I think what’s also important, is the resource that I’ll tell you a bit more about, the Genetic Testing Registry. It’s very

84
00:15:25,966 --> 00:15:35,699
closely aligned with ClinVar, and functionally what that means is that if you are perhaps selecting a test based on what you’re looking for

85
00:15:35,700 --> 00:15:43,600
but you come up with a result, you could potentially go to a resource like this and at least find out what’s known about this variant of

86
00:15:43,600 --> 00:15:55,833
uncertain significance by looking in one place. So, this is a little snippet of the home page of the Genetic Testing Registry. This is going to be

87
00:15:55,833 --> 00:16:09,199
publicly announced later this month. So, I think you’ve heard the description of it. There’s a, I believe, a need to assemble a sort of transparent

88
00:16:09,200 --> 00:16:17,666
data about the genetic tests that are out there. It’s been a bit of a black box about what tests you’re actually ordering and what you get when you get

89
00:16:17,666 --> 00:16:27,799
it back or why you should order it. So, this is laid out in a tabular format. You can search any term across all the Genetic Testing Registry or you can

90
00:16:27,800 --> 00:16:36,466
focus it on tests, conditions, and phenotypes, which is actually where you would find pharmacogenetic tests or you can look for genes

91
00:16:36,466 --> 00:16:50,732
or labs, and then we’ve put gene reviews in here as well. So, the call for a test registry really accumulated over the course of time. There are

92
00:16:50,733 --> 00:17:01,699
about…it’s estimated there are about 2,000 genetic conditions, according to the gene tests registry, for which tests exist. There’s actually

93
00:17:01,700 --> 00:17:11,000
not a clear answer about how many tests are actually out there, although we estimate there’s somewhere between 7,000-8,000 genetic tests

94
00:17:11,000 --> 00:17:20,233
out there, but nobody really knows, and there’s actually no single source of information about these tests. The registry that exists is really

95
00:17:20,233 --> 00:17:32,433
about diseases and labs for which there are tests mostly about the diseases. So, in 2008 the Secretary’s Advisory Committee on Genetics,

96
00:17:32,433 --> 00:17:42,766
Health, and Society came out with a report in which, among other things, recommended that a test registry be created by HHS to increase the

97
00:17:42,766 --> 00:17:51,499
transparency of genetic testing. So, some time was spent thinking about where that ought to be housed and a lot of debate about whether it

98
00:17:51,500 --> 00:18:05,566
should be a mandatory registry or voluntary registry. It is a voluntary registry and NIH was tasked with taking this on, which fits in one way

99
00:18:05,566 --> 00:18:19,099
because it’s not a regulatory agency and also NCBI is, I think, well capable of managing the type of data in here; and it wasn’t just the Secretary’s

100
00:18:19,100 --> 00:18:26,466
Advisory Committee, it was also other policy and advocacy groups that called for a registry and actually many of these also wanted it to be

101
00:18:26,466 --> 00:18:41,299
mandatory. The background here, of course, is that genomics is becoming clinical and we need a database that’s anchored on tests to know what

102
00:18:41,300 --> 00:18:51,333
we’re getting and not just diseases, and the database structure must be able to accommodate complex information, and that means information

103
00:18:51,333 --> 00:19:00,233
in arrays and information in whole genome and whole exome tests so that when these doctors are scratching their head at a sequence—and

104
00:19:00,233 --> 00:19:12,366
hopefully there’s some genetic counselors to help them out—they can figure out what it all means. So, as I said, NIH responded to this in order to

105
00:19:12,366 --> 00:19:21,266
encourage providers of genetic tests to enhance transparency, share some of the information—nonproprietary information—about the availability

106
00:19:21,266 --> 00:19:30,766
of these tests—what’s the scientific basis—and it also is to provide an information resource for health care providers as well as researchers and

107
00:19:30,766 --> 00:19:42,766
patients to find laboratories that offer particular tests and bolster research and discovery. This is a phased approach; we can’t do everything at

108
00:19:42,766 --> 00:19:52,832
once. So, in the initial phase it will include single-gene tests for heritable mutations, including pharmacogenetic tests—which is currently not a

109
00:19:52,833 --> 00:20:02,666
type of test that’s contained in the gene tests laboratory directory—but in subsequent phases, it will expand to include somatic mutations, for

110
00:20:02,666 --> 00:20:10,199
example, for solid tumors and hematologic malignancies. It will include direct-to-consumer tests because the desire is to increase

111
00:20:10,200 --> 00:20:19,666
transparency there as well, and it will include whole exome and whole genome sequencing so that we can accommodate that type of complex

112
00:20:19,666 --> 00:20:31,932
data. So, to sort of take you through a very quick tour—I’m not showing you, sort of, all the pathways in GTR—but if you were to be either

113
00:20:31,933 --> 00:20:42,999
on the home page or right here and you put in “kidney” and you were searching here in all GTR, this is what would pop out. You would find that

114
00:20:43,000 --> 00:20:56,633
there are 81 tests which are in some way and form have the word “kidney” associated with them, and 44 conditions, 153 genes, and 41

115
00:20:56,633 --> 00:21:08,933
laboratories that use this term. And the way this page is laid out is that you’ll see the first of these hits and you can dig further if you want, but you’ll

116
00:21:08,933 --> 00:21:19,266
see the tests and conditions and phenotypes and down where I’m not showing you, the genes and laboratories and you can decide where to go.

117
00:21:19,266 --> 00:21:30,866
And also, when you first do this kind of search and you’re deciding to look at a condition phenotype—in this case, kidney—the default that

118
00:21:30,866 --> 00:21:41,699
we’ve set is actually to check off hits where there is actually a test available in the Genetic Testing Registry, but you can unclick that and

119
00:21:41,700 --> 00:21:48,833
you’ll be able to find more things that, you know, might be in OMIM or you can look at things where there’s only gene reviews because you want to

120
00:21:48,833 --> 00:22:02,533
know what’s written about it. You can also toggle back and forth between seeing a detailed description—well, a short summary—of each of

121
00:22:02,533 --> 00:22:13,766
these conditions, and instead look at more of a hierarchy effect, and if you do that you’ll see all of the results laid out. There’s some hierarchical

122
00:22:13,766 --> 00:22:23,132
relationships here and you can click and find whether there’s a…you can decide if you want to look based on if there’s a clinical test or

123
00:22:23,133 --> 00:22:33,399
whether it’s a research test—which none of these are—and you can see if there’s an OMIM record or gene reviews available; and of course,

124
00:22:33,400 --> 00:22:47,433
these are all active links to the conditions. Or, you can…I also look at the test view of things. These are actually all sort of named the same until

125
00:22:47,433 --> 00:22:55,833
people come in and they fully register a genetic testing registry test, but this is neat information that we have now based on what’s in gene

126
00:22:55,833 --> 00:23:06,533
tests. These are all, in theory, distinct tests, and we also have, for when it’s available, we have a methodology laid out that you can go to another

127
00:23:06,533 --> 00:23:14,833
view and lay out all of the methodologies kind of in a grid format. But here I want to point out is that you can narrow your condition down by

128
00:23:14,833 --> 00:23:23,633
selecting it and then you can actually compare these labs, one by one, for what they offer if they’re offering for that condition. You can find

129
00:23:23,633 --> 00:23:33,866
research tests if they’re available and sort based on that, and then you also look at test methodology alone if you’re a genetic counselor

130
00:23:33,866 --> 00:23:44,566
or physician and you’re going to order a test but you know the next methodology you need is such-and-such, you can search based on that.

131
00:23:44,566 --> 00:23:58,699
And here I just put in “renal” and, you know, you get a different set of results here and it sort of looks like that. So, I think I went through that

132
00:23:58,700 --> 00:24:08,133
pretty quick. I just wanted you to have some contact information: Steve Sherry; Mike Feolo; Donna Maglott; and Diana Church, who really

133
00:24:08,133 --> 00:24:16,766
managed most of these resources I’ve been telling you about. I encourage you to go to the home page and troll your way around, or if you

134
00:24:16,766 --> 00:24:29,166
have an interest in GTR, to call me. Okay. Thank you very much.

135
00:24:29,166 --> 00:24:40,666
FEMALE: Thank you for that talk. It gives me real concern, however, in terms of the direction where the science is going in terms of using

136
00:24:40,666 --> 00:24:54,232
information gleaned from the database searches, and then when you get to what you might consider a candidate gene that ties now to the

137
00:24:54,233 --> 00:25:05,499
person in terms of the clinical—when you go to the clinical you’re talking about the real personalized aspect of this—and a statement that

138
00:25:05,500 --> 00:25:19,033
you made that begin to mine the data to even describe sentiment feel, all the kind of phenotypic information that would then be applied to the

139
00:25:19,033 --> 00:25:34,433
individual. My concern is, we recognize right now that what’s in the database that you’re gleaning from is largely dependent upon study design,

140
00:25:34,433 --> 00:25:47,899
population use and what kind. So my question first of all, in this rush to justify, if you will, the investment in genomics from a public health or

141
00:25:47,900 --> 00:26:03,233
clinical side, are we planning to put a Surgeon’s Warning on recommendations that this data is relevant to individuals that meet this kind of

142
00:26:03,233 --> 00:26:17,033
criteria in terms of generalization in a clinical context? And if I might, just another real concern about the whole move to electronic databases

143
00:26:17,033 --> 00:26:26,566
that will eventually then become the resource from which we define diseases, define treatments, and what have you. I’m very much

144
00:26:26,566 --> 00:26:35,999
concerned that, once again, the MAC and HLA may be a microcosm of what we’re going to deal with from a whole genome perspective in terms

145
00:26:36,000 --> 00:26:48,700
of, to the extent that information on a given patient population is not in the database, particularly as we look at these common variants

146
00:26:48,700 --> 00:27:02,733
and find relationship between regulation and expression of the disease, to the extent that we don’t even know how the regulatory genes

147
00:27:02,733 --> 00:27:15,666
impact the system that’s tied to the exon that we would put the emphasis on testing, so to speak. I mean, is there a task force at least in place to

148
00:27:15,666 --> 00:27:28,366
begin to be sensitive to the extent to which data is missing from these very rich databases on, for example, from Stefánsson’s talk this morning on

149
00:27:28,366 --> 00:27:40,799
how individual’s thinking and emotions and how these impact regulation of the system of these exons? If that’s not in the database to put into

150
00:27:40,800 --> 00:27:54,500
your model, the predictions that we make…are we at least making some probabilistic estimate of how applicable it is in personalized medicine that

151
00:27:54,500 --> 00:28:08,400
deals with the individual? I just wanted to believe that there is thought going into deficiencies in the source that’s going to impact application to the

152
00:28:08,400 --> 00:28:13,000
individual. WENDY RUBINSTEIN: Yeah. Well, there’s a lot in

153
00:28:13,000 --> 00:28:24,266
that question, but let me try to impact it a little bit. So, let me say first of all, there are a number…if you’re talking about the clinical realm, which I

154
00:28:24,266 --> 00:28:34,999
think you are, there are many, many tests out there and there is relatively little known or reported about what those tests are actually

155
00:28:35,000 --> 00:28:47,300
doing and yet they’re used in the clinical realm every day, and the attempt…and I showed you many databases here, so they may be sort of

156
00:28:47,300 --> 00:28:56,233
melding into each other a bit, but the Genetic Testing Registry is definitely at the clinical edge of this. The others are really much more in the

157
00:28:56,233 --> 00:29:04,666
research realm and I don’t think doctors or patients are going to really find them or use them. But, the Genetic Testing Registry is really an

158
00:29:04,666 --> 00:29:18,132
attempt to bring to light or really, if you will, demand from individuals or organizations that offer tests: what does this test actually do and

159
00:29:18,133 --> 00:29:29,899
can you substantiate the claims that you’re making? Initially, these tests are really focused on sort of traditional genetic diseases where I

160
00:29:29,900 --> 00:29:38,900
recognize even that if it’s still the same thing, that if you do a test in one population versus another you can get pretty different answers. So, there’s

161
00:29:38,900 --> 00:29:52,500
a request for the providers of these tests to describe, for example, analytical validity, clinical validity, and clinical utility. The analytical validity,

162
00:29:52,500 --> 00:30:02,433
after much debate is a minimal field; it’s required to register a test. We’d also like to have clinical validity, which I think is the point you’re trying to

163
00:30:02,433 --> 00:30:12,566
make, and that’s where the sensitivity in different populations really comes out, and that pertains to: how should you really use this test? Is it

164
00:30:12,566 --> 00:30:19,766
appropriate for that person that you have in front of you or not? So, we’re somewhat reliant on the providers of these tests to give us that

165
00:30:19,766 --> 00:30:33,832
information. We think that by putting up a registry where the information’s assembled in one place it will hopefully even the playing field a bit. And so,

166
00:30:33,833 --> 00:30:42,366
if certain manufacturers have their test up and you can read about it and you like the depth of information in it, you’re probably more likely to

167
00:30:42,366 --> 00:30:51,166
order that test, whereas if there’s another manufacturer that’s not being so clear about what they’re doing, you remain with questions

168
00:30:51,166 --> 00:30:59,566
about it and you’re probably not going to go there. And then I think also, over the course of time, there will be more pressure on test providers

169
00:30:59,566 --> 00:31:09,966
who are not giving out the details that their competitors are and, over time, that will mean that there will be more and more willing or there will

170
00:31:09,966 --> 00:31:19,032
be more and more pressure to fill in that information. I brought up this screen to show…I’m not sure if you can read it from there…but there’s

171
00:31:19,033 --> 00:31:32,666
a disclaimer. There’s certainly concern on the part of NIH and its leadership about how this registry will be used. Until now, most of the people looking

172
00:31:32,666 --> 00:31:41,799
up genetic tests have been genetics professionals, and I think more and more with so-called “genomic medicine”—I saw this in the

173
00:31:41,800 --> 00:31:51,333
BRCA world—physicians are canvassed by a testing company and given kits and they do the test themselves and they don’t necessarily have

174
00:31:51,333 --> 00:32:02,533
all of the knowledge base to use that effectively, but those physicians will come to this resource and we want them to come to it because this is

175
00:32:02,533 --> 00:32:11,066
an information resource. And so, this disclaimer says: NIH does not independently verify information submitted to the Genetic Testing

176
00:32:11,066 --> 00:32:20,532
Registry. It relies on submitters to provide information that is accurate and not misleading. NIH makes no endorsement of tests or labs. It’s

177
00:32:20,533 --> 00:32:27,533
not a substitute for medical advice. Patients who have specific questions should contact a health care provider; the sort of things you’d expect in a

178
00:32:27,533 --> 00:32:36,733
disclaimer. Whether everybody’s going to behave this way, I don’t know, but it’s our job to make sure that the information and the way it’s

179
00:32:36,733 --> 00:32:48,399
presented is not only understandable from genetics professionals, but from a wide variety of users, including patients and general

180
00:32:48,400 --> 00:32:56,400
physicians. So, I hope that partially answered your question.

181
00:32:56,400 --> 00:33:08,466
MALE: Thank you. That was a difficult question and I think you gave a good answer. So, is there any thought going on within HHS about taking the

182
00:33:08,466 --> 00:33:15,466
next step—which would be a huge step—in trying to provide some sort of a vetting process for at least some of these huge number of

183
00:33:15,466 --> 00:33:25,066
variants; something that might be on the analogy of the U.S. Prevention Task Force. But of course, it would have to be many, many task forces and it

184
00:33:25,066 --> 00:33:32,532
might be shared with individual professional societies, so nephrology could be doing the genes and the variants that are important for

185
00:33:32,533 --> 00:33:39,399
them and for us and so forth, through the different fields, realizing we can never be complete, we can never keep up with the

186
00:33:39,400 --> 00:33:51,333
waterfall that’s delivering new variants, but somebody—some neutral party less interested economically in the other test—would call in or

187
00:33:51,333 --> 00:33:58,866
would weigh in and say “these are variants that you should pay attention to and the level of evidence is a, b, and c” and so forth. So, I guess

188
00:33:58,866 --> 00:34:01,932
it’s a simple question: is there any talk about that?

189
00:34:01,933 --> 00:34:11,266
WENDY RUBINSTEIN: That’s a great question. So, yes, there’s talk about that. It’s very difficult. So, I think I’d answer that on two levels. One is the

190
00:34:11,266 --> 00:34:22,066
effort to assemble information about variants in one place, that you can start to do curation—and, you know, ClinVar, that is its aim—and the idea is

191
00:34:22,066 --> 00:34:32,266
that there would be experts—content experts—from renal diseases and, you know, ad infinitum, who will create what they’re literally calling

192
00:34:32,266 --> 00:34:42,799
mutacircles and will sit around and assess the evidence for unique variants and then come to an understanding at that point in time and make an

193
00:34:42,800 --> 00:34:54,666
assertion and then that can be displayed in ClinVar and it can also be linked to the Genetic Testing Registry or used, you know, outwardly.

194
00:34:54,666 --> 00:35:05,266
There’s also…because the Genetic Testing Registry is putting information out there about tests and there’s the concern about how they’ll

195
00:35:05,266 --> 00:35:13,199
be used, especially when you come to the vast amount of genome data and the vast amount of variance events or insignificance, there’s

196
00:35:13,200 --> 00:35:26,200
increasing pressure to come to an understanding about what are even the genes that we think you can do a test on that’s reliable and accurate, and

197
00:35:26,200 --> 00:35:36,133
hopefully that if you do that test there’s something useful about doing that test, so there’s utility. I think we’re still stuck, mostly, at the clinical

198
00:35:36,133 --> 00:35:45,666
validity, meaning, if you do a test and it says you’re at two-times increased risk for hemangioma, is that really true? And there, I think

199
00:35:45,666 --> 00:35:59,732
there’s a vast vacuum of data. So, many people think where the bar should be set in even doing a genetic test is that it has clinical validity. So,

200
00:35:59,733 --> 00:36:10,999
there’s an attempt within this resource to at least have the test provider state at least their claim about clinical validity and to support it with either

201
00:36:11,000 --> 00:36:25,400
reference, which could be anything in PubMed or it could be a clinical practice guideline, you know, or anything. And in addition, I sort of alluded

202
00:36:25,400 --> 00:36:42,133
because this has a strong public face to it, this resource. There’s been concern by the genetics societies like ASHG and ACMG and NSGC to sort

203
00:36:42,133 --> 00:36:54,533
of be clear about all of this, about: what’s the snake oil test and what’s the test we all believe in? So, we definitely want that to be clear but,

204
00:36:54,533 --> 00:37:02,599
you know, I’m a medical geneticist, there are two genetic counselors in our group, and even if we said what we think about these tests, I don’t think

205
00:37:02,600 --> 00:37:14,900
you should necessarily believe us. So, it’s really up to the community to, I think, step up to the plate and come together as organizations and to state

206
00:37:14,900 --> 00:37:27,166
what are, really, valid tests. So, ASHG…they approached us, we asked them to put it together and I think as a scientific organization they don’t

207
00:37:27,166 --> 00:37:41,966
feel totally comfortable making statements about clinical tests. At this point, ACMG is looking into creating a work group to, again, come up with a

208
00:37:41,966 --> 00:37:52,932
list of tests and variants that will certainly take time and require a lot of people to donate their time and figure out how to organize themselves

209
00:37:52,933 --> 00:38:03,399
to do it, but once that is done—and I think it has to happen—what the Genetic Testing Registry would like to do is then to assign to each test

210
00:38:03,400 --> 00:38:12,533
what a given organization has said about that gene or variant, so that, in reading about that test you can also see what does the organization

211
00:38:12,533 --> 00:38:22,066
say. We already are doing that—and I didn’t really show you—but where they exist, we’re showing clinical practice guidelines by EGAPP or ACMG

212
00:38:22,066 --> 00:38:30,866
and so forth and anything we can lay our hands on to be aligned on the page you land on about the test or the disease.

213
00:38:30,866 --> 00:38:34,799
MALE: So, that material would be chiefly in ClinVar? Is that how I understood it?

214
00:38:34,800 --> 00:38:42,300
WENDY RUBINSTEIN: So, you know, when it comes to information about genes and variants that resides principally…well, variants is

215
00:38:42,300 --> 00:38:53,266
principally in ClinVar, but it’s linked closely to the GTR, and when you’re starting to see clinical practice guidelines and displays about how to

216
00:38:53,266 --> 00:38:58,732
think about ordering a test, that would be the GTR.

217
00:38:58,733 --> 00:39:01,599
MALE: Thank you.

218
00:39:01,600 --> 00:39:05,900
WENDY RUBINSTEIN: Okay. Thank you very much.




Date Last Updated: 9/18/2012

General Inquiries may be addressed to:
Office of Communications and Public Liaison
NIDDK, NIH
Building 31, Rm 9A06
31 Center Drive, MSC 2560
Bethesda, MD 20892-2560
USA
Phone: 301.496.3583