This is the second part of a three part series that started with Understanding Genetics. In this article, I will identify, and describe at a high level the standards that are needed to obtain genetic test results.
But first a little segue. A colleague reminded me recently that what helps us best to deliver is in having a real understanding of the why our customer needs something. Let's see if we can create a little fantasy that might help.
Imagine that you are in Tier 3 technical support at your company. Assume for the time being that all of your company's computers (and those of companies like yours) are the same; it's just the software that's different. A new technique in computer diagnostic now allows technicians like you to actually read the stored programs inside the computer (work with me here). A few years ago, your company and hundreds of others like it came together to work on a major project. They took one of the computers at random, and cataloged every bit in its memory, all 750 megabytes of it. Within this vast amount of data were somewhere between 65 and 80 thousand little subprograms, each of them anywhere from 10 to 15 thousand microinstructions in length. We know only a little bit about how the processor works. We can understand start and stop instructions, and can interpret some of the sequences of microinstructions that make up larger operational instructions, and have some basic understanding of some of these programs, but are still learning more every day.
Your job, given a particular computer malfunction, is this: Based on a particular set of symptoms, and other random information that comes your way about where the computer has been, and what subsystems it was built from, you need to:
Just to complicate matters, the computer that was selected at random is known to have a few wonky sub-programs installed on it that aren't quite right either. Also, addressing a particular memory location is not an exact science. It's more an art form, and the ways that you access it is by looking for sequences that you know typically precedes or follows the particular memory address you want. It's more like associative memory rather than RAM.
By the way, you have a budget to work with. You can read out vast sections of memory, but it is very expensive (like disassembly), or you can look for known problem causing sequences (like a virus scanner), which is faster and cheaper, but doesn't find everything.
Add to this that the information you have to work with and understand to identify a problem is not only growing at tremendous rates (see the second paragraph in Clinical Decision Support), but also being changed. What you knew yesterday might be different tomorrow. It may be that one of those wonky subprograms has now been replaced by a better sample.
This is just a small sample of the complexity that faces the clinical geneticist. Hopefully this little segue into an analogical fantasy might help you understand a little bit about how genetic testing works.
Now, back in the real world, we will start simple. A genetic test is, at its core, a laboratory test. This simplifies matters for us, because we can make use of the same standards used in ordering and reporting for laboratory tests. The most commonly used standard for ordering laboratory tests and reporting on results is HL7 Version 2. There are many different releases of HL7 Version 2 (we could call them variants, but that would just be too confusing), including 2.2, 2.3, 2.3.1, 2.4, 2.5, 2.5.1 and 2.6, and coming soon Version 2.7 (it isn't clear whether these would be alleles or mutations).
Various organizations have selected different releases of HL7 Version 2 messages for laboratory orders and results, including:
- HL7 Version 2.4
- Used in the original ELINCS implementation guide developed initially by the California Healthcare Foundation. This guide is now being completed by HL7 using HL7 Version 2.5.1
- HL7 Version 2.5
- Used in the Laboratory Technical framework from Integrating the Healthcare Enterprise,
- HL7 Version 2.5.1
- Selected by ANSI/HITSP, and recognized by Secretary Leavitt of Health and Human Services for use in the US for reporting laboratory results. ANSI/HITSP selected this version because it supports the conveyance of information required by CLIA regulations.
While HL7 Version 3 does support laboratory tests orders and results, this is still a work in progress.
HL7 CDA Release 2.0 (this is another gene altogether) has also been selected by ANSI/HITP and recognized by Secretary Leavitt for reporting laboratory results in a clinical document. ANSI/HITSP's selection of this standard is constrained by the IHE XD-LAB profile found in the IHE Laboratory Technical Framework. The XD-LAB integration profile also conforms to the HL7 Laboratory Claims Attachments Implementation Guide.
Finally, results reported in a laboratory result often use Logical Observation Identifiers and Codes or LOINC® to identify results (all of the examples above use LOINC).
Ordering the Test
Sequencing or re-sequencing, refers to the reading off the nucleotides (A, C, G and T) of the gene sequence directly. This is usually more expensive, but also the most accurate way to obtain a gene sequence. Some researchers use the term re-sequencing, because the Human genome has been sequenced once already.
So, in looking at why, what are the questions that are typically being asked? In genetic testing, there are six common clinical questions\. Half of these are related to specific genetically related diseases, and the other half to medications used for treatment. The "question" being asked by the clinician needs to be described in the order.
Tests on Genetic Conditions
Tests that identify variants associated with genetic conditions can assist the provider in determining if a patient:
- has a genetic condition,
- is at (increased) risk of contracting a genetically related disease, or
- carries a particular genetic variant and can potentially pass it onto their children.
Tests on Medications
Pharmacogenomic tests can tell a provider:
- Whether a particular medication will be effective or not in treatment,
- How quickly particular medication will be metabolized by the patient, or
- How toxic a medication may be to the patient.
LOINC Vocabulary terms have been proposed to represent each of these different kinds of tests results in panels. The SNOMED CT and RxNORM terminologies have been proposed to represent disease conditions and medications respectively. Some experts have noted that SNOMED CT does not provide great coverage for family related disease (i.e. genetic conditions) but feel that it is more important to use a common reference vocabulary, than it is to introduce vocabularies that are not yet used in healthcare. Use of these vocabularies will enable linkage of genetic data with other clinical data in the health record. I find myself in agreement with them.
Describing the Specimen
The specimen is the source of the DNA examined, as well as eventual source of the variant identified. Genetic material in a tumor specimens can have somatic or germline variations. A somatic variation occurs after cells have been formed, for example from UV damage to skin cells after too much sun exposure. A majority of cancers occur due to somatic changes. A germline variation is one that is incorporated into every cell. The last classification is for specimens of fetal tissue (prenatal). Identifying results based on these three categories. A proposed classification system for specimens uses the terms somatic, germline, or prenatal to describe the specimen.
Reporting the results
When reporting on the results of genetic tests, it is important to include the information in the report necessary for a healthcare provider in interpreting this information. The first step in reporting the results is to repeat everything in the order that was stated or clarified later during the ordering process. The reason for this is to allow subsequent reviewers of the result to understand the original intent of the provider ordering the test. Ordering a genetic test may be an iterative process. In reporting the test results it should be necessary only to report what was finally agreed upon.
Region of Interest
Once a genetic test is selected, the testing laboratory can specify more detail about the region of the gene that was examined. The human genome includes on the order of 3 billion base pairs (which fits into about 750 mb). At present, it isn't practical to sequence a single person's gene, and would take quite a long time (although that may change).
So, the testing laboratory determines where on the genome the test will focus. This region of interest can be described by identifying the genomic or transcriptional reference sequence (to align the region with the genome), the starting and ending nucleotides in the sequence (using the numeric portion of HGVS nomenclature), and a specific gene using the HGNC nomenclature (remember that "associative memory access", here it is). Much of this information is, or can be tied together in appropriate knowledge bases (and these are continuously being updated).
The next step is to report the interpretation. Each test types described in the previous sections above will require different values to interpret the results. Vocabulary has also been proposed in LOINC using LOINC Answer Codes, but the LOINC documentation does not presently describe how to relate LOINC Answer codes to the supplied data. Certainly any set of values used for these interpretations will also need to be mapped into SNOMED CT, to allow for their eventual use in clinical decision support systems that rely on SNOMED CT. Note that some interpretations will remain "inconclusive" (have you ever finished a technical support call only to get no solution to your problem).
Because sequencing and genotyping is so expensive, it shouldn't be repeated unnecessarily. That means that enough detailed information should be conveyed in the result that future re-interpretation is possible. The average gene can contains from 10 to 15 thousand base pairs (think of these as the microinstructions), but this can vary dramatically, with some genes using millions of base pairs. This information is maintained by the testing laboratory and is absolutely essential in the initial interpretation of the results. When reported, these findings are summarized using the recommended standards. This will enable linkage of the genetic data to clinical genetic knowledgebases, so that interpretations can be maintained in a manner similar to other laboratory tests.
Different kinds of tests will require differing detailed results. A test that is attempting to identify a particular DNA marker or allele will need to describe what was found. Again, this identification can be performed using HGNC to describe the gene, NCBI Nucleotide Reference sequence identifiers, and HGVS nomenclature to describe the variations.
Details are not sufficient. The final component of the report should include interpretation of the results performed by a geneticist. Genetics is so complex that providers will need that expertise to understand the results. This analysis may include references to research, educational materials, suggested treatments or additional testing.
The Final Step
For most clinical uses, once the provider's question about the patient's genetics is answered, many more questions are asked by the provider and by the patient. Some of these involve how to communicate the information to the patient and/or their relatives; others involve what the next steps should be in management of the patient's health. These fall outside of the domain of Healthcare Standards, so I will not dwell upon them. However, The American Society of Clinical Oncology published a policy statement that addresses some of these issues. I recommend reading it.
American Society of Clinical Oncology Policy Statement Update: Genetic Testing for Cancer Susceptibility, Journal of Clinical Oncology, Vol 21, No 12 (Jun 15), 2003: pp 2397-2406 available on the web from http://jco.ascopubs.org/cgi/reprint/21/12/2397
This is an excellent article that describes many of the issues surrounding the need for, and appropriate use of genetic testing. While its principle audience is clinical oncologists, the explanations given for the ASCO positions are very clear, and address many issues that need to be considered with respect to genetic testing.
Thanks to Sandy Aronson, Director of IT, and Mollie Ullman-Cullere, both of HPCGG for arranging a tour of their genetic testing laboratory and answering my many questions on genetic testing. Thanks also to Mollie, and to Scott Bolte of GE Healthcare for their reviews of an early draft of this article. Accuracy is due to them, any errors are of course my own.