Blood Paper-Next Generation Sequencing

Published on November 2016 | Categories: Documents | Downloads: 29 | Comments: 0 | Views: 214
of 10
Download PDF   Embed   Report

next sequencing

Comments

Content


SECTION XXI: NEXT GENERATION SEQUENCING
Applications of Next-Generation Sequencing to Blood
and Marrow Transplantation
Michael Chapman,
1
Edus H. Warren, III,
2
Catherine J. Wu
3
Since the advent of next-generation sequencing (NGS) in 2005, there has been an explosion of published
studies employing the technology to tackle previously intractable questions in many disparate biological
fields. This has been coupled with technology development that has occurred at a remarkable pace. This
review discusses the potential impact of this new technology on the field of blood and marrow stem cell
transplantation. Hematologic malignancies have been among the forefront of those cancers whose genomes
have been the subject of NGS. Hence, these studies have opened novel areas of biology that can be exploited
for prognostic, diagnostic, and therapeutic means. Because of the unprecedented depth, resolution and
accuracy achievable by NGS, this technology is well-suited for providing detailed information on the diversity
of receptors that govern antigen recognition; this approach has the potential to contribute important insights
into understanding the biologic effects of transplantation. Finally, the ability to performcomprehensive tumor
sequencing provides a systematic approach to the discovery of genetic alterations that can encode peptides
with restricted tumor expression, and hence serve as potential target antigens of graft-versus-leukemia
responses. Altogether, this increasingly affordable technology will undoubtedly impact the future practice
and care of patients with hematologic malignancies.
Biol Blood Marrow Transplant 18: S151-S160 (2012) Ó 2012 American Society for Blood and Marrow Transplantation
KEY WORDS: Genome analysis, Exome sequencing, Transplantation, TCR
NEXT-GENERATION SEQUENCING (NGS)
FOR UNDERSTANDING HEMATOLOGIC
MALIGNANCIES
This section describes some of the main analytic
considerations in NGS, reviews some of the key find-
ings from sequencing hematologic malignancies, and
finally looks to the future use of the technology and
the challenges andthe opportunities that that will bring.
Analytical Considerations
Generating sequence
A detailed description of the different NGS tech-
nologies is beyond the scope of this discussion; these
are reviewed elsewhere [1]. Briefly, NGS methods
such as 454, SOLiD, and Illumina all involve a process
of fragmenting DNA, ligating adapters, and immobi-
lizing the fragments via the adapters to create libraries.
The libraries then undergo a process of amplification,
generating multiple copies of each DNA fragment.
The immobilized, amplified DNA is then sequenced
in parallel by a fluorescence- or chemiluminescence-
based method, yielding billions of short sequence
reads.
If there is no prior selection of the DNA, this is
known as whole genome shotgun (WGS) sequencing.
Alternatively, by initially hybridizing the DNA frag-
ments to target-specific baits, selected areas of DNA
can be captured for sequencing, while excluding the
rest [2]. Frequently, the baits are designed to the cod-
ing portion of the genome, resulting in whole exome
(WE) sequencing. If only coding mutations are of
interest, WE sequencing offers significant cost advan-
tages over WGS sequencing as the same coverage can
be achieved with far fewer reads.
Unlike genome-wide associationstudies, the aimin
cancer genome sequencing is generally to detect
somatic mutations, that is, those mutations that are
unique to the tumor. This necessitates sequencing
both tumor and normal tissue from the same patient.
This is because the rate of private single nucleotide
polymorphisms (SNPs)—variations in the DNA
From the
1
Department of Haematology, Cambridge University,
Cambridge, United Kinigdom;
2
Program in Immunology,
FredHutchinsonCancer ResearchCenter, Seattle, Washington;
and
3
Cancer Vaccine Center, Dana-Farber Cancer Institute,
Boston, Massachusetts.
Financial disclosure: See Acknowledgments on page S159.
Correspondence and reprint requests: Catherine J. Wu, MD, Dana-
Farber Cancer Institute, Harvard Institutes of Medicine, Rm
416B, 77, Avenue Louis Pasteur, Boston, MA 02115 (e-mail:
[email protected]).
Ó 2012 American Society for Blood and Marrow Transplantation
1083-8791/$36.00
doi:10.1016/j.bbmt.2011.11.011
S151
sequence that are unique to the individual and not
annotated in databases—is of the order of 1 SNP per
10,000 bases. The choice of the normal tissue is gov-
erned by the tumor sequenced and the normal tissue
should not be contaminated by tumor cells. For exam-
ple, when sequencing the acute myeloid leukemia
(AML) genome, it was necessary to use skin to provide
normal tissue [3], whereas in sequencing the multiple
myeloma genome, peripheral bloodwas used [4]. How-
ever, in the latter case, even though cases of plasma cell
leukemia had been excluded, 3 samples had to be omit-
ted from the analysis because of the presence of low
levels of circulating malignant plasma cells.
Sequencing data processing
Having generated the short sequence reads, these
are aligned to the human reference genome. Although
simple in principle, there are a number of difficulties in
practise. Given the number of reads and the size of the
genome, highly efficient algorithms have to be em-
ployed to perform the mapping. However, these algo-
rithms are less accurate than ‘‘traditional’’ alignment
algorithms, such as BLAST [5]. Compared with
Sanger sequencing reads, the reads from NGS are
short and can often map in more than 1 place, espe-
cially with repetitive or low complexity DNA.
Sequence variations resulting from either sequencing
errors or from somatic mutations or SNPs often cause
misalignment. The algorithms struggle particularly to
accurately align reads containing small insertions or
deletions (indels). For this reason, it is often necessary
to perform local realignments in areas containing
indels with more accurate but slower algorithms [4].
Sequencing coverage is the average number of
reads covering any particular base. The greater the
coverage, the greater the chance of accurately calling
sequence variations. For cancer genome sequencing,
30 Â coverage is typical, although accurate mutation
calling can be made with coverage of 20 Â (Illumina
technical note: Calling Sequencing SNPs). It is impor-
tant to remember that the coverage is not uniform
across the genome. GC-rich regions are covered
poorly, partly because of difficulties with aligning the
sequence reads. With hybrid capture techniques in
WE sequencing, not all exons may be targeted by the
baits or the baits may fail to reliably capture their
targets. This may mean that the ability to detect muta-
tions in some genes is markedly impaired [4].
Mutation detection
Point mutations are relatively straightforward to
identify by comparison of the reads with the reference
sequence they have been mapped to. Poorly mapped
reads will give rise to factitious sequence variations
and should be excluded before mutation calling. Statis-
tical arguments are applied to define the likelihood that
thevariant is bothreal andabsent fromthenormal tissue
from the same patient. Only mutations scoring above
a given threshold are accepted. As a final step, false
positive mutations arising from known artefacts of
NGS are excluded. A broadly similar approach can be
applied to small indels (ie, indels smaller than the length
of a single read), although these are more challenging.
Larger indels and translocations are detected with algo-
rithms that detect reads in which the ends are separated
by an unexpected length of sequence or indeed are on
different chromosomes. With all these approaches,
there is a balance to be struck between sensitivity and
specificity. In any sequencing project, validation of
identified mutations by an alternative sequencing
method is usually undertaken and it is becoming clear
that the various mutation-calling algorithms have im-
proved markedly in a short period of time.
Identifying driver mutations
One of the most critical steps in the analytical pro-
cess is the ability to distinguish between the minority of
mutations that afford some growth or survival
advantage to the tumor (driver mutations), from the
majority of mutations that occurred randomly and
became fixed within the tumor clones (passenger muta-
tions). This can be done statistically, arguing that
across a number of patients with the same tumor, the
driver mutations should occur more frequently than
would be expected by chance. Such calculations do
not simply consider the frequency of the mutations,
but also the size of the mutated genes, their base com-
position, the nature of the mutated bases, and the level
to which the genes are expressed in the tumor. Using
this framework, there are 2 broad approaches for defin-
ing driver mutations. One is to sequence a small num-
ber of samples in a discovery set, define a list of mutated
genes, then performdirected sequencing of these genes
in a much larger validation set to detect the driver
mutations. This is a relatively economical approach,
but there is a risk of missing less frequently mutated
genes in the discovery set. The alternative model,
adopted by The Cancer Genome Atlas (http://cancer-
genome.nih.gov) and others, is to sequence a large
number of samples up front. The driver mutations
are defined in the discovery set and the problem of
lower frequency mutations is overcome to some extent,
but it is currently a more expensive approach.
Genome Analysis of Hematologic Malignancies
AML
The first cancer genome to be sequenced by NGS
was that of a cytogenetically normal AML by a group
at the Genome Institute at Washington University
[3]. This represented an extraordinary technical feat
with the technology available at the time and was an
important proof of principle. They identified around
31,600 novel somatic mutations and then focused on
S152 Biol Blood Marrow Transplant 18:S151-S160, 2012 M. Chapman et al.
those that affected genic coding sequences. Following
validation by resequencing, they were left with 8 true
novel somatic coding mutations, in addition to 2
well-described AML mutations, that is, insertions
into the FLT3 (FLT3-ITD) and NPM1 genes. The
same group followed this work with the WGS se-
quencing of a second AML genome and the directed
sequencing of mutated genes in a further 188 AML
samples [6]. They found a mutation affecting the isoci-
trate dehydrogenase gene at residue 132 (IDH1 R132)
in 9% of AML samples, exclusively in cases with inter-
mediate risk cytogenetics. Although these mutations
had not previously been identified in AML, they
were known to occur commonly in glioma. Subse-
quent studies have demonstrated an association
between mutations in IDH1 and NPM mutations,
with IDH1 associated with a worse prognosis in
NPM-mutated/IDH1-ITD negative AML [7]. More
recently, the Washington University group adopted
a similar NGS approach to identify mutations in the
DNA methyltransferase, DNMT3A, in 22% of cases
of AML [8]. Similar to IDH1, this was associated
with intermediate-risk cytogenetics, but predicted
a poor prognosis.
Multiple myeloma (MM)
As discussed previously, an alternative approach to
that described in the sequencing of AMLis to sequence
several tumors up front. This was the approach that
was taken in a large, multicenter study at the Broad
Institute to sequence the MM genome, in what was
one of the first studies of its kind [4]. This study used
a combination of WGS and WEsequencing to explore
the genomes of 38 MMcases. Ten genes were mutated
more frequently than would be expected by chance,
including previously described mutations in NRAS,
KRAS, and TP53. A quarter of the samples were
affected by mutations in 1 of 2 genes, DIS3 and
FAM46C, not previously implicated in cancer nor
known or predicted to be involved in RNA processing
and/or translation initiation. A single, known activat-
ing mutation in the BRAF gene (G469A), previously
described in melanoma, was found, which prompted
the genotyping for known BRAF mutations in a large
separate cohort of MM patients. Four percent of
samples were affected by these mutations. A highly
effective BRAF inhibitor, PLX4032, is already under
investigation in a Phase 3 clinical trial in melanoma,
and these results suggest that trials of PLX4032 in
MM in targeted individuals would be promising. Hav-
ing multiple samples in the initial sequencing cohort
enabled the application of network analyses to look
for mutations in multiple genes targeting the same
pathway. By this means, this group was able to confirm
and extend the observations of mutations affecting the
NF-KB pathway in MM [9,10] and identify novel
mutations predicted to affect histone methylation.
Finally, the presence of noncoding mutations
clustering in regulatory regions of the genome in
a statistically recurrent manner was demonstrated.
Over a quarter of samples had mutations in the
promoter or first intron of the putative tumor
suppressor BCL7a. One potential weakness of this
study, currently being addressed by further discovery
phase sequencing, was that the size was too small and
the patient group was too heterogeneous to draw any
conclusions about the prognostic significance of any
of the mutations. However, it provided an important
demonstration of the power of examining multiple
genomes by NGS in the discovery set.
Chronic lymphocytic leukemia (CLL)
A collaboration between multiple centers in Spain
and the Cancer Genome Project (CGP) at the Well-
come Trust Sanger Institute undertook WGS se-
quencing of 4 cases of CLL [11]. The mutation rates
were relatively low, with only 45 genes affected by cod-
ing mutations across all the samples. These genes
informed further sequencing in a much larger series.
Four genes, NOTCH1, MYD88, XPO1, and KLHL6
were recurrently mutated with an apparently nonran-
dom distribution. The expression of these mutated
genes was examined in relationship with an established
predictive biologic marker, the degree of somatic
hypermutation of the immunoglobulin heavy chain
variable region (IGHV). NOTCH1 and XPO1 ap-
peared to be associated with the more aggressive un-
mutated IGHV status, whereas MYD88 and KLHL6
appeared to be associated with mutated IGHV status.
The NOTCH1 mutations were frequent (12%), con-
tained premature stop codons predicted to result in
activation and stabilization of the protein, and pre-
dicted for poor overall survival, although it was not
clear whether or not this is independent of the associ-
ated unmutated IGHV phenotype.
A second CLL sequencing project, employing 91
tumors in its discovery set, has recently been reported
[12]. This group, from the Dana-Farber Cancer Insti-
tute together with the Broad Institute, described 9
genes mutated at significant frequency, among them
NOTCH1 and MYD88. They confirmed statistically
significant associations of these 2 genes with unmu-
tated and mutated IGHV status, respectively. In addi-
tion, they demonstrated significant associations with
trisomy 12 and heterozygous 13q deletion, respec-
tively. Perhaps more important, however, was their
observation of frequent mutations affecting the gene
SF3B1. After TP53, this was the most frequently mu-
tated gene, with 15% of CLL affected. There was
a strong association with del(11q) and, in a multivariate
Cox analysis, SF3B1 mutation was predictive of poor
prognosis, establishing it as an independent prognostic
marker. SF3B1 is a component of the catalytic core of
the spliceosome and these investigators were able to
Biol Blood Marrow Transplant 18:S151-S160, 2012 S153 NGS for Blood and Marrow Transplantation
demonstrate that SF3B1 mutation was associated with
aberrant splicing in CLL. It is likely that these muta-
tions are associated with widespread changes in the
transcriptome, echoing the large-scale transcriptional
changes predicted to occur as a result of the frequent
DIS3 and FAM46C mutations in MM.
Myelodysplastic syndrome (MDS)
Recently, frequent mutations in genes in the RNA
splicing machinery have been also detected in MDS by
2 independent groups [13,14]. Papaemmanuil et al.
[12], representing the CGP group at Wellcome Trust
Sanger Institute performed WE sequencing in 9 sam-
ples from patients with MDS. They identified 46 mu-
tations affecting protein coding across these cases.
Intriguingly, 6 out of 9 samples exhibited mutations
in SF3B1, the same splicing gene found to be mutated
in 15% of CLL [12]. Targeted sequencing in a much
larger cohort revealed that SF3B1 was mutated in
20% of cases of MDS. These mutations were associ-
ated with ringed sideroblasts and a benign clinical
course. Many of the mutations were recurrent and
there was considerable overlap with the mutations
seen in CLL, including the commonest mutation,
K700E. This strengthens the idea that these mutations
alter rather than abrogate function of the spliceosome.
Finally, to understand whether any part of the tran-
scriptome is particularly affected by these mutations,
Papaemmanuil et al. performed gene expression pro-
filing [13]. Of note, SF3B1 mutation was associated
with downregulation of several pathways relating to
mitochondrial function, including genes involved in
the mitochondrial ribosome and the electron trans-
port chain.
In an alternate approach, Yoshida et al. [14], repre-
senting the International Cancer Genome Consor-
tium, examined 29 patients with MDS by whole
exome sequencing of paired tumor/control DNA,
and subsequently, 582 subjects with myeloid neoplasm
by high-throughput mutation screening of pooled
DNA. These investigators also identified mutations
on multiple components of the spliceosome that func-
tion in the recognition of 3
0
-splice site during pre-
mRNA splicing, including SF3B1 and U2AF35.
Expression of a mutant form of U2AF35 in HeLa cells
induced the increased expression of genes that func-
tion in nonsense-mediated mRNA decay pathway, in-
dicating activation of cellular responses to abnormally
spliced RNA. Furthermore, both exon array and RNA
sequencing analysis confirmed the increased expres-
sion of nonexon regions of the genome in cells trans-
fected with the mutated U2AF35. Intriguingly,
expression of mutated U2AF35 suppressed cell
proliferation both in vitro and in vivo, suggesting that
the oncogenic effects of these spliceosome mutations
are mediated through mechanisms other than cell
proliferation.
Hairy cell leukemia (HCL)
Whole exome sequencing of a patient with HCL
revealed 5 nonsynonymous coding mutations [15].
One of these was a known activating BRAF mutation
(V600E), well described in melanoma and latterly in
MM (see above). Strikingly, Sanger sequencing of
BRAF in a further 47 cases of HCL revealed 100%
presence of the V600E mutation, suggesting that it is
an obligate driver of the disease and that HCL might
well be highly susceptible to PLX4032.
Challenges and Opportunities of Cancer
Genome Sequencing for the Future
There are a number of factors that may influence
the future of NGS in cancer. One is the concept of sta-
tistical power. Genes whose true mutation frequency is
10%should be identified as significant at 80%power in
approximately 75 samples. However, for genes with
a true mutation frequency of 5%, around 200 samples
are required for the same power. For a true frequency
of 3%, several hundred samples are required. Com-
pounding this is the concept of the Winner’s Curse.
Originally applied to bidding for oil drilling rights in
the Gulf of Mexico (where the winner of the bid fre-
quently overpays) and later to the interpretation of
genome-wide associationstudies (where polymorphism
rates determined in the discovery sets are frequently
found to be large overestimates following validation),
it is equally applicable to NGS. Many of the observed
frequencies from these initial NGS studies may there-
fore overestimate the true frequencies. Taken together
withthe fact that we are only nowbeginning togenerate
any discovery sets in the hematologic malignancies ad-
equately poweredtodetect mutations less frequent than
about 10%, it is entirely conceivable that we are missing
the vast majority of driver mutations, which may affect
large numbers of genes at low frequency. This has im-
plications for understanding the interplay of these mu-
tations and for identifying true independent prognostic
markers. Sequencing of thousands of tumors may be
required to address these issues.
Fortunately, the cost of NGS has fallen at an aston-
ishing rate. The introduction of the so-called third-
generation of sequencing machines, capable of true
single molecule sequencing, will drive down costs
and increase productivity further [16], and it may be
the case that the computational infrastructure becomes
the rate-limiting step to progress. Other novel tech-
nologies that are either operational or in the pipeline
are RNA sequencing (RNAseq) and single cell se-
quencing [17]. The former will allow us to probe the
transcriptome in an unbiased fashion, examining non-
coding and coding RNA, allelic expression of muta-
tions, and splicing patterns. The latter will enable us
to ask questions about the clonal composition and evo-
lution of tumors, revealing potential novel Achilles’
S154 Biol Blood Marrow Transplant 18:S151-S160, 2012 M. Chapman et al.
heels for targeted therapy as well as novel markers of
minimal residual disease.
Perspectives
NGS is a truly revolutionary technology that in
a short time has identified novel biology in the hema-
tologic malignancies that would unlikely have been
discovered by traditional hypothesis-driven research.
These new findings offer the potential for exploring
targeted therapeutics and some, such as BRAF muta-
tions in MM and HCL, suggest that existing therapies
for other conditions should be tested for novel indica-
tions. Another promising avenue is to use the novel
mutations in risk stratification to better employ cur-
rent treatment modalities. For example, mutations in
intermediate risk AML could in theory be used to
define groups of patients who may benefit from early
marrow or stem cell transplantation. The principal
barrier to this approach is an incomplete understand-
ing of how these mutations interact with one another
and with more ‘‘traditional’’ diagnostic information,
such as cytogenetics. It may require much larger se-
quencing efforts to understand this and to define single
mutations or combinations of mutations with indepen-
dent prognostic significance. However, it is hoped that
the staggering rate of technology development in this
field will bring these answers sooner rather than later.
NEXT-GENERATION DNA SEQUENCING FOR
PROBING LYMPHOCYTE REPERTOIRES AND
TRACKING LYMPHOID MALIGNANCIES
The Generation of Antigenic Specificity and
Diversity of Lymphoid Cells
B- and T-lymphocytes are distinguished from all
other somatic cells by the fact that much of their biol-
ogy—indeed, their primary function—is directed by
DNA sequence information that is not encoded within
the germline. The antigenic specificity of B and Tcells
is in large part determined by the amino acid sequence
in the complementarity-determining regions (CDRs)
of their antigen receptors. The CDR1 and CDR2
regions in both B and T cell antigen receptors ex-
pressed by antigen-naive lymphocytes are encoded in
the germline, but the sequence that encodes the
CDR3 region—arguably the most critical determinant
of antigenic specificity—is generated during lympho-
cyte development by recombination between noncon-
tiguous gene segments in the Band Tcell receptor loci.
The CDR3 regions in the b and d chains of ab and
gd T cell receptors (TCRs) and in the heavy chain of
B cell receptors (BCRs) are formed by recombination
between noncontiguous variable (V), diversity (D),
and joining (J) gene segments in the TCRb, TCRd,
and IgH loci, while the CDR3 regions in the a and g
chains of ab and gd TCRs and in BCR light chains
are formed by recombination between analogous sets
of variable and joining gene segments in the TCRa,
TCRg, k light chain, and l light chain loci. The exis-
tence of multiple variable, diversity, and joining gene
segments in the 4 T cell and 3 B cell antigen receptor
loci allows for a large number of distinct TCR and
BCR CDR3 sequences to be encoded. CDR3 sequence
diversity in both TCRs and BCRs is further increased
by template-independent addition and deletion of
nucleotides at the junctions between the different clas-
ses of gene segments. Somatic hypermutation of previ-
ously rearranged B cell receptor genes, which is not
limited to the CDR3 region, can also occur in Bcells af-
ter initial antigen encounter, further increasing
diversity within the immunoglobulin repertoire. The
adaptive immune system utilizes this ingenious multi-
component strategy to generate an extremely diverse
repertoire of BandTcell antigenreceptors that cancol-
lectively recognize the universe of potential pathogens.
Deep Sequencing of the CDR3 Region to Probe
Repertoire Diversity
The enormous magnitude of the CDR3 sequence
diversity that can be created with this strategy is far
too great to permit comprehensive exploration and
definition using conventional capillary-based DNA se-
quencing. The advent of next-generation sequencing
over the last 6 years [17], however, has enabled analysis
of the B and T cell receptor CDR3 region sequence
repertoires realized in any given individual with un-
precedented depth, resolution, and accuracy [18-21].
Because the CDR3 region in the vast majority of
successfully rearranged BCR and TCR genes com-
prises no more than 60 nucleotides, encoding no
more than 20 amino acids, the CDR3 region sequence
repertoire is ideally suited to comprehensive definition
by current sequencing platforms that can generate on
the order of 2 Â10
8
(or more) sequence reads of length
$60 nucleotides in a single sequencing run. The ca-
pacity for generating such extremely large sequence
datasets has made it necessary to enjoin the efforts of
computational biologists to develop analytical strate-
gies to deal with the deluge of sequence data, and has
thus provided the rationale for the nascent field of
computational immunology. The development of
these powerful computational and molecular strategies
for probing the BCR and TCRCDR3 sequence reper-
toires expressed in lymphocyte populations of virtually
any degree of complexity has, in turn, made it possible
to address biological questions that were never before
amenable to direct experimental analysis.
Normal Adult Human B and T Cell Repertoire
Diversity—Parameters and Questions
In one example, high-throughput sequencing of
rearranged BCR and TCR genes from peripheral
Biol Blood Marrow Transplant 18:S151-S160, 2012 S155 NGS for Blood and Marrow Transplantation
blood lymphocytes of healthy adults has begun to de-
fine the critical characteristics of the B and T cell rep-
ertoires that are established and maintained in adult
life. The number of unique IgH, TCRa, and TCRb
rearrangements that can be detected in the blood of
a single individual, for example, provide a basis on
which to estimate the total number of unique B and
ab T cell receptors that are present in the individual
at any one time. The number of distinct IgH rear-
rangements present in the peripheral blood B cell
pool, for example, appears to be at least 2 Â 10
6
[18]. If comparable diversity exists within the Igk and
Igl repertoires, as is thought to be likely, this would
suggest that the diversity of Bcell antigen receptors ex-
pressed in the peripheral blood B cell repertoire is po-
tentially very high. Analogous studies of TCRb
rearrangements in peripheral blood ab T cells [19-22]
have likewise established lower bounds for the
diversity of TCRb chains expressed in the na€ıve and
memory CD4
1
and CD8
1
compartments of healthy
adults, and indicate that the total number of unique
TCRb chains in the peripheral blood is at least 3-4 Â
10
6
[20]. Although published data on the diversity of
TCRa chains expressed in peripheral blood ab T cells
are not as extensive as for TCRb, the available data
suggest that TCRa diversity may be comparable to
that of TCRb [21]. Global comparison of the TCRb
repertoires expressed in peripheral blood CD8
1
T
cells from different individuals of diverse geographic
and ethnic origin and sharing few or no major histo-
compatibility complex class I alleles has shown that
the overlap between the repertoires of any 2 individ-
uals is far higher than expected and also seemingly in-
dependent of HLA type [23]. Whether the TCRa
repertoires, and, more importantly, the repertoires of
TCR ab heterodimers, expressed in different individ-
uals exhibit similar overlap has not yet been deter-
mined.
An important challenge currently facing the new
discipline of computational immunology is to define
how, and to what extent, effective clinical immunity
against pathogens relates to and depends on the enor-
mous diversity that characterizes lymphocyte antigen
receptor repertoires. Is there, for example, a minimum
or threshold level of repertoire diversity required for
effective immunity against the wide spectrum of viral,
bacterial, fungal, and protozoan pathogens that we typ-
ically encounter in our lives? To what extent is the di-
versity that exists in the entire B- and T-lymphocyte
repertoires found in the repertoires of the numerous
functional subsets of B and T cells, and is the diversity
within any specific subset particularly important? For
example, is receptor diversity in the regulatory T cell
repertoire—which, on the basis of published data,
appears to be comparable to that in the effector
T cell repertoire [21]—related in any systematic way
to autoimmune disease, or to the occurrence of chronic
graft-versus-host disease (GVHD) after allogeneic
HCT? Studies addressing these questions are currently
in progress.
Monitoring Rare T Cell Populations
High throughput sequencing has recently been
used to study diversity within the small minority of
T cells in peripheral blood that express gd rather
than ab T cell receptors [24]. Commitment of T cells
to the ab or gd lineages takes place in the thymus, but
the factors that influence this lineage decision have not
been well defined [25-27]. In contrast to ab T cells,
which primarily recognize peptide antigens presented
by class I or class II MHC molecules, gd T cells
recognize a variety of unconventional ligands without
the participation of class I or class II MHC [28]. Al-
though it has been estimated that the potential diver-
sity of antigen receptors that can be expressed in gd
Tcells exceeds that of ab Tcells or B cells [28], the an-
tigen receptors expressed by peripheral blood gd
T cells are overwhelmingly dominated by specific sub-
sets characterized by a very limited range of antigenic
specificity. Indeed, deep sequencing of the rearranged
TCRg genes expressed in peripheral blood gd T cells
from 3 healthy adults revealed that .45% of the
TCRd CDR3 sequences from the 3 individuals were
identical to a previously described sequence found in
a shared gd TCR that is specifically reactive with non-
peptide prenyl pyrophosphate antigens [29].
Deep sequencing of both the TCRb and TCRd
loci in ab and gd T cells in peripheral blood of healthy
adults has also provided valuable insights into the pro-
cess of ab versus gd lineage commitment in the thymus
that have important implications for the monitoring of
T-lymphoid malignancies using tumor-specific TCR
rearrangements [24]. Although the vast majority of
db and gd T cells in peripheral blood carry rearranged
TCRd genes, suggesting that the TCRg locus rear-
ranges before ab/gd lineage commitment, a miniscule
fraction (\4%) of gd Tcells appear to have rearranged
TCRb loci, suggesting that rearrangements of TCRb
occur only in Tcells that have committed to the ab lin-
eage. These results therefore suggest that the TCRg
locus should be the focus of molecular strategies for
monitoring T-lymphoid malignancies.
Monitoring of Malignant Clones
The enormous capacity of current high-
throughput sequencing platforms and the profound
sequencing depth that they provide are increasingly
being exploited to monitor the lymphocyte repertoires
in patients with B- and T-lymphoid malignancies and
to identify the malignant clone(s) at the time of diag-
nosis and to track themduring and after therapy. Serial
monitoring of the peripheral blood B cell compart-
ment in patients with CLL, follicular non-Hodgkin’s
S156 Biol Blood Marrow Transplant 18:S151-S160, 2012 M. Chapman et al.
leukemia, and posttransplant lymphoproliferative dis-
ease using IgH CDR3 sequencing, for example, can
uniquely identify the malignant B cell clones that drive
the disease and follow their suppression as therapy is
administered, as well as their subsequent reappear-
ance, before clinical relapse is detectable [18,30].
Moreover, high-throughput sequencing can provide
accurate and precise assessments of the volume of dis-
ease in a given sample, such as peripheral blood, bone
marrow, or lymph node.
Perspectives
Recent studies suggest that sequencing methods
can reproducibly detect the presence of a T cell clone
with a specific TCR rearrangement in a background of
100,000 others [31], which is comparable to the sensi-
tivity of polymerase chain reaction (PCR)-based
methods. In contrast to PCR-based methods, however,
high-throughput sequencing provides comprehensive
information about the entire lymphocyte repertoire
in a patient, not just the malignant clone(s), and will
therefore provide insights into the disease process
that would not be obtainable with a PCR-based ap-
proach. For this reason, it is anticipated that high-
throughput sequencing will soon replace PCR for
monitoring disease burden in patients with lymphoid
malignancies, particularly if the cost of such sequenc-
ing continues to fall as rapidly as it has over the last
several years (http://www.synthesis.cc/assets_c/2011/
06/carlson_cost%2520per_base_june_2011.html). In-
deed, it is tempting to speculate that serial monitoring
of lymphocyte repertoires with high-throughput se-
quencing will soon become a standard feature of the
care of patients with lymphoid malignancies, autoim-
mune diseases, and immunodeficiency, as well as
patients undergoing allogeneic hematopoietic cell
transplantation or other forms of immunotherapy.
GENOME ANALYSIS TO DISCOVER TARGETS
OF GRAFT-VERSUS- LEUKEMIA (GVL)
RESPONSES
The Holy Grail, still: Separating GVL from
GVHD
Several lines of evidence have definitively estab-
lished the critical role played by donor-derived T cells
following allogeneic hematopoietic stem cell trans-
plantation (allo-HSCT) in generating curative re-
sponses. These include the observations of improved
relapse-free survival following allogeneic compared
with autologous HSCT, increased disease relapse fol-
lowing HSCT when using T cell-depleted grafts, and
examples of leukemia regression observed following
infusion of donor T cells [32]. Donor-derived T cells
can mediate GVL effects through 2 general mecha-
nisms. First, engraftment of donor cells restores
normal immune function, thus overcoming tumor-
or treatment-induced host immune defects and restor-
ing immunosurveillance of malignant cells. Second,
donor-derived T cells may recognize host antigens
and eliminate cells bearing these antigens. Recognition
of the beneficial effects of GVL has led to major
changes in the landscape of allo-HSCT. In particular,
they have provided the underlying rationale for devel-
oping less intensive preparative regimens, that have
broadened the availability of allo-HSCTas a therapeu-
tic option to older patients and to those individuals
with comorbidities that otherwise would not have
been able to withstand more intensive chemotherapy
or radiation. These regimens typically do not com-
pletely eliminate host hematopoiesis, and hence rely
on GVL for their efficacy.
Unfortunately, these desired immunologic effects
come at a cost. Too often, GVL responses following
allo-HSCT arise in the setting of acute or chronic
GVHD. Although the immunologic targeting of
tumor cells is beneficial, similar targeting of normal
recipient tissues remains a major cause of morbidity
and mortality following HSCT. Thus, a major goal
of allotransplantation remains separation of GVL
from GVHD, so that curative responses can be gener-
ated with minimal toxicity.
Genetic Basis of GVHD versus GVLTargets
One approach to distinguish GVL from GVHD
effects is to define differences in their target antigen
specificities. Numerous studies have provided evidence
that GVHD arises from immunologic recognition of
polypeptides that are encoded by genetic polymor-
phisms existing throughout the human genome [33],
that differ between donor and recipient. Transplanta-
tion of mature T cells during allogeneic HSCT results
in the transfer of large numbers of cells capable of rec-
ognizing these alloantigens. The various mechanisms
by which genetic polymorphisms (minor histocompat-
ibility antigens, or ‘‘mHA’’) can give rise to allo-
antigens include: amino acid substitutions that create
antigenic peptides, creation of alternate transcripts,
modification of proteasomal processing, posttransla-
tional modifications, or gene deletions. The clinical
significance of a mHA is highly dependent on the
tissues and cell types that express the target antigen.
Targeting allo-antigens that are broadly expressed in
normal recipient tissues (hematopoietic and nonhema-
topoietic) results in GVHD. When these allo-antigens
are also expressed on leukemia cells, targeting these
antigens contributes to GVL. When mHA are only ex-
pressed in hematopoietic tissues, donor T cells target-
ing these antigens result in the elimination of recipient
hematopoiesis and conversion to full donor hemato-
poiesis. Finally, when leukemia cells also express these
mHA, targeting hematopoietic allo-antigens can result
in GVL without concomitant GVHD.
Biol Blood Marrow Transplant 18:S151-S160, 2012 S157 NGS for Blood and Marrow Transplantation
Alternatively, GVL responses can also be observed
if immune responses are directed against antigens that
are expressed solely on the tumor cell. A handful of
antigens with leukemia-restricted expression are
already known. These include leukemia-specific anti-
gens (epitopes arising from chromosomal rearrange-
ments such as BCR-ABL), virally encoded antigens
(latent Epstein-Barr virus epitopes), overexpressed
self-antigens (proteinase-3, WT-1), cancer-testis anti-
gens (NY-ESO-1) or mutated/modified self-antigens
[32]. Although recipients with leukemia may have be-
come tolerant to these antigens, normal donors remain
capable of developing effective immune responses after
transplantation. Overall, tumor-associated antigens
have been challenging to identify because conventional
methods for identifying tumor or transplantation anti-
gens are laborious, typically requiring isolation of
patient T cell clones followed by determination of
their peptide-HLA target using expression cloning.
In recent years, antibody responses to several GVL-
associated antigens have been identified, many of
which appear to represent tumor-associated rather
than mHA, and that have the potential to elicit donor
T cell immunity [34,35]. Nonetheless, relatively little
is known about the nature of tumor specific antigens.
If bona fide GVL target antigens were identified,
novel immunotherapy approaches, perhaps through
vaccination or adoptive T cell therapy, could be
implemented to generate and maintain tumor control
and eradication through development of tumor-
specific responses.
Neoantigens as Potential GVLTargets
Tumor neoantigens have beenpreviously proposed
to be an immunologically important class of tumor-
specific antigens [36], but this hypothesis could not
be rigorously tested until nowbecause of technical bar-
riers to their identification. Just like mHA, they can po-
tentially arise as a result of genetic changes, but this
time, as a result of tumor-driven mutation rather than
from polymorphism: somatic mutations that give rise
to missense mutation, to frameshift insertions or dele-
tions, gene fusions, alternative splicing. The potential
effectiveness of targeting mutated antigens, or neoepi-
topes, in the immune control of tumors has been appre-
ciated in seminal studies showing that: (1) mice and
humans often mount T cell responses to mutated anti-
gens [36,37]; (2) mice can be protected froma tumor by
immunization with a single mutated peptide that is
present in the tumor [38]; (3) spontaneous or vaccine-
mediated long-term melanoma survivors mount
strong memory cytotoxic T cell responses to mutated
antigens [39-41]; and finally (4) that patients with
follicular lymphoma show molecular remission when
immunized with patient-specific mutated immuno-
globulin proteins that are present in autologous tumor
cells [42,43]. However, as targets for vaccination, they
have rarely been used in vaccines because of the
technical difficulties in identifying them [36].
Comprehensive Discovery of Tumor
Neoantigens
The obstacles for discovering personal tumor neo-
antigens have recently been potentially surmounted
with the advent of next-generation sequencing tech-
nology. Recent large-scale traditional sequencing
efforts have demonstrated that an average tumor may
have tens to hundreds of protein-coding changes
[44,45]. Such mutated proteins have the potential to:
(1) uniquely mark a tumor for recognition and
destruction by the immune system [36], thus reducing
the risk for autoimmunity; and (2) avoid central and
peripheral T cell tolerance, allowing the antigen to
be recognized by more effective, high avidity T cells
receptors. In this instance, ‘‘passenger’’ mutations,
that may not have been of interest from the standpoint
of oncogenesis, do have potential relevance for elicit-
ing immunity. From a probability standpoint, higher
mutation rates may generate more chances to generate
epitopes. A recent in silico analysis of sequences derived
from tumor and normal cells in the same patients sug-
gest that these somatic mutations provide 10 novel
neoepitopes that can bind HLA-A*0201 for tumor vac-
cine development [46].
Perspectives
Fusing genomic data with immunologic studies
will enable the evaluation of the immunologic effects
of personal tumor neoantigens in a way that has not
been historically possible using conventional method-
ologies of T cell antigen discovery. Recent studies in
cancer genome sequencing have increasingly sug-
gested that greater diversity of genetic changes in tu-
mors exist than previously anticipated, and include
point mutations, gene fusion and alternative splicing
events, the vast majority of which appear to be private
to an individual tumor.
These results have interesting immunologic impli-
cations. Based on the older studies described above and
on the difficult challenge that overexpressed antigens
are likely to be expressed in some normal tissue, we an-
ticipate that personal tumor neoantigens will play an
increasingly important role in the development of
highly focused and potent cancer vaccines. Our tool
kit for generating effective vaccines has vastly in-
creased in recent years, and range from the develop-
ment of novel vaccine delivery methods, of more
potent adjuvants, and of highly active checkpoint
blockade inhibitors [47]. In this context, the impor-
tance of defining the truly tumor-specific antigens is
heightened, to ensure that focused potent immune re-
sponses that could lead to effective destruction of tu-
mor cells (without autoimmunity) can be
implemented. The reality of integrating whole tumor
S158 Biol Blood Marrow Transplant 18:S151-S160, 2012 M. Chapman et al.
sequencing into the clinical therapeutic setting is in-
creasingly feasible as the costs of genome sequencing
drop. Deeper investigation in this area can address
the many as yet unanswered questions related to tumro
noantigens include: which and what fraction of tumor
neoantigens are detected by T cells? How frequent are
neoantigen-specific memory and effector T cells in
circulation and in the tumor? How much avidity do
T cells have for these antigens? Are neoantigen-
specific T cells functional? Addressing these questions
will inform any potential applications of tumor neoan-
tigens in vaccines.
ACKNOWLEDGMENTS
M.C. acknowledges support from Leukaemia and
Lymphoma Research. E.H.W. is supported by a Clini-
cal Scientist Award in Translational Research from
the Burroughs Wellcome Fund (1007475) and NIH
Grants P30 CA015704-37, R43 DK089783, and R56
AI081860. C.J.Wis supportedbya Clinical Investigator
award from the Damon-Runyon Cancer Research
Foundation (CI-38-07) and acknowledges support
from the Leukemia and Lymphoma Society Transla-
tional ResearchProgram, theBlavatnikFamilyFounda-
tion, and from the NIH (NCI-1R01CA155010-01A1).
Financial disclosure: Dr. Warren serves on the Sci-
entific Advisory Board of Adaptive TCR, Inc., but
has no financial interest in the company and receives
no financial compensation of any kind from the com-
pany. Drs. Wu and Chapman have no conflicts of
interest to disclose.
REFERENCES
1. Morozova O, Marra MA. Applications of next-generation
sequencing technologies in functional genomics. Genomics. 2008;
92:255-264.
2. Gnirke A, Melnikov A, Maguire J, et al. Solution hybrid selec-
tion with ultra-long oligonucleotides for massively parallel tar-
geted sequencing. Nat Biotechnol. 2009;27:182-189.
3. Ley TJ, Mardis ER, Ding L, et al. DNA sequencing of a cytoge-
netically normal acute myeloid leukaemia genome. Nature. 2008;
456:66-72.
4. Chapman MA, Lawrence MS, Keats JJ, et al. Initial genome se-
quencing and analysis of multiple myeloma. Nature. 2011;471:
467-472.
5. Trapnell C, Salzberg SL. How to map billions of short reads
onto genomes. Nat Biotechnol. 2009;27:455-457.
6. Mardis ER, Ding L, Dooling DJ, et al. Recurring mutations
found by sequencing an acute myeloid leukemia genome.
N Engl J Med. 2009;361:1058-1066.
7. Paschka P, Schlenk RF, Gaidzik VI, et al. IDH1 and IDH2 mu-
tations are frequent genetic alterations in acute myeloid leuke-
mia and confer adverse prognosis in cytogenetically normal
acute myeloid leukemia with NPM1 mutation without FLT3
internal tandem duplication. J Clin Oncol. 2010;28:3636-3643.
8. Ley TJ, Ding L, Walter M, et al. DNMT3A mutations in acute
myeloid leukemia. N Engl J Med. 2010;363:2424-2433.
9. Annunziata CM, Davis RE, Demchenko Y, et al. Frequent
engagement of the classical and alternative NF-kB pathways
by diverse genetic abnormalities in multiple myeloma. Cancer
Cell. 2007;12:115-130.
10. Keats JJ, Fonseca R, Chesi M, et al. Promiscuous mutations
activate the non-canonical NF-kB pathway in multiple mye-
loma. Cancer Cell. 2007;12:131-144.
11. Puente XS, Pinyol M, Quesada V, et al. Whole-genome se-
quencing identifies recurrent mutations in chronic lymphocytic
leukaemia. Nature. 2011;475:101-105.
12. Wang L, Lawrence MS, Wan Y, et al. SF3B1 and other novel
cancer genes in chronic lymphocytic leukemia. N Engl J Med.
2011;365.
13. Papaemmanuil E, Cazzola M, Boultwood J, et al. Somatic
SF3B1 mutation in myelodysplasia with ring sideroblasts. N
Engl J Med. 2011;365:1384-1395.
14. Yoshida K, Sanada M, Shiraishi Y, et al. Frequent pathway mu-
tations of splicing machinery in myelodysplasia. Nature. 2011;
478:64-69.
15. Tiacci E, Trifonov V, Schiavoni G, et al. BRAF mutations in
hairy-cell leukemia. N Engl J Med. 2011;364:2305-2315.
16. Munroe DJ, Harris TJR. Third-generation sequencing fire-
works at Marco Island. Nat Biotechnol. 2010;28:426-428.
17. Mardis ER. Adecade’s perspective on DNAsequencing technol-
ogy. Nature. 2011;470:198-203.
18. Boyd SD, Marshall EL, Merker JD, et al. Measurement and clin-
ical monitoring of human lymphocyte clonality by massively
parallel VDJ pyrosequencing. Sci Trans Med. 2009;1:1-16.
19. Freeman JD, Warren RL, Webb JR, Nelson BH, Freeman JD,
Holt RA. Profiling the T-cell receptor beta-chain repertoire
by massively parallel sequencing. Genome Res. 2009;1817-1824.
20. Robins HS, Campregher PV, Srivastava SK, et al. Comprehen-
sive assessment of T-cell receptor b-chain diversity in ab Tcells.
Blood. 2009;114:4099-4107.
21. Wang C, Sanders CM, Yang Q, et al. High throughput sequenc-
ing reveals a complex pattern of dynamic interrelationships
among human T cell subsets. Proc Natl Acad Sci USA. 2010;
107:1518-1523.
22. Warren RL, Freeman JD, Zeng T, et al. Exhaustive T-cell rep-
ertoire sequencing of human peripheral blood samples reveals
signatures of antigen selection and a directly measured reper-
toire size of at least 1 million clonotypes. Genome Res. 2011;21:
790-797.
23. Robins HS, Srivastava SK, Campregher PV, et al. Overlap and
effective size of the human CD81 T cell receptor repertoire.
Sci Transl Med. 2010;2:47ra64.
24. Sherwood AM, Desmarais C, Livingston RJ, et al. Deep se-
quencing of the human TCRg and TCRb repertoires suggests
that TCR rearranges after and T cell commitment. Sci Transl
Med. 2011;3. 90ra61-90ra61.
25. Kreslavsky T, Gleimer M, Garbe A, vonBoehmer H. ab versus
gd fate choice: counting the T-cell lineages at the branch point.
Immunol Rev. 2010;238:169-181.
26. Kreslavsky T, Boehmer H von. gd TCR ligands and lineage
commitment. Semin Immunol. 2010;22:214-221.
27. Ciofani M, Z u~ niga-Pfl€ ucker JC. Determining gd versus ab
T cell development. Nat Rev Immunol. 2010;10:657-663.
28. Carding SR, Egan PJ. gd T cells: functional plasticity and het-
erogeneity. Nat Rev Immunol. 2002;2:336-345.
29. Wang H, Fang Z, Morita CT. Vgamma2Vdelta2 T Cell Recep-
tor recognition of prenyl pyrophosphates is dependent on all
CDRs. J Immunol. 2010;184:6209-6222.
30. Kalos M, Levine BL, Porter DL, et al. T cells with chimeric an-
tigen receptors have potent antitumor effects and can establish
memory in patients with advanced leukemia. Sci Transl Med.
2011;3:95ra73.
31. Robins H, Desmarais C, Matthis J, et al. Ultra-sensitive detec-
tion of rare T cell clones. J Immunol Methods. 2011.
32. Wu C, Ritz J. Induction of tumor immunity following allogeneic
stem cell transplantation. Adv Immunol. 2006;90:133-173.
33. Mullally A, Ritz J. Beyond HLA: the significance of genomic
variation for allogeneic hematopoietic stem cell transplantation.
Blood. 2007;109:1355-1362.
Biol Blood Marrow Transplant 18:S151-S160, 2012 S159 NGS for Blood and Marrow Transplantation
34. Biernacki AM, Marina O, Zhang W, et al. Efficacious immune
therapy in chronic myelogenous leukemia (CML) recognizes an-
tigens that are expressed on CML progenitor cells. Cancer Res.
2010;70:906-915.
35. Zhang W, Choi J, Zeng W, et al. Graft-versus-leukemia antigen
CML66 elicits coordinated B-cell and T-cell immunity after do-
nor lymphocyte infusion. Clin Cancer Res. 2010;16:2729-2739.
36. Sensi M, Anichini A. Unique tumor antigens: evidence for im-
mune control of genome integrity and immunogenic targets
for T cell-mediated patient-specific immunotherapy. Clin
Cancer Res. 2006;12:5023-5032.
37. Parmiani G, Filippo A De, Novellino L, Castelli C. Unique hu-
man tumor antigens: immunobiology and use in clinical trials. J
Immunol. 2007;178:1975-1979.
38. Mandelboim O, Vadai E, Fridkin M, et al. Regression of estab-
lished murine carcinoma metastases following vacination with
tumour-associatedantigenpeptides. Nat Med. 1995;1:1179-1183.
39. Zhou J, Shen X, Huang J, Hodes RJ, Rosenberg SA,
Robbins PF. Telomere length of transferred lymphocytes corre-
lates with in vivo persistence and tumor regression in melanoma
patients receiving cell transfer therapy. J Immunol. 2005;175:
7046-7052.
40. Lennerz V, Fatho M, Gentilini C, et al. The response of
autologous T cells to a human melanoma is dominated by
mutated neoantigens. Proc Natl Acad Sci USA. 2005;102:
16013-16018.
41. Huang J, El-Gamil M, Dudley ME, Li YF, Rosenberg SA,
Robbins PF. T cells associated with tumor regression recognize
frameshifted products of the CDKN2A tumor suppressor gene
locus and a mutated HLA class I gene product. J Immunol.
2004;172:6057-6064.
42. Baskar S, Kobrin CB, Kwak LW. Autologous lymphoma vac-
cines induce human T cell responses against multiple, unique
epitopes. J Clin Invest. 2004;113:1498-1510.
43. Timmerman JM, Czerwinski DK, Davis TA, et al. Idiotype-
pulsed dendritic cell vaccination for B-cell lymphoma: clinical
and immune responses in 35 patients. Blood. 2002;99:1517-1526.
44. Sj€ oblom T, Jones S, Wood LD, et al. The consensus coding se-
quences of human breast and colorectal cancers. Science. 2006;
314:268-274.
45. Thomas RK, Baker AC, Debiasi RM, et al. High-throughput
oncogene mutation profiling in human cancer. Nat Genet.
2007;39:347-351.
46. Segal NH, Parsons DW, Peggs KS, et al. Epitope landscape in
breast and colorectal cancer. Cancer Res. 2008;68:889-892.
47. Brusic A, Wu CJ. Anti-cancer vaccines following hematopoietic
stem cell transplantation to enhance graft-versus-leukemia re-
sponses. Front Biosci (in press).
S160 Biol Blood Marrow Transplant 18:S151-S160, 2012 M. Chapman et al.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close