National Institute of Health The Cancer Genome Atlas National Cancer Institute National Human Genome Research Institute
Home > Publications > Comprehensive Molecular Characterization of Oesophageal Carcinoma, 2017

Comprehensive Molecular Characterization of Oesophageal Carcinoma

Nature, 12 January 2017, issue

DOI: 10.1038/nature20805


Oesophageal cancers are prominent worldwide; however, there are few targeted therapies owing to a lack of insight into the molecular drivers of this cancer. Here we performed a comprehensive molecular analysis of 164 carcinomas of the oesophagus derived from Western and Eastern populations. Beyond known histopathological and epidemiologic distinctions, molecular features differentiated oesophageal squamous cell carcinomas from oesophageal adenocarcinomas. Oesophageal squamous cell carcinomas resembled squamous carcinomas of other organs more than they did oesophageal adenocarcinomas. Our analyses identified three molecular subclasses of oesophageal squamous cell carcinomas, but none showed evidence for an aetiological role of human papillomavirus. Squamous cell carcinomas showed frequent genomic amplifications of CCND1 and SOX2 and/or TP63, whereas ERBB2, VEGFA and GATA4 and/or GATA6 were more commonly amplified in adenocarcinomas. Oesophageal adenocarcinomas strongly resembled the chromosomally unstable variant of gastric adenocarcinoma, suggesting that these cancers could be considered a single disease entity. However, some molecular features, including DNA hypermethylation, occurred disproportionally in oesophageal adenocarcinomas. These data provide a framework to facilitate more rational categorization of these tumours and a foundation for new therapies.

Views of the Data

Tools for Exploring Data and Analyses

Associated Data Files

These data represent a data freeze from November 6, 2015. The 164 esophageal adenoma cases in the manuscript are sourced from both TCGA ESCA and TCGA STAD, and the study includes gastric cases for comparison. Both ESCA and STAD data are provided here, and can be filtered according to further annotation in Supplemental Table 1 of the manuscript. All data marked by [DCC] are DCC-validated archives. All data marked by [Supplementary] were created by the manuscript authors and you should contact the corresponding author for support.

Participant List, BAM File List, and Full Listing of TCGA Archives for Data Freeze


Mutation annotation files (MAF) can be obtained from the Genomic Data Commons (GDC). The MAF files are controlled-access data and can be obtained as described in GDC documentation.

The latter should be filtered by retaining barcodes corresponding to the STAD Participant List above, or by excluding barcodes corresponding to the ESCA Participant List above.

Reverse Phase Protein Array (RPPA) Expression

RNA Expression from IlluminaGA RNASeq and IlluminaHiseq RNASeq

SNP and Copy Number variation from Affymetrix SNP6 and IlluminaHiSeq Low Pass Whole Genome sequencing

miRNA from Illumina HiSeq 2000

Methylation from Illumina Infinium Human Methylation450

Additional Information