NATIONAL CANCER INSTITUTE - CANCER.GOV

Contact Information


Primary Contact

JC Zenklusen
Deputy Director

Location

MSC 2440, 31 Center Drive
Bethesda, MD 20892

Additional Contacts

Sharon Gaheen
Technical Project Manager

Overview

The NCI Genomic Data Commons (GDC) was established by the NCI Center for Cancer Genomics (CCG) to support the receipt, harmonization, distribution, and analysis of genomic and clinical data from cancer research programs. The mission of the GDC is to provide the cancer research community with a unified data repository and cancer knowledge base that enables data sharing across cancer genomic studies in support of precision medicine. The GDC accomplishes this by harmonizing raw sequence data against a common reference genome (GrCH38), applying state-of-the-art methods for generating high-level data such as mutation calls and structural variants, and providing scalable tools supporting data download and analysis. The GDC maintains harmonized data from several supported cancer research programs such as The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), the Clinical Proteomic Tumor Analysis Consortium (CPTAC), and other contributing programs.

List of Services

  • GDC Data Portal – The GDC Data Portal is a robust web-based platform that allows users to search, download, and analyze data from cancer genomic studies. The GDC Data Portal provides advanced search and cohort building features, gene and variant level data visualization and analysis, and a repository for data download. Links: Launch the GDC Data Portal | User’s Guide
  • GDC Data Transfer Tool (DTT) – The GDC DTT is a command-line driven application for the download and upload of large, high-volume data. The GDC DTT provides an optimized method for transferring data to-and-from the GDC and enables resumption of interrupted transfers. The GDC DTT Client provides a command-line interface supporting both GDC data downloads and submissions. The GDC DTT User Interface (UI) provides a user-friendly interface to the GDC DTT Client for downloading data from the GDC. Links: Download the GDC DTT Client and UI | User’s Guide
  • GDC Application Programming Interface (API) – The GDC API is a programmatic interface for searching, downloading, submitting, and analyzing GDC data and metadata. The GDC API is the external facing REpresentational State Transfer (REST) interface for the GDC and uses JSON as its communication format, and standard HTTP methods (GET, PUT, POST, and DELETE). Links: User’s Guide
  • GDC Data Dictionary and Data Model – The GDC Data Dictionary is a resource that describes the clinical, biospecimen, administrative, and genomic metadata that can be used in parallel with the genomic data generated by the GDC. The dictionary defines the structure of the GDC graph-based data model and the rules the data need to follow. In addition, the dictionary includes information about the relationships between entities within the data model. Links: View the GDC Data Dictionary | GDC Data Model
  • GDC Publication Pages – GDC Publication Pages provide access to information and supplementary files from publications associated with NCI supported programs. Search facilities are provided to filter publications by program, project, publication year, and keywords. Links: View GDC Publication Pages
  • GDC Web Site – The GDC Web Site provides information on the GDC, data hosted in the GDC, and processes and tools supporting data access, submission, and analysis. The GDC Web Site also provides access to GDC support information including webinar videos and news on GDC releases. Links: GDC Web Site
  • GDC Documentation Site – The GDC Documentation Site provides access to User’s Guides for GDC applications and services and includes detailed information on GDC Bioinformatics pipelines. The site also hosts the GDC Data Dictionary. Links: GDC Documentation Site

User Guidelines

GDC applications and services are available to the global cancer research community. The GDC Data Portal is the primary application for exploring, analyzing, and downloading harmonized data maintained in the GDC. A summary of the many tools, applications, and other resources the GDC provides for retrieving, downloading, and analyzing data in the GDC, submitting data to the GDC, and processing data through GDC bioinformatics pipelines is available on the GDC Web Site GDC Resources Page.

Keywords

BioinformaticsCancerDNA-SeqGDCGenomicsMethylationNext Generation Sequencing (NGS)RNA-SeqTCGAWhole Exome Sequencing (WXS)Whole Genome Sequencing (WGS)Bioinformatics Biostatistics and ComputingmiRNA-Seqnci-ncrnih-ncrscRNA-Seq