Overview

CCR Sequencing Facility

Mission:

The mission of the Center for Cancer Research Sequencing Facility (CCR-SF) is to utilize high-throughput sequencing technologies to enrich cancer research and ensure that the NCI community can leverage the leading-edge of Next-Generation Sequencing technology.

CCR Sequencing Facility: Overview

The introduction of DNA sequencing instruments capable of producing millions of DNA sequence reads in a single run has profoundly altered the landscape of genetics and cancer biology. Complex questions can now be answered at previously unthinkable speeds and a fraction of their former cost. At the Sequencing Facility, NCI researchers are provided access to the latest technologies, with consultation and Q&A services available throughout the design and execution of sequencing projects.

Our lab currently employs the following sequencing platforms:

Illumina (Short Read) Sequencing Technology

Illumina sequencing utilizes reversible terminator chemistry optimized to achieve high levels of cost-effectiveness and throughput
Millions of reads produced per sample lane at 50 bp to 300 bp read lengths
Support for the multiplexing of 96 bar-coded samples into a single lane
Available resources include NovaSeq6000, NextSeq2000and MiSeq sequencer

Long-Read Technologies: PacBio Sequel IIe Sequencing

Amplification-free sequencing via Single-Molecule Real-Time (SMRT) technology enables rapid identification of long nucleotide chains and DNA methylation.
HiFi reads generated using Circular Consensus Sequencing mode (up to 25 kb read length) achieve > 99.9% accuracy.
Long and accurate reads are ideal for many applications such as de novo assemblies, identification of structural variants, full-length transcriptomes and long amplicons.
Average yields per SMRTcell are insert-size dependent but can reach 30 Gb of HiFi reads for WGS.
Project flexibility with production protocols that include large and small genome sequencing, whole and targeted transcriptome sequencing (Iso-Seq) for bulk and single cell sample, and amplicon sequencing, either single or multiplexed.

Long-Read Technologies: Oxford Nanopore Technologies (ONT)

ONT sequencing uses flow cells containing an array of tiny holes — nanopores — embedded in an electro-resistant membrane. When a molecule passes through a nanopore, the current is disrupted to produce a characteristic ‘squiggle’. The squiggle is then decoded using basecalling algorithms to determine the DNA or RNA sequence.
ONT can analyze native DNA or RNA in real-time and sequence any length of fragment to achieve short (20 bases) to ultra-long read lengths (> 1 million bases), with accuracies over 99%.
Amplification-free direct sequencing of individual DNA and RNA molecules precludes PCR bias and artifacts and allows base modification identification at single nucleotide resolution, including 5mC, 5hmC in DNA and m6A in RNA. Detection of further natural or synthetic epigenetic modifications are possible through training basecalling algorithms.de novo assemblies, identification of structural variants, full-length transcriptomes and long amplicons.
The Sequencing Facility offers several ONT devices with different throughput capacities: GridION (up to 50 GB per flow cell) and PromethION 2 Solo (up to 200 GB per flow cell).
Minimal machine turnaround time provides flexibility in experimental and run design, including whole and targeted genome sequencing, whole transcriptome sequencing for bulk and single cell samples and direct RNA sequencing.

Single Cell Technologies: 10X Genomics

10X Genomics Chromium X system can partition hundreds of thousands of fresh, frozen, or fixed cells in a single run.
The system can easily handle a range of projects, from a pilot size study to the highest-throughput analysis.
Chromium X system is compatible with all 10X Genomics single cells assays and can capture molecular readouts of cell activity in multiple dimensions, including gene expression, chromatin accessibility, cell surface proteins, immune clonotype, antigen specificity.

Single Cell Technologies: Mission Bio

Mission Bio Tapestri platform can provide both genotype and phenotype data from the same cell.
The Tapestri Platform enables targeted single-cell DNA and protein analysis.
Tapestri Single-cell DNA panels allow researchers to focus on the mutations and regions of interest that are most relevant to their disease research.
The targeted single cell DNA panel can detect SNVs, CNVs and proteins simultaneously from the same cell.

Single Cell Technologies: Fluent BioSciences PIPseq

Instrument-free single-cell RNA-seq (3’ gene end counting)
Fast and scalable scRNA-seq technology, works as well as 10X assays. Cost effective.
Low multiplet rate
High quality transcript capture and complex cell population resolution
Application areas: Cancer, Immunology, neurosciences, infectious diseases

Optical Mapping Technologies: Bionano Genomics

Bionano’s non-sequencing-based genome mapping technology images and analyzes extremely long, high-molecular-weight DNA.
Facilitates identification of structural variants and creation of de novo genome assemblies.

Services

SF Services

To request services from the CCR Sequencing Facility
Submit a Sequencing Facility Request
Prior to filling out a NAS request, you are advised to consult with Dr. Maggie Cam and/or Mr. Bao Tran to discuss your project design and bioinformatics approach to data analysis:

Bao Tran
Director, Sequencing Facility
bao.tran@nih.gov
301-360-3460

Maggie Cam, Ph.D.
maggie.cam@nih.gov
301-443-2965

Please visit the Protocols and Resources page for more details about the sequencing chemistry and technology utilized by each platform. We encourage you to contact us so we can provide you with the most current information and help you plan your project to meet your sequencing needs.

Short reads with Illumina Sequencing:

Illumina sequencing enables a wide variety of applications, allowing researchers to ask virtually any question related to the genome, transcriptome, or epigenome of any organism.

ChIP-Seq
Cut and Run
ATAC-Seq*
RNA-Seq (mRNA, Total RNA and microRNA)
Whole Genome Sequencing
Whole Exome Sequencing – For further information please contact NCI-FredLMTSFExome@mail.nih.gov.
Methylated DNA sequencing (bisulfite)
Amplicon Sequencing

* ATAC seq is only provided as a pilot project for a maximum of 12 samples. After the pilot, or for more than 12 samples, we can transfer the protocol to you.

Long Read Sequencing Techonologies:

Whole Genome Sequencing: de novo assembly, haplotype resolution, structural variant detection, DNA epigenetic modification detection.
RNA Sequencing: Full-length transcript sequencing for whole-transcriptome or gene-specific targets. Full-length RNA sequencing can be performed on bulk or single cell samples. Direct RNA sequencing with Oxford Nanopore Sequencing.
Targeted Sequencing: Long amplicon sequencing, full-length viral sequencing, full-length vector sequencing, target enrichment, adaptive sampling and multiplexing strategies.
HLA Typing: Amplification of full gene for HLA class I and/or class II.
16S sequencing: Amplification of full length 16S for bacterial communities.

Optical Mapping using Bionano Technology:

Imaging and analysis of extremely long, high-molecular-weight DNA facilitates identification of structural variants and creation of de novo genome assemblies

R&D Resources:

R&D group works closely with investigators to provide customized support for a variety of applications, utilizing the most recent state-of-the-art NGS sequencing technologies.
Testing and validation of new sequencing applications/products before offering them as production services.
Development of new sequencing applications/protocols to assist the NCI community.
Training the production team members and PI labs on the newest developed NGS sequencing technologies and new instruments.

Bioinformatics Support:

CCR-SF bioinformatics group provides coordinated joint consultation services for sequencing technology selection, project design, and data analysis for next generation sequencing projects. We support analysis for Whole Genome/Exome sequencing, ATAC-seq, ChIP-seq, RNA-seq, miRNA-seq and analysis for new data types from single cell sequencing, long-read sequencing and optical mapping. We work collaboratively with the CCR Collaborative Bioinformatics Resource (CCBR) and provide a mechanism for CCR researchers to obtain many different types of bioinformatics assistance to further their research goals. Please check the details of Bioinformatics Support at the Bioinformatics Info page.

Pricing

Illumina Library Construction	NCI/NIAID Price	NON NCI Price
Project Type	Cost/Sample	Cost/Sample
ChIP-Seq	$80 + (Sequencing Cost)	$160 + ( Sequencing Cost)
gDNA-Seq	$50 + (Sequencing Cost)	$100 + ( Sequencing Cost)
Nextera XT	$52 + (Sequencing Cost)	$104 + ( Sequencing Cost)
Whole Genome Methyl-Seq	$54 + (Sequencing Cost)	$108 + ( Sequencing Cost)
5hmC Detection	$91 + ( Sequencing Cost)	$182 + ( Sequencing Cost)
Total RNA-Seq	$76 + ( Sequencing Cost)	$152 + ( Sequencing Cost)
mRNA-Seq	$57 + ( Sequencing Cost)	$114 + ( Sequencing Cost)
miRNA-Seq	$94 + ( Sequencing Cost)	$188 + ( Sequencing Cost)
ATAC-Seq	$104 + ( Sequencing Cost)	$208 + ( Sequencing Cost)

Single Cell 10X Library Construction	NCI/NIAID Price	NON NCI Price
Project Type	Cost/Sample	Cost/Sample
10X Chromium Single Cell RNA_Seq (GEX, for 3′ or 5′)	$1723 + (Sequencing Cost)	$3446 + (Sequencing Cost)
10X Chromium Single Cell VDJ Enrichment (TCR or BCR, for 5’ only)	$113 + (10X Capture) + (Sequencing Cost)	$226 + (10X Capture) + (Sequencing Cost)
10X Chromium Single Cell Feature Barcode (for 3’ or 5’)	$133 + (10X Capture) + (Sequencing Cost)	$266 + (10X Capture) + (Sequencing Cost)
10X Chromium Single Cell ATAC-seq	$1644 + (Sequencing Cost)	$3288 + (Sequencing Cost)
10X Chromium HT Single Cell RNA_Seq (GEX, for 3′ or 5′)	$1854 + (Sequencing Cost)	$3708 + (Sequencing Cost)
10X Chromium Fixed RNA_Seq 4×4	$1010 + (Sequencing Cost)	$2020 + (Sequencing Cost)
10X Chromium Fixed RNA_Seq 4×1	$1863 + (Sequencing Cost)	$3726 + (Sequencing Cost)

Single Cell Mission Bio Tapestri Library Construction	NCI/NIAID Price	NON NCI Price
Project Type	Cost/Sample	Cost/Sample
Tapestri Single Cell DNA *	$2441 + (Sequencing Cost)	$4228 + (Sequencing Cost)
Tapestri Single Cell DNA and Protein (Multi-omics)	$2704 + (Sequencing Cost)	$5408 + (Sequencing Cost)
Tapestri Single -Cell Targeted DNA panels (AML, Myeloid, THP, CLL)	$66 + (Capture) + (Sequencing Cost)	$132 + (Capture) + (Sequencing Cost)
*Custom DNA panel price is not included

Single Cell Fluent PIPSeq 3′ Library Construction	NCI/NIAID Price	NON NCI Price
Project Type	Cost/Sample	Cost/Sample
Fluent PIPseq T2	$362 + (Sequencing Cost)	$724 + (Sequencing Cost)
Fluent PIPseq T20	$1087 + (Sequencing Cost)	$2174 + (Sequencing Cost)
Fluent PIPseq T100	$3864 + (Sequencing Cost)	$7728 + (Sequencing Cost)

Illumina Sequencing

NovaSeq	NCI/NIAID Price	NON NCI Price
Run Type	Cost/Flowcell	Cost/Flowcell
SP 100 Cycle (v1.5)	$2,324	$4,648
SP 200 Cycle (v1.5)	$3,044	$6,088
SP 300 Cycle (v1.5)	$3,321	$6,642
SP 500 Cycle (v1.5)	$4,649	$9,298
S1 100 Cycle (v1.5)	$4,261	$8,522
S1 200 Cycle (v1.5)	$5,368	$10,736
S1 300 Cycle (v1.5)	$5,811	$11,622
S2 100 Cycle (v1.5)	$8,025	$16,050
S2 200 Cycle (v1.5)	$9,963	$19,926
S2 300 Cycle (v1.5)	$10,627	$21,254
1.5B 100 cycle	$1,900	$3,800
1.5B 200 cycle	$2,400	$4,800
1.5B 300 cycle	$2,600	$5,200
10B 100 cycle	$7,100	$14,200
10B 200 cycle	$11,654	$23,307
10B 300 cycle	$12,770	$25,540
25B 300 cycle	$16,000	$32,000
Xp 2-Lane Kit	$330	$660
Xp 4-Lane Kit	$663	$1,326

NextSeq 2000	NCI/NIAID Price	NON NCI Price
Run Type	Cost/Flowcell	Cost/Flowcell
P1 100 cycle	$900	$1,800
P2 100 Cycle	$1,455	$2,910
P2 200 Cycle	$2,736	$5,472
P2 300 Cycle	$3,628	$7,256
P3 100 Cycle	$3,331	$6,662
P3 200 Cycle	$4,612	$9,224
P3 300 Cycle	$6,150	$12,300

MiSeq	NCI/NIAID Price	NON NCI Price
Run Type	Cost/Flowcell	Cost/Flowcell
1 x 50 Cycle (v2)	$1,007	$2,014
1 x 150 Cycle (v3)	$1,118	$2,236
1 x 300 Cycle (v2)	$1,289	$2,578
1 x 500 Cycle (v2)	$1,450	$2,900
1 x 600 Cycle (v3)	$1,887	$3,774

Pacbio Library Construction	NCI/NIAID Price	NON NCI Price
Project Type	Cost/Sample	Cost/Sample
Amplicon	$48 + (Sequencing Cost)	$96 + (Sequencing Cost)
Iso-seq	$86 + (Sequencing Cost)	$172 + (Sequencing Cost)
WGS	$147 + (Sequencing Cost)	$294 + (Sequencing Cost)
HLA- multiplex class I and II	$146 + (Sequencing Cost)	$292 + (Sequencing Cost)
Full-length 16S	$7 + (Sequencing Cost)	$14 + (Sequencing Cost)
Single Cell MAS Iso-seq *	$485 + (Sequencing Cost)	$970 + (Sequencing Cost)
Ultra-low Input (PCR-based)	$263 + (Sequencing Cost)	$526 + (Sequencing Cost)
Targeted Iso-seq **	$225 + (Sequencing Cost)	$450 + (Sequencing Cost)
Xdrop Sort	$580 + (Sequencing Cost)	$1160 + (Sequencing Cost)
* Single cell capture is not included
** Probes cost is not included

Pacbio Sequencing	NCI/NIAID Price	NON NCI Price
Project Type	Cost/SMRT Cell	Cost/Flowcell
Sequel II SMRT Cell	$1,269	$2,538
Revio SMRT Cell	$995	$1,990

Oxford Nanopore Library Construction	NCI/NIAID Price	NON NCI Price
Project Type	Cost/Sample	Cost/Sample
WGS	$139 + (Sequencing Cost)	$278 + (Sequencing Cost)
Targeted WGS (adaptive sampling)	$176 + (Sequencing Cost)	$352 + (Sequencing Cost)
CRISPR-Cas9 crRNA	$95 + (Sequencing Cost)	$190 + (Sequencing Cost)
Direct RNA-Seq	$134 + (Sequencing Cost)	$268 + (Sequencing Cost)
Ultra-Long	$181 + (Sequencing Cost)	$362 + (Sequencing Cost)
RNA-seq (cDNA-PCR)	$138 + (Sequencing Cost)	$276 + (Sequencing Cost)
Single cell transcriptomics	$103 + (Sequencing Cost)	$206 + (Sequencing Cost)
* Cost of CRISPR-Cas9 crRNA = $95

Oxford Nanopore Sequencing	NCI/NIAID Price	NON NCI Price
Project Type	Cost/Flowcell	Cost/Flowcell
ONT Flow Cell R9.4.1/R10.4.1	$900	$1,800

Bionano Genomics-Optical Mapping Sample Prep	NCI/NIAID Price	NON NCI Price
Project Type	Cost/Sample	Cost/Sample
Bionano Sample Prep for Blood and Cell	$195 + (Sequencing Cost)	$293 + (Sequencing Cost)

Bionano Sequencing	NCI/NIAID Price	NON NCI Price
Project Type	Cost/Sample	Cost/Sample
1 Sample	$1,100	$2,200
2 Samples	$825	$1,650
Set of 3 Samples	$549	$1,098

Protocols and Resources

SF Protocols and Resources

Here you will find all the forms necessary for submitting your sequencing proposal and samples to the laboratory. To aid in project planning, we have also provided handouts of the technical details of each sequencing platform as well as the sample preparation protocols used by our laboratory. Do you have additional questions about the Sequencing Facility? Check out our sequencing FAQs, containing the most common questions we receive!

Laboratory Forms and Information
Illumina	Single Cell
Sample Manifest and Submission Instructions	Sample Requirements Sample Manifest Form Sample Delivery Instructions

PacBio	ONT	Bionano
Sample Manifest Form Sample Delivery Instructions	Sample Manifest Form Sample Delivery Instructions SOP Preparing Frozen Cell Pellets Recommended (Generation 2)	Sample Manifest Form Sample Delivery Instructions

Protocols and Resources

Illumina

Single Cell

Sample processing workflow
Illumina sequencing primer
- Please contact Illumina Technical Support techsupport@illumina.com for up-to-date information

Illumina’s Sample Preparation Protocols

Single Cell Gene Expression (3’ or 5’)
Single Cell Immune Profiling
Single Cell ATAC Sequencing
PIPseq Single Cell RNA Sequencing (T2 ,T20, T100)
Tapestri Single Cell DNA Sequencing
Tapestri Single Cell DNA-Protein Sequencing

PacBio	ONT	Bionano
Sample processing workflow PacBio Sequencing Protocols Whole genome and metagenome sequencing RNA Iso-Seq Amplicons sequencing MAS Single Cell Iso-seq 16S sequencing Targeted Iso-Seq DNA Extraction Recommendations Technical Note SMRT Sequencing Information Sequel System HiFi Sequencing	ONT Sequencing Protocols Direct RNA Sequencing Whole Genome Sequencing Ultra-long Whole Genome Sequencing ONT Instruments GridION PromethION 2 Solo	Bionano Sequencing Protocols Bionano-Prep-SP-Frozen-Cell-Pellet-DNA-Isolation-Protocol Bionano-Prep-Direct-Label-and-Stain-DLS-Protocol Bionano Instruments Bionano Saphyr

Bioinformatics Info

Bioinformatics Support at CCR-SF:

The CCR-SF uses high-throughput sequencing technologies to enrich cancer research and ensure that the NCI community can remain at the leading edge of NGS. Since its inception in 2009, CCR-SF Bioinformatics group (CCR-SF IFX) provides a broad range of bioinformatics support services to CCR investigators and their collaborators. Our team has diverse expertise in bioinformatics pipeline development and NGS data analysis support. Our mission is to provide the highest quality of sequencing data to our customers.  We work closely with investigators to help get their NGS projects off the ground.

Main Area of Support:

Provide experimental design consultation including sequencing technology recommendation, library protocol consultation, sequencing coverage and cost estimate, etc.
Perform QC, secondary and tertiary data analysis for sequencing data from different platforms, including Illumina, PacBio, Oxford Nanopore, and Bionano.
Develop robust and reproducible analysis workflows/pipelines based on application types and sequencing technologies.
Work closely with CCR-SF’s laboratory component, support adaptive new sequencing protocols and new technology development.
Provide training to customers for NGS technology and data analysis.

New Analysis Services:

Full Length Transcriptome Analysis – utilize PacBio Iso-seq for full length transcripts and novel splice variants discoveries; utilize Oxford Nanopore technology for direct RNA-sequencing and analysis.
Targeted Sequencing Analysis – utilize adaptive sampling of Oxford Nanopore technology to enrich or deplete any regions of the interest (ROI) in genome to get the high quality and efficient coverage of the ROI; targeted RNA Iso-seq using hybridization capture to enrich full-length transcripts of interests and sequence on PacBio; utilize Xdrop Sort for whole genome structure variation detection or viral integration site detection.
Single cell Analysis – support both whole transcriptome, and 3’ and 5’ capture-based technologies such as 10X Genomics, and Fluent Biosciences PIPseq 3’ single-cell RNA sequencing, single-cell multi-omics (targeted DNA, DNA+Protein) analysis using Mission Bio Tapestri platform. In addition to short-read single cell sequencing analysis, we also provide analysis support for PacBio MAS-Seq or Oxford Nanpore for sequencing of full-length transcripts from 10X Genomics single cell captures. Additionally, we provide analysis support for ResolveOME kit which offers a comprehensive view of the genome, mRNA transcriptome, and inferred impacts of protein sequence alterations.
Whole Genome Structural Variation and Copy Number Variation Detections – utilize the long reads technologies such as PacBio, Oxford Nanopore, or Bionano Optical Genome Mapping technology to detect variations or rearrangements in the structure of chromosomes as well as copy number variations.
Whole Genome Methylation Analysis – 5-mC and 5-hmC detection and analysis using protocols for Illumina short-read sequencing; DNA base modification detection (5-mC) using PacBio Single Molecule, Real Time sequencing; directly detect DNA and RNA base modifications (5mC, 5hmC, 6mA in DNA, and m6A in RNA) using Oxford Nanopore sequencing.

General Questions for Bioinformatics

What analyses does the CCR-SF Bioinformatics group perform?
Sequencing depth and experimental design questions?
What is required to assure timely processing and delivery of my data?
What types of analysis workflows does the CCR-SF use to perform analyses?
What types of data formats will I receive from CCR-SF?
How do I analyze the data?
How large are the data delivery files?
How are the data files delivered?
How long is the data made available to download?
How do I obtain a LIMS account and submit an order in the CCR-SF LIMS?
What is the yield per run for different sequencing platforms?

Answers for General Questions

What analyses does the CCR-SF Bioinformatics group perform?

Currently we offer primary and secondary analyses for all NGS projects, including initial base-calling, demultiplexing, data quality control, and reference genome alignment of NGS reads. We also offer tertiary analyses on a limited basis for R&D projects, which may include de novo assembly, whole genome structural variant analysis, full length transcriptomic splice variant detection, and single cell analysis. For all projects, we insure that every sequence run we deliver meets our high standard for yield, base-call quality, and base alignment percentage and application specific standard metrics that we established.

Sequencing depth and experimental design questions?

Coverage requirements vary by application, library protocol, sequencing platform, and project specific considerations. To provide the best approach for your project, a meeting is setup between you and representatives from our sequencing facility in order to make recommendations in sequencing platform, library protocol, and other needs.

For assistance in planning your experiment or to discuss specifics of your project please contact Bao Tran (bao.tran@nih.gov) for bioinformatics consultation please contact Yongmei Zhao (yongmei.zhao@nih.gov).

Please go to the following web links for experimental design best practices:

ATAC-seq Best Practices: https://informatics.fas.harvard.edu/atac-seq-guidelines.html
ChIP-seq, RNA-seq, and Exome and Whole Genome-seq: https://bioinformatics.ccr.cancer.gov/ccbr/project-support/experimental-design-best-practices/
Whole genome sequencing and Structural Variation Detection Best Practices: coming soon
Single cell RNA-seq: coming soon

What is required to assure timely processing and delivery of my data?

We recommend an initial consultation with the CCR-SF Bioinformatics group to discuss data analysis requirements and to establish expectations. It is also important to specify the reference genome version and annotation build for projects with human or mouse genome mapping requirements. For other reference-based sequencing projects, you will need to provide us with the reference sequences (FASTQ file format or weblink).

If you have any questions regarding your preferred data processing options, please contact Yongmei Zhao (yongmei.zhao@nih.gov) or SF Bioinformatics Team via email CCRSF_IFX@nih.gov.

What types of analysis workflows does the CCR-SF use to perform analyses?

We currently provide analyses based on sequencing application type. We have designed and implemented in-house data analysis pipelines that integrate platform/vendor specific data analysis tools with popular open-source software.

Currently available data analysis pipelines:

Illumina Sequencing

ATAC-seq
ChIP-seq
Exome-seq
Whole Genome Sequencing for SNVs, CNVs and SVs
RNA-Seq
miRNA-Seq
Whole genome bisulfite-seq

PacBio Long-read Sequencing

16S amplicon
Iso-seq
De novo Assembly
Whole genome sequencing for mutations (SNVs) and structural variant (SV) analysis
DNA base modification
HLA Genotyping

Single Cell Analysis

Single Cell RNA-seq and CITE-seq
Single Cell Immune Profiling
Single Cell Multiome

Oxford Nanopore Sequencing

Directed RNA or full-length transcript sequencing
Adaptive sampling for regions of interest analysis or virus integration sites detection
Single cell full-length transcriptomic sequencing analysis

Bionano Optical Genome Mapping

Whole genome structural variant (SV) analysis and copy number variation (CNV) profiling

What types of data formats will I receive from CCR-SF?

For projects using the Illumina sequencing platform, a PDF report containing a summary of the sequencing project (i.e., library and sequencing protocols, sequencing result summary, application-based QC metrics, and software details) and an excel file containing the detailed data analysis results. Depending on the application, you will also receive a html QC report file contains detailed QC statistics and plots for analysis workflows included for that specific application. In addition, you will receive the pass-filtered raw sequence reads in FASTQ format and the reference alignment data in BAM format. BAM files contain base-call and quality score information for all pass-filtered reads, as well as alignment information for reads that have mapped to the reference genome. Additional application specific data files were specified in the deliverable data file types.

For projects using the PacBio sequencing platform, the data delivery choice is driven by the specific needs of the project. For example, when circular consensus processing is performed, the raw subreads bam file, run definition xml files, and the consensus reads (CCS) are included in the data delivery package. If alignment and variant calling are performed, the resulting data are provided within BAM and VCF files. There are also files containing the intermediate results of pipeline processing (such as the read-to-cluster mapping for IsoSeq) are sometimes included. Beyond that, we are happy to deliver any of the files produced by our processing upon request. The content of the data delivery package should be discussed at project definition time.

For standard projects, the deliverable data file types are:

Sequencing FASTQ/FASTA files
Alignment BAM files or assembly files
Data QC statistics reports
Mapping or variant calling statistics

For projects with secondary and application specific analysis, the deliverable data file types are:

Exome-seq or WGS Structural Variants Discovery:

Raw FASTQ files
Alignment BAM files
SNP/Indel and structural variant call VCF files
Sturctural variant call BED file
Variant annotation files
QC and variant analysis statistics reports

RNA-Seq:

Raw FASTQ files
STAR-2pass alignment BAM files
Rsem gene and transcript quantification count matrix files
QC and RNA analysis statistics reports

PacBio Iso-seq:

Raw data: CCS/HiFi reads BAM or FASTQ
QC and Statistics reports: MultiQC report, Squanti3 report and Kraken contamination check
Analysis data: high quality clustered isoforms, full length cDNAs, Squanti3 filtered results including BAM, GTF as well as classification table.

PacBio De novo Assembly:

Raw data: CCS reads BAM or FASTQ
QC and Statistics reports: MultiQC report, assembly report and Kraken contamination check
Analysis data: polished contigs

PacBio Long Amplicon Sequencing:

Raw data: CCS reads BAM or FASTQ
QC and Statistics reports: MultiQC report, Kraken contamination check
Analysis data: Clustered long amplicon consensus, phasing and variant analysis file

PacBio WGS Sequencing:

Raw data: CCS reads BAM or FASTQ
QC and Statistics reports: MultiQC report, Kraken contamination check
Analysis data: mapped BAM file, SV VCF file

PacBio HLA Genotyping:

Raw data: CCS reads BAM or FASTQ
QC and Statistics reports: MultiQC report, Kraken contamination check
Analysis data: mapped BAM files, standard HLA genotyping reports

Single Cell RNA:

Cell Ranger output
Seurat clustering
SingleR annotations
Nozzle report

Single Cell ATAC:

Cell Ranger output
Signac clustering

Single Cell Multiome:

Cell Ranger output

Single Cell Immune Profiling:

Cell Ranger output

Single Cell Fixed RNA Profiling:

Cell Ranger output

Single Cell CNV:

Cell Ranger DNA output

Single Cell PIPseq:

PIPseeker output

How do I analyze the data?

The SF typically provides primary, secondary and sometimes tertiary data analysis, which includes delivery of the FASTQ pass-filtered raw read files and alignment BAM files, gene quantification counting files, or variant analysis VCF files to the customer. Investigators are expected to provide for their own downstream analyses not offered by the SF bioinformatics group. For investigators interested in performing their own bioinformatics in-house, there are several commercial software options from Illumina, PacBio, and third-party vendors. In addition, many open-source NGS software tools are freely available from Biowulf and other online computing sources.

For investigators interested in need of assistance for downstream NGS data analyses, the CCR Collaborative Bioinformatics Resource (CCBR) provides expert bioinformatics data analysis for the Center for Cancer Research at the NCI free of charge. To contact the CCBR, please submit a request through the CCBR Project Submission Form at https://bioinformatics.ccr.cancer.gov/ccbr/project-support/

How large are the delivery files?

Because NGS sequencing is still a rapidly evolving field, this answer changes regularly. Please contact the bioinformatics group for current data delivery file size information.

How are the data files delivered?

Please contact the bioinformatics group to discuss your options. The original sequence, alignment, and analysis files are available to download through CBIIT DME system. To access your project data at DME system, please email Yongmei Zhao (yongmei.zhao@nih.gov) or SF Bioinformatics Team via email CCRSF_IFX@nih.gov to get your NIH account linked to DME system. You will need to register an account for each lab member planning to log in. Please follow DME tutorials to access and download your project data.

If you or your collaborator does not have NIH account, we recommend you to register an account at GlobusFTP (https://www.globus.org/) in order to transfer data via the GlobusFTP site. Please see the following tutorial on registering an account and transferring data: https://hpc.nih.gov/docs/globus/

If you have any issues setting up a Globus account or transferring data via the shared endpoint, please contact us via email CCRSF_IFX@nih.gov.

How long is the data made available to download?

The data files located on CBIIT DME system currently is depending on the data life cycle defined by data policy implemented at CBIIT. It is available online within 5 years after the initial project data generation. For data files uploaded to Globus system, we make data available for up to 2 weeks starting from the date of our data delivery email announcement. It is the responsibility of the investigator laboratory contact, or bioinformatics contact to ensure that they have retrieved their data promptly. To maintain sufficient data storage for upcoming projects, the analysis files are then archived and stored for an additional four weeks for Globus data transfer.

If your data is no longer available for download, please contact the SF bioinformatics group and we can re-run the data processing and alignment as necessary. However, please note that it may take longer to receive the re-analyzed data due to resource conflicts with current production runs. Whenever possible, it is best to download the data in a timely manner after receipt of the delivery notice.

How do I obtain a LIMS account and submit an order in the CCR-SF LIMS?

Instructions on how to initiate the LIMS account set-up process for your group are available in the LIMS user guide, as well as instructions on how to submit an order once your account is authorized by CCR-SF.

In order to have your account authorized, the PI should email Yongmei Zhao (yongmei.zhao@nih.gov) or SF Bioinformatics Team via email CCRSF_IFX@nih.gov with a list of group members requiring LIMS account access after successful completion of the account creation steps in the LIMS user guide.

What is the yield per run for different sequencing platforms?

The following table provides the example yields per sequencing instrument types based on applications supported at CCR-SF by the vendor supported chemistry and flowcell types. Actual performance parameters may vary based on sample type, sample quality, and clusters passing filter.

Sequencing Platform	Specification Website
Illumina NextSeq 2000	https://www.illumina.com/systems/sequencing-platforms/nextseq-1000-2000/specifications.html
Illumina NovaSeq 6000	https://www.illumina.com/systems/sequencing-platforms/novaseq/specifications.html
Illumina NovaSeq Xplus	https://www.illumina.com/systems/sequencing-platforms/novaseq-x-plus/specifications.html
PacBio Sequel System	https://www.pacb.com/technology/hifi-sequencing/sequel-system/
PacBio Revio System	https://www.pacb.com/wp-content/uploads/Revio-specification-sheet.pdf
Oxford Nanopore GridION	https://nanoporetech.com/products/gridion
Oxford Nanopore PromethION	https://nanoporetech.com/products/promethion

For further questions, please contact SF Bioinformatics Team via email CCRSF_IFX@nih.gov.

Contacts

For questions concerning the Sequencing Facility, proposal submission and funding, and project status, please contact:

Bao Tran

Director, Sequencing Facility

ATRF Room D-3047
301-360-3460
bao.tran@nih.gov

Jyoti Shetty

Illumina Lab Manager

ATRF Room D-3038
301-360-3454
jyoti.shetty@nih.gov

Yongmei Zhao

Bioinformatics Manager

ATRF Room D-3048
301-360-3455
yongmei.zhao@nih.gov

Oksana German

Illumina QA Specialist

ATRF Room D-3037
301-360-3457
oksana.german@nih.gov

Yunlong He

PacBio Lab Manager / R&D Scientist

ATRF Room D-3004
301-846-7087
yunlong.he2@nih.gov

Juanma Caravaca

ONT Lab Manager / R&D Scientist

ATRF Room D-3006
301-228-4526
juanmanuel.carava@nih.gov

R&D/Coming Soon

New Instruments:

Revio: Newest generation long read sequencer from PacBio

Upgraded flow cells with 25M ZMWs (3x increase from Sequel IIe)
Shorter runtimes/parallel sequencing
On-board computation with Google DeepConsensus (>20x more computing power)

NovaSeq X Plus from Illumina

XLEAP-SBS chemistry – an even faster, higher quality, and more robust version of sequencing by synthesis (SBS) chemistry.
Ultra-high density patterned flow cell
Three flow cell types (1.5B, 10B and 25B) and up to 16Tb output per run (~ 3 times increase from current NovaSeq 6000)

Nabsys: High-Definition Mapping (HDM) using electronic detection of tagged single high molecular weight (HMW) DNA molecules

HDM provides routine, accurate, cost-effective analysis of genomic structural information, unavailable with short read technologies.
These characteristics make HDM an ideal first-line approach for a variety of applications for small and large genomes, including de novo map assembly, structural variant analysis, hybrid assembly, metagenome characterization and strain identification.

Xdrop-Sort: Target DNA enrichment for SV or virus integration detection

A novel microfluidic-based system that allows for targeted enrichment of long DNA molecules using only a few nanograms of DNA.
Based on the isolation of long DNA fragments in millions of droplets, where the droplets containing a target sequence of interest are fluorescently labeled and sorted. The final product is an enriched population of DNA molecules that can be investigated by long read sequencing.
Single cell RNA-seq applications will be coming soon with the release of a new cartridge from Samplix.

New Applications:

Cell-free DNA sequencing: Cell-free DNA (cfDNA) refers to all non-encapsulated DNA in the bloodstream. cfDNA are nucleic acid fragments that enter the bloodstream during apoptosis or necrosis. A portion of that cell-free DNA may originate from a tumor clone and is called circulating tumor DNA (ctDNA). cfDNA sequencing will therefore provide a quick and easy way for early cancer detection. The R&D team is currently in the process of developing adapted protocols for short read and long read cfDNA sequencing.

Single cell RNA-seq on ONT with or without adaptive sampling depletion/enrichment: Single cell RNA sequencing (scRNA-seq) technology has become the state-of-the-art approach for unraveling the heterogeneity and complexity of RNA transcripts within individual cells, as well as revealing the composition of different cell types and functions within highly organized tissues/organs/organisms. ONT’s high-throughput long read sequencer, PromethION, can sequence full length cDNA generated from single cell RNA-seq captures and detect not only the gene expression, but also the isoform information at single cell level. The adaptive sampling on ONT can selectively sequence the interested genes and increase the coverage of the region of interest. Our R&D team is establishing the protocol for single cell RNA Iso-seq on ONT with or without adaptive sampling.

ResolveOME single-cell whole genome and transcriptome amplification from BioSkryb genomics: BioSkryb Genomics has developed a unified system named ResolveOME for single cell whole transcriptome and whole genome amplification sequencing analysis. The ResolveOME system allows comprehensive analysis of the transcriptome and genome in parallel from the same cell. It provides high resolution accuracy of genome analysis down to the single base level combined with the comprehensive full length mRNA transcriptome and enables the understanding of interplay of these omic layers within and between individual cells. Our R&D team is evaluating the performance of this protocol.

Illumina Complete Long Read Sequencing Technology generates contiguous long-read sequences with N50 of 5–7 kb with some reads > 10 kb. It has the potential to improve the efficiency and accuracy of some existing DNA sequencing applications while increasing the resolution of clinically important genes. The technology simplifies de novo sequencing because large repeat regions in the DNA fragments can easily be spanned.

5-hmC and 5mC Detection and Analysis: Discrimination between 5-mC and 5-hmC in CCGG sequences using enzymatic digestion and PCR amplification using the The EpiMark® 5-hmC and 5-mC Analysis Kit. This can also be used to analyze and quantitate 5-methylcytosine and 5-hydroxymethylcytosine within a specific locus.

Publications

2019

Zhao Y*, Mehta M*, Walton A*, Talsania K*, Levin Y, Shetty J, Gillanders EM, Tran B, Carrick DM. Robustness of RNA sequencing on older formalin-fixedparaffin-embedded tissue from high-grade ovarian serous adenocarcinomas. PLoS 2019 May 6;14(5): e0216050.

Levin Y^†, Talsania K^†, Tran B, Shetty J*, Zhao Y*, Mehta M*. Optimization for sequencing and analysis of degraded FFPE-RNA samples. JoVE, In Press. (^†Co-first authors; * Co-corresponding authors)

Magen A, Nie J, Ciucci T, Tamoutounour S, Zhao Y, Mehta M, Tran B, McGavern DB, Hannenhalli S, Bosselut R. Single-Cell Profiling Defines Transcriptomic Signatures Specific to Tumor-Reactive versus Virus-Responsive CD4⁺T Cells. Cell Reports, 2019, 29(10): 3019-3032.e6

The Somatic Mutation Working Group of the SEQC-II consortium, Xiao W, Kusko R, Ren L, Fang F, Shen T, Talsania K, Kriga Y, Shetty J, Tran B, Zhao Y, et al. Towards best practice in cancer mutation detection with whole-genome and whole-exome sequencing. Nat Biotechnol, 2019. Accepted

Ma L, Hernandez M, Zhao Y, Mehta M, Tran B, Kelly M, Rae Z, Hernandez J, Davis J, Martin S, Kleiner D, Hewitt S, Ylaya K, Wood B, Greten T, Wang X. Tumor Cell Biodiversity Drives Microenvironmental Reprogramming in Liver Cancer. Cancer Cell. 2019 Oct 03.

Jiao X, Sui H, Lyons C, Tran B, Sherman BT, Imamichi T. Complete Genome Sequence of Herpes Simplex Virus 1 Strain McKrae. Microbiol Resour Announc. 2019 Sep 26;8(39).

Jiao X, Sui H, Lyons C, Tran B, Sherman BT, Imamichi T. Complete Genome Sequence of Herpes Simplex Virus 1 Strain MacIntyre. Microbiol Resour Announc. 2019 Sep 12;8(37).

Vacchio MS, Ciucci T, Gao Y, Watanabe M, Balmaceno-Criss M, McGinty MT, Huang A, Xiao Q, McConkey C, Zhao Y, Shetty J, Tran B, Pepper M, Vahedi G, Jenkins MK, McGavern DB, Bosselut R. A Thpok-Directed Transcriptional Circuitry Promotes Bcl6 and Maf Expression to Orchestrate T Follicular Helper Differentiation. Immunity. 2019 Sep 17;51(3):465-478.e6. Epub 2019 Aug 15.

Talsania K, Mehta M, Raley C, Kriga Y, Gowda S, Grose C, Drew M, Roberts V, Tai Cheng K, Burkett S, Oeser S, Stephens R, Soppet D, Chen X, Kumar P, German O, Smirnova T, Hautman C, Shetty J, Tran B, Zhao Y, & Esposito D. Genome Assembly and Annotation of the Trichoplusia ni Tni-FNL Insect Cell Line Enabled by Long-Read Technologies. Gene, 2019, 10 (2). pii: E79.

Ciucci T, Vacchio MS, Gao Y, Ardori FT, Candia J, Mehta M, Zhao Y, Tran B, Tessarollo L, McGavern D, & Bosselut R. Emergence and functional fitness of memory CD4+ T cells require the transcription factor Thpok. Immunity, 2019, 50(1): 91-105.e4.

2018

Zheng H, Pomyen Y, Hernandez MO, Li C, Livak F, Tang W, Dang H, Greten T, Zhao Y, Mehta M, Levin Y, Shetty J, Tran B, Budhu A, and Wang XW. Single cell analysis reveals cancer stem cell heterogeneities in hepatocellular carcinoma. Hepatology, 2018, 68(1): 127-140.

Schmitz R, Wright GW, Huang DW, Johnson CA, Phelan JD, Wang JQ, Roulland S, Kasbekar M, Young RM, Shaffer AL, Hodson DJ, Xiao W, Yu X, Yang Y, Zhao H, Xu W, Liu X, Zhou B, Du W, Chan WC, Jaffe ES, Gascoyne RD, Connors JM, Campo E, Lopez-Guillermo A, Rosenwald A, Ott G, Delabie J, Rimsza LM, Tay Kuang Wei K, Zelenetz AD, Leonard JP, Bartlett NL, Tran B, Shetty J, Zhao Y, Soppet DR, Pittaluga S, Wilson WH, Staudt LM. Genetics and Pathogenesis of Diffuse Large B-Cell Lymphoma. N Engl J Med. 2018 Apr 12;378(15):1396-1407.

Miller ME, Zhang Y, Omidvar V, Sperschneider J, Schwessinger B, Raley C, Palmer JM, Garnica D, Upadhyaya N, Rathjen J, Taylor JM, Park RF, Dodds PN, Hirsch CD, Kianian SF, Figueroa M:De Novo assembly and phasing of dikaryotic genomes from two isolates of puccinia coronata f. sp. avenae, the causal agent of oat crown rust. MBio, 9(1):2018.

Greer YE, Porat-Shliom N, Nagashima K, Stuelten C, Crooks D, Koparde VN, Gilbert SF, Islam C, Ubaldini A, Ji Y, Gattinoni L, Soheilian F, Wang X, Hafner M, Shetty J, Tran B, Jailwala P, Cam M, Lang M, Voeller D, Reinhold WC, Rajapakse V, Pommier Y, Weigert R, Linehan WM, Lipkowitz S. ONC201 kills breast cancer cells in vitro by targeting mitochondria. Oncotarget. 2018 Apr 6;9(26):18454-18479.

Cramer SD, Hixon JA, Andrews C, Porter RJ, Rodrigues GOL, Wu X, Back T, Czarra K, Michael H, Cam M, Chen J, Esposito D, Senkevitch E, Negi V, Aplan PD, Li W, Durum SK. Mutant IL-7Rα and mutant NRas are sufficient to induce murine T cell acute lymphoblastic leukemia. Leukemia. 2018 Aug;32(8):1795-1882.

2017

Shukla A, Zhu J, Kim SY, Hager G, Ruan Y and Hunter KW (2017) Identification of a core inherited metastatic susceptibility network by integrated epigenetic, genetic and chromosomal interaction analysis. Manuscript in preparation

Carpenter AC, Wohlfert E, Chopp LB, Vacchio MS, Nie J, Zhao Y, Shetty J, Xiao Q, Deng C, Tran B, Cam M, Gaida MM, Belkaid Y, Bosselut R. Control of Regulatory T Cell Differentiation by the Transcription Factors Thpok and LRF. J Immunol. 2017 Sep 1;199(5): 1716-1728.

2016

Hodson DJ, Shaffer AL, Xiao W, Wright GW, Schmitz R, Phelan JD, Yang Y, Webster DE, Rui L, Kohlhammer H, Nakagawa M, Waldmann TA, Staudt LM. Regulation of normal B cell differentiation and malignant B cell survival by OCT2. Proc Natl Acad Sci 2016 113:E2039-E2046.

Thompson, Bethtrice; Varticovski, Lyuba; Baek, Songjoon; et al. Hager GL. Genome-Wide Chromatin Landscape Transitions Identify Novel Pathways in Early Commitment to Osteoblast Differentiation. PLOS ONE Volume: 11 Issue: 2

Yang Y, Kelly P, Shaffer AL, Schmitz R, Liu X, Huang DW, Webster D, Young RM, Yoo H, Nakagawa M, Ceribelli M, Wright GW, Yang Y, Zhao H, Yu X, Xu W, Chan WC, Jaffe ES, Gascoyne RD, Campo E, Rosenwald A, Ott G, Delabie J, Rimsza L, Staudt LM. Targeting non-proteolytic protein ubiquitination for the treatment of diffuse large B cell lymphoma. Cancer Cell 2016 29:494-507.

Kuschal C, Botta E, Orioli D, Digiovanna JJ, Seneca S, Keymolen K, Tamura D, Heller E, Khan SG, Caligiuri G, Lanzafame M, Nardo T, Ricotti R, Peverali FA, Stephens R, Zhao Y, Lehmann AR, Baranello L, Levens D, Kraemer KH, Stefanini M. GTF2E2 Mutations Destabilize the General Transcription Factor Complex TFIIE in Individuals with DNA Repair-Proficient Trichothiodystrophy. Am J Hum Genet. 2016 Apr 7;98(4):627-42.

Liang M, Raley C, Zheng X, Kutty G, Gogineni E, Sherman BT, Sun Q, Chen X, Skelly T, Jones K, Stephens R, Zhou B, Lau W, Johnson C, Imamichi T, Jiang M, Dewar R, Lempicki RA, Tran B, Kovacs JA, Huang DW. Distinguishing highly similar gene isoforms with a clustering-based bioinformatics analysis of PacBio single-molecule long reads. BioData Min. 2016 Apr 5; 9:13.

Huang DW, Raley C, Jiang MK, Zheng X, Liang D, Rehman MT, Highbarger HC, Jiao X, Sherman B, Ma L, Chen X, Skelly T, Troyer J, Stephens R, Imamichi T, Pau A, Lempicki RA, Tran B, Nissley D, Lane HC, Dewar RL. Towards Better Precision Medicine: PacBio Single-Molecule Long Reads Resolve the Interpretation of HIV Drug Resistant Mutation Profiles at Explicit Quasispecies (Haplotype) Level. J Data Mining Genomics Proteomics. 2016 Jan;7(1). pii: 182. Epub 2015 Nov 8.

Ma L, Chen Z, Huang da W, Kutty G, Ishihara M, Wang H, Abouelleil A, Bishop L, Davey E, Deng R, Deng X, Fan L, Fantoni G, Fitzgerald M, Gogineni E, Goldberg JM, Handley G, Hu X, Huber C, Jiao X, Jones K, Levin JZ, Liu Y, Macdonald P, Melnikov A, Raley C, Sassi M, Sherman BT, Song X, Sykes S, Tran B, Walsh L, Xia Y, Yang J, Young S, Zeng Q, Zheng X, Stephens R, Nusbaum C, Birren BW, Azadi P, Lempicki RA, Cuomo CA, Kovacs JA. Genome analysis of three Pneumocystis species reveals adaptation mechanisms to life exclusively in mammalian hosts. Nat Commun. 2016 Feb 22;7:10740.

Rui L, Drennan AC, Ceribelli M, Zhu F, Wright GW, Xiao W, Grindle KM, Lu L, Hodson DJ, Zhao H, Xu W, Yang Y, Staudt LM. Epigenetic gene regulation by Janus kinase 1 in diffuse large B cell lymphoma. Proc Natl Acad Sci, in press, 2016.

Smith OK, Kim RG, Fu H, Martin M, Utani K, Zhang Y, Marks AB, Lalande M, Chamberlaine S, Libbrecht MW, Bouhassira EE, Ryan MC, Noble WC, Aladjem MI. Distinct Epigenetic Features of Differentiation-Regulated Replication Origins. Epigenetics and Chromatin 9:18. 2016.

Zhang Y, Huang L, Fu H, Smith OK, Lin CM, Utani K, Rao M, Reinhold WC, Redon CE, Ryan M, Kim RG, You Y, Hanna H, Boisclair Y, Long Q, Aladjem MI. A Replicator-Specific Binding Protein Essential For Site-Specific Initiation of DNA Replication in Mammalian Cells. Nat. Commun. 7:11748. 2016.

Ceribelli M, Hou EZ, Kelly PN, Huang DW, Ganapathi K, Evbuomwan MO, Pittaluga S, Shaffer AL, Wright G, Marcucci G, Forman SJ, Xiao W, Guha R, Zhang X, Ferrer M, Chaperot L, Plumas L, Jaffe ES, Thomas CJ, Reizis B, Staudt LM. A druggable TCF4- and BRD4-dependent transcriptional network sustains malignancy in blastic plasmacytoid dendritic cell neoplasm. Cancer Cell 2016, in press.

Zhang M, Lykke-Andersen S, Zhu B, Xiao W, Hoskins JW, Jermusyk A, Zhang X, Rost L, Collins I, Jia J, Parikh H, Zhang T, Song L, Zhu B, Zhou W, Matters GL, Kurtz RC, Yeager M, Jensen TH, Brown KM, Bamlet WR, TCGA Research Network, Chanock S, Chatterjee N, Wolpin BM, Smith J, Olson SH, Petersen GM, Shi J, Amundadottir LT. Characterizing cis-regulatory variation in the transcriptome of histologically normal and tumor-derived pancreatic tissues. 2016: Gut

Doran AG, Wong K, Flint J, Adams DJ, Hunter KW* and Keane TM* (2016) Deep genome sequencing and variation analysis of 13 inbred mouse strains find novel missense mutations in essential DNA repair pathway genes. Genome Biology, 17:167.

Zhang S, Zhu I, Deng T, Furusawa T, Rochman M, Vacchio MS, Bosselut R, Yamane A, Casellas R, Landsman D, Bustin M. HMGN proteins modulate chromatin regulatory sites and gene expression during activation of naïve B cells. Nucleic Acids Res. 2016 Sep 6;44(15):7144-58.

Deng T, Zhu ZI, Zhang S, Postnikov Y, Huang D, Horsch M, Furusawa T, Beckers J, Rozman J, Klingenspor M, Amarie O, Graw J, Rathkolb B, Wolf E, Adler T, Busch DH, Gailus-Durner V, Fuchs H, Hrabě de Angelis M, van der Velde A, Tessarollo L, Ovcherenko I, Landsman D, Bustin M. Functional compensation among HMGN variants modulates the DNase I hypersensitive sites at enhancers. Genome Res. 2015 Sep;25(9):1295-308.

Deng T, Zhu ZI, Zhang S, Leng F, Cherukuri S, Hansen L, Mariño-Ramírez L, Meshorer E, Landsman D, Bustin M. HMGN1 modulates nucleosome occupancy and DNase I hypersensitivity at the CpG island promoters of embryonic stem cells. Mol Cell Biol. 2013 Aug;33(16):3377-89.

Bai L, Yang H, Hu Y, Shukla, A, Ha, N-H, Doran A, Faraji F, Goldberger N, Lee M, Keane T and Hunter KW. (2016) An integrated genome-wide systems genetics screen for breast cancer susceptibility genes. PLoS Genetics.

Ha N-H, Long J, Cai Q, Shu X-O and Hunter KW. The circadian rhythm gene Arntl2 is a metastasis susceptibility gene for estrogen receptor-negative breast cancer. PLoS Genetics, 12(9) e1006267. The article highlighted by the journal (Siracusa and Bussard, PLoS Genetics 12(9) e1006299).

Kim J, Sturgill D, Tran AD, Sinclair DA, Oberdoerffer P. Controlled DNA double-strand break induction in mice reveals post-damage transcriptome stability. Nucleic Acids Res. 2016 Apr 20;44(7):e64.

Khurana S, Kruhlak MJ, Kim J, Tran AD, Liu J, Nyswaner K, Shi L, Jailwala P, Sung MH, Hakim O, Oberdoerffer P. A macrohistone variant links dynamic chromatin compaction to BRCA1-dependent genome maintenance. Nucleic Acids Res. 2016 Apr 20;44(7):e64.

2015

Young RM, Wu T, Schmitz T, Dawood M, Xiao W, Phelan JD, Xu W, Menard L, Meffre E, Chan WC, Jaffe ES, Gascoyne RD, Campo E, Rosenwald A, Ott G, Delabie J, Rimsza L, Staudt LM. Survival of human lymphoma cells requires B cell receptor engagement by self-antigens. Proc Natl Acad Sci 2015 112:13447-54.

Manna S, Kim JK, Baugé C, Cam M, Zhao Y, Shetty J, Vacchio MS, Castro E, Tran B, Tessarollo L, Bosselut R. Histone H3 Lysine 27 demethylases Jmjd3 and Utx are required for T-cell differentiation. Nat Commun. 2015;6:8152

Miles, George; Zhao, Yongmei; Levin, Yelena; et al. Multiplex Tissue and Clinical Proteomics By Next-Generation Sequencing Conference: 104th Annual Meeting of the United-States-and-Canadian-Academy-of-Pathology Location: Boston, MA Date: MAR 21-27, 2015

Fu H, Martin MM, Regairaz M, Huang L, You Y, Lin CM, Ryan M, Kim R, Shimura T, Pommier Y, Aladjem MI. The DNA repair endonuclease Mus81 facilitates fast DNA replication in the absence of exogenous damage. Nature Communications 6:6746. 2015.

Bartholdy B, Mukhopadhyay R, Lajugie J, Aladjem MI, Bouhassira EE. Allele-specific analysis of DNA replication origins in mammalian cells. Nat Commun.6:7051. 2015.

2014

Schmitz R, Ceribelli M, Pitaluga S, Wright G, and Staudt LM. Oncogenic mechanisms in Burkitt lymphoma. Cold Spring Harb Perspect Med. 2014 4:1-13.

Yang Y, Schmitz R, Mitala J, Whiting A, Xiao W, Ceribelli M, Wright G, Zhao H, Yang Y, Xu W, Rosenwald A, Ott G, Gascoyne RD, Connors JM, Rimsza LM, Campo E, Jaffe ES, Delabie J, Smeland EB, Braziel RM, Tubbs RR, Cook JR, Weisenburger DD, Chan WC, Wiestner A, Kruhlak MJ, Iwai K, Bernal F, Staudt LM. Essential role of the linear ubiquitin chain assembly complex in lymphoma revealed by rare germline polymorphisms. Cancer Discovery 2014 4:480-93.

Yudkin D, Hayward B, Aladjem MI, Kumari D, Usdin K. Chromosome fragility and the abnormal replication of the FMR1 locus in Fragile X syndrome. Hum Mol Genet, 23:2940-52. 2014.

Mukhopadhyay R, Lajugie J, Fourel N, Selzer A, Schizas M, Bartholdy B, Mar J, Lin CM, Martin MM, Ryan M, Aladjem MI, Bouhassira EE. Allele-specific genome-wide profiling in human primary erythroblasts reveals replication program organization. PLoS Genetics 10(5): e1004319. 2014.

Hoskins JW, Jia J, Flandez M, Parikh H, Xiao W, Collins I, Emmanuel MA, Ibrahim A, Powell J, Zhang L, Malats N, Bamlet WR, Petersen GM, Real FX, Amundadottir LT. Transcriptome analysis of pancreatic cancer reveals a tumor suppressor function for HNF1A. Carcinogenesis 2014; 35(12): 2670-2678.

Yi, Ming; Zhao, Yongmei; Jia, Li; et al. Performance comparison of SNP detection tools with Illumina exome sequencing data-an assessment using both family pedigree information and sample-matched SNP array data. NAR Volume: 42. Issue: 12 Article Number: e101

Muppidi JR, Schmitz R, Green JA, Xiao W, Larsen AB, Braun SE, An J, Xu Y, Rosenwald A, Ott G, Gascoyne RD, Rimsza LM, Campo E, Jaffe ES, Delabie J, Smeland EB, Braziel RM, Tubbs RR, Cook JR, Weisenburger DD, Chan WC, Vaidehi N, Staudt LM*, Cyster JG*. Loss of signaling via Gα13 in germinal center B cell-derived lymphoma. Nature 2014 516: 254-8.

Ceribelli M, Kelly P, Shaffer AL, Wright G, Yang Y, Mathews-Griner LA, Guha R, Shinn P, Keller JM, Liu D, Patel PR, Ferrer M, Joshi S, Nerle S, Sandy P, Normant E, Thomas CJ, Staudt LM. Blockade of oncogenic IkB kinase activity in ABC DLBCL by small molecule BET protein inhibitors. Proc Natl Acad Sci 2014 111:11365-70.

Nakagawa M, Schmitz R, Xiao W, Goldman CK, Xu W, Yang Y, Yu X, Waldmann TA, Staudt LM. Gain-of-function CCR4 mutations in adult T-cell leukemia/lymphoma. J Exp Med 2014 211:2497-2505.

2013

Xiao W, Tran B, Staudt LM, Schmitz R. High-throughput RNA sequencing in B-cell lymphomas. Methods Mol Biol 2013 971:295-312.

Jia J, Parikh H, Xiao W, Hoskins JW, Pflicke H, Liu X, Collins I, Zhou W, Wang Z, Powell J, Thorgeirsson SS, Rudloff U, Petersen GM, Amundadottir LT. An integrated transcriptome and epigenome analysis identifies a novel candidate gene for pancreatic cancer. BMC Med Genomics 2013; 6:33.

Fu YP, Kohaar I, Rothman N, Earl J, Figueroa JD, Ye Y, Malats N, Tang W, Liu L, Garcia-Closas M, Muchmore B, Chatterjee N, Tarway M, Kogevinas M, Porter-Gill P, Baris D, Mumy A, Albanes D, Purdue MP, Hutchinson A, Carrato A, Tardón A, Serra C, García-Closas R, Lloreta J, Johnson A, Schwenn M, Karagas MR, Schned A, Diver WR, Gapstur SM, Thun MJ, Virtamo J, Chanock SJ, Fraumeni JF Jr, Silverman DT, Wu X, Real FX, Prokunina-Olsson L. Common genetic variants in the PSCA gene influence gene expression and bladder cancer risk. Proc Natl Acad Sci U S A. 2012 Mar 27;109(13):4974-9.

Swaminathan, Sanjay; Hu, Xiaojun; Zheng, Xin; et al. Interleukin-27 treated human macrophages induce the expression of novel microRNAs which may mediate anti-viral properties. BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS Volume: 434 Issue: 2 Pages: 228-234.

Fu H, Maunakea AK, Martin MM, Huang L, Zhang Y, Ryan M, Kim R, Lin CM, Zhao K, Aladjem MI. Methylation of histone H3 on lysine 79 associates with a group of replication origins and helps limit DNA replication once per cell cycle. PLoS Genet. 9:e1003542. 2013.

Collaborative Publications

2012

Snow AL, Xiao W, Stinson JR, Lu W, Chaigne-Delalande B, Zheng L, Pittaluga S, Matthews HF, Schmitz R, Jhavar S, Kuchen S, Kardava L, Wang W, Lamborn IT, Jing H, Raffeld M, Moir S, Fleisher TA, Staudt LM, Su HC, Lenardo MJ. Congenital B cell lymphocytosis explained by novel germline CARD11 mutations. J Exp Med 2012 209:2247-61.

Grontved L, Hager GL. Impact of chromatin structure on PR signaling: Transition from local to global analysis. Mol Cell Endocrinol. 357, 30-36.

Li M¹, He Y, Dubois W, Wu X, Shi J, Huang J. Distinct Regulatory Mechanisms and Functions for p53-Activated and p53-Repressed DNA Damage Response Genes in Embryonic Stem Cells, Molecular Cell (2012)

Yang Y, Shaffer AL, Emre NCT, Ceribelli M, Wright G, Xiao W, Powell J, Platig J, Kohlhammer H, Young RM, Zhao H, Yang Y, Xu W, Balasubramanian S, Buggy JJ, Mathews LA, Shinn P, Guha R, Ferrer M, Thomas C, Staudt LM. Exploiting synthetic lethality for the therapy of ABC diffuse large B cell lymphoma. Cancer Cell 2012 21:723–737.

Koh Y, Wu X, Ferris AL, Matreyek KA, Smith SJ, Lee K, KewalRamani VN, Hughes SH, Engelman A: Differential effects of human immunodeficiency virus type 1 capsid and cellular factors nucleoporin 153 and ledgf/p75 on the efficiency and specificity of viral DNA integration. Journal of Virology. 2012.

Wang H, Jurado KA, Wu X, Shun MC, Li X, Ferris AL, Smith SJ, Patel PA, Fuchs JR, Cherepanov P, Kvaratskheila M, Hughes SH, Engelman A: Hrp2 determines the efficiency and specificity of hiv-1 integration in ledgf/p75 knockout cells but does not contribute to the antiviral activity of a potent ledgf/p75-binding site integrase inhibitor. Nucleic acids research. 2012.

Schmitz R, Young RM, Cerribeli M, Jhavar S, Xiao W, Zhang M, Wright G, Shaffer AL, Hodson D, Buras E, Lu X, Powell J, Yang Y, Xu W, Zhao H, Kohlhammer H, Rosenwald A, Kluin P, Muller-Hermelink HK, Ott G, Gascoyne RD, Connors JM, Rimsza LM, Campo E, Jaffe ES, Delabie J, Smeland EB, Fisher RI , Braziel RM, Tubbs RR, Cook JR, Weisenburger DD, Chan WC, Pittaluga S, Wilson W, Waldmann TA, Rowe M, Mbulaiteye SM, Rickinson AB, Staudt LM. Pathogenetic mechanisms and therapeutic targets in Burkitt lymphoma from structural and functional genomics. Nature 2012 490:116-20.

Grontved L, Bandle R, John S, Baek S, Chung H-J, Liu Y, Aguilera G, Oberholtzer C, Hager GL, Levens D: Rapid genome-scale mapping of chromatin accessibility in tissue. Epigenetics Chromatin 2012 Jun 26;5(1):10.

2011

Ngo VN, Young RM, Schmitz R, Jhavar S, Xiao W, Lim KH, Kohlhammer H, Xu W, Yang Y, Zhao H, Shaffer AL, Romesser P, Wright G, Powell J, Rosenwald A, Muller-Hermelink HK, Ott G, Gascoyne RD, Connors JM, Rimsza LM, Campo E, Jaffe ES, Delabie J, Smeland EB, Fisher RI , Braziel RM, Tubbs RR, Cook JR, Weisenburger DD, Chan WC, Staudt LM. Oncogenically active MYD88 mutations in human lymphoma. Nature 2011 470:115-119.

Martin MM, Ryan M, Kim R, Zakas AL, Fu H, Lin CM, Reinhold WC, Davis SR, Bilke S, Liu H, Doroshow JH, Reimers MA, Valenzuela MS, Pommier Y, Meltzer PS, Aladjem MI. Genome-wide depletion of replication initiation events in highly transcribed regions. Genome Research 21: 1822-1832. 2011.

CCR Sequencing Facility Presented Posters

Monika Mehta, Parimal Kumar, Vicky Chen, John Bettridge, Yongmei Zhao, Jyoti Shetty, Bao Tran. Single Cell Sequencing at CCR-Sequencing Facility. Molecular Biology in Single Cells Symposium, NCI, April 2018 & NCI Frederick Spring Research Festival, May 2018.

Keyur Talsania, Jack Chen, Tsai-wei Shen, Vicky Chen, Bao Tran, Jack Collins, Yongmei Zhao. Data Analysis for Genome Assembly and Structural Variant Detection. National Interagency Confederation for Biological Research Spring Research Festival at Fort Detrick and the National Cancer Institute, May 2018.

Vicky Chen, Tsai-wei Shen, Keyur Talsania, John Bettridge, Monika Mehta, Michael Kelly, Xiaolin Wu, Bao Tran, Jack Collins, Yongmei Zhao. High throughput Single Cell Transcriptome Sequencing Data Analysis. NIH Single Cell Symposium, April 2018.

Jack Chen, Oksana German, Sujatha Gowda, Yuliya Kriga, Christopher Hautman, Yelena Levin, Monika Mehta, Castle Raley, Jyoti Shetty, Tatyana Smirnova, Heidi Smith, Keyur Talsania, Vicky Chen, Tsai-wei Shen, Yongmei Zhao and Bao Tran. Innovative Sequencing Resources in the CCR Sequencing Facility. March 2018.

Wenming Xiao, Yongmei Zhao. A comprehensive investigation of factors impacting the accuracy of mutation detection using next-generation sequencing technology. 18-A-4219-AACR 2018.

Monika Mehta, Yongmei Zhao, Keyur Talsania, Ashley Walton, Yelena Levin, Jyoti Shetty, Elizabeth Gillanders, Bao Tran, Danielle Carrick. RNA Sequencing from Archived FFPE Tissues. AGBT Meeting, Feb 2018.

Yongmei Zhao, Keyur Talsania, Castle Raley, Monika Mehta, Jyoti Shetty, Yuliya Kriga, Sujatha Gowda, Jack Chen, Carissa Grose, Matthew Drew, Veronica Roberts, Kwong Tai Cheng, Sandra Burkett, Steffen Oeser, Robert Stephens, Daniel Soppet, Jack Collins, Bao Tran, Dominic Esposito. Draft Genome Assembly and Annotation of the Trichoplusia ni Insect Cell Line Tni-FNL. AGBT Conference 2018.

Cristobal Vera, Keyur Talsania, Ashley Walton, Sucheta Godbole, Bao Tran, Jack Collins, Yongmei Zhao. Data Analysis for Structural Variation Detection and Genome Assembly. National Interagency Confederation for Biological Research Spring Research Festival, May 2017.

Keyur Talsania, Sucheta Godbole, Ashley Walton, J. Cristobal Vera, Bao Tran, Jack Collins, Yongmei Zhao. Data Analysis Pipelines for Transcriptome Sequence Analysis. National Interagency Confederation for Biological Research Spring Research Festival, May 2017.

Monika Mehta, Yongmei Zhao, Jyoti Shetty, Castle Raley, Bao Tran. New Advancements in Next-Generation Sequencing Approaches to Address a Variety of Biological Questions. Advances in Genome Biology and Technology (AGBT) Meeting, Feb 2017.

Keyur Talsania, Sucheta Godbole, J. Cristobal Vera, Thomas Skelly, Jack Chen, Robert Stephens, Jack Collins, Bao Tran, Yongmei Zhao. Bioinformatics Support for Next-Generation Sequencing and Data Analysis at CCR-SF. National Interagency Confederation for Biological Research Spring Research Festival, May 2016.

Brenda Ho, Ashley Walton, Monika Mehta. Analysis of Illumina library preparation protocols for NGS analysis of FFPE RNA samples in cancer research. NIH Summer Intern Poster Day at NIH Bethesda campus. July 29^th, 2016.

Monika Mehta, Castle Raley, Yongmei Zhao, Jyoti Shetty, Bao Tran. New Advances In Studying Cellular RNA By Next-generation Sequencing. Presented at: CCR RNA Biology Workshop at NCI Shady Grove. November 1, 2016.

CCR Sequencing Facility Presented Posters and Seminars

SF seminar Oct 28, 2020 (presentations and video recording)

FAQ

General Questions

What services does the Sequencing Facility provide?
Who can order services through the Sequencing Facility?
How do I submit a sequencing request?
What happens after my sample is submitted?

Illumina Questions

How do I submit samples for Illumina sequencing?
What are the requirements for submitted samples for Illumina sequencing?

PacBio Questions

How do I submit samples for PacBio sequencing?
What are the requirements for submitted samples for PacBio sequencing?
What happens during PacBio library preparation?
What are PacBio HiFi and CCS reads?
What is the estimated output for PacBio sequencing?
What can I sequence on one SMRT Cell 8M?
How can I extract HMW DNA?
How can I perform target enrichment for PacBio Iso-seq?

Oxford Nanopore Technologies (ONT) Questions

How do I submit samples for ONT sequencing?
What happens after my sample is submitted?
What are the requirements for submitted samples for ONT sequencing?
How can I extract HMW DNA?
What is the estimated output for ONT sequencing?

Bionano Questions

How do I submit samples for Bionano optical mapping?
What happens after my sample is submitted?
What are the requirements for submitted samples for Bionano optical mapping?
What is the recommended coverage for Bionano optical mapping?

Single Cell Questions

What is sample processing workflow for single cell projects?
What is Single Cell RNA-Seq?
Do dead cells impact the data quality?
How many cells do I need to provide?
How many cells can I expect to get information for?
How many reads do I need for my experiment?
What 10X applications do you support?
What buffers should I use to resuspend my cells on the day of submission?
How should I prepare and send my samples?
What is the cost per sample?
What is included in the price?
What is the Turn Around Time for your single cell core?
How should I schedule the experiment?

Answers for General Questions

What services does the Sequencing Facility provide?

Please see the services page for a detailed list of projects we support. If your project design is not listed, please contact ccrsfhelp@mail.nih.gov or the Sequencing Facility director, to discuss the feasibility of a custom project.

Who can order services through the Sequencing Facility?

All NIH research labs are eligible to order services through the Sequencing Facility. Labs outside of CCR and NIAID will have overhead charges added.

How do I submit a sequencing request?

Please complete a sequencing proposal form at NAS request. There will be an option for Illumina, which you should select for all short read and single-cell projects, and Long Read, which you should select for all PacBio, ONT and Bionano projects. You may also contact ccrsfhelp@mail.nih.gov to discuss the available platforms and best choice for your project.

What happens after my sample is submitted?

Before sequencing, we will perform an internal QC to confirm the information in the sample manifest and notify you if any samples do not meet minimum sequencing requirements. You will then be able to choose whether to resubmit those samples or continue and sequence them at your own risks. You will be notified again when the analysis on each sample is completed and available for download.

Answers for Illumina Questions

How do I submit samples for Illumina sequencing?

Before submitting samples, ensure that the sequencing project has been discussed with the Sequencing Facility team and the NAS request submitted. Illumina service is listed under Sequencing Facility –Illumina (CCR).

You may then submit samples by delivering them at the ATRF Room D3040 (instructions for sample delivery. Be sure to include a sample manifest form with your submission as well as to send an electronic version of the form to Jyoti Shetty prior to sending your samples.

What are the requirements for submitted samples for Illumina sequencing?

Sample Quantity/Quality Requirements and Recommendations:

All samples are shipped in dry ice and the individual (1.5-2 ml) tubes are labeled clearly

Type of library	Minimum DNA/RNA Requirement for Library Construction	Recommended DNA/RNA for Optimal Library Construction	Maximum Sample Volume Requirement for Library Construction	Additional requirements
ChIP DNA Sequencing	5 ng	10 ng	30 µL	Bulk of the DNA fragments in the 100-300 bp range
gDNA Sequencing	100 ng	1 µg	30 µL	DNA should be as intact as possible with no contamination, OD260/280 1.8–2.0
mRNA Sequencing	25 ng	1 µg	30 µL	RIN should be at least 8.0, DNase treated
mRNA ultralow	100 pg	10 ng	10 µL	RIN should be at least 8.0, DNase treated
microRNA Sequencing	100 ng	1 µg	6 µL
Total RNA Sequencing	10 ng	1 µg	10 µL	DNase treated, FFPE and degraded RNA can be used; DV200 < 30% not recommended

You can use any extraction protocol as long as the DNA/RNA samples meet our sample requirements.

Here are the requirements for ATAC seq from frozen cells:

Bulk ATAC-seq requires each sample to be present as a replicate. Triplicates are better.
Cells for ATAC-seq should be cryopreserved at high viability. Please ensure that the cells are cryopreserved properly in freezing medium.
Please send a minimum of 2 million cells per sample.
Please ship the cells in dry ice.

Answers for PacBio Questions

How do I submit samples for PacBio sequencing?

Before submitting samples, ensure that the sequencing project has been discussed with the Sequencing Facility team and the NAS request submitted. PacBio service is listed under Sequencing Facility – Long Read Technology (CCR)

You may then submit samples by delivering them at the ATRF Room D3040 (instructions for sample delivery). Be sure to include a sample manifest form with your submission as well as to send an electronic version of the form to caroline.fromont@nih.gov prior to sending your samples.

What are the requirements for submitted samples for PacBio sequencing?

All samples must be sent in a 1.5 ml or 2 ml tubes.

Some requirements might be project dependent such as input of DNA if multiplexing or sequencing on multiple SMRTcells. Please contact us for more details.

Quality and quantity requirements are listed in the table below:

Type of Library	Minimum DNA/RNA Quantity Requirement	Recommended DNA/RNA Quantity Requirement	Minimum Concentration Requirement	Quality Requirements
WGS	1.5 µg	5 µg	n/a	OD260/280:1.8-2.0 OD260/230:1.7-2.2
Amplicons (< 5000 bp)	200 ng	500 ng	n/a	OD260/280:1.8-2.0 OD260/230:1.7-2.2
Amplicons (> 5000 bp)	300 ng	800 ng	n/a	OD260/280:1.8-2.0 OD260/230:1.7-2.2
HLA (Class I)	250 ng	1 µg	20 ng/µL	OD260/280:1.8-2.0 OD260/230:1.7-2.2
16S	2 ng	10 ng	500 pg/µL	OD260/280:1.8-2.0 OD260/230:1.7-2.2
MAS Single Cell	15 ng	50 ng	1 ng/µL	OD260/280:1.8-2.0 OD260/230:1.7-2.2
WTS	300 ng	1 µg	50 ng/µL	RIN ≥ 8.0

What happens during PacBio library preparation?

After initial sample QC, we proceed with library preparation. Depending on the project, the samples will be handled differently prior to PacBio library preparation. For amplicons samples, we first perform an AMPure bead clean-up that also allows us to concentrate the samples if necessary. For gDNA samples, we shear the samples to the targeted size depending on the project need and perform an AMPure bead clean-up to concentrate the samples. For WTS, we generate cDNA using polydT primers and TSO allowing us to target full length transcripts with a polyA tail. During PacBio library preparation, fragments undergo damage repair, end-repair/A-tailing and adapter ligation. The adapters are hairpin adapters and, ligated to double stranded DNA, they form a circular molecule necessary for PacBio sequencing. Barcoded hairpin adapters are also available if the project requires pooling of multiple samples. The libraries are then cleaned using AMPure beads and we perform a QC prior to setting up a sequencing run.

What are PacBio HiFi and CCS reads?

CCS stands for Circular Consensus Sequences. CCS are produced for sequencing libraries with insert size shorter than 25 kb. For CCS, the circular template (dsDNA with hairpin adapters) generated during library preparation is read multiple times and produces numerous read passes (subreads). Those subreads are then used to call a consensus sequence and generate highly accurate reads. Four passes of the molecule usually yield Q20 data while 8 passes should yield Q30 data. HiFi reads are CCS reads with > Q20.

For a quick explanation of SMRT sequencing, please watch the following PacBio video https://youtu.be/_lD8JyAbwEo

On the PacBio website: https://www.pacb.com/technology/hifi-sequencing/

What is the estimated output for PacBio sequencing?

Please note that these are estimates only as both library type and insert size are going to influence the output and it is subject to change. The Sequel II is estimated to produce 3-4.5 million raw reads. For RNA iso-seq libraries you can expect to get 3-4 million CCS reads. For WGS libraries, you can expect to get about 20-30 Gb of HiFi reads.

What can I sequence on one SMRT Cell 8M?

According to PacBio, one SMRT cell is enough to sequence a genome up to 2 Gb and a whole transcriptome, detect structural variants in up to 2 samples of ˜3 Gb genome, and multiplex numerous amplicons. For variant detection (single nucleotides, indels and structural variants) in a ˜3 Gb genome, using at least 2 SMRT cells is recommended.

See “What can you do with one SMRT cell?” for more information.

How can I extract HMW DNA?

PacBio has released a list of HMW DNA extraction protocols and QC methods that can be found at DNA preparation technical note

How can I perform target enrichment for PacBio Iso-seq?

If you are interested in long read isoform sequencing but are focused on only one or a few genes you may consider a target enrichment protocol. This protocol relies on hybridization of biotinylated probes to your cDNA target of interest and subsequent pulldown with streptavidin beads. The enriched cDNA is then amplified and prepared for PacBio sequencing. To design probes for your project please contact IDT at NGSDesign@idtdna.com or fill out a probe design request at https://go.idtdna.com/Request-consult-NGS-xGen-Custom-Hyb-Panel. To complete your sequencing request you will need to submit your probe panel in addition to your RNA samples. Please contact us for more details.

Answers for Oxford Nanopore Technologies (ONT) Questions

How do I submit samples for ONT sequencing?

Before submitting samples, ensure that the sequencing project has been discussed with the Sequencing Facility team and the NAS request submitted. ONT service is listed under Sequencing Facility – Long Read Technology (CCR)

What happens after my sample is submitted?

What are the requirements for submitted samples for ONT sequencing?

All samples, except ONT Ultralong, must be sent in 1.5 or 2 mL tubes in dry ice. For ONT Ultralong, send the cells as a frozen pellet or cryopreserved vial in dry ice.

Quality and quantity requirements are listed in the table below:

Type of library	Minimum DNA/RNA Requirement for Library Construction	Recommended DNA/RNA for Optimal Library Construction	Maximum Sample Volume Requirement for Library Construction	Additional requirements
WGS	1 µg	4 µg	48 µL	HMW DNA
WGS Adaptive Sampling	2 µg	5 µg	48 µL	HMW DNA
WGS Ultralong	6 million human cells or the cell number equivalent to 40 µg of DNA	n/a	n/a	n/a
Direct RNA Sequencing	50 ng of poly(A) tailed or 500 ng total RNA	150 ng of poly(A) tailed or 1 µg total RNA	9 µL	RIN ≥ 8.0

How can I extract HMW DNA?

We recommend the HMW Circulomics kit (NB-900-001-01) for DNA extraction.

What is the estimated output for ONT sequencing?

GridION: up to 50 GB per flow cell
PromethION 2 Solo: up to 200 GB per flow cell

Answers for Bionano Questions

How do I submit samples for Bionano optical mapping?

Before submitting samples, ensure that the sequencing project has been discussed with the Sequencing Facility team and the NAS request submitted. Bionano service is listed under Sequencing Facility – Long Read Technology (CCR)

What happens after my sample is submitted?

What are the requirements for submitted samples for Bionano optical mapping?

Please send the cells as a frozen pellet (1.5-2 million cells/sample).
Please use the Bionano DNA stabilizer buffer to prepare frozen pellets as specified in the protocol of Preparing Frozen Cell Pellets Recommended.
Please ship the cells in dry ice.

What is the recommended coverage for Bionano optical mapping?

Answers for Single Cell Questions

What is sample processing workflow for single cell projects?

Upon receiving single cell suspension, we check for the quality of cells and wash the cells a couple of times with PBS+BSA before loading it onto a microfluidic chip for the capture of single cells in a nanoliter size droplets along with the barcoded beads. RT takes place inside the droplet and then we break the droplets and PCR amplify the cDNA in bulk. We then purify the cDNA and check the quality. Generally, the quality of cDNA correlates very well with the final sequencing results. We then make Illumina compatible libraries from these cDNAs and sequence it on sequencer.

What is Single Cell RNA-Seq?

Single-Cell RNA-Seq provides transcriptional profiling of thousands of individual cells. This level of throughput analysis enables researchers to understand at the single-cell level what genes are expressed, in what quantities, and how they differ across thousands of cells within a heterogeneous sample.

Do dead cells impact the data quality?

10X Genomics Single Cell Protocols require suspensions of viable (90% optimal, 70-90% acceptable), single cells as input. Dead cells easily lyse, resulting in the release of ambient RNA. This cell-free RNA can contribute to the background noise of the assay and will compromise the quality of single cell data.

Clusters of dying cells typically have relatively higher levels of mitochondrial expression, lower gene counts, and more ambiguous cell type identification scores that equally or comparably match multiple major cell types.

How many cells do I need to provide?

We recommend > 1×10⁶ cells/mL – minimum 200,000 cells/mL to load 10K live cells for non-hashing experiments and 20K-30K live cells for cell hashing experiments.

How many cells can I expect to get information for?

The capture rate of 10X is approximately 60%, depending on cell type and cell quality. When you load 10K (70-90% live) cells, you will capture around 6K cells.

How many reads do I need for my experiment?

We aim to provide 20K-50K reads/cell for gene expression libraries, 10K reads/cell for V(D)J enriched samples, 5K-10K reads/cell for CITE-Seq libraries, 50K reads/cell for single cell ATAC libraries, 20K-50K reads/cell for single cell multiome libraries as recommended by 10X genomics.

What 10X applications do you support?

We support following 10X assays: single cell gene expression (3’ and 5’), single cell immune profiling (V(D)J, TCR, BCR), Single cell multiome ATAC+Gene expression, Single Cell ATAC. We also support Mission Bio Tapestri single cell targeted DNA application.

What buffers should I use to resuspend my cells on the day of submission?

10X recommended to use 1X PBS (calcium and magnesium-free) containing 0.04% weight/volume BSA (400 μg/ml) for washing and resuspension. It is also possible to use most cell culture media with up to 10% FBS or up to 2% BSA to maintain cell health with little to no adverse downstream effects. Media should not contain excessive amounts of EDTA (> 0.1mM), or magnesium (> 3mM) as those components will inhibit the reverse transcription reaction. Any surfactants (Tween-20, etc.) should also be avoided as they may interfere with GEM generation.

How should I prepare and send my samples?

We accept fresh and cryopreserved cells. Please use 10X Genomics recommended or supported cryopreservation protocols (human/mouse) for your cells, and follow the 10X Genomics full cell preparation guide.

Fresh samples need to be at CCR_SF within 1-1.5 hour after dissociation. Fresh samples need to be loaded onto the 10X machine as soon as possible after dissociation. Please bring your samples before 2pm!

Cell parameters: Recommended cell number: > 1×10⁶ cells/mL. Minimum 200,000 cells/mL
Viability: Recommended > 90%. Acceptable 70%-90%.
Container: Please use 15mL conical Falcon tubes or 2 mL Eppendorf tubes
Medium: You can hand your dissociated cells over in cell culture medium (up to 10% FBS or up to 2% BSA) or in 1X PBS/0.004% BSA.
abeling: Please label tubes clearly and use permanent markers or labels. Always label your tubes on the lids and the side. Please use short unambiguous names (e.g., CTRL, IFN1).
Temperature: Please deliver your fresh cells on ice. Please ship your cryopreserved samples on dry ice.

What is the cost per sample?

All the information about pricing is listed on our website.

What is included in the price?

We provide full service which includes administrative services, consultations, advice on experimental design, 10X genomics reagents, sample QC prior to loading, cDNA QC, post library generation QC, primary bioinformatic analysis (using the Cell Ranger pipeline).

What is the Turn Around Time for your single cell core?

It takes about 6-8 weeks from the time the sample is submitted, to data delivery after running the CellRanger pipeline. Turn around time can increase for large projects (> 48 samples) and close to the end of the fiscal year.

How should I schedule the experiment?

Please email our scientific team. They will reply you and may arrange a short meeting to discuss the project. The scheduling should be at least 2 weeks in advance. When your samples are ready, you will need to submit a NAS request listed under Sequencing Facility, Illumina (CCR). Please fill out a sample manifest form and send back to us.

Announcement:

FNL Couriers no longer accepted pickup/delivery requests via the phone or email. All FNL Courier pickups from Bethesda to Frederick, must be submitted through the Ship Wizard system. All shipments that are transported on the highway (FNL Courier pickups) must be classified as a Hazardous or Non Hazardous shipment and the only way to have that done is to submit a Request for Shipment in the Ship Wizard system.

Ship Wizard Link: https://ncifrederick.cancer.gov/Cad/ShippingWizard

This is a free service, but you need to enter your PID number when filling out the form. If you do not know your PID, you can check with your AO or this information will also be in your NAS request.

Sequencing Facility

Overview

CCR Sequencing Facility

Mission:

CCR Sequencing Facility: Overview

Illumina (Short Read) Sequencing Technology

Long-Read Technologies: PacBio Sequel IIe Sequencing

Long-Read Technologies: Oxford Nanopore Technologies (ONT)

Single Cell Technologies: 10X Genomics

Single Cell Technologies: Mission Bio

Single Cell Technologies: Fluent BioSciences PIPseq

Optical Mapping Technologies: Bionano Genomics

Services

SF Services

Short reads with Illumina Sequencing:

* ATAC seq is only provided as a pilot project for a maximum of 12 samples. After the pilot, or for more than 12 samples, we can transfer the protocol to you.

Long Read Sequencing Techonologies:

Optical Mapping using Bionano Technology:

R&D Resources:

Bioinformatics Support:

Pricing

Illumina Sequencing

Protocols and Resources

SF Protocols and Resources

Laboratory Forms and Information

Protocols and Resources

Bioinformatics Info

Bioinformatics Support at CCR-SF:

Main Area of Support:

New Analysis Services:

General Questions for Bioinformatics

Answers for General Questions

Contacts

Contacts

Bao Tran

Director, Sequencing Facility

Jyoti Shetty

Illumina Lab Manager

Yongmei Zhao

Bioinformatics Manager

Oksana German

Illumina QA Specialist

Yunlong He

PacBio Lab Manager / R&D Scientist

Juanma Caravaca

ONT Lab Manager / R&D Scientist

R&D/Coming Soon

Publications

Publications

Collaborative Publications

CCR Sequencing Facility Presented Posters

CCR Sequencing Facility Presented Posters and Seminars

FAQ

General Questions

Illumina Questions

PacBio Questions

Oxford Nanopore Technologies (ONT) Questions

Bionano Questions

Single Cell Questions

Answers for General Questions

Answers for Illumina Questions

Answers for PacBio Questions

Answers for Oxford Nanopore Technologies (ONT) Questions

What happens after my sample is submitted?

What are the requirements for submitted samples for ONT sequencing?

How can I extract HMW DNA?

What is the estimated output for ONT sequencing?

Answers for Bionano Questions

How do I submit samples for Bionano optical mapping?

What happens after my sample is submitted?

What are the requirements for submitted samples for Bionano optical mapping?

What is the recommended coverage for Bionano optical mapping?

Answers for Single Cell Questions

What is sample processing workflow for single cell projects?

What is Single Cell RNA-Seq?

Do dead cells impact the data quality?

How many cells do I need to provide?

How many cells can I expect to get information for?

How many reads do I need for my experiment?

What 10X applications do you support?