Frederick, MD
Core Facility
The function of the SAIP is to collaborate with NCI investigators in the development of mouse models, new molecular imaging probes for early detection and therapy, monitor tumors in vivo, and perform drug efficacy studies Read More...
Web Page
Bioinformatics
We will need to use a manifest file to import. See the Import tutorial . Note: The manifest file can be comma separated depending on the format that you use at import , despite what is written Read More...
Web Page
Bioinformatics
We can create a "for" loop to do iterative actions in Unix. For each commands all on one line or separate lines: (“i” can be any variable name). These steps can be saved Read More...
Web Page
Bioinformatics
Before jumping into submitting scripts in job files, let's first focus on how to run R from the command line. The primary way to run R from the command line is to call Rscript . Read More...
Web Page
Bioinformatics
We can create a "for" loop to do iterative actions in Unix. For each commands all on one line or separate lines: (i can be any variable name). These steps can be saved Read More...
Web Page
Bioinformatics
How do we navigate this directory tree. We use cd, which means "change directory". Let's change directory to our data directory, which is the larger of the two allocations we are allotted Read More...
Web Page
Bioinformatics
We can also take advantage of swarm for running jobs in parallel, as we did in the previous lesson. Note This should be run from Biowulf, not Helix. cd .. mkdir swarm cd swarm We can Read More...
Web Page
Bioinformatics
{{Sdet}} Solution{{Esum}} qiime demux summarize \ --i-data 01_import/import.qza \ --o-visualization 01_import/import.qzv {{Edet}} Again, to view this file, you will need to move it to public . Note: It is easier to create the Read More...
Web Page
Bioinformatics
The "parallel" tool executes commands in "parallel", one for each CPU core in your system. See Tool: Gnu Parallel - Parallelize Serial Command Line Programs Without Changing Them cat runids.txt | Read More...
Web Page
Bioinformatics
GNU parallel executes commands in "parallel", one for each CPU core on your system. It can serve as a nice replacement of the for loop . See Tool: Gnu Parallel - Parallelize Serial Command Read More...
Web Page
Bioinformatics
Now that we have accession numbers to work with, let's use parallel and fastq-dump to download the data. GNU parallel executes commands in "parallel", one for each CPU core on your system. Read More...
Web Page
Bioinformatics
Now that we have accession numbers to work with, let's use parallel and fastq-dump to download the data. GNU parallel is a shell tool for executing jobs in parallel using one or more computers. Read More...
Web Page
Bioinformatics
To retrieve all the files at once, you can create a swarm job. nano set.swarm Copy the following and paste into the swarm file. #SWARM --threads-per-process 6 #SWARM --gb-per-process 4 #SWARM --gres=lscratch:20 #SWARM --module sratoolkit # Read More...
Web Page
Bioinformatics
Practice Lesson 2 For the help sessions, we will work on processing sequences generated in Zhang Z, Feng Q, Li M, Li Z, Xu Q, Pan X, Chen W. Age-Related Cancer-Associated Microbiota Potentially Promotes Oral Squamous Read More...
Web Page
Bioinformatics
To retrieve all the files at once, you can create a swarm job. nano set.swarm Inside the swarm file type #SWARM --threads-per-process 3 #SWARM --gb-per-process 1 #SWARM --gres=lscratch:10 #SWARM --module sratoolkit fasterq-dump -t /lscratch/$SLURM_ Read More...
Web Page
Bioinformatics
To help us align all of the FASTQ files in one go, we should create in the reads directory a file with the sample IDs names for the Golden Snidget. First, change into the ~/biostar_ Read More...
Web Page
Bioinformatics
For this exercise, change back into the data directory. cd /data/username Make a directory called text_files_and_tabular_data. mkdir text_files_and_tabular_data cd text_files_and_tabular_data The touch Read More...
Web Page
Bioinformatics
cat hcc1395_sample_ids.txt | parallel "bowtie2 -x references/22 -1 reads/{}_R1.fq -2 reads/{}_R2.fq -S hcc1395_bowtie2/{}.sam" Change into hcc1395_bowtie2 and remove hcc1395 from the SAM alignment outputs. Read More...
Web Page
Bioinformatics
For today's practice, we are going to embark on a Unix treasure hunt created by the Sanders Lab at the University of California San Francisco. Note: the treasure hunt materials can be obtained directly Read More...
Web Page
Bioinformatics
Lesson 4 Practice For today's practice, we are going to embark on a Unix treasure hunt created by the Sanders Lab at the University of California San Francisco. Note: the treasure hunt materials can be Read More...
Web Page
Bioinformatics
To align FASTQ files for one sample, we construct the HISAT2 command with the following options. The "-x" flag prompts us to enter the base name (ie. without extension) of genome index. The Read More...
Web Page
Bioinformatics
We will build a database out of all features of the 2014 Ebola genome under accession number KM233118. This data will go into a new directory named "db_2014". mkdir -p db_2014 # Get the 2014 Ebola Read More...
Web Page
Bioinformatics
Next, we need to generate the counts (ie. number of reads that map to a transcript). But first, change back into the ~/biostar_class/snidget folder and then take a moment to think about how Read More...
Web Page
Bioinformatics
Because we are now creating different folders that stores results from various stages in our data analysis, we could set up some environmental variables for these so we can more easily reference these folders while Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Review: * downloading data from SRA * decompressing tar files * e-utilities * fastq-dump Learn: * sra-stat * XML format * automating SRA downloads * working with comma-separated values (csv) format * Read More...
Web Page
Bioinformatics
Lesson 13: Aligning raw sequences to reference genome Before getting started, remember to be signed on to the DNAnexus GOLD environment. Lesson 11 Review In Lesson 11 we learned to aggregate multiple FASTQC reports into one using MultiQC, Read More...
Web Page
Bioinformatics
Lesson 6: sra-tools, e-utilities, and parallel This page uses some content directly from the Biostar Handbook by Istvan Albert. Lesson 5 Review: The majority of computational tasks on Biowulf should be submitted as jobs: sbatch or swarm Read More...
Web Page
Bioinformatics
Lesson 6: Downloading data from the SRA For this lesson, you will need to login to the GOLD environment on DNAnexus. Lesson 5 Review: The majority of computational tasks on Biowulf should be submitted as jobs: sbatch Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Remember to activate the bioinformatics environment and create a directory for today's work. conda activate bioinfo mkdir blast cd blast What is Read More...
Web Page
Bioinformatics
Lesson 16 Practice Objectives In this lesson, we learned about the classification based approach for RNA sequencing analysis. In this approach, we are aligning our raw sequencing reads to a reference transcriptome rather than a genome. Read More...
Web Page
Bioinformatics
More useful Unix Flags and command options - making programs do what they do Use of wildcards Using tab complete for less typing Access your history with the "up" and "down" Read More...
Web Page
Bioinformatics
R scripts can be run from the command line with command line arguments. Here is a great resource from software carpentry explaining command line arguments. To use command line arguments with an R script, we Read More...
Web Page
Bioinformatics
Lesson 4: Submitting R Scripts via command line Learning Objectives Learn how to use R with less interaction Learn how to deploy sbatch R jobs, and learn about alternatives such as swarm . Learn about R job Read More...
Web Page
Bioinformatics
Mac users will need to open a Terminal window. Then change into the local Downloads folder which should be /Users/username/Downloads (where username should be your NIH username). cd /Users/username/Downloads If you Read More...
Web Page
Bioinformatics
Author: Stephan Sanders, PhD (UCSF) For today's practice, we are going to embark on a Unix treasure hunt created by the Sanders Lab at the University of California San Francisco. Note: the treasure hunt Read More...
Web Page
Bioinformatics
Author: Stephan Sanders, PhD (UCSF) For today's practice, we are going to embark on a Unix treasure hunt created by the Sanders Lab at the University of California San Francisco. Note: the treasure hunt Read More...
Web Page
Bioinformatics
Author: Stephan Sanders, PhD (UCSF) For today's practice, we are going to embark on a Unix treasure hunt created by the Sanders Lab at the University of California San Francisco. Note: the treasure hunt Read More...
Web Page
Bioinformatics
For today's practice, we are going to embark on a Unix treasure hunt created by the Sanders Lab at the University of California San Francisco. Note: the treasure hunt materials can be obtained directly Read More...
Web Page
Bioinformatics
Author: Stephan Sanders, PhD (UCSF) In this exercise, we are going to embark on a Unix treasure hunt created by the Sanders Lab at the University of California San Francisco. Note: the treasure hunt materials Read More...
Web Page
Bioinformatics
More useful Unix Flags and command options - making programs do what they do Use of wildcards Using tab complete for less typing Access your history with the "up" and "down" Read More...
Web Page
Bioinformatics
More useful Unix Flags and command options - making programs do what they do Use of wildcards Using tab complete for less typing Access your history with the "up" and "down" Read More...
Web Page
Bioinformatics
Lesson 4: Useful Unix For this lesson, you will need to login to the GOLD environment on DNAnexus. Lesson 3 Review Biowulf is the high performance computing cluster at NIH. When you apply for a Biowulf account Read More...
Web Page
Bioinformatics
This page contains content directly from The Biostar Handbook . Always remember to start the bioinformatics environment. conda activate bioinfo Pseudoalignment-based methods identify locations in the genome using patterns rather than via alignment type algorithms. It Read More...
Web Page
Bioinformatics
This page contains content directly from The Biostar Handbook . Always remember to start the bioinformatics environment. conda activate bioinfo Pseudoalignment-based methods identify locations in the genome using patterns rather than via alignment type algorithms. It Read More...
Web Page
Bioinformatics
Lesson 13 Practice Objectives In this lesson we learned how to align raw sequencing reads to reference and to process alignment results for downstream analysis. Here, we will test our knowledge by continuing with the Golden Read More...
Web Page
Bioinformatics
Here, let's change back in the ~/biostar_class/hbr_uhr/hbr_uhr_hisat2 folder. cd $hbr_uhr_hisat2 To align FASTQ files for one sample, we construct the HISAT2 command with the following options Read More...
Web Page
Bioinformatics
Lesson 13: Aligning raw sequences to reference genome Before getting started, remember to be signed on to the DNAnexus GOLD environment. Lesson 11 Review In Lesson 11 we learned to aggregate multiple FASTQC reports into one using MultiQC, Read More...
Web Page
Bioinformatics
Before we can align the HBR and UHR raw sequencing data to human chromosome 22 transcriptome, we need to create an index of this transcriptome (like we did with the genome). This will make the alignment Read More...
Web Page
Bioinformatics
“Gene set enrichment analysis” refers to the process of discovering the common characteristics potentially present in a list of genes. When these characteristics are GO terms, the process is called “functional enrichment.” Warning Overall GO Read More...
Web Page
Bioinformatics
Lesson 5: Working on Biowulf Lesson 4 Review Flags and command options Wildcards ( * ) Tab complete Accessing user history with the "up" and "down" arrows cat , head , and tail Working with file content (input, Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Obtain RNA-seq test data. The test data consists of two commercially available RNA samples: Universal Human Reference (UHR) and Human Brain Reference (HBR) . Read More...
Web Page
Bioinformatics
Lesson 16: RNA sequencing review and classification based analysis Before getting started, remember to be signed on to the DNAnexus GOLD environment. Review In the previous classes, we learned about the steps involved in RNA sequencing Read More...