Web Page
Bioinformatics
Accessing the Biostar Handbook This course follows along closely with the Biostar Handbook content, and course registration comes with a Biostar Handbook license. You will receive an email from hello@biostarhandbook.com with information detailing Read More...
Web Page
Bioinformatics
This course follows along closely with the Biostar Handbook content, and course registration comes with a Biostar Handbook license. You will receive an email from hello@biostarhandbook.com with information detailing how to set up Read More...
Web Page
Bioinformatics
As a reminder, when you registered for the course, you also gained a 6 month subscription to the Biostar Handbook Collection . This is a fantastic resource covering a range of bioinformatics topics, not just RNA-Seq. At Read More...
Web Page
Bioinformatics
Below are a list of links and resources mentioned in the BIOSTAR Sequencing Instruments class given on 06/10/20 and 06/11/20
Web Page
Bioinformatics
The URL for the Biostar handbook is https://www.biostarhandbook.com . Once you sign into this handbook, you will find that it is composed of several different books including one for RNA sequencing. Scroll to Read More...
Web Page
Bioinformatics
Biostar Class - Sequencing Instruments Below are a list of links and resources mentioned in the BIOSTAR Sequencing Instruments class given on 06/10/20 and 06/11/20 Sequencing Technologies - Company Web Sites Illumina PacBio Oxford Nanopore 10X Genomics Read More...
Web Page
Bioinformatics
Normalization - the process of scaling data to account for uncontrolled factors affecting variation. Effect size - "the quantitative measure of the magnitude of a phenomenon" (Biostar Handbook). P-value - "the probability Read More...
Web Page
Bioinformatics
To complement this course, there is a module available on Biowulf with installed programs associated with the Biostar Handbook. During class, we will work on the command line on the GOLD system on DNAnexus. However, Read More...
Web Page
Bioinformatics
This page uses information directly from the Biostar Handbook by Istvan Albert (PSU). Activate the bioinformatics environment and install some software as directed. conda activate bioinfo conda update -y blast conda install -y cd-hit pip Read More...
Web Page
Bioinformatics
The Biostar handbook has included bioinformatics software in a conda environment named bioinfo . If you followed Biostar Handbook instructions and created a bioinfo environment on your local computer, you will need to activate the environment Read More...
Web Page
Bioinformatics
This page uses some content directly from the Biostar Handbook by Istvan Albert.
Web Page
Bioinformatics
The Biostar handbook quite often uses a command called parallel . Let's break this command down further here.
Web Page
Bioinformatics
The Biostar handbook quite often uses a command called parallel . Let's break this command down further here.
Web Page
Bioinformatics
For your convenience, we have created a module on Biowulf that includes many of the same programs in the bioinfo environment from The Biostar Handbook . To use this module, please see the instructions documented under Read More...
Web Page
Bioinformatics
For your convenience, we have created a module on Biowulf that includes many of the same programs in the bioinfo environment from The Biostar Handbook . Instructions for using this module can be found at Additional Read More...
Web Page
Bioinformatics
In Lesson 7, you learned how to download and work with archived and compressed files. To practice what you have learned, we will use the ERCC spike in control data, which Istvan Albert, creator of the Read More...
Web Page
Bioinformatics
The documentation you are currently reading will be accessible in the future from the BTEP website . For this course, there are additional resources worthy of note under the Additional Resource tab, including Further Readings and Read More...
Web Page
Bioinformatics
After obtaining the expression counts for each gene, we can run helper R scripts provided by the author of the Biostar Handbook to obtain differential gene expression results. In our lessons, we used deseq2.r. Read More...
Web Page
Bioinformatics
Licenses to the Biostar Handbook are available to CCR researchers. Please email BTEP at ncibtep@nih.gov if you would like a license.
Web Page
Bioinformatics
Biostars: Bioinformatics Explained is a question and answer forum where researchers can obtain answers to questions ranging from simple to advanced in the fields of bioinformatics, computational genomics, and biological data analysis. The developers of Read More...
Web Page
Bioinformatics
The handbooks are opinionated. The opinions expressed in the book may or may not align with those of other bioinformaticians. A license is required to access the Biostar Handbook, but anyone can submit a question Read More...
Web Page
Bioinformatics
A scripting language that can be used for manipulating data and generating reports. Awk is a utility that enables a programmer to write tiny but effective programs in the form of statements that define text Read More...
Web Page
Bioinformatics
To participate in this course, you will need a computer, a reliable internet connection, and a web browser. All classes and help sessions will be held virtually through Webex. In addition, this class will be Read More...
Web Page
Bioinformatics
As with any language, the learning curve for Unix can be quite steep. However, to get started analyzing data you really need to understand the following: Directory navigation: what the directory tree is, how to Read More...
Web Page
Bioinformatics
As with any language, the learning curve for Unix can be quite steap. However, to get started analyzing data you really need to understand the following: Directory navigation: what the directory tree is, how to Read More...
Web Page
Bioinformatics
VCF is a data representation format used to describe variations in the genome. VCF files can contain information on any number of samples (thousands). VCF is composed of two sections, a header section and record Read More...
Web Page
Bioinformatics
NCBI BioProject : PRJN#### (example: PRJNA257197 ) contains the overall description of a single research initiative; a project will typically relate to multiple samples and datasets. NCBI BioSample : SAMN#### or SRS#### (example: SAMN03254300) describe biological source material; Read More...
Web Page
Bioinformatics
NCBI BioProject : PRJN#### (example: PRJNA257197 ) contains the overall description of a single research initiative; a project will typically relate to multiple samples and datasets. NCBI BioSample : SAMN#### or SRS#### (example: SAMN03254300) describe biological source material; Read More...
Web Page
Bioinformatics
Type cmd + spacebar and search for "terminal". Once open, right click on the app logo in the dock. Select Options and Keep in Dock . The default shell starting with Mac OSX version 10.14 is Read More...
Web Page
Bioinformatics
Type cmd + spacebar and search for "terminal". Once open, right click on the app logo in the dock. Select Options and Keep in Dock . The default shell starting with Mac OSX version 10.14 is Read More...
Web Page
Bioinformatics
Biostars on Biowulf To complement this course, there is a module available on Biowulf with installed programs associated with the Biostar Handbook. During class, we will work on the command line on the GOLD system Read More...
Web Page
Bioinformatics
{{Sdet}} Answer{{Esum}} Tools that we will be using for RNA sequencing analysis in this course series include command line applications for raw data quality assessment, data cleanup, trimming, alignment, etc. We will also be Read More...
Web Page
Bioinformatics
Content for this course series was adapted / inspired by the following sources: The Biostar Handbook , 2nd Edition, Istvan Albert. Malachi Griffith, Jason R. Walker, Nicholas C. Spies, Benjamin J. Ainscough, Obi L. Griffith. 2015. Informatics for Read More...
Web Page
Bioinformatics
References Content for this course series was adapted / inspired by the following sources: The Biostar Handbook , 2nd Edition, Istvan Albert. Malachi Griffith, Jason R. Walker, Nicholas C. Spies, Benjamin J. Ainscough, Obi L. Griffith. 2015. Informatics Read More...
Web Page
Bioinformatics
Lesson 12: RNA sequencing review 1 Learning objectives Here, we will do a quick review of what we have learned about RNA sequencing in Lessons 8 through 11. Accessing the Biostar handbook The URL for the Biostar handbook is Read More...
Web Page
Bioinformatics
As with any language, the learning curve for Unix can be quite steep. However, to work on Biowulf you really need to understand the following: Directory navigation: what the directory tree is, how to navigate Read More...
Web Page
Bioinformatics
To work on Biowulf you really need to understand the following: Directory navigation: what the directory tree is, how to navigate and move around with cd Absolute and relative paths: how to access files located Read More...
Web Page
Bioinformatics
Click here to download the class data as a zip files to local computer. Macs should automatically unzip upon download but Windows users will have to unzip after download. Follow the instructions at https://bioinformatics. Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Always remember to start the bioinformatics environment when working on Biostar class material. conda activate bioinfo Let's start by creating a directory Read More...
Web Page
Bioinformatics
Course Wrap-up This lesson concludes the Bioinformatics for Beginners course series. Please email us any time at ncibtep@nih.gov for help with your bioinformatics questions or concerns. Lesson Objectives Short course overview. Review BTEP Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Learn: FASTQC for assaying quality of sequence reads MultiQC for combining multiple FASTQC reports into one report Trimmomatic for removing sequence data based Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Learn: FASTQC for assaying quality of sequence reads MultiQC for combining multiple FASTQC reports into one report Trimmomatic for removing sequence data based Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Always remember to activate the bioinfo environment when working on Biostar class materials. conda activate bioinfo The bulk RNA-Seq test data we've Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Remember to activate the bioinformatics environment. conda activate bioinfo The jellyfish program is dependent on a program called "gcc" which is Read More...
Web Page
Bioinformatics
The Biostar Handbook works with programs installed within a conda environment named bioinfo . Conda is commonly used for bioinformatics package installations. Conda is often used for scientific software installation because... Installing software is hard. Installing Read More...
Web Page
Bioinformatics
After alignment of sequencing data to genome, we will need to count how many reads aligned to which gene. Using the tool featureCounts, we were able to do this. This tool takes as input our Read More...
Web Page
Bioinformatics
As mentioned above, instructions for using the Biostar module on Biowulf can be found in the course documents . The Biostars module loads the software used in this course series. A list of the software included Read More...
Web Page
Bioinformatics
The Biostar Handbook works with programs installed within a conda environment named bioinfo . Conda is commonly used for bioinformatics package installations. Conda is often used for scientific software installation because... Installing software is hard. Installing Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Start by activating the bioinfo environment. conda activate bioinfo Create a new directory for the multiqc data. mkdir multi cd multi Retrieve the Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Start by activating the bioinfo environment. conda activate bioinfo Create a new directory for the multiqc data. mkdir multi cd multi Retrieve the Read More...
Web Page
Bioinformatics
First we will obtain the SRA data from the biostar handbook web site curl http://data.biostarhandbook.com/sra/sra-runinfo-2019-01.tar.gz --output sra-runinfo-2019-01.tar.gz Now we can unpack the data. tar Read More...
Web Page
Bioinformatics
Lesson 1: Introduction to Unix and the Shell Lesson Objectives Course overview. Introduce Unix and describe how it differs from other operating systems. Introduce and get set up on DNAnexus and the GOLD system. Discuss ways Read More...
Web Page
Bioinformatics
Lesson 1: Introduction to Unix and the Shell Lesson Objectives Review the course syllabus and general structure of lessons to come. Introduce Unix and describe how it differs from other operating systems. Introduce and get set Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Review: * cd * mkdir * curl * tar * cat * grep * wc * outputting data * piping data from one command to another * cut Learn: * du * pip * csvkit * datamash Read More...
Web Page
Bioinformatics
Lesson 7 Practice In Lesson 7, you learned how to download and work with archived and compressed files. To practice what you have learned, we will use the ERCC spike in control data, which Istvan Albert, creator Read More...
Web Page
Bioinformatics
This page contains content directly from the Biostar Handbook by Istvan Albert. Always remember to activate your bioinformatics environment. conda activate bioinfo What is a sequence pattern? A sequence pattern is a sequence of bases Read More...
Web Page
Bioinformatics
The bulk RNA-Seq test data we've been working with is in FASTQ format. We'd like to do a BLAST search on a couple of these sequences. Data must be in FASTA format to Read More...
Web Page
Bioinformatics
This page uses material directly from the Biostar Handbook by Istvan Albert. Always remember to activate the bioinformatics environment. conda activate bioinfo To align sequences to a genome, we need (1) genome sequence and the (2) sequence Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Obtain RNA-seq test data. The test data consists of two commercially available RNA samples: Universal Human Reference (UHR) and Human Brain Reference (HBR) . Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Obtain RNA-seq test data. The test data consists of two commercially available RNA samples: Universal Human Reference (UHR) and Human Brain Reference (HBR) . Read More...
Web Page
Bioinformatics
The test data consists of two commercially available RNA samples: Universal Human Reference (UHR) and Human Brain Reference (HBR) . The UHR is total RNA isolated from a diverse set of 10 cancer cell lines. The HBR Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Always remember to activate the bioinfo environment when working on Biostar class material. conda activate bioinfo Retrieving a FASTA genome from NCBI/GenBank Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Learn * What are sequence adapters? * Do we need to trim them before alignment? * How can I trim with a new adapter sequence? Be Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Learn * What are sequence adapters? * Do we need to trim them before alignment? * How can I trim with a new adapter sequence? Be Read More...
Web Page
Bioinformatics
Bioinformatics for Beginners: RNA-Seq Course Description: This course was designed to teach the basic skills needed for bioinformatics, including working on the Unix command line. This course primarily focuses on RNA-Seq analysis. All steps of Read More...
Web Page
Bioinformatics
This page contains content taken directly from the Biostar Handbook by Istvan Albert. Activate the bioinformatics environment. conda activate bioinfo First let's make a place to store today's work. In your biostar_class Read More...
Web Page
Bioinformatics
This page contains content taken directly from the Biostar Handbook by Istvan Albert. Always remember to start the bioinformatics environment. conda activate bioinfo We will be analyzing differential expression of genes on Chr22 from the Read More...
Web Page
Bioinformatics
This page contains content taken directly from the Biostar Handbook by Istvan Albert. Always remember to start the bioinformatics environment. conda activate bioinfo We will be analyzing differential expression of genes on Chr22 from the Read More...
Web Page
Bioinformatics
Bioinformatics Training and Education Program 15 January – 15 February 2023 BTEP Bulletin Contact us at ncibtep@nih.gov FEATURED BIOINFORMATICS EVENTS Data Management Sharing: Part 1 and Part 2 DOE-NCI Collaboration: MOSSAIC for Advancing Computational Models for Cancer Research Variation Read More...
Web Page
Bioinformatics
BTEP is currently running on-line classes to enable CCR scientists to learn computer and bioinformatics skills remotely. We are using two on-line resources as the basis of each class and have added bi-weekly interactive learning Read More...
Web Page
Bioinformatics
This page uses context taken directly from the Biostar Handbook by Istvan Albert. Remember to activate the class bioinformatics environment. conda activate bioinfo Introduction to Genomic Variation Genomic variations are typically categorized into different classes Read More...
Web Page
Bioinformatics
Lesson 7: Downloading the RNA-Seq Data and Dataset Overview Lesson Review pwd (print working directory) ls (list) touch (creates an empty file) nano (basic editor for creating small text files) using the rm command to remove Read More...
Web Page
Bioinformatics
This page contains content directly from The Biostar Handbook . Always remember to start the bioinformatics environment. conda activate bioinfo Pseudoalignment-based methods identify locations in the genome using patterns rather than via alignment type algorithms. It Read More...
Web Page
Bioinformatics
This page contains content directly from The Biostar Handbook . Always remember to start the bioinformatics environment. conda activate bioinfo Pseudoalignment-based methods identify locations in the genome using patterns rather than via alignment type algorithms. It Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Learn: using trimmomatic to remove low-quality bases from a sequence Always remember to activate the bioinformatics environment. conda activate bioinfo We will be Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Learn: using trimmomatic to remove low-quality bases from a sequence Always remember to activate the bioinformatics environment. conda activate bioinfo We will be Read More...
Web Page
Bioinformatics
Retrieve R "helper" scripts developed for Biostars environment. curl -O http://data.biostarhandbook.com/rnaseq/code/deseq1.r curl -O http://data.biostarhandbook.com/rnaseq/code/deseq2.r curl -O http://data.biostarhandbook. Read More...
Web Page
Bioinformatics
Lesson 6: sra-tools, e-utilities, and parallel This page uses some content directly from the Biostar Handbook by Istvan Albert. Lesson 5 Review: The majority of computational tasks on Biowulf should be submitted as jobs: sbatch or swarm Read More...
Web Page
Bioinformatics
Lesson 17: RNA sequencing review 2 Learning objectives This lesson will serve as comprehensive review of Module 2. We will spend roughly the first hour reviewing the Module 2 material the second hour answering specific questions from the poll Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Remember to activate the bioinfo environment. conda activate bioinfo Then create a new directory for files we will be working with today in Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Review: * downloading data from SRA * decompressing tar files * e-utilities * fastq-dump Learn: * sra-stat * XML format * automating SRA downloads * working with comma-separated values (csv) format * Read More...
Web Page
Bioinformatics
“Gene set enrichment analysis” refers to the process of discovering the common characteristics potentially present in a list of genes. When these characteristics are GO terms, the process is called “functional enrichment.” Warning Overall GO Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Obtain RNA-seq test data. The test data consists of two commercially available RNA samples: Universal Human Reference (UHR) and Human Brain Reference (HBR) . Read More...
Web Page
Bioinformatics
More useful Unix Flags and command options - making programs do what they do Use of wildcards Using tab complete for less typing Access your history with the "up" and "down" Read More...
Web Page
Bioinformatics
More useful Unix Flags and command options - making programs do what they do Use of wildcards Using tab complete for less typing Access your history with the "up" and "down" Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Learning objectives: 1. Understand what a sequence alignment is and how different algorithms can effect alignments. 2. Learn how scoring matrices and gap penalties (gap Read More...
Web Page
Bioinformatics
Lesson 6: Downloading data from the SRA For this lesson, you will need to login to the GOLD environment on DNAnexus. Lesson 5 Review: The majority of computational tasks on Biowulf should be submitted as jobs: sbatch Read More...
Web Page
Bioinformatics
This page contains content taken directly from the Biostar Handbook (Istvan Albert). Always remember to activate the class bioinformatics environment. conda activate bioinfo For this data analysis, we will be using: Two commercially available RNA Read More...
Web Page
Bioinformatics
This page contains content taken directly from the Biostar Handbook (Istvan Albert). Always remember to activate the class bioinformatics environment. conda activate bioinfo For this data analysis, we will be using: Two commercially available RNA Read More...
Web Page
Bioinformatics
Lesson 16: RNA sequencing review and classification based analysis Before getting started, remember to be signed on to the DNAnexus GOLD environment. Review In the previous classes, we learned about the steps involved in RNA sequencing Read More...
Web Page
Bioinformatics
This page uses content directly from the Biostar Handbook by Istvan Albert. Remember to activate the bioinformatics environment and create a directory for today's work. conda activate bioinfo mkdir blast cd blast What is Read More...
Web Page
Bioinformatics
Lesson 1: Introduction to Biowulf, Unix, and R Learning Objectives Learn about why you may want to use R on Biowulf. Refresh Unix and R skills. This lesson will not be hands on. Why use R Read More...
Web Page
Bioinformatics
Learning Objectives Understand the components of an HPC system. How does this compare to your local desktop? Learn about Biowulf, the NIH HPC cluster. Learn about the command line interface and resources for learning. What Read More...