Writers write, painters paint, genomics researchers… wait patiently for algorithm pipelines to finish processing NGS data, only to spend as much time visualizing results?
As long-time researchers and bioinformaticians, we’re well aware that in NGS data analysis, every bottleneck hampers innovation. For every hour spent feeding genomics data into algorithms, converting file formats, waiting for everything to finish processing to move onto the next step, you’re taking away valuable time that could be used to delve into downstream analysis and derive crucial insights.
We created Basepair to automate and exponentially speed up the time-consuming multistep process researchers have been undertaking to analyze NGS data. Today, Basepair’s web platform is a hit with everyone from university students and staff, to large enterprise teams who use our platform to analyze up to thousands of samples in parallel, with only a few minutes for setup, and less than an hour of processing time even for the most computationally expensive tasks.
Our goal was to give back individuals and teams the thousands of hours lost to manual NGS data analysis. But we didn’t see it fit to simply output resulting files and call it a day. We understand how crucial reporting and visuals are to downstream analysis – and what a pain it can be to set up. That’s why we include a host of rich visual and interactive components as part of our NGS data analysis report.
In addition to output files, Basepair includes three main reporting components that speed up downstream NGS data analysis: QC and alignment metrics, the Genome Browser, and interactive figures specific to certain workflows, such as an adjustable volcano plot for differential expression analysis and the Variant Browser for variant calling analysis.
What you see in your Basepair report
Depending on the workflow you choose for your RNA-Seq, ChIP-Seq, DNA-Seq, or ATAC-Seq data, your report will feature unique visuals and interactive elements. For example, let’s take a look at the overall report for an expression count workflow using STAR on a RNA-Seq data sample:
For this workflow, your report will include quality scores, alignment percentages, and alignment to exons and distribution of read count over genes or transcript. You will have the ability to explore each gene further in the Genome Browser (more on this feature below).
Let’s take a peek at a DESeq report:
The interactive volcano plot can be adjusted with the P-Value and Fold Change sliders on the left, along with a filter box that is able to locate and label specific genes. Here’s a video of these interactive elements:
One size definitely doesn’t fit all in the NGS data analysis world, which is why we made sure each of our 30 workflows has a tailored set of graphs and tables to best represent your genomics data. Let’s further explore the types of plots you will see in Basepair’s reports.
Graph options in your report
In this section we’ll explore a cross-section of publication-ready visuals available to you, starting with a QC figure and Q30 summary from an Expression Count (STAR) report.
In the same report we have a percentage of reads during various stages of the pipeline…
… and a histogram (Log 10) and table for the expression counts.
For the Alignment (BWA) workflow report on DNA-Seq data, besides QC and alignment, we feature a line graph showing the percentage of bases with varying levels of coverage.
For locating peaks and motifs in ChIP-Seq data, we use MACS for peaks and Homer for motifs. The report includes a simple peak intensity line graph and a distance-from-peak heatmap:
For the DESeq workflow for RNA-Seq data, we feature a volcano plot that includes interactive sliders and a filter box, and an accompanying heatmap. Users can download the volcano plot as a .png/.svg.
Feel free to explore these plots and many others with your own data by signing up free for our 14-day Basepair demo.
Exploring raw data in Basepair’s Genome Browser
Each report includes a Genome Browser, an interactive visualization of your raw data.
Each of the tracks above is customizable – you are able to set name, height, color, and data range to your preferences, as well as add or remove tracks. The Genome Browser is a quick method of assessing data quality, honing in on specific genes, and generally observing your dataset.
Exploring variant data in the Variant Browser
Secondary analysis gives you a few thousand variants in a .vcf file. What can you do with this data? Which parts of this data are of interest? We received many inquiries for suggestions around this topic. In response we developed the Variant Browser.
Now, DNA-Seq data analyzed via the Variant Annotation or Variant Calling workflows includes an interactive variant browser that allows Basepair users to experiment with various parameters and filters, making the discovery of useful data more intuitive, and much faster.
Parameters include quality, depth, gnomAD frequency, and alternate reads frequency sliders, among many more. You can also filter by specific gene, annotation, feature type, or chromosome. In addition to the powerful flexibility of the table, an Analytics tabs takes you to a set of useful metrics, from base change counts, to the number of variant types, to a host of other useful plots.
At Basepair, our mission is to make NGS data analysis as seamless, fast, and rewarding as possible, so you can focus on what you do best. Our interactive reports include a variety of workflow-specific graphs, tables, and interactive elements that streamline reporting and downstream analysis, no matter how large the number of samples is.
Explore our sample reports – and try analyzing your own data – free for 14 days (no card details necessary).