Data

When running bioinformatics analyses on the command-line, you may wish to use publicly-available data for testing or for analysis. Some examples of how to do this are given here, along with sample datasets.

Get data with wget or curl

At the command line, type:

More about wget: https://en.wikipedia.org/wiki/Wget

More about curl: https://en.wikipedia.org/wiki/CURL

SRA Toolkit

Sample sequencing data

PacBio reads From Canu’s quickstart page:

Synthetic PacBio data

Nanopore reads From Canu’s quickstart page; data from the Loman lab:

Downsampled nanopore data

fast5 files

10X (linked) reads