Quickstart

Requirements

quicksand has two dependencies

Nextflow:

Version 22.04 or above. See here

Containerization-Software:

Please use Singularity or Docker

Tip

check the successful installation of the software by running:

nextflow -v
    >>> nextflow version 22.10
singularity --version
    >>> singularity version 3.7.2-dirty

Download test-data

The input for quicksand is a directory with user-supplied files in BAM or FASTQ format. Adapter-trimming, overlap-merging and sequence demultiplexing need to be performed by the user prior to running quicksand. Provide the directory with the --split DIR flag.

As a test file, download a mammalian mtDNA capture library from Denisova Cave Layer 20 (published in Zavala et al. 2021) into a directory split:

    wget -P split \
ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR564/ERR5640810/A20896.bam

Create test-database

Create a small test-database containing only the Hominidae, Bovidae and Hyaenidea mtDNA reference genomes (~150 genomes, runtime: ~3-5 minutes, size ~5GB):

nextflow run mpieva/quicksand-build -r v3.1 \
    --include  Hominidae,Bovidae,Hyaenidae \
    --outdir   refseq \
    -profile   singularity

Download the database

For real analyses (or the Examples / Tutorial page), download or create a complete dataset

A custom versions of the full reference material can be created with the quicksand-build pipeline (warning: requires at least 100GB of RAM):

nextflow run mpieva/quicksand-build -r v3.1 \
  --outdir   refseq \
  -profile   singularity

Alternatively, download the most full datastructure (refseq release 221) from the MPI EVA FTP SERVERS (~50 GB):

latest=$(curl http://ftp.eva.mpg.de/quicksand/LATEST)
wget -r -np -nc -nH --cut-dirs=3 --reject="*index.html*" -q --show-progress -P refseq http://ftp.eva.mpg.de/quicksand/build/$latest

Run quicksand

quicksand is executed directly from github. With the databases created and the testdata downloaded, run the pipeline as follows:

nextflow run mpieva/quicksand -r v2.6 \
  -profile   singularity \
  --db       refseq/kraken/Mito_db_kmer22 \
  --genomes  refseq/genomes/ \
  --bedfiles refseq/masked/ \
  --split    split/

The output of quicksand can be found in the directory quicksand_v2.6/

See the final_report.tsv and filtered_report_0.5p_0.5b.tsv for a summary of the results.

See the Input and Output section for a detailed explaination of all the output files.