Quickstart

Requirements

quicksand has two dependencies

Nextflow:

Version 22.04 or above. See here

Containerization-Software:

Please use Singularity or Docker

Tip

check the successful installation of the software by running:

nextflow -v
    >>> nextflow version 22.10
singularity --version
    >>> singularity version 3.7.2-dirty

Download test-data

The input for quicksand is a directory with user-supplied files in BAM or FASTQ format. Adapter-trimming, overlap-merging and sequence demultiplexing need to be performed by the user prior to running quicksand. Provide the directory with the --split DIR flag.

As input for the quickstart, download the Hominin "Hohlenstein-Stadel" mtDNA [1] into a directory split:

wget -q --show-progress -P split http://ftp.eva.mpg.de/neandertal/Hohlenstein-Stadel/BAM/mtDNA/HST.raw_data.ALL.bam

Download the database

The required KrakenUniq database, the reference genomes for mapping and the bed-files for low-complexity filtering are available on the MPI EVA FTP Servers. Custom versions of the reference material can be created with the quicksand-build pipeline

For quickstarting quicksand, create a fresh database containing only the Hominidae mtDNA reference genomes (runtime: ~3-5 minutes):

nextflow run mpieva/quicksand-build -r v3.0
--include

Hominidae

--outdir

refseq

-profile

singularity

Alternatively, download the most full datastructure from the MPI EVA FTP SERVERS (~50 GB):

latest=$(curl http://ftp.eva.mpg.de/quicksand/LATEST)
wget -r -np -nc -nH --cut-dirs=3 --reject="*index.html*" -q --show-progress -P refseq http://ftp.eva.mpg.de/quicksand/build/$latest

Run quicksand

quicksand is executed directly from github. With the databases created and the testdata downloaded, run the pipeline as follows:

nextflow run mpieva/quicksand -r v2.5 \
  -profile   singularity \
  --db       refseq/kraken/Mito_db_kmer22 \
  --genomes  refseq/genomes/ \
  --bedfiles refseq/masked/ \
  --split    split/

The output of quicksand can be found in the directory quicksand_v2.5/

See the final_report.tsv and filtered_report_0.5p_0.5b.tsv for a summary of the results.

See the Input and Output section for a detailed explaination of all the output files.