Quickstart¶
Requirements¶
quicksand has two dependencies
- Nextflow:
Version
22.04or above. See here- Containerization-Software:
Please use Singularity or Docker
Tip
check the successful installation of the software by running:
nextflow -v
>>> nextflow version 22.10
singularity --version
>>> singularity version 3.7.2-dirty
Download test-data¶
The input for quicksand is a directory with user-supplied files in BAM or FASTQ format.
Adapter-trimming, overlap-merging and sequence demultiplexing need to be performed by the user prior to running quicksand.
Provide the directory with the --split DIR flag.
As a test file, download a mammalian mtDNA capture library from Denisova Cave Layer 20 (published in Zavala et al. 2021) into a directory split:
wget -P split \
ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR564/ERR5640810/A20896.bam
Create test-database¶
Create a small test-database containing only the Hominidae, Bovidae and Hyaenidea mtDNA reference genomes (~150 genomes, runtime: ~3-5 minutes, size ~5GB):
nextflow run mpieva/quicksand-build -r v3.1 \
--include Hominidae,Bovidae,Hyaenidae \
--outdir refseq \
-profile singularity
Download the database¶
For real analyses (or the Examples / Tutorial page), download or create a complete dataset
A custom versions of the full reference material can be created with the quicksand-build pipeline (warning: requires at least 100GB of RAM):
nextflow run mpieva/quicksand-build -r v3.1 \
--outdir refseq \
-profile singularity
Alternatively, download the most full datastructure (refseq release 221) from the MPI EVA FTP SERVERS (~50 GB):
latest=$(curl http://ftp.eva.mpg.de/quicksand/LATEST)
wget -r -np -nc -nH --cut-dirs=3 --reject="*index.html*" -q --show-progress -P refseq http://ftp.eva.mpg.de/quicksand/build/$latest
Run quicksand¶
quicksand is executed directly from github. With the databases created and the testdata downloaded, run the pipeline as follows:
nextflow run mpieva/quicksand -r v2.6 \
-profile singularity \
--db refseq/kraken/Mito_db_kmer22 \
--genomes refseq/genomes/ \
--bedfiles refseq/masked/ \
--split split/
The output of quicksand can be found in the directory quicksand_v2.6/
See the final_report.tsv and filtered_report_0.5p_0.5b.tsv for a summary of the results.
See the Input and Output section for a detailed explaination of all the output files.