Skip to content

How to install snpArcher

Prerequisites

  • Operating system: Linux or macOS (Windows users should use WSL)
  • Git: for cloning the repository
  • Conda or Mamba: for managing environments

If you are on an institutional cluster, conda may already be available. Check by running:

conda --version

If you see a version number, you are set. If you see "command not found", install conda through Miniforge:

wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh

Install Snakemake

Create a dedicated conda environment with Snakemake:

conda create -c conda-forge -c bioconda -n snparcher "snakemake>=9"

Activate the environment:

conda activate snparcher

Note

This is the only software you need to install manually. snpArcher uses Snakemake's --use-conda flag to create isolated environments for each pipeline step automatically.

Clone snpArcher

Clone the repository:

git clone https://github.com/harvardinformatics/snpArcher.git

You can place this clone anywhere on your filesystem. A single clone can serve multiple independent projects.

Pinning a release

To use a specific release, visit the Releases page and download the version you want, or check out a tag after cloning:

cd snpArcher
git checkout v2.0.0  # <-- change this to the desired version

Verify the installation

Run the bundled example dataset to confirm everything is working:

conda activate snparcher
snakemake --use-conda --cores 4 --directory snpArcher/example/

This processes five simulated samples against a small reference genome. It should complete in about five minutes on a machine with four cores. A successful run produces output files under example/results/ with no error messages.

To do a faster check without actually running any jobs, use the dry-run flag:

snakemake --use-conda --dry-run --directory snpArcher/example/

This resolves the full dependency graph and reports which jobs would be run, without executing them.

Optional: install executor plugins for HPC

If you plan to run snpArcher on a cluster, you need an executor plugin. For SLURM (the most common scheduler):

conda activate snparcher
pip install snakemake-executor-plugin-slurm

For other schedulers, find the appropriate plugin in the Snakemake Plugin Catalog and follow its installation instructions.

See How to run on HPC for full cluster setup details.

Next steps