How to install snpArcher¶

Prerequisites¶

Operating system: Linux or macOS (Windows users should use WSL)
Git: for cloning the repository
Conda or Mamba: for managing environments

If you are on an institutional cluster, conda may already be available. Check by running:

conda --version

If you see a version number, you are set. If you see "command not found", install conda through Miniforge:

wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh

Install Snakemake¶

Create a dedicated conda environment with Snakemake:

conda create -c conda-forge -c bioconda -n snparcher "snakemake>=9"

Activate the environment:

conda activate snparcher

Note

This is the only software you need to install manually. snpArcher uses Snakemake's --use-conda flag to create isolated environments for each pipeline step automatically.

Clone snpArcher¶

Clone the repository:

git clone https://github.com/harvardinformatics/snpArcher.git

You can place this clone anywhere on your filesystem. A single clone can serve multiple independent projects.

Pinning a release

To use a specific release, visit the Releases page and download the version you want, or check out a tag after cloning:

cd snpArcher
git checkout v2.0.0  # <-- change this to the desired version

Verify the installation¶

Run the bundled example dataset to confirm everything is working:

conda activate snparcher
snakemake --use-conda --cores 4 --directory snpArcher/example/

This processes five simulated samples against a small reference genome. It should complete in about five minutes on a machine with four cores. A successful run produces output files under example/results/ with no error messages.

To do a faster check without actually running any jobs, use the dry-run flag:

snakemake --use-conda --dry-run --directory snpArcher/example/

This resolves the full dependency graph and reports which jobs would be run, without executing them.

Optional: install executor plugins for HPC¶

If you plan to run snpArcher on a cluster, you need an executor plugin. For SLURM (the most common scheduler):

conda activate snparcher
pip install snakemake-executor-plugin-slurm

For other schedulers, find the appropriate plugin in the Snakemake Plugin Catalog and follow its installation instructions.

See How to run on HPC for full cluster setup details.

Next steps¶

Create a sample sheet describing your samples
Configure your run with a config.yaml file
Run locally or run on HPC