Skip to content
Commits on Source (10)
SINGULARITY_VER=2.4.2
version: 2
jobs:
build:
machine: true
environment:
- GCLOUD: /opt/google-cloud-sdk/bin/gcloud
steps:
- checkout
- restore_cache:
keys:
- snakemake-{{ checksum ".circleci/setup.sh" }}-{{ checksum "test-environment.yml" }}-{{ checksum ".circleci/common.sh" }}
- run:
name: Update PATH
command: echo 'export PATH="`pwd`/miniconda/bin:$PATH"' >> $BASH_ENV
- run:
name: Setup Conda
command: .circleci/setup.sh
- save_cache:
key: snakemake-{{ checksum ".circleci/setup.sh" }}-{{ checksum "test-environment.yml" }}-{{ checksum ".circleci/common.sh" }}
paths:
- miniconda
- run:
name: Setup singularity
command: |
# TODO only install if singularity is not yet present
# if type singularity > /dev/null; then exit 0; fi
source .circleci/common.sh
sudo apt-get update; sudo apt-get install squashfs-tools
wget https://github.com/singularityware/singularity/releases/download/$SINGULARITY_VER/singularity-$SINGULARITY_VER.tar.gz
tar xvf singularity-$SINGULARITY_VER.tar.gz
cd singularity-$SINGULARITY_VER
./configure --prefix=/usr/local --sysconfdir=/etc
make
sudo make install
- run:
name: Setup Snakemake
command: |
source activate snakemake
pip install -e .
- run:
name: Setup iRODS Docker image
command: |
docker build -t irods-server tests/test_remote_irods
docker run -d -p 1247:1247 --name provider irods-server -i run_irods
sleep 10
docker exec -u irods provider iput /incoming/infile
cp -r tests/test_remote_irods/setup-data ~/.irods
- run:
name: Setup gcloud
command: |
# skip if key is unset
if [ -z $GCLOUD_SERVICE_KEY ]; then exit 0; fi
# otherwise init cloud
echo $GCLOUD_SERVICE_KEY | base64 --decode --ignore-garbage > ${HOME}/gcloud-service-key.json
sudo $GCLOUD components install kubectl
sudo $GCLOUD auth activate-service-account --key-file=${HOME}/gcloud-service-key.json
sudo $GCLOUD config set project snakemake-testing
- run:
name: Run tests
command: |
export GCLOUD_CLUSTER=t-`uuidgen`
export GOOGLE_APPLICATION_CREDENTIALS=${HOME}/gcloud-service-key.json
source activate snakemake
py.test tests/test*.py -v -x
#!/bin/bash
set -euo pipefail
if type conda > /dev/null; then exit 0; fi
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p miniconda
conda env create --name snakemake --file test-environment.yml
snakemake/_version.py export-subst
# Change Log
# [4.8.0] - 2018-03-13
### Added
- Integration with CWL: the `cwl` directive allows to use CWL tool definitions in addition to shell commands or Snakemake wrappers.
- A global `singularity` directive allows to define a global singularity container to be used for all rules that don't specify their own.
- Singularity and Conda can now be combined. This can be used to specify the operating system (via singularity), and the software stack (via conda), without the overhead of creating specialized container images for workflows or tasks.
# [4.7.0] - 2018-02-19
### Changed
- Speedups when calculating dry-runs.
- Speedups for workflows with many rules when calculating the DAG.
- Accept SIGTERM to gracefully finish all running jobs and exit.
- Various minor bug fixes.
# [4.6.0] - 2018-02-06
### Changed
- Log files can now be used as input files for other rules.
- Adapted to changes in Kubernetes client API.
- Fixed minor issues in --archive option.
- Search path order in scripts was changed to fix a bug with leaked packages from root env when using script directive together with conda.
# [4.5.1] - 2018-02-01
### Added
- Input and output files can now tag pathlib objects.
### Changed
- Various minor bug fixes.
# [4.5.0] - 2018-01-18
### Added
- iRODS remote provider
### Changed
- Bug fix in shell usage of scripts and wrappers.
- Bug fixes for cluster execution, --immediate-submit and subworkflows.
## [4.4.0] - 2017-12-21
### Added
- A new shadow mode (minimal) that only symlinks input files has been added.
### Changed
- The default shell is now bash on linux and maxOS. If bash is not installed, we fall back to sh. Previously, Snakemake used the default shell of the user, which defeats the purpose of portability. If the developer decides so, the shell can be always overwritten using shell.executable().
- Snakemake now requires Singularity 2.4.1 at least (only when running with --use-singularity).
- HTTP remote provider no longer automatically unpacks gzipped files.
- Fixed various smaller bugs.
## [4.3.1] - 2017-11-16
### Added
- List all conda environments with their location on disk via --list-conda-envs.
......
include versioneer.py
include snakemake/_version.py
[![wercker status](https://app.wercker.com/status/5b4faec0485e3b6ed5497f3e8e551b34/s/master "wercker status")](https://app.wercker.com/project/byKey/5b4faec0485e3b6ed5497f3e8e551b34)
[![CircleCI](https://circleci.com/bb/snakemake/snakemake/tree/master.svg?style=shield)](https://circleci.com/bb/snakemake/snakemake/tree/master)
# Snakemake - a pythonic workflow system
......
snakemake (4.8.0-1) UNRELEASED; urgency=medium
* Team upload.
* New upstream version
* Standards-Version: 4.1.3
* debhelper 11
* Build-Depends: r-cran-rmarkdown and disable patch that skips test using
RMarkdown
* Recommends: r-cran-rmarkdown
TODO: Needs python-datrie (#828741)
-- Andreas Tille <tille@debian.org> Tue, 27 Mar 2018 16:07:45 +0200
snakemake (4.3.1-1) unstable; urgency=medium
* Team upload
......
......@@ -3,11 +3,12 @@ Maintainer: Debian Med Packaging Team <debian-med-packaging@lists.alioth.debian.
Uploaders: Kevin Murray <kdmfoss@gmail.com>
Section: science
Priority: optional
Build-Depends: debhelper (>= 10),
Build-Depends: debhelper (>= 11~),
dh-python,
python3 (>= 3.2),
python3,
python3-boto,
python3-configargparse,
python3-datrie,
python3-nose,
python3-psutil,
python3-pytools,
......@@ -19,8 +20,9 @@ Build-Depends: debhelper (>= 10),
python3-wrapt,
python3-yaml,
python3-sphinx-rtd-theme,
ca-certificates
Standards-Version: 4.1.2
ca-certificates,
r-cran-rmarkdown
Standards-Version: 4.1.3
Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/snakemake.git/
Vcs-Git: https://anonscm.debian.org/git/debian-med/snakemake.git
Homepage: https://bitbucket.org/snakemake/snakemake
......@@ -43,6 +45,7 @@ Depends: ${misc:Depends},
libjs-d3,
ca-certificates
Recommends: python3-boto,
r-cran-rmarkdown
Description: pythonic workflow management system
Build systems like GNU Make are frequently used to create complicated
workflows, e.g. in bioinformatics. This project aims to reduce the
......
......@@ -22,11 +22,11 @@ Description: Avoid privacy breach
<script type="text/javascript" src="http://cdnjs.cloudflare.com/ajax/libs/bootstrap-select/1.5.4/bootstrap-select.min.js"></script>
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -4,25 +4,22 @@
@@ -4,25 +4,25 @@
Snakemake
=========
-.. image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg
-.. image:: https://img.shields.io/conda/dn/bioconda/snakemake.svg?label=Bioconda
+.. image:: file:///usr/share/doc/snakemake/html/_svg/install_with-bioconda-brightgreen.svg
:target: https://bioconda.github.io/recipes/snakemake/README.html
......@@ -42,16 +42,17 @@ Description: Avoid privacy breach
+.. image:: file:///usr/share/doc/snakemake/html/_svg/status.svg
:target: https://quay.io/repository/snakemake/snakemake
-.. image:: https://app.wercker.com/status/5b4faec0485e3b6ed5497f3e8e551b34/s/master
- :target: https://app.wercker.com/project/byKey/5b4faec0485e3b6ed5497f3e8e551b34
-
-.. image:: https://circleci.com/bb/snakemake/snakemake/tree/master.svg?style=shield
+.. image:: file:///usr/share/doc/snakemake/html/_svg/master.svg
:target: https://circleci.com/bb/snakemake/snakemake/tree/master
-.. image:: https://img.shields.io/badge/stack-overflow-orange.svg
+.. image:: file:///usr/share/doc/snakemake/html/_svg/stack-overflow-orange.svg
:target: http://stackoverflow.com/questions/tagged/snakemake
-.. image:: https://img.shields.io/twitter/follow/johanneskoester.svg?style=social&label=Follow
+.. image:: file:///usr/share/doc/snakemake/html/_svg/johanneskoester_follow.svg
:target: https://twitter.com/johanneskoester
:target: https://twitter.com/search?l=&q=%23snakemake%20from%3Ajohanneskoester
The Snakemake workflow management system is a tool to create **reproducible and scalable** data analyses.
--- a/docs/project_info/citations.rst
......
......@@ -7,5 +7,5 @@
# 0007-noop-rate-limiter.patch - broken and obsolete with https://bugs.debian.org/880661
0008-remove_sphinx.ext.patch
0009-skip-test-without-google-cloud-sdk.patch
0010-skip-test-without-rmarkdown.patch
# 0010-skip-test-without-rmarkdown.patch
0011-fix-privacy-breach.patch
......@@ -7,5 +7,6 @@ wget -q https://img.shields.io/pypi/pyversions/snakemake.svg -O pyversions_snake
wget -q https://img.shields.io/pypi/v/snakemake.svg
wget -q https://img.shields.io/badge/stack-overflow-orange.svg
wget -q https://img.shields.io/twitter/follow/johanneskoester.svg?style=social\&label=Follow -O johanneskoester_follow.svg
wget -q wget https://img.shields.io/badge/snakemake-≥3.5.2-brightgreen.svg?style=flat-square -O snakemake_gt_3.5.2-brightgreen.svg
wget -q https://img.shields.io/badge/snakemake-≥3.5.2-brightgreen.svg?style=flat-square -O snakemake_gt_3.5.2-brightgreen.svg
wget -q https://quay.io/repository/snakemake/snakemake/status -O status.svg
wget -q https://circleci.com/bb/snakemake/snakemake/tree/master.svg?style=shield -O master.svg
<svg xmlns="http://www.w3.org/2000/svg" width="102" height="20"><linearGradient id="b" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><mask id="a"><rect width="102" height="20" rx="3" fill="#fff"/></mask><g mask="url(#a)"><path fill="#555" d="M0 0h49v20H0z"/><path fill="#4c1" d="M49 0h53v20H49z"/><path fill="url(#b)" d="M0 0h102v20H0z"/></g><g fill="#fff" text-anchor="middle" font-family="DejaVu Sans,Verdana,Geneva,sans-serif" font-size="11"><text x="24.5" y="15" fill="#010101" fill-opacity=".3">circleci</text><text x="24.5" y="14">circleci</text><text x="74.5" y="15" fill="#010101" fill-opacity=".3">passing</text><text x="74.5" y="14">passing</text></g></svg>
\ No newline at end of file
......@@ -114,6 +114,10 @@ Of course, if any input or output already defines a different remote location, t
Importantly, this means that Snakemake does **not** require a shared network
filesystem to work in the cloud.
Currently, this mode requires that the Snakemake workflow is stored in a git repository.
Snakemake uses git to query necessary source files (the Snakefile, scripts, config, ...)
for workflow execution and encodes them into the kubernetes job.
It is further possible to forward arbitrary environment variables to the kubernetes
jobs via the flag ``--kubernetes-env`` (see ``snakemake --help``).
......
......@@ -4,7 +4,7 @@
Snakemake
=========
.. image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg
.. image:: https://img.shields.io/conda/dn/bioconda/snakemake.svg?label=Bioconda
:target: https://bioconda.github.io/recipes/snakemake/README.html
.. image:: https://img.shields.io/pypi/pyversions/snakemake.svg
......@@ -16,14 +16,14 @@ Snakemake
.. image:: https://quay.io/repository/snakemake/snakemake/status
:target: https://quay.io/repository/snakemake/snakemake
.. image:: https://app.wercker.com/status/5b4faec0485e3b6ed5497f3e8e551b34/s/master
:target: https://app.wercker.com/project/byKey/5b4faec0485e3b6ed5497f3e8e551b34
.. image:: https://circleci.com/bb/snakemake/snakemake/tree/master.svg?style=shield
:target: https://circleci.com/bb/snakemake/snakemake/tree/master
.. image:: https://img.shields.io/badge/stack-overflow-orange.svg
:target: http://stackoverflow.com/questions/tagged/snakemake
.. image:: https://img.shields.io/twitter/follow/johanneskoester.svg?style=social&label=Follow
:target: https://twitter.com/johanneskoester
:target: https://twitter.com/search?l=&q=%23snakemake%20from%3Ajohanneskoester
The Snakemake workflow management system is a tool to create **reproducible and scalable** data analyses.
Workflows are described via a human readable, Python based language.
......@@ -63,7 +63,7 @@ Rules describe how to create **output files** from **input files**.
* Rules can either use shell commands, plain Python code or external Python or R scripts to create output files from input files.
* Snakemake workflows can be easily executed on **workstations**, **clusters**, **the grid**, and **in the cloud** without modification. The job scheduling can be constrained by arbitrary resources like e.g. available CPU cores, memory or GPUs.
* Snakemake can automatically deploy required software dependencies of a workflow using `Conda <https://conda.io>`_ or `Singularity <http://singularity.lbl.gov/>`_.
* Snakemake can use Amazon S3, Google Storage, Dropbox, FTP, WebDAV and SFTP to access input or output files and further access input files via HTTP and HTTPS.
* Snakemake can use Amazon S3, Google Storage, Dropbox, FTP, WebDAV, SFTP and iRODS to access input or output files and further access input files via HTTP and HTTPS.
.. _main-getting-started:
......@@ -71,6 +71,7 @@ Rules describe how to create **output files** from **input files**.
Getting started
---------------
News about Snakemake are published via `Twitter <https://twitter.com/search?l=&q=%23snakemake%20from%3Ajohanneskoester>`_.
To get started, consider the :ref:`tutorial`, the `introductory slides <http://slides.com/johanneskoester/snakemake-tutorial-2016>`_, and the :ref:`FAQ <project_info-faq>`.
.. _main-support:
......@@ -105,7 +106,7 @@ Resources
The provided code should also serve as a best-practices of how to build production ready workflows with Snakemake.
Everybody is invited to contribute.
`Snakemake Profiles Project <https://github.com/snakemake-profiles/doc`_
`Snakemake Profiles Project <https://github.com/snakemake-profiles/doc>`_
This project provides Snakemake configuration profiles for various execution environments.
Please consider contributing your own if it is still missing.
......
......@@ -66,6 +66,20 @@ he quick fix for virtualenv is to temporarily deactivate the check for unbound v
For more details on bash strict mode, see the `here <http://redsymbol.net/articles/unofficial-bash-strict-mode/>`_.
My shell command fails with exit code != 0 from within a pipe, what's wrong?
----------------------------------------------------------------------------
Snakemake is using `bash strict mode <http://redsymbol.net/articles/unofficial-bash-strict-mode/>`_ to ensure best practice error reporting in shell commands.
This entails the pipefail option, which reports errors from within a pipe to outside. If you don't want this, e.g., to handle empty output in the pipe, you can disable pipefail via prepending
.. code-block:: bash
set +o pipefile;
to your shell command in the problematic rule.
.. _glob-wildcards:
How do I run my rule on all files of a certain directory?
......@@ -177,6 +191,12 @@ You can use the entire Python `format minilanguage <http://docs.python.org/3/lib
Here the double braces are escapes, i.e. there will remain single braces in the final command. In contrast, ``{input}`` is replaced with an input filename.
In addition, if your shell command has literal slashes, `\`, you must escape them with a slash, `\\`. For example:
.. code-block:: python
shell: """printf \\">%s\\"" {{input}}"""
How do I incorporate files that do not follow a consistent naming scheme?
-------------------------------------------------------------------------
......@@ -434,3 +454,15 @@ Git is messing up the modification times of my input files, what can I do?
--------------------------------------------------------------------------
When you checkout a git repository, the modification times of updated files are set to the time of the checkout. If you rely on these files as input **and** output files in your workflow, this can cause trouble. For example, Snakemake could think that a certain (git-tracked) output has to be re-executed, just because its input has been checked out a bit later. In such cases, it is advisable to set the file modification dates to the last commit date after an update has been pulled. See `here <https://stackoverflow.com/questions/2458042/restore-files-modification-time-in-git/22638823#22638823>`_ for a solution to achieve this.
How do I exit a running Snakemake workflow?
-------------------------------------------
There are two ways to exit a currently running workflow.
1. If you want to kill all running jobs, hit Ctrl+C. Note that when using --cluster, this will only cancel the main Snakemake process.
2. If you want to stop the scheduling of new jobs and wait for all running jobs to be finished, you can send a TERM signal, e.g., via
.. code-block:: bash
killall -TERM snakemake
......@@ -27,6 +27,7 @@ External Resources
These resources are not part of the official documentation.
* `A number of tutorials on the subject "Tools for reproducible research" <http://nbis-reproducible-research.readthedocs.io>`_
* `Snakemake workflow used for the Kallisto paper <https://github.com/pachterlab/kallisto_paper_analysis>`_
* `An alternative tutorial for Snakemake <http://slowkow.com/notes/snakemake-tutorial/>`_
* `An Emacs mode for Snakemake <http://melpa.milkbox.net/#/snakemake-mode>`_
......@@ -36,3 +37,4 @@ These resources are not part of the official documentation.
* `Japanese version of the Snakemake tutorial <https://github.com/joemphilips/Translate_Snakemake_Tutorial>`_
* `Basic <http://bioinfo-fr.net/snakemake-pour-les-nuls>`_ and `advanced <http://bioinfo-fr.net/snakemake-aller-plus-loin-avec-la-parallelisation>`_ french Snakemake tutorial.
* `Mini tutorial on Snakemake and Bioconda <https://github.com/dlaehnemann/TutMinicondaSnakemake>`_
* `Snakeparse: a utility to expose Snakemake workflow configuation via a command line interface <https://github.com/nh13/snakeparse>`_
......@@ -27,7 +27,7 @@ In the workflow, the configuration is accessible via the global variable `config
rule all:
input:
expand("{sample}.{yourparam}.output.pdf", sample=config["samples"], param=config["yourparam"])
expand("{sample}.{param}.output.pdf", sample=config["samples"], param=config["yourparam"])
If the `configfile` statement is not used, the config variable provides an empty array.
In addition to the `configfile` statement, config values can be overwritten via the command line or the :ref:`api_reference_snakemake`, e.g.:
......@@ -60,6 +60,9 @@ Snakemake supports a separate configuration file for execution on a cluster.
A cluster config file allows you to specify cluster submission parameters outside the Snakefile.
The cluster config is a JSON- or YAML-formatted file that contains objects that match names of rules in the Snakefile.
The parameters in the cluster config are then accessed by the ``cluster.*`` wildcard when you are submitting jobs.
Note that a workflow shall never depend on a cluster configuration, because this would limit its portability.
Therefore, it is also not intended to access the cluster configuration from **within** the workflow.
For example, say that you have the following Snakefile:
.. code-block:: python
......
====================================
Workflow Distribution and Deployment
====================================
================================
Distribution and Reproducibility
================================
It is recommended to store each workflow in a dedicated git repository of the
following structure:
......@@ -82,6 +82,7 @@ with the following `environment definition <http://conda.pydata.org/docs/using/e
Snakemake will store the environment persistently in ``.snakemake/conda/$hash`` with ``$hash`` being the MD5 hash of the environment definition file content. This way, updates to the environment definition are automatically detected.
Note that you need to clean up environments manually for now. However, in many cases they are lightweight and consist of symlinks to your central conda installation.
.. _singularity:
--------------------------
Running jobs in containers
......@@ -112,6 +113,52 @@ Allowed image urls entail everything supported by singularity (e.g., ``shub://``
When ``--use-singularity`` is combined with ``--kubernetes`` (see :ref:`kubernetes`), cloud jobs will be automatically configured to run in priviledged mode, because this is a current requirement of the singularity executable.
Importantly, those privileges won't be shared by the actual code that is executed in the singularity container though.
--------------------------------------------------
Combining Conda package management with containers
--------------------------------------------------
While :ref:`integrated_package_management` provides control over the used software in exactly
the desired versions, it does not control the underlying operating system.
Here, it becomes handy that Snakemake >=4.8.0 allows to combine Conda-based package management
with :ref:`singularity`.
For example, you can write
.. code-block:: python
singularity: "docker://continuumio/miniconda3:4.4.10"
rule NAME:
input:
"table.txt"
output:
"plots/myplot.pdf"
conda:
"envs/ggplot.yaml"
script:
"scripts/plot-stuff.R"
in other words, a global definition of a container image can be combined with a
per-rule conda directive.
Then, upon invocation with
.. code-block:: bash
snakemake --use-conda --use-singularity
Snakemake will first pull the defined container image, and then create the requested conda environment from within the container.
The conda environments will still be stored in your working environment, such that they don't have to be recreated unless they have changed.
The hash under which the environments are stored includes the used container image url, such that changes to the container image also lead to new environments to be created.
When a job is executed, Snakemake will first enter the container and then activate the conda environment.
By this, both packages and OS can be easily controlled without the overhead of creating and distributing specialized container images.
Of course, it is also possible (though less common) to define a container image per rule in this scenario.
The user can, upon execution, freely choose the desired level of reproducibility:
* no package management (use whatever is on the system)
* Conda based package management (use versions defined by the workflow developer)
* Conda based package management in containerized OS (use versions and OS defined by the workflow developer)
--------------------------------------
Sustainable and reproducible archiving
--------------------------------------
......