Skip to content
Commits on Source (2)
comment: off
codecov:
require_ci_to_pass: no
coverage:
precision: 1
round: down
range: "90...100"
status:
project: yes
patch: no
changes: no
comment: off
[*.{py,pyx,rst}]
charset=utf-8
end_of_line=lf
insert_final_newline=true
indent_style=space
indent_size=4
<!--
When reporting an issue, please include this information:
- Cutadapt and Python version
- How you installed the tool (conda or pip, for example)
- Which command-line parameters you used
If you report unexpected trimming behavior, this would also be helpful:
- An example input read (or read pair)
- The output that cutadapt produces
- The output that you would have expected
Feel free to delete this text before submitting your issue.
-->
# Editor-specific ignore patterns such as "*~" should be added to
# ~/.config/git/ignore, not here.
*.pyc
__pycache__/
/MANIFEST
/build/
/dist/
/.coverage
/.tox
/.cache
/.pytest_cache
/src/cutadapt/*.c
/src/cutadapt/*.so
/doc/_build
/src/cutadapt.egg-info/
src/cutadapt/_version.py
language: python
dist: xenial
cache:
directories:
- $HOME/.cache/pip
python:
- "3.5"
- "3.6"
- "3.7"
- "3.8-dev"
- "nightly"
install:
- pip install 'Cython>=0.28' tox-travis
script:
- tox
after_success:
- pip install codecov
- codecov
env:
global:
#- TWINE_REPOSITORY_URL=https://test.pypi.org/legacy/
- TWINE_USERNAME=marcelm
# TWINE_PASSWORD is set in Travis settings
jobs:
include:
- stage: deploy
services:
- docker
python: "3.6"
install: python3 -m pip install Cython twine
if: tag IS present
script:
- |
python3 setup.py sdist
./buildwheels.sh
ls -l dist/
python3 -m twine upload dist/*
- name: flake8
python: "3.6"
install: python3 -m pip install flake8
script: flake8 src/ tests/
allow_failures:
- python: "nightly"
......@@ -2,6 +2,19 @@
Changes
=======
v2.6 (2019-10-26)
-----------------
* :issue:`395`: Do not show animated progress when ``--quiet`` is used.
* :issue:`399`: When two adapters align to a read equally well (in terms
of the number of matches), prefer the alignment that has fewer errors.
* :issue:`401` Give priority to adapters given earlier on the command
line. Previously, the priority was: All 3' adapters, all 5' adapters,
all anywhere adapters. In rare cases this could lead to different results.
* :issue:`404`: Fix an issue preventing Cutadapt from being used on Windows.
* This release no longer supports Python 3.4 (which has reached end of life).
v2.5 (2019-09-04)
-----------------
......
Contributing
------------
Contributions to Cutadapt in the form of source code or documentation
improvements or helping out with responding to issues are welcome!
To contribute to Cutadapt development, it is easiest to send in a pull request
(PR) on GitHub.
Here are some guidelines for how to do this. They are not strict rules. When in
doubt, send in a PR and we will sort it out.
* Limit a PR to a single topic. Submit multiple PRs if necessary. This way, it
is easier to discuss the changes individually, and in case we find that one
of them should not go in, the others can still be accepted.
* For larger changes, consider opening an issue first to plan what you want to
do.
* Include appropriate unit or integration tests. Sometimes, tests are hard to
write or don’t make sense. If you think this is the case, just leave the tests
out initially and we can discuss whether to add any.
* Add documentation and a changelog entry if appropriate.
Code style
~~~~~~~~~~
* Cutadapt tries to follow PEP8, except that the allowed line length is 100
characters, not 80. But try to wrap comments after 80 characters.
* There are inconsistencies in the current code base since it’s a few years old
already. New code should follow the current rules, however.
* At the moment, no automatic code formatting is done, but one idea might be to
switch to the `black <https://black.readthedocs.io/>`_ code formatter at some
point. If you’re familiar with its style, you can use that already now for
new code to make the diff smaller.
* Prefer double quotation marks in new code. This will also make the diff smaller
if we eventually switch to black.
* Using an IDE is beneficial (PyCharm, for example). It helps to catch lots of
style issues early (unused imports, spacing etc.).
* Avoid unnecessary abbreviations for variable names. Code is more often read
than written.
* When writing a help text for a new command-line option, look at the output of
``cutadapt --help`` and try to make it look nice and short.
* In comments and documentation, capitalize FASTQ, BWA, CPU etc.
include CHANGES.rst
include CITATION
include LICENSE
include README.rst
include pyproject.toml
include doc/*.rst
include doc/conf.py
include doc/Makefile
include src/cutadapt/*.c
include src/cutadapt/*.pyx
include tests/*.py
graft tests/data
graft tests/cut
Metadata-Version: 2.1
Name: cutadapt
Version: 2.5
Summary: trim adapters from high-throughput sequencing reads
Home-page: https://cutadapt.readthedocs.io/
Author: Marcel Martin
Author-email: marcel.martin@scilifelab.se
License: MIT
Description: .. image:: https://travis-ci.org/marcelm/cutadapt.svg?branch=master
:target: https://travis-ci.org/marcelm/cutadapt
:alt:
.. image:: https://img.shields.io/pypi/v/cutadapt.svg?branch=master
:target: https://pypi.python.org/pypi/cutadapt
:alt:
.. image:: https://codecov.io/gh/marcelm/cutadapt/branch/master/graph/badge.svg
:target: https://codecov.io/gh/marcelm/cutadapt
:alt:
========
Cutadapt
========
Cutadapt finds and removes adapter sequences, primers, poly-A tails and other
types of unwanted sequence from your high-throughput sequencing reads.
Cleaning your data in this way is often required: Reads from small-RNA
sequencing contain the 3’ sequencing adapter because the read is longer than
the molecule that is sequenced. Amplicon reads start with a primer sequence.
Poly-A tails are useful for pulling out RNA from your sample, but often you
don’t want them to be in your reads.
Cutadapt helps with these trimming tasks by finding the adapter or primer
sequences in an error-tolerant way. It can also modify and filter reads in
various ways. Adapter sequences can contain IUPAC wildcard characters. Also,
paired-end reads and even colorspace data is supported. If you want, you can
also just demultiplex your input data, without removing adapter sequences at all.
Cutadapt comes with an extensive suite of automated tests and is available under
the terms of the MIT license.
If you use Cutadapt, please cite
`DOI:10.14806/ej.17.1.200 <http://dx.doi.org/10.14806/ej.17.1.200>`_ .
Links
-----
* `Documentation <https://cutadapt.readthedocs.io/>`_
* `Source code <https://github.com/marcelm/cutadapt/>`_
* `Report an issue <https://github.com/marcelm/cutadapt/issues>`_
* `Project page on PyPI (Python package index) <https://pypi.python.org/pypi/cutadapt/>`_
* `Follow @marcelm_ on Twitter <https://twitter.com/marcelm_>`_
* `Wrapper for the Galaxy platform <https://bitbucket.org/lance_parsons/cutadapt_galaxy_wrapper>`_
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Cython
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.4
Provides-Extra: dev
......@@ -10,6 +10,11 @@
:target: https://codecov.io/gh/marcelm/cutadapt
:alt:
.. image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat
:target: http://bioconda.github.io/recipes/cutadapt/README.html
:alt: install with bioconda
========
Cutadapt
========
......@@ -24,13 +29,16 @@ Poly-A tails are useful for pulling out RNA from your sample, but often you
don’t want them to be in your reads.
Cutadapt helps with these trimming tasks by finding the adapter or primer
sequences in an error-tolerant way. It can also modify and filter reads in
various ways. Adapter sequences can contain IUPAC wildcard characters. Also,
paired-end reads and even colorspace data is supported. If you want, you can
also just demultiplex your input data, without removing adapter sequences at all.
sequences in an error-tolerant way. It can also modify and filter single-end
and paired-end reads in various ways. Adapter sequences can contain IUPAC
wildcard characters. Cutadapt can also demultiplex your reads.
Cutadapt is available under the terms of the MIT license.
Cutadapt comes with an extensive suite of automated tests and is available under
the terms of the MIT license.
Cutadapt development was started at `TU Dortmund University <https://www.tu-dortmund.de>`_
in the group of `Prof. Dr. Sven Rahmann <https://www.rahmannlab.de/>`_.
It is currently being developed within
`NBIS (National Bioinformatics Infrastructure Sweden) <https://nbis.se/>`_.
If you use Cutadapt, please cite
`DOI:10.14806/ej.17.1.200 <http://dx.doi.org/10.14806/ej.17.1.200>`_ .
......@@ -44,4 +52,4 @@ Links
* `Report an issue <https://github.com/marcelm/cutadapt/issues>`_
* `Project page on PyPI (Python package index) <https://pypi.python.org/pypi/cutadapt/>`_
* `Follow @marcelm_ on Twitter <https://twitter.com/marcelm_>`_
* `Wrapper for the Galaxy platform <https://bitbucket.org/lance_parsons/cutadapt_galaxy_wrapper>`_
* `Wrapper for the Galaxy platform <https://github.com/galaxyproject/tools-iuc/tree/master/tools/cutadapt>`_
#!/bin/bash
#
# Build manylinux1 wheels for cutadapt. Based on the example at
# <https://github.com/pypa/python-manylinux-demo>
#
# It is best to run this in a fresh clone of the repository!
#
# Run this within the repository root:
# docker run --rm -v $(pwd):/io quay.io/pypa/manylinux1_x86_64 /io/buildwheels.sh
#
# The wheels will be put into the wheelhouse/ subdirectory.
#
# For interactive tests:
# docker run -it -v $(pwd):/io quay.io/pypa/manylinux1_x86_64 /bin/bash
set -xeuo pipefail
MANYLINUX=quay.io/pypa/manylinux2010_x86_64
# For convenience, if this script is called from outside of a docker container,
# it starts a container and runs itself inside of it.
if ! grep -q docker /proc/1/cgroup; then
# We are not inside a container
docker pull ${MANYLINUX}
exec docker run --rm -v $(pwd):/io ${MANYLINUX} /io/$0
fi
# Strip binaries (copied from multibuild)
STRIP_FLAGS=${STRIP_FLAGS:-"-Wl,-strip-all"}
export CFLAGS="${CFLAGS:-$STRIP_FLAGS}"
export CXXFLAGS="${CXXFLAGS:-$STRIP_FLAGS}"
# We require Python 3.5+
rm /opt/python/cp27* /opt/python/cp34*
PYBINS="/opt/python/*/bin"
HAS_CYTHON=0
for PYBIN in ${PYBINS}; do
# ${PYBIN}/pip install -r /io/requirements.txt
${PYBIN}/pip wheel /io/ -w wheelhouse/
done
# Bundle external shared libraries into the wheels
for whl in wheelhouse/cutadapt-*.whl; do
auditwheel repair "$whl" --plat manylinux1_x86_64 -w repaired/
done
# Created files are owned by root, so fix permissions.
chown -R --reference=/io/setup.py repaired/
mv repaired/*.whl /io/dist/
# TODO Install packages and test them
#for PYBIN in ${PYBINS}; do
# ${PYBIN}/pip install cutadapt --no-index -f /io/wheelhouse
# (cd $HOME; ${PYBIN}/nosetests ...)
#done
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
version="1.0"
width="500.50909"
height="365.63535"
id="svg5571">
<defs
id="defs5573" />
<metadata
id="metadata5576">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title></dc:title>
</cc:Work>
</rdf:RDF>
</metadata>
<g
transform="translate(-4.4323702,147.9297)"
id="layer1">
<rect
width="35.933102"
height="7.0866098"
x="111.386"
y="-52.720001"
id="rect6974"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#b3b3b3;fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:3;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="106.299"
height="7.0866098"
x="5.0866399"
y="-52.720001"
id="rect3625"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#ffffff;fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:3;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="141.73199"
height="7.0866098"
x="5.5865898"
y="-52.720001"
id="rect5585"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="70.866096"
height="7.0866199"
x="83.385101"
y="-123.586"
id="rect6102"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#84b818;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="71.020401"
height="7.0866299"
x="111.732"
y="-66.893303"
id="rect6104"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#84b818;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="70.866096"
height="7.0866098"
x="268.57001"
y="136.66589"
id="rect6130"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#ffffff;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="70.866096"
height="7.0866098"
x="268.57001"
y="172.099"
id="rect6972"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#84b818;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="70.866096"
height="7.0866199"
x="268.57001"
y="207.532"
id="rect7032"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#b3b3b3;fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:3;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<text
x="353.60956"
y="214.61865"
id="text6978"
xml:space="preserve"
style="font-size:18px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:Lato;-inkscape-font-specification:Lato"><tspan
x="353.60956"
y="214.61865"
id="tspan6980">Removed sequence</tspan></text>
<text
x="353.60956"
y="179.18559"
id="text6982"
xml:space="preserve"
style="font-size:18px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:Lato;-inkscape-font-specification:Lato"><tspan
x="353.60956"
y="179.18559"
id="tspan6984">Adapter</tspan></text>
<text
x="353.60956"
y="143.75253"
id="text6986"
xml:space="preserve"
style="font-size:18px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:Lato;-inkscape-font-specification:Lato"><tspan
x="353.60956"
y="143.75253"
id="tspan6988">Read </tspan></text>
<rect
width="70.866096"
height="7.0866098"
x="4.9323802"
y="193.35901"
id="rect5587"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#84b818;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="70.866096"
height="7.0866098"
x="4.9324002"
y="207.532"
id="rect7030"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#b3b3b3;fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:3;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="141.73199"
height="7.0866098"
x="4.9324002"
y="207.532"
id="rect6976"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="99.712601"
height="7.0866199"
x="82.885101"
y="-109.413"
id="rect7028"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#b3b3b3;fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:3;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="77.952797"
height="7.0866199"
x="4.9324002"
y="-109.413"
id="rect3627"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#ffffff;fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:3;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="177.16499"
height="7.0866098"
x="5.4323401"
y="-109.413"
id="rect7199"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="70.866096"
height="7.0865698"
x="4.9323702"
y="24.8864"
id="rect6128"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#84b818;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="70.866096"
height="7.0865698"
x="4.9323902"
y="81.5793"
id="rect6114"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#84b818;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="70.866203"
height="7.0866299"
x="4.9323702"
y="39.059551"
id="rect7058"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#b3b3b3;fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:3;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="70.366096"
height="7.0866299"
x="-146.16499"
y="39.059551"
transform="scale(-1,1)"
id="rect3629"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#ffffff;fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:3;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="141.73199"
height="7.0866098"
x="4.9323702"
y="39.059551"
id="rect6132"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="35.433102"
height="7.0866199"
x="40.365501"
y="95.752502"
id="rect7056"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#b3b3b3;fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:3;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="70.866203"
height="7.0866199"
x="75.2985"
y="95.752502"
id="rect3631"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#ffffff;fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:3;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<rect
width="106.299"
height="7.0866098"
x="40.365501"
y="95.752502"
id="rect6134"
style="font-size:12px;font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:none;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0;marker:none;visibility:visible;display:inline;overflow:visible;enable-background:accumulate;font-family:Times New Roman;-inkscape-font-specification:'Times New Roman, Bold'" />
<text
x="4.9323802"
y="10.713129"
id="text3333"
xml:space="preserve"
style="font-size:18px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:Lato;-inkscape-font-specification:Lato"><tspan
x="4.9323802"
y="10.713129"
id="tspan3335">5' Adapter</tspan></text>
<text
x="4.9323802"
y="-130.6727"
id="text3337"
xml:space="preserve"
style="font-size:18px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:Lato;-inkscape-font-specification:Lato"><tspan
x="4.9323802"
y="-130.6727"
id="tspan3339">3' Adapter</tspan></text>
<text
x="4.9323802"
y="179.18558"
id="text3341"
xml:space="preserve"
style="font-size:18px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:Lato;-inkscape-font-specification:Lato"><tspan
x="4.9323802"
y="179.18558"
id="tspan3343">Anchored 5' adapter</tspan></text>
<text
x="40.865387"
y="-81.066414"
id="text3349"
xml:space="preserve"
style="font-size:13.63599968px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:Lato;-inkscape-font-specification:Lato"><tspan
x="40.865387"
y="-81.066414"
id="tspan3351">or</tspan></text>
<text
x="40.365467"
y="67.405998"
id="text3353"
xml:space="preserve"
style="font-size:13.63599968px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:Lato;-inkscape-font-specification:Lato"><tspan
x="40.365467"
y="67.405998"
id="tspan3355">or</tspan></text>
</g>
</svg>
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
width="56.122009"
height="51.8545"
id="svg3076"
version="1.1"
inkscape:version="0.48.5 r10040"
sodipodi:docname="Cutadapt logo">
<defs
id="defs3078" />
<sodipodi:namedview
id="base"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageopacity="0.0"
inkscape:pageshadow="2"
inkscape:zoom="2.6502665"
inkscape:cx="41.639266"
inkscape:cy="34.486602"
inkscape:document-units="px"
inkscape:current-layer="layer1"
showgrid="false"
fit-margin-top="2"
fit-margin-left="2"
fit-margin-right="2"
fit-margin-bottom="2"
inkscape:window-width="1305"
inkscape:window-height="763"
inkscape:window-x="-4"
inkscape:window-y="56"
inkscape:window-maximized="0" />
<metadata
id="metadata3081">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title></dc:title>
</cc:Work>
</rdf:RDF>
</metadata>
<g
inkscape:label="Layer 1"
inkscape:groupmode="layer"
id="layer1"
transform="translate(-346.939,-506.43493)">
<g
transform="translate(44.935994,179.79303)"
style="display:inline"
id="g4093"
inkscape:export-filename="cutadapt.png"
inkscape:export-xdpi="276"
inkscape:export-ydpi="276">
<path
inkscape:connector-curvature="0"
id="path4068"
transform="translate(0,308.2677)"
d="m 349.625,34.1875 -7.78125,3.625 c 0.91994,1.970873 1.4375,4.181472 1.4375,6.5 0,2.318528 -0.51756,4.497877 -1.4375,6.46875 l 7.78125,3.625 c 2.98923,-6.41042 2.98923,-13.808331 0,-20.21875 z"
style="fill:#aad400;fill-opacity:1;display:inline" />
<path
inkscape:connector-curvature="0"
id="path4066"
transform="translate(0,308.2677)"
d="m 328.15625,20.375 c -6.89497,-0.05633 -13.78424,2.867991 -18.5625,8.5625 -8.49469,10.123572 -7.15482,25.192814 2.96875,33.6875 10.12357,8.494686 25.19281,7.186072 33.6875,-2.9375 l -6.40625,-5.375 c -2.85492,3.3998 -7.11936,5.5625 -11.90625,5.5625 -8.59538,0 -15.5625,-6.96712 -15.5625,-15.5625 0,-8.59538 6.96712,-15.5625 15.5625,-15.5625 4.78689,0 9.05133,2.1627 11.90625,5.5625 l 6.40625,-5.375 c -0.8954,-1.067094 -1.87041,-2.073352 -2.9375,-2.96875 -4.42906,-3.716425 -9.7935,-5.54994 -15.15625,-5.59375 z"
style="fill:#217821;fill-opacity:1;display:inline" />
<path
sodipodi:nodetypes="cccccc"
inkscape:connector-curvature="0"
id="path4072"
transform="translate(0,308.2677)"
d="m 353.4375,25.09375 -15.28125,11.5 0.0497,1.108915 1.04406,0.578585 16.875,-8.96875 c -0.78525,-1.47685 -1.68088,-2.882924 -2.6875,-4.21875 z"
style="fill:#217821;fill-opacity:1;display:inline" />
<path
sodipodi:nodetypes="cccccc"
inkscape:connector-curvature="0"
id="path4074"
transform="translate(0,308.2677)"
d="m 339.25,50.3125 -1.04688,0.4375 -0.0469,1.25 15.28125,11.53125 c 1.00662,-1.335826 1.90224,-2.77315 2.6875,-4.25 z"
style="fill:#217821;fill-opacity:1;display:inline" />
</g>
</g>
</svg>
......@@ -20,9 +20,8 @@ Compressed in- and output files are also supported::
Cutadapt searches for the adapter in all reads and removes it when it finds it.
Unless you use a filtering option, all reads that were present in the input file
will also be present in the output file, some of them trimmed, some of them not.
Even reads that were trimmed entirely (because the adapter was found in the very
beginning) are output. All of this can be changed with command-line options,
explained further down.
Even reads that were trimmed to a length of zero are output. All of this can be
changed with command-line options, explained further down.
:ref:`Trimming of paired-end data <paired-end>` is also supported.
......@@ -30,30 +29,18 @@ explained further down.
Input and output file formats
-----------------------------
Input and output files need to be in FASTA or FASTQ format. Reading and writing
compressed file formats ``.gz``, ``.bz2`` or ``.xz`` is also supported. Cutadapt
uses ``pigz`` internally if possible to speed up writing and reading of
gzipped files.
The supported input and output file formats are FASTA and FASTQ, with
optional compression.
The input file format is recognized from the file name extension. If the
extension was not recognized or when Cutadapt reads from standard input,
the contents are inspected instead.
The output file format is also recognized from the file name extension. If the
extensions was not recognized or when Cutadapt writes to standard input, the
extensions was not recognized or when Cutadapt writes to standard output, the
same format as the input is used for the output.
You can use this to convert from FASTQ to FASTA (without doing any adapter
trimming)::
cutadapt -o output.fasta.gz input.fastq.gz
When you want to do the same (read FASTQ, write FASTA), but want to write to
standard output, you need to use ``--fasta`` instead because there is no
output file name::
cutadapt --fasta input.fastq.gz > out.fasta
See also :ref:`file format conversion <file-format-conversion>`.
.. _compressed-files:
......@@ -76,13 +63,15 @@ The default compression level for gzip output is 6. Use option ``-Z`` to
change this to level 1. The files need more space, but it is faster and
therefore a good choice for short-lived intermediate files.
If available, Cutadapt uses `pigz <https://zlib.net/pigz/>`_ to speed up
writing and reading of gzipped files.
Standard input and output
-------------------------
If no output file is specified via the ``-o`` option, then the output is sent to
the standard output stream. Instead of the example command line from above, you
can therefore also write::
the standard output stream. Example::
cutadapt -a AACCGGTT input.fastq > output.fastq
......@@ -142,17 +131,15 @@ the output will be done in a single thread and therefore be a bottleneck.
There are some limitations at the moment:
* Multi-core Cutadapt can only write to output files given by ``-o`` and ``-p``.
This implies that the following command-line arguments are not compatible with
* The following command-line arguments are not compatible with
multi-core:
- ``--info-file``
- ``--rest-file``
- ``--wildcard-file``
- ``--format``
* Multi-core is also not compatible with ``--format``
* Multi-core is also not available when you use Cutadapt for demultiplexing.
* Multi-core is not available when you use Cutadapt for demultiplexing.
If you try to use multiple cores with an incompatible commandline option, you
will get an error message.
......@@ -171,7 +158,7 @@ Some of these limitations will be lifted in the future, as time allows.
Speed-up tricks
---------------
There are several tricks for limiting wall-clock time while using cutadapt.
There are several tricks for limiting wall-clock time while using Cutadapt.
``-Z`` (alternatively ``--compression-level=1``) can be used to limit the
amount of CPU time which is spent on the compression of output files.
......@@ -1567,7 +1554,7 @@ to use ``{name1}}`` and ``{name2}`` in both output file name templates. For exam
-e 0.15 --no-indels \
-g file:barcodes_fwd.fasta \
-G file:barcodes_rev.fasta \
-o trimmed-{name1}-{name2}.1.fastq.gz -p trimmed-{name1}-{name2}.2.fastq.gz \
-o {name1}-{name2}.1.fastq.gz -p {name1}-{name2}.2.fastq.gz \
input.1.fastq.gz input.2.fastq.gz
The ``{name1}`` will be replaced with the name of the best-matching R1 adapter and ``{name2}}`` will
......@@ -1583,6 +1570,26 @@ Read the :ref:`demultiplexing <demultiplexing>` section for how to choose the er
Also, the tips below about how to speed up demultiplexing apply even with combinatorial
demultiplexing.
When doing the above, you will end up with lots of files named ``first-second.x.fastq.gz``, where
*first* is the name of the first indexed adapter and *second* is the name of the second indexed
adapter, and *x* is 1 or 2. Each indexed adapter combination may correspond to a sample name and
you may want to name your files according to the sample name, not the name of the adapters.
Cutadapt does not have built-in functionality to achieve this, but you can use an external
tool such as ``mmv`` (“multiple move”). First, create a list of patterns in ``patterns.txt``::
fwdindex1-revindex1.[12].fastq.gz sampleA.#1.fastq.gz
fwdindex1-revindex2.[12].fastq.gz sampleB.#1.fastq.gz
fwdindex1-revindex3.[12].fastq.gz sampleC.#1.fastq.gz
fwdindex2-revindex1.[12].fastq.gz sampleD.#1.fastq.gz
fwdindex2-revindex2.[12].fastq.gz sampleE.#1.fastq.gz
...
Here, *fwdindex1*/*revindex1* etc. are the names of indexes, and *sampleA* etc.
are your sample names. Then rename all files at once with ::
mmv < patterns.txt
.. versionadded:: 2.4
......
......@@ -274,10 +274,28 @@ By prefixing the adapter sequence with ``NN``, the bases will be automatically
stripped during adapter trimming.
.. _file-format-conversion:
File format conversion
----------------------
You can use Cutadapt to convert FASTQ to FASTA format::
cutadapt -o output.fasta.gz input.fastq.gz
Cutadapt detects that the file name extension of the output file is ``.fasta``
and writes in FASTA format, omitting the qualities.
When writing to standard output, you need to use the ``--fasta`` option::
cutadapt --fasta input.fastq.gz > out.fasta
Without the option, Cutadapt writes in FASTQ format.
Other things (unfinished)
-------------------------
* How to detect adapters
* Use Cutadapt for quality-trimming only
* Use it for minimum/maximum length filtering
* Use it for conversion to FASTQ
sphinx_issues
cython
setuptools_scm
......@@ -11,8 +11,8 @@ from distutils.command.build_ext import build_ext as _build_ext
MIN_CYTHON_VERSION = '0.28'
if sys.version_info[:2] < (3, 4):
sys.stdout.write('You need at least Python 3.4\n')
if sys.version_info[:2] < (3, 5):
sys.stdout.write('You need at least Python 3.5\n')
sys.exit(1)
......@@ -83,7 +83,7 @@ class SDist(_sdist):
super().run()
encoding_arg = {'encoding': 'utf-8'} if sys.version > '3' else dict()
encoding_arg = {'encoding': 'utf-8'} if sys.version_info[0] >= 3 else dict()
with open('README.rst', **encoding_arg) as f:
long_description = f.read()
......@@ -102,11 +102,14 @@ setup(
package_dir={'': 'src'},
packages=find_packages('src'),
entry_points={'console_scripts': ['cutadapt = cutadapt.__main__:main']},
install_requires=['dnaio~=0.3.0', 'xopen~=0.8.1'],
install_requires=[
'dnaio~=0.4.0',
'xopen~=0.8.4',
],
extras_require={
'dev': ['Cython', 'pytest', 'pytest-timeout', 'sphinx', 'sphinx_issues'],
},
python_requires='>=3.4',
python_requires='>=3.5',
classifiers=[
"Development Status :: 5 - Production/Stable",
"Environment :: Console",
......
__all__ = ["__version__"]
from ._version import version as __version__
......@@ -55,15 +55,16 @@ See https://cutadapt.readthedocs.io/ for full documentation.
"""
import sys
import errno
import time
from argparse import ArgumentParser, SUPPRESS, HelpFormatter
import logging
import platform
from argparse import ArgumentParser, SUPPRESS, HelpFormatter
from xopen import xopen
import dnaio
from cutadapt import __version__
from cutadapt.adapters import warn_duplicate_adapters
from cutadapt.parser import AdapterParser
from cutadapt.modifiers import (LengthTagModifier, SuffixRemover, PrefixSuffixAdder,
ZeroCapper, QualityTrimmer, UnconditionalCutter, NEndTrimmer, AdapterCutter,
......@@ -71,7 +72,7 @@ from cutadapt.modifiers import (LengthTagModifier, SuffixRemover, PrefixSuffixAd
from cutadapt.report import full_report, minimal_report
from cutadapt.pipeline import (SingleEndPipeline, PairedEndPipeline, InputFiles, OutputFiles,
SerialPipelineRunner, ParallelPipelineRunner)
from cutadapt.utils import available_cpu_count
from cutadapt.utils import available_cpu_count, Progress, DummyProgress
from cutadapt.log import setup_logging, REPORT
logger = logging.getLogger()
......@@ -109,6 +110,7 @@ class CommandLineError(Exception):
def get_argument_parser():
# noqa: E131
parser = CutadaptArgumentParser(usage=__doc__, add_help=False)
group = parser.add_argument_group("Options")
group.add_argument("-h", "--help", action="help", help="Show this help message and exit")
......@@ -140,19 +142,21 @@ def get_argument_parser():
"trimmed (but see the --times option). When the special notation "
"'file:FILE' is used, adapter sequences are read from the given "
"FASTA file.")
group.add_argument("-a", "--adapter", action="append", default=[], metavar="ADAPTER",
dest="adapters",
group.add_argument("-a", "--adapter", type=lambda x: ("back", x), action="append",
default=[], metavar="ADAPTER", dest="adapters",
help="Sequence of an adapter ligated to the 3' end (paired data: of the "
"first read). The adapter and subsequent bases are trimmed. If a "
"'$' character is appended ('anchoring'), the adapter is only "
"found if it is a suffix of the read.")
group.add_argument("-g", "--front", action="append", default=[], metavar="ADAPTER",
group.add_argument("-g", "--front", type=lambda x: ("front", x), action="append",
default=[], metavar="ADAPTER", dest="adapters",
help="Sequence of an adapter ligated to the 5' end (paired data: of the "
"first read). The adapter and any preceding bases are trimmed. "
"Partial matches at the 5' end are allowed. If a '^' character is "
"prepended ('anchoring'), the adapter is only found if it is a "
"prefix of the read.")
group.add_argument("-b", "--anywhere", action="append", default=[], metavar="ADAPTER",
group.add_argument("-b", "--anywhere", type=lambda x: ("anywhere", x), action="append",
default=[], metavar="ADAPTER", dest="adapters",
help="Sequence of an adapter that may be ligated to the 5' or 3' end "
"(paired data: of the first read). Both types of matches as "
"described under -a und -g are allowed. If the first base of the "
......@@ -283,11 +287,14 @@ def get_argument_parser():
group = parser.add_argument_group("Paired-end options", description="The "
"-A/-G/-B/-U options work like their -a/-b/-g/-u counterparts, but "
"are applied to the second read in each pair.")
group.add_argument("-A", dest='adapters2', action='append', default=[], metavar='ADAPTER',
group.add_argument("-A", type=lambda x: ("back", x), dest='adapters2',
action='append', default=[], metavar='ADAPTER',
help="3' adapter to be removed from second read in a pair.")
group.add_argument("-G", dest='front2', action='append', default=[], metavar='ADAPTER',
group.add_argument("-G", type=lambda x: ("front", x), dest='adapters2',
action='append', default=[], metavar='ADAPTER',
help="5' adapter to be removed from second read in a pair.")
group.add_argument("-B", dest='anywhere2', action='append', default=[], metavar='ADAPTER',
group.add_argument("-B", type=lambda x: ("anywhere", x), dest='adapters2',
action='append', default=[], metavar='ADAPTER',
help="5'/3 adapter to be removed from second read in a pair.")
group.add_argument("-U", dest='cut2', action='append', default=[], type=int, metavar="LENGTH",
help="Remove LENGTH bases from second read in a pair.")
......@@ -330,19 +337,21 @@ def get_argument_parser():
def parse_cutoffs(s):
"""Parse a string INT[,INT] into a two-element list of integers"""
cutoffs = s.split(',')
"""Parse a string INT[,INT] into a two-element list of integers
>>> parse_cutoffs("5")
[0, 5]
>>> parse_cutoffs("6,7")
[6, 7]
"""
try:
cutoffs = [int(value) for value in s.split(",")]
except ValueError as e:
raise CommandLineError("Quality cutoff value not recognized: {}".format(e))
if len(cutoffs) == 1:
try:
cutoffs = [0, int(cutoffs[0])]
except ValueError as e:
raise CommandLineError("Quality cutoff value not recognized: {}".format(e))
elif len(cutoffs) == 2:
try:
cutoffs = [int(cutoffs[0]), int(cutoffs[1])]
except ValueError as e:
raise CommandLineError("Quality cutoff value not recognized: {}".format(e))
else:
cutoffs = [0, cutoffs[0]]
elif len(cutoffs) != 2:
raise CommandLineError("Expected one value or two values separated by comma for "
"the quality cutoff")
return cutoffs
......@@ -378,13 +387,12 @@ def open_output_files(args, default_outfile, interleaved):
attributes are not opened files, but paths (out and out2 with the '{name}' template).
"""
compression_level = args.compression_level
rest_file = info_file = wildcard = None
if args.rest_file is not None:
rest_file = xopen(args.rest_file, 'w', compresslevel=compression_level)
if args.info_file is not None:
info_file = xopen(args.info_file, 'w', compresslevel=compression_level)
if args.wildcard_file is not None:
wildcard = xopen(args.wildcard_file, 'w', compresslevel=compression_level)
def open1(path):
"""Return opened file (or None if path is None)"""
if path is None:
return None
return xopen(path, "w", compresslevel=compression_level)
def open2(path1, path2):
file1 = file2 = None
......@@ -394,6 +402,10 @@ def open_output_files(args, default_outfile, interleaved):
file2 = xopen(path2, 'wb', compresslevel=compression_level)
return file1, file2
rest_file = open1(args.rest_file)
info_file = open1(args.info_file)
wildcard = open1(args.wildcard_file)
too_short = too_short2 = None
if args.minimum_length is not None:
too_short, too_short2 = open2(args.too_short_output, args.too_short_paired_output)
......@@ -490,8 +502,6 @@ def determine_paired_mode(args):
args.paired_output
or args.interleaved
or args.adapters2
or args.front2
or args.anywhere2
or args.cut2
or args.pair_filter
or args.too_short_paired_output
......@@ -549,7 +559,11 @@ def check_arguments(args, paired, is_interleaved_output):
if not paired:
if args.untrimmed_paired_output:
raise CommandLineError("Option --untrimmed-paired-output can only be used when "
"trimming paired-end reads (with option -p).")
"trimming paired-end reads.")
if args.pair_adapters:
raise CommandLineError("Option --pair-adapters can only be used when trimming "
"paired-end reads")
if paired:
if not is_interleaved_output:
......@@ -581,6 +595,9 @@ def check_arguments(args, paired, is_interleaved_output):
if not (0 <= args.gc_content <= 100):
raise CommandLineError("GC content must be given as percentage between 0 and 100")
if args.pair_adapters and args.times != 1:
raise CommandLineError("--pair-adapters cannot be used with --times")
def pipeline_from_parsed_args(args, paired, is_interleaved_output):
"""
......@@ -602,14 +619,12 @@ def pipeline_from_parsed_args(args, paired, is_interleaved_output):
indels=args.indels,
)
try:
adapters = adapter_parser.parse_multi(args.adapters, args.anywhere, args.front)
adapters2 = adapter_parser.parse_multi(args.adapters2, args.anywhere2, args.front2)
except IOError as e:
if e.errno == errno.ENOENT:
raise CommandLineError(e)
raise
except ValueError as e:
adapters = adapter_parser.parse_multi(args.adapters)
adapters2 = adapter_parser.parse_multi(args.adapters2)
except (FileNotFoundError, ValueError) as e:
raise CommandLineError(e)
warn_duplicate_adapters(adapters)
warn_duplicate_adapters(adapters2)
if args.debug:
for adapter in adapters + adapters2:
adapter.enable_debug()
......@@ -627,26 +642,7 @@ def pipeline_from_parsed_args(args, paired, is_interleaved_output):
args.discard_untrimmed or args.untrimmed_output or args.untrimmed_paired_output):
pipeline.override_untrimmed_pair_filter = True
for i, cut_arg in enumerate([args.cut, args.cut2]):
# cut_arg is a list
if not cut_arg:
continue
if len(cut_arg) > 2:
raise CommandLineError("You cannot remove bases from more than two ends.")
if len(cut_arg) == 2 and cut_arg[0] * cut_arg[1] > 0:
raise CommandLineError("You cannot remove bases from the same end twice.")
for c in cut_arg:
if c == 0:
continue
if i == 0: # R1
if paired:
pipeline.add(UnconditionalCutter(c), None)
else:
pipeline.add(UnconditionalCutter(c))
else:
# R2
assert isinstance(pipeline, PairedEndPipeline)
pipeline.add(None, UnconditionalCutter(c))
add_unconditional_cutters(pipeline, args.cut, args.cut2, paired)
pipeline_add = pipeline.add_both if paired else pipeline.add
......@@ -657,11 +653,6 @@ def pipeline_from_parsed_args(args, paired, is_interleaved_output):
pipeline_add(QualityTrimmer(cutoffs[0], cutoffs[1], args.quality_base))
if args.pair_adapters:
if not paired:
raise CommandLineError("Option --pair-adapters can only be used when trimming "
"paired-end reads")
if args.times != 1:
raise CommandLineError("--pair-adapters cannot be used with --times")
try:
cutter = PairedAdapterCutter(adapters, adapters2, args.action)
except PairedAdapterCutterError as e:
......@@ -680,19 +671,8 @@ def pipeline_from_parsed_args(args, paired, is_interleaved_output):
if adapter_cutter:
pipeline.add(adapter_cutter)
# Remaining modifiers that apply to both reads of paired-end reads
if args.length is not None:
pipeline_add(Shortener(args.length))
if args.trim_n:
pipeline_add(NEndTrimmer())
if args.length_tag:
pipeline_add(LengthTagModifier(args.length_tag))
for suffix in args.strip_suffix:
pipeline_add(SuffixRemover(suffix))
if args.prefix or args.suffix:
pipeline_add(PrefixSuffixAdder(args.prefix, args.suffix))
if args.zero_cap:
pipeline_add(ZeroCapper(quality_base=args.quality_base))
for modifier in modifiers_applying_to_both_ends_if_paired(args):
pipeline_add(modifier)
# Set filtering parameters
# Minimum/maximum length
......@@ -713,6 +693,44 @@ def pipeline_from_parsed_args(args, paired, is_interleaved_output):
return pipeline
def add_unconditional_cutters(pipeline, cut1, cut2, paired):
for i, cut_arg in enumerate([cut1, cut2]):
# cut_arg is a list
if not cut_arg:
continue
if len(cut_arg) > 2:
raise CommandLineError("You cannot remove bases from more than two ends.")
if len(cut_arg) == 2 and cut_arg[0] * cut_arg[1] > 0:
raise CommandLineError("You cannot remove bases from the same end twice.")
for c in cut_arg:
if c == 0:
continue
if i == 0: # R1
if paired:
pipeline.add(UnconditionalCutter(c), None)
else:
pipeline.add(UnconditionalCutter(c))
else:
# R2
assert isinstance(pipeline, PairedEndPipeline)
pipeline.add(None, UnconditionalCutter(c))
def modifiers_applying_to_both_ends_if_paired(args):
if args.length is not None:
yield Shortener(args.length)
if args.trim_n:
yield NEndTrimmer()
if args.length_tag:
yield LengthTagModifier(args.length_tag)
for suffix in args.strip_suffix:
yield SuffixRemover(suffix)
if args.prefix or args.suffix:
yield PrefixSuffixAdder(args.prefix, args.suffix)
if args.zero_cap:
yield ZeroCapper(quality_base=args.quality_base)
def log_header(cmdlineargs):
"""Print the "This is cutadapt ..." header"""
......@@ -769,7 +787,7 @@ def main(cmdlineargs=None, default_outfile=sys.stdout.buffer):
pipeline = pipeline_from_parsed_args(args, paired, is_interleaved_output)
outfiles = open_output_files(args, default_outfile, is_interleaved_output)
except CommandLineError as e:
parser.error(e)
parser.error(str(e))
return # avoid IDE warnings below
if args.cores < 0:
......@@ -792,8 +810,12 @@ def main(cmdlineargs=None, default_outfile=sys.stdout.buffer):
runner_kwargs = dict()
infiles = InputFiles(input_filename, file2=input_paired_filename,
interleaved=is_interleaved_input)
if sys.stderr.isatty() and not args.quiet:
progress = Progress()
else:
progress = DummyProgress()
try:
runner = runner_class(pipeline, infiles, outfiles, **runner_kwargs)
runner = runner_class(pipeline, infiles, outfiles, progress, **runner_kwargs)
except (dnaio.UnknownFileFormat, IOError) as e:
parser.error(e)
return # avoid IDE warnings below
......@@ -807,10 +829,8 @@ def main(cmdlineargs=None, default_outfile=sys.stdout.buffer):
except KeyboardInterrupt:
print("Interrupted", file=sys.stderr)
sys.exit(130)
except IOError as e:
if e.errno == errno.EPIPE:
sys.exit(1)
raise
except BrokenPipeError:
sys.exit(1)
except (dnaio.FileFormatError, dnaio.UnknownFileFormat, EOFError) as e:
sys.exit("cutadapt: error: {}".format(e))
......
This diff is collapsed.