Skip to content
Commits on Source (4)
......@@ -18,3 +18,4 @@ pybedtools/featurefuncs.cpp
*.bak
cythonize.dat
docs/source/autodocs/*.rst
MANIFEST
# os:
# - linux
# - osx
# make edits to this line to trigger a new travis build.
# (sometimes it errors out trying to download from PyPI)
language: python
......@@ -7,9 +11,16 @@ sudo: false
python:
- "2.7"
#- "3.3"
- "3.4"
- "3.5"
- "3.6"
# see https://github.com/travis-ci/travis-ci/issues/9815 for py3.7 support
matrix:
include:
- python: 3.7
dist: xenial
sudo: true
notifications:
email:
......@@ -39,9 +50,5 @@ install:
- conda update -q conda
- conda info -a
# Base env only needs to cythonize sources; test script takes care of
# everything else.
- conda install cython
script:
- ./condatest.sh "$TRAVIS_PYTHON_VERSION"
include src/*
recursive-include pybedtools/include/ *
include README.rst
include LICENSE.txt
include ez_setup.py
recursive-include docs/source *.rst
recursive-include docs/source *.py
recursive-include docs/source/images *
recursive-include docs/source/_templates *
recursive-include pybedtools/test/data *
recursive-include pybedtools/test *
include docs/Makefile
include docs/make.bat
recursive-include pybedtools *.cxx
recursive-include pybedtools *.cpp
recursive-include pybedtools *.c
recursive-exclude * __pycache__
recursive-exclude * *.py[co]
#!/bin/bash
# Installs pybedtools and requirements into a fresh Python 2 or 3 environment
# and runs tests.
#
# Note that this script needs to be called from an environment with Cython
# since this does a clean/sdist operation which will Cythonize the source
set -e
PY_VERSION=$1
......@@ -13,56 +7,95 @@ PY_VERSION=$1
usage="Usage: $0 py_version[2|3]"
: ${PY_VERSION:?$usage}
log () {
echo
echo "[`date`] TEST HARNESS: $1"
echo
}
log "removing existing env pbtpy${PY_VERSION}"
name=pbtpy${PY_VERSION}
conda env list | grep -q $name && conda env remove -y -n $name
log "starting with basic environment"
conda create -y -n $name --channel bioconda python=${PY_VERSION} \
bedtools \
"htslib<1.4" \
ucsc-bedgraphtobigwig \
ucsc-bigwigtobedgraph
source activate $name
log "temporarily install cython"
conda install cython
log "force re-cythonizing"
rm -rf dist build
python setup.py clean
python setup.py build
python setup.py sdist
log "uninstall cython"
conda remove cython
log "test installation of sdist"
set -x
(cd dist && pip install pybedtools-*.tar.gz && python -c 'import pybedtools')
set +x
python setup.py clean
log "install test requirements"
# ----------------------------------------------------------------------------
# sdist and pip install tests
# ----------------------------------------------------------------------------
# Build an environment with just Python and Cython. We do this fresh each time.
log "building fresh environment with just python and cython"
with_cy="pbtpy${PY_VERSION}_sdist_cython"
if conda env list | grep -q $with_cy; then
conda env remove -y -n $with_cy
fi
conda create -n $with_cy -y --channel conda-forge --channel bioconda python=${PY_VERSION} cython
source activate $with_cy
# Clone the repo -- so we're only catching things committed to git -- into
# a temp dir
log "cloning into temp dir"
HERE="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
TMP=/tmp/pybedtools-deploy
rm -rf $TMP
git clone $HERE $TMP
cd $TMP
log "cythonizing source files and building source package"
# Cythonize the .pyx filex to .cpp, and build a source package
python setup.py clean cythonize sdist
log "installing source package with pip"
# Install into the environment to verify that everything works (just an import
# test)
(cd dist && pip install pybedtools-*.tar.gz && python -c 'import pybedtools; print(pybedtools.__file__)')
# ----------------------------------------------------------------------------
# Unit tests
# ----------------------------------------------------------------------------
# Deactivate that env, and build another one with all requirements that we'll
# use for unit tests.
source deactivate
conda env list | grep -q $name && conda env remove -y -n $name
conda create -y -n $name --channel bioconda python=${PY_VERSION} \
--file "requirements.txt" \
--file "test-requirements.txt" \
--file "optional-requirements.txt"
source activate $name
log "install pybedtools from setup.py in develop mode to trigger re-cythonizing"
python setup.py develop
log "run tests"
nosetests
(cd docs && make clean && make doctest)
no_cy="pbtpy${PY_VERSION}_conda_no_cython"
if ! conda env list | grep -q $no_cy; then
log "creating environment"
# pysam not available from bioconda for py37 so remove it from
# requirements.
TMPREQS=$(tempfile)
grep -v pysam requirements.txt > $TMPREQS
if [[ "$PY_VERSION" == "3.7" ]]; then
REQS=$TMPREQS
else
REQS=requirements.txt
fi
conda create -n $no_cy -y \
--channel conda-forge \
--channel bioconda \
python=${PY_VERSION} \
--file $REQS \
--file test-requirements.txt \
--file optional-requirements.txt
else
echo "Using existing environment '${no_cy}'"
fi
source activate $no_cy
log "unpacking source package and install with pip install -e into $no_cy env"
mkdir -p /tmp/pybedtools-uncompressed
cd /tmp/pybedtools-uncompressed
tar -xf $TMP/dist/pybedtools-*.tar.gz
cd pybedtools-*
pip install -e .
log "Unit tests"
pytest -v --doctest-modules
# ----------------------------------------------------------------------------
# sphinx doctests
# ----------------------------------------------------------------------------
# Since the docs aren't included in the MANIFEST and therefore aren't included
# in the source distribution, we copy them over from the repo we checked out.
log "copying over docs directory from repo"
cp -r $TMP/docs .
log "sphinx doctests"
(cd docs && make clean doctest)
python-pybedtools (0.8.0-1) unstable; urgency=medium
* New upstream version.
* Standards-Version: 4.3.0
-- Steffen Moeller <moeller@debian.org> Thu, 17 Jan 2019 17:39:31 +0100
python-pybedtools (0.7.10-2.1) REMOVED; urgency=medium
* Removed from unstable and testing, see
......
......@@ -7,16 +7,6 @@ Priority: optional
Build-Depends: debhelper (>= 10),
dh-python,
bedtools,
python-all-dev,
python-setuptools,
python-pysam (>= 0.8.1),
python-matplotlib,
python-nose,
python-numpy,
python-numpydoc,
python-pandas,
python-yaml,
python-tk,
python3-all-dev,
python3-setuptools,
python3-pysam,
......@@ -28,14 +18,25 @@ Build-Depends: debhelper (>= 10),
python3-yaml,
python3-tk,
python3-sphinx,
cython,
r-base-core,
zlib1g-dev
Standards-Version: 4.1.5
Standards-Version: 4.3.0
Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/python-pybedtools.git
Vcs-Git: https://anonscm.debian.org/git/debian-med/python-pybedtools.git
Homepage: https://daler.github.io/pybedtools/
# cython,
# python-all-dev,
# python-setuptools,
# python-pysam (>= 0.8.1),
# python-matplotlib,
# python-nose,
# python-numpy,
# python-numpydoc,
# python-pandas,
# python-yaml,
# python-tk,
Package: python-pybedtools
Architecture: any
Depends: ${python:Depends},
......
......@@ -3,11 +3,13 @@ Last-Update: Wed, 13 July 2018 11:36:44 +0300
Description: Define FileNotFoundError as OSError for python 2
--- a/pybedtools/test/test1.py
+++ b/pybedtools/test/test1.py
@@ -20,6 +20,12 @@
import warnings
Index: pybedtools-0.8.0/pybedtools/test/test1.py
===================================================================
--- pybedtools-0.8.0.orig/pybedtools/test/test1.py
+++ pybedtools-0.8.0/pybedtools/test/test1.py
@@ -28,6 +28,12 @@ def teardown_module():
os.system('rm -r %s' % tempdir)
pybedtools.cleanup()
+try:
+ FileNotFoundError
......@@ -18,9 +20,11 @@ Description: Define FileNotFoundError as OSError for python 2
def fix(x):
"""
--- a/pybedtools/contrib/bigwig.py
+++ b/pybedtools/contrib/bigwig.py
@@ -6,6 +6,13 @@
Index: pybedtools-0.8.0/pybedtools/contrib/bigwig.py
===================================================================
--- pybedtools-0.8.0.orig/pybedtools/contrib/bigwig.py
+++ pybedtools-0.8.0/pybedtools/contrib/bigwig.py
@@ -6,6 +6,13 @@ import os
import subprocess
......
spelling
enable-package-data
disable-write-version
disable-test-156
rename-scripts
c8dff864dbc942bb7adb3e719b0702f7e3989d36.patch
#enable-package-data
#disable-write-version
#disable-test-156
#rename-scripts
#c8dff864dbc942bb7adb3e719b0702f7e3989d36.patch
define_filenotfounderror_python2.patch
remove_badges_from_documentation.patch
......@@ -11,11 +11,11 @@ export PYBUILD_BEFORE_TEST=cp {dir}/debian/mpl-expected.png {build_dir}/pybedtoo
export HOME=$(shell echo $$PWD"/fakehome")
%:
dh $@ --with python2,python3,sphinxdoc --buildsystem=pybuild
dh $@ --with python3,sphinxdoc --buildsystem=pybuild
override_dh_auto_build:
python3 setup.py cythonize
dh_auto_build
python setup.py develop --user
python3 setup.py develop --user
python3 setup.py build_sphinx
......@@ -27,6 +27,16 @@ override_dh_auto_install:
rm -fr debian/python-pybedtools/usr/bin
rm -fr debian/python3-pybedtools/usr/bin
override_dh_auto_test:
@echo
@echo *** NOT TESTING ***
@echo
override_dh_auto_clean:
dh_auto_clean
rm -rf fakehome .eggs
rm -rf fakehome .eggs .pybuild
rm -rf debian/python3-pybedtools
rm -rf pybedtools.egg-info
find pybedtools -name "*.so" -delete
rm -rf pybedtools/cbedtools.cpp pybedtools/featurefuncs.cpp
find docs/source/autodocs/ -name "*.rst" | grep -v "pybedtools.contrib.plotting.Track" | xargs -r /bin/rm
#dh_auto_clean
......@@ -85,14 +85,6 @@ features of :mod:`pybedtools` such as:
* streaming (for more, see :ref:`BedTools as iterators`)
* ability to use parallel processing
The first listing has many explanatory comments, and the second listing shows
the same code with no comments to give more of a feel for :mod:`pybedtools`.
.. literalinclude:: example_3
Here's the same code but with no comments:
.. literalinclude:: example_3_no_comments
For more on using :mod:`pybedtools`, continue on to the :ref:`tutorial` . .
.
For more on using :mod:`pybedtools`, continue on to the :ref:`tutorial` . . .
......@@ -324,17 +324,6 @@ Working with bigBed files
pybedtools.contrib.bigbed.bigbed
pybedtools.contrib.bigbed.bigbed_to_bed
:class:`MultiClassifier`
~~~~~~~~~~~~~~~~~~~~~~~~
An example use-case of the :class:`MultiClassifier` class would be to determine the
distribution of ChIP-seq peaks in introns/exons/intergenic space.
.. autosummary::
:toctree: autodocs
pybedtools.contrib.MultiClassifier
pybedtools.contrib.MultiClassifier.classify
pybedtools.contrib.MultiClassifier.print_table
:class:`IntersectionMatrix`
~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -30,7 +30,7 @@ pybedtools.contrib.plotting.Track
~Track.get_alpha
~Track.get_animated
~Track.get_array
~Track.get_axes
~Track.get_capstyle
~Track.get_children
~Track.get_clim
~Track.get_clip_box
......@@ -49,6 +49,7 @@ pybedtools.contrib.plotting.Track
~Track.get_fill
~Track.get_gid
~Track.get_hatch
~Track.get_joinstyle
~Track.get_label
~Track.get_linestyle
~Track.get_linestyles
......@@ -91,7 +92,7 @@ pybedtools.contrib.plotting.Track
~Track.set_antialiased
~Track.set_antialiaseds
~Track.set_array
~Track.set_axes
~Track.set_capstyle
~Track.set_clim
~Track.set_clip_box
~Track.set_clip_on
......@@ -107,6 +108,7 @@ pybedtools.contrib.plotting.Track
~Track.set_figure
~Track.set_gid
~Track.set_hatch
~Track.set_joinstyle
~Track.set_label
~Track.set_linestyle
~Track.set_linestyles
......
......@@ -2,6 +2,75 @@
Changelog
=========
Changes in v0.8.0
-----------------
This version further improves testing, improves the way C++ files are included
in the package, and fixes many long-standing bugs.
* Using pytest framework rather than nose for testing
* Updated `setup.py` to be more robust and to more clearly separate
"cythonization" into .cpp files
* Updated test harness for testing in independent conda environments
* All issue tests go in their own test module
* Included Python 3.7 tests (note that at the time of this writing, pysam is
not yet available on bioconda so that dependency is pip-installed in the
test) (`#254 <https://github.com/daler/pybedtools/issues/254>`_)
* Updated tests to reflect BEDTool 2.27.1 output (`#260
<https://github.com/daler/pybedtools/issues/260>`_`#261
<https://github.com/daler/pybedtools/issues/261>`_)
* Removed the `contrib.classifier` module, which has been unsupported for
a while.
* More informative error messages for UCSC tools if they're missing (`#227
<https://github.com/daler/pybedtools/issues/227>`_)
* BedTool objects that are the result of operations that create files that are
not BED/GTF/GFF/BAM can be more easily converted to pandas.DataFrame with
`disable_auto_names=True` arg to `BedTool.to_dataframe()` (`#258
<https://github.com/daler/pybedtools/issues/258>`_)
* Added aliases to existing methods to match current BEDTools commands, e.g.
the `BedTool.nucleotide_content` method can now also be called using
`BedTool.nuc` which is consistent with the `bedtools nuc` command line name.
* New wrapper for `bedtools split`. The wrapper method is called `splitbed` to
maintain backwards compatibility because `pybedtools.BedTool` objects have
long had a `split` method that splits intervals based on a custom function.
* New wrapper for `bedtools spacing`.
* `BedTool.from_dataframe` handles NaN in dataframes by replacing with `"."`,
and is more explicit about kwargs that are passed to `pandas.DataFrame`
(`#257 <https://github.com/daler/pybedtools/issues/257>`_)
* Raise FileNotFoundError when on Python 3 (thanks Gosuke Shibahara, (`#255
<https://github.com/daler/pybedtools/issues/255>`_)
* Relocated BEDTools header and .cpp files to the `pybedtools/include`
directory, so they can more easily be linked to from external packages
(`#253 <https://github.com/daler/pybedtools/issues/253>`_)
* Add test for (`#118 <https://github.com/daler/pybedtools/issues/118>`_)
* `BedTool.tabix_contigs` will list the sequence names indexed by tabix
(`#180 <https://github.com/daler/pybedtools/issues/180>`_)
* `BedTool.tabix_intervals` will return an empty generator if the coordinates
provided are not indexed, unless `check_coordinates=True` in which case the
previous behavior of raising a ValueError is triggered (`#181
<https://github.com/daler/pybedtools/issues/181>`_)
* Bugfix: Avoid "ResourceWarning: unclosed file" in `helpers.isBGZIP` (thanks
Stephen Bush)
* Bugfix: Interval objects created directly no longer have their filetype set
to None (`#217 <https://github.com/daler/pybedtools/issues/217>`_)
* Bugfix: Fixed the ability to set paths and reload module afterwards (`#218
<https://github.com/daler/pybedtools/issues/218>`_, `#220
<https://github.com/daler/pybedtools/issues/220>`_, `#222
<https://github.com/daler/pybedtools/issues/222>`_)
* Bugfix: `BedTool.head()` no longer uses an IntervalIterator (which would
check to make sure lines are valid BED/GTF/GFF/BAM/SAM). Instead, it simply
prints the first lines of the underlying file.
* Bugfix: functions passed to `BedTool.filter` and `BedTool.each` no longer
silently pass ValueErrors (`#231
<https://github.com/daler/pybedtools/issues/231>`_)
* Bugfix: Fixed IndexError in IntervalIterator if there was an empty line (`#233
<https://github.com/daler/pybedtools/issues/233>`_)
* Bugfix: Add additional constraint to SAM file detection to avoid incorrectly
detecting a BED file as SAM (`#246
<https://github.com/daler/pybedtools/issues/246>`_)
* Bugfix: accessing Interval.fields after accessing Interval.attrs no longer
raises ValueError (`#246 <https://github.com/daler/pybedtools/issues/246>`_)
Changes in v0.7.10
------------------
Various bug fixes and some minor feature additions:
......
import sys
import multiprocessing
import pybedtools
# get example GFF and BAM filenames
gff = pybedtools.example_filename('gdc.gff')
bam = pybedtools.example_filename('gdc.bam')
# Some GFF files have invalid entries -- like chromosomes with negative coords
# or features of length = 0. This line removes them and saves the result in a
# tempfile
g = pybedtools.BedTool(gff).remove_invalid().saveas()
# Next, we create a function to pass only features for a particular
# featuretype. This is similar to a "grep" operation when applied to every
# feature in a BedTool
def featuretype_filter(feature, featuretype):
if feature[2] == featuretype:
return True
return False
# This function will eventually be run in parallel, applying the filter above
# to several different BedTools simultaneously
def subset_featuretypes(featuretype):
result = g.filter(featuretype_filter, featuretype).saveas()
return pybedtools.BedTool(result.fn)
# This function performs the intersection of a BAM file with a GFF file and
# returns the total number of hits. It will eventually be run in parallel.
def count_reads_in_features(features_fn):
"""
Callback function to count reads in features
"""
# BAM files are auto-detected; no need for an `abam` argument. Here we
# construct a new BedTool out of the BAM file and intersect it with the
# features filename.
# We use stream=True so that no intermediate tempfile is
# created, and bed=True so that the .count() method can iterate through the
# resulting streamed BedTool.
return pybedtools.BedTool(bam).intersect(
b=features_fn,
stream=True).count()
# Set up a pool of workers for parallel processing
pool = multiprocessing.Pool()
# Create separate files for introns and exons, using the function we defined
# above
featuretypes = ('intron', 'exon')
introns, exons = pool.map(subset_featuretypes, featuretypes)
# Perform some genome algebra to get unique and shared intron/exon regions.
# Here we keep only the filename of the results, which is safer than an entire
# BedTool for passing around in parallel computations.
exon_only = exons.subtract(introns).merge().remove_invalid().saveas().fn
intron_only = introns.subtract(exons).merge().remove_invalid().saveas().fn
intron_and_exon = exons.intersect(introns).merge().remove_invalid().saveas().fn
# Do intersections with BAM file in parallel, using the other function we
# defined above
features = (exon_only, intron_only, intron_and_exon)
results = pool.map(count_reads_in_features, features)
# Print the results
labels = (' exon only:',
' intron only:',
'intron and exon:')
for label, reads in zip(labels, results):
sys.stdout.write('%s %s\n' % (label, reads))
../../pybedtools/scripts/intron_exon_reads.py
\ No newline at end of file
......@@ -9,17 +9,17 @@ installed.
.. _condainstall:
Quick install via `conda`
~~~~~~~~~~~~~~~~~~~~~~~~~
If you're usng the `Anaconda Python distribution
<http://continuum.io/downloads>`_ on Linux, then the following will install
:mod:`pybedtools`::
Install via `conda`
~~~~~~~~~~~~~~~~~~~
This is by far the easiest option. If you're usng the `Anaconda Python
distribution <http://continuum.io/downloads>`_ on Linux, then the following
will install :mod:`pybedtools`::
conda install -c bioconda pybedtools
conda install --channel conda-forge --channel bioconda pybedtools
You can also install Tabix and BEDTools via conda::
conda install -c bioconda bedtools htslib
conda install --channel conda-forge --channel bioconda bedtools htslib
Otherwise, read on for installation on other platforms and in other
environments.
......@@ -39,13 +39,11 @@ Required
:A C/C++ compiler:
* **Windows:** Use Cygwin, http://www.cygwin.com. It is probably easiest to select
all of the 'Devel" group items to be installed. In addition, ensure the
`zlib` items are selected for installation as well (using the search
funciton in the Cygwin install program).
* **OSX:** Install Xcode from http://developer.apple.com/xcode/
* **Linux:** `gcc`, usually already installed; on Ubuntu, install with `sudo apt-get install
build-essentials`
* **Windows:** may work with conda compliers or Cygwin but this is
untested. Windows is not supported.
Optional
++++++++
......@@ -62,14 +60,7 @@ Installing :mod:`pybedtools`
Install latest release via `conda` (recommended)
++++++++++++++++++++++++++++++++++++++++++++++++
Use the Anaconda channel `daler`::
conda install -c daler pybedtools
This example installs :mod:`pybedtools` and BEDTools into an isolated
environment called `myenv` running Python 3::
conda create -n myenv -c daler pybedtools bedtools python=3
See :ref:`condainstall` section above.
Install latest release using `pip`
......@@ -90,13 +81,36 @@ Assumptions:
1. `git` is installed
2. Cython is installed (`conda install cython` or `pip install cython`)
The following commands will clone the repository
.. code-block:: bash
git clone https://github.com/daler/pybedtools.git
cd pybedtools
git pull
The only time the C++ files will be rebuilt from Cython .pyx source is if the
`cythonize` subcommand is used. To rebuild the C++ files using Cython, run:
.. code-block:: bash
python setup.py cythonize
To install in develop mode, where changes to Python files will be picked up
without having to re-install, use:
.. code-block:: bash
python setup.py develop
The above will not update when the .pyx files are updated, so if the Cython
source files have been changed, run:
.. code-block:: bash
python setup.py cythonize develop
See `python setup.py --usage` for more information.
Quick test
......@@ -131,17 +145,19 @@ e.g., by running::
Test current installation
~~~~~~~~~~~~~~~~~~~~~~~~~
Testing the "current installation" means testing the installation into the
current environment, whether this is the system-wide Python, a virtualenv, or
a conda environment. It requires some additional packages to be installed::
To test within the existing installation, install the additional packages for
testing::
pip install -r dev-requirements.txt
conda install --channel conda-forge --channel bioconda \
--file requirements.txt \
--file test-requirements.txt \
--file optional-requirements.txt
Run unit tests::
Then run unit tests along with module doctests::
nosetests -v
pytest --doctest-modules
Run doctests::
Finally, run sphinx doctests::
(cd docs && make doctest)
......@@ -159,20 +175,6 @@ To run tests under Python 3::
./condatest.sh 3
Test within isolated Docker containers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This assumes that `Docker <https://www.docker.com/>`_ is installed.
The following command will build Docker two containers -- one for Python 2 and
one for Python 3 -- starting with the base Ubuntu 14.04 container. The first
time the containers are built it will take some time, but they are cached so
subsequent tests will run quickly. Within each of these containers, unit tests
and doctests are run::
(cd docker && ./full-test.sh)
Compile docs
~~~~~~~~~~~~
To compile the docs, from the top-level `pybedtools` directory::
......
......@@ -68,7 +68,7 @@ it to a list in order to look at it:
100
>>> print(results[:10])
[1, 1, 2, 2, 1, 2, 1, 0, 2, 3]
[0, 3, 1, 1, 2, 2, 2, 1, 4, 2]
Running thousands of iterations on files with many features will of course
result in more complex results. We could then take these results and plot
......@@ -149,7 +149,7 @@ For example:
actual: 3
median randomized: 2.0
normalized: 1.5
percentile: 92.0
percentile: 93.5
Contributions toward improving this code or implementing other methods of
statistical testing are very welcome!
......
......@@ -81,21 +81,18 @@ https://github.com/arq5x/bedtools2/issues/436 for details).
>>> # bedtools v2.26.0
>>> print(open(d.fn).read())
30
>>> # bedtools != v2.26.0
>>> # UTR 0
>>> # CDS 2
>>> # intron 4
>>> # CDS 0
>>> # UTR 1
>>> # exon 3
>>> # mRNA 7
>>> # CDS 2
>>> # exon 2
>>> # tRNA 2
>>> # gene 7
>>> # <BLANKLINE>
UTR 0
CDS 2
intron 4
CDS 0
UTR 1
exon 3
mRNA 7
CDS 2
exon 2
tRNA 2
gene 7
<BLANKLINE>
Trying to iterate over `d` (`[i for i in d]`) or save it (`d.saveas()`) raises
......