Skip to content
Commits on Source (16)
*.pyc
.*.swp
build/
dist/
pbalign.egg-info/
.DS_Store
include LICENSES.txt
......@@ -15,10 +15,13 @@ install:
develop:
python setup.py develop
test:
pylint:
pylint --errors-only pbalign
test: pylint
# Unit tests
#find tests/unit -name "*.py" | xargs nosetests
nosetests --verbose tests/unit/*.py
python setup.py test
# End-to-end tests
@echo pbalign cram tests require blasr installed.
find tests/cram -name "*.t" | xargs cram
......
......@@ -3,3 +3,7 @@ pbalign maps PacBio reads to reference sequences.
Want to know how to install and run pbalign?
Please refer to https://github.com/PacificBiosciences/pbalign/blob/master/doc/howto.rst
DISCLAIMER
----------
THIS WEBSITE AND CONTENT AND ALL SITE-RELATED SERVICES, INCLUDING ANY DATA, ARE PROVIDED "AS IS," WITH ALL FAULTS, WITH NO REPRESENTATIONS OR WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY, SATISFACTORY QUALITY, NON-INFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE. YOU ASSUME TOTAL RESPONSIBILITY AND RISK FOR YOUR USE OF THIS SITE, ALL SITE-RELATED SERVICES, AND ANY THIRD PARTY WEBSITES OR APPLICATIONS. NO ORAL OR WRITTEN INFORMATION OR ADVICE SHALL CREATE A WARRANTY OF ANY KIND. ANY REFERENCES TO SPECIFIC PRODUCTS OR SERVICES ON THE WEBSITES DO NOT CONSTITUTE OR IMPLY A RECOMMENDATION OR ENDORSEMENT BY PACIFIC BIOSCIENCES.
#!/usr/bin/env bash
set -vex
################
# DEPENDENCIES #
################
## Load modules
type module >& /dev/null || . /mnt/software/Modules/current/init/bash
module purge
module load gcc
module load git
module load samtools
module load python/2
case "${bamboo_planRepository_branchName}" in
master)
module load blasr/master
;;
*)
module load blasr/develop
;;
esac
rm -rf prebuilts build
test -d .pip/wheels && find .pip/wheels -type f ! -name '*none-any.whl' -print -delete || true
export NX3PBASEURL=http://nexus/repository/unsupported/pitchfork/gcc-6.4.0
export PATH="${PWD}/build/bin:${PATH}"
export PYTHONUSERBASE="${PWD}/build"
export PIP="pip --cache-dir=$bamboo_build_working_directory/.pip"
CUR_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
source "${CUR_DIR}"/scripts/ci/build.sh
source "${CUR_DIR}"/scripts/ci/test.sh
pbalign (0.3.1-1) unstable; urgency=medium
* Team upload.
* Add watch file
* debhelper 11
* Point Vcs fields to salsa.debian.org
* Standards-Version: 4.2.1
* Testsuite: autopkgtest-pkg-python
* Secure URI in copyright format
* Remove trailing whitespace in debian/control
* Build-Depends: s/python-sphinx/python3-sphinx/
* pbalign versioned depends: python-pbalign (= ${source:Version})
* Create manpages before building package
-- Andreas Tille <tille@debian.org> Wed, 31 Oct 2018 07:53:41 +0100
pbalign (0.3.0-1) unstable; urgency=medium
* New upstream release (corresponds to smrtanalysis-4.0.0 tag)
......
Source: pbalign
Section: python
Priority: optional
Maintainer: Debian Med Packaging Team <debian-med-packaging@lists.alioth.debian.org>
Uploaders: Afif Elghraoui <afif@debian.org>
Build-Depends:
debhelper (>= 9),
Section: python
Testsuite: autopkgtest-pkg-python
Priority: optional
Build-Depends: debhelper (>= 11~),
dh-python,
python-all,
python-setuptools,
python-pbcore (>= 0.8.5),
python-pbcommand (>= 0.2.0),
python-sphinx,
help2man,
Standards-Version: 3.9.8
python-pbcore,
python-pbcommand,
python3-sphinx,
python-nose
Standards-Version: 4.2.1
Vcs-Browser: https://salsa.debian.org/med-team/pbalign
Vcs-Git: https://salsa.debian.org/med-team/pbalign.git
Homepage: https://github.com/PacificBiosciences/pbalign
Vcs-Git: https://anonscm.debian.org/git/debian-med/pbalign.git
Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/pbalign.git
Package: pbalign
Architecture: all
Depends:
${misc:Depends},
Depends: ${misc:Depends},
${python:Depends},
python-pbalign,
python-pbalign (= ${source:Version}),
python-pkg-resources,
blasr (>= 5.3+0),
Recommends:
python-pbh5tools,
hdf5-tools,
Suggests:
bowtie2,
blasr (>= 5.3+0)
Recommends: python-pbh5tools,
hdf5-tools
Suggests: bowtie2,
gmap,
pbalign-doc,
pbalign-doc
Description: map Pacific Biosciences reads to reference DNA sequences
pbalign aligns PacBio reads to reference sequences, filters aligned
reads according to user-specific filtering criteria, and converts the
......@@ -42,17 +39,14 @@ Description: map Pacific Biosciences reads to reference DNA sequences
Package: python-pbalign
Architecture: all
Depends:
${misc:Depends},
Depends: ${misc:Depends},
${python:Depends},
blasr (>= 5.3+0),
Recommends:
python-pbh5tools,
hdf5-tools,
Suggests:
bowtie2,
blasr (>= 5.3+0)
Recommends: python-pbh5tools,
hdf5-tools
Suggests: bowtie2,
gmap,
pbalign-doc,
pbalign-doc
Description: map Pacific Biosciences reads to reference DNA sequences (Python2)
pbalign aligns PacBio reads to reference sequences, filters aligned
reads according to user-specific filtering criteria, and converts the
......@@ -63,12 +57,11 @@ Description: map Pacific Biosciences reads to reference DNA sequences (Python2)
Python library backend.
Package: pbalign-doc
Section: doc
Architecture: all
Depends:
${misc:Depends},
Section: doc
Depends: ${misc:Depends},
libjs-jquery,
libjs-underscore,
libjs-underscore
Description: documentation for pbalign
pbalign aligns PacBio reads to reference sequences, filters aligned
reads according to user-specific filtering criteria, and converts the
......
Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: pbalign
Upstream-Contact: Pacific Biosciences <devnet@pacificbiosciences.com>
Source: https://github.com/PacificBiosciences/pbalign
......
#!/bin/sh
MANDIR=debian/mans
mkdir -p $MANDIR
VERSION=`dpkg-parsechangelog | awk '/^Version:/ {print $2}' | sed -e 's/^[0-9]*://' -e 's/-.*//' -e 's/[+~]dfsg$//'`
NAME=`grep "^Description:" debian/control | sed 's/^Description: *//' | head -n1`
PROGNAME=`grep "^Package:" debian/control | sed 's/^Package: *//' | head -n1`
AUTHOR=".SH AUTHOR\nThis manpage was written by $DEBFULLNAME for the Debian distribution and
can be used for any other usage of the program.
"
# If program name is different from package name or title should be
# different from package short description change this here
progname=${PROGNAME}
help2man --no-info --no-discard-stderr \
--name="Mapping PacBio sequences to references" \
--version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
echo $AUTHOR >> $MANDIR/${progname}.1
progname=createChemistryHeader
help2man --no-info --no-discard-stderr \
--name="Create a SAM header with PacBio sequencing chemistry information" \
--version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
echo $AUTHOR >> $MANDIR/${progname}.1
progname=extractUnmappedSubreads
help2man --no-info --no-discard-stderr \
--name="Extract unmapped subreads from a fasta file" \
--version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
echo $AUTHOR >> $MANDIR/${progname}.1
progname=loadChemistry
help2man --no-info --no-discard-stderr \
--name="Load PacBio sequencing chemistry information" \
--version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
echo $AUTHOR >> $MANDIR/${progname}.1
progname=maskAlignedReads
help2man --no-info --no-discard-stderr \
--name="Mask aligned reads in regions file" \
--version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
echo $AUTHOR >> $MANDIR/${progname}.1
rm -f $MANDIR/*.py.1
for man in $MANDIR/*.1 ; do
ln -s `basename $man` $MANDIR/`basename $man .1`.py.1
done
echo "$MANDIR/*.1" > debian/manpages
cat <<EOT
Please enhance the help2man output.
The following web page might be helpful in doing so:
http://liw.fi/manpages/
EOT
debian/mans/*.1
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.8.
.TH CREATECHEMISTRYHEADER "1" "October 2018" "createChemistryHeader 0.3.1" "User Commands"
.SH NAME
createChemistryHeader \- Create a SAM header with PacBio sequencing chemistry information
.SH DESCRIPTION
usage: getChemistryHeader.py [\-h] [\-\-debug] \fB\-\-bas_files\fR BAS_FILES
.TP
[BAS_FILES ...]
input_alignment_file output_header_file
.PP
createChemistryHeader creates a SAM header that contains the chemistry
information used by Quiver.
.SS "positional arguments:"
.TP
input_alignment_file
A SAM or BAM file produced by BLASR.
.TP
output_header_file
Name of the SAM or BAM header file that will be
created with chemistry information loaded.
.SS "optional arguments:"
.TP
\fB\-h\fR, \fB\-\-help\fR
show this help message and exit
.TP
\fB\-\-debug\fR
Output detailed log information. (default: False)
.TP
\fB\-\-bas_files\fR BAS_FILES [BAS_FILES ...]
The bas or bax files containing the reads that were
aligned in the input_alignment_file. Also can be a
fofn of bas or bax files. (default: None)
.SH AUTHOR
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
createChemistryHeader.1
\ No newline at end of file
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.8.
.TH EXTRACTUNMAPPEDSUBREADS "1" "October 2018" "extractUnmappedSubreads 0.3.1" "User Commands"
.SH NAME
extractUnmappedSubreads \- Extract unmapped subreads from a fasta file
.SH DESCRIPTION
usage: extractUnmappedSubreads [\-h] [\-\-verbose] [\-\-version] [\-\-profile]
.TP
[\-\-debug]
fasta cmp.h5 [cmp.h5 ...]
.PP
Extract unmapped subreads from a fasta file.
.SS "positional arguments:"
.TP
fasta
a fasta file containing all subreads.
.TP
cmp.h5
input cmp.h5 files.
.SS "optional arguments:"
.TP
\fB\-h\fR, \fB\-\-help\fR
show this help message and exit
.TP
\fB\-\-verbose\fR, \fB\-v\fR
Set the verbosity level (default: None)
.TP
\fB\-\-version\fR
show program's version number and exit
.TP
\fB\-\-profile\fR
Print runtime profile at exit (default: False)
.TP
\fB\-\-debug\fR
Catch exceptions in debugger (requires ipdb) (default: False)
.SH AUTHOR
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
extractUnmappedSubreads.1
\ No newline at end of file
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.8.
.TH LOADCHEMISTRY "1" "October 2018" "loadChemistry 0.3.1" "User Commands"
.SH NAME
loadChemistry \- Load PacBio sequencing chemistry information
.SH DESCRIPTION
loadChemistry.py
.TP
Load chemistry info into a cmp.h5, just copying the triple.
Note
.IP
that there is no attempt to "decode" chemistry barcodes here\-\-\-this
is a dumb pipe.
.IP
usage:
.IP
\f(CW% loadChemistry [input.fofn | list of input.ba[sx].h5] aligned_reads.cmp.h5\fR
.SH AUTHOR
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
loadChemistry.1
\ No newline at end of file
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.8.
.TH MASKALIGNEDREADS "1" "October 2018" "maskAlignedReads 0.3.1" "User Commands"
.SH NAME
maskAlignedReads \- Mask aligned reads in regions file
.SH DESCRIPTION
usage: maskAlignedReads [\-h] [\-v] [\-l LOGFILE] [\-d] [\-i]
.IP
inCmpFile inRgnFofn outRgnFofn
.PP
Use in.cmp.h5 to mask corresponding regions of files in in.rgn.h5, write output
to a new rgn.fofn.
.SS "positional arguments:"
.TP
inCmpFile
An input cmp.h5 file.
.TP
inRgnFofn
A fofn of input region table files.
.TP
outRgnFofn
A fofn of output region table files.
.SS "optional arguments:"
.TP
\fB\-h\fR, \fB\-\-help\fR
show this help message and exit
.TP
\fB\-v\fR, \fB\-\-version\fR
show program's version number and exit
.TP
\fB\-l\fR LOGFILE, \fB\-\-logFile\fR LOGFILE
Specify a file to log to. Defaults to stderr.
(default: None)
.TP
\fB\-d\fR, \fB\-\-debug\fR
Increases verbosity of logging (default: False)
.TP
\fB\-i\fR, \fB\-\-info\fR
Display informative log entries (default: False)
.SH AUTHOR
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
maskAlignedReads.1
\ No newline at end of file
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.8.
.TH PBALIGN "1" "October 2018" "pbalign 0.3.1" "User Commands"
.SH NAME
pbalign \- Mapping PacBio sequences to references
.SH DESCRIPTION
usage: pbalign [\-h] [\-\-version] [\-\-log\-file LOG_FILE]
.IP
[\-\-log\-level {DEBUG,INFO,WARNING,ERROR,CRITICAL} | \fB\-\-debug\fR | \fB\-\-quiet\fR | \fB\-v]\fR
[\-\-pdb] [\-\-regionTable REGIONTABLE] [\-\-configFile CONFIGFILE]
[\-\-pulseFile PULSEFILE] [\-\-algorithm {blasr,bowtie,gmap}]
[\-\-maxHits MAXHITS] [\-\-minAnchorSize MINANCHORSIZE]
[\-\-maxMatch MAXMATCH]
[\-\-useccs {useccs,useccsall,useccsdenovo}] [\-\-noSplitSubreads]
[\-\-concordant] [\-\-nproc NPROC]
[\-\-algorithmOptions ALGORITHMOPTIONS]
[\-\-maxDivergence MAXDIVERGENCE] [\-\-minAccuracy MINACCURACY]
[\-\-minLength MINLENGTH] [\-\-scoreCutoff SCORECUTOFF]
[\-\-hitPolicy {randombest,allbest,random,all,leftmost}]
[\-\-filterAdapterOnly] [\-\-unaligned UNALIGNED] [\-\-seed SEED]
[\-\-tmpDir TMPDIR] [\-\-profile]
inputFileName referencePath outputFileName
.PP
Mapping PacBio sequences to references using an algorithm selected from a
selection of supported command\-line alignment algorithms. Input can be a
fasta, pls.h5, bas.h5 or ccs.h5 file or a fofn (file of file names). Output
can be in SAM or BAM format. If output is BAM format, aligner can only be
blasr and QVs will be loaded automatically. NOTE that pbalign no longer
supports CMP.H5 in 3.0.
.SS "positional arguments:"
.TP
inputFileName
SubreadSet or unaligned .bam
.TP
referencePath
Reference DataSet or FASTA file
.TP
outputFileName
Alignment results dataset
.SS "optional arguments:"
.TP
\fB\-h\fR, \fB\-\-help\fR
show this help message and exit
.TP
\fB\-\-version\fR
show program's version number and exit
.TP
\fB\-\-log\-file\fR LOG_FILE
Write the log to file. Default(None) will write to
stdout. (default: None)
.TP
\fB\-\-log\-level\fR {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set log level (default: INFO)
.TP
\fB\-\-debug\fR
Alias for setting log level to DEBUG (default: False)
.TP
\fB\-\-quiet\fR
Alias for setting log level to CRITICAL to suppress
output. (default: False)
.TP
\fB\-v\fR, \fB\-\-verbose\fR
Set the verbosity level. (default: None)
.TP
\fB\-\-pdb\fR
Enable Python debugger (default: False)
.TP
\fB\-\-profile\fR
Print runtime profile at exit (default: False)
.SS "Optional input arguments:"
.TP
\fB\-\-regionTable\fR REGIONTABLE
Specify a region table for filtering reads. (default:
None)
.TP
\fB\-\-configFile\fR CONFIGFILE
Specify a set of user\-defined argument values.
(default: None)
.TP
\fB\-\-pulseFile\fR PULSEFILE
When input reads are in fasta format and output is a
cmp.h5 this option can specify pls.h5 or bas.h5 or
FOFN files from which pulse metrics can be loaded for
Quiver. (default: None)
.SS "Alignment options:"
.TP
\fB\-\-algorithm\fR {blasr,bowtie,gmap}
Select an aligorithm from ('blasr', 'bowtie', 'gmap').
(default: blasr)
.TP
\fB\-\-maxHits\fR MAXHITS
The maximum number of matches of each read to the
reference sequence that will be evaluated. (default:
None)
.TP
\fB\-\-minAnchorSize\fR MINANCHORSIZE
The minimum anchor size defines the length of the read
that must match against the reference sequence.
(default: None)
.TP
\fB\-\-maxMatch\fR MAXMATCH
BLASR maxMatch option. (Will be overridden if is also
set in algorithmOptions) (default: 30)
.TP
\fB\-\-useccs\fR {useccs,useccsall,useccsdenovo}
Map the ccsSequence to the genome first, then align
subreads to the interval that the CCS reads mapped to.
useccs: only maps subreads that span the length of the
template. useccsall: maps all subreads. useccsdenovo:
maps ccs only. (default: None)
.TP
\fB\-\-noSplitSubreads\fR
Do not split reads into subreads even if subread
regions are available. (default: False)
.TP
\fB\-\-concordant\fR
Map subreads of a ZMW to the same genomic location.
(default: False)
.TP
\fB\-\-nproc\fR NPROC
Number of threads. (default: 8)
.TP
\fB\-\-algorithmOptions\fR ALGORITHMOPTIONS
Pass alignment options through. (default: None)
.SS "Filter criteria options:"
.TP
\fB\-\-maxDivergence\fR MAXDIVERGENCE
The maximum allowed percentage divergence of a read
from the reference sequence. (default: 30.0)
.TP
\fB\-\-minAccuracy\fR MINACCURACY
The minimum concordance of alignments that will be
evaluated. (default: 70.0)
.TP
\fB\-\-minLength\fR MINLENGTH
The minimum aligned read length of alignments that
will be evaluated. (default: 50)
.TP
\fB\-\-scoreCutoff\fR SCORECUTOFF
The worst score to output an alignment. (default:
None)
.TP
\fB\-\-hitPolicy\fR {randombest,allbest,random,all,leftmost}
Specify a policy for how to treat multiple hit random
: selects a random hit. all : selects all hits.
allbest : selects all the best score hits. randombest:
selects a random hit from all best score hits.
leftmost : selects a hit which has the best score and
the smallest mapping coordinate in any reference.
(default: randombest)
.TP
\fB\-\-filterAdapterOnly\fR
If specified, do not report adapter\-only hits using
annotations with the reference entry. (default: False)
.SS "Miscellaneous options:"
.TP
\fB\-\-unaligned\fR UNALIGNED
Output names of unaligned reads to specified file.
(default: None)
.TP
\fB\-\-seed\fR SEED
Initialize the random number generator with a nonezero integer. Zero means that current system time is
used. (default: 1)
.TP
\fB\-\-tmpDir\fR TMPDIR
Specify a directory for saving temporary files.
(default: \fI\,/tmp\/\fP)
.SH AUTHOR
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.