Imported Upstream version 2.2.0

parent fa595dac
......@@ -133,10 +133,10 @@ Obtaining Bowtie 2
==================
Download Bowtie 2 sources and binaries from the [Download] section of the
Sourceforge site. Binaries are available for Intel architectures (`i386` and
`x86_64`) running Linux, and Mac OS X. A 32-bit version is available for
Windows. If you plan to compile Bowtie 2 yourself, make sure to get the source
package, i.e., the filename that ends in "-source.zip".
Sourceforge site. Binaries are available for the Intel `x86_64` architecture
running Linux, Mac OS X, and Windows. If you plan to compile Bowtie 2 yourself,
make sure to get the source package, i.e., the filename that ends in
"-source.zip".
Building from source
--------------------
......@@ -396,7 +396,8 @@ instance, when seeking [structural variants].
The expected relative orientation of the mates is set using the `--ff`,
`--fr`, or `--rf` options. The expected range of inter-mates distances (as
measured from the furthest extremes of the mates; also called "outer distance")
is set with the `-I` and `-X` options.
is set with the `-I` and `-X` options. Note that setting `-I` and `-X`
far apart makes Bowtie 2 slower. See documentation for `-I` and `-X`.
To declare that a pair aligns discordantly, Bowtie 2 requires that both mates
align uniquely. This is a conservative threshold, but this is often desirable
......@@ -730,27 +731,35 @@ For datasets consisting of pairs, the summary might look like this:
The indentation indicates how subtotals relate to totals.
Wrapper
-------
Wrapper scripts
---------------
The `bowtie2` executable is actually a Perl wrapper script that calls the
compiled `bowtie2-align` binary. It is recommended that you always run the
`bowtie2` wrapper and not run `bowtie2-align` directly.
The `bowtie2`, `bowtie2-build` and `bowtie2-inspect` executables are actually
wrapper scripts that call binary programs as appropriate. The wrappers shield
users from having to distinguish between "small" and "large" index formats,
discussed briefly in the following section. Also, the `bowtie2` wrapper
provides some key functionality, like the ability to handle compressed inputs,
and the fucntionality for `--un`, `--al` and related options.
Performance tuning
------------------
It is recommended that you always run the bowtie2 wrappers and not run the
binaries directly.
1. Use 64-bit version if possible
Small and large indexes
-----------------------
The 64-bit version of Bowtie 2 is faster than the 32-bit version, owing to
its use of 64-bit arithmetic. If possible, download the 64-bit binaries for
Bowtie 2 and run on a 64-bit computer. If you are building Bowtie 2 from
sources, you may need to pass the `-m64` option to `g++` to compile the
64-bit version; you can do this by including `BITS=64` in the arguments to
the `make` command; e.g.: `make BITS=64 bowtie2`. To determine whether your
version of bowtie is 64-bit or 32-bit, run `bowtie2 --version`.
`bowtie2-build` can index reference genomes of any size. For genomes less than
about 4 billion nucleotides in length, `bowtie2-build` builds a "small" index
using 32-bit numbers in various parts of the index. When the genome is longer,
`bowtie2-build` builds a "large" index using 64-bit numbers. Small indexes are
stored in files with the `.bt2` extension, and large indexes are stored in
files with the `.bt2l` extension. The user need not worry about whether a
particular index is small or large; the wrapper scripts will automatically build
and use the appropriate index.
2. If your computer has multiple processors/cores, use `-p`
Performance tuning
------------------
1. If your computer has multiple processors/cores, use `-p`
The `-p` option causes Bowtie 2 to launch a specified number of parallel
search threads. Each thread runs on a different processor/core and all
......@@ -758,6 +767,23 @@ Performance tuning
approximately a multiple of the number of threads (though in practice,
speedup is somewhat worse than linear).
2. If reporting many alignments per read, try reducing
`bowtie2-build --offrate`
If you are using `-k` or `-a` options and Bowtie 2 is reporting many
alignments per read, using an index with a denser SA sample can speed
things up considerably. To do this, specify a smaller-than-default
`-o`/`--offrate` value when running `bowtie2-build`.
A denser SA sample yields a larger index, but is also particularly
effective at speeding up alignment when many alignments are reported per
read.
3. If `bowtie2` "thrashes", try increasing `bowtie2-build --offrate`
If `bowtie2` runs very slowly on a relatively low-memory computer, try
setting `-o`/`--offrate` to a *larger* value when building the index.
This decreases the memory footprint of the index.
Command Line
------------
......@@ -1143,7 +1169,15 @@ specified and a paired-end alignment consists of two 20-bp alignments in the
appropriate orientation with a 20-bp gap between them, that alignment is
considered valid (as long as `-X` is also satisfied). A 19-bp gap would not
be valid in that case. If trimming options `-3` or `-5` are also used, the
`-I` constraint is applied with respect to the untrimmed mates. Default: 0.
`-I` constraint is applied with respect to the untrimmed mates.
The larger the difference between `-I` and `-X`, the slower Bowtie 2 will
run. This is because larger differences bewteen `-I` and `-X` require that
Bowtie 2 scan a larger window to determine if a concordant alignment exists.
For typical fragment length ranges (200 to 400 nucleotides), Bowtie 2 is very
efficient.
Default: 0 (essentially imposing no minimum)
-X/--maxins <int>
......@@ -1153,7 +1187,15 @@ proper orientation with a 60-bp gap between them, that alignment is considered
valid (as long as `-I` is also satisfied). A 61-bp gap would not be valid in
that case. If trimming options `-3` or `-5` are also used, the `-X`
constraint is applied with respect to the untrimmed mates, not the trimmed
mates. Default: 500.
mates.
The larger the difference between `-I` and `-X`, the slower Bowtie 2 will
run. This is because larger differences bewteen `-I` and `-X` require that
Bowtie 2 scan a larger window to determine if a concordant alignment exists.
For typical fragment length ranges (200 to 400 nucleotides), Bowtie 2 is very
efficient.
Default: 500.
--fr/--rf/--ff
......@@ -1494,10 +1536,12 @@ alignment:
XS:i:<N>
Alignment score for second-best alignment. Can be negative. Can be greater
than 0 in `--local` mode (but not in `--end-to-end` mode). Only present
if the SAM record is for an aligned read and more than one alignment was
found for the read.
Alignment score for the best-scoring alignment found other than the
alignment reported. Can be negative. Can be greater than 0 in `--local`
mode (but not in `--end-to-end` mode). Only present if the SAM record is
for an aligned read and more than one alignment was found for the read.
Note that, when the read is part of a concordantly-aligned pair, this score
could be greater than `AS:i`.
YS:i:<N>
......@@ -1559,7 +1603,8 @@ The `bowtie2-build` indexer
`bowtie2-build` builds a Bowtie index from a set of DNA sequences.
`bowtie2-build` outputs a set of 6 files with suffixes `.1.bt2`, `.2.bt2`,
`.3.bt2`, `.4.bt2`, `.rev.1.bt2`, and `.rev.2.bt2`. These files together
`.3.bt2`, `.4.bt2`, `.rev.1.bt2`, and `.rev.2.bt2`. In the case of a large
index these suffixes will have a `bt2l` termination. These files together
constitute the index: they are all that is needed to align reads to that
reference. The original sequence FASTA files are no longer used by Bowtie 2
once the index is built.
......@@ -1582,19 +1627,11 @@ profitable trade-offs depending on the application. They have been set to
defaults that are reasonable for most cases according to our experiments. See
[Performance tuning] for details.
Because `bowtie2-build` uses 32-bit pointers internally, it can handle up to a
theoretical maximum of 2^32-1 (somewhat more than 4 billion) characters in an
index, though, with other constraints, the actual ceiling is somewhat less than
that. If your reference exceeds 2^32-1 characters, `bowtie2-build` will print
an error message and abort. To resolve this, divide your reference sequences
into smaller batches and/or chunks and build a separate index for each.
If your computer has more than 3-4 GB of memory and you would like to exploit
that fact to make index building faster, use a 64-bit version of the
`bowtie2-build` binary. The 32-bit version of the binary is restricted to using
less than 4 GB of memory. If a 64-bit pre-built binary does not yet exist for
your platform on the sourceforge download site, you will need to build one from
source.
`bowtie2-build` can generate either [small or large indexes]. The wrapper
will decide which based on the length of the input genome. If the reference
does not exceed 4 billion characters but a large index is preferred, the user
can specify `--large-index` to force `bowtie2-build` to build a large index
instead.
The Bowtie 2 index is based on the [FM Index] of Ferragina and Manzini, which in
turn is based on the [Burrows-Wheeler] transform. The algorithm used to build
......@@ -1636,6 +1673,11 @@ The reference input files (specified as `<reference_in>`) are FASTA files
The reference sequences are given on the command line. I.e. `<reference_in>` is
a comma-separated list of sequences rather than a list of FASTA files.
--large-index
Force `bowtie2-build` to build a [large index], even if the reference is less
than ~ 4 billion nucleotides inlong.
-a/--noauto
Disable the default behavior whereby `bowtie2-build` automatically selects
......
......@@ -143,10 +143,10 @@ Obtaining Bowtie 2
==================
Download Bowtie 2 sources and binaries from the [Download] section of the
Sourceforge site. Binaries are available for Intel architectures (`i386` and
`x86_64`) running Linux, and Mac OS X. A 32-bit version is available for
Windows. If you plan to compile Bowtie 2 yourself, make sure to get the source
package, i.e., the filename that ends in "-source.zip".
Sourceforge site. Binaries are available for the Intel `x86_64` architecture
running Linux, Mac OS X, and Windows. If you plan to compile Bowtie 2 yourself,
make sure to get the source package, i.e., the filename that ends in
"-source.zip".
Building from source
--------------------
......@@ -749,27 +749,35 @@ For datasets consisting of pairs, the summary might look like this:
The indentation indicates how subtotals relate to totals.
Wrapper
-------
Wrapper scripts
---------------
The `bowtie2` executable is actually a Perl wrapper script that calls the
compiled `bowtie2-align` binary. It is recommended that you always run the
`bowtie2` wrapper and not run `bowtie2-align` directly.
The `bowtie2`, `bowtie2-build` and `bowtie2-inspect` executables are actually
wrapper scripts that call binary programs as appropriate. The wrappers shield
users from having to distinguish between "small" and "large" index formats,
discussed briefly in the following section. Also, the `bowtie2` wrapper
provides some key functionality, like the ability to handle compressed inputs,
and the fucntionality for [`--un`], [`--al`] and related options.
Performance tuning
------------------
It is recommended that you always run the bowtie2 wrappers and not run the
binaries directly.
1. Use 64-bit version if possible
Small and large indexes
-----------------------
The 64-bit version of Bowtie 2 is faster than the 32-bit version, owing to
its use of 64-bit arithmetic. If possible, download the 64-bit binaries for
Bowtie 2 and run on a 64-bit computer. If you are building Bowtie 2 from
sources, you may need to pass the `-m64` option to `g++` to compile the
64-bit version; you can do this by including `BITS=64` in the arguments to
the `make` command; e.g.: `make BITS=64 bowtie2`. To determine whether your
version of bowtie is 64-bit or 32-bit, run `bowtie2 --version`.
`bowtie2-build` can index reference genomes of any size. For genomes less than
about 4 billion nucleotides in length, `bowtie2-build` builds a "small" index
using 32-bit numbers in various parts of the index. When the genome is longer,
`bowtie2-build` builds a "large" index using 64-bit numbers. Small indexes are
stored in files with the `.bt2` extension, and large indexes are stored in
files with the `.bt2l` extension. The user need not worry about whether a
particular index is small or large; the wrapper scripts will automatically build
and use the appropriate index.
2. If your computer has multiple processors/cores, use `-p`
Performance tuning
------------------
1. If your computer has multiple processors/cores, use `-p`
The [`-p`] option causes Bowtie 2 to launch a specified number of parallel
search threads. Each thread runs on a different processor/core and all
......@@ -777,6 +785,23 @@ Performance tuning
approximately a multiple of the number of threads (though in practice,
speedup is somewhat worse than linear).
2. If reporting many alignments per read, try reducing
`bowtie2-build --offrate`
If you are using [`-k`] or [`-a`] options and Bowtie 2 is reporting many
alignments per read, using an index with a denser SA sample can speed
things up considerably. To do this, specify a smaller-than-default
[`-o`/`--offrate`](#bowtie2-build-options-o) value when running `bowtie2-build`.
A denser SA sample yields a larger index, but is also particularly
effective at speeding up alignment when many alignments are reported per
read.
3. If `bowtie2` "thrashes", try increasing `bowtie2-build --offrate`
If `bowtie2` runs very slowly on a relatively low-memory computer, try
setting [`-o`/`--offrate`] to a *larger* value when building the index.
This decreases the memory footprint of the index.
Command Line
------------
......@@ -2197,10 +2222,12 @@ alignment:
</td>
<td>
Alignment score for second-best alignment. Can be negative. Can be greater
than 0 in [`--local`] mode (but not in [`--end-to-end`] mode). Only present
if the SAM record is for an aligned read and more than one alignment was
found for the read.
Alignment score for the best-scoring alignment found other than the
alignment reported. Can be negative. Can be greater than 0 in [`--local`]
mode (but not in [`--end-to-end`] mode). Only present if the SAM record is
for an aligned read and more than one alignment was found for the read.
Note that, when the read is part of a concordantly-aligned pair, this score
could be greater than [`AS:i`].
</td></tr>
<tr><td id="bowtie2-build-opt-fields-ys">
......@@ -2338,7 +2365,8 @@ The `bowtie2-build` indexer
`bowtie2-build` builds a Bowtie index from a set of DNA sequences.
`bowtie2-build` outputs a set of 6 files with suffixes `.1.bt2`, `.2.bt2`,
`.3.bt2`, `.4.bt2`, `.rev.1.bt2`, and `.rev.2.bt2`. These files together
`.3.bt2`, `.4.bt2`, `.rev.1.bt2`, and `.rev.2.bt2`. In the case of a large
index these suffixes will have a `bt2l` termination. These files together
constitute the index: they are all that is needed to align reads to that
reference. The original sequence FASTA files are no longer used by Bowtie 2
once the index is built.
......@@ -2361,19 +2389,11 @@ profitable trade-offs depending on the application. They have been set to
defaults that are reasonable for most cases according to our experiments. See
[Performance tuning] for details.
Because `bowtie2-build` uses 32-bit pointers internally, it can handle up to a
theoretical maximum of 2^32-1 (somewhat more than 4 billion) characters in an
index, though, with other constraints, the actual ceiling is somewhat less than
that. If your reference exceeds 2^32-1 characters, `bowtie2-build` will print
an error message and abort. To resolve this, divide your reference sequences
into smaller batches and/or chunks and build a separate index for each.
If your computer has more than 3-4 GB of memory and you would like to exploit
that fact to make index building faster, use a 64-bit version of the
`bowtie2-build` binary. The 32-bit version of the binary is restricted to using
less than 4 GB of memory. If a 64-bit pre-built binary does not yet exist for
your platform on the sourceforge download site, you will need to build one from
source.
`bowtie2-build` can generate either [small or large indexes](#small-and-large-indexes). The wrapper
will decide which based on the length of the input genome. If the reference
does not exceed 4 billion characters but a large index is preferred, the user
can specify [`--large-index`] to force `bowtie2-build` to build a large index
instead.
The Bowtie 2 index is based on the [FM Index] of Ferragina and Manzini, which in
turn is based on the [Burrows-Wheeler] transform. The algorithm used to build
......@@ -2438,6 +2458,18 @@ The reference input files (specified as `<reference_in>`) are FASTA files
The reference sequences are given on the command line. I.e. `<reference_in>` is
a comma-separated list of sequences rather than a list of FASTA files.
</td></tr>
</td></tr><tr><td id="bowtie2-build-options-large-index">
[`--large-index`]: #bowtie2-build-options-large-index
--large-index
</td><td>
Force `bowtie2-build` to build a [large index](#small-and-large-indexes), even if the reference is less
than ~ 4 billion nucleotides inlong.
</td></tr>
<tr><td id="bowtie2-build-options-a">
......
......@@ -56,6 +56,12 @@ ifneq (,$(findstring Darwin,$(shell uname)))
MACOS = 1
endif
POPCNT_CAPABILITY ?= 1
ifeq (1, $(POPCNT_CAPABILITY))
EXTRA_FLAGS += -DPOPCNT_CAPABILITY
INC += -I third_party
endif
MM_DEF =
ifeq (1,$(BOWTIE_MM))
......@@ -111,16 +117,20 @@ SEARCH_CPPS = qual.cpp pat.cpp sam.cpp \
aligner_driver.cpp
SEARCH_CPPS_MAIN = $(SEARCH_CPPS) bowtie_main.cpp
DP_CPPS = qual.cpp aligner_sw.cpp aligner_result.cpp ref_coord.cpp mask.cpp \
simple_func.cpp sse_util.cpp aligner_bt.cpp aligner_swsse.cpp \
aligner_swsse_loc_i16.cpp aligner_swsse_ee_i16.cpp \
aligner_swsse_loc_u8.cpp aligner_swsse_ee_u8.cpp scoring.cpp
BUILD_CPPS = diff_sample.cpp
BUILD_CPPS_MAIN = $(BUILD_CPPS) bowtie_build_main.cpp
SEARCH_FRAGMENTS = $(wildcard search_*_phase*.c)
VERSION = $(shell cat VERSION)
# Convert BITS=?? to a -m flag
BITS=32
ifeq (x86_64,$(shell uname -m))
BITS=64
BITS=64
endif
# msys will always be 32 bit so look at the cpu arch instead.
ifneq (,$(findstring AMD64,$(PROCESSOR_ARCHITEW6432)))
......@@ -128,33 +138,35 @@ ifneq (,$(findstring AMD64,$(PROCESSOR_ARCHITEW6432)))
BITS=64
endif
endif
BITS_FLAG =
ifeq (32,$(BITS))
BITS_FLAG = -m32
$(error bowtie2 compilation requires a 64-bit platform )
endif
ifeq (64,$(BITS))
BITS_FLAG = -m64
endif
SSE_FLAG=-msse2
SSE_FLAG=-msse2
DEBUG_FLAGS = -O0 -g3 $(BITS_FLAG) $(SSE_FLAG)
DEBUG_FLAGS = -O0 -g3 -m64 $(SSE_FLAG)
DEBUG_DEFS = -DCOMPILER_OPTIONS="\"$(DEBUG_FLAGS) $(EXTRA_FLAGS)\""
RELEASE_FLAGS = -O3 $(BITS_FLAG) $(SSE_FLAG) -funroll-loops -g3
RELEASE_FLAGS = -O3 -m64 $(SSE_FLAG) -funroll-loops -g3
RELEASE_DEFS = -DCOMPILER_OPTIONS="\"$(RELEASE_FLAGS) $(EXTRA_FLAGS)\""
NOASSERT_FLAGS = -DNDEBUG
FILE_FLAGS = -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE
BOWTIE2_BIN_LIST = bowtie2-build \
bowtie2-align \
bowtie2-inspect
BOWTIE2_BIN_LIST_AUX = bowtie2-build-debug \
bowtie2-align-debug \
bowtie2-inspect-debug
BOWTIE2_BIN_LIST = bowtie2-build-s \
bowtie2-build-l \
bowtie2-align-s \
bowtie2-align-l \
bowtie2-inspect-s \
bowtie2-inspect-l
BOWTIE2_BIN_LIST_AUX = bowtie2-build-s-debug \
bowtie2-build-l-debug \
bowtie2-align-s-debug \
bowtie2-align-l-debug \
bowtie2-inspect-s-debug \
bowtie2-inspect-l-debug
GENERAL_LIST = $(wildcard scripts/*.sh) \
$(wildcard scripts/*.pl) \
$(wildcard third_party/*) \
doc/manual.html \
doc/README \
doc/style.css \
......@@ -164,6 +176,8 @@ GENERAL_LIST = $(wildcard scripts/*.sh) \
example/reference/lambda_virus.fa \
$(PTHREAD_PKG) \
bowtie2 \
bowtie2-build \
bowtie2-inspect \
AUTHORS \
LICENSE \
NEWS \
......@@ -172,6 +186,10 @@ GENERAL_LIST = $(wildcard scripts/*.sh) \
TUTORIAL \
VERSION
ifeq (1,$(WINDOWS))
BOWTIE2_BIN_LIST := $(BOWTIE2_BIN_LIST) bowtie2.bat bowtie2-build.bat bowtie2-inspect.bat
endif
# This is helpful on Windows under MinGW/MSYS, where Make might go for
# the Windows FIND tool instead.
FIND=$(shell which find)
......@@ -192,9 +210,9 @@ all: $(BOWTIE2_BIN_LIST)
allall: $(BOWTIE2_BIN_LIST) $(BOWTIE2_BIN_LIST_AUX)
both: bowtie2 bowtie2-build
both: bowtie2-align-s bowtie2-build-s bowtie2-align-l bowtie2-build-l
both-debug: bowtie2-align-debug bowtie2-build-debug
both-debug: bowtie2-align-s-debug bowtie2-build-s-debug bowtie2-align-l-debug bowtie2-build-l-debug
DEFS=-fno-strict-aliasing \
-DBOWTIE2_VERSION="\"`cat VERSION`\"" \
......@@ -210,15 +228,23 @@ DEFS=-fno-strict-aliasing \
# bowtie2-build targets
#
bowtie2-build: bt2_build.cpp $(SHARED_CPPS) $(HEADERS)
bowtie2-build-s: bt2_build.cpp $(SHARED_CPPS) $(HEADERS)
$(CXX) $(RELEASE_FLAGS) $(RELEASE_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 $(NOASSERT_FLAGS) -Wall \
$(INC) \
-o $@ $< \
$(SHARED_CPPS) $(BUILD_CPPS_MAIN) \
$(LIBS) $(BUILD_LIBS)
bowtie2-build-debug: bt2_build.cpp $(SHARED_CPPS) $(HEADERS)
bowtie2-build-l: bt2_build.cpp $(SHARED_CPPS) $(HEADERS)
$(CXX) $(RELEASE_FLAGS) $(RELEASE_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -DBOWTIE_64BIT_INDEX $(NOASSERT_FLAGS) -Wall \
$(INC) \
-o $@ $< \
$(SHARED_CPPS) $(BUILD_CPPS_MAIN) \
$(LIBS) $(BUILD_LIBS)
bowtie2-build-s-debug: bt2_build.cpp $(SHARED_CPPS) $(HEADERS)
$(CXX) $(DEBUG_FLAGS) $(DEBUG_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -Wall \
$(INC) \
......@@ -226,11 +252,19 @@ bowtie2-build-debug: bt2_build.cpp $(SHARED_CPPS) $(HEADERS)
$(SHARED_CPPS) $(BUILD_CPPS_MAIN) \
$(LIBS) $(BUILD_LIBS)
bowtie2-build-l-debug: bt2_build.cpp $(SHARED_CPPS) $(HEADERS)
$(CXX) $(DEBUG_FLAGS) $(DEBUG_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -DBOWTIE_64BIT_INDEX -Wall \
$(INC) \
-o $@ $< \
$(SHARED_CPPS) $(BUILD_CPPS_MAIN) \
$(LIBS) $(BUILD_LIBS)
#
# bowtie targets
# bowtie2-align targets
#
bowtie2-align: bt2_search.cpp $(SEARCH_CPPS) $(SHARED_CPPS) $(HEADERS) $(SEARCH_FRAGMENTS)
bowtie2-align-s: bt2_search.cpp $(SEARCH_CPPS) $(SHARED_CPPS) $(HEADERS) $(SEARCH_FRAGMENTS)
$(CXX) $(RELEASE_FLAGS) $(RELEASE_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 $(NOASSERT_FLAGS) -Wall \
$(INC) \
......@@ -238,7 +272,15 @@ bowtie2-align: bt2_search.cpp $(SEARCH_CPPS) $(SHARED_CPPS) $(HEADERS) $(SEARCH_
$(SHARED_CPPS) $(SEARCH_CPPS_MAIN) \
$(LIBS) $(SEARCH_LIBS)
bowtie2-align-debug: bt2_search.cpp $(SEARCH_CPPS) $(SHARED_CPPS) $(HEADERS) $(SEARCH_FRAGMENTS)
bowtie2-align-l: bt2_search.cpp $(SEARCH_CPPS) $(SHARED_CPPS) $(HEADERS) $(SEARCH_FRAGMENTS)
$(CXX) $(RELEASE_FLAGS) $(RELEASE_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -DBOWTIE_64BIT_INDEX $(NOASSERT_FLAGS) -Wall \
$(INC) \
-o $@ $< \
$(SHARED_CPPS) $(SEARCH_CPPS_MAIN) \
$(LIBS) $(SEARCH_LIBS)
bowtie2-align-s-debug: bt2_search.cpp $(SEARCH_CPPS) $(SHARED_CPPS) $(HEADERS) $(SEARCH_FRAGMENTS)
$(CXX) $(DEBUG_FLAGS) \
$(DEBUG_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -Wall \
......@@ -247,11 +289,20 @@ bowtie2-align-debug: bt2_search.cpp $(SEARCH_CPPS) $(SHARED_CPPS) $(HEADERS) $(S
$(SHARED_CPPS) $(SEARCH_CPPS_MAIN) \
$(LIBS) $(SEARCH_LIBS)
bowtie2-align-l-debug: bt2_search.cpp $(SEARCH_CPPS) $(SHARED_CPPS) $(HEADERS) $(SEARCH_FRAGMENTS)
$(CXX) $(DEBUG_FLAGS) \
$(DEBUG_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -DBOWTIE_64BIT_INDEX -Wall \
$(INC) \
-o $@ $< \
$(SHARED_CPPS) $(SEARCH_CPPS_MAIN) \
$(LIBS) $(SEARCH_LIBS)
#
# bowtie2-inspect targets
#
bowtie2-inspect: bt2_inspect.cpp $(HEADERS) $(SHARED_CPPS)
bowtie2-inspect-s: bt2_inspect.cpp $(HEADERS) $(SHARED_CPPS)
$(CXX) $(RELEASE_FLAGS) \
$(RELEASE_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -DBOWTIE_INSPECT_MAIN -Wall \
......@@ -260,7 +311,16 @@ bowtie2-inspect: bt2_inspect.cpp $(HEADERS) $(SHARED_CPPS)
$(SHARED_CPPS) \
$(LIBS) $(INSPECT_LIBS)
bowtie2-inspect-debug: bt2_inspect.cpp $(HEADERS) $(SHARED_CPPS)
bowtie2-inspect-l: bt2_inspect.cpp $(HEADERS) $(SHARED_CPPS)
$(CXX) $(RELEASE_FLAGS) \
$(RELEASE_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -DBOWTIE_INSPECT_MAIN -DBOWTIE_64BIT_INDEX -Wall \
$(INC) -I . \
-o $@ $< \
$(SHARED_CPPS) \
$(LIBS) $(INSPECT_LIBS)
bowtie2-inspect-s-debug: bt2_inspect.cpp $(HEADERS) $(SHARED_CPPS)
$(CXX) $(DEBUG_FLAGS) \
$(DEBUG_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -DBOWTIE_INSPECT_MAIN -Wall \
......@@ -269,6 +329,49 @@ bowtie2-inspect-debug: bt2_inspect.cpp $(HEADERS) $(SHARED_CPPS)
$(SHARED_CPPS) \
$(LIBS) $(INSPECT_LIBS)
bowtie2-inspect-l-debug: bt2_inspect.cpp $(HEADERS) $(SHARED_CPPS)
$(CXX) $(DEBUG_FLAGS) \
$(DEBUG_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -DBOWTIE_64BIT_INDEX -DBOWTIE_INSPECT_MAIN -Wall \
$(INC) -I . \
-o $@ $< \
$(SHARED_CPPS) \
$(LIBS) $(INSPECT_LIBS)
#
# bowtie2-dp targets
#
bowtie2-dp: bt2_dp.cpp $(HEADERS) $(SHARED_CPPS) $(DP_CPPS)
$(CXX) $(RELEASE_FLAGS) \
$(RELEASE_DEFS) $(EXTRA_FLAGS) $(NOASSERT_FLAGS) \
$(DEFS) -DBOWTIE2 -DBOWTIE_DP_MAIN -Wall \
$(INC) -I . \
-o $@ $< \
$(DP_CPPS) $(SHARED_CPPS) \
$(LIBS) $(SEARCH_LIBS)
bowtie2-dp-debug: bt2_dp.cpp $(HEADERS) $(SHARED_CPPS) $(DP_CPPS)
$(CXX) $(DEBUG_FLAGS) \
$(DEBUG_DEFS) $(EXTRA_FLAGS) \
$(DEFS) -DBOWTIE2 -DBOWTIE_DP_MAIN -Wall \
$(INC) -I . \
-o $@ $< \
$(DP_CPPS) $(SHARED_CPPS) \
$(LIBS) $(SEARCH_LIBS)
bowtie2.bat:
echo "@echo off" > bowtie2.bat
echo "perl %~dp0/bowtie2 %*" >> bowtie2.bat
bowtie2-build.bat:
echo "@echo off" > bowtie2-build.bat
echo "python %~dp0/bowtie2-build %*" >> bowtie2-build.bat
bowtie2-inspect.bat:
echo "@echo off" > bowtie2-inspect.bat
echo "python %~dp0/bowtie2-inspect %*" >> bowtie2-inspect.bat
.PHONY: bowtie2-src
bowtie2-src: $(SRC_PKG_LIST)
chmod a+x scripts/*.sh scripts/*.pl
......@@ -287,15 +390,15 @@ bowtie2-bin: $(BIN_PKG_LIST) $(BOWTIE2_BIN_LIST) $(BOWTIE2_BIN_LIST_AUX)
rm -rf .bin.tmp
mkdir .bin.tmp
mkdir .bin.tmp/bowtie2-$(VERSION)
if [ -f bowtie.exe ] ; then \
if [ -f bowtie2-align-s.exe ] ; then \
zip tmp.zip $(BIN_PKG_LIST) $(addsuffix .exe,$(BOWTIE2_BIN_LIST) $(BOWTIE2_BIN_LIST_AUX)) ; \
else \
zip tmp.zip $(BIN_PKG_LIST) $(BOWTIE2_BIN_LIST) $(BOWTIE2_BIN_LIST_AUX) ; \
fi
mv tmp.zip .bin.tmp/bowtie2-$(VERSION)
cd .bin.tmp/bowtie2-$(VERSION) ; unzip tmp.zip ; rm -f tmp.zip
cd .bin.tmp ; zip -r bowtie2-$(VERSION)-$(BITS).zip bowtie2-$(VERSION)
cp .bin.tmp/bowtie2-$(VERSION)-$(BITS).zip .
cd .bin.tmp ; zip -r bowtie2-$(VERSION).zip bowtie2-$(VERSION)
cp .bin.tmp/bowtie2-$(VERSION).zip .
rm -rf .bin.tmp
bowtie2-seeds-debug: aligner_seed.cpp ccnt_lut.cpp alphabet.cpp aligner_seed.h bt2_idx.cpp bt2_io.cpp
......
......@@ -3,7 +3,7 @@ Bowtie 2 NEWS
Bowtie 2 is now available for download from the project website,
http://bowtie-bio.sf.net/bowtie2. 2.0.0-beta1 is the first version released to
the public and 2.0.7 is the latest version. Bowtie 2 is licensed under
the public and 2.2.0 is the latest version. Bowtie 2 is licensed under
the GPLv3 license. See `LICENSE' file for details.
Reporting Issues
......@@ -16,6 +16,31 @@ Please report any issues using the Sourceforge bug tracker:
Version Release History
=======================
Version 2.2.0 - February 10, 2014
* Improved index querying efficiency using "population count" instructions
available since SSE4.2.
* Added support for large and small indexes, removing 4-billion-nucleotide
barrier. Bowtie 2 can now be used with reference genomes of any size.
* Fixed bug that could cause bowtie2-build to crash when reference length
is close to 4 billion.
* Fixed issue in bowtie2-inspect that caused -e mode not to output
nucleotides properly.
* Added a CL: string to the @PG SAM header to preserve information about
the aligner binary and paramteres.
* No longer releasing 32-bit binaries. Simplified manual and Makefile
accordingly.
* Credits to the Intel(r) enabling team for performance optimizations
included in this release. Thank you!
* Phased out CygWin support.
* Added the .bat generation for Windows.
* Fixed issue with very large one sequence reference.
* Fixed some issues with rare chars in fasta files.
* Fixed wrappers so bowtie can now be used with symlinks.
Bowtie 2 on GitHub - February 4, 2014
* Bowtie 2 source now lives in a public GitHub repository:
https://github.com/BenLangmead/bowtie2.
Version 2.1.0 - February 21, 2013
* Improved multithreading support so that Bowtie 2 now uses native Windows
threads when compiled on Windows and uses a faster mutex. Threading
......
......@@ -54,10 +54,10 @@ bool SAVal::repOk(const AlignmentCache& ac) const {
bool AlignmentCache::addOnTheFly(
QVal& qv, // qval that points to the range of reference substrings
const SAKey& sak, // the key holding the reference substring
uint32_t topf, // top range elt in BWT index
uint32_t botf, // bottom range elt in BWT index
uint32_t topb, // top range elt in BWT' index
uint32_t botb, // bottom range elt in BWT' index
TIndexOffU topf, // top range elt in BWT index
TIndexOffU botf, // bottom range elt in BWT index
TIndexOffU topb, // top range elt in BWT' index
TIndexOffU botb, // bottom range elt in BWT' index
bool getLock)
{
ThreadSafe ts(lockPtr(), shared_ && getLock);
......@@ -85,14 +85,14 @@ bool AlignmentCache::addOnTheFly(
}
assert(s->key.repOk());
if(added) {
s->payload.i = (uint32_t)salist_.size();
s->payload.i = (TIndexOffU)salist_.size();
s->payload.len = botf - topf;
s->payload.topf = topf;
s->payload.topb = topb;
for(size_t j = 0; j < (botf-topf); j++) {
if(!salist_.add(pool(), 0xffffffff)) {
if(!salist_.add(pool(), OFF_MASK)) {
// Change the payload's len field
s->payload.len = (uint32_t)j;
s->payload.len = (TIndexOffU)j;
return false; // Exhausted pool memory
}
}
......@@ -214,7 +214,7 @@ static void aligner_cache_tests() {
}
// Add all of the 4-mers in several different random orders
RandomSource rand;
for(uint32_t runs = 0; runs < 100; runs++) {
for(unsigned runs = 0; runs < 100; runs++) {
rb.clear();
p.clear();
assert_eq(0, rb.size());
......
......@@ -61,10 +61,11 @@
#include "threading.h"
#include "mem_ids.h"
#include "simple_func.h"
#include "btypes.h"
#define CACHE_PAGE_SZ (16 * 1024)
typedef PListSlice<uint32_t, CACHE_PAGE_SZ> TSlice;
typedef PListSlice<TIndexOffU, CACHE_PAGE_SZ> TSlice;
/**
* Key for the query multimap: the read substring and its length.
......@@ -194,13 +195,13 @@ public:
/**
* Return the offset of the first reference substring in the qlist.
*/
uint32_t offset() const { return i_; }
TIndexOffU offset() const { return i_; }
/**
* Return the number of reference substrings associated with a read
* substring.
*/
uint32_t numRanges() const {
TIndexOffU numRanges() const {
assert(valid());
return rangen_;