Commit 9aae020c authored by Charles Plessy's avatar Charles Plessy

Imported Upstream version 0.5.7

parent 723987e1
------------------------------------------------------------------------
r1309 | lh3 | 2010-02-26 21:42:22 -0500 (Fri, 26 Feb 2010) | 4 lines
Changed paths:
M /branches/prog/bwa/bwape.c
M /branches/prog/bwa/bwtaln.c
M /branches/prog/bwa/main.c
* bwa-0.5.6-2 (r1309)
* fixed an unfixed bug (by Carol Scott)
* fixed some tiny formatting
------------------------------------------------------------------------
r1305 | lh3 | 2010-02-25 13:47:58 -0500 (Thu, 25 Feb 2010) | 3 lines
Changed paths:
M /branches/prog/bwa/bwape.c
M /branches/prog/bwa/bwase.c
M /branches/prog/bwa/bwtaln.c
M /branches/prog/bwa/bwtsw2_main.c
M /branches/prog/bwa/main.c
* bwa-0.5.6-1 (r1304)
* optionally write output to a file (by Tim Fennel)
------------------------------------------------------------------------
r1303 | lh3 | 2010-02-10 23:43:48 -0500 (Wed, 10 Feb 2010) | 2 lines
Changed paths:
M /branches/prog/bwa/ChangeLog
M /branches/prog/bwa/NEWS
M /branches/prog/bwa/bwa.1
M /branches/prog/bwa/bwtsw2_main.c
M /branches/prog/bwa/main.c
Release bwa-0.5.6
------------------------------------------------------------------------
r1302 | lh3 | 2010-02-10 11:11:49 -0500 (Wed, 10 Feb 2010) | 3 lines
Changed paths:
......
Installation Instructions
*************************
Simply type `make' to compile and copy the resulting executable `bwa'
anywhere you want. You may also like to copy `solid2fastq.pl' and
`qualfa2fq.pl' for format conversion.
Copyright (C) 1994, 1995, 1996, 1999, 2000, 2001, 2002, 2004, 2005,
2006 Free Software Foundation, Inc.
This file is free documentation; the Free Software Foundation gives
unlimited permission to copy, distribute and modify it.
Basic Installation
==================
Briefly, the shell commands `./configure; make; make install' should
configure, build, and install this package. The following
more-detailed instructions are generic; see the `README' file for
instructions specific to this package.
The `configure' shell script attempts to guess correct values for
various system-dependent variables used during compilation. It uses
those values to create a `Makefile' in each directory of the package.
It may also create one or more `.h' files containing system-dependent
definitions. Finally, it creates a shell script `config.status' that
you can run in the future to recreate the current configuration, and a
file `config.log' containing compiler output (useful mainly for
debugging `configure').
It can also use an optional file (typically called `config.cache'
and enabled with `--cache-file=config.cache' or simply `-C') that saves
the results of its tests to speed up reconfiguring. Caching is
disabled by default to prevent problems with accidental use of stale
cache files.
If you need to do unusual things to compile the package, please try
to figure out how `configure' could check whether to do them, and mail
diffs or instructions to the address given in the `README' so they can
be considered for the next release. If you are using the cache, and at
some point `config.cache' contains results you don't want to keep, you
may remove or edit it.
The file `configure.ac' (or `configure.in') is used to create
`configure' by a program called `autoconf'. You need `configure.ac' if
you want to change it or regenerate `configure' using a newer version
of `autoconf'.
The simplest way to compile this package is:
1. `cd' to the directory containing the package's source code and type
`./configure' to configure the package for your system.
Running `configure' might take a while. While running, it prints
some messages telling which features it is checking for.
2. Type `make' to compile the package.
3. Optionally, type `make check' to run any self-tests that come with
the package.
4. Type `make install' to install the programs and any data files and
documentation.
5. You can remove the program binaries and object files from the
source code directory by typing `make clean'. To also remove the
files that `configure' created (so you can compile the package for
a different kind of computer), type `make distclean'. There is
also a `make maintainer-clean' target, but that is intended mainly
for the package's developers. If you use it, you may have to get
all sorts of other programs in order to regenerate files that came
with the distribution.
Compilers and Options
=====================
Some systems require unusual options for compilation or linking that the
`configure' script does not know about. Run `./configure --help' for
details on some of the pertinent environment variables.
You can give `configure' initial values for configuration parameters
by setting variables in the command line or in the environment. Here
is an example:
./configure CC=c99 CFLAGS=-g LIBS=-lposix
*Note Defining Variables::, for more details.
Compiling For Multiple Architectures
====================================
You can compile the package for more than one kind of computer at the
same time, by placing the object files for each architecture in their
own directory. To do this, you can use GNU `make'. `cd' to the
directory where you want the object files and executables to go and run
the `configure' script. `configure' automatically checks for the
source code in the directory that `configure' is in and in `..'.
With a non-GNU `make', it is safer to compile the package for one
architecture at a time in the source code directory. After you have
installed the package for one architecture, use `make distclean' before
reconfiguring for another architecture.
Installation Names
==================
By default, `make install' installs the package's commands under
`/usr/local/bin', include files under `/usr/local/include', etc. You
can specify an installation prefix other than `/usr/local' by giving
`configure' the option `--prefix=PREFIX'.
You can specify separate installation prefixes for
architecture-specific files and architecture-independent files. If you
pass the option `--exec-prefix=PREFIX' to `configure', the package uses
PREFIX as the prefix for installing programs and libraries.
Documentation and other data files still use the regular prefix.
In addition, if you use an unusual directory layout you can give
options like `--bindir=DIR' to specify different values for particular
kinds of files. Run `configure --help' for a list of the directories
you can set and what kinds of files go in them.
If the package supports it, you can cause programs to be installed
with an extra prefix or suffix on their names by giving `configure' the
option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'.
Optional Features
=================
Some packages pay attention to `--enable-FEATURE' options to
`configure', where FEATURE indicates an optional part of the package.
They may also pay attention to `--with-PACKAGE' options, where PACKAGE
is something like `gnu-as' or `x' (for the X Window System). The
`README' should mention any `--enable-' and `--with-' options that the
package recognizes.
For packages that use the X Window System, `configure' can usually
find the X include and library files automatically, but if it doesn't,
you can use the `configure' options `--x-includes=DIR' and
`--x-libraries=DIR' to specify their locations.
Specifying the System Type
==========================
There may be some features `configure' cannot figure out automatically,
but needs to determine by the type of machine the package will run on.
Usually, assuming the package is built to be run on the _same_
architectures, `configure' can figure that out, but if it prints a
message saying it cannot guess the machine type, give it the
`--build=TYPE' option. TYPE can either be a short name for the system
type, such as `sun4', or a canonical name which has the form:
CPU-COMPANY-SYSTEM
where SYSTEM can have one of these forms:
OS KERNEL-OS
See the file `config.sub' for the possible values of each field. If
`config.sub' isn't included in this package, then this package doesn't
need to know the machine type.
If you are _building_ compiler tools for cross-compiling, you should
use the option `--target=TYPE' to select the type of system they will
produce code for.
If you want to _use_ a cross compiler, that generates code for a
platform different from the build platform, you should specify the
"host" platform (i.e., that on which the generated programs will
eventually be run) with `--host=TYPE'.
Sharing Defaults
================
If you want to set default values for `configure' scripts to share, you
can create a site shell script called `config.site' that gives default
values for variables like `CC', `cache_file', and `prefix'.
`configure' looks for `PREFIX/share/config.site' if it exists, then
`PREFIX/etc/config.site' if it exists. Or, you can set the
`CONFIG_SITE' environment variable to the location of the site script.
A warning: not all `configure' scripts look for a site script.
Defining Variables
==================
Variables not defined in a site shell script can be set in the
environment passed to `configure'. However, some packages may run
configure again during the build, and the customized values of these
variables may be lost. In order to avoid this problem, you should set
them in the `configure' command line, using `VAR=value'. For example:
./configure CC=/usr/local2/bin/gcc
causes the specified `gcc' to be used as the C compiler (unless it is
overridden in the site shell script).
Unfortunately, this technique does not work for `CONFIG_SHELL' due to
an Autoconf bug. Until the bug is fixed you can use this workaround:
CONFIG_SHELL=/bin/bash /bin/bash ./configure CONFIG_SHELL=/bin/bash
`configure' Invocation
======================
`configure' recognizes the following options to control how it operates.
`--help'
`-h'
Print a summary of the options to `configure', and exit.
`--version'
`-V'
Print the version of Autoconf used to generate the `configure'
script, and exit.
`--cache-file=FILE'
Enable the cache: use and save the results of the tests in FILE,
traditionally `config.cache'. FILE defaults to `/dev/null' to
disable caching.
`--config-cache'
`-C'
Alias for `--cache-file=config.cache'.
`--quiet'
`--silent'
`-q'
Do not print messages saying which checks are being made. To
suppress all normal output, redirect it to `/dev/null' (any error
messages will still be shown).
`--srcdir=DIR'
Look for the package's source code in directory DIR. Usually
`configure' can determine that directory automatically.
`configure' also accepts some other, not widely useful, options. Run
`configure --help' for more details.
On 32-bit system, you should compile bwa with `make CFLAGS=-O2'.
The GNU building system is also supported, which is necessary for
building Java binding. Nonetheless, directly runing `make' is
recommended in most other cases.
\ No newline at end of file
Beta Release 0.5.7 (1 March, 2010)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This release only has an effect on paired-end data with fat insert-size
distribution. Users are still recommended to update as the new release
improves the robustness to poor data.
* The fix for `weird pairing' was not working in version 0.5.6, pointed
out by Carol Scott. It should work now.
* Optionally output to a normal file rather than to stdout (by Tim
Fennel).
(0.5.7: 1 March 2010, r1310)
Beta Release 0.5.6 (10 Feburary, 2010)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -88,7 +88,7 @@ static double ierfc(double x) // inverse erfc(); iphi(x) = M_SQRT2 *ierfc(2 * x)
static int infer_isize(int n_seqs, bwa_seq_t *seqs[2], isize_info_t *ii, double ap_prior, int64_t L)
{
uint64_t x, *isizes;
int n, i, tot, p25, p75, p50;
int n, i, tot, p25, p75, p50, max_len = 1, tmp;
double skewness = 0.0, kurtosis = 0.0, y;
ii->avg = ii->std = -1.0;
......@@ -99,6 +99,8 @@ static int infer_isize(int n_seqs, bwa_seq_t *seqs[2], isize_info_t *ii, double
p[0] = seqs[0] + i; p[1] = seqs[1] + i;
if (p[0]->mapQ >= 20 && p[1]->mapQ >= 20)
isizes[tot++] = (p[0]->pos < p[1]->pos)? p[1]->pos + p[1]->len - p[0]->pos : p[0]->pos + p[0]->len - p[1]->pos;
if (p[0]->len > max_len) max_len = p[0]->len;
if (p[1]->len > max_len) max_len = p[1]->len;
}
if (tot < 20) {
fprintf(stderr, "[infer_isize] fail to infer insert size: too few good pairs\n");
......@@ -109,7 +111,8 @@ static int infer_isize(int n_seqs, bwa_seq_t *seqs[2], isize_info_t *ii, double
p25 = isizes[(int)(tot*0.25 + 0.5)];
p50 = isizes[(int)(tot*0.50 + 0.5)];
p75 = isizes[(int)(tot*0.75 + 0.5)];
ii->low = (int)(p25 - OUTLIER_BOUND * (p75 - p25) + .499);
tmp = (int)(p25 - OUTLIER_BOUND * (p75 - p25) + .499);
ii->low = tmp > max_len? tmp : max_len; // ii->low is unsigned
ii->high = (int)(p75 + OUTLIER_BOUND * (p75 - p25) + .499);
for (i = 0, x = n = 0; i < tot; ++i)
if (isizes[i] >= ii->low && isizes[i] <= ii->high)
......@@ -135,7 +138,6 @@ static int infer_isize(int n_seqs, bwa_seq_t *seqs[2], isize_info_t *ii, double
for (y = 1.0; y < 10.0; y += 0.01)
if (.5 * erfc(y / M_SQRT2) < ap_prior / L * (y * ii->std + ii->avg)) break;
ii->high_bayesian = (bwtint_t)(y * ii->std + ii->avg + .499);
if (ii->low < 0) ii->low = 35;
fprintf(stderr, "[infer_isize] (25, 50, 75) percentile: (%d, %d, %d)\n", p25, p50, p75);
fprintf(stderr, "[infer_isize] low and high boundaries: %d and %d for estimating avg and std\n", ii->low, ii->high);
fprintf(stderr, "[infer_isize] inferred external isize from %d pairs: %.3lf +/- %.3lf\n", n, ii->avg, ii->std);
......@@ -663,7 +665,7 @@ void bwa_sai2sam_pe_core(const char *prefix, char *const fn_sa[2], char *const f
fprintf(stderr, "[bwa_sai2sam_pe_core] convert to sequence coordinate... \n");
cnt_chg = bwa_cal_pac_pos_pe(prefix, bwt, n_seqs, seqs, fp_sa, &ii, popt, &opt, &last_ii);
fprintf(stderr, "[bwa_sai2sam_pe_core] time elapses: %.2f sec\n", (float)(clock() - t) / CLOCKS_PER_SEC); t = clock();
fprintf(stderr, "[bwa_sai2sam_pe_core] change of coordinates in %d alignments.\n", cnt_chg);
fprintf(stderr, "[bwa_sai2sam_pe_core] changing coordinates of %d alignments.\n", cnt_chg);
fprintf(stderr, "[bwa_sai2sam_pe_core] align unmapped mate...\n");
pacseq = bwa_paired_sw(bns, pac, n_seqs, seqs, popt, &ii);
......@@ -708,7 +710,7 @@ int bwa_sai2sam_pe(int argc, char *argv[])
int c;
pe_opt_t *popt;
popt = bwa_init_pe_opt();
while ((c = getopt(argc, argv, "a:o:sPn:N:c:")) >= 0) {
while ((c = getopt(argc, argv, "a:o:sPn:N:c:f:")) >= 0) {
switch (c) {
case 'a': popt->max_isize = atoi(optarg); break;
case 'o': popt->max_occ = atoi(optarg); break;
......@@ -717,6 +719,7 @@ int bwa_sai2sam_pe(int argc, char *argv[])
case 'n': popt->n_multi = atoi(optarg); break;
case 'N': popt->N_multi = atoi(optarg); break;
case 'c': popt->ap_prior = atof(optarg); break;
case 'f': freopen(optarg, "w", stdout); break;
default: return 1;
}
}
......@@ -730,8 +733,9 @@ int bwa_sai2sam_pe(int argc, char *argv[])
fprintf(stderr, " -N INT maximum hits to output for discordant pairs [%d]\n", popt->N_multi);
fprintf(stderr, " -c FLOAT prior of chimeric rate [%.1le]\n", popt->ap_prior);
fprintf(stderr, " -P preload index into memory (for base-space reads only)\n");
fprintf(stderr, " -s disable Smith-Waterman for the unmapped mate\n\n");
fprintf(stderr, "Notes: 1. For SOLiD read, <in1.fq> corresponds R3 reads and <in2.fq> to F3.\n");
fprintf(stderr, " -s disable Smith-Waterman for the unmapped mate\n");
fprintf(stderr, " -f FILE sam file to output results to instead of stdout\n\n");
fprintf(stderr, "Notes: 1. For SOLiD reads, <in1.fq> corresponds R3 reads and <in2.fq> to F3.\n");
fprintf(stderr, " 2. For reads shorter than 30bp, applying a smaller -o is recommended to\n");
fprintf(stderr, " to get a sensible speed at the cost of pairing accuracy.\n");
fprintf(stderr, "\n");
......
......@@ -612,16 +612,17 @@ void bwa_sai2sam_se_core(const char *prefix, const char *fn_sa, const char *fn_f
int bwa_sai2sam_se(int argc, char *argv[])
{
int c, n_occ = 3;
while ((c = getopt(argc, argv, "hn:")) >= 0) {
while ((c = getopt(argc, argv, "hn:f:")) >= 0) {
switch (c) {
case 'h': break;
case 'n': n_occ = atoi(optarg); break;
case 'f': freopen(optarg, "w", stdout); break;
default: return 1;
}
}
if (optind + 3 > argc) {
fprintf(stderr, "Usage: bwa samse [-n max_occ] <prefix> <in.sai> <in.fq>\n");
fprintf(stderr, "Usage: bwa samse [-n max_occ] [-f out.sam] <prefix> <in.sai> <in.fq>\n");
return 1;
}
bwa_sai2sam_se_core(argv[optind], argv[optind+1], argv[optind+2], n_occ);
......
......@@ -230,7 +230,7 @@ int bwa_aln(int argc, char *argv[])
gap_opt_t *opt;
opt = gap_init_opt();
while ((c = getopt(argc, argv, "n:o:e:i:d:l:k:cLR:m:t:NM:O:E:q:")) >= 0) {
while ((c = getopt(argc, argv, "n:o:e:i:d:l:k:cLR:m:t:NM:O:E:q:f:")) >= 0) {
switch (c) {
case 'n':
if (strstr(optarg, ".")) opt->fnr = atof(optarg), opt->max_diff = -1;
......@@ -252,6 +252,7 @@ int bwa_aln(int argc, char *argv[])
case 'q': opt->trim_qual = atoi(optarg); break;
case 'c': opt->mode &= ~BWA_MODE_COMPREAD; break;
case 'N': opt->mode |= BWA_MODE_NONSTOP; opt->max_top2 = 0x7fffffff; break;
case 'f': freopen(optarg, "wb", stdout); break;
default: return 1;
}
}
......@@ -281,6 +282,7 @@ int bwa_aln(int argc, char *argv[])
fprintf(stderr, " -c input sequences are in the color space\n");
fprintf(stderr, " -L log-scaled gap penalty for long deletions\n");
fprintf(stderr, " -N non-iterative mode: search for all n-difference hits (slooow)\n");
fprintf(stderr, " -f FILE file to write output to instead of stdout\n");
fprintf(stderr, "\n");
return 1;
}
......
......@@ -16,7 +16,7 @@ int bwa_bwtsw2(int argc, char *argv[])
opt = bsw2_init_opt();
srand48(11);
while ((c = getopt(argc, argv, "q:r:a:b:t:T:w:d:z:m:y:s:c:N:H")) >= 0) {
while ((c = getopt(argc, argv, "q:r:a:b:t:T:w:d:z:m:y:s:c:N:H:f:")) >= 0) {
switch (c) {
case 'q': opt->q = atoi(optarg); break;
case 'r': opt->r = atoi(optarg); break;
......@@ -32,6 +32,7 @@ int bwa_bwtsw2(int argc, char *argv[])
case 'c': opt->coef = atof(optarg); break;
case 'N': opt->t_seeds = atoi(optarg); break;
case 'H': opt->hard_clip = 1; break;
case 'f': freopen(optarg, "w", stdout);
}
}
opt->qr = opt->q + opt->r;
......@@ -57,6 +58,7 @@ int bwa_bwtsw2(int argc, char *argv[])
fprintf(stderr, " -N INT # seeds to trigger reverse alignment [%d]\n", opt->t_seeds);
fprintf(stderr, " -c FLOAT coefficient of length-threshold adjustment [%.1f]\n", opt->coef);
fprintf(stderr, " -H in SAM output, use hard clipping rather than soft\n");
fprintf(stderr, " -f FILE file to output results to instead of stdout\n");
fprintf(stderr, "\n");
{
......
rm -fr autom4te.cache Makefile.in depcomp configure config.guess config.sub missing aclocal.m4 install-sh bwt_gen/Makefile.in
\ No newline at end of file
......@@ -3,7 +3,7 @@
#include "main.h"
#ifndef PACKAGE_VERSION
#define PACKAGE_VERSION "0.5.6 (r1303)"
#define PACKAGE_VERSION "0.5.7 (r1310)"
#endif
static int usage()
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment