Imported Upstream version 2.1.1+dfsg

parent edb0104c
......@@ -17,8 +17,18 @@ The SeqAn-1.3 library is used in TopHat and some of its sources are included
in TopHat source releases; its authors are Andreas Doring, David Weese,
Tobias Rausch, and Knut Reinert.
The IntervalTree library is used by TopHat in its fusion-calling pipeline;
its authors are Chaim-Leib Halbert and Konstantin Tretyakov. This package
is released under the Apache 2.0 license.
The SortedContainers library is used by TopHat in its fusion-calling
pipeline; its author is Grant Jenks. This package is released under the
Apache 2.0 license.
Websites:
TopHat: http://ccb.jhu.edu/software/tophat
Bowtie: http://bowtie-bio.sourceforge.net/bowtie2
Samtools: http://samtools.sourceforge.net
SeqAn: http://www.seqan.de
IntervalTree: https://github.com/chaimleib/intervaltree
SortedContainers: http://www.grantjenks.com/docs/sortedcontainers/
The Artistic License
Preamble
The intent of this document is to state the conditions under which a
Package may be copied, such that the Copyright Holder maintains some
semblance of artistic control over the development of the package,
while giving the users of the package the right to use and distribute
the Package in a more-or-less customary fashion, plus the right to
make reasonable modifications.
Definitions:
* "Package" refers to the collection of files distributed by the
Copyright Holder, and derivatives of that collection of files
created through textual modification.
* "Standard Version" refers to such a Package if it has not been
modified, or has been modified in accordance with the wishes of
the Copyright Holder.
* "Copyright Holder" is whoever is named in the copyright or
copyrights for the package.
* "You" is you, if you're thinking about copying or distributing
this Package.
* "Reasonable copying fee" is whatever you can justify on the
basis of media cost, duplication charges, time of people
involved, and so on. (You will not be required to justify it to
the Copyright Holder, but only to the computing community at
large as a market that must bear the fee.)
* "Freely Available" means that no fee is charged for the item
itself, though there may be fees involved in handling the
item. It also means that recipients of the item may redistribute
it under the same conditions they received it.
1. You may make and give away verbatim copies of the source form of
the Standard Version of this Package without restriction, provided
that you duplicate all of the original copyright notices and
associated disclaimers.
2. You may apply bug fixes, portability fixes and other modifications
derived from the Public Domain or from the Copyright Holder. A
Package modified in such a way shall still be considered the
Standard Version.
3. You may otherwise modify your copy of this Package in any way,
provided that you insert a prominent notice in each changed file
stating how and when you changed that file, and provided that you
do at least ONE of the following:
a) place your modifications in the Public Domain or otherwise make
them Freely Available, such as by posting said modifications to
Usenet or an equivalent medium, or placing the modifications on a
major archive site such as ftp.uu.net, or by allowing the
Copyright Holder to include your modifications in the Standard
Version of the Package.
b) use the modified Package only within your corporation or
organization.
c) rename any non-standard executables so the names do not
conflict with standard executables, which must also be provided,
and provide a separate manual page for each non-standard
executable that clearly documents how it differs from the Standard
Version.
d) make other distribution arrangements with the Copyright Holder.
4. You may distribute the programs of this Package in object code or
executable form, provided that you do at least ONE of the
following:
a) distribute a Standard Version of the executables and library
files, together with instructions (in the manual page or
equivalent) on where to get the Standard Version.
b) accompany the distribution with the machine-readable source of
the Package with your modifications.
c) accompany any non-standard executables with their corresponding
Standard Version executables, giving the non-standard executables
non-standard names, and clearly documenting the differences in
manual pages (or equivalent), together with instructions on where
to get the Standard Version.
d) make other distribution arrangements with the Copyright Holder.
5. You may charge a reasonable copying fee for any distribution of
this Package. You may charge any fee you choose for support of this
Package. You may not charge a fee for this Package itself. However,
you may distribute this Package in aggregate with other (possibly
commercial) programs as part of a larger (possibly commercial)
software distribution provided that you do not advertise this
Package as a product of your own.
6. The scripts and library files supplied as input to or produced as
output from the programs of this Package do not automatically fall
under the copyright of this Package, but belong to whomever
generated them, and may be sold commercially, and may be aggregated
with this Package.
7. C or perl subroutines supplied by you and linked into this Package
shall not be considered part of this Package.
8. The name of the Copyright Holder may not be used to endorse or
promote products derived from this software without specific prior
written permission.
9. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES
OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The End
This license is approved by the Open Source Initiative
(www.opensource.org) for certifying software as OSI Certified Open
Source.
......@@ -2,6 +2,6 @@ ALWAYS_BUILT = src
SUBDIRS = $(ALWAYS_BUILT)
DIST_SUBDIRS = $(ALWAYS_BUILT)
EXTRA_DIST = LICENSE
EXTRA_DIST = LICENSE README AUTHORS
.PHONY: FORCE
......@@ -52,8 +52,8 @@ host_triplet = @host@
subdir = .
DIST_COMMON = README $(am__configure_deps) $(srcdir)/Makefile.am \
$(srcdir)/Makefile.in $(srcdir)/config.h.in \
$(top_srcdir)/configure AUTHORS COPYING ChangeLog INSTALL NEWS \
THANKS ar-lib config.guess config.sub install-sh missing
$(top_srcdir)/configure AUTHORS ChangeLog INSTALL NEWS THANKS \
ar-lib config.guess config.sub depcomp install-sh missing
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/ax_boost_base.m4 \
$(top_srcdir)/ax_boost_thread.m4 $(top_srcdir)/configure.ac
......@@ -248,7 +248,7 @@ top_srcdir = @top_srcdir@
ALWAYS_BUILT = src
SUBDIRS = $(ALWAYS_BUILT)
DIST_SUBDIRS = $(ALWAYS_BUILT)
EXTRA_DIST = LICENSE
EXTRA_DIST = LICENSE README AUTHORS
all: config.h
$(MAKE) $(AM_MAKEFLAGS) all-recursive
......
......@@ -68,12 +68,12 @@ if test "$ax_cv_boost_thread" = yes; then
AC_CHECK_LIB($ax_lib, main, [BOOST_THREAD_LIBS=-l$ax_lib; break])
done
# in recent Boost versions, boost::thread depends on boost::system
# in some Boost versions, boost::thread depends on boost::system
AC_CACHE_CHECK(whether Boost::Thread needs Boost::System library,
ax_cv_boost_thread_system,
[LIBS="$LIBS $BOOST_THREAD_LIBS"
AC_LINK_IFELSE([AC_LANG_PROGRAM([[#include <boost/thread/thread.hpp>]],
[[boost::thread_group thrds; return 0;]])],
[[boost::thread_group thrds; return 0;]])],
[ax_cv_boost_thread_system=no],
[LIBS="$LIBS $BOOST_THREAD_LIBS -lboost_system$with_boost_thread"
AC_LINK_IFELSE([
......@@ -82,8 +82,17 @@ if test "$ax_cv_boost_thread" = yes; then
],
[BOOST_THREAD_LIBS="$BOOST_THREAD_LIBS -lboost_system$with_boost_thread"
ax_cv_boost_thread_system=yes],
[AC_ERROR([Cannot use Boost::Thread])]
)])
[LIBS="$LIBS $BOOST_THREAD_LIBS -lboost_system$with_boost_thread -lrt"
AC_LINK_IFELSE([
AC_LANG_PROGRAM([[#include <boost/thread/thread.hpp>]],
[[boost::thread_group thrds; return 0;]])
],
[BOOST_THREAD_LIBS="$BOOST_THREAD_LIBS -lboost_system$with_boost_thread -lrt"
ax_cv_boost_thread_system=yes],
[AC_ERROR([Cannot use Boost::Thread])]
)
])
])
])
CXXFLAGS=$CXXFLAGS_SAVE
......
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for tophat 2.1.0.
# Generated by GNU Autoconf 2.69 for tophat 2.1.1.
#
# Report bugs to <tophat.cufflinks@gmail.com>.
#
......@@ -580,8 +580,8 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='tophat'
PACKAGE_TARNAME='tophat'
PACKAGE_VERSION='2.1.0'
PACKAGE_STRING='tophat 2.1.0'
PACKAGE_VERSION='2.1.1'
PACKAGE_STRING='tophat 2.1.1'
PACKAGE_BUGREPORT='tophat.cufflinks@gmail.com'
PACKAGE_URL=''
......@@ -1304,7 +1304,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures tophat 2.1.0 to adapt to many kinds of systems.
\`configure' configures tophat 2.1.1 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
......@@ -1374,7 +1374,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of tophat 2.1.0:";;
short | recursive ) echo "Configuration of tophat 2.1.1:";;
esac
cat <<\_ACEOF
......@@ -1484,7 +1484,7 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
tophat configure 2.1.0
tophat configure 2.1.1
generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
......@@ -1945,7 +1945,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by tophat $as_me 2.1.0, which was
It was created by tophat $as_me 2.1.1, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@
......@@ -2294,7 +2294,7 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
$as_echo "#define SVN_REVISION \"exported\"" >>confdefs.h
$as_echo "#define SVN_REVISION \"ecf7617\"" >>confdefs.h
......@@ -2777,7 +2777,7 @@ fi
# Define the identity of the package.
PACKAGE='tophat'
VERSION='2.1.0'
VERSION='2.1.1'
cat >>confdefs.h <<_ACEOF
......@@ -5602,7 +5602,7 @@ fi
done
# in recent Boost versions, boost::thread depends on boost::system
# in some Boost versions, boost::thread depends on boost::system
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether Boost::Thread needs Boost::System library" >&5
$as_echo_n "checking whether Boost::Thread needs Boost::System library... " >&6; }
if ${ax_cv_boost_thread_system+:} false; then :
......@@ -5640,12 +5640,35 @@ _ACEOF
if ac_fn_cxx_try_link "$LINENO"; then :
BOOST_THREAD_LIBS="$BOOST_THREAD_LIBS -lboost_system$with_boost_thread"
ax_cv_boost_thread_system=yes
else
LIBS="$LIBS $BOOST_THREAD_LIBS -lboost_system$with_boost_thread -lrt"
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
/* end confdefs.h. */
#include <boost/thread/thread.hpp>
int
main ()
{
boost::thread_group thrds; return 0;
;
return 0;
}
_ACEOF
if ac_fn_cxx_try_link "$LINENO"; then :
BOOST_THREAD_LIBS="$BOOST_THREAD_LIBS -lboost_system$with_boost_thread -lrt"
ax_cv_boost_thread_system=yes
else
as_fn_error $? "Cannot use Boost::Thread" "$LINENO" 5
fi
rm -f core conftest.err conftest.$ac_objext \
conftest$ac_exeext conftest.$ac_ext
fi
rm -f core conftest.err conftest.$ac_objext \
conftest$ac_exeext conftest.$ac_ext
fi
rm -f core conftest.err conftest.$ac_objext \
conftest$ac_exeext conftest.$ac_ext
......@@ -6900,7 +6923,7 @@ fi
CFLAGS="${generic_CFLAGS} ${ext_CFLAGS} ${user_CFLAGS} ${debug_CFLAGS}"
CXXFLAGS="$CFLAGS"
CXXFLAGS="$CXXFLAGS $BAM_CPPFLAGS $BOOST_CPPFLAGS -I./SeqAn-1.3"
CXXFLAGS="$CXXFLAGS $BAM_CPPFLAGS $BOOST_CPPFLAGS -I./SeqAn-1.4.2"
LDFLAGS="$BAM_LDFLAGS $BOOST_LDFLAGS $user_LDFLAGS"
if test "`cd $srcdir && pwd`" != "`pwd`"; then
......@@ -6925,7 +6948,7 @@ fi
# Define the identity of the package.
PACKAGE='tophat'
VERSION='2.1.0'
VERSION='2.1.1'
cat >>confdefs.h <<_ACEOF
......@@ -7868,7 +7891,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by tophat $as_me 2.1.0, which was
This file was extended by tophat $as_me 2.1.1, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
......@@ -7934,7 +7957,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\
tophat config.status 2.1.0
tophat config.status 2.1.1
configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\"
......
define([svnversion], esyscmd([sh -c "svnversion|tr -d '\n'"]))dnl
AC_INIT([tophat],[2.1.0],[tophat.cufflinks@gmail.com])
AC_DEFINE(SVN_REVISION, "svnversion", [SVN Revision])
define([gitversion], esyscmd([sh -c 'git show -s --pretty=format:%h']))dnl
AC_INIT([tophat],[2.1.1],[tophat.cufflinks@gmail.com])
AC_DEFINE(SVN_REVISION, "gitversion", [SVN Revision])
AC_CONFIG_SRCDIR([config.h.in])
AC_CONFIG_HEADERS([config.h])
......@@ -105,7 +105,7 @@ AS_IF([test "x$enable_debug" = xyes],
CFLAGS="${generic_CFLAGS} ${ext_CFLAGS} ${user_CFLAGS} ${debug_CFLAGS}"
CXXFLAGS="$CFLAGS"
CXXFLAGS="$CXXFLAGS $BAM_CPPFLAGS $BOOST_CPPFLAGS -I./SeqAn-1.3"
CXXFLAGS="$CXXFLAGS $BAM_CPPFLAGS $BOOST_CPPFLAGS -I./SeqAn-1.4.2"
LDFLAGS="$BAM_LDFLAGS $BOOST_LDFLAGS $user_LDFLAGS"
AM_INIT_AUTOMAKE([-Wall foreign tar-pax foreign])
......
#include "GBase.h"
#include <stdarg.h>
#include <ctype.h>
#include <errno.h>
#ifndef S_ISDIR
#define S_ISDIR(mode) (((mode) & S_IFMT) == S_IFDIR)
......@@ -188,22 +189,38 @@ int Gstrcmp(const char* a, const char* b, int n) {
}
int G_mkdir(const char* path, int perms=0775) {
int G_mkdir(const char* path, int perms = (S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH) ) {
//int perms=(S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH) ) {
#ifdef __WIN32__
return _mkdir(path);
#else
//#if _POSIX_C_SOURCE
// return ::mkdir(path);
//#else
return mkdir(path, perms); // not sure if this works on mac
//#endif
return mkdir(path, perms);
#endif
}
void Gmktempdir(char* templ) {
#ifdef __WIN32__
int blen=strlen(templ);
if (_mktemp_s(templ, blen)!=0)
GError("Error creating temp dir %s!\n", templ);
#else
char* cdir=mkdtemp(templ);
if (cdir==NULL)
GError("Error creating temp dir %s!(%s)\n", templ, strerror(errno));
#endif
}
int Gmkdir(const char *path, bool recursive, int perms) {
if (path==NULL || path[0]==0) return -1;
if (!recursive) return G_mkdir(path, perms);
mode_t process_mask = umask(0); //is this really needed?
if (!recursive) {
int r=G_mkdir(path, perms);
if (r!=0)
GMessage("Warning: G_mkdir(%s) failed: %s\n", path, strerror(errno));
umask(process_mask);
return r;
}
int plen=strlen(path);
char* gpath=NULL;
//make sure gpath ends with /
......@@ -221,17 +238,35 @@ int Gmkdir(const char *path, bool recursive, int perms) {
while (*ss!=0 && (psep=strchr(ss, '/'))!=NULL) {
*psep=0; //now gpath is the path up to this /
ss=psep; ++ss; //ss repositioned just after the /
// create current level
if (fileExists(gpath)!=1 && G_mkdir(gpath, perms)!=0) {
// create current level if it doesn't exist
if (fileExists(gpath)) { //path exists
*psep='/';
continue; //assume it's a directory or a symlink to one
//if not, it'll fail later
}
int mkdir_err=0;
if ((mkdir_err=G_mkdir(gpath, perms))!=0) {
GMessage("Warning: mkdir(%s) failed: %s\n", gpath, strerror(errno));
GFREE(gpath);
umask(process_mask);
return -1;
}
*psep='/';
}
GFREE(gpath);
umask(process_mask);
return 0;
}
FILE* Gfopen(const char *path, char *mode) {
FILE* f=NULL;
if (mode==NULL) f=fopen(path, "rb");
else f=fopen(path, mode);
if (f==NULL)
GMessage("Error opening file '%s': %s\n", path, strerror(errno));
return f;
}
bool GstrEq(const char* a, const char* b) {
if (a==NULL || b==NULL) return false;
register int i=0;
......@@ -674,12 +709,12 @@ const char* getFileExt(const char* filepath) {
int fileExists(const char* fname) {
struct stat stFileInfo;
int r=0;
// Attempt to get the file attributes
// Attempt to get the path attributes
int fs = stat(fname,&stFileInfo);
if (fs == 0) {
r=3;
// We were able to get the file attributes
// so the file obviously exists.
// so the path exists
if (S_ISREG (stFileInfo.st_mode)) {
r=2;
}
......@@ -690,14 +725,6 @@ int fileExists(const char* fname) {
return r;
}
/*bool fileExists(const char* filepath) {
if (filepath==NULL) return false;
FILE* ft=fopen(filepath, "rb");
if (ft==NULL) return false;
fclose(ft);
return true;
}
*/
int64 fileSize(const char* fpath) {
struct stat results;
if (stat(fpath, &results) == 0)
......
#ifndef G_BASE_DEFINED
#define G_BASE_DEFINED
#ifndef _POSIX_SOURCE
//#ifndef _POSIX_SOURCE
//mostly for MinGW
#define _POSIX_SOURCE
#endif
//#define _POSIX_SOURCE
//#endif
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
......@@ -171,8 +171,8 @@ inline int iround(double x) {
return (int)floor(x + 0.5);
}
int Gmkdir(const char *path, bool recursive=true, int perms=0775);
int Gmkdir(const char *path, bool recursive=true, int perms = (S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH));
void Gmktempdir(char* templ);
/****************************************************************************/
......@@ -240,6 +240,9 @@ char* strupper(char * str);
void* Gmemscan(void *mem, unsigned int len,
void *part, unsigned int partlen);
FILE* Gfopen(const char *path, char *mode=NULL);
// test if a char is in a string:
bool chrInStr(char c, const char* str);
......@@ -441,7 +444,7 @@ const char* getFileExt(const char* filepath);
int fileExists(const char* fname);
//returns 0 if file entry doesn't exist
//returns 0 if path doesn't exist
// 1 if it's a directory
// 2 if it's a regular file
// 3 otherwise (?)
......
......@@ -679,17 +679,23 @@ GStr GStr::substr(int idx, int len) const {
// A negative idx specifies an idx from the right of the string.
if (idx < 0)
idx += length();
// A length of -1 specifies the rest of the string.
if (len < 0 || len>length()-idx)
else if (idx>=length()) {
len=0;
idx=length();
}
if (len) {
// A length of -1 specifies the rest of the string.
if (len < 0 || len>length()-idx)
len = length() - idx;
if (idx<0 || idx>=length() || len<0 )
if (idx<0 || idx>=length() || len<0 )
invalid_args_error("substr()");
}
GStr newstring;
newstring.replace_data(len);
::memcpy(newstring.chrs(), &chars()[idx], len);
if (len) {
newstring.replace_data(len);
::memcpy(newstring.chrs(), &chars()[idx], len);
}
return newstring;
}
......
......@@ -566,38 +566,38 @@ template <class OBJ> GPVec<OBJ>::GPVec(GPVec& list) { //copy constructor
fCount=list.fCount;
fCapacity=list.fCapacity;
fList=NULL;
if (fCapacity>0) {
GMALLOC(fList, fCapacity*sizeof(OBJ*));
}
fFreeProc=list.fFreeProc;
fCount=list.fCount;
memcpy(fList, list.fList, fCount*sizeof(OBJ*));
//for (int i=0;i<list.Count();i++) Add(list[i]);
if (fCapacity>0) {
GMALLOC(fList, fCapacity*sizeof(OBJ*));
memcpy(fList, list.fList, fCount*sizeof(OBJ*));
}
}
template <class OBJ> GPVec<OBJ>::GPVec(GPVec* plist) { //another copy constructor
fCount=0;
fCapacity=plist->fCapacity;
fList=NULL;
if (fCapacity>0) {
GMALLOC(fList, fCapacity*sizeof(OBJ*));
}
fFreeProc=plist->fFreeProc;
fCount=plist->fCount;
memcpy(fList, plist->fList, fCount*sizeof(OBJ*));
//for (int i=0;i<list->fCount;i++) Add(plist->Get(i));
fCount=0;
fCapacity=plist->fCapacity;
fList=NULL;
fFreeProc=plist->fFreeProc;
fCount=plist->fCount;
if (fCapacity>0) {
GMALLOC(fList, fCapacity*sizeof(OBJ*));
memcpy(fList, plist->fList, fCount*sizeof(OBJ*));
}
}
template <class OBJ> const GPVec<OBJ>& GPVec<OBJ>::operator=(GPVec& list) {
if (&list!=this) {
Clear();
fFreeProc=list.fFreeProc;
//Attention: the object *POINTERS* are copied,
// but the actual object content is NOT duplicated
//for (int i=0;i<list.Count();i++) Add(list[i]);
//Attention: only the *POINTERS* are copied,
// the actual objects are NOT duplicated
fCount=list.fCount;
GMALLOC(fList, fCapacity*sizeof(OBJ*));
memcpy(fList, list.fList, fCount*sizeof(OBJ*));
fCapacity=list.fCapacity;
if (fCapacity>0) {
GMALLOC(fList, fCapacity*sizeof(OBJ*));
memcpy(fList, list.fList, fCount*sizeof(OBJ*));
}
}
return *this;
}
......
This diff is collapsed.
This diff is collapsed.
</
#ifdef HAVE_CONFIG_H
#include <config.h>
#else
#define PACKAGE_VERSION "local"
#define SVN_REVISION "unknown"
#endif
#include <cstdlib>
#include <cstdio>
#include <cstring>
......@@ -22,18 +30,45 @@ bool ignoreOQ=false; // ignore OQ tag
string outfname;
#define USAGE "Usage: bam2fastx [--fasta|-a|--fastq|-q] [--color] [-Q] [--sam|-s|-t]\n\
[-M|--mapped-only|-A|--all] [-o <outfile>] [-P|--paired] [-N] <in.bam>\n\
\nNote: By default, reads flagged as not passing quality controls are\n\
discarded; the -Q option can be used to ignore the QC flag.\n\
\nUse the -N option if the /1 and /2 suffixes should be appended to\n\
read names according to the SAM flags\n\
\nUse the -O option to ignore the OQ tag, if present, when writing quality values\n"
#define USAGE "bam2fastx v%s (%s) usage:\n\
bam2fastx [--fasta|-a] [-C|--color] [-P|--paired] [-N]\n\
[-A|--all|-M|--mapped-only] [-Q] [--sam|-s|-t] [-o <outfname>] <in.bam>\n\
\nBy default, bam2fastx only converts the unmapped reads from the input file,\n\
discarding those unmapped reads flagged as QC failed.\n\
The input BAM/SAM file MUST be sorted by read name (-n option for samtools\n\
sort). If the input file name is \"-\", stdin will be used instead.\n\
\nOptions:\n\
-A,--all convert all reads (mapped and unmapped)\n\
(but discarding those flagged as QC failed, unless -Q)\n\
-P paired reads are expected and converted into two output\n\
files (see <outfname> comments below)\n\
-Q convert unmapped reads even when flagged as QC failed\n\
-M,--maped-only convert only mapped reads\n\
-N for -P, append /1 and /2 suffixes to read names\n\
-O ignore the original quality values (OQ tag) and write the\n\
current quality values (default is to use OQ data if found)\n\
-C,--color reads are in ABI SOLiD color format\n\
-s,-t,--sam input is a SAM text file (default: BAM input expected)\n\
-a,--fasta output FASTA records, not FASTQ (discard quality values)\n\
-o <outfname> output file name or template (see below)\n\
\n\
<outfname> serves as a name template when -P option is provided, as suffixes\n\
.1 and .2 will be automatically inserted before the file extension in \n\
<outfname>, such that two file names will be created.\n\
If <outfname> ends in .gz or .bz2 then bam2fastx will write the\n\
output compressed by gzip or bzip2 respectively.\n\n\
Example of converting all paired reads from a BAM file to FASTQ format:\n\
bam2fastx -PANQ -o sample.fq.gz sample.sortedbyname.bam\n\
In this example the output will be written in two files: \n\
sample.1.fq.gz and sample.2.fq.gz\n\
"
const char *short_options = "o:ac:qstOQMAPN";
const char *short_options = "o:ac:qvhstOQCMAPN";
enum {
OPT_FASTA = 127,
OPT_HELP = 127,
OPT_VERSION,
OPT_FASTA,
OPT_FASTQ,
OPT_SAM,
OPT_PAIRED,
......@@ -41,7 +76,7 @@ enum {
OPT_ALL,
OPT_COLOR
};
struct Read {
string name;
int mate;
......@@ -56,6 +91,8 @@ struct Read {
};
struct option long_options[] = {
{"help", no_argument, 0, OPT_HELP},
{"version", no_argument, 0, OPT_VERSION},
{"fasta", no_argument, 0, OPT_FASTA},
{"fastq", no_argument, 0, OPT_FASTQ},
{"sam", no_argument, 0, OPT_SAM},
......@@ -73,54 +110,63 @@ int parse_options(int argc, char** argv)
do {
next_option = getopt_long(argc, argv, short_options, long_options, &option_index);
switch (next_option) {
case -1:
break;
case 'a':
case OPT_FASTA:
is_fastq = false;
break;
case 'q':
case OPT_FASTQ:
is_fastq = true;
break;
case 's':
case 't':
case OPT_SAM: //sam (text) input
sam_input = true;
break;
case 'M':
case OPT_MAPPED_ONLY:
mapped_only = true;
case -1:
break;
case 'A':
case OPT_ALL:
all_reads = true;
break;
case OPT_COLOR:
color = true;
break;
case 'P':
case OPT_PAIRED:
pairs = true;
break;
case 'Q':
ignoreQC = true;
break;
case 'O':
ignoreOQ = true;
break;
case 'o':
outfname=optarg;
break;
case 'N':
add_matenum=true;
break;
default:
return 1;
}
case 'h':
case OPT_HELP:
fprintf(stdout, USAGE, PACKAGE_VERSION, SVN_REVISION);
exit(0);
case 'v':
case OPT_VERSION:
fprintf(stdout, "%s\n", PACKAGE_VERSION);