Skip to content
Commits on Source (7)
segemehl (0.2.0+dfsg-1) UNRELEASED; urgency=medium
segemehl (0.3-1) UNRELEASED; urgency=medium
* Initial release (Closes: #<bug>)
-- Andreas Tille <tille@debian.org> Mon, 20 Jun 2016 12:09:31 +0200
-- Andreas Tille <tille@debian.org> Fri, 05 Oct 2018 09:54:55 +0200
......@@ -3,12 +3,14 @@ Maintainer: Debian Med Packaging Team <debian-med-packaging@lists.alioth.debian.
Uploaders: Andreas Tille <tille@debian.org>
Section: science
Priority: optional
Build-Depends: debhelper (>= 9),
Build-Depends: debhelper (>= 11),
pkg-config,
libhts-dev,
libncurses-dev,
zlib1g-dev
Standards-Version: 3.9.8
Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/segemehl.git
Vcs-Git: https://anonscm.debian.org/git/debian-med/segemehl.git
Standards-Version: 4.2.1
Vcs-Browser: https://salsa.debian.org/med-team/segemehl
Vcs-Git: https://salsa.debian.org/med-team/segemehl.git
Homepage: http://www.bioinf.uni-leipzig.de/Software/segemehl/
Package: segemehl
......@@ -16,12 +18,22 @@ Architecture: any
Depends: ${shlibs:Depends},
${misc:Depends}
Description: short read mapping with gaps
segemehl is a software to map short sequencer reads to reference
genomes. Unlike other methods, segemehl is able to detect not only
mismatches but also insertions and deletions. Furthermore, segemehl
is not limited to a specific read length and is able to mapprimer-
or polyadenylation contaminated reads correctly. segemehl implements
a matching strategy based on enhanced suffix arrays (ESA). Segemehl
now supports the SAM format, reads gziped queries to save both disk
and memory space and allows bisulfite sequencing mapping and split
read mapping.
Segemehl is a software to map short sequencer reads to reference
genomes. Segemehl implements a matching strategy based on enhanced
suffix arrays (ESA). Segemehl accepts fasta and fastq queries (gzip’ed
and bgzip'ed). In addition to the alignment of reads from standard DNA-
and RNA-seq protocols, it also allows the mapping of bisulfite converted
reads (Lister and Cokus) and implements a split read mapping strategy.
The output of segemehl is a SAM or BAM formatted alignment file. In the
case of split-read mapping, additional BED files are written to the
disc. These BED files may be summarized with the postprocessing tool
haarz. In the case of the alignment of bisulfite converted reads, raw
methylation rates may also be called with haarz.
.
In brief, for each suffix of a read, segemehl aims to find the
best-scoring seed. Seeds might contain insertions, deletions, and
mismatches (differences). The number of differences allowed within a
single seed is user-controlled and is crucial for the runtime of the
program. Subsequently, seeds that undercut the user-defined E-value are
passed on to an exact semi-global alignment procedure. Finally, reads
with a minimum accuracy of percent are reported to the user.
#!/bin/sh
MANDIR=debian
mkdir -p $MANDIR
VERSION=`dpkg-parsechangelog | awk '/^Version:/ {print $2}' | sed -e 's/^[0-9]*://' -e 's/-.*//' -e 's/[+~]dfsg$//'`
NAME=`grep "^Description:" debian/control | sed 's/^Description: *//' | head -n1`
PROGNAME=`grep "^Package:" debian/control | sed 's/^Package: *//'`
AUTHOR=".SH AUTHOR\nThis manpage was written by $DEBFULLNAME for the Debian distribution and
can be used for any other usage of the program.
"
# If program name is different from package name or title should be
# different from package short description change this here
progname=${PROGNAME}
help2man --no-info --no-discard-stderr --help-option=" " \
--name="$NAME" \
--version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
echo $AUTHOR >> $MANDIR/${progname}.1
progname=haarz
help2man --no-info --no-discard-stderr --help-option=" " \
--name="Heuristic mapping of short sequences" \
--version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
echo $AUTHOR >> $MANDIR/${progname}.1
echo "$MANDIR/*.1" > debian/manpages
cat <<EOT
Please enhance the help2man output.
The following web page might be helpful in doing so:
http://liw.fi/manpages/
EOT
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.7.
.TH HAARZ "1" "October 2018" "haarz 0.3" "User Commands"
.SH NAME
haarz \- Heuristic mapping of short sequences
.SH DESCRIPTION
The program haarz belongs to the segemehl package.
.SH SYNOPSIS
.B haarz
<program>
.SH OPTIONS
.SS available programs
.TP
callmethyl
generate methylation vcf from bam
.TP
methylstring
get SAM file with methylation string annotation
.TP
split
summarize and annotate segemehl split info
.SH BUGS
Please report bugs to steve@bioinf.uni\-leipzig.de
.SH REFERENCES
.IP
2008 Bioinformatik Leipzig
.IP
2018 Leibniz Institute on Aging (FLI)
.SH AUTHOR
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
segemehl/*.x usr/bin
segemehl usr/bin
haarz usr/bin
debian/mans/*.1
debian/*.1
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.4.
.TH LACK.X "1" "June 2016" "lack.x 0.2.0" "User Commands"
.SH NAME
lack.x \- Remapping of unmapped reads
.SH SYNOPSIS
.B lack.x
[\-s] \fB\-d\fR <file> [<file> ...] \fB\-q\fR <file> [<file> ...] [\-o <string>] \fB\-r\fR <file> [\-u <file>] [\-t <n>] [\-A <n>] [\-W <n>] [\-U <n>] [\-Z <n>] [\-M <n>]
.SH DESCRIPTION
This program belongs to the segemehl package (see segemehl(1))
.P
Segemehl is a software to map short sequencer reads to reference
genomes. Unlike other methods, segemehl is able to detect not only
mismatches but also insertions and deletions. Furthermore, segemehl
is not limited to a specific read length and is able to mapprimer-
or polyadenylation contaminated reads correctly. segemehl implements
a matching strategy based on enhanced suffix arrays (ESA). Segemehl
now supports the SAM format, reads gziped queries to save both disk
and memory space and allows bisulfite sequencing mapping and split
read mapping.
.SH OPTIONS
.TP
\fB\-d\fR, \fB\-\-database\fR <file> [<file> ...]
list of path/filename(s) of database sequence(s)
.TP
\fB\-q\fR, \fB\-\-query\fR <file> [<file> ...]
path/filename of alignment file
.TP
\fB\-o\fR, \fB\-\-outfile\fR <string>
outputfile (default:none)
.TP
\fB\-r\fR, \fB\-\-remapfilename\fR <file>
filename for reads to be remapped (default:none)
.TP
\fB\-u\fR, \fB\-\-nomatchfilename\fR <file>
filename for unmatched reads (default:none)
.TP
\fB\-t\fR, \fB\-\-threads\fR <n>
start <n> threads for remapping (default:1)
.TP
\fB\-s\fR, \fB\-\-silent\fR
shut up!
.TP
\fB\-A\fR, \fB\-\-accuracy\fR <n>
min percentage of matches per read in semi\-global alignment (default:90)
.TP
\fB\-W\fR, \fB\-\-minsplicecover\fR <n>
min coverage for spliced transcripts (default:80)
.TP
\fB\-U\fR, \fB\-\-minfragscore\fR <n>
min score of a spliced fragment (default:5)
.TP
\fB\-Z\fR, \fB\-\-minfraglen\fR <n>
min length of a spliced fragment (default:5)
.TP
\fB\-M\fR, \fB\-\-maxdist\fR <n>
max number of distant sites to consider, 0 to disable (default:100)
.SH BUGS
Please report bugs to christian@bioinf.uni\-leipzig.de
.SH AUTHOR
This software was written by Christian Otto and others at Bioinformatik Leipzig
.P
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.4.
.TH TESTREALIGN.X "1" "June 2016" "testrealign.x 0.2.0" "User Commands"
.SH NAME
testrealign.x \- Heuristic mapping of short sequences
.SH SYNOPSIS
.B testrealign.x
[\-Evn] \fB\-d\fR <file> [<file> ...] \fB\-q\fR <file> [<file> ...] [\-t <n>] [\-U <file>] [\-T <file>] [\-o <file>] [\-M <n>]
.SH DESCRIPTION
This program belongs to the segemehl package (see segemehl(1))
.P
Segemehl is a software to map short sequencer reads to reference
genomes. Unlike other methods, segemehl is able to detect not only
mismatches but also insertions and deletions. Furthermore, segemehl
is not limited to a specific read length and is able to mapprimer-
or polyadenylation contaminated reads correctly. segemehl implements
a matching strategy based on enhanced suffix arrays (ESA). Segemehl
now supports the SAM format, reads gziped queries to save both disk
and memory space and allows bisulfite sequencing mapping and split
read mapping.
.SH OPTIONS
.TP
\fB\-d\fR, \fB\-\-database\fR <file> [<file> ...]
list of path/filename(s) of database sequence(s)
.TP
\fB\-q\fR, \fB\-\-query\fR <file> [<file> ...]
path/filename of alignment file
.TP
\fB\-E\fR, \fB\-\-expand\fR
expand
.TP
\fB\-v\fR, \fB\-\-verbose\fR
verbose
.TP
\fB\-n\fR, \fB\-\-norealign\fR
do not realign
.TP
\fB\-t\fR, \fB\-\-threads\fR <n>
start <n> threads for realigning (default:1)
.TP
\fB\-U\fR, \fB\-\-splitfile\fR <file>
path/filename of the split bedfile (default:"splicesites.bed")
.TP
\fB\-T\fR, \fB\-\-transfile\fR <file>
path/filename of bed files containing trans\-split (default:"transrealigned.bed")
.TP
\fB\-o\fR, \fB\-\-outfile\fR <file>
path/filename of output sam file (default:none)
.TP
\fB\-M\fR, \fB\-\-maxdist\fR <n>
max number of distant sites to consider, 0 to disable (default:100)
.SH BUGS
Please report bugs to steve@bioinf.uni\-leipzig.de
.SH AUTHOR
This software was written by Christian Otto and others at Bioinformatik Leipzig
.P
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
Author: Andreas Tille <tille@debian.org>
Last-Update: Mon, 20 Jun 2016 12:09:31 +0200
Last-Update: Fri, 05 Oct 2018 09:54:55 +0200
Description: Propagate hardening options
--- a/segemehl/Makefile
+++ b/segemehl/Makefile
@@ -1,7 +1,7 @@
CC=gcc
--- a/Makefile
+++ b/Makefile
@@ -1,10 +1,10 @@
CC?=gcc
LD=${CC}
- CFLAGS= -Wall -pedantic -std=c99 -g -O3 -DFIXINSMALL -DFIXINBACKSPLICE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DDBGNFO -DSHOWALIGN -DDBGLEVEL=0 -DPROGNFO -Isrc -Ilibs -Ilibs/sufarray -Lsrc
- LDFLAGS= -lm -lpthread -lz -lncurses
+ CFLAGS+= -Wall -pedantic -std=c99 -g -O3 -DFIXINSMALL -DFIXINBACKSPLICE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DDBGNFO -DSHOWALIGN -DDBGLEVEL=0 -DPROGNFO -Isrc -Ilibs -Ilibs/sufarray -Lsrc
+ LDFLAGS+= -lm -lpthread -lz -lncurses
-CFLAGS= -Wall -pedantic -std=c99 -g -O3 -DSORTEDUNMAPPED -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DDBGNFO -DSHOWALIGN -DDBGLEVEL=0 -DPROGNFO -Ilibs -Ilibs/sufarray -Isamtools
+CFLAGS += -Wall -pedantic -std=c99 -g -O3 -DSORTEDUNMAPPED -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DDBGNFO -DSHOWALIGN -DDBGLEVEL=0 -DPROGNFO -Ilibs -Ilibs/sufarray -Isamtools
CFLAGS += `pkg-config --cflags htslib`
INC := -I include
CTAGS = ctags > tags
LIBS=-lob -lm -lpthread
-LIB = -lm -lpthread -lz -lncurses -L libs -lform -lmenu -L/usr/local/lib/
+LIB += -lm -lpthread -lz -lncurses -L libs -lform -lmenu -L/usr/local/lib/
LIB += `pkg-config --libs htslib`
LIB += "-Wl,-rpath,`pkg-config --variable=libdir htslib`"
@@ -30,7 +30,7 @@ LIBOBJECTS := $(patsubst $(LIBDIR)/%,$(
$(PRGTARGETS): $(OBJECTS)
@echo "Linking $@";
- $(LD) $(LIBOBJECTS) $(BUILDDIR)/$@.o -o $(TARGETDIR)/$@$(TARGETEXT) $(LIB)
+ $(LD) $(LIBOBJECTS) $(BUILDDIR)/$@.o -o $(TARGETDIR)/$@$(TARGETEXT) $(LIB) $(LDFLAGS)
$(BUILDDIR)/%.o: $(LIBDIR)/%.c
Author: Andreas Tille <tille@debian.org>
Last-Update: Fri, 05 Oct 2018 09:54:55 +0200
Description: Do not specify rpath
--- a/Makefile
+++ b/Makefile
@@ -6,7 +6,6 @@ INC := -I include
CTAGS = ctags > tags
LIB += -lm -lpthread -lz -lncurses -L libs -lform -lmenu -L/usr/local/lib/
LIB += `pkg-config --libs htslib`
-LIB += "-Wl,-rpath,`pkg-config --variable=libdir htslib`"
PRGTARGETS := segemehl haarz
hardening.patch
rpath.patch
spelling.patch
Author: Andreas Tille <tille@debian.org>
Last-Update: Fri, 05 Oct 2018 09:54:55 +0200
Description: Fix spelling
--- a/libs/biofiles.c
+++ b/libs/biofiles.c
@@ -1586,7 +1586,7 @@ bl_fastxAddMate(void *space,
if(bl_fastaCheckMateID(f, n, descr, descrlen) == 0) {
NFO("The fasta/fastq IDs in both mate files do not match.\n", NULL);
- NFO("The first mismatch occured at fastq number %u\n", n);
+ NFO("The first mismatch occurred at fastq number %u\n", n);
NFO("Exiting.\n", NULL);
exit(EXIT_FAILURE);
}
@@ -3448,7 +3448,7 @@ bl_annotationRead (void *space, char *fn
} else if(!strcmp(suf, ".gff") || !strcmp(suf, ".gff3")) {
annot = bl_GFFread(NULL, fn);
} else {
- NFO("please provide a bed or gff file with the approriate extension.\n", NULL);
+ NFO("please provide a bed or gff file with the appropriate extension.\n", NULL);
exit(EXIT_FAILURE);
}
return annot;
--- a/libs/gzidx.c
+++ b/libs/gzidx.c
@@ -554,7 +554,7 @@ int bl_bgzFillStream(FILE *fp, unsigned
n = fread(&input[strm->avail_in], 1, CHUNK-strm->avail_in, fp);
if (ferror(fp)) {
fprintf(stderr, "error reading bgz file.\n");
- perror("The following error occured:");
+ perror("The following error occurred:");
exit(EXIT_FAILURE);
}
--- a/libs/haarz.c
+++ b/libs/haarz.c
@@ -413,7 +413,7 @@ int main(int argc,char** argv) {
prg = manopt_getopts(&prgset, MIN(argc,2), argv);
if(prg->noofvalues == 1) {
- manopt_help(&prgset, "programm needs to be selected\n");
+ manopt_help(&prgset, "program needs to be selected\n");
}
manopt_initoptionset(&optset, argv[0], NULL,
@@ -848,7 +848,7 @@ int main(int argc,char** argv) {
FREEMEMORY(NULL, unflagged);
} else {
- manopt_help(&prgset, "unkown program selected\n");
+ manopt_help(&prgset, "unknown program selected\n");
}
manopt_destructoptionset(&optset);
--- a/libs/manopt.c
+++ b/libs/manopt.c
@@ -1039,7 +1039,7 @@ manopt_checkconstraint(manopt_optionset*
case MANOPT_BLOCKSEPARATOR:
break;
default:
- manopt_help(optset, "unkown option %s type\n", argset->args[arg].flagname);
+ manopt_help(optset, "unknown option %s type\n", argset->args[arg].flagname);
break;
}
--- a/libs/multicharseq.c
+++ b/libs/multicharseq.c
@@ -423,7 +423,7 @@ initMultiCharSeqAlignment(
a->refstart = MAX(sub_start, (Lint)pos-loff);
if(a->refstart > sub_end) {
- fprintf(stderr, "refstart > substart: skiping MultiCharSeqAlignment\n");
+ fprintf(stderr, "refstart > substart: skipping MultiCharSeqAlignment\n");
return 0;
}
@@ -480,7 +480,7 @@ initMultiCharSeqAlignmentOpt(
//this should not happen
if(a->refstart > sub_end) {
- fprintf(stderr, "refstart > substart: skiping MultiCharSeqAlignment\n");
+ fprintf(stderr, "refstart > substart: skipping MultiCharSeqAlignment\n");
return 0;
}
......@@ -3,4 +3,13 @@
export DEB_BUILD_MAINT_OPTIONS = hardening=+all
%:
dh $@ --sourcedirectory=segemehl
dh $@
override_dh_auto_build:
dh_auto_build -- all
override_dh_install:
mv segemehl.x segemehl
mv haarz.x haarz
find . -name "*.x"
dh_install
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.4.
.TH SEGEMEHL.X "1" "June 2016" "segemehl.x 0.2.0" "User Commands"
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.7.
.TH SEGEMEHL "1" "October 2018" "segemehl 0.3" "User Commands"
.SH NAME
segemehl.x \- Heuristic mapping of short sequences
segemehl \- Heuristic mapping of short sequences
.SH SYNOPSIS
.B segemehl.x
[\-sbcKVTYCO] \fB\-d\fR <file> [<file>] [\-q <file>] [\-p <file>] [\-i <file>] [\-j <file>] [\-x <file>] [\-y <file>] [\-B <string>] [\-F <n>] [\-m <n>] [\-t <n>] [\-o <string>] [\-u <file>] [\-D <n>] [\-J <n>]
[\-E <double>] [\-w <double>] [\-M <n>] [\-r <n>] [\-S] [\-\-nohead] [\-e <n>] [\-n <n>] [\-X <n>] [\-A <n>] [\-W <n>] [\-U <n>] [\-Z <n>] [\-l <f>] [\-H] [\-\-showalign] [\-P <string>] [\-Q <string>]
[\-R <n>] [\-I <n>]
.B segemehl
[\-besVOc] \fB\-d\fR <file> [<file>] [\-q <file>] [\-p <file>] [\-i <file>] [\-j <file>] [\-x <file>] [\-y <file>] [\-G <file>] [\-g <string>] [\-t <n>] [\-o <string>] [\-u <file>] [\-B <string>] [\-F <n>]
[\-S [<basename>]] [\-A <n>] [\-D <n>] [\-E <double>] [\-H] [\-m <n>] [\-Z <n>] [\-W <n>] [\-U <n>] [\-l <f>] [\-w <double>] [\-X <n>] [\-J <n>] [\-I <n>] [\-M <n>] [\-n <n>] [\-r <n>] [\-\-skipidcheck]
[\-\-showalign] [\-\-nohead]
.SH DESCRIPTION
Segemehl is a software to map short sequencer reads to reference
genomes. Unlike other methods, segemehl is able to detect not only
mismatches but also insertions and deletions. Furthermore, segemehl
is not limited to a specific read length and is able to mapprimer-
or polyadenylation contaminated reads correctly. segemehl implements
a matching strategy based on enhanced suffix arrays (ESA). Segemehl
now supports the SAM format, reads gziped queries to save both disk
and memory space and allows bisulfite sequencing mapping and split
read mapping.
genomes. Segemehl implements a matching strategy based on enhanced
suffix arrays (ESA). Segemehl accepts fasta and fastq queries (gzip’ed
and bgzip'ed). In addition to the alignment of reads from standard DNA-
and RNA-seq protocols, it also allows the mapping of bisulfite converted
reads (Lister and Cokus) and implements a split read mapping strategy.
The output of segemehl is a SAM or BAM formatted alignment file. In the
case of split-read mapping, additional BED files are written to the
disc. These BED files may be summarized with the postprocessing tool
haarz. In the case of the alignment of bisulfite converted reads, raw
methylation rates may also be called with haarz.
.P
In brief, for each suffix of a read, segemehl aims to find the
best-scoring seed. Seeds might contain insertions, deletions, and
mismatches (differences). The number of differences allowed within a
single seed is user-controlled and is crucial for the runtime of the
program. Subsequently, seeds that undercut the user-defined E-value are
passed on to an exact semi-global alignment procedure. Finally, reads
with a minimum accuracy of percent are reported to the user.
.SH OPTIONS
.SS Input options
.SS INPUT
.TP
\fB\-d\fR, \fB\-\-database\fR <file> [<file>]
list of path/filename(s) of database sequence(s)
list of path/filename(s) of fasta database sequence(s)
.TP
\fB\-q\fR, \fB\-\-query\fR <file>
path/filename of query sequences (default:none)
......@@ -41,78 +51,61 @@ generate db index and store to disk (default:none)
\fB\-y\fR, \fB\-\-generate2\fR <file>
generate second db index and store to disk (default:none)
.TP
\fB\-B\fR, \fB\-\-filebins\fR <string>
file bins with basename <string> for easier data handling (default:none)
.TP
\fB\-F\fR, \fB\-\-bisulfite\fR <n>
bisulfite mapping with methylC\-seq/Lister et al. (=1) or bs\-seq/Cokus et al. protocol (=2) (default:0)
.SS General options
\fB\-G\fR, \fB\-\-readgroupfile\fR <file>
filename to read @RG header (default:none)
.TP
\fB\-m\fR, \fB\-\-minsize\fR <n>
minimum size of queries (default:12)
.TP
\fB\-s\fR, \fB\-\-silent\fR
shut up!
.TP
\fB\-b\fR, \fB\-\-brief\fR
brief output
.TP
\fB\-c\fR, \fB\-\-checkidx\fR
check index
\fB\-g\fR, \fB\-\-readgroupid\fR <string>
read group id (default:none)
.TP
\fB\-t\fR, \fB\-\-threads\fR <n>
start <n> threads (default:1)
.SS OUTPUT
.TP
\fB\-o\fR, \fB\-\-outfile\fR <string>
outputfile (default:none)
.TP
\fB\-b\fR, \fB\-\-bamabafixoida\fR
generate a bam output (\fB\-o\fR <filename> required)
.TP
\fB\-u\fR, \fB\-\-nomatchfilename\fR <file>
filename for unmatched reads (default:none)
.SS Options for SEEDPARAMS
.TP
\fB\-D\fR, \fB\-\-differences\fR <n>
search seeds initially with <n> differences (default:1)
.TP
\fB\-J\fR, \fB\-\-jump\fR <n>
search seeds with jump size <n> (0=automatic) (default:0)
.TP
\fB\-E\fR, \fB\-\-evalue\fR <double>
max evalue (default:5.000000)
\fB\-e\fR, \fB\-\-briefcigar\fR
brief cigar string (M vs X and =)
.TP
\fB\-w\fR, \fB\-\-maxsplitevalue\fR <double>
max evalue for splits (default:50.000000)
\fB\-s\fR, \fB\-\-progressbar\fR
show a progress bar
.TP
\fB\-M\fR, \fB\-\-maxinterval\fR <n>
maximum width of a suffix array interval, i.e. a query seed will be omitted if it matches more than <n> times (default:100)
\fB\-B\fR, \fB\-\-filebins\fR <string>
file bins with basename <string> for easier data handling (default:none)
.TP
\fB\-r\fR, \fB\-\-maxout\fR <n>
maximum number of alignments that will be reported. If set to zero, all alignments will be reported (default:0)
\fB\-V\fR, \fB\-\-MEOP\fR
output MEOP field for easier variance calling in SAM (XE:Z:)
.SS ALIGNMENT
.TP
\fB\-S\fR, \fB\-\-splits\fR
detect split/spliced reads (default:none)
\fB\-F\fR, \fB\-\-bisulfite\fR <n>
bisulfite aln with methylC\-seq/Lister et al. (=1) or bs\-seq/Cokus et al. protocol (=2) (default:0)
.TP
\fB\-K\fR, \fB\-\-SEGEMEHL\fR
output SEGEMEHL format (needs to be selected for brief)
\fB\-S\fR, \fB\-\-splits\fR [<basename>]
detect split/spliced reads. (default:none)
.TP
\fB\-V\fR, \fB\-\-MEOP\fR
output MEOP field for easier variance calling in SAM (XE:Z:)
\fB\-A\fR, \fB\-\-accuracy\fR <n>
min percentage of matches per read in semi\-global alignment (default:90)
.TP
\fB\-\-nohead\fR
do not output header
.SS Options for SEEDEXTENSIONPARAMS
\fB\-D\fR, \fB\-\-differences\fR <n>
search seeds initially with <n> differences (default:1)
.TP
\fB\-e\fR, \fB\-\-extensionscore\fR <n>
score of a match during extension (default:2)
\fB\-E\fR, \fB\-\-evalue\fR <double>
max evalue (default:5.000000)
.TP
\fB\-n\fR, \fB\-\-extensionpenalty\fR <n>
penalty for a mismatch during extension (default:4)
\fB\-H\fR, \fB\-\-hitstrategy\fR
report only best scoring hits (=1) or all (=0) (default:1)
.TP
\fB\-X\fR, \fB\-\-dropoff\fR <n>
dropoff parameter for extension (default:8)
.SS Options for ALIGNPARAMS
\fB\-m\fR, \fB\-\-minsize\fR <n>
minimum length of queries (default:12)
.TP
\fB\-A\fR, \fB\-\-accuracy\fR <n>
min percentage of matches per read in semi\-global alignment (default:90)
\fB\-Z\fR, \fB\-\-minfraglen\fR <n>
min length of a spliced fragment (default:20)
.TP
\fB\-W\fR, \fB\-\-minsplicecover\fR <n>
min coverage for spliced transcripts (default:80)
......@@ -120,44 +113,53 @@ min coverage for spliced transcripts (default:80)
\fB\-U\fR, \fB\-\-minfragscore\fR <n>
min score of a spliced fragment (default:18)
.TP
\fB\-Z\fR, \fB\-\-minfraglen\fR <n>
min length of a spliced fragment (default:20)
.TP
\fB\-l\fR, \fB\-\-splicescorescale\fR <f>
report spliced alignment with score s only if <f>*s is larger than next best spliced alignment (default:1.000000)
report spliced alignment with score s only if <f>*s is larger than next best spliced alignment (default:0.900000)
.TP
\fB\-H\fR, \fB\-\-hitstrategy\fR
report only best scoring hits (=1) or all (=0) (default:1)
\fB\-w\fR, \fB\-\-maxsplitevalue\fR <double>
max evalue for splits (default:50.000000)
.SS SPECIAL
.TP
\fB\-\-showalign\fR
show alignments
\fB\-X\fR, \fB\-\-dropoff\fR <n>
dropoff parameter for extension (default:8)
.TP
\fB\-P\fR, \fB\-\-prime5\fR <string>
add 5' adapter (default:none)
\fB\-J\fR, \fB\-\-jump\fR <n>
search seeds with jump size <n> (0=automatic) (default:0)
.TP
\fB\-Q\fR, \fB\-\-prime3\fR <string>
add 3' adapter (default:none)
\fB\-O\fR, \fB\-\-order\fR
sorts the output by chromsome and position (might take a while!)
.TP
\fB\-R\fR, \fB\-\-clipacc\fR <n>
clipping accuracy (default:70)
\fB\-I\fR, \fB\-\-maxpairinsertsize\fR <n>
maximum size of the inserts (paired end) in case of multiple hits (default:200000)
.TP
\fB\-T\fR, \fB\-\-polyA\fR
clip polyA tail
\fB\-M\fR, \fB\-\-maxinterval\fR <n>
maximum width of a suffix array interval, i.e. a query seed will be omitted if it matches more than <n> times (default:100)
.TP
\fB\-Y\fR, \fB\-\-autoclip\fR
autoclip unknown 3prime adapter
\fB\-c\fR, \fB\-\-checkidx\fR
check index
.TP
\fB\-C\fR, \fB\-\-hardclip\fR
enable hard clipping
\fB\-n\fR, \fB\-\-extensionpenalty\fR <n>
penalty for a mismatch during extension (default:4)
.TP
\fB\-O\fR, \fB\-\-order\fR
sorts the output by chromsome and position (might take a while!)
\fB\-r\fR, \fB\-\-maxout\fR <n>
maximum number of alignments that will be reported. If set to zero, all alignments will be reported (default:0)
.TP
\fB\-\-skipidcheck\fR
do not check whether the fastq ids of mates / paired ends match. Instead, the first mate (\fB\-q\fR) will be used for output only.
.TP
\fB\-\-showalign\fR
show alignments
.TP
\fB\-I\fR, \fB\-\-maxinsertsize\fR <n>
maximum size of the inserts (paired end) (default:5000)
\fB\-\-nohead\fR
do not output header
.SH BUGS
Please report bugs to steve@bioinf.uni\-leipzig.de
.SH SEE ALSO
http://www.bioinf.uni-leipzig.de/Software/segemehl/
.SH REFERENCES
.IP
2008 Bioinformatik Leipzig
.IP
2018 Leibniz Institute on Aging (FLI)
.SH AUTHOR
This software was written by Christian Otto and others at Bioinformatik Leipzig
.P
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.