Skip to content
Commits on Source (6)
language:
- cpp
os:
- linux
- osx
compiler:
- g++
- clang
install:
- if [ $TRAVIS_OS_NAME = linux ]; then sudo apt-get install ghostscript; else brew install ghostscript; fi
script:
- ./autogen.sh
- ./configure
- make
[![Build Status](https://travis-ci.org/torognes/vsearch.svg?branch=master)](https://travis-ci.org/torognes/vsearch)
# VSEARCH
## Introduction
......@@ -22,7 +24,7 @@ Most of the nucleotide based commands and options in USEARCH version 7 are suppo
## Getting Help
If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.7.1/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.8.0/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
## Example
......@@ -35,9 +37,9 @@ In the example below, VSEARCH will identify sequences in the file database.fsa t
**Source distribution** To download the source distribution from a [release](https://github.com/torognes/vsearch/releases) and build the executable and the documentation, use the following commands:
```
wget https://github.com/torognes/vsearch/archive/v2.7.1.tar.gz
tar xzf v2.7.1.tar.gz
cd vsearch-2.7.1
wget https://github.com/torognes/vsearch/archive/v2.8.0.tar.gz
tar xzf v2.8.0.tar.gz
cd vsearch-2.8.0
./autogen.sh
./configure
make
......@@ -68,36 +70,36 @@ Binary distributions are provided for x86-64 systems running GNU/Linux, macOS (v
Download the appropriate executable for your system using the following commands if you are using a Linux x86_64 system:
```sh
wget https://github.com/torognes/vsearch/releases/download/v2.7.1/vsearch-2.7.1-linux-x86_64.tar.gz
tar xzf vsearch-2.7.1-linux-x86_64.tar.gz
wget https://github.com/torognes/vsearch/releases/download/v2.8.0/vsearch-2.8.0-linux-x86_64.tar.gz
tar xzf vsearch-2.8.0-linux-x86_64.tar.gz
```
Or these commands if you are using a Linux ppc64le system:
```sh
wget https://github.com/torognes/vsearch/releases/download/v2.7.1/vsearch-2.7.1-linux-ppc64le.tar.gz
tar xzf vsearch-2.7.1-linux-ppc64le.tar.gz
wget https://github.com/torognes/vsearch/releases/download/v2.8.0/vsearch-2.8.0-linux-ppc64le.tar.gz
tar xzf vsearch-2.8.0-linux-ppc64le.tar.gz
```
Or these commands if you are using a Mac:
```sh
wget https://github.com/torognes/vsearch/releases/download/v2.7.1/vsearch-2.7.1-macos-x86_64.tar.gz
tar xzf vsearch-2.7.1-macos-x86_64.tar.gz
wget https://github.com/torognes/vsearch/releases/download/v2.8.0/vsearch-2.8.0-macos-x86_64.tar.gz
tar xzf vsearch-2.8.0-macos-x86_64.tar.gz
```
Or if you are using Windows, download and extract (unzip) the contents of this file:
```
https://github.com/torognes/vsearch/releases/download/v2.7.1/vsearch-2.7.1-win-x86_64.zip
https://github.com/torognes/vsearch/releases/download/v2.8.0/vsearch-2.8.0-win-x86_64.zip
```
Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.7.1-linux-x86_64` or `vsearch-2.7.1-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`.
Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.8.0-linux-x86_64` or `vsearch-2.8.0-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`.
Windows: You will now have the binary distribution in a folder called `vsearch-2.7.1-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.
Windows: You will now have the binary distribution in a folder called `vsearch-2.8.0-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.
**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.7.1/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.7.1/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.8.0/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.8.0/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
## Plugins, packages, and wrappers
......@@ -137,6 +139,8 @@ On Windows these libraries are called zlib1.dll and bz2.dll.
VSEARCH will automatically check whether these libraries are available and load them dynamically.
To create the PDF file with the manual the ps2pdf tool is required. It is part of the ghostscript package.
## VSEARCH license and third party licenses
......
......@@ -2,7 +2,7 @@
# Process this file with autoconf to produce a configure script.
AC_PREREQ([2.63])
AC_INIT([vsearch], [2.7.1], [torognes@ifi.uio.no])
AC_INIT([vsearch], [2.8.0], [torognes@ifi.uio.no])
AC_CANONICAL_TARGET
AM_INIT_AUTOMAKE([subdir-objects])
AC_LANG([C++])
......
vsearch (2.8.0-1) unstable; urgency=medium
* New upstream version
* Point Vcs fields to salsa.debian.org
* Standards-Version: 4.1.4
-- Andreas Tille <tille@debian.org> Sat, 28 Apr 2018 22:27:42 +0200
vsearch (2.7.1-1) unstable; urgency=medium
* New upstream version
......
......@@ -10,9 +10,9 @@ Build-Depends: debhelper (>= 11~),
python-markdown,
ghostscript,
time
Standards-Version: 4.1.3
Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/vsearch.git
Vcs-Git: https://anonscm.debian.org/git/debian-med/vsearch.git
Standards-Version: 4.1.4
Vcs-Browser: https://salsa.debian.org/med-team/vsearch
Vcs-Git: https://salsa.debian.org/med-team/vsearch.git
Homepage: https://github.com/torognes/vsearch/
Package: vsearch
......
.\" ============================================================================
.TH vsearch 1 "February 16, 2018" "version 2.7.1" "USER COMMANDS"
.TH vsearch 1 "April 24, 2018" "version 2.8.0" "USER COMMANDS"
.\" ============================================================================
.SH NAME
vsearch \(em chimera detection, clustering, dereplication and
rereplication, FASTA/FASTQ file processing, masking, pairwise
alignment, searching, shuffling, sorting and subsampling of amplicons
for metagenomics, genomics, and population genetics.
alignment, searching, shuffling, sorting, subsampling, and taxonomic
classification of amplicons for metagenomics, genomics, and population
genetics.
.\" ============================================================================
.SH SYNOPSIS
.\" left justified, ragged right
......@@ -110,6 +111,13 @@ Subsampling:
\-\-sample_size \fIpositive integer\fR) [\fIoptions\fR]
.PP
.RE
Taxonomic classification:
.RS
\fBvsearch\fR \-\-sintax \fIfastafile\fR \-\-db \fIfastafile\fR
\-\-tabbedout \fIoutputfile\fR [\-\-sintax_cutoff \fIreal\fR]
[\fIoptions\fR]
.PP
.RE
UDB database handling:
.RS
\fBvsearch\fR \-\-makeudb_usearch \fIfastafile\fR \-\-output \fIoutputfile\fR [\fIoptions\fR]
......@@ -1088,6 +1096,13 @@ non-matching nucleotides allowed in the overlap region. That option
has a strong influence on the merging success rate. The default
value is 10.
.TP
.BI \-\-fastq_maxdiffpct\~ real
When using \-\-fastq_mergepairs, specify the maximum percentage of
non-matching nucleotides allowed in the overlap region. The default
value is 100.0%. There are other more sophisticated rules in the
merging algorithm that will discard read pairs with a high fraction of
mismatches.
.TP
.BI \-\-fastq_maxee\~ real
When using \-\-fastq_filter, \-\-fastq_mergepairs or \-\-fastx_filter,
discard sequences with more than the specified number of expected
......@@ -1126,12 +1141,14 @@ ambiguous bases (N's), as specified with the \-\-fastq_maxns are also
discarded (no limit by default). Staggered reads are not merged unless
the \-\-fastq_allowmergestagger option is specified. The minimum
length of the overlap region between the reads may be specified with
the \-\-fastq_minovlen option (default 10), and the overlap region may
the \-\-fastq_minovlen option (default 10). The overlap region may
not include more mismatches than specified with the \-\-fastq_maxdiffs
option (10 by default), otherwise the read pair is discarded.
Additional rules will avoid merging of reads that cannot be aligned
reliably and unambiguously. The mimimum and maximum length of the
merged sequence may be specified with the \-\-fastq_minmergelen and
option (10 by default) or a higher percentage of mismatches than
specified with the \-\-fastq_maxdiffpct option (100.0% by default),
otherwise the read pair is discarded. Additional rules will avoid
merging of reads that cannot be aligned reliably and
unambiguously. The mimimum and maximum length of the merged sequence
may be specified with the \-\-fastq_minmergelen and
\-\-fastq_maxmergelen options, respectively. Other relevant options
are: \-\-fastq_ascii, \-\-fastq_maxee, \-\-fastq_nostagger,
\-\-fastq_qmax, \-\-fastq_qmaxout, \-\-fastq_qmin, \-\-fastq_qminout,
......@@ -2355,6 +2372,56 @@ file.
.RE
.PP
.\" ----------------------------------------------------------------------------
Taxonomic classification options:
.RS
The vsearch command \-\-sintax will classify the input sequences
according to the Sintax algorithm as described by Robert Edgar (2016)
in SINTAX: a simple non-Bayesian taxonomy classifier for 16S and ITS
sequences, BioRxiv, 074161. Preprint. doi: https://doi.org/10.1101/074161.
The name of the fasta file containing the input sequences to be
classified is given as an argument to the \-\-sintax command. The reference
sequence database is specified with the \-\-db option. The results are
written in a tab delimited text file whose name is specified with the
\-\-tabbedout option. The \-\-sintax_cutoff option may be used to set a
minimum level of bootstrap support for the taxonomic ranks to be reported.
Multithreading is supported. Databases in UDB files are supported.
The strand option may be specified.
The reference database must contain taxonomic information in the
header of each sequence in the form of a string starting with ";tax="
and followed by a comma-separated list of up to eight taxonomic
identifiers. Each taxonomic identifier must start with an indication
of the rank by one of the letters d (for domain) k (kingdom), p
(phylum), c (class), o (order), f (family), g (genus), or s
(species). The letter is followed by a colon (:) and the name of that
rank. Commas and semicolons are not allowed in the name of the rank.
Example: ">X80725_S000004313;tax=d:Bacteria,p:Proteobacteria,c:Gammaproteobacteria,o:Enterobacteriales,f:Enterobacteriaceae,g:Escherichia/Shigella,s:Escherichia_coli".
.PP
.TP 9
.BI \-\-db \0filename
Read the reference sequences from \fIfilename\fR, in FASTA, FASTQ or UDB format. These sequences needs to be annotated with taxonomy.
.TP
.BI \-\-sintax_cutoff\~ "real"
Specify a minimum level of bootstrap support for the taxonomic ranks that will be included in column 4 of the output file. For instance 0.9, corresponding to 90%.
.TP
.BI \-\-sintax \0filename
Read the input sequences from \fIfilename\fR, in FASTA or FASTQ format.
.TP
.BI \-\-tabbedout \0filename
Write the results to \fIfilename\fR, in a tab-separated text
format. Column 1 contains the query label. Column 2 contains the
predicted taxonomy in the same format as for the reference data, with
bootstrap support indicated in parentheses after each rank. Column 3
contains the strand. If the \-\-sintax_cutoff option is used, the
predicted taxonomy will be repeated in column 4 while omitting the
bootstrap values and including only the ranks with support at or above
the threshold.
.RE
.PP
.\" ----------------------------------------------------------------------------
UDB options:
.RS
Databases to be used with the \-\-usearch_global command may be
......@@ -2808,7 +2875,7 @@ Source code and binaries are available at
.PP
.\" ============================================================================
.SH COPYRIGHT
Copyright (C) 2014-2017, Torbjørn Rognes, Frédéric Mahé and Tomás
Copyright (C) 2014-2018, Torbjørn Rognes, Frédéric Mahé and Tomás
Flouri
.PP
All rights reserved.
......@@ -3320,6 +3387,13 @@ extraction of abundance and other attributes from the headers.
Fix several bugs on Windows related to large files, use of "-" as a
file name to mean stdin or stdout, alignment errors, missed kmers and
corrupted UDB files. Added documentation of UDB-related commands.
.TP
.BR v2.7.2\~ "released April 20th, 2018"
Added the sintax command for taxonomic classification. Fixed a bug with
incorrect FASTA headers of consensus sequences after clustering.
.TP
.BR v2.8.0\~ "released April 24th, 2018"
Added the fastq_maxdiffpct option to the fastq_mergepairs command.
.RE
.LP
.\" ============================================================================
......
......@@ -49,6 +49,7 @@ searchexact.h \
showalign.h \
sha1.h \
shuffle.h \
sintax.h \
sortbylength.h \
sortbysize.h \
subsample.h \
......@@ -126,6 +127,7 @@ searchexact.cc \
sha1.c \
showalign.cc \
shuffle.cc \
sintax.cc \
sortbylength.cc \
sortbysize.cc \
subsample.cc \
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......