Skip to content
Commits on Source (7)
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2017, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2019, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......
......@@ -16,7 +16,7 @@ We have implemented a tool called VSEARCH which supports *de novo* and reference
VSEARCH stands for vectorized search, as the tool takes advantage of parallelism in the form of SIMD vectorization as well as multiple threads to perform accurate alignments at high speed. VSEARCH uses an optimal global aligner (full dynamic programming Needleman-Wunsch), in contrast to USEARCH which by default uses a heuristic seed and extend aligner. This usually results in more accurate alignments and overall improved sensitivity (recall) with VSEARCH, especially for alignments with gaps.
VSEARCH binaries are provided for x86-64 systems running GNU/Linux, macOS (version 10.7 or higher) and Windows (64-bit, version 7 or higher), as well as ppc64le systems running GNU/Linux.
VSEARCH binaries are provided for x86-64 systems running GNU/Linux, macOS (version 10.7 or higher) and Windows (64-bit, version 7 or higher), as well as for 64-bit little-endian POWER8 (ppc64le) and 64-bit ARMv8 systems (aarch64) running GNU/Linux. VSEARCH contains dedicated SIMD code for these three processors architectures (SSE2, AltiVec/VMX/VSX, Neon).
VSEARCH can directly read input query and database files that are compressed using gzip and bzip2 (.gz and .bz2) if the zlib and bzip2 libraries are available.
......@@ -24,7 +24,7 @@ Most of the nucleotide based commands and options in USEARCH version 7 are suppo
## Getting Help
If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.10.2/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.10.4/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
## Example
......@@ -37,9 +37,9 @@ In the example below, VSEARCH will identify sequences in the file database.fsa t
**Source distribution** To download the source distribution from a [release](https://github.com/torognes/vsearch/releases) and build the executable and the documentation, use the following commands:
```
wget https://github.com/torognes/vsearch/archive/v2.10.2.tar.gz
tar xzf v2.10.2.tar.gz
cd vsearch-2.10.2
wget https://github.com/torognes/vsearch/archive/v2.10.4.tar.gz
tar xzf v2.10.4.tar.gz
cd vsearch-2.10.4
./autogen.sh
./configure
make
......@@ -48,8 +48,6 @@ make install # as root or sudo make install
You may customize the installation directory using the `--prefix=DIR` option to `configure`. If the compression libraries [zlib](http://www.zlib.net) and/or [bzip2](http://www.bzip.org) are installed on the system, they will be detected automatically and support for compressed files will be included in vsearch. Support for compressed files may be disabled using the `--disable-zlib` and `--disable-bzip2` options to `configure`. A PDF version of the manual will be created from the `vsearch.1` manual file if `ps2pdf` is available, unless disabled using the `--disable-pdfman` option to `configure`. Other options may also be applied to `configure`, please run `configure -h` to see them all. GNU autotools (version 2.63 or later) and the gcc compiler is required to build vsearch.
The IBM XL C++ compiler is recommended on ppc64le systems.
The Windows binary was compiled using the [Mingw-w64](https://mingw-w64.org/) C++ cross-compiler.
**Cloning the repo** Instead of downloading the source distribution as a compressed archive, you could clone the repo and build it as shown below. The options to `configure` as described above are still valid.
......@@ -65,47 +63,56 @@ make install # as root or sudo make install
**Binary distribution** Starting with version 1.4.0, binary distribution files containing pre-compiled binaries as well as the documentation will be made available as part of each [release](https://github.com/torognes/vsearch/releases). The included executables include support for input files compressed by zlib and bzip2 (with files usually ending in `.gz` or `.bz2`).
Binary distributions are provided for x86-64 systems running GNU/Linux, macOS (version 10.7 or higher) and Windows (64-bit, version 7 or higher), as well as ppc64le systems running GNU/Linux.
Binary distributions are provided for x86-64 systems running GNU/Linux, macOS (version 10.7 or higher) and Windows (64-bit, version 7 or higher), as well as POWER8 (ppc64le) and 64-bit AMDv8 (aarch64) systems running GNU/Linux.
Download the appropriate executable for your system using the following commands if you are using a Linux x86_64 system:
```sh
wget https://github.com/torognes/vsearch/releases/download/v2.10.2/vsearch-2.10.2-linux-x86_64.tar.gz
tar xzf vsearch-2.10.2-linux-x86_64.tar.gz
wget https://github.com/torognes/vsearch/releases/download/v2.10.4/vsearch-2.10.4-linux-x86_64.tar.gz
tar xzf vsearch-2.10.4-linux-x86_64.tar.gz
```
Or these commands if you are using a Linux ppc64le system:
```sh
wget https://github.com/torognes/vsearch/releases/download/v2.10.2/vsearch-2.10.2-linux-ppc64le.tar.gz
tar xzf vsearch-2.10.2-linux-ppc64le.tar.gz
wget https://github.com/torognes/vsearch/releases/download/v2.10.4/vsearch-2.10.4-linux-ppc64le.tar.gz
tar xzf vsearch-2.10.4-linux-ppc64le.tar.gz
```
Or these commands if you are using a Linux aarch64 system:
```sh
wget https://github.com/torognes/vsearch/releases/download/v2.10.4/vsearch-2.10.4-linux-aarch64.tar.gz
tar xzf vsearch-2.10.4-linux-aarch64.tar.gz
```
Or these commands if you are using a Mac:
```sh
wget https://github.com/torognes/vsearch/releases/download/v2.10.2/vsearch-2.10.2-macos-x86_64.tar.gz
tar xzf vsearch-2.10.2-macos-x86_64.tar.gz
wget https://github.com/torognes/vsearch/releases/download/v2.10.4/vsearch-2.10.4-macos-x86_64.tar.gz
tar xzf vsearch-2.10.4-macos-x86_64.tar.gz
```
Or if you are using Windows, download and extract (unzip) the contents of this file:
```
https://github.com/torognes/vsearch/releases/download/v2.10.2/vsearch-2.10.2-win-x86_64.zip
https://github.com/torognes/vsearch/releases/download/v2.10.4/vsearch-2.10.4-win-x86_64.zip
```
Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.10.2-linux-x86_64` or `vsearch-2.10.2-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`.
Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.10.4-linux-x86_64` or `vsearch-2.10.4-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`.
Windows: You will now have the binary distribution in a folder called `vsearch-2.10.2-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.
Windows: You will now have the binary distribution in a folder called `vsearch-2.10.4-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.
**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.10.2/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.10.2/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.10.4/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.10.4/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
## Plugins, packages, and wrappers
**QIIME 2 plugin** Thanks to the [QIIME 2](https://github.com/qiime2) team, there is now a plugin called [q2-vsearch](https://github.com/qiime2/q2-vsearch) for [QIIME 2](https://qiime2.org).
**Conda package** Thanks to the [BioConda](https://bioconda.github.io/) team, there is now a [vsearch package](https://anaconda.org/bioconda/vsearch) in [Conda](https://conda.io/).
**Homebrew package** Thanks to [Torsten Seeman](https://github.com/tseemann), a [vsearch package](https://github.com/Homebrew/homebrew-science/pull/2409) for [Homebrew](http://brew.sh/) has been made.
**Debian package** Thanks to the [Debian Med](https://www.debian.org/devel/debian-med/) team, there is now a [vsearch](https://packages.debian.org/sid/vsearch) package in [Debian](https://www.debian.org/).
......@@ -187,6 +194,7 @@ File | Description
**eestats.cc** | Produce statistics for fastq_eestats command
**fasta.cc** | FASTA file parser
**fastq.cc** | FASTQ file parser
**fastqjoin.cc** | FASTQ paired-end reads joining
**fastqops.cc** | FASTQ file statistics etc
**fastx.cc** | Detection of FASTA and FASTQ files, wrapper for FASTA and FASTQ parsers
**kmerhash.cc** | Hash for kmers used by paired-end read merger
......@@ -203,6 +211,7 @@ File | Description
**search.cc** | Implements search using global alignment
**searchcore.cc** | Core search functions for searching, clustering and chimera detection
**searchexact.cc** | Exact search functions
**sffconvert.cc** | SFF to FASTQ file conversion
**sha1.c** | SHA1 message digest
**showalign.cc** | Output an alignment in a human-readable way given a CIGAR-string and the sequences
**shuffle.cc** | Shuffle sequences
......@@ -232,14 +241,6 @@ or you could send an email to [torognes@ifi.uio.no](mailto:torognes@ifi.uio.no?s
VSEARCH is designed for rather short sequences, and will be slow when sequences are longer than about 5,000 bp. This is because it always performs optimal global alignment on selected sequences.
## Future work
Some issues to work on:
* testing and debugging
* heuristics for alignment of long sequences (e.g. banded alignment around selected diagonals)?
## The VSEARCH team
The main contributors to VSEARCH:
......
......@@ -2,7 +2,7 @@
# Process this file with autoconf to produce a configure script.
AC_PREREQ([2.63])
AC_INIT([vsearch], [2.10.2], [torognes@ifi.uio.no])
AC_INIT([vsearch], [2.10.4], [torognes@ifi.uio.no])
AC_CANONICAL_TARGET
AM_INIT_AUTOMAKE([subdir-objects])
AC_LANG([C++])
......@@ -86,6 +86,7 @@ if test "x${have_zlib}" = "xyes"; then
fi
case $target in
aarch64*) target_aarch64="yes" ;;
powerpc64*) target_ppc="yes" ;;
esac
......@@ -96,6 +97,7 @@ AM_CONDITIONAL(HAVE_ZLIB, test "x${have_zlib}" = "xyes")
AM_CONDITIONAL(HAVE_PTHREADS, test "x${have_pthreads}" = "xyes")
AM_CONDITIONAL(HAVE_PS2PDF, test "x${have_ps2pdf}" = "xyes")
AM_CONDITIONAL(TARGET_PPC, test "x${target_ppc}" = "xyes")
AM_CONDITIONAL(TARGET_AARCH64, test "x${target_aarch64}" = "xyes")
AM_PROG_CC_C_O
AC_CONFIG_FILES([Makefile
......
vsearch (2.10.4-1) unstable; urgency=medium
* New upstream version
* debhelper 12
* Standards-Version: 4.3.0
* Secure URI in copyright format
-- Andreas Tille <tille@debian.org> Fri, 11 Jan 2019 22:54:35 +0100
vsearch (2.10.2-1) unstable; urgency=medium
* New upstream version
......
......@@ -4,13 +4,13 @@ Uploaders: Tim Booth <tbooth@ceh.ac.uk>,
Andreas Tille <tille@debian.org>
Section: science
Priority: optional
Build-Depends: debhelper (>= 11~),
Build-Depends: debhelper (>= 12~),
zlib1g-dev,
libbz2-dev,
python-markdown,
ghostscript,
time
Standards-Version: 4.2.1
Standards-Version: 4.3.0
Vcs-Browser: https://salsa.debian.org/med-team/vsearch
Vcs-Git: https://salsa.debian.org/med-team/vsearch.git
Homepage: https://github.com/torognes/vsearch/
......
Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: VSEARCH
Upstream-Contact: Torbjørn Rognes <torognes@ifi.uio.no>
Source: https://github.com/torognes/vsearch/
......
.\" ============================================================================
.TH vsearch 1 "December 10, 2018" "version 2.10.2" "USER COMMANDS"
.TH vsearch 1 "January 4, 2019" "version 2.10.4" "USER COMMANDS"
.\" ============================================================================
.SH NAME
vsearch \(em chimera detection, clustering, dereplication and
......@@ -1430,7 +1430,7 @@ file containing containing the reverse reads.
.BI \-\-sff_convert \0filename
Convert the given SFF file to FASTQ. The FASTQ output file is
specified with the \-\-fastqout option. The sequence may be clipped as
specied in the SFF file if the option \-\-sff_clip is specified,
specified in the SFF file if the option \-\-sff_clip is specified,
otherwise no clipping occurs. Bases that would have been clipped are
converted to lower case, while the rest is in upper case. The output
quality encoding may be specified with the \-\-fastq_asciiout option
......@@ -3496,6 +3496,16 @@ speed and memory usage improvements.
.TP
.BR v2.10.2\~ "released December 10th, 2018"
Fixed bug in sintax with reversed order of domain and kingdom.
.TP
.BR v2.10.3\~ "released December 19th, 2018"
Ported to Linux on ARMv8 (aarch64). Fixed compilation warning with gcc
version 8.1.0 and 8.2.0.
.TP
.BR v2.10.4\~ "released January 4th, 2019"
Fixed serious bug in x86_64 SIMD alignment code introduced in version
2.10.3. Added link to BioConda in README. Fixed bug in fastq_stats
with sequence length 1. Fixed use of equals symbol in UC files for
identical sequences with cluster_fast.
.RE
.LP
.\" ============================================================================
......
......@@ -3,8 +3,12 @@ bin_PROGRAMS = $(top_builddir)/bin/vsearch
if TARGET_PPC
AM_CXXFLAGS=-Wall -Wsign-compare -O3 -g -mcpu=power8
else
if TARGET_AARCH64
AM_CXXFLAGS=-Wall -Wsign-compare -O3 -g -march=armv8-a+simd -mtune=generic
else
AM_CXXFLAGS=-Wall -Wsign-compare -O3 -g -march=x86-64 -mtune=generic
endif
endif
AM_CFLAGS=$(AM_CXXFLAGS)
......@@ -66,12 +70,17 @@ if TARGET_PPC
libcpu_a_SOURCES = cpu.cc $(VSEARCHHEADERS)
noinst_LIBRARIES = libcpu.a libcityhash.a
else
if TARGET_AARCH64
libcpu_a_SOURCES = cpu.cc $(VSEARCHHEADERS)
noinst_LIBRARIES = libcpu.a libcityhash.a
else
libcpu_sse2_a_SOURCES = cpu.cc $(VSEARCHHEADERS)
libcpu_sse2_a_CXXFLAGS = $(AM_CXXFLAGS) -msse2
libcpu_ssse3_a_SOURCES = cpu.cc $(VSEARCHHEADERS)
libcpu_ssse3_a_CXXFLAGS = $(AM_CXXFLAGS) -mssse3 -DSSSE3
noinst_LIBRARIES = libcpu_sse2.a libcpu_ssse3.a libcityhash.a
endif
endif
libcityhash_a_SOURCES = city.cc city.h
......@@ -88,8 +97,12 @@ libcityhash_a_CXXFLAGS = $(AM_CXXFLAGS) -Wno-sign-compare
if TARGET_PPC
__top_builddir__bin_vsearch_LDADD = libcityhash.a libcpu.a
else
if TARGET_AARCH64
__top_builddir__bin_vsearch_LDADD = libcityhash.a libcpu.a
else
__top_builddir__bin_vsearch_LDADD = libcityhash.a libcpu_ssse3.a libcpu_sse2.a
endif
endif
endif
......
This diff is collapsed.
......@@ -63,7 +63,59 @@
/* This file contains code dependent on special cpu features. */
/* The file may be compiled several times with different cpu options. */
#ifdef __PPC__
#ifdef __aarch64__
void increment_counters_from_bitmap(unsigned short * counters,
unsigned char * bitmap,
unsigned int totalbits)
{
const uint8x16_t c1 =
{ 0x01, 0x01, 0x02, 0x02, 0x04, 0x04, 0x08, 0x08,
0x10, 0x10, 0x20, 0x20, 0x40, 0x40, 0x80, 0x80 };
unsigned short * p = (unsigned short *)(bitmap);
int16x8_t * q = (int16x8_t *)(counters);
int r = (totalbits + 15) / 16;
for(int j=0; j<r; j++)
{
uint16x8_t r0;
uint8x16_t r1, r2, r3, r4;
int16x8_t r5, r6;
// load and duplicate short
r0 = vdupq_n_u16(*p);
p++;
// cast to bytes
r1 = vreinterpretq_u8_u16(r0);
// bit test with mask giving 0x00 or 0xff
r2 = vtstq_u8(r1, c1);
// transpose to duplicate even bytes
r3 = vtrn1q_u8(r2, r2);
// transpose to duplicate odd bytes
r4 = vtrn2q_u8(r2, r2);
// cast to signed 0x0000 or 0xffff
r5 = vreinterpretq_s16_u8(r3);
// cast to signed 0x0000 or 0xffff
r6 = vreinterpretq_s16_u8(r4);
// subtract signed 0 or -1 (i.e add 0 or 1) with saturation to counter
*q = vqsubq_s16(*q, r5);
q++;
// subtract signed 0 or 1 (i.e. add 0 or 1) with saturation to counter
*q = vqsubq_s16(*q, r6);
q++;
}
}
#elif __PPC__
void increment_counters_from_bitmap(unsigned short * counters,
unsigned char * bitmap,
......@@ -102,7 +154,7 @@ void increment_counters_from_bitmap(unsigned short * counters,
}
}
#else
#elif __x86_64__
#ifdef SSSE3
void increment_counters_from_bitmap_ssse3(unsigned short * counters,
......@@ -170,4 +222,8 @@ void increment_counters_from_bitmap_sse2(unsigned short * counters,
}
}
#else
#error Unknown architecture
#endif
......@@ -58,16 +58,15 @@
*/
#ifdef __PPC__
void increment_counters_from_bitmap(unsigned short * counters,
unsigned char * bitmap,
unsigned int totalbits);
#else
#ifdef __x86_64__
void increment_counters_from_bitmap_sse2(unsigned short * counters,
unsigned char * bitmap,
unsigned int totalbits);
void increment_counters_from_bitmap_ssse3(unsigned short * counters,
unsigned char * bitmap,
unsigned int totalbits);
#else
void increment_counters_from_bitmap(unsigned short * counters,
unsigned char * bitmap,
unsigned int totalbits);
#endif
......@@ -889,7 +889,7 @@ void fastq_stats()
fprintf(fp_log, " Len Q=5 Q=10 Q=15 Q=20\n");
fprintf(fp_log, "----- ------ ------ ------ ------\n");
for(int64_t i = len_max; i >= len_max/2; i--)
for(int64_t i = len_max; i >= MAX(1, len_max/2); i--)
{
double read_percentage[4];
......
......@@ -2,7 +2,7 @@
VSEARCH: a versatile open source tool for metagenomics
Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
Copyright (C) 2014-2019, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
All rights reserved.
Contact: Torbjorn Rognes <torognes@ifi.uio.no>,
......@@ -197,15 +197,28 @@ void results_show_uc_one(FILE * fp,
strand: + or -
0
0
compressed alignment, e.g. 9I92M14D, or "=" if prefect alignment
compressed alignment, e.g. 9I92M14D, or "=" if perfect alignment
query label
target label
*/
if (hp)
{
bool perfect = (hp->matches == qseqlen) &&
((uint64_t)(qseqlen) == db_getsequencelen(hp->target));
bool perfect;
if (opt_cluster_fast)
{
/* cluster_fast */
/* use = for identical sequences ignoring terminal gaps */
perfect = (hp->matches == hp->internal_alignmentlength);
}
else
{
/* cluster_size, cluster_smallmem, cluster_unoise */
/* usearch_global, search_exact, allpairs_global */
/* use = for strictly identical sequences */
perfect = (hp->matches == hp->nwalignmentlength);
}
fprintf(fp,
"H\t%d\t%" PRId64 "\t%.1f\t%c\t0\t0\t%s\t%s\t%s\n",
......
......@@ -185,15 +185,15 @@ void search_topscores(struct searchinfo_s * si)
if (bitmap)
{
#ifdef __PPC__
increment_counters_from_bitmap(si->kmers, bitmap, indexed_count);
#else
#ifdef __x86_64__
if (ssse3_present)
increment_counters_from_bitmap_ssse3(si->kmers,
bitmap, indexed_count);
else
increment_counters_from_bitmap_sse2(si->kmers,
bitmap, indexed_count);
#else
increment_counters_from_bitmap(si->kmers, bitmap, indexed_count);
#endif
}
else
......
......@@ -278,6 +278,7 @@ int64_t opt_wordlength;
/* cpu features available */
int64_t altivec_present = 0;
int64_t neon_present = 0;
int64_t mmx_present = 0;
int64_t sse_present = 0;
int64_t sse2_present = 0;
......@@ -300,7 +301,7 @@ FILE * fp_log = 0;
char * STDIN_NAME = (char*) "/dev/stdin";
char * STDOUT_NAME = (char*) "/dev/stdout";
#ifndef __PPC__
#ifdef __x86_64__
#define cpuid(f1, f2, a, b, c, d) \
__asm__ __volatile__ ("cpuid" \
: "=a" (a), "=b" (b), "=c" (c), "=d" (d) \
......@@ -309,9 +310,16 @@ char * STDOUT_NAME = (char*) "/dev/stdout";
void cpu_features_detect()
{
#ifdef __PPC__
altivec_present = 1;
#ifdef __aarch64__
#ifdef __ARM_NEON
/* may check /proc/cpuinfo for asimd or neon */
neon_present = 1;
#else
#error ARM Neon not present
#endif
#elif __PPC__
altivec_present = 1;
#elif __x86_64__
unsigned int a, b, c, d;
cpuid(0, 0, a, b, c, d);
......@@ -336,12 +344,16 @@ void cpu_features_detect()
avx2_present = (b >> 5) & 1;
}
}
#else
#error Unknown architecture
#endif
}
void cpu_features_show()
{
fprintf(stderr, "CPU features:");
if (neon_present)
fprintf(stderr, " neon");
if (altivec_present)
fprintf(stderr, " altivec");
if (mmx_present)
......@@ -2962,7 +2974,7 @@ int main(int argc, char** argv)
dynlibs_open();
#ifndef __PPC__
#ifdef __x86_64__
if (!sse2_present)
fatal("Sorry, this program requires a cpu with SSE2.");
#endif
......
......@@ -96,7 +96,17 @@
#define PROG_NAME PACKAGE
#define PROG_VERSION PACKAGE_VERSION
#ifdef __PPC__
#ifdef __x86_64__
#define PROG_CPU "x86_64"
#ifdef __SSE2__
#include <emmintrin.h>
#endif
#ifdef __SSSE3__
#include <tmmintrin.h>
#endif
#elif __PPC__
#ifdef __LITTLE_ENDIAN__
#define PROG_CPU "ppc64le"
......@@ -105,17 +115,14 @@
#error Big endian ppc64 CPUs not supported
#endif
#else
#elif __aarch64__
#define PROG_CPU "x86_64"
#define PROG_CPU "aarch64"
#include <arm_neon.h>
#ifdef __SSE2__
#include <emmintrin.h>
#endif
#else
#ifdef __SSSE3__
#include <tmmintrin.h>
#endif
#error Unknown architecture (not ppc64le, aarch64 or x86_64)
#endif
......