Skip to content
Commits on Source (7)
Makefile.in
aclocal.m4
autom4te.cache
config.h.in
configure
depcomp
install-sh
missing
compile
config.guess
config.h.in~
config.sub
ltmain.sh
*_cmdline.hpp
test-driver
language: cpp
before_install:
- mkdir -p ~/bin
- curl -L -o ~/bin/yaggo https://github.com/gmarcais/yaggo/releases/download/v1.5.9/yaggo
- chmod a+rx ~/bin/yaggo
script: autoreconf -i && ./configure YAGGO=~/bin/yaggo && make && make check
addons:
artifacts:
paths:
- $(git ls-files -o | grep '\.log$' | tr "\n" ":")
s3_region: "us-east-1"
Version 1.1
===========
* Read input from pipes. For example, to count DNA from zipped files, run:
% gunzip -c *.fa.gz | jellyfish [options] /dev/fd/0
* Support fasta and fastq formats.
* Support for counting 'q-mers', as defined by the Quake error
correcting software (http://www.cbcb.umd.edu/software/quake/).
* New dump subcommands which replaces 'stats -f' and 'stats -c'.
Copyright (c) 2006, Industrial Light & Magic, a division of Lucasfilm
Entertainment Company Ltd. Portions contributed and copyright held by
others as indicated. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above
copyright notice, this list of conditions and the following
disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided with
the distribution.
* Neither the name of Industrial Light & Magic nor the names of
any other contributors to this software may be used to endorse or
promote products derived from this software without specific prior
written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
This diff is collapsed.
# This Makefile is meant to be run in the traget directories. There
# name start with a '_' by convention and contain a config.h.
CC = g++
CPPFLAGS = -Wall -Werror -g -O2 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -I.
# Uncomment following to use SSE
# CPPFLAGS += -msse -msse2 -DSSE
LDFLAGS = -lpthread
TARGETS = jellyfish
# old TARGETS = mer_counter dump_stats dump_stats_compacted hash_merge
SOURCES = $(patsubst %,%.cc,$(TARGETS)) mer_count_thread.cc
.SECONDARY:
#VPATH=..:../lib
%.d: %.cc
$(CC) -M -MM -MG -MP $(CPPFLAGS) $< > $@.tmp
sed -e 's/$*.o/& $@/g' $@.tmp > $@ && rm $@.tmp
all: $(TARGETS)
jellyfish: dump_stats_compacted.o square_binary_matrix.o hash_merge.o storage.o misc.o mer_count_thread.o mer_counter.o
# compare_dump3: compare_dump3.o
# dump_stats_compacted: dump_stats_compacted.o square_binary_matrix.o
total_mers_single_fasta: total_mers_single_fasta.o
mer_counter_C: mer_counter_C.o
# hash_merge: hash_merge.o storage.o misc.o
# mer_counter: mer_counter.o mer_count_thread.o storage.o misc.o
# dump_mers: dump_mers.o storage.o misc.o
dump_stats: dump_stats.o storage.o misc.o
clean:
rm -f $(TARGETS) *.o *.d
# Is the following OK? Shouldn't we include more than that?
include $(SOURCES:.cc=.d)
-include local.mk
Jellyfish
=========
Overview
--------
Jellyfish is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. Jellyfish can count k-mers using an order of magnitude less memory and an order of magnitude faster than other k-mer counting packages by using an efficient encoding of a hash table and by exploiting the "compare-and-swap" CPU instruction to increase parallelism.
JELLYFISH is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in a binary format, which can be translated into a human-readable text format using the "jellyfish dump" command, or queried for specific k-mers with "jellyfish query". See the UserGuide provided on [Jellyfish's home page][1] for more details.
If you use Jellyfish in your research, please cite:
Guillaume Marcais and Carl Kingsford, A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics (2011) 27(6): 764-770 ([first published online January 7, 2011](http://bioinformatics.oxfordjournals.org/cgi/content/abstract/27/6/764 "Paper on Oxford Bioinformatics website")) doi:10.1093/bioinformatics/btr011
Installation
------------
To get an easier to compiled packaged tar ball of the source code, download a release from the [github release][3]. You need make and g++ version 4.4 or higher. To install in your home directory, do:
```Shell
./configure --prefix=$HOME
make -j 4
make install
```
To compile from the git tree, you will also need autoconf/automake, and [yaggo](https://github.com/gmarcais/yaggo/releases "Yaggo release on github"). Then to compile and install (in `/usr/local` in that example) with:
```Shell
autoreconf -i
./configure
make -j 4
sudo make install
```
If the software is installed in system directories (hint: you needed to use `sudo` to install), like the example above, then the system library cache must be updated like such:
```Shell
sudo ldconfig
```
Extra / Examples
----------------
In the examples directory are potentially useful extra programs to query/manipulates output files of Jellyfish, using the shared library of Jellyfish in C++ or with scripting languages. The examples are not compiled by default. Each subdirectory of examples is independent and is compiled with a simple invocation of 'make'.
Binding to script languages
---------------------------
Bindings to Ruby, Python and Perl are provided. This binding allows to read the output file of Jellyfish directly in a scripting language. Compilation of the bindings is easier from the [release tarball][3]. The development files of the target scripting language are required.
Compilation of the bindings from the git tree requires [SWIG][2] version 3 and adding the switch `--enable-swig` to the configure command lines show below.
To compile all three bindings, configure and compile with:
```Shell
./configure --enable-ruby-binding --enable-python-binding --enable-perl-binding
make -j 4
sudo make install
```
By default, Jellyfish is installed in `/usr/local` and the bindings are installed in the proper system location. When the `--prefix` switch is passed, the bindings are installed in the given directory. For example:
```Shell
./configure --prefix=$HOME --enable-python-binding
make -j 4
make install
```
This will install the python binding in `$HOME/lib/python2.7/site-packages` (adjust based on your Python version).
Then, for Python, Ruby or Perl to find the binding, an environment variable may need to be adjusted (`PYTHONPATH`, `RUBYLIB` and `PERL5LIB` respectively). For example:
```Shell
export PYTHONPATH=$HOME/lib/python2.7/site-packages
```
See the [swig directory](../../tree/master/swig) for examples on how to use the bindings.
[1]: http://www.genome.umd.edu/jellyfish.html "Genome group at University of Maryland"
[2]: http://www.swig.org/
[3]: https://github.com/gmarcais/Jellyfish/releases "Jellyfish release"
This diff is collapsed.
#! /bin/sh
# Wrapper for compilers which do not understand '-c -o'.
scriptversion=2016-01-11.22; # UTC
# Copyright (C) 1999-2017 Free Software Foundation, Inc.
# Written by Tom Tromey <tromey@cygnus.com>.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2, or (at your option)
# any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# As a special exception to the GNU General Public License, if you
# distribute this file as part of a program that contains a
# configuration script generated by Autoconf, you may include it under
# the same distribution terms that you use for the rest of that program.
# This file is maintained in Automake, please report
# bugs to <bug-automake@gnu.org> or send patches to
# <automake-patches@gnu.org>.
nl='
'
# We need space, tab and new line, in precisely that order. Quoting is
# there to prevent tools from complaining about whitespace usage.
IFS=" "" $nl"
file_conv=
# func_file_conv build_file lazy
# Convert a $build file to $host form and store it in $file
# Currently only supports Windows hosts. If the determined conversion
# type is listed in (the comma separated) LAZY, no conversion will
# take place.
func_file_conv ()
{
file=$1
case $file in
/ | /[!/]*) # absolute file, and not a UNC file
if test -z "$file_conv"; then
# lazily determine how to convert abs files
case `uname -s` in
MINGW*)
file_conv=mingw
;;
CYGWIN*)
file_conv=cygwin
;;
*)
file_conv=wine
;;
esac
fi
case $file_conv/,$2, in
*,$file_conv,*)
;;
mingw/*)
file=`cmd //C echo "$file " | sed -e 's/"\(.*\) " *$/\1/'`
;;
cygwin/*)
file=`cygpath -m "$file" || echo "$file"`
;;
wine/*)
file=`winepath -w "$file" || echo "$file"`
;;
esac
;;
esac
}
# func_cl_dashL linkdir
# Make cl look for libraries in LINKDIR
func_cl_dashL ()
{
func_file_conv "$1"
if test -z "$lib_path"; then
lib_path=$file
else
lib_path="$lib_path;$file"
fi
linker_opts="$linker_opts -LIBPATH:$file"
}
# func_cl_dashl library
# Do a library search-path lookup for cl
func_cl_dashl ()
{
lib=$1
found=no
save_IFS=$IFS
IFS=';'
for dir in $lib_path $LIB
do
IFS=$save_IFS
if $shared && test -f "$dir/$lib.dll.lib"; then
found=yes
lib=$dir/$lib.dll.lib
break
fi
if test -f "$dir/$lib.lib"; then
found=yes
lib=$dir/$lib.lib
break
fi
if test -f "$dir/lib$lib.a"; then
found=yes
lib=$dir/lib$lib.a
break
fi
done
IFS=$save_IFS
if test "$found" != yes; then
lib=$lib.lib
fi
}
# func_cl_wrapper cl arg...
# Adjust compile command to suit cl
func_cl_wrapper ()
{
# Assume a capable shell
lib_path=
shared=:
linker_opts=
for arg
do
if test -n "$eat"; then
eat=
else
case $1 in
-o)
# configure might choose to run compile as 'compile cc -o foo foo.c'.
eat=1
case $2 in
*.o | *.[oO][bB][jJ])
func_file_conv "$2"
set x "$@" -Fo"$file"
shift
;;
*)
func_file_conv "$2"
set x "$@" -Fe"$file"
shift
;;
esac
;;
-I)
eat=1
func_file_conv "$2" mingw
set x "$@" -I"$file"
shift
;;
-I*)
func_file_conv "${1#-I}" mingw
set x "$@" -I"$file"
shift
;;
-l)
eat=1
func_cl_dashl "$2"
set x "$@" "$lib"
shift
;;
-l*)
func_cl_dashl "${1#-l}"
set x "$@" "$lib"
shift
;;
-L)
eat=1
func_cl_dashL "$2"
;;
-L*)
func_cl_dashL "${1#-L}"
;;
-static)
shared=false
;;
-Wl,*)
arg=${1#-Wl,}
save_ifs="$IFS"; IFS=','
for flag in $arg; do
IFS="$save_ifs"
linker_opts="$linker_opts $flag"
done
IFS="$save_ifs"
;;
-Xlinker)
eat=1
linker_opts="$linker_opts $2"
;;
-*)
set x "$@" "$1"
shift
;;
*.cc | *.CC | *.cxx | *.CXX | *.[cC]++)
func_file_conv "$1"
set x "$@" -Tp"$file"
shift
;;
*.c | *.cpp | *.CPP | *.lib | *.LIB | *.Lib | *.OBJ | *.obj | *.[oO])
func_file_conv "$1" mingw
set x "$@" "$file"
shift
;;
*)
set x "$@" "$1"
shift
;;
esac
fi
shift
done
if test -n "$linker_opts"; then
linker_opts="-link$linker_opts"
fi
exec "$@" $linker_opts
exit 1
}
eat=
case $1 in
'')
echo "$0: No command. Try '$0 --help' for more information." 1>&2
exit 1;
;;
-h | --h*)
cat <<\EOF
Usage: compile [--help] [--version] PROGRAM [ARGS]
Wrapper for compilers which do not understand '-c -o'.
Remove '-o dest.o' from ARGS, run PROGRAM with the remaining
arguments, and rename the output as expected.
If you are trying to build a whole package this is not the
right script to run: please start by reading the file 'INSTALL'.
Report bugs to <bug-automake@gnu.org>.
EOF
exit $?
;;
-v | --v*)
echo "compile $scriptversion"
exit $?
;;
cl | *[/\\]cl | cl.exe | *[/\\]cl.exe | \
icl | *[/\\]icl | icl.exe | *[/\\]icl.exe )
func_cl_wrapper "$@" # Doesn't return...
;;
esac
ofile=
cfile=
for arg
do
if test -n "$eat"; then
eat=
else
case $1 in
-o)
# configure might choose to run compile as 'compile cc -o foo foo.c'.
# So we strip '-o arg' only if arg is an object.
eat=1
case $2 in
*.o | *.obj)
ofile=$2
;;
*)
set x "$@" -o "$2"
shift
;;
esac
;;
*.c)
cfile=$1
set x "$@" "$1"
shift
;;
*)
set x "$@" "$1"
shift
;;
esac
fi
shift
done
if test -z "$ofile" || test -z "$cfile"; then
# If no '-o' option was seen then we might have been invoked from a
# pattern rule where we don't need one. That is ok -- this is a
# normal compilation that the losing compiler can handle. If no
# '.c' file was seen then we are probably linking. That is also
# ok.
exec "$@"
fi
# Name of file we expect compiler to create.
cofile=`echo "$cfile" | sed 's|^.*[\\/]||; s|^[a-zA-Z]:||; s/\.c$/.o/'`
# Create the lock directory.
# Note: use '[/\\:.-]' here to ensure that we don't use the same name
# that we are using for the .o file. Also, base the name on the expected
# object file name, since that is what matters with a parallel build.
lockdir=`echo "$cofile" | sed -e 's|[/\\:.-]|_|g'`.d
while true; do
if mkdir "$lockdir" >/dev/null 2>&1; then
break
fi
sleep 1
done
# FIXME: race condition here if user kills between mkdir and trap.
trap "rmdir '$lockdir'; exit 1" 1 2 15
# Run the compile.
"$@"
ret=$?
if test -f "$cofile"; then
test "$cofile" = "$ofile" || mv "$cofile" "$ofile"
elif test -f "${cofile}bj"; then
test "${cofile}bj" = "$ofile" || mv "${cofile}bj" "$ofile"
fi
rmdir "$lockdir"
exit $ret
# Local Variables:
# mode: shell-script
# sh-indentation: 2
# eval: (add-hook 'write-file-hooks 'time-stamp)
# time-stamp-start: "scriptversion="
# time-stamp-format: "%:y-%02m-%02d.%02H"
# time-stamp-time-zone: "UTC0"
# time-stamp-end: "; # UTC"
# End:
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
AC_INIT([jellyfish], [2.2.8], [gmarcais@umd.edu])
AC_INIT([jellyfish], [2.2.9], [gmarcais@umd.edu])
AC_CANONICAL_HOST
AC_CONFIG_MACRO_DIR([m4])
AM_INIT_AUTOMAKE([subdir-objects foreign parallel-tests color-tests])
......
jellyfish (2.2.9-1) UNRELEASED; urgency=medium
* New upstream version
* Point Vcs fields to salsa.debian.org
* Standards-Version: 4.1.4
-- Andreas Tille <tille@debian.org> Thu, 26 Apr 2018 15:51:15 +0200
jellyfish (2.2.8-3) unstable; urgency=medium
* Build and install fastq2sam, which fixes Debci tests
......
......@@ -18,9 +18,9 @@ Build-Depends: debhelper (>= 11~),
dh-python,
perl,
chrpath
Standards-Version: 4.1.3
Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/jellyfish.git
Vcs-Git: https://anonscm.debian.org/git/debian-med/jellyfish.git
Standards-Version: 4.1.4
Vcs-Browser: https://salsa.debian.org/med-team/jellyfish
Vcs-Git: https://salsa.debian.org/med-team/jellyfish.git
Homepage: http://www.cbcb.umd.edu/software/jellyfish/
Package: jellyfish
......
manpage_whatis_entry.patch
pkg-config-with-name
drop-rpath
modern-g++
spelling
gcc-7.patch
# pkg-config-with-name
# drop-rpath
# modern-g++
# spelling
# gcc-7.patch
reproducible.patch
portability.patch
add_small_mers.sh.patch
6dfc57b516249454dc0a708ceb4833623cbf0ffd.patch
# add_small_mers.sh.patch
# 6dfc57b516249454dc0a708ceb4833623cbf0ffd.patch
......@@ -43,14 +43,6 @@ override_dh_install:
override_dh_auto_configure:
dh_auto_configure -- --enable-swig --enable-perl-binding
override_dh_auto_clean:
set -e && for i in $(build3vers); do \
cd swig/python ; \
python$$i ./setup.py clean --all 2> /dev/null; \
cd ../.. ; \
done
dh_auto_clean
override_dh_clean:
dh_clean
if [ -d debian/tmp_save_tests ] ; then \
......
This diff is collapsed.