Commit 83af76e3 authored by Saif Abdul Cassim's avatar Saif Abdul Cassim

New upstream version 1.3.0

parents
<?xml version="1.0" encoding="UTF-8"?>
<classpath>
<classpathentry kind="src" path="src/java"/>
<classpathentry kind="src" path="build/java"/>
<classpathentry kind="src" path="src/java-unsafe"/>
<classpathentry kind="src" path="src/test-resources"/>
<classpathentry kind="src" path="build/jni"/>
<classpathentry kind="src" path="src/test"/>
<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER"/>
<classpathentry kind="lib" path="lib/hamcrest-core-1.1.jar"/>
<classpathentry kind="lib" path="lib/junit-4.10.jar"/>
<classpathentry kind="lib" path="lib/randomizedtesting-runner-2.0.9.jar"/>
<classpathentry kind="output" path="bin"/>
</classpath>
build
dist
bin
lib
!src/build
.idea
out
*/*.iml
<?xml version="1.0" encoding="UTF-8"?>
<projectDescription>
<name>lz4-java</name>
<comment></comment>
<projects>
</projects>
<buildSpec>
<buildCommand>
<name>org.eclipse.jdt.core.javabuilder</name>
</buildCommand>
</buildSpec>
<natures>
<nature>org.eclipse.jdt.core.javanature</nature>
</natures>
</projectDescription>
This diff is collapsed.
#Tue Jul 17 16:36:09 CEST 2012
eclipse.preferences.version=1
formatter_profile=_2 spaces
formatter_settings_version=12
# Change log
## 1.3.0
- lz4 r123
- xxhash r37
- [#49](https://github.com/jpountz/lz4-java/pull/49)
All compression and decompression routines as well as xxhash can now work
with java.nio.ByteBuffer. (Branimir Lambov)
- [#46](https://github.com/jpountz/lz4-java/pull/46)
Fixed incorrect usage of ReleasePrimitiveArrayCritical. (Xiaoguang Sun)
- [#44](https://github.com/jpountz/lz4-java/pull/44)
Added support for xxhash64. (Linnaea Von Lavia)
- [#43](https://github.com/jpountz/lz4-java/pull/43)
The compression level for high compression is now configurable.
(Linnaea Von Lavia)
- [#39](https://github.com/jpountz/lz4-java/pull/39)
The JAR is now a valid OSGI bundle. (Simon Chemouil)
- [#33](https://github.com/jpountz/lz4-java/pull/33)
The implementation based on Java's sun.misc.Unsafe relies on unaligned
memory access and is now only used on platforms that support it.
(Dmitry Shohov)
## 1.2.0
- lz4 r100
- [#16](http://github.com/jpountz/lz4-java/issues/16)
Fix violation of the Closeable contract in LZ4BlockOutputStream: double close
now works as specified in the Closeable interface documentation.
(Steven Schlansker)
- [#17](http://github.com/jpountz/lz4-java/issues/17)
The JNI HC compressor now supports maxDestLen < maxCompressedLength.
(Adrien Grand)
- [#12](http://github.com/jpountz/lz4-java/issues/12)
Fixed ArrayIndexOutOfBoundsException in the Java HC compressors on highly
compressible inputs when srcOff is > 0. (Brian S. O'Neill, @foresteve,
Adrien Grand)
- Decompressors have been renamed to "safe" and "fast" to reflect changes in
the C API. (Adrien Grand)
- [#18](http://github.com/jpountz/lz4-java/issues/18)
Added utility methods that take and return (de)compressed byte[]s.
(Adrien Grand)
## 1.1.2
- LZ4BlockInputStream does not support mark/reset anymore. (Adrien Grand)
- LZ4BlockOutputStream supports a new syncFlush parameter to configure whether
the flush method should flush pending data or just flush the underlying
stream. (Adrien Grand)
- [#14](http://github.com/jpountz/lz4-java/issues/14)
Fixed misspelled API. (Brian S. O'Neill)
- [#13](http://github.com/jpountz/lz4-java/issues/13)
Header must be fully read. (Gabriel Ki)
## 1.1.1
- [#11](http://github.com/jpountz/lz4-java/issues/11)
Fixed bug in LZ4BlockOutputStream.write(int). (Adrien Grand, Brian Moore)
## 1.1.0
- lz4 r88
- [#7](http://github.com/jpountz/lz4-java/issues/7)
LZ4Factory.fastestInstance() only tries to use the native bindings if:
- they have already been loaded by the current class loader,
- or if the current class loader is the system class loader.
(Adrien Grand)
- [#5](http://github.com/jpountz/lz4-java/issues/5)
The native instances unpack a shared library to the temporary directory when
they are first used. lz4-java now tries to remove this file on exist but
this might fail on systems that don't support removal of open files such as
Windows. (Adrien Grand)
- Added LZ4Factory.fastestJavaInstance() and XXHash.fastestJavaInstance().
(Adrien Grand)
- Added StreamingXXHash32.asChecksum() to return a java.util.zip.Checksum
view. (Adrien Grand)
- [#10](http://github.com/jpountz/lz4-java/issues/10)
Added LZ4BlockOutputStream which compresses data into fixed-size blocks of
configurable size.
(Adrien Grand, Brian Moore)
- [#5](http://github.com/jpountz/lz4-java/issues/5)
Fixed Windows build. (Rui Gonçalves)
- Fixed Mac build. (Adrien Maglo)
- [#8](http://github.com/jpountz/lz4-java/issues/5)
Provided pre-built JNI bindings for some major platforms: Windows/64,
Linux/32, Linux/64 and Mac Intel/64. (Rui Gonçalves, Adrien Maglo,
Adrien Grand)
## 1.0.0
- lz4 r87
- xxhash r6
This diff is collapsed.
# LZ4 Java
LZ4 compression for Java, based on Yann Collet's work available at
http://code.google.com/p/lz4/.
This library provides access to two compression methods that both generate a
valid LZ4 stream:
- fast scan (LZ4):
- low memory footprint (~ 16 KB),
- very fast (fast scan with skipping heuristics in case the input looks
incompressible),
- reasonable compression ratio (depending on the redundancy of the input).
- high compression (LZ4 HC):
- medium memory footprint (~ 256 KB),
- rather slow (~ 10 times slower than LZ4),
- good compression ratio (depending on the size and the redundancy of the
input).
The streams produced by those 2 compression algorithms use the same compression
format, are very fast to decompress and can be decompressed by the same
decompressor instance.
## Implementations
For LZ4 compressors, LZ4 HC compressors and decompressors, 3 implementations are
available:
- JNI bindings to the original C implementation by Yann Collet,
- a pure Java port of the compression and decompression algorithms,
- a Java port that uses the sun.misc.Unsafe API in order to achieve compression
and decompression speeds close to the C implementation.
Have a look at LZ4Factory for more information.
## Compatibility notes
- Compressors and decompressors are interchangeable: it is perfectly correct
to compress with the JNI bindings and to decompress with a Java port, or the
other way around.
- Compressors might not generate the same compressed streams on all platforms,
especially if CPU endianness differs, but the compressed streams can be
safely decompressed by any decompressor implementation on any platform.
## Example
```java
LZ4Factory factory = LZ4Factory.fastestInstance();
byte[] data = "12345345234572".getBytes("UTF-8");
final int decompressedLength = data.length;
// compress data
LZ4Compressor compressor = factory.fastCompressor();
int maxCompressedLength = compressor.maxCompressedLength(decompressedLength);
byte[] compressed = new byte[maxCompressedLength];
int compressedLength = compressor.compress(data, 0, decompressedLength, compressed, 0, maxCompressedLength);
// decompress data
// - method 1: when the decompressed length is known
LZ4FastDecompressor decompressor = factory.fastDecompressor();
byte[] restored = new byte[decompressedLength];
int compressedLength2 = decompressor.decompress(compressed, 0, restored, 0, decompressedLength);
// compressedLength == compressedLength2
// - method 2: when the compressed length is known (a little slower)
// the destination buffer needs to be over-sized
LZ4SafeDecompressor decompressor2 = factory.safeDecompressor();
int decompressedLength2 = decompressor2.decompress(compressed, 0, compressedLength, restored, 0);
// decompressedLength == decompressedLength2
```
# xxhash Java
xxhash hashing for Java, based on Yann Collet's work available at
http://code.google.com/p/xxhash/. xxhash is a non-cryptographic, extremly fast
and high-quality ([SMHasher](http://code.google.com/p/smhasher/wiki/SMHasher)
score of 10) hash function.
## Implementations
Similarly to LZ4, 3 implementations are available: JNI bindings, pure Java port
and pure Java port that uses sun.misc.Unsafe.
Have a look at XXHashFactory for more information.
## Compatibility notes
- All implementation return the same hash for the same input bytes:
- on any JVM,
- on any platform (even if the endianness or integer size differs).
## Example
```java
XXHashFactory factory = XXHashFactory.fastestInstance();
byte[] data = "12345345234572".getBytes("UTF-8");
ByteArrayInputStream in = new ByteArrayInputStream(data);
int seed = 0x9747b28c; // used to initialize the hash value, use whatever
// value you want, but always the same
StreamingXXHash32 hash32 = factory.newStreamingHash32(seed);
byte[] buf = new byte[8]; // for real-world usage, use a larger buffer, like 8192 bytes
for (;;) {
int read = in.read(buf);
if (read == -1) {
break;
}
hash32.update(buf, 0, read);
}
int hash = hash32.getValue();
```
# Download
You can download released artifacts from [Maven Central](http://repo1.maven.org/maven2/net/jpountz/lz4/lz4/).
# Documentation
- [lz4](http://jpountz.github.com/lz4-java/1.2.0/docs/net/jpountz/lz4/package-summary.html)
- [xxhash](http://jpountz.github.com/lz4-java/1.2.0/docs/net/jpountz/xxhash/package-summary.html)
- [changelog](http://github.com/jpountz/lz4-java/blob/master/CHANGES.md)
# Performance
Both lz4 and xxhash focus on speed. Although compression, decompression and
hashing performance can depend a lot on the input (there are lies, damn lies
and benchmarks), here are some benchmarks that try to give a sense of the
speed at which they compress/decompress/hash bytes.
- [lz4 compression](http://jpountz.github.com/lz4-java/1.2.0/lz4-compression-benchmark/)
- [lz4 decompression](http://jpountz.github.com/lz4-java/1.2.0/lz4-decompression-benchmark/)
- [xxhash hashing](http://jpountz.github.com/lz4-java/1.2.0/xxhash-benchmark/)
# Build
## Requirements
- JDK version 7 or newer,
- ant,
- ivy.
If ivy is not installed yet, ant can take care of it for you, just run
`ant ivy-bootstrap`. The library will be installed under ${user.home}/.ant/lib.
## Instructions
Then run `ant`. It will:
- generate some Java source files in `build/java` from the templates that are
located under `src/build`,
- compile the lz4 and xxhash libraries and their JNI (Java Native Interface)
bindings,
- compile Java sources in `src/java` (normal sources), `src/java-unsafe`
(sources that make use of `sun.misc.Unsafe`) and `build/java`
(auto-generated sources) to `build/classes`, `build/unsafe-classes` and
`build/generated-classes`,
- generate a JAR file called lz4-${version}.jar under the `dist` directory.
The JAR file that is generated contains Java class files, the native library
and the JNI bindings. If you add this JAR to your classpath, the native library
will be copied to a temporary directory and dynamically linked to your Java
application.
This diff is collapsed.
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<ivy-module version="2.0">
<info organisation="net.jpountz.lz4" module="lz4" revision="1.3-SNAPSHOT" />
<configurations defaultconfmapping="default->default">
<conf name="default" />
<conf name="test" extends="default" />
</configurations>
<dependencies>
<dependency org="com.carrotsearch.randomizedtesting" name="junit4-ant" rev="2.0.9" transitive="true" conf="test->*,!sources,!javadoc" />
</dependencies>
</ivy-module>
Bundle-SymbolicName: lz4-java
Bundle-Name: LZ4 Java Compression
Bundle-Version:${ivy.revision}
Export-Package: net.jpountz.*;version:=${packages.version}
import java.io.*;
import java.util.*;
import org.mvel2.templates.*;
outDir = System.getProperty("out.dir");
def get_template(file) {
template = new File(file);
return TemplateCompiler.compileTemplate(template, Collections.emptyMap());
}
def execute_template(template, dest, args) {
System.out.println("Generating " + dest);
dest.getParentFile().mkdirs();
String result = (String) TemplateRuntime.execute(compiledTemplate, null, args);
writer = new PrintWriter(dest, "UTF-8");
writer.print(result);
writer.close();
}
def dest_file(path) {
return new File(outDir + "/net/jpountz/" + path);
}
def generate_decompressors() {
compiledTemplate = get_template("decompressor.template");
for (type : ["Safe", "Unsafe"]) {
for (size : ["Fast", "Safe"]) {
dest = dest_file("lz4/LZ4Java" + type + size + "Decompressor.java");
args = new HashMap();
args.put("type", type);
args.put("size", size);
execute_template(compiledTemplate, dest, args);
}
}
}
def generate_compressors() {
compiledTemplate = get_template("compressor.template");
for (type : ["Safe", "Unsafe"]) {
dest = dest_file("lz4/LZ4Java" + type + "Compressor.java");
args = new HashMap();
args.put("type", type);
execute_template(compiledTemplate, dest, args);
}
}
def generate_hc_compressors() {
compiledTemplate = get_template("compressor_hc.template");
for (type : ["Safe", "Unsafe"]) {
dest = dest_file("lz4/LZ4HCJava" + type + "Compressor.java");
args = new HashMap();
args.put("type", type);
execute_template(compiledTemplate, dest, args);
}
}
def generate_xxhash() {
for (bitness : ["32", "64"]) {
compiledTemplate = get_template("xxhash" + bitness + ".template");
for (type : ["Safe", "Unsafe"]) {
dest = dest_file("xxhash/XXHash" + bitness + "Java" + type + ".java");
args = new HashMap();
args.put("type", type);
execute_template(compiledTemplate, dest, args);
}
}
}
def generate_streaming_xxhash() {
for (bitness : ["32", "64"]) {
compiledTemplate = get_template("xxhash" + bitness + "_streaming.template");
for (type : ["Safe", "Unsafe"]) {
dest = dest_file("xxhash/StreamingXXHash" + bitness + "Java" + type + ".java");
args = new HashMap();
args.put("type", type);
execute_template(compiledTemplate, dest, args);
}
}
}
generate_decompressors();
generate_compressors();
generate_hc_compressors();
generate_xxhash();
generate_streaming_xxhash();
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<name>LZ4 and xxHash</name>
<description>Java ports and bindings of the LZ4 compression algorithm and the xxHash hashing algorithm</description>
<url>https://github.com/jpountz/lz4-java</url>
<modelVersion>4.0.0</modelVersion>
<groupId>${ivy.pom.groupId}</groupId>
<artifactId>${ivy.pom.artifactId}</artifactId>
<packaging>${ivy.pom.packaging}</packaging>
<version>${ivy.pom.version}</version>
<licenses>
<license>
<name>The Apache Software License, Version 2.0</name>
<url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
<distribution>repo</distribution>
</license>
</licenses>
<scm>
<url>git://github.com/jpountz/lz4-java.git</url>
<connection>https://github.com/jpountz/lz4-java</connection>
</scm>
<developers>
<developer>
<id>jpountz</id>
<name>Adrien Grand</name>
<email>jpountz@gmail.com</email>
</developer>
</developers>
</project>
static int compress64k(${storage} src, int srcOff, int srcLen, ${storage} dest, int destOff, int destEnd) {
final int srcEnd = srcOff + srcLen;
final int srcLimit = srcEnd - LAST_LITERALS;
final int mflimit = srcEnd - MF_LIMIT;
int sOff = srcOff, dOff = destOff;
int anchor = sOff;
if (srcLen >= MIN_LENGTH) {
final short[] hashTable = new short[HASH_TABLE_SIZE_64K];
++sOff;
main:
while (true) {
// find a match
int forwardOff = sOff;
int ref;
int step = 1;
int searchMatchNb = 1 << SKIP_STRENGTH;
do {
sOff = forwardOff;
forwardOff += step;
step = searchMatchNb++ >>> SKIP_STRENGTH;
if (forwardOff > mflimit) {
break main;
}
final int h = hash64k(${utils}.readInt(src, sOff));
ref = srcOff + ${type}Utils.readShort(hashTable, h);
${type}Utils.writeShort(hashTable, h, sOff - srcOff);
} while (!LZ4${utils}.readIntEquals(src, ref, sOff));
// catch up
final int excess = LZ4${utils}.commonBytesBackward(src, ref, sOff, srcOff, anchor);
sOff -= excess;
ref -= excess;
// sequence == refsequence
final int runLen = sOff - anchor;
// encode literal length
int tokenOff = dOff++;
if (dOff + runLen + (2 + 1 + LAST_LITERALS) + (runLen >>> 8) > destEnd) {
throw new LZ4Exception("maxDestLen is too small");
}
if (runLen >= RUN_MASK) {
${utils}.writeByte(dest, tokenOff, RUN_MASK << ML_BITS);
dOff = LZ4${utils}.writeLen(runLen - RUN_MASK, dest, dOff);
} else {
${utils}.writeByte(dest, tokenOff, runLen << ML_BITS);
}
// copy literals
LZ4${utils}.wildArraycopy(src, anchor, dest, dOff, runLen);
dOff += runLen;
while (true) {
// encode offset
${utils}.writeShortLE(dest, dOff, (short) (sOff - ref));
dOff += 2;
// count nb matches
sOff += MIN_MATCH;
ref += MIN_MATCH;
final int matchLen = LZ4${utils}.commonBytes(src, ref, sOff, srcLimit);
if (dOff + (1 + LAST_LITERALS) + (matchLen >>> 8) > destEnd) {
throw new LZ4Exception("maxDestLen is too small");
}
sOff += matchLen;
// encode match len
if (matchLen >= ML_MASK) {
${utils}.writeByte(dest, tokenOff, ${utils}.readByte(dest, tokenOff) | ML_MASK);
dOff = LZ4${utils}.writeLen(matchLen - ML_MASK, dest, dOff);
} else {
${utils}.writeByte(dest, tokenOff, ${utils}.readByte(dest, tokenOff) | matchLen);
}
// test end of chunk
if (sOff > mflimit) {
anchor = sOff;
break main;
}
// fill table
${type}Utils.writeShort(hashTable, hash64k(${utils}.readInt(src, sOff - 2)), sOff - 2 - srcOff);
// test next position
final int h = hash64k(${utils}.readInt(src, sOff));
ref = srcOff + ${type}Utils.readShort(hashTable, h);
${type}Utils.writeShort(hashTable, h, sOff - srcOff);
if (!LZ4${utils}.readIntEquals(src, sOff, ref)) {
break;
}
tokenOff = dOff++;
${utils}.writeByte(dest, tokenOff, 0);
}
// prepare next loop
anchor = sOff++;
}
}
dOff = LZ4${utils}.lastLiterals(src, anchor, srcEnd - anchor, dest, dOff, destEnd);
return dOff - destOff;
}
@Override
public int compress(${storage} src, final int srcOff, int srcLen, ${storage} dest, final int destOff, int maxDestLen) {
@if{ storage == "ByteBuffer"}
if (src.hasArray() && dest.hasArray()) {
return compress(src.array(), srcOff, srcLen, dest.array(), destOff, maxDestLen);
}
src = ${utils}.inNativeByteOrder(src);
dest = ${utils}.inNativeByteOrder(dest);
@end{}
${utils}.checkRange(src, srcOff, srcLen);
${utils}.checkRange(dest, destOff, maxDestLen);
final int destEnd = destOff + maxDestLen;
if (srcLen < LZ4_64K_LIMIT) {
return compress64k(src, srcOff, srcLen, dest, destOff, destEnd);
}
final int srcEnd = srcOff + srcLen;
final int srcLimit = srcEnd - LAST_LITERALS;
final int mflimit = srcEnd - MF_LIMIT;
int sOff = srcOff, dOff = destOff;
int anchor = sOff++;
final int[] hashTable = new int[HASH_TABLE_SIZE];
Arrays.fill(hashTable, anchor);
main:
while (true) {
// find a match
int forwardOff = sOff;
int ref;
int step = 1;
int searchMatchNb = 1 << SKIP_STRENGTH;
int back;
do {
sOff = forwardOff;
forwardOff += step;
step = searchMatchNb++ >>> SKIP_STRENGTH;
if (forwardOff > mflimit) {
break main;
}
final int h = hash(${utils}.readInt(src, sOff));
ref = ${type}Utils.readInt(hashTable, h);
back = sOff - ref;
${type}Utils.writeInt(hashTable, h, sOff);
} while (back >= MAX_DISTANCE || !LZ4${utils}.readIntEquals(src, ref, sOff));
final int excess = LZ4${utils}.commonBytesBackward(src, ref, sOff, srcOff, anchor);
sOff -= excess;
ref -= excess;
// sequence == refsequence
final int runLen = sOff - anchor;
// encode literal length
int tokenOff = dOff++;
if (dOff + runLen + (2 + 1 + LAST_LITERALS) + (runLen >>> 8) > destEnd) {
throw new LZ4Exception("maxDestLen is too small");
}
if (runLen >= RUN_MASK) {
${utils}.writeByte(dest, tokenOff, RUN_MASK << ML_BITS);
dOff = LZ4${utils}.writeLen(runLen - RUN_MASK, dest, dOff);
} else {
${utils}.writeByte(dest, tokenOff, runLen << ML_BITS);
}
// copy literals
LZ4${utils}.wildArraycopy(src, anchor, dest, dOff, runLen);
dOff += runLen;
while (true) {
// encode offset
${utils}.writeShortLE(dest, dOff, back);
dOff += 2;
// count nb matches
sOff += MIN_MATCH;
final int matchLen = LZ4${utils}.commonBytes(src, ref + MIN_MATCH, sOff, srcLimit);
if (dOff + (1 + LAST_LITERALS) + (matchLen >>> 8) > destEnd) {
throw new LZ4Exception("maxDestLen is too small");
}
sOff += matchLen;
// encode match len