Commit 71cfbdcd authored by Hilko Bengen's avatar Hilko Bengen

Imported Upstream version 1.0.1

parents
*.iml
*.ipr
*.iws
*.log
.DS_Store
.classpath
.settings
.project
target
pom.xml.releaseBackup
release.properties
*~
temp-testng-customsuite.xml
test-output
.externalToolBuilders
server/logs
runtime
logs
Copyright 2009-2010 Ning, Inc.
Licensed under the Apache License, Version 2.0 (the "License"); you may not
use this file except in compliance with the License. You may obtain a copy of
the License at http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under
the License.
\ No newline at end of file
# Ning-Compress
## Overview
Ning-compress is a Java library for encoding and decoding data in LZF format, written by Tatu Saloranta (tatu.saloranta@iki.fi)
Data format and algorithm based on original [LZF library](http://freshmeat.net/projects/liblzf) by Marc A Lehmann. See [LZF Format](https://github.com/ning/compress/wiki/LZFFormat) for full description.
Format differs slightly from some other adaptations, such as one used by [H2 database project](http://www.h2database.com) (by Thomas Mueller); although internal block compression structure is the same, block identifiers differ.
This package uses the original LZF identifiers to be 100% compatible with existing command-line lzf tool(s).
LZF alfgorithm itself is optimized for speed, with somewhat more modest compression: compared to Deflate (algorithm gzip uses) LZF can be 5-6 times as fast to compress, and twice as fast to decompress.
## Usage
See [Wiki](https://github.com/ning/compress/wiki) for more details; here's a "TL;DNR" version.
Both compression and decompression can be done either by streaming approach:
InputStream in = new LZFInputStream(new FileInputStream("data.lzf"));
OutputStream out = new LZFOutputStream(new FileOutputStream("results.lzf"));
InputStream compIn = new LZFCompressingInputStream(new FileInputStream("stuff.txt"));
or by block operation:
byte[] compressed = LZFEncoder.encode(uncompressedData);
byte[] uncompressed = LZFDecoder.decode(compressedData);
and you can even use the LZF jar as a command-line tool (it has manifest that points to 'com.ning.compress.lzf.LZF' as the class having main() method to call), like so:
java -jar compress-lzf-1.0.0.jar
(which will display necessary usage arguments for `-c`(ompressing) or `-d`(ecompressing) files.
## Interoperability
Besides Java support, LZF codecs / bindings exist for non-JVM languages as well:
* C: [liblzf](http://oldhome.schmorp.de/marc/liblzf.html) (the original LZF package!)
* Go: [Golly](https://github.com/tav/golly)
* Javascript(!): [http://freecode.com/projects/lzf](freecode LZF) (or via [SourceForge](http://sourceforge.net/projects/lzf/))
* Perl: [Compress::LZF](http://search.cpan.org/dist/Compress-LZF/LZF.pm)
* Python: [Python-LZF](https://github.com/teepark/python-lzf)
* Ruby: [glebtv/lzf](https://github.com/glebtv/lzf), [LZF/Ruby](https://rubyforge.org/projects/lzfruby/)
## Related
Check out [jvm-compress-benchmark](https://github.com/ning/jvm-compressor-benchmark) for comparison of space- and time-efficiency of this LZF implementation, relative other available Java-accessible compression libraries.
## More
[Project Wiki](https://github.com/ning/compress/wiki).
1.0.1 (08-Apr-2014)
#35: Fix a problem with closing of `DeflaterOutputStream` (for gzip output)
that could cause corrupt state for reusable `Deflater`
(contribyted by thmd@github)
1.0.0 (02-Dec-2013)
#34: Add `ChunkEncoder.appendEncodedIfCompresses()` for conditional compression;
useful for building efficient "compress but only if it makes enough difference"
processing systems
0.9.9 (25-Sep-2013)
#14: Added parallel LZF compression, contributed by Cedrik
(javabean@github)
#25: Allow early termination of push-style `Uncompressor` operation
#32: Fix for a rare NPE
(suggested by francoisforster@github)
0.9.8 (09-Mar-2013)
#24: Problems uncompressing certain types of binary documents
- Minor perf improvement for 'appendEncoded', was not reusing buffers
0.9.7 (06-Mar-2013)
#23: Add UnsafeChunkEncoder that uses 'sun.misc.Unsafe' for additional Oomph.
* Add LZFEncoder.estimateMaxWorkspaceSize() to help allocate work buffers.
#22: Add method(s) to allow encoding into caller-provided (pre-allocated) buffer.
0.9.6 (05-Sep-2012)
#17: Add IOException subtypes 'LZFException' and 'GZIPException' (with
common supertype of 'CompressionFormatException) to allow for better
catching of decompression errors
#19: (more) Efficient skipping with LZFInputStream, LZFFileInputStream;
can skip full chunks without decoding -- much faster (as per simple tests)
0.9.5 (25-May-2012)
* Add 'LZFCompressingInputStream' to allow streaming compression
"in reverse" (compared to LZFOutputStream)
* Add GZIP support functionality:
* 'OptimizedGZIPInputStream', 'OptimizedGZIPOutputStream' which add buffer
(and Inflater/Deflater) recycling for improved performance compared to
default JDK implementations (uses same native ZLIB library for actual
decompression)
* Add "push-mode" handler, 'Uncompressor' to be used for un-/decompression
with non-blocking push-style data sources (like async-http-client)
* Implementations for LZF (LZFUncompressor) and GZIP (GZIPUncompressor)
* 'UncompressorOutputStream' convenience wrapper to expose 'Uncompressor'
as 'OutputStream'
0.9.3
* Fixed Issue #12: Command-line tool out of memory
(reported by nodarret@github)
* Implemented Issue #16: Add LZFInputStream.readAndWrite(...) method for copying
uncompressed data, avoiding an intermediate copy.
* Fix for Issue #15: LZFDecoder not passing 'offset', 'length' params
(reported by T.Effland)
* Fix for Issue #13: problems with Unsafe decoder on some platforms
0.9.0 (and prior)
* Rewrote decoder to allow ChunkDecoder variants, to allow optional use of
sun.misc.Unsafe (which can boost uncompression speed by up to +50%)
* #11: Input/OutputStreams not throwing IOException if reading/writing
after close() called, should be.
(reported by Dain S)
* Fix an NPE in BufferRecycler
(reported by Matt Abrams, abramsm@gmail.com)
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.sonatype.oss</groupId>
<artifactId>oss-parent</artifactId>
<version>7</version>
</parent>
<groupId>com.ning</groupId>
<artifactId>compress-lzf</artifactId>
<name>Compress-LZF</name>
<version>1.0.1</version>
<packaging>bundle</packaging>
<description>
Compression codec for LZF encoding for particularly encoding/decoding, with reasonable compression.
Compressor is basic Lempel-Ziv codec, without Huffman (deflate/gzip) or statistical post-encoding.
See "http://oldhome.schmorp.de/marc/liblzf.html" for more on original LZF package.
</description>
<prerequisites>
<maven>2.2.1</maven>
</prerequisites>
<url>http://github.com/ning/compress</url>
<scm>
<connection>scm:git:git@github.com:ning/compress.git</connection>
<developerConnection>scm:git:git@github.com:ning/compress.git</developerConnection>
<url>http://github.com/ning/compress</url>
</scm>
<issueManagement>
<url>http://github.com/ning/compress/issues</url>
</issueManagement>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<developers>
<developer>
<id>tatu</id>
<name>Tatu Saloranta</name>
<email>tatu.saloranta@iki.fi</email>
</developer>
</developers>
<contributors>
<contributor>
<name>Jon Hartlaub</name>
<email>jhartlaub@gmail.com</email>
</contributor>
<contributor>
<name>Cédrik Lime</name>
<email>2013@cedrik.fr</email>
</contributor>
</contributors>
<licenses>
<license>
<name>Apache License 2.0</name>
<url>http://www.apache.org/licenses/LICENSE-2.0.html</url>
<distribution>repo</distribution>
</license>
</licenses>
<dependencies>
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>6.5.2</version>
<type>jar</type>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<defaultGoal>install</defaultGoal>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<!-- 1.6 since 0.9.7 -->
<configuration>
<source>1.6</source>
<target>1.6</target>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<version>2.1.2</version>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<version>2.6.1</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
<encoding>UTF-8</encoding>
<links>
<link>http://docs.oracle.com/javase/6/docs/api/</link>
</links>
</configuration>
<executions>
<execution>
<id>attach-javadocs</id>
<phase>verify</phase>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-release-plugin</artifactId>
<version>2.1</version>
<configuration>
<mavenExecutorId>forked-path</mavenExecutorId>
</configuration>
</plugin>
<!-- Plus, let's make jars OSGi bundles as well -->
<plugin>
<groupId>org.apache.felix</groupId>
<artifactId>maven-bundle-plugin</artifactId>
<version>2.3.7</version>
<extensions>true</extensions>
<configuration>
<instructions><!-- note: artifact id, name, version and description use defaults (which are fine) -->
<Bundle-Vendor>http://ning.com</Bundle-Vendor>
<Import-Package />
<!-- if using high-perf decoder: -->
<DynamicImport-Package>sun.misc</DynamicImport-Package>
<Private-Package>
com.ning.compress.lzf.impl
</Private-Package>
<!-- Export-Package default: set of packages in local Java sources, excluding the default package '.' and any packages containing 'impl' or 'internal' -->
<!--Export-Package>
com.ning.compress,
com.ning.compress.gzip,
com.ning.compress.lzf,
com.ning.compress.lzf.parallel,
com.ning.compress.lzf.util
</Export-Package-->
<Main-Class>com.ning.compress.lzf.LZF</Main-Class>
</instructions>
</configuration>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<id>release-sign-artifacts</id>
<activation>
<property>
<name>performRelease</name>
<value>true</value>
</property>
</activation>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-gpg-plugin</artifactId>
<executions>
<execution>
<id>sign-artifacts</id>
<phase>verify</phase>
<goals>
<goal>sign</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>
<profile>
<id>offline-testing</id>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<groups>standalone</groups>
</configuration>
</plugin>
</plugins>
</build>
</profile>
<profile>
<id>online-testing</id>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<groups>standalone, online</groups>
</configuration>
</plugin>
</plugins>
</build>
</profile>
</profiles>
</project>
#!/bin/sh
java -Xmx64m -server \
-cp target/classes:target/test-classes \
-Xrunhprof:cpu=samples,depth=10,verbose=n,interval=2 \
perf.ManualCompressComparison \
$*
#!/bin/sh
java -Xmx64m -server \
-cp target/classes:target/test-classes \
-Xrunhprof:cpu=samples,depth=10,verbose=n,interval=2 \
perf.ManualSkipComparison \
$*
#!/bin/sh
java -Xmx64m -server \
-cp target/classes:target/test-classes \
-Xrunhprof:cpu=samples,depth=10,verbose=n,interval=2 \
perf.ManualUncompressComparison \
$*
#!/bin/sh
java -Xmx200m -server \
-cp target/classes:target/test-classes \
perf.ManualCompressComparison \
$*
#!/bin/sh
java -Xmx64m -server \
-cp target/classes:target/test-classes \
perf.ManualSkipComparison \
$*
#!/bin/sh
java -Xmx200m -server \
-cp target/classes:target/test-classes \
perf.ManualUncompressComparison \
$*
package com.ning.compress;
import java.lang.ref.SoftReference;
/**
* Simple helper class to encapsulate details of basic buffer
* recycling scheme, which helps a lot (as per profiling) for
* smaller encoding cases.
*
* @author Tatu Saloranta (tatu.saloranta@iki.fi)
*/
public final class BufferRecycler
{
private final static int MIN_ENCODING_BUFFER = 4000;
private final static int MIN_OUTPUT_BUFFER = 8000;
/**
* This <code>ThreadLocal</code> contains a {@link java.lang.ref.SoftReference}
* to a {@link BufferRecycler} used to provide a low-cost
* buffer recycling for buffers we need for encoding, decoding.
*/
final protected static ThreadLocal<SoftReference<BufferRecycler>> _recyclerRef
= new ThreadLocal<SoftReference<BufferRecycler>>();
private byte[] _inputBuffer;
private byte[] _outputBuffer;
private byte[] _decodingBuffer;
private byte[] _encodingBuffer;
private int[] _encodingHash;
/**
* Accessor to get thread-local recycler instance
*/
public static BufferRecycler instance()
{
SoftReference<BufferRecycler> ref = _recyclerRef.get();
BufferRecycler br = (ref == null) ? null : ref.get();
if (br == null) {
br = new BufferRecycler();
_recyclerRef.set(new SoftReference<BufferRecycler>(br));
}
return br;
}
/*
///////////////////////////////////////////////////////////////////////
// Buffers for encoding (output)
///////////////////////////////////////////////////////////////////////
*/
public byte[] allocEncodingBuffer(int minSize)
{
byte[] buf = _encodingBuffer;
if (buf == null || buf.length < minSize) {
buf = new byte[Math.max(minSize, MIN_ENCODING_BUFFER)];
} else {
_encodingBuffer = null;
}
return buf;
}
public void releaseEncodeBuffer(byte[] buffer)
{
if (_encodingBuffer == null || (buffer != null && buffer.length > _encodingBuffer.length)) {
_encodingBuffer = buffer;
}
}
public byte[] allocOutputBuffer(int minSize)
{
byte[] buf = _outputBuffer;
if (buf == null || buf.length < minSize) {
buf = new byte[Math.max(minSize, MIN_OUTPUT_BUFFER)];
} else {
_outputBuffer = null;
}
return buf;
}
public void releaseOutputBuffer(byte[] buffer)
{
if (_outputBuffer == null || (buffer != null && buffer.length > _outputBuffer.length)) {
_outputBuffer = buffer;
}
}
public int[] allocEncodingHash(int suggestedSize)
{
int[] buf = _encodingHash;
if (buf == null || buf.length < suggestedSize) {
buf = new int[suggestedSize];
} else {
_encodingHash = null;
}
return buf;
}
public void releaseEncodingHash(int[] buffer)
{
if (_encodingHash == null || (buffer != null && buffer.length > _encodingHash.length)) {
_encodingHash = buffer;
}
}
/*
///////////////////////////////////////////////////////////////////////
// Buffers for decoding (input)
///////////////////////////////////////////////////////////////////////
*/
public byte[] allocInputBuffer(int minSize)
{
byte[] buf = _inputBuffer;
if (buf == null || buf.length < minSize) {
buf = new byte[Math.max(minSize, MIN_OUTPUT_BUFFER)];
} else {
_inputBuffer = null;
}
return buf;
}
public void releaseInputBuffer(byte[] buffer)
{
if (_inputBuffer == null || (buffer != null && buffer.length > _inputBuffer.length)) {
_inputBuffer = buffer;
}
}
public byte[] allocDecodeBuffer(int size)
{
byte[] buf = _decodingBuffer;
if (buf == null || buf.length < size) {
buf = new byte[size];
} else {
_decodingBuffer = null;
}
return buf;
}
public void releaseDecodeBuffer(byte[] buffer)
{
if (_decodingBuffer == null || (buffer != null && buffer.length > _decodingBuffer.length)) {
_decodingBuffer = buffer;
}
}
}
package com.ning.compress;
import java.io.IOException;
/**
* Base exception used by compression codecs when encountering a problem
* with underlying data format, usually due to data corruption.
*/
public class CompressionFormatException extends IOException
{
private static final long serialVersionUID = 1L;
protected CompressionFormatException(String message) {
super(message);
}
protected CompressionFormatException(Throwable t) {
super();
initCause(t);
}
protected CompressionFormatException(String message, Throwable t) {
super(message);
initCause(t);
}
}
package com.ning.compress;
import java.io.IOException;
/**
* Interface used by {@link Uncompressor} implementations: receives
* uncompressed data and processes it appropriately.
*/
public interface DataHandler
{
/**
* Method called with uncompressed data as it becomes available.
*<p>
* NOTE: return value was added (from void to boolean) in 0.9.9
*
* @return True, if caller should process and feed more data; false if
* caller is not interested in more data and processing should be terminated
* (and {@link #allDataHandled} should be called immediately)
*/
public boolean handleData(byte[] buffer, int offset, int len) throws IOException;
/**
* Method called after last call to {@link #handleData}, for successful
* operation, if and when caller is informed about end of content
* Note that if an exception thrown by {@link #handleData} has caused processing
* to be aborted, this method might not get called.
* Implementation may choose to free resources, flush state, or perform
* validation at this point.
*/
public void allDataHandled() throws IOException;
}
package com.ning.compress;
import java.io.IOException;
/**
* Abstract class that defines "push" style API for various uncompressors
* (aka decompressors or decoders). Implements are alternatives to stream
* based uncompressors (such as {@link com.ning.compress.lzf.LZFInputStream})
* in cases where "push" operation is important and/or blocking is not allowed;
* for example, when handling asynchronous HTTP responses.
*<p>
* Note that API does not define the way that listener is attached: this is
* typically passed through to constructor of the implementation.
*
* @author Tatu Saloranta (tatu.saloranta@iki.fi)
*/
public abstract class Uncompressor
{
/**
* Method called to feed more compressed data to be uncompressed, and
* sent to possible listeners.
*<p>
* NOTE: return value was added (from void to boolean) in 0.9.9
*
* @return True, if caller should process and feed more data; false if
* caller is not interested in more data and processing should be terminated.
* (and {@link #complete} should be called immediately)
*/
public abstract boolean feedCompressedData(byte[] comp, int offset, int len)
throws IOException;
/**
* Method called to indicate that all data to uncompress has already been fed.
* This typically results in last block of data being uncompressed, and results
* being sent to listener(s); but may also throw an exception if incomplete
* block was passed.
*/
public abstract void complete() throws IOException;
}
package com.ning.compress;
import java.io.*;
/**
* Simple wrapper or wrapper around {@link Uncompressor}, to help
* with inter-operability.
*/
public class UncompressorOutputStream extends OutputStream
{
protected final Uncompressor _uncompressor;
private byte[] _singleByte = null;
public UncompressorOutputStream(Uncompressor uncomp)
{
_uncompressor = uncomp;
}
/**
* Call to this method will result in call to
* {@link Uncompressor#complete()}, which is idempotent
* (i.e. can be called multiple times without ill effects).
*/
@Override
public void close() throws IOException {
_uncompressor.complete();
}
@Override
public void flush() { }
@Override
public void write(byte[] b) throws IOException {
_uncompressor.feedCompressedData(b, 0, b.length);
}
@Override
public void write(byte[] b, int off, int len) throws IOException {
_uncompressor.feedCompressedData(b, off, len);
}
@Override
public void write(int b) throws IOException
{
if (_