Commit b4abe7a1 authored by Ole Streicher's avatar Ole Streicher

Updated version 0.1+2017.07.31 from 'upstream/0.1+2017.07.31'

with Debian dir 009d257e782df7e0aac1daca14d7eaa368abefd6
parents 47bbe647 2a9f27b0
Extended column convention for FITS BINTABLE
--------------------------------------------
The BINTABLE extension type as described in the FITS Standard
(FITS Standard v3.0, sec 7.3) requires table column metadata
to be described using 8-character keywords of the form XXXXXnnn,
where XXXXX represents one of an open set of mandatory, reserved
or user-defined root keywords up to five characters in length,
for instance TFORM (mandatory), TUNIT (reserved), TUCD (user-defined).
The nnn part is an integer between 1 and 999 indicating the
index of the column to which the keyword in question refers.
Since the header syntax confines this indexed part of the keyword
to three digits, there is an upper limit of 999 columns in
BINTABLE extensions.
Note that the FITS/BINTABLE format does not entail any restriction on
the storage of column *data* beyond the 999 column limit in the data
part of the HDU, the problem is just that client software
cannot be informed about the layout of this data using the
header cards in the usual way.
In some cases it is desirable to store FITS tables with a column
count greater than 999. Whether that's a good idea is not within
the scope of this discussion.
To achieve this, I propose the following convention.
Definitions:
- 'BINTABLE columns' are those columns defined using the
FITS BINTABLE standard
- 'Data columns' are the columns to be encoded
- N_TOT is the total number of data columns to be stored
- Data columns with (1-based) indexes from 999 to N_TOT inclusive
are known as 'extended' columns. Their data is stored
within the 'container' column.
- BINTABLE column 999 is known as the 'container' column
It contains the byte data for all the 'extended' columns.
Convention:
- All column data (for columns 1 to N_TOT) is laid out in the data part
of the HDU in exactly the same way as if there were no 999-column
limit.
- The TFIELDS header is declared with the value 999.
- The container column is declared in the header with some
TFORM999 value corresponding to the total field length required
by all the extended columns ('B' is the obvious data type, but
any legal TFORM value that gives the right width MAY be used).
The byte count implied by TFORM999 MUST be equal to the
total byte count implied by all extended columns.
- Other XXXXX999 headers MAY optionally be declared to describe
the container column in accordance with the usual rules,
e.g. TTYPE999 to give it a name.
- The NAXIS1 header is declared in the usual way to give the width
of a table row in bytes. This is equal to the sum of
all the BINTABLE columns as usual. It is also equal to
the sum of all the data columns, which has the same value.
- Headers for Data columns 1-998 are declared as usual,
corresponding to BINTABLE columns 1-998.
- Keyword XT_ICOL indicates the index of the container column.
It MUST be present with the integer value 999 to indicate
that this convention is in use.
- Keyword XT_NCOL indicates the total number of data columns encoded.
It MUST be present with an integer value equal to N_TOT.
- Metadata for each extended column is encoded with keywords
of the form HIERARCH XT XXXXXnnnnn, where XXXXX
are the same keyword roots as used for normal BINTABLE extensions,
and nnnnn is a decimal number written as usual (no leading zeros,
as many digits are required). Thus the formats for data
columns 999, 1000, 1001 etc are declared with the keywords
HIERARCH XT TFORM999, HIERARCH XT TFORM1000, HIERARCH XT TFORM1001
etc. Note this uses the ESO HIERARCH convention described at
https://fits.gsfc.nasa.gov/registry/hierarch_keyword.html.
The "name space" token has been chosen as "XT" (extended table).
- This convention MUST NOT be used for N_TOT<=999.
The resulting HDU is a completely legal FITS BINTABLE extension.
Readers aware of this convention may use it to extract column
data and metadata beyond the 999-column limit.
Readers unaware of this convention will see 998 columns in their
intended form, and an additional (possibly large) column 999
which contains byte data but which cannot be easily interpreted.
An example header might look like this:
XTENSION= 'BINTABLE' / binary table extension
BITPIX = 8 / 8-bit bytes
NAXIS = 2 / 2-dimensional table
NAXIS1 = 9229 / width of table in bytes
NAXIS2 = 26 / number of rows in table
PCOUNT = 0 / size of special data area
GCOUNT = 1 / one data group
TFIELDS = 999 / number of columns
XT_ICOL = 999 / index of container column
XT_NCOL = 1204 / total columns including extended
TTYPE1 = 'posid_1 ' / label for column 1
TFORM1 = 'J ' / format for column 1
TTYPE2 = 'instrument_1' / label for column 2
TFORM2 = '4A ' / format for column 2
TTYPE3 = 'edge_code_1' / label for column 3
TFORM3 = 'I ' / format for column 3
TUCD3 = 'meta.code.qual'
...
TTYPE998= 'var_min_s_2' / label for column 998
TFORM998= 'D ' / format for column 998
TUNIT998= 'counts/s' / units for column 998
TTYPE999= 'XT_MORECOLS' / label for column 999
TFORM999= '813I ' / format for column 999
HIERARCH XT TTYPE999 = 'var_min_u_2' / label for column 999
HIERARCH XT TFORM999 = 'D' / format for column 999
HIERARCH XT TUNIT999 = 'counts/s' / units for column 999
HIERARCH XT TTYPE1000 = 'var_prob_h_2' / label for column 1000
HIERARCH XT TFORM1000 = 'D' / format for column 1000
...
HIERARCH XT TTYPE1203 = 'var_prob_w_2' / label for column 1203
HIERARCH XT TFORM1203 = 'D' / format for column 1203
HIERARCH XT TTYPE1204 = 'var_sigma_w_2' / label for column 1204
HIERARCH XT TFORM1204 = 'D' / format for column 1204
HIERARCH XT TUNIT1204 = 'counts/s' / units for column 1204
END
This general approach was suggested by William Pence on the FITSBITS
list in June 2012
(https://listmgr.nrao.edu/pipermail/fitsbits/2012-June/002367.html),
and by Francois-Xavier Pineau (CDS) in private conversation in 2016.
The details have been filled in by Mark Taylor (Bristol).
It was discussed in some detail on the FITSBITS list in July 2017
(https://listmgr.nrao.edu/pipermail/fitsbits/2017-July/002967.html)
--------------------------------------------------------------------
Note: a previous variant of this convention was proposed in which
the metadata for the extended columns was declared by extending
the numbering scheme using three-digit base-26 representations.
It was identical to the above, except that:
- Metadata for each extended column is encoded with keywords
of the form XXXXXaaa, where XXXXX are the same keyword roots
as used for normal BINTABLE extensions, and aaa is a 3-digit
value in base 26 using the characters 'A' (0 in base 26) to
'Z' (25 in base 26), and giving the 1-based data column index
minus 999. The sequence aaa MUST be exactly three characters
long (leading 'A's are required). Thus the formats for data
columns 999, 1000, 1001, etc are declared with the keywords
TFORMAAA, TFORMAAB, TFORMAAC etc.
This convention can therefore allow encoding of tables with data
column counts N_TOT up to 998+26^3 = 18574.
In that case the header looks identical to the previous example up
to TFORM999, but the remaining entries differ:
TTYPE998= 'var_min_s_2' / label for column 998
TFORM998= 'D ' / format for column 998
TUNIT998= 'counts/s' / units for column 998
TTYPE999= 'XT_MORECOLS' / label for column 999
TFORM999= '813I ' / format for column 999
TTYPEAAA= 'var_min_u_2' / label for column 999
TFORMAAA= 'D ' / format for column 999
TUNITAAA= 'counts/s' / units for column 999
TTYPEAAB= 'var_prob_h_2' / label for column 1000
TFORMAAB= 'D ' / format for column 1000
...
TTYPEAHW= 'var_prob_w_2' / label for column 1203
TFORMAHW= 'D ' / format for column 1203
TTYPEAHX= 'var_sigma_w_2' / label for column 1204
TFORMAHX= 'D ' / format for column 1204
TUNITAHX= 'counts/s' / units for column 1204
END
This variant was generally less favoured than the HIERARCH_based
one by participants in the FITSBITS discussion.
......@@ -49,9 +49,6 @@ public abstract class AbstractFitsTableWriter extends StreamStarTableWriter
private static final Logger logger_ =
Logger.getLogger( "uk.ac.starlink.fits" );
/** Hard limit for FITS table columns (TTYPE1000 has too many chars). */
private static final int MAX_FITS_COLUMNS = 999;
/**
* Constructor.
*
......@@ -147,12 +144,6 @@ public abstract class AbstractFitsTableWriter extends StreamStarTableWriter
*/
public void writeTableHDU( StarTable table, FitsTableSerializer fitser,
DataOutput out ) throws IOException {
int ncol = table.getColumnCount();
if ( ncol > MAX_FITS_COLUMNS ) {
throw new IOException( "Column count " + ncol
+ " exceeds FITS limit of "
+ MAX_FITS_COLUMNS );
}
try {
Header hdr = fitser.getHeader();
addMetadata( hdr );
......@@ -167,9 +158,13 @@ public abstract class AbstractFitsTableWriter extends StreamStarTableWriter
/**
* Provides a suitable serializer for a given table.
* Note this should throw an IOException if it can be determined that
* the submitted table cannot be written by this writer, for instance
* if it has too many columns.
*
* @param table table to serialize
* @return FITS serializer
* @throws IOException if the table can't be written
*/
protected abstract FitsTableSerializer createSerializer( StarTable table )
throws IOException;
......
package uk.ac.starlink.fits;
import java.util.logging.Logger;
import nom.tam.fits.FitsFactory;
import nom.tam.fits.Header;
import nom.tam.fits.HeaderCardException;
/**
* Implementations of the WideFits interface.
* This class fills in the details of the general idea defined in
* WideFits. Static methods provide concrete implementations.
*
* <p>The Wide FITS convention is defined in the file
* (fits/src/docs/)wide-fits.txt
*
* @author Mark Taylor
* @since 27 Jul 2017
*/
public abstract class AbstractWideFits implements WideFits {
private final int icolContainer_;
private final int extColMax_;
private final String name_;
/** Index of container column hosting extended column data. */
public static final String KEY_ICOL_CONTAINER = "XT_ICOL";
/** Header key for extended column count - includes standard ones. */
public static final String KEY_NCOL_EXT = "XT_NCOL";
private static final Logger logger_ =
Logger.getLogger( "uk.ac.starlink.fits" );
/**
* Constructor.
*
* @param icolContainer 1-based index of container column
* used for storing extended column data;
* usually 999
* @param extColMax maximum number of extended columns
* (including standard columns) that can be
* represented by this convention
* @param implName base name of this implementation
*/
protected AbstractWideFits( int icolContainer, int extColMax,
String implName ) {
icolContainer_ = icolContainer;
if ( icolContainer_ > MAX_NCOLSTD ) {
throw new IllegalArgumentException( "Container column index > "
+ MAX_NCOLSTD );
}
extColMax_ = extColMax;
name_ = implName
+ ( icolContainer == MAX_NCOLSTD ? "" : icolContainer );
}
public int getContainerColumnIndex() {
return icolContainer_;
}
public int getExtColumnMax() {
return extColMax_;
}
public void addContainerColumnHeader( Header hdr, long nbyteExt,
long nslice ) {
/* Work out how we will declare the data type for this column
* in the FITS header. Since this column data is not supposed
* to have any meaning, it doesn't matter what format we use
* as long as it's the right length.
* The obvious thing would be to use 'B' format (byte),
* and that does work. However, FITS bytes are unsigned,
* and java bytes are signed. To get round that,
* the STIL FITS reading code usually turns FITS bytes into java
* shorts on read. That doesn't break anything, but it means that
* if a non-WideFits-aware STIL reader encounters a WideFits table,
* it may end up doing expensive and useless conversions of
* large byte arrays to large short arrays.
* (Other software may or may not have similar issues with
* FITS unsigned bytes, I don't know).
* So, if the element size is an even number of bytes, write it
* using TFORM = 'I' (16-bit signed integer) instead.
* Since the FITS and java 16-bit types match each other, this
* avoids the problem. If it's an odd number, we still have to
* go with bytes. */
final char formChr;
final short formSiz;
long elSize = nslice > 0 ? nbyteExt / nslice : nbyteExt;
if ( elSize % 2 == 0 ) {
formChr = 'I';
formSiz = 2;
}
else {
formChr = 'B';
formSiz = 1;
}
long nEl = nbyteExt / formSiz;
assert nEl * formSiz == nbyteExt;
/* If requested, prepare to write a TDIM header. */
final String dimStr;
if ( nslice > 0 ) {
if ( nEl % nslice == 0 ) {
dimStr = new StringBuffer()
.append( '(' )
.append( nEl / nslice )
.append( ',' )
.append( nslice )
.append( ')' )
.toString();
}
else {
logger_.severe( nEl + " not divisible by " + nslice
+ " - no TDIM" + icolContainer_ );
dimStr = null;
}
}
else {
dimStr = null;
}
/* Add the relevant entries to the header. */
BintableColumnHeader colhead =
BintableColumnHeader.createStandardHeader( icolContainer_ );
String forcol = " for column " + icolContainer_;
try {
hdr.addValue( colhead.getKeyName( "TTYPE" ), "XT_MORECOLS",
"label" + forcol );
hdr.addValue( colhead.getKeyName( "TFORM" ), nEl + "" + formChr,
"format" + forcol );
if ( dimStr != null ) {
hdr.addValue( colhead.getKeyName( "TDIM" ), dimStr,
"dimensions" + forcol );
}
hdr.addValue( colhead.getKeyName( "TCOMM" ),
"Extension buffer for columns beyond "
+ icolContainer_, null );
}
catch ( HeaderCardException e ) {
throw new RuntimeException( e ); // shouldn't happen
}
}
public void addExtensionHeader( Header hdr, int ncolExt ) {
try {
hdr.addValue( KEY_ICOL_CONTAINER, icolContainer_,
"index of container column" );
hdr.addValue( KEY_NCOL_EXT, ncolExt,
"total columns including extended" );
}
catch ( HeaderCardException e ) {
throw new RuntimeException( e ); // shouldn't happen
}
}
public int getExtendedColumnCount( HeaderCards cards, int ncolStd ) {
Integer icolContainerValue = cards.getIntValue( KEY_ICOL_CONTAINER );
Integer ncolExtValue = cards.getIntValue( KEY_NCOL_EXT );
if ( icolContainerValue == null && ncolExtValue == null ) {
return ncolStd;
}
else if ( icolContainerValue == null || ncolExtValue == null ) {
logger_.warning( "FITS header has one but not both of "
+ KEY_ICOL_CONTAINER + " and " + KEY_NCOL_EXT
+ " - no extended columns" );
return ncolStd;
}
int icolContainer = icolContainerValue.intValue();
int ncolExt = ncolExtValue.intValue();
if ( icolContainer != ncolStd ) {
logger_.warning( "FITS header " + KEY_ICOL_CONTAINER + "="
+ icolContainer
+ " != standard column count (TFIELDS) " + ncolStd
+ " - no extended columns" );
return ncolStd;
}
for ( String tkey : new String[] { "TTYPE", "TFORM", "TCOMM" } ) {
cards.useKey( tkey + icolContainer );
}
logger_.config( "Located extended columns in wide FITS file" );
return ncolExt;
}
@Override
public String toString() {
return name_;
}
/**
* Returns a WideFits instance that uses normal TFORMaaa headers
* where aaa is a 3-digit base-26 integer (each digit is [A-Z]).
*
* <p><strong>Note:</strong> this implementation is a historical relic.
* It could be removed if its maintenance becomes problematic.
*
* @param icolContainer 1-based index of container column
* used for storing extended column data;
* usually 999
* @return WideFits implementation
*/
public static WideFits createAlphaWideFits( int icolContainer ) {
return new AlphaWideFits( icolContainer );
}
/**
* Returns a WideFits instance that uses headers of the form
* HIERARCH XT TFORMnnnnn, using the ESO HIERARCH convention.
*
* @param icolContainer 1-based index of container column
* used for storing extended column data;
* usually 999
* @return WideFits implementation
*/
public static WideFits createHierarchWideFits( int icolContainer ) {
return new HierarchWideFits( icolContainer );
}
/**
* Utility method to write a log message indicating that this
* convention is being used to write a FITS file.
*
* @param logger logger
* @param nStdcol number of standard FITS columns
* @param nAllcol total number of columns including extended
*/
public static void logWideWrite( Logger logger, int nStdcol, int nAllcol ) {
if ( nAllcol > nStdcol ) {
logger.warning( "Using non-standard extended column convention" );
logger.warning( "Other FITS software may not see columns "
+ nStdcol + "-" + nAllcol );
}
}
/**
* Utility method to write a log message indicating that this
* convention is being used to read a FITS file.
*
* @param logger logger
* @param nStdcol number of standard FITS columns
* @param nAllcol total number of columns including extended
*/
public static void logWideRead( Logger logger, int nStdcol, int nAllcol ) {
if ( nAllcol > nStdcol ) {
logger.info( "Using non-standard extended column convention "
+ "for columns " + nStdcol + "-" + nAllcol );
}
}
/**
* WideFits implementation based on using 3-digit base-26 numbers
* to label extended columns in normal 8-character FITS keywords.
*
* <p><strong>Note:</strong> this implementation is a historical relic.
* It could be removed if its maintenance becomes problematic.
*/
static class AlphaWideFits extends AbstractWideFits {
/** First digit used for extended column indexing. */
private static final char DIGIT0 = 'A';
/** Number of digits used for extended column indexing.
* This is the base used for the index value encoding.
* All characters in the range DIGIT0..NDIGIT must be legal FITS
* keyword characters, and must not be decimal digits. */
private static final int NDIGIT = 26;
/**
* Constructor.
*
* @param icolContainer 1-based index of container column
* used for storing extended column data
*/
public AlphaWideFits( int icolContainer ) {
super( icolContainer,
icolContainer - 1 + NDIGIT * NDIGIT * NDIGIT,
"alpha" );
}
public BintableColumnHeader createExtendedHeader( int icolContainer,
int jcol ) {
final String jcolStr = encodeInteger( jcol - icolContainer );
return new BintableColumnHeader() {
public String getKeyName( String stdName ) {
return stdName + jcolStr;
}
};
}
/**
* Encodes an integer so it can be used as an extended column index.
* This uses base 26, with the digits A-Z.
*
* <p>This must give a unique result of not more than 3 characters
* for each input value in the allowed range,
* which is legal for inclusion in a FITS keyword,
* and which is not capable of interpretation as a decimal integer.
*
* @param ix input value in range 0&lt;ix&lt;17576
* @return string representation in base 26
* @throws NumberFormatException if input value is out of range
*/
public String encodeInteger( int ix ) {
int base = NDIGIT;
int max = base * base * base;
if ( ix >= 0 && ix < max ) {
char[] digits = new char[ 3 ];
int j = ix;
for ( int k = 0; k < 3; k++ ) {
digits[ 2 - k ] = (char) ( DIGIT0 + ( j % base ) );
j = j / base;
}
return new String( digits );
}
else {
String msg = "Out of range (0-" + ( max - 1 ) + "): " + ix;
throw new NumberFormatException( msg );
}
}
}
/**
* WideFits implementation based on the non-standard HIERARCH convention.
*/
static class HierarchWideFits extends AbstractWideFits {
public static final String NAMESPACE = "XT";
/**
* Constructor.
*
* @param icolContainer 1-based index of container column
* used for storing extended column data
*/
public HierarchWideFits( int icolContainer ) {
super( icolContainer, Integer.MAX_VALUE, "hierarch" );
}
public BintableColumnHeader createExtendedHeader( int icolContainer,
final int jcol ) {
return new BintableColumnHeader() {
public String getKeyName( String stdName ) {
return new StringBuffer()
.append( "HIERARCH" )
.append( "." )
.append( NAMESPACE )
.append( "." )
.append( stdName )
.append( jcol )
.toString();
}
};
}
@Override
public void addExtensionHeader( Header hdr, int ncolExt ) {
checkHasHierarch( false );
super.addExtensionHeader( hdr, ncolExt );
}
@Override
public int getExtendedColumnCount( HeaderCards cards, int ncolStd ) {
int ncolExt = super.getExtendedColumnCount( cards, ncolStd );
if ( ncolExt > ncolStd ) {
checkHasHierarch( true );
}
return ncolExt;
}
/**
* Check that the FITS HIERARCH convention is in operation.
* If not, complain about it or something.
*
* @param isRead true for read, false for write
*/
private void checkHasHierarch( boolean isRead ) {
if ( ! FitsFactory.getUseHierarch() ) {
logger_.severe( "FitsFactory.useHierarch=false: "
+ "HIERARCH-based wide FITS table convention "
+ ( isRead ? "read" : "write" ) + " will fail!" );
}
}
}
}
package uk.ac.starlink.fits;
import java.io.Closeable;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
/**
* ThreadLocal based on an InputFactory.
* This can dispense a BasicInput object private to the current thread.
* The close method will close all the BasicInput objects that this
* has created so far.
*
* @author Mark Taylor
* @since 30 Jun 2017
*/
public class BasicInputThreadLocal extends ThreadLocal<BasicInput>
implements Closeable {
private final InputFactory inputFact_;
private final boolean isSeq_;
private final List<BasicInput> inputs_;
/**
* Constructor.
*
* @param inputFact factory for BasicInput objects
* @param isSeq true if created inputs are sequential, false for random
*/
public BasicInputThreadLocal( InputFactory inputFact, boolean isSeq ) {
inputFact_ = inputFact;
isSeq_ = isSeq;
inputs_ = new ArrayList<BasicInput>();
}
@Override
protected BasicInput initialValue() {
BasicInput bi = createBasicInput();
inputs_.add( bi );
return bi;
}
/**
* Creates a BasicInput object without throwing an exception.
* If it fails, a dummy instance that will throw a suitable exception
* when in use is returned.
*
* @return new basic input
*/
private BasicInput createBasicInput() {
try {
return inputFact_.createInput( isSeq_ );
}
catch ( final IOException e ) {
return new FailureBasicInput( e, isSeq_ );
}
}
public synchronized void close() {
for ( Iterator<BasicInput> it = inputs_.iterator(); it.hasNext(); ) {
try {
it.next().close();
}
catch ( IOException e ) {
// never mind
}
it.remove();
}
}
/**
* BasicInput instance that responds to most method invocations
* by throwing a previously supplied exception.
*/
private static class FailureBasicInput implements BasicInput {
private final IOException err_;
private final boolean isSeq_;
/**
* Constructor.
*
* @param err exception on which to base thrown ones
* @param isSeq true if created inputs are sequential,
* false for random
*/
FailureBasicInput( IOException err, boolean isSeq ) {
isSeq_ = isSeq;
err_ = err;
}
public byte readByte() throws IOException {
throw failure();
}
public short readShort() throws IOException {
throw failure();
}
public int readInt() throws IOException {
throw failure();
}
public long readLong() throws IOException {
throw failure();
}
public float readFloat() throws IOException {
throw failure();
}
public double readDouble() throws IOException {
throw failure();
}
public void skip( long nbyte ) throws IOException {
throw failure();
}