Commit 2adf65a3 authored by Filippo Rusconi's avatar Filippo Rusconi

New upstream version 1.9.0

parents
This file lists *major* changes between releases, for a full list see git log.
1.9.1
----------------------------------
- R bindings brought up to speed
- Improvements for 1.0.X backward compatibility in Python module
- Minor fixes and cleanups
1.9.0
----------------------------------
- We no longer override $CXX to use clang when available - the difference in speed of
produced code is no longer large enough to justify it
- Some more minor code cleanups
- Python wrapper now returns probabilities of configurations instead of logprobabilities
by default. This is controllable through (optional) parameter.
- Expose LayeredTabulator in Python
----------------------------------
1.9.0beta2
- API change: rename eprob() functions to prob()
- Added more tests
- Some bugfixes
----------------------------------
1.9.0beta1
- Added more tests
- Code cleanups
- Some bugfixes
----------------------------------
1.9.0alpha3
- Brought C++/Python examples up to the new API
- Small change of Python API
- Speed improvements in IsoThresholdGenerator
- Code cleanup, ripped out much of experimental stuff which will be postponed until 2.0
----------------------------------
1.9.0alpha2
- Rudimentary module for backward-compatibility with IsoSpecPy 1.0.X API
- Bugfix (and usability improvement) for getting configurations in
IsoThreshold
----------------------------------
1.9.0alpha1
- Almost complete rewrite of the entire codebase
- Improvements in the core algorithms, resulting in 10x-1000x speed increase
as well as lower memory consumption
- Reworked API to be more sensible
- Laying the groundwork for multithreaded operation (not complete yet)
- Expose generators in Python (enabling processing configurations on the fly,
as they are produced without the need to compute and store all of them first
(using up memory) and process them later)
- More modular Python part
- C/C++ and Python-only release, R will follow soon after
- Some features from 1.0.5 are not yet (re-)implemented: IsoLayered is only
exposed as a generator, not yet as a fixed table; as a consequence quicktrim
to get the optimal p-set is still missing
- THIS IS AN ALPHA RELEASE: IT'S NOT YET EXHAUSTIVELY TESTED AND MAY CRASH OR
PRODUCE WRONG RESULTS
- API is still not yet finalized.
----------------------------------
1.0.7 (Python-only release)
- Bugfix in Python's getConfs()
----------------------------------
1.0.6 (Python-only release)
- Compatibility fixes for Python3.6
----------------------------------
1.0.5 (Python-only release)
- Only affecting IsoSpecPy: Speed improvements in IsoFromFormula
----------------------------------
1.0.4
- ??? FIXME
----------------------------------
1.0.3
- Only affecting IsoSpecR: improve the efficiency of R to C++ interface
- Add an option to skip discarding the superfluous configurations
----------------------------------
1.0.2
- Some further Python+Windows compatibility improvements
----------------------------------
1.0.1
- Only affecting IsoSpecPy: provide prebuilt library
- Only affecting IsoSpecPy: Improve Python/cygwin functionality
----------------------------------
1.0.0
- Initial release
The easiest (if not the most elegant) way to compile the example is:
clang++ water.cpp ../../IsoSpec++/unity-build.cpp -std=c++11
You may replace clang++ with g++ if you do not have LLVM installed.
Run the example by:
./a.out
#include <iostream>
#include <cassert>
#include "../../IsoSpec++/isoSpec++.h"
using namespace IsoSpec;
int main()
{
int configs[5];
{
// We shall walk through a set of configurations which covers at least 99.9% of the mass.
// For water we could obviously go through the entire spectrum (100%), but for larger
// molecules the entire spectrum has far too many configurations. Luckily, most of the
// likelihood is concentrated in a relatively small set of most probable isotopologues
// - and this method allows one to quickly calculate such a set, parametrising on the
// percentage of coverage of total probability space required.
//
// This is usually better than just calculating all isotopes with probabilities above a
// given threshold, as it allows one to directly parametrise on the accuracy of the
// simplified spectrum - that is, the L1 distance between the full and simplified spectrum
// will be less than (in this example) 0.001.
//
// If for some reason one would wish to just calculate a set of peaks with probabilities
// above a given threshold - it is possible using the IsoThresholdGenerator class.
//
// Note: the returned set will usually contain a bit more configurations than necessary
// to achieve the desired coverage. These configurations need to be computed anyway, however
// it is possible to discard them using the optional trim argument.
IsoLayeredGenerator iso("H2O1", 0.999);
double total_prob = 0.0;
while(iso.advanceToNextConfiguration())
{
std::cout << "Visiting configuration: " << std::endl;
std::cout << "Mass: " << iso.mass() << std::endl;
std::cout << "log-prob: " << iso.lprob() << std::endl;
std::cout << "probability: " << iso.prob() << std::endl;
total_prob += iso.prob();
iso.get_conf_signature(configs);
// Successive isotopologues are ordered by the appearance in the formula of the element, then by nucleon number, and concatenated into one array
std::cout << "Protium atoms: " << configs[0] << std::endl;
std::cout << "Deuterium atoms " << configs[1] << std::endl;
std::cout << "O16 atoms: " << configs[2] << std::endl;
std::cout << "O17 atoms: " << configs[3] << std::endl;
std::cout << "O18 atoms: " << configs[4] << std::endl;
}
assert(total_prob >= 0.99); // We must have covered at least 99% of the spectrum
}
{
std::cout << "Now what if both isotopes of hydrogen were equally probable, while prob. of O16 was 50%, O17 at 30% and O18 at 20%?" << std::endl;
const int elementNumber = 2;
const int isotopeNumbers[2] = {2,3};
const int atomCounts[2] = {2,1};
const double hydrogen_masses[2] = {1.00782503207, 2.0141017778};
const double oxygen_masses[3] = {15.99491461956, 16.99913170, 17.9991610};
const double* isotope_masses[2] = {hydrogen_masses, oxygen_masses};
const double hydrogen_probs[2] = {0.5, 0.5};
const double oxygen_probs[3] = {0.5, 0.3, 0.2};
const double* probs[2] = {hydrogen_probs, oxygen_probs};
IsoLayeredGenerator iso(Iso(elementNumber, isotopeNumbers, atomCounts, isotope_masses, probs), 0.99);
iso.advanceToNextConfiguration();
std::cout << "The first configuration has the following parameters: " << std::endl;
std::cout << "Mass: " << iso.mass() << std::endl;
std::cout << "log-prob: " << iso.lprob() << std::endl;
std::cout << "probability: " << iso.prob() << std::endl;
iso.get_conf_signature(configs);
// Successive isotopologues are ordered by the appearance in the formula of the element, then by nucleon number, and concatenated into one array
std::cout << "Protium atoms: " << configs[0] << std::endl;
std::cout << "Deuterium atoms " << configs[1] << std::endl;
std::cout << "O16 atoms: " << configs[2] << std::endl;
std::cout << "O17 atoms: " << configs[3] << std::endl;
std::cout << "O18 atoms: " << configs[4] << std::endl;
std::cout << "Probabilities of the remaining configurations, are: " << std::endl;
while(iso.advanceToNextConfiguration())
std::cout << iso.prob() << std::endl;
}
// TODO: demonstrate other algorithms
}
# Calculates the isotopic distribution of water in several ways
import IsoSpecPy
from math import exp
try:
if IsoSpecPy.__version__[:3] != '1.9':
raise AttributeError
except AttributeError:
print("This file is meant to be used with IsoSpecPy version 1.9.X. You seem to have a different version installed on your system.")
import sys
sys.exit(-1)
i = IsoSpecPy.IsoLayeredGenerator(formula="H2O1", prob_to_cover = 0.999, get_confs=True)
print("Calculating isotopic distribution of water. Here's a list of configurations necessary to cover at least 0.999 of total probability:")
for mass, log_prob, conf in i:
print("")
print("Mass: " + str(mass))
print("log(probability): " + str(log_prob))
print("probability: " + str(exp(log_prob)))
print("Number of Protium atoms: " + str(conf[0][0]))
print("Number of Deuterium atoms: " + str(conf[0][1]))
print("Number of O16 atoms: " + str(conf[1][0]))
print("Number of O17 atoms: " + str(conf[1][1]))
print("Number of O18 atoms: " + str(conf[1][2]))
print("")
print("Now what if both isotopes of hydrogen were equally probable, while prob. of O16 was 50%, O17 at 30% and O18 at 20%?")
hydrogen_probs = (0.5, 0.5)
oxygen_probs = (0.5, 0.3, 0.2)
hydrogen_masses = (1.00782503207, 2.0141017778)
oxygen_masses = (15.99491461956, 16.99913170, 17.9991610)
i = IsoSpecPy.IsoLayeredGenerator(atomCounts = (2, 1), isotopeMasses = (hydrogen_masses, oxygen_masses), isotopeProbabilities = (hydrogen_probs, oxygen_probs), prob_to_cover = 0.999, get_confs=True)
mass, log_prob, conf = next(i.__iter__())
print("The first configuration has the following parameters:")
print("Mass: " + str(mass))
print("log(probability): " + str(log_prob))
print("probability: " + str(exp(log_prob)))
print("Number of Protium atoms: " + str(conf[0][0]))
print("Number of Deuterium atoms: " + str(conf[0][1]))
print("Number of O16 atoms: " + str(conf[1][0]))
print("Number of O17 atoms: " + str(conf[1][1]))
print("Number of O18 atoms: " + str(conf[1][2]))
library(IsoSpecR)
water = c(H=2,O=1) # A water molecule:
p = .99 # joint probability of the output
# calculate one of the smallest sets of isotopologues with
# probability at least equal to 99%
WATER = IsoSpecify(molecule = water, stopCondition = p)
print(WATER)
# the outcome looks small, but water can only have 3*3 = 9 isotopologues.
# (for a general formula, see the formula in the supplement to our paper)
# you can see all of them by setting the probability to 1: this will include all isotopologues.
WATER = IsoSpecify(molecule = water, stopCondition = 1)
print(WATER)
# The default output consists of a matrix with two columns: mass and log-probability of the
# isotopologues (the natural logarithms, with base equal to 'e').
print(WATER[,"mass"])
print(WATER[,"logProb"])
# to get the probabilities, simply run
WATER = IsoSpecify(molecule = water, stopCondition = 1, showCounts=T)
print(WATER)
# Let's get a bigger example: bovine insulin
bovine_insulin = c(C=254, H=377, N=65, O=75, S=6)
BOVINE_INSULIN = IsoSpecify(bovine_insulin, .999)
# as you can see, there are 1301 isotopologs such that their probability summed together
# exceed .99. This is one of the smallest such sets.
print(nrow(BOVINE_INSULIN))
total_probability = sum(exp(BOVINE_INSULIN[,'logProb']))
print(total_probability)
# there are other ways to get this outcome, but this one is the fastest.
# if you want to get the isotopologues sorted by probability, you can run the
# priority-queue version of the algorithm.
BOVINE_INSULIN = IsoSpecify(bovine_insulin, .999, algo=1)
is.unsorted(BOVINE_INSULIN[,'logProb'][nrow(BOVINE_INSULIN):1])
# is.unsorted == FALSE is the same as sorted == TRUE. It's elementary :)
# using the priority queue has poorer time complexity, which is O(n*log(n)).
T0 = Sys.time()
BOVINE_INSULIN = IsoSpecify(bovine_insulin, .999, algo=1)
T1 = Sys.time()
BOVINE_INSULIN = IsoSpecify(bovine_insulin, .999, algo=0)
T2 = Sys.time()
print(T1 - T0)
print(T2 - T1)
# If, for some reason, you just want to generate peaks above certain probability threshold,
# you could use the third algorithm
BOVINE_INSULIN_peaks_more_probable_than_10_percent_each = IsoSpecify(bovine_insulin, .1, algo=3)
print(exp(BOVINE_INSULIN_peaks_more_probable_than_10_percent_each[,'logProb']))
# ATTENTION: it is not a good idea to do it.
# Remember, that we supply you with infinitely resolved peaks.
# your experimental peaks are most likely generated by an mass spctrometer,
# that does not share this quality. Hence, the smallest observed peak might
# be actually a sum of some smaller, unresolved peaks, giving you false impression of security.
# Anyway, if you still want to, you can also precise the limiting height of the peak w.r.t.
# the height of the heighest peak.
# You can achieve this with algorithm 3:
BOVINE_INSULIN_peaks_higher_than_the_one_ten_thousandth_of_the_height_of_the_heighest_peak = IsoSpecify(bovine_insulin, 1e-04, algo=3)
LP = BOVINE_INSULIN_peaks_higher_than_the_one_ten_thousandth_of_the_height_of_the_heighest_peak[,'logProb']
print(exp(LP[length(LP)] - LP[1]))
# so, as you can see, this is a peak very close to 1/10,000, but still above the threshold.
# The next most probable that was not yet calculated, would have had that ratio below this number.
# You are invited to test it yourself :)
# Matteo and Michał.
\ No newline at end of file
-------------------------------------------
Building and installing the C/C++ library
Requirements:
A C++11 compiler, clang++ (v3.3 or later) or g++ (v4.7 or later)
Make (GNU make is OK)
Note: clang++ is the default (and preferred) compiler as it produces
faster code (in our tests, the difference in speed is about 20%).
If you'd like to use g++ instead, please edit the first line of
IsoSpec++/Makefile appropriately.
Next, execute the following commands:
cd IsoSpec++
make
You may copy the resulting .so file to a convenient location.
If you wish to develop software using this library, you will also
have to place the header files (*.hpp/*.h) somewhere your C/C++
compiler can find them.
On Windows, you should turn on the MANN_LIBRARY pre-processor directive when
building the library.
------------------------------------------
Building and installing the Python package
IMPORTANT: please note that the Python package is standalone, in the
sense that it does not need the C/C++ library to be installed separately.
IsoSpecPy is compatible with both CPython and PyPy, running on Unix,
Cygwin or in MinGW environment.
The preferred method of installation is by using the pip package manager.
In order to do that, simply issue the following command, replacing
"<python>" with your Python executable (python/python3/pypy/...)
<python> -m pip install IsoSpecPy
If, instead, you wish to install the package manually, follow these
instructions:
Requirements:
Python (obviously) (v2 and v3 are supported)
C++11 compiler, clang++ (v3.3 or later) or g++ (v4.7 or later)
setuptools
cffi (this, if not present, will be automatically downloaded and
installed by the setup script, however you may prefer to use
your distribution's package manager to install that)
Execute the following commands:
cd IsoSpecPy
sudo python setup.py install
Again, clang++ is the preferred compiler and will be used if found by the
setup script. If you want to override the behaviour (if you have clang++
installed, but still want to use g++) you will have to replace the last
command with:
ISO_USE_DEFAULT_CXX=TRUE sudo python setup.py install
------------------------------------------
Building and installing the R package
The preferred installation method is to use the CRAN package installer.
To do that, simply issue the command:
install.packages('IsoSpecR')
in your R session. However it is possible to install the package manually,
from source, by following these instructions:
Requirements:
R (>= 3.2.1)
Rcpp package from CRAN
C++11 compiler, clang++ (v3.3 or later) or g++ (v4.7 or later)
Move to the folder containing the IsoSpecR folder. Then run in terminal:
R CMD build IsoSpecR
R CMD INSTALL IsoSpecR_*.tar.gz
All necessary packages should download automatically.
project("IsoSpec")
cmake_minimum_required(VERSION 2.6)
set(my_sources
cwrapper.cpp
allocator.cpp
dirtyAllocator.cpp
isoSpec++.cpp
isoMath.cpp
marginalTrek++.cpp
operators.cpp
element_tables.cpp
misc.cpp
)
## platform dependent compiler flags:
include(CheckCXXCompilerFlag)
if (NOT WIN32) # we only want fPIC on non-windows systems (fPIC is implicitly true there)
CHECK_CXX_COMPILER_FLAG("-fPIC" WITH_FPIC)
if (WITH_FPIC)
add_definitions(-fPIC)
endif()
# Only add those definitions on non-windows systems
add_definitions(-std=c++11 -Wall -pedantic -Wextra)
else()
## On MSVS we need this for mmap
set(my_sources ${my_sources} mman.cpp)
add_definitions(-DMMAN_LIBRARY)
endif()
add_library(IsoSpec SHARED ${my_sources})
configure_file(IsoSpecConfig.cmake.in "${PROJECT_BINARY_DIR}/IsoSpecConfig.cmake" @ONLY)
export(TARGETS IsoSpec FILE IsoSpecLibrary.cmake)
# Create imported target IsoSpec
# adapted from http://www.cmake.org/Wiki/CMake/Tutorials/How_to_create_a_ProjectConfig.cmake_file#The_FooBar.2FFooBarBuildTreeSettings.cmake.in_file
# Compute paths
get_filename_component(ISOSPEC_CMAKE_DIR "${CMAKE_CURRENT_LIST_FILE}" PATH)
if(EXISTS "${ISOSPEC_CMAKE_DIR}/CMakeCache.txt")
# In build tree
set(ISOSPEC_INCLUDE_DIRS
"@PROJECT_SOURCE_DIR@/")
else()
set(ISOSPEC_INCLUDE_DIRS "${ISOSPEC_CMAKE_DIR}/@CONF_REL_INCLUDE_DIR@")
endif()
# Our library dependencies (contains definitions for IMPORTED targets)
include("${ISOSPEC_CMAKE_DIR}/IsoSpecLibrary.cmake")
# These are IMPORTED targets created by IsoSpecLibrary.cmake
set(ISOSPEC_LIBRARIES IsoSpec)
OPTFLAGS=-O3 -march=native -mtune=native
DEBUGFLAGS=-O0 -g -Werror
CXXFLAGS=-std=c++11 -Wall -pedantic -Wextra
SRCFILES=cwrapper.cpp allocator.cpp dirtyAllocator.cpp isoSpec++.cpp isoMath.cpp marginalTrek++.cpp operators.cpp element_tables.cpp misc.cpp mman.cpp
all: unitylib
unitylib:
$(CXX) $(CXXFLAGS) $(OPTFLAGS) unity-build.cpp -fPIC -shared -o libIsoSpec++.so
debug:
$(CXX) $(CXXFLAGS) $(DEBUGFLAGS) unity-build.cpp -DDEBUG -fPIC -shared -o libIsoSpec++.so
debug-gcc:
g++ $(CXXFLAGS) $(DEBUGFLAGS) unity-build.cpp -DDEBUG -fPIC -shared -o libIsoSpec++.so
nonunity:
$(CXX) $(CXXFLAGS) $(OPTFLAGS) $(SRCFILES) -DDEBUG -fPIC -shared -o libIsoSpec++.so
clean:
rm -f libIsoSpec++.so
windows:
g++ -O3 -std=gnu++11 -O3 -shared -static -static-libstdc++ -static-libgcc unity-build.cpp -o ../IsoSpecPy/IsoSpec++.dll
x86_64-w64-mingw32-g++.exe -std=gnu++11 -O3 -shared -static -static-libstdc++ unity-build.cpp -o ../IsoSpecPy/IsoSpecPy/prebuilt-libIsoSpec++1.9.1-x64.dll
i686-w64-mingw32-g++.exe -std=gnu++11 -O3 -shared -static -static-libstdc++ unity-build.cpp -o ../IsoSpecPy/IsoSpecPy/prebuilt-libIsoSpec++1.9.1-x32.dll
/*
* Copyright (C) 2015-2018 Mateusz Łącki and Michał Startek.
*
* This file is part of IsoSpec.
*
* IsoSpec is free software: you can redistribute it and/or modify
* it under the terms of the Simplified ("2-clause") BSD licence.
*
* IsoSpec is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*
* You should have received a copy of the Simplified BSD Licence
* along with IsoSpec. If not, see <https://opensource.org/licenses/BSD-2-Clause>.
*/
#include <iostream>
#include "allocator.h"
namespace IsoSpec
{
template <typename T>
Allocator<T>::Allocator(const int dim, const int tabSize): currentId(-1), dim(dim), tabSize(tabSize)
{
currentTab = new T[dim * tabSize];
}
template <typename T>
Allocator<T>::~Allocator()
{
for(unsigned int i = 0; i < prevTabs.size(); ++i)
{
delete [] prevTabs[i];
}
delete [] currentTab;
}
template <typename T>
void Allocator<T>::shiftTables()
{
prevTabs.push_back(currentTab);
currentTab = new T[dim * tabSize];
currentId = 0;
}
template class Allocator<int>;
}
/*
* Copyright (C) 2015-2018 Mateusz Łącki and Michał Startek.
*
* This file is part of IsoSpec.
*
* IsoSpec is free software: you can redistribute it and/or modify
* it under the terms of the Simplified ("2-clause") BSD licence.
*
* IsoSpec is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*
* You should have received a copy of the Simplified BSD Licence
* along with IsoSpec. If not, see <https://opensource.org/licenses/BSD-2-Clause>.
*/
#pragma once
#include <vector>
#include <iostream>
#include <string.h>
#include "conf.h"
namespace IsoSpec
{
template <typename T> inline void copyConf(
const T* source, T* destination,
int dim
){
memcpy(destination, source, dim*sizeof(T));
}
template <typename T> class Allocator{
private:
T* currentTab;
int currentId;
const int dim, tabSize;
std::vector<T*> prevTabs;
public:
Allocator(const int dim, const int tabSize = 10000);
~Allocator();
void shiftTables();
inline T* newConf()
{
currentId++;
if (currentId >= tabSize)
shiftTables();
return &(currentTab[ currentId * dim ]);
}
inline T* makeCopy(const T* conf)
{
T* currentPlace = newConf();
copyConf<T>( conf, currentPlace, dim );
return currentPlace;
}
inline T* makeExternalCopy(const T* conf)
{
T* res = new T[dim];
copyConf( conf, res, dim );
return res;
}
};
}
/*
* Copyright (C) 2015-2018 Mateusz Łącki and Michał Startek.
*
* This file is part of IsoSpec.
*
* IsoSpec is free software: you can redistribute it and/or modify
* it under the terms of the Simplified ("2-clause") BSD licence.
*
* IsoSpec is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*
* You should have received a copy of the Simplified BSD Licence
* along with IsoSpec. If not, see <https://opensource.org/licenses/BSD-2-Clause>.
*/
#pragma once
namespace IsoSpec
{
typedef int* Conf;
}
/*
* Copyright (C) 2015-2018 Mateusz Łącki and Michał Startek.
*
* This file is part of IsoSpec.
*
* IsoSpec is free software: you can redistribute it and/or modify
* it under the terms of the Simplified ("2-clause") BSD licence.
*
* IsoSpec is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*