Skip to content
Commits on Source (6)
language: python
sudo: true
python:
- "3.6"
- "3.5"
- "2.7"
# command to install dependencies
virtualenv:
system_site_packages: true
before_install:
- sudo apt-get install python-numpy python-scipy
install:
- pip install biopython
- python setup.py build
- sudo apt-get update
- wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh;
- bash miniconda.sh -b -p $HOME/miniconda
- export PATH="$HOME/miniconda/bin:$PATH"
- hash -r
- conda config --set always_yes yes --set changeps1 no
- conda update -q conda
# Useful for debugging any issues with conda
- conda info -a
- conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION numpy scipy biopython nose
- source activate test-environment
- python setup.py install
# command to run tests
script:
- bash .travis_test.sh
nosetests test/test_treetime.py
cd test
bash command_line_tests.sh
OUT=$?
if [ "$OUT" != 0 ]; then
exit 1
fi
nosetests test_treetime.py
......@@ -6,7 +6,6 @@
TreeTime provides routines for ancestral sequence reconstruction and the inference of molecular-clock phylogenies, i.e., a tree where all branches are scaled such that the locations of terminal nodes correspond to their sampling times and internal nodes are placed at the most likely time of divergence.
TreeTime is implemented in Python 2.7 -- an experimental port to [Python 3 is available in a separate branch](https://github.com/neherlab/treetime/tree/py3).
TreeTime aims at striking a compromise between sophisticated probabilistic models of evolution and fast heuristics. It implements GTR models of ancestral inference and branch length optimization, but takes the tree topology as given.
To optimize the likelihood of time-scaled phylogenies, TreeTime uses an iterative approach that first infers ancestral sequences given the branch length of the tree, then optimizes the positions of unconstrained nodes on the time axis, and then repeats this cycle.
The only topology optimization are (optional) resolution of polytomies in a way that is most (approximately) consistent with the sampling time constraints on the tree.
......@@ -34,22 +33,22 @@ The package is designed to be used as a stand-alone tool or as a library used in
### Installation and prerequisites
TreeTime is written in Python 2.7 and currently doesn't support Python 3.
* The package depends on several python libraries:
- numpy, scipy: for all kind of mathematical operations as matrix operations, numerical integration, interpolation, minimization, etc.
- BioPython: for parsing multiple sequence alignments and all phylogenetic functionality
TreeTime is compatible with Python 2.7 upwards and is tested on 2.7, 3.5, and 3.6. It depends on several Python libraries:
- matplotlib: optional dependency for plotting
If you do not have these libraries, you can install them by typing in the terminal:
```bash
$pip install numpy scipy biopython matplotlib
```
* numpy, scipy, pandas: for all kind of mathematical operations as matrix
operations, numerical integration, interpolation, minimization, etc.
* To install the package, run `setup.py` script from the terminal:
```bash
$python setup.py install
```
* BioPython: for parsing multiple sequence alignments and all phylogenetic
functionality
* matplotlib: optional dependency for plotting
You may install TreeTime and its dependencies by running
pip install .
within this repository.
You might need root privileges for system wide installation. Alternatively, you can simply use it TreeTime locally without installation. In this case, just download and unpack it, and then add the TreeTime folder to your $PYTHONPATH.
......
......@@ -4,6 +4,7 @@ import numpy as np
from treetime import TreeAnc, GTR
from Bio import Phylo, AlignIO
from Bio import __version__ as bioversion
import sys
if __name__=="__main__":
###########################################################################
......@@ -85,7 +86,8 @@ if __name__=="__main__":
###########################################################################
### ANCESTRAL RECONSTRUCTION
###########################################################################
treeanc = TreeAnc(params.tree, aln=params.aln, gtr=gtr, verbose=4, fill_overhangs=not params.keep_overhangs)
treeanc = TreeAnc(params.tree, aln=params.aln, gtr=gtr, verbose=4,
fill_overhangs=not params.keep_overhangs)
treeanc.infer_ancestral_sequences('ml', infer_gtr=infer_gtr,
marginal=params.marginal)
......@@ -121,3 +123,5 @@ if __name__=="__main__":
outtree_name = '.'.join(params.tree.split('/')[-1].split('.')[:-1])+'_mutation.nexus'
Phylo.write(treeanc.tree, outtree_name, 'nexus')
print("--- tree saved in nexus format as \n\t %s\n"%outtree_name)
sys.exit(0)
......@@ -4,7 +4,7 @@ import numpy as np
from treetime import TreeAnc, GTR
from Bio import Phylo, AlignIO
from Bio import __version__ as bioversion
import os,shutil
import os,shutil, sys
if __name__=="__main__":
###########################################################################
......@@ -29,7 +29,6 @@ if __name__=="__main__":
"Example: '--gtr K80 --gtr_params kappa=0.2 pis=0.25,0.25,0.25,0.25'. See the exact definitions of "
" the parameters in the GTR creation methods in treetime/nuc_models.py. Only nucleotide models supported at present")
parser.add_argument('--prot', default = False, action="store_true", help ="protein alignment")
parser.add_argument('--zero_based', default = False, action='store_true', help='zero based SNP indexing')
parser.add_argument('-n', default = 10, type=int, help='number of mutations/nodes that are printed to screen')
parser.add_argument('--verbose', default = 1, type=int, help='verbosity of output 0-6')
......@@ -39,6 +38,7 @@ if __name__=="__main__":
###########################################################################
### CHECK FOR TREE, build if not in place
###########################################################################
if params.tree is None:
from treetime.utils import tree_inference
params.tree = os.path.basename(params.aln)+'.nwk'
......@@ -47,6 +47,9 @@ if __name__=="__main__":
tree_inference(params.aln, params.tree, tmp_dir = tmp_dir)
if os.path.isdir(tmp_dir):
shutil.rmtree(tmp_dir)
elif not os.path.isfile(params.tree):
print("Input tree file does not exist:", params.tree)
exit(1)
###########################################################################
### GTR SET-UP
......@@ -216,3 +219,6 @@ if __name__=="__main__":
for name, val in mutation_by_strain_sorted[:params.n]:
if len(val):
print("\t%s\t%d"%(name, len(val)))
sys.exit(0)
......@@ -8,7 +8,7 @@ from Bio.Seq import Seq
from Bio.Align import MultipleSeqAlignment
from Bio import Phylo, AlignIO
from Bio import __version__ as bioversion
import os
import os,sys
if __name__=="__main__":
###########################################################################
......@@ -23,7 +23,7 @@ if __name__=="__main__":
parser.add_argument('--states', required = True, type=str, help ="csv or tsv file with discrete characters."
"\n#name,country,continent\ntaxon1,micronesia,oceania\n...")
parser.add_argument('--weights', type=str, help="csv or tsv file with probabilities of that a randomly sampled "
"sequence has a particular state. E.g. population of different continents or countries. E.g.:"
"sequence at equilibrium has a particular state. E.g. population of different continents or countries. E.g.:"
"\n#country,weight\nmicronesia,0.1\n...")
# parser.add_argument('--migration', type=str, help="csv or tsv file with symmetric migration/transition rates "
# "between states. For example passenger flow.")
......@@ -79,7 +79,7 @@ if __name__=="__main__":
tmp_weights = pd.read_csv(params.weights, sep='\t' if params.states[-3:]=='tsv' else ',',
skipinitialspace=True)
weights = {row[0]:row[1] for ri,row in tmp_weights.iterrows()}
mean_weight = np.mean(weights.values())
mean_weight = np.mean(list(weights.values()))
weights = np.array([weights[c] if c in weights else mean_weight for c in unique_states], dtype=float)
weights/=weights.sum()
else:
......@@ -144,3 +144,5 @@ if __name__=="__main__":
outtree_name = bname+'.mugration.nexus'
Phylo.write(treeanc.tree, outtree_name, 'nexus')
print("Saved annotated tree as:",outtree_name)
sys.exit(0)
......@@ -4,6 +4,7 @@ import numpy as np
from treetime import TreeTime
from treetime.utils import DateConversion
from Bio import Phylo, AlignIO
import sys
if __name__=="__main__":
###########################################################################
......@@ -75,6 +76,9 @@ if __name__=="__main__":
'\nthe substitution rate. The rate needs to be positive!'
'\nNegative rates suggest an inappropriate root.\n\n')
print('\nThe estimated rate and tree correspond to a root date:\n')
print('\n--root-date:\t %3.2f\n\n'%(-d2d.intercept/d2d.clock_rate))
if not params.keep_root:
# write rerooted tree to file
outtree_name = base_name+'_rerooted.newick'
......@@ -108,6 +112,7 @@ if __name__=="__main__":
plt.savefig(base_name+'_root_to_tip_regression.pdf')
print("--- root-to-tip plot saved to \n\t %s_root_to_tip_regression.pdf\n"%base_name)
sys.exit(0)
......@@ -4,6 +4,7 @@ import numpy as np
from treetime import TreeTime, GTR
from Bio import Phylo, AlignIO
from Bio import __version__ as bioversion
import sys
if __name__=="__main__":
###########################################################################
......@@ -135,7 +136,7 @@ if __name__=="__main__":
myTree.run(root=params.reroot, relaxed_clock=params.relax,
resolve_polytomies=(not params.keep_polytomies),
Tc=Tc, max_iter=params.max_iter,
use_input_branch_length = (not params.optimize_branch_length))
branch_lengths = 'joint' if params.optimize_branch_length else 'input')
###########################################################################
### OUTPUT and saving of results
......@@ -193,3 +194,5 @@ if __name__=="__main__":
Phylo.write(myTree.tree, outtree_name, 'nexus')
print("--- tree saved in nexus format as \n\t %s\n"%outtree_name)
sys.exit(0)
This diff is collapsed.
python-treetime (0.4.0-1) UNRELEASED; urgency=medium
* New upstream version
* Drop ancient X-Python-Version field
* Provide Python3 package, move tools also to Python3 package
-- Andreas Tille <tille@debian.org> Fri, 29 Jun 2018 17:06:03 +0200
python-treetime (0.2.4-1) unstable; urgency=medium
* d/watch: Point to github since upstream has started using release tags
......
......@@ -8,11 +8,12 @@ Build-Depends: debhelper (>= 11~),
dh-python,
python-all,
python-setuptools
python3-all,
python3-setuptools
Standards-Version: 4.1.4
Vcs-Browser: https://salsa.debian.org/med-team/python-treetime
Vcs-Git: https://salsa.debian.org/med-team/python-treetime.git
Homepage: https://github.com/neherlab/treetime
X-Python-Version: >= 2.6
Package: python-treetime
Architecture: all
......@@ -23,7 +24,7 @@ Depends: ${python:Depends},
python-scipy,
python-biopython,
python-pandas
Description: inference of time stamped phylogenies and ancestral reconstruction
Description: inference of time stamped phylogenies and ancestral reconstruction (Python 2)
TreeTime provides routines for ancestral sequence reconstruction and the
maximum likelihoo inference of molecular-clock phylogenies, i.e., a tree
where all branches are scaled such that the locations of terminal nodes
......@@ -51,6 +52,48 @@ Description: inference of time stamped phylogenies and ancestral reconstruction
* inference of GTR models
* rerooting to obtain best root-to-tip regression
* auto-correlated relaxed molecular clock (with normal prior)
.
This package provides the Python 2 module.
Package: python3-treetime
Architecture: all
Section: python
Depends: ${python3:Depends},
${misc:Depends},
python3-numpy,
python3-scipy,
python3-biopython,
python3-pandas
Description: inference of time stamped phylogenies and ancestral reconstruction (Python 3)
TreeTime provides routines for ancestral sequence reconstruction and the
maximum likelihoo inference of molecular-clock phylogenies, i.e., a tree
where all branches are scaled such that the locations of terminal nodes
correspond to their sampling times and internal nodes are placed at the
most likely time of divergence.
.
TreeTime aims at striking a compromise between sophisticated
probabilistic models of evolution and fast heuristics. It implements GTR
models of ancestral inference and branch length optimization, but takes
the tree topology as given. To optimize the likelihood of time-scaled
phylogenies, treetime uses an iterative approach that first infers
ancestral sequences given the branch length of the tree, then optimizes
the positions of unconstraine d nodes on the time axis, and then repeats
this cycle. The only topology optimization are (optional) resolution of
polytomies in a way that is most (approximately) consistent with the
sampling time constraints on the tree. The package is designed to be
used as a stand-alone tool or as a library used in larger phylogenetic
analysis workflows.
.
Features
* ancestral sequence reconstruction (marginal and joint maximum
likelihood)
* molecular clock tree inference (marginal and joint maximum
likelihood)
* inference of GTR models
* rerooting to obtain best root-to-tip regression
* auto-correlated relaxed molecular clock (with normal prior)
.
This package provides the Python 3 module.
Package: python-treetime-examples
Architecture: all
......
usr/share/treetime/ancestral_reconstruction.py usr/bin/ancestral_reconstruction
usr/share/treetime/temporal_signal.py usr/bin/temporal_signal
usr/share/treetime/timetree_inference.py usr/bin/timetree_inference
......@@ -4,8 +4,10 @@
export PYBUILD_NAME = treetime
EXECDIR=debian/python3-$(PYBUILD_NAME)/usr/share/treetime
%:
dh $@ --with python2
dh $@ --with python2,python3 --buildsystem=pybuild
override_dh_install:
dh_install
......@@ -14,3 +16,9 @@ override_dh_install:
## enable dh_python finding dependency
##find debian/*/usr/share/treetime -name "*.py" -exec sed -i 's:#!/usr/bin/env python:#!/usr/bin/python:' \{\} \;
find debian/*/usr/lib/*/dist-packages/treetime -name "*.py" -exec sed -i 's:/usr/local/bin/python:/usr/bin/python:' \{\} \;
mkdir -p $(EXECDIR)
mv debian/python-$(PYBUILD_NAME)/usr/bin/* $(EXECDIR)
for py in `find debian/python3-treetime -name "*.py"` ; do \
dh_link -p python3-$(PYBUILD_NAME) usr/share/treetime/`basename $${py}` usr/bin/`basename $${py} .py` ; \
sed -i '1s+#!/usr/bin/python+&3+' $${py} ; \
done
......@@ -11,7 +11,8 @@ if __name__ == '__main__':
# load data
base_name = 'data/H3N2_NA_allyears_NA.20'
T = Phylo.read(base_name+".nwk", "newick")
T.root_with_outgroup(T.find_clades(lambda x:x.name.startswith('A/Scot')).next())
OG = list(T.find_clades(lambda x:x.name.startswith('A/Scot')))[0]
T.root_with_outgroup(OG)
# instantiate treetime
myTree = TreeAnc(gtr='Jukes-Cantor', tree = T, aln = base_name+'.fasta', verbose = 0)
......
......@@ -68,7 +68,7 @@ if __name__ == '__main__':
x = np.linspace(-0.1,0.05,1000)+ myTree.tree.root.time_before_present
Phylo.draw(tree, axes=axs[0], show_confidence=False)
offset = myTree.tree.root.time_before_present + myTree.tree.root.branch_length
cols = sns.color_palette()
cols = ['r', 'g', 'c', 'b', 'm', 'y'] #sns.color_palette()
depth = myTree.tree.depths()
for ni,node in enumerate(myTree.tree.find_clades()):
if (not node.is_terminal()):
......
......@@ -29,11 +29,11 @@ if __name__ == '__main__':
base_name = 'data/ebola'
dates = read_dates(base_name)
# instantiate treetime
ebola = TreeTime(gtr='Jukes-Cantor', tree = base_name+'.nwk',
ebola = TreeTime(gtr='Jukes-Cantor', tree = base_name+'.nwk', precision=1,
aln = base_name+'.fasta', verbose = 4, dates = dates)
# infer an ebola time tree while rerooting and resolving polytomies
ebola.run(root='best', relaxed_clock=False, max_iter=2,
ebola.run(root='best', relaxed_clock=False, max_iter=2, branch_length_mode='input',
resolve_polytomies=True, Tc='skyline', time_marginal="assign")
# scatter root to tip divergence vs sampling date
......
......@@ -58,7 +58,7 @@ if __name__ == '__main__':
format_axes(fig, axs)
# rerooting can be done along with the tree time inference
tt.run(root="best")
tt.run(root="best", branch_length_mode='input')
# if more complicated models (relaxed clocks, coalescent models) are to be used
# or you want to resolve polytomies, treetime needs to be run for
# several iterations, for example as
......