Skip to content
Commits on Source (7)
......@@ -2,4 +2,7 @@ language: cpp
compiler:
- gcc
- clang
before_script:
- pip install --upgrade --user pip
- pip install --user networkx
script: make; make dependencies; make test
\ No newline at end of file
Ragout
======
Version: 2.1.1
Version: 2.2
[![Build Status](https://travis-ci.org/fenderglass/Ragout.svg?branch=master)](https://travis-ci.org/fenderglass/Ragout)
......@@ -35,20 +35,21 @@ Manuals
- [Usage](docs/USAGE.md)
Authors
-------
- Mikhail Kolmogorov (St. Petersburg University of the Russian Academy of Sciences, UCSD)
- Pavel Avdeev (St. Petersburg University of the Russian Academy of Sciences)
- Dmitriy Meleshko (St. Petersburg University of the Russian Academy of Sciences)
- Son Pham (UCSD)
Code contributions
------------------
* Mikhail Kolmogorov (St. Petersburg University of the Russian Academy of Sciences, UCSD)
* Pavel Avdeev (St. Petersburg University of the Russian Academy of Sciences)
* Dmitriy Meleshko (St. Petersburg University of the Russian Academy of Sciences)
* Son Pham (UCSD)
* Tatiana Malygina
Publications
------------
- Kolmogorov et al., "Chromosome assembly of large and complex genomes using multiple references",
bioRxiv preprint, 2016
* Kolmogorov et al., "Chromosome assembly of large and complex genomes using multiple references",
Genome Research, 2018
- Kolmogorov et al., "Ragout: A reference-assisted assembly tool for bacterial genomes",
* Kolmogorov et al., "Ragout: A reference-assisted assembly tool for bacterial genomes",
Bioinformatics, 2014
......@@ -62,19 +63,20 @@ Acknowledgments
---------------
The work was partially supported by VP Foundation.
We would like to thank:
- Anna Liosnova (benchmarks and useful suggestions)
- Nikolay Vyahhi (testing and useful suggestions)
- Aleksey Gurevich (testing)
We also would like to thank:
* Anna Liosnova (benchmarks and useful suggestions)
* Nikolay Vyahhi (testing and useful suggestions)
* Aleksey Gurevich (testing)
Third-party
-----------
Ragout package includes some third-patry software (see INSTALL.md for details)
Ragout is using some third-patry software (see INSTALL.md for details):
* Networkx 1.8 Python library [http://networkx.github.io/]
* Newick 1.3 [http://www.daimi.au.dk/~mailund/newick.html]
* Networkx Python library [http://networkx.github.io/]
* Newick parser by Thomas Mailund [https://www.mailund.dk/]
* Sibelia [http://github.com/bioinf/Sibelia]
* HAL Tools [https://github.com/ComparativeGenomicsToolkit/hal]
License
......
......@@ -12,7 +12,6 @@ and invokes Ragout
import os
import sys
LIB_DIR = "lib"
BIN_DIR = "bin"
#Check Python version
......@@ -23,8 +22,6 @@ if sys.version_info[:2] != (2, 7):
#Setting executable paths
ragout_root = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
lib_absolute = os.path.join(ragout_root, LIB_DIR)
sys.path.insert(0, lib_absolute)
sys.path.insert(0, ragout_root)
bin_absolute = os.path.join(ragout_root, BIN_DIR)
......
ragout (2.2-1) UNRELEASED; urgency=medium
* New upstream version
* debhelper-compat 12
* Standards-Version: 4.4.0
* Use 2to3 to port to Python3
Closes: #938330
-- Andreas Tille <tille@debian.org> Thu, 05 Sep 2019 10:06:47 +0200
ragout (2.1.1+dfsg-1) unstable; urgency=medium
* Initial release (Closes: #925527)
......
......@@ -3,11 +3,11 @@ Maintainer: Debian Med Packaging Team <debian-med-packaging@lists.alioth.debian.
Uploaders: Andreas Tille <tille@debian.org>
Section: science
Priority: optional
Build-Depends: debhelper (>= 12~),
Build-Depends: debhelper-compat (= 12),
dh-python,
python-dev,
python-networkx (>= 2.2)
Standards-Version: 4.3.0
python3-dev,
python3-networkx (>= 2.2)
Standards-Version: 4.4.0
Vcs-Browser: https://salsa.debian.org/med-team/ragout
Vcs-Git: https://salsa.debian.org/med-team/ragout.git
Homepage: https://github.com/fenderglass/Ragout/
......@@ -15,10 +15,10 @@ Homepage: https://github.com/fenderglass/Ragout/
Package: ragout
Architecture: any
Depends: ${shlibs:Depends},
${python:Depends},
${python3:Depends},
${misc:Depends},
sibelia,
python-networkx (>= 2.2)
python3-networkx (>= 2.2)
Description: Reference-Assisted Genome Ordering UTility
Ragout (Reference-Assisted Genome Ordering UTility) is a tool for
chromosome-level scaffolding using multiple references. Given initial
......@@ -37,7 +37,7 @@ Description: Reference-Assisted Genome Ordering UTility
Package: ragout-examples
Architecture: all
Depends: ${shlibs:Depends},
${python:Depends},
${python3:Depends},
${misc:Depends}
Description: Reference-Assisted Genome Ordering UTility (example data)
Ragout (Reference-Assisted Genome Ordering UTility) is a tool for
......
This diff is collapsed.
From d49a5c46f73c1a4a2bfcbf5bbe41b9d2dab5f5f0 Mon Sep 17 00:00:00 2001
From: Tatiana Malygina <merlettaia@gmail.com>
Date: Mon, 25 Mar 2019 16:09:21 +0300
Origin: https://github.com/fenderglass/Ragout/pull/41
Subject: [PATCH] replace edges_iter iterator with edges iterator in networkx
graphs
---
ragout/breakpoint_graph/breakpoint_graph.py | 2 +-
ragout/breakpoint_graph/chimera_detector.py | 2 +-
ragout/breakpoint_graph/inferer.py | 4 ++--
ragout/scaffolder/merge_iters.py | 4 ++--
scripts/debug-report.py | 6 +++---
5 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/ragout/breakpoint_graph/breakpoint_graph.py b/ragout/breakpoint_graph/breakpoint_graph.py
index c9aa6c2..e73afd9 100644
--- a/ragout/breakpoint_graph/breakpoint_graph.py
+++ b/ragout/breakpoint_graph/breakpoint_graph.py
@@ -272,7 +272,7 @@ def _output_graph(graph, out_file):
"""
with open(out_file, "w") as fout:
fout.write("graph {\n")
- for v1, v2, data in graph.edges_iter(data=True):
+ for v1, v2, data in graph.edges(data=True):
fout.write("{0} -- {1}".format(v1, v2))
if len(data):
extra = list(map(lambda (k, v) : "{0}=\"{1}\"".format(k, v),
diff --git a/ragout/breakpoint_graph/chimera_detector.py b/ragout/breakpoint_graph/chimera_detector.py
index d72440f..06b8f08 100644
--- a/ragout/breakpoint_graph/chimera_detector.py
+++ b/ragout/breakpoint_graph/chimera_detector.py
@@ -98,7 +98,7 @@ def _get_contig_breaks(self, bp_graph):
logger.debug("Processing component of size {0}"
.format(len(subgr.bp_graph)))
- for (u, v, data) in subgr.bp_graph.edges_iter(data=True):
+ for (u, v, data) in subgr.bp_graph.edges(data=True):
if data["genome_id"] != subgr.target:
continue
diff --git a/ragout/breakpoint_graph/inferer.py b/ragout/breakpoint_graph/inferer.py
index a2c8134..7cdc5c3 100644
--- a/ragout/breakpoint_graph/inferer.py
+++ b/ragout/breakpoint_graph/inferer.py
@@ -115,7 +115,7 @@ def _trim_known_edges(self, graph):
Removes edges with known target adjacencies (red edges from paper)
"""
trimmed_graph = graph.copy()
- for v1, v2 in graph.edges_iter():
+ for v1, v2 in graph.edges():
if not trimmed_graph.has_node(v1) or not trimmed_graph.has_node(v2):
continue
@@ -142,7 +142,7 @@ def _min_weight_matching(graph):
"""
Finds a perfect matching with minimum weight
"""
- for v1, v2 in graph.edges_iter():
+ for v1, v2 in graph.edges():
graph[v1][v2]["weight"] = -graph[v1][v2]["weight"] #want minimum weght
MIN_LOG_SIZE = 20
diff --git a/ragout/scaffolder/merge_iters.py b/ragout/scaffolder/merge_iters.py
index fd6c9cd..66be623 100644
--- a/ragout/scaffolder/merge_iters.py
+++ b/ragout/scaffolder/merge_iters.py
@@ -179,7 +179,7 @@ def project(self):
red_edges = []
black_edges = []
- for (u, v, data) in subgr.edges_iter(data=True):
+ for (u, v, data) in subgr.edges(data=True):
if data["scf_set"] == "old":
red_edges.append((u, v))
else:
@@ -201,7 +201,7 @@ def project(self):
logger.debug("Made {0} k-breaks".format(num_kbreaks))
adjacencies = {}
- for (u, v, data) in self.bp_graph.edges_iter(data=True):
+ for (u, v, data) in self.bp_graph.edges(data=True):
if data["scf_set"] == "old":
gap, support = 0, []
if not data["infinity"]:
diff --git a/scripts/debug-report.py b/scripts/debug-report.py
index 017d3c8..e118f46 100755
--- a/scripts/debug-report.py
+++ b/scripts/debug-report.py
@@ -113,11 +113,11 @@ def compose_breakpoint_graph(base_dot, predicted_dot, true_edges):
predicted_edges = nx.read_dot(predicted_dot)
out_graph = nx.MultiGraph()
- for v1, v2, data in base_graph.edges_iter(data=True):
+ for v1, v2, data in base_graph.edges(data=True):
color = g2c(data["genome_id"])
label = "oo" if data["infinity"] == "True" else ""
out_graph.add_edge(v1, v2, color=color, label=label)
- for v1, v2 in predicted_edges.edges_iter():
+ for v1, v2 in predicted_edges.edges():
out_graph.add_edge(v1, v2, color="red", style="dashed")
for (v1, v2, infinite) in true_edges:
label = "oo" if infinite else ""
@@ -140,7 +140,7 @@ def output_graph(graph, output_dir, only_predicted):
if only_predicted:
to_show = False
- for v1, v2, data in subgr.edges_iter(data=True):
+ for v1, v2, data in subgr.edges(data=True):
if data.get("style") == "dashed":
to_show = True
break
From e3a06ef1256cf132903bbf19c9d4e11ea846e6df Mon Sep 17 00:00:00 2001
From: Tatiana Malygina <merlettaia@gmail.com>
Date: Mon, 25 Mar 2019 16:19:40 +0300
Origin: https://github.com/fenderglass/Ragout/pull/41
Subject: [PATCH] fix TypeError: object of type 'dictionary-keyiterator' has no
len()
---
ragout/scaffolder/merge_iters.py | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/ragout/scaffolder/merge_iters.py b/ragout/scaffolder/merge_iters.py
index 66be623..3536f5c 100644
--- a/ragout/scaffolder/merge_iters.py
+++ b/ragout/scaffolder/merge_iters.py
@@ -174,7 +174,7 @@ def project(self):
subgraphs = list(nx.connected_component_subgraphs(self.bp_graph))
for subgr in subgraphs:
#this is a cycle
- if any(len(subgr.neighbors(node)) != 2 for node in subgr.nodes()):
+ if any(len(subgr[node]) != 2 for node in subgr.nodes()):
continue
red_edges = []
@@ -193,6 +193,7 @@ def project(self):
self.bp_graph.remove_edge(u, v)
self.adj_graph.remove_edge(u, v)
for u, v in black_edges:
+ print(self.bp_graph[u][v])
link = self.bp_graph[u][v][0]["link"]
infinity = self.bp_graph[u][v][0]["infinity"]
self.bp_graph.add_edge(u, v, scf_set="old",
From f4d4e2223a337abdbc5873fd6560e1494e55342d Mon Sep 17 00:00:00 2001
From: Tatiana Malygina <merlettaia@gmail.com>
Date: Mon, 25 Mar 2019 17:08:08 +0300
Origin: https://github.com/fenderglass/Ragout/pull/41
Subject: [PATCH] fix AttributeError: 'set' object has no attribute 'items'
---
ragout/breakpoint_graph/inferer.py | 2 +-
ragout/breakpoint_graph/repeat_resolver.py | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/ragout/breakpoint_graph/inferer.py b/ragout/breakpoint_graph/inferer.py
index 7cdc5c3..eb47032 100644
--- a/ragout/breakpoint_graph/inferer.py
+++ b/ragout/breakpoint_graph/inferer.py
@@ -151,7 +151,7 @@ def _min_weight_matching(graph):
"size {0}".format(len(graph)))
edges = nx.max_weight_matching(graph, maxcardinality=True)
unique_edges = set()
- for v1, v2 in edges.items():
+ for v1, v2 in edges:
if not (v2, v1) in unique_edges:
unique_edges.add((v1, v2))
diff --git a/ragout/breakpoint_graph/repeat_resolver.py b/ragout/breakpoint_graph/repeat_resolver.py
index b3054e5..a20ea11 100644
--- a/ragout/breakpoint_graph/repeat_resolver.py
+++ b/ragout/breakpoint_graph/repeat_resolver.py
@@ -362,7 +362,7 @@ def _profile_similarity(profile, genome_ctx, repeats, same_len):
def _max_weight_matching(graph):
edges = nx.max_weight_matching(graph, maxcardinality=True)
unique_edges = set()
- for v1, v2 in edges.items():
+ for v1, v2 in edges:
if not (v2, v1) in unique_edges:
unique_edges.add((v1, v2))
Author: Andreas Tille <tille@debian.org>
Last-Update: Thu, 14 Feb 2019 14:52:05 +0100
Description: Make sure newick is installed as well
Remark: This needs a hack in debian/rules - no idea how to do this more elegantly
--- a/setup.py
+++ b/setup.py
@@ -37,7 +37,8 @@ setup(name='ragout',
packages=['ragout', 'ragout/assembly_graph', 'ragout/breakpoint_graph',
'ragout/maf2synteny', 'ragout/overlap', 'ragout/parsers',
'ragout/phylogeny', 'ragout/scaffolder', 'ragout/shared',
- 'ragout/synteny_backend'],
+ 'ragout/synteny_backend',
+ 'newick'],
package_data={'ragout': ['LICENSE']},
scripts = ['bin/ragout-maf2synteny', 'bin/ragout-overlap', 'bin/ragout'],
cmdclass={'build': MakeBuild}
install_newick.patch
d49a5c46f73c1a4a2bfcbf5bbe41b9d2dab5f5f0.patch
e3a06ef1256cf132903bbf19c9d4e11ea846e6df.patch
f4d4e2223a337abdbc5873fd6560e1494e55342d.patch
2to3.patch
......@@ -8,15 +8,7 @@ include /usr/share/dpkg/default.mk
export DEB_BUILD_MAINT_OPTIONS=hardening=+all
%:
dh $@ --with python2 --buildsystem=pybuild
override_dh_auto_clean:
dh_auto_clean
if [ -L newick ] ; then rm newick ; fi
override_dh_auto_configure:
ln -s lib/newick .
dh_auto_configure
dh $@ --with python3 --buildsystem=pybuild
override_dh_install:
dh_install
......
......@@ -26,29 +26,44 @@ Runtime Depenencies
* Python 2.7
* Sibelia [http://github.com/bioinf/Sibelia]
* HAL Tools [https://github.com/glennhickey/hal] (alternatively to Sibelia)
* python-networkx >= 2.2
* HAL Tools (optionally) [https://github.com/ComparativeGenomicsToolkit/hal]
Building
--------
Local installation
------------------
To build Ragout binaries, type:
If you don't want to use bioconda release, you can build
Ragout repository clone and run it locally without installing
into system. To do this, perform:
git clone https://github.com/fenderglass/Ragout.git
cd Ragout
python setup.py build
pip install -r requirements.txt --user
python scripts/install-sibelia.py
You will also need either Sibelia or HAL Tools installed
This will also build and install Sibelia and all Python dependencies.
See below for HAL installation instructions.
To build and install Sibelia, use:
Once installed, you can invoke Ragout from the cloned directory by using:
python scripts/install-sibelia.py
bin/ragout
If you already have Sibelia installed into your system, it will
be picked up automatically.
System installation
-------------------
Optionally, you may isntall Ragout into your system by typing:
To integrate Ragout into your system, run:
git clone https://github.com/fenderglass/Ragout.git
cd Ragout
python setup.py build
python setup.py install
This assumes that you already have python-networkx package
installed into your system (using the respective package manager).
Sibelia / HAL tools should also be installed / integrated separately.
HAL Tools
---------
......@@ -57,7 +72,7 @@ HAL alignment produced by Progressive Cactus could be used for synteny
blocks decomposition instead of Sibelia (recommended for large genomes).
If you want to use HAL alignment as input,
you need to install HAL Tools package [https://github.com/glennhickey/hal]
you need to install HAL Tools package [https://github.com/ComparativeGenomicsToolkit/hal]
as it is described in the manual. Do not forget to properly set PATH and PYTHONPATH
environment variables.
......@@ -68,8 +83,8 @@ Troubleshooting
Q: Many compilation errors, possibly with
"unrecognized command line option '-std=c++0x'" message:
A: Probably your compiler is too old and does not support C++0x. Minimum required
versions of GCC and Clang are given in the beginning of this document.
A: Probably your compiler is too old and does not support C++0x. Make
sure you have at least GCC 4.6+ / Clang 3.2+
Q: "libstdc++.so.6: version `CXXABI_1.3.5' not found" or similar error when running
......
__version__ = "2.1.1"
__version__ = "2.2"
......@@ -53,7 +53,7 @@ def _load_dot(filename):
def _check_overaps_number(graph, contigs_fasta):
rate = float(len(graph.edges())) / len(contigs_fasta)
rate = float(len(graph.edges)) / len(contigs_fasta)
if rate < config.vals["min_overlap_rate"]:
logger.warning("Too few overlaps ({0}) between contigs were detected "
"-- refine procedure will be useless. Possible reasons:"
......@@ -61,7 +61,7 @@ def _check_overaps_number(graph, contigs_fasta):
"2. Contigs overlap not on a constant value "
"(like k-mer for assemblers which use debruijn graph)\n"
"3. Contigs ends are trimmed/postprocessed\n"
.format(len(graph.edges())))
.format(len(graph.edges)))
def _insert_from_graph(graph, scaffolds_in, max_path_len, contigs_fasta):
......
......@@ -101,9 +101,9 @@ class BreakpointGraph(object):
"""
assert len(self.bp_graph) >= 2
g = nx.Graph()
g.add_nodes_from(self.bp_graph.nodes())
g.add_nodes_from(self.bp_graph.nodes)
for node in self.bp_graph.nodes():
for node in self.bp_graph.nodes:
adjacencies = {}
for neighbor in self.bp_graph.neighbors(node):
for edge in self.bp_graph[node][neighbor].values():
......@@ -272,7 +272,7 @@ def _output_graph(graph, out_file):
"""
with open(out_file, "w") as fout:
fout.write("graph {\n")
for v1, v2, data in graph.edges_iter(data=True):
for v1, v2, data in graph.edges(data=True):
fout.write("{0} -- {1}".format(v1, v2))
if len(data):
extra = list(map(lambda (k, v) : "{0}=\"{1}\"".format(k, v),
......
......@@ -98,7 +98,7 @@ class ChimeraDetector(object):
logger.debug("Processing component of size {0}"
.format(len(subgr.bp_graph)))
for (u, v, data) in subgr.bp_graph.edges_iter(data=True):
for (u, v, data) in subgr.bp_graph.edges(data=True):
if data["genome_id"] != subgr.target:
continue
......@@ -130,7 +130,7 @@ class ChimeraDetector(object):
"""
assert len(bp_graph) == 4
red_1, red_2 = red_edge
cand_1, cand_2 = tuple(set(bp_graph.nodes()) - set(red_edge))
cand_1, cand_2 = tuple(set(bp_graph.nodes) - set(red_edge))
if abs(cand_1) == abs(cand_2):
return False
......
......@@ -76,7 +76,7 @@ class AdjacencyInferer(object):
"""
adjacency = subgraph.to_weighted_graph(self.phylogeny)
trimmed_graph = self._trim_known_edges(adjacency)
unused_nodes = set(trimmed_graph.nodes())
unused_nodes = set(trimmed_graph.nodes)
chosen_edges = []
for trim_subgraph in nx.connected_component_subgraphs(trimmed_graph):
......@@ -84,8 +84,8 @@ class AdjacencyInferer(object):
continue
if len(trim_subgraph) == 2:
chosen_edges.append(tuple(trim_subgraph.nodes()))
for n in trim_subgraph.nodes():
chosen_edges.append(tuple(trim_subgraph.nodes))
for n in trim_subgraph.nodes:
unused_nodes.remove(n)
continue
......@@ -115,7 +115,7 @@ class AdjacencyInferer(object):
Removes edges with known target adjacencies (red edges from paper)
"""
trimmed_graph = graph.copy()
for v1, v2 in graph.edges_iter():
for v1, v2 in graph.edges:
if not trimmed_graph.has_node(v1) or not trimmed_graph.has_node(v2):
continue
......@@ -142,7 +142,7 @@ def _min_weight_matching(graph):
"""
Finds a perfect matching with minimum weight
"""
for v1, v2 in graph.edges_iter():
for v1, v2 in graph.edges:
graph[v1][v2]["weight"] = -graph[v1][v2]["weight"] #want minimum weght
MIN_LOG_SIZE = 20
......@@ -151,7 +151,7 @@ def _min_weight_matching(graph):
"size {0}".format(len(graph)))
edges = nx.max_weight_matching(graph, maxcardinality=True)
unique_edges = set()
for v1, v2 in edges.items():
for v1, v2 in edges:
if not (v2, v1) in unique_edges:
unique_edges.add((v1, v2))
......
......@@ -362,7 +362,7 @@ def _profile_similarity(profile, genome_ctx, repeats, same_len):
def _max_weight_matching(graph):
edges = nx.max_weight_matching(graph, maxcardinality=True)
unique_edges = set()
for v1, v2 in edges.items():
for v1, v2 in edges:
if not (v2, v1) in unique_edges:
unique_edges.add((v1, v2))
......