Skip to content
Commits on Source (6)
......@@ -217,3 +217,7 @@
proteinortho_grab_proteins.pl speedup for -exact and a given proteinortho file
proteinortho6.pl replaced chomp with s/[\r\n]+$//
proteinortho_clustering.cpp fix bug that only uses lapack if -pld is set, regardless of the value.
11. Sept (uid: 3813)
updated shebang of ffadj such that python2.7 is used directly (ffadj fails if called with higher version of python)
-p=blastp is now alias of blastp+ and legacy blast is now -p=blastp_legacy (blastn is equivalent)
Makefile: static now includes -lquadmath
......@@ -132,7 +132,7 @@ ifeq ($(USELAPACK),TRUE)
ifeq ($(USEPRECOMPILEDLAPACK),TRUE)
ifeq ($(STATIC),TRUE)
@echo "[ 20%] Building **proteinortho_clustering** with LAPACK (static linking)";
@$(CXX) $(CXXFLAGS) $(CXXFLAGS_PO) -fopenmp -o $@ $< $(LDFLAGS) $(LDLIBS) -static -Wl,--allow-multiple-definition -llapack -lblas -lgfortran -pthread -Wl,--whole-archive -lpthread -Wl,--no-whole-archive && ([ $$? -eq 0 ] ) || ( \
@$(CXX) $(CXXFLAGS) $(CXXFLAGS_PO) -fopenmp -o $@ $< $(LDFLAGS) $(LDLIBS) -static -Wl,--allow-multiple-definition -llapack -lblas -lgfortran -lquadmath -pthread -Wl,--whole-archive -lpthread -Wl,--no-whole-archive && ([ $$? -eq 0 ] ) || ( \
echo "......$(ORANGE)static linking failed, now I try dynamic linking.$(NC)"; \
$(CXX) $(CXXFLAGS) $(CXXFLAGS_PO) -fopenmp -o $@ $< $(LDFLAGS) $(LDLIBS) -llapack -lblas -pthread -Wl,--whole-archive -lpthread -Wl,--no-whole-archive && ([ $$? -eq 0 ] && echo "......OK dynamic linking was successful for proteinortho_clustering!";) || ( \
echo "......$(ORANGE)dynamic linking failed too, now I try dynamic linking without -WL,-whole-archive (this should now work for OSX).$(NC)"; \
......
......@@ -39,8 +39,8 @@ You can also send a mail to lechner@staff.uni-marburg.de.
# Installation
**Proteinortho comes with precompiled binaries of all executables (Linux/x86) so just run the proteinortho6.pl in the downloaded directory.**
You could also move all executables to your favorite bin directory (e.g. with make install PREFIX=/home/paul/bin).
**Proteinortho comes with precompiled binaries of all executables (Linux/x86) so you should be able to run perl proteinortho6.pl in the downloaded directory.**
You could also move all executables to your favorite directory (e.g. with make install PREFIX=/home/paul/bin).
If you cannot execute the src/BUILD/Linux_x86_64/proteinortho_clustering, then you have to recompile with make, see the section 2. Building and installing proteinortho from source.
<br>
......@@ -73,6 +73,21 @@ If you need brew (see [here](https://brew.sh/index_de))
<br>
#### Easy installation with dpkg (root privileges are required)
The deb package can be downloaded here: [https://packages.debian.org/unstable/proteinortho](https://packages.debian.org/unstable/proteinortho).
Afterwards the deb package can be installed with `sudo dpkg -i proteinortho*deb`.
<br>
#### *(Easy installation with apt-get)*
**! Disclamer: Work in progress !**
*proteinortho will be released to stable with Debian 11 (~2021), then proteinortho can be installed with `sudo apt-get install proteinortho` (currently this installes the outdated version v5.16b)*
<br>
#### 1. Prerequisites
Proteinortho uses standard software which is often installed already or is part of then package repositories and can thus easily be installed. The sources come with a precompiled version of Proteinortho for 64bit Linux.
......@@ -94,7 +109,7 @@ Proteinortho uses standard software which is often installed already or is part
- mmseqs2 (conda install mmseqs2, https://github.com/soedinglab/MMseqs2)
- Perl v5.08 or higher (to test this, type perl -v in the command line)
- Python v2.6.0 or higher to include synteny analysis (to test this, type 'python -V' in the command line)
- Perl modules: Thread::Queue, File::Basename, Pod::Usage, threads (if you miss one just install with `cpan install Thread::Queue` )
- Perl standard modules (these should come with Perl): Thread::Queue, File::Basename, Pod::Usage, threads (if you miss one just install with `cpan install ...` )
</details>
<br>
......@@ -113,9 +128,9 @@ Proteinortho uses standard software which is often installed already or is part
#### 2. Building and installing proteinortho from source (linux and osx)
Here you <i>can</i> use a working lapack library, check this with 'dpkg --get-selections | grep lapack'. Install lapack e.g. with 'apt-get install libatlas3-base' or liblapack3.
Here you can use a working lapack library, check this with 'dpkg --get-selections | grep lapack'. Install lapack e.g. with 'apt-get install libatlas3-base' or liblapack3.
If you dont have one (or you have no root permissions), then 'make' will automatically compile a lapack (v3.8.0) for you !
If you dont have Lapack, then 'make' will automatically compiles Lapack v3.8.0 for you !
Fetch the latest source code archive downloaded from <a href="https://gitlab.com/paulklemm_PHD/proteinortho/-/archive/master/proteinortho-master.zip">here</a>
<details> <summary>or from here (Click to expand)</summary>
......@@ -281,7 +296,7 @@ Open `proteinorthoHelper.html` in your favorite browser or visit [lechnerlab.de/
<details>
<summary>show all algorithms (Click to expand)</summary>
- blastn,blastp,tblastx : legacy blast family (shell commands: blastall -) family. The suffix 'n' or 'p' indicates nucleotide or protein input files.
- blastn_legacy,blastp_legacy,tblastx_legacy : legacy blast family (shell commands: blastall -) family. The suffix 'n' or 'p' indicates nucleotide or protein input files.
- blastn+,blastp+,tblastx+ : standard blast family (shell commands: blastn,blastp,tblastx)
family. The suffix 'n' or 'p' indicates nucleotide or protein input files.
......
proteinortho (6.0.7+dfsg-1) UNRELEASED; urgency=medium
* New upstream version
* debhelper-compat 12
* Use 2to3 to port to Python3
-- Andreas Tille <tille@debian.org> Mon, 16 Sep 2019 12:41:53 +0200
proteinortho (6.0.6+dfsg-1) unstable; urgency=medium
[ Paul Klemm ]
......
......@@ -3,7 +3,7 @@ Maintainer: Debian Med Packaging Team <debian-med-packaging@lists.alioth.debian.
Uploaders: Andreas Tille <tille@debian.org>
Section: science
Priority: optional
Build-Depends: debhelper (>= 12~),
Build-Depends: debhelper-compat (= 12),
ncbi-blast+,
liblapack-dev | libatlas-base-dev | liblapack.so,
diamond-aligner
......
Description: Use 2to3 to port to Python3
Author: Andreas Tille <tille@debian.org>
Last-Update: Mon, 16 Sep 2019 12:41:53 +0200
--- a/src/proteinortho_ffadj_mcs.py
+++ b/src/proteinortho_ffadj_mcs.py
@@ -1,9 +1,9 @@
-#!/usr/bin/env python2.7
+#!/usr/bin/python3
-from sys import stdout, stderr, exit, argv, maxint
+from sys import stdout, stderr, exit, argv, maxsize
from copy import deepcopy
from bisect import bisect
-from itertools import izip, product
+from itertools import product
from os.path import basename, dirname
from random import randint
from math import ceil
@@ -46,7 +46,7 @@ class Run:
adjTerm = 0
if len(self.weight) > 1:
adjTerm = sum([self.weight[i] * self.weight[i+1] for i in
- xrange(len(self.weight)-1)])
+ range(len(self.weight)-1)])
edgeTerm = sum([w **2 for w in self.weight])
# edgeTerm = max(self.weight)**2
return alpha * adjTerm + (1-alpha) * edgeTerm
@@ -101,9 +101,9 @@ def readDistsAndOrder(data, edgeThreshol
if edgeWeight < edgeThreshold:
continue
- if not g1_chromosomes.has_key(chr1):
+ if chr1 not in g1_chromosomes:
g1_chromosomes[chr1] = set()
- if not g2_chromosomes.has_key(chr2):
+ if chr2 not in g2_chromosomes:
g2_chromosomes[chr2] = set()
g1_chromosomes[chr1].add(g1)
@@ -124,19 +124,19 @@ def readDistsAndOrder(data, edgeThreshol
# add telomeres
for t1, t2 in product(tel1, tel2):
- if not res.has_key(t1):
+ if t1 not in res:
res[t1] = dict()
res[t1][t2] = (DIRECTION_BOTH_STRANDS, 1)
-# res[maxint] = dict([
-# (maxint, (DIRECTION_WATSON_STRAND, 1)),
+# res[maxsize] = dict([
+# (maxsize, (DIRECTION_WATSON_STRAND, 1)),
# (0, (DIRECTION_WATSON_STRAND, 1)),
-# (maxint, (DIRECTION_CRICK_STRAND, 1)),
+# (maxsize, (DIRECTION_CRICK_STRAND, 1)),
# (0, (DIRECTION_CRICK_STRAND, 1))])
-# res[maxint] = dict([
-# (maxint, (DIRECTION_WATSON_STRAND, 1)),
+# res[maxsize] = dict([
+# (maxsize, (DIRECTION_WATSON_STRAND, 1)),
# (0, (DIRECTION_WATSON_STRAND, 1)),
-# (maxint, (DIRECTION_CRICK_STRAND, 1)),
+# (maxsize, (DIRECTION_CRICK_STRAND, 1)),
# (0, (DIRECTION_CRICK_STRAND, 1))])
return hasMultipleChromosomes, g1, g2, res
@@ -148,20 +148,20 @@ def establish_linear_genome_order(chromo
g.append((k, -1))
telomeres.add((k, -1))
g.extend([(k, i) for i in sorted(chromosomes[k])])
- g.append((k, maxint))
- telomeres.add((k, maxint))
+ g.append((k, maxsize))
+ telomeres.add((k, maxsize))
return telomeres, g
def insertIntoRunList(runs, runList):
- keys = map(lambda x: x.getWeight(alpha), runList)
+ keys = [x.getWeight(alpha) for x in runList]
for run in runs:
i = bisect(keys, run.getWeight(alpha))
keys.insert(i, run.getWeight(alpha))
runList.insert(i, run)
def checkMatching(g1, g2, g1_runs, g2_runs, runs, dist):
- g1pos = dict(izip(g1, xrange(len(g1))))
- g2pos = dict(izip(g2, xrange(len(g2))))
+ g1pos = dict(zip(g1, range(len(g1))))
+ g2pos = dict(zip(g2, range(len(g2))))
if len(g1) != len(g2):
@@ -177,7 +177,7 @@ def checkMatching(g1, g2, g1_runs, g2_ru
r_counter = 0
prev_run = None
c_adj = 0
- for i in xrange(len(g1)):
+ for i in range(len(g1)):
if not g1_runs[i]:
logging.error('Gene %s is not included in any run' %g1[i])
continue
@@ -213,7 +213,7 @@ def checkMatching(g1, g2, g1_runs, g2_ru
missing_runs = all_included.symmetric_difference(runs)
if missing_runs:
logging.error(('Additional runs in runslist that are not part in the' + \
- ' matching: %s') %(map(str, missing_runs)))
+ ' matching: %s') %(list(map(str, missing_runs))))
logging.info('Number of adjacencies is %s in matching of size %s.' %(c_adj,
len(g1)))
@@ -222,7 +222,7 @@ def checkMatching(g1, g2, g1_runs, g2_ru
logging.error(('Sum of run lengths does not equal matching size! Sum ' + \
'of run lengths: %s, matching size: %s') % (r_counter, len(g1)))
- for j in xrange(len(g2)):
+ for j in range(len(g2)):
if not g2_runs[j]:
logging.error('Gene %s is not included in any run' %g2[j])
if len(g2_runs[j]) > 1:
@@ -262,8 +262,8 @@ def checkMatching(g1, g2, g1_runs, g2_ru
'Weights: %s, run length: %s, run: %s') %(len(r.weight),
g1pos[r.endG1] - g1pos[r.startG1], r))
- g1_chromosomes = set(map(lambda x: x[0], g1[g1pos[r.startG1]:g1pos[r.endG1]+1]))
- g2_chromosomes = set(map(lambda x: x[0], g2[g2pos[r.startG2]:g2pos[r.endG2]+1]))
+ g1_chromosomes = set([x[0] for x in g1[g1pos[r.startG1]:g1pos[r.endG1]+1]])
+ g2_chromosomes = set([x[0] for x in g2[g2pos[r.startG2]:g2pos[r.endG2]+1]])
if len(g1_chromosomes) != 1 and len(g2_chromosomes) != 1:
logging.error(('Number of chromosomes on G1 (#chrs: %s) or G2 ' + \
'(#chrs: %s) in run %s is not 1 (Meaning that possibly' + \
@@ -281,7 +281,7 @@ def checkMatching(g1, g2, g1_runs, g2_ru
run_ends[r.startG1] = (r.direction, r.endG2)
run_ends[r.endG1] = (r.direction, r.startG2)
- for i in xrange(len(g1)-1):
+ for i in range(len(g1)-1):
g1i = g1[i]
g1i2 = g1[i+1]
if g1i in run_ends and g1i2 in run_ends and run_ends[g1i][0] == \
@@ -290,13 +290,13 @@ def checkMatching(g1, g2, g1_runs, g2_ru
g2i = run_ends[g1i][1]
g2i2 = run_ends[g1i2][1]
if direction == DIRECTION_CRICK_STRAND and g2pos[g2i] == g2pos[g2i2]-1:
- logging.error('Runs %s and %s could be merged, but are not!' % (map(str, g1_runs[i])[0], map(str, g1_runs[i+1])[0]))
+ logging.error('Runs %s and %s could be merged, but are not!' % (list(map(str, g1_runs[i]))[0], list(map(str, g1_runs[i+1]))[0]))
elif direction == DIRECTION_WATSON_STRAND and g2pos[g2i] == g2pos[g2i2]+1:
- logging.error('Runs %s and %s could be merged, but are not!' % (map(str, g1_runs[i])[0], map(str, g1_runs[i+1])[0]))
+ logging.error('Runs %s and %s could be merged, but are not!' % (list(map(str, g1_runs[i]))[0], list(map(str, g1_runs[i+1]))[0]))
def getAllRuns(g1, g2, d):
- g2pos = dict(izip(g2, xrange(len(g2))))
+ g2pos = dict(zip(g2, range(len(g2))))
g1_runs = [set() for _ in g1]
g2_runs = [set() for _ in g2]
@@ -305,7 +305,7 @@ def getAllRuns(g1, g2, d):
reportedRuns= list()
- for i in xrange(len(g1)):
+ for i in range(len(g1)):
curPos = g1[i]
@@ -355,7 +355,7 @@ def getAllRuns(g1, g2, d):
# if no edge exists, nothing has to be done...
if e:
- for (g2_gene, (direction, weight)) in d[curPos].items():
+ for (g2_gene, (direction, weight)) in list(d[curPos].items()):
if (direction, g2_gene) not in forbiddenRunStarts:
j = g2pos[g2_gene]
if isinstance(direction, BothStrands):
@@ -391,12 +391,12 @@ def replaceByNew(g1_runs, g2_runs, i, j,
break
def doMatching(g1, g2, g1_runs, g2_runs, m, runList):
- g1pos = dict(izip(g1, xrange(len(g1))))
- g2pos = dict(izip(g2, xrange(len(g2))))
+ g1pos = dict(zip(g1, range(len(g1))))
+ g2pos = dict(zip(g2, range(len(g2))))
newRuns = set()
- for k in xrange(g1pos[m.endG1] - g1pos[m.startG1] + 1):
+ for k in range(g1pos[m.endG1] - g1pos[m.startG1] + 1):
i = g1pos[m.startG1] + k
j = g2pos[m.startG2] + k
@@ -516,13 +516,13 @@ def doMatching(g1, g2, g1_runs, g2_runs,
insertIntoRunList(newRuns, runList)
def mergeRuns(mod_g1, g1, g2, g1_runs, g2_runs, runList, alreadyMatched):
- g1pos = dict(izip(g1, xrange(len(g1))))
- g2pos = dict(izip(g2, xrange(len(g2))))
+ g1pos = dict(zip(g1, range(len(g1))))
+ g2pos = dict(zip(g2, range(len(g2))))
newRuns = set()
wSrt = lambda x: x.getWeight(alpha)
mod_g1 = list(mod_g1)
- for x in xrange(len(mod_g1)):
+ for x in range(len(mod_g1)):
g1i = mod_g1[x]
i = g1pos[g1i]
if len(g1) < i+2:
@@ -601,18 +601,18 @@ def removeSingleGenes(genome, genome_run
def findRandomRunSequence(g1, g2, dists, topXperCent):
g2dists = dict()
- for g1i, x in dists.items():
- for g2j, d in x.items():
+ for g1i, x in list(dists.items()):
+ for g2j, d in list(x.items()):
if g2j not in g2dists:
g2dists[g2j] = dict()
g2dists[g2j][g1i] = d
# copy g1, g2 and dists map, because we'll modify it. Also remove all genes
# that do not contain edges.
- g1 = [x for x in g1 if dists.has_key(x) and len(dists[x])]
- g2 = [x for x in g2 if g2dists.has_key(x) and len(g2dists[x])]
+ g1 = [x for x in g1 if x in dists and len(dists[x])]
+ g2 = [x for x in g2 if x in g2dists and len(g2dists[x])]
- g1pos = dict(izip(g1, xrange(len(g1))))
+ g1pos = dict(zip(g1, range(len(g1))))
g1_runs, g2_runs, runs = getAllRuns(g1, g2, dists)
logging.info('Found %s runs.' %len(runs))
@@ -621,7 +621,7 @@ def findRandomRunSequence(g1, g2, dists,
res = set()
while runList:
- noOfAdjacencies = len(filter(lambda x: x.getWeight(alpha) and x.getWeight(alpha) or 0, runList))
+ noOfAdjacencies = len([x for x in runList if x.getWeight(alpha) and x.getWeight(alpha) or 0])
if noOfAdjacencies:
randPos = randint(1, ceil(noOfAdjacencies * topXperCent))
else:
@@ -645,7 +645,7 @@ def findRandomRunSequence(g1, g2, dists,
for g in del_g1.intersection(mod_g1):
mod_g1.remove(g)
- g1pos = dict(izip(g1, xrange(len(g1))))
+ g1pos = dict(zip(g1, range(len(g1))))
# add new modification points
mod_g1.update(new_mod_g1)
@@ -653,7 +653,7 @@ def findRandomRunSequence(g1, g2, dists,
if del_g2:
logging.info('Zombie genes removed from G2: %s' %', '.join(map(str, del_g2)))
for g2j in mod_g2:
- for g1i, (d, _) in g2dists[g2j].items():
+ for g1i, (d, _) in list(g2dists[g2j].items()):
if g1i in g1:
if d == DIRECTION_CRICK_STRAND:
mod_g1.add(g1i)
@@ -665,8 +665,8 @@ def findRandomRunSequence(g1, g2, dists,
runList, res)
if res:
- logging.info('Matching finished. Longest run size is %s.' %(max(map(len,
- res))))
+ logging.info('Matching finished. Longest run size is %s.' %(max(list(map(len,
+ res)))))
else:
logging.info('Matching finished, but no runs found. Empty input?')
@@ -681,19 +681,19 @@ def repeatMatching(g1, g2, g1_mod, g2_mo
g2_runs_res = g2_runs
selectedRuns_res = list()
- g1pos = dict(izip(g1_mod, xrange(len(g1_mod))))
- g2pos = dict(izip(g2_mod, xrange(len(g2_mod))))
+ g1pos = dict(zip(g1_mod, range(len(g1_mod))))
+ g2pos = dict(zip(g2_mod, range(len(g2_mod))))
noReps = repMatching
while repMatching:
- for i in xrange(len(g1_runs)):
+ for i in range(len(g1_runs)):
run_set = g1_runs[i]
if len(run_set) != 1:
logging.error(('Expected run, set length of 1, but was told' + \
' different: %s.') %(', '.join(map(str, run_set))))
- run = run_set.__iter__().next()
+ run = next(run_set.__iter__())
g1i = g1_mod[i]
@@ -720,11 +720,11 @@ def repeatMatching(g1, g2, g1_mod, g2_mo
len(g1_mod), noReps-repMatching+2))
# remove runs that fall below min length of minCsSize
- ff = lambda x: len(x.__iter__().next()) >= minCsSize
- g1_mod = [g1_mod[i] for i in xrange(len(g1_mod)) if ff(g1_runs[i])]
- g2_mod = [g2_mod[i] for i in xrange(len(g2_mod)) if ff(g2_runs[i])]
- g1_runs = filter(ff, g1_runs)
- g2_runs = filter(ff, g2_runs)
+ ff = lambda x: len(next(x.__iter__())) >= minCsSize
+ g1_mod = [g1_mod[i] for i in range(len(g1_mod)) if ff(g1_runs[i])]
+ g2_mod = [g2_mod[i] for i in range(len(g2_mod)) if ff(g2_runs[i])]
+ g1_runs = list(filter(ff, g1_runs))
+ g2_runs = list(filter(ff, g2_runs))
selectedRuns = set([s for s in selectedRuns if len(s) >= minCsSize])
# stop if no runs were found matching the criteria
@@ -736,10 +736,10 @@ def repeatMatching(g1, g2, g1_mod, g2_mo
logging.info('%s feasible runs retained.' %len(selectedRuns))
# reconciliate with result data
- g2pos = dict(izip(g2_mod, xrange(len(g2_mod))))
- g1pos = dict(izip(g1_mod, xrange(len(g1_mod))))
- g2pos_res = dict(izip(g2_mod_res, xrange(len(g2_mod_res))))
- g1pos_res = dict(izip(g1_mod_res, xrange(len(g1_mod_res))))
+ g2pos = dict(zip(g2_mod, range(len(g2_mod))))
+ g1pos = dict(zip(g1_mod, range(len(g1_mod))))
+ g2pos_res = dict(zip(g2_mod_res, range(len(g2_mod_res))))
+ g1pos_res = dict(zip(g1_mod_res, range(len(g1_mod_res))))
chr_srt = lambda x, y: x[0] == y[0] and (x[1] < y[1] and -1 or 1) or (x[0] < y[0] and -1 or 1)
g1_mod_new = sorted(set(g1_mod_res + g1_mod), cmp=chr_srt)
@@ -749,17 +749,17 @@ def repeatMatching(g1, g2, g1_mod, g2_mo
for g1i in g1_mod_new:
x = set()
- if g1pos_res.has_key(g1i):
+ if g1i in g1pos_res:
x.update(g1_runs_res[g1pos_res[g1i]])
- if g1pos.has_key(g1i):
+ if g1i in g1pos:
x.update(g1_runs[g1pos[g1i]])
g1_runs_new.append(x)
for g2j in g2_mod_new:
x = set()
- if g2pos_res.has_key(g2j):
+ if g2j in g2pos_res:
x.update(g2_runs_res[g2pos_res[g2j]])
- if g2pos.has_key(g2j):
+ if g2j in g2pos:
x.update(g2_runs[g2pos[g2j]])
g2_runs_new.append(x)
@@ -776,21 +776,21 @@ def repeatMatching(g1, g2, g1_mod, g2_mo
def printMatching(g1, g2, g1_runs, hasMultipleChromosomes, out):
if hasMultipleChromosomes:
- print >> f, 'Chr(G1)\tG1\tChr(G2)\tG2\tdirection\tedge weight'
+ print('Chr(G1)\tG1\tChr(G2)\tG2\tdirection\tedge weight', file=f)
else:
- print >> f, 'G1\tG2\tdirection\tedge weight'
+ print('G1\tG2\tdirection\tedge weight', file=f)
- g2pos = dict(izip(g2, xrange(len(g2))))
- g1pos = dict(izip(g1, xrange(len(g1))))
+ g2pos = dict(zip(g2, range(len(g2))))
+ g1pos = dict(zip(g1, range(len(g1))))
cur_index = dict()
- for i in xrange(len(g1_runs)):
+ for i in range(len(g1_runs)):
run_set = g1_runs[i]
for run in run_set:
g1i = g1[i]
j = 0
- if cur_index.has_key(run):
+ if run in cur_index:
j = cur_index[run]
if run.direction == DIRECTION_CRICK_STRAND:
g2j = g2[g2pos[run.startG2] + j]
@@ -800,22 +800,22 @@ def printMatching(g1, g2, g1_runs, hasMu
direction = run.direction == DIRECTION_CRICK_STRAND and '1' or '-1'
g1i1 = g1i[1] == -1 and 'TELOMERE_START' or g1i[1]
- g1i1 = g1i[1] == maxint and 'TELOMERE_END' or g1i1
+ g1i1 = g1i[1] == maxsize and 'TELOMERE_END' or g1i1
g2j1 = g2j[1] == -1 and 'TELOMERE_START' or g2j[1]
- g2j1 = g2j[1] == maxint and 'TELOMERE_END' or g2j1
+ g2j1 = g2j[1] == maxsize and 'TELOMERE_END' or g2j1
if hasMultipleChromosomes:
- print >> f, '%s\t%s\t%s\t%s\t%s\t%s' %(g1i[0], g1i1, g2j[0],
- g2j1, direction, run.weight[j])
+ print('%s\t%s\t%s\t%s\t%s\t%s' %(g1i[0], g1i1, g2j[0],
+ g2j1, direction, run.weight[j]), file=f)
else:
- print >> f, '%s\t%s\t%s\t%s' %(g1i1, g2j1, direction,
- run.weight[j])
+ print('%s\t%s\t%s\t%s' %(g1i1, g2j1, direction,
+ run.weight[j]), file=f)
cur_index[run] = j+1
if __name__ == '__main__':
if len(argv) < 3 or len(argv) > 8:
- print '\tusage: %s <DIST FILE> <ALPHA> [ <EDGE WEIGHT THRESHOLD> --repeat-matching (-R) <NUMBER >= 2> --min-cs-size (-M) <NUMBER >= 1> ]' %argv[0]
+ print('\tusage: %s <DIST FILE> <ALPHA> [ <EDGE WEIGHT THRESHOLD> --repeat-matching (-R) <NUMBER >= 2> --min-cs-size (-M) <NUMBER >= 1> ]' %argv[0])
exit(1)
repMatching= '--repeat-matching' in argv or '-R' in argv
@@ -826,8 +826,8 @@ if __name__ == '__main__':
minCsSize = int(argv[pos+1])
argv = argv[:pos] + argv[pos+2:]
if not repMatching:
- print >> stderr, ('Argument --min-cs-size (-M) only valid in ' + \
- 'combination with --repeat-matching (-R)')
+ print(('Argument --min-cs-size (-M) only valid in ' + \
+ 'combination with --repeat-matching (-R)'), file=stderr)
exit(1)
else:
minCsSize = 1
@@ -869,7 +869,7 @@ if __name__ == '__main__':
# sum of weights of adjacencies
wAdj = sum([r.getWeight(1) for r in selectedRuns])
# sum of weights of all edges of the matching
- wEdg = sum([sum(map(lambda x: x**2, r.weight)) for r in selectedRuns])
+ wEdg = sum([sum([x**2 for x in r.weight]) for r in selectedRuns])
edg = sum(map(len, selectedRuns))
@@ -892,6 +892,6 @@ if __name__ == '__main__':
' is %s with #edg = %s, adj(M) = %.3f and edg(M) = %.3f') %(bkp, edg,
wAdj, wEdg))
- print '#bkp\t#edg\tadj\tedg'
- print '%s\t%s\t%.6f\t%.6f' %(bkp, edg, wAdj, wEdg)
+ print('#bkp\t#edg\tadj\tedg')
+ print('%s\t%s\t%.6f\t%.6f' %(bkp, edg, wAdj, wEdg))
......@@ -58,8 +58,8 @@ You can also send a mail to lechner@staff.uni-marburg.de.</p>
<h1 id="installation">Installation</h1>
<p><strong>Proteinortho comes with precompiled binaries of all executables (Linux/x86) so just run the proteinortho6.pl in the downloaded directory.</strong>
You could also move all executables to your favorite bin directory (e.g. with make install PREFIX=/home/paul/bin).
<p><strong>Proteinortho comes with precompiled binaries of all executables (Linux/x86) so you should be able to run perl proteinortho6.pl in the downloaded directory.</strong>
You could also move all executables to your favorite directory (e.g. with make install PREFIX=/home/paul/bin).
If you cannot execute the src/BUILD/Linux<em>x86</em>64/proteinortho_clustering, then you have to recompile with make, see the section 2. Building and installing proteinortho from source.</p>
<p><br></p>
......@@ -95,6 +95,20 @@ If you cannot execute the src/BUILD/Linux<em>x86</em>64/proteinortho_clustering,
<p><br></p>
<h4 id="easyinstallationwithdpkgrootprivilegesarerequired">Easy installation with dpkg (root privileges are required)</h4>
<p>The deb package can be downloaded here: <a href="https://packages.debian.org/unstable/proteinortho">https://packages.debian.org/unstable/proteinortho</a>.
Afterwards the deb package can be installed with <code>sudo dpkg -i proteinortho*deb</code>.</p>
<p><br></p>
<h4 id="easyinstallationwithaptget"><em>(Easy installation with apt-get)</em></h4>
<p><strong>! Disclamer: Work in progress !</strong>
<em>proteinortho will be released to stable with Debian 11 (~2021), then proteinortho can be installed with <code>sudo apt-get install proteinortho</code> (currently this installes the outdated version v5.16b)</em></p>
<p><br></p>
<h4 id="1prerequisites">1. Prerequisites</h4>
<p>Proteinortho uses standard software which is often installed already or is part of then package repositories and can thus easily be installed. The sources come with a precompiled version of Proteinortho for 64bit Linux.</p>
......@@ -128,7 +142,7 @@ If you cannot execute the src/BUILD/Linux<em>x86</em>64/proteinortho_clustering,
<li><p>Python v2.6.0 or higher to include synteny analysis (to test this, type 'python -V' in the command line) </p></li>
<li><p>Perl modules: Thread::Queue, File::Basename, Pod::Usage, threads (if you miss one just install with <code>cpan install Thread::Queue</code> )
<li><p>Perl standard modules (these should come with Perl): Thread::Queue, File::Basename, Pod::Usage, threads (if you miss one just install with <code>cpan install ...</code> )
</details></p></li>
</ul>
......@@ -154,9 +168,9 @@ If you cannot execute the src/BUILD/Linux<em>x86</em>64/proteinortho_clustering,
<h4 id="2buildingandinstallingproteinorthofromsourcelinuxandosx">2. Building and installing proteinortho from source (linux and osx)</h4>
<p>Here you <i>can</i> use a working lapack library, check this with 'dpkg --get-selections | grep lapack'. Install lapack e.g. with 'apt-get install libatlas3-base' or liblapack3.</p>
<p>Here you can use a working lapack library, check this with 'dpkg --get-selections | grep lapack'. Install lapack e.g. with 'apt-get install libatlas3-base' or liblapack3.</p>
<p>If you dont have one (or you have no root permissions), then 'make' will automatically compile a lapack (v3.8.0) for you !</p>
<p>If you dont have Lapack, then 'make' will automatically compiles Lapack v3.8.0 for you !</p>
<p>Fetch the latest source code archive downloaded from <a href="https://gitlab.com/paulklemm_PHD/proteinortho/-/archive/master/proteinortho-master.zip">here</a>
<details> <summary>or from here (Click to expand)</summary></p>
......@@ -345,7 +359,7 @@ blast. 3 -> run the clustering.</p></li></ul>
<p><details>
<summary>show all algorithms (Click to expand)</summary></p>
<pre><code>- blastn,blastp,tblastx : legacy blast family (shell commands: blastall -) family. The suffix 'n' or 'p' indicates nucleotide or protein input files.
<pre><code>- blastn_legacy,blastp_legacy,tblastx_legacy : legacy blast family (shell commands: blastall -) family. The suffix 'n' or 'p' indicates nucleotide or protein input files.
- blastn+,blastp+,tblastx+ : standard blast family (shell commands: blastn,blastp,tblastx)
family. The suffix 'n' or 'p' indicates nucleotide or protein input files.
......
This diff is collapsed.
#!/usr/bin/perl
#!/usr/bin/env perl
#pk
##########################################################################################
......
#!/usr/bin/perl
#!/usr/bin/env perl
##########################################################################################
# This file is part of proteinortho.
......
#!/usr/bin/perl
#!/usr/bin/env perl
use strict;
use warnings "all";
......
#!/usr/bin/perl
#!/usr/bin/env perl
#pk
##########################################################################################
......
#!/usr/bin/perl
#!/usr/bin/env perl
use warnings;
use strict;
......
#!/usr/bin/python
#!/usr/bin/env python2.7
from sys import stdout, stderr, exit, argv, maxint
from copy import deepcopy
......
#!/usr/bin/perl
#!/usr/bin/env perl
use warnings;
use strict;
......
#!/usr/bin/perl
#!/usr/bin/env perl
#pk
##########################################################################################
......
#!/usr/bin/perl
#!/usr/bin/env perl
use strict;
use warnings "all";
......