Skip to content
Commits on Source (9)
repo: 092c2fe2278cb7f0b18d81faeb4aab98b89dc096
node: 89633b311684ece67f32a9461aa1567e32dd42f7
node: b16e6ee425dbf3797cc6c222fe0685818921f853
branch: 2.9
tag: 2.9.20
tag: 2.9.22
......@@ -36,3 +36,5 @@ c20357c9133be435919db0b2948f9aa164f19a0b 2.9.15
71f40111c849408b60e1bc7aeddfef84c9e55eea 2.9.17
170242bd646540bfb5521fe7ff2e54bc8a97dd35 2.9.18
7030c379d395c9eb463ca48943373d52121140d9 2.9.19
89633b311684ece67f32a9461aa1567e32dd42f7 2.9.20
c38b12873d9221abc8c6ade3a1c6fbf808e0c325 2.9.21
......@@ -69,7 +69,7 @@ MetaPhlAn2 requires *python 2.7* or newer with argparse, tempfile, [numpy](http:
* MetaPhlAn2 is integrated with advanced heatmap plotting with [hclust2](https://bitbucket.org/nsegata/hclust2) and cladogram visualization with [GraPhlAn](https://bitbucket.org/nsegata/graphlan/wiki/Home). If you use such visualization tool please refer to their prerequisites.
## Installation
The best way to install MetaPhlAn2 2.9.15 is through conda:
The best way to install MetaPhlAn2 2.9.20 is through conda:
```
#!bash
......@@ -96,10 +96,10 @@ You can also install and run MetaPhlAn2 through Docker
```
#!bash
$ docker pull quay.io/biocontainers/metaphlan2:2.9.15
$ docker pull quay.io/biocontainers/metaphlan2:2.9.20
```
Alternatively, you can **manually download** from [Bitbucket](https://bitbucket.org/biobakery/metaphlan2/get/2.9.15.zip) or **clone the repository** using the following command ``$ hg clone https://bitbucket.org/biobakery/metaphlan2``.
Alternatively, you can **manually download** from [Bitbucket](https://bitbucket.org/biobakery/metaphlan2/get/2.9.20.zip) or **clone the repository** using the following command ``$ hg clone https://bitbucket.org/biobakery/metaphlan2``.
If you choose this way, **you'll need to install manually all the dependencies!**
......@@ -115,7 +115,7 @@ By default, the latest MetaPhlAn2 database is downloaded and built. You can down
```
#!bash
$ metaphlan2.py --install --index v29_CHOCOPhlAn_201901
$ metaphlan2.py --install --index mpa_v29_CHOCOPhlAn_201901
```
--------------------------
......@@ -612,7 +612,7 @@ for f in $(ls fastqs/*.bz2)
do
echo "Running metaphlan2 on ${f}"
bn=$(basename ${f} | cut -d '.' -f 1)
../metaphlan2.py --index v29_CHOCOPhlAn_201901 --input_type multifastq --nproc 10s -s sams/${bn}.sam.bz2 --bowtie2out sams/${bn}.bowtie2_out.bz2 -o ssams/${bn}.profile ${f}
../metaphlan2.py --index mpa_v29_CHOCOPhlAn_201901 --input_type multifastq --nproc 10s -s sams/${bn}.sam.bz2 --bowtie2out sams/${bn}.bowtie2_out.bz2 -o ssams/${bn}.profile ${f}
done
```
......
metaphlan2 (2.9.20-1) UNRELEASED; urgency=medium
metaphlan2 (2.9.22-1) unstable; urgency=medium
[ Andreas Tille ]
* New upstream version
* debhelper-compat 12
* Standards-Version: 4.4.0
* Standards-Version: 4.4.1
* Fix watch file
* New upstream version
[ Steve Langasek ]
* Patches to port to Python3
Closes: #933661
Closes: #933661, #937016
-- Andreas Tille <tille@debian.org> Fri, 16 Aug 2019 22:17:00 +0200
-- Andreas Tille <tille@debian.org> Fri, 08 Nov 2019 07:50:19 +0100
metaphlan2 (2.7.8-1) unstable; urgency=medium
......
......@@ -8,7 +8,7 @@ Build-Depends: debhelper-compat (= 12),
dh-python,
pandoc,
bowtie2
Standards-Version: 4.4.0
Standards-Version: 4.4.1
Vcs-Browser: https://salsa.debian.org/med-team/metaphlan2
Vcs-Git: https://salsa.debian.org/med-team/metaphlan2.git
Homepage: https://bitbucket.org/biobakery/metaphlan2
......
......@@ -16,18 +16,9 @@ Description: Instead of setting mpa_dir bash variable the path to the
"$ metaphlan2.py metagenome.sam --input_type sam -o profiled_metagenome.txt\n\n"
"* We can also natively handle paired-end metagenomes, and, more generally, metagenomes stored in \n"
@@ -1159,7 +1159,7 @@ def metaphlan2():
# check for the mpa_pkl file
if not os.path.isfile(pars['mpa_pkl']):
sys.stderr.write("Error: Unable to find the mpa_pkl file at: " + pars['mpa_pkl'] +
- "\nExpecting location ${mpa_dir}/db_v20/map_v20_m200.pkl "
+ "\nExpecting location /usr/share/metaphlan2/db_v20/map_v20_m200.pkl "
"Exiting...\n\n")
sys.exit(1)
--- a/README.md
+++ b/README.md
@@ -124,13 +124,7 @@ $ metaphlan2.py --install --index v29_CH
@@ -124,13 +124,7 @@ $ metaphlan2.py --install --index mpa_v2
This section presents some basic usages of MetaPhlAn2, for more advanced usages, please see at [its wiki](https://bitbucket.org/biobakery/biobakery/wiki/metaphlan2).
......
......@@ -51,3 +51,328 @@ Bug-Debian: https://bugs.debian.org/933661
import sys
--- a/strainphlan_tutorial/step3_sam2marker.sh
+++ b/strainphlan_tutorial/step3_sam2marker.sh
@@ -2,7 +2,7 @@
mkdir -p consensus_markers
cwd=$(pwd -P)
export PATH=${cwd}/../strainphlan_src:${PATH}
-python2 ../strainphlan_src/sample2markers.py --ifn_samples sams/*.sam.bz2 \
+python3 ../strainphlan_src/sample2markers.py --ifn_samples sams/*.sam.bz2 \
--input_type sam \
--output_dir consensus_markers \
--nprocs 10 | tee consensus_markers/log.txt
--- a/strainphlan_tutorial/step4_extract_db_marker.sh
+++ b/strainphlan_tutorial/step4_extract_db_marker.sh
@@ -1,7 +1,7 @@
#!/bin/bash
mkdir -p db_markers
bowtie2-inspect ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901 > db_markers/all_markers.fasta
-python2 ../strainphlan_src/extract_markers.py \
+python3 ../strainphlan_src/extract_markers.py \
--mpa_pkl ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901.pkl \
--ifn_markers db_markers/all_markers.fasta \
--clade s__Bacteroides_caccae \
--- a/strainphlan_tutorial/step5_build_tree.sh
+++ b/strainphlan_tutorial/step5_build_tree.sh
@@ -1,6 +1,6 @@
#!/bin/bash
mkdir -p output
-python2 ../strainphlan.py --mpa_pkl ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901.pkl \
+python3 ../strainphlan.py --mpa_pkl ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901.pkl \
--ifn_samples consensus_markers/*.markers \
--ifn_markers db_markers/s__Bacteroides_caccae.markers.fasta \
--ifn_ref_genomes reference_genomes/G000273725.fna \
@@ -8,7 +8,7 @@ python2 ../strainphlan.py --mpa_pkl ../m
--nprocs_main 10 \
--clades s__Bacteroides_caccae | tee output/log.txt
-python2 ../strainphlan_src/add_metadata_tree.py \
+python3 ../strainphlan_src/add_metadata_tree.py \
--ifn_trees output/RAxML_bestTree.s__Bacteroides_caccae.tree \
--ifn_metadatas fastqs/metadata.txt \
--metadatas subjectID
--- a/strainphlan_tutorial/step6_build_tree_single_strain.sh
+++ b/strainphlan_tutorial/step6_build_tree_single_strain.sh
@@ -1,10 +1,10 @@
#!/bin/bash
-python2 ../strainphlan_src/build_tree_single_strain.py \
+python3 ../strainphlan_src/build_tree_single_strain.py \
--ifn_alignments output/s__Bacteroides_caccae.fasta \
--nprocs 10 \
--log_ofn output/build_tree_single_strain.log
-python2 ../strainphlan_src/add_metadata_tree.py \
+python3 ../strainphlan_src/add_metadata_tree.py \
--ifn_trees output/RAxML_bestTree.s__Bacteroides_caccae.remove_multiple_strains.tree \
--ifn_metadatas fastqs/metadata.txt \
- --metadatas subjectID
\ No newline at end of file
+ --metadatas subjectID
--- a/README.md
+++ b/README.md
@@ -430,14 +430,14 @@ To merge multiple output files, run the
```
#!bash
-$ python utils/merge_metaphlan_tables.py metaphlan_output1.txt metaphlan_output2.txt metaphlan_output3.txt output/merged_abundance_table.txt
+$ python3 utils/merge_metaphlan_tables.py metaphlan_output1.txt metaphlan_output2.txt metaphlan_output3.txt output/merged_abundance_table.txt
```
Wildcards can be used as needed:
```
#!bash
-$ python utils/merge_metaphlan_tables.py metaphlan_output*.txt output/merged_abundance_table.txt
+$ python3 utils/merge_metaphlan_tables.py metaphlan_output*.txt output/merged_abundance_table.txt
```
**Output files can be merged only if the profiling was performed with the same version of the MetaPhlAn2 database.**
@@ -504,7 +504,7 @@ bowtie2-build metaphlan_databases/mpa_v2
* Assume that the new marker was extracted from genome1, genome2. Update the taxonomy file from the Python console as follows:
```
-#!python
+#!/usr/bin/python3
import pickle
import bz2
@@ -575,14 +575,12 @@ Otherwise, all dependence binaries on Li
The script files in folder "strainphlan_src" should be changed to executable mode by:
```
-#!python
chmod +x strainphlan_src/*.py
```
and add to the executable path:
```
-#!python
export PATH=$PATH:$(pwd -P)/strainphlan_src
```
@@ -600,7 +598,6 @@ Each sam file (in SAM format) correspond
The commands to run are:
```
-#!python
mkdir -p sams
for f in $(ls fastqs/*.bz2)
do
@@ -620,11 +617,10 @@ The commands to run are:
```
-#!python
mkdir -p consensus_markers
cwd=$(pwd -P)
export PATH=${cwd}/../strainphlan_src:${PATH}
-python ../strainphlan_src/sample2markers.py --ifn_samples sams/*.sam.bz2 --input_type sam --output_dir consensus_markers --nprocs 10 &> consensus_markers/log.txt
+python3 ../strainphlan_src/sample2markers.py --ifn_samples sams/*.sam.bz2 --input_type sam --output_dir consensus_markers --nprocs 10 &> consensus_markers/log.txt
```
The result is the same if you want run several sample2markers.py scripts in parallel with each run for a sample (this maybe useful for some cluster-system settings).
@@ -637,10 +633,9 @@ This step will extract the markers of *B
The commands to run are:
```
-#!python
mkdir -p db_markers
bowtie2-inspect ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901 > db_markers/all_markers.fasta
-python ../strainphlan_src/extract_markers.py --mpa_pkl ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901.pkl --ifn_markers db_markers/all_markers.fasta --clade s__Bacteroides_caccae --ofn_markers db_markers/s__Bacteroides_caccae.markers.fasta
+python3 ../strainphlan_src/extract_markers.py --mpa_pkl ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901.pkl --ifn_markers db_markers/all_markers.fasta --clade s__Bacteroides_caccae --ofn_markers db_markers/s__Bacteroides_caccae.markers.fasta
```
Note that the "all\_markers.fasta" file consists can be reused for extracting other reference genomes.
@@ -650,8 +645,7 @@ This step will take around 1 minute and
Before building the trees, we should get the list of all clades detected from the samples and save them in the "output/clades.txt" file by the following command:
```
-#!python
-python ../strainphlan.py --mpa_pkl ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901.pkl --ifn_samples consensus_markers/*.markers --output_dir output --nprocs_main 10 --print_clades_only > output/clades.txt
+python3 ../strainphlan.py --mpa_pkl ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901.pkl --ifn_samples consensus_markers/*.markers --output_dir output --nprocs_main 10 --print_clades_only > output/clades.txt
```
The clade names in the output file "clades.txt" will be used for the next step.
@@ -664,9 +658,8 @@ Note that: all marker files (\*.markers)
The commands to run are:
```
-#!python
mkdir -p output
-python ../strainphlan.py --mpa_pkl ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901.pkl --ifn_samples consensus_markers/*.markers --ifn_markers db_markers/s__Bacteroides_caccae.markers.fasta --ifn_ref_genomes reference_genomes/G000273725.fna.bz2 --output_dir output --nprocs_main 10 --clades s__Bacteroides_caccae | tee output/log_full.txt
+python3 ../strainphlan.py --mpa_pkl ../metaphlan_databases/mpa_v29_CHOCOPhlAn_201901.pkl --ifn_samples consensus_markers/*.markers --ifn_markers db_markers/s__Bacteroides_caccae.markers.fasta --ifn_ref_genomes reference_genomes/G000273725.fna.bz2 --output_dir output --nprocs_main 10 --clades s__Bacteroides_caccae | tee output/log_full.txt
```
This step will take around 2 minutes. After this step, you will find the tree "output/RAxML\_bestTree.s\_\_Bacteroides\_caccae.tree". All the output files can be found in the folder "output" in [this link](https://www.dropbox.com/sh/m4na8wefp53j8ej/AABA3yVsG26TbB0t1cnBS9-Ra?dl=0).
@@ -677,8 +670,7 @@ By default, if you do not specify refere
In order to add the metadata, we also provide a script called "add\_metadata\_tree.py" which can be used as follows:
```
-#!python
-python ../strainphlan_src/add_metadata_tree.py --ifn_trees output/RAxML_bestTree.s__Bacteroides_caccae.tree --ifn_metadatas fastqs/metadata.txt --metadatas subjectID
+python3 ../strainphlan_src/add_metadata_tree.py --ifn_trees output/RAxML_bestTree.s__Bacteroides_caccae.tree --ifn_metadatas fastqs/metadata.txt --metadatas subjectID
```
The script "add\_metadata\_tree.py" can accept multiple metadata files (space separated, wild card can also be used) and multiple trees. A metadata file is a tab separated file where the first row is the meta-headers, and the following rows contain the metadata for each sample. Multiple metadata files are used in the case where your samples come from more than one dataset and you do not want to merge the metadata files.
@@ -686,7 +678,6 @@ For more details of using "add\_metadata
An example of a metadata file is the "fastqs/metadata.txt" file with the below content:
```
-#!python
sampleID subjectID
SRS055982 638754422
SRS022137 638754422
@@ -705,8 +696,7 @@ If you have installed [graphlan](https:/
```
-#!python
-python ../strainphlan_src/plot_tree_graphlan.py --ifn_tree output/RAxML_bestTree.s__Bacteroides_caccae.tree.metadata --colorized_metadata subjectID
+python3 ../strainphlan_src/plot_tree_graphlan.py --ifn_tree output/RAxML_bestTree.s__Bacteroides_caccae.tree.metadata --colorized_metadata subjectID
```
and obtain the following figure (output/RAxML\_bestTree.s\_\_Bacteroides\_caccae.tree.metadata.png):
@@ -716,9 +706,8 @@ and obtain the following figure (output/
Step 6. If you want to remove the samples with high-probability of containing multiple strains, you can rebuild the tree by removing the multiple strains:
```
-#!python
-python ../strainphlan_src/build_tree_single_strain.py --ifn_alignments output/s__Bacteroides_caccae.fasta --nprocs 10 --log_ofn output/build_tree_single_strain.log
-python ../strainphlan_src/add_metadata_tree.py --ifn_trees output/RAxML_bestTree.s__Bacteroides_caccae.remove_multiple_strains.tree --ifn_metadatas fastqs/metadata.txt --metadatas subjectID
+python3 ../strainphlan_src/build_tree_single_strain.py --ifn_alignments output/s__Bacteroides_caccae.fasta --nprocs 10 --log_ofn output/build_tree_single_strain.log
+python3 ../strainphlan_src/add_metadata_tree.py --ifn_trees output/RAxML_bestTree.s__Bacteroides_caccae.remove_multiple_strains.tree --ifn_metadatas fastqs/metadata.txt --metadatas subjectID
```
You will obtain the refined tree "output/RAxML\_bestTree.s\_\_Bacteroides\_caccae.remove\_multiple\_strains.tree.metadata". This tree can be found in the folder "output" in [this link](https://www.dropbox.com/sh/m4na8wefp53j8ej/AABA3yVsG26TbB0t1cnBS9-Ra?dl=0).
@@ -727,8 +716,7 @@ You will obtain the refined tree "output
All option details can be viewed by strainphlan.py help:
```
-#!python
-python ../strainphlan.py -h
+python3 ../strainphlan.py -h
```
The default setting can be stringent for some cases where you have very few samples left in the phylogenetic tree. You can relax some parameters to add more samples back:
--- a/strainphlan_src/add_metadata_tree.py
+++ b/strainphlan_src/add_metadata_tree.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Authors: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/build_tree_single_strain.py
+++ b/strainphlan_src/build_tree_single_strain.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/compute_distance.py
+++ b/strainphlan_src/compute_distance.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/compute_distance_all.py
+++ b/strainphlan_src/compute_distance_all.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/dump_file.py
+++ b/strainphlan_src/dump_file.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/extract_markers.py
+++ b/strainphlan_src/extract_markers.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/fastx_len_filter.py
+++ b/strainphlan_src/fastx_len_filter.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/fix_AF1.py
+++ b/strainphlan_src/fix_AF1.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/mixed_utils.py
+++ b/strainphlan_src/mixed_utils.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/ooSubprocess.py
+++ b/strainphlan_src/ooSubprocess.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
# Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/plot_tree_ete2.py
+++ b/strainphlan_src/plot_tree_ete2.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/plot_tree_graphlan.py
+++ b/strainphlan_src/plot_tree_graphlan.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Authors: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/sam_filter.py
+++ b/strainphlan_src/sam_filter.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/sample2markers.py
+++ b/strainphlan_src/sample2markers.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
# Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
--- a/strainphlan_src/which.py
+++ b/strainphlan_src/which.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python3
#Author: Duy Tin Truong (duytin.truong@unitn.it)
# at CIBIO, University of Trento, Italy
......@@ -2,6 +2,8 @@
# DH_VERBOSE := 1
include /usr/share/dpkg/default.mk
%:
dh $@ --with python3
......@@ -12,3 +14,10 @@ override_dh_auto_build:
override_dh_installchangelogs:
dh_installchangelogs changeset.txt
override_dh_fixperms:
dh_fixperms
find debian/$(DEB_SOURCE) -name "*.txt" -exec chmod -x \{\} \;
find debian/$(DEB_SOURCE)/usr/share/$(DEB_SOURCE)/utils -name "*.py" -exec chmod -x \{\} \;
chmod -x debian/$(DEB_SOURCE)/usr/share/$(DEB_SOURCE)/_*.py
chmod -x debian/$(DEB_SOURCE)/usr/share/$(DEB_SOURCE)/plugin_setup.py
......@@ -4,8 +4,8 @@ __author__ = ('Nicola Segata (nicola.segata@unitn.it), '
'Duy Tin Truong, '
'Francesco Asnicar (f.asnicar@unitn.it), '
'Francesco Beghini (francesco.beghini@unitn.it)')
__version__ = '2.9.20'
__date__ = '14 Aug 2019'
__version__ = '2.9.22'
__date__ = '14 Oct 2019'
import sys
import os
......@@ -247,7 +247,7 @@ def read_params(args):
"that 'bowtie2-build is present in the system path")
arg('--bowtie2out', metavar="FILE_NAME", type=str, default=None,
help="The file for saving the output of BowTie2")
arg('--min_mapq_val', type=str, default=5,
arg('--min_mapq_val', type=int, default="5",
help="Minimum mapping quality value (MAPQ)")
arg('--no_map', action='store_true',
help="Avoid storing the --bowtie2out map file")
......@@ -526,6 +526,9 @@ def download_unpack_tar(url, download_file_name, folder, bowtie2_build, nproc):
sys.stderr.write("Fatal error running '{}'\nError message: '{}'\n\n".format(' '.join(bt2_cmd), e))
sys.exit(1)
for bt2 in glob(os.path.join(folder, download_file_name + "*.bt2")):
os.chmod(bt2, stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP | stat.S_IWGRP | stat.S_IROTH) # change permissions to 664
sys.stderr.write('Removing uncompress database {}\n'.format(fna_file))
os.remove(fna_file)
......@@ -921,11 +924,11 @@ class TaxTree:
if ignore_eukaryotes or ignore_bacteria or ignore_archaea:
cn = cl.get_full_name()
if ignore_eukaryotes and cn.startswith("k__Eukaryota"):
return ""
return (None, None)
if ignore_archaea and cn.startswith("k__Archaea"):
return ""
return (None, None)
if ignore_bacteria and cn.startswith("k__Bacteria"):
return ""
return (None, None)
# while len(cl.children) == 1:
# cl = list(cl.children.values())[0]
cl.markers2nreads[marker] = n
......@@ -1104,7 +1107,7 @@ def maybe_generate_biom_file(tree, pars, abundance_predictions):
######## clade_ids, #Modified by George Weingart 5/22/2017 - We will use instead the clade_names
clade_names, #Modified by George Weingart 5/22/2017 - We will use instead the clade_names
sample_metadata = None,
observation_metadata = map(to_biomformat, clade_names),
observation_metadata = list(map(to_biomformat, clade_names)),
table_id = table_id,
constructor = biom.table.DenseOTUTable
)
......@@ -1118,7 +1121,7 @@ def maybe_generate_biom_file(tree, pars, abundance_predictions):
clade_names, #Modified by George Weingart 5/22/2017 - We will use instead the clade_names
sample_ids,
sample_metadata = None,
observation_metadata = map(to_biomformat, clade_names),
observation_metadata = list(map(to_biomformat, clade_names)),
table_id = table_id,
input_is_dense = True
)
......@@ -1159,7 +1162,6 @@ def metaphlan2():
# check for the mpa_pkl file
if not os.path.isfile(pars['mpa_pkl']):
sys.stderr.write("Error: Unable to find the mpa_pkl file at: " + pars['mpa_pkl'] +
"\nExpecting location ${mpa_dir}/db_v20/map_v20_m200.pkl "
"Exiting...\n\n")
sys.exit(1)
......@@ -1270,7 +1272,10 @@ def metaphlan2():
elif pars['t'] == 'rel_ab':
if pars['CAMI_format_output']:
outf.write('@SampleID:{}\n@Version:0.9.1\n@__program__:MetaPhlAn{}\n@Ranks:superkingdom|phylum|class|order|family|genus|species\n@@TAXID\tRANK\tTAXPATH\tTAXPATHSN\tPERCENTAGE\n'.format(pars["sample_id"],__version__))
outf.write("@SampleID:{}\n"
"@Version:0.10.0\n"
"@Ranks:superkingdom|phylum|class|order|family|genus|species|strain\n"
"@@TAXID\tRANK\tTAXPATH\tTAXPATHSN\tPERCENTAGE\n".format(pars["sample_id"],__version__))
elif not pars['legacy_output']:
outf.write('#clade_name\tNCBI_tax_id\trelative_abundance\n')
......@@ -1289,7 +1294,7 @@ def metaphlan2():
if taxid:
rank = ranks2code[clade.split('|')[-1][0]]
leaf_taxid = taxid.split('|')[-1]
taxpathsh = '|'.join([remove_prefix(name) for name in clade.split('|')])
taxpathsh = '|'.join([remove_prefix(name) if '_unclassified' not in name else '' for name in clade.split('|')])
outf.write( '\t'.join( [ leaf_taxid, rank, taxid, taxpathsh, str(relab*fraction_mapped_reads) ] ) + '\n' )
else:
if pars['unknown_estimation']:
......
......@@ -102,6 +102,12 @@ def read_and_write_raw_int(fd, min_len=None):
for idx, l in enumerate(fd,1):
_ = sys.stdout.write(ignore_spaces(l))
#Read again the first line of the file to determine if is a fasta or a fastq
fd.seek(0)
l = fd.readline()
readn = 4 if fastx(l) == 'fastq' else 2
idx = idx // readn
nreads = idx - discarded
return nreads
......