Skip to content
Commits on Source (5)
# # Mac
.DS_STORE
.env
# Byte-compiled / optimized / DLL files
**/__pycache__/
**/*.py[cod]
......@@ -6,6 +6,9 @@
__pycache__/
*.py[cod]
# Env
.env
# C extensions
*.so
......@@ -62,9 +65,9 @@ docs/_build/
#Data files
*.wig
*.dat
*.prot_table
*.png
*.out
*.prot_table
NOTES
*~
......@@ -73,3 +76,19 @@ NOTES
tests/testoutput.txt
tests/testoutput_genes.txt
#BWA generated files from fna
/src/pytransit/genomes/*.amb
/src/pytransit/genomes/*.ann
/src/pytransit/genomes/*.bwt
/src/pytransit/genomes/*.pac
/src/pytransit/genomes/*.sa
/tests/**/*.amb
/tests/**/*.ann
/tests/**/*.bwt
/tests/**/*.pac
/tests/**/*.sa
*.amb
*.ann
*.bwt
*.pac
*.sa
language: python
matrix:
include:
- os: linux
dist: trusty
sudo: required
python: 2.7
before_install: "sudo apt-get install -y -f python python-dev python-pip pkg-config libpng-dev libjpeg8-dev libfreetype6-dev"
install: "pip install --upgrade pip setuptools numpy scipy pillow matplotlib pytest statsmodels"
services:
- docker
before_install:
- docker build -t transit .
DISPLAY: 0.0
notifications:
email:
on_success: change # default: change
on_failure: change # default: always
before_script: cd /home/travis/build/mad-lab/transit/tests
script: travis_wait 30 pytest
script: travis_wait 20 docker run -t transit
# Change log
All notable changes to this project will be documented in this file.
## Version 2.5.2 2019-05-16
#### TRANSIT:
- Made some improvements in command-line version of 'tn5gaps'
- Added flags for trimming insertions in N- and C-termini of genes for tn5gaps (-iN and -iC)
## Version 2.5.1 2019-04-25
#### TRANSIT:
- Add support for [handling interactions in ZINB](https://transit.readthedocs.io/en/latest/transit_methods.html#covariates-and-interactions)
- Fix selection bug for gff3 in GUI
## Version 2.5.0 2019-03-28
#### TRANSIT:
- Added analysis method for Zero-Inflated Negative Binomial ([ZINB](https://transit.readthedocs.io/en/latest/transit_methods.html#zinb))
- Fix LOESS flag bug in resampling 2.4.2
- Resampling supports combined_wig files
- Change ordering of metadata and annotation file in ANOVA cmd
## Version 2.4.2 2019-03-15
#### TPP:
- updated docs for TPP; expanded discussion of protocols, including Mme1
- for Mme1, change min read length from 20bp to 15bp (for genomic part of read1)
- replaced '-himar1' and 'tn5' flags with '-protocol [sassetti|tn5|mme1]'
- added 'auto' for -replicon-ids
- added 'pre-trimmed' as option for transposon in TPP GUI (prefix="")
#### TRANSIT:
- [resampling can now be done between TnSeq libraries from different strains](https://transit.readthedocs.io/en/latest/transit_methods.html#re-sampling)
- add documentation for 'griffin' and Mann-Whitney 'utest' analysis methods
## Version 2.4.1 2019-03-04
#### TPP:
- allow the primer sequence to be the empty string (i.e. -primer "" on command-line; for pre-trimmed reads)
- do not throw an error if header ids in read1 and read2 fastq files happen to match identically
- minor bug fixes:
- fixed problem of order of data in tn_stats table when there are multiple contigs but only single-ended reads
- fixed name of flag from "replicon-id" to "replicon-ids"
- prevent div-by-zero error in cases where no reads map
## Version 2.4.0 2019-02-28
#### TPP:
- **can now handle genomes with multiple contigs** (thanks to modifications by Robert Jenquin and William Matern); it creates multiple .wig files as output
- BWA: switched from using 'aln' to 'mem' by default
- added flags to set the nucleotide window for searching for start of primer sequence (-primer-window-start)
- fixed bug in counting misprimed reads, and reads mapped to both R1 and R2
- added some fields to TPP GUI, and made it more consistent about saving/reading parameters in the tpp.cfg config file
#### Transit:
- fixed bug in handling '-minreads' flag in Gumbel analysis
- updated support for converting .gff files to .prot_table format (in GUI and on command line)
- added a status field to ANOVA output
- TrackView scales all plots simultaneously by default
- updated documentation
## Pull Request 18 by Robert Jenquin and William Matern (Jan, 2019)
- Added the ability to accept multiple replicons in the form of either multiline reference genomes or multiple reference genome files.
- Added `-bwa-alg` argument, allowing the user to specify `mem` or `aln` to use `bwa mem` or `bwa aln` algorithms
- Now requires `-replicon-id` argument to specify names for the replicons if multiple reference genomes given (respective order to order appearing in reference genome(s)
- Code cleanup: closing dangling file handles
- Bug fix: if adapter is at exact end of R1, it is now properly handled
- Bug fix: trimmed\_reads now counted properly
- Added support for specifying `-window-size` argument
- **Sample usage:**
```
python2 src/tpp.py -himar1 -bwa /usr/bin/bwa -bwa-alg aln -ref MAC109_genome.fa -replicon-id CP029332 CP029333 CP029334 -reads1 ../HJKK5BCX2_ATGCTG_1.fastq -reads2 ../HJKK5BCX2_ATGCTG_2.fastq -primer AACCTGTTA -mismatches 2 -window-size 6 -output tpp_output/avium
```
- Explanation of arguments
- `-himar1` specifies that the Himar1 transposon was used in the transposon mutagenesis procedure. Tn5 is also supported (`-tn5`)
- `-bwa` specifies the path to the `bwa` executable
- `-bwa-alg` specifies either `mem` or `aln` algorithms for `bwa` to use. `aln` is widely considered obselete to `mem` for reads of length > 70bp. `aln` is default.
- `-ref` specifies the reference genome(s) in FASTA format to which reads will be mapped. If more than one, they can be specified in either multiple FASTAs, or as a multilined FASTA (or a combination of both).
- `-replicon-id [contig1 contig2 ...]` specifies the names of the contigs in the genome(s). These are used as filename suffixes for output files (ie \*\_contig1.wig, \*\_contig2.wig, etc). The order of the contigs is assumed to be the same as they appear in the reference genome(s) (as given with `-ref`). Specifying this option is only required if there is more than one contig. Note: While you can technically use any contig name at this step, if you wish to use `wig_gb_to_csv.py` to organize the data you should use the contig names as they appear in the Genbank file (as specified by `wig_gb_to_csv.py -g`).
- `-reads1` specifies the file containing the raw reads (untrimmed) for read1 in FASTQ or FASTA format
- `-reads2` specifies the file containing the raw reads (untrimmed) for read2 in FASTQ or FASTA format
- `-primer` specifies a nucleotide sequence at the end of the transposon, is used to separate transposon DNA from genomic DNA in read 1.
- `-window-size` specifies how many positions to look for `-primer` within read 1. It should be set to at least the difference between the maximum and minumum expected positions of the first base of genomic DNA in read 1 (and larger if you want to allow for insertions/deletions). For the Long et al 2015 protocol (using a pool of 4 shifting prefixes) the window-size should be at least 6. Default value is 6.
- `-mismatches` specifies the number of mismatches to allow when searching for the transposon in read 1 (ie number of mismatches to `-primer`).
- `-output` specifies the filename prefix to be applied to output files. Can include directories, allowing custom paths to be specified.
## Version 2.3.4 2019-01-14
- TRANSIT:
- Minor bug fixes related to flags in Resampling and HMM
......
From r-base:3.4.1
RUN apt-get update -y && apt-get install -y -f python2 python-dev python-pip
ADD src/ /src
ADD tests/ /tests
RUN pip install pytest 'numpy~=1.15' 'scipy~=1.2' 'matplotlib~=2.2' 'pillow~=5.0' 'statsmodels~=0.9' 'rpy2<2.9.0'
RUN R -e "install.packages('MASS')"
RUN R -e "install.packages('pscl')"
CMD [ "pytest", "./tests" ]
# TRANSIT 2.3.4
# TRANSIT 2.5.1
[![Build Status](https://travis-ci.org/mad-lab/transit.svg?branch=master)](https://travis-ci.org/mad-lab/transit) [![Documentation Status](https://readthedocs.org/projects/transit/badge/?version=latest)](http://transit.readthedocs.io/en/latest/?badge=latest) [![Downloads](https://pepy.tech/badge/tnseq-transit)](https://pepy.tech/project/tnseq-transit)
[![Build Status](https://travis-ci.org/mad-lab/transit.svg?branch=master)](https://travis-ci.org/mad-lab/transit) [![Documentation Status](https://readthedocs.org/projects/transit/badge/?version=latest)](http://transit.readthedocs.io/en/latest/?badge=latest)
Welcome! This is the distribution for the TRANSIT and TPP tools developed by the [Ioerger Lab](http://orca2.tamu.edu/tom/iLab.html) at Texas A&M University.
Welcome! This is the distribution for the TRANSIT and TPP tools developed by the Ioerger Lab at Texas A&M University.
TRANSIT is a tool for processing and statistical analysis of Tn-Seq data.
TRANSIT is a tool for processing and statistical analysis of Tn-Seq data.
It provides an easy to use graphical interface and access to three different analysis methods that allow the user to determine essentiality in a single condition as well as between conditions.
TRANSIT Home page: http://saclab.tamu.edu/essentiality/transit/index.html
......@@ -18,8 +17,8 @@ TRANSIT Documentation: https://transit.readthedocs.io/en/latest/transit_overview
## Features
TRANSIT offers a variety of features including:
- More than **8 analysis methods**, including methods for determining **conditional essentiality** as well as **genetic interactions**.
- More than **10 analysis methods**, including methods for determining **conditional essentiality** as well as **genetic interactions**.
- Ability to analyze datasets from libraries constructed using **himar1 or tn5 transposons**.
......
version: 2.2.1-9-ge742-mod
tnseq-transit (2.5.2-1) unstable; urgency=medium
* New upstream version
* Asked upstrem for Python3 port
https://github.com/mad-lab/transit/issues/20
-- Andreas Tille <tille@debian.org> Tue, 09 Jul 2019 15:21:53 +0200
tnseq-transit (2.3.4-1) unstable; urgency=medium
* New upstream version
......
......@@ -4,39 +4,26 @@ Last-Update: Tue, 18 Dec 2018 16:56:48 +0100
--- a/tests/test_tpp.py
+++ b/tests/test_tpp.py
@@ -34,7 +34,7 @@ def get_stats(path):
@@ -122,19 +122,19 @@ def verify_stats(stats_file, expected):
class TestTPP(TransitTestCase):
- @unittest.skipUnless(os.path.exists("/usr/bin/bwa"), "requires BWA")
- @unittest.skipUnless(len(bwa_path) > 0, "requires BWA")
+ @unittest.skipUnless(os.path.exists("../misc/test"), "requires local data file")
def test_tpp_noflag_primer(self):
(args, kwargs) = cleanargs(["-bwa", bwa_path, "-ref", h37fna, "-reads1", reads1, "-output", tpp_output_base, "-protocol", "sassetti"])
tppMain(*args, **kwargs)
self.assertTrue(verify_stats("{0}.tn_stats".format(tpp_output_base), NOFLAG_PRIMER))
arguments = ["-bwa", "/usr/bin/bwa", "-ref", "H37Rv.fna", "-reads1", "test.fastq", "-output",
@@ -44,7 +44,7 @@ class TestTPP(TransitTestCase):
self.assertTrue(NOFLAG_PRIMER == stats)
- @unittest.skipUnless(os.path.exists("/usr/bin/bwa"), "requires BWA")
- @unittest.skipUnless(len(bwa_path) > 0, "requires BWA")
+ @unittest.skipUnless(os.path.exists("../misc/test"), "requires local data file")
def test_tpp_flag_primer(self):
(args, kwargs) = cleanargs(["-bwa", bwa_path, "-ref", h37fna, "-reads1", reads1, "-output", tpp_output_base, "-himar1", "-flags", "-k 1"])
tppMain(*args, **kwargs)
self.assertTrue(verify_stats("{0}.tn_stats".format(tpp_output_base), FLAG_PRIMER))
arguments = ["-bwa", "/usr/bin/bwa", "-ref", "H37Rv.fna", "-reads1", "test.fastq", "-output",
@@ -55,7 +55,7 @@ class TestTPP(TransitTestCase):
@unittest.expectedFailure
- @unittest.skipUnless(os.path.exists("/usr/bin/bwa"), "requires BWA")
+ @unittest.skipUnless(os.path.exists("../misc/test"), "requires local data file")
def test_tpp_noflag_noprimer(self):
with self.assertRaises(SystemExit):
@@ -67,7 +67,7 @@ class TestTPP(TransitTestCase):
@unittest.expectedFailure
- @unittest.skipUnless(os.path.exists("/usr/bin/bwa"), "requires BWA")
- @unittest.skipUnless(len(bwa_path) > 0, "requires BWA")
+ @unittest.skipUnless(os.path.exists("../misc/test"), "requires local data file")
def test_tpp_flag_noprimer(self):
with self.assertRaises(SystemExit):
def test_tpp_protocol_mme1(self):
(args, kwargs) = cleanargs(["-bwa", bwa_path, "-ref", h37fna, "-reads1", reads1, "-output", tpp_output_base, "-protocol", "Mme1"])
tppMain(*args, **kwargs)
......@@ -60,8 +60,10 @@ class UploadCommand(Command):
if not self.yes_or_no("Have you done the following? \n" +
"- Updated README/Documentation?\n"
"- Are in the master branch, and have you merged version branch into master?\n"
"- Have you updated CHANGELOG?\n"
"- Have you updated Transit Essentiality page?\n"
"- Updated version in src/pytransit/__init__.py (used to set git tag)?\n"
"- Is version v{0} correct".format(version)):
self.status("Exiting...")
sys.exit()
......@@ -73,7 +75,7 @@ class UploadCommand(Command):
self.status('Adding and pushing git tags to origin and public...')
os.system('git tag v{0}'.format(version))
os.system('git push origin --tags')
os.system('git push https://github.com/mad-lab/transit')
os.system('git push https://github.com/mad-lab/transit master')
os.system('git push https://github.com/mad-lab/transit --tags')
else:
self.status("Exiting...")
......@@ -142,7 +144,7 @@ setup(
# your project is installed. For an analysis of "install_requires" vs pip's
# requirements files see:
# https://packaging.python.org/en/latest/requirements.html
install_requires=['setuptools', 'numpy~=1.15', 'scipy~=1.1', 'matplotlib~=2.2', 'pillow~=5.0', 'statsmodels~=0.9'],
install_requires=['setuptools', 'numpy~=1.15', 'scipy~=1.2', 'matplotlib~=2.2', 'pillow~=5.0', 'statsmodels~=0.9'],
#dependency_links = [
# "git+https://github.com/wxWidgets/wxPython.git#egg=wxPython"
......
......@@ -39,7 +39,6 @@ def run_main():
main(*args, **kwargs)
def main(*args, **kwargs):
vars = Globals()
# Check for arguements
if not args and not kwargs and hasWx:
......@@ -48,6 +47,7 @@ def main(*args, **kwargs):
form.update_dataset_list()
form.Show()
form.Maximize(True)
app.MainLoop()
# vars.action not defined, quit...
......@@ -73,7 +73,6 @@ def main(*args, **kwargs):
show_help()
else:
# Show help if needed
if "help" in kwargs or "-help" in kwargs:
show_help()
......@@ -82,7 +81,8 @@ def main(*args, **kwargs):
# Check for strange flags
known_flags = set(["tn5", "help", "himar1", "protocol", "primer", "reads1",
"reads2", "bwa", "ref", "maxreads", "output", "mismatches", "flags",
"barseq_catalog_in", "barseq_catalog_out"])
"barseq_catalog_in", "barseq_catalog_out",
"window-size", "bwa-alg", "replicon-ids","primer-start-window"])
unknown_flags = set(kwargs.keys()) - known_flags
if unknown_flags:
print "error: unrecognized flags:", ", ".join(unknown_flags)
......
......@@ -52,7 +52,7 @@ if hasWx:
self.vars = vars
initialize_globals(self.vars)
wx.Frame.__init__(self, None, wx.ID_ANY, "TPP: Tn-Seq PreProcessor") # v%s" % vars.version)
wx.Frame.__init__(self, None, wx.ID_ANY, "TPP: Tn-Seq PreProcessor") # v%s" % vars.version
# Add a panel so it looks the correct on all platforms
panel = wx.ScrolledWindow( self, wx.ID_ANY, wx.DefaultPosition, wx.Size( -1,-1 ), wx.HSCROLL|wx.VSCROLL )
......@@ -97,17 +97,27 @@ if hasWx:
# REFERENCE
sizer3 = wx.BoxSizer(wx.HORIZONTAL)
label3 = wx.StaticText(panel, label='Choose a reference genome (FASTA):',size=(330,-1))
label3 = wx.StaticText(panel, label='Choose a reference genome (FASTA) (REQUIRED):',size=(330,-1))
sizer3.Add(label3,0,wx.ALIGN_CENTER_VERTICAL,0)
self.picker3 = wx.lib.filebrowsebutton.FileBrowseButton(panel, id=wx.ID_ANY, dialogTitle='Please select the reference genome', fileMode=wx.FD_OPEN, fileMask='*.fna;*.fasta;*.fa', size=(400,30), startDirectory=os.path.dirname(vars.ref), initialValue=vars.ref, labelText='')
sizer3.Add(self.picker3, proportion=1, flag=wx.EXPAND|wx.ALL, border=5)
sizer3.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Select a reference genome in FASTA format."), flag=wx.CENTER, border=0)
sizer3.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Select a reference genome in FASTA format (can be a multi-contig fasta file)."), flag=wx.CENTER, border=0)
sizer3.Add((10, 1), 0, wx.EXPAND)
sizer.Add(sizer3,0,wx.EXPAND,0)
# REPLICON ID NAMES
sizer_replicon_ids = wx.BoxSizer(wx.HORIZONTAL)
label_replicon_ids = wx.StaticText(panel, label='ID names for each replicon: \n(if genome has multiple contigs)',size=(340,-1))
sizer_replicon_ids.Add(label_replicon_ids,0,wx.ALIGN_CENTER_VERTICAL,0)
self.replicon_ids = wx.TextCtrl(panel,value=vars.replicon_ids,size=(400,30))
sizer_replicon_ids.Add(self.replicon_ids, proportion=1.0, flag=wx.EXPAND|wx.ALL, border=5)
sizer_replicon_ids.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Specify names of each contig within the reference genome separated by commas (if using wig_gb_to_csv.py you must use the contig names in the Genbank file). Only required if there are multiple contigs; can leave blank if there is just one sequence.\nEnter 'auto' for autogenerated ids."), flag=wx.CENTER, border=0)
sizer_replicon_ids.Add((10, 1), 0, wx.EXPAND)
sizer.Add(sizer_replicon_ids,0,wx.EXPAND,0)
# READS 1
sizer1 = wx.BoxSizer(wx.HORIZONTAL)
label1 = wx.StaticText(panel, label='Choose the Fastq file for read 1:',size=(330,-1))
label1 = wx.StaticText(panel, label='Choose the Fastq file for read 1 (REQUIRED):',size=(330,-1))
sizer1.Add(label1,0,wx.ALIGN_CENTER_VERTICAL,0)
self.picker1 = wx.lib.filebrowsebutton.FileBrowseButton(panel, id=wx.ID_ANY, dialogTitle='Please select the .fastq file for read 1', fileMode=wx.FD_OPEN, fileMask='*.fastq;*.fq;*.reads;*.fasta;*.fa;*.fastq.gz', size=(400,30), startDirectory=os.path.dirname(vars.fq1), initialValue=vars.fq1, labelText='',changeCallback=self.OnChanged2)
sizer1.Add(self.picker1, proportion=1, flag=wx.EXPAND|wx.ALL, border=5)
......@@ -129,11 +139,11 @@ if hasWx:
# OUTPUT PREFIX
sizer5 = wx.BoxSizer(wx.HORIZONTAL)
label5 = wx.StaticText(panel, label='Prefix to use for output filenames:',size=(340,-1))
label5 = wx.StaticText(panel, label='Prefix to use for output filenames (REQUIRED):',size=(340,-1))
sizer5.Add(label5,0,wx.ALIGN_CENTER_VERTICAL,0)
self.base = wx.TextCtrl(panel,value=vars.base,size=(400,30))
sizer5.Add(self.base, proportion=1.0, flag=wx.EXPAND|wx.ALL, border=5)
sizer5.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Select a a label prefix that will be used when writing output files e.g. 'wt_run1'"), flag=wx.CENTER, border=0)
sizer5.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Select a prefix that will be used when writing output files"), flag=wx.CENTER, border=0)
sizer5.Add((10, 1), 0, wx.EXPAND)
sizer.Add(sizer5,0,wx.EXPAND,0)
......@@ -145,9 +155,9 @@ if hasWx:
self.protocol = wx.ComboBox(panel,choices=['Sassetti','Mme1', 'Tn5'],size=(400,30))
self.protocol.SetStringSelection(vars.protocol)
sizer_protocol.Add(self.protocol, proportion=1, flag=wx.EXPAND|wx.ALL, border=5)
protocol_tooltip_text = """Select which protocol best represents the reads. Default values will populate the fields.
protocol_tooltip_text = """Select which protocol used to prepare the sequencing samples. Default values will populate the other fields.
The Sassetti protocol generally assumes the reads include the primer prefix and part of the transposon sequencing. It also assumes reads are sequenced in the forward direction.
The Sassetti protocol generally assumes the reads include the primer prefix and part of the transposon sequence, followed by genomic sequence. It also assumes reads are sequenced in the forward direction. Barcodes are in read 2, along with genomic DNA from the other end of the fragment.
The Mme1 protocol generally assumes reads do NOT include the primer prefix, and that the reads are sequenced in the reverse direction"""
sizer_protocol.Add(TPPIcon(panel, wx.ID_ANY, bmp, protocol_tooltip_text), flag=wx.CENTER, border=0)
......@@ -161,7 +171,7 @@ The Mme1 protocol generally assumes reads do NOT include the primer prefix, and
sizer8 = wx.BoxSizer(wx.HORIZONTAL)
label8 = wx.StaticText(panel, label='Transposon used:',size=(340,-1))
sizer8.Add(label8,0,wx.ALIGN_CENTER_VERTICAL,0)
self.transposon = wx.ComboBox(panel,choices=['Himar1','Tn5', '[Custom]'],size=(400,30))
self.transposon = wx.ComboBox(panel,choices=['Himar1','Tn5', 'pre-trimmed','[Custom]'],size=(400,30))
self.transposon.SetStringSelection(vars.transposon)
sizer8.Add(self.transposon, proportion=1, flag=wx.EXPAND|wx.ALL, border=5)
sizer8.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Select the transposon used to construct the TnSeq libraries. This will automatically populate the primer prefix field. Select custom to specify your own sequence."), flag=wx.CENTER, border=0)
......@@ -186,7 +196,7 @@ The Mme1 protocol generally assumes reads do NOT include the primer prefix, and
sizer6 = wx.BoxSizer(wx.HORIZONTAL)
label6 = wx.StaticText(panel, label='Max reads (leave blank to use all):',size=(340,-1))
sizer6.Add(label6,0,wx.ALIGN_CENTER_VERTICAL,0)
self.maxreads = wx.TextCtrl(panel,size=(150,30))
self.maxreads = wx.TextCtrl(panel,value=str(vars.maxreads),size=(150,30)) # or "" if not defined? can't write to tpp.cfg
sizer6.Add(self.maxreads, proportion=1, flag=wx.EXPAND|wx.ALL, border=5)
sizer6.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Maximum reads to use from the reads files. Useful for running only a portion of very large number of reads. Leave blank to use all the reads."), flag=wx.CENTER, border=0)
sizer6.Add((10, 1), 0, wx.EXPAND)
......@@ -202,26 +212,54 @@ The Mme1 protocol generally assumes reads do NOT include the primer prefix, and
sizer7.Add((10, 1), 0, wx.EXPAND)
sizer.Add(sizer7,0,wx.EXPAND,0)
# PRIMER_START_WINDOW
sizer_primer_start = wx.BoxSizer(wx.HORIZONTAL)
label_primer_start = wx.StaticText(panel, label='Start of window to look for prefix (Tn terminus):', size=(340,-1))
sizer_primer_start.Add(label_primer_start,0,wx.ALIGN_CENTER_VERTICAL,0)
primer_start_window = "%s,%s" % (vars.primer_start_window[0],vars.primer_start_window[1])
self.primer_start = wx.TextCtrl(panel,value=primer_start_window,size=(150,30))
sizer_primer_start.Add(self.primer_start, proportion=1, flag=wx.EXPAND|wx.ALL, border=5)
sizer_primer_start.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Region in read 1 to search for start of prefix seq (i.e. end of transposon)."), flag=wx.CENTER, border=0)
sizer_primer_start.Add((10, 1), 0, wx.EXPAND)
sizer.Add(sizer_primer_start,0,wx.EXPAND,0)
# # WINDOW SIZE # [RJ] This block is to add the acceptance of a set window size for setting P,Q parameters
# sizer_window_size = wx.BoxSizer(wx.HORIZONTAL)
# label_window_size = wx.StaticText(panel, label='Window size for Tn prefix in read:', size=(340,-1))
# sizer_window_size.Add(label_window_size,0,wx.ALIGN_CENTER_VERTICAL,0)
# self.window_size = wx.TextCtrl(panel,value=str(vars.window_size),size=(150,30))
# sizer_window_size.Add(self.window_size, proportion=1, flag=wx.EXPAND|wx.ALL, border=5)
# sizer_window_size.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Window size for extract_staggered() to look for start of Tn prefix."), flag=wx.CENTER, border=0)
# sizer_window_size.Add((10, 1), 0, wx.EXPAND)
# sizer.Add(sizer_window_size,0,wx.EXPAND,0)
# BWA
sizer0 = wx.BoxSizer(wx.HORIZONTAL)
label0 = wx.StaticText(panel, label='BWA executable:',size=(330,-1))
label0 = wx.StaticText(panel, label='BWA executable (REQUIRED):',size=(330,-1))
sizer0.Add(label0,0,wx.ALIGN_CENTER_VERTICAL,0)
self.picker0 = wx.lib.filebrowsebutton.FileBrowseButton(panel, id = wx.ID_ANY, size=(400,30), dialogTitle='Path to BWA', fileMode=wx.FD_OPEN, fileMask='bwa*', startDirectory=os.path.dirname(vars.bwa), initialValue=vars.bwa, labelText='')
sizer0.Add(self.picker0, proportion=1, flag=wx.EXPAND|wx.ALL, border=5)
sizer0.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Specify a path to the BWA executable (including the executable)."), flag=wx.CENTER, border=0)
sizer0.Add((10, 1), 0, wx.EXPAND)
sizer.Add(sizer0,0,wx.EXPAND,0)
self.bwa_alg = wx.ComboBox(panel,choices=["use algorithm 'aln'", "use algorithm 'mem'"],size=(200,30))
if vars.bwa_alg=='aln': self.bwa_alg.SetSelection(0)
else: self.bwa_alg.SetSelection(1) # default
sizer0.Add(self.bwa_alg, proportion=0.5, flag=wx.EXPAND|wx.ALL, border=5) ##
self.bwa_alg.Bind(wx.EVT_COMBOBOX, self.OnBwaAlgSelection, id=self.bwa_alg.GetId())
sizer0.Add(TPPIcon(panel, wx.ID_ANY, bmp, "'mem' is considered to do a better job at mapping reads, but 'aln' is available as an alternative."), flag=wx.CENTER, border=0)
sizer0.Add((10, 1), 0, wx.EXPAND)
#sizer.Add(sizer0,0,wx.EXPAND,0)
# BWA FLAGS
sizer8 = wx.BoxSizer(wx.HORIZONTAL)
label8 = wx.StaticText(panel, label='BWA flags (Optional)',size=(340,-1))
sizer8.Add(label8,0,wx.ALIGN_CENTER_VERTICAL,0)
self.flags = wx.TextCtrl(panel,value=vars.flags,size=(400,30))
sizer8.Add(self.flags, proportion=1, flag=wx.EXPAND|wx.ALL, border=5)
sizer8.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Use this textobx to enter any desired flags for the BWA alignment. For example, to limit the number of mismatches to 1, type: -k 1. See the BWA documentation for all possible flags."), flag=wx.CENTER, border=0)
sizer8.Add(TPPIcon(panel, wx.ID_ANY, bmp, "Use this textbox to enter any desired flags for the BWA alignment. For example, to limit the number of mismatches to 1, type: -k 1. See the BWA documentation for all possible flags."), flag=wx.CENTER, border=0)
sizer8.Add((10, 1), 0, wx.EXPAND)
sizer.Add(sizer8,0,wx.EXPAND,0)
......@@ -250,7 +288,15 @@ The Mme1 protocol generally assumes reads do NOT include the primer prefix, and
sizer.Add(sizer9,0,wx.EXPAND,0)
#
def OnBwaAlgSelection(self, event):
if 'aln' in self.bwa_alg.GetValue():
self.vars.bwa_alg = "aln"
elif 'mem' in self.bwa_alg.GetValue():
self.vars.bwa_alg = "mem"
else:
self.vars.bwa_alg = "[Custom]"
#
......@@ -263,6 +309,11 @@ The Mme1 protocol generally assumes reads do NOT include the primer prefix, and
self.prefix.SetValue("ACTTATCAGCCAACCTGTTA")
self.transposon.SetStringSelection("Himar1")
self.vars.transposon = "Himar1"
elif self.transposon.GetValue()=="pre-trimmed":
self.transposon.SetValue("pre-trimmed")
self.transposon.SetStringSelection("pre-trimmed")
self.vars.transposon = "pre-trimmed"
self.prefix.SetValue('""')
else:
self.transposon.SetValue("[Custom]")
self.transposon.SetStringSelection("[Custom]")
......@@ -281,9 +332,9 @@ The Mme1 protocol generally assumes reads do NOT include the primer prefix, and
self.transposon.SetStringSelection("Himar1")
self.vars.transposon = "Himar1"
elif self.protocol.GetValue()=="Mme1":
self.prefix.SetValue("")
self.transposon.SetStringSelection("Himar1")
self.vars.transposon = "Himar1"
self.prefix.SetValue('""')
self.transposon.SetStringSelection("pre-trimmed")
self.vars.transposon = ""
#
......@@ -292,7 +343,7 @@ The Mme1 protocol generally assumes reads do NOT include the primer prefix, and
value = os.path.basename(str_path).split('.')[0]
if '_R1' in value or '_R2':
value = value.split('_')[0]
self.base.SetValue(value)
#self.base.SetValue(value)
#
......@@ -300,8 +351,8 @@ The Mme1 protocol generally assumes reads do NOT include the primer prefix, and
value2 = os.path.basename(self.picker2.GetValue()).split('.')[0]
value1 = os.path.basename(self.picker1.GetValue()).split('.')[0]
value = os.path.commonprefix([value1, value2])
self.base.SetValue(value)
self.base.Refresh()
#self.base.SetValue(value)
#self.base.Refresh()
#
......@@ -393,9 +444,9 @@ The Mme1 protocol generally assumes reads do NOT include the primer prefix, and
#
def add_data(self, dataset,vals):
self.list_ctrl.InsertStringItem(self.index, dataset)
self.list_ctrl.InsertItem(self.index, dataset)
for i in range(1, len(vals)+1):
self.list_ctrl.SetStringItem(self.index, i, vals[i-1])
self.list_ctrl.SetItem(self.index, i, vals[i-1])
self.index += 1
#
......@@ -430,6 +481,18 @@ The Mme1 protocol generally assumes reads do NOT include the primer prefix, and
self.vars.base = base
self.vars.mm1 = mm1
self.vars.prefix = prefix
#self.vars.window_size = int(self.window_size.GetValue())
if 'aln' in self.bwa_alg.GetValue():
self.vars.bwa_alg = 'aln'
elif 'mem' in self.bwa_alg.GetValue():
self.vars.bwa_alg = 'mem'
self.vars.replicon_ids = self.replicon_ids.GetValue().split(',')
v = self.primer_start.GetValue()
if v!="":
v = v.split(',')
self.vars.primer_start_window = (int(v[0]),int(v[1]))
if maxreads == '': self.vars.maxreads = -1
else: self.vars.maxreads = int(maxreads)
......
This diff is collapsed.
......@@ -2,6 +2,6 @@
__all__ = ["transit_tools", "tnseq_tools", "norm_tools", "stat_tools"]
__version__ = "v2.3.4"
__version__ = "v2.5.2"
prefix = "[TRANSIT]"
......@@ -21,11 +21,12 @@ import pytransit
import pytransit.transit_tools as transit_tools
import pytransit.analysis
import pytransit.export
import pytransit.convert
method_wrap_width = 250
methods = pytransit.analysis.methods
export_methods = pytransit.export.methods
convert_methods = pytransit.convert.methods
all_methods = {}
all_methods.update(methods)
......@@ -48,10 +49,20 @@ def main(*args, **kwargs):
sys.argv.remove("--debug")
kwargs.pop("-debug")
if (not args and 'h' in kwargs):
if (not args and ('v' in kwargs or '-version' in kwargs)):
print "Version: {0}".format(pytransit.__version__)
sys.exit(0)
if (not args and ('h' in kwargs or '-help' in kwargs)):
print "For commandline mode, please use one of the known methods (or see documentation to add a new one):"
print("Analysis methods: ")
for m in all_methods:
## TODO :: Move normalize to separate subcommand?
if (m == "normalize"): continue
print "\t - %s" % m
print("Other functions: ")
print("\t - normalize")
print("\t - convert")
print("\t - export")
print "Usage: python %s <method>" % sys.argv[0]
sys.exit(0)
......@@ -73,7 +84,7 @@ def main(*args, **kwargs):
#start the applications
app.MainLoop()
# Tried GUI mode but has no wxPython
elif not (args or kwargs) and not hasWx:
print "Please install wxPython to run in GUI Mode."
......@@ -93,7 +104,7 @@ def main(*args, **kwargs):
export_method_name = ""
if len(args) > 1:
export_method_name = args[1]
if export_method_name not in export_methods:
print "Error: Need to specify the export method."
print "Please use one of the known methods (or see documentation to add a new one):"
......@@ -103,7 +114,20 @@ def main(*args, **kwargs):
else:
methodobj = export_methods[export_method_name].method.fromconsole()
methodobj.Run()
elif method_name.lower() == "convert":
convert_method_name = ""
if len(args) > 1:
convert_method_name = args[1]
if convert_method_name not in convert_methods:
print "Error: Need to specify the convert method."
print "Please use one of the known methods (or see documentation to add a new one):"
for m in convert_methods:
print "\t - %s" % m
print "Usage: python %s convert <method>" % sys.argv[0]
else:
methodobj = convert_methods[convert_method_name].method.fromconsole()
methodobj.Run()
else:
print "Error: The '%s' method is unknown." % method_name
print "Please use one of the known methods (or see documentation to add a new one):"
......
......@@ -20,7 +20,9 @@ import utest
import normalize
import pathway_enrichment #08/22/2018 by Ivan
import anova
import zinb
import tnseq_stats
import winsorize
methods = {}
methods["example"] = example.ExampleAnalysis()
......@@ -34,12 +36,14 @@ methods["rankproduct"] = rankproduct.RankProductAnalysis()
methods["utest"] = utest.UTestAnalysis()
methods["GI"] = gi.GIAnalysis()
methods["anova"] = anova.AnovaAnalysis()
methods["zinb"] = zinb.ZinbAnalysis()
#methods["mcce"] = mcce.MCCEAnalysis()
#methods["mcce2"] = mcce2.MCCE2Analysis()
#methods["motifhmm"] = motifhmm.MotifHMMAnalysis()
methods["normalize"] = normalize.Normalize()
methods["pathway_enrichment"]=pathway_enrichment.GSEAAnalysis()
methods["tnseq_stats"]=tnseq_stats.TnseqStats()
methods["winsorize"] = winsorize.Winsorize()
# EXPORT METHODS
import norm
......
......@@ -33,9 +33,9 @@ class AnovaMethod(base.MultiConditionMethod):
"""
anova
"""
def __init__(self, combined_wig, metadata, annotation, normalization, output_file, ignored_conditions=set()):
base.MultiConditionMethod.__init__(self, short_name, long_name, short_desc, long_desc, combined_wig, metadata, annotation, output_file, normalization=normalization)
self.ignored_conditions = ignored_conditions
def __init__(self, combined_wig, metadata, annotation, normalization, output_file, ignored_conditions=[], included_conditions=[], nterm=0.0, cterm=0.0):
base.MultiConditionMethod.__init__(self, short_name, long_name, short_desc, long_desc, combined_wig, metadata, annotation, output_file,
normalization=normalization, ignored_conditions=ignored_conditions, included_conditions=included_conditions, nterm=nterm, cterm=cterm)
@classmethod
def fromargs(self, rawargs):
......@@ -46,13 +46,21 @@ class AnovaMethod(base.MultiConditionMethod):
sys.exit(0)
combined_wig = args[0]
annotation = args[1]
metadata = args[2]
annotation = args[2]
metadata = args[1]
output_file = args[3]
normalization = kwargs.get("n", "TTR")
ignored_conditions = set(kwargs.get("-ignore-conditions", "Unknown").split(","))
NTerminus = float(kwargs.get("iN", 0.0))
CTerminus = float(kwargs.get("iC", 0.0))
ignored_conditions = filter(None, kwargs.get("-ignore-conditions", "").split(","))
included_conditions = filter(None, kwargs.get("-include-conditions", "").split(","))
if len(included_conditions) > 0 and len(ignored_conditions) > 0:
print(self.transit_error("Cannot use both include-conditions and ignore-conditions flags"))
print(ZinbMethod.usage_string())
sys.exit(0)
return self(combined_wig, metadata, annotation, normalization, output_file, ignored_conditions)
return self(combined_wig, metadata, annotation, normalization, output_file, ignored_conditions, included_conditions, NTerminus, CTerminus)
def wigs_to_conditions(self, conditionsByFile, filenamesInCombWig):
"""
......@@ -60,7 +68,7 @@ class AnovaMethod(base.MultiConditionMethod):
({FileName: Condition}, [FileName]) -> [Condition]
Condition :: [String]
"""
return [conditionsByFile.get(f, "Unknown") for f in filenamesInCombWig]
return [conditionsByFile.get(f, self.unknown_cond_flag) for f in filenamesInCombWig]
def means_by_condition_for_gene(self, sites, conditions, data):
"""
......@@ -69,45 +77,12 @@ class AnovaMethod(base.MultiConditionMethod):
Site :: Number
Condition :: String
"""
nTASites = len(sites)
wigsByConditions = collections.defaultdict(lambda: [])
for i, c in enumerate(conditions):
wigsByConditions[c].append(i)
return { c: numpy.mean(data[wigIndex][:, sites]) for (c, wigIndex) in wigsByConditions.items() }
def filter_by_conditions_blacklist(self, data, conditions, ignored_conditions):
"""
Filters out wigfiles, with ignored conditions.
([[Wigdata]], [Condition]) -> Tuple([[Wigdata]], [Condition])
"""
d_filtered, cond_filtered = [], [];
for i, c in enumerate(conditions):
if c not in ignored_conditions:
d_filtered.append(data[i])
cond_filtered.append(conditions[i])
return (numpy.array(d_filtered), numpy.array(cond_filtered))
def read_samples_metadata(self, metadata_file):
"""
Filename -> ConditionMap
ConditionMap :: {Filename: Condition}
"""
wigFiles = []
conditionsByFile = {}
headersToRead = ["condition", "filename"]
with open(metadata_file) as mfile:
lines = mfile.readlines()
headIndexes = [i
for h in headersToRead
for i, c in enumerate(lines[0].split())
if c.lower() == h]
for line in lines:
if line[0]=='#': continue
vals = line.split()
[condition, wfile] = vals[headIndexes[0]], vals[headIndexes[1]]
conditionsByFile[wfile] = condition
return conditionsByFile
return { c: numpy.mean(data[wigIndex][:, sites]) if nTASites > 0 else 0 for (c, wigIndex) in wigsByConditions.items() }
def means_by_rv(self, data, RvSiteindexesMap, genes, conditions):
"""
......@@ -121,8 +96,7 @@ class AnovaMethod(base.MultiConditionMethod):
MeansByRv = {}
for gene in genes:
Rv = gene["rv"]
if len(RvSiteindexesMap[gene["rv"]]) > 0: # skip genes with no TA sites
MeansByRv[Rv] = self.means_by_condition_for_gene(RvSiteindexesMap[Rv], conditions, data)
MeansByRv[Rv] = self.means_by_condition_for_gene(RvSiteindexesMap[Rv], conditions, data)
return MeansByRv
def group_by_condition(self, wigList, conditions):
......@@ -134,10 +108,12 @@ class AnovaMethod(base.MultiConditionMethod):
DataForCondition :: [Number]
"""
countsByCondition = collections.defaultdict(lambda: [])
countSum = 0
for i, c in enumerate(conditions):
countSum += numpy.sum(wigList[i])
countsByCondition[c].append(wigList[i])
return [numpy.array(v).flatten() for v in countsByCondition.values()]
return (countSum, [numpy.array(v).flatten() for v in countsByCondition.values()])
def run_anova(self, data, genes, MeansByRv, RvSiteindexesMap, conditions):
"""
......@@ -152,15 +128,25 @@ class AnovaMethod(base.MultiConditionMethod):
count = 0
self.progress_range(len(genes))
pvals,Rvs = [],[]
pvals,Rvs,status = [],[],[]
for gene in genes:
count += 1
Rv = gene["rv"]
if Rv in MeansByRv:
countsvec = self.group_by_condition(map(lambda wigData: wigData[RvSiteindexesMap[Rv]], data), conditions)
stat,pval = scipy.stats.f_oneway(*countsvec)
pvals.append(pval)
Rvs.append(Rv)
if (len(RvSiteindexesMap[Rv]) <= 1):
status.append("TA sites <= 1")
pvals.append(1)
else:
countSum, countsVec = self.group_by_condition(map(lambda wigData: wigData[RvSiteindexesMap[Rv]], data), conditions)
if (countSum == 0):
pval = 1
status.append("No counts in all conditions")
pvals.append(pval)
else:
stat,pval = scipy.stats.f_oneway(*countsVec)
status.append("-")
pvals.append(pval)
Rvs.append(Rv)
# Update progress
text = "Running Anova Method... %5.1f%%" % (100.0*count/len(genes))
......@@ -171,10 +157,10 @@ class AnovaMethod(base.MultiConditionMethod):
qvals = numpy.full(pvals.shape, numpy.nan)
qvals[mask] = statsmodels.stats.multitest.fdrcorrection(pvals[mask])[1] # BH, alpha=0.05
p,q = {},{}
p,q,statusMap = {},{},{}
for i,rv in enumerate(Rvs):
p[rv],q[rv] = pvals[i],qvals[i]
return (p, q)
p[rv],q[rv],statusMap[rv] = pvals[i],qvals[i],status[i]
return (p, q, statusMap)
def Run(self):
self.transit_message("Starting Anova analysis")
......@@ -186,43 +172,50 @@ class AnovaMethod(base.MultiConditionMethod):
self.transit_message("Normalizing using: %s" % self.normalization)
(data, factors) = norm_tools.normalize_data(data, self.normalization)
conditionsByFile, _, _, _ = tnseq_tools.read_samples_metadata(self.metadata)
conditions = self.wigs_to_conditions(
self.read_samples_metadata(self.metadata),
conditionsByFile,
filenamesInCombWig)
data, conditions = self.filter_by_conditions_blacklist(data, conditions, self.ignored_conditions)
data, conditions, _, _ = self.filter_wigs_by_conditions(data, conditions, ignored_conditions = self.ignored_conditions, included_conditions = self.included_conditions)
genes = tnseq_tools.read_genes(self.annotation_path)
TASiteindexMap = {TA: i for i, TA in enumerate(sites)}
RvSiteindexesMap = tnseq_tools.rv_siteindexes_map(genes, TASiteindexMap)
RvSiteindexesMap = tnseq_tools.rv_siteindexes_map(genes, TASiteindexMap, nterm=self.NTerminus, cterm=self.CTerminus)
MeansByRv = self.means_by_rv(data, RvSiteindexesMap, genes, conditions)
self.transit_message("Running Anova")
pvals,qvals = self.run_anova(data, genes, MeansByRv, RvSiteindexesMap, conditions)
pvals,qvals,run_status = self.run_anova(data, genes, MeansByRv, RvSiteindexesMap, conditions)
self.transit_message("Adding File: %s" % (self.output))
file = open(self.output,"w")
conditionsList = list(set(conditions))
vals = "Rv Gene TAs".split() + conditionsList + "pval padj".split()
file.write('\t'.join(vals)+EOL)
conditionsList = self.included_conditions if len(self.included_conditions) > 0 else list(set(conditions))
heads = ("Rv Gene TAs".split() +
conditionsList +
"pval padj".split() + ["status"])
file.write("#Console: python %s\n" % " ".join(sys.argv))
file.write('\t'.join(heads)+EOL)
for gene in genes:
Rv = gene["rv"]
if Rv in MeansByRv:
vals = ([Rv, gene["gene"], str(len(RvSiteindexesMap[Rv]))] +
["%0.1f" % MeansByRv[Rv][c] for c in conditionsList] +
["%f" % x for x in [pvals[Rv], qvals[Rv]]])
["%0.2f" % MeansByRv[Rv][c] for c in conditionsList] +
["%f" % x for x in [pvals[Rv], qvals[Rv]]] + [run_status[Rv]])
file.write('\t'.join(vals)+EOL)
file.close()
self.transit_message("Finished Anova analysis")
self.transit_message("Time: %0.1fs\n" % (time.time() - start_time))
@classmethod
def usage_string(self):
return """python %s anova <combined wig file> <annotation .prot_table> <samples_metadata file> <output file> [Optional Arguments]
return """python %s anova <combined wig file> <samples_metadata file> <annotation .prot_table> <output file> [Optional Arguments]
Optional Arguments:
-n <string> := Normalization method. Default: -n TTR
--ignore-conditions <cond1,cond2> := Comma seperated list of conditions to ignore, for the analysis. Default --ignore-conditions Unknown
--ignore-conditions <cond1,cond2> := Comma separated list of conditions to ignore, for the analysis. Default --ignore-conditions Unknown
--include-conditions <cond1,cond2> := Comma separated list of conditions to include, for the analysis. Conditions not in this list, will be ignored.
-iN <float> := Ignore TAs occuring within given percentage (as integer) of the N terminus. Default: -iN 0
-iC <float> := Ignore TAs occuring within given percentage (as integer) of the C terminus. Default: -iC 0
""" % (sys.argv[0])
......
......@@ -18,6 +18,7 @@ if hasWx:
import traceback
import datetime
import numpy
import pytransit.transit_tools as transit_tools
file_prefix = "[FileDisplay]"
......@@ -508,14 +509,56 @@ class MultiConditionMethod(AnalysisMethod):
Class to be inherited by analysis methods that compare essentiality between multiple conditions (e.g Anova).
'''
def __init__(self, short_name, long_name, short_desc, long_desc, combined_wig, metadata, annotation_path, output, normalization=None, LOESS=False, ignoreCodon=True, wxobj=None):
def __init__(self, short_name, long_name, short_desc, long_desc, combined_wig, metadata, annotation_path, output, normalization=None, LOESS=False, ignoreCodon=True, wxobj=None, ignored_conditions=[], included_conditions=[], nterm=0.0, cterm=0.0):
AnalysisMethod.__init__(self, short_name, long_name, short_desc, long_desc, output,
annotation_path, wxobj)
self.combined_wig = combined_wig
self.metadata = metadata
self.normalization = normalization
self.LOESS = LOESS
self.ignoreCodon = ignoreCodon
self.NTerminus = nterm
self.CTerminus = cterm
self.unknown_cond_flag = "FLAG-UNMAPPED-CONDITION-IN-WIG"
self.ignored_conditions = ignored_conditions
self.included_conditions = included_conditions
def filter_wigs_by_conditions(self, data, conditions, covariates = [], interactions = [], ignored_conditions = [], included_conditions = []):
"""
Filters conditions that are ignored/included.
([[Wigdata]], [Condition], [[Covar]], [Condition], [Condition]) -> Tuple([[Wigdata]], [Condition])
"""
ignored_conditions, included_conditions = (set(ignored_conditions), set(included_conditions))
d_filtered, cond_filtered, filtered_indexes = [], [], [];
if len(ignored_conditions) > 0 and len(included_conditions) > 0:
self.transit_error("Both ignored and included conditions have len > 0", ignored_conditions, included_conditions)
sys.exit(0)
elif (len(ignored_conditions) > 0):
self.transit_message("conditions ignored: {0}".format(ignored_conditions))
for i, c in enumerate(conditions):
if (c != self.unknown_cond_flag) and (c not in ignored_conditions):
d_filtered.append(data[i])
cond_filtered.append(conditions[i])
filtered_indexes.append(i)
elif (len(included_conditions) > 0):
self.transit_message("conditions included: {0}".format(included_conditions))
for i, c in enumerate(conditions):
if (c != self.unknown_cond_flag) and (c in included_conditions):
d_filtered.append(data[i])
cond_filtered.append(conditions[i])
filtered_indexes.append(i)
else:
for i, c in enumerate(conditions):
if (c != self.unknown_cond_flag):
d_filtered.append(data[i])
cond_filtered.append(conditions[i])
filtered_indexes.append(i)
covariates_filtered = [[c[i] for i in filtered_indexes] for c in covariates]
interactions_filtered = [[c[i] for i in filtered_indexes] for c in interactions]
return (numpy.array(d_filtered),
numpy.array(cond_filtered),
numpy.array(covariates_filtered),
numpy.array(interactions_filtered))
#
......
......@@ -516,8 +516,8 @@ class BinomialMethod(base.SingleConditionMethod):
Optional Arguments:
-s <int> := Number of samples to take. Default: -s 10000
-b <int> := Number of burn-in samples to take. Default: -b 500
-iN <float> := Ignore TAs occuring at given fraction of the N terminus. Default: -iN 0.0
-iC <float> := Ignore TAs occuring at given fraction of the C terminus. Default: -iC 0.0
-iN <float> := Ignore TAs occuring at given percentage (as integer) of the N terminus. Default: -iN 0
-iC <float> := Ignore TAs occuring at given percentage (as integer) of the C terminus. Default: -iC 0
Hyper-parameters:
-pi0 <float> := Hyper-parameters for rho, non-essential genes. Default: -pi0 0.5
......
......@@ -901,8 +901,8 @@ class GIMethod(base.QuadConditionMethod):
-n <string> := Normalization method. Default: -n TTR
-iz := Include rows with zero accross conditions.
-l := Perform LOESS Correction; Helps remove possible genomic position bias. Default: Turned Off.
-iN <float> := Ignore TAs occuring at given fraction of the N terminus. Default: -iN 0.0
-iC <float> := Ignore TAs occuring at given fraction of the C terminus. Default: -iC 0.0
-iN <float> := Ignore TAs occuring at given percentage (as integer) of the N terminus. Default: -iN 0
-iC <float> := Ignore TAs occuring at given percentage (as integer) of the C terminus. Default: -iC 0
""" % (sys.argv[0])
......