Andreas Tille · Andreas Tille · Andreas Tille · Andreas Tille · Andreas Tille · Andreas Tille
--- a/ChangeLog
+++ b/ChangeLog
+2019-10-01  Tao Liu  <vladimir.liu@gmail.com>
+	MACS version 2.2.4 (Python3)
+
+	* Features added
+
+	1) First Python3 version MACS2 released.
+
+	2) Version number 2.2.X will be used for MACS2 in Python3, in
+	parallel to 2.1.X.
+
+	3) More comprehensive test.sh script to check the consistency of
+	results from Python2 version and Python3 version.
+
+	4) Simplify setup.py script since the newest version transparently
+	supports cython. And when cython is not installed by the user,
+	setup.py can still compile using only C codes.
+
+	5) Fix Signal.pyx to use np.array instead of np.mat.
+
 2019-09-30  Tao Liu  <vladimir.liu@gmail.com>
 	MACS version 2.1.4


--- a/INSTALL.md
+++ b/INSTALL.md
 # INSTALL Guide For MACS
-Time-stamp: <2019-09-30 14:47:45 taoliu>
+Time-stamp: <2019-10-03 11:56:03 taoliu>

 Please check the following instructions to complete your installation.

 ## Prerequisites

-Python version must be equal to *2.7* to run MACS.
+MACS v2.2.x requires Python3. We have tested MACS in Python3.5, 3.6
+and 3.7. MACS runs slower under Python3.5, so Python3.6 or Python3.7
+is recommended.

-[Numpy](http://www.scipy.org/Download) (>=1.16) are required to run MACS v2. 
+MACS also requires [Numpy](http://www.scipy.org/Download) (>=1.17).

 GCC is required to compile `.c` codes in MACS v2 package, and python
 header files are needed. If you are using Mac OSX, I recommend you
 install Xcode; if you are using Linux, you need to make sure
-`python-dev` is installed.
+`python-dev` package is installed -- the actual package name depends
+on the Linux OS distribution, you are using.

-[Cython](http://cython.org/) (>=0.29) is required to generate `.c`
-files from `.pyx` files using `setup.py` script.
+[Cython](http://cython.org/) is **NOT** required although most of MACS
+codes are written in Cython. But if you plan to generate `.c` files
+from `.pyx` by yourself, you can install Cython (>=0.29), then use
+`setup.py` script.

-## Easy installation through PyPI
+## Prepare a virtual Python environment 

-The easiest way to install MACS2 is through PyPI system. Get pip_ if
-it's not available in your system. *Note* if you have already
-installed numpy and scipy system-wide, you can use `virtualenv
--system-site-packages` to let your virtual Python environment have
-access to system-wide numpy and scipy libraries so that you don't need
-to install them again.
+I strongly recommend installing your MACS program in a virtual
+environment, so that you have full control of your installation and
+won't mess up with your system libraries. To learn about virtual
+environment, read [this
+article](https://docs.python.org/3/library/venv.html). A simple way to
+create a virtual environment of Python3 is

-Then under command line, type `pip install MACS2`. PyPI will
-install Numpy and Cython automatically if they are absent. 
+`$ python3 -m venv MyPythonEnv/`

-To upgrade MACS2, type `pip install --upgrade MACS2`. It will check
+Then active it by
+
+`$ source MyPythonEnv/bin/active`
+
+## Install through PyPI
+
+The easiest way to install MACS is through PyPI system. Get `pip` if
+it's not available in your system. If you create a virtual environment
+as described before, your `pip` command will install everything under
+the folder you specified previously through `python3 -m env` command.
+
+Then under the command line, type `pip install macs2`. PyPI will
+install Numpy automatically if it is absent.
+
+To upgrade MACS2, type `pip install --upgrade macs2`. It will check
 currently installed MACS2, compare the version with the one on PyPI
-repository, download and install newer version while necessary.
+repository, download and install a newer version while necessary.
+
+## Install through Conda
+
+Please check [MACS2 page on
+bioconda](https://anaconda.org/bioconda/macs2) for instructions.
+
+## Install through Debian APT
+
+MACS is included in Debian Stretch/Buster/Unstable blessed by some
+kind of Supper Cow Powers.

 ## Install from source

-MACS uses Python's distutils tools for source installations. To
-install a source distribution of MACS, unpack the distribution tarball
-and open up a command terminal. Go to the directory where you unpacked
-MACS, and simply run the install script:
+MACS uses Python's [setuptools](https://setuptools.readthedocs.io) for
+source code installations. To install a source distribution of MACS,
+unpack the distribution tarball, or clone Git repository with `git
+clone git@github.com:taoliu/MACS.git`. Go to the directory where you
+unpacked MACS, and simply run the install script:

 `$ python setup.py install`

 By default, the script will install python library and executable
-codes globally, which means you need to be root or administrator of
-the machine so as to complete the installation. Please contact the
-administrator of that machine if you want their help. If you need to
-provide a nonstandard install prefix, or any other nonstandard
-options, you can provide many command line options to the install
-script. Use the –help option to see a brief list of available options:
+codes according to the environment. When you run the command under
+virtualenv, the script will install to the virtual environment
+instead. When you run it without virtual environment, you may need to
+be root or administrator of the machine so as to complete the
+installation. Please contact the system administrator if you want
+their help. If you need to provide a nonstandard install prefix, or
+any other nonstandard options, you can provide many command line
+options to the install script. Use the `--help` option to see a brief
+list of available options:

 `$ python setup.py --help`

@@ -57,8 +89,17 @@ directory, use this command:

 `$ python setup.py install --prefix /home/taoliu/`

+As mentioned in *Prerequisites*, you don't need to install Cython in
+order to install MACS. When Cython is available, this setup script
+will regenerate C codes from Pyx codes when necessary. When Cython is
+not available, this setup script will just use the C codes included in
+the release package (or your Github clone) for installation.

-## Configure enviroment variables
+## Configure environment variables
+
+*Note*, if you are using a virtual environment, you should skip this
+section since all the corresponding environment variables have been
+correctly set while you `activate` the environment.

 After running the setup script, you might need to add the install
 location to your `PYTHONPATH` and `PATH` environment variables. The
@@ -67,10 +108,10 @@ concept is the same across platforms.

 ### PYTHONPATH

-To set up your `PYTHONPATH` environment variable, you'll need to add the
-value `PREFIX/lib/pythonX.Y/site-packages` to your existing
+To set up your `PYTHONPATH` environment variable, you'll need to add
+the value `PREFIX/lib/pythonX.Y/site-packages` to your existing
 `PYTHONPATH`. In this value, X.Y stands for the major–minor version of
-Python you are using (such as 2.7 ; you can find this with
+Python you are using (such as 3.7; you can find this with
 `sys.version[:3]` from a Python command line). `PREFIX` is the install
 prefix where you installed MACS. If you did not specify a prefix on
 the command line, MACS will be installed using Python's sys.prefix
@@ -79,24 +120,23 @@ value.
 On Linux, using bash, I include the new value in my `PYTHONPATH` by
 adding this line to my `~/.bashrc`::

- `$ export PYTHONPATH=/home/taoliu/lib/python2.7/site-packages:$PYTHONPATH`
+ `$ export
+ PYTHONPATH=/home/taoliu/lib/python3.7/site-packages:$PYTHONPATH`

-Using Windows, you need to open up the system properties dialog, and
+Using Windows, you need to open up the system properties dialog and
 locate the tab labeled Environment. Add your value to the `PYTHONPATH`
 variable, or create a new `PYTHONPATH` variable if there isn't one
 already.

 ### PATH

-Just like your `PYTHONPATH`, you'll also need to add a new value to your
-PATH environment variable so that you can use the MACS command line
-directly. Unlike the `PYTHONPATH` value, however, this time you'll need
-to add `PREFIX/bin` to your PATH environment variable. The process for
-updating this is the same as described above for the `PYTHONPATH`
-variable::
+Just like your `PYTHONPATH`, you'll also need to add a new value to
+your PATH environment variable so that you can use the MACS command
+line directly. Unlike the `PYTHONPATH` value, however, this time
+you'll need to add `PREFIX/bin` to your PATH environment variable. The
+process for updating this is the same as described above for the
+`PYTHONPATH` variable::

 `$ export PATH=/home/taoliu/bin:$PATH`

--
-Tao Liu <vladimir.liu@gmail.com>
-
+-- Tao Liu <vladimir.liu@gmail.com>
--- a/MACS2.egg-info/PKG-INFO
+++ b/MACS2.egg-info/PKG-INFO
--- a/MACS2.egg-info/SOURCES.txt
+++ b/MACS2.egg-info/SOURCES.txt
@@ -53,6 +53,7 @@ MACS2/IO/BedGraphIO.c
 MACS2/IO/BedGraphIO.pyx
 MACS2/IO/CallPeakUnit.c
 MACS2/IO/CallPeakUnit.pyx
+MACS2/IO/CallPeakUnitPrecompiled.c
 MACS2/IO/FixWidthTrack.c
 MACS2/IO/FixWidthTrack.pyx
 MACS2/IO/PairedEndTrack.c
@@ -64,5 +65,4 @@ MACS2/IO/PeakIO.pyx
 MACS2/IO/ScoreTrack.c
 MACS2/IO/ScoreTrack.pyx
 MACS2/IO/__init__.py
-MACS2/data/__init__.py
 bin/macs2
\ No newline at end of file
--- a/MACS2.egg-info/requires.txt
+++ b/MACS2.egg-info/requires.txt
-numpy>=1.16
+numpy>=1.17
--- a/MACS2/Constants.py
+++ b/MACS2/Constants.py
-MACS_VERSION = "2.1.4"
-#MACSDIFF_VERSION = "1.0.4 20110212 (tag:alpha)"
+MACS_VERSION = "2.2.4"
 FILTERDUP_VERSION = "1.0.0 20140616"
 RANDSAMPLE_VERSION = "1.0.0 20120703"
 MAX_PAIRNUM = 1000

--- a/MACS2/IO/BedGraph.c
+++ b/MACS2/IO/BedGraph.c
--- a/MACS2/IO/BedGraph.pyx
+++ b/MACS2/IO/BedGraph.pyx
-# Time-stamp: <2019-09-20 11:37:21 taoliu>
+# cython: language_level=3
+# Time-stamp: <2019-10-02 11:05:35 taoliu>

 """Module for BedGraph data class.

@@ -91,12 +92,12 @@ cdef class bedGraphTrackI:
        self.minvalue = 10000000  # initial minimum value is large since I want safe_add_loc to update it
        self.baseline_value = baseline_value

-    def add_a_chromosome ( self, chrom, d ):
+    def add_a_chromosome ( self, bytes chrom, d ):
        """Unsafe method. Only to be used by cPileup.pyx.
        """
        self.__data[chrom] = d

-    cpdef add_loc ( self, str chromosome, int startpos, int endpos, float value ):
+    cpdef add_loc ( self, bytes chromosome, int startpos, int endpos, float value ):
        """Add a chr-start-end-value block into __data dictionary.

        Difference between safe_add_loc: no check, but faster. Save
@@ -112,7 +113,7 @@ cdef class bedGraphTrackI:
        if startpos < 0:
            startpos = 0
        
-        if not self.__data.has_key(chromosome):
+        if chromosome not in self.__data:
            self.__data[chromosome] = [array(BYTE4,[]),array(FBYTE4,[])] # for (endpos,value)
            c = self.__data[chromosome]
            if startpos:
@@ -147,16 +148,16 @@ cdef class bedGraphTrackI:
        """
        cdef:
            set chrs
-            str chromosome
+            bytes chromosome
            
        chrs = self.get_chr_names()
        for chromosome in chrs:
-            if self.__data.has_key(chromosome):
+            if chromosome in self.__data:
                self.__data[chromosome] = [None, None]
                self.__data.pop(chromosome)
        return True

-    def safe_add_loc ( self, str chromosome, int startpos, int endpos, double value):
+    def safe_add_loc ( self, bytes chromosome, int startpos, int endpos, double value):
        """Add a chr-start-end-value block into __data dictionary.

        """
@@ -168,7 +169,7 @@ cdef class bedGraphTrackI:
        if startpos < 0:
            startpos = 0
        
-        if not self.__data.has_key(chromosome):
+        if chromosome not in self.__data:
            self.__data[chromosome] = [array(BYTE4,[]),array(FBYTE4,[])] # for (endpos,value)
            c = self.__data[chromosome]
            if startpos:
@@ -209,13 +210,13 @@ cdef class bedGraphTrackI:
        if value < self.minvalue:
            self.minvalue = value

-    def get_data_by_chr (self, str chromosome):
+    def get_data_by_chr (self, bytes chromosome):
        """Return array of counts by chromosome.

        The return value is a tuple:
        ([end pos],[value])
        """
-        if self.__data.has_key(chromosome):
+        if chromosome in self.__data:
            return self.__data[chromosome]
        else:
            return None
@@ -238,7 +239,7 @@ cdef class bedGraphTrackI:
        cdef:
            int pre, pos, i
            double value
-            str chrom
+            bytes chrom
        
        if trackline:
            trackcontents = (name.replace("\"", "\\\""), description.replace("\"", "\\\""))
@@ -246,8 +247,8 @@ cdef class bedGraphTrackI:
        chrs = self.get_chr_names()
        for chrom in chrs:
            (p,v) = self.__data[chrom]
-            pnext = iter(p).next
-            vnext = iter(v).next
+            pnext = iter(p).__next__
+            vnext = iter(v).__next__
            pre = 0

            for i in range(len(p)):
@@ -255,7 +256,7 @@ cdef class bedGraphTrackI:
                value = vnext()
                #if value != self.baseline_value:
                # never write baseline_value
-                fhd.write("%s\t%d\t%d\t%.5f\n" % (chrom,pre,pos,value))
+                fhd.write("%s\t%d\t%d\t%.5f\n" % (chrom.decode(),pre,pos,value))
                pre = pos

    def reset_baseline (self, double baseline_value):
@@ -276,13 +277,13 @@ cdef class bedGraphTrackI:
        cdef:
            int new_pre_pos, pos, i
            double new_pre_value, value
-            str chrom
+            bytes chrom
        
        chrs = set(self.__data.keys())
        for chrom in chrs:
            (p,v) = self.__data[chrom]
-            pnext = iter(p).next
-            vnext = iter(v).next
+            pnext = iter(p).__next__
+            vnext = iter(v).__next__

            # new arrays
            new_pos = array(BYTE4,[pnext(),])
@@ -317,13 +318,13 @@ cdef class bedGraphTrackI:
        cdef:
            int new_pre_pos, pos, i
            double new_pre_value, value
-            str chrom
+            bytes chrom
        
        chrs = set(self.__data.keys())
        for chrom in chrs:
            (p,v) = self.__data[chrom]
-            pnext = iter(p).next
-            vnext = iter(v).next
+            pnext = iter(p).__next__
+            vnext = iter(v).__next__

            # new arrays
            new_pos = array(BYTE4,[])
@@ -412,7 +413,7 @@ cdef class bedGraphTrackI:
        cdef:
            int peak_length, x, pre_p, p, i, summit, tstart, tend
            double v, summit_value, tvalue
-            str chrom
+            bytes chrom
        
        #if call_summits: close_peak = self.__close_peak2
        #else: close_peak = self.__close_peak
@@ -423,8 +424,8 @@ cdef class bedGraphTrackI:
            peak_content = None
            peak_length = 0
            (ps,vs) = self.get_data_by_chr(chrom) # arrays for position and values
-            psn = iter(ps).next         # assign the next function to a viable to speed up
-            vsn = iter(vs).next
+            psn = iter(ps).__next__         # assign the next function to a viable to speed up
+            vsn = iter(vs).__next__
            x = 0
            pre_p = 0                   # remember previous position
            while True:
@@ -466,7 +467,7 @@ cdef class bedGraphTrackI:
            close_peak(peak_content, peaks, min_length, chrom) #, smoothlen=max_gap / 2 )
        return peaks

-    def __close_peak( self, peak_content, peaks, int min_length, str chrom ):
+    def __close_peak( self, peak_content, peaks, int min_length, bytes chrom ):
        
        peak_length = peak_content[-1][1]-peak_content[0][0]
        if peak_length >= min_length: # if the peak is too small, reject it
@@ -509,7 +510,7 @@ cdef class bedGraphTrackI:
        Return both general PeakIO object for highly enriched regions
        and gapped broad regions in BroadPeakIO.
        """
-        cdef str chrom
+        cdef bytes chrom
        cdef int i, j
        #cdef int tmp_n
        
@@ -524,7 +525,7 @@ cdef class bedGraphTrackI:
            #tmp_n = 0
            lvl1peakschrom = lvl1_peaks.get_data_from_chrom(chrom)
            lvl2peakschrom = lvl2_peaks.get_data_from_chrom(chrom)
-            lvl1peakschrom_next = iter(lvl1peakschrom).next
+            lvl1peakschrom_next = iter(lvl1peakschrom).__next__
            tmppeakset = []             # to temporarily store lvl1 region inside a lvl2 region
            # our assumption is lvl1 regions should be included in lvl2 regions
            try:
@@ -551,7 +552,7 @@ cdef class bedGraphTrackI:
            
            #print len(lvl1peakschrom), len(lvl2peakschrom), tmp_n

-        return lvl1_peaks, broadpeaks
+        return broadpeaks

    def __add_broadpeak (self, bpeaks, chrom, lvl2peak, lvl1peakset):
        """Internal function to create broad peak.
@@ -559,40 +560,40 @@ cdef class bedGraphTrackI:
        """
        cdef:
            long start, end, blockNum
-            str blockSizes, blockStarts, thickStart, thickEnd
+            bytes blockSizes, blockStarts, thickStart, thickEnd
        
        start      = lvl2peak["start"]
        end        = lvl2peak["end"]

+        # the following code will add those broad/lvl2 peaks with no strong/lvl1 peaks inside
        if not lvl1peakset:
            # try:
            # will complement by adding 1bps start and end to this region
            # may change in the future if gappedPeak format was improved.
-            bpeaks.add(chrom, start, end, score=lvl2peak["score"], thickStart=str(start), thickEnd=str(end),
-                       blockNum = 2, blockSizes = "1,1", blockStarts = "0,"+str(end-start-1), pileup = lvl2peak["pileup"],
+            bpeaks.add(chrom, start, end, score=lvl2peak["score"], thickStart=(b"%d" % start), thickEnd=(b"%d" % end),
+                       blockNum = 2, blockSizes = b"1,1", blockStarts = (b"0,%d" % (end-start-1)), pileup = lvl2peak["pileup"],
                       pscore = lvl2peak["pscore"], fold_change = lvl2peak["fc"],
                       qscore = lvl2peak["qscore"] )
            return bpeaks
        
-        thickStart = str(lvl1peakset[0]["start"])
-        thickEnd   = str(lvl1peakset[-1]["end"])
+        thickStart = b"%d" % lvl1peakset[0]["start"]
+        thickEnd   = b"%d" % lvl1peakset[-1]["end"]
        blockNum   = len(lvl1peakset)
-        blockSizes = ",".join( map(lambda x:str(x["length"]),lvl1peakset) )
-        blockStarts = ",".join( map(lambda x:str(x["start"]-start),lvl1peakset) )
+        blockSizes = b",".join( [b"%d" % x["length"] for x in lvl1peakset] )
+        blockStarts = b",".join( [b"%d" % (x["start"]-start) for x in lvl1peakset] )

-        # add 1bp left and/or right block if necessary
        if int(thickStart) != start:
            # add 1bp left block
-            thickStart = str(start)
+            thickStart = b"%d" % start
            blockNum += 1
-            blockSizes = "1,"+blockSizes
-            blockStarts = "0,"+blockStarts
+            blockSizes = b"1,"+blockSizes
+            blockStarts = b"0,"+blockStarts
        if int(thickEnd) != end:
            # add 1bp right block
-            thickEnd = str(end)
+            thickEnd = b"%d" % end
            blockNum += 1
-            blockSizes = blockSizes+",1"
-            blockStarts = blockStarts+","+str(end-start-1)
+            blockSizes = blockSizes+b",1"
+            blockStarts = blockStarts + b"," + (b"%d" % (end-start-1))
        
        bpeaks.add(chrom, start, end, score=lvl2peak["score"], thickStart=thickStart, thickEnd=thickEnd,
                   blockNum = blockNum, blockSizes = blockSizes, blockStarts = blockStarts)
@@ -615,7 +616,7 @@ cdef class bedGraphTrackI:
        
        """
        cdef:
-            str chrom
+            bytes chrom
            int max_p
        
        ret = bedGraphTrackI()
@@ -663,7 +664,7 @@ cdef class bedGraphTrackI:
        cdef:
            int pre_p, p1, p2
            double v1, v2
-            str chrom
+            bytes chrom

        nr_tracks = len(bdgTracks) + 1  # +1 for self
        assert nr_tracks >= 2, "Specify at least two replicates"
@@ -698,9 +699,9 @@ cdef class bedGraphTrackI:
            ps, vs, pn, vn = [], [], [], []
            for data in datas:
                ps.append(data[0])
-                pn.append(iter(ps[-1]).next)
+                pn.append(iter(ps[-1]).__next__)
                vs.append(data[1])
-                vn.append(iter(vs[-1]).next)
+                vn.append(iter(vs[-1]).__next__)

            pre_p = 0                   # remember the previous position in the new bedGraphTrackI object ret
            try:
@@ -748,7 +749,7 @@ cdef class bedGraphTrackI:
         for other type of scores.
        """
        cdef:
-            str chrom
+            bytes chrom
            object pos_array, pscore_array
            dict pvalue_stat = {}
            dict pqtable = {}
@@ -767,14 +768,14 @@ cdef class bedGraphTrackI:

            [pos_array, pscore_array] = self.__data[ chrom ]

-            pn = iter(pos_array).next
-            vn = iter(pscore_array).next
+            pn = iter(pos_array).__next__
+            vn = iter(pscore_array).__next__

            for i in range( len( pos_array ) ):
                this_p = pn()
                this_v = vn()
                this_l = this_p - pre_p
-                if pvalue_stat.has_key( this_v ):
+                if this_v in pvalue_stat:
                    pvalue_stat[ this_v ] += this_l
                else:
                    pvalue_stat[ this_v ] = this_l
@@ -826,7 +827,7 @@ cdef class bedGraphTrackI:
        cdef:
            int pre_p, p1, p2, i
            double v1, v2
-            str chrom
+            bytes chrom
        
        assert isinstance(bdgTrack2,bedGraphTrackI), "bdgTrack2 is not a bedGraphTrackI object"

@@ -844,12 +845,12 @@ cdef class bedGraphTrackI:
        for i in range( len( common_chr ) ):
            chrom = common_chr.pop()
            (p1s,v1s) = self.get_data_by_chr(chrom) # arrays for position and values
-            p1n = iter(p1s).next         # assign the next function to a viable to speed up
-            v1n = iter(v1s).next
+            p1n = iter(p1s).__next__         # assign the next function to a viable to speed up
+            v1n = iter(v1s).__next__

            (p2s,v2s) = bdgTrack2.get_data_by_chr(chrom) # arrays for position and values
-            p2n = iter(p2s).next         # assign the next function to a viable to speed up
-            v2n = iter(v2s).next
+            p2n = iter(p2s).__next__         # assign the next function to a viable to speed up
+            v2n = iter(v2s).__next__

            pre_p = 0                   # remember the previous position in the new bedGraphTrackI object ret
            
@@ -914,7 +915,7 @@ cdef class bedGraphTrackI:
        cdef:
            int pre_p, p1, p2
            double v1, v2
-            str chrom
+            bytes chrom
        
        assert isinstance(bdgTrack2,bedGraphTrackI), "bdgTrack2 is not a bedGraphTrackI object"

@@ -927,12 +928,12 @@ cdef class bedGraphTrackI:
        for chrom in common_chr:
            
            (p1s,v1s) = self.get_data_by_chr(chrom) # arrays for position and values
-            p1n = iter(p1s).next         # assign the next function to a viable to speed up
-            v1n = iter(v1s).next
+            p1n = iter(p1s).__next__         # assign the next function to a viable to speed up
+            v1n = iter(v1s).__next__

            (p2s,v2s) = bdgTrack2.get_data_by_chr(chrom) # arrays for position and values
-            p2n = iter(p2s).next         # assign the next function to a viable to speed up
-            v2n = iter(v2s).next
+            p2n = iter(p2s).__next__         # assign the next function to a viable to speed up
+            v2n = iter(v2s).__next__

            chrom_max_len = len(p1s)+len(p2s) # this is the maximum number of locations needed to be recorded in scoreTrackI for this chromosome.
            
@@ -982,7 +983,8 @@ cdef class bedGraphTrackI:
    cpdef str cutoff_analysis ( self, int max_gap, int min_length, int steps = 100 ):
        cdef:
            list chrs, tmplist, peak_content
-            str  chrom, ret
+            bytes  chrom
+            str ret
            float cutoff
            long total_l, total_p, i, n, ts, te, lastp, tl, peak_length
            dict cutoff_npeaks, cutoff_lpeaks
@@ -1017,8 +1019,8 @@ cdef class bedGraphTrackI:
                    continue

                # first bit of region above cutoff
-                acs_next = iter(above_cutoff_startpos).next
-                ace_next = iter(above_cutoff_endpos).next
+                acs_next = iter(above_cutoff_startpos).__next__
+                ace_next = iter(above_cutoff_endpos).__next__

                ts = acs_next()
                te = ace_next()
@@ -1062,7 +1064,7 @@ def scoreTracktoBedGraph (scoretrack, str colname):
    """
    cdef:
        int pre, i
-        str chrom
+        bytes chrom
    
    bdgtrack = bedGraphTrackI( baseline_value = 0 )
    if colname not in ['sample','control','-100logp','-100logq']:
@@ -1098,7 +1100,7 @@ class bedRegionTrackI (bedGraphTrackI):
        self.minvalue = 0
        self.baseline_value = 0

-    def safe_add_loc (self, str chromosome, int startpos, int endpos):
+    def safe_add_loc (self, bytes chromosome, int startpos, int endpos):
        """Add a chr-start-end-value block into __data dictionary.

        """
@@ -1114,7 +1116,7 @@ class bedRegionTrackI (bedGraphTrackI):
        if startpos < 0:
            startpos = 0
        
-        if not self.__data.has_key(chromosome):
+        if chromosome not in self.__data:
            self.__data[chromosome] = [array(BYTE4,[]),array(FBYTE4,[])] # for (endpos,value)
            c = self.__data[chromosome]
            if startpos:

--- a/MACS2/IO/BedGraphIO.c
+++ b/MACS2/IO/BedGraphIO.c
--- a/MACS2/IO/BedGraphIO.pyx
+++ b/MACS2/IO/BedGraphIO.pyx
-# Time-stamp: <2019-09-20 11:37:42 taoliu>
+# cython: language_level=3
+# Time-stamp: <2019-10-02 11:05:42 taoliu>

 """Module Description:  IO Module for bedGraph file

@@ -62,13 +63,13 @@ cdef class bedGraphIO:
    If any of the above two criteria is violated, parsering will fail.
    """
    cdef:
-        char * bedGraph_filename
+        str bedGraph_filename

-    def __init__ ( self, bedGraph_filename ):
+    def __init__ ( self, str bedGraph_filename ):
        """f must be a filename or a file handler.
        
        """
-        self.bedGraph_filename = <bytes> bedGraph_filename
+        self.bedGraph_filename = bedGraph_filename

    def build_bdgtrack (self, double baseline_value=0):
        """Use this function to return a bedGraphTrackI object.
@@ -82,19 +83,19 @@ cdef class bedGraphIO:
        Then the region chr1:200..250 should be filled with
        baseline_value. Default of baseline_value is 0.
        """
-        cdef str i
+        cdef bytes i

        data = bedGraphTrackI(baseline_value=baseline_value)
        add_func = data.add_loc
        # python open file
-        bedGraph_file = open( self.bedGraph_filename, "r" )
+        bedGraph_file = open( self.bedGraph_filename, "rb" )
        
        for i in bedGraph_file:
-            if i.startswith("track"):
+            if i.startswith(b"track"):
                continue
-            elif i.startswith("#"):
+            elif i.startswith(b"#"):
                continue
-            elif i.startswith("browse"):
+            elif i.startswith(b"browse"):
                continue
            else:
                fs = i.split()
@@ -103,39 +104,6 @@ cdef class bedGraphIO:
        bedGraph_file.close()
        return data

-    def build_bdgtrack_c (self, double baseline_value=0):
-        """Use this function to return a bedGraphTrackI object. This is a C version.
-
-        baseline_value is the value to fill in the regions not defined
-        in bedGraph. For example, if the bedGraph is like:
-
-        chr1  100 200  1
-        chr1  250 350  2
-
-        Then the region chr1:200..250 should be filled with
-        baseline_value. Default of baseline_value is 0.
-        """
-        cdef:
-            char[80] chrom      # will there be chromosome name longer than 80 characters?
-            int start, end
-            float value
-            FILE * bedGraph_file
-        
-        data = bedGraphTrackI(baseline_value=baseline_value)
-        add_func = data.add_loc
-        # C open file
-        bedGraph_file = fopen( self.bedGraph_filename, "r" )
-        if bedGraph_file == NULL:
-            raise Exception("File '%s' can not be opened!" % self.bedGraph_filename )
-        
-        while ( fscanf( bedGraph_file, "%s\t%d\t%d\t%f\n", chrom, &start, &end, &value ) != EOF ):
-            add_func( chrom, start, end, value )
-        fclose( bedGraph_file )
-        #data.finalize()
-        return data
-
-
-
 cdef class genericBedIO:
    """File Parser Class for generic bed File with at least column #1,#2,#3,and #5.

@@ -156,8 +124,8 @@ cdef class genericBedIO:
        
        """
        if type(f) == str:
-            self.fhd = open(f,"r")
-        elif type(f) == file:
+            self.fhd = open(f,"rb")
+        elif type(f) == io.IOBase:
            self.fhd = f
        else:
            raise Exception("f must be a filename or a file handler.")

--- a/MACS2/IO/CallPeakUnit.c
+++ b/MACS2/IO/CallPeakUnit.c
--- a/MACS2/IO/CallPeakUnit.pyx
+++ b/MACS2/IO/CallPeakUnit.pyx
--- a/MACS2/IO/CallPeakUnitPrecompiled.c
+++ b/MACS2/IO/CallPeakUnitPrecompiled.c
--- a/MACS2/IO/FixWidthTrack.c
+++ b/MACS2/IO/FixWidthTrack.c
--- a/MACS2/IO/FixWidthTrack.pyx
+++ b/MACS2/IO/FixWidthTrack.pyx
-# Time-stamp: <2019-09-20 11:38:45 taoliu>
+# cython: language_level=3
+# Time-stamp: <2019-10-02 11:06:00 taoliu>

 """Module for FWTrack classes.

@@ -15,6 +16,7 @@ import logging
 #from array import array
 #from random import sample as random_sample
 import sys
+import io
 from copy import copy
 from collections import Counter

@@ -98,18 +100,18 @@ cdef class FWTrack:
        """
        cdef:
            set chrs
-            str chromosome
+            bytes chromosome
            
        chrs = set(self.get_chr_names())
        for chromosome in chrs:
-            if self.__locations.has_key(chromosome):
+            if chromosome in self.__locations:
                self.__locations[chromosome][0].resize( self.buffer_size, refcheck=False )
                self.__locations[chromosome][0].resize( 0, refcheck=False )
                self.__locations[chromosome][1].resize( self.buffer_size, refcheck=False )
                self.__locations[chromosome][1].resize( 0, refcheck=False )
                self.__locations[chromosome] = [None, None]
                self.__locations.pop(chromosome)
-            if self.__dup_locations.has_key(chromosome):
+            if chromosome in self.__dup_locations:
                self.__dup_locations[chromosome][0].resize( self.buffer_size, refcheck=False )
                self.__dup_locations[chromosome][0].resize( 0, refcheck=False )
                self.__dup_locations[chromosome][1].resize( self.buffer_size, refcheck=False )
@@ -120,7 +122,7 @@ cdef class FWTrack:
        return True


-    cpdef add_loc ( self, str chromosome, int32_t fiveendpos, int strand ):
+    cpdef add_loc ( self, bytes chromosome, int32_t fiveendpos, int strand ):
        """Add a location to the list according to the sequence name.
        
        chromosome -- mostly the chromosome name
@@ -130,7 +132,7 @@ cdef class FWTrack:
        cdef:
            long i

-        if not self.__locations.has_key(chromosome):
+        if chromosome not in self.__locations:
            self.__locations[chromosome] = [ np.zeros(self.buffer_size, dtype='int32'), np.zeros(self.buffer_size, dtype='int32') ] # [plus,minus strand]
            self.__pointer[chromosome] = [ 0, 0 ]
            self.__locations[chromosome][strand][0] = fiveendpos
@@ -143,10 +145,10 @@ cdef class FWTrack:
                self.__locations[chromosome][strand][i]= fiveendpos
                self.__pointer[chromosome][strand] += 1
            except:
-                print "i:",i
-                print "self.buffer_size:", self.buffer_size
-                print "strand:", strand
-                print "loc len:", len( self.__locations[chromosome][strand] )
+                print ("i:",i)
+                print ("self.buffer_size:", self.buffer_size)
+                print ("strand:", strand)
+                print ("loc len:", len( self.__locations[chromosome][strand] ))
                raise Exception("!!")

    cdef np.ndarray[np.int32_t, ndim=1] __expand__ ( self, np.ndarray[np.int32_t, ndim=1] arr ):
@@ -161,7 +163,7 @@ cdef class FWTrack:
        
        cdef:
            int32_t i
-            str c
+            bytes c
        
        self.total = 0

@@ -191,7 +193,7 @@ cdef class FWTrack:
        """
        cdef:
            set valid_chroms, missed_chroms, extra_chroms
-            str chrom
+            bytes chrom

        valid_chroms = set(self.__locations.keys()).intersection(rlengths.keys())
        for chrom in valid_chroms:
@@ -211,11 +213,11 @@ cdef class FWTrack:
            self.rlengths = dict([(k, INT_MAX) for k in self.__locations.keys()])
        return self.rlengths

-    cpdef get_locations_by_chr ( self, str chromosome ):
+    cpdef get_locations_by_chr ( self, bytes chromosome ):
        """Return a tuple of two lists of locations for certain chromosome.

        """
-        if self.__locations.has_key(chromosome):
+        if chromosome in self.__locations:
            return self.__locations[chromosome]
        else:
            raise Exception("No such chromosome name (%s) in TrackI object!\n" % (chromosome))
@@ -223,7 +225,7 @@ cdef class FWTrack:
    cpdef list get_chr_names ( self ):
        """Return all the chromosome names stored in this track object.
        """
-        l = self.__locations.keys()
+        l = list(self.__locations.keys())
        l.sort()
        return l

@@ -238,7 +240,7 @@ cdef class FWTrack:
        """
        cdef:
            int32_t i
-            str c
+            bytes c

        chrnames = self.get_chr_names()

@@ -258,7 +260,7 @@ cdef class FWTrack:
            int p, m, n, current_loc, i_chrom
            unsigned long i_old, i_new          # index for old array, and index for new one
            unsigned long i_dup, size, new_size, dup_size
-            str k
+            bytes k
            np.ndarray[np.int32_t, ndim=1] plus, new_plus, dup_plus, minus, new_minus, dup_minus

        if not self.__sorted:
@@ -384,7 +386,7 @@ cdef class FWTrack:
            int p, m, n, current_loc, i_chrom
            unsigned long i_old, i_new          # index for old array, and index for new one
            unsigned long i_dup, size, new_size, dup_size
-            str k
+            bytes k
            np.ndarray[np.int32_t, ndim=1] plus, new_plus, dup_plus, minus, new_minus, dup_minus

        if not self.__sorted:
@@ -452,7 +454,7 @@ cdef class FWTrack:
            int p, m, n, current_loc, i_chrom
            # index for old array, and index for new one
            unsigned long i_old, i_new, size, new_size 
-            str k
+            bytes k
            np.ndarray[np.int32_t, ndim=1] plus, new_plus, minus, new_minus

        if maxnum < 0: return self.total         # do nothing
@@ -560,7 +562,7 @@ cdef class FWTrack:
        """
        cdef:
            int32_t num, i_chrom      # num: number of reads allowed on a certain chromosome
-            str key
+            bytes key
        
        self.total = 0
        self.length = 0
@@ -612,11 +614,11 @@ cdef class FWTrack:
        """
        cdef:
            int32_t i, i_chrom, p
-            str k
+            bytes k
        
        if not fhd:
            fhd = sys.stdout
-        assert isinstance(fhd, file)
+        assert isinstance(fhd,io.IOBase)
        assert self.fw > 0, "FWTrack object .fw should be set larger than 0!"

        chrnames = self.get_chr_names()
@@ -631,16 +633,16 @@ cdef class FWTrack:

            for i in range(plus.shape[0]):
                p = plus[i]
-                fhd.write("%s\t%d\t%d\t.\t.\t%s\n" % (k,p,p+self.fw,"+") )
+                fhd.write("%s\t%d\t%d\t.\t.\t%s\n" % (k.decode(),p,p+self.fw,"+") )

            minus = self.__locations[k][1]
            
            for i in range(minus.shape[0]):
                p = minus[i]
-                fhd.write("%s\t%d\t%d\t.\t.\t%s\n" % (k,p-self.fw,p,"-") )
+                fhd.write("%s\t%d\t%d\t.\t.\t%s\n" % (k.decode(),p-self.fw,p,"-") )
        return
    
-    cpdef tuple extract_region_tags ( self, str chromosome, int32_t startpos, int32_t endpos ):
+    cpdef tuple extract_region_tags ( self, bytes chromosome, int32_t startpos, int32_t endpos ):
        cdef:
            int32_t i, pos
            np.ndarray[np.int32_t, ndim=1] rt_plus, rt_minus
@@ -696,7 +698,7 @@ cdef class FWTrack:
        cdef:
            int32_t m, i, j, pre_i, pre_j, pos, startpos, endpos
            np.ndarray[np.int32_t, ndim=1] plus, minus, rt_plus, rt_minus
-            str chrom, name
+            bytes chrom, name
            list temp, retval, pchrnames

        pchrnames = peaks.get_chr_names()
@@ -777,7 +779,7 @@ cdef class FWTrack:
        cdef:
            int32_t m, i, j, pre_i, pre_j, pos, startpos, endpos #, n_peaks
            np.ndarray[np.int32_t, ndim=1] plus, minus, rt_plus, rt_minus
-            str chrom #, peak_name
+            bytes chrom #, peak_name
            list temp, retval, pchrnames, cpeaks
            np.ndarray[np.int32_t, ndim=1] adjusted_summits, passflags

@@ -863,7 +865,7 @@ cdef class FWTrack:
                # end of a loop
        return ret_peaks

-    cpdef pileup_a_chromosome ( self, str chrom, list ds, list scale_factor_s, float baseline_value = 0.0, bint directional = True, int end_shift = 0 ):
+    cpdef pileup_a_chromosome ( self, bytes chrom, list ds, list scale_factor_s, float baseline_value = 0.0, bint directional = True, int end_shift = 0 ):
        """pileup a certain chromosome, return [p,v] (end position and value) list.
        
        ds             : tag will be extended to this value to 3' direction,
@@ -897,8 +899,8 @@ cdef class FWTrack:
                three_shift_s.append( end_shift + d)
            else:
                # both sides
-                five_shift_s.append( d/2 - end_shift )
-                three_shift_s.append( end_shift + d - d/2)
+                five_shift_s.append( d//2 - end_shift )
+                three_shift_s.append( end_shift + d - d//2)

        prev_pileup = None

@@ -906,13 +908,13 @@ cdef class FWTrack:
            five_shift = five_shift_s[i]
            three_shift = three_shift_s[i]
            scale_factor = scale_factor_s[i]
-            
            tmp_pileup = se_all_in_one_pileup ( self.__locations[chrom][0], self.__locations[chrom][1], five_shift, three_shift, rlength, scale_factor, baseline_value )

            if prev_pileup:
                prev_pileup = max_over_two_pv_array ( prev_pileup, tmp_pileup )
            else:
                prev_pileup = tmp_pileup
+
        return prev_pileup

 cdef inline int32_t left_sum ( data, int pos, int width ):

--- a/MACS2/IO/PairedEndTrack.c
+++ b/MACS2/IO/PairedEndTrack.c
--- a/MACS2/IO/PairedEndTrack.pyx
+++ b/MACS2/IO/PairedEndTrack.pyx
-# Time-stamp: <2019-09-20 11:39:05 taoliu>
+# cython: language_level=3
+# Time-stamp: <2019-10-02 11:06:06 taoliu>

 """Module for filter duplicate tags from paired-end data

@@ -8,16 +9,19 @@ the distribution).
 """

 from MACS2.Constants import *
+import io
+import sys
 from logging import debug, info
 import numpy as np
 cimport numpy as np
-from libc.stdint cimport uint32_t, uint64_t, int32_t, int64_t
 from copy import copy
+
 from cpython cimport bool
 cimport cython
+from libc.stdint cimport uint32_t, uint64_t, int32_t, int64_t

+from MACS2.Constants import *
 from MACS2.Pileup import quick_pileup, max_over_two_pv_array, se_all_in_one_pileup
-import sys

 cdef INT_MAX = <int>((<unsigned int>-1)>>1)

@@ -67,7 +71,7 @@ cdef class PETrackI:
        self.length = 0
        self.average_template_length = 0.0
        
-    cpdef add_loc ( self, str chromosome, int start, int end):
+    cpdef add_loc ( self, bytes chromosome, int start, int end):
        """Add a location to the list according to the sequence name.
        
        chromosome -- mostly the chromosome name
@@ -76,7 +80,7 @@ cdef class PETrackI:
        cdef:
            long i

-        if not self.__locations.has_key(chromosome):
+        if chromosome not in self.__locations:
            self.__locations[chromosome] = np.zeros(shape=self.buffer_size, dtype=[('l','int32'),('r','int32')]) # note: ['l'] is the leftmost end, ['r'] is the rightmost end of fragment.
            self.__locations[chromosome][ 0 ] = ( start, end )
            self.__pointer[chromosome] = 1
@@ -93,16 +97,16 @@ cdef class PETrackI:
        """
        cdef:
            set chrs
-            str chromosome
+            bytes chromosome
            
        chrs = set(self.get_chr_names())
        for chromosome in chrs:
-            if self.__locations.has_key(chromosome):
+            if chromosome in self.__locations:
                self.__locations[chromosome].resize( self.buffer_size, refcheck=False )
                self.__locations[chromosome].resize( 0, refcheck=False )
                self.__locations[chromosome] = None
                self.__locations.pop(chromosome)
-            if self.__dup_locations.has_key(chromosome):
+            if chromosome in self.__dup_locations:
                self.__dup_locations[chromosome].resize( self.buffer_size, refcheck=False )
                self.__dup_locations[chromosome].resize( 0, refcheck=False )
                self.__dup_locations[chromosome] = None
@@ -126,7 +130,7 @@ cdef class PETrackI:
        """
        cdef:
            set valid_chroms, missed_chroms
-            str chrom
+            bytes chrom

        valid_chroms = set(self.__locations.keys()).intersection(rlengths.keys())
        for chrom in valid_chroms:
@@ -154,7 +158,7 @@ cdef class PETrackI:
        """
        
        cdef int32_t i
-        cdef str c
+        cdef bytes c
        
        self.total = 0

@@ -170,11 +174,11 @@ cdef class PETrackI:
        self.average_template_length = float( self.length ) / self.total
        return

-    def get_locations_by_chr ( self, str chromosome ):
+    def get_locations_by_chr ( self, bytes chromosome ):
        """Return a tuple of two lists of locations for certain chromosome.

        """
-        if self.__locations.has_key(chromosome):
+        if chromosome in self.__locations:
            return self.__locations[chromosome]
        else:
            raise Exception("No such chromosome name (%s) in TrackI object!\n" % (chromosome))
@@ -182,7 +186,9 @@ cdef class PETrackI:
    cpdef list get_chr_names ( self ):
        """Return all the chromosome names stored in this track object.
        """
-        l = self.__locations.keys()
+        cdef list l
+        
+        l = list(self.__locations.keys())
        l.sort()
        return l

@@ -203,7 +209,7 @@ cdef class PETrackI:
        
        """
        cdef uint32_t i
-        cdef str c
+        cdef bytes c
        cdef list chrnames = self.get_chr_names()

        for i in range(len(chrnames)):
@@ -241,7 +247,7 @@ cdef class PETrackI:
           np.ndarray[np.float64_t, ndim=1] pmf
           np.ndarray[np.int64_t, ndim=1] bins
           np.ndarray[np.int32_t, ndim=1] sizes, locs
-           str c
+           bytes c
           int i, bins_len
           int max_bins = 0
           list chrnames = self.get_chr_names()
@@ -275,7 +281,7 @@ cdef class PETrackI:
            np.ndarray locs, new_locs, dup_locs
            unsigned long i_old, i_new, i_dup, new_size, dup_size
            list chrnames = self.get_chr_names()
-            str k
+            bytes k
           
        if not self.__sorted: self.sort()
        
@@ -362,7 +368,7 @@ cdef class PETrackI:
 #            np.ndarray[np.int32_t, ndim=1] current_loc #= np.zeros([1,2], np.int32)
            int loc_start, loc_end, current_loc_start, current_loc_end
            unsigned long i_old, i_new, size, new_size
-            str k
+            bytes k
            np.ndarray locs, new_locs
                
        if maxnum < 0: return self.total # condition to return if not filtering
@@ -435,7 +441,7 @@ cdef class PETrackI:
        """
        cdef:
            uint32_t num, i_chrom      # num: number of reads allowed on a certain chromosome
-            str key
+            bytes key
        
        self.total = 0
        self.length = 0
@@ -479,11 +485,11 @@ cdef class PETrackI:
        
        """
        cdef int32_t i, i_chrom, s, e
-        cdef str k
+        cdef bytes k
        
        if not fhd:
            fhd = sys.stdout
-        assert isinstance(fhd, file)
+        assert isinstance(fhd, io.IOBase)

        chrnames = self.get_chr_names()
        
@@ -497,11 +503,11 @@ cdef class PETrackI:

            for i in range(locs.shape[0]):
                s, e = locs[ i ]
-                fhd.write("%s\t%d\t%d\n" % (k, s, e))
+                fhd.write("%s\t%d\t%d\n" % (k.decode(), s, e))

        return
    
-    cpdef pileup_a_chromosome ( self, str chrom, list scale_factor_s, float baseline_value = 0.0 ):
+    cpdef pileup_a_chromosome ( self, bytes chrom, list scale_factor_s, float baseline_value = 0.0 ):
        """pileup a certain chromosome, return [p,v] (end position and value) list.
        
        scale_factor_s  : linearly scale the pileup value applied to each d in ds. The list should have the same length as ds.
@@ -528,7 +534,7 @@ cdef class PETrackI:

        return prev_pileup

-    cpdef pileup_a_chromosome_c ( self, str chrom, list ds, list scale_factor_s, float baseline_value = 0.0 ):
+    cpdef pileup_a_chromosome_c ( self, bytes chrom, list ds, list scale_factor_s, float baseline_value = 0.0 ):
        """pileup a certain chromosome, return [p,v] (end position and value) list.

        This function is for control track. Basically, here is a
@@ -557,8 +563,8 @@ cdef class PETrackI:
        for i in range(len(scale_factor_s)):
            d = ds[i]
            scale_factor = scale_factor_s[i]
-            five_shift = d/2
-            three_shift= d/2
+            five_shift = d//2
+            three_shift= d//2

            tmp_pileup = se_all_in_one_pileup ( self.__locations[chrom]['l'], self.__locations[chrom]['r'], five_shift, three_shift, rlength, scale_factor, baseline_value )


--- a/MACS2/IO/Parser.c
+++ b/MACS2/IO/Parser.c
--- a/MACS2/IO/Parser.pyx
+++ b/MACS2/IO/Parser.pyx
--- a/MACS2/IO/PeakIO.c
+++ b/MACS2/IO/PeakIO.c