...
 
Commits (23)
Copyright (C) 2014-2017, New York University
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
include README.rst
include LICENSE.txt
graft native
Metadata-Version: 1.2
Name: reprozip
Version: 1.0.14
Summary: Linux tool enabling reproducible experiments (packer)
Home-page: https://www.reprozip.org/
Author: Remi Rampin, Fernando Chirigati, Dennis Shasha, Juliana Freire
Author-email: reprozip-users@vgc.poly.edu
Maintainer: Remi Rampin
Maintainer-email: remirampin@gmail.com
License: BSD-3-Clause
Project-URL: Homepage, https://github.com/ViDA-NYU/reprozip
Project-URL: Documentation, https://docs.reprozip.org/
Project-URL: Examples, https://examples.reprozip.org/
Project-URL: Say Thanks, https://saythanks.io/to/remram44
Project-URL: Source, https://github.com/ViDA-NYU/reprozip
Project-URL: Tracker, https://github.com/ViDA-NYU/reprozip/issues
Description: ReproZip
========
`ReproZip <https://www.reprozip.org/>`__ is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
reprozip
--------
This is the component responsible for the packing step on Linux distributions.
Please refer to `reprounzip <https://pypi.python.org/pypi/reprounzip>`_, `reprounzip-vagrant <https://pypi.python.org/pypi/reprounzip-vagrant>`_, and `reprounzip-docker <https://pypi.python.org/pypi/reprounzip-docker>`_ for other components and plugins.
Additional Information
----------------------
For more detailed information, please refer to our `website <https://www.reprozip.org/>`_, as well as to our `documentation <https://reprozip.readthedocs.io/>`_.
ReproZip is currently being developed at `NYU <http://engineering.nyu.edu/>`_. The team includes:
* `Fernando Chirigati <http://fchirigati.com/>`_
* `Juliana Freire <https://vgc.poly.edu/~juliana/>`_
* `Remi Rampin <https://remirampin.com/>`_
* `Dennis Shasha <http://cs.nyu.edu/shasha/>`_
* `Vicky Steeves <https://vickysteeves.com/>`_
Keywords: reprozip,reprounzip,reproducibility,provenance,vida,nyu
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: C
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: System :: Archiving
ReproZip
========
`ReproZip <https://www.reprozip.org/>`__ is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
reprozip
--------
This is the component responsible for the packing step on Linux distributions.
Please refer to `reprounzip <https://pypi.python.org/pypi/reprounzip>`_, `reprounzip-vagrant <https://pypi.python.org/pypi/reprounzip-vagrant>`_, and `reprounzip-docker <https://pypi.python.org/pypi/reprounzip-docker>`_ for other components and plugins.
Additional Information
----------------------
For more detailed information, please refer to our `website <https://www.reprozip.org/>`_, as well as to our `documentation <https://reprozip.readthedocs.io/>`_.
ReproZip is currently being developed at `NYU <http://engineering.nyu.edu/>`_. The team includes:
* `Fernando Chirigati <http://fchirigati.com/>`_
* `Juliana Freire <https://vgc.poly.edu/~juliana/>`_
* `Remi Rampin <https://remirampin.com/>`_
* `Dennis Shasha <http://cs.nyu.edu/shasha/>`_
* `Vicky Steeves <https://vickysteeves.com/>`_
reprozip (1.0.14-3) UNRELEASED; urgency=medium
* Remove unnecessary X-Python{,3}-Version field in debian/control.
-- Jelmer Vernooij <jelmer@debian.org> Wed, 17 Oct 2018 22:20:49 +0000
reprozip (1.0.14-2) unstable; urgency=medium
* Use any-linux-amd64 any-linux-i386 instead of linux-any since ATM
builds/supported only on amd64 i386 x32 ports
-- Yaroslav Halchenko <debian@onerussian.com> Tue, 04 Sep 2018 12:44:40 -0400
reprozip (1.0.14-1) unstable; urgency=medium
* New upstream version
* debian/control
- adjusted VCS fields to point to salsa
- boosted policy compliance claim to 4.2.1
- replaced extra priority with optional for -dbg pkg
-- Yaroslav Halchenko <debian@onerussian.com> Thu, 30 Aug 2018 09:44:11 -0400
reprozip (1.0.10-1) unstable; urgency=medium
* New upstream version 1.0.10
* Drop the patch queue, applied upstream
* Set architecture to linux-any
* Bump standards version to 4.0.0, no changes required
-- Ghislain Antony Vaillant <ghisvail@gmail.com> Fri, 14 Jul 2017 09:07:59 +0100
reprozip (1.0.9-4) unstable; urgency=medium
* Enable build for x32.
Thanks to James Clarke for reporting (Closes: #862585)
* Fix FTBFS with Python 3.6
- New patch Commit-sqlite3-transactions-explicitly.patch
Thanks to James Clarke for the patch (Closes: #862595)
-- Ghislain Antony Vaillant <ghisvail@gmail.com> Mon, 15 May 2017 09:40:48 +0100
reprozip (1.0.9-3) unstable; urgency=medium
* Fix file conflict issue.
Thanks to Axel Beckert for reporting (Closes: #862542)
-- Ghislain Antony Vaillant <ghisvail@gmail.com> Sun, 14 May 2017 20:40:54 +0100
reprozip (1.0.9-2) unstable; urgency=medium
* Restrict the build to amd64 and i386.
Thanks to Adrian Bunk for reporting (Closes: #862351)
-- Ghislain Antony Vaillant <ghisvail@gmail.com> Fri, 12 May 2017 19:18:41 +0100
reprozip (1.0.9-1) unstable; urgency=low
* Initial release. (Closes: #860531)
-- Ghislain Antony Vaillant <ghisvail@gmail.com> Tue, 02 May 2017 09:02:08 +0100
Source: reprozip
Maintainer: Debian Science Maintainers <debian-science-maintainers@lists.alioth.debian.org>
Uploaders: Ghislain Antony Vaillant <ghisvail@gmail.com>,
Yaroslav Halchenko <debian@onerussian.com>,
Section: science
Priority: optional
Build-Depends: debhelper (>= 10),
dh-python,
libsqlite3-dev,
python3-all-dbg,
python3-all-dev,
python3-requests,
python3-rpaths,
python3-setuptools,
python3-usagestats,
python3-yaml
Standards-Version: 4.2.1
Vcs-Browser: https://salsa.debian.org/science-team/reprozip
Vcs-Git: https://salsa.debian.org/science-team/reprozip.git
Homepage: https://www.reprozip.org
Package: reprozip
Architecture: all
Multi-Arch: foreign
Section: utils
Depends: ${misc:Depends},
${python3:Depends},
python3-reprozip
Description: tool for reproducing scientific experiments (packer)
ReproZip is a tool aimed at simplifying the process of creating
reproducible experiments from command-line executions, a frequently-used
common denominator in computational science.
.
It tracks operating system calls and creates a package that contains
all the binaries, files and dependencies required to run a given
command on the author’s computational environment (packing step). A
reviewer can then extract the experiment in his environment to
reproduce the results (unpacking step).
.
This package provides the ReproZip packer.
Package: python3-reprozip
Architecture: any-linux-amd64 any-linux-i386
Multi-Arch: same
Section: python
Depends: ${misc:Depends},
${python3:Depends},
${shlibs:Depends}
Description: modules for the ReproZip packer
ReproZip is a tool aimed at simplifying the process of creating
reproducible experiments from command-line executions, a frequently-used
common denominator in computational science.
.
It tracks operating system calls and creates a package that contains
all the binaries, files and dependencies required to run a given
command on the author’s computational environment (packing step). A
reviewer can then extract the experiment in his environment to
reproduce the results (unpacking step).
.
This package provides the modules for Python 3.
Package: python3-reprozip-dbg
Architecture: linux-any
Multi-Arch: same
Section: debug
Priority: optional
Depends: ${misc:Depends},
${python3:Depends},
${shlibs:Depends},
python3-reprozip (= ${binary:Version})
Description: debug extensions for the ReproZip packer
ReproZip is a tool aimed at simplifying the process of creating
reproducible experiments from command-line executions, a frequently-used
common denominator in computational science.
.
It tracks operating system calls and creates a package that contains
all the binaries, files and dependencies required to run a given
command on the author’s computational environment (packing step). A
reviewer can then extract the experiment in his environment to
reproduce the results (unpacking step).
.
This package provides the debug extensions for Python 3.
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: reprozip
Source: https://pypi.python.org/pypi/reprozip
Files: *
Copyright: 2014-2017 New York University
License: BSD-3-Clause
Files: debian/*
Copyright: 2017 Ghislain Antony Vaillant
License: BSD-3-Clause
License: BSD-3-Clause
BSD 3-Clause License
.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
.
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
.
* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
[DEFAULT]
upstream-branch = upstream
debian-branch = master
upstream-tag = upstream/%(version)s
debian-tag = debian/%(version)s
sign-tags = True
pristine-tar = True
# Upstream does not provide manpages for the command-line tools yet.
binary-without-manpage
#!/usr/bin/make -f
# Uncomment this to turn on verbose mode.
#export DH_VERBOSE = 1
export DEB_BUILD_MAINT_OPTIONS = hardening=+all
export PYBUILD_NAME = reprozip
export PYBUILD_TEST_ARGS = {dir}/debian/tests
export PYBUILD_AFTER_INSTALL = rm -rf {destdir}/usr/bin
%:
dh $@ --with python3 --buildsystem=pybuild
override_dh_auto_install:
dh_auto_install
python3 setup.py install_scripts --skip-build \
--install-dir=debian/$(PYBUILD_NAME)/usr/bin
override_dh_strip:
dh_strip --package=python3-$(PYBUILD_NAME) \
--dbg-package=python3-$(PYBUILD_NAME)-dbg
extend-diff-ignore="^[^/]+\.egg-info/"
Test-Command: set -e
; cd "$AUTOPKGTEST_TMP"
; HOME=/tmp
; reprozip usage_report --disable
; reprozip testrun /bin/echo
; reprozip -v trace /bin/echo
; reprozip -v pack echo.rpz
Depends: reprozip
Restrictions: allow-stderr
This diff is collapsed.
version=4
opts=uversionmangle=s/(rc|a|b|c)/~$1/ \
https://pypi.debian.net/reprozip/reprozip@ANY_VERSION@@ARCHIVE_EXT@
#ifndef CONFIG_H
#define CONFIG_H
#define WORD_SIZE sizeof(int)
#if !defined(X86) && !defined(X86_64)
# if defined(__x86_64__) || defined(__x86_64)
# define X86_64
# elif defined(__i386__) || defined(__i386) || defined(_M_I86) || defined(_M_IX86)
# define I386
# else
# error Unrecognized architecture!
# endif
#endif
/* Static assertion trick */
#define STATIC_ASSERT(name, condition) \
enum { name = 1/(!!( \
condition \
)) }
STATIC_ASSERT(ASSERT_POINTER_FITS_IN_LONG_INT,
sizeof(long int) >= sizeof(void*));
#endif
This diff is collapsed.
#ifndef DATABASE_H
#define DATABASE_H
#define FILE_READ 0x01
#define FILE_WRITE 0x02
#define FILE_WDIR 0x04 /* File is used as a process's working dir */
#define FILE_STAT 0x08 /* File is stat()d (only metadata is read) */
#define FILE_LINK 0x10 /* The link itself is accessed, no dereference */
int db_init(const char *filename);
int db_close(int rollback);
int db_add_process(unsigned int *id, unsigned int parent_id,
const char *working_dir, int is_thread);
int db_add_exit(unsigned int id, int exitcode);
int db_add_first_process(unsigned int *id, const char *working_dir);
int db_add_file_open(unsigned int process,
const char *name, unsigned int mode,
int is_dir);
int db_add_exec(unsigned int process, const char *binary,
const char *const *argv, const char *const *envp,
const char *workingdir);
#endif
#ifndef LOG_H
#define LOG_H
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <sys/types.h>
extern int logging_level;
int log_setup(void);
void log_real_(pid_t tid, int lvl, const char *format, ...);
#ifdef __GNUC__
#define log_critical(i, s, ...) log_real_(i, 50, s, ## __VA_ARGS__)
#define log_error(i, s, ...) log_real_(i, 40, s, ## __VA_ARGS__)
#define log_warn(i, s, ...) log_real_(i, 30, s, ## __VA_ARGS__)
#define log_info(i, s, ...) log_real_(i, 20, s, ## __VA_ARGS__)
#define log_debug(i, s, ...) log_real_(i, 10, s, ## __VA_ARGS__)
#else
#define log_critical(i, s, ...) log_real_(i, 50, s, __VA_ARGS__)
#define log_error(i, s, ...) log_real_(i, 40, s, __VA_ARGS__)
#define log_warn(i, s, ...) log_real_(i, 30, s, __VA_ARGS__)
#define log_info(i, s, ...) log_real_(i, 20, s, __VA_ARGS__)
#define log_debug(i, s, ...) log_real_(i, 10, s, __VA_ARGS__)
#endif
#endif
#include <errno.h>
#include <inttypes.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ptrace.h>
#include <sys/types.h>
#include <unistd.h>
#include "config.h"
#include "log.h"
#include "ptrace_utils.h"
#include "tracer.h"
static long tracee_getword(pid_t tid, const void *addr)
{
long res;
errno = 0;
res = ptrace(PTRACE_PEEKDATA, tid, addr, NULL);
if(errno)
{
/* LCOV_EXCL_START : We only do that on things that went through the
* kernel successfully, and so should be valid. The exception is
* execve(), which will dup arguments when entering the syscall */
log_error(tid, "tracee_getword() failed: %s", strerror(errno));
return 0;
/* LCOV_EXCL_END */
}
return res;
}
void *tracee_getptr(int mode, pid_t tid, const void *addr)
{
if(mode == MODE_I386)
{
/* Pointers are 32 bits */
uint32_t ptr;
tracee_read(tid, (void*)&ptr, addr, sizeof(ptr));
return (void*)(uint64_t)ptr;
}
else /* mode == MODE_X86_64 */
{
/* Pointers are 64 bits */
uint64_t ptr;
tracee_read(tid, (void*)&ptr, addr, sizeof(ptr));
return (void*)ptr;
}
}
uint64_t tracee_getlong(int mode, pid_t tid, const void *addr)
{
if(mode == MODE_I386)
{
/* Longs are 32 bits */
uint32_t val;
tracee_read(tid, (void*)&val, addr, sizeof(val));
return (uint64_t)val;
}
else /* mode == MODE_X86_64 */
{
/* Longs are 64 bits */
uint64_t val;
tracee_read(tid, (void*)&val, addr, sizeof(val));
return val;
}
}
size_t tracee_getwordsize(int mode)
{
if(mode == MODE_I386)
/* Pointers are 32 bits */
return 4;
else /* mode == MODE_X86_64 */
/* Pointers are 64 bits */
return 8;
}
size_t tracee_strlen(pid_t tid, const char *str)
{
uintptr_t ptr = (uintptr_t)str;
size_t j = ptr % WORD_SIZE;
uintptr_t i = ptr - j;
size_t size = 0;
int done = 0;
for(; !done; i += WORD_SIZE)
{
unsigned long data = tracee_getword(tid, (const void*)i);
for(; !done && j < WORD_SIZE; ++j)
{
unsigned char byte = data >> (8 * j);
if(byte == 0)
done = 1;
else
++size;
}
j = 0;
}
return size;
}
void tracee_read(pid_t tid, char *dst, const char *src, size_t size)
{
uintptr_t ptr = (uintptr_t)src;
size_t j = ptr % WORD_SIZE;
uintptr_t i = ptr - j;
uintptr_t end = ptr + size;
for(; i < end; i += WORD_SIZE)
{
unsigned long data = tracee_getword(tid, (const void*)i);
for(; j < WORD_SIZE && i + j < end; ++j)
*dst++ = data >> (8 * j);
j = 0;
}
}
char *tracee_strdup(pid_t tid, const char *str)
{
size_t length = tracee_strlen(tid, str);
char *res = malloc(length + 1);
tracee_read(tid, res, str, length);
res[length] = '\0';
return res;
}
char **tracee_strarraydup(int mode, pid_t tid, const char *const *argv)
{
/* FIXME : This is probably broken on x32 */
char **array;
/* Reads number of pointers in pointer array */
size_t nb_args = 0;
{
const char *const *a = argv;
/* xargv = *a */
const char *xargv = tracee_getptr(mode, tid, a);
while(xargv != NULL)
{
++nb_args;
++a;
xargv = tracee_getptr(mode, tid, a);
}
}
/* Allocs pointer array */
array = malloc((nb_args + 1) * sizeof(char*));
/* Dups array elements */
{
size_t i = 0;
/* xargv = argv[0] */
const char *xargv = tracee_getptr(mode, tid, argv);
while(xargv != NULL)
{
array[i] = tracee_strdup(tid, xargv);
++i;
/* xargv = argv[i] */
xargv = tracee_getptr(mode, tid, argv + i);
}
array[i] = NULL;
}
return array;
}
void free_strarray(char **array)
{
char **ptr = array;
while(*ptr)
{
free(*ptr);
++ptr;
}
free(array);
}
#ifndef PTRACE_UTILS_H
#define PTRACE_UTILS_H
void *tracee_getptr(int mode, pid_t tid, const void *addr);
uint64_t tracee_getlong(int mode, pid_t tid, const void *addr);
size_t tracee_getwordsize(int mode);
size_t tracee_strlen(pid_t tid, const char *str);
void tracee_read(pid_t tid, char *dst, const char *src, size_t size);
char *tracee_strdup(pid_t tid, const char *str);
char **tracee_strarraydup(int mode, pid_t tid, const char *const *argv);
void free_strarray(char **array);
#endif
#include <assert.h>
#include <errno.h>
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <Python.h>
#include "log.h"
static PyObject *py_logger = NULL;
static PyObject *py_logger_log = NULL;
int logging_level = 0;
int log_setup()
{
if(py_logger == NULL)
{
// Import Python's logging module
PyObject *logging = PyImport_ImportModuleEx("logging",
NULL, NULL, NULL);
if(logging == NULL)
return -1;
// Get the logger
{
PyObject *func = PyObject_GetAttrString(logging, "getLogger");
py_logger = PyObject_CallFunction(func, "(s)", "reprozip");
Py_DECREF(logging);
Py_DECREF(func);
if(py_logger == NULL)
return -1;
}
// Get the log function
py_logger_log = PyObject_GetAttrString(py_logger, "log");
if(py_logger_log == NULL)
{
Py_DECREF(py_logger);
py_logger = NULL;
return -1;
}
}
// Get the effective logging level
{
PyObject *meth = PyObject_GetAttrString(py_logger,
"getEffectiveLevel");
PyObject *level = PyObject_CallFunctionObjArgs(meth, NULL);
Py_DECREF(meth);
if(level == NULL)
return -1;
logging_level = PyLong_AsLong(level);
if(PyErr_Occurred())
{
Py_DECREF(level);
return -1;
}
Py_DECREF(level);
}
return 0;
}
void log_real_(pid_t tid, int lvl, const char *format, ...)
{
va_list args;
char datestr[13]; /* HH:MM:SS.mmm */
static char *buffer = NULL;
static size_t bufsize = 4096;
size_t length;
/* Fast filter: don't call Python if level is not enough */
if(lvl < logging_level)
return;
if(buffer == NULL)
buffer = malloc(bufsize);
{
struct timeval tv;
gettimeofday(&tv, NULL);
strftime(datestr, 13, "%H:%M:%S", localtime(&tv.tv_sec));
sprintf(datestr+8, ".%03u", (unsigned int)(tv.tv_usec / 1000));
}
va_start(args, format);
length = (size_t)vsnprintf(buffer, bufsize, format, args);
va_end(args);
if(length + 1 >= bufsize)
{
while(length + 1 >= bufsize)
bufsize *= 2;
free(buffer);
buffer = malloc(bufsize);
va_start(args, format);
length = vsnprintf(buffer, bufsize, format, args);
va_end(args);
}
if(tid > 0)
PyObject_CallFunction(py_logger_log, "(l, s, l, s)",
lvl, "[%d] %s", tid, buffer);
else
PyObject_CallFunction(py_logger_log, "(l, s, s)",
lvl, "%s", buffer);
}
#include <Python.h>
#include "database.h"
#include "log.h"
#include "tracer.h"
PyObject *Err_Base;
/**
* Makes a C string from a Python unicode or bytes object.
*
* If successful, the result is a string that the caller must free().
* Else, returns NULL.
*/
static char *get_string(PyObject *obj)
{
if(PyUnicode_Check(obj))
{
const char *str;
PyObject *pyutf8 = PyUnicode_AsUTF8String(obj);
if(pyutf8 == NULL)
return NULL;
#if PY_MAJOR_VERSION >= 3
str = PyBytes_AsString(pyutf8);
#else
str = PyString_AsString(pyutf8);
#endif
if(str == NULL)
return NULL;
{
char *ret = strdup(str);
Py_DECREF(pyutf8);
return ret;
}
}
else if(
#if PY_MAJOR_VERSION >= 3
PyBytes_Check(obj)
#else
PyString_Check(obj)
#endif
)
{
const char *str;
#if PY_MAJOR_VERSION >= 3
str = PyBytes_AsString(obj);
#else
str = PyString_AsString(obj);
#endif
if(str == NULL)
return NULL;
return strdup(str);
}
else
return NULL;
}
static PyObject *pytracer_execute(PyObject *self, PyObject *args)
{
PyObject *ret = NULL;
int exit_status;
char *binary = NULL, *databasepath = NULL;
char **argv = NULL;
size_t argv_len;
PyObject *py_binary, *py_argv, *py_databasepath;
if(log_setup() != 0)
{
PyErr_SetString(Err_Base, "Error occurred");
return NULL;
}
/* Reads arguments */
if(!PyArg_ParseTuple(args, "OO!O",
&py_binary,
&PyList_Type, &py_argv,
&py_databasepath))
return NULL;
binary = get_string(py_binary);
if(binary == NULL)
goto done;
databasepath = get_string(py_databasepath);
if(databasepath == NULL)
goto done;
/* Converts argv from Python list to char[][] */
{
size_t i;
int bad = 0;
argv_len = PyList_Size(py_argv);
argv = malloc((argv_len + 1) * sizeof(char*));
for(i = 0; i < argv_len; ++i)
{
PyObject *arg = PyList_GetItem(py_argv, i);
char *str = get_string(arg);
if(str == NULL)
{
bad = 1;
break;
}
argv[i] = str;
}
if(bad)
{
size_t j;
for(j = 0; j < i; ++j)
free(argv[j]);
free(argv);
argv = NULL;
goto done;
}
argv[argv_len] = NULL;
}
if(fork_and_trace(binary, argv_len, argv, databasepath, &exit_status) == 0)
{
ret = PyLong_FromLong(exit_status);
}
else
{
PyErr_SetString(Err_Base, "Error occurred");
ret = NULL;
}
done:
free(binary);
free(databasepath);
/* Deallocs argv */
if(argv)
{
size_t i;
for(i = 0; i < argv_len; ++i)
free(argv[i]);
free(argv);
}
return ret;
}
static PyMethodDef methods[] = {
{"execute", pytracer_execute, METH_VARARGS,
"execute(binary, argv, databasepath)\n"
"\n"
"Runs the specified binary with the argument list argv under trace and "
"writes\nthe captured events to SQLite3 database databasepath."},
{ NULL, NULL, 0, NULL }
};
#if PY_MAJOR_VERSION >= 3
static struct PyModuleDef moduledef = {
PyModuleDef_HEAD_INIT,
"reprozip._pytracer", /* m_name */
"C interface to tracer", /* m_doc */
-1, /* m_size */
methods, /* m_methods */
NULL, /* m_reload */
NULL, /* m_traverse */
NULL, /* m_clear */
NULL, /* m_free */
};
#endif
#if PY_MAJOR_VERSION >= 3
PyMODINIT_FUNC PyInit__pytracer(void)
#else
PyMODINIT_FUNC init_pytracer(void)
#endif
{
PyObject *mod;
#if PY_MAJOR_VERSION >= 3
mod = PyModule_Create(&moduledef);
#else
mod = Py_InitModule("reprozip._pytracer", methods);
#endif
if(mod == NULL)
{
#if PY_MAJOR_VERSION >= 3
return NULL;
#else
return;
#endif
}
Err_Base = PyErr_NewException("_pytracer.Error", NULL, NULL);
Py_INCREF(Err_Base);
PyModule_AddObject(mod, "Error", Err_Base);
#if PY_MAJOR_VERSION >= 3
return mod;
#endif
}
This diff is collapsed.
#ifndef SYSCALL_H
#define SYSCALL_H
#include "tracer.h"
void syscall_build_table(void);
int syscall_handle(struct Process *process);
int syscall_execve_event(struct Process *process);
int syscall_fork_event(struct Process *process, unsigned int event);
#endif
This diff is collapsed.
#ifndef TRACER_H
#define TRACER_H
#include "config.h"
int fork_and_trace(const char *binary, int argc, char **argv,
const char *database_path, int *exit_status);
/* This is NOT a union because sign-extension rules depend on actual register
* sizes. */
typedef struct S_register_type {
signed long int i;
unsigned long int u;
void *p;
} register_type;
#define PROCESS_ARGS 6
struct ExecveInfo {
char *binary;
char **argv;
char **envp;
};
void free_execve_info(struct ExecveInfo *execi);
struct ThreadGroup {
pid_t tgid;
char *wd;
unsigned int refs;
};
struct Process {
unsigned int identifier;
unsigned int mode;
struct ThreadGroup *threadgroup;
pid_t tid;
int status;
unsigned int flags;
int in_syscall;
int current_syscall;
register_type retvalue;
register_type params[PROCESS_ARGS];
struct ExecveInfo *execve_info;
};
#define PROCSTAT_FREE 0 /* unallocated entry in table */
#define PROCSTAT_ALLOCATED 1 /* fork() done but not yet attached */
#define PROCSTAT_ATTACHED 2 /* running process */
#define PROCSTAT_UNKNOWN 3 /* attached but no corresponding fork() call
* has finished yet */
#define MODE_I386 1
#define MODE_X86_64 2 /* In x86_64 mode, syscalls might be native x64
* or x32 */
#define PROCFLAG_EXECD 1 /* Process is coming out of execve */
#define PROCFLAG_FORKING 2 /* Process is spawning another with
* fork/vfork/clone */
#define PROCFLAG_OPEN_EXIST 4 /* Process is opening a file that exists */
/* FIXME : This is only exposed because of execve() workaround */
extern struct Process **processes;
extern size_t processes_size;
struct Process *trace_find_process(pid_t tid);
struct Process *trace_get_empty_process(void);
struct ThreadGroup *trace_new_threadgroup(pid_t tgid, char *wd);
void trace_free_process(struct Process *process);
void trace_count_processes(unsigned int *p_nproc, unsigned int *p_unknown);
int trace_add_files_from_proc(unsigned int process, pid_t tid,
const char *binary);
#endif
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include "config.h"
#include "database.h"
#include "log.h"
unsigned int flags2mode(int flags)
{
unsigned int mode = 0;
if(!O_RDONLY)
{
if(flags & O_WRONLY)
mode |= FILE_WRITE;
else if(flags & O_RDWR)
mode |= FILE_READ | FILE_WRITE;
else
mode |= FILE_READ;
}
else if(!O_WRONLY)
{
if(flags & O_RDONLY)
mode |= FILE_READ;
else if(flags & O_RDWR)
mode |= FILE_READ | FILE_WRITE;
else
mode |= FILE_WRITE;
}
else
{
if( (flags & (O_RDONLY | O_WRONLY)) == (O_RDONLY | O_WRONLY) )
log_error(0, "encountered bogus open() flags O_RDONLY|O_WRONLY");
/* Carry on anyway */
if(flags & O_RDONLY)
mode |= FILE_READ;
if(flags & O_WRONLY)
mode |= FILE_WRITE;
if(flags & O_RDWR)
mode |= FILE_READ | FILE_WRITE;
if( (mode & FILE_READ) && (mode & FILE_WRITE) && (flags & O_TRUNC) )
/* If O_TRUNC is set, consider this a write */
mode &= ~FILE_READ;
}
return mode;
}
char *abspath(const char *wd, const char *path)
{
size_t len_wd = strlen(wd);
if(wd[len_wd-1] == '/')
{
/* LCOV_EXCL_START : We usually get canonical path names, so we don't
* run into this one */
char *result = malloc(len_wd + strlen(path) + 1);
memcpy(result, wd, len_wd);
strcpy(result + len_wd, path);
return result;
/* LCOV_EXCL_END */
}
else
{
char *result = malloc(len_wd + 1 + strlen(path) + 1);
memcpy(result, wd, len_wd);
result[len_wd] = '/';
strcpy(result + len_wd + 1, path);
return result;
}
}
char *get_wd(void)
{
/* PATH_MAX has issues, don't use it */
size_t size = 1024;
char *path;
for(;;)
{
path = malloc(size);
if(getcwd(path, size) != NULL)
return path;
else
{
if(errno != ERANGE)
{
/* LCOV_EXCL_START : getcwd() really shouldn't fail */
free(path);
log_error(0, "getcwd failed: %s", strerror(errno));
return strdup("/UNKNOWN");
/* LCOV_EXCL_END */
}
free(path);
size <<= 1;
}
}
}
char *read_line(char *buffer, size_t *size, FILE *fp)
{
size_t pos = 0;
if(buffer == NULL)
{
*size = 4096;
buffer = malloc(*size);
}
for(;;)
{
char c;
{
int t = getc(fp);
if(t == EOF)
{
free(buffer);
return NULL;
}
c = t;
}
if(c == '\n')
{
buffer[pos] = '\0';
return buffer;
}
else
{
if(pos + 1 >= *size)
{
*size <<= 2;
buffer = realloc(buffer, *size);
}
buffer[pos++] = c;
}
}
}
int path_is_dir(const char *pathname)
{
struct stat buf;
if(lstat(pathname, &buf) != 0)
{
/* LCOV_EXCL_START : shouldn't happen because a tracer process just
* accessed it */
log_error(0, "error stat()ing %s: %s", pathname, strerror(errno));
/* LCOV_EXCL_END */
return 0;
}
return S_ISDIR(buf.st_mode)?1:0;
}
#ifndef UTILS_H
#define UTILS_H
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
unsigned int flags2mode(int flags);
char *abspath(const char *wd, const char *path);
char *get_wd(void);
char *read_line(char *buffer, size_t *size, FILE *fp);
int path_is_dir(const char *pathname);
#endif
Metadata-Version: 1.2
Name: reprozip
Version: 1.0.14
Summary: Linux tool enabling reproducible experiments (packer)
Home-page: https://www.reprozip.org/
Author: Remi Rampin, Fernando Chirigati, Dennis Shasha, Juliana Freire
Author-email: reprozip-users@vgc.poly.edu
Maintainer: Remi Rampin
Maintainer-email: remirampin@gmail.com
License: BSD-3-Clause
Project-URL: Homepage, https://github.com/ViDA-NYU/reprozip
Project-URL: Documentation, https://docs.reprozip.org/
Project-URL: Examples, https://examples.reprozip.org/
Project-URL: Say Thanks, https://saythanks.io/to/remram44
Project-URL: Source, https://github.com/ViDA-NYU/reprozip
Project-URL: Tracker, https://github.com/ViDA-NYU/reprozip/issues
Description: ReproZip
========
`ReproZip <https://www.reprozip.org/>`__ is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
reprozip
--------
This is the component responsible for the packing step on Linux distributions.
Please refer to `reprounzip <https://pypi.python.org/pypi/reprounzip>`_, `reprounzip-vagrant <https://pypi.python.org/pypi/reprounzip-vagrant>`_, and `reprounzip-docker <https://pypi.python.org/pypi/reprounzip-docker>`_ for other components and plugins.
Additional Information
----------------------
For more detailed information, please refer to our `website <https://www.reprozip.org/>`_, as well as to our `documentation <https://reprozip.readthedocs.io/>`_.
ReproZip is currently being developed at `NYU <http://engineering.nyu.edu/>`_. The team includes:
* `Fernando Chirigati <http://fchirigati.com/>`_
* `Juliana Freire <https://vgc.poly.edu/~juliana/>`_
* `Remi Rampin <https://remirampin.com/>`_
* `Dennis Shasha <http://cs.nyu.edu/shasha/>`_
* `Vicky Steeves <https://vickysteeves.com/>`_
Keywords: reprozip,reprounzip,reproducibility,provenance,vida,nyu
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: C
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: System :: Archiving
LICENSE.txt
MANIFEST.in
README.rst
setup.py
native/config.h
native/database.c
native/database.h
native/log.h
native/ptrace_utils.c
native/ptrace_utils.h
native/pylog.c
native/pytracer.c
native/syscalls.c
native/syscalls.h
native/tracer.c
native/tracer.h
native/utils.c
native/utils.h
reprozip/__init__.py
reprozip/common.py
reprozip/filters.py
reprozip/main.py
reprozip/pack.py
reprozip/traceutils.py
reprozip/utils.py
reprozip.egg-info/PKG-INFO
reprozip.egg-info/SOURCES.txt
reprozip.egg-info/dependency_links.txt
reprozip.egg-info/entry_points.txt
reprozip.egg-info/requires.txt
reprozip.egg-info/top_level.txt
reprozip/tracer/__init__.py
reprozip/tracer/linux_pkgs.py
reprozip/tracer/trace.py
\ No newline at end of file
[console_scripts]
reprozip = reprozip.main:main
[reprozip.filters]
builtin = reprozip.filters:builtin
python = reprozip.filters:python
PyYAML
rpaths>=0.8
usagestats>=0.3
requests
# Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
__version__ = '1.0.14'
This diff is collapsed.
# Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
from __future__ import division, print_function, unicode_literals
import logging
from reprozip.tracer.trace import TracedFile
from reprozip.utils import irange, iteritems
logger = logging.getLogger('reprozip')
def builtin(input_files, **kwargs):
"""Default heuristics for input files.
"""
for i in irange(len(input_files)):
lst = []
for path in input_files[i]:
if path.unicodename[0] == '.' or path.ext in ('.pyc', '.so'):
logger.info("Removing input %s", path)
else:
lst.append(path)
input_files[i] = lst
def python(files, input_files, **kwargs):
add = []
for path, fi in iteritems(files):
if path.ext == '.pyc':
pyfile = path.parent / path.stem + '.py'
if pyfile.is_file():
if pyfile not in files:
logger.info("Adding %s", pyfile)
add.append(TracedFile(pyfile))
for fi in add:
files[fi.path] = fi
for i in irange(len(input_files)):
lst = []
for path in input_files[i]:
if path.ext in ('.py', '.pyc'):
logger.info("Removing input %s", path)
else:
lst.append(path)
input_files[i] = lst
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
[egg_info]
tag_build =
tag_date = 0
import io
import os
import platform
from setuptools import setup, Extension
import sys
# pip workaround
os.chdir(os.path.abspath(os.path.dirname(__file__)))
# This won't build on non-Linux -- don't even try
if platform.system().lower() != 'linux':
sys.stderr.write("reprozip uses ptrace and thus only works on Linux\n"
"You can however install reprounzip and plugins on other "
"platforms\n")
sys.exit(1)
# List the source files
sources = ['pytracer.c', 'tracer.c', 'syscalls.c', 'database.c',
'ptrace_utils.c', 'utils.c', 'pylog.c']
# They can be found under native/
sources = [os.path.join('native', n) for n in sources]
# Setup the libraries
libraries = ['sqlite3', 'rt']
# Build the C module
pytracer = Extension('reprozip._pytracer',
sources=sources,
libraries=libraries)
# Need to specify encoding for PY3, which has the worst unicode handling ever
with io.open('README.rst', encoding='utf-8') as fp:
description = fp.read()
req = [
'PyYAML',
'rpaths>=0.8',
'usagestats>=0.3',
'requests']
setup(name='reprozip',
version='1.0.14',
ext_modules=[pytracer],
packages=['reprozip', 'reprozip.tracer'],
entry_points={
'console_scripts': [
'reprozip = reprozip.main:main'],
'reprozip.filters': [
'python = reprozip.filters:python',
'builtin = reprozip.filters:builtin']},
install_requires=req,
description="Linux tool enabling reproducible experiments (packer)",
author="Remi Rampin, Fernando Chirigati, Dennis Shasha, Juliana Freire",
author_email='reprozip-users@vgc.poly.edu',
maintainer="Remi Rampin",
maintainer_email='remirampin@gmail.com',
url='https://www.reprozip.org/',
project_urls={
'Homepage': 'https://github.com/ViDA-NYU/reprozip',
'Documentation': 'https://docs.reprozip.org/',
'Examples': 'https://examples.reprozip.org/',
'Say Thanks': 'https://saythanks.io/to/remram44',
'Source': 'https://github.com/ViDA-NYU/reprozip',
'Tracker': 'https://github.com/ViDA-NYU/reprozip/issues',
},
long_description=description,
license='BSD-3-Clause',
keywords=['reprozip', 'reprounzip', 'reproducibility', 'provenance',
'vida', 'nyu'],
classifiers=[
'Development Status :: 5 - Production/Stable',
'Intended Audience :: Science/Research',
'License :: OSI Approved :: BSD License',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3',
'Operating System :: POSIX :: Linux',
'Programming Language :: C',
'Topic :: Scientific/Engineering',
'Topic :: System :: Archiving'])