Commit e0312d10 authored by Iñaki Malerba's avatar Iñaki Malerba

New upstream version 0.31.1

parent 3a542324
......@@ -4,6 +4,15 @@ Changes
=======
0.31.1 (*2018-03-18*)
=====================
- Fix #249 reporter bug when using `--continue` option
- Fix #248 test failures of debian when using GDBM
- Fix #164 `get_var` fails on multiprocess execution on Windows
- Fix #245 custom `clean` action takes `dry-run` into account
0.31.0 (*2018-02-25*)
=====================
......
......@@ -120,7 +120,7 @@ include a *clean* attribute. This attribute can be ``True`` to remove all of its
target files. If there is a folder as a target it will be removed if the folder
is empty, otherwise it will display a warning message.
The *clean* attribute can be a list of actions, again, an action could be a
The *clean* attribute can be a list of actions. An action could be a
string with a shell command or a tuple with a python callable.
If you want to clean the targets and add some custom clean actions,
......@@ -146,8 +146,6 @@ If you are executing the default tasks this flag is automatically set.
By default only the default tasks' clean are executed, not from all tasks.
You can clean all tasks using the *-a*/*--all* argument.
If you want check which tasks the clean operation would affect you can use the option *-n*/*--dry-run*.
If you like to also make doit forget previous execution of cleaned tasks, use option
*--forget*. This can be made the default behavior by adding the corresponding ``cleanforget``
configuration switch:
......@@ -158,6 +156,21 @@ configuration switch:
'cleanforget': True,
}
dry run
^^^^^^^
If you want check which tasks the clean operation would affect you can use the option `-n/--dry-run`.
When using a custom action on `dry-run`, the action is not executed at all
**if** it does not include a `dryrun` parameter.
If it includes a `dryrun` parameter the action will **always** be executed,
and its implementation is responsible for handling the *dry-run* logic.
.. literalinclude:: samples/custom_clean.py
ignore
-------
......
......@@ -38,11 +38,11 @@ documentation
install
tasks
dependencies
task_creation
cmd_run
cmd_other
configuration
task_args
task_creation
uptodate
tools
extending
......
......@@ -102,6 +102,129 @@ so they are executed in appropriate order.
.. _uptodate_api:
uptodate API
--------------
This section will explain how to extend ``doit`` writing an ``uptodate``
implementation. So unless you need to write an ``uptodate`` implementation
you can skip this.
Let's start with trivial example. `uptodate` is a function that returns
a boolean value.
.. literalinclude:: samples/uptodate_callable.py
You could also execute this function in the task-creator and pass the value
to to `uptodate`. The advantage of just passing the callable is that this
check will not be executed at all if the task was not selected to be executed.
Example: run-once implementation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Most of the time an `uptodate` implementation will compare the current value
of something with the value it had last time the task was executed.
We already saw how tasks can save values by returning dict on its actions.
But usually the "value" we want to check is independent from the task actions.
So the first step is to add a callable to the task so it can save some extra
values. These values are not used by the task itself, they are only used
for dependency checking.
The Task has a property called ``value_savers`` that contains a list of
callables. These callables should return a dict that will be saved together
with other task values. The ``value_savers`` will be executed after all actions.
The second step is to actually compare the saved value with its "current" value.
The `uptodate` callable can take two positional parameters ``task`` and ``values``. The callable can also be represented by a tuple (callable, args, kwargs).
- ``task`` parameter will give you access to task object. So you have access
to its metadata and opportunity to modify the task itself!
- ``values`` is a dictionary with the computed values saved in the last
successful execution of the task.
Let's take a look in the ``run_once`` implementation.
.. literalinclude:: samples/run_once.py
The function ``save_executed`` returns a dict. In this case it is not checking
for any value because it just checks it the task was ever executed.
The next line we use the ``task`` parameter adding
``save_executed`` to ``task.value_savers``.So whenever this task is executed this
task value 'run-once' will be saved.
Finally the return value should be a boolean to indicate if the task is
up-to-date or not. Remember that the 'values' parameter contains the dict with
the values saved from last successful execution of the task.
So it just checks if this task was executed before by looking for the
``run-once`` entry in ```values``.
Example: timeout implementation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Let's look another example, the ``timeout``. The main difference is that
we actually pass the parameter ``timeout_limit``. Here we present
a simplified version that only accepts integers (seconds) as a parameter.
.. code-block:: python
class timeout(object):
def __init__(self, timeout_limit):
self.limit_sec = timeout_limit
def __call__(self, task, values):
def save_now():
return {'success-time': time_module.time()}
task.value_savers.append(save_now)
last_success = values.get('success-time', None)
if last_success is None:
return False
return (time_module.time() - last_success) < self.limit_sec
This is a class-based implementation where the objects are made callable
by implementing a ``__call__`` method.
On ``__init__`` we just save the ``timeout_limit`` as an attribute.
The ``__call__`` is very similar with the ``run-once`` implementation.
First it defines a function (``save_now``) that is registered
into ``task.value_savers``. Than it compares the current time
with the time that was saved on last successful execution.
Example: result_dep implementation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The ``result_dep`` is more complicated due to two factors. It needs to modify
the task's ``task_dep``.
And it needs to check the task's saved values and metadata
from a task different from where it is being applied.
A ``result_dep`` implies that its dependency is also a ``task_dep``.
We have seen that the callable takes a `task` parameter that we used
to modify the task object. The problem is that modifying ``task_dep``
when the callable gets called would be "too late" according to the
way `doit` works. When an object is passed ``uptodate`` and this
object's class has a method named ``configure_task`` it will be called
during the task creation.
The base class ``dependency.UptodateCalculator`` gives access to
an attribute named ``tasks_dict`` containing a dictionary with
all task objects where the ``key`` is the task name (this is used to get all
sub-tasks from a task-group). And also a method called ``get_val`` to access
the saved values and results from any task.
See the `result_dep` `source <https://github.com/pydoit/doit/blob/master/doit/task.py>`_.
task-dependency
---------------
......@@ -148,27 +271,6 @@ its `actions` to ``None``.
Note that tasks are never executed twice in the same "run".
.. _attr-calc_dep:
calculated-dependencies
------------------------
Calculation of dependencies might be an expensive operation, so not suitable
to be done on load time by task-creators.
For this situation it is better to delegate
the calculation of dependencies to another task.
The task calculating dependencies must have a python-action returning a
dictionary with `file_dep`, `task_dep`, `uptodate` or another `calc_dep`.
On the example below ``mod_deps`` prints on the screen all direct dependencies
from a module. The dependencies itself are calculated on task ``get_dep``
(note: get_dep has a fake implementation where the results are taken from a dict).
.. literalinclude:: samples/calc_dep.py
setup-task
-------------
......@@ -268,3 +370,30 @@ If a group-task is used, the values from all its sub-tasks are passed as a dict.
.. note::
``getargs`` creates an implicit setup-task.
.. _attr-calc_dep:
calculated-dependencies
------------------------
Calculation of dependencies might be an expensive operation, so not suitable
to be done on load time by task-creators.
For this situation it is better to delegate
the calculation of dependencies to another task.
The task calculating dependencies must have a python-action returning a
dictionary with `file_dep`, `task_dep`, `uptodate` or another `calc_dep`.
.. note::
An alternative way (and often easier) to have task attributes that
rely on other tasks execution is to use `delayed tasks <delayed-task-creation>`.
On the example below ``mod_deps`` prints on the screen all direct dependencies
from a module. The dependencies itself are calculated on task ``get_dep``
(note: get_dep has a fake implementation where the results are taken from a dict).
.. literalinclude:: samples/calc_dep.py
......@@ -82,17 +82,22 @@ DSL
DelayedLoader
DelayedTask
Dembia
DevOps
DoIt
DoitCmdBase
DoitMain
ETC2
ETag
Elasticsearch
FPGA
FileChangedChecker
FooCmd
FrankStain
Fuseki
GH
Gliwinski
Guo
Heggø
INI
IPython
IPython's
......@@ -126,6 +131,7 @@ Pythonic
README
RSS
ReST
Repology
S3
SCons
Schettino
......@@ -206,6 +212,7 @@ dict's
dir
dodoFile
doit
dryrun
dumbdbm
dumpdb
efg
......@@ -242,6 +249,7 @@ hg
hggit
hgrc
html
https
hunspell
img
init
......@@ -286,6 +294,7 @@ notavalidchoice
o'
once'
online
org
os
outfile
param
......@@ -372,6 +381,8 @@ toctree
tsetup
txt
txt'
ub
uio
unhandled
unicode
unix
......
......@@ -3,25 +3,49 @@ Installing
==========
pip
^^^
* Using `pip <http://pip.pypa.io/>`_::
`package <http://pip.pypa.io/>`_::
$ pip install doit
$ pip install doit
Latest version of `doit` supports only python 3.
If you are using python 2::
$ pip install "doit<0.30"
* `Download <http://pypi.python.org/pypi/doit>`_ the source and::
Source
^^^^^^
Download `source <http://pypi.python.org/pypi/doit>`_::
$ pip install -e .
* Get latest development version::
git repository
^^^^^^^^^^^^^^
Get latest development version::
$ git clone https://github.com/pydoit/doit.git
OS package
^^^^^^^^^^
Several distributions include native `doit` packages.
`Repology.org <https://repology.org/metapackage/doit/badges>`_
provides up-to-date information about available packages and
`doit` versions on each distribution.
Anaconda
^^^^^^^^
`doit` is also packaged on `Anaconda <https://anaconda.org/conda-forge/doit>`_.
Note this is not an official package and might be outdated.
.. note::
* `doit` depends on the packages
`pyinotify <http://trac.dbzteam.org/pyinotify>`_ (for linux),
......
def my_cleaner(dryrun):
if dryrun:
print('dryrun, dont really execute')
return
print('execute cleaner...')
def task_sample():
return {
"actions" : None,
"clean" : [my_cleaner],
}
......@@ -150,9 +150,42 @@ many users and contributors.
Document Production
^^^^^^^^^^^^^^^^^^^
(2018-02-01)
`Carve Systems <https://carvesystems.com>`_ uses `doit` as the core automation tool
for all of our document production. This customized tool based on Pandoc, Latex, and
coordinated by `doit` is used by everyone in our company to prepare our primary
customer facing deliverable. Previously we used makefiles to coordinate builds. `doit`
let us create a system that can be more easily maintained, tested, and extended using
plugins.
DevOps
------
University of Oslo Library, Norway
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
by_ `Dan Michael O. Heggø <https://github.com/danmichaelo>`_ (2018-02-26)
.. _by: #https-data-ub-uio-no
We're using `doit` for the publishing workflow at our vocabulary server https://data.ub.uio.no/ .
The server checks multiple remote sources for changes, and when there’s new changes somewhere, the data is fetched,
converted to different formats, published and pushed to Fuseki and Elasticsearch.
One part I love about `doit` is that you can control what is considered a change.
For remote files, I've created a task that checks if some header, like ETag or Last-Modified, has changed.
If it hasn't, I set `uptodate` to True and stop there.
Another part I love is the ability to re-use tasks.
Each vocabulary (like https://github.com/realfagstermer/realfagstermer and https://github.com/scriptotek/humord)
has a different publication workflow, but many tasks are shared.
With `doit`, I've created a collection of tasks and task generators (https://github.com/scriptotek/data_ub_tasks/)
that I use with all the vocabularies.
Finally, it's great that you can mix shell commands and Python tasks so easily.
This cuts development time and makes the move from using Makefiles much easier.
......@@ -4,7 +4,7 @@ custom uptodate
The basics of `uptodate` was already :ref:`introduced <attr-uptodate>`.
Here we look in more
detail into some implementations shipped with `doit`. And the API used by those.
detail into some implementations shipped with `doit`.
.. _result_dep:
......@@ -129,126 +129,3 @@ If a file is a target of another task you should probably add
``task_dep`` on that task to ensure the file is created before it is checked.
.. literalinclude:: samples/check_timestamp_unchanged.py
.. _uptodate_api:
uptodate API
--------------
This section will explain how to extend ``doit`` writing an ``uptodate``
implementation. So unless you need to write an ``uptodate`` implementation
you can skip this.
Let's start with trivial example. `uptodate` is a function that returns
a boolean value.
.. literalinclude:: samples/uptodate_callable.py
You could also execute this function in the task-creator and pass the value
to to `uptodate`. The advantage of just passing the callable is that this
check will not be executed at all if the task was not selected to be executed.
Example: run-once implementation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Most of the time an `uptodate` implementation will compare the current value
of something with the value it had last time the task was executed.
We already saw how tasks can save values by returning dict on its actions.
But usually the "value" we want to check is independent from the task actions.
So the first step is to add a callable to the task so it can save some extra
values. These values are not used by the task itself, they are only used
for dependency checking.
The Task has a property called ``value_savers`` that contains a list of
callables. These callables should return a dict that will be saved together
with other task values. The ``value_savers`` will be executed after all actions.
The second step is to actually compare the saved value with its "current" value.
The `uptodate` callable can take two positional parameters ``task`` and ``values``. The callable can also be represented by a tuple (callable, args, kwargs).
- ``task`` parameter will give you access to task object. So you have access
to its metadata and opportunity to modify the task itself!
- ``values`` is a dictionary with the computed values saved in the last
successful execution of the task.
Let's take a look in the ``run_once`` implementation.
.. literalinclude:: samples/run_once.py
The function ``save_executed`` returns a dict. In this case it is not checking
for any value because it just checks it the task was ever executed.
The next line we use the ``task`` parameter adding
``save_executed`` to ``task.value_savers``.So whenever this task is executed this
task value 'run-once' will be saved.
Finally the return value should be a boolean to indicate if the task is
up-to-date or not. Remember that the 'values' parameter contains the dict with
the values saved from last successful execution of the task.
So it just checks if this task was executed before by looking for the
``run-once`` entry in ```values``.
Example: timeout implementation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Let's look another example, the ``timeout``. The main difference is that
we actually pass the parameter ``timeout_limit``. Here we present
a simplified version that only accepts integers (seconds) as a parameter.
.. code-block:: python
class timeout(object):
def __init__(self, timeout_limit):
self.limit_sec = timeout_limit
def __call__(self, task, values):
def save_now():
return {'success-time': time_module.time()}
task.value_savers.append(save_now)
last_success = values.get('success-time', None)
if last_success is None:
return False
return (time_module.time() - last_success) < self.limit_sec
This is a class-based implementation where the objects are made callable
by implementing a ``__call__`` method.
On ``__init__`` we just save the ``timeout_limit`` as an attribute.
The ``__call__`` is very similar with the ``run-once`` implementation.
First it defines a function (``save_now``) that is registered
into ``task.value_savers``. Than it compares the current time
with the time that was saved on last successful execution.
Example: result_dep implementation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The ``result_dep`` is more complicated due to two factors. It needs to modify
the task's ``task_dep``.
And it needs to check the task's saved values and metadata
from a task different from where it is being applied.
A ``result_dep`` implies that its dependency is also a ``task_dep``.
We have seen that the callable takes a `task` parameter that we used
to modify the task object. The problem is that modifying ``task_dep``
when the callable gets called would be "too late" according to the
way `doit` works. When an object is passed ``uptodate`` and this
object's class has a method named ``configure_task`` it will be called
during the task creation.
The base class ``dependency.UptodateCalculator`` gives access to
an attribute named ``tasks_dict`` containing a dictionary with
all task objects where the ``key`` is the task name (this is used to get all
sub-tasks from a task-group). And also a method called ``get_val`` to access
the saved values and results from any task.
See the `result_dep` `source <https://github.com/pydoit/doit/blob/master/doit/task.py>`_.
......@@ -399,7 +399,11 @@ class DoitCmdBase(Command):
db_class = self._backends.get(params['backend'])
checker_cls = self.get_checker_cls(params['check_file_uptodate'])
# note the command have the responsibility to call dep_manager.close()
self.dep_manager = Dependency(db_class, params['dep_file'], checker_cls)
if self.dep_manager is None:
# dep_manager might have been already set (used on unit-test)
self.dep_manager = Dependency(
db_class, params['dep_file'], checker_cls)
# hack to pass parameter into _execute() calls that are not part
# of command line options
......
......@@ -32,6 +32,11 @@ def reset_vars():
_CMDLINE_VARS = {}
def get_var(name, default=None):
# Ignore if not initialized.
# This is a work-around for Windows multi-processing
# See https://github.com/pydoit/doit/issues/164
if _CMDLINE_VARS is None:
return None
return _CMDLINE_VARS.get(name, default)
def set_var(name, value):
......
......@@ -88,17 +88,21 @@ class ConsoleReporter(object):
# if test fails print output from failed task
for result in self.failures:
task = result['task']
# makes no sense to print output if task was not executed
if not task.executed:
continue
show_err = task.verbosity < 1 or self.failure_verbosity > 0
show_out = task.verbosity < 2 or self.failure_verbosity == 2
if show_err:
if show_err or show_out:
self.write("#"*40 + "\n")
if show_err:
self._write_failure(result,
write_exception=self.failure_verbosity)
err = "".join([a.err for a in task.actions if a.err])
self.write("<stderr>:\n{}\n".format(err))
self.write("{} <stderr>:\n{}\n".format(task.name, err))
if show_out:
out = "".join([a.out for a in task.actions if a.out])
self.write("<stdout>:\n{}\n".format(out))
self.write("{} <stdout>:\n{}\n".format(task.name, out))
if self.runtime_errors:
self.write("#"*40 + "\n")
......
......@@ -111,6 +111,10 @@ class Runner():
self.reporter.get_status(task)
# overwrite with effective verbosity
task.overwrite_verbosity(self.stream)
task.verbosity = self.stream.effective_verbosity(task.verbosity)
# check if task should be ignored (user controlled)
if node.ignored_deps or self.dep_manager.status_is_ignore(task):
node.run_status = 'ignore'
......
......@@ -243,6 +243,8 @@ class Task(object):
self.teardown = [create_action(a, self, 'teardown') for a in teardown]
self.doc = self._init_doc(doc)
self.watch = watch
# just indicate if actions were executed at all
self.executed = False
def _init_deps(self, file_dep, task_dep, calc_dep):
......@@ -433,14 +435,15 @@ class Task(object):
for value_saver in self.value_savers:
self.values.update(value_saver())
def overwrite_verbosity(self, stream):
self.verbosity = stream.effective_verbosity(self.verbosity)
def execute(self, stream):
"""Executes the task.
@return failure: see CmdAction.execute
"""
self.executed = True
self.init_options()
# overwrite with effective verbosity
self.verbosity = stream.effective_verbosity(self.verbosity)
task_stdout, task_stderr = stream._get_out_err(self.verbosity)
for action in self.actions:
action_return = action.execute(task_stdout, task_stderr)
......@@ -478,12 +481,14 @@ class Task(object):
outstream.write(msg % (self.name, action))
# add extra arguments used by clean actions
execute_on_dryrun = False
if isinstance(action, PythonAction):
action_sig = inspect.signature(action.py_callable)
if 'dryrun' in action_sig.parameters:
execute_on_dryrun = True
action.kwargs['dryrun'] = dryrun
if not dryrun:
if (not dryrun) or execute_on_dryrun:
result = action.execute(out=outstream)
if isinstance(result, CatchedException):
sys.stderr.write(str(result))
......
"""doit version, defined out of __init__.py to avoid circular reference"""
VERSION = (0, 31, 0)
VERSION = (0, 31, 1)
......@@ -43,7 +43,7 @@ configuration management, etc.
setup(name = 'doit',
description = 'doit - Automation Tool',
version = '0.31.0',
version = '0.31.1',
license = 'MIT',
author = 'Eduardo Naufel Schettino',
author_email = 'schettino72@gmail.com',
......
# Run/test doit on debian unstable
# docker build -t doit-debian .
# docker run -it --cap-add SYS_PTRACE -v /home/eduardo/work/doit/dev:/root/doit doit-debian
# pip3 install -e .
# pip3 install -r dev_requirements.txt
from debian:unstable
RUN apt-get update && apt-get install eatmydata --no-install-recommends -y
RUN eatmydata apt-get install python3-pytest python3-pip -y
RUN apt-get install python3-gdbm strace -y
WORKDIR /root/doit
......@@ -75,7 +75,7 @@ db_ext = {'dbhash': [''],
}
@pytest.fixture
def depfile(request):
def dep_manager(request):
if hasattr(request, 'param'):
dep_class = request.param
else:
......@@ -111,10 +111,6 @@ def depfile_name(request):
return depfile_name
@pytest.fixture
def dep_manager(request, depfile_name):
return Dependency(DbmDB, depfile_name)
@pytest.fixture
def restore_cwd(request):
......
......@@ -10,6 +10,7 @@ from doit.task import Task
from doit.cmd_base import version_tuple, Command, DoitCmdBase
from doit.cmd_base import ModuleTaskLoader, DodoTaskLoader
from doit.cmd_base import check_tasks_exist, tasks_and_deps_iter, subtasks_iter
from .conftest import CmdFactory
def test_version_tuple():
......@@ -248,7 +249,7 @@ class TestDoitCmdBase(object):
assert dodo_config == {'verbosity': 2}
def test_force_verbosity(self, depfile_name):
def test_force_verbosity(self, dep_manager):
members = {
'DOIT_CONFIG': {'verbosity': 0},
'task_xxx1': lambda : {'actions':[]},
......@@ -269,9 +270,10 @@ class TestDoitCmdBase(object):
def _execute(self, verbosity, force_verbosity):
return verbosity, force_verbosity
cmd = SampleCmd(task_loader=loader)
assert (2, True) == cmd.parse_execute(['--db-file', depfile_name, '-v2'])
assert (0, False) == cmd.parse_execute(['--db-file', depfile_name])
cmd = CmdFactory(SampleCmd, task_loader=loader, dep_manager=dep_manager)
assert (2, True) == cmd.parse_execute(
['--db-file', dep_manager.name, '-v2'])
assert (0, False) == cmd.parse_execute(['--db-file', dep_manager.name])
......
......@@ -4,14 +4,14 @@ from doit.cmd_dumpdb import DumpDB
class TestCmdDumpDB(object):
def testDefault(self, capsys, depfile):
if depfile.whichdb in ('dbm', 'dbm.ndbm'): # pragma: no cover
pytest.skip('%s not supported for this operation' % depfile.whichdb)
def testDefault(self, capsys, dep_manager):
if dep_manager.whichdb in ('dbm', 'dbm.ndbm'): # pragma: no cover
pytest.skip('%s not supported for this operation' % dep_manager.whichdb)
# cmd_main(["help", "task"])
depfile._set('tid', 'my_dep', 'xxx')
depfile.close()