Commit 7f8e500f authored by Reiner Herrmann's avatar Reiner Herrmann Committed by Jérémy Bobbio

Fix typos, grammar and add various wording improvements

parent ba7f69de
......@@ -5,19 +5,19 @@ permalink: /docs/archives/
---
Most archive formats record metadata that will capture details about the
build environment if care is not taken. File last modification time is
build environment if no care is taken. File last modification time is
obvious, but file ordering, users, groups, numeric ids, and permissions
can also be concerns. Tar is going to be used as the main example but
these tips should apply to other archive formats as well.
can also be concerns. Tar will be used as the main example but these tips
should apply to other archive formats as well.
File modification times
-----------------------
Most archive formats will, be default, record file last modification
times. Some will also record file creation times.
Most archive formats will, by default, record file last modification
times, while some will also record file creation times.
Tar has a way to specify the modification time that must be used for all
files:
Tar has a way to specify the modification time that is used for all
archive members:
{% highlight sh %}
$ tar --mtime='2015-10-21 00:00Z' -cf product.tar build
......@@ -26,7 +26,7 @@ $ tar --mtime='2015-10-21 00:00Z' -cf product.tar build
(Notice how `Z` is used to specify that time is in the UTC
[timezone]({{ "/docs/timezones/" | prepend: site.baseurl }}).)
For other achive formats, it is always possible to use `touch` to reset
For other archive formats, it is always possible to use `touch` to reset
the modification times to a [predefined value]({{ "/docs/timestamps/" | prepend: site.baseurl }})
before creating the archive:
......@@ -36,7 +36,7 @@ $ find build -print0 |
$ zip -r product.zip build
{% endhighlight %}
In some cases, it can be preferable to keep the original times for files
In some cases, it is preferable to keep the original times for files
that have not been created or modified during the build process:
{% highlight sh %}
......@@ -45,12 +45,12 @@ $ find build -newermt "@${SOURCE_DATE_EPOCH}" -print0 |
$ zip -r product.zip build
{% endhighlight %}
A patch has been written to make the latter operation easier with GNU
A patch has been written to simplify the latter operation with GNU
Tar. It is currently available in Debian since
[tar](https://packages.qa.debian.org/tar) version 1.28-1. Hopefully it
will be integrated upstream soon but you might want to use it with
caution. It adds a new `--clamp-mtime` flag which will only set time
when the file is more recent than what was given with `--mtime`:
will be integrated upstream soon, but you might want to use it with
caution. It adds a new `--clamp-mtime` flag which will only set the time
when the file is more recent than the value specified with `--mtime`:
{% highlight sh %}
# Only in Debian unstable for now
......@@ -67,7 +67,7 @@ When asked to record directories, most archive formats will read their
content in the order returned by the filesystem which is [likely to be
different on every run]({{ "/docs/stable-inputs/" | prepend: site.baseurl }}).
With version 1.28, GNU Tar has gained `--sort=name` option which will
With version 1.28, GNU Tar has gained the `--sort=name` option which will
sort filenames in a locale independent manner:
{% highlight sh %}
......@@ -94,7 +94,7 @@ can be recorded. Sometimes it will be using a string, sometimes using
the associated numeric ids.
When files belong to predefined system groups, this is not a problem,
but builds most often are made using regular users. Recording of the
but builds are often performed with regular users. Recording of the
account name or its associated ids might be a source of reproducibility
issues.
......@@ -147,17 +147,17 @@ mode* which will use zero for UIDs, GIDs, timestamps, and use consistent
file modes for all files. It can be made the default by passing the
`--enable-deterministic-archives` option to `./configure`. It is already
enabled by default for some distributions[^distros-with-default] and so
far it seemed to be pretty safe [except for
far it seems to be pretty safe [except for
Makefiles](https://bugs.debian.org/798804) using targets like
`archive.a(foo.o)`.
When binutils is not built with deterministic archives by default, build
systems have to be changed to pass the right options to `ar` and
friends. `ARFLAGS` can be set to `Dcvr` with many build systems to turn on the
deterministic mode. Care must be also taken to pass `-D` if `ranlib` is
deterministic mode. Care must also be taken to pass `-D` if `ranlib` is
used to create the function index.
Another option is to do post-processing by using
Another option is post-processing with
[strip-nondeterminism](https://packages.debian.org/sid/strip-nondeterminism)
or `objcopy`:
......
......@@ -4,9 +4,8 @@ layout: docs
permalink: /docs/buy-in/
---
Working on reproducible builds might look like a lot of efforts with
little gain at first. While [this apply to many types of work related
to
Working on reproducible builds might look like a lot of effort with
little gain at first. While [this applies to many types of work related to
security](https://www.schneier.com/blog/archives/2008/09/security_roi_1.html),
there are already some good arguments and testimonies
on why *reproducible builds* matter.
......@@ -22,8 +21,8 @@ from the Snowden leaks the abstract of a talk at an
[Strawhorse: Attacking the MacOS and iOS Software Development
Kit](https://theintercept.com/document/2015/03/10/strawhorse-attacking-macos-ios-software-development-kit/).
The abstract clearly explains how unnamed researchers have been creating
modified version of XCode that wouldwithout any knowledge of the
developper—watermark or insert spyware in the compiled applications.
modified version of XCode that wouldwithout any knowledge of the
developer — watermark or insert spyware in the compiled applications.
A few months later, a malware dubbed “XcodeGhost” has been found
targeting developers to make them unknowingly distribute malware
......@@ -42,14 +41,14 @@ the problem easily visible, especially given the size of the added
payload.
As Mike Perry and Seth Schoen explained in December 2014 during [a talk at
31C3](https://media.ccc.de/events/31c3_-_6240_-_en_-_saal_g_-_201412271400_-_reproducible_builds_-_mike_perry_-_seth_schoen_-_hans_steiner)
in December, problematic changes might be more subtle, and a single bit
31C3](https://media.ccc.de/events/31c3_-_6240_-_en_-_saal_g_-_201412271400_-_reproducible_builds_-_mike_perry_-_seth_schoen_-_hans_steiner),
problematic changes might be more subtle, and a single bit
might be the only thing required to create a remotely exploitable
security hole. Seth Schoen also made the demonstration of a kernel-level
malware that would compromise the source code while it was being read by
security hole. Seth Schoen also demonstrated a kernel-level
malware that would compromise the source code while it is read by
the compiler, without leaving any traces on disk. While to the best of
our knowledge such attacks have not been observed in the wild,
<strong>reproducible builds is the only way to detect them
<strong>reproducible builds are the only way to detect them
early</strong>.
Quality assurance
......@@ -57,8 +56,8 @@ Quality assurance
Regular tests are required to make sure that the software can be built
reproducibly in various environments. Debian and other free software
distributions consider that their users must be able to build the
software they distribute. Such regular tests helps to avoid *fail to
distributions require that their users must be able to build the
software they distribute. Such regular tests help in avoiding *fail to
build from source* bugs.
Build environments may evolve after a project is no longer receiving
......@@ -72,11 +71,11 @@ translations](https://bugs.debian.org/778486), or [changing
dependencies](https://bugs.debian.org/778707).
The constraint of having to reflect about the build environment also
helps developers to think the relationship with external software or
helps developers to think about the relationship with external software or
data providers. Relying on external sources with no backup plans might
cause serious troubles in the long term.
Having reproducible builds also allow to recreate matching [debug
Reproducible builds also enable the recreation of matching [debug
symbols](https://en.wikipedia.org/wiki/Debugging_data_format) for a
distributed build which can help understanding issues in software used
in production.
......@@ -89,7 +88,7 @@ to know if the build environment is not compromised if everyone is using
the same binaries? Or how can I trust that the compiler I just built
was not compromised by a backdoor in the compiler I used to build it?
The latter is known in the academic litterature since the
The latter is known in the academic literature since the
[Reflections on trusting
trust](https://dx.doi.org/10.1145%2F358198.358210) paper from
Ken Thompson published in 1984. It's the paper mentioned in the
......@@ -97,9 +96,9 @@ description of the talk about “Strawhorse” mentioned earlier.
The technique known as [Diverse
Double-Compilation](http://www.dwheeler.com/trusting-trust/) formally
defined and resarched by David A. Wheeler can answer this question.
To sum up quickly how it works: taking two compilers, one trusted and
one under test trusted, the compiler under test is built twice,
defined and researched by David A. Wheeler can answer this question.
To sum up quickly how it works: given two compilers, one trusted and
one under test, the compiler under test is built twice,
once with each compiler. Using the compilers created from this build,
the compiler under test is built again. If the output is the same, then
we have a proof that no backdoors have been inserted during the
......
......@@ -16,14 +16,14 @@ responsibilities.
Getting a deterministic build system
------------------------------------
In order for software to allow reproducible builds, the source code must
In order to allow software to build reproducibly, the source code must
not introduce uncontrollable variations in the build output.
Things will work better if such variations are discovered before users
are confronted with unreproducible binaries. Setting up a test
protocol in which rebuilds are performed under variations in the
environment (aspects like time, *username*, CPU, system version,
filesystems amongst many others) will greatly help.
filesystems, amongst many others) will greatly help.
Defining a build environment
----------------------------
......@@ -31,11 +31,11 @@ Defining a build environment
As different versions of compilation tools are likely to produce
different outputs, users must be able to recreate a build environment
close enough to the original build. It is not required that the
toolchain[^toolchain] itself be byte-for-byte identical, but its
toolchain[^toolchain] itself is byte-for-byte identical, but its
output has to stay the same.
The build environment can either be defined while the software is being
developped or recorded at build time.
developed or recorded at build time.
Distributing the build environment
----------------------------------
......@@ -47,9 +47,9 @@ If the build environment is defined ahead and part of the source code,
then no further steps are required.
In other cases, it needs to be made available alongside the binaries.
The ideal form is a description that can be both understand by human and
machine to make automatic verification possible, while making people
able to review that the environment is sane.
The ideal form is a description that can be understood by both humans
and machines to make automatic verification possible, while enabling people
to review that the environment is sane.
Providing a comparison protocol
-------------------------------
......@@ -57,9 +57,9 @@ Providing a comparison protocol
Users must have an easy way to recreate the build environment, get the
source code, perform the build, and compare the results.
Ideally, the comparison protocol should be simply to see if resulting
bytes are identical. Comparing directly bytes or cryptographic hashes
function is easy to do and understand.
Ideally, the comparison protocol should be simple to see if resulting
binaries are identical. Comparing bytes or cryptographic hash
functions is easy to do and understand.
Other technologies might require removing cryptographic signatures or
ignore specific parts. Such operations must be both documented and
......
......@@ -5,7 +5,7 @@ permalink: /docs/stable-inputs/
---
If building your software requires processing several inputs at once,
make sure the order is stable accross builds.
make sure the order is stable across builds.
A typical example is creating an archive from the content of a
directory. Most filesystems do not guarantee that listing files in a
......@@ -53,8 +53,8 @@ Watch out for locale-related issues
-----------------------------------
When sorting inputs, one must ensure that the sorting order is not affected by
the system locale settings. For example, some locale will not make differences
between uppercase and lowercase.
the system locale settings. Some locales will not distinguish between uppercase
and lowercase characters.
For example, `tar` will by default use the filesystem order when
descending directories:
......@@ -86,5 +86,4 @@ $ find src -print0 | LC_ALL=C sort -z |
This might not be the only changes required for [Tar and other archive
formats]({{ "/docs/archives/" | prepend: site.baseurl }}) as they
usually embed more metadata.
problems.
usually embed more metadata problems.
......@@ -7,7 +7,7 @@ permalink: /docs/test-bench/
It is important to detect reproducibility problems in the build system
before users to avoid any false alarms.
The method is usually as follow:
The method is usually as followed:
1. Build a first time.
2. Save the result.
......@@ -39,8 +39,8 @@ far:
* CPU type,
* number of CPU cores.
[disorderfs](https://packages.debian.org/sid/disorderfs) can help to
test variations due to the filesystem in a deterministic manner.
[disorderfs](https://packages.debian.org/sid/disorderfs) can help in
testing variations due to filesystems in a deterministic manner.
The list of [variations tested for
Debian](https://reproducible.debian.net/reproducible.html#variation) is
......
......@@ -26,7 +26,7 @@
Content licensed under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a>.
</p>
<p>
Logos and trademarks belongs to their respective owners.
Logos and trademarks belong to their respective owners.
</p>
</div>
<div class="four columns">
......
......@@ -6,13 +6,13 @@ categories: org
---
While long considered nearly impossible for real software,
the idea of *reproducible builds* has been revived a couple years ago by
developpers from Bitcoin and The Tor Project. Since then, [several major free
the idea of *reproducible builds* has been revived a couple of years ago by
developers from Bitcoin and the Tor Project. Since then, [several major free
software projects]({{ "/who/" | prepend: site.baseurl }}) are now actively
working on supporting reproducible builds.
It was time to give more visibility to the various initiatives and get a common
ground to share general information, specifications, and tutorials. So here's a
It was time to increase the visibility of the various initiatives and establigh a
common ground to share general information, specifications, and tutorials. So here's a
new homepage!
**Everyone is welcome to contribute!** To get a copy of the website, just type:
......@@ -21,4 +21,4 @@ new homepage!
Most of the currently available documentation has been written with the
experience and the perspective of the work done in the Debian project. We are
sure missing important research and solutions. Please share your insights!
surely missing important research and solutions. Please share your insights!
......@@ -9,9 +9,9 @@ permalink: /docs/
<div class="eight columns text">
<p>
Getting <em>reproducible builds</em> for your software might be easier than
you think! But it might require—generally small—changes to your build system,
and a strategy on how to allow other to recreate an environment where the builds
can be reproduced.
you think! But it might require—usually small—changes to your build system,
and a strategy on how to enable others to recreate an environment in which
the builds can be reproduced.
</p>
</div>
</div>
......
......@@ -8,7 +8,7 @@ permalink: /events/
<div class="four columns">&nbsp;</div>
<div class="eight columns text">
<p>
Irregular events are organize to exchange ideas about “reproducible
Irregular events are organized to exchange ideas about “reproducible
builds”, get a better understanding or cooperate on improving code
or specifications.
</p>
......
......@@ -21,8 +21,8 @@ permalink: /tools/
<a href="http://diffoscope.org/">diffoscope</a> will try to <strong>get
to the bottom of what makes files or directories different</strong>. It
will recursively unpack archives of many kinds and transform various
binary formats into more human readable form to compare them. It can
compare two tarballs, ISO images, or PDF just as easily.
binary formats into more human readable forms for comparsion. It can
compare two tarballs, ISO images, or PDFs just as easily.
</p>
<p>
See an <a href="http://diffoscope.org/examples/igerman98_20131206-5.txt">example text output</a>.
......@@ -51,7 +51,7 @@ permalink: /tools/
<div class="eight columns text">
<p>
Some tools used in build systems might introduce non-determinism in ways
difficult to fix at the source <strong>requiring
difficult to fix at the source, which <strong>requires
post-processing</strong>. <a
href="https://packages.debian.org/sid/strip-nondeterminism">strip-nondeterminism</a>
knows how to <strong>normalize various file formats</strong> such as gzipped files, ZIP
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment