Commit 23f28585 authored by James Addison's avatar James Addison 💬
Browse files

docs: Add a "Getting Started" guide to the website



Resolves #56

Co-authored-by: Jussi
Co-authored-by: Klaus
Co-authored-by: default avatarPier Angelo Vendrame <pierov@torproject.org>
parent 791c3bec
Loading
Loading
Loading
Loading
+3 −0
Original line number Diff line number Diff line
@@ -10,6 +10,9 @@
- title: Achieve deterministic builds
  docs:
  - commandments
  - getting-started
- title: Managing variance
  docs:
  - env-variations
  - source-date-epoch
  - deterministic-build-systems
+37 −0
Original line number Diff line number Diff line
---
title: Reproducibility Troubleshooting
layout: docs
permalink: /docs/reproducibility-troubleshooting/
---

To identify the origin of non-deterministic build outputs:

  * Try to isolate the build steps involved.  For example: it may be possible to perform a partial rebuild of the affected files, and/or to temporarily comment-out irrelevant source code until the problem disappears.

    * Smaller builds are generally quicker, and this is likely to allow you to experiment more rapidly.
    * Removing irrelevant code allows you and collaborators to narrow the focus of your investigation.
    * Minimal examples are often helpful as test cases and for communication purposes.

  * Compare the differing output files, using a tool such as [diffoscope](https://diffoscope.org/).

    * Sometimes the nature of the content difference itself will provide clues (see the [difference patterns](#difference-patterns) below).
    * Context surrounding the difference may help you to locate relevant source code.

  * Try to pinpoint the [variance factor](/docs/adding-build-variance/) that causes the difference.  Tools such as [reprotest](https://salsa.debian.org/reproducible-builds/reprotest) can help you do this.

    * This will allow you to focus your experimentation on the relevant factor.
    * Knowing the variance factor should also make it easier to consult relevant [documentation](https://reproducible-builds.org/docs/).

With some luck and/or persistence, you should obtain an idea of the affected build step, the different values that appear, and the input factors that introduce them.

We now need to inspect the differences and to try to understand what could have caused each of them.

In some cases, the differences themselves may provide some hints about possible fixes.  For example:

# Difference patterns

| Before     | After | Seems to be a...          | Possible remedy |
|------------|-------|---------------------------|-----------------|
| 2021-04-10 | 2026-09-01 | date                      | Configure [`SOURCE_DATE_EPOCH`](https://reproducible-builds.org/docs/source-date-epoch/) |
| APK file A | APK file B | Android build difference  | [Unpack the APK and examine the contents](https://gitlab.torproject.org/tpo/applications/tor-browser/-/wikis/Reproducible-Builds/Debugging-Android) |
| 00 00 A4 81 | 00 00 B4 81 | Zip file permission | [Configure reproducible file permissions](https://blog.jabberhead.tk/2022/06/20/reproducible-builds-telling-of-a-debugging-story/) |
+22 −0
Original line number Diff line number Diff line
Sample use cases
================

[ TODO : fill this page with examples collated from reproducible build investigation results ]

Part one:

Ask people on the r-b mailing list (and elsewhere) for examples of reproducibility failures and how they were fixed.

All the writeups should have the same outline for consistency. The texts should have some technical details, but should not dive too deeply in the nitty gritty. A potential outline could be something like the following:

How the issue was discovered?

What steps were taken to discover the underlying cause?

How the issue was fixed

The writeups should be fairly brief to make them easy to read. Two or three paragraphs of text for each section is a good amount to aim for.

Part two:

Links to blog posts etc that have similar writeups.
+44 −0
Original line number Diff line number Diff line
---
title: Adding build variance
layout: docs
permalink: /docs/adding-build-variance/
---

### Background

Rebuilding software on an individual machine and obtaining the same output does not guarantee that the software will always build reproducibly.

For example: a program that embeds the hostname of the build computer would not be bit-for-bit reproducible when built on systems with different hostnames.

Changing the compiler (and version) you use could also introduce differences.  However, to achieve reproducible build results, it is generally acceptable to specify precise toolchain version(s) that other people should use when attempting to achieve an identical build.

**Note**: There are some conventions about factors that are acceptable to keep constant -- these include the compiler, compiler version, and the versions of other software that the software depends upon. In contrast, there are other variable factors that we do expect to vary, and that we should accommodate when the software is rebuilt in diverse environments. It is a good idea to confirm that your software builds reproducibly in those environments too.

### How to add variance to software builds

Tooling exists to systematically explore the factors that can affect build reproducibility, and we recommend re-using existing utilities rather than writing your own.

#### Reprotest

[`reprotest`](https://salsa.debian.org/reproducible-builds/reprotest) is a tool that rebuilds a project in different environments automatically.

It can apply several variations, including build path, file order, locales, hostname, etc...

It includes native support for Debian and RPM package rebuilds, and can also be configured to run with other build systems.

Its [`README`](https://salsa.debian.org/reproducible-builds/reprotest/-/blob/master/README.rst) includes a variety of usage examples.


## Factors that we would like to prevent from affecting the build output

- Software on your computer unrelated to the build process
- date, time
- language and regional settings
- CPU speed, number of cores, load of the build machine
- hostname, user name, build path


## Factors that are usually acceptable to declare as constants

- Toolchain (compiler, ...)
- Dependencies listed in your project
+44 −0
Original line number Diff line number Diff line
---
title: Reproducibility Quickstart Guide
layout: docs
permalink: /docs/getting-started/
---

This is a brief guide to help you get started writing software that builds [reproducibly](/docs/definition/).

The easiest check that you can perform, without installing any additional software tooling, is to build your software twice and to compare the build output files.

**Tip**: A common approach is to [compare file checksums](https://reproducible-builds.org/docs/checksums/) rather than the artifacts, but using diff tools or the `cmp` command are also valid alternatives.

**Note**: Software builds that involve [cryptographic code signing](https://en.wikipedia.org/wiki/Code_signing) may complicate basic file-to-file comparisons, because some code signing techniques intentionally introduce randomness. To learn how to deal with those situations, refer to the [embedded signatures](https://reproducible-builds.org/docs/embedded-signatures/) documentation.

If the results differ, then you have found a reproducibility bug either in your software or in your toolchain, and can proceed directly to the [troubleshooting](/docs/reproducibility-troubleshooting/) guide.

If the output is identical, then you should add more variance to the build environment to examine less-obvious factors that might influence the output:

```
┌─────────────────────────────────────────────────┐
│   Define what output needs to be reproducible   │
└──────────────────────┬──────────────────────────┘

      ┌────────────────▼──────────────────┐
      │        Build your project         │
      └────────────────┬──────────────────┘

      ┌────────────────▼──────────────────┐
  ┌──►│         Build it again            │
  │   └────────────────┬──────────────────┘
  │                    │
  │   ┌────────────────▼──────────────────┐   No     ┌───────────────────────┐
  │   │      Is the output identical?     ├─────────►│ GOTO: Troubleshooting │
  │   └────────────────┬──────────────────┘          └───────────────────────┘
  │                    │  Yes
  │         ┌──────────▼───────────┐
  └─────────│ GOTO: Add variations │
            └──────────────────────┘
```

Destinations:

  * [`Troubleshooting`](/docs/reproducibility-troubleshooting/)
  * [`Add variations`](/docs/adding-build-variance/)
Loading