Commit 80e4e5b9 authored by Holger Levsen's avatar Holger Levsen
Browse files
parent af639729
Loading
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -52,7 +52,7 @@ Day 1 - Tuesday, November 1st
* 12.30 Lunch
* 13.30 Collaborative Working Sessions
  * Documentation FIXME https://pad.riseup.net/p/rbsummmit2022-documentation-keep
  * Tools FIXME https://pad.riseup.net/p/rbsummmit2022-tools-keep
  * [Tools]({{ "/events/venice2022/tools/" | relative_url }})
  * [Packaging]({{ "/events/venice2022/packaging/" | relative_url }})
* 15.00 Break
* 15.30 Closing Circle
+1 −1
Original line number Diff line number Diff line
@@ -2,7 +2,7 @@
layout: event_detail
title: Collaborative Working Sessions - Documentation + Tooling
event: venice2022
order: 40
order: 50
permalink: /events/venice2022/documentation+tooling/
---

+1 −1
Original line number Diff line number Diff line
@@ -2,7 +2,7 @@
layout: event_detail
title: Collaborative Working Sessions - Metrics
event: venice2022
order: 50
order: 60
permalink: /events/venice2022/metrics/
---

+1 −1
Original line number Diff line number Diff line
@@ -2,7 +2,7 @@
layout: event_detail
title: Collaborative Working Sessions - Packaging
event: venice2022
order: 60
order: 40
permalink: /events/venice2022/packaging/
---

+109 −0
Original line number Diff line number Diff line
---
layout: event_detail
title: Collaborative Working Sessions - Tools
event: venice2022
order: 30
permalink: /events/venice2022/tools/
---

Reproducible Builds Summit 2022

Tools conversation
------------------

 - request for diffoscope to support the "nar" format.
 - there is congratulations and thanks for diffoscope -- for some people here it is the "main tool"!  could not live without!
 - ... also from the same person: "are there improvements you would like to see?"  "yeah."
 - diffoscope can be quite verbose in its reporting.  it can become difficult to see a high-level result in large results.
   - perhaps it would be nice to have some human-readable explanation of some of the kinds of changes which might be recognizable?
     - (admitted that it is not clear how much this is possible)
     - for example can we guess if this pattern of change indicates that certain compiler flags have been used?
 - would live to have: a source code mirroring tool!
   - example user story: i have an openWRT from 15 years ago I want to rebuild... and though I have some instructions... the source URLs have maybe moved or disappeared, and this stops me.
     - is one example -- openWRT is not special in this -- source code is source code, we all need this.
   - is the goal of the Software Heritage Foundation (SWH)?
     - Possibly!
     - Do we trust them?  should there be more decentralization of this?
       - want our own instance!
   - anecdote: we know some projects stop hosting their own source releases if they think they have a security vuln in it.
     - entire table agrees: this is very very bad behavior.  archives should be kept!!
     - "but surely it's still in their VCS"
       - unclear!
       - some projects produce "source" tarballs... but these may have e.g. autoconf has been run... and maybe that ISN'T in the VCS.  uh roh.
   - anecdote: git archive behavior changes.
   - discussed that want this to work where you ask the service for the source snapshot that is identified by a hash... this means it is easier to mirror, and does not require much trust.
   - for example deb source files solve this... for debian.
     - this also isn't necessarily archival and total.  doesn't necessarily contain older releases!
     - emphasized that this solution is for debian and doesn't solve it for others.
   - nixOS and guix "substitutes servers" may also contain this kind of snapshotted content.
   - part of the problem is mirroring the content... part of the problem is seeing the names used to refer to stuff, and then snapshotting that as well.
   - this is kinda like transparency logs!  like Certificate Transparency!
     - want this like e.g. maven-central-transparency-logs !
   - tor has some scripts that look at maven-central and the pom files and hashes them, and they do redistribute the hashes.
     - they download the thing the first time, and recall the hash...
       - the future build scripts will _check_ the hash.
         - (somewhat complex: is _either_ a hash is checked, _or_ a sig is checked.)
           - (note that the signature files -- the ".asc" files -- are not distributed right now.  maybe they should be.)
       - the hash is stored in the tor git repo after first being seen.
       - the future build scripts are still downloading.
     - "can I have that?!" "we do not the cleverest thing.  we would like to improve that."
       - "very simple txt files.  only the things we care about."
   - arch linux stores hashes in their build scripts like this sometimes as well, we think!
   - bazel has some fetching functions that takes a hash as an _optional_ parameter next to a URL.
     - it's a shame it's optional
     - the UX of this is... you have to find a hash somewhere and copy-paste it.
       - where?
       - this is human manual!
   - in cases where git is used, sometimes simply using the git source hash seems like a good option.
     - tor notes that they use these often, when possible.
   - it seems many people have some scripts for doing this, but no one is proud of them enough to share :)
 - possible that there are two parts of the mirroring problem...
   - to have and to mirror the content blobs is one task
   - to see the names that are used for this content is also a problem.
 - what if there was a foundation that maintained version names and made sure people don't re-release things?
 - what about tools for rebuilding?
   - some people are not sure it's really seeming possible to share at this level.  many opinions.
   - we notice reprobuilder tools for debian were topics in previous years but did not really show up on the topic board at all today...
 - could we have some linters for things like `--Werror=datetime` ?
   - something like this example above already exists we think, but we would like more like this!
 - reproducible builds on windows seem to have less representation in this table
   - some people here have done it!
   - not much documentation is shared.  potential room for improvement!
     - "black magic", "sparse information on websites"
   - e.g. there are some checksum numbers in PE executable format in windows that needs to be normalized, and this is only described in some blog posts.
   - going on the reproducible builds website was some help!
 - discussion about difference between controlling supply of content to build vs controlling and describing build environment
   - "do you have buildinfo files"
 - it is interesting that gcc may be introducing new sections in the ELF header which describe the build environment.
   - this could be terrible?!  or it could make some parts of life easier.
     - depends on what information ends up being stored there, exactly!
   - if they embed library version info, we welcome that!
   - if they embed "debian vs redhat": "who cares"
   - if they embed kernel version: please do not!
   - if they embed timestamps: please do not!!!!!
 - "debuginfod" is a thing a few years ago which puts debug info in a server/service
   - part of an evolution of systems for many years now which puts less debug info in executable binaries themselves.  (e.g. debug symbols began to be put in separate files, starting a few years ago, or at least it is possible to do so.)
 - reprobuilder / reprotest: have you heard of it?  what would you want from it?
   - some have not heard of it.  some have.
   - does it pay off?
     - it has some features to make it easy to use with debian, but it's supposed to be generic
     - you have a "exec this" option as well, which should be general.
       - argument that this isn't very useful, because by the time i've manifested a whole system to hand to that, i've done a bigger piece of work than reprobuilder is.
   - it has a nice suite of things it will aggressively vary!
     - that's nice!
     - sometimes.  some of the things it can vary, some people do not care about.
       - example: hostname.  several people state they have no complaint to just hardcode a hostname, and don't care.
       - example: timestamps: these already vary quite naturally so _usually_ it's not the biggest need.
       - you can disable these if you don't want to waste time on them if you don't see them as interesting, but arguably also it is still a complexity that someone would maintain or know about but not use.
 - misc things
   - tor has ended up needing faketime for something on macOS
 - signing is a problem for reproducibility sometimes, especially when embedded in bundles.
   - the fdroid people have worked on some scripts for taking signatures out and putting them back in, in android packages!
   - sometimes a "published private key" is used.
 - it would be nice if we could convince more things to store _no_ time info instead of SOURCE_DATE_EPOCH
   - the spec does say already that you should only use SOURCE_DATE_EPOCH as a last resort... but.  well.
   - idea: should we perhaps amend the spec to say "if SOURCE_DATE_EPOCH=10000" (or some random number we select), "then that means please store no time at all".
     - same wishes as already, but nudge people more explicitly to support this.
     - would make it visible if a tool listens to SOURCE_DATE_EPOCH but shouldn't.