Alec the Geek

or “My big fat geek’s blogging”

The myth of the reproducible build

I have just finished reading Martin Fowler’s excellent paper on Continuous Integration (CI). Well worth a read and if you are not using CI then I urge to spend time implementing it. However there is one idea that I did want to pick up on:

In general you should store in source control everything you need to build anything, but nothing that you actually build. Some people do keep the build products in source control, but I consider that to be a smell - an indication of a deeper problem, usually an inability to reliably recreate builds.

I think in the context of CI or when using most build tools this approach is quite correct. We have limited control on the build environment and the products it creates. Better to start afresh each time. But there is a problem with this: this is a not in fact a reproducible build; it is a reproducible build process – which is not at all the same thing.

In my view a reproducible build is an activity that, when given a specific set of input artifacts (all at a given revision), will create a bit for bit identical output to the previous build. In practice this is hard to achieve for two reasons

  1. We usually have poor control or knowledge of the environment on which the build is performed. E.g. exactly which 3rd party libraries are used, are we certain that no old copies of artifacts are being used in the build.
  2. Things such as date and time differences can modify the built files in trivial ways.

Now insisting on exact bit for bit reproduction may seem a little harsh, but it does allow the use of checksum hashes to verify the integrity of our different environments. For some people this is absolutely critical as production environments are audited. For the rest of us it’s also important, for example we can ensure that the correct runtime environment is being used in test to reproduce a production problem.

A way to achieve this is to preserve the outputs from our builds in our version control system. Then instead of ‘reproducing the build’ we release our build artifacts as often as required.

However their are two caveats to this: we need to tag the created output files showing

  1. Which source files (and versions) were used as input. This includes files such as build scripts
  2. Shows under which configuration the files were built (should relate back to the software tools, libraries etc. used). Note that this mean all system build environments should be subject to rigorous change control.

Now that we have this information we can also clearly document how our software was build and what file revisions were used to build it. This information is vital, even if the built outputs are not saved in our version control repository.

A minor side affect is that we can now re-use built files in later builds, providing the tags are correct. There are still some systems where this potential saving is a significant benefit as complete builds can take days.

powered by performancing firefox

13 November 2006 Posted by Alec | Application Lifecycle Management, Software Configuration Management, Software Development | | No Comments

What is Process?

Agreg8 Masthead Image

zx12bob asks what is process and who needs it. The answer to this question is easy — the details and implementation are of course complex, but at the highest level it comes down to three things.

  • We must do things cheaper
  • We must do things faster
  • We must do things better

What does that mean?

Cheaper
  • Use Fewer people
  • Peform less rework (rework costs money)
Faster
Offer useful business services to users as quickly as possible
Better
Improve the user experience with better design and fewer bugs

The way to achieve this is to have everyone consistantly use the best process for the project.

So now the complex questions:

  • What is the best process
  • How do we use it consistantly?

powered by performancing firefox

13 November 2006 Posted by Alec | Application Lifecycle Management | | 1 Comment

The dangers of scientific ignorance

New Scientist has a opinion piece about the rise of home schooling in the US, it’s approach to scientific truth and it’s long term goals. It makes chilling reading to realise that the world’s biggest military and economic power could end up in the dark ages, unable to understand how to survive in the second millennium.

Let’s hope they don’t drag the rest of us with them…

powered by performancing firefox

13 November 2006 Posted by Alec | Personal Opinion | | 1 Comment