Skip to content

Conversation

ryanwaite
Copy link

Summary

This PR adds informative header comments to all generated Dockerfiles to improve their self-documentation and make them easier to understand for maintainers, security auditors, and users.

Changes

  • Added header comment generation to both Alpine and Debian Dockerfile templates
  • Headers include package name, ecosystem, version, artifact, and purpose description
  • Updated all test cases to expect the new header format
  • Feature works across all supported ecosystems (NPM, PyPI, Debian, CratesIO, Maven)

Example Output

Before:

#syntax=docker/dockerfile:1.10
FROM docker.io/library/alpine:3.19
RUN <<'EOF'
# ... build commands

After:

# OSS-Rebuild Dockerfile
# Package: express (npm)
# Version: 4.18.2
# Artifact: express-4.18.2.tgz
# This Dockerfile rebuilds the above package from source for supply chain verification
FROM docker.io/library/alpine:3.19
RUN <<'EOF'
# ... build commands

Benefits

  • Should be easy for anyone to immediatley understand what package is being rebuilt
  • From an auditing perspective, it's clear what version, artifact, and package we intended to rebuild
  • Helps with maintenance and debugging of the generated Dockerfiles

Testing

  • All existing test pass
  • Updated test cases to handle header generation (this was previously assumed to be empty)
  • Manually verified for PyPI and NPM.
  • Build succeeded without errors.

@msuozzo
Copy link
Member

msuozzo commented Jul 31, 2025

Overall: I like the self-describing nature of this embedded Dockerfile content and I like the way it's done here.

I suppose I'd first like to understand the context for the change, if any. Do you have a particular circumstance that makes this necessary?

Regardless, thanks for this work!

@ryanwaite
Copy link
Author

Overall: I like the self-describing nature of this embedded Dockerfile content and I like the way it's done here.

I suppose I'd first like to understand the context for the change, if any. Do you have a particular circumstance that makes this necessary?

Regardless, thanks for this work!

Thanks for your note. I got a demo of oss-rebuild and I've been thinking about how we might be able to use the project in production environments in a place like Microsoft. I thought about cases where we might be building several versions of a package from source. This could be because teams consuming those packages have pinned to particular versions, etc.

There might be a case where I'm checking these Dockerfiles into a repository and anything that could help them be more self-describing would be helpful, especially because it's hard to read the Dockerfile and intuit how a particular commit hash relates to a specific version of a package. This change was to make it more explicit and hopefully more useful in production environments.

Does that help explain the use case?

@wbxyz
Copy link
Member

wbxyz commented Aug 5, 2025

Agreed, thanks for the contribution! I'm also in favor of self-documenting artifacts.

textwrap.Dedent(`
#syntax=docker/dockerfile:1.10
# OSS-Rebuild Dockerfile
# Package: {{.Target.Package}} ({{.Target.Ecosystem}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the ecosystem be on its own line? Then this description would more clearly match the rebuild.Target type which is the common descriptor across the codebase.

{{- if .Target.Artifact}}
# Artifact: {{.Target.Artifact}}
{{- end}}
# This Dockerfile rebuilds the above package from source for supply chain verification
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this description important to include in each dockerfile?

Attempting to capture the project mission sufficiently in the dockerfile comments seems challenging. For example, some people might not be using rebuilds for supply chain verification. Also, just because we generate the dockerfile doesn't mean it works. It's possible the dockerfile will eventually fail to build, so claiming "this file rebuilds ..." might not be accurate.

Instead, WDYT of adding to the OSS-Rebuild line something like this:
"OSS-Rebuild generated build instructions"

And removing the "This Dockerfile line"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, I was putting my intent for how I'd use OSS-rebuild, but that's not everyone's intent. I like the suggestion and removed the "This Dockerfile" line and updated the top comment line as suggested.

- Add Target field to rebuildContainerArgs struct
- Update both debuildContainerTpl and alpineContainerTpl templates to include:
  * Package name and ecosystem
  * Version number
  * Artifact filename (if present)
  * Purpose description
- Update MakeDockerfile function to pass Target information to templates
- Update test cases with proper Target information and expected headers

The generated Dockerfiles now include clear documentation about what
package is being rebuilt, making them more self-documenting and easier
to understand for supply chain verification purposes.
@ryanwaite ryanwaite force-pushed the add-dockerfile-header-comments branch from 3ce8e99 to 4915955 Compare August 13, 2025 22:26
…rification these are generated build instructions.
@wbxyz
Copy link
Member

wbxyz commented Aug 14, 2025

This looks good to me, but I'd be interested to hear if @msuozzo has any thoughts.

@wbxyz wbxyz requested a review from msuozzo August 14, 2025 18:40
// TODO: Find a base image that has build-essentials installed, that would improve startup time significantly, and it would pin the build tools we're using.
textwrap.Dedent(`
#syntax=docker/dockerfile:1.10
# OSS-Rebuild generated build instructions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this indentation here looks like it differs from the surrounding lines. may be a tabs vs spaces issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's unfortunate. I've pushed a fix.

// TODO: Find a base image that has build-essentials installed, that would improve startup time significantly, and it would pin the build tools we're using.
textwrap.Dedent(`
#syntax=docker/dockerfile:1.10
# OSS-Rebuild generated build instructions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wbxyz mentioned the idea that we could use a PURL here which I thought had some nice properties. While a bit less human-readable, we'd get a bit more density and could probably fit the description comfortably into a single comment line.

Given that it'd be a bit more involved to implement, I'd be happy to merge this and to take on the change in a follow-on. What do you think @ryanwaite?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A PURL would have some nice properties. I think it'd be great if you took on that work given how involved it is. If you wanted any feedback, please let me know. I'd be happy to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants