feat(layerscanning): add optional FileRequirer to image Config#2170
Open
waldemar-kindler wants to merge 4 commits into
Open
feat(layerscanning): add optional FileRequirer to image Config#2170waldemar-kindler wants to merge 4 commits into
waldemar-kindler wants to merge 4 commits into
Conversation
Materialize a regular file only when FileRequired returns true; dirs and symlinks are always kept and a nil requirer means require-all (unchanged default). Lets callers skip unpacking files no extractor needs, shrinking the content store (~29.9k -> ~6.5k files on a 398MB image). Decompression is unaffected, so this is a footprint win with a minor unpack-time gain.
Whiteouts are 0-byte regular files whose path encodes the deleted entry. Gating tar.TypeReg entries on FileRequirer skipped any whiteout whose de-whiteouted path was not required, dropping directory whiteouts and leaking deleted files back into the merged filesystem. Exempt whiteouts from the requirer check so layer deletion semantics are preserved. Add a regression test that filters with a path requirer while an upper layer deletes a directory via a whiteout, asserting the file stays gone.
A required path may be a symlink whose target regular file is not itself required. The single-pass requirer filter skipped the target, so reading the required path through the symlink dangled (e.g. /etc/os-release -> /usr/lib/os-release). Sweep the layers repeatedly until a fixpoint: a required symlink records its resolved target, and a later pass materializes the regular file it resolves to, following symlink chains. The sweep is idempotent and breaks early, so the default (require-all) path stays single-pass. Mirrors the multi-pass requiredTargets approach already used in image/unpack.
The comment claimed decompression of layer streams is unaffected by filtering, but resolving required symlink targets sweeps the layers repeatedly, re-decompressing each stream up to MaxSymlinkDepth+1 times. Note the trade-off and that the default requirer settles in one pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an optional
FileRequirerto the layerscanning imageConfigso callerscan avoid materializing files that no extractor needs. When set, a regular file
is unpacked into the image content store only if
FileRequiredreturns true;directories and symlinks are always kept, and a
nilrequirer means"require all" — the existing default behavior is unchanged.
Impact: decompression is unaffected, so this is primarily a footprint win (with
a minor unpack-time gain). On a 398MB test image the content store shrank from
~29.9k to ~6.5k files.
Changes
feat(layerscanning): add the optionalFileRequirerfield toConfig.DefaultConfigandvalidateConfigdefault a nil requirer toFileRequirerAll{}, so the require-all path is preserved.fix(layerscanning): keep whiteouts when filtering with a requirer.Whiteouts are 0-byte regular files whose path encodes a deleted entry; gating
tar.TypeRegentries on the requirer dropped whiteouts whose de-whiteoutedpath was not required, leaking deleted files back into the merged filesystem.
Whiteouts are now exempt from the requirer check, preserving layer deletion
semantics. Includes a regression test where an upper layer deletes a directory
via a whiteout while a path requirer is active.
fix(layerscanning): materialize symlink targets of required files. Arequired path may be a symlink whose target regular file is not itself
required; the single-pass filter skipped the target, leaving the required path
dangling (e.g.
/etc/os-release -> /usr/lib/os-release). The layers are nowswept to a fixpoint: a required symlink records its resolved target, and a
later pass materializes the file it resolves to, following symlink chains. The
sweep is idempotent and breaks early, so the default require-all path stays
single-pass. This mirrors the multi-pass
requiredTargetsapproach alreadyused in
image/unpack.Testing
go test ./artifact/image/layerscanning/image/passes.requirer_test.gocovering: requirer gating of regular files, whiteoutpreservation under a requirer, and symlink-target materialization (including
chains).
No new dependencies.
gofmt,go vet, andgolangci-lintare clean on thechanged files.