copy: add InstancePlatforms field for platform-based filtering#656
copy: add InstancePlatforms field for platform-based filtering#656aguidirh wants to merge 2 commits intocontainers:mainfrom
Conversation
Add the ability to select images by platform name instead of requiring digest hashes. This implements the functionality originally proposed in containers/image#1938. When copying specific images from a multi-architecture manifest list, users currently must specify exact digest hashes. This is cumbersome and error-prone, as users must manually look up digests and can easily confuse which digest corresponds to which platform. This commit adds a new InstancePlatforms field that allows users to specify platforms by human-readable names like "linux/amd64" or "linux/arm64". The field works alongside the existing Instances field, allowing users to combine both methods. The implementation includes: 1. New InstancePlatforms []imgspecv1.Platform field in Options 2. determineSpecificImages() function to resolve platforms to digests and combine with digest-based selection 3. Efficient Set-based filtering (O(1) lookup vs O(n) with slices) 4. Size() method added to internal/set for Set length queries 5. Table-driven tests covering all selection scenarios Based on original work by @nalind in containers/image#1938, adapted for the container-libs monorepo structure. Relates to containers#227 Signed-off-by: Alex Guidi <aguidi@redhat.com>
|
Packit jobs failed. @containers/packit-build please check. |
1 similar comment
|
Packit jobs failed. @containers/packit-build please check. |
mtrmac
left a comment
There was a problem hiding this comment.
Thanks!
An extremely brief look for now.
| ArchitectureChoice: platform.Architecture, | ||
| VariantChoice: platform.Variant, | ||
| } | ||
| instanceDigest, err := updatedList.ChooseInstanceByCompression(&platformContext, options.PreferGzipInstances) |
There was a problem hiding this comment.
Hum… we have multi-compression images (where there is a gzip instance and a zstd instance for the same platform). This would only copy one of the instances.
OTOH a trivial “does the instance match the required Platform value” check might copy too much, because a v1 variant requirement would match a v1,v2,v3 instances.
There was a problem hiding this comment.
Thank you for the feedback! You're right that using ChooseInstanceByCompression() only copies one instance per platform, potentially missing multi-compression variants.
Proposed Solution
Default behavior: Copy ALL compression variants for platforms specified in InstancePlatforms. This preserves all available data for selected platforms, maintains the original digest for each instance, and avoids implicit filtering.
For compression-specific selection: Introduce a wrapper struct that allows optional compression filtering:
type PlatformSelection struct {
Platform imgspecv1.Platform
Compressions []compression.Algorithm // nil or empty = copy all compressions
}
type Options struct {
// ... existing fields ...
InstancePlatforms []PlatformSelection
}Usage Examples
Copy all compressions (default, most common):
InstancePlatforms: []PlatformSelection{
{Platform: {OS: "linux", Architecture: "amd64"}}, // Compressions nil = all
}Filter to specific compressions:
InstancePlatforms: []PlatformSelection{
{
Platform: {OS: "linux", Architecture: "amd64"},
Compressions: []compression.Algorithm{compression.Gzip},
},
}Different compressions per platform:
InstancePlatforms: []PlatformSelection{
{
Platform: {OS: "linux", Architecture: "amd64"},
Compressions: []compression.Algorithm{compression.Gzip},
},
{
Platform: {OS: "linux", Architecture: "arm64"},
Compressions: nil, // all compressions
},
}Rationale
-
Better UX: Platform-based selection should be inclusive by default. Users shouldn't lose compression variants unless explicitly requested.
-
Avoids digest lookup: The whole point of
InstancePlatformsis convenience. While users could use theInstancesfield with exact digests for compression-specific control, this would require manually looking up digests for each compression variant—a poor user experience that defeats the purpose of platform-based selection. -
Clear semantics:
InstancePlatforms= broad selection (by platform)Instances= precise selection (by digest)Compressionsfield = optional filter
-
Future extensibility: Using a wrapper struct that we control (rather than directly exposing
imgspecv1.Platform) makes it easier to extend in the future. We can add new fields for additional filtering or options without being constrained by the external OCI spec types. -
Not a breaking change: Since
InstancePlatformsis new in this PR, we can get the design right before it becomes public API.
Does this approach address your concerns? I'm happy to implement it if you agree with the direction.
There was a problem hiding this comment.
Different compressions per platform:
I don’t know, do we need that? I guess oc-mirror would be the primary consumer of this kind of API.
I guess that "copy only $platforms" and "copy only $compressionVariants" are orthogonal (we don’t have the latter, and I’m not sure we need it), and eventually, they can be implemented by intersection. (At some point, we can just give up and let users choose variants manually by choosing outside of c/image and passing Instances.)
I’m somewhat more {interested in, worried about} variant matching. I think users would typically say “mirror the amd64 images” (or, fine, Linux amd64 images, whether adding Linux happens explicitly, in c/image, or in a caller), and not worry about variants too much. In that case I guess we’d want to include all variants, that’s fine.
But if the user says “amd64 v2”, do we copy v2 only, or v2 and v3? If the user says "amd64 v2" and there is no v2 but there is v3, do we copy v3 or fail?
There’s probably a clean mental model where this all behaves ~consistently and in a way that is fairly easy to document, I didn’t now take the time to figure that out. Or maybe it’s just inherently complex and there’s no way to simplify that much, I don’t know.
|
I've approved this and restarted a couple of failng tests. I'd be interested in seeing your reply to @mtrmac 's comment on the multi-compression images. Overall, a nice change, thanks! |
Address code review feedback from @mtrmac: - Use assert.ElementsMatch() instead of manual assertion loop - Remove unnecessary expectedSize field from test struct Signed-off-by: Alex Guidi <aguidi@redhat.com>
| // Convert Set to slice for comparison | ||
| actualDigests := []digest.Digest{} | ||
| for d := range specificImages.All() { | ||
| actualDigests = append(actualDigests, d) | ||
| } |
Add platform-based filtering for copying specific images
This PR adds the ability to select images by platform name (e.g.,
linux/amd64) instead of requiring digest hashes, implementing the functionality proposed in containers/image#1938.Motivation
Relates to #227
When copying specific images from a multi-architecture manifest list, the current
CopySpecificImagesmode requires users to specify exact digest hashes in theInstancesfield. This is cumbersome because:This PR adds a new
InstancePlatformsfield that allows users to specify platforms by human-readable names.Changes
1. New
InstancePlatformsfield inimage/copyInstancePlatforms []imgspecv1.Platformfield toOptionsstruct{OS: "linux", Architecture: "amd64"}Instancesfield (both can be used simultaneously)2. Platform resolution in
determineSpecificImages()(image/copy/multiple.go)ChooseInstanceByCompression()to find best match for each platform3. Efficient Set-based filtering
slices.Contains()to Set-based lookupSize()method tointernal/set/set.go4. Comprehensive test coverage
TestDetermineSpecificImageswith table-driven testsUser Experience Improvement
Before (digest-based only):
After (platform-based):
Combined (both methods):
Testing
All existing tests pass. New tests verify:
Compatibility
Instancesfield continues to work unchangedCredits
This implementation is based on the original work by @nalind in containers/image#1938, adapted for the container-libs monorepo structure with the following changes: