-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zstd:chunked podman-side tests #25007
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mtrmac The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
16d79da
to
6d5c1d2
Compare
Ephemeral COPR build failed. @containers/packit-build please check. |
fb12098
to
fb18284
Compare
0864767
to
1942947
Compare
56c3692
to
62891ef
Compare
test/e2e/pull_chunked_test.go
Outdated
imgspecv1 "github.com/opencontainers/image-spec/specs-go/v1" | ||
) | ||
|
||
func pullChunkedTests() { // included in pull_test.go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of this is rather tedious.
Plausibly we could pre-build all the test images and ship them in the CI images… but we still need to podman system reset
for each test case, and the pulls must run from a real registry (not from oci:
nor using podman load
), so I’m not sure that would be any better.
It’s quite likely there is some infrastructure that could shorten these tests; I’d welcome pointers.
@containers/podman-maintainers PTAL, this is the last part of the tracked Zstd blocker work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard blocking as this will cause a lot of flakes due system reset.
test/e2e/pull_chunked_test.go
Outdated
// Do each test with a clean slate: no layer metadata known, no blob info cache. | ||
// Annoyingly, we have to re-start and re-populate the registry as well. | ||
podmanTest.PodmanExitCleanly("system", "reset", "-f") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is hard blocker, we cannot call system reset in parallel context. system reset modifies global shared state, see
054154c
In particular looking thought CI logs here I see you have rerun flakes that are almost certainly caused because of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now marked as a serial test.
Alternatively, for the purposes of this test, it should be enough to remove all images+layers and the BlobInfoCache. Do you think it would make sense to do that manually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently BlobInfoCache’s location can’t be changed by a public option to Podman, so the test would have to delete the global one, and possibly affect other tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… and even if I added a Podman option for the purpose of the test, the test also needs Skopeo, which does not have the option either :| .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Various options to move forward:
- Stay with the serial test (and pay the cost, about 15 seconds or so of non-concurrent execution)
- Add the BlobInfoCache option to both Podman (possibly local-only, in a hidden option) and Skopeo (~necessarily public as a cross-project dependency), wait with tests until a Skopeo RPM ships
- Teach c/image to do chunked pulls from
dir:
. Then we might not need a running registry at all (and, hypothetically, the test image files could be included in the CI image and not created at runtime), probably avoiding the need for running Skopeo at all — we wouldpodman pull dir:…
, and we would still need a Podman option. - Some shenanigans with running as non-root a custom home directory, to trigger the use of per-user BlobInfoCache. That would not test running as root, and, at a guess, introduces way too much unexpected complexity.
- Don’t try to work from an empty state; go with whatever the BlobInfoCache knows from other operations or from previous cases of this test. (To an extent, that can be helped by using custom, very small, images, not the traditional
ALPINE
. But the BIC would still play a role.) I think this would be sufficient to confirm that the tests correctly reject invalid images. We could no longer test the fallback behavior of images with unknown information, or the like — various test cases would have to be dropped. Worse, the relative test timing could influence which exact code path is invoked, making the tests not deterministic even if they did reveal a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stay with the serial test (and pay the cost, about 15 seconds or so of non-concurrent execution)
I am if zstd:chunked and partial pull are important then we should test them and accept whatever time they take. It certainly sounds like that is a very important feature so I rather have that tested and not blocked because 15s was to long for us.
Add the BlobInfoCache option to both Podman (possibly local-only, in a hidden option) and Skopeo (~necessarily public as a cross-project dependency), wait with tests until a Skopeo RPM ships
Why is this a global not configurable location to begin with, should this not be part of the storage location instead? (I mean I know it is for the default because they are the same but if other processes use a different storage why should they still default to a global path).
Teach c/image to do chunked pulls from dir:. Then we might not need a running registry at all (and, hypothetically, the test image files could be included in the CI image and not created at runtime), probably avoiding the need for running Skopeo at all — we would podman pull dir:…, and we would still need a Podman option.
Well users still pull from real registries so not testing a real registry seems like not an option?
I mean sure we could get rid of some test cases maybe but ultimately it seems better to use the registry as this is what most people would interact with.
Some shenanigans with running as non-root a custom home directory, to trigger the use of per-user BlobInfoCache. That would not test running as root, and, at a guess, introduces way too much unexpected complexity.
Yes let's not go there, we do that for machine tests but it is ugly and I would not recommend it
Don’t try to work from an empty state; go with whatever the BlobInfoCache knows from other operations or from previous cases of this test. (To an extent, that can be helped by using custom, very small, images, not the traditional ALPINE. But the BIC would still play a role.) I think this would be sufficient to confirm that the tests correctly reject invalid images. We could no longer test the fallback behavior of images with unknown information, or the like — various test cases would have to be dropped. Worse, the relative test timing could influence which exact code path is invoked, making the tests not deterministic even if they did reveal a bug.
Non deterministic test sounds horrible. I don't what the BlobInfoCache does but I would guess in reality the code must work with an empty cache AND a full/partial populated cache, possible concurrently modified?
Only testing the empty cache case would not reflect reality either I assume so do we test both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Teach c/image to do chunked pulls from dir: …
Well users still pull from real registries so not testing a real registry seems like not an option?
(There is at least test/e2e/pull_test.go
: @test "push and pull zstd chunked image"
to cover that.)
Non deterministic test sounds horrible. I don't what the BlobInfoCache does but I would guess in reality the code must work with an empty cache AND a full/partial populated cache, possible concurrently modified?
The code is structured so that
- If it knows the DiffID (uncompressed layer digest), and the config contains a DiffID, they must match exactly
- (We always compute the DiffID of fully-pulled layers, as we decompress them)
- During a chunked pull, we always compute the DiffID by default. (Users who don’t care about the security impact and value performance way too much can opt out.)
- Together, the above resolve the signing ambiguity, unless the users opt out.
- For purposes of layer reuse without pulling anything, various information known with certainty about layers is stored in BlobInfoCache and in the c/storage store. So, even if the users opt out, we might know the DiffID. And in that case we do enforce the consistency, because we don’t have any reason not to.
Only testing the empty cache case would not reflect reality either I assume so do we test both?
Each case:
- Resets storage
fresh
: Tests pull of the test image from this empty state (no information should be recorded anywhere)- Triggers a pull of
chunkedNormal
, which can record the full set of metadata for the relevant layer (if something was not recorded duringfresh
, which might have refused to complete the pull) reuse
: Tests pull with all metadata available.
This should cover the “nothing known” and “everything known” extremes. I wouldn’t claim that it covers every single code path the metadata can travel — that can be determined for certain only with code coverage (branch coverage?) instrumentation — but I think it’s enough for the image ID / signing ambiguity logic.
test/e2e/pull_chunked_test.go
Outdated
// The actual test | ||
for _, c := range []struct { | ||
img *pullChunkedTestImage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not how to do table driven tests with ginkgo.
In particular all this output end up under one It() block and it exits on the first error not running following test cases which may hide if this is a single issue or multiple ones.
Just look at a CI log for podman pull chunked images
https://api.cirrus-ci.com/v1/artifact/task/6056101424136192/html/int-podman-fedora-41-root-host-sqlite.log.html
The output is basically unreadable, the printl calls do not really help. Each of these must be there own test block for the log formatter to properly highlight all the cases.
Ginkgo does support table like tests, see DescribeTable() in the quadlet test as an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difficulty with using DescribeTable
that is that I need to create the various pullChunkedTestImage
on-disk images only once, to decrease test runtime.
AFAICS the two options are
- Use
BeforeEach
within aDescribe
/It
container; the setup will run repeatedly, every time, for each table entry. - Add the code to generate the images to the top-level
SynchronizedBeforeSuite
. Technically possible but hard to maintain/scale if more tests added their custom setup to that top level.
The output is basically unreadable
I agree with that, but I think that’s primarily because it includes the voluminous --log-level=debug
output. Would this be more acceptable if that output were silenced and only printed on test failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can setup a new Describe() and have the It() blocks in there. IF the BeforeEach() is outside of the describe it should only be run once AFAICT, compare vs JustBeforeEach()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you need --log-level debug for the test I would prefer not to use it.
Consider that our logs are already quite large, it takes a fair bit to download and render them in my browser. So there is some value to avoid unnecessary output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can setup a new Describe() and have the It() blocks in there. IF the BeforeEach() is outside of the describe it should only be run once AFAICT, compare vs JustBeforeEach()
That’s not how I read https://onsi.github.io/ginkgo/#mental-model-how-ginkgo-traverses-the-spec-hierarchy , but I’ll set up a test and confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you need --log-level debug for the test I would prefer not to use it.
The tests currently do rely on the debug level, e.g. to confirm that we don’t do a partial pull and fall back if we can’t resolve the ambiguities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I took another look and yeah the BeforeEach() thing does not work like I thought in that context.
Also more importantly even when there would be a way to do it each parallel runner is its own process. That means the setup cannot be run once and as the order of It() would be random it would not be workable.
As for as logging each test case then it should use By() I would say instead of manually printing, I guess the end result is the same but this is more ginkgo style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think regarding do not print the log level debug output that is an option we can choose put if such output is needed in the test then it SHOULD be printed IMO.
When a test fails we want to see such output, hard errors or flakes. (Yes we can make it so the output is dumped on failures but I don't feel so strongly about special casing this one test.
My main concern was that individual test cases are hard to follow/recognize, we just see this between a lot of lines
===
===Testing docker://localhost:5013/chunked-mismatch
===
If we cannot fix that with the By() ginkgo lines then we should consider teaching logformatter to color them. Or maybe add more passing around it. Adding an extra newline above/below would make it easier to spot.
I know this is very pedantic but given almost everything flakes at one point I would like to quickly spot the broken test case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated:
- Now using
By()
. I don’t think it’s particularly more readable, but it is the standard and potentially visible to tools. - The
--log-level=debug
output is only shown if a test fails, and only for that failing command. This is a very visible change — most noise in the test now comes fromWaitContainerReady
(re)starting the registry.
See https://api.cirrus-ci.com/v1/artifact/task/4684706555363328/html/int-podman-fedora-41-root-host-sqlite.log.html for an example failure.
4f09725
to
624bf91
Compare
... because (podman system reset) will delete all of it, interfering with the test storing other data in the directory. Signed-off-by: Miloslav Trmač <[email protected]>
1355296
to
c2c2b6f
Compare
c2c2b6f
to
37a5534
Compare
Signed-off-by: Miloslav Trmač <[email protected]>
37a5534
to
00db254
Compare
if expectation.success != nil { | ||
Expect(session).Should(Exit(0)) | ||
for _, s := range expectation.success { | ||
Expect(regexp.MatchString(".*"+s+".*", log)).To(BeTrue(), s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note these expected(flase).To(BeTrue()) lead to poor error messages
you should use something like Expect(string).To(MatchRegexp("regex"))
instead
sorry for not noticing this earlier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; using the idiomatic way would trigger posting the whole debug log of that Podman command into that error structure. “Expected … $theWholeDebugLog … to match $regexp”. E.g. the stack trace would then be rather far from the failure header.
for _, s := range expectation.failure { | ||
Expect(regexp.MatchString(".*"+s+".*", log)).To(BeTrue(), s) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
This adds Podman tests of chunked pulls.
Does this PR introduce a user-facing change?