-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In test I/O utility, restore the old stdin
/stdout
instead of the "true" I/O streams
#5049
base: master
Are you sure you want to change the base?
Conversation
Instead of restoring `sys.stdin` to `sys.__stdin__`, restore it to whatever it was before we installed out dummy I/O hooks. This is relevant in pytest, for example, which installs its *own* `sys.stdin`, which we were then clobbering. This was leading to the suppression of test failures observed in #5021 and addressed in #5027.
Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry. |
Awesome, yeah will do once I'm available, thanks so much for looking into this! I'll rebase on top of your diff and validate |
@sampsyo all green when running ============================= test session starts ==============================
platform darwin -- Python 3.11.6, pytest-7.4.3, pluggy-1.3.0
rootdir: /Users/pjulien/repos/beets
plugins: typeguard-3.0.2, anyio-4.1.0
collected 132 items / 131 deselected / 1 selected
test/test_ui.py . [100%]
=============================== warnings summary ===============================
../../homebrew/lib/python3.11/site-packages/mediafile.py:52
/Users/pjulien/homebrew/lib/python3.11/site-packages/mediafile.py:52: DeprecationWarning: 'imghdr' is deprecated and slated for removal in Python 3.13
import imghdr
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================= 1 passed, 131 deselected, 1 warning in 0.34s ================= There's only one more failing test for me now ( ...
======================================================================
FAIL: test_load_item_types (test_metasync.MetaSyncTest.test_load_item_types)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/pjulien/repos/beets/test/test_metasync.py", line 93, in test_load_item_types
self.assertIn("amarok_score", Item._types)
AssertionError: 'amarok_score' not found in {'data_source': <beets.dbcore.types.String object at 0x101b3a2d0>}
----------------------------------------------------------------------
Ran 1169 tests in 42.193s
FAILED (failures=1, skipped=16) Seems to be similar in nature to the issue I raised (i.e. some shared state between tests being changed based on test ordering; running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for tracking this down! I didn't try to think through the whole behavior of the tests before this PR, but in hindsight, the new code feels like what DummyIO
should have been doing in the first place.
EDIT: Maybe, given that there's now the installed
flag, we could add a guard for nested install
calls?
Awesome; thanks taking a look!! From @wisp3rwind:
This is a good call. I'll add a check right now. From @Phil305:
Fascinating! That does look like an unrelated problem; we can try to investigate that next. For this specific problem, I'm going to pull the change from #5027 into this PR and see if everything goes green. 🤞 |
Resolves #5027: failing test: test_nonexistant_db This test was failing because it was requesting input from the user on stdin. This diff mocks stdin with a canned response.
Oh dear, that actually reveled several places where a |
Hey guys, I think we'll want to hold off on putting that assertion there, unless we're willing to perhaps redo the structure of the test suites for the failing tests. TLDRI'm wondering if there's a way to fix the failing assertion by moving away from shared state, and opting for sharing functionality via stateless functions and explicit definition of state changes that are completely isolated to each test suite. Any worthwhile code duplication could be abstracted away behind the previously mentioned stateless functions. I think this could expose w/e is causing the test failures in an obvious way, making for an easy fix (though I have no proof to back it up unfortunately! My investigation ended up to this point so far 🙂) ContextI looked into the code, and found a lot of non-obvious implicit state changes for the test suites the failing test cases (listed at bottom of this comment) belong to (i.e. global and class instance-level state changes happening via a class hierarchy, instead of explicity) that make it alot harder for me to fix the assertion failures (most of the info in the diagram isn't directly related to the failing tests, but they highlight the overall situation, so please bear with me!): Some questions I have, based on what I'm seeing:
Here's a class diagram to show the current class hierarchy in a cleaner way (the other failing tests are setup essentially the same way): note: The following tests are failing because of the DummyIO assertion: ImportTest::test_empty_directory_singleton_warning
ImportTest::test_empty_directory_warning
ImportSingletonTest::test_import_single_files
ImportExistingTest::test_asis_updated_moves_file
ImportExistingTest::test_asis_updated_without_copy_does_not_move_file
ImportExistingTest::test_asis_updates_metadata
ImportExistingTest::test_does_not_duplicate_album
ImportExistingTest::test_does_not_duplicate_item
ImportExistingTest::test_does_not_duplicate_singleton_track
ImportExistingTest::test_outside_file_is_copied
ImportExistingTest::test_outside_file_is_moved |
I got a little bit nerdsniped by the problems observed in #5027. In short, my high-level diagnosis in #5027 (comment) seems to have been correct: other tests were suppressing the legitimate failure of a flaky test.
I found the problem by running other tests before the problem test, like this:
When running
test_nonexistant_db
alone, it fails. When running it like this with another test that goes first, it passes. That's the problem.However,
test_delete_removes_item
is just one example that works to make this problem happen. It appeared that any test in a class that used our_common.TestCase
base class had this power. I tracked down the issue to ourDummyIO
utility, which was having an unintentional effect even when it was never actually used.Here's the solution. Instead of restoring
sys.stdin
tosys.__stdin__
, we now restore it to whatever it was before we installed out dummy I/O hooks. This is relevant in pytest, for example, which installs its ownsys.stdin
, which we were then clobbering. This was leading to the suppression of test failures observed in #5021 and addressed in #5027.The CI will fail for this PR because it now (correctly) exposes a failing test. Hopefully by combining this with the fixes in the works in #5027, we'll be back to a passing test suite. 😃 @Phil305, could you perhaps help validate that hypothesis?