Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] Fix flaky prompt-toolkit tests #9099

Merged
merged 4 commits into from
Nov 22, 2024
Merged

[v2] Fix flaky prompt-toolkit tests #9099

merged 4 commits into from
Nov 22, 2024

Conversation

hssyoo
Copy link
Contributor

@hssyoo hssyoo commented Nov 22, 2024

The PromptToolkitAppRunner test utility only waits 2 seconds for the app to finish rendering. This causes flaky test failures because prompt-toolkit is tested using EC2 service model, which is ~3MB and can take a few seconds to load on hosts with low resources. This PR changes the test service model to CloudWatch, which is ~230KB.

Also bumps the event timeout to 5 seconds to allow more time for the app to render for input buffer tests. Unlike other flaky tests, there's not a slimmer command/model to switch to.

@hssyoo hssyoo requested a review from a team November 22, 2024 14:45
Copy link
Member

@nateprewitt nateprewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we get a clean run without having to retry 🤞

@nateprewitt
Copy link
Member

The one failure seems to match a lot of the cases I've seen:

functional/autoprompt/test_prompttoolkit.py::TestPromptToolkitPrompterBuffer::test_input_buffer_initialization[args3-s3 ls ] - AssertionError: assert 's3 ls' == 's3 ls '

For whatever reason we're missing trailing spaces on some of the tests periodically. I haven't looked at this at all but 1.) is the trailing space relevant and 2.) if not can we just normalize the strings by trimming before comparison?

@hssyoo
Copy link
Contributor Author

hssyoo commented Nov 22, 2024

The one failure seems to match a lot of the cases I've seen:

functional/autoprompt/test_prompttoolkit.py::TestPromptToolkitPrompterBuffer::test_input_buffer_initialization[args3-s3 ls ] - AssertionError: assert 's3 ls' == 's3 ls '

For whatever reason we're missing trailing spaces on some of the tests periodically. I haven't looked at this at all but 1.) is the trailing space relevant and 2.) if not can we just normalize the strings by trimming before comparison?

It's relevant because when autoprompt is enabled and a command is entered with no ambiguity, then there will be a trailing space. For example, aws s3 ls is unambiguous since there aren't any other s3 commands that fuzzy-match with ls. But aws ec2 is ambiguous because it could be ec2, ec2-instance-connect, etc.

I was able to reproduce this locally by setting the event timeout to a very low value. So what I assume is happening is that the test is actually failing to render the app by timing out, so the input aws s3 ls stays as-is, instead of being transformed to aws s3 ls by the prompt-toolkit app. I think bumping the timeout from 2 to 3 seconds should make this much less common.

@hssyoo
Copy link
Contributor Author

hssyoo commented Nov 22, 2024

Played around with this a bit on my fork. I timed the test durations and saw a couple instances of prompt-toolkit tests taking ~3.5 seconds. Setting the timeout to 5 seconds for now and the plan is to monitor future CI runs to see if we need to bump it higher or look into improving performance of the autoprompt feature. Not too worried about the bump causing longer test times since this is only observed in GitHub CI and it's faster than manually retrying failed builds.

@hssyoo hssyoo merged commit daa3a26 into aws:v2 Nov 22, 2024
45 checks passed
@hssyoo hssyoo deleted the ptk-tests branch November 22, 2024 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants