Add outer timeout for mpi testing#5150
Open
cpjordan wants to merge 1 commit into
Open
Conversation
connorjward
requested changes
Jun 3, 2026
Contributor
connorjward
left a comment
There was a problem hiding this comment.
This should probably go into release so it becomes available for Thetis sooner.
| kill_after=${FIREDRAKE_RUN_SPLIT_TESTS_KILL_AFTER:-60s} | ||
|
|
||
| if ! command -v timeout >/dev/null 2>&1; then | ||
| echo "GNU timeout is required" >&2 |
Contributor
There was a problem hiding this comment.
In your other PR you allow for GNU parallel not being there. This should also be optional no?
Contributor
There was a problem hiding this comment.
And we definitely want this to run on macOS, so making it optional would help with that if you don't want to setup gtimeout as you suggested.
Contributor
Author
Contributor
There was a problem hiding this comment.
Yeah #5147 should also go into release. I plan to merge that ASAP
Contributor
Author
There was a problem hiding this comment.
I've updated the base on #5147, will update this branch once merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes thetisproject/thetis#459 as demonstrated on thetisproject/thetis#438 (@connorjward).
This PR adds a shell-level timeout around each split job run by
firedrake-run-split-tests.The existing pytest timeout catches long-running pytest items, but it does not protect the whole
mpiexec ... pytestprocess tree. We have observed (in Thetis - locally and in CI) MPI split jobs that print pytest success and then hang indefinitely during PETSc/OpenMPI finalization. In that state, the wrapper never writesjobN.errcode, GNUparallelnever returns, and CI remains stuck.Change
Each split job now runs as:
inside the existing
teepipeline.The timeout settings are configurable:
This PR assumes GNU
timeoutis available, matching the current Linux CI environment wherefiredrake-run-split-testsis used. If this helper is later used in macOS CI, a follow-up can either install Homebrewcoreutilsand usegtimeout, or add a small platform-specific wrapper.