Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions scripts/firedrake-run-split-tests
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,14 @@ Usage:
* <njobs> is the number of different jobs
* <pytest_args...> are additional arguments that are passed to pytest

The following environment variables can be used to configure the
outer process-tree timeout for each split job:

* FIREDRAKE_RUN_SPLIT_TESTS_TIMEOUT: maximum wall time for each job
(default: 1800s)
* FIREDRAKE_RUN_SPLIT_TESTS_KILL_AFTER: grace period before forcibly
killing a timed-out job (default: 60s)

Example:

firedrake-run-split-tests 3 4 tests/unit --verbose
Expand All @@ -28,7 +36,8 @@ Requires:
* pytest
* pytest-split
* mpi-pytest
* GNU parallel"
* GNU parallel
* GNU timeout"

# Print out help message with no arguments or "-h" or "--help"
if [[ "$#" -eq "0" ]] || [[ "$1" == "-h" ]] || [[ "$1" == "--help" ]]; then
Expand All @@ -39,6 +48,13 @@ fi
num_procs=$1
num_jobs=$2
extra_args=${@:3}
job_timeout=${FIREDRAKE_RUN_SPLIT_TESTS_TIMEOUT:-1800s}
kill_after=${FIREDRAKE_RUN_SPLIT_TESTS_KILL_AFTER:-60s}

if ! command -v timeout >/dev/null 2>&1; then
echo "GNU timeout is required" >&2

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your other PR you allow for GNU parallel not being there. This should also be optional no?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we definitely want this to run on macOS, so making it optional would help with that if you don't want to setup gtimeout as you suggested.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense but I was waiting for #5147 to be merged (I don't have the permissions to do so).

Do you want me to make the target branch for #5147 to be release as well? Then I can update this branch for your comments.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah #5147 should also go into release. I plan to merge that ASAP

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the base on #5147, will update this branch once merged

exit 1
fi

if [ $num_procs = 1 ]; then
# Cannot use mpiexec -n 1 because this can sometimes hang with
Expand All @@ -58,10 +74,11 @@ set -x

# This incantation:
# * Runs pytest under GNU parallel using the right number of jobs
# * Applies an outer timeout to the whole pytest/mpiexec process tree
# * Uses tee to pipe stdout+stderr to both stdout and a log file
# * Writes pytest's exit code to a file called jobN.errcode (for later inspection)
parallel --line-buffer --tag \
"${pytest_cmd} 2>&1 | tee ${log_file_prefix}{#}.log; \
"timeout --kill-after=${kill_after} ${job_timeout} ${pytest_cmd} 2>&1 | tee ${log_file_prefix}{#}.log; \
echo \${PIPESTATUS[0]} > job{#}.errcode" \
::: $(seq ${num_jobs})

Expand Down