Flaky test: test_dump_modules[description_no_llm] — CategoricalDistribution does not support dynamic value space

## Symptom

`tests/pipeline/test_optimization.py::test_dump_modules[description_no_llm]` fails intermittently with:

```
ValueError: CategoricalDistribution does not support dynamic value space.
```

The test sometimes passes, sometimes fails, on identical commits — i.e. it is a real flake, not a regression.

## Where it surfaces

- **dev branch**: alternating green/red across consecutive CI runs (e.g. run `27102499946` green, prior run `27101359057` red, run before that `27098583531` green).
- **mypy-on-tests Phase B**: surfaced on two parallel subagent PRs (#304, #307) whose diffs do **not** touch `tests/pipeline/` at all, confirming it's not driven by the diff.

Example failing job logs:
- https://github.com/deeppavlov/AutoIntent/actions/runs/27127958767/job/80061423953 (PR #304)
- https://github.com/deeppavlov/AutoIntent/actions/runs/27128038542/job/80061871290 (PR #307)

## Likely root cause

The failure is preceded by warnings emitted from `src/autointent/nodes/_node_optimizer.py` lines 386, 389, 392, 394:

```
UserWarning: Choices for a categorical distribution should be a tuple of
None, bool, int, float and str for persistent storage but contains
{'model_name': 'sergeyzh/rubert-tiny-turbo', 'device': 'cpu'} which is of type dict.
```

Optuna requires `CategoricalDistribution` choices to be hashable scalars for persistent storage. The search-space currently passes `dict` objects (embedder configs) as choices, which Optuna stores via `repr()` — when the persisted study is loaded back, the "value space" reconstructed from string `repr`s of dicts no longer matches by identity, and the `description_no_llm` variant's particular ordering of choices appears to trip the equality check non-deterministically (insertion order in Python dicts is stable but dict repr ordering interacts with set/list de-dup downstream in Optuna).

The flake-ness comes from this trip being sensitive to ordering decisions inside Optuna's storage layer, which can shift between runs.

## Suggested fix direction

Convert dict-typed categorical choices into a hashable, scalar representation before handing them to Optuna — e.g. a stable string key (`"st:sergeyzh/rubert-tiny-turbo:cpu"`) or an index into a side-table of full configs. Resolve the chosen key back to the dict at trial time.

This would:
1. Eliminate the four `UserWarning`s pointing at `_node_optimizer.py:386-394`.
2. Make the persisted study round-trippable cleanly.
3. Remove the flake.

## Workaround

Rerun the failing job. The test is unrelated to the bulk of diffs flagged in CI for it.

## Discovered during

The strict-mypy-on-tests workstream (Phase B fan-out). Not blocking that workstream — the failing test is outside every subagent's diff.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky test: test_dump_modules[description_no_llm] — CategoricalDistribution does not support dynamic value space #314

Symptom

Where it surfaces

Likely root cause

Suggested fix direction

Workaround

Discovered during

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Flaky test: test_dump_modules[description_no_llm] — CategoricalDistribution does not support dynamic value space #314

Description

Symptom

Where it surfaces

Likely root cause

Suggested fix direction

Workaround

Discovered during

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions