Skip to content

Improve RSample I/O for distributed RDataFrame usage#22749

Merged
vepadulano merged 3 commits into
root-project:masterfrom
vepadulano:df-distributed-improve-rsample-io
Jul 3, 2026
Merged

Improve RSample I/O for distributed RDataFrame usage#22749
vepadulano merged 3 commits into
root-project:masterfrom
vepadulano:df-distributed-improve-rsample-io

Conversation

@vepadulano

Copy link
Copy Markdown
Member

Since
74fbf73,
initialization of cppyy is defered to a later point then before, when it used to
be initialized at import ROOT time.

This caused in a specific scenario of distributed RDataFrame usage a particular
mix of behaviours that ended in a crash.

In case of running a distributed RDataFrame application using FromSpec, the
distributed tasks need to serialize the information relative to the various
samples. This is done in the Python class SerializableRSample in Ranges.py.

At serialization time, i.e. in setstate, the class was storing a payload
with a few objects among which collections of strings (for names of files and
datasets) with type std::vectorstd::string.

During the deserialization of a distributed task, the following would happen:

  1. The ROOT module is deserialized, i.e. the reduce method of the ROOT
    facade is called.
  2. The getstate method of the SerializableRSample class is called.
  3. Inside of it, the members of the payload are accessed, thus its
    types must be deserialized, including std::vectorstd::string.
  4. cppyy functions are invoked in order to gather information about
    those types.
  5. cppyy has not been initialized yet at this point, hence the crash.

This commit simplifies the implementation of SerializableRSample so that it does
not require cppyy anymore for I/O.

An issue occurs during deserialization of distributed tasks for an RDataFrame
created via FromSpec. The issue is only visible if the application runs
standalone.
Since
root-project@74fbf73,
initialization of cppyy is defered to a later point then before, when it used to
be initialized at `import ROOT` time.

This caused in a specific scenario of distributed RDataFrame usage a particular
mix of behaviours that ended in a crash.

In case of running a distributed RDataFrame application using `FromSpec`, the
distributed tasks need to serialize the information relative to the various
samples. This is done in the Python class SerializableRSample in Ranges.py.

At serialization time, i.e. in __setstate__, the class was storing a payload
with a few objects among which collections of strings (for names of files and
datasets) with type std::vector<std::string>.

During the deserialization of a distributed task, the following would happen:

1. The ROOT module is deserialized, i.e. the __reduce__ method of the ROOT
   facade is called.
2. The __getstate__ method of the SerializableRSample class is called.
3. Inside of it, the members of the payload are accessed, thus its
   types must be deserialized, including std::vector<std::string>.
4. cppyy functions are invoked in order to gather information about
   those types.
5. cppyy has not been initialized yet at this point, hence the crash.

This commit simplifies the implementation of SerializableRSample so that it does
not require cppyy anymore for I/O.
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown

Test Results

    23 files      23 suites   3d 15h 42m 18s ⏱️
 3 875 tests  3 848 ✅ 0 💤 27 ❌
78 814 runs  78 787 ✅ 0 💤 27 ❌

For more details on these failures, see this check.

Results for commit 6f003db.

@vepadulano

Copy link
Copy Markdown
Member Author

This is to confirm the failure of the test introduced in this PR without the fixes https://github.com/root-project/root/actions/runs/28660728005/job/85000413196?pr=22750#step:8:13348

*** Break *** segmentation violation
   Generating stack trace...
   *** Break *** segmentation violation
   Generating stack trace...
   0x00007f03cdec5393 in <unknown> from /github/home/ROOT-CI/build/lib/libCore.so
   0x00007f03e525fc60 in <unknown> from /lib64/libc.so.6
   0x00007f03e5663c72 in PyObject_GetAttrString + 0x12 from /lib64/libpython3.12.so.1.0
   0x00007f03cf2140d4 in CPyCppyy::CreateScopeProxy(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, _object*, unsigned int) at /github/home/ROOT-CI/src/bindings/pyroot/cppyy/CPyCppyy/src/ProxyWrappers.cxx:648 (discriminator 4) from /github/home/ROOT-CI/build/lib/libCPyCppyy.so
   0x00007f03cf21375d in CPyCppyy::CreateScopeProxy(unsigned long, unsigned int) at /github/home/ROOT-CI/src/bindings/pyroot/cppyy/CPyCppyy/src/ProxyWrappers.cxx:518 from /github/home/ROOT-CI/build/lib/libCPyCppyy.so
   0x00007f03cf215058 in CPyCppyy::BindCppObjectNoCast(void*, unsigned long, unsigned int) at /github/home/ROOT-CI/src/bindings/pyroot/cppyy/CPyCppyy/src/ProxyWrappers.cxx:833 from /github/home/ROOT-CI/build/lib/libCPyCppyy.so
   0x00007f03cf159c4a in CPyCppyy::Instance_FromVoidPtr(void*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) at /github/home/ROOT-CI/src/bindings/pyroot/cppyy/CPyCppyy/src/API.cxx:141 from /github/home/ROOT-CI/build/lib/libCPyCppyy.so
   0x00007f03dc0a9ed7 in PyROOT::CPPInstanceExpand(_object*, _object*) at /github/home/ROOT-CI/src/bindings/pyroot/pythonizations/src/CPPInstancePyz.cxx:53 (discriminator 2) from /github/home/ROOT-CI/build/lib/ROOT/libROOTPythonizations.so
   0x00007f03e565cda0 in <unknown> from /lib64/libpython3.12.so.1.0
   0x00007f03e566f389 in _PyObject_Call + 0xb9 from /lib64/libpython3.12.so.1.0
   0x00007f03e4d66150 in <unknown> from /usr/lib64/python3.12/lib-dynload/_pickle.cpython-312-x86_64-linux-gnu.so
   0x00007f03e4d656ca in <unknown> from /usr/lib64/python3.12/lib-dynload/_pickle.cpython-312-x86_64-linux-gnu.so

@hageboeck hageboeck left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I think I saw a usage of too many lists that may be simplified further. Sorry if I overlooked other uses of these lists.

Comment thread bindings/distrdf/python/DistRDF/HeadNode.py
@vepadulano vepadulano merged commit 6d98c84 into root-project:master Jul 3, 2026
36 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants