Skip to content

fix(qlib): prevent MultiIndex duplication from groupby().rolling() pattern + custom baseline features#1400

Closed
shin4 wants to merge 1 commit into
microsoft:mainfrom
shin4:next_agent
Closed

fix(qlib): prevent MultiIndex duplication from groupby().rolling() pattern + custom baseline features#1400
shin4 wants to merge 1 commit into
microsoft:mainfrom
shin4:next_agent

Conversation

@shin4
Copy link
Copy Markdown

@shin4 shin4 commented Apr 28, 2026

Summary

This PR introduces a preventive fix for pandas MultiIndex issues caused by groupby().rolling() patterns in LLM-generated factor code, complementing the remedial approach in #1375.

Fixes #678

Problem

When LLM generates factor code with rolling operations on MultiIndex data (index: ['datetime', 'instrument']), a common pattern produces 3-level indices instead of the expected 2-level:

# ❌ WRONG - Creates 3-level index: ['instrument', 'datetime', 'instrument']
ma_20 = volume.groupby(level='instrument').rolling(window=20).mean()
# ValueError: The name instrument occurs multiple times

This causes pd.concat() to fail with:

AssertionError: Length of new_levels (3) must be <= self.nlevels (2)

Solution

We provide a two-layer fix:

Layer 1: Preventive Code Fix (This PR)

Auto-detect and fix the problematic pattern in generated factor code before execution:

# rdagent/scenarios/qlib/developer/utils.py
def _fix_groupby_rolling_pattern(code: str) -> str:
    """Convert groupby().rolling().{op}() to groupby().transform(lambda x: x.rolling().{op}())"""
    # Pattern: .groupby(level='instrument').rolling(window=N).mean()
    # Fixed:   .groupby(level='instrument').transform(lambda x: x.rolling(window=N).mean())

Advantages:

  • ✅ Fixes root cause - factor code produces correct 2-level index from the start
  • ✅ No data loss or incorrect index ordering
  • ✅ Factor values can be used in subsequent operations (division, etc.)

Layer 2: Remedial Index Fix (Complementary with #1375)

Normalize index levels before concat as a fallback safety net:

Changes

Core Fix

  • rdagent/scenarios/qlib/developer/utils.py: Add _fix_groupby_rolling_pattern() function
    • Auto-fixes groupby().rolling().{mean|sum|std|min|max}() patterns
    • Converts to groupby().transform(lambda x: x.rolling().{op}())
    • Applied before factor code execution

Prompt Enhancement

  • rdagent/scenarios/qlib/experiment/prompts.yaml: Add documentation for correct pattern
    • Guides LLM to generate correct code from the start
    • Reduces occurrence of the problematic pattern

Configuration Updates

  • rdagent/scenarios/qlib/experiment/factor_template/conf_*.yaml: Use local qlib_bin data, CSI500 market
  • rdagent/oai/llm_conf.py: Add request_timeout, extra_headers for LLM API flexibility
  • rdagent/oai/backend/litellm.py: Support API base/key override, custom headers

CLI Enhancement

  • rdagent/app/cli.py: Add --base_features_path for custom baseline factors

Bug Fix

  • rdagent/components/runner/__init__.py: Include base_feature_codes in cache key

Additional Features

Custom Baseline Factors

  • baseline_features/: 9 custom factors (Volatility, Momentum, etc.) + Alpha20 config
  • Enables starting factor evolution from optimized baseline

Testing

  • All offline tests pass: pytest -m offline
  • Manual testing with qlib fin_factor scenario
  • Verified factor data produces correct 2-level MultiIndex

Comparison with #1375

Aspect This PR (Preventive) #1375 (Remedial)
Fix timing Before code execution Before concat
Root cause ✅ Yes ⚠️ Partially
Data integrity ✅ Preserved ⚠️ May drop level incorrectly
Index ordering ✅ Correct ⚠️ May need swaplevel
Complementary ✅ Works together ✅ Works together

Recommendation: Merge both for defense-in-depth.

Related


📚 Documentation preview 📚: https://RDAgent--1400.org.readthedocs.build/en/1400/

This PR introduces a **preventive fix** for pandas MultiIndex issues caused by `groupby().rolling()` patterns in LLM-generated factor code, complementing the remedial approach in microsoft#1375.

Fixes microsoft#678

## Problem

When LLM generates factor code with rolling operations on MultiIndex data (index: `['datetime', 'instrument']`), a common pattern produces 3-level indices instead of the expected 2-level:

```python
# ❌ WRONG - Creates 3-level index: ['instrument', 'datetime', 'instrument']
ma_20 = volume.groupby(level='instrument').rolling(window=20).mean()
# ValueError: The name instrument occurs multiple times
```

This causes `pd.concat()` to fail with:
```
AssertionError: Length of new_levels (3) must be <= self.nlevels (2)
```

## Solution

We provide a **two-layer fix**:

### Layer 1: Preventive Code Fix (This PR)

Auto-detect and fix the problematic pattern in generated factor code **before execution**:

```python
# rdagent/scenarios/qlib/developer/utils.py
def _fix_groupby_rolling_pattern(code: str) -> str:
    """Convert groupby().rolling().{op}() to groupby().transform(lambda x: x.rolling().{op}())"""
    # Pattern: .groupby(level='instrument').rolling(window=N).mean()
    # Fixed:   .groupby(level='instrument').transform(lambda x: x.rolling(window=N).mean())
```

**Advantages**:
- ✅ Fixes root cause - factor code produces correct 2-level index from the start
- ✅ No data loss or incorrect index ordering
- ✅ Factor values can be used in subsequent operations (division, etc.)

### Layer 2: Remedial Index Fix (Complementary with microsoft#1375)

Normalize index levels before concat as a **fallback safety net**:
- PR microsoft#1375's approach handles any remaining edge cases
- Both approaches work together for robustness

## Changes

### Core Fix
- `rdagent/scenarios/qlib/developer/utils.py`: Add `_fix_groupby_rolling_pattern()` function
  - Auto-fixes `groupby().rolling().{mean|sum|std|min|max}()` patterns
  - Converts to `groupby().transform(lambda x: x.rolling().{op}())`
  - Applied before factor code execution

### Prompt Enhancement
- `rdagent/scenarios/qlib/experiment/prompts.yaml`: Add documentation for correct pattern
  - Guides LLM to generate correct code from the start
  - Reduces occurrence of the problematic pattern

### Configuration Updates
- `rdagent/scenarios/qlib/experiment/factor_template/conf_*.yaml`: Use local qlib_bin data, CSI500 market
- `rdagent/oai/llm_conf.py`: Add `request_timeout`, `extra_headers` for LLM API flexibility
- `rdagent/oai/backend/litellm.py`: Support API base/key override, custom headers

### CLI Enhancement
- `rdagent/app/cli.py`: Add `--base_features_path` for custom baseline factors

### Bug Fix
- `rdagent/components/runner/__init__.py`: Include `base_feature_codes` in cache key

## Additional Features

### Custom Baseline Factors
- `baseline_features/`: 9 custom factors (Volatility, Momentum, etc.) + Alpha20 config
- Enables starting factor evolution from optimized baseline

## Testing

- All offline tests pass: `pytest -m offline`
- Manual testing with qlib fin_factor scenario
- Verified factor data produces correct 2-level MultiIndex

## Comparison with microsoft#1375

| Aspect | This PR (Preventive) | microsoft#1375 (Remedial) |
|--------|---------------------|------------------|
| Fix timing | Before code execution | Before concat |
| Root cause | ✅ Yes | ⚠️ Partially |
| Data integrity | ✅ Preserved | ⚠️ May drop level incorrectly |
| Index ordering | ✅ Correct | ⚠️ May need swaplevel |
| Complementary | ✅ Works together | ✅ Works together |

**Recommendation**: Merge both for defense-in-depth.

## Related

- Fixes microsoft#678
- Complements microsoft#1375
@shin4
Copy link
Copy Markdown
Author

shin4 commented Apr 28, 2026

@microsoft-github-policy-service agree

@shin4
Copy link
Copy Markdown
Author

shin4 commented Apr 28, 2026

Superseded by #1401 - this PR contained unrelated changes. A new clean PR has been created with only the core MultiIndex fix.

@shin4 shin4 closed this Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fail to concat factors with different MultiIndex

1 participant