-
Notifications
You must be signed in to change notification settings - Fork 37
chore: allow additional fields to EvaluationData and flexible experiment report type #232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,4 @@ | ||
| from pydantic import BaseModel | ||
| from pydantic import BaseModel, ConfigDict | ||
| from typing_extensions import Any, Generic, TypedDict, TypeVar | ||
|
|
||
| from .trace import Session | ||
|
|
@@ -93,8 +93,15 @@ class EvaluationData(BaseModel, Generic[InputT, OutputT]): | |
| metadata: Additional information about the test case. | ||
| actual_interactions: The actual interaction sequence given the input. | ||
| expected_interactions: The expected interaction sequence given the input. | ||
|
|
||
|
poshinchen marked this conversation as resolved.
|
||
| Extra fields from `Case` subclasses are preserved as model attributes (e.g. a typed | ||
| `config` field on a `Case` subclass passes through to the evaluator with its type intact). | ||
| Note: `extra="allow"` means misspelled field names will be silently accepted rather than | ||
| rejected — the tradeoff for supporting subclass passthrough. | ||
| """ | ||
|
|
||
| model_config = ConfigDict(extra="allow") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's double-check that there's not a subtle gotcha lurking here. The pydantic extras are preserved as live objects in memory, but they have no field annotation, so model_validate can't reconstruct their type. Any So an evaluator written with Please either:
Minimal repro:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes I think this is not a blocker as of now, but we definitely need to think about how to achieve the recovery when loading from the |
||
|
|
||
| input: InputT | ||
| actual_output: OutputT | None = None | ||
| name: str | None = None | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.