-
Notifications
You must be signed in to change notification settings - Fork 2.5k
https://github.com/p-e-w/heretic/compare/master...RyderFreeman4Logos:heretic:feat/llm-judge-pipeline?expand=1 #255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 14 commits
f2cec53
4135be6
2f31c67
557573b
f709905
cb7e83b
13b3442
e1653a5
311980c
bea19e8
56a680a
a09d9b2
462e17b
ac17262
2404d45
4ea4d52
f47108e
3cec064
e01c882
2e87930
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,25 @@ | ||||||||||||||||||
| # LLM judge configuration (hot-reloadable — changes take effect without restart). | ||||||||||||||||||
| # | ||||||||||||||||||
| # Copy to judge.toml and edit. Environment variables override file values. | ||||||||||||||||||
| # | ||||||||||||||||||
| # Env var mapping: | ||||||||||||||||||
| # LLM_JUDGE_API_BASE, LLM_JUDGE_API_KEY, LLM_JUDGE_MODELS (comma-separated), | ||||||||||||||||||
| # LLM_JUDGE_BATCH_SIZE, LLM_JUDGE_CONCURRENCY, LLM_JUDGE_TIMEOUT, | ||||||||||||||||||
| # LLM_JUDGE_MAX_RETRIES, LLM_JUDGE_PRICING (model:in:out,...) | ||||||||||||||||||
| # | ||||||||||||||||||
| # Config file path can be changed via LLM_JUDGE_CONFIG env var (default: judge.toml). | ||||||||||||||||||
|
|
||||||||||||||||||
| api_base = "http://localhost:8317/v1/chat/completions" | ||||||||||||||||||
| # api_key = "" # prefer LLM_JUDGE_API_KEY env var | ||||||||||||||||||
|
|
||||||||||||||||||
| models = ["gpt-mini", "spark", "gemini-flash"] | ||||||||||||||||||
|
|
||||||||||||||||||
| batch_size = 10 # items per API call | ||||||||||||||||||
| concurrency = 6 # parallel batch workers | ||||||||||||||||||
| timeout = 90 # seconds per HTTP request | ||||||||||||||||||
| max_retries = 3 # retries per model before fallback | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These comments violate the repository's style guide. According to rule #4, comments should start with a capital letter and end with a period. Please update these comments to adhere to the style guide.
Suggested change
References
|
||||||||||||||||||
|
|
||||||||||||||||||
| [pricing] # USD per 1M tokens: [input, output] | ||||||||||||||||||
| gpt-mini = [0.15, 0.60] | ||||||||||||||||||
| spark = [0.50, 2.00] | ||||||||||||||||||
| gemini-flash = [0.15, 0.60] | ||||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| pre-commit: | ||
| commands: | ||
| fmt: | ||
| run: mise run fmt | ||
| lint: | ||
| run: mise run lint | ||
| typecheck: | ||
| run: mise run typecheck | ||
| build: | ||
| run: mise run build |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| [tools] | ||
| uv = "latest" | ||
| lefthook = "latest" | ||
|
|
||
| [tasks.fmt] | ||
| description = "Check code formatting" | ||
| run = "uv run ruff format --check ." | ||
|
|
||
| [tasks."fmt:fix"] | ||
| description = "Apply code formatting" | ||
| run = "uv run ruff format ." | ||
|
|
||
| [tasks.lint] | ||
| description = "Lint and check import sorting" | ||
| run = "uv run ruff check --extend-select I ." | ||
|
|
||
| [tasks."lint:fix"] | ||
| description = "Lint and auto-fix" | ||
| run = "uv run ruff check --extend-select I --fix ." | ||
|
|
||
| [tasks.typecheck] | ||
| description = "Type check with ty" | ||
| run = "uv run ty check --error-on-warning ." | ||
|
|
||
| [tasks.build] | ||
| description = "Build package" | ||
| run = "uv build" | ||
|
|
||
| [tasks.check] | ||
| description = "Run all quality gates (CI equivalent)" | ||
| depends = ["fmt", "lint", "typecheck", "build"] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -341,6 +341,11 @@ class Settings(BaseSettings): | |
| description="Strings whose presence in a response (case insensitive) identifies the response as a refusal.", | ||
| ) | ||
|
|
||
| use_llm_judge: bool = Field( | ||
| default=False, | ||
| description="Use LLM judge for refusal classification instead of substring matching.", | ||
| ) | ||
|
Comment on lines
+344
to
+347
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The new setting References
|
||
|
|
||
| system_prompt: str = Field( | ||
| default="You are a helpful assistant.", | ||
| description="System prompt to use when prompting the model.", | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This inline comment violates the repository's style guide (Rule 4), which states that comments should start with a capital letter and end with a period. Please update this and other comments in the file to adhere to the style guide.
References