-
Notifications
You must be signed in to change notification settings - Fork 51
Add models configuration object to init() #164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Braintrust eval reportAutoevals (models-config-1768445476)
|
569d23a to
c3d81a0
Compare
py/autoevals/ragas.py
Outdated
| def _get_ragas_embedding_model(user_model): | ||
| """Get embedding model with RAGAS-specific default fallback. | ||
| Priority: | ||
| 1. Explicitly provided user_model parameter | ||
| 2. User-configured global embedding default (via init()) | ||
| 3. RAGAS-specific default (text-embedding-3-small) | ||
| """ | ||
| if user_model is not None: | ||
| return user_model | ||
|
|
||
| # Check if user has explicitly configured a global embedding default | ||
| configured_default = _default_embedding_model_var.get(None) | ||
| if configured_default is not None: | ||
| return configured_default | ||
|
|
||
| # Fall back to RAGAS-specific default | ||
| return DEFAULT_RAGAS_EMBEDDING_MODEL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This exists because (for some reason) Python and TypeScript are inconsistent about the embedding model to use here. Python has its own fallback to text-embedding-3-small while TypeScript delegates to the EmbeddingSimilarity default which will use text-embedding-ada-002. Should we just be switching everywhere to text-embedding-3-small though?
b064549 to
2a3d1c5
Compare
Introduces a new `models` parameter to init() that allows configuring
default models for different evaluation types:
```typescript
init({
models: {
completion: 'claude-3-5-sonnet-20241022',
embedding: 'text-embedding-3-large',
}
})
```
Changes:
- Added `models` parameter to init() in both JS and Python
- Models object supports:
- `completion`: Default model for LLM-as-a-judge evaluations
- `embedding`: Default model for embedding-based evaluations
- `models.completion` takes precedence over deprecated `defaultModel`
- All embedding scorers now use configured default embedding model
- Added getDefaultEmbeddingModel() function
- Maintains backward compatibility with existing `defaultModel` parameter
- Added comprehensive tests for both languages
Default values:
- Completion: "gpt-4o" (unchanged)
- Embedding: "text-embedding-ada-002"
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Introduces a new
modelsparameter to init() that allows configuring default models for different evaluation types:Changes:
modelsparameter to init() in both JS and Pythoncompletion: Default model for LLM-as-a-judge evaluationsembedding: Default model for embedding-based evaluationsmodels.completiontakes precedence over deprecateddefaultModeldefaultModelparameterDefault values: