-
Notifications
You must be signed in to change notification settings - Fork 8
courseexam: Add Experimental Validation Workflow
#63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Tarek <tareknaser360@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a GitHub Actions workflow that enables experimental validation of courseexam submissions. The workflow is manually triggered via workflow_dispatch and runs evaluations on new or modified exam files using Claude Haiku 4.5 for both testing and judging, then posts results as PR comments.
Changes:
- Adds a new GitHub Actions workflow for experimental courseexam validation
- Implements automatic detection of new/modified exams in PRs
- Provides automated feedback via PR comments with evaluation results
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
xuafeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge it to see how it works.
Signed-off-by: Tarek <tareknaser360@gmail.com>
Description
As discussed, this PR adds a ci workflow to run experimental evaluations on new exam submissions to help contributors validate their exams before merging.
It's triggered manually with workflow dispatch on PRs targeting main
It detects new/modified exams and runs evaluation using
anthropic/claude-haiku-4-5for both testing and LLM-as-judge then adds comment on the PR with results and instructions to inspect the full resultsI tested this workflow on a private repository