Popular repositories Loading
-
deconfounding-embeddings
deconfounding-embeddings Public[EMNLP 2025] The Medium is Not the Message: Deconfounding Document Embeddings via Linear Concept Erasure
Jupyter Notebook 7
-
-
LEXam
LEXam PublicForked from LEXam-Benchmark/LEXam
This Repo provides code for evaluating LLMs on LEXam. LEXam is a comprehensive benchmark evaluating AI system's legal reasoning ability with law exam questions. It has two subsets of open questions…
Python
-
evaluate-idk
evaluate-idk PublicForked from JoelNiklaus/evaluate-idk
Evaluates to what extent LLMs signal correctly that they don't know the answer to a question
Python
If the problem persists, check the GitHub status page or contact support.

