-
Notifications
You must be signed in to change notification settings - Fork 47
Open
Description
Hello,
Thank you for this impressive work. I am currently trying to reproduce the results using the provided code.
- Low Accuracy Observation I ran the inference on the arc_challenge task using the provided reference datasets (ref/correct, incorrect, and ambiguous). However, the resulting accuracy is only around 37%, which seems lower than expected.
- Potential Dataset Issues Upon further investigation of the data, I noticed a few things that might be causing this:
• Row Mismatch in incorrect data: It appears that the row counts or alignment in arc_challenge/ref/incorrect do not seem to match the corresponding questions perfectly.
• Irrelevant Knowledge in ambiguous data: In the merged ambiguous dataset, I noticed that knowledge1 and knowledge2 often appear to be unrelated to each other (irrelevant).
My Question: Could you please clarify if I am using the correct version of the reference data? Or is there a specific preprocessing step I might have missed to align these datasets correctly?
Any guidance would be greatly appreciated.
Thanks!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels