Low accuracy (37%) on ARC Challenge & Potential Dataset Issues in ref files

Hello,
Thank you for this impressive work. I am currently trying to reproduce the results using the provided code.
1. Low Accuracy Observation I ran the inference on the arc_challenge task using the provided reference datasets (ref/correct, incorrect, and ambiguous). However, the resulting accuracy is only around 37%, which seems lower than expected.
2. Potential Dataset Issues Upon further investigation of the data, I noticed a few things that might be causing this:
• Row Mismatch in incorrect data: It appears that the row counts or alignment in arc_challenge/ref/incorrect do not seem to match the corresponding questions perfectly.
• Irrelevant Knowledge in ambiguous data: In the merged ambiguous dataset, I noticed that knowledge1 and knowledge2 often appear to be unrelated to each other (irrelevant).
My Question: Could you please clarify if I am using the correct version of the reference data? Or is there a specific preprocessing step I might have missed to align these datasets correctly?
Any guidance would be greatly appreciated.
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Low accuracy (37%) on ARC Challenge & Potential Dataset Issues in ref files #36

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Low accuracy (37%) on ARC Challenge & Potential Dataset Issues in ref files #36

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions