Typo report

Thanks for releasing this wonderful work.

In `evaluate_from_local.py`, the extract_xx functions appear to have typos (i.e., L101, 111, 119).

As the MMLU PRO dataset have questions with answer A-P, then the pattern should be something like *A-P* instead of *A-J*