ValueError: Predictions and/or references don't match the expected format. 에러 해결

AFL·2024년 1월 10일

error shooting

...

File "/home/sujinkwon/anaconda3/envs/py39_langs/lib/python3.9/site-packages/evaluate/module.py", line 616, in _infer_feature_from_example
    raise ValueError(error_msg) from None
ValueError: Predictions and/or references don't match the expected format.
Expected format:
Feature option 0: {'predictions': Value(dtype='string', id='sequence'), 'references': Sequence(feature=Value(dtype='string', id='sequence'), length=-1, id='references')}
Feature option 1: {'predictions': Value(dtype='string', id='sequence'), 'references': Value(dtype='string', id='sequence')},
Input predictions: 43445,
Input references: 20650

translation 에 대해 metric 을 계산할 때는 metric = evaluate.load("sacrebleu") 로 부르고, masked language modeling 이나 casual language modeling 을 할 때에 metric 계산은 metric = evaluate.load("accuracy") 를 불러야 한다.

그래야 translation 에 대해서는 [{'predictions': Value(dtype='string', id='sequence'), 'references': Sequence(feature=Value(dtype='string', id='sequence'), length=-1, id='references')}, {'predictions': Value(dtype='string', id='sequence'), 'references': Value(dtype='string', id='sequence')}] 를 사용하고,
MLM, CLM 에 대해서는 {'predictions': Value(dtype='int32', id=None), 'references': Value(dtype='int32', id=None)} 를 사용한다.