Quantcast
Viewing all articles
Browse latest Browse all 14011

NLTK's sentence_nist() returns ZeroDivisionError when the hypothesis & reference are the same

NLTK's sentence_nist() returns ZeroDivisionError when the hypothesis & reference are the same. Here is my code:

from nltk.translate.nist_score import sentence_nistref_test = '太少'hypo_test = '太少'split_ref = ''.join(splitKeyword(ref_test))split_hypo = ''.join(splitKeyword(hypo_test))# Case 1: tokenized each Chinese character with spacenist_test = sentence_nist([split_ref], split_hypo) # ZeroDivisionError: division by zeroprint(nist_test)# Case 2: without splitting up the Chinese charactersnist_test = sentence_nist([ref_test], hypo_test) # ZeroDivisionError: division by zeroprint(nist_test)

both results have the same error:

ZeroDivisionError: division by zero

I expect the NIST calculation will have high scores when the hypothesis and reference are the same.

How can I correctly calculate the NIST score of the Chinese sentence pair above?

Detailed Trackback

---------------------------------------------------------------------------ZeroDivisionError                         Traceback (most recent call last)Cell In[41], line 6      4 split_ref = ''.join(splitKeyword(ref_test))      5 split_hypo = ''.join(splitKeyword(hypo_test))----> 6 nist_test = sentence_nist([split_ref], split_hypo)      7 #nist_test = sentence_nist([ref_test], hypo_test)      8 print(nist_test)File ~/.local/lib/python3.10/site-packages/nltk/translate/nist_score.py:70, in sentence_nist(references, hypothesis, n)     18 def sentence_nist(references, hypothesis, n=5):     19     """     20     Calculate NIST score from     21     George Doddington. 2002. "Automatic evaluation of machine translation quality   (...)     68     :type n: int     69     """---> 70     return corpus_nist([references], [hypothesis], n)File ~/.local/lib/python3.10/site-packages/nltk/translate/nist_score.py:165, in corpus_nist(list_of_references, hypotheses, n)    162 nist_precision = 0    163 for i in nist_precision_numerator_per_ngram:    164     precision = (--> 165         nist_precision_numerator_per_ngram[i]    166         / nist_precision_denominator_per_ngram[i]    167     )    168     nist_precision += precision    169 # Eqn 3 in Doddington(2002)

Viewing all articles
Browse latest Browse all 14011

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>