-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StopIteration error when using word_segmentation #79
Comments
For making it easier, I put the full code here:
|
Hi, |
@rebouvet Hi, can you upload a sample of the dictionary which causes the error so I can try and debug? |
@mammothb Same problem here using this dictionary: https://raw.githubusercontent.com/hermitdave/FrequencyWords/master/content/2018/pt_br/pt_br_full.txt |
Anyone managed to solve it? I also get StopIteration error for loading french dictionary and using word_segmentation. I used this one. link Error:
|
@lucaslrolim i was able to run word_segmentation without a import os.path
from symspellpy.symspellpy import SymSpell
# Set max_dictionary_edit_distance to avoid spelling correction
sym_spell = SymSpell(max_dictionary_edit_distance=0, prefix_length=7)
dictionary_path = os.path.join(
os.path.dirname(os.path.realpath(__file__)), "symspellpy", "pt_br_full.txt"
)
# term_index is the column of the term and count_index is the
# column of the term frequency
sym_spell.load_dictionary(dictionary_path, term_index=0, count_index=1, encoding="utf8")
# a sentence without any spaces
input_term = "thequickbrownfoxjumpsoverthelazydog"
result = sym_spell.word_segmentation(input_term)
print("{}, {}, {}".format(result.corrected_string, result.distance_sum,
result.log_prob_sum)) and the output is
Initially, I ran into the |
@vection I see you're swapping out the For example,
should work. |
@alexvaca0 @rebouvet May I know if you have a similar problem with the dictionary path not pointing to the right location? Similar to what I have described in #79 (comment) |
Hi, I'm trying to use symspellpy for correcting some spanish texts. I've loaded a dictionary of spanish words and their absolute frequency, and it seems to be correctly loaded. However, when I try to use the word_segmentation, the following error appears, no matter the text I introduce in it:
StopIteration Traceback (most recent call last)
in
----> 1 result = symspell.word_segmentation('holaadiós')
~/miniconda/envs/bertology/lib/python3.7/site-packages/symspellpy/symspellpy.py in word_segmentation(self, phrase, max_edit_distance, max_segmentation_word_length, ignore_token)
1001 compositions[idx].distance_sum + separator_len + top_ed,
1002 compositions[idx].log_prob_sum + top_log_prob)
-> 1003 idx = next(circular_index)
1004 return compositions[idx]
1005
StopIteration:
The text was updated successfully, but these errors were encountered: