Replies: 2 comments
-
bab2min/kiwi-farm#1 에도 답변 달아드렸다시피 from kiwipiepy.transformers_addon import KiwiTokenizer |
Beta Was this translation helpful? Give feedback.
0 replies
-
from kiwipiepy import Kiwi
from tokenizers import Tokenizer, models, pre_tokenizers, trainers
# Kiwi 형태소 분석기 초기화
kiwi = Kiwi()
# 토크나이저 초기화
tokenizer = Tokenizer(models.WordLevel(unk_token="<unk>"))
# 특수 토큰 설정
special_tokens = ["<pad>", "<unk>", "<s>", "</s>", "<mask>"]
# Kiwi를 이용한 커스텀 토크나이징 함수
def kiwi_tokenizer(text):
return [word for word, tag, start_pos, end_pos in
kiwi.analyze(text)[0][0]]
# 커스텀 토크나이저 적용
tokenizer.pre_tokenizer = pre_tokenizers.PreTokenizer.custom(kiwi_tokenizer)
# 트레이너 설정
trainer = trainers.WordLevelTrainer(vocab_size=36000,
special_tokens=special_tokens)
# 토크나이저 훈련
data_files = ["./data/1234567.txt"] # 훈련 데이터 파일
tokenizer.train(trainer, data_files)
# 토크나이저 저장
tokenizer.save("kiwi_korean_tokenizer.json")
ImportError Traceback (most recent call last)
Cell In[1], line 1----> 1 from kiwipiepy import Kiwi 2 from
tokenizers import Tokenizer, models, pre_tokenizers, trainers 4 #
Kiwi 형태소 분석기 초기화
File D:\projects\kiwi\kiwipiepy\__init__.py:7 5 from
kiwipiepy._version import __version__ 6 from kiwipiepy._wrap
import Kiwi, Sentence, TypoTransformer, TypoDefinition, HSDataset,
MorphemeSet, PretokenizedToken----> 7 import kiwipiepy.sw_tokenizer as
sw_tokenizer 8 import kiwipiepy.utils as utils 9 from
kiwipiepy.const import Match
File D:\projects\kiwi\kiwipiepy\sw_tokenizer.py:15 11 import
warnings 13 import tqdm---> 15 from _kiwipiepy import Sw_Tokenizer
17 from kiwipiepy import Kiwi, Token 19 @DataClass 20
class SwTokenizerConfig:
ImportError: cannot import name 'Sw_Tokenizer' from '_kiwipiepy'
(D:\anaconda3\Lib\site-packages\_kiwipiepy.cp311-win_amd64.pyd) <= 님의
블로그에 있는 tomotopy 관련 예제도 한번 실행해좠는데 결론은 똑같은 에러 발생인데 저만 그런 것인가요? kiwi
가상환경을 만들어 몇번지웠다 설치해봐도 맨 마지막 결론은 똑같은 에러 발생입니다. 이리저리 구글링도 해보고 Chatgpt나
클로드2.1에 질문을 던져도 뚜렷한 해결방안이 나오지 않습니다.
2023년 12월 17일 (일) 오후 9:38, Minchul Lee ***@***.***>님이 작성:
… bab2min/kiwi-farm#1 <bab2min/kiwi-farm#1> 에도 답변
달아드렸다시피
KiwiTokenizer는 kiwipiepy가 아니라 kiwipiepy.transformers_addon 패키지에 포함되어 있습니다.
from kiwipiepy.transformers_addon import KiwiTokenizer
—
Reply to this email directly, view it on GitHub
<#147 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB2B5M4KMZPPBOMMPJCEOE3YJ3RULAVCNFSM6AAAAABAX67RBWVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TQNZWGY3TC>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
'KiwiTokenizer' 한번 사용해볼려고 제시한 예제대로 해봤는데 다음과 같은 에러가 나네요. 키위팜에 있는 예제도 전부
'KiwiTokenizer' 때문에 에러가 발생하던데요?
import kiwipiepy
from kiwipiepy import KiwiTokenizer
ImportError Traceback (most recent call last)
Cell In[3], line 2
1 import kiwipiepy
----> 2 from kiwipiepy import KiwiTokenizer
ImportError: cannot import name 'KiwiTokenizer' from 'kiwipiepy' (D:\projects\kiwi\kiwipiepy_init_.py)
Beta Was this translation helpful? Give feedback.
All reactions