-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
icu tokenizer may panic on invalid UTF-8 #34
Comments
same related error maybe ?
Is it because there is a dependency i need to install maybe ? |
By running I see this panic too in my system,
I confirmed the issue is fixed by adding these lines into
@mschoch Do you have any plan to include this patch into main stream? It would be really nice, thank you. |
Thanks @atthakorn -- wondering if for anybody also running into this and who need a temporary workaround, I'd wonder if those lines of init() code are also just invokable from any app code. |
Wow I did try, following lines are able to be invoked in app code and it works fine. Great thanks (i'm new to Go) // #cgo LDFLAGS: -licuuc -licudata
// #include "unicode/ucnv.h"
import "C"
func init() {
C.ucnv_setDefaultName(C.CString("UTF-8"))
} However, to leave more trail to others , due to $dep ensure -add github.com/blevesearch/blevex
Solving failure: No versions of github.com/blevesearch/blevex met constraints: To install
wherever blevex modules are: So we don't pollute core extension and keep code clean. |
When the icu tokenizer gets invalid utf8 input like:
You may get a panic. This seems to depend on the version of ICU you have installed, and may also depend on some default ICU settings and/or environment variables.
Some users have reported that adding the following fixes the issue for them.
This issue has been moved from the bleve repo: blevesearch/bleve#185
The text was updated successfully, but these errors were encountered: