-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Web demo #45
Comments
I would like to work on this feature. As a first step I have tried to run the nlprule example from the README.md as WASM.
Do you know an example WASM project that successfully uses nlprule? |
After some hours of sleep, I have discovered that it seems to be possible to feature switch to "fancy-regex" in nlprule. I will try that next. |
I have tried to use fancy-regexp instead of nlprule using
but it still tries to compile onig_sys. |
Hi, thanks for taking a shot at this!
[dependencies]
nlprule = { version = "0.6.4", default-features = false, features = ["fancy-regex"] } looks right, I wouldn't expect onig to be used with that configuration.. Could you maybe check this into a repository so I can try compiling it? FYI, I also test compilation to WASM in CI e.g. https://github.com/bminixhofer/nlprule/runs/2761121829 using |
I have followed this tutorial https://rustwasm.github.io/docs/book/game-of-life/hello-world.html and after just adding nlprule to the cargo.toml, it fails to compile because it tries to compile onig_sys. It doesn't matter whether I try to compile it using
or
both fails because of onig_sys. |
Ok, thanks! So there were two issues:
[dependencies]
nlprule = { version = "0.6.4", default-features = false, features = ["regex-fancy"] } |
I have tried to follow your advice and now it compiles, but this function https://github.com/shybyte/nplrule-wasm-example/blob/master/src/lib.rs throws in the browser console:
Apparently already the line let tokenizer =
Tokenizer::from_reader(&mut tokenizer_bytes).expect("tokenizer binary is valid"); is causing this exception. Interestingly this line causes no exception: let rules = Rules::from_reader(&mut rules_bytes).expect("rules binary is valid"); |
OK, sorry, seems a bit bumpy to get this working but we'll get there :) I just looked into this: it's a consequence of using Unfortunately disabling parallelism is handled via an env variable at the moment ( pub fn get_parallelism() -> bool {
- match std::env::var(ENV_VARIABLE) {
- Ok(mut v) => {
- v.make_ascii_lowercase();
- !matches!(v.as_ref(), "" | "off" | "false" | "f" | "no" | "n" | "0")
- }
- Err(_) => true, // If we couldn't get the variable, we use the default
- }
+ false
} in https://github.com/bminixhofer/nlprule/blob/main/nlprule/src/utils/parallelism.rs#L26-L34 and depending on a local copy of nlprule. |
Oh and also you should probably do utils::set_panic_hook(); at the start of your code so |
Thank you for your help. By the way, this is my first encounter with Wasm. |
That's great! If there's anything else, please ask. I've found for example https://quilljs.com useful for this kind of thing in the past, but I'm not that up to date with web development. |
https://quilljs.com is a HTML editor. I don't know its API but usually handling HTML increases complexity because we would
All of this is solvable, but I would propose to start simple with simple plain text. |
I agree, an HTML editor would definitely be overkill. It shouldn't only be "start with plaintext", but "keep plaintext" in my opinion. In the past I've disabled all of the HTML functionality and just used Quill to create some highlights in the text, that worked quite nicely. I'm also using Quill for the demo here for example: https://bminixhofer.github.io/nnsplit/ But there's definitely also other solutions. Keep me updated :) |
@bminixhofer Caused by unexpected private events, I hadn't done anything for 9 day.
Unfortunately the results are a bit disappointing in regard to performance:
(My CPU is a Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz) Originally I had implemented a "Check as you type" experience (without an explicit check button) but this does not work, even for small texts, because the UI thread is blocked so long. My first suspicion was, that it's caused by a debug build, but "wasm-pack build" should build a release build by default and also adding the --release flag changes nothing. |
Hi, great work!! It's really cool to see this running in the browser :)
No, WASM should run at near-native performance. My first thought also was that it might be a debug build but it seems you checked that. I think it might have to do with running on the main thread, maybe you could try running it in a web worker since as you said, it would be better anyway not to block the UI. On my machine (Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz), a check of the sample text usually takes 27ms, but interestingly after refreshing the page a couple of times, sometimes consistently only 5ms. 5ms are actually somewhat reasonable, so that may be another hint that the issue is running on the main thread. If running in a web worker doesn't solve the issue, I can do some profiling to check where the slowdown is coming from. Also, 8.66MB wire size for the WASM is surprisingly good, I guess the code adds almost no overhead to the raw binary size. |
I have now implemented a webworker based version: https://github.com/shybyte/nlprule-web-demo/tree/main/webworker-example The unblocked UI thread is a bit nicer in regard to UX, but unfortunately the actual performance is equally bad 😢 This isn't a urgent blocker for further development of the web demo, but the result won't be very impressive with the current wasm performance. I doesn't matter so much for a short demo text, but if you want to check a medium size blog article and it needs 15 seconds for a check and re-check, than it becomes unwieldy.
However, these are tricks that should be in best case not be necessary, if wasm would deliver what it promises. In order to verify that it's not caused by a debug build, I have recompiling it with All this said, I think for now performance is OKish for a web demo (assuming that we don't want to demo performance itself 😄 ). |
@bminixhofer It's based on https://codemirror.net/ and https://github.com/solidjs/solid. The current state should look like this: The corrections are marked in the CodeMirror editor and also listed as cards on the right. It's possible to apply the replacements by clicking on them. It's far from perfect, but we are not so far from an usable open source Grammarly replacement, running locally in a web browser with full privacy, because nothing is sent to a server. How cool is that! |
Really, really, cool. Great work. I knew neither codemirror nor solid.js, but seems like good choices 🙂 I've thought a bit about the speed issue. Without taking a closer look, I guess there's probably two reasons:
With (1) you could actually play around a bit if you haven't already. You could try another That said, with nlprule running in a WebWorker now the demo actually feels very smooth to me. I think the biggest improvement to speed is as you said splitting into sentences first and caching the sentence-level results. That will actually be easily possible with #72, so another reason to try to get that merged :) |
I have now enabled all possible optimize-for-speed knobs I could find:
and the resulting WASM is around 10% faster with more or less equal size. |
I'm currently experimenting with different ways to optimize the end user performance on the JavaScript side like:
As an side-effect, the main demo is not suited anymore for performance test of the base nlprule-wasm code. However I have done (using https://github.com/shybyte/nlprule-web-demo/tree/main/webworker-example) some unscientific manual raw-performance measurements which might be interesting:
Correct includes tokenize and tokenize includes sentencize. All code was compiled with:
From these results I would conclude that WASM (chrome) is around 5-10 times slower in all areas, so there is no obvious bottleneck (sentencize/tokenize/correct) to fix, but executing the rules dominates the costs, especially for the first execution. (But please note, that these measurements are very unscientific) |
Interesting. Thanks! The demo feels very smooth now! The numbers are a bit disappointing though. Others apparently found a ~50% decrease compared to native (https://www.usenix.org/conference/atc19/presentation/jangda) but 5-10 times slower is another big jump. I still feel it's likely that there is some issue either in nlprule or in the WASM <-> JS interaction that makes it that slow. If you have time and interest to continue in this direction, you could try replacing the sentencize function with https://docs.rs/srx/0.1.3/srx/. This is actually used under the hood in nlprule, but |
A web demo running client-side via WebAssembly would be really cool and useful. Ideally this should have:
The website should live in another repository and be hosted via GH pages. It should already be possible to implement with the library in its current state.
It's completely open how this is implemented (could be yew, or vuejs / react with a JS interface to nlprule made with wasm-pack, or something else).
It's quite a piece of work but it would be amazing to have. I want to focus on the other things first since they are more important for the core library.
Here I would appreciate contributions a lot!
The text was updated successfully, but these errors were encountered: