-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster syntax highlighting #3006
Conversation
merged rules include: texMathCmdStyle, texMathDelim, texCmdStyle, texCmdPackage, texCmdRef, some rules for amsTeX, texSpecialChar, texCmdRef, texCmdPart this results in a ~20% speedup
…Env` rule (also only one `texMathError` rule for un-matched `\end`s) This results in a ~9% speedup. Unfortunately I had to restrict funcionality a bit, it is no longer possible to specify predicates when using `new_env` with `{'math': v:true}`
`:syntime report` shows that the "skip" rules for `texMathZoneTI` and `texMathGroup` are very expensive. The change might speed things up a bit because every non-backslash character only needs to be looked at once. According to `:syntime report` this reduces the time spent on the `texMathZoneTI` skip rule from 3.1% to 2.7%, and similar for texMathGroup. Also change other skip rules of the form `\\<char 1>\|\\<char 2>`. Overall, this leads to a ~1% speedup.
Re-order means change the order in which they are checked. Most importantly, (using the `_texMathBackslash` rule) we try to make the syntax engine check syntax rules for commands first when we find a backslash (instead of looking for things like `[(){}+-=]` etc.) This leads to a ~5% speedup. This is definitely suboptimal, because it's one more syntax rule, when we could in theory just move the definitions of ``` texComment, texSpecialChar, texMathSymbol, texCmdGreek, texTabularChar, texCmdEnvM, texCmdFootnote, texCmdMinipage, texCmdParbox, texCmdRef, texCmdSize, texCmdStyle, texCmdTodo, texCmdVerb, texMathCmd, texMathCmdEnv, texMathCmdStyle, texMathCmdStyleBold, texMathCmdStyleItal, texMathCmdText, texMathDelimMod, texMathSymbol, ``` to the bottom of `vimtex#syntax#core#init_rules()` to get the same effect.
This leads to a ~1.5% speedup.
this leads to a massive speedup for most regexes that aren't just literal strings (and some slowdown for matching strings, so we don't force engine 1 for those regexes). Overall, this leads to a ~23% speedup.
This results in a ~3% speedup.
Thanks for this! At a first glance, it looks very good, and I'm very glad to get help at speeding things up! I very much like how you've documented your commits and the speedup from each separate change. For reference, how are you benchmarking things here? I mean, when you refer to an x % increased performance, what is the exact approach to calculate the speedup? Did you check with both Vim and neovim? I want to verify things on my end, and perhaps also add a CI test that in which I can track progress or changes to the performance in the future. I'm still very surprised by the conclusion that the old regex engine is faster - it really "feels wrong". I'm curious if we should ask some of the Vim or neovim devs to ask what is happening here..? Finally, I've made a few updates to the syntax rules which lead to a conflict here. Sorry! I can probably take the job of merging this and sorting out the conflict after the discussions. Also, for future reference to me and others: See the earlier related discussion in #2877. |
I was using hyperfine before, but later I realized that (at least on linux) you can just count processor cycles which has less noise than wall time. I've only tested nvim so far, but I'll test vim right now. #!/bin/bash
vim=vi
while [[ $# -gt 0 ]]; do
case "$1" in
--vim)
vim="$2"
shift 2
;;
esac
done
if vim --version | grep nvim > /dev/null; then
vim2="$vim --headless"
else
vim2="$vim"
fi
fname="nvim0_vimtex$(git log -1 --format="%h").term"
export TERM=xterm-kitty
"$vim" -u NONE -c 'color desert' -c 'set nospell' -c 'source work.vim' | sed 's/\x1b\[?\(12\|25\|1004\)[lh]//g' | cat -v > "$fname"
echo 'diff:'
if ! diff "$fname" base.term > /dev/null; then
echo 'first diff failed, retrying'
diff "$fname" base.term
fi
echo '---------------------'
echo ''
eval sudo taskset -c 0 nice -n -11 perf stat -r 4 "$vim2 -u NONE -c 'source work.vim'" 2>&1 | sed 's/ \+$//g' | grep -A1 -P 'cycles' It also checks that the output is identical to the baseline, so you need to run set nocompatible
set runtimepath^=../..
set runtimepath+=../../after
filetype plugin indent on
if exists('g:nosyntax') == 0
syntax enable
endif
set nolazyredraw
set columns=130
function! Scroll()
let LINES = line('$')
for s:x in range(2*LINES/winheight(0))
exe "norm! \<C-D>"
redraw!
endfor
for s:x in range(2*LINES/winheight(0))
exe "norm! \<C-U>"
redraw!
endfor
endfunction
let g:vimtex_syntax_conceal_disable = 1
"let g:vimtex_syntax_conceal_disable = 0
"set conceallevel=2
let g:vimtex_syntax_match_unicode = 0
silent edit main1.tex
if exists('g:nolink') == 0
syntax link
endif
call Scroll()
echo 'moving to second file...'
silent edit main2.tex
if exists('g:nolink') == 0
syntax link
endif
call Scroll()
quitall and I've been using the tex files of https://arxiv.org/abs/1512.07213 and https://arxiv.org/abs/1405.0401 to test things. |
OK I've just been testing with vim, and everything looks good. Output is identical to 80c9bc1 and speedup is ~83%. I had to modify the testing script a bit: #!/bin/bash
vim=vi
base=0
while [[ $# -gt 0 ]]; do
case "$1" in
--vim)
vim="$2"
shift 2
;;
--base)
shift 1
base=1
;;
esac
done
if vim --version | grep nvim > /dev/null; then
vim2="$vim --headless"
name='nvim'
else
vim2="$vim"
name='vim'
fi
if [ $base -gt 0 ]; then
fname="base_$name.term"
else
fname="$name""0_vimtex$(git log -1 --format="%h").term"
fi
export TERM=xterm-kitty
"$vim" -u NONE -c 'color desert|set nospell|source work.vim' \
| sed 's/\x1b\[?\(12\|25\|1004\)[lh]//g' \
| sed 's/\x1b\[[0-9]\+;1H/\0\n/g' \
| head -n -1 \
| cat -v > "$fname"
if [ $base -eq 0 ]; then
echo 'diff:'
if ! diff "$fname" base_$name.term > /dev/null; then
echo 'first diff failed, retrying'
"$vim" -u NONE -c 'color desert|set nospell|source work.vim' \
| sed 's/\x1b\[?\(12\|25\|1004\)[lh]//g' \
| sed 's/\x1b\[[0-9]\+;1H/\0\n/g' \
| head -n -1 \
| cat -v > "$fname"
diff "$fname" base_$name.term
fi
echo '---------------------'
echo ''
fi
eval sudo taskset -c 0 nice -n -11 perf stat -r 4 "$vim2 -u NONE -c 'source work.vim'" 2>&1 | sed 's/ \+$//g' | grep -A1 -P 'cycles' |
I've been looking into this a bit -- there are definitely plenty of inefficiencies in the code of the NFA engine. On linux one of them is that it spends something like 25% of the time allocating and freeing memory. This might however be different on another platform. Also, the old engine still seems to be faster when accounting for this. |
I've tried to measure the same as you now between the current master and your latest syntax2 branch. I do observe a significant speedup, but on my end it is closer to |
I'm curious, since you delved deep into this rabbit hole - did you find that there was any significant benefit of changing |
Ok, I've merged your proposed changes now. I did a rebase of everything on top of master and I made a few minor adjustments. I also added the performance tests to the test framework. Thanks for this, it's a major contribution and very much appreciated! |
Barely. I just tried replacing all |
Hi, I finally found the time to clean up my work on speeding up syntax highlighting.
Are the comments in
autoload/vimtex/syntax/core.vim
somewhat understandable? Are there other things that need changing?There's an explicit functional change in this branch: I've added highlighting for
\lVert
and\rVert
. There's also some new restrictions on the arguments of thevimtex#syntax#core#new_env
function.