~tsdh/highlight-parentheses.el#5: 
Laggy movement due to highlight-parentheses--highlight → scan-sexps

When I'm viewing a log file of size 359 556 lines ("text-mode"), I noticed every time I move caret Emacs freezes for a second. Further research showed that the reason is scan-sexps call in the highlight-parentheses--highlight.

Here's profiler report:

    9512  89% - timer-event-handler
    9512  89%  - apply
    9501  89%   - highlight-parentheses--highlight
    9501  89%      scan-sexps
       8   0%   - #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_11>
       8   0%      jit-lock-context-fontify
       3   0%   - sp-show--pair-function
       3   0%    - sp--get-allowed-regexp
       3   0%     - sp--strict-regexp-opt
       3   0%      - #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_77>
       3   0%       - sp--regexp-for-group
       3   0%          regexp-opt
     608   5% + redisplay_internal (C function)
     460   4% + command-execute
       0   0%   ...

#Steps to reproduce

  1. Make a file with echo "(" > file.txt && base64 /dev/urandom | head -n 359556 >> file.txt && echo ")" >> file.txt
  2. Open it emacs ./file.txt
  3. Move caret up and down

#Expected

There's no lags

#Actual

There are lags

Status
RESOLVED CLOSED
Submitter
~hi_angel
Assigned to
No-one
Submitted
9 months ago
Updated
9 months ago
Labels
No labels applied.

~tsdh 9 months ago

"~hi_angel" outgoing@sr.ht writes:

When I'm viewing a log file of size 359 556 lines ("text-mode"), I noticed every time I move caret Emacs freezes for a second.

Which version of emacs is that?

Further research showed that the reason is scan-sexps call in the highlight-parentheses--highlight.

Here's profiler report:

    9512  89% - timer-event-handler
    9512  89%  - apply
    9501  89%   - highlight-parentheses--highlight
    9501  89%      scan-sexps

So the culprit is emacs, no? I mean, scan-sexps is the the right tool for determining the other part of a balanced thing, e.g., finding the position of the closing paren given the position of the opening one. Of course, when the distance between those is huge, it can take a while.

#Steps to reproduce

  1. Make a file with echo "(" > file.txt && base64 /dev/urandom | head -n 359556 >> file.txt && echo ")" >> file.txt
  2. Open it emacs ./file.txt
  3. Move caret up and down

My profiler report also shows that scan-sexps is where most time is spent although its not really laggy on my 10 years old thinkpad.

Anyway, I'd just not use highlight-parentheses-mode for log files. In general, I think it's suited mainly for programming modes and prose. So instead of using global-highlight-parentheses-mode you could use something like

(add-hook 'prog-mode-hook #'highlight-parentheses-mode)

HTH, Tassilo

~hi_angel 9 months ago

Which version of emacs is that?

Latest git built 2-3 weeks ago

So the culprit is emacs, no? I mean, scan-sexps is the the right tool for determining the other part of a balanced thing, e.g., finding the position of the closing paren given the position of the opening one

It's true that the function being slow is from Emacs library, but I don't know if the change is required there or in the mode. Emacs provides various slow functions, such as looking-back for example, and then optimizing the actual usage is what usually modes do. For example, doing some kind of caching or what not… I don't know the details of how scan-sexps work (so if there's a way to cache results for example) or if there are better alternatives, so I reported it here because I figured the mode author has better knowledge in that regard 😊

its not really laggy on my 10 years old thinkpad.

Well, if you replace ( with (( and ) with )) the lags increase a lot. I constructed the testcase just to make sure it shows at least some lag, but it can be easily increased.

~tsdh 9 months ago

"~hi_angel" outgoing@sr.ht writes:

Which version of emacs is that?

Latest git built 2-3 weeks ago

Alright.

So the culprit is emacs, no? I mean, scan-sexps is the the right tool for determining the other part of a balanced thing, e.g., finding the position of the closing paren given the position of the opening one

It's true that the function being slow is from Emacs library, but I don't know if the change is required there or in the mode. Emacs provides various slow functions, such as looking-back for example, and then optimizing the actual usage is what usually modes do. For example, doing some kind of caching or what not…

Well, the highlighting of the surrounding parens are updated every highlight-parentheses-delay seconds but only if point has changed since the last time. I don't see how I could optimize that further.

I mean, actually re-highlighting the surrounding parens/braces/brackets is only needed when between the current and previous value of point there was at least one paren/brace/bracket. However, checking that could be as expensive as scan-sexps when the distance of the move was large, e.g., by M->.

Well, since I have the values of old/new point, I could do that check if the values are not more than DIFF = 1000 away from each other... I guess that would make the normal operation a bit slower but improve the worst case scenario you have. Not sure if the tradeoff is justified...

I'll give it a try...

its not really laggy on my 10 years old thinkpad.

Well, if you replace ( with (( and ) with )) the lags increase a lot. I constructed the testcase just to make sure it shows at least some lag, but it can be easily increased.

Yes, highlight-parentheses highlights the N surrounding pairs of parens depending on the longest list of highlight-parentheses-colors, highlight-parentheses-background-colors, and highlight-parentheses-attributes. So by default, it'll highlight the 4 nearest surrounding pairs of parens. Each pair requires one scan-sexps call to compute the position of the closing one from the opening one.

Bye, Tassilo

~hi_angel 9 months ago

[…] but only if point has changed since the last time. I don't see how I could optimize that further.

I mean, actually re-highlighting the surrounding parens/braces/brackets is only needed when between the current and previous value of point there was at least one paren/brace/bracket. However, checking that could be as expensive as scan-sexps when the distance of the move was large, e.g., by M->.

But the mode knows location of the nearest pair (as it highlights them), right? So I'd imagine it could simply check if (point) is still in the range of the pair and completely avoid calling (scan-sexps) in that case? It is O(1), so both cheap and also beneficial to the common case editing a code as well.

Well, since I have the values of old/new point, I could do that check if the values are not more than DIFF = 1000 away from each other... I guess that would make the normal operation a bit slower but improve the worst case scenario you have. Not sure if the tradeoff is justified...

I'm not sure if we think about the same thing or not. Since you're saying it would make the normal operation a bit slower, you probably think of something else. How would you want to do the check? What do you think about my suggestion in the prev. paragraph?

~tsdh 9 months ago

Your idea is good and better than mine. However, it could fall apart when editing and inserting a closing paren for example. Then we are still inside the formerly surrounding pair of parens but that isn't the currently surrounding pair anymore.

~hi_angel 9 months ago

it could fall apart when editing and inserting a closing paren for example. Then we are still inside the formerly surrounding pair of parens but that isn't the currently surrounding pair anymore.

You are right, but we could simply invalidate the cache every time the buffer is edited. It's a fast operation and I imagine overall change should be still beneficial because navigation happens more often than editing, and also complexity of (scan-sexps) that we skip is O(n).

~tsdh referenced this from #5 9 months ago

~tsdh 9 months ago

Yes, right. Will do it that way. 👍

Am Do, 28. Mär 2024, um 15:16, schrieb ~hi_angel:

it could fall apart when editing and inserting a closing paren for example. Then we are still inside the formerly surrounding pair of parens but that isn't the currently surrounding pair anymore.

You are right, but we could simply invalidate the cache every time the buffer is edited. It's a fast operation and I imagine overall change should be still beneficial because navigation happens more often than editing, and also complexity of (scan-sexps) that we skip is O(n).

-- View on the web: https://todo.sr.ht/~tsdh/highlight-parentheses.el/5#event-338461

~hi_angel 9 months ago

Thank you very much for looking into it! 🎉

~tsdh 9 months ago

"~hi_angel" outgoing@sr.ht writes:

Thank you very much for looking into it! 🎉

Too bad, we didn't see the following problem:

(((point is here: | (more (parens)))))
  ^                                ^

So when point is at the | above, the cache contains the paren pair marked with ^. Now when we move point to the the "more", we are still in the boundaries, so no re-highlighting is done although it should be.

That's why my original idea was to check every character between old and new point location if there is some paren in between. Of course, that is more costly than your approach.

If you have some better idea, I'm all ears. Otherwise, I'll test my original approach.

~hi_angel 9 months ago

Oh, I'm sorry, I didn't see that. Okay, well, we can modify the original idea a bit.

So, your problem basically comes down to the fact that movement may change what pair is the closest even if we're still in the range of the one last cached. Which makes tracking the range useless… or does it? 😊

Well, note that you often have to call (scan-sexps) more than once, 4 times by default since you want 4 pairs to be highlighted (IIUC). So what you can do here is you call (scan-sexps) as usual but only once. And then you check if the boundaries you get are matching the closest pair (one you cached). If it does, that means you're all good and no need to call (scan-sexps) again.

So the idea becomes:

  1. Have the closest pair location cached.
  2. If (point) is moved, call (scan-sexps) just once to check if cache is still valid, otherwise invalidate it.
  3. If buffer is modified, unconditionally invalidate the cache (buffer may get modified in multiple locations at once, so cache can no longer be trusted).

That's why my original idea was to check every character between old and new point location if there is some paren in between. Of course, that is more costly than your approach.

Well, I think my suggestion now becomes similar to yours, except buffer scanning is still done by Emacs' (scan-sexps) and that I imagine it's still beneficial to the common usecase of just navigating the code.

~tsdh 9 months ago

Tassilo Horn referenced this ticket in commit 3715c2a.

~tsdh 9 months ago

Could you please try the current version of the main branch. I've implemented both your and my optimizations. I've tested both individually but it seems the effect is best when used together.

~hi_angel 9 months ago

I've implemented both your and my optimizations. I've tested both individually but it seems the effect is best when used together

Cool, thanks! Just tested, works for me!

I'm not seeing how to leave a code commentary on this site, so some comments here:

@@ 170,27 176,66 @@ If the optional argument OVERLAYS (a list) is non-nil, delete all
 overlays in it instead."
   (mapc #'delete-overlay overlays))

+(defun highlight-parentheses--highlight-needed-p ()
+  "Return non-nil when if re-highlighting is needed."

"when if" in the description, probably meant to be just when or just if.

+            (let ((start (if (< point last-point) point last-point))
+                  (end (if (< point last-point) last-point point)))

This part can be simplified (and optimized for that matter) by using min and max like:

        (let ((start (min point last-point))
              (end (max point last-point)))

+              ;; Lets assume we keep moving in the same direction, so update
+              ;; last-point to the current value of point.
+              (setq highlight-parentheses--last-point point)
+              nil))))))

You execute (setq highlight-parentheses--last-point point) both here and in (highlight-parentheses--highlight) right afterwards. I think one of them is extraneous. I don't understand the algorithm too well to say which one, but offhand I would remove the one you just added to highlight-parentheses--highlight-needed-p because the function seems to scan range for parentheses, and then changing "last point" is sudden for me as a side-reader. Especially so since it's not even changing the "last pair" and postfixed with -p.

~hi_angel 9 months ago

Hmmm, weird. Part of the text got lost. It is good that I edited it via Qutebrowser + Emacs and I still have a buffer with the text.

The lost part:

+      (setq highlight-parentheses--last-pair
+            (cons (overlay-start (car highlight-parentheses--overlays))
+                  (overlay-start (cadr highlight-parentheses--overlays)))))))

I presume that part should be at the end of the (catch 'no-change …) instead of being outside it? I.e. because you only want to update the last pair when it needs to be updated, and it only needs to be updated when 'no-change wasn't thrown. Unless I'm misunderstanding something of course.


I have a question. I'm probably asking something stupid, because I don't understand this algorithm too well, but why in the code below you execute move-overlay always, disregarding if last pair changed or not? It would seem to me overlay only needs to be moved when pair cache is invalidated…

+              (move-overlay (pop overlays) pos1 (1+ pos1))
+              (when (setq pos2 (scan-sexps pos1 1))
+                (move-overlay (pop overlays) (1- pos2) pos2)
+                ;; Check if the immediately surrounding pair of parens is at the
+                ;; same location as before.  If so, we can skip moving the other
+                ;; overlays since they haven't changed, too.
+                (when (and first-iteration
+                           (equal highlight-parentheses--last-pair
+                                  (cons pos1 (1- pos2))))
+                  (throw 'no-change t))
+                (setq first-iteration nil)))))

~hi_angel 9 months ago

Hmmm, weird. Part of the text got lost. It is good that I edited it via Qutebrowser + Emacs and I still have a buffer with the text.

Actually, nvm, it may have been on my side but it's weird. I just noticed that the leftover buffer is still marked as unsaved (and if I undo I see that the last time it was saved matches with the "truncated text"). But it's weird, because I edit texts in browser via a separate emacsclient which I always exit with :x (using Evil). I mean, I just don't exit this window in a way that may not get saved. No idea what happened there.

~tsdh 9 months ago

"~hi_angel" outgoing@sr.ht writes:

Cool, thanks! Just tested, works for me!

Great.

I'm not seeing how to leave a code commentary on this site, so some comments here:

@@ 170,27 176,66 @@ If the optional argument OVERLAYS (a list) is non-nil, delete all
 overlays in it instead."
   (mapc #'delete-overlay overlays))

+(defun highlight-parentheses--highlight-needed-p ()
+  "Return non-nil when if re-highlighting is needed."

"when if" in the description, probably meant to be just when or just if.

Fixed.

+            (let ((start (if (< point last-point) point last-point))
+                  (end (if (< point last-point) last-point point)))

This part can be simplified (and optimized for that matter) by using min and max like:

        (let ((start (min point last-point))
              (end (max point last-point)))

Right, fixed.


+              ;; Lets assume we keep moving in the same direction, so update
+              ;; last-point to the current value of point.
+              (setq highlight-parentheses--last-point point)
+              nil))))))

You execute (setq highlight-parentheses--last-point point) both here and in (highlight-parentheses--highlight) right afterwards. I think one of them is extraneous.

No, that's intended. highlight-parentheses--highlight-needed-p moves point when no re-highlighting is needed and therefore the setq in highlight-parentheses--highlight won't be executed.

I don't understand the algorithm too well to say which one, but offhand I would remove the one you just added to highlight-parentheses--highlight-needed-p because the function seems to scan range for parentheses, and then changing "last point" is sudden for me as a side-reader. Especially so since it's not even changing the "last pair" and postfixed with -p.

Yes, that's surely a sneaky side-effect. But it's an optimization. It assumes that you probably keep moving in the same direction so it sets the last point value in that direction if no paren crosses our way so that the 5000-char window where we scan each character moves with point.

+      (setq highlight-parentheses--last-pair
+            (cons (overlay-start (car highlight-parentheses--overlays))
+                  (overlay-start (cadr highlight-parentheses--overlays)))))))

I presume that part should be at the end of the (catch 'no-change …) instead of being outside it? I.e. because you only want to update the last pair when it needs to be updated, and it only needs to be updated when 'no-change wasn't thrown. Unless I'm misunderstanding something of course.

I update it always with the values of the surrounding pair which are the first 2 overlays in highlight-parentheses--overlays. That has the nice effect that we change the value to (nil) also when we aren't surrounded by parens anymore. In your version, we would update the value only if there is a surrounding pair. And in your version, we also wouldn't update if the closing paren was gone (we killed it with some editing command).

I have a question. I'm probably asking something stupid, because I don't understand this algorithm too well, but why in the code below you execute move-overlay always, disregarding if last pair changed or not? It would seem to me overlay only needs to be moved when pair cache is invalidated…

Yes, true. But I benchmarked it. Moving an overlay to its current location one million times takes 0.2 seconds so no need to worry.

Bye, Tassilo

~hi_angel 9 months ago

I see, thank you very much!

~tsdh REPORTED FIXED 9 months ago

Thank you for making me aware, having good ideas, and doing a code review. :-)

~yantar92 9 months ago

After this fix I started getting errors - when highlight-parentheses--overlays happens to be nil in buffer.

~tsdh FIXED REPORTED 9 months ago

~tsdh 9 months ago

Can you get a backtrace with toogle-debug-on-error? Does that happen when you switch to a buffer where highlight-parentheses-mode is off?

~yantar92 9 months ago

"~tsdh" outgoing@sr.ht writes:

Can you get a backtrace with toogle-debug-on-error? Does that happen when you switch to a buffer where highlight-parentheses-mode is off?

Debugger entered--Lisp error: (wrong-type-argument overlayp nil) (overlay-start nil) (highlight-parentheses--highlight) (apply highlight-parentheses--highlight nil) (timer-event-handler [t 26126 44053 245233 nil highlight-parentheses--highlight nil nil 416000 nil])

e (current-buffer) #<buffer Org Agenda(i)>

highlight-parenthesis-mode is off in both agenda buffer and Org buffers. However, it is not off in the minibuffer. Doing something that involves answering prompt triggers the backtrace. (prompt is not the only way the bug is triggered)

My config:

(add-hook 'minibuffer-setup-hook #'highlight-parentheses-minibuffer-setup) (dolist (hook '( prog-mode-hook helpful-mode-hook debugger-mode-hook lisp-interaction-mode-hook)) (add-hook hook #'highlight-parentheses-mode)))

-- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at https://orgmode.org/. Support Org development at https://liberapay.com/org-mode, or support my work at https://liberapay.com/yantar92

~tsdh 9 months ago

Hm, I thought I had fixed that with 6a41199 concerning #6 where the fix is in version 2.2.1 which has appeared on NonGNU ELPA on March, 31st. Is it possible you are still using 2.2.0?

~yantar92 9 months ago

"~tsdh" outgoing@sr.ht writes:

Hm, I thought I had fixed that with 6a41199 concerning #6 where the fix is in version 2.2.1 which has appeared on NonGNU ELPA on March, 31st. Is it possible you are still using 2.2.0?

You are right. I happened to update my packages right in the middle between 2.2.0 and 2.2.1. Judging from the commits, the backtrace I am seeing should be addressed. I will test 2.2.1 version and let you know if any problems remain. Meanwhile, feel free to close the task.

~tsdh 9 months ago

Tassilo Horn referenced this ticket in commit [e6a4ce1].

[e6a4ce1]: https://git.sr.ht/~tsdh/highlight-parentheses.el/commit/e6a4ce1 "Improve fix for #6 by always running the highlight fn in the right buffer"

~tsdh REPORTED CLOSED 9 months ago

Alright. I've just pushed a better fix for #6. Instead of just safe-guarding when the highlighting function is run in the wrong buffer, it's now ensured runs in the correct buffer where the post-command-hook initiated the highlighting, i.e., where the timer eventually calling the highlighting function is started.

It would be nice if you could give that version some testing before I make another ELPA release.

~yantar92 9 months ago

"~tsdh" outgoing@sr.ht writes:

It would be nice if you could give that version some testing before I make another ELPA release.

After two days, I do not see any issues.

-- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at https://orgmode.org/. Support Org development at https://liberapay.com/org-mode, or support my work at https://liberapay.com/yantar92

~tsdh 9 months ago

Thanks! I'll make a new ELPA release then. :-)

Register here or Log in to comment, or comment via email.