Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There should be a simpler alternative to the(.*)? pattern #1301

Open
zcrittendon opened this issue Feb 8, 2022 · 9 comments
Open

There should be a simpler alternative to the(.*)? pattern #1301

zcrittendon opened this issue Feb 8, 2022 · 9 comments
Labels
discussion This problem still needs more feedback

Comments

@zcrittendon
Copy link

zcrittendon commented Feb 8, 2022

Version

4.0.12

Reproduction link

jsfiddle.net

Steps to reproduce

  1. Create route containing multiple (.*)* patterns. For example path: '/:seoPath(.*)*/c/:categoryId/:facetsPath(.*)*'
  2. Trigger processing of route containing the (.*)* pattern with a long path matching the pattern.
  3. Observe a delay of seconds (~ 20 path segments) or minutes (~ 30 path segments)

This delay occurs when rendering router-link as well as during route navigation.

This fiddle has ~20 path segments and shows a 1-second delay when loading the page or clicking the link:
https://jsfiddle.net/zcrittendon/j1fmu680/2/

This fiddle has ~30 path segments and shows shows a multi-minute delay (or browser hang) when loading the page:
https://jsfiddle.net/zcrittendon/dmzjvq56/8/

This fiddle has has ~30 path segments, but has no delay because it uses pattern (.*)? instead of (.*)*
https://jsfiddle.net/zcrittendon/9ze24hx7/5/

What is expected?

Router performance should not be longer than ~1 second even for very long wildcard URL paths.

What is actually happening?

Router performance is extremely long (or hands) for very long wildcard URL paths.

@posva
Copy link
Member

posva commented Feb 8, 2022

If you don't need the array format you should just use (.*)?, otherwise the generated regexp becomes too complex which is what makes it slow.

@posva posva closed this as completed Feb 8, 2022
@posva
Copy link
Member

posva commented Feb 8, 2022

Although I think we could maybe introduce a new syntax: * to just match anything without creating a complex regex group. This needs more thought.

@posva posva reopened this Feb 8, 2022
@posva posva changed the title Poor performance for long routes containing (.*)* pattern There should be a simpler alternative to the(.*)? pattern Feb 8, 2022
@posva posva added the discussion This problem still needs more feedback label Feb 8, 2022
@Mash19
Copy link

Mash19 commented Apr 11, 2022

I was a little confused by the grouping but looking at the fiddles, I'm seeing that the slashes aren't part of the regex used for the parameters. I guess they are being handled internally, so instead of using the matchAll metacharacter you can use a character class and just include the characters you using in your route. This performs pretty well even with the greedy regex pattern.
For the jsfiddle examples replacing the path with
path: '/:seoPath([a-z]*)*/c/:categoryId/:facetsPath([-\\w]*)*',
There is no hang loading or with the long route and the submatches are in the parameter array.
Of course the regex is customized to the routes in the examples, and could be a little broader [a-zA-Z] instead of [a-z] to handle upper and lower case. Or [\w] if you are matching alphanumeric like the facetsPath parameter.
There are a few other characters that are valid in an URL path that could be added but you'd probably be better off keeping it simple and only adding what you actually need.

@posva posva moved this to 🆕 Triaging in Vue Router Roadmap Jun 10, 2024
@Tofandel
Copy link

Tofandel commented Sep 19, 2024

I would also add that the pattern should allow for lazy quantifiers, right now any param match will always be greedy and this is very problematic

https://paths.esm.dev/?p=AAMeJSyA3BXABQKmLXANglpgEHQH0R4wAcl3LC4AjAFI8ASkAvA6gKgA4AeDEgDKz3ETgUMBAcAC0BpguIdQAABA&t=/hu/termek-kategoria/page/2/#

The previous path would work if I was allowed to add ?? as a param quantifier to produce the following regex: /^\/hu\/termek-kategoria(?:\/(.*?))??(?:\/(page\/\d+))?\/?$/

And if anything this should be the default, greedy matching makes no sense for a router

https://github.com/vuejs/router/blob/main/packages/router/src/matcher/pathParserRanker.ts#L190

This should basically be replaced from += '?' to += '??' or we should allow the tokenizer to have multiple tokens in the regex part

@Tofandel
Copy link

Tofandel commented Oct 7, 2024

If you don't need the array format you should just use (.*)?, otherwise the generated regexp becomes too complex which is what makes it slow.

I would point out that this opens the router to a bug where navigating will url encode the parameter

Eg: You are on path /some/nested/path/c/123/facet, running router.replace({query: { test: 'test' }) will result in the path becoming /some%2Fnested%2Fpath/c/123/facet?test=test

#2291

Copy link
Member

posva commented Oct 7, 2024

FYI the regexp are already non greedy (.*? and .+?). Regarding the problem the other bug, in your case you can workaround it by explicitely reusing the route

router.replace({
    ...router.currentRoute.value,
    query: { n: Date.now() }
  })

@Tofandel
Copy link

Tofandel commented Oct 7, 2024

FYI the regexp are already non greedy (.*? and .+?)

That's not true because our patterns are wrapped in (?:\/(.*?))? by vue-router

This is the generated regex from the tool

/^\/hu\/termek-kategoria(?:\/(.*?))?(?:\/(page\/\d+))?\/?$/i

As you can see it is greedy:
image
https://regex101.com/r/psZG8r/1

And because of that page/2 goes into the wrong parameter

This is what it needs to be lazy
https://regex101.com/r/YDu8Dx/1
image

But we don't have any way to modify this quantifier because it is added by the core there

Maybe we can tweak it? If the last char of the regex is a ? then we add 2 ?? to the quantifier to respect the users wish for the regex to be lazy

This would result in (.*) still being (?:\/(.*))? but (.*?) becomes (?:\/(.*?))??

@Tofandel
Copy link

Tofandel commented Oct 7, 2024

There also seems to be a problem in the path ranking if I split it into 2 routes with a trailing slash (which is one of our requirement)

https://paths.esm.dev/?p=AAMeJUyB3A7ACAROgDYMTAHuJXIFqHQAUAFACTg_OQA5INikft_qADWg4AUBBfMMSSMghySCcrsAAA..&t=/en/brands/abcd/efh/page/2#

Without a trailing slash the ranking is correct, I think the trailingslash should have a lower rank than a regex
https://paths.esm.dev/?p=AAMeJUyB3A7ACAROgDYMTAHuJXIFqHQAUAFACTg_OQA5INikft_qADWgAQVDQMk8w6RBgNwSAIA.&t=/en/brands/abcd/efh/page/2#

Effectively I'm doomed

@posva
Copy link
Member

posva commented Oct 7, 2024

You have multiple ways of handling it. Maybe a query param for the page instead of a param otherwise your routes are indeed redundant and the trailing slash makes it more specific. In your case the upcoming new custom matches could help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion This problem still needs more feedback
Projects
Status: 🆕 Triaging
Development

No branches or pull requests

4 participants