Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META] Performance #1246

Closed
jsangmeister opened this issue Feb 28, 2022 · 0 comments
Closed

[META] Performance #1246

jsangmeister opened this issue Feb 28, 2022 · 0 comments

Comments

@jsangmeister
Copy link
Contributor

jsangmeister commented Feb 28, 2022

This issue serves as a central place for all performance related ideas/problems. Concrete solutions for either specific problems or proposed implementations can be split up into own issues to better keep track of them. This issue should have no linked PRs at any time.

"Big" ideas to improve the performance of the backend:

  • prefetch for all important actions (see below)
  • change relation handling to use list updates instead of normal updates: This would remove most datastore requests from the relation handling. Currently, reverse relations are calculated as follows: Suppose motion/42 was just created for meeting/1 and therefore has to be added to meeting/1/motion_ids. This relation is fetched from the datastore, the new id (42) is added and then written back via an update event. List updates could improve this process since the meeting does not have to be fetched. Instead, just a list update with the content {"add": [42]} is generated and sent to the datastore where the insertion takes place. Related issue: Reduce the number of get(-many) requests #904
  • Replace types FullQualifiedId and Collection by typed strings, see Performance: Replace FullQualifiedIdentifier structure by typed string #1330

Smaller, more local improvements:

Actual problems encountered (may be investigated to find more ideas on how to improve performance):

Performance ideas for the datastore:

  • add indices to certain often used columns, like meeting_id

Done:

  • Reduction of amount of update events: Add a merging of update events after the relation handling #1269
  • Cache layer: This layer would be implemented between the base datastore adapter and the actual datastore. It would cache all results of get or get_many requests to make them available if the same data is requested again in the same action. It has to be taken care of not overwriting locked_fields if they might be needed. Related issue: Reduce the number of get(-many) requests #904
    • prefetch all necessary data: As soon as we have the cache layer, all cache hits are very cheap in comparison to cache misses (where the datastore must be accessed). One idea to increase the amount of cache hits is to prefetch alll necessary data at the start of each action (depending on the structure of the data maybe in multiple steps, e.g. if multiple relations must be followed), so that during the actual action, no data is requested from the datastore at all. This is somewhat similar to the prefetching already done in OS3 with Django. @r-peschke even proposed to forbid any datastore access after the prefetch phase, but we will have to see if this is possible or wanted at all.
    • prefetch complete models: This may be done alternatively or together with the previous approach. As soon as a model is requested, the model is requested from the datastore with all fields instead of only the mapped_fields from the request to fill the cache. This should not be done for all models or all fields, since this would result in a huge amount of data, but maybe through smart whitelists this could reduce the amount of actual datastore requests. May just as well be obsolete with the previous approach since all data is fetched anyway. This must be evaluated before implementation.
  • Add meeting id to all filters where it is possible #1203
  • Possible performance problem with write (single insert vs bulk insert) openslides-datastore-service#178
@jsangmeister jsangmeister added this to the 4.0 milestone Feb 28, 2022
@rrenkert rrenkert removed this from the 4.0 milestone Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants