[META] Performance #1246

jsangmeister · 2022-02-28T14:55:03Z

This issue serves as a central place for all performance related ideas/problems. Concrete solutions for either specific problems or proposed implementations can be split up into own issues to better keep track of them. This issue should have no linked PRs at any time.

"Big" ideas to improve the performance of the backend:

prefetch for all important actions (see below)
change relation handling to use list updates instead of normal updates: This would remove most datastore requests from the relation handling. Currently, reverse relations are calculated as follows: Suppose motion/42 was just created for meeting/1 and therefore has to be added to meeting/1/motion_ids. This relation is fetched from the datastore, the new id (42) is added and then written back via an update event. List updates could improve this process since the meeting does not have to be fetched. Instead, just a list update with the content {"add": [42]} is generated and sent to the datastore where the insertion takes place. Related issue: Reduce the number of get(-many) requests #904
Replace types FullQualifiedId and Collection by typed strings, see Performance: Replace FullQualifiedIdentifier structure by typed string #1330

Smaller, more local improvements:

Actual problems encountered (may be investigated to find more ideas on how to improve performance):

Performance on motion.delete with a "big" database #1172
Optimize performance of bulk actions #1052, was already a summation of:
- deleting a poll with 2000 votes does not work #997
- Optimise Poll stopping #1036

Performance ideas for the datastore:

add indices to certain often used columns, like meeting_id

Done:

Reduction of amount of update events: Add a merging of update events after the relation handling #1269
Cache layer: This layer would be implemented between the base datastore adapter and the actual datastore. It would cache all results of get or get_many requests to make them available if the same data is requested again in the same action. It has to be taken care of not overwriting locked_fields if they might be needed. Related issue: Reduce the number of get(-many) requests #904
- prefetch all necessary data: As soon as we have the cache layer, all cache hits are very cheap in comparison to cache misses (where the datastore must be accessed). One idea to increase the amount of cache hits is to prefetch alll necessary data at the start of each action (depending on the structure of the data maybe in multiple steps, e.g. if multiple relations must be followed), so that during the actual action, no data is requested from the datastore at all. This is somewhat similar to the prefetching already done in OS3 with Django. @r-peschke even proposed to forbid any datastore access after the prefetch phase, but we will have to see if this is possible or wanted at all.
- prefetch complete models: This may be done alternatively or together with the previous approach. As soon as a model is requested, the model is requested from the datastore with all fields instead of only the mapped_fields from the request to fill the cache. This should not be done for all models or all fields, since this would result in a huge amount of data, but maybe through smart whitelists this could reduce the amount of actual datastore requests. May just as well be obsolete with the previous approach since all data is fetched anyway. This must be evaluated before implementation.
Add meeting id to all filters where it is possible #1203
Possible performance problem with write (single insert vs bulk insert) openslides-datastore-service#178

The text was updated successfully, but these errors were encountered:

jsangmeister added the performance label Feb 28, 2022

jsangmeister added this to the 4.0 milestone Feb 28, 2022

jsangmeister mentioned this issue Mar 1, 2022

Optimise Poll stopping #1036

Closed

rrenkert removed this from the 4.0 milestone Nov 3, 2022

jsangmeister closed this as completed Dec 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[META] Performance #1246

[META] Performance #1246

jsangmeister commented Feb 28, 2022 •

edited by r-peschke

Loading

[META] Performance #1246

[META] Performance #1246

Comments

jsangmeister commented Feb 28, 2022 • edited by r-peschke Loading

jsangmeister commented Feb 28, 2022 •

edited by r-peschke

Loading