deal with erroneous/outdated cached data #75

schlegelp · 2018-08-01T10:28:15Z

Examples:

Pulling neurons with annotation X, then adding that annotation to another neuron (in CATMAID) will not change subsequent queries from pymaid as now outdated cached response is used
2.CatmaidNeuron.reload() also uses cached data. Is this working as intended?
Querying for non-existent annotations throws an error. Even if that annotation is added, will still fail due to cached response.

Possible solutions:

Exclude some urls (annotations, tags, etc.) from caching. Most definitely for everything that writes to the CATMAID database (Thanks for flagging, Greg!). Could use a @no_cache decorator.
Check if error is raised before caching server response (e.g. if annotation is not found)
Allow relevant functions (e.g. .reload) to overwrite caching=True decorator
Allow relevant functions to force cache update
Add clear_cached_response() analogous to get_cached_response()
Raise Exception/Warning if data might be outdated.
Add pymaid.clear_cache() function and refer to it in Cached data used message.

The text was updated successfully, but these errors were encountered:

jefferis · 2018-08-01T11:11:00Z

https://martinfowler.com/bliki/TwoHardThings.html

But more positively, my own use case for a cache was very specific - to act as a snapshot of catmaid state at a given time associated with a full run of a single markdown analysis. However initial attempts to share even across analyses failed. And none of those were writing to the catmaid database.

I think you probably want to define what your target use cases are and then either cache the most expensive stuff or try and cater for read only situations completely.

schlegelp · 2018-08-01T13:21:20Z

What seemed to be the problem when sharing it across analysis, @jefferis?

jefferis · 2018-08-02T14:54:09Z

So I'm not quite sure yet, but I think it was one of two things – cacheing the login object or annotation requests where somehow different POST requested were being aliased to the same hash value (which defines the on disk response).

schlegelp · 2018-08-05T16:37:04Z

Made some improvements to the caching. From commit 5c3c8fd:

new decorator @no_caching: prevent caching for a function. Currently used for remove_annotations, add_annotations, delete_neuron, delete_tags, add_tags and rename_neurons
new decorator @no_cache_on_error: if function fails, will purge server responses cached while that function ran - will not change pre-existing cache (i.e. won't help if existing cache was the reason for failure)
new decorator @retry_clear_cache: if function fails, will clear given URL from cache and retry. Does not work with POST requests unfortunately.
new decorator @retry_no_caching: retries a function without caching if it fails

The main problem here is how to deal with cases where existing cache is the reason for failure. That's because we don't know what the function requested from the server. Possible solution is to (a) let the cache keep a log of requests and (b) let the decorator track the log during running the function - upon failure delete all cached urls that have been requested as per log.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deal with erroneous/outdated cached data #75

deal with erroneous/outdated cached data #75

schlegelp commented Aug 1, 2018 •

edited

Loading

jefferis commented Aug 1, 2018

schlegelp commented Aug 1, 2018

jefferis commented Aug 2, 2018

schlegelp commented Aug 5, 2018 •

edited

Loading

deal with erroneous/outdated cached data #75

deal with erroneous/outdated cached data #75

Comments

schlegelp commented Aug 1, 2018 • edited Loading

jefferis commented Aug 1, 2018

schlegelp commented Aug 1, 2018

jefferis commented Aug 2, 2018

schlegelp commented Aug 5, 2018 • edited Loading

schlegelp commented Aug 1, 2018 •

edited

Loading

schlegelp commented Aug 5, 2018 •

edited

Loading