Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deal with erroneous/outdated cached data #75

Open
3 of 7 tasks
schlegelp opened this issue Aug 1, 2018 · 4 comments
Open
3 of 7 tasks

deal with erroneous/outdated cached data #75

schlegelp opened this issue Aug 1, 2018 · 4 comments

Comments

@schlegelp
Copy link
Collaborator

schlegelp commented Aug 1, 2018

Examples:

  1. Pulling neurons with annotation X, then adding that annotation to another neuron (in CATMAID) will not change subsequent queries from pymaid as now outdated cached response is used
    2.CatmaidNeuron.reload() also uses cached data. Is this working as intended?
  2. Querying for non-existent annotations throws an error. Even if that annotation is added, will still fail due to cached response.

Possible solutions:

  • Exclude some urls (annotations, tags, etc.) from caching. Most definitely for everything that writes to the CATMAID database (Thanks for flagging, Greg!). Could use a @no_cache decorator.
  • Check if error is raised before caching server response (e.g. if annotation is not found)
  • Allow relevant functions (e.g. .reload) to overwrite caching=True decorator
  • Allow relevant functions to force cache update
  • Add clear_cached_response() analogous to get_cached_response()
  • Raise Exception/Warning if data might be outdated.
  • Add pymaid.clear_cache() function and refer to it in Cached data used message.
@jefferis
Copy link
Contributor

jefferis commented Aug 1, 2018

https://martinfowler.com/bliki/TwoHardThings.html

But more positively, my own use case for a cache was very specific - to act as a snapshot of catmaid state at a given time associated with a full run of a single markdown analysis. However initial attempts to share even across analyses failed. And none of those were writing to the catmaid database.

I think you probably want to define what your target use cases are and then either cache the most expensive stuff or try and cater for read only situations completely.

@schlegelp
Copy link
Collaborator Author

What seemed to be the problem when sharing it across analysis, @jefferis?

@jefferis
Copy link
Contributor

jefferis commented Aug 2, 2018

So I'm not quite sure yet, but I think it was one of two things – cacheing the login object or annotation requests where somehow different POST requested were being aliased to the same hash value (which defines the on disk response).

@schlegelp
Copy link
Collaborator Author

schlegelp commented Aug 5, 2018

Made some improvements to the caching. From commit 5c3c8fd:

  • new decorator @no_caching: prevent caching for a function. Currently used for remove_annotations, add_annotations, delete_neuron, delete_tags, add_tags and rename_neurons
  • new decorator @no_cache_on_error: if function fails, will purge server responses cached while that function ran - will not change pre-existing cache (i.e. won't help if existing cache was the reason for failure)
  • new decorator @retry_clear_cache: if function fails, will clear given URL from cache and retry. Does not work with POST requests unfortunately.
  • new decorator @retry_no_caching: retries a function without caching if it fails

The main problem here is how to deal with cases where existing cache is the reason for failure. That's because we don't know what the function requested from the server. Possible solution is to (a) let the cache keep a log of requests and (b) let the decorator track the log during running the function - upon failure delete all cached urls that have been requested as per log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants