Skip to content

Memoize with lease #178

@yurj

Description

@yurj

Hi!

use case:

a web API. The API must return values as soon as possible, and values should be updated if some time is passed from last call. So the api is fast and has data quite updated. The data depend on an external service which can take from 1 second to several seconds to compute. Data is collected at the begin of the week using the external service and it rarely changes but when it happen, nice to have it displayed in some minutes.

memoize_stampede was not suited because when the cache expire, the caller has to wait all the recomputation thus the API would be slow. So I modified stampede a little bit, removing the euristic and adding a lease time parameter. It is not perfect because if the service is not called from sometime, it will display the cached value and then update it, but being called from Javascript in some home pages this should not happen often and changes in data are rare, so not a problem.

here the code (nice if it will be included on the recipes!):

def memoize_lease(cache, expire, lease, name=None, typed=False, tag=None):
    """Memoizing cache decorator with cache lease.

    The memoization decorator update the cache entry when lease expire. The
    decorator return always from the cache, then call the function only when
    the lease time has passed, using a separate thread. Setting expire to None
    means the caller never wait for the execution of the function, reading
    always from the cache. Suitable for quite long running (rare called)
    functions, for example a long calculation or web pages which can change
    every 5-10 mins and the page can take some seconds to be computed.

    If name is set to None (default), the callable name will be determined
    automatically.

    If typed is set to True, function arguments of different types will be
    cached separately. For example, f(3) and f(3.0) will be treated as distinct
    calls with distinct results.

    The original underlying function is accessible through the `__wrapped__`
    attribute. This is useful for introspection, for bypassing the cache, or
    for rewrapping the function with a different cache.

    >>> from diskcache import Cache
    >>> cache = Cache()
    >>> @memoize_lease(cache, expire=10, lease=1)
    ... def fib(number):
    ...     if number == 0:
    ...         return 0
    ...     elif number == 1:
    ...         return 1
    ...     else:
    ...         return fib(number - 1) + fib(number - 2)
    >>> print(fib(100))
    354224848179261915075

    An additional `__cache_key__` attribute can be used to generate the cache
    key used for the given arguments.

    >>> key = fib.__cache_key__(100)
    >>> del cache[key]

    Remember to call memoize when decorating a callable. If you forget, then a
    TypeError will occur.

    :param cache: cache to store callable arguments and return values
    :param float expire: seconds until arguments expire
    :param float lease: minimum seconds after last execution
                        we want to update the cache value
    :param str name: name given for callable (default None, automatic)
    :param bool typed: cache different types separately (default False)
    :param str tag: text to associate with arguments (default None)
    :return: callable decorator

    """
    # Caution: Nearly identical code exists in Cache.memoize
    def decorator(func):
        "Decorator created by memoize call for callable."
        base = (full_name(func),) if name is None else (name,)

        def timer(*args, **kwargs):
            "Time execution of `func` and return result and time delta."
            start = time.time()
            result = func(*args, **kwargs)
            delta = time.time() - start
            return result, delta, time.time()

        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            "Wrapper for callable to cache arguments and return values."
            key = wrapper.__cache_key__(*args, **kwargs)
            trio, expire_time = cache.get(
                key, default=ENOVAL, expire_time=True, retry=True,
            )

            if trio is not ENOVAL:
                result, delta, last_exec = trio
                now = time.time()

                if (now - last_exec) > lease:
                    thread_key = key + (ENOVAL,)
                    thread_added = cache.add(thread_key, None,
                                             expire=delta, retry=True,)

                    if thread_added:
                        # Start thread for early recomputation.
                        def recompute():
                            with cache:
                                trio = timer(*args, **kwargs)
                                cache.set(
                                    key, trio, expire=expire,
                                    tag=tag, retry=True,)
                        thread = threading.Thread(target=recompute)
                        thread.daemon = True
                        thread.start()

                return result

            trio = timer(*args, **kwargs)
            cache.set(key, trio, expire=expire, tag=tag, retry=True)
            return trio[0]

        def __cache_key__(*args, **kwargs):
            "Make key for cache given function arguments."
            return args_to_key(base, args, kwargs, typed)

        wrapper.__cache_key__ = __cache_key__
        return wrapper

    return decorator

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions