Skip to content

chu-k/redis-cache

Repository files navigation

wakatime

Quickstart

make test from the repo root as defined in the spec will bring up all containers and execute tests.

make services to bring up just proxy and redis

Overview

This system is implemented as Docker Compose stack containing:

  • the proxy service running as a sanic server that implements HTTP GET<key> endpoint.
  • a Redis backing cache
  • a (py)test container that executes:
    • local cache unit tests (LRU, TTL eviction policies)
    • system-level tests using the 'requests' library

Design

system diagram

The webserver only implements the GET<key> route.

The GET handler calls the ClientCache get method which contains a Redis client and a LocalCache instance - passed via class constructor. A simple TTL (global expiry), fixed-size LRU cache based on standard Python library data structures (OrderedDict, heapq) implements this interface.

Cache complexity

  • get O(1) due to ordered dict lookup. Laxy expiration, i.e. we only check if it's expired, but don't remove until set call
  • set O(log(n)). This is due to call to remove_all_expired here. Expiry tracking was initially implemented as a standard dict, but in doing this analysis, I realized worst case complexity was O(n) to iterate over all keys during removal. Ended up refactoring to use Python standard lib heapq, which improves complexity to O(log(n)).

Configuration

Config values as defined in the spec are passed via environment variables:

  • CACHE_TTL_SEC: Cache expiry time
  • CACHE_MAX_KEYS: Cache capcity (number of keys)
  • REDIS_HOSTNAME:REDIS_PORT: Address of backing Redis
  • PROXY_HOST:PROXY_PORT: Proxy listen TCP/IP address and port
  • CONCURRENT_REQUESTS_MAX: Concurrent client request limit

A low-resource .env file is included in the repo, but should be substituted for appropriate production values. Default fallback values are also provided (proxy/config.py) if env vars are not present, but actual sensible defaults for cache configuration would depend on the intended workload.

Concurrency

Sanic was initially selected as the web frameworkdue to its async implementation (as opposed to say, Flask). It can handle several client requests asynchronously. Access to shared resources (the local cache data structures) are guarded by aysncio Lock to ensure thread safety. Redis client calls are also async with a connection pool.

The ability to provide the Sanic app a --workers flag to scale across multiple cores was intriguing; however, each worker process ends up with a different context, i.e., its own copy of a local cache. There is a documented way to share context, but it requires the use of native multiprocessing types, e.g. multiprocessing.Queue that need to be instantiated at the app level. In other words, the 'TTLLRUCache' local cache object can't be added to the shared context. Without this, we can only use a single server worker process.

Max concurrent requests

I leveraged sanic middleware to check/update a global concurrent_requests counter - protected with mutex.

With the current testing setup using threading, the maximum number of concurrent active requests is ~5 on my machine due to request response timing. I'm not aware of a clean way to test different environment-level config values and assert different responses for the same test case... That being said, I temporarily set the configuration to CONCURRENT_REQUESTS_MAX=2 and ran the tests by hand to verify the correct (503) response. Could probably achieve this with a script, passing the variable to Make, i.e. CONCURRENT_REQUESTS_MAX=5 make test, but adding conditional asserts based on an env variable feels clunky.

Redis Serialization Protocol

Implemented separately as simple asyncio TCP server but using the same ClientCache, configured similarly. Specification stated needs to handle GET. In testing with redis-py, also need to handle CLIENT SETINFO commands on initial call. For now, just respond OK, but could extend server to track connected client history and support other CLIENT commands. This server implementation also tracks the number of connected client and increment/decrements appropriately on the main connection handler.

Improvements

Refs

Time breakdown (approx.)

Thurs/Fri

  • 2 hr pre-dev docs/research on web frameworks, redis, planning
  • 4 hrs intial impl, container stack functioning

Sat

  • 2 hrs setting up/debug test container

Sun

  • 1 hr setup concurrent requests tests
  • 2 hr refactoring/tidying up
  • 2 hr docs, ci
  • 2 hr impl/debug configurable max concurrent requests (incl. digging around in sanic docs)

Mon

  • 2.5hr reading docs + simple resp server impl working locally with redis-py

About

Redis proxy cache service

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published