Skip to content

Feature Implementation Plan: Proactive API Caching Strategy

📋 Todo Checklist

  • [ ] Configure Symfony's Cache component and choose a cache adapter (e.g., Redis, Filesystem).
  • [ ] Identify all high-traffic, read-only API endpoints that would benefit from caching.
  • [ ] Implement caching logic in the repository methods that serve these endpoints.
  • [ ] Implement a cache invalidation strategy to ensure data freshness.
  • [ ] Write integration tests to verify caching behavior and invalidation.
  • [ ] Final Review and Testing

🔍 Analysis & Investigation

Codebase Structure

  • This plan will introduce a new configuration file, config/packages/cache.yaml.
  • It will primarily modify the repository layer (src/Repository/*Repository.php) to add caching logic around read-only database queries.
  • It will also touch entities or admin controllers to handle cache invalidation.

Current Architecture & Problem

  • Problem: The API currently hits the database for every request. For high-traffic, read-only endpoints (like lists of varieties, regions, or top-10 aggregations), this is inefficient and can lead to performance bottlenecks under load.
  • Solution: This plan introduces a proactive caching layer. By storing the results of expensive or frequently requested queries in a fast cache (like Redis or even in-memory), we can serve subsequent requests almost instantly, dramatically improving API response times and reducing database load.

Dependencies & Integration Points

  • Symfony Cache: This is the core component. The plan may require installing a specific adapter, like symfony/redis-adapter, if a shared cache is desired for a multi-server environment.
  • Doctrine: The caching logic will wrap Doctrine query results.

Considerations & Challenges

  • Cache Invalidation: This is famously one of the "hard problems" in computer science. A stale cache can be worse than no cache at all. The plan must include a robust strategy for clearing or updating cached data when the underlying information changes.
  • Cache Keys: A consistent and predictable strategy for naming cache keys is essential. Keys should be unique to the query and its parameters to avoid returning the wrong data.
  • Choosing a TTL: The Time-To-Live (TTL) for cached items is a trade-off between performance and data freshness. The plan will suggest reasonable starting points (e.g., 1 hour), but this will likely need tuning based on real-world usage.

📝 Implementation Plan

Prerequisites

  • Install a cache adapter if needed, e.g., composer require symfony/cache symfony/redis-adapter.

Step-by-Step Implementation

  1. Configure Symfony Cache

    • Files to create/modify: config/packages/cache.yaml
    • Changes needed: Define one or more cache pools. A general pool for API data is a good start.
      framework:
          cache:
              pools:
                  api.cache:
                      adapter: cache.adapter.redis # Recommended for production
                      # Or for local development:
                      # adapter: cache.adapter.filesystem
                      default_lifetime: 3600 # 1 hour
      
  2. Implement Caching in Repositories

    • Files to modify: All repositories serving high-traffic, read-only API endpoints (e.g., VarietyRepository, RegionRepository, ProcessingMethodRepository).
    • Changes needed:
      • Inject the configured cache pool into the repository's constructor: private CacheItemPoolInterface $apiCache.
      • In each read-only method (e.g., findTopByAvailableBeanCount, findByFiltersWithPagination), wrap the core logic in a cache block.
      • Example Implementation:
        public function findTopByAvailableBeanCount(int $limit = 10): array
        {
            // 1. Create a unique key for this specific query
            $cacheKey = 'top_varieties_' . $limit;
            $cachedItem = $this->apiCache->getItem($cacheKey);
        
            // 2. Check if the item is in the cache
            if ($cachedItem->isHit()) {
                return $cachedItem->get();
            }
        
            // 3. If not, run the original query
            $query = $this->createQueryBuilder('v')
                // ... a lot of query logic ...
                ->getQuery();
            $result = $query->getResult();
        
            // 4. Save the result to the cache
            $cachedItem->set($result);
            $this->apiCache->save($cachedItem);
        
            // 5. Return the result
            return $result;
        }
        
  3. Implement Cache Invalidation

    • Files to modify: Admin CRUD controllers (e.g., VarietyCrudController) or the entities themselves.
    • Changes needed: When data is updated, the relevant cache keys must be cleared.
    • Strategy 1 (Simpler): In the admin controller, after a successful update/create/delete operation, invalidate the relevant keys.
      // In VarietyCrudController after updating a variety
      public function updateEntity(EntityManagerInterface $entityManager, $entityInstance): void
      {
          parent::updateEntity($entityManager, $entityInstance);
          $this->apiCache->deleteItem('top_varieties_10');
          $this->apiCache->deleteItem('top_varieties_5');
          // ... and any other related keys
      }
      
    • Strategy 2 (More Advanced): Use Doctrine Lifecycle Callbacks or an Event Subscriber that listens for entity changes and invalidates caches. This is cleaner but more complex to set up. The first strategy is sufficient for this plan.

Testing Strategy

  • Unit Tests: It's difficult to unit test the cache itself, but you can test the logic that generates cache keys to ensure they are unique and predictable.
  • Integration Tests: This is where caching is primarily tested.
    • Write a test that calls a cached repository method twice. On the second call, assert that the database was not hit again (you can use a Doctrine query logger or a mock cache adapter to verify this).
    • Write a test that first calls a cached endpoint, then performs an action that should invalidate the cache (e.g., update an entity via the admin controller), and then calls the endpoint again, asserting that the new, updated data is returned.

🎯 Success Criteria

  • High-traffic API endpoints respond significantly faster after the first request.
  • Database load is measurably reduced for read-only operations.
  • Updating data through the admin panel correctly invalidates the relevant caches, ensuring users see fresh data.
  • The caching logic is reliable and does not serve stale data.