Phase 3: Image Caching & Proxy¶
Priority: Later Complexity: High Dependencies: None (but benefits from Phase 2 for validation integration)
Goal¶
Cache external images in S3/MinIO, transform to optimized formats, and serve through our infrastructure.
Design Decisions¶
| Decision | Choice | Rationale |
|---|---|---|
| Storage backend | S3/MinIO | Scalable, CDN-ready for future |
| Image transformation | Optimize + thumbnails | WebP conversion, multiple sizes |
| API response | Replace imageUrl | Proxy URL replaces original in API response |
Storage: S3/MinIO with Flysystem¶
Use league/flysystem-aws-s3-v3 for abstraction:
- Development: MinIO container
- Production: S3 or S3-compatible storage
- Easy CDN integration later (CloudFront, etc.)
Bucket Structure¶
images/
├── original/
│ └── {bean_uuid}.{ext} # Original fetched image
├── optimized/
│ └── {bean_uuid}.webp # Full-size WebP
└── thumbnails/
├── {bean_uuid}_sm.webp # 150x150
└── {bean_uuid}_md.webp # 400x400
Entity: CachedImage¶
#[ORM\Entity(repositoryClass: CachedImageRepository::class)]
class CachedImage
{
#[ORM\Id]
#[ORM\Column(type: 'uuid')]
private Uuid $id;
#[ORM\OneToOne(targetEntity: CoffeeBean::class)]
#[ORM\JoinColumn(nullable: false, onDelete: 'CASCADE')]
private CoffeeBean $coffeeBean;
#[ORM\Column(length: 255)]
private string $originalUrl; // Source URL
#[ORM\Column(length: 64)]
private string $originalUrlHash; // SHA256 for dedup
#[ORM\Column(length: 64, nullable: true)]
private ?string $contentHash = null; // SHA256 of image bytes
#[ORM\Column(type: 'datetime_immutable')]
private DateTimeImmutable $cachedAt;
#[ORM\Column(type: 'datetime_immutable', nullable: true)]
private ?DateTimeImmutable $lastValidatedAt = null;
#[ORM\Column(enumType: CachedImageStatus::class)]
private CachedImageStatus $status; // PENDING, CACHED, FAILED, STALE
#[ORM\Column(length: 50, nullable: true)]
private ?string $originalMimeType = null;
#[ORM\Column(nullable: true)]
private ?int $originalSize = null;
#[ORM\Column(nullable: true)]
private ?int $optimizedSize = null;
#[ORM\Column(type: 'json', nullable: true)]
private ?array $variants = null; // ['sm' => true, 'md' => true, 'full' => true]
}
Image Transformation: Intervention Image¶
Use intervention/image with GD or Imagick driver:
- Convert to WebP (80% quality)
- Generate thumbnail sizes: 150x150 (sm), 400x400 (md)
- Preserve aspect ratio with cover/contain
Sizes Configuration¶
// config/packages/image_cache.php or service parameter
'image_variants' => [
'sm' => ['width' => 150, 'height' => 150, 'fit' => 'cover'],
'md' => ['width' => 400, 'height' => 400, 'fit' => 'contain'],
'full' => ['width' => 1200, 'height' => 1200, 'fit' => 'contain'],
]
Cache Invalidation Strategy¶
- Time-based: Re-validate after 7 days (via Phase 2 job if available)
- On broken detection: Phase 2 marks as STALE, triggers re-fetch
- Manual: Admin action to invalidate and re-cache
- Source URL change: If CoffeeBean.imageUrl changes, invalidate
API Integration¶
Proxy Controller¶
GET /api/images/{beanUuid} → Full optimized WebP
GET /api/images/{beanUuid}?size=sm → 150x150 thumbnail
GET /api/images/{beanUuid}?size=md → 400x400 medium
GET /api/images/{beanUuid}?original=1 → Original (if stored)
Response headers:
- Cache-Control: public, max-age=86400 (1 day)
- ETag based on content hash
- Content-Type: image/webp
DTO Changes¶
Modify EntityToDtoMapper:
- imageUrl returns proxy URL when cached, fallback to original if not cached
- Original URL stored in CachedImage.originalUrl for reference
- Add imageVariants array for size options
// In EntityToDtoMapper
$imageUrl = $cachedImage?->getStatus() === CachedImageStatus::CACHED
? $this->imageProxyUrlGenerator->generate($coffeeBean)
: $coffeeBean->getImageUrl();
Implementation¶
Services¶
| Service | Responsibility |
|---|---|
ImageCacheService |
Orchestrates caching workflow |
ImageStorageService |
S3/Flysystem operations |
ImageTransformService |
Resize/convert with Intervention |
ImageProxyUrlGenerator |
Generate signed/public URLs |
Async Processing¶
ImageCacheMessage: Trigger caching for a beanImageCacheHandler: Fetch, transform, upload to S3- Batch job for initial migration of existing images
Controller¶
ImageProxyController: Serve images, handle cache-on-demand
Integration with Phase 2¶
If Phase 2 (Broken Image Detection) is implemented:
- Phase 2 validates → marks ImageCheck as BROKEN
- Listener detects broken check → marks CachedImage as STALE
- Re-cache job picks up STALE images → attempts re-fetch
- If still broken → CachedImage status = FAILED, API returns fallback
Files to Create¶
| File | Description |
|---|---|
src/Entity/CachedImage.php |
Entity |
src/Repository/CachedImageRepository.php |
Repository |
src/Enum/CachedImageStatus.php |
Status enum (PENDING, CACHED, FAILED, STALE) |
src/Service/Image/ImageCacheService.php |
Orchestration |
src/Service/Image/ImageStorageService.php |
S3 operations |
src/Service/Image/ImageTransformService.php |
Image transformation |
src/Service/Image/ImageProxyUrlGenerator.php |
URL generation |
src/Controller/Api/ImageProxyController.php |
Proxy endpoint |
src/Message/ImageCacheMessage.php |
Message class |
src/MessageHandler/ImageCacheHandler.php |
Handler |
src/EventListener/BrokenImageListener.php |
Optional: event-driven |
| Migration | cached_image table |
config/packages/flysystem.php |
S3 adapter config |
Dependencies to Add¶
Reference Files¶
src/Scheduler/AvailabilityCrawlSchedulerService.php- Scheduler patternsrc/MessageHandler/CrawlStepHandler.php- Message handler patternsrc/DTO/Api/CoffeeBeanDTO.php- DTO structuresrc/Service/Api/Mapper/EntityToDtoMapper.php- DTO mapping