Multi-Market Implementation: Phase 2 - Database Migration and Data Migration¶
Status: Planning¶
Dependencies¶
Requires completed: - Phase 1 (Entities) - Must have Market, AffiliateProgram, BuyingOption entities
Blocks: - Phase 3 (Services) - Services need these new fields - Phase 4 (Cache/Repository) - Repositories query these fields - All subsequent phases
📋 Todo Checklist¶
- [ ] Add
marketandpriorityfields to RoasterCrawlConfig entity - [ ] Add
availablefield to CrawlUrl entity - [ ] Create database migration for schema changes
- [ ] Create data migration script for CoffeeBean.available → CrawlUrl.available
- [ ] Test migration up/down (reversibility)
- [ ] Verify existing queries still work
- [ ] Document transition period behavior
- [ ] Plan removal of CoffeeBean.available field (future phase)
🔍 Analysis & Investigation¶
Problem Statement¶
The current system needs fields added to existing entities to support multi-market architecture:
1. RoasterCrawlConfig needs market assignment and priority
2. CrawlUrl needs availability tracking (replacing global CoffeeBean.available)
3. Existing data must be migrated safely
Current State¶
RoasterCrawlConfig:
- Has shipsTo countries (logistics)
- No market assignment (monetization)
- No priority system
CrawlUrl:
- No availability tracking
- Relies on global CoffeeBean.available
CoffeeBean:
- Has global available boolean
- Applied to ALL crawl URLs (not granular enough)
Target State¶
RoasterCrawlConfig:
- Has market (nullable during transition)
- Has priority (default 50)
- Keeps shipsTo (serves different purpose than Market.countries)
CrawlUrl:
- Has available boolean (per-URL availability)
CoffeeBean:
- available field removed (AFTER transition complete)
Migration Strategy¶
Transition Period:
- New fields are nullable/have defaults
- Configs with market = NULL work in "legacy mode"
- Both old and new systems work simultaneously
- Gradual migration of configs to markets
Backward Compatibility: - Down migration restores previous state - Data migration is reversible - No breaking changes during transition
📝 Implementation Plan¶
Prerequisites¶
- Phase 1 completed (Market, AffiliateProgram, BuyingOption entities exist)
- Database backup created
- All existing migrations applied
Step-by-Step Implementation¶
Step 1: Update RoasterCrawlConfig Entity¶
File: src/Entity/RoasterCrawlConfig.php
Add new fields and relationships:
use App\Entity\Market;
// Add to class body:
#[ORM\ManyToOne(targetEntity: Market::class, inversedBy: 'roasterCrawlConfigs')]
#[ORM\JoinColumn(name: 'market_id', referencedColumnName: 'id', nullable: true)]
private ?Market $market = null;
#[ORM\Column(type: Types::INTEGER, options: ['default' => 50])]
#[Assert\Range(min: 0, max: 1000, notInRangeMessage: 'Priority must be between 0 and 1000.')]
private int $priority = 50;
// Add getters/setters:
public function getMarket(): ?Market
{
return $this->market;
}
public function setMarket(?Market $market): self
{
$this->market = $market;
return $this;
}
public function getPriority(): int
{
return $this->priority;
}
public function setPriority(int $priority): self
{
$this->priority = $priority;
return $this;
}
Add indexes to entity annotations:
#[ORM\Index(name: 'idx_rcc_market', columns: ['market_id'])]
#[ORM\Index(name: 'idx_rcc_priority', columns: ['priority'])]
Notes:
- market is nullable to support transition period
- priority defaults to 50 (allows higher/lower on both sides)
- Keep existing shippingRegion and shipsTo fields (serve different purpose)
Step 2: Update CrawlUrl Entity¶
File: src/Entity/CrawlUrl.php
Add availability field:
#[ORM\Column(type: Types::BOOLEAN, options: ['default' => true])]
private bool $available = true;
// Add getter/setter:
public function isAvailable(): bool
{
return $this->available;
}
public function setAvailable(bool $available): self
{
$this->available = $available;
return $this;
}
Add index:
Notes:
- Default true for new crawl URLs
- Existing URLs will be updated via data migration
Step 3: Generate Schema Migration¶
Run Doctrine migration diff:
Verify generated migration includes:
// Up migration:
$this->addSql('ALTER TABLE roaster_crawl_config ADD COLUMN market_id UUID NULL');
$this->addSql('ALTER TABLE roaster_crawl_config ADD COLUMN priority INT DEFAULT 50 NOT NULL');
$this->addSql('ALTER TABLE crawl_url ADD COLUMN available BOOLEAN DEFAULT true NOT NULL');
// Add foreign key
$this->addSql('ALTER TABLE roaster_crawl_config ADD CONSTRAINT FK_RCC_MARKET FOREIGN KEY (market_id) REFERENCES market (id)');
// Add indexes
$this->addSql('CREATE INDEX idx_rcc_market ON roaster_crawl_config (market_id)');
$this->addSql('CREATE INDEX idx_rcc_priority ON roaster_crawl_config (priority)');
$this->addSql('CREATE INDEX idx_crawl_url_available ON crawl_url (available)');
Step 4: Create Data Migration Script¶
File: migrations/VersionXXX_migrate_availability_data.php
Create separate migration for data transformation:
<?php
declare(strict_types=1);
namespace DoctrineMigrations;
use Doctrine\DBAL\Schema\Schema;
use Doctrine\Migrations\AbstractMigration;
final class VersionXXX_migrate_availability_data extends AbstractMigration
{
public function getDescription(): string
{
return 'Migrate CoffeeBean.available to CrawlUrl.available';
}
public function up(Schema $schema): void
{
// For each bean where available = false, set all its CrawlUrls to available = false
$this->addSql('
UPDATE crawl_url cu
SET available = false
WHERE cu.coffee_bean_id IN (
SELECT id FROM coffee_bean WHERE available = false
)
');
// Log the migration
$affectedRows = $this->connection->executeStatement('
SELECT COUNT(*) FROM crawl_url WHERE available = false
');
$this->write(sprintf('Migrated availability for %d CrawlUrls', $affectedRows));
}
public function down(Schema $schema): void
{
// Restore CoffeeBean.available from CrawlUrl.available
// This assumes CoffeeBean.available field still exists
// Set bean available = false if ANY of its CrawlUrls are unavailable
$this->addSql('
UPDATE coffee_bean cb
SET available = false
WHERE EXISTS (
SELECT 1 FROM crawl_url cu
WHERE cu.coffee_bean_id = cb.id
AND cu.available = false
)
');
// Set bean available = true if ALL CrawlUrls are available
$this->addSql('
UPDATE coffee_bean cb
SET available = true
WHERE NOT EXISTS (
SELECT 1 FROM crawl_url cu
WHERE cu.coffee_bean_id = cb.id
AND cu.available = false
)
');
}
}
Important: Run this migration AFTER the schema migration.
Step 5: Document Transition Behavior¶
Create documentation file: docs/multi-market-transition.md
# Multi-Market Transition Period
## Current State (During Transition)
### RoasterCrawlConfig.market
- **Nullable**: Yes
- **NULL behavior**: Legacy mode (uses shipsTo, no affiliate transformation)
- **Non-NULL behavior**: Multi-market mode (uses market countries, affiliate transformation)
### URL Resolution Logic
For each CrawlUrl, the system checks:
1. **Logistics check**: Does RCC ship to visitor's country? (RCC.shipsTo)
2. **Monetization check**: If RCC has market, does market serve visitor's country? (Market.countries)
3. **Both must pass** for bean to be shown
### Priority System
Even in legacy mode (market = NULL), priority is used:
- Default priority: 50
- Higher priority wins when multiple configs serve same country
- Allows gradual migration without URL resolution changes
## Migration Workflow
### Immediate (After Migration)
1. All existing RCCs have `market = NULL` and `priority = 50`
2. System works as before (legacy mode)
3. All CrawlUrls inherit availability from CoffeeBean.available
### Gradual (Admin Actions)
1. Create AffiliateProgram entities
2. Create Market entities
3. Assign RCCs to markets one by one
4. Adjust priorities as needed
5. Create BuyingOptions for special cases
### Eventually (After Full Transition)
1. All RCCs have markets assigned
2. Verify all beans have correct availability
3. Remove CoffeeBean.available field (separate migration)
## Rollback Procedure
If issues occur:
1. Revert code deployment
2. Run migration down: `bin/console doctrine:migrations:migrate prev`
3. Data is restored to previous state
Step 6: Create Migration Tests¶
File: tests/Migration/MultiMarketMigrationTest.php
<?php
namespace App\Tests\Migration;
use Symfony\Bundle\FrameworkBundle\Test\KernelTestCase;
use Doctrine\DBAL\Connection;
class MultiMarketMigrationTest extends KernelTestCase
{
private Connection $connection;
protected function setUp(): void
{
self::bootKernel();
$this->connection = self::getContainer()->get('doctrine')->getConnection();
}
public function testSchemaChangesApplied(): void
{
// Verify new columns exist
$this->assertTrue(
$this->columnExists('roaster_crawl_config', 'market_id')
);
$this->assertTrue(
$this->columnExists('roaster_crawl_config', 'priority')
);
$this->assertTrue(
$this->columnExists('crawl_url', 'available')
);
}
public function testIndexesCreated(): void
{
// Verify indexes exist
$this->assertTrue(
$this->indexExists('roaster_crawl_config', 'idx_rcc_market')
);
$this->assertTrue(
$this->indexExists('roaster_crawl_config', 'idx_rcc_priority')
);
$this->assertTrue(
$this->indexExists('crawl_url', 'idx_crawl_url_available')
);
}
public function testDataMigration(): void
{
// Create test data: bean with available=false
$this->connection->insert('coffee_bean', [
'id' => 'test-bean-id',
'name' => 'Test Bean',
'available' => false,
// ... other required fields
]);
$this->connection->insert('crawl_url', [
'id' => 'test-url-id',
'url' => 'https://example.com/product',
'coffee_bean_id' => 'test-bean-id',
'available' => true, // Default
// ... other required fields
]);
// Run data migration logic
$this->connection->executeStatement('
UPDATE crawl_url cu
SET available = false
WHERE cu.coffee_bean_id IN (
SELECT id FROM coffee_bean WHERE available = false
)
');
// Verify CrawlUrl.available was updated
$result = $this->connection->fetchAssociative(
'SELECT available FROM crawl_url WHERE id = ?',
['test-url-id']
);
$this->assertFalse($result['available']);
}
private function columnExists(string $table, string $column): bool
{
$schemaManager = $this->connection->createSchemaManager();
$columns = $schemaManager->listTableColumns($table);
return isset($columns[$column]);
}
private function indexExists(string $table, string $index): bool
{
$schemaManager = $this->connection->createSchemaManager();
$indexes = $schemaManager->listTableIndexes($table);
return isset($indexes[$index]);
}
}
Step 7: Test Migration Reversibility¶
Create test script to verify up/down migrations work:
#!/bin/bash
# Test migration reversibility
set -e
echo "Creating database backup..."
make db-backup
echo "Running migration up..."
make migrate
echo "Verifying schema..."
bin/console doctrine:schema:validate
echo "Running migration down..."
bin/console doctrine:migrations:migrate prev --no-interaction
echo "Verifying rollback..."
bin/console doctrine:schema:validate
echo "Running migration up again..."
make migrate
echo "Migration reversibility test passed!"
Testing Strategy¶
Manual Testing¶
- Database Backup:
- Create backup before migration
-
Verify backup contains all data
-
Schema Migration:
- Run
make migrate - Verify new columns exist with correct types
- Verify indexes created
-
Verify foreign keys created
-
Data Migration:
- Verify CrawlUrls inherit availability from CoffeeBean
-
Check edge cases (beans with no URLs, NULL availability)
-
Rollback Test:
- Run migration down
- Verify schema restored
- Verify data restored
- Run migration up again
Integration Tests¶
- Existing Queries:
- Verify all existing queries still work
- Test RoasterCrawlConfig queries with market = NULL
-
Test CrawlUrl queries with available field
-
Legacy Mode:
- Verify configs with market = NULL still function
- Test URL resolution with legacy configs
- Verify no breaking changes
Performance Testing¶
- Query Performance:
- Verify new indexes improve query performance
- Test queries on large datasets
- Check for N+1 query issues
🎯 Success Criteria¶
- Schema migration runs successfully without errors
- New fields added to RoasterCrawlConfig (market, priority)
- New field added to CrawlUrl (available)
- All indexes created correctly
- Data migration completes: CrawlUrl.available reflects CoffeeBean.available
- Migration is reversible (up/down works correctly)
- Existing queries continue working
- Legacy mode (market = NULL) functions correctly
- No breaking changes to existing functionality
- Doctrine schema validation passes
- All tests pass
Related Files¶
Files to Modify:¶
src/Entity/RoasterCrawlConfig.php(add market, priority)src/Entity/CrawlUrl.php(add available)
Files to Create:¶
migrations/VersionXXX_multi_market_add_fields.php(auto-generated schema)migrations/VersionXXX_migrate_availability_data.php(manual data migration)docs/multi-market-transition.md(documentation)tests/Migration/MultiMarketMigrationTest.php
Files to Keep (not remove yet):¶
src/Entity/CoffeeBean.php- Keepavailablefield during transition- Will be removed in Phase 8 (Cleanup) after full transition
Notes¶
Why Keep RCC.shipsTo?¶
Different purposes:
- RCC.shipsTo = Physical shipping capability
- Market.countries = Monetization regions
Both are checked: 1. Can roaster ship to visitor? (RCC.shipsTo) 2. Can we monetize in visitor's country? (Market.countries)
Not deprecated: RCC.shipsTo serves a distinct purpose and will NOT be removed.
Transition Period¶
Duration: Indefinite - no forced migration deadline Flexibility: Admins migrate configs to markets at their own pace Safety: Both systems work simultaneously
Data Integrity¶
Unique constraint: BuyingOption (coffee_bean_id, market_id) prevents duplicates Foreign keys: All relationships enforced at database level Cascading deletes: Market deletion removes BuyingOptions
Next Steps¶
After completing this phase: - Proceed to Phase 3 (Services) to implement BuyingOptionService and AffiliateUrlService - Entities and schema ready for service layer - Data migrated and safe to use in new system