Skip to content

Sitemap URL Filter Matchers

Filter sitemap URLs by locale/region patterns during crawling.

Quick Start

Add a new matcher in config/services_sitemap_matchers.php:

$services->set('app.sitemap_matcher.locale.shopify_de', UrlFragmentFilterMatcher::class)
         ->arg('$baseMatcher', service(ShopifySitemapMatcher::class))
         ->arg('$pattern', '/de/')
         ->arg('$platformName', 'locale:shopify:de')
         ->tag('app.sitemap_matcher');

Then set sitemap_platform_name to locale:shopify:de on the roaster's crawl config.

Options

Parameter Description
$baseMatcher CMS matcher to delegate to (e.g., ShopifySitemapMatcher)
$pattern String or regex pattern (see below)
$platformName Unique identifier, set on crawl config to activate
$exclude true = exclude matching URLs (default: false)

Pattern Formats

Plain string - matches URLs containing the string:

->arg('$pattern', '/en/')  // matches URLs with /en/

Regex - use # delimiters:

->arg('$pattern', '#/en(-[a-z]{2})?/#')  // matches /en/, /en-us/, /en-gb/

Examples

Include only English URLs:

->arg('$pattern', '/en/')
->arg('$exclude', false)  // default

Exclude German URLs (include everything else):

->arg('$pattern', '/de/')
->arg('$exclude', true)

Exclude all German variants:

->arg('$pattern', '#/de(-[a-z]{2})?/#')
->arg('$exclude', true)