How Google indexes multilingual sites — XML sitemaps and hreflang under the hood
Multilingual SEO is one of the most underappreciated aspects of technical site optimization. Many websites serve content in different languages under the same URL (via Accept-Language), unaware that Googlebot sends Accept-Language: en as its header — and will never see the Polish version of the page.
The solution is an XML sitemap with xhtml:link rel="alternate" hreflang="pl/en" tags — a mechanism that tells Google: "the same article exists under two different URLs, in two languages. Here are both. Don't treat them as duplicate content — they are language variants." Each URL in the sitemap includes references to all its language variants, creating a mesh of mutual connections.
Cyberapis v0.14.0 rebuilds the entire sitemap system from scratch. Instead of relying on a crawler (which generated localhost URLs and pulled in admin pages), the sitemap is now manually built from Post::indexable(), DocsPage::indexable(), and 6 static pages. Each blog post and docs page gets two url entries — for the EN and PL slugs — with mutual hreflang alternates. Plus auto-regeneration on every Filament change and a human-readable HTML sitemap at /sitemap.
Additionally, the slug lookup system was rewritten: PostController::show() now searches via slug->en OR slug->pl, not just the slug for the current locale. This means Googlebot (which always sends Accept-Language: en) can discover and index Polish URLs.
New Features
Rebuilt XML Sitemap
Manual URL construction instead of crawling — filters only is_published=true AND is_indexable=true, includes 6 static pages + all blog posts + all docs pages. XSL stylesheet for human-readable browser view.
Multilingual Sitemap with hreflang
Each blog post and docs page gets two url entries (EN slug + PL slug) with mutual xhtml:link rel="alternate" hreflang="en/pl" tags. Search engines can now discover and index content in both languages.
Auto-Regeneration
Post and DocsPage models trigger sitemap:generate on saved/deleted events — the sitemap is always up-to-date within seconds of any Filament admin change. A daily cron runs as a fallback.
is_indexable Field
Added to posts and docs_pages tables. Toggled in Filament. When disabled, the page gets meta name="robots" content="noindex" and is excluded from the sitemap.
Polish Legal Page URLs
/regulamin-serwisu and /polityka-prywatnosci with hreflang alternates to English /terms and /privacy-policy. Footer links use locale-appropriate URLs.
HTML Sitemap for Humans
The /sitemap page renders a live, always-up-to-date list of all indexable pages grouped by section (static, blog, docs).
Localized Footer
footer.legal translation key replaces the hardcoded heading. Legal links are locale-aware (PL vs EN URLs).
Bug Fixes
Post::toSitemapTag() — dead code
A premature return route('blog.show', $this) generated URLs with IDs instead of slugs (/blog/5 instead of /blog/the-slug). The setLastModificationDate/setPriority block never executed. Rewritten to return proper Url objects with hreflang alternates.
Localhost URLs in sitemap
Removed crawler-based generation (SitemapGenerator::create()) that pulled APP_URL from env (often localhost in dev). All URLs now use the route() helper, which resolves correctly per environment.
/admin/* in sitemap
The crawler was pulling in admin pages. Manual URL construction only includes public routes.
Changes & Improvements
Cross-Locale Slug Lookup
PostController::show() and DocsController::show() now search via slug->en OR slug->pl instead of only the session locale. Polish slugs work even when the browser has Accept-Language: en (e.g., Googlebot). The matched locale is set via app()->setLocale() so UI chrome matches the content language.