Build faster indexing workflows without the spreadsheet swamp. Open the app
Technical SEO Guide

XML Sitemap Best Practices: How to Create and Submit for Faster Indexing

A poorly configured sitemap wastes crawl budget. Learn to generate a proper XML sitemap, avoid common validation errors, and submit it to Google Search Console so your important pages get indexed fast. We cover priority tags, lastmod usage, and real-world filtering strategies.

On this page
Field notes

Why Your Sitemap Is Not Getting Your Pages Indexed

Most SEOs treat the sitemap as a checklist item. Upload once, forget. That is a mistake. Google uses the sitemap to discover URLs, but it does not guarantee indexing. A common situation we see: a site with 50,000 product pages submits a sitemap containing all of them, but only 200 get indexed. Why? Because the sitemap included blocked URLs, noindex tags, and pages with thin content. The official Google documentation on building a sitemap is clear: only include canonical, indexable URLs. Filter out everything else before you even generate the file.

In practice, when you create and submit an XML sitemap to Google, the most common bottleneck is not the submission itself—it is the quality of the URLs inside. If you dump every URL, you dilute your crawl budget. A lean, validated sitemap with 5,000 high-priority pages will outperform a bloated one with 50,000 entries every time. The rest of this guide focuses on the decisions that separate effective sitemaps from noise.

Data table

Tactical Comparison: Static vs Dynamic Sitemap Generators

Generator TypeHow It WorksBest ForHidden Risk / Failure Mode
Static generator
e.g., Screaming Frog, XML-Sitemaps.com
Crawl site, generate flat XML file. Manual upload via FTP or CMS.Small sites (< 500 pages) or one-time builds. Quick fix.Stale data. If you add pages, the sitemap is outdated. No automatic update. Easy to accidentally include noindex or canonicalised URLs.
Dynamic generator (plugin)
e.g., Yoast, Rank Math, SEOPress
Sits inside CMS. Regenerates on post publish. Adds lastmod automatically.WordPress sites. Medium traffic. Content updates daily.Plugin can conflict with caching. Lastmod may be wrong if post modified date is set to current time by cache. Can include pagination pages or tag archives.
Dynamic generator (custom script)
e.g., Python + Flask, PHP cron job
Queries database directly. Filters by custom rules. Outputs gzipped XML.Large sites (10k+ pages). Custom logic needed. Agency workflows.Requires dev maintenance. Common bug: forgetting to exclude staging or test domains. Slow database queries can timeout on large catalogs. Gzip must be streamed, not buffered.
Dynamic generator (API-based)
e.g., SpeedyIndex, JetOctopus
Uses API to fetch indexable URLs. Allows bulk submit. Some validate before adding.Agencies managing multiple client sites. Bulk workflows. Scale.Cost per API call. Vendor lock-in. If the API returns empty results due to a filter mismatch, you get a blank sitemap. Always validate the output before submission.
Field notes

Priority, Lastmod, and Changefreq: When They Matter and When They Don't

Priority tags are a hint, not a directive. Google ignores them if your site structure is clear. Lastmod, on the other hand, is useful—but only if it is accurate. A common edge case: your CMS sets lastmod to the current date every time a post is saved, even if nothing changed. This tricks Google into recrawling unchanged pages, wasting budget. Use the actual content modification date, not the database update timestamp.

Changefreq is the least reliable tag. Google ignores it in most cases because it derives frequency from crawl history. Do not waste time on it. Instead, focus on ensuring that your lastmod values are correct and that you only include URLs that are actually indexable. For a deeper look at how Google can still choose a different canonical URL than what you set, see this analysis of why Google sometimes picks a different canonical URL and how to fix it.

Worked example

Worked Example: Filtering a 12,000-URL Sitemap Down to 4,500 Indexable Pages

We had a client with 12,000 product URLs. Initial sitemap included everything. After one month, only 1,200 pages were indexed. We rebuilt the sitemap using these filters:

  • Exclude: URLs with meta noindex (found 2,100 URLs)
  • Exclude: URLs returning 3xx or 4xx (found 1,800)
  • Exclude: pagination pages (page/2/, page/3/ etc.) (found 600)
  • Exclude: URLs with thin content (less than 300 words) (found 900)
  • Exclude: internal search result pages (found 1,100)

Final sitemap: 4,500 URLs. After resubmission, within 10 days, 3,800 of those were indexed. The rest had other issues (slow server, no backlinks), but the crawl efficiency jumped from 10% to 84%. The key: do not just generate a sitemap—audit it first.

Workflow map

Sitemap Creation and Submission Workflow

Audit Current URLs

Export all site URLs. Run bulk checks for indexability using a crawler or API. Identify duplicates, redirects, noindex, and thin content pages.

Apply Inclusion Rules

Keep only canonical, indexable, and high-value URLs. Set priority based on business value: 1.0 for home and core pages, 0.5 for standard posts, 0.3 for archives.

Generate XML File

Use a dynamic generator or custom script. Ensure lastmod is from the actual content change date. Validate XML syntax and size limits (max 50MB or 50,000 URLs per file).

Validate & Test

Open XML in browser. Check for namespace errors. Use Google's URL Inspection Tool on a sample of included URLs. Confirm they return 200 and are indexable.

Submit to Google Search Console

Go to Sitemaps section. Add sitemap URL. Monitor the 'Submitted' vs 'Indexed' count daily for the first week. Watch for errors under 'Coverage'.

Monitor & Refresh

Check weekly for sudden drops. If new pages are added, regenerate the sitemap. Use Google's Index Coverage report to spot patterns. After submission, <a href="https://teletype.in/@speedyindex/check-if-google-indexed">check if Google actually indexed your pages</a> using URL inspection.

Field notes

Common Validation Errors and How to Fix Them

When you create and submit an XML sitemap to Google, the Search Console will show errors if the file is invalid. The most frequent ones:

  • Namespace error: The XML namespace attribute is missing or misspelled. Fix: use correct xmlns="http://www.sitemaps.org/schemas/sitemap/0.9".
  • URL too long: Google allows up to 2048 characters per URL. Truncate or use a URL shortener.
  • Date format error: lastmod must be in W3C Datetime format (e.g., 2025-03-21T14:30:00+00:00). Many CMS plugins output local time without timezone—this fails validation.
  • Size limit exceeded: Sitemap files cannot exceed 50MB uncompressed or 50,000 URLs. Split into multiple sitemaps and use a sitemap index file.
  • Duplicate URLs: Same absolute URL appearing more than once. Deduplicate before generation.

FAQ

How often should I create and submit XML sitemap to Google for a news site?

For news sites, regenerate the sitemap every time new content is published. At least daily. Use dynamic generation that triggers on publish. Include only the latest 1,000 articles, not the entire archive. Submit via Google Search Console's Sitemaps section. Monitor the 'Submitted' count to ensure it updates within 24 hours.

What are the most common validation errors when submitting an XML sitemap for an ecommerce site with 100,000 products?

The top errors: namespace missing, lastmod format incorrect (use W3C datetime with timezone), URLs over 2048 characters (common with long product filters), and exceeding the 50,000 URL or 50MB limit. You must split into multiple sitemaps and use a sitemap index file. Also, ensure no URLs return 4xx or 5xx status codes.

Can I submit an XML sitemap via API for bulk indexing of guest posts across multiple domains?

Yes. Use Google Search Console API or third-party tools like SpeedyIndex that support bulk submission. For each domain, you need ownership verification in GSC. The API allows submitting sitemap URLs programmatically. However, Google does not guarantee immediate indexing. The sitemap just signals existence; the pages must still pass quality and relevance checks.

What is the best way to create and submit XML sitemap for backlinks from a PBN network without leaking footprints?

Use a custom script that generates a separate sitemap for each domain. Exclude any URLs that link to the money site. Do not use the same IP or CMS pattern. Validate that no sitemap includes internal search pages or author pages that might expose relationships. Submit each sitemap individually to its respective Google Search Console account.

How do I check if Google indexed my pages after submitting an XML sitemap?

Use Google Search Console's URL Inspection tool. Enter a sample of URLs from the sitemap. The tool shows 'URL is on Google' or 'URL is not on Google'. Also check the 'Indexed' count in the Sitemaps report. For a quick bulk check, use a third-party tool that compares your sitemap URLs against Google's index via the Indexing API.

What should I do if my XML sitemap shows 0 indexed pages after submission?

First, verify the sitemap is valid and accessible (200 status, no robots.txt block). Use URL Inspection to test a few pages—they may be noindex or blocked. Check the 'Coverage' report for errors like 'Submitted URL blocked' or 'Submitted URL has crawl issue'. Ensure your pages are not behind a login. If everything looks correct, wait 1-2 weeks; Google may index them slowly.

How to fix a sitemap that includes thousands of thin or duplicate product pages?

Audit all URLs for uniqueness (canonical tags, duplicate content). Exclude pages with less than 300 words of original content. Use canonical tags to consolidate duplicates, then include only the canonical URL in the sitemap. Set priority low (0.3) for thin pages if you must include them, but better to exclude entirely. Regenerate and resubmit.

Is there a way to automate the creation and submission of XML sitemaps for multiple client sites in an agency workflow?

Yes. Use a custom script that loops through client sites, generates sitemaps via API or database query, validates XML, and submits to each site's Google Search Console via the API. Tools like SpeedyIndex offer a dashboard for multi-site management. Set up cron jobs to regenerate daily. Monitor error logs for each client to catch validation failures early.

What are the best settings for priority and lastmod in an XML sitemap for a blog with daily updates?

Set priority: 1.0 for homepage and cornerstone articles, 0.8 for recent posts (last 30 days), 0.5 for older posts. For lastmod, use the actual publication or substantial update date—not the date a comment was added. Do not use lastmod if your CMS cannot output a stable, correct date; omit the tag entirely. Google will still crawl based on its own signals.

Budget math

Estimate the cost of waiting

Quick calculator. Put in the expected monthly value of a page or link batch and the natural waiting time.

Next reads

Related guides