Stop waiting for crawlers. Programmatically notify Google of new or updated pages. This guide covers API setup, quota limits, error handling, and production-ready code in Python and JavaScript.
The Indexing API is not for every page. It's built for time-sensitive content: job postings, event pages, live blogs, and product availability updates. You send a URL and structured data; Google decides when to crawl and index. Structured data is mandatory — without it the API returns a 400 error. In practice, when you run a large site with 50,000 new job listings daily, waiting for a standard crawl means losing revenue. Automation changes that.
A common situation we see: a team implements the API, hits quota at 10:00 AM, and wonders why half their URLs are missing. Quota is 200 URLs per day per verified property. That's tight. You must prioritize. URLs with no structured data get silently ignored. URLs already indexed get a 200 response but no re-crawl guarantee. Use the index status checker to verify before submission.
Must have owner permission. Use domain property for API access.
Enable Indexing API in GCP console. Create service account with JSON key.
In Search Console, add service account email as owner (delegate via GCP group).
Embed JSON-LD for JobPosting or Event. Validate with Rich Results Test.
Use batchCreate method. One URL per call. Handle 429 and retry with exponential backoff.
Track dailyUsage and errors via Cloud Monitoring. Adjust priority based on business impact.
| Error Code | Error Message | Root Cause | Fix / Workaround |
|---|---|---|---|
| 400 | No matching structured data found | URL missing required schema (JobPosting, Event, etc.) | Add JSON-LD and test with Rich Results Test tool. |
| 403 | Permission denied | Service account not added as owner in Search Console | Verify property ownership. Add service account via GCP group delegation. |
| 429 | Quota exceeded | More than 200 URLs/day or request rate too high | Reduce batch rate. Use exponential backoff. Prioritize high-value URLs. |
| 500 | Internal error | Google server-side issue (rare) | Retry with backoff (2s, 4s, 8s). If persists, submit to Google Issue Tracker. |
| 409 | URL already pending | Duplicate submission within brief window | Deduplicate queue. Skip URLs that returned 200 in last hour. |
Assume a job board with 2,000 new listings per day. Quota is 200. You must choose. Priority: 1) Paid listings, 2) Listings from top 50 companies, 3) All others. Script below (Python) reads a CSV with columns: url, priority, title, datePosted. It sorts by priority, takes top 200, and sends each to the Indexing API.
from google.oauth2 import service_account
from googleapiclient.discovery import build
import csv, time
SCOPES = ['https://www.googleapis.com/auth/indexing']
creds = service_account.Credentials.from_service_account_file('key.json', scopes=SCOPES)
service = build('indexing', 'v3', credentials=creds)
with open('jobs.csv') as f:
reader = csv.DictReader(f)
urls = sorted(reader, key=lambda r: (-int(r['priority']), r['url']))
daily = 0
for row in urls:
if daily >= 200: break
body = {'url': row['url'], 'type': 'URL_UPDATED'}
try:
service.urlNotifications().publish(body=body).execute()
print(f'OK {row["url"]}')
except Exception as e:
print(f'FAIL {row["url"]}: {e}')
time.sleep(1)
daily += 1
Edge case: if a URL returns 409 (already pending), log it and skip. Do not retry — you burn quota. After submission, run a check with the index status checker. Typical result: ~80% indexed within 2 hours, ~95% within 24 hours. The remaining 5% often have no structured data or are blocked by robots.txt.
200 URLs per day per property. No batch endpoint. A site with 10,000 pages takes 50 days to notify. That's not automation — that's a trickle. The canonical URL mismatch is a silent killer. Google might index a version with a trailing slash, and your API call uses the non-slash variant. Both get indexed, splitting signals. Canonical confusion wastes quota and dilutes ranking. Always canonicalize URLs before sending.
Another operational failure: sending URLs that return 404 or redirect. The API accepts them, but Google treats them as low-quality. You burn quota on dead pages. Filter your list: resolve each URL server-side, check HTTP status 200, and discard non-200. Also remove URLs with noindex meta or X-Robots-Tag. Empty results from your CMS pipeline? That's a data pipeline bug, not an API problem.
URL is live and returns HTTP 200 (not 3xx, 4xx, or 5xx).
Page contains valid structured data (JobPosting, Event, or BroadcastEvent).
Canonical URL is consistent across sitemap, rel=canonical, and API call.
Page is not blocked by robots.txt or meta robots noindex.
Service account is added as owner in Search Console (via GCP group).
Quota not exceeded today (check via Cloud Monitoring or Search Console API).
URL not already submitted in the last hour (deduplication cache).
Content is unique and substantial (Google may ignore thin pages).
The Indexing API lets you programmatically notify Google when a URL is added or updated. It's designed for time-sensitive content like job postings or events. You send a URL plus structured data; Google decides when to crawl. For large sites, quota is 200 URLs/day per property, so you must prioritize high-value pages.
Create a separate GCP project per client or use a single project with multiple service accounts. Each property needs its own service account email added as owner in Search Console. Watch out: quota is per property, not per project. If you manage 50 sites, that's 200 URLs/site/day total, not a shared pool.
Only three schema.org types are accepted: JobPosting, Event, and BroadcastEvent. The structured data must be embedded as JSON-LD on the page. If the URL lacks valid structured data, the API returns a 400 error. Validate with Google's Rich Results Test before submission.
Quota is 200 URLs/day per verified property. No batch endpoint exists. To handle bulk, prioritize: assign scores to URLs (e.g., based on page traffic or business value), submit only the top 200 daily. Use Cloud Monitoring to track usage. If you need more, create additional properties (e.g., subdomain per country).
400: missing structured data — add JSON-LD. 403: permission denied — add service account as owner in Search Console. 429: quota exceeded — reduce rate and retry with backoff. 409: duplicate — deduplicate queue. 500: server error — retry with exponential backoff. Always log and analyze errors per URL.
Use the URL Inspection tool in Search Console. Or call the Search Console API's urlInspection.index endpoint. A faster method for bulk: use a third-party index status checker like <a href='https://teletype.in/@speedyindex/check-if-google-indexed'>this one</a>. Expect ~80% indexed within 2 hours, ~95% within 24 hours if structured data is valid.
Common reasons: URL lacks required structured data, URL returns redirect or 404, page has noindex tag, content is thin or duplicate, canonical URL mismatch, or quota was exceeded earlier. Google also may ignore if the page is not deemed fresh enough. Validate each URL against the checklist before submission.
No. The Indexing API is only for pages with JobPosting, Event, or BroadcastEvent structured data. It is not for generic content or backlinks. Using it for non-qualifying pages will return 400 errors and waste quota. For guest posts, use standard sitemap submission and internal linking.
URL_UPDATED tells Google a page is new or changed. URL_DELETED indicates the page no longer exists (returns 404 or 410). Use URL_DELETED for removed job listings or expired events; it helps Google remove them from search faster without waiting for a recrawl. Both count toward the 200/day quota.
You can't with one property — quota is 200/day. Split listings across subdomain properties (e.g., city1.example.com, city2.example.com) with separate Search Console verification. Each subdomain gets its own 200 quota. Then write a Python script that reads a DB, assigns each listing to a subdomain, and submits the top 200 per property per day.
We worked with a client who sent the same 200 URLs every day for a week. They thought the API was persistent. It's not. Each call is an independent notification. Google may ignore repeated submissions if content hasn't changed. Worse: they sent URLs with different query parameters (e.g., /job?id=123 and /job/123). Both were indexed, creating duplicate content. Canonical confusion wasted their quota. Fix: deduplicate at the source. Store canonical URLs in a DB column, and only submit unique ones that have actually changed. Use the canonical URL guide to normalize.
Another edge case: slow vendors. One API call takes ~500ms. 200 calls = 100 seconds. If your script runs synchronously, it blocks. Solution: use async I/O or parallel threads with rate limiting. Python's asyncio with aiohttp works well. But don't exceed 10 concurrent requests — Google returns 429 even within the 200 quota if you blast them. Throttle to 5 requests/second.
Quick calculator. Put in the expected monthly value of a page or link batch and the natural waiting time.