For developers and SEOs managing 10k+ URLs, manual index checks are a bottleneck. The Google Index Status API gives you a programmatic shotgun—batch queries, quota monitoring, and error handling. Here is how to wire it up without getting blocked.
Quick calculator. Put in the expected monthly value of a page or link batch and the natural waiting time.
If you are still pasting URLs into Google Search Console or using a browser extension for index checks, you are wasting hours that should go into triage. The google index status api is not a toy—it is the only reliable way to verify whether Googlebot sees your content at scale. A common situation we see: an agency with 50,000 client pages runs a weekly script. They find 4,200 URLs returning 200 OK but INDEX_STATUS_NONE. Root cause? Bloated sitemaps and orphaned pages. Without the API, that pattern hides for weeks.
This guide covers the Google Indexing API setup, authentication pitfalls, real-world quota limits, and the scripts that separate pros from burnt-out SEOs.
| Method / Endpoint | What It Returns | Best Use Case | Failure Mode / Risk |
|---|---|---|---|
URL Inspection (index.status)POST /v1/urlInspection/index | Index status, coverage state, crawlability, canonical, AMP, rich results | Single URL deep diagnostics for high-value pages (landing pages, product sheets) | Returns INDEX_STATUS_UNSPECIFIED if URL has never been crawled.Risk: Misinterpreted as 'not indexed' when it is only uncrawled. |
Batch Indexing (indexing.batch)POST /v1/urlNotifications:batch | Confirmation of notification receipt—does NOT return index status | Requesting indexing for new/updated content (e.g., blog posts, news) | Quota is shared with the URL Inspection endpoint. Risk: Hitting 200/day cap when both flows run in parallel. |
Fetch Queue (crawlQueue.list)GET /v1/sites/{siteUrl}/crawlRequests | List of pending crawl requests and their priority | Monitoring crawl backlog after a content refresh | Only available for verified Search Console properties. Risk: Empty results if Google ignores low-priority requests. |
Service account with domain-wide delegation. Scope: <code>https://www.googleapis.com/auth/webmasters.readonly</code>. Store key as JSON, not env var.
Max 200 URLs per batch. Filter out <code>noindex</code> pages first using your CMS or sitemap. Cuts API calls by 30-50%.
POST per URL. Response includes <code>inspectionResult.indexStatus.verdict</code>. Map to PASS / FAIL / UNCERTAIN.
Group by <code>coverageState</code>: <strong>Submitted and indexed</strong>, <strong>Not indexed (blocked by robots.txt)</strong>, <strong>Crawled but not indexed</strong>.
Retry on <code>429</code> with exponential backoff. Log <code>403</code> as auth failure. Ignore <code>404</code> URLs.
Write results to BigQuery or CSV. Alert if >10% of URLs show <code>INDEX_STATUS_NONE</code> for 3 consecutive runs.
Scenario: E-commerce site with 1,500 product pages, weekly index audit.
Step 1: Sitemap extraction yields 1,500 URLs. Run a pre-filter: 230 URLs have noindex meta robots. Remove them. Left: 1,270.
Step 2: API quota = 200/day. 1,270 / 200 = 6.35 days. So split across 7 days: 182 URLs/day. Schedule cron job for 02:00 UTC daily.
Step 3: After 7 days, results: 1,270 queries. 1,025 indexed (80.7%). 145 not indexed. Breakdown: 82 blocked by robots.txt (disallowed for /product/), 63 Crawled - currently not indexed (thin content).
Step 4: Fix: update robots.txt for the blocked section. For thin pages, add canonical to parent category. Re-run API on those 63 after fixes—48 indexed within 48 hours.
In practice, when you script against the google index status api, you will hit silent failures. The API returns a 200 OK even when the URL is blocked by robots.txt—you have to inspect robotsTxtState in the response. Another trap: the verdict field can be PASS for a URL that is indexed but with a different canonical. That means your page is not the one ranking. You must check canonical and indexStatus.canonicalization.
We also see duplicate URL lists crashing the batch endpoint. If you include both https://example.com/page and https://example.com/page/, Google treats them as separate. That eats your quota. Normalize trailing slashes before the call.
Google Cloud project with Search Console API enabled
Service account with JSON key (no expired secrets)
Domain-wide delegation for the service account
Verified Search Console property for the target site
Normalized URL list (no trailing slash duplicates, no fragments)
Retry logic for 429 and 5xx errors (exponential backoff)
Alerting when quota hits 80% (prevent silent failures)
The default quota is 200 URLs per day per Google Cloud project. For sites with 10k+ URLs, you need to distribute checks across multiple projects or request a quota increase via Google Cloud Console. Each inspection counts as one query, even if the URL is invalid. Plan your crawl frequency accordingly—weekly audits for 1,500 URLs take 7-8 days at default quota.
The Indexing API (indexing.batch) is for notifying Google of new or updated content—it does NOT return index status. The URL Inspection API (urlInspection.index) returns actual index status, coverage state, and canonical information. For automated index checking, always use the Inspection endpoint. The Indexing API is only for pushing URLs, not verifying them.
Use a single service account with domain-wide delegation. In the Google Cloud Console, add each client domain as a verified Search Console property. The service account must be added as an owner or user in each Search Console property. Then in your script, specify the siteUrl parameter. This avoids handling multiple keys—but monitor quotas per project, not per property.
INDEX_STATUS_NONE means Google has no record of the URL—it has never been crawled or has been removed. This is different from 'crawled but not indexed.' First verify the URL is not blocked by robots.txt or noindex. If the page is new, push it via Indexing API and wait 24-48 hours. If it is old, check sitemap inclusion and internal linking. Do not immediately re-submit; investigate root cause.
Yes, but with a caveat: the API only works for URLs on Search Console properties you own or manage. For guest posts on external domains, you cannot use this API directly. Instead, ask the host site to add you as a property user, or use third-party tools like SERP APIs. For your own site, automate checks on guest post landing pages to ensure they get indexed quickly.
PASS means the URL is indexed, not that it ranks. A page can be indexed with a different canonical URL, meaning Google treats it as a duplicate. Check the canonical field in the response. Also check the robotsTxtState and coverageState. If the page is indexed but under a different canonical, update your rel=canonical tags or consolidate content.
Top errors: 403 (auth failure—check service account permissions), 429 (quota exceeded—implement exponential backoff), 400 (invalid URL format—normalize to absolute, no fragments). Silent errors: API returns 200 but verdict is UNSPECIFIED because URL was never submitted. Also, missing trailing slash normalization causes duplicate queries that waste quota.
Write a script that: 1) reads URLs from your sitemap or CMS, 2) filters out noindex URLs, 3) batches them in groups of 200 (daily limit), 4) calls urlInspection.index for each, 5) logs results to a database, 6) alerts on anomalies (e.g., >10% not indexed). Schedule via cron or Cloud Scheduler. Example: Monday 02:00 UTC for URLs 1-200, Tuesday for 201-400, etc.
The API itself is free, but quota is limited (200 URLs/day). For 10k+ URLs, you need multiple Google Cloud projects or request a quota increase (not guaranteed). Alternatively, use Search Console API for aggregated data (not per-URL). For strict programmatic index checking at scale, consider third-party tools or a private proxy pool. Budget for developer hours to build robust error handling.