This site's search functionality.
The site search is part of every version of docs.github.com. This endpoint responds in JSON format, and fronts our search querying functionality. We recommend using this endpoint, as the endpoint will be more stable. On any page, you can use the search box to search the documents we've indexed.
You can also query our search endpoint directly at:
https://docs.github.com/search?version=<VERSION>&language=<LANGUAGE CODE>&query=<QUERY>
- The
VERSIONcan be any numbered supported GitHub Enterprise Server version (e.g.,3.12), Enterprise Cloud (ghec), or the Free pro team plan (dotcom). - The
LANGUAGE CODEcan be one of:es,ja,pt,zh,ru,fr,ko,de - The
QUERYcan be any alphanumeric string value.
Our backend currently supports 2 "types" of searching.
All searches accept a query param, e.g. ?query=how and return results based on their type:
- general search
- Results: The pages of our sites that match the query, sorted by popularity
- Example: Query = "clone" -> Results
- Endpoint:
/api/search/v1
- AI search autocomplete
- Results: Human-readable full-sentence questions that best match the query. Questions are based on previous searches and popular pages
- Example: Query = "How do I clone" -> A Result = "How do I clone a repository?"
- Endpoint:
/api/search/ai-search-autocomplete/v1
Elasticsearch is an external service that we use for searching. When a user types a search, our backend queries Elasticsearch for the most relevant results.
In order to provide relevant results to queries, we prefill Elasticsearch with data via Indexes. See the Indexing README for how we index on Docs.
A GitHub Actions workflow that runs every twenty four hours syncs the search data. This process generates structured data for all pages on the site, compares that data to what's currently on search, then adds, updates, or removes indices based on the diff of the local and remote data, being careful not to create duplicate records and avoiding any unnecessary (and costly) indexing operations.
The workflow runs are only accessible to GitHub employees using internal resources.
You can manually run the workflow to generate the indexes after you push your changes to main to speed up the indexing when needed. It's recommended to do this for only the free-pro-team@latest version and the en language because running all languages and versions takes about 40 minutes. To run it manually, click "Run workflow" button in the Actions tab. Enter the language and version you'd like to generate the indexes for as inputs to the workflow. By default, all languages and versions are generated.
The preferred way to build and sync the search indices is to do so via the GitHub Actions workflow.
.github/workflows/index-general-search.yml- Populates search indices for general search using themainbranch every four hours. Search indices are stored in an internal-only Elasticsearch instance. To run it manually, click "Run workflow" button in the Actions tab..github/workflows/index-autocomplete-search.yml- Populates search indices for AI search autocomplete using data from an internal repo. Runs daily.
- src/search/components/Search.tsx - The browser-side code that enables the search.
- src/search/components/SearchResults.tsx - The browser-side code that displays search results.
- src/search/middleware/general-search-middleware.ts - Entrypoint to general search when you hit docs.github/search
- src/search/middleware/search-routes - Entrypoint to the API endpoints for our search routes
- src/search/scripts/ - Scripts used by Actions workflows or for manual operations like scraping data for indexing and performing the indexing.
- src/search/tests - Tests relevant to searching.
- It's not strictly necessary to set an
objectIDas the search index will create one automatically, but by creating our own we have a guarantee that subsequent invocations of this upload script will overwrite existing records instead of creating numerous duplicate records with differing IDs. - Our search querying has typo tolerance. Try spelling something wrong and see what you get!
- Our search querying has lots of controls for customizing each index, so we can add weights to certain attributes and create rules like "title is more important than body", etc. But it works pretty well as-is without any configuration.
- Our search querying has support for "advanced query syntax" for exact matching of quoted expressions and exclusion of words preceded by a
-sign. This is off by default, but it is enabled in our browser client. The settings in the web interface can be overridden by the search endpoint. See middleware/search.ts. - When needed, the Docs Engineering team can commit updates to the search index, as long as the label
skip-index-checkis applied to the PR.
- Team: Docs Engineering
- Primary contacts: @docs-engineering (GitHub team)
- Search infrastructure: Internal Elasticsearch cluster for autocomplete and general search results, and an external RAG app (cse-copilot) owned by @github/customer-success-engineering for LLM-generated responses
- Slack: #docs-engineering
If search is not working:
-
Check search health
- Test search on docs.github.com
- Check Elasticsearch cluster status (internal)
- Review recent deploys and index updates
-
Index issues
- Check
.github/workflows/index-general-search.ymllogs - Verify last successful index run
- Test manual index update for single version/language
- Check
-
API issues
- Check
/api/search/v1endpoint - Review middleware logs for errors
- Test search queries directly against API
- Check
- Real-time indexing - Reduce lag between content changes and search index
- Relevance tuning - Improve search result ranking and quality
- Performance optimization - Faster search queries and results
- Version handling - Better support for version-specific search
- Language support - Improve multilingual search quality
- Faceted search - Filter by product, version, content type
- Search analytics - Track what users are searching for
- Did you mean - Suggest corrections for misspellings
- Related searches - Show similar or related queries
- Result previews - Better snippets and highlighting
- Query understanding - Better interpret user intent
- Answer generation - Provide direct answers, not just links
- Contextual results - Consider user's current page/version
- Personalization - Learn from search patterns
- Index efficiency - Reduce index size and update time
- Cache optimization - Improve query caching
- API versioning - Stable search API with version control
- Testing coverage - More comprehensive search tests
- Error handling - Better error messages and recovery
- Elasticsearch upgrade - Keep cluster up to date
- Redundancy - Improve search availability
- Monitoring - Better observability of search health
- Cost optimization - Reduce Elasticsearch costs
- Index validation - Ensure all pages are indexed correctly
- Freshness indicators - Show when content was last updated
- Broken link detection - Identify 404s in search results
- Duplicate detection - Prevent duplicate results
Search is largely KTLO (keep the lights on). We will continue to ensure the search is working as expected and support updates to both Elasticsearch and Copilot models underlying our search.
- Index lag - 24-hour delay between content changes and search updates
- Manual triggers - Urgent updates require manual workflow run
- Full reindex - Can't update individual pages incrementally
- Version complexity - Hard to search across all versions simultaneously
- Full index rebuild takes ~40 minutes for all versions/languages
- Single version/language takes ~5-10 minutes
- Search queries cached but cache can become stale
- High search volume can impact Elasticsearch cluster
