Skip to main content

Typesense Search

The platform uses self-hosted Typesense for full-text search with faceted filtering by client and site. The search experience is provided by docusaurus-theme-search-typesense with a custom swizzled SearchBar component.


Architecture Overview

Browser Search Modal
└── Typesense Query (with filter_by: client_tag, site_tag)
└── Typesense Server (localhost:8108)
└── Collection: rhino-docs (indexed by DocSearch scraper)

How Faceted Filtering Works

  1. Build time: The docsearch-meta-plugin injects <meta name="docsearch:client_tag"> and <meta name="docsearch:site_tag"> into every HTML page
  2. Scrape time: The DocSearch scraper reads these meta tags and stores them as faceted fields in Typesense
  3. Search time: The swizzled SearchBar component lets users select a client and/or site from dropdowns, which are sent as filter_by parameters to Typesense

Typesense Server Setup

Typesense runs via Docker using typesense/docker-compose.yml:

cd typesense
cp .env.example .env
# Edit .env and set TYPESENSE_API_KEY to a strong random string
docker compose up -d typesense

Verify:

curl http://localhost:8108/health
# {"ok":true}

Search-Only API Key

The admin key should never be exposed to the browser. Create a search-only key:

curl -X POST 'http://localhost:8108/keys' \
-H 'X-TYPESENSE-API-KEY: <your-admin-key>' \
-H 'Content-Type: application/json' \
-d '{
"description": "Search-only key for Docusaurus",
"actions": ["documents:search"],
"collections": ["rhino-docs"]
}'

Put the returned value in docusaurus.config.tsthemeConfig.typesense.typesenseServerConfig.apiKey.


Scraper Configuration

The scraper config lives at typesense/config.json:

{
"index_name": "rhino-docs",
"start_urls": ["http://host.docker.internal:3000"],
"selectors": { ... },
"custom_settings": {
"field_definitions": [
{"name": ".*_tag", "type": "string", "facet": true}
]
}
}

The .*_tag pattern makes all _tag suffixed fields (including client_tag and site_tag) faceted and filterable.


Running the Scraper

After building the site:

npm run build
npm run serve & # Serve the build locally on port 3000
./typesense/scrape.sh # Run the scraper against the local server

Or use Docker Compose mode:

./typesense/scrape.sh --compose

SearchBar Component

The search bar is swizzled at src/theme/SearchBar/index.tsx. Key features:

  • Client dropdown: Filters search by client_tag
  • Site dropdown: Filters search by site_tag (populated based on selected client)
  • Auto-detection: Automatically selects the current client/site based on URL path
  • Persistence: Selections are stored in sessionStorage

Client Registry

The clientRegistry array in SearchBar/index.tsx must be kept in sync with the clients/ directory. It maps client slugs to their sites for the dropdown UI.


Troubleshooting

IssueSolution
Search returns no resultsCheck if scraper has been run after latest build
Facets not filteringVerify meta tags exist in built HTML (docsearch:client_tag)
Typesense connection refusedVerify Docker container is running: docker ps
Scraper fails to connectEnsure site is being served and accessible from Docker
New client not in dropdownAdd to clientRegistry in SearchBar/index.tsx