Pinecone Vector Search
The platform vectorizes all managed site data into Pinecone, enabling semantic search and AI-powered queries across the entire site portfolio.
What's Indexed
Each managed site is represented as a vector containing:
| Data | Example |
|---|---|
| Site name | "ScentLok Retail" |
| URL | "https://scentlok.com" |
| Platform | "WordPress" |
| Server | "WP Engine" |
| Organization | "Nexus Outdoors" |
| Service tier | "Premium" |
| E-commerce flag | Yes/No |
| SSL status | Valid/Invalid |
| Google indexed | Yes/No |
| Debug mode | On/Off |
| Plugin count | 34 |
This data is combined into a text summary and converted to a 768-dimensional vector using Google's Gemini embedding model.
Auto-Sync Behavior
When Does It Sync?
Vectors update automatically whenever a site's data changes:
- Cron cycle — Every 5 minutes,
site-updates.phpprocesses 15 sites - Site updated — The
suma_site_updatedhook fires - Pinecone auto-sync — Captures the hook and re-vectorizes the site
- Vector upserted — Updated vector replaces the old one in Pinecone
Timing
- After a site sync: vector updates within seconds
- Full portfolio sync: 4–6 hours (all sites through the queue)
- A site's vector is always as fresh as its last sync
Manual Sync
Sync All Sites
Triggers re-vectorization of every active site:
- Navigate to Suma Management → Pinecone Settings
- Click Sync All
- Wait for completion (processes sequentially — may take several minutes)
- Completion message shows count of synced vectors
Syncing all sites makes API calls for each site (embedding generation + Pinecone upsert). With 50+ sites, this takes 5–10 minutes.
Sync Single Site
Re-vectorize one specific site:
- Navigate to the site's detail page
- Click the Pinecone Sync button
- Immediate feedback on success/failure
Viewing Stats
The Pinecone Settings page displays:
| Stat | Description |
|---|---|
| Total Vectors | Number of site vectors in the namespace |
| Namespace | manage-rhinogroup |
| Dimensions | 768 |
| Connection | Whether Pinecone is reachable |
Testing Connection
- Navigate to Pinecone Settings
- Click Test Connection
- The system verifies:
- API key is valid
- Index is accessible
- Namespace exists
- Can read stats
Success: Shows vector count and index info.
Failure: Shows error message (invalid key, network issue, etc.)
Understanding Embeddings
What Is a Vector Embedding?
A vector embedding is a numerical representation of text that captures its meaning. Similar concepts have similar vectors, enabling "semantic search" — finding things by meaning rather than exact keywords.
How It Works Here
Site Data (text) → Gemini Embedding API → 768 numbers → Pinecone Storage
↓
Query (text) → Gemini Embedding API → 768 numbers → Similarity Search
↓
Most Similar Sites
Practical Example
Searching for "hunting apparel with e-commerce" would find:
- ScentLok (hunting clothing, WooCommerce)
- Blocker Outdoors (hunting gear, BigCommerce)
- Not: Rhino Group Documentation (no e-commerce, no hunting)
Metadata & Filtering
Each vector stores metadata that enables filtered searches:
| Filter | Use Case |
|---|---|
platform = "WordPress" | Only search WordPress sites |
ecommerce = true | Only e-commerce sites |
organization = "GSM Outdoors" | Only one client's sites |
ssl = false | Find sites with SSL issues |
Who Uses This Data?
AI Assistant (Burt)
The osTicket AI assistant queries Pinecone to find relevant site context when answering questions about managed sites.
Future Search Features
The vector index enables planned features like:
- Natural language site search ("find BigCommerce sites on WP Engine")
- AI-powered recommendations
- Automated site categorization
Settings
| Setting | Default | Description |
|---|---|---|
| API Key | — | Pinecone API key from dashboard |
| Host | — | Index host URL |
| Namespace | manage-rhinogroup | Isolation namespace |
| Auto-Sync | Enabled | Sync on every site update |
| Gemini Key | — | For embedding generation (falls back to Suma Gemini key) |
Troubleshooting
"Connection Failed" on Test
- Verify API key is correct (copy from Pinecone dashboard)
- Check Host URL includes full domain (e.g.,
rhino-tickets-xxx.svc.aped-xxx.pinecone.io) - Ensure network allows outbound HTTPS to Pinecone
Vectors Not Updating
- Check Auto-Sync is enabled in settings
- Verify the
suma_site_updatedhook is firing (check debug.log) - Try manual sync for one site to isolate the issue
- Check Gemini API key hasn't expired
Vector Count Doesn't Match Site Count
- Archived sites have their vectors deleted
- New sites need one sync cycle before vectorization
- Development sites are included (toggle in settings)
- Check for failed syncs in debug.log