Skip to main content

Pinecone Vector Integration

The Suma Management plugin vectorizes site data into Pinecone for semantic search and AI-powered queries across all managed sites.


Configuration

SettingValue
Indexrhino-tickets
Namespacemanage-rhinogroup
Embedding Modelgemini-embedding-001 (Google)
Dimensions768
CloudAWS
Regionus-east-1

Architecture

Site Update Completes


suma_site_updated hook fires


Pinecone_Sync::on_site_updated($site_id)

├── 1. Load site data (Site model)
├── 2. Build text summary (name, URL, platform, plugins, etc.)
├── 3. Generate embedding via Gemini API
├── 4. Upsert vector to Pinecone (namespace: manage-rhinogroup)
└── 5. Log sync result

Auto-Sync Flow

When a site update completes (via cron or manual sync), the suma_site_updated hook fires:

// Hook registration (in suma-management.php)
add_action('suma_site_updated', [Pinecone_Sync::class, 'on_site_updated']);

class Pinecone_Sync {
/**
* Handles automatic vectorization when a site is updated.
*
* @param int $site_id The ID of the updated site.
* @return void
*/
public static function on_site_updated(int $site_id): void {
// Check if auto-sync is enabled
if (get_option('suma_pinecone_auto_sync') !== '1') {
return;
}

$sync = new self();
$sync->sync_site($site_id);
}
}

Text Summary Generation

Each site is converted to a text summary for embedding:

private function build_site_text(Site $site): string {
$parts = [
"Site: {$site->name}",
"URL: {$site->url}",
"Platform: {$site->platforms}",
"Server: {$site->server}",
"Organization: {$site->organization}",
"Tier: {$site->tier}",
"E-commerce: " . ($site->ecommerce ? 'Yes' : 'No'),
"Plugins: {$site->plugin_count}",
"SSL: " . ($site->ssl ? 'Valid' : 'Invalid/Missing'),
"Indexed: " . ($site->indexed ? 'Yes' : 'No'),
"Debug: " . ($site->debug ? 'Enabled' : 'Disabled'),
];

return implode("\n", $parts);
}

Embedding Generation

Embeddings are generated via the Google Gemini API:

private function generate_embedding(string $text): array {
$api_key = get_option('suma_pinecone_gemini_key')
?: get_option('suma-gemini_api_key');

$response = wp_remote_post(
"https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent?key={$api_key}",
[
'headers' => ['Content-Type' => 'application/json'],
'body' => wp_json_encode([
'model' => 'models/gemini-embedding-001',
'content' => ['parts' => [['text' => $text]]],
]),
'timeout' => 30,
]
);

$body = json_decode(wp_remote_retrieve_body($response), true);
return $body['embedding']['values']; // 768-dimensional float array
}

Vector Metadata

Each vector is stored with metadata for filtered queries:

FieldTypeExample
site_idint42
namestring"ScentLok Retail"
urlstring"https://scentlok.com"
platformstring"WordPress"
serverstring"WPEngine"
organizationstring"Nexus Outdoors"
tierstring"Premium"
ecommercebooleantrue
sslbooleantrue
indexedbooleantrue
debugbooleanfalse
plugin_countint34
archivedbooleanfalse
last_updatestring"2025-05-04T08:00:00Z"

AJAX Endpoints

Test Connection

jQuery.post(ajaxurl, {
action: 'pinecone_test_connection',
nonce: sumaManagement.nonce
}, function(response) {
// response.data = { connected: true, index: 'rhino-tickets', stats: {...} }
});

Sync All Sites

Triggers full vectorization of all active sites:

jQuery.post(ajaxurl, {
action: 'pinecone_sync_all',
nonce: sumaManagement.nonce
}, function(response) {
// response.data = { synced: 52, failed: 1, errors: [...] }
});
note

Full sync processes all active sites sequentially. With 50+ sites, this takes several minutes due to embedding API calls.

Sync Single Site

jQuery.post(ajaxurl, {
action: 'pinecone_sync_site',
nonce: sumaManagement.nonce,
site_id: 42
}, function(response) {
// response.data = { success: true, vector_id: 'site_42' }
});

Get Stats

jQuery.post(ajaxurl, {
action: 'pinecone_get_stats',
nonce: sumaManagement.nonce
}, function(response) {
// response.data = { total_vectors: 52, namespace: 'manage-rhinogroup', dimension: 768 }
});

Settings (WordPress Options)

OptionDefaultDescription
suma_pinecone_api_keyPinecone API key
suma_pinecone_hostIndex host URL (e.g., rhino-tickets-xxx.svc.aped-xxxx.pinecone.io)
suma_pinecone_namespacemanage-rhinogroupVector namespace
suma_pinecone_auto_sync1Auto-sync on site updates
suma_pinecone_gemini_keyGemini API key (falls back to suma-gemini key)

Pinecone Settings Page

Located at Suma Management → Pinecone Settings in WP Admin.

Features:

  • API key and host configuration
  • Auto-sync toggle
  • Test connection button (live connectivity check)
  • Sync all button (with progress indicator)
  • Stats display (vector count, namespace info)
  • Individual site sync (from site detail page)

Querying Vectors

While the management plugin handles upserting, vector queries are typically made from other systems (e.g., the AI assistant in osTicket):

# Example: Query from external system
import pinecone

index = pinecone.Index("rhino-tickets")
results = index.query(
namespace="manage-rhinogroup",
vector=embedding,
top_k=5,
include_metadata=True
)

Troubleshooting

Vectors Not Updating

  1. Check suma_pinecone_auto_sync is set to 1
  2. Verify suma_site_updated hook is firing (check debug.log)
  3. Test API key with "Test Connection" button
  4. Ensure Gemini API key is valid and has quota

Embedding Failures

  1. Check Gemini API rate limits (may need delays for bulk sync)
  2. Verify text content isn't empty (site must have data)
  3. Check wp-content/debug.log for API error responses

Dimension Mismatch

The index expects 768-dimensional vectors. If switching embedding models, ensure the new model also produces 768 dimensions or recreate the index.