Skip to main content

AI System — Gemini & Knowledge Assistant

The ticket system uses Google's Gemini API for three distinct AI features: automatic ticket urgency analysis, an interactive knowledge assistant for staff (powered by the modular Suma\AI php-chatbot architecture), and AI note optimization for Harvest time entries.


Gemini AI Urgency Analysis

Overview

Every time a ticket is created or a new thread entry is posted, the system queues it for AI analysis. A cron job processes the queue, sends ticket data to a Gemini API endpoint, and updates custom form fields with urgency levels, estimated hours, summaries, and suggested titles.

Architecture Flow

Thread entry posted (message, response, or note)
→ Motherload signal: threadentry.created
→ plugin_ai_summary_update::run()
→ INSERT INTO ost_gemini_queue (ticket_id, processed=0)
→ cron-gemini.php runs Gemini::process_queue()
→ Gemini::prepare_ticket_data() fetches entries via:
$ticket->getThreadEntries(['M', 'R', 'N'])
→ Includes previous urgency context + last entry poster type
→ Gemini::call_gemini_api() builds prompt locally using urgency criteria
→ Check ost_gemini_response_cache for cached result
→ POST directly to Google Gemini REST API
(generativelanguage.googleapis.com/v1beta/models/{model}:generateContent)
→ Retry with exponential backoff + model fallback chain on failure
→ Parse JSON response (urgency_percentage, estimated_hours, summary, timeline, suggested_title)
→ Staff Reply Urgency Cap enforcement (server-side clamp)
→ Cache response in ost_gemini_response_cache (configurable TTL)
→ Results saved to ost_form_entry_values (custom fields)
→ Snapshot saved to ost_ticket_ai_snapshots (history tracking)
→ History logged to ost_ticket_ai_history
→ If CRITICAL: triggers email + SMS + browser notification

Direct Gemini Integration

The system calls the Google Gemini REST API directly — there is no proxy server. The shared gemini_api_key from the AI Assistant config namespace is used for authentication.

Model Fallback Chains

If the primary model fails, fallback models are tried automatically:

  • gemini-2.5-flashgemini-2.5-flash-lite
  • gemini-2.5-progemini-2.5-flashgemini-3-flash-preview
  • gemini-3-pro-previewgemini-3-flash-previewgemini-2.5-progemini-2.5-flash

Response Cache

Responses are cached in ost_gemini_response_cache with configurable TTL. Cache is bypassed during normal queue processing (bypass_cache=true). Expired entries are cleaned up automatically during cron runs.

Failure Retry Logic

If the Gemini API returns a non-200 response, the failures column on the queue entry is incremented. After 5 failed attempts, the ticket is skipped. When a new thread entry is posted (re-queuing), failures reset to 0.

API Request Format

The Gemini API receives a structured payload with one object per thread entry:

{
"subject": "Ticket Subject",
"body": [
{
"poster_name": "Jane Smith",
"poster_type": "Client",
"date": "2026-03-13 07:10:10",
"message": "The checkout page throws a 500 error...",
"is_internal": false,
"entry_label": "Original Ticket Request"
},
{
"poster_name": "David Sinclair",
"poster_type": "Staff",
"date": "2026-03-13 08:38:07",
"message": "I've identified the issue...",
"is_internal": false,
"entry_label": "Reply #2"
}
],
"bypass_cache": true,
"ticket_id": 8826,
"customer_type": "small",
"reported_at": "2026-03-13 07:10:10 AM CDT"
}

AI Fields Updated

FieldValuesPurpose
ai-urgencyLOW / MEDIUM / HIGH / CRITICALUrgency classification
ai-urgency-percentage0–100Numeric urgency score
ai-estimated-hoursDecimalEstimated effort
ticket-summaryHTMLTicket summary
ai-timelineHTMLEvent timeline
ai-suggested-titleTextBetter title suggestion

Critical Alert Deduplication

Critical urgency emails and SMS are sent once per ticket only, tracked via ost_ticket_ai_history:

// Check before sending
Gemini::has_critical_alert_been_sent($ticketId); // returns bool

// Record after sending
Gemini::record_critical_alert_sent($ticketId);

API Logging

All Gemini requests are logged to var/logs/api/gemini/gemini-YYYY-MM-DD.log via the ApiLogger class. Sensitive headers are automatically redacted.

Dedicated Gemini Queue Logging

The GeminiLogger class (include/class.gemini-logger.php) provides a separate file-based logger specifically for the queue processing pipeline:

  • Log location: var/logs/gemini/gemini-queue-YYYY-MM-DD.log
  • Daily rotation with 5MB max file size and 30-day retention
  • Logged events:
    • Cron throttle (10-minute gate) — why processing was skipped
    • Queue state snapshots — total/pending/processed/failed counts
    • Per-ticket processing — start, success (with urgency results), or failure (with error + failure count)
    • API calls — model used, HTTP response code, duration
    • Fallback chain — each step logged when primary model fails

Cron Configuration

// cron-gemini.php
// urgency_max_tickets_per_run — configurable via Settings (default 2)
# Run manually (Windows dev)
.\run-cron-dev.ps1

# Or directly
php cron-gemini.php

The cron should run every minute in production via api/cron.php.


Staff Reply Urgency Cap

When a staff member posts a reply or note, the AI re-analyzes the ticket. To prevent the AI from artificially escalating urgency after an internal response, a server-side urgency cap is enforced.

How It Works

  1. prepare_ticket_data() includes previous_urgency_percentage, previous_urgency_level, last_entry_poster_type, and last_entry_poster_name in the data sent to Gemini
  2. The urgency criteria prompt includes a "STAFF REPLY RULE" section instructing Gemini that staff replies must never increase urgency
  3. Server-side enforcement in process_ticket(): if the last thread entry was posted by Staff and Gemini returns a higher urgency than previously stored, the result is hard-clamped to the previous percentage
  4. This ensures no AI hallucination can override the rule, regardless of what the model returns

Data Flow

Staff posts reply
→ Queued for Gemini analysis
→ prepare_ticket_data() includes previous_urgency + last_entry_poster
→ Gemini receives "STAFF REPLY RULE" in criteria
→ Response parsed → server-side cap applied
→ Final urgency = min(gemini_result, previous_urgency) if last poster is Staff

AI Analysis Snapshots

Every time Gemini processes a ticket, a complete snapshot of all 6 AI fields is stored for historical tracking.

Database Table

ost_ticket_ai_snapshots — stores one row per analysis run:

ColumnDescription
idAuto-increment primary key
ticket_idForeign key to ticket
urgencyLOW / MEDIUM / HIGH / CRITICAL
urgency_percentage0–100
estimated_hoursDecimal
summaryHTML summary text
timelineHTML timeline
suggested_titleText
model_usedGemini model that produced the result
createdTimestamp of snapshot

Staff History Page

scp/ai-history.php — Paginated, searchable table of all AI snapshots:

  • Filter by ticket number
  • Filter by urgency level
  • Click any row to expand and view full summary/timeline
  • Accessible to all staff via footer "AI History" link

Recording Snapshots

// Called at the end of every update_fields_directly() invocation
Gemini::insert_ai_snapshot($ticketId, $model, $fields);

Migration

-- deploy/deploy-ai-snapshots.sql

AI Knowledge Assistant ("the Rhino")

Overview

The AI Knowledge Assistant (internally called "the Rhino") is a conversational chatbot available to staff that can answer questions about tickets, clients, and historical work. It uses Pinecone vector search for context retrieval and Gemini for response generation.

As of v1.0.1136, the assistant is built on the php-chatbot architecture (rumenx/php-chatbot v1.1.0) with a modular Suma\AI namespace providing clean separation of concerns.

Architecture (php-chatbot)

Staff asks question in chat widget
→ SSE POST /scp/ajax.php/ai-assistant/chat
→ ChatService::forStaff($staffId, $conversationId) factory
→ ContextService detects org analysis intent (or normal RAG)
→ Normal path:
→ Embed question via Gemini gemini-embedding-001 (~200ms)
→ Query Pinecone default namespace with metadata source filter (~40ms)
→ Tickets: Fetch sanitized_text from ost_ticket_embeddings + ORM metadata
→ Sites/Docs/Wiki: Build text blocks from Pinecone metadata
→ Direct ticket lookup: extractDirectTicketLookups() finds #NNNNN patterns
→ Cited ticket carry-forward: re-fetches previously cited tickets
→ Org analysis path:
→ Resolve org by name or current ticket context
→ Fetch up to 500 ticket summaries from MySQL (bypasses Pinecone)
→ Specialized prompt for theme categorization
→ PromptBuilder assembles system instructions + context + conversation history
→ OsTicketGeminiModel streams response via CURLOPT_WRITEFUNCTION SSE
→ Tokens appear in real-time in the chat UI
→ After stream: extract cited tickets, store message via OsTicketConversationStorage
→ Auto-title conversation if first message

Module Architecture (include/custom/AI/)

ClassResponsibility
ChatServiceMain orchestrator and entry point. Factory method forStaff(). Handles streaming, context, prompt building, message storage, auto-titling
ContextServiceRAG context retrieval. Delegates to AIAssistant for Pinecone queries. Detects org analysis intent, resolves orgs, extracts cited tickets
ModelConfigTyped configuration bridge. Reads AI settings from ost_config table (namespace ai_assistant) and exposes typed accessors
OsTicketGeminiModelExtends php-chatbot's GeminiModel. Real-time SSE streaming via CURLOPT_WRITEFUNCTION, multi-turn conversation format, model fallback chains, thinking model filtering (skips "thought" parts from Gemini 2.5+)
OsTicketConversationStorageImplements php-chatbot's MemoryStorageInterface. Maps to ost_ai_conversations and ost_ai_conversation_messages tables. Composite session IDs (staff_id:conversation_id)
PromptBuilderConstructs system prompts and Gemini contents arrays. Dynamic knowledge source descriptions, source priority guidance, HTML output format rules, org analysis specialized prompts

Technical Notes

  • All files use declare(strict_types=1) and the Suma\AI namespace
  • Composer autoloader (loaded in include/ost-config.php) provides Rumenx\PhpChatbot\* class autoloading
  • AIAssistant class remains intact for embedding/Pinecone operations, delegated to by ContextService
  • SSE streaming contract preserved: {"type":"token","content":"..."} and {"type":"done",...} events unchanged
  • No database migrations required — uses existing tables
  • No frontend changes required — same AJAX endpoints and response format

Ticket & Agent Context Awareness

The chat widget is aware of the ticket the agent is currently viewing:

  • ai-chat-widget-host element includes data-ticket-number and data-agent-name attributes (populated from PHP)
  • JavaScript reads these into instance properties on initialization
  • Chat messages include ticket_number and agent_name in the POST payload
  • PromptBuilder includes a "CURRENT CONTEXT" section: "You are assisting [Agent Name] who is viewing ticket #[Number]"
  • Agents can say "this ticket" without specifying a number

Organization-Level Issue Analysis

The chatbot can analyze all tickets for an organization and identify recurring issue themes:

Detection: Natural language queries like:

  • "Top issues for Acme Corp"
  • "Common problems at [org name]"
  • "What does this org usually have issues with?"

How it works:

  1. ContextService detects org analysis intent via pattern matching
  2. Resolves org from current ticket context ("this org") or fuzzy name matching
  3. Fetches up to 500 ticket summaries directly from MySQL (subject + AI summary + status + urgency)
  4. Time filter support: "last 6 months", "since 2025", "this year"
  5. Specialized Gemini prompt instructs categorization into top 20 themes with frequency, descriptions, and example ticket numbers
  6. Uses existing SSE streaming — no new endpoints needed

All vectors live in Pinecone's default namespace with a source metadata field. A single query with $in metadata filter fetches results from all enabled sources:

SourceVector IDsContent
ticketticket_{id}Ticket embeddings (always active)
sitesite-{id}Client site metadata from manage.rhinogroup.com
wikidocs-{id}Internal wiki articles
docs{path-slug}Docusaurus technical documentation
gsm{domain}GSM Outdoors brand website details

Components

ComponentFileDescription
Chat Widgetinclude/staff/ai-chat-widget.inc.phpFloating bubble (bottom-right), Shadow DOM isolated
Full Pageinclude/staff/ai-assistant-full.inc.phpSidebar + full-width chat interface
Page Bootstrapscp/ai-assistant.phpFull-page AI Assistant entry point
AI Historyscp/ai-history.phpAI analysis snapshots history page
ChatServiceinclude/custom/AI/ChatService.phpMain orchestrator (php-chatbot architecture)
ContextServiceinclude/custom/AI/ContextService.phpRAG retrieval + org analysis
PromptBuilderinclude/custom/AI/PromptBuilder.phpPrompt construction
Modelinclude/custom/AI/OsTicketGeminiModel.phpSSE streaming Gemini model
Storageinclude/custom/AI/OsTicketConversationStorage.phpConversation persistence
Configinclude/custom/AI/ModelConfig.phpAI settings accessor
Core Logicinclude/class.ai-assistant.phpEmbeddings, Pinecone search (delegated to by ContextService)

Shadow DOM Isolation

The chat widget is rendered inside Shadow DOM to prevent osTicket's jQuery UI and Bootstrap styles from interfering:

<div id="ai-chat-container"></div>
<script>
const shadow = document.getElementById('ai-chat-container')
.attachShadow({mode: 'open'});
// Widget renders inside shadow root
</script>

Embedding Generation

Ticket content is vectorized and stored in Pinecone for semantic search:

  • Cron: cron-embeddings.php runs every 10 minutes
  • Storage: ost_ticket_embeddings table caches text before embedding
  • Model: Gemini embedding model for vector generation
  • Index: Pinecone vector database with ticket metadata (client, status, dates)

AJAX Endpoints

MethodEndpointPurpose
POST/ajax.php/ai-assistant/chatSend message, receive SSE stream
GET/ajax.php/ai-assistant/conversationsList conversations
POST/ajax.php/ai-assistant/conversationsCreate new conversation
DELETE/ajax.php/ai-assistant/conversations/{id}Delete conversation
GET/ajax.php/ai-assistant/statusCheck AI system status
POST/ajax.php/ai-assistant/reembedForce re-embed a ticket

Conversation Storage

TableColumnsPurpose
ost_ai_conversationsid, staff_id, title, created, updatedChat sessions
ost_ai_conversation_messagesid, conversation_id, role, content, createdMessages (user/assistant)

Motherload Plugin: AI Summary Update

The plugin_ai_summary_update plugin triggers the AI pipeline:

class plugin_ai_summary_update extends motherloadPlugin {
public const signalConnect = 'threadentry.created';

public function run(): bool {
// Get the ticket from the thread entry
$ticket = $this->getTicket();

// Queue for Gemini processing
db_query(
'INSERT INTO ' . GEMINI_QUEUE_TABLE .
' (ticket_id, processed) VALUES (%d, 0)',
$ticket->getId()
);

return true;
}
}

Located at: include/plugins/motherload/plugins/ai_summary_update/plugin_ai_summary_update.php


Configuration

AI settings are managed in the osTicket admin panel under Admin → Settings → AI Assistant:

SettingPurpose
Gemini API KeyShared key for all Gemini features (urgency, assistant, note optimization)
Generative ModelModel for chat and urgency analysis (e.g. gemini-2.5-flash)
Embedding ModelModel for vector generation (e.g. gemini-embedding-001)
Pinecone API KeyVector database authentication
Pinecone HostPinecone index host URL
Enabled SourcesJSON array of Pinecone source types to query
Max Context TicketsTotal results included as chat context
Urgency CriteriaEditable textarea for urgency analysis prompt criteria (with Reset to Default button)
Max Tickets Per RunConfigurable throttle for queue processing (default 2)
Chatbot NameDisplay name for the AI assistant (default "the Rhino")

Email Templates

Critical urgency alerts use the template at emails/:

  • Template uses %{ticket.number} for URLs
  • Subject: includes urgency level and ticket number
  • Body: AI summary, urgency percentage, estimated hours
  • Recipients: assigned staff + PMs + department manager