AI System — Gemini & Knowledge Assistant
The ticket system uses Google's Gemini API for three distinct AI features: automatic ticket urgency analysis, an interactive knowledge assistant for staff (powered by the modular Suma\AI php-chatbot architecture), and AI note optimization for Harvest time entries.
Gemini AI Urgency Analysis
Overview
Every time a ticket is created or a new thread entry is posted, the system queues it for AI analysis. A cron job processes the queue, sends ticket data to a Gemini API endpoint, and updates custom form fields with urgency levels, estimated hours, summaries, and suggested titles.
Architecture Flow
Thread entry posted (message, response, or note)
→ Motherload signal: threadentry.created
→ plugin_ai_summary_update::run()
→ INSERT INTO ost_gemini_queue (ticket_id, processed=0)
→ cron-gemini.php runs Gemini::process_queue()
→ Gemini::prepare_ticket_data() fetches entries via:
$ticket->getThreadEntries(['M', 'R', 'N'])
→ Includes previous urgency context + last entry poster type
→ Gemini::call_gemini_api() builds prompt locally using urgency criteria
→ Check ost_gemini_response_cache for cached result
→ POST directly to Google Gemini REST API
(generativelanguage.googleapis.com/v1beta/models/{model}:generateContent)
→ Retry with exponential backoff + model fallback chain on failure
→ Parse JSON response (urgency_percentage, estimated_hours, summary, timeline, suggested_title)
→ Staff Reply Urgency Cap enforcement (server-side clamp)
→ Cache response in ost_gemini_response_cache (configurable TTL)
→ Results saved to ost_form_entry_values (custom fields)
→ Snapshot saved to ost_ticket_ai_snapshots (history tracking)
→ History logged to ost_ticket_ai_history
→ If CRITICAL: triggers email + SMS + browser notification
Direct Gemini Integration
The system calls the Google Gemini REST API directly — there is no proxy server. The shared gemini_api_key from the AI Assistant config namespace is used for authentication.
Model Fallback Chains
If the primary model fails, fallback models are tried automatically:
gemini-2.5-flash→gemini-2.5-flash-litegemini-2.5-pro→gemini-2.5-flash→gemini-3-flash-previewgemini-3-pro-preview→gemini-3-flash-preview→gemini-2.5-pro→gemini-2.5-flash
Response Cache
Responses are cached in ost_gemini_response_cache with configurable TTL. Cache is bypassed during normal queue processing (bypass_cache=true). Expired entries are cleaned up automatically during cron runs.
Failure Retry Logic
If the Gemini API returns a non-200 response, the failures column on the queue entry is incremented. After 5 failed attempts, the ticket is skipped. When a new thread entry is posted (re-queuing), failures reset to 0.
API Request Format
The Gemini API receives a structured payload with one object per thread entry:
{
"subject": "Ticket Subject",
"body": [
{
"poster_name": "Jane Smith",
"poster_type": "Client",
"date": "2026-03-13 07:10:10",
"message": "The checkout page throws a 500 error...",
"is_internal": false,
"entry_label": "Original Ticket Request"
},
{
"poster_name": "David Sinclair",
"poster_type": "Staff",
"date": "2026-03-13 08:38:07",
"message": "I've identified the issue...",
"is_internal": false,
"entry_label": "Reply #2"
}
],
"bypass_cache": true,
"ticket_id": 8826,
"customer_type": "small",
"reported_at": "2026-03-13 07:10:10 AM CDT"
}
AI Fields Updated
| Field | Values | Purpose |
|---|---|---|
ai-urgency | LOW / MEDIUM / HIGH / CRITICAL | Urgency classification |
ai-urgency-percentage | 0–100 | Numeric urgency score |
ai-estimated-hours | Decimal | Estimated effort |
ticket-summary | HTML | Ticket summary |
ai-timeline | HTML | Event timeline |
ai-suggested-title | Text | Better title suggestion |
Critical Alert Deduplication
Critical urgency emails and SMS are sent once per ticket only, tracked via ost_ticket_ai_history:
// Check before sending
Gemini::has_critical_alert_been_sent($ticketId); // returns bool
// Record after sending
Gemini::record_critical_alert_sent($ticketId);
API Logging
All Gemini requests are logged to var/logs/api/gemini/gemini-YYYY-MM-DD.log via the ApiLogger class. Sensitive headers are automatically redacted.
Dedicated Gemini Queue Logging
The GeminiLogger class (include/class.gemini-logger.php) provides a separate file-based logger specifically for the queue processing pipeline:
- Log location:
var/logs/gemini/gemini-queue-YYYY-MM-DD.log - Daily rotation with 5MB max file size and 30-day retention
- Logged events:
- Cron throttle (10-minute gate) — why processing was skipped
- Queue state snapshots — total/pending/processed/failed counts
- Per-ticket processing — start, success (with urgency results), or failure (with error + failure count)
- API calls — model used, HTTP response code, duration
- Fallback chain — each step logged when primary model fails
Cron Configuration
// cron-gemini.php
// urgency_max_tickets_per_run — configurable via Settings (default 2)
# Run manually (Windows dev)
.\run-cron-dev.ps1
# Or directly
php cron-gemini.php
The cron should run every minute in production via api/cron.php.
Staff Reply Urgency Cap
When a staff member posts a reply or note, the AI re-analyzes the ticket. To prevent the AI from artificially escalating urgency after an internal response, a server-side urgency cap is enforced.
How It Works
prepare_ticket_data()includesprevious_urgency_percentage,previous_urgency_level,last_entry_poster_type, andlast_entry_poster_namein the data sent to Gemini- The urgency criteria prompt includes a "STAFF REPLY RULE" section instructing Gemini that staff replies must never increase urgency
- Server-side enforcement in
process_ticket(): if the last thread entry was posted by Staff and Gemini returns a higher urgency than previously stored, the result is hard-clamped to the previous percentage - This ensures no AI hallucination can override the rule, regardless of what the model returns
Data Flow
Staff posts reply
→ Queued for Gemini analysis
→ prepare_ticket_data() includes previous_urgency + last_entry_poster
→ Gemini receives "STAFF REPLY RULE" in criteria
→ Response parsed → server-side cap applied
→ Final urgency = min(gemini_result, previous_urgency) if last poster is Staff
AI Analysis Snapshots
Every time Gemini processes a ticket, a complete snapshot of all 6 AI fields is stored for historical tracking.
Database Table
ost_ticket_ai_snapshots — stores one row per analysis run:
| Column | Description |
|---|---|
id | Auto-increment primary key |
ticket_id | Foreign key to ticket |
urgency | LOW / MEDIUM / HIGH / CRITICAL |
urgency_percentage | 0–100 |
estimated_hours | Decimal |
summary | HTML summary text |
timeline | HTML timeline |
suggested_title | Text |
model_used | Gemini model that produced the result |
created | Timestamp of snapshot |
Staff History Page
scp/ai-history.php — Paginated, searchable table of all AI snapshots:
- Filter by ticket number
- Filter by urgency level
- Click any row to expand and view full summary/timeline
- Accessible to all staff via footer "AI History" link
Recording Snapshots
// Called at the end of every update_fields_directly() invocation
Gemini::insert_ai_snapshot($ticketId, $model, $fields);
Migration
-- deploy/deploy-ai-snapshots.sql
AI Knowledge Assistant ("the Rhino")
Overview
The AI Knowledge Assistant (internally called "the Rhino") is a conversational chatbot available to staff that can answer questions about tickets, clients, and historical work. It uses Pinecone vector search for context retrieval and Gemini for response generation.
As of v1.0.1136, the assistant is built on the php-chatbot architecture (rumenx/php-chatbot v1.1.0) with a modular Suma\AI namespace providing clean separation of concerns.
Architecture (php-chatbot)
Staff asks question in chat widget
→ SSE POST /scp/ajax.php/ai-assistant/chat
→ ChatService::forStaff($staffId, $conversationId) factory
→ ContextService detects org analysis intent (or normal RAG)
→ Normal path:
→ Embed question via Gemini gemini-embedding-001 (~200ms)
→ Query Pinecone default namespace with metadata source filter (~40ms)
→ Tickets: Fetch sanitized_text from ost_ticket_embeddings + ORM metadata
→ Sites/Docs/Wiki: Build text blocks from Pinecone metadata
→ Direct ticket lookup: extractDirectTicketLookups() finds #NNNNN patterns
→ Cited ticket carry-forward: re-fetches previously cited tickets
→ Org analysis path:
→ Resolve org by name or current ticket context
→ Fetch up to 500 ticket summaries from MySQL (bypasses Pinecone)
→ Specialized prompt for theme categorization
→ PromptBuilder assembles system instructions + context + conversation history
→ OsTicketGeminiModel streams response via CURLOPT_WRITEFUNCTION SSE
→ Tokens appear in real-time in the chat UI
→ After stream: extract cited tickets, store message via OsTicketConversationStorage
→ Auto-title conversation if first message
Module Architecture (include/custom/AI/)
| Class | Responsibility |
|---|---|
ChatService | Main orchestrator and entry point. Factory method forStaff(). Handles streaming, context, prompt building, message storage, auto-titling |
ContextService | RAG context retrieval. Delegates to AIAssistant for Pinecone queries. Detects org analysis intent, resolves orgs, extracts cited tickets |
ModelConfig | Typed configuration bridge. Reads AI settings from ost_config table (namespace ai_assistant) and exposes typed accessors |
OsTicketGeminiModel | Extends php-chatbot's GeminiModel. Real-time SSE streaming via CURLOPT_WRITEFUNCTION, multi-turn conversation format, model fallback chains, thinking model filtering (skips "thought" parts from Gemini 2.5+) |
OsTicketConversationStorage | Implements php-chatbot's MemoryStorageInterface. Maps to ost_ai_conversations and ost_ai_conversation_messages tables. Composite session IDs (staff_id:conversation_id) |
PromptBuilder | Constructs system prompts and Gemini contents arrays. Dynamic knowledge source descriptions, source priority guidance, HTML output format rules, org analysis specialized prompts |
Technical Notes
- All files use
declare(strict_types=1)and theSuma\AInamespace - Composer autoloader (loaded in
include/ost-config.php) providesRumenx\PhpChatbot\*class autoloading AIAssistantclass remains intact for embedding/Pinecone operations, delegated to byContextService- SSE streaming contract preserved:
{"type":"token","content":"..."}and{"type":"done",...}events unchanged - No database migrations required — uses existing tables
- No frontend changes required — same AJAX endpoints and response format
Ticket & Agent Context Awareness
The chat widget is aware of the ticket the agent is currently viewing:
ai-chat-widget-hostelement includesdata-ticket-numberanddata-agent-nameattributes (populated from PHP)- JavaScript reads these into instance properties on initialization
- Chat messages include
ticket_numberandagent_namein the POST payload PromptBuilderincludes a "CURRENT CONTEXT" section: "You are assisting [Agent Name] who is viewing ticket #[Number]"- Agents can say "this ticket" without specifying a number
Organization-Level Issue Analysis
The chatbot can analyze all tickets for an organization and identify recurring issue themes:
Detection: Natural language queries like:
- "Top issues for Acme Corp"
- "Common problems at [org name]"
- "What does this org usually have issues with?"
How it works:
ContextServicedetects org analysis intent via pattern matching- Resolves org from current ticket context ("this org") or fuzzy name matching
- Fetches up to 500 ticket summaries directly from MySQL (subject + AI summary + status + urgency)
- Time filter support: "last 6 months", "since 2025", "this year"
- Specialized Gemini prompt instructs categorization into top 20 themes with frequency, descriptions, and example ticket numbers
- Uses existing SSE streaming — no new endpoints needed
Multi-Source Pinecone Search
All vectors live in Pinecone's default namespace with a source metadata field. A single query with $in metadata filter fetches results from all enabled sources:
| Source | Vector IDs | Content |
|---|---|---|
ticket | ticket_{id} | Ticket embeddings (always active) |
site | site-{id} | Client site metadata from manage.rhinogroup.com |
wiki | docs-{id} | Internal wiki articles |
docs | {path-slug} | Docusaurus technical documentation |
gsm | {domain} | GSM Outdoors brand website details |
Components
| Component | File | Description |
|---|---|---|
| Chat Widget | include/staff/ai-chat-widget.inc.php | Floating bubble (bottom-right), Shadow DOM isolated |
| Full Page | include/staff/ai-assistant-full.inc.php | Sidebar + full-width chat interface |
| Page Bootstrap | scp/ai-assistant.php | Full-page AI Assistant entry point |
| AI History | scp/ai-history.php | AI analysis snapshots history page |
| ChatService | include/custom/AI/ChatService.php | Main orchestrator (php-chatbot architecture) |
| ContextService | include/custom/AI/ContextService.php | RAG retrieval + org analysis |
| PromptBuilder | include/custom/AI/PromptBuilder.php | Prompt construction |
| Model | include/custom/AI/OsTicketGeminiModel.php | SSE streaming Gemini model |
| Storage | include/custom/AI/OsTicketConversationStorage.php | Conversation persistence |
| Config | include/custom/AI/ModelConfig.php | AI settings accessor |
| Core Logic | include/class.ai-assistant.php | Embeddings, Pinecone search (delegated to by ContextService) |
Shadow DOM Isolation
The chat widget is rendered inside Shadow DOM to prevent osTicket's jQuery UI and Bootstrap styles from interfering:
<div id="ai-chat-container"></div>
<script>
const shadow = document.getElementById('ai-chat-container')
.attachShadow({mode: 'open'});
// Widget renders inside shadow root
</script>
Embedding Generation
Ticket content is vectorized and stored in Pinecone for semantic search:
- Cron:
cron-embeddings.phpruns every 10 minutes - Storage:
ost_ticket_embeddingstable caches text before embedding - Model: Gemini embedding model for vector generation
- Index: Pinecone vector database with ticket metadata (client, status, dates)
AJAX Endpoints
| Method | Endpoint | Purpose |
|---|---|---|
| POST | /ajax.php/ai-assistant/chat | Send message, receive SSE stream |
| GET | /ajax.php/ai-assistant/conversations | List conversations |
| POST | /ajax.php/ai-assistant/conversations | Create new conversation |
| DELETE | /ajax.php/ai-assistant/conversations/{id} | Delete conversation |
| GET | /ajax.php/ai-assistant/status | Check AI system status |
| POST | /ajax.php/ai-assistant/reembed | Force re-embed a ticket |
Conversation Storage
| Table | Columns | Purpose |
|---|---|---|
ost_ai_conversations | id, staff_id, title, created, updated | Chat sessions |
ost_ai_conversation_messages | id, conversation_id, role, content, created | Messages (user/assistant) |
Motherload Plugin: AI Summary Update
The plugin_ai_summary_update plugin triggers the AI pipeline:
class plugin_ai_summary_update extends motherloadPlugin {
public const signalConnect = 'threadentry.created';
public function run(): bool {
// Get the ticket from the thread entry
$ticket = $this->getTicket();
// Queue for Gemini processing
db_query(
'INSERT INTO ' . GEMINI_QUEUE_TABLE .
' (ticket_id, processed) VALUES (%d, 0)',
$ticket->getId()
);
return true;
}
}
Located at: include/plugins/motherload/plugins/ai_summary_update/plugin_ai_summary_update.php
Configuration
AI settings are managed in the osTicket admin panel under Admin → Settings → AI Assistant:
| Setting | Purpose |
|---|---|
| Gemini API Key | Shared key for all Gemini features (urgency, assistant, note optimization) |
| Generative Model | Model for chat and urgency analysis (e.g. gemini-2.5-flash) |
| Embedding Model | Model for vector generation (e.g. gemini-embedding-001) |
| Pinecone API Key | Vector database authentication |
| Pinecone Host | Pinecone index host URL |
| Enabled Sources | JSON array of Pinecone source types to query |
| Max Context Tickets | Total results included as chat context |
| Urgency Criteria | Editable textarea for urgency analysis prompt criteria (with Reset to Default button) |
| Max Tickets Per Run | Configurable throttle for queue processing (default 2) |
| Chatbot Name | Display name for the AI assistant (default "the Rhino") |
Email Templates
Critical urgency alerts use the template at emails/:
- Template uses
%{ticket.number}for URLs - Subject: includes urgency level and ticket number
- Body: AI summary, urgency percentage, estimated hours
- Recipients: assigned staff + PMs + department manager