AI System — Gemini & Knowledge Assistant

The ticket system uses Google's Gemini API for three distinct AI features: automatic ticket urgency analysis, an interactive knowledge assistant for staff (powered by the modular Suma\AI php-chatbot architecture), and AI note optimization for Harvest time entries.

Gemini AI Urgency Analysis

Overview

Every time a ticket is created or a new thread entry is posted, the system queues it for AI analysis. A cron job processes the queue, sends ticket data to a Gemini API endpoint, and updates custom form fields with urgency levels, estimated hours, summaries, and suggested titles.

Architecture Flow

Thread entry posted (message, response, or note)
  → Motherload signal: threadentry.created
  → plugin_ai_summary_update::run()
  → INSERT INTO ost_gemini_queue (ticket_id, processed=0)
  → cron-gemini.php runs Gemini::process_queue()
  → Gemini::prepare_ticket_data() fetches entries via:
       $ticket->getThreadEntries(['M', 'R', 'N'])
  → Includes previous urgency context + last entry poster type
  → Gemini::call_gemini_api() builds prompt locally using urgency criteria
  → Check ost_gemini_response_cache for cached result
  → POST directly to Google Gemini REST API
       (generativelanguage.googleapis.com/v1beta/models/{model}:generateContent)
  → Retry with exponential backoff + model fallback chain on failure
  → Parse JSON response (urgency_percentage, estimated_hours, summary, timeline, suggested_title)
  → Staff Reply Urgency Cap enforcement (server-side clamp)
  → Cache response in ost_gemini_response_cache (configurable TTL)
  → Results saved to ost_form_entry_values (custom fields)
  → Snapshot saved to ost_ticket_ai_snapshots (history tracking)
  → History logged to ost_ticket_ai_history
  → If CRITICAL: triggers email + SMS + browser notification

Direct Gemini Integration

The system calls the Google Gemini REST API directly — there is no proxy server. The shared gemini_api_key from the AI Assistant config namespace is used for authentication.

Model Fallback Chains

If the primary model fails, fallback models are tried automatically:

gemini-2.5-flash → gemini-2.5-flash-lite
gemini-2.5-pro → gemini-2.5-flash → gemini-3-flash-preview
gemini-3-pro-preview → gemini-3-flash-preview → gemini-2.5-pro → gemini-2.5-flash

Response Cache

Responses are cached in ost_gemini_response_cache with configurable TTL. Cache is bypassed during normal queue processing (bypass_cache=true). Expired entries are cleaned up automatically during cron runs.

Failure Retry Logic

If the Gemini API returns a non-200 response, the failures column on the queue entry is incremented. After 5 failed attempts, the ticket is skipped. When a new thread entry is posted (re-queuing), failures reset to 0.

API Request Format

The Gemini API receives a structured payload with one object per thread entry:

{
  "subject": "Ticket Subject",
  "body": [
    {
      "poster_name": "Jane Smith",
      "poster_type": "Client",
      "date": "2026-03-13 07:10:10",
      "message": "The checkout page throws a 500 error...",
      "is_internal": false,
      "entry_label": "Original Ticket Request"
    },
    {
      "poster_name": "David Sinclair",
      "poster_type": "Staff",
      "date": "2026-03-13 08:38:07",
      "message": "I've identified the issue...",
      "is_internal": false,
      "entry_label": "Reply #2"
    }
  ],
  "bypass_cache": true,
  "ticket_id": 8826,
  "customer_type": "small",
  "reported_at": "2026-03-13 07:10:10 AM CDT"
}

AI Fields Updated

Field	Values	Purpose
`ai-urgency`	`LOW` / `MEDIUM` / `HIGH` / `CRITICAL`	Urgency classification
`ai-urgency-percentage`	0–100	Numeric urgency score
`ai-estimated-hours`	Decimal	Estimated effort
`ticket-summary`	HTML	Ticket summary
`ai-timeline`	HTML	Event timeline
`ai-suggested-title`	Text	Better title suggestion

Critical Alert Deduplication

Critical urgency emails and SMS are sent once per ticket only, tracked via ost_ticket_ai_history:

// Check before sending
Gemini::has_critical_alert_been_sent($ticketId);  // returns bool

// Record after sending
Gemini::record_critical_alert_sent($ticketId);

API Logging

All Gemini requests are logged to var/logs/api/gemini/gemini-YYYY-MM-DD.log via the ApiLogger class. Sensitive headers are automatically redacted.

Dedicated Gemini Queue Logging

The GeminiLogger class (include/class.gemini-logger.php) provides a separate file-based logger specifically for the queue processing pipeline:

Log location: var/logs/gemini/gemini-queue-YYYY-MM-DD.log
Daily rotation with 5MB max file size and 30-day retention
Logged events:
- Cron throttle (10-minute gate) — why processing was skipped
- Queue state snapshots — total/pending/processed/failed counts
- Per-ticket processing — start, success (with urgency results), or failure (with error + failure count)
- API calls — model used, HTTP response code, duration
- Fallback chain — each step logged when primary model fails

Cron Configuration

// cron-gemini.php
// urgency_max_tickets_per_run — configurable via Settings (default 2)

# Run manually (Windows dev)
.\run-cron-dev.ps1

# Or directly
php cron-gemini.php

The cron should run every minute in production via api/cron.php.

Staff Reply Urgency Cap

When a staff member posts a reply or note, the AI re-analyzes the ticket. To prevent the AI from artificially escalating urgency after an internal response, a server-side urgency cap is enforced.

How It Works

prepare_ticket_data() includes previous_urgency_percentage, previous_urgency_level, last_entry_poster_type, and last_entry_poster_name in the data sent to Gemini
The urgency criteria prompt includes a "STAFF REPLY RULE" section instructing Gemini that staff replies must never increase urgency
Server-side enforcement in process_ticket(): if the last thread entry was posted by Staff and Gemini returns a higher urgency than previously stored, the result is hard-clamped to the previous percentage
This ensures no AI hallucination can override the rule, regardless of what the model returns

Data Flow

Staff posts reply
  → Queued for Gemini analysis
  → prepare_ticket_data() includes previous_urgency + last_entry_poster
  → Gemini receives "STAFF REPLY RULE" in criteria
  → Response parsed → server-side cap applied
  → Final urgency = min(gemini_result, previous_urgency) if last poster is Staff

AI Analysis Snapshots

Every time Gemini processes a ticket, a complete snapshot of all 6 AI fields is stored for historical tracking.

Database Table

ost_ticket_ai_snapshots — stores one row per analysis run:

Column	Description
`id`	Auto-increment primary key
`ticket_id`	Foreign key to ticket
`urgency`	LOW / MEDIUM / HIGH / CRITICAL
`urgency_percentage`	0–100
`estimated_hours`	Decimal
`summary`	HTML summary text
`timeline`	HTML timeline
`suggested_title`	Text
`model_used`	Gemini model that produced the result
`created`	Timestamp of snapshot

Staff History Page

scp/ai-history.php — Paginated, searchable table of all AI snapshots:

Filter by ticket number
Filter by urgency level
Click any row to expand and view full summary/timeline
Accessible to all staff via footer "AI History" link

Recording Snapshots

// Called at the end of every update_fields_directly() invocation
Gemini::insert_ai_snapshot($ticketId, $model, $fields);

Migration

-- deploy/deploy-ai-snapshots.sql

AI Knowledge Assistant ("the Rhino")

Overview

The AI Knowledge Assistant (internally called "the Rhino") is a conversational chatbot available to staff that can answer questions about tickets, clients, and historical work. It uses Pinecone vector search for context retrieval and Gemini for response generation.

As of v1.0.1136, the assistant is built on the php-chatbot architecture (rumenx/php-chatbot v1.1.0) with a modular Suma\AI namespace providing clean separation of concerns.

Architecture (php-chatbot)

Staff asks question in chat widget
  → SSE POST /scp/ajax.php/ai-assistant/chat
  → ChatService::forStaff($staffId, $conversationId) factory
  → ContextService detects org analysis intent (or normal RAG)
  → Normal path:
      → Embed question via Gemini gemini-embedding-001 (~200ms)
      → Query Pinecone default namespace with metadata source filter (~40ms)
      → Tickets: Fetch sanitized_text from ost_ticket_embeddings + ORM metadata
      → Sites/Docs/Wiki: Build text blocks from Pinecone metadata
      → Direct ticket lookup: extractDirectTicketLookups() finds #NNNNN patterns
      → Cited ticket carry-forward: re-fetches previously cited tickets
  → Org analysis path:
      → Resolve org by name or current ticket context
      → Fetch up to 500 ticket summaries from MySQL (bypasses Pinecone)
      → Specialized prompt for theme categorization
  → PromptBuilder assembles system instructions + context + conversation history
  → OsTicketGeminiModel streams response via CURLOPT_WRITEFUNCTION SSE
  → Tokens appear in real-time in the chat UI
  → After stream: extract cited tickets, store message via OsTicketConversationStorage
  → Auto-title conversation if first message

Module Architecture (`include/custom/AI/`)

Class	Responsibility
`ChatService`	Main orchestrator and entry point. Factory method `forStaff()`. Handles streaming, context, prompt building, message storage, auto-titling
`ContextService`	RAG context retrieval. Delegates to `AIAssistant` for Pinecone queries. Detects org analysis intent, resolves orgs, extracts cited tickets
`ModelConfig`	Typed configuration bridge. Reads AI settings from `ost_config` table (namespace `ai_assistant`) and exposes typed accessors
`OsTicketGeminiModel`	Extends php-chatbot's `GeminiModel`. Real-time SSE streaming via `CURLOPT_WRITEFUNCTION`, multi-turn conversation format, model fallback chains, thinking model filtering (skips "thought" parts from Gemini 2.5+)
`OsTicketConversationStorage`	Implements php-chatbot's `MemoryStorageInterface`. Maps to `ost_ai_conversations` and `ost_ai_conversation_messages` tables. Composite session IDs (`staff_id:conversation_id`)
`PromptBuilder`	Constructs system prompts and Gemini contents arrays. Dynamic knowledge source descriptions, source priority guidance, HTML output format rules, org analysis specialized prompts

Technical Notes

All files use declare(strict_types=1) and the Suma\AI namespace
Composer autoloader (loaded in include/ost-config.php) provides Rumenx\PhpChatbot\* class autoloading
AIAssistant class remains intact for embedding/Pinecone operations, delegated to by ContextService
SSE streaming contract preserved: {"type":"token","content":"..."} and {"type":"done",...} events unchanged
No database migrations required — uses existing tables
No frontend changes required — same AJAX endpoints and response format

Ticket & Agent Context Awareness

The chat widget is aware of the ticket the agent is currently viewing:

ai-chat-widget-host element includes data-ticket-number and data-agent-name attributes (populated from PHP)
JavaScript reads these into instance properties on initialization
Chat messages include ticket_number and agent_name in the POST payload
PromptBuilder includes a "CURRENT CONTEXT" section: "You are assisting [Agent Name] who is viewing ticket #[Number]"
Agents can say "this ticket" without specifying a number

Organization-Level Issue Analysis

The chatbot can analyze all tickets for an organization and identify recurring issue themes:

Detection: Natural language queries like:

"Top issues for Acme Corp"
"Common problems at [org name]"
"What does this org usually have issues with?"

How it works:

ContextService detects org analysis intent via pattern matching
Resolves org from current ticket context ("this org") or fuzzy name matching
Fetches up to 500 ticket summaries directly from MySQL (subject + AI summary + status + urgency)
Time filter support: "last 6 months", "since 2025", "this year"
Specialized Gemini prompt instructs categorization into top 20 themes with frequency, descriptions, and example ticket numbers
Uses existing SSE streaming — no new endpoints needed

Multi-Source Pinecone Search

All vectors live in Pinecone's default namespace with a source metadata field. A single query with $in metadata filter fetches results from all enabled sources:

Source	Vector IDs	Content
`ticket`	`ticket_{id}`	Ticket embeddings (always active)
`site`	`site-{id}`	Client site metadata from manage.rhinogroup.com
`wiki`	`docs-{id}`	Internal wiki articles
`docs`	`{path-slug}`	Docusaurus technical documentation
`gsm`	`{domain}`	GSM Outdoors brand website details

Components

Component	File	Description
Chat Widget	`include/staff/ai-chat-widget.inc.php`	Floating bubble (bottom-right), Shadow DOM isolated
Full Page	`include/staff/ai-assistant-full.inc.php`	Sidebar + full-width chat interface
Page Bootstrap	`scp/ai-assistant.php`	Full-page AI Assistant entry point
AI History	`scp/ai-history.php`	AI analysis snapshots history page
ChatService	`include/custom/AI/ChatService.php`	Main orchestrator (php-chatbot architecture)
ContextService	`include/custom/AI/ContextService.php`	RAG retrieval + org analysis
PromptBuilder	`include/custom/AI/PromptBuilder.php`	Prompt construction
Model	`include/custom/AI/OsTicketGeminiModel.php`	SSE streaming Gemini model
Storage	`include/custom/AI/OsTicketConversationStorage.php`	Conversation persistence
Config	`include/custom/AI/ModelConfig.php`	AI settings accessor
Core Logic	`include/class.ai-assistant.php`	Embeddings, Pinecone search (delegated to by ContextService)

Shadow DOM Isolation

The chat widget is rendered inside Shadow DOM to prevent osTicket's jQuery UI and Bootstrap styles from interfering:

<div id="ai-chat-container"></div>
<script>
  const shadow = document.getElementById('ai-chat-container')
    .attachShadow({mode: 'open'});
  // Widget renders inside shadow root
</script>

Embedding Generation

Ticket content is vectorized and stored in Pinecone for semantic search:

Cron: cron-embeddings.php runs every 10 minutes
Storage: ost_ticket_embeddings table caches text before embedding
Model: Gemini embedding model for vector generation
Index: Pinecone vector database with ticket metadata (client, status, dates)

AJAX Endpoints

Method	Endpoint	Purpose
POST	`/ajax.php/ai-assistant/chat`	Send message, receive SSE stream
GET	`/ajax.php/ai-assistant/conversations`	List conversations
POST	`/ajax.php/ai-assistant/conversations`	Create new conversation
DELETE	`/ajax.php/ai-assistant/conversations/{id}`	Delete conversation
GET	`/ajax.php/ai-assistant/status`	Check AI system status
POST	`/ajax.php/ai-assistant/reembed`	Force re-embed a ticket

Conversation Storage

Table	Columns	Purpose
`ost_ai_conversations`	`id`, `staff_id`, `title`, `created`, `updated`	Chat sessions
`ost_ai_conversation_messages`	`id`, `conversation_id`, `role`, `content`, `created`	Messages (user/assistant)

Motherload Plugin: AI Summary Update

The plugin_ai_summary_update plugin triggers the AI pipeline:

class plugin_ai_summary_update extends motherloadPlugin {
    public const signalConnect = 'threadentry.created';

    public function run(): bool {
        // Get the ticket from the thread entry
        $ticket = $this->getTicket();

        // Queue for Gemini processing
        db_query(
            'INSERT INTO ' . GEMINI_QUEUE_TABLE .
            ' (ticket_id, processed) VALUES (%d, 0)',
            $ticket->getId()
        );

        return true;
    }
}

Located at: include/plugins/motherload/plugins/ai_summary_update/plugin_ai_summary_update.php

Configuration

AI settings are managed in the osTicket admin panel under Admin → Settings → AI Assistant:

Setting	Purpose
Gemini API Key	Shared key for all Gemini features (urgency, assistant, note optimization)
Generative Model	Model for chat and urgency analysis (e.g. `gemini-2.5-flash`)
Embedding Model	Model for vector generation (e.g. `gemini-embedding-001`)
Pinecone API Key	Vector database authentication
Pinecone Host	Pinecone index host URL
Enabled Sources	JSON array of Pinecone source types to query
Max Context Tickets	Total results included as chat context
Urgency Criteria	Editable textarea for urgency analysis prompt criteria (with Reset to Default button)
Max Tickets Per Run	Configurable throttle for queue processing (default 2)
Chatbot Name	Display name for the AI assistant (default "the Rhino")

Email Templates

Critical urgency alerts use the template at emails/:

Template uses %{ticket.number} for URLs
Subject: includes urgency level and ticket number
Body: AI summary, urgency percentage, estimated hours
Recipients: assigned staff + PMs + department manager

Gemini AI Urgency Analysis​

Overview​

Architecture Flow​

Direct Gemini Integration​

Model Fallback Chains​

Response Cache​

Failure Retry Logic​

API Request Format​

AI Fields Updated​

Critical Alert Deduplication​

API Logging​

Dedicated Gemini Queue Logging​

Cron Configuration​

Staff Reply Urgency Cap​

How It Works​

Data Flow​

AI Analysis Snapshots​

Database Table​

Staff History Page​

Recording Snapshots​

Migration​

AI Knowledge Assistant ("the Rhino")​

Overview​

Architecture (php-chatbot)​

Module Architecture (include/custom/AI/)​

Technical Notes​

Ticket & Agent Context Awareness​

Organization-Level Issue Analysis​

Multi-Source Pinecone Search​

Components​

Shadow DOM Isolation​

Embedding Generation​

AJAX Endpoints​

Conversation Storage​

Motherload Plugin: AI Summary Update​

Configuration​

Email Templates​

Gemini AI Urgency Analysis

Overview

Architecture Flow

Direct Gemini Integration

Model Fallback Chains

Response Cache

Failure Retry Logic

API Request Format

AI Fields Updated

Critical Alert Deduplication

API Logging

Dedicated Gemini Queue Logging

Cron Configuration

Staff Reply Urgency Cap

How It Works

Data Flow

AI Analysis Snapshots

Database Table

Staff History Page

Recording Snapshots

Migration

AI Knowledge Assistant ("the Rhino")

Overview

Architecture (php-chatbot)

Module Architecture (`include/custom/AI/`)

Technical Notes

Ticket & Agent Context Awareness

Organization-Level Issue Analysis

Multi-Source Pinecone Search

Components

Shadow DOM Isolation

Embedding Generation

AJAX Endpoints

Conversation Storage

Motherload Plugin: AI Summary Update

Configuration

Email Templates