Don't Just Trust AI LLMs. Verify Them.

Fire Mission is a third-party verification layer that sits between your code and AI providers. We verify the actual backend model processing your data matches your expectations—not an unverified substitute. Content passes through—we retain only metadata (costs, tokens, security events) for your analytics dashboards.

~5 min
Integration Time
100%
Response Verification
Pass-Through
Data by Default

How Verification Works: Fire Mission validates the backend LLM processing your requests by analyzing response patterns for model identity markers. We confirm the model matches your expected provider before data is processed. Verified responses pass through instantly. Unverified models are blocked and logged for compliance.

šŸ—ļø Transparent Verification Architecture

Fire Mission operates as a pass-through verification layer. Your requests go to POST /api/v1/gateway, we verify the backend model matches your expected provider, and return the response to you. We store metadata only (timestamps, token counts, costs) by default. Full request/response content is logged only when we detect model substitution, prompt injection, or anomalous patterns—giving your security team the evidence they need.

BYOK Proxy Architecture

Key Benefits

  • āœ“ Provider-agnostic: Switch providers without code changes
  • āœ“ BYOK security: You own your API keys, zero lock-in
  • āœ“ Automatic failover: Retry with backup providers on errors
  • āœ“ Cost optimization: Route to cheapest/fastest provider

Data Handling

  • āœ“ All processing is in-memory (ephemeral)
  • āœ“ API keys encrypted with AES-256-GCM
  • āœ“ Pass-through default; full data logged only on security violations
  • āœ“ Compatible with LangChain, LlamaIndex, OpenAI SDK

šŸš€ 5-Minute Integration Quickstart

Step 1: Sign Up & Get API Key

Create a RECON tier account ($75/mo with 7-day free trial)

Sign Up Now

Step 2: Add Your Provider API Keys

Navigate to Provider Setup in your dashboard. Add your OpenAI, Anthropic, Gemini, or other provider API keys. Fire Mission encrypts them with AES-256-GCM and never shares them.

Step 3: Make Your First Request

Replace provider-specific API calls with Fire Mission's universal endpoint:

fetch('https://firemission.us/api/v1/gateway', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'X-API-Key': 'your-fire-mission-api-key'
  },
  body: JSON.stringify({
    operation: 'ai.proxy',
    provider: 'openai',
    model: 'gpt-4',
    messages: [
      { role: 'user', content: 'Hello, AI!' }
    ]
  })
}

Step 4: View Analytics

Check your dashboard for real-time cost tracking, token usage, latency metrics, and security alerts. All AI calls are automatically logged with metadata (no prompts stored).

Complete API Documentation

See full endpoint specifications, authentication details, error codes, and advanced features.

View API Docs →

šŸ”‘ One API Key to Rule Them All

External applications, scripts, and CI/CD pipelines only need ONE Fire Mission API key. Configure your AI provider keys once in the dashboard—Fire Mission automatically retrieves them for every request.

Zero Key Management Overhead
  • āœ“ Configure AI keys once in dashboard
  • āœ“ Automatic provider key resolution
  • āœ“ No multi-key juggling in apps
Organization Support
  • āœ“ Service accounts for CI/CD
  • āœ“ Team-level automation
  • āœ“ Centralized access control

šŸ”’ Mandatory Security Scanning

All AI requests through Fire Mission include security scanning. There is no bypass option—security is built into every request to protect your organization.

Security Features (All Tiers)

  • āœ“ PII detection (SSN, credit cards, emails)
  • āœ“ Prompt injection blocking
  • āœ“ API key leakage prevention
  • āœ“ Backend model verification
  • āœ“ Cost tracking & analytics
  • āœ“ Rate limit monitoring

Tier Differences

All tiers have full access. Only usage limits differ:

  • • RECON ($75/mo): 2 ingested keys, 1 external key, 30 days retention
  • • Asymmetric Ops ($175/mo): 4 ingested keys, 2 external keys, 90 days retention
  • • Full Spectrum ($275/mo): 8 ingested keys, 4 external keys, 180 days retention

Content vs. Metadata: How Analytics Work

All security scanning happens in-memory. Content (prompts/responses) passes through without storage. Metadata (token counts, cost calculations, provider used, timestamps, security scan results) is retained—this powers your cost analytics and security dashboards. Full content is captured only when security violations are detected, providing forensic evidence.

šŸŽÆ Mixture-of-Experts (MoE) Intelligent Routing

Available in paid tiers. Fire Mission analyzes each request and automatically routes to the optimal provider based on cost, performance, or speed requirements.

MoE Intelligent Routing Logic

Cost Optimization

Route simple tasks to cheapest models

  • • GPT-3.5: $0.0005/1k tokens
  • • Gemini Flash: $0.00015/1k
  • • Groq Llama: $0.0001/1k āœ“

Save 85% on AI costs

Performance Routing

Complex tasks → best accuracy

  • • GPT-4: 94% accuracy
  • • Claude Opus: 96% āœ“
  • • Gemini Pro: 91%

Maximize output quality

Speed Priority

Real-time apps → lowest latency

  • • OpenAI: 800ms latency
  • • Anthropic: 1200ms
  • • Groq: 200ms āœ“

4x faster responses

Automatic Failover

If your primary provider hits rate limits or returns errors, Fire Mission automatically retries with backup providers. Configure priority rules and fallback chains in your dashboard.

šŸ¤– Supported AI Providers

Fire Mission supports security-vetted AI providers plus self-hosted LLMs. All providers work with BYOK (Bring Your Own Keys)—you control your API keys and data.

OpenAI

GPT-4, GPT-3.5, Embeddings

  • āœ“ Chat completions
  • āœ“ Embeddings
  • āœ“ Vision (GPT-4V)

Anthropic

Claude 3.5 Sonnet, Opus, Haiku

  • āœ“ Messages API
  • āœ“ Long context (200k)
  • āœ“ Tool use

Google Gemini

Gemini Pro, Flash, Ultra

  • āœ“ Text generation
  • āœ“ Multi-modal
  • āœ“ Code execution

Groq

Llama 3.1, Mixtral (Ultra-fast)

  • āœ“ 200ms latency
  • āœ“ Open-source models
  • āœ“ Low cost

Together AI

Open-source model hosting

  • āœ“ Llama, Mistral, etc.
  • āœ“ Custom fine-tunes
  • āœ“ Flexible pricing

Self-Hosted LLMs

Ollama, LM Studio, LocalAI

  • āœ“ Zero API costs
  • āœ“ Full data control
  • āœ“ Works with SaaS now

Custom OpenAI-Compatible APIs

Fire Mission supports any OpenAI-compatible API endpoint. Add your own custom provider URLs in the dashboard for Azure OpenAI, AWS Bedrock, or IBM watsonx.

šŸ–„ļø Self-Hosted LLM Integration

All tiers (including FREE RECON) support self-hosted LLMs. Run Ollama, LM Studio, or LocalAI on your own infrastructure and route requests through Fire Mission for unified cost tracking and analytics.

Ollama Setup

Run open-source LLMs locally with Ollama

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull llama3.1

# Start Ollama server (default: http://localhost:11434)
ollama serve

Add your Ollama endpoint in Fire Mission dashboard: http://localhost:11434

LM Studio Setup

Desktop app for running LLMs with GPU acceleration

  • 1. Download LM Studio from lmstudio.ai
  • 2. Load a model (Llama, Mistral, CodeLlama, etc.)
  • 3. Enable local server (default: http://localhost:1234)
  • 4. Add endpoint to Fire Mission dashboard

Tier-Based Endpoint Limits

RECON ($75/mo): 2 provider keys, 1 external app key
Asymmetric Ops ($175/mo): 4 provider keys, 2 external app keys
Full Spectrum ($275/mo): 8 provider keys, 4 external app keys

Why Self-Host?

  • āœ“ Zero API costs - Run models on your own hardware
  • āœ“ Full data control - Never send data to third parties
  • āœ“ GPU acceleration - Fast inference with local GPUs
  • āœ“ Works with SaaS today - Connect local LLMs to Fire Mission cloud

šŸ”’ Security for Self-Hosted LLMs

Self-hosted LLMs receive the same security protections as cloud providers. Fire Mission scans all requests before they reach your local endpoint:

  • āœ“ PII detection and blocking
  • āœ“ Prompt injection prevention
  • āœ“ API key leakage protection
  • āœ“ Backend model verification

Model Validation: Fire Mission verifies that self-hosted models are from approved, compliant sources. Unverified models are blocked to ensure your data handling meets compliance requirements.

All scanning happens in-memory. Content passes through without storage—we retain only metadata (token counts, costs, security scan results) for your analytics. Full request/response content is captured only when security violations are detected, providing forensic evidence.

šŸš€ Deployment Options

Cloud SaaS is available now. VPC and air-gapped deployments are under active development.

AVAILABLE NOW

SaaS (Default)

Multi-tenant cloud deployment

  • āœ“ Instant signup
  • āœ“ Zero maintenance
  • āœ“ Auto-scaling
  • āœ“ Automatic updates
UNDER DEVELOPMENT

VPC Deployment

Isolated cloud environment

  • • Your cloud account (AWS/Azure/GCP)
  • • Network isolation
  • • Custom compliance rules
  • • Contact for timeline
UNDER DEVELOPMENT

Air-Gap

Fully offline deployment

  • • No internet required
  • • DoW IL2+ architecture
  • • On-premise/SCIF
  • • Contact for timeline

ā“ Technical FAQ

How does Fire Mission handle API key security?

All API keys are encrypted with AES-256-GCM at rest and in transit. Fire Mission uses BYOK (Bring Your Own Keys) architecture—you own your provider API keys, and we never share them with third parties.

Keys are stored in an encrypted database with strict tenant isolation. Only your organization can access your keys.

Are prompts and responses stored?

No. Fire Mission follows a "Process and Forget" architecture. Prompts and responses are processed in-memory for security scanning (Security-Enhanced Mode only) and cost tracking, then immediately discarded.

We only store metadata: timestamp, provider, model, token count, cost, latency. No actual content is persisted.

Can I switch providers without changing code?

Yes. Fire Mission's universal gateway abstracts provider-specific APIs. Change the provider parameter in your request to switch between OpenAI, Anthropic, Gemini, Groq, etc.

Example: Change "provider": "openai" to "provider": "anthropic" without any code refactoring.

What's the latency overhead?

Base Proxy Mode: ~10-30ms overhead (simple passthrough)

Security-Enhanced Mode: ~50-100ms overhead (in-memory PII/injection scanning)

For most use cases, this is negligible compared to AI provider latency (500-2000ms).

Does Fire Mission support streaming responses?

Yes. Fire Mission supports Server-Sent Events (SSE) for streaming completions from all providers. Set "stream": true in your request.

Streaming works in both Base and Security-Enhanced proxy modes.

Can I use Fire Mission with LangChain or LlamaIndex?

Yes. Fire Mission's universal gateway is compatible with OpenAI SDK, LangChain, LlamaIndex, and any framework that supports custom API endpoints.

Simply point your base URL to https://firemission.us/api/v1/gateway and add your Fire Mission API key.

Ready to Integrate?

Start with RECON ($75/mo with 7-day free trial). All tiers include unlimited usage, MoE routing, and mandatory security scanning.