AI Providers Guide (v1)

Master each AI provider option for Gonzo. From cloud-based OpenAI to privacy-focused local models, this comprehensive guide helps you choose and configure the perfect AI solution for your needs.

circle-info

Already completed basic setup? This guide provides advanced configuration, optimization tips, and provider-specific best practices for production use.

Provider Comparison Matrix

Provider
Setup Complexity
Privacy Level
Cost Model
Performance
Best Use Case

OpenAI

Low

Cloud

Pay-per-use

Excellent

Production incidents, complex analysis

Ollama

Medium

Complete

Hardware only

Good

Privacy-sensitive, unlimited usage

LM Studio

Low-Medium

Complete

Hardware only

Good

Development, testing, experimentation

Azure OpenAI

Medium

Enterprise Cloud

Pay-per-use

Excellent

Enterprise compliance, hybrid cloud

Custom APIs

High

Configurable

Varies

Varies

Specialized models, existing infrastructure

OpenAI Provider Deep Dive

Model Selection Strategy

Production Deployment:

# Tier 1: Critical incidents (use best model)
export GONZO_PROD_MODEL="gpt-4"
alias gonzo-incident='gonzo --ai-model="$GONZO_PROD_MODEL"'

# Tier 2: Regular monitoring (balanced cost/performance)
export GONZO_MONITOR_MODEL="gpt-3.5-turbo"
alias gonzo-monitor='gonzo --ai-model="$GONZO_MONITOR_MODEL"'

# Tier 3: Development/testing (cost-optimized)
export GONZO_DEV_MODEL="gpt-3.5-turbo"
alias gonzo-dev='gonzo --ai-model="$GONZO_DEV_MODEL"'

Advanced OpenAI Configuration

Cost Optimization Settings:

Enterprise OpenAI Setup:

OpenAI Model Characteristics

GPT-4 (Recommended for Production)

  • Cost: $0.03/1K input tokens, $0.06/1K output

  • Context: 8K tokens

  • Strengths: Best reasoning, complex log analysis, accurate root cause identification

  • Best for: Critical incidents, complex debugging, production monitoring

GPT-4 Turbo

  • Cost: $0.01/1K input tokens, $0.03/1K output

  • Context: 128K tokens

  • Strengths: Large context, cost-effective, latest training data

  • Best for: Large log files, comprehensive analysis, cost-sensitive production

OpenAI Rate Limiting and Quotas

Understanding Rate Limits:

Handling Rate Limits:

Ollama Provider Deep Dive

Model Selection for Log Analysis

Recommended Models by Use Case:

Model
Size
RAM Required
Quality
Best For

llama3:8b

4.7GB

8GB+

Excellent

General log analysis, production ready

llama3:70b

40GB

64GB+

Outstanding

Complex analysis, enterprise use

mistral

4.1GB

8GB+

Good

Fast analysis, resource-constrained systems

codellama

3.8GB

8GB+

Good

Technical logs, code-related issues

mixtral

26GB

32GB+

Excellent

Complex reasoning, multi-language logs

Advanced Ollama Configuration

Performance Optimization:

Memory Management:

Multi-Model Ollama Setup

Strategy: Different Models for Different Tasks

Smart Model Switching:

Ollama Performance Tuning

System Optimization:

Monitoring Ollama Performance:

LM Studio Provider Deep Dive

Model Recommendations for LM Studio

Balanced Models (8-16GB RAM):

High-Performance Models (32GB+ RAM):

LM Studio Configuration

Server Settings:

Model-Specific Tuning:

LM Studio Best Practices

Model Management:

Performance Optimization:

Azure OpenAI Service

Enterprise Setup

Azure Resource Configuration:

Gonzo Configuration for Azure:

Model Deployment in Azure:

Azure-Specific Features

Private Endpoints:

Managed Identity:

Custom API Providers

AWS Bedrock Integration

Setup with Bedrock Proxy:

Google Cloud Vertex AI

Vertex AI Proxy Setup:

Self-Hosted Models

Hugging Face Transformers:

Provider Selection Decision Tree

Choose Based on Your Needs

Multi-Provider Strategy

Hybrid Approach:

Provider Fallback Chain:

Performance Comparison

Benchmark Results

Analysis Speed (avg response time):

Provider
Model
Simple Query
Complex Analysis
Large Context

OpenAI

gpt-4

2-3s

8-12s

15-25s

OpenAI

gpt-3.5-turbo

1-2s

3-5s

8-12s

Ollama

llama3:8b

5-8s

15-25s

30-45s

Ollama

mistral

3-5s

10-15s

20-30s

LM Studio

Various

4-10s

12-30s

25-60s

Quality Assessment (log analysis accuracy):

Provider
Model
Technical Accuracy
Context Understanding
Actionable Insights

OpenAI

gpt-4

95%

90%

85%

OpenAI

gpt-3.5-turbo

85%

80%

75%

Ollama

llama3:8b

80%

75%

70%

Ollama

mistral

75%

70%

65%

Resource Usage

Memory Requirements:

Best Practices by Provider

OpenAI Best Practices

Do:

  • Use gpt-3.5-turbo for development and routine analysis

  • Reserve gpt-4 for complex incidents and production issues

  • Monitor API usage and costs regularly

  • Implement token budgets and alerts

  • Use specific, targeted questions for better responses

Don't:

  • Send sensitive data without understanding OpenAI's data policies

  • Use gpt-4 for simple queries that gpt-3.5-turbo can handle

  • Ignore rate limits and quotas

  • Include unnecessary context that increases token usage

Ollama Best Practices

Do:

  • Keep Ollama service running as a daemon

  • Use appropriate model sizes for your hardware

  • Monitor system resources during model loading

  • Download models during off-peak hours

  • Use GPU acceleration when available

Don't:

  • Load multiple large models simultaneously without sufficient RAM

  • Ignore model update notifications

  • Run Ollama on systems with insufficient memory

  • Use CPU-only inference for large models

LM Studio Best Practices

Do:

  • Organize models in logical folders

  • Test models before production use

  • Monitor disk space for model storage

  • Use appropriate model settings for log analysis

  • Keep LM Studio updated

Don't:

  • Download models without checking system requirements

  • Run multiple models simultaneously without adequate resources

  • Ignore model performance metrics

  • Use default settings without optimization

What's Next?

Now that you understand all AI provider options, learn how to use them effectively:

  • Using AI Features - Master AI-powered workflows and practical usage patterns

  • Log Analysis - Combine AI insights with algorithmic analysis

  • Configuration - Set up provider-specific configurations

Or start using your chosen provider immediately:


You now have complete mastery over AI provider selection and configuration! 🚀 Whether you choose cloud-based APIs for maximum quality or local models for privacy and cost control, you can optimize your AI setup for any scenario.

Last updated