Custom Log Formats
Not every log is the same...
Gonzo supports custom log formats through YAML configuration files, allowing you to parse logs from any application and convert them to OpenTelemetry (OTLP) attributes for analysis.
Using Built-in Formats
Gonzo includes pre-built formats in the formats directory:
Available formats:
loki-stream.yaml
- Grafana Loki streaming (individual entries)loki-batch.yaml
- Loki batch format with multi-entry expansionvercel-stream.yaml
- Vercel logsnodejs.yaml
- Node.js application logsapache-combined.yaml
- Apache/Nginx access logs
Setup:
# Download and install format
mkdir -p ~/.config/gonzo/formats
cp <format-file>.yaml ~/.config/gonzo/formats/
# Use the format
gonzo --format=loki-stream -f logs.json
# List available formats
ls ~/.config/gonzo/formats/
Examples:
# Loki with logcli
logcli query --addr=http://localhost:3100 --follow '{service=~".+"}' -o jsonl 2>/dev/null | gonzo --format=loki-stream
# Loki Live Tail API using "wscat" (batch format)
wscat -c 'ws://localhost:3100/loki/api/v1/tail?query={service_name=~".%2B"}&limit=50' | gonzo --format=loki-batch
# Vercel logs
vercel logs <deployment_id> -j | gonzo --format=vercel-stream
# File with custom format
gonzo --format=nodejs -f application.log
Creating Your Own Custom Formats
Quick Start
1. Create a Format File
Create a YAML file in ~/.config/gonzo/formats/
directory:
mkdir -p ~/.config/gonzo/formats
vim ~/.config/gonzo/formats/myapp.yaml
2. Define Your Format
name: myapp
description: My Application Log Format
type: text
pattern:
use_regex: true
main: '^(?P<timestamp>[\d\-T:\.]+)\s+\[(?P<level>\w+)\]\s+(?P<message>.*)$'
mapping:
timestamp:
field: timestamp
time_format: rfc3339
severity:
field: level
body:
field: message
3. Use the Format
gonzo --format=myapp -f application.log
Basic Structure
# Metadata
name: format-name # Required: Unique identifier
description: Description # Optional: Human-readable description
author: Your Name # Optional: Format author
type: text|json|structured # Required: Format type
# Pattern Configuration (for text/structured types)
pattern:
use_regex: true|false # Use regex or template matching
main: "pattern" # Main pattern for parsing
fields: # Additional field patterns
field_name: "pattern"
# JSON Configuration (for json type)
json:
fields: # Field mappings
internal_name: json_path
array_path: "path" # For nested arrays
root_is_array: true|false # If root is an array
# Field Mapping
mapping:
timestamp: # Timestamp extraction
field: field_name
time_format: format
default: value
severity: # Log level/severity
field: field_name
transform: operation
default: value
body: # Main log message
field: field_name
template: "{{.field}}"
attributes: # Additional attributes
attr_name:
field: source_field
pattern: "regex"
transform: operation
default: value
Format Types
text - Plain text logs with regex patterns:
type: text
pattern:
use_regex: true
main: 'your-regex-pattern-here'
json - JSON structured logs:
type: json
json:
fields:
timestamp: $.timestamp
message: $.msg
structured - Fixed position logs (Apache-style):
type: structured
pattern:
use_regex: true
main: 'pattern-with-named-groups'
Common Regex Patterns
[\d\-T:\.]+
ISO timestamp
2024-01-15T10:30:45.123
\w+
Word characters
ERROR, INFO
\d+
Digits
12345
[^\]]+
Everything except ]
Content inside brackets
.*
Any characters
Rest of line
\S+
Non-whitespace
Token or word
Time Formats
rfc3339
2024-01-15T10:30:45Z
ISO 8601
unix
1705316445
Unix seconds
unix_ms
1705316445123
Unix milliseconds
unix_ns
1705316445123456789
Unix nanoseconds
auto
Various
Auto-detect format
"2006-01-02 15:04:05"
2024-01-15 10:30:45
Custom Go format
Field Transforms
uppercase
: Convert to uppercase (info → INFO)lowercase
: Convert to lowercase (ERROR → error)trim
: Remove whitespace (" text " → "text")status_to_severity
: HTTP status to severity (200→INFO, 404→WARN, 500→ERROR)
Complete Examples
Example 1: Node.js Application Logs
Log format: [Backend] 5300 LOG [Module] Message +6ms
# Format for: [Backend] 5300 LOG [Module] Message +6ms
name: nodejs
type: text
pattern:
use_regex: true
main: '^\[(?P<project>[^\]]+)\]\s+(?P<pid>\d+)\s+(?P<level>\w+)\s+\[(?P<module>[^\]]+)\]\s+(?P<message>[^+]+?)(?:\s+\+(?P<duration>\d+)ms)?$'
mapping:
severity:
field: level
transform: uppercase
body:
field: message
attributes:
project:
field: project
pid:
field: pid
module:
field: module
duration_ms:
field: duration
default: "0"
Example 2: Kubernetes/Docker JSON Logs
Format configuration:
name: k8s-json
type: json
json:
fields:
timestamp: time
message: log
stream: stream
mapping:
timestamp:
field: timestamp
time_format: rfc3339
body:
field: message
attributes:
stream:
field: stream
container_name:
field: kubernetes.container_name
pod_name:
field: kubernetes.pod_name
namespace:
field: kubernetes.namespace_name
Example 3: Apache Access Logs
Log format: 192.168.1.1 - - [14/Oct/2024:10:30:45 +0000] "GET /api/users HTTP/1.1" 200 1234
name: apache-access
type: structured
pattern:
use_regex: true
main: '^(?P<ip>[\d\.]+).*?\[(?P<timestamp>[^\]]+)\]\s+"(?P<method>\w+)\s+(?P<path>[^\s]+).*?"\s+(?P<status>\d+)\s+(?P<bytes>\d+)'
mapping:
timestamp:
field: timestamp
time_format: "02/Jan/2006:15:04:05 -0700"
body:
template: "{{.method}} {{.path}} - {{.status}}"
attributes:
client_ip:
field: ip
http_method:
field: method
http_path:
field: path
http_status:
field: status
response_bytes:
field: bytes
Advanced Features
Batch Processing
For logs where a single line contains multiple entries (like Loki batch format):
batch:
enabled: true
expand_path: "streams[].values[]" # Arrays to expand
context_paths: ["streams[].stream"] # Metadata to preserve
How it works:
Original line:
{"streams":[{"stream":{"service":"app"},"values":[["1234","msg1"],["5678","msg2"]]}]}
Gets expanded to: 2 separate log entries
Each entry retains the stream metadata
Common patterns:
logs[]
- Expand top-level arraystreams[].values[]
- Expand nested arrays (Loki)events[].entries[]
- Multi-level expansion
Nested JSON Fields
Access nested fields using dot notation:
attributes:
user_id:
field: user.id
user_name:
field: user.profile.name
Pattern Extraction
Extract values from within a field:
attributes:
error_code:
field: message
pattern: 'ERROR\[(\d+)\]' # Extracts code from "ERROR[404]: Not found"
Conditional Defaults
Use defaults when fields are missing:
attributes:
environment:
field: env
default: "production"
HTTP Status Code to Severity Mapping
For web server logs, use the status_to_severity
transform:
severity:
field: http_status
transform: status_to_severity
Status code mapping:
1xx (100-199): DEBUG (Informational)
2xx (200-299): INFO (Success)
3xx (300-399): INFO (Redirection)
4xx (400-499): WARN (Client Error)
5xx (500-599): ERROR (Server Error)
Multiple Pattern Matching
Define additional patterns for specific fields:
pattern:
use_regex: true
main: '^(?P<base>.*)'
fields:
request_id: 'RequestID:\s*(\w+)'
user_id: 'UserID:\s*(\d+)'
Testing & Troubleshooting
Test your format:
# Test with small sample
head -n 10 app.log | gonzo --format=myformat
# Test without TUI
gonzo --format=myformat -f app.log --test-mode
Common issues:
Pattern not matching: Test regex at regex101.com, verify named groups
(?P<name>...)
Wrong timestamps: Check time_format matches exactly, use Go format syntax
Missing attributes: Verify field paths (use dot notation for nested:
user.profile.name
)Performance issues: Use specific patterns instead of
.*
, avoid overly complex regex
Debug tips:
Start with simple patterns, add complexity gradually
Use defaults for optional fields
Test with various log samples
Check Gonzo output for parsing errors
Best Practices
Document your format: Add description and example log lines
Use meaningful names: Descriptive field names aid understanding
Handle edge cases: Provide defaults for optional fields
Test thoroughly: Verify with various log samples
Version control: Keep formats in Git for team sharing
Optimize patterns: Specific patterns perform better than generic ones
Additional Resources
Format Examples: https://github.com/control-theory/gonzo/tree/main/formats
Full Guide: https://github.com/control-theory/gonzo/blob/main/guides/CUSTOM_FORMATS.md
Quick Reference: https://github.com/control-theory/gonzo/blob/main/guides/FORMAT_QUICK_REFERENCE.md
Issue Tracker: https://github.com/control-theory/gonzo/issues
Last updated