Webhook API

OnCallM receives alerts from AlertManager via webhook endpoints.

Webhook Endpoint

POST /webhook

Receives alerts from AlertManager for AI analysis.

Request Format

OnCallM expects AlertManager webhook format:

json

{
  "receiver": "oncallm-webhook",
  "status": "firing",
  "alerts": [
    {
      "status": "firing",
      "labels": {
        "alertname": "HighCPUUsage",
        "instance": "node-1:9100",
        "job": "node-exporter",
        "severity": "warning",
        "namespace": "production"
      },
      "annotations": {
        "summary": "CPU usage is above 80%",
        "description": "CPU usage on node-1 has been above 80% for more than 5 minutes"
      },
      "startsAt": "2024-01-15T10:30:00.000Z",
      "endsAt": "0001-01-01T00:00:00Z",
      "generatorURL": "http://prometheus:9090/graph?g0.expr=...",
      "fingerprint": "abc123def456"
    }
  ],
  "groupLabels": {
    "alertname": "HighCPUUsage"
  },
  "commonLabels": {
    "job": "node-exporter",
    "severity": "warning"
  },
  "commonAnnotations": {},
  "externalURL": "http://alertmanager:9093",
  "version": "4",
  "groupKey": "{}:{alertname=\"HighCPUUsage\"}"
}

Response Format

json

{
  "status": "success",
  "message": "Alerts queued for analysis",
  "report_urls": [
    {
      "fingerprint": "abc123def456",
      "alert_name": "HighCPUUsage",
      "namespace": "production",
      "report_url": "https://oncallm.example.com/report/abc123def456"
    }
  ]
}

Error Responses

400 Bad Request

json

{
  "detail": "Invalid alert format"
}

503 Service Unavailable

json

{
  "detail": "Service not initialised"
}

Authentication

Bearer Token

Configure bearer token authentication:

yaml

# alertmanager.yml
receivers:
- name: 'oncallm-webhook'
  webhook_configs:
  - url: 'http://oncallm:8001/webhook'
    http_config:
      bearer_token: 'your-secret-token'

OnCallM validates the token:

python

# Verify bearer token
authorization = request.headers.get("Authorization")
if authorization != f"Bearer {expected_token}":
    raise HTTPException(status_code=401, detail="Unauthorized")

Testing

Manual Test

bash

curl -X POST http://localhost:8001/webhook \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer your-token' \
  -d '{
    "alerts": [{
      "labels": {
        "alertname": "TestAlert",
        "severity": "warning"
      },
      "annotations": {
        "summary": "Test alert for integration"
      },
      "startsAt": "2024-01-15T10:30:00.000Z",
      "fingerprint": "test123"
    }]
  }'

Expected Response

json

{
  "status": "success",
  "message": "Alerts queued for analysis",
  "report_urls": [
    {
      "fingerprint": "test123",
      "alert_name": "TestAlert",
      "namespace": "Unknown",
      "report_url": "http://localhost:8001/report/test123"
    }
  ]
}

Rate Limiting

OnCallM implements rate limiting to prevent abuse:

Per IP: 100 requests per minute
Global: 1000 requests per minute
Per Alert: Duplicate alerts within 60 seconds are deduplicated

Rate Limit Headers

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1642249200

Rate Limit Response

json

{
  "detail": "Rate limit exceeded. Try again in 60 seconds."
}

Monitoring

Metrics

OnCallM exposes webhook metrics:

promql

# Total webhook requests
oncallm_webhook_requests_total

# Webhook request duration
oncallm_webhook_duration_seconds

# Webhook errors
oncallm_webhook_errors_total

Health Check

Monitor webhook health:

bash

curl http://localhost:8001/health

json

{
  "status": "healthy",
  "webhook_endpoint": "available",
  "ai_service": "connected",
  "queue_size": 5
}

Troubleshooting

Common Issues

Webhook not receiving requests?

Check AlertManager configuration:

bash

amtool config show --config.file=alertmanager.yml

Verify network connectivity:

bash

kubectl port-forward svc/oncallm 8001:8001
curl http://localhost:8001/health

Check AlertManager logs:

bash

kubectl logs deployment/alertmanager | grep webhook

Authentication failures?

Verify bearer token in AlertManager config

Check OnCallM logs:

bash

kubectl logs deployment/oncallm | grep "Unauthorized"

Rate limiting issues?

Check rate limit headers in responses
Implement exponential backoff in client
Contact support for higher limits

Integration Examples

Slack Integration

Combine OnCallM with Slack notifications:


receivers:
- name: 'oncallm-and-slack'
  webhook_configs:
  - url: 'http://oncallm:8001/webhook'
  slack_configs:
  - api_url: 'YOUR_SLACK_WEBHOOK'
    channel: '#alerts'
    title: 'Alert: {{ .GroupLabels.alertname }}'
    text: 'AI Analysis: http://oncallm:8001/report/{{ .Alerts.0.Fingerprint }}'

PagerDuty Integration

Route critical alerts to both OnCallM and PagerDuty:

yaml

receivers:
- name: 'critical-alerts'
  webhook_configs:
  - url: 'http://oncallm:8001/webhook'
  pagerduty_configs:
  - routing_key: 'YOUR_PAGERDUTY_KEY'
    description: 'Critical alert - AI analysis available'

Webhook API ​

Webhook Endpoint ​

Request Format ​

Response Format ​

Error Responses ​

400 Bad Request ​

503 Service Unavailable ​

Authentication ​

Bearer Token ​

Testing ​

Manual Test ​

Expected Response ​

Rate Limiting ​

Rate Limit Headers ​

Rate Limit Response ​

Monitoring ​

Metrics ​

Health Check ​

Troubleshooting ​

Common Issues ​

Integration Examples ​

Slack Integration ​

PagerDuty Integration ​

Next Steps ​

Webhook API

Webhook Endpoint

Request Format

Response Format

Error Responses

400 Bad Request

503 Service Unavailable

Authentication

Bearer Token

Testing

Manual Test

Expected Response

Rate Limiting

Rate Limit Headers

Rate Limit Response

Monitoring

Metrics

Health Check

Troubleshooting

Common Issues

Integration Examples

Slack Integration

PagerDuty Integration

Next Steps