Troubleshooting
Diagnose and resolve common issues with DB Audit. Find step-by-step solutions for connection problems, missing events, and more.
Quick Diagnostic Steps
Before diving into specific issues, run these quick checks to identify the problem area.
Is the collector process running? Check systemd, Docker, or Kubernetes status.
Look for error messages indicating connection failures or configuration issues.
The /health/detailed endpoint shows connectivity status for databases and API.
Run dbaudit-collector validate to check for configuration errors.
Common Issues
Connection Issues
Collector cannot connect to database
Collector cannot connect to database
Symptoms
- Connection timeout errors
- Authentication failures
- SSL/TLS handshake errors
Possible Causes
- Incorrect credentials
- Firewall blocking connection
- SSL certificate issues
- Database not accepting connections
Solutions
- 1 Verify credentials are correct and the user has required permissions
- 2 Check firewall rules allow traffic from collector to database port
- 3 For SSL issues, verify certificates are valid and paths are correct in config
- 4 Test connectivity with: `dbaudit-collector test-connection --database prod-postgres`
Collector cannot reach DB Audit API
Collector cannot reach DB Audit API
Symptoms
- API connection errors
- Events not appearing in dashboard
- Cache filling up
Possible Causes
- Network/firewall blocking outbound HTTPS
- Invalid API key
- Proxy configuration missing
Solutions
- 1 Verify outbound HTTPS (443) to api.dbaudit.ai is allowed
- 2 Check API key is valid: `dbaudit-collector validate-key`
- 3 If behind proxy, set HTTPS_PROXY environment variable
- 4 Test API connectivity: `curl -I https://api.dbaudit.ai/health`
Event Capture Issues
Events not appearing in dashboard
Events not appearing in dashboard
Symptoms
- Zero events in dashboard
- Some queries not captured
- Missing events for specific databases
Possible Causes
- Database audit logging not enabled
- Collector not running
- Incorrect database configuration
- Sampling filtering events
Solutions
- 1 Verify native audit logging is enabled on the database (see connector docs)
- 2 Check collector status: `systemctl status dbaudit-collector`
- 3 Review collector logs: `journalctl -u dbaudit-collector -f`
- 4 Verify database is listed in config and credentials are correct
- 5 Check if sampling is excluding events: review sampling config
Event lag or delays
Event lag or delays
Symptoms
- Events appearing minutes after execution
- Dashboard showing stale data
Possible Causes
- High event volume overwhelming collector
- Network latency to API
- Insufficient collector resources
Solutions
- 1 Check collector buffer utilization: `curl localhost:8080/health/detailed`
- 2 Increase collector resources (CPU/memory) if buffer is frequently full
- 3 Deploy additional collectors to distribute load
- 4 Consider enabling sampling for high-volume, low-value queries
Performance Issues
High collector CPU/memory usage
High collector CPU/memory usage
Symptoms
- Collector using excessive resources
- OOM kills
- System slowdown
Possible Causes
- Very high event volume
- Complex policy rules
- Memory leak (rare)
Solutions
- 1 Enable sampling to reduce event volume
- 2 Increase resource limits in deployment config
- 3 Simplify complex regex patterns in policies
- 4 If memory leak suspected, check for latest collector version
Slow API queries
Slow API queries
Symptoms
- Dashboard loading slowly
- API timeouts
- Report generation failing
Possible Causes
- Querying large time ranges
- Missing time filters
- Rate limit throttling
Solutions
- 1 Add time bounds to queries (use last 24h instead of all time)
- 2 Use aggregations instead of fetching raw events for dashboards
- 3 Check rate limit headers and implement backoff if needed
- 4 Contact support if consistently slow for reasonable queries
Alert Issues
Alerts not triggering
Alerts not triggering
Symptoms
- Expected alerts not firing
- Policy violations not notifying
Possible Causes
- Alert channel not configured
- Policy not enabled
- Severity threshold too high
- Alert cooldown active
Solutions
- 1 Verify alert channel is configured and tested
- 2 Check policy is enabled and matches the events
- 3 Review severity settings on alert channel
- 4 Check if cooldown is preventing duplicate alerts
- 5 Test alert delivery: Settings > Alerts > Send Test
Too many alerts (alert fatigue)
Too many alerts (alert fatigue)
Symptoms
- Hundreds of alerts per day
- Important alerts buried in noise
Possible Causes
- Thresholds too sensitive
- Missing exclusions for known patterns
- Baseline not tuned
Solutions
- 1 Increase thresholds for noisy alert rules
- 2 Add exclusions for known-good patterns (e.g., monitoring queries)
- 3 Enable anomaly detection to reduce false positives
- 4 Use alert aggregation to batch similar alerts
Diagnostic Commands
Systemd (Linux)
# Check collector service status
systemctl status dbaudit-collector
# View collector logs (last 100 lines)
journalctl -u dbaudit-collector -n 100
# Follow logs in real-time
journalctl -u dbaudit-collector -f
# Check collector health endpoint
curl -s localhost:8080/health/detailed | jq
# Validate configuration
dbaudit-collector validate --config /etc/dbaudit/config.yaml
# Test database connectivity
dbaudit-collector test-connection --database prod-postgres
# Test API connectivity
dbaudit-collector test-api Docker
# Check container status
docker ps -a | grep dbaudit
# View container logs
docker logs dbaudit-collector --tail 100 -f
# Execute commands inside container
docker exec -it dbaudit-collector /bin/sh
# Check health inside container
docker exec dbaudit-collector curl -s localhost:8080/health/detailed
# Restart container
docker restart dbaudit-collector Kubernetes
# Check pod status
kubectl get pods -l app=dbaudit-collector
# Describe pod for events
kubectl describe pod -l app=dbaudit-collector
# View pod logs
kubectl logs -l app=dbaudit-collector --tail 100 -f
# Check resource usage
kubectl top pod -l app=dbaudit-collector
# Execute into pod
kubectl exec -it deploy/dbaudit-collector -- /bin/sh
# Check configmap
kubectl get configmap dbaudit-config -o yaml Network Diagnostics
# Test outbound connectivity to DB Audit API
curl -v https://api.dbaudit.ai/health
# Test database connectivity
nc -zv your-database.example.com 5432
# Check DNS resolution
nslookup api.dbaudit.ai
# Test with proxy (if configured)
HTTPS_PROXY=http://proxy:8080 curl -v https://api.dbaudit.ai/health
# Check listening ports
netstat -tlnp | grep dbaudit Error Codes Reference
Common error codes you may encounter in collector logs and their meanings.
| Code | Description | Action |
|---|---|---|
CONN_TIMEOUT | Connection to database timed out | Check network path and firewall rules |
AUTH_FAILED | Database authentication failed | Verify credentials in config file |
SSL_HANDSHAKE | SSL/TLS handshake failed | Check certificate validity and paths |
API_UNAUTHORIZED | Invalid or expired API key | Regenerate API key in dashboard |
API_RATE_LIMIT | API rate limit exceeded | Implement exponential backoff |
BUFFER_FULL | Event buffer capacity reached | Increase buffer size or add collectors |
CACHE_FULL | Local cache capacity reached | Increase cache size or check API connectivity |
PARSE_ERROR | Failed to parse audit log entry | Check database audit log format |
Still Need Help?
If you're still experiencing issues after following this guide, our support team is here to help.