DevOps troubleshooting agents
Agents that parse logs, metrics, and deployment events to diagnose incidents and recommend or execute remediation steps.
Muhammad Zeeshan
Technologies Used
Python
Docker
Kubernetes
LLMs
AI Agents
Key Features
1
Log and metric correlation across services
2
Runbook-aware remediation suggestions
3
Incident timeline reconstruction
Implementation
Connected observability feeds to agent toolchains that classify failure modes and surface ranked fix paths with command snippets.
Results & Impact
Shortened mean time to diagnosis for deployment and infrastructure regressions.