-
Notifications
You must be signed in to change notification settings - Fork 30
Description
Project summary
AI-powered DevOps automation: infrastructure provisioning, issue remediation, governance management, and project setup through natural language.
Project description
DevOps AI Toolkit democratizes platform engineering and cloud native operations through AI-powered automation across multiple workflow areas: infrastructure provisioning, issue remediation, governance management, and project setup.
The project addresses critical challenges in modern DevOps:
- Capability Discovery: Teams struggle to understand what their Kubernetes clusters can provision (operators, CRDs, native resources)
- Troubleshooting Complexity: Diagnosing issues requires deep expertise and multi-step investigation
- Knowledge Silos: Platform engineering best practices and governance policies locked in expert engineers' heads
- Repetitive Toil: Setting up repositories with proper governance, security, and automation files is manual and inconsistent
How it works: Discovers cluster capabilities (resources, operators, CRDs) and uses AI to semantically understand what each resource does. Users describe needs in plain English ("I need a PostgreSQL database"), and the system matches intent to available capabilities, generates manifests, and provisions resources across any infrastructure. It performs multi-step root cause analysis for issues with executable remediation commands, codifies organizational best practices as reusable patterns, and generates 25+ governance/security files for repository standardization.
Built on the Model Context Protocol (MCP), it integrates with any MCP-compatible AI agent and supports multiple AI providers (Claude, GPT, Gemini, Mistral, DeepSeek, XAI).
What makes it unique: First project combining semantic capability discovery, MCP integration, and AI-powered automation across infrastructure provisioning, remediation, governance, and project setup. Unlike single-purpose tools like K8sGPT, provides a comprehensive DevOps automation platform working with any infrastructure operators and MCP-compatible agents.
Org repo URL (provide if all repos under the org are in scope of the application)
Project repo URL in scope of application
https://github.com/vfarcic/dot-ai
Additional repos in scope of the application
No response
Website URL
https://github.com/vfarcic/dot-ai
Roadmap
https://github.com/vfarcic/dot-ai/blob/main/docs/ROADMAP.md
Roadmap context
Key Roadmap Highlights:
- Phase 1 (Current): Semantic capability management, MCP integration, core deployment recommendations
- Phase 2 (6 months): Enhanced AI model support (Bedrock, etc.), improved remediation playbooks, web UI
- Phase 3 (12 months): Advanced pattern learning, multi-cluster support, ecosystem integration with more CNCF projects
The roadmap emphasizes building a robust foundation for AI-powered platform engineering while maintaining flexibility to respond to community feedback and cloud native ecosystem evolution.
Contributing guide
https://github.com/vfarcic/dot-ai/blob/main/CONTRIBUTING.md
Code of Conduct (CoC)
https://github.com/vfarcic/dot-ai/blob/main/CODE_OF_CONDUCT.md
Adopters
No response
Maintainers file
https://github.com/vfarcic/dot-ai/blob/main/docs/MAINTAINERS.md
Security policy file
https://github.com/vfarcic/dot-ai/blob/main/SECURITY.md
Standard or specification?
No formal specification, but the project implements:
- Model Context Protocol (MCP) specification for AI tool integration (https://modelcontextprotocol.io/)
- Kubernetes API standards for cluster interaction
- OpenAPI 3.0 for REST API endpoints
The project follows cloud native best practices and standards, but does not define new specifications. It acts as an implementation layer that bridges AI assistants with Kubernetes infrastructure.
Business product or service to project separation
DevOps AI Toolkit is 100% open source with no commercial product or service.
- All code is MIT licensed and publicly available
- No "enterprise" or "pro" versions exist
- No hosted SaaS offering
- No vendor lock-in or proprietary extensions
- No commercial entity backing the project
The project creator (Viktor Farcic) is a Developer Advocate at Upbound and maintains this as an independent open source project separate from any employer. This ensures neutral governance aligned with CNCF principles.
Why CNCF?
We're applying to CNCF for four strategic reasons:
1. Natural Ecosystem Fit
DevOps AI Toolkit is fundamentally a cloud native project built on Kubernetes, integrating with CNCF projects (Kubernetes, Argo CD, Crossplane, Kyverno, Prometheus), and solving cloud native pain points. CNCF is the natural home for projects in the Kubernetes ecosystem.
2. Neutral Governance & Sustainability
As a single-maintainer project with growing adoption, we need neutral governance to ensure long-term sustainability. CNCF provides the framework to transition from individual project to community-governed initiative, reducing single-point-of-failure risk.
3. Community & Visibility
CNCF membership will significantly increase project visibility within the platform engineering and DevOps communities we serve. Access to CNCF resources (events, marketing, technical guidance) will accelerate community building and contributor recruitment.
4. Credibility for Adopters
Enterprise teams hesitate to adopt single-maintainer projects without formal governance. CNCF Sandbox membership signals project viability, governance structure, and community backing thus lowering the adoption barrier for organizations.
What we hope to gain:
- Access to CNCF technical advisory groups for guidance on architecture and integrations
- Visibility at KubeCon, CloudNativeCon, and CNCF channels
- Connection to potential contributors and adopters
- Governance best practices and mentorship
- Integration opportunities with other CNCF projects
- Long-term project sustainability beyond single maintainer
Benefit to the landscape
DevOps AI Toolkit fills a critical gap in the cloud native landscape: AI-powered DevOps automation platform for infrastructure provisioning, remediation, governance, and project setup.
Problem It Solves:
Modern Kubernetes clusters have evolved with sophisticated operators and CRDs that can provision resources inside and outside the cluster, but teams struggle to discover and leverage these capabilities. Platform engineers encode knowledge in operators, but developers don't know what exists or how to use it. Additionally, teams need help with issue diagnosis, governance enforcement, and project standardization. This creates:
- Knowledge gap: "What can my cluster do?" is difficult to answer
- Underutilization: Teams don't leverage available operators and capabilities
- Tribal knowledge: Platform capabilities and best practices locked in expert engineers' heads
- Slow onboarding: New team members struggle to discover available resources
- Manual toil: Repetitive tasks like project setup, issue diagnosis, and governance enforcement
Landscape Impact:
- Democratizes Infrastructure Operations: Makes capability-driven provisioning accessible through natural language, working with any operators on any infrastructure
- Increases CNCF Project Adoption: Helps teams discover and leverage any operators already in their clusters
- Bridges AI & Cloud Native: First project combining Model Context Protocol (MCP) with Kubernetes for conversational DevOps automation
- Enables Semantic Discovery: Introduces vector database-powered semantic search to understand cluster capabilities
- Comprehensive Automation: Covers the full DevOps lifecycle from provisioning to remediation to governance
Unique Position:
Unlike single-purpose tools (K8sGPT for troubleshooting, Helm for packaging), DevOps AI Toolkit provides a comprehensive AI-powered automation platform covering infrastructure provisioning (any resources inside/outside cluster via any operators), issue remediation, governance management, and project setup. This integrated approach is a net-new category in the landscape.
Cloud native 'fit'
DevOps AI Toolkit embodies core cloud native principles:
1. Kubernetes-Native
- Built entirely on Kubernetes APIs for resource and infrastructure management
- Discovers and leverages any Kubernetes operators and CRDs that exist in clusters
- Works with resources inside the cluster (native Kubernetes) and outside (via any operators)
- Follows Kubernetes patterns (declarative configuration, controller-based reconciliation)
- Universal compatibility with any infrastructure operators on any environment (cloud, on-prem, edge)
2. API-Driven & Extensible
- Exposes capabilities via RESTful HTTP API and Model Context Protocol
- Extensible through pluggable AI providers (Claude, GPT, Gemini, Mistral, DeepSeek)
- Integrates with external systems (Qdrant vector DB, Kubernetes clusters)
- Container-first: Runs as container or npm package
3. Declarative & Intent-Based
- Users declare intent in plain English ("I need a PostgreSQL database")
- System generates appropriate Kubernetes manifests based on discovered capabilities (whatever operators exist)
- Those manifests provision resources wherever capabilities reach (cluster-internal or external infrastructure)
- Supports organizational patterns/policies for declarative governance
4. Observable & Debuggable
- Comprehensive logging and error handling
- AI-powered remediation with root cause analysis
- Integration with existing observability tools (Prometheus, Grafana)
5. Composable & Interoperable
- Works alongside existing CNCF tools (doesn't replace, enhances)
- MCP protocol enables integration with any AI coding assistant
- Multi-cloud via Kubernetes control plane abstraction
6. Community-Driven Development
- Open source (MIT license) with transparent development
- Public roadmap and GitHub-based collaboration
- Comprehensive documentation and contribution guidelines
Cloud native 'integration'
DevOps AI Toolkit integrates with and enhances CNCF projects through universal discovery and capability-driven operations. The system works with any operators and resources that exist in your cluster. Common examples include:
Core Integration - Kubernetes:
- Primary interface for cluster management
- Discovers and leverages all Kubernetes resource types (native and custom)
- Generates manifests for any discovered resources (Deployments, StatefulSets, Services, operator CRs, etc.)
GitOps & Continuous Delivery:
- Argo CD: Recommends ArgoCD Applications when detected in cluster
- Flux: Can generate Flux-compatible manifests
- Supports GitOps workflows through manifest generation
Infrastructure Operators (Examples):
- Crossplane: Discovers Crossplane providers and generates XRs/XRDs when available
- AWS Controllers for Kubernetes (ACK): Leverages ACK operators when available
- ClusterAPI: Discovers and uses ClusterAPI resources when available
- Any infrastructure operators: Works universally with whatever operators exist in the cluster
Policy & Security:
- Kyverno: Integrates policy validation into recommendations
- OPA/Gatekeeper: Can validate generated manifests against policies
- Falco: Integrates security context into remediation recommendations
Observability:
- Prometheus: Uses metrics for remediation root cause analysis
- Jaeger/Tempo: Integrates distributed tracing for debugging
Service Mesh:
- Istio/Linkerd: Generates service mesh configurations when detected
Storage & Databases:
- Operator Framework: Discovers and uses database operators (PostgreSQL, MySQL, etc.)
- Recommends StatefulSets with proper PVC configurations
Build & Registry:
- Harbor: References container registries in deployment recommendations
- Buildpacks: Can recommend Cloud Native Buildpacks for image creation
Key Differentiator: DevOps AI Toolkit acts as a universal discovery and automation layer that works with any CNCF projects and operators through capability-driven discovery. The examples above represent common integrations, but the system adapts to whatever exists in your cluster.
Cloud native overlap
Primary Overlap Area: AI-powered Kubernetes operations
Similar Projects:
1. K8sGPT (CNCF Sandbox, Dec 2023)
- Focus: Troubleshooting and diagnosis - scans clusters for issues and explains them in plain English
- Overlap: Both use AI with Kubernetes
- Differentiation:
- K8sGPT: Single-purpose troubleshooting tool (diagnose existing problems)
- DevOps AI Toolkit: Comprehensive DevOps automation platform (provisioning, remediation, governance, project setup)
- K8sGPT: Works with current cluster state to diagnose issues
- DevOps AI Toolkit: Discovers capabilities (any operators/resources), matches intent, provisions infrastructure anywhere
- K8sGPT: Explains what's wrong
- DevOps AI Toolkit: Provides end-to-end automation from provisioning to remediation to governance
- Complementary: Can use both - K8sGPT for diagnosis, DevOps AI Toolkit for comprehensive DevOps automation
2. HolmesGPT (CNCF Sandbox, Sept 2025)
- Focus: AI troubleshooting with broader data sources (Prometheus, logs)
- Overlap: AI-powered root cause analysis
- Differentiation:
- HolmesGPT: Investigation from alerts/incidents
- DevOps AI Toolkit: Semantic capability discovery + deployment recommendations + remediation
- Complementary: Can integrate HolmesGPT for alert analysis, DevOps AI Toolkit for deployment
3. Kagent (CNCF Sandbox, May 2025)
- Focus: General-purpose agentic AI framework for Kubernetes
- Overlap: AI agents for Kubernetes automation
- Differentiation:
- Kagent: Framework for building custom AI agents
- DevOps AI Toolkit: Purpose-built solution for platform engineering workflows
- Complementary: Kagent provides framework, we provide specific platform engineering tools
4. Kubernetes Dashboard (CNCF Graduated)
- Focus: Web-based UI for Kubernetes cluster management
- Overlap: Both help discover cluster resources
- Differentiation:
- Kubernetes Dashboard: Manual UI navigation and visualization
- DevOps AI Toolkit: AI-powered semantic discovery and conversational interface with intent-based provisioning
- Complementary: Dashboard for visual inspection, DevOps AI Toolkit for AI-driven automation
5. Helm (CNCF Graduated)
- Focus: Package manager for Kubernetes with pre-built charts
- Overlap: Both simplify Kubernetes deployments
- Differentiation:
- Helm: Pre-built package manager with standardized charts
- DevOps AI Toolkit: AI-generated manifests tailored to your cluster's actual discovered capabilities
- Helm: Users must find and choose appropriate charts
- DevOps AI Toolkit: Discovers what operators/CRDs exist and generates appropriate manifests dynamically
- Complementary: Helm for packaging applications, DevOps AI Toolkit for capability-driven provisioning
Key Differentiators That Make DevOps AI Toolkit Unique:
- Semantic Capability Discovery: Only project using vector search to understand what any cluster resources can do
- Universal Compatibility: Works with any operators on any infrastructure (not prescriptive about specific operators or clouds)
- Comprehensive Platform: Covers full DevOps lifecycle (provisioning, remediation, governance, project setup) vs. single-purpose tools
- MCP Protocol Integration: Native integration with any MCP-compatible AI agents for conversational workflows
- Capability-Driven: Discovers and adapts to whatever exists in your cluster dynamically
- Organizational Intelligence: Codifies institutional knowledge as patterns and policies for consistent operations
Ecosystem Positioning: DevOps AI Toolkit is complementary to single-purpose tools (K8sGPT for troubleshooting, Helm for packaging) while providing a comprehensive AI-powered DevOps automation platform that works universally with any infrastructure operators and any MCP-compatible agents.
Similar projects
1. kubectl-ai / Kube-Copilot
- Similarity: AI-assisted kubectl commands
- Difference:
- Limited to generating kubectl commands from natural language
- DevOps AI Toolkit discovers capabilities, understands operators/CRDs, generates complete manifests
- We provide organizational governance (patterns/policies) on top of AI
2. Pulumi/Terraform with AI Assistants
- Similarity: Infrastructure as Code with AI code generation
- Difference:
- IaC tools: Define infrastructure in code
- DevOps AI Toolkit: Kubernetes-native, leverages existing cluster capabilities, semantic discovery
- We work with what's already in the cluster rather than defining everything from scratch
3. Commercial AI DevOps Platforms (Harness AI, GitLab Duo, GitHub Copilot for Azure)
- Similarity: AI for DevOps automation
- Difference:
- Commercial platforms: Closed source, vendor lock-in, SaaS-only
- DevOps AI Toolkit: 100% open source, self-hosted, MIT licensed, neutral governance
- Focuses on Kubernetes capability discovery vs. broad DevOps automation
Unique Value Proposition:
- Only open source project combining semantic capability discovery + vector search + MCP integration for Kubernetes platform engineering
- Kubernetes-native: Works with cluster as it exists, not imposing new abstractions
- AI-powered intent translation: Converts "what I want" to "how to deploy it with what's available"
- Governance-aware: Organizational patterns/policies influence recommendations
Target Audience Alignment:
Unlike general-purpose IaC or AI code assistants, DevOps AI Toolkit is purpose-built for platform engineers and DevOps teams working with complex Kubernetes environments who need AI to help bridge the capability discovery gap.
Landscape
Not currently listed.
We plan to apply for Cloud Native Landscape listing upon Sandbox acceptance, as CNCF projects receive prioritized placement. We recommend being listed under:
Primary Category: Provisioning > Automation & Configuration
- Rationale: DevOps AI Toolkit's core functionality aligns with infrastructure orchestration and deployment automation, which is the focus of this category. The project automates infrastructure provisioning, configuration management, and operational workflows through AI-powered intelligence.
Secondary Category (if dual listing permitted): CNAI (Cloud Native AI)
- Rationale: As an AI-powered automation toolkit, the project also fits the emerging Cloud Native AI category, demonstrating innovative application of AI technologies to cloud native operations.
Why Landscape Listing Matters:
Landscape visibility helps potential adopters discover the project when researching automation solutions and AI-powered Kubernetes tools, reinforcing our position as a cloud native community member while increasing project discoverability across multiple relevant categories.
Trademark and accounts
- If the project is accepted, I agree to donate all project trademarks and accounts to the CNCF
IP policy
- If the project is accepted, I agree the project will follow the CNCF IP Policy
Will the project require a license exception?
Minimal exceptions may be needed for 4 transitive dependencies (1% of total).
Non-Allowlist Dependencies (all permissive, transitive):
-
BlueOak-1.0.0 (3 packages: [email protected], [email protected], [email protected])
- Status: OSI-approved permissive license (approved by Open Source Initiative in Jan 2024)
- Nature: Similar to MIT, explicitly addresses patents
- Risk: Low - permissive license, widely considered FOSS-compatible
-
Unlicense (1 package: [email protected])
- Status: Public domain dedication
- Nature: Maximum permissiveness (no restrictions)
- Risk: Low - effectively public domain
Mitigation Strategy:
If CNCF requests license alignment, we can identify and replace these 4 transitive dependencies with functionally equivalent packages using CNCF allowlist licenses. Given these are common utility libraries (path handling, argument parsing, stream buffers), allowlist-licensed alternatives are readily available in the npm ecosystem.
Note: 97% of dependencies (374 of 378 packages) are on the CNCF allowlist. No GPL, AGPL, or copyleft licenses present.
Project "Domain Technical Review"
No formal TAG engagement yet.
Primary Recommendation: Platform Working Group
DevOps AI Toolkit is fundamentally a platform engineering tool that helps teams build internal developer platforms through AI-powered automation. The CNCF Platform Working Group is the most natural fit for review, as the project directly addresses platform engineering challenges: abstracting infrastructure complexity, enabling self-service workflows, codifying organizational patterns, and improving developer experience.
Alternative TAG Assignments (if Platform WG is unavailable or TOC prefers TAG review):
Given the project's cross-domain nature, it could fit multiple TAGs depending on review focus:
- TAG Infrastructure - Capability discovery and infrastructure provisioning
- TAG Developer Experience - Project setup, governance automation, developer workflows
- TAG Operational Resilience - AI-powered issue remediation and troubleshooting
- TAG Security and Compliance - Policy management and governance enforcement
- Cloud Native AI Working Group (TAG Runtime) - AI/LLM-powered automation across all workflows
Flexibility:
We're open to TOC assigning any TAG or working group based on their strategic priorities and full application context. The project's cross-cutting nature means it can provide value to multiple technical communities.
Post-Submission Engagement:
We're committed to actively participating in assigned TAG/working group meetings, presenting project demos, and incorporating feedback into roadmap priorities.
Application contact email(s)
Contributing or sponsoring entity signatory information
| Name | Country | Email address |
|---|---|---|
| Viktor Farcic | Spain | [email protected] |
CNCF contacts
No response
Additional information
Project Maturity & Readiness:
- Active Development: 442 commits in 6 months, regular releases (currently v0.118.0)
- Production Quality: Comprehensive integration test suite, CI/CD automation, semantic versioning
- Documentation: 12+ detailed guides covering all features, quick start, troubleshooting
- Community Infrastructure: Issue/PR templates, automated labeling, stale bot, OpenSSF Scorecard
Timing & Market Context:
- AI DevOps Surge: Rapid growth in AI-powered DevOps tools; positioning early in CNCF establishes leadership
- Platform Engineering Movement: Strong industry momentum toward platform engineering practices; project directly addresses this trend
- MCP Ecosystem Growth: Model Context Protocol adoption accelerating; being early CNCF project with MCP integration creates differentiation
Commitment to CNCF Principles:
- Neutral Governance: Committed to evolving toward multi-organization maintainer base
- Community First: Will prioritize community needs over individual/employer interests
- Ecosystem Collaboration: Eager to integrate deeply with other CNCF projects (Argo, Crossplane, Kyverno, etc.)
- Transparency: Public roadmap, open development, responsive to feedback
Sandbox as Growth Vehicle:
We view Sandbox not as an endpoint but as a foundation for community building:
- Sandbox visibility will help attract contributors and maintainers
- TAG engagement will guide architecture and integration decisions
- CNCF credibility will lower adoption barriers for enterprise teams
- Goal: Progress to Incubating as community and governance mature
Challenges We're Transparent About:
- Single Maintainer Currently: Actively working to recruit contributors; Sandbox acceptance will accelerate this
- Limited Production Evidence: Early stage with growing adoption; will document adopters as they emerge
- Rapid AI Evolution: Committed to evolving with AI/LLM ecosystem while maintaining stability
Why CNCF Should Accept DevOps AI Toolkit:
- Fills Real Gap: Addresses underserved "platform engineering enablement" category
- Cloud Native DNA: Built on Kubernetes, integrates with CNCF projects, follows cloud native principles
- Innovative Technology: First to combine semantic discovery + vector search + MCP for Kubernetes
- Strong Foundation: 6 months of active development, comprehensive docs, solid architecture
- Growth Trajectory: Community interest indicators (stars, forks, downloads) show upward momentum
- Committed Maintainer: Track record of open source contribution, DevOps community presence (YouTube channel: DevOps Toolkit, published author)
We're excited about the opportunity to contribute to the CNCF community and help advance cloud native platform engineering practices.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status