Onboarding to a new codebase shouldn't feel like deciphering ancient hieroglyphics.
GitMate transforms the daunting task of codebase onboarding into an intuitive, interactive experience. By fusing deterministic static analysis (ASTs, LSP) with probabilistic AI reasoning (LLMs, RAG), GitMate provides a complete semantic understanding of any repository. It allows developers to "chat" with their code, visualizes complex dependencies, and accelerates the time-to-understanding by up to 70%.
Modern software repositories are complex, interconnected ecosystems. For new developers, the learning curve is steep and costly:
- High Code Volume, Low Documentation: READMEs rarely capture the intricate runtime behaviors or architectural decisions.
- Invisible Dependencies: Modifying a single function can have cascading effects that static linters miss.
- Inefficient Onboarding: Developers spend nearly 75% of their time reading code versus writing it. The "Time-to-First-Commit" often spans weeks.
- Legacy Black Boxes: Inheriting undocumented legacy code is risky and error-prone without deep contextual understanding.
Who struggles most?
- New Team Members needing to become productive immediately.
- Open Source Contributors navigating massive, unfamiliar projects.
- Maintainers auditing legacy systems or refactoring complex modules.
GitMate bridges the gap between raw code and human understanding:
-
Precision Parsing (The Logic)
- Utilizing Tree-sitter, GitMate constructs a rigorous Abstract Syntax Tree (AST) of the codebase, ensuring every function, class, and variable is indexed with 100% accuracy.
-
Semantic Enrichment (The Knowledge)
- An LSP (Language Server Protocol) client resolves symbol references and call hierarchies, mapping the "connectome" of the software.
-
AI Synthesis (The Insight)
- Large Language Models (Llama 3.3 via Groq) generate human-readable explanations for every entity, stored in a FAISS vector database for semantic retrieval.
-
Interactive Exploration
- A modern Next.js 16 web dashboard allows users to query the codebase using natural language, visualize data flows, and navigate complex architectures effortlessly.
|
|
|
|
|
|
Experience the power of GitMate through our modern web interface.
|
|
|
|
|
|
|
|
|
- Python 3.13+ (Backend)
- Node.js 18+ & pnpm (Frontend)
- PostgreSQL (Database)
- Ollama (Embeddings) - Install Ollama
- UV (Python Package Manager) - Install UV
git clone https://github.com/bigsparsh/gitmate.git
cd gitmatecd backend
# Install dependencies using UV
uv sync
# Configure environment
# Create .env file with GROQ_API_KEY and DATABASE_URLcd frontend
# Install dependencies
pnpm install
# Initialize Database
pnpm prisma generate
pnpm prisma db push# Pull the embedding model locally
ollama pull nomic-embed-textFor enhanced tracking and call hierarchy features:
# For C/C++ support
sudo apt install clangd # Ubuntu/Debian
brew install llvm # macOScd backend
source .venv/bin/activate
uv run server.py
# Server runs at http://localhost:8000cd frontend
pnpm dev
# Dashboard available at http://localhost:3000- Open
http://localhost:3000in your browser. - Enter a GitHub Repository URL to start a new project.
- The system will clone, parse, and analyze the repo in the background.
- Interact with the Chat, File Explorer, or Dependency Graph to understand the code.
GitMate employs a Hybrid Neuro-Symbolic Architecture that combines the deterministic precision of static analysis with the probabilistic reasoning of Large Language Models.
- FastAPI (Python 3.13): High-performance async API server handling WebSocket streams for real-time chat.
- Tree-sitter: Incremental parsing library extracting precise ASTs for C++, Python, TypeScript, and Java.
- LSP Client: A custom Python wrapper interacting with
clangdandtsservervia stdio pipes to extract Call Hierarchies and References.
- Vector Store: FAISS (Facebook AI Similarity Search) indexes code chunks using Nomic Embed Text (via Ollama) for local, privacy-focused semantic retrieval.
- Inference: Groq API running Llama 3.3 70B provides near-instantaneous reasoning and code explanation.
- RAG Pipeline: LangChain orchestrates the retrieval of semantic context + AST structure + Call Graph data to ground the LLM's responses in reality.
- Framework: Next.js 16 (App Router) & React 19 for server-side rendering and static generation.
- State & UI: TailwindCSS 4 for styling, Mermaid.js for rendering live dependency graphs, and Prisma ORM for managing user sessions and history.
- Streaming: Server-Sent Events (SSE) and WebSockets ensure a fluid, "typing-like" experience during AI generation.
- PostgreSQL: Stores relational data (Users, Projects, Chat History).
- Relational Integrity: Tracks the lineage of every analysis session and user interaction.
gitmate/
├── frontend/
│ ├── app/
│ ├── components/
│ ├── hooks/
│ ├── lib/
│ ├── prisma/
│ ├── public/
│ ├── types/
│ ├── .gitignore
│ ├── README.md
│ ├── components.json
│ ├── eslint.config.mjs
│ ├── instructions.md
│ ├── middleware.ts
│ ├── next.config.ts
│ ├── package.json
│ ├── pnpm-lock.yaml
│ ├── pnpm-workspace.yaml
│ ├── postcss.config.mjs
│ ├── prisma.config.ts
│ ├── tsconfig.json
│
├── backend/
│ ├── assets/
│ ├── instructions.md
│ ├── lsp_client.py
│ ├── main.py
│ ├── pyproject.toml
│ ├── tree-sitter-docs.md
│ └── uv.lock
│
├── README.md
└── .gitignore
|
|
|
|
- v0.1: Core Tree-sitter + LSP + LLM integration (CLI)
- v0.2: Vector Database Memory & Context Awareness
- v1.0: Full Web Dashboard (Next.js 16) & Streaming Chat
- v1.1: Multi-repo support & Organization workspaces
- v1.2: IDE Extensions for VS Code & JetBrains
- v2.0: Autonomous Refactoring Agents
- Every developer who struggled with a new codebase
- The open-source community's commitment to accessibility
- The vision of AI-augmented development
Made with ❤️ for Developers, by Developers











