Alder Security Scanner

An AI-powered security analysis tool for web applications that combines Large Language Model (LLM) analysis with intelligent agent-based verification to identify and validate security vulnerabilities in your codebase.

🔍 Overview

Alder Security Scanner uses Google's Gemini AI to perform comprehensive security analysis across multiple vulnerability categories, then employs a sophisticated agent workflow to verify findings and reduce false positives. The scanner provides detailed reports with exploitability assessments, business impact analysis, and actionable remediation steps.

✨ Key Features

AI-Powered Analysis: Leverages Google Gemini 2.5 Pro for deep code understanding and vulnerability detection
Agent-Based Verification: Multi-stage verification workflow that assesses exploitability, context, and impact
Comprehensive Coverage: Analyzes 10 security categories including injection, XSS, authentication, authorization, and more
Smart Code Processing: Uses Tree-sitter AST parsing for accurate code chunking and analysis
Vector Database Integration: Efficient code retrieval using ChromaDB for contextual analysis
Cost Management: Built-in cost tracking and limits to control LLM API usage
Multiple Deployment Options: Run locally, in Docker, or integrate into CI/CD pipelines
Rich Reporting: Generates detailed Markdown and HTML reports with severity classification

🛡️ Security Categories Analyzed

The scanner performs analysis across these security domains:

Authentication - Weak credentials, session management, password policies
Authorization - Access control, privilege escalation, missing checks
Injection - SQL injection, command injection, SSTI, XSS
Cross-Site Scripting (XSS) - Reflected, stored, and DOM-based XSS
Data Protection - Sensitive data exposure, insecure storage/transmission
API Security - Insecure API design, missing authentication/authorization
Configuration - Security misconfigurations in frameworks and servers
Cryptography - Weak algorithms, improper key management, insecure RNG
Client-Side - JavaScript vulnerabilities, insecure data storage
Business Logic - Logic flaws, race conditions, validation bypasses

🚀 Quick Start

Prerequisites

Docker
Google Gemini API key
OpenAI API key (for embeddings)

Installation

Clone the repository:

git clone https://github.com/adamsmith6300/alder-security-scanner.git
cd alder-security-scanner

Set up environment variables:

cp .env.example .env
# Edit .env and add your API keys:
# GOOGLE_API_KEY=your_google_api_key
# OPENAI_API_KEY=your_openai_api_key

Build the Docker image:

docker build -t alder-security-scanner .

📖 Usage

Docker Execution

# Scan a repository using Docker
./local.sh /path/to/your/repository

Docker Script Options

The local.sh script builds and runs the scanner in a Docker container with the following features:

Automatic Docker image building: The script builds the image locally on each run
Environment variable injection: Your API keys from .env are passed to the container
Volume mounting: The target repository and reports directory are mounted for analysis
Report generation: Security reports are saved to ./security-reports/ directory

Configuration

You can customize the analysis by modifying the Docker script or the scanner's built-in defaults:

Maximum tokens: 5,000,000 tokens per scan (configurable)
Maximum cost: $5.00 USD per scan (configurable)
Ignore patterns: Common directories like node_modules, dist, build are automatically ignored

📊 Example Usage Scenarios

1. Quick Security Assessment

# Fast scan of a small project
./local.sh ./my-web-app

2. Comprehensive Enterprise Scan

# Full scan of a large codebase
./local.sh ./enterprise-app

3. Scanning Remote Repositories

# Clone and scan a repository
git clone https://github.com/user/repo.git
./local.sh ./repo

⏱️ Performance & Cost

Analysis Time: Scans typically take 5-10 minutes for medium-sized codebases, with larger repositories potentially taking longer
Cost Estimates:
- Small repositories (<500K tokens): Usually under $1
- Medium repositories (~2M tokens): $1-$4
- Large repositories (>5M tokens): $5+ (costs are automatically limited to $5.00 by default)
Token Limits: Default maximum of 5M tokens per scan with automatic cost management

🏗️ Architecture

The scanner follows a multi-stage analysis pipeline:

Repository Preparation: Scans and filters files based on ignore patterns
Code Processing: Uses Tree-sitter for AST-based code chunking
Vector Indexing: Creates embeddings for efficient code retrieval
LLM Analysis: Gemini analyzes code chunks for each security category
Agent Verification: Multi-agent workflow verifies findings:
- Exploitability Agent: Assesses if vulnerabilities are exploitable
- Context Agent: Analyzes attack scenarios and risk levels
- Impact Agent: Evaluates business impact and consequences
- Synthesis Agent: Combines analyses and provides final recommendations
Report Generation: Creates comprehensive reports with verified findings

📖 For detailed architecture documentation, see Architecture.md - This comprehensive guide explains the system design, component interactions, data flow, and extensibility patterns.

📋 Supported File Types

The scanner analyzes these programming languages and file types:

Python (.py)
JavaScript (.js, .jsx)
TypeScript (.ts, .tsx)
CoffeeScript (.coffee)
HTML (.html)
CSS (.css)
Java (.java)
PHP (.php)
Ruby (.rb)
Go (.go)
C/C++ (.c, .cpp, .h)
C# (.cs)
Swift (.swift)
Rust (.rs)
Shell Scripts (.sh)
Configuration Files (.json, .yaml, .yml, .xml)
SQL (.sql)
GraphQL (.graphql)

🚫 Limitations

Language Support: Best results with Python, JavaScript, and TypeScript; other languages have basic support
Context Window: Limited by LLM context windows; very large files may be truncated
API Dependencies: Requires internet connection and valid API keys for Gemini and OpenAI
Cost Considerations: Analysis costs scale with codebase size and complexity
False Positives: While agent verification reduces false positives, manual review is still recommended
Static Analysis Only: Does not perform dynamic analysis or runtime testing

📊 Report Format

The scanner generates detailed reports including:

Executive Summary: Overview of findings and verification status
Severity Classification: Critical, High, Medium, Low, Informational
Exploitability Assessment: Exploitable, Not Exploitable, Uncertain
Detailed Findings: Each finding includes:
- File location and line numbers
- Code snippets
- Exploitability analysis with confidence scores
- Risk and impact assessment
- Attack scenarios
- Remediation steps
- CWE mappings where applicable

Example Report Output

Here's a sample of what the generated security report looks like:

# Security Analysis Report: MyApp
*Generated: 2025-05-27 20:27:09 UTC*

## Executive Summary

This report summarizes potential security findings identified through Large Language Model (LLM) analysis and verified through an AI agent workflow.

### Verification Summary

- **Total Findings**: 33
- **Agent Verified**: 33
- **Exploitable**: 25
- **Not Exploitable**: 3
- **Uncertain**: 5

### Findings Summary

| Severity      | Code Findings | Exploitable | Not Exploitable | Uncertain |
|---------------|---------------|-------------|-----------------|-----------|
| Critical      | 18            | 18          | 0               | 0         |
| High          | 11            | 7           | 1               | 3         |
| Medium        | 3             | 0           | 1               | 2         |
| Low           | 1             | 0           | 1               | 0         |

## Detailed Findings

### Critical Findings

#### 1. SQL injection vulnerability in the `/items/product/:id` endpoint [P0-Critical] 🔴 Exploitable
**Source:** Category: injection
**File:** `app.js:18`
**CWE:** CWE-89
**Verification Status:** Verified By Agent Workflow

**Description:**
SQL injection vulnerability allows attackers to execute arbitrary SQL commands due to unsanitized user input in the 'id' parameter.

**🔍 Exploitability Analysis:**
- **Status:** Exploitable
- **Confidence:** 90%
- **Reasoning:** The `id` parameter is directly concatenated into the SQL query without sanitization, allowing arbitrary SQL injection.
- **Data Source Analysis:** User-controlled input from URL path parameter.

**📊 Risk & Impact Analysis:**
- **Risk Level:** High
- **Business Impact:** Critical
- **Attack Scenario:** Attacker can send crafted requests to execute arbitrary SQL commands
- **Potential Consequences:**
  - Complete database compromise
  - Unauthorized data access and modification
  - Service disruption

**Code Snippet:**
```javascript
app.get('/items/product/:id', (req, res) => {
  const id = req.params.id;
  const sql = `SELECT * FROM items WHERE id = ${id}`;
  // Vulnerable: direct concatenation

🔧 Remediation Steps:

Use parameterized queries or prepared statements
Implement input validation for the 'id' parameter
Apply principle of least privilege to database connections

🤖 AI Analysis Summary: High-confidence exploitable vulnerability with critical business impact. Immediate remediation required.


## 🔧 Configuration

### Environment Variables

```bash
# Required
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key

# Optional (for GitHub integration)
GITHUB_TOKEN=your_github_token
GITHUB_REPOSITORY=owner/repo-name
INPUT_CREATE_ISSUES=true

Ignore Patterns

The scanner automatically ignores common directories and files. You can customize this by:

Using --extra-ignore-dirs for additional directories
Modifying src/repository/ignore_patterns.py for permanent changes

Default ignored directories include:

Version control: .git, .svn
Dependencies: node_modules, vendor, venv
Build artifacts: dist, build, target
IDE files: .vscode, .idea
And many more...

🔗 GitHub Action Integration

While this repository doesn't include a pre-built GitHub Action, you can easily integrate the scanner into your CI/CD pipeline by:

Publishing the Docker image to a container registry
Creating a custom GitHub Action that uses the published image
Using the provided Docker configuration as a starting point

The scanner includes GitHub integration features for creating issues from findings when appropriate environment variables are set.

🤝 Contributing

We welcome contributions! To get started:

Understand the Architecture: Read the Architecture.md documentation to understand the system design, component structure, and data flow. This is essential for making meaningful contributions.
Review the Codebase: The architecture documentation provides a roadmap to navigate the codebase effectively.
Make a pull request!

Key Areas for Contribution

New Security Categories: Add additional vulnerability detection categories
Language Support: Extend support for additional programming languages
Agent Workflows: Enhance or create new verification agent workflows
LLM Providers: Add support for additional LLM providers (Claude, GPT-4, etc.)
Report Formats: Create new report output formats
Performance Optimizations: Improve analysis speed and cost efficiency

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🆘 Support

Issues: Report bugs and request features via GitHub Issues

⚠️ Disclaimer: This tool is designed to assist in security analysis but should not be the only security measure. Always perform manual security reviews and testing in addition to automated scanning.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
architecture.md		architecture.md
entrypoint.sh		entrypoint.sh
local-no-docker.sh		local-no-docker.sh
local.sh		local.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Alder Security Scanner

🔍 Overview

✨ Key Features

🛡️ Security Categories Analyzed

🚀 Quick Start

Prerequisites

Installation

📖 Usage

Docker Execution

Docker Script Options

Configuration

📊 Example Usage Scenarios

1. Quick Security Assessment

2. Comprehensive Enterprise Scan

3. Scanning Remote Repositories

⏱️ Performance & Cost

🏗️ Architecture

📋 Supported File Types

🚫 Limitations

📊 Report Format

Example Report Output

Ignore Patterns

🔗 GitHub Action Integration

🤝 Contributing

Key Areas for Contribution

📄 License

🆘 Support

About

Uh oh!

Releases

Packages

Languages

License

xuebibibibibi/alder-security-scanner

Folders and files

Latest commit

History

Repository files navigation

Alder Security Scanner

🔍 Overview

✨ Key Features

🛡️ Security Categories Analyzed

🚀 Quick Start

Prerequisites

Installation

📖 Usage

Docker Execution

Docker Script Options

Configuration

📊 Example Usage Scenarios

1. Quick Security Assessment

2. Comprehensive Enterprise Scan

3. Scanning Remote Repositories

⏱️ Performance & Cost

🏗️ Architecture

📋 Supported File Types

🚫 Limitations

📊 Report Format

Example Report Output

Ignore Patterns

🔗 GitHub Action Integration

🤝 Contributing

Key Areas for Contribution

📄 License

🆘 Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages