Skip to content

Commit ada6755

Browse files
author
yunwoong
committed
Initial open source import: AWS IDP Pipeline
1 parent f025593 commit ada6755

File tree

459 files changed

+108569
-9
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

459 files changed

+108569
-9
lines changed

.DS_Store

6 KB
Binary file not shown.

CHANGELOG

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Changelog
2+
## v0.3.0
3+
Added smart search functionality using hybrid search
4+
Added document verification feature for comparing documents
5+
6+
## v0.2.0
7+
---
8+
Added WebSocket functionality
9+
Fixed ECS deployment issues
10+
11+
## v0.1.0
12+
---
13+
- initial project setup and configuration

README.md

Lines changed: 138 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,146 @@
1-
## My Project
1+
<div align="center">
2+
<img src="docs/assets/logo/logo.png" alt="AWS IDP Logo" width="100"/>
3+
</div>
24

3-
TODO: Fill this README out!
45

5-
Be sure to:
66

7-
* Change the title in this README
8-
* Edit your repository description on GitHub
7+
<h2 align="center"> AWS IDP </h2>
98

10-
## Security
9+
<div align="center">
10+
<img src="https://img.shields.io/badge/Nx-20.6.4-143055?logo=nx&logoColor=white"/>
11+
<img src="https://img.shields.io/badge/AWS-Bedrock-FF9900?logo=amazon&logoColor=white"/>
12+
<img src="https://img.shields.io/badge/Python-3.12-3776AB?logo=python"/>
13+
<img src="https://img.shields.io/badge/Next.js-15.3-black?logo=next.js"/>
14+
<img src="https://img.shields.io/badge/React-19-blue?logo=react"/>
15+
</div>
1116

12-
See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.
17+
---
1318

14-
## License
19+
## Overview
1520

16-
This library is licensed under the MIT-0 License. See the LICENSE file.
21+
**AWS IDP** is an **AI-powered Intelligent Document Processing solution** designed for unstructured data.
1722

23+
Transform unstructured data into actionable insights with our advanced AI-powered IDP: Analyze **documents, videos, audio files, and images** with unprecedented accuracy and speed.
24+
25+
<div align="center">
26+
<img src="docs/assets/main-screen.png" alt="AWS IDP Main Screen" width="900"/>
27+
</div>
28+
29+
---
30+
31+
### Application Demo
32+
33+
For a visual walkthrough of the application's key features, see the [**Application Demo**](docs/demo.md).
34+
35+
---
36+
37+
### Key Features
38+
39+
**Multi-Modal Unstructured Data Processing**
40+
41+
- **Document Processing**
42+
Content extraction, key data summarization, and layout analysis
43+
- **Video Analysis**
44+
Scene detection, chaptering, and transcript generation
45+
- **Audio Analysis**
46+
Speech-to-text, speaker identification
47+
- **Image Understanding**
48+
Object, scene, and text recognition
49+
50+
**AI-Powered Automation**
51+
52+
- **Bedrock Data Automation (BDA)**: Fast, scalable OCR + analysis
53+
- **ReAct Agent-based Workflow**: Adaptive tool orchestration for any file type
54+
- **Iterative Reasoning**: Multi-step refinement for accurate outputs
55+
56+
**Hybrid Search System**
57+
58+
- **Semantic + Keyword Search**: Meaning + precision combined
59+
- **Vector Indexing with OpenSearch**
60+
- **Real-time Reranking** for best matches
61+
62+
**Conversational AI Interface**
63+
64+
- **MCP Server-based Chatbot**: Natural language Q&A across all content
65+
- **Contextual Conversation**: Multi-turn dialogue with memory
66+
- **Domain-Specific Language Support**
67+
68+
---
69+
70+
### Documentation
71+
72+
- Analysis Pipeline Guide: [Kor](docs/analysis_pipeline_kr.md) / [Eng](docs/analysis_pipeline.md)
73+
- Agent Usage Guide: [Kor](docs/agents_usage_kr.md) / [Eng](docs/agents_usage.md)
74+
75+
---
76+
77+
### System Architecture
78+
79+
See [**Backend System Architecture Overview**](docs/ARCHITECTURE_OVERVIEW.md) for details.
80+
81+
<div align="center">
82+
<img src="docs/assets/architecture.png" alt="AWS IDP Architecture" width="900"/>
83+
</div>
84+
85+
---
86+
87+
## Repository Structure
88+
89+
```bash
90+
aws-idp/
91+
├── packages/
92+
│ ├── frontend/ # Next.js + React user interface
93+
│ ├── backend/ # FastAPI + MCP Tools + ReAct agent
94+
│ ├── infra/ # AWS CDK-based infrastructure
95+
│ │ ├── .toml # Infrastructure configuration file (e.g., dev.toml)
96+
│ │ ├── deploy-infra.sh # Deploys core infrastructure (VPC, S3, etc.)
97+
│ │ └── deploy-services.sh # Optional: Deploys services like ECS and ALB
98+
│ └── results/ # Analysis results
99+
├── docs/ # Documentation and diagrams
100+
└── .env # Auto-generated env vars
101+
```
102+
103+
---
104+
105+
## Quick Start
106+
107+
Getting started with AWS IDP involves setting up your environment, deploying the necessary cloud infrastructure, and running the application. Choose one of the following environment setup methods.
108+
109+
### **Environment Setup**
110+
111+
You can set up your development environment in one of two ways. The Devcontainer method is recommended for a consistent and automated experience.
112+
113+
#### Quick Deployment & Destroy (CloudShell + CodeBuild) — Recommended First
114+
115+
- Quick Deployment: [Guide](docs/quick_deploy.md) / [Kor](docs/quick_deploy_kr.md)
116+
- Quick Destroy: [Guide](docs/quick_destory.md) / [Kor](docs/quick_destroy_kr.md)
117+
118+
Use these scripts to deploy or remove the stack quickly without local setup. They run from AWS CloudShell and execute via CodeBuild.
119+
120+
| Method | Description | Guide |
121+
| --------------------------- | ------------------------------------------------------------------------------------------------------- | ------------------------------------------------- |
122+
| **Manual Local Setup** | Manually install dependencies on your local machine. For advanced users or specific needs. | [**Manual Setup Guide**](docs/manual_setup.md) / [**Kor**](docs/manual_setup_kr.md) |
123+
| **Devcontainer** | A fully containerized environment with all dependencies pre-installed. Requires Docker and VS Code. | [**Devcontainer Setup Guide**](docs/devcontainer_setup.md) / [**Kor**](docs/devcontainer_setup_kr.md) |
124+
125+
After setting up your environment using one of the guides above, proceed with the infrastructure deployment.
126+
127+
---
128+
129+
## **Configuration**
130+
131+
- Infrastructure: packages/infra/.toml
132+
- Env vars: .env (auto-generated)
133+
134+
---
135+
136+
## **Technology Stack**
137+
138+
- **Infrastructure**: AWS CDK, Lambda, DynamoDB, S3, OpenSearch, Step Functions, Bedrock, BDA, SQS
139+
- **Backend**: Python, FastAPI, MCP Tools, ReAct Agent Pattern
140+
- **Frontend**: Next.js 15, React 19, TypeScript
141+
142+
------
143+
144+
## **License**
145+
146+
This project is licensed under the [Amazon Software License](https://aws.amazon.com/asl/).

THIRD-PARTY-LICENSES.md

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# Third-Party Libraries and Licenses
2+
3+
This project uses the following third-party libraries and their respective licenses:
4+
5+
## Backend Dependencies (Python)
6+
7+
| Name | Version | License |
8+
|------|---------|---------|
9+
| PyYAML | 6.0.2 | MIT License |
10+
| Pygments | 2.19.2 | BSD License |
11+
| aiofiles | 24.1.0 | Apache Software License |
12+
| aiosqlite | 0.21.0 | MIT License |
13+
| annotated-types | 0.7.0 | MIT License |
14+
| anyio | 4.9.0 | MIT License |
15+
| appnope | 0.1.4 | BSD License |
16+
| arize-phoenix-otel | 0.12.1 | Elastic-2.0 |
17+
| asttokens | 3.0.0 | Apache 2.0 |
18+
| aws-xray-sdk | 2.14.0 | Apache Software License |
19+
| aws_lambda_powertools | 3.16.0 | MIT License; MIT No Attribution License (MIT-0) |
20+
| certifi | 2025.7.9 | Mozilla Public License 2.0 (MPL 2.0) |
21+
| charset-normalizer | 3.4.2 | MIT License |
22+
| click | 8.2.1 | BSD License |
23+
| colorama | 0.4.6 | BSD License |
24+
| comm | 0.2.2 | BSD License |
25+
| debugpy | 1.8.14 | MIT License |
26+
| decorator | 5.2.1 | BSD License |
27+
| executing | 2.2.0 | MIT License |
28+
| fastapi | 0.116.0 | MIT License |
29+
| googleapis-common-protos | 1.70.0 | Apache Software License |
30+
| grpcio | 1.73.1 | Apache Software License |
31+
| h11 | 0.16.0 | MIT License |
32+
| httpcore | 1.0.9 | BSD License |
33+
| httpx | 0.28.1 | BSD License |
34+
| httpx-sse | 0.4.1 | MIT |
35+
| idna | 3.10 | BSD License |
36+
| importlib_metadata | 8.7.0 | Apache Software License |
37+
| ipykernel | 6.29.5 | BSD License |
38+
| ipython | 9.4.0 | BSD License |
39+
| ipython_pygments_lexers | 1.1.1 | BSD License |
40+
| jedi | 0.19.2 | MIT License |
41+
| jsonpatch | 1.33 | BSD License |
42+
| jsonpointer | 3.0.0 | BSD License |
43+
| jupyter_client | 8.6.3 | BSD License |
44+
| langchain-aws | 0.2.27 | MIT License |
45+
| langchain-core | 0.3.68 | MIT |
46+
| matplotlib-inline | 0.1.7 | BSD License |
47+
| mcp | 1.10.1 | MIT License |
48+
| nest-asyncio | 1.6.0 | BSD License |
49+
| numpy | 2.3.1 | BSD License |
50+
| openinference-instrumentation | 0.1.34 | Apache Software License |
51+
| openinference-instrumentation-langchain | 0.1.46 | Apache Software License |
52+
| openinference-semantic-conventions | 0.1.21 | Apache Software License |
53+
| opentelemetry-instrumentation | 0.56b0 | Apache Software License |
54+
| orjson | 3.10.18 | Apache Software License; MIT License |
55+
| ormsgpack | 1.10.0 | Apache Software License; MIT License |
56+
| packaging | 24.2 | Apache Software License; BSD License |
57+
| parso | 0.8.4 | MIT License |
58+
| pexpect | 4.9.0 | ISC License (ISCL) |
59+
| platformdirs | 4.3.8 | MIT License |
60+
| prompt_toolkit | 3.0.51 | BSD License |
61+
| protobuf | 6.31.1 | 3-Clause BSD License |
62+
| psutil | 7.0.0 | BSD License |
63+
| ptyprocess | 0.7.0 | ISC License (ISCL) |
64+
| pure_eval | 0.2.3 | MIT License |
65+
| pydantic | 2.11.7 | MIT License |
66+
| pydantic-settings | 2.10.1 | MIT License |
67+
| pydantic_core | 2.33.2 | MIT License |
68+
| python-dateutil | 2.9.0.post0 | Apache Software License; BSD License |
69+
| python-dotenv | 1.1.1 | BSD License |
70+
| python-multipart | 0.0.20 | Apache Software License |
71+
| pyzmq | 27.0.0 | BSD License |
72+
| requests | 2.32.4 | Apache Software License |
73+
| requests-toolbelt | 1.0.0 | Apache Software License |
74+
| rpds-py | 0.26.0 | MIT |
75+
| six | 1.17.0 | MIT License |
76+
| sniffio | 1.3.1 | Apache Software License; MIT License |
77+
| sqlite-vec | 0.1.6 | MIT License, Apache License, Version 2.0 |
78+
| stack-data | 0.6.3 | MIT License |
79+
| starlette | 0.46.2 | BSD License |
80+
| tenacity | 9.1.2 | Apache Software License |
81+
| tornado | 6.5.1 | Apache Software License |
82+
| traitlets | 5.14.3 | BSD License |
83+
| typing-inspection | 0.4.1 | MIT License |
84+
| uvicorn | 0.35.0 | BSD License |
85+
| wrapt | 1.17.2 | BSD License |
86+
| xxhash | 3.5.0 | BSD License |
87+
| zstandard | 0.23.0 | BSD License |
88+
89+
## Frontend Dependencies (JavaScript/TypeScript)
90+
91+
| Name | Version | License |
92+
|------|---------|---------|
93+
| next | 15.3.3 | MIT License |
94+
| react | 19.0.0 | MIT License |
95+
| react-dom | 19.0.0 | MIT License |
96+
| tailwindcss | 4 | MIT License |
97+
98+
---
99+
100+
## License Information
101+
102+
This project is compliant with all third-party license requirements. Please refer to the individual license files or documentation for each library for more detailed license terms and conditions.
103+
104+
### Key License Types Used:
105+
- **MIT License**: Permissive license allowing commercial use, modification, and distribution
106+
- **BSD License**: Permissive license similar to MIT with additional attribution requirements
107+
- **Apache Software License**: Permissive license with explicit patent grants
108+
- **Mozilla Public License 2.0**: Copyleft license requiring derivative works to maintain same license
109+
- **ISC License**: Permissive license functionally equivalent to MIT
110+
- **Elastic-2.0**: Source-available license with certain commercial use restrictions
111+
112+
For any questions regarding licensing or compliance, please contact the project maintainers.

0 commit comments

Comments
 (0)