aws-samples
diff --git a/‎.github/workflows/gh-pages.yml‎
Lines changed: 44 additions & 0 deletions b/‎.github/workflows/gh-pages.yml‎
Lines changed: 44 additions & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md‎
Lines changed: 66 additions & 39 deletions b/‎README.md‎
Lines changed: 66 additions & 39 deletions
diff --git a/‎docs/en/about.md‎
Lines changed: 18 additions & 0 deletions b/‎docs/en/about.md‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎docs/en/installation.md‎ b/‎docs/en/installation.md‎
diff --git a/‎docs/en/supported_models.md‎
Lines changed: 36 additions & 0 deletions b/‎docs/en/supported_models.md‎
Lines changed: 36 additions & 0 deletions
diff --git a/‎docs/en/usage.md‎ b/‎docs/en/usage.md‎
diff --git a/‎docs/images/emd-bootstrap.png‎
524 KB b/‎docs/images/emd-bootstrap.png‎
524 KB
diff --git a/‎docs/images/emd-config.png‎
32.5 KB b/‎docs/images/emd-config.png‎
32.5 KB
diff --git a/‎docs/images/emd-deploy.png‎
419 KB b/‎docs/images/emd-deploy.png‎
419 KB
@@ -0,0 +1,44 @@
+name: GitHub Pages
+
+on:
+  push:
+    branches:
+      - main
+    paths:
+      - docs/**
+  workflow_dispatch:
+
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v2
+        with:
+          fetch-depth: 0    # Fetch all history for .GitInfo and .Lastmod
+
+      - name: Setup Python
+        uses: actions/setup-python@v2
+        with:
+          python-version: '3.11'
+
+      - name: Install dependencies
+        run: |
+          python3 -m pip install mkdocs==1.6.1
+          python3 -m pip install mkdocs-material==9.6.5
+          python3 -m pip install mkdocs-include-markdown-plugin==7.1.4
+          python3 -m pip install mkdocs-macros-plugin==1.3.7
+          python3 -m pip install mkdocs-with-pdf==0.9.3
+          python3 -m pip install mkdocs-print-site-plugin==2.6.0
+
+      - name: Build mkdocs
+        run: |
+          mkdocs build -f ./docs/mkdocs.en.yml
+          cp -av ./docs/index.html ./docs/site
+
+      - name: Deploy
+        uses: peaceiris/actions-gh-pages@v3
+        if: ${{ github.ref == 'refs/heads/main' }}
+        with:
+          github_token: ${{ secrets.ACTION_TOKEN }}
+          publish_dir: ./docs/site
+          publish_branch: gh-pages
@@ -127,6 +127,7 @@ cython_debug/
 
 # VS Code
 .vscode/
+docs/site
 
 # data
 emd_models/
 
@@ -1,64 +1,91 @@
-## Installation Guide
+<h3 align="center">
+Easy Model Deployer - Simple, Efficient, and Easy-to-Integrate
+</h3>
+---
 
-### Prerequisites
+*Latest News* 🔥
 
-Ensure you have Python 3 installed on your system. If not, download and install it from the [official Python website](https://www.python.org/downloads/).
+- [2025/03] We officially released EMD! Check out our [blog post](https://vllm.ai).
 
-### Installing pip
+---
 
-If `pip` is not installed, follow one of the methods below:
+## About
 
-#### Method 1: Using ensurepip
+EMD (Easy Model Deployer) is a lightweight tool designed to simplify model deployment. Built for developers who need reliable and scalable model serving without complex setup.
 
-Run the following command to install or upgrade `pip`:
+**Key Features**
+- One-click deployment of models to the cloud (Amazon SageMaker, Amazon ECS) or on-premises
+- Diverse model types (LLMs, VLMs, Embeddings, Vision, etc.)
+- Rich inference engine (vLLM, TGI, Lmdeploy, etc.)
+- Different instance types (CPU/GPU/AWS Inferentia)
+- Convenient integration (OpenAI Compatible API, LangChain client, etc.)
 
-```bash
-python3 -m ensurepip --upgrade
-```
+**Notes**
 
-#### Method 2: Using get-pip.py
+- Please check the [Supported Models](docs/supported_models.md) for complete list.
+- OpenAI Compatible API is supported only for Amazon ECS deployment.
 
-Download the `get-pip.py` script and execute it:
+## Getting Started
 
-```bash
-curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
-python3 get-pip.py
-```
+### Installation
 
-### Installing EMD
+Install EMD with `pip`, currently only support for Python 3.9 and above:
 
-To install the EMD package, execute the following steps:
+```bash
+curl  https://github.com/aws-samples/easy-model-deployer/releases/download/dev/emd-0.6.0-py3-none-any.whl -o emd-0.6.0-py3-none-any.whl && pip install emd-0.6.0-py3-none-any.whl"[all]"
+```
 
-1. **(Optional)** Create and activate a virtual environment:
+Visit our [documentation](https://aws-samples.github.io/easy-model-deployer/) to learn more.
 
-    Creating a virtual environment is recommended if you want to isolate the package and its dependencies from other Python projects on your system. This helps prevent version conflicts and keeps your global Python environment clean.
+### Usage
 
-    ```bash
-    python3 -m venv .venv
-    source .venv/bin/activate
-    ```
+#### Choose your default aws profile.
+```bash
+emd config set-default-profile-name
+```
+Notes: If you don't set aws profile, it will use the default profile in your env (suitable for Temporary Credentials). Whenever you want to switch deployment accounts, run ```emd config set-default-profile-name```
+![alt text](docs/images/emd-config.png)
 
-2. **Install the package**:
+#### Bootstrap emd stack
+```bash
+emd bootstrap
+```
+Notes: This is going to set up the necessary resources for model deployment. Whenever you change EMD version, run this command again.
+![alt text](docs/images/emd-bootstrap.png)
 
-    ```bash
-    pip install ./
-    ```
+#### Choose deployment parameters interactively by ```emd deploy``` or deploy with one command
+```bash
+emd deploy --model-id DeepSeek-R1-Distill-Qwen-1.5B --instance-type g5.8xlarge --engine-type vllm --framework-type fastapi --service-type sagemaker --extra-params {} --skip-confirm
+```
+Notes: Get complete parameters by ```emd deploy --help``` and find the values of the required parameters [here](docs/en/supported_models.md)
+When you see "Waiting for model: ...",  it means the deployment task has started, you can quit the current task by ctrl+c.
+![alt text](docs/images/emd-deploy.png)
 
-This command installs the package along with its dependencies.
+#### Check deployment status.
+```bash
+emd status
+```
+![alt text](docs/images/emd-status.png)
+Notes: EMD allows to launch multiple deployment tasks at the same time.
 
+#### Quick functional verfication or check our [documentation](https://aws-samples.github.io/easy-model-deployer/) for integration examples.
+```bash
+emd invoke DeepSeek-R1-Distill-Qwen-1.5B
+```
+Notes: Find *ModelId* in the output of ```emd status```.
+![alt text](docs/images/emd-invoke.png)
 
-3. **(Optional) Install the package with LangChain client**:
-    ```bash
-    pip install .[langchain_client]
-    ```
+#### Delete the deployed model
+```bash
+emd destroy DeepSeek-R1-Distill-Qwen-1.5B
+```
+Notes: Find *ModelId* in the output of ```emd status```.
 
 
-## Usage Instructions
+## Documentation
 
-After successful installation, use the following command to access the CLI help:
+For advanced configurations and detailed guides, visit our [documentation site](https://emd-docs.example.com).
 
-```bash
-emd --help
-```
+## Contributing
 
-This command displays the available options and commands for the EMD CLI.
+We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
@@ -0,0 +1,18 @@
+# About Easy Model Deployer
+
+Easy Model Deployer is a lightweight tool designed to simplify the machine learning model deployment process.
+
+## Features
+
+- Simple deployment workflow
+- Support for multiple ML frameworks
+- Easy configuration
+- Minimal dependencies
+
+## Why Use Easy Model Deployer?
+
+Perfect for developers who want to quickly deploy ML models without dealing with complex infrastructure setup.
+
+## Getting Started
+
+Check our [Usage Guide](usage.md) to start deploying your models in minutes.
@@ -0,0 +1,36 @@
+| ModeId                            | ModelSeries              | ModelType   | Supported Engines   | Supported Instances                                                                                                  | Supported Services            | Support China Region   |
+|:----------------------------------|:-------------------------|:------------|:--------------------|:---------------------------------------------------------------------------------------------------------------------|:------------------------------|:-----------------------|
+| glm-4-9b-chat                     | glm4                     | llm         | vllm                | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ✅                     |
+| internlm2_5-20b-chat-4bit-awq     | internlm2.5              | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge                                     | sagemaker,sagemaker_async,ecs | ✅                     |
+| internlm2_5-20b-chat              | internlm2.5              | llm         | vllm                | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ✅                     |
+| internlm2_5-7b-chat               | internlm2.5              | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g5.12xlarge,g5.24xlarge,g5.48xlarge | sagemaker,sagemaker_async,ecs | ✅                     |
+| internlm2_5-7b-chat-4bit          | internlm2.5              | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g5.12xlarge,g5.24xlarge,g5.48xlarge | sagemaker,sagemaker_async,ecs | ❎                     |
+| internlm2_5-1_8b-chat             | internlm2.5              | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g5.12xlarge,g5.24xlarge,g5.48xlarge | sagemaker,sagemaker_async,ecs | ✅                     |
+| Qwen2.5-7B-Instruct               | qwen2.5                  | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge                                     | sagemaker,sagemaker_async,ecs | ✅                     |
+| Qwen2.5-72B-Instruct-AWQ          | qwen2.5                  | llm         | vllm,tgi            | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ✅                     |
+| Qwen2.5-72B-Instruct              | qwen2.5                  | llm         | vllm                | g5.48xlarge                                                                                                          | sagemaker,sagemaker_async,ecs | ✅                     |
+| Qwen2.5-72B-Instruct-AWQ-128k     | qwen2.5                  | llm         | vllm                | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ✅                     |
+| Qwen2.5-32B-Instruct              | qwen2.5                  | llm         | vllm                | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ✅                     |
+| Qwen2.5-0.5B-Instruct             | qwen2.5                  | llm         | vllm,tgi            | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge,inf2.8xlarge                                                            | sagemaker,sagemaker_async,ecs | ✅                     |
+| Qwen2.5-1.5B-Instruct             | qwen2.5                  | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge                                                                         | sagemaker,sagemaker_async,ecs | ✅                     |
+| Qwen2.5-3B-Instruct               | qwen2.5                  | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge                                                                         | sagemaker,sagemaker_async,ecs | ✅                     |
+| Qwen2.5-14B-Instruct-AWQ          | qwen2.5                  | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge                                                                         | sagemaker,sagemaker_async,ecs | ✅                     |
+| Qwen2.5-14B-Instruct              | qwen2.5                  | llm         | vllm                | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ✅                     |
+| QwQ-32B-Preview                   | qwen reasoning model     | llm         | huggingface,vllm    | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ✅                     |
+| llama-3.3-70b-instruct-awq        | llama                    | llm         | tgi                 | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ❎                     |
+| DeepSeek-R1-Distill-Qwen-32B      | deepseek reasoning model | llm         | vllm                | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ❎                     |
+| DeepSeek-R1-Distill-Qwen-14B      | deepseek reasoning model | llm         | vllm                | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ❎                     |
+| DeepSeek-R1-Distill-Qwen-7B       | deepseek reasoning model | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge                                                                         | sagemaker,sagemaker_async,ecs | ❎                     |
+| DeepSeek-R1-Distill-Qwen-1.5B     | deepseek reasoning model | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge                                                                         | sagemaker,sagemaker_async,ecs | ❎                     |
+| DeepSeek-R1-Distill-Llama-8B      | deepseek reasoning model | llm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge                                                                         | sagemaker,sagemaker_async,ecs | ❎                     |
+| deepseek-r1-distill-llama-70b-awq | deepseek reasoning model | llm         | tgi,vllm            | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ✅                     |
+| Baichuan-M1-14B-Instruct          | baichuan                 | llm         | huggingface         | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async,ecs | ❎                     |
+| Qwen2-VL-72B-Instruct-AWQ         | qwen2vl                  | vlm         | vllm                | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async     | ✅                     |
+| QVQ-72B-Preview-AWQ               | qwen reasoning model     | vlm         | vllm                | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async     | ❎                     |
+| Qwen2-VL-7B-Instruct              | qwen2vl                  | vlm         | vllm                | g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.12xlarge,g5.16xlarge,g5.24xlarge,g5.48xlarge,g6e.2xlarge                         | sagemaker,sagemaker_async     | ✅                     |
+| InternVL2_5-78B-AWQ               | internvl2.5              | vlm         | lmdeploy            | g5.12xlarge,g5.24xlarge,g5.48xlarge                                                                                  | sagemaker,sagemaker_async     | ❎                     |
+| txt2video-LTX                     | comfyui                  | video       | comfyui             | g5.4xlarge,g5.8xlarge,g6e.2xlarge                                                                                    | sagemaker_async               | ❎                     |
+| whisper                           | whisper                  | whisper     | huggingface         | g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge                                                               | sagemaker_async               | ❎                     |
+| bge-base-en-v1.5                  | bge                      | embedding   | vllm                | g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge                                                               | sagemaker                     | ✅                     |
+| bge-m3                            | bge                      | embedding   | vllm                | g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge                                                               | sagemaker,ecs                 | ✅                     |
+| bge-reranker-v2-m3                | bge                      | rerank      | vllm                | g5.xlarge,g5.2xlarge,g5.4xlarge,g5.8xlarge,g5.16xlarge                                                               | sagemaker                     | ✅                     |