You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/CLOSED_NETWORK.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,6 +20,10 @@ Options related to closed network mode have the `closedNetwork` prefix. The foll
20
20
| closedNetworkCreateTestEnvironment | Whether to create a test environment. Created by default. Specify false if not needed. The test environment is created as an EC2 Windows instance and accessed via Fleet Manager. (Detailed procedures described later.) |
21
21
| closedNetworkCreateResolverEndpoint | Whether to generate Route53 Resolver Endpoint. Default is true. |
- Deployment must be performed in an environment with internet connectivity. Also, internet connectivity is required when accessing the operation verification environment from the management console.
Copy file name to clipboardExpand all lines: docs/en/DEPLOY_OPTION.md
+45-49Lines changed: 45 additions & 49 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -688,8 +688,33 @@ This is a use case for integrating with agents created in AgentCore. (Experiment
688
688
Enabling `createGenericAgentCoreRuntime` will deploy the default AgentCore Runtime.
689
689
By default, it is deployed to the `modelRegion`, but you can override this by specifying `agentCoreRegion`.
690
690
691
+
The default agents available in AgentCore can utilize MCP servers defined in [mcp.json](https://github.com/aws-samples/generative-ai-use-cases/blob/main/packages/cdk/lambda-python/generic-agent-core-runtime/mcp.json).
692
+
The MCP servers defined by default are AWS-related MCP servers and MCP servers related to current time.
693
+
For details, please refer to the documentation [here](https://awslabs.github.io/mcp/).
694
+
When adding MCP servers, please add them to the aforementioned `mcp.json`.
695
+
However, MCP servers that start with methods other than `uvx` require development work such as rewriting the Dockerfile.
696
+
691
697
With `agentCoreExternalRuntimes`, you can use externally created AgentCore Runtimes.
692
698
699
+
To enable AgentCore use cases, the `docker` command must be executable.
700
+
701
+
> [!WARNING]
702
+
> On Linux machines using x86_64 CPUs (Intel, AMD, etc.), run the following command before cdk deployment:
703
+
>
704
+
> ```
705
+
> docker run --privileged --rm tonistiigi/binfmt --install arm64
706
+
> ```
707
+
>
708
+
> If you do not run the above command, the following error will occur:
709
+
> During the deployment process, ARM-based container images used by AgentCore Runtime are built. When building ARM container images on x86_64 CPUs, errors occur due to CPU architecture differences.
710
+
>
711
+
> ```
712
+
> ERROR: failed to solve: process "/bin/sh -c apt-get update -y && apt-get install curl nodejs npm graphviz -y" did not complete successfully: exit code: 255
713
+
> AgentCoreStack: fail: docker build --tag cdkasset-64ba68f71e3d29f5b84d8e8d062e841cb600c436bb68a540d6fce32fded36c08 --platform linux/arm64 . exited with error code 1: #0 building with "default" instance using docker driver
714
+
> ```
715
+
>
716
+
> Running this command makes temporary configuration changes to the host Linux Kernel. It registers QEMU emulator custom handlers in Binary Format Miscellaneous (binfmt_misc), enabling ARM container image builds. The configuration returns to its original state after reboot, so the command must be re-executed before re-deployments.
You can use large language models deployed to Amazon SageMaker endpoints. This solution supports SageMaker endpoints using [Hugging Face's Text Generation Inference (TGI) LLM inference containers](https://aws.amazon.com/blogs/machine-learning/announcing-the-launch-of-new-hugging-face-llm-inference-containers-on-amazon-sagemaker/). Ideally, the models should support chat-formatted prompts where user and assistant take turns speaking. Currently, image generation use cases are not supported with Amazon SageMaker endpoints.
1453
+
## When you want to use Amazon SageMaker custom models
1429
1454
1430
-
There are two ways to deploy models using TGIcontainers to SageMaker endpoints:
1455
+
It is possible to use large language models deployed to Amazon SageMaker endpoints. It supports SageMaker Endpoints using [Text Generation Inference (TGI) Hugging Face LLM inference containers](https://aws.amazon.com/blogs/machine-learning/announcing-the-launch-of-new-hugging-face-llm-inference-containers-on-amazon-sagemaker/). Since it uses TGI's [Message API](https://huggingface.co/docs/text-generation-inference/messages_api), TGI must be version 1.4.0 or later, and the model must support Chat Template (`chat_template` defined in `tokenizer.config`). Currently, only text models are supported.
1431
1456
1432
-
**Deploy pre-packaged models from SageMaker JumpStart**
1457
+
There are currently two ways to deploy models using TGI containers to SageMaker endpoints.
1433
1458
1434
-
SageMaker JumpStart offers one-click deployment of packaged open-source large language models. You can deploy these models by opening them in the JumpStart screen in SageMaker Studio and clicking the "Deploy" button. Examples of Japanese models provided include:
1459
+
**Deploy pre-prepared modelsby AWS with SageMaker JumpStart**
1435
1460
1436
-
-[SageMaker JumpStart Elyza Japanese Llama 2 7B Instruct](https://aws.amazon.com/jp/blogs/news/sagemaker-jumpstart-elyza-7b/)
1437
-
-[SageMaker JumpStart Elyza Japanese Llama 2 13B Instruct](https://aws.amazon.com/jp/blogs/news/sagemaker-jumpstart-elyza-7b/)
SageMaker JumpStart provides OSS large language models packaged for one-click deployment. You can open a model from the JumpStart screen in SageMaker Studio and deploy it by clicking the "Deploy" button.
1442
1462
1443
1463
**Deploy with a few lines of code using SageMaker SDK**
1444
1464
1445
-
Thanks to [AWS's partnership with Hugging Face](https://aws.amazon.com/jp/blogs/news/aws-and-hugging-face-collaborate-to-make-generative-ai-more-accessible-and-cost-efficient/), you can deploy models by simply specifying the model ID from Hugging Face using the SageMaker SDK.
1465
+
Through the [partnership between AWS and Hugging Face](https://aws.amazon.com/jp/blogs/news/aws-and-hugging-face-collaborate-to-make-generative-ai-more-accessible-and-cost-efficient/), you can deploy models by simply specifying the ID of models published on Hugging Face with the SageMaker SDK.
1446
1466
1447
-
From a model's Hugging Face page, select _Deploy_ > _Amazon SageMaker_ to see the code for deploying the model. Copy and run this code to deploy the model. (You may need to adjust parameters like instance size or `SM_NUM_GPUS` depending on the model. If deployment fails, you can check the logs in CloudWatch Logs.)
1448
-
1449
-
> [!NOTE]
1450
-
> There's one modification needed when deploying: The endpoint name will be displayed in the GenU application and is used to determine the model's prompt template (explained in the next section). Therefore, you need to specify a distinguishable endpoint name.
1451
-
> Add `endpoint_name="<distinguishable endpoint name>"` as an argument to `huggingface_model.deploy()` when deploying.
1467
+
From a published Hugging Face model page, select _Deploy_ > _Amazon SageMaker_ to display the code for deploying the model. You can deploy the model by copying and executing this code. (Depending on the model, you may need to change parameters such as instance size or `SM_NUM_GPUS`. If deployment fails, you can check the logs from CloudWatch Logs)
1452
1468
1453
1469

1454
1470

@@ -1457,18 +1473,19 @@ From a model's Hugging Face page, select _Deploy_ > _Amazon SageMaker_ to see th
1457
1473
1458
1474
To use deployed SageMaker endpoints with the target solution, specify them as follows:
1459
1475
1460
-
endpointNames is a list of SageMaker endpoint names. (Example: `["elyza-llama-2", "rinna"]`)
1461
-
1462
-
To specify the prompt template used when constructing prompts in the backend, you need to include the prompt type in the endpoint name. (Example: `llama-2`, `rinna`, etc.) See `packages/cdk/lambda/utils/models.ts` for details. Add prompt templates as needed.
1476
+
`endpointNames` is a list of SageMaker endpoint names. Optionally you can specify region for each endpoint.
0 commit comments