-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[Inference] Implementing the completion task type on EIS. #137677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi @afoucret, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements support for the completion task type in the Elastic Inference Service (EIS). The implementation adapts simple text inputs from the completion task into chat message format and routes them to the existing EIS chat endpoint. Key architectural changes include introducing a distinction between COMPLETION and CHAT_COMPLETION task types, where completion uses a simpler text-based input format while chat completion uses structured message objects.
- Introduces
ElasticInferenceServiceCompletionModelfor the completion task type - Refactors the original completion model to
ElasticInferenceServiceChatCompletionModelto distinguish between task types - Adds request adapters to convert simple text inputs into chat message format
Reviewed Changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| ElasticInferenceServiceCompletionRequestTests.java | Tests for the completion request adapter that converts text inputs to chat messages |
| ElasticInferenceServiceCompletionRequestEntityTests.java | Tests for the XContent serialization of completion request entities |
| ElasticInferenceServiceCompletionModelTests.java | Tests for the completion model including URI creation and service settings |
| ElasticInferenceServiceChatCompletionModelTests.java | Tests for the chat completion model (renamed from original completion model) |
| ElasticInferenceServiceAuthorizationHandlerTests.java | Updates test references to use ElasticInferenceServiceChatCompletionModel |
| ElasticInferenceServiceTests.java | Updates references and adds COMPLETION to supported task types |
| ElasticInferenceServiceUnifiedChatCompletionRequest.java | Updates to use ElasticInferenceServiceChatCompletionModel |
| ElasticInferenceServiceCompletionRequestEntity.java | Entity class for serializing completion requests to JSON |
| ElasticInferenceServiceCompletionRequest.java | Request adapter for COMPLETION task type |
| ElasticInferenceServiceCompletionModel.java | Model class for COMPLETION task type that adapts inputs to chat format |
| ElasticInferenceServiceChatCompletionModel.java | Renamed model class for CHAT_COMPLETION task type |
| ElasticInferenceServiceActionVisitor.java | Adds visitor method for completion model |
| ElasticInferenceServiceActionCreator.java | Implements action creator for completion requests |
| ElasticInferenceServiceUnifiedCompletionRequestManager.java | Updates to use ElasticInferenceServiceChatCompletionModel |
| ElasticInferenceService.java | Adds COMPLETION to supported task types and creates model instances |
| 137677.yaml | Changelog entry for the enhancement |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...arch/xpack/inference/services/elastic/completion/ElasticInferenceServiceCompletionModel.java
Outdated
Show resolved
Hide resolved
...arch/xpack/inference/services/elastic/completion/ElasticInferenceServiceCompletionModel.java
Outdated
Show resolved
Hide resolved
jonathan-buttner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I left a few suggestions.
|
|
||
| public class ElasticInferenceServiceChatCompletionModel extends ElasticInferenceServiceModel { | ||
|
|
||
| public static ElasticInferenceServiceChatCompletionModel of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, this is the only method that makes this class different from the ElasticInferenceServiceCompletionModel. How about we just use ElasticInferenceServiceCompletionModel and add this method there?
| * Request adapter for COMPLETION task type that converts simple text inputs | ||
| * into chat message format and sends them to the EIS chat endpoint. | ||
| */ | ||
| public class ElasticInferenceServiceCompletionRequest extends ElasticInferenceServiceRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating a new class for this, let's follow the OpenAI example here: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/openai/action/OpenAiActionCreator.java#L75-L86
Basically we should be able to leverage the existing ElasticInferenceServiceUnifiedChatCompletionRequest and just construct a request with only a list of input.
| chunkingSettings | ||
| ); | ||
| case CHAT_COMPLETION -> new ElasticInferenceServiceCompletionModel( | ||
| case CHAT_COMPLETION -> new ElasticInferenceServiceChatCompletionModel( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we switch to using a single model class we should be able to follow what OpenAI does here: and have case CHAT_COMPLETION, COMPLETION -> ...
|
@jonathan-buttner: thank you for the careful review. I made the requested change and it looks better now. |
Draft PR to add support for the
completiontask type for Elastic Inference Service.