The purpose of this project is to provide simple, minimal examples of custom OpenSearch plugins. These plugins are designed to:
-
Show the basic structure of different types of plugins.
-
Demonstrate how to extend OpenSearch with custom functionalities.
-
Serve as a starting point for developers exploring plugin development.
This plugin provides a custom analyzer filter that extends the default ASCII folding functionality in OpenSearch.
The concept is simple, it performs ASCII folding, except for specific tokens (words) specified in the rules
block.
Start by compiling the code with mvn clean package
You should then see the zipped package at analyzer-plugin/target/analyzer-plugin-${VERSION}.zip
Install the plugin to OS using the command :
$OS_HOME/bin/opensearch-plugin install --batch file:///analyzer-plugin/target/analyzer-plugin-${VERSION}.zip
If planning to use OS with docker you can add to this minimal Dockerfile:
FROM opensearchproject/opensearch:3.2.0
ARG plugins_version
# for Mac environments, comment otherwise
ENV _JAVA_OPTIONS="-XX:UseSVE=0"
COPY libs/analyzer-plugin-1.0-SNAPSHOT.zip analyzer-plugin.zip
RUN /usr/share/opensearch/bin/opensearch-plugin install --batch "file://$(pwd)/analyzer-plugin.zip"
After installing the plugin, here is how to declare an smart ASCII folding analyzer in your index settings:
"analysis": {
"analyzer": {
"myanalyzer": {
"tokenizer": "standard",
"filter": [
"custom_smartasciifolding"
]
}
},
"filter": {
"custom_smartasciifolding": {
"type": "smartasciifolding",
"rules": [
"paté, pâté, patè, pàté => paté",
"dés => dés"
]
}
}
}
}
In this example, all tokens will be ASCII folded, except for "paté", "pâté", "patè" and "pàté" that will be converted to "paté" and "dés" and that won't be folded at all.
So an input like "Ce matin après avoir chargé mon téléphone je suis allé au marché acheter du pâté et des dés de fromage" would output the following tokens : ["ce", "matin", "apres", "avoir", "charge", "mon", "telephone", "je", "suis", "alle", "au", "marche", "acheter", "du", "paté", "et", "des", "dés", "de", "fromage" ]
You can test your analyzer very simply, start by creating a test index :
PUT /test_analyzer
{
"settings": {
"analysis": {
"analyzer": {
"myanalyzer": {
"tokenizer": "standard",
"filter": [
"custom_smartasciifolding"
]
}
},
"filter": {
"custom_smartasciifolding": {
"type": "smartasciifolding",
"rules": [
"paté, pâté, patè, pàté => paté",
"dés => dés"
]
}
}
}
}
}
You can then test your input like that
GET /test_analyzer/_analyze
{
"analyzer": "myanalyzer",
"text": "Ce matin après avoir chargé mon téléphone je suis allé au marché acheter du pâté et des dés de fromage"
}
Once you are satisfied with your results, you can use it like any analyzer :
PUT /my-index
{
"settings": {
"analysis": {
"analyzer": {
"myanalyzer": {
"tokenizer": "standard",
"filter": [
"custom_smartasciifolding"
]
}
},
"filter": {
"custom_smartasciifolding": {
"type": "smartasciifolding",
"rules": [
====> put your rules here <=====
]
}
}
}
},
"mappings": {
"properties": {
"cat": { # <======= field where you want to use the smartasciifolding
"type": "text",
"analyzer": "myanalyzer" # add this line
},
# other fields
}
}
}
This plugin provides a custom OpenSearch ingest processor that enriches documents at index time by performing lookups into other indices. Based on a configurable joinKey, and lookup indices, the processor fetches fields from reference indices and adds them into the document before it is stored, making enrichment an integrated part of the ingestion pipeline.
Start by compiling the code with mvn clean package
You should then see the zipped package at lookup-plugin/target/lookup-plugin-${VERSION}.zip
Install the plugin to OS using the command :
$OS_HOME/bin/opensearch-plugin install --batch file:///lookup-plugin/target/lookup-plugin-${VERSION}.zip
If planning to use OS with docker you can add to this minimal Dockerfile:
FROM opensearchproject/opensearch:3.2.0
ARG plugins_version
# for Mac environments, comment otherwise
ENV _JAVA_OPTIONS="-XX:UseSVE=0"
COPY libs/lookup-plugin-1.0-SNAPSHOT.zip lookup-plugin.zip
RUN /usr/share/opensearch/bin/opensearch-plugin install --batch "file://$(pwd)/lookup-plugin.zip"
In our example, we want to add the prices and quantity of each product, that are stored in .product_prices
and .product_availability
, with the product field asin
serving as the id for all three indices:
PUT _ingest/pipeline/shopping_products_enrich
{
"description": "Enrich shopping_products docs",
"processors": [
{
"lookup_processor": {
"joinKey": "asin",
"lookups": {
".product_prices": ["price"],
".product_availability": ["qty"]
}
}
}
]
}
This will, during indexation, send requests to both .product_prices and .product_availability with the current document's asin as id and retrieve price and qty, respectively. NOTE: The plugin is able to retrieve fields / objects at any level of a document.
To test the lookup plugin, you will need at least a lookup index, some data that have matches in the lookup index.
Keeping our previous example, we can create manually a few entries to a .product_prices
index
PUT .product_prices/_doc/myid1
{
"price": 10.5
}
PUT .product_prices/_doc/myid2
{
"price": 15
}
PUT .product_prices/_doc/myid3
{
"price": 20
}
Now we will setup our processor :
PUT _ingest/pipeline/shopping_processor
{
"description": "Enrich shopping_products docs",
"processors": [
{
"lookup_processor": {
"joinKey": "asin",
"lookups": {
".product_prices": ["price"]
}
}
}
]
}
Now we can create our index, specifying to use our created pipeline
PUT my_products/
{
"settings": {
"index": {
"default_pipeline": "shopping_processor"
}
}
}
We can now add some documents to our index :
PUT my_products/_doc/myid1
{
"title" : "Iphone 16 pro",
"description": "latest iphone",
"asin": "myid1"
}
PUT my_products/_doc/myid3
{
"title" : "Playstation portal",
"description": "Play your playstation games anywhere",
"asin": "myid3"
}
PUT my_products/_doc/myid5
{
"title" : "Round sunglasses",
"description": "Stylish sunglasses",
"asin": "myid5"
}
You can check the result with GET my_products/_search
As you can see, the third document doesn't have a price, that's because `.product_prices doesn't have an entry for id "myid5"
This plugin works in two phases as it implements both IngestPlugin
and ActionPlugin
.
When documents are indexed, it automatically takes one chosen field from each document and stores it in a separate secondary index.
Later, using our newly developed endpoint custom_plugin/autocomplete, the plugin searches both the main index and the secondary one. The results are then combined to offer autocomplete suggestions.
NOTE: The provided code assumes the main index is made of products containing the fields : title, stars, ratings, asin, image. This can be changed by implementing a new ActionResponse.
Start by compiling the code with mvn clean package
You should then see the zipped package at autcomplete-plugin/target/autocomplete-plugin-${VERSION}.zip
Install the plugin to OS using the command :
$OS_HOME/bin/opensearch-plugin install --batch file:///autocomplete-plugin/target/lookup-plugin-${VERSION}.zip
If planning to use OS with docker you can add to this minimal Dockerfile:
FROM opensearchproject/opensearch:3.2.0
ARG plugins_version
# for Mac environments, comment otherwise
ENV _JAVA_OPTIONS="-XX:UseSVE=0"
COPY libs/autocomplete-plugin-1.0-SNAPSHOT.zip autocomplete-plugin.zip
RUN /usr/share/opensearch/bin/opensearch-plugin install --batch "file://$(pwd)/autocomplete-plugin.zip"
First, you need to declare a new ingest pipeline specifying both the field to be extracted and the name of the index to extract it to :
PUT _ingest/pipeline/extract-category
{
"description": "Extract Product category",
"processors": [
{
"field2doc": {
"description": "Extracting categories from products",
"fieldName": "category",
"indexName": "autocomplete-category"
}
}
]
}
This pipeline now needs to be set as the default pipeline for our index, we also need to setup our analyzer for both indices here:
PUT my_products/
{
"settings": {
"index": {
"default_pipeline": "extract-category",
"analysis": {
"filter": {
"edge_ngram_1_20": {
"type": "edge_ngram",
"min_gram": "1",
"max_gram": "20"
}
},
"analyzer": {
"autocomplete_analyzer": {
"filter": [
"lowercase",
"edge_ngram_1_20"
],
"type": "custom",
"tokenizer": "standard"
}
}
}
}
}
}
PUT autocomplete-category/
{
"settings": {
"index": {
"analysis": {
"filter": {
"edge_ngram_1_20": {
"type": "edge_ngram",
"min_gram": "1",
"max_gram": "20"
}
},
"analyzer": {
"autocomplete_analyzer": {
"filter": [
"lowercase",
"edge_ngram_1_20"
],
"type": "custom",
"tokenizer": "standard"
}
}
}
}
}
}
Then you need to add the edge ngram analyzer to the fields you wish the match query to be done on, for example here we'll use the title field, and we'll create a minimalist version of products documents :
PUT my_products/_mapping
{
"properties": {
"asin": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"category": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"description": {
"type": "text",
"fields": {
"autocomplete": {
"type": "text",
"search_analyzer": "standard"
},
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"image": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"ratings": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"stars": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"title": {
"type": "text",
"fields": {
"autocomplete": {
"type": "text",
"analyzer": "autocomplete_analyzer",
"search_analyzer": "standard"
},
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
The same needs to be done on our secondary index
PUT autocomplete-category/_mapping
{
"properties": {
"value": {
"type": "text",
"fields": {
"autocomplete": {
"type": "text",
"analyzer": "autocomplete_analyzer",
"search_analyzer": "standard"
},
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
Now, we can insert some documents to our index:
PUT my_products/_doc/B06XXXLJ6V
{
"asin": "B06XXXLJ6V",
"title": "FYY Leather Case with Mirror for Samsung Galaxy S8 Plus, Leather Wallet Flip Folio Case with Mirror and Wrist Strap for Samsung Galaxy S8 Plus Black",
"description": "Product Description Premium PU Leather Top quality. Made with Premium PU Leather. Receiver design.",
"stars": "4.3 out of 5 stars",
"ratings": "1,116 ratings",
"category": [
"Cell Phones & Accessories",
"Cases, Holsters & Sleeves",
"Flip Cases"
],
"image": "https://m.media-amazon.com/images/I/81bdoltQWVL.__AC_SY300_SX300_QL70_FMwebp_.jpg"
}
PUT my_products/_doc/B00PMMB4X8
{
"asin": "B00PMMB4X8",
"title": "Amazon Basics Large Carrying Case for GoPro And Accessories - 13 x 9 x 2.5 Inches, Black",
"description": "Product Description Product Description Amazon Basics Large Carrying Case for GoPro And Accessories - 13 x 9 x 2.5 Inches, Black From the Manufacturer Amazon Basics Product Description Product Description Amazon Basics Large Carrying Case for GoPro And Accessories - 13 x 9 x 2.5 Inches",
"stars": "4.6 out of 5 stars",
"ratings": "23,224 ratings",
"category": [
"Electronics",
"Camera & Photo",
"Bags & Cases",
"Bag & Case Accessories"
],
"image": "https://m.media-amazon.com/images/W/WEBP_402378-T2/images/I/91KBV4dvXLS.__AC_SX300_SY300_QL70_FMwebp_.jpg"
}
PUT my_products/_doc/B08B1RM2JW
{
"asin": "B08B1RM2JW",
"title": """Golden Age Comic Book Bags Collector Bundle - 300-pack of Acid-Free Archival Protective Storage Sleeves for Cataloguing Vintage Comics - Fits Books Up to 7 5/8" x 10.5” - Resealable Adhesive Strip""",
"description": "Product Description WORTHY OF THE COLLECTOR You've amassed a collection of the World's Finest comic books in the world.",
"stars": "4.1 out of 5 stars",
"ratings": "14 ratings",
"category": [
"Office Products",
"Office & School Supplies",
"Book Covers & Book Accessories",
"Book & Bible Covers",
"Book Covers"
],
"image": "https://m.media-amazon.com/images/I/81hGmDgpoKL.__AC_SX300_SY300_QL70_FMwebp_.jpg"
}
You can see that both my_products and autocomplete-category were populated, we can now query our index using our custom action plugin :
POST _autocomplete
{
"primaryIndex": "my_products",
"field": "title",
"secondaryIndex": "autocomplete-category",
"text": "cas"
}
As you can see the results differ from a classic request response :
{
"products": [
{
"title": "FYY Leather Case with Mirror for Samsung Galaxy S8 Plus, Leather Wallet Flip Folio Case with Mirror and Wrist Strap for Samsung Galaxy S8 Plus Black",
"stars": "4.3 out of 5 stars",
"ratings": "1,116 ratings",
"asin": "B06XXXLJ6V",
"image": "https://m.media-amazon.com/images/I/81bdoltQWVL.__AC_SY300_SX300_QL70_FMwebp_.jpg"
},
{
"title": "Amazon Basics Large Carrying Case for GoPro And Accessories - 13 x 9 x 2.5 Inches, Black",
"stars": "4.6 out of 5 stars",
"ratings": "23,224 ratings",
"asin": "B00PMMB4X8",
"image": "https://m.media-amazon.com/images/W/WEBP_402378-T2/images/I/91KBV4dvXLS.__AC_SX300_SY300_QL70_FMwebp_.jpg"
}
],
"categories": [
"Flip Cases",
"Bags & Cases",
"Cases, Holsters & Sleeves",
"Bag & Case Accessories"
]
}
This is because of our custom Response class (see AutocompleteResponse
)
NOTE: The inserts to our secondary index rely on a bulkprocessor which can be configured in the opensearch.yml configuration file.
This plugin is meant to reduce the number of zero-result query by transforming the input query into a less strict version in case it yielded no results. Our simple use case focuses on MultiMatchQuery type; in case of 0 result :
- we rerun the query with operator.OR instead of AND
- if there are still no result set the fuziness to AUTO
It is an action plugin, meaning it works with a custom endpoint, here it is
{index}/_reesearch
- if there are still no result set the fuziness to AUTO
It is an action plugin, meaning it works with a custom endpoint, here it is
Start by compiling the code with mvn clean package
You should then see the zipped package at query-relaxing-plugin/target/query-relaxing-plugin-${VERSION}.zip
Install the plugin to OS using the command :
$OS_HOME/bin/opensearch-plugin install --batch file:///query-relaxing-plugin/target/query-relaxing-plugin-${VERSION}.zip
If planning to use OS with docker you can add to this minimal Dockerfile:
FROM opensearchproject/opensearch:3.2.0
ARG plugins_version
# for Mac environments, comment otherwise
ENV _JAVA_OPTIONS="-XX:UseSVE=0"
COPY libs/query-relaxing-plugin-1.0-SNAPSHOT.zip query-relaxing-plugin.zip
RUN /usr/share/opensearch/bin/opensearch-plugin install --batch "file://$(pwd)/query-relaxing-plugin.zip"
This particular plugin requires little to no setup; you just need to have data in your index, let's use this as our example
POST _bulk
{ "index": { "_index": "my-qr-demo-index" } }
{ "name": "iPhone 15", "category": "Smartphone" }
{ "index": { "_index": "my-qr-demo-index" } }
{ "name": "Galaxy S23", "category": "Smartphone" }
{ "index": { "_index": "my-qr-demo-index" } }
{ "name": "MacBook Pro", "category": "Laptop" }
Now we can compare the results we get using Opensearch's _search endpoint and our custom made one {index}/_reesearch
GET my-qr-demo-index/_search
{
"query": {
"multi_match": {
"query": "iPhone Laptop",
"fields": ["name", "category"],
"operator": "and",
"fuzziness": 0
}
}
}
No document contains both iphone and laptop. Running the same query with our endpoint gives us results as the operator gets changed to OR :
POST my-qr-demo-index/_reesearch
{
"multi_match": {
"query": "iPhone Laptop",
"fields": [
"name",
"category"
],
"operator": "and",
"fuzziness": 0
}
}
We can know try the same thing but with typos for both terms so it will change the fuziness :
GET my-qr-demo-index/_search
{
"query": {
"multi_match": {
"query": "iphon Laptoop",
"fields": ["name", "category"],
"operator": "and",
"fuzziness": 0
}
}
}
We can change the operator to verify it is not enough to get a result
GET my-qr-demo-index/_search
{
"query": {
"multi_match": {
"query": "iphon Laptoop",
"fields": ["name", "category"],
"operator": "or",
"fuzziness": 0
}
}
}
Now with our endpoint we can see we retrieve the same 2 documents as before :
POST my-qr-demo-index/_reesearch
{
"multi_match": {
"query": "iphon Laptoop",
"fields": ["name", "category"],
"operator": "or",
"fuzziness": 0
}
}
The search as you type plugin is a custom search plugin that registers a custom query "autocomplete". The syntax is as follow
POST /my_index/_search
{
"query": {
"autocomplete": {
"matching_query": {my input},
"query_fields": [my fields]
}
}
}
The goal of this plugin is to automatically add the ._2gram and ._3gram subfields to the query.
Note : the fields in query_fields need to be of type search_as_you_type.
Start by compiling the code with mvn clean package
You should then see the zipped package at search-as-you-type-plugin/target/search-as-you-type-plugin-${VERSION}.zip
Install the plugin to OS using the command :
$OS_HOME/bin/opensearch-plugin install --batch file:///lookup-plugin/target/search-as-you-type-plugin-${VERSION}.zip
If planning to use OS with docker you can add to this minimal Dockerfile:
FROM opensearchproject/opensearch:3.2.0
ARG plugins_version
# for Mac environments, comment otherwise
ENV _JAVA_OPTIONS="-XX:UseSVE=0"
COPY libs/search-as-you-type-plugin-1.0-SNAPSHOT.zip search-as-you-type-plugin.zip
RUN /usr/share/opensearch/bin/opensearch-plugin install --batch "file://$(pwd)/search-as-you-type-plugin.zip"
First create an index, with at least a search-as-you-type field
PUT books
{
"mappings": {
"properties": {
"title": { "type": "search_as_you_type" },
"shortSummary": { "type": "search_as_you_type" },
"words": { "type": "search_as_you_type" }
}
}
}
then you can request it like that
POST /books/_search
{
"query": {
"autocomplete": {
"matching_query": "boo",
"query_fields": ["title", "words", "shortSummary"]
}
}
}
Internally, this is the equivalent of the following multimatch query :
POST books/_search
{
"query": {
"multi_match": {
"query": "boo",
"type": "bool_prefix",
"fields": [
"title",
"title._2gram",
"title._3gram",
"shortSummary",
"shortSummary._2gram",
"shortSummary._3gram",
"words",
"words._2gram",
"words._3gram"
]
}
}
}
This plugin is a dashboard plugin. It is meant to be used with Telicent's synonyms plugin. To use it, you'll need to have the telicent synonyms plugin installed to your Opensearch instance.
The goal of this plugin is to edit the synonyms list setup in the field synonyms
of the index .synonyms
.
See https://github.com/Telicent-io/telicent-opensearch for more info on the synonyms plugin.