Elasticsearch (`OrchardCore.Elasticsearch`)¶

The Elasticsearch module allows you to manage Elasticsearch indices.

How to use¶

You can use an Elasticsearch cloud service like offered on https://www.elastic.co or install it on-premises. For development and testing purposes, it is also available to be deployed with Docker.

Install Elasticsearch with Docker compose¶

Elasticsearch uses a mmapfs directory by default to store its indices. The default operating system limits on mmap counts is likely to be too low, which may result in out of memory exceptions.

https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html

For Docker with WSL2, you will need to persist this setting by using a .wslconfig file.

In your Windows %userprofile% directory (typically C:\Users\<username>) create or edit the file .wslconfig with the following:

[wsl2]
kernelCommandLine = "sysctl.vm.max_map_count=262144"

Then exit any WSL instance, wsl --shutdown, and restart.

> sysctl vm.max_map_count
vm.max_map_count = 262144

Elasticsearch Docker Compose file (check the current Elasticsearch version in the file if you need to run a specific version): docker-compose.yml.

Copy this file in a folder named Elasticsearch somewhere safe.
Open up a Terminal or Command Shell in this folder.
Execute docker-compose up to deploy Elasticsearch containers.

Tip

Don't remove this file from its folder if you want to remove all their containers at once later on in Docker desktop.

You should get this result in Docker Desktop app:

Elasticsearch docker containers

Failure

If you've done this previously with an older elasticsearch Docker image, you might get errors similar to "The index [.geoip_databases/Zgrk5UXCRhmCFz98BImAHg] created in version [7.17.5] with current compatibility version [7.17.5] must be marked as read-only using the setting [index.blocks.write] set to [true] before upgrading to 9.0.0." While you can do what the error message says, if you just use the Elasticsearch instance for local development, instead, we recommend you to just remove the volume where it stores its data and start over. docker volume ls will show you which volumes exist, and you can then run docker volume rm elasticsearchdocker_data01 or similar to remove the volumes used by Elasticsearch.

Set up Elasticsearch in Orchard Core¶

Add Elastic Connection in the shell configuration (OrchardCore.Cms.Web appsettings.json file). See Elasticsearch Configuration.
Start an Orchard Core instance with your IDE or the .NET CLI.
Go to Orchard Core features, Enable Elasticsearch.

Recipe step¶

Create Index Step¶

Elasticsearch indices can be created during recipe execution using the ElasticIndexSettings step.
Here is a sample step:

{
  "steps":[
    {
      "name":"ElasticIndexSettings",
      "Indices": [
        {
          "Search": {
            "AnalyzerName": "standard",
            "IndexLatest": false,
            "IndexedContentTypes": [
              "Article",
              "BlogPost"
            ]
          }
        }
      ]
    }
  ]
}

Note

It's recommended to use the CreateOrUpdateIndexProfile recipe step instead as the ElasticIndexSettings step is obsolete.

Here is an example of how to create Elasticsearch index profile using the IndexProfile for Content items.

{
  "steps":[
    {
      "name":"CreateOrUpdateIndexProfile",
      "indexes": [
        {
            "Name": "BlogPostsES",
            "IndexName": "blogposts",
            "ProviderName": "Elasticsearch",
            "Type": "Content",
            "Properties": {
                "ContentIndexMetadata": {
                    "IndexLatest": false,
                    "IndexedContentTypes": ["BlogPosts"],
                    "Culture": "any"
                },
                "ElasticsearchIndexMetadata": {
                    "AnalyzerName": "standard",
                    "StoreSourceData": true,
                },
                "ElasticsearchDefaultQueryMetadata": {
                    "QueryAnalyzerName": "standard",
                    "SearchType": "", // The search type can be "query_string", "custom", or empty for default search type.
                    "DefaultQuery": "", // When using "custom" search type, this is the query to use.
                    "DefaultSearchFields": [
                        "Content.ContentItem.FullText"
                    ]
                }
            }
        }
      ]
    }
  ]
}

Reset Elasticsearch Index Step¶

This Reset Index Step resets an Elasticsearch index. Restarts the indexing process from the beginning in order to update current content items. It doesn't delete existing entries from the index.

{
  "steps":[
    {
      "name":"elastic-index-reset",
      "Indices":[
        "IndexName1",
        "IndexName2"
      ]
    }
  ]
}

To reset all indices:

{
  "steps":[
    {
      "name":"elastic-index-reset",
      "IncludeAll":true
    }
  ]
}

Note

It's recommended to use the ResetIndex recipe step instead as the elastic-index-reset step is obsolete.

Rebuild Elasticsearch Index Step¶

This Rebuild Index Step rebuilds an Elasticsearch index. Deletes and recreates the full index content.

{
  "steps":[
    {
      "name":"elastic-index-rebuild",
      "Indices":[
        "IndexName1",
        "IndexName2"
      ]
    }
  ]
}

To rebuild all indices:

{
  "steps":[
    {
      "name":"elastic-index-rebuild",
      "IncludeAll":true
    }
  ]
}

Note

It's recommended to use the RebuildIndex recipe step instead as the elastic-index-rebuild step is obsolete.

Queries recipe step¶

Here is an example for creating a Elasticsearch query from a Queries recipe step:

{
  "steps":[
    {
        "Source": "Elasticsearch",
        "Name": "RecentBlogPosts",
        "Index": "Search",
        "Template": "...", // json encoded query template
        "ReturnContentItems": true
    }
  ]
}

Indexing custom data¶

The indexing module supports multiple sources for indexing. This allows you to create indexes based on different data sources, such as content items or custom data.

To register a new source, you can add the following code to your Startup.cs file:

services.AddElasticsearchIndexingSource("CustomSource", o =>
{
    o.DisplayName = S["Custom Source in Provider"];
    o.Description = S["Create a Provider index based on custom source."];
});

Web APIs¶

`api/elasticsearch/content`¶

Executes a query with the specified name and returns the corresponding content items.

Verbs: POST and GET

Parameter	Example	Description
`indexName`	`search`	The name of the index to query.
`query`	`{ "query": { "match_all": {} }, "size": 10 }`	A JSON object representing the query.
`parameters`	`{ size: 3}`	A JSON object representing the parameters of the query.

`api/elasticsearch/documents`¶

Executes a query with the specified name and returns the corresponding Elasticsearch documents. Only the stored fields are returned.

Verbs: POST and GET

Parameter	Example	Description
`indexName`	`search`	The name of the index to query.
`query`	`{ "query": { "match_all": {} }, "size": 10 }`	A JSON object representing the query.
`parameters`	`{ size: 3}`	A JSON object representing the parameters of the query.

Elasticsearch Queries¶

The Elasticsearch module provides a management UI and APIs for querying Elasticsearch data using Elasticsearch Queries. See: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html

Elasticsearch configuration¶

Connection configuration¶

The Elasticsearch module connection configuration can be set globally in the appsettings.json file or other configuration sources, globally for the whole app or per tenant (see Configuration).

{
  "OrchardCore_Elasticsearch": {
    "ConnectionType": "SingleNodeConnectionPool",
    "Url": "http://localhost",
    "Ports": [
      9200
    ],
    "AuthenticationType":"Basic", // Supported values are:'Basic', 'ApiKey', 'Base64ApiKey' or 'KeyIdAndKey'
    "ApiKey": "", // Required when using ApiKey authentication type
    "Base64ApiKey": "", // Required when using Base64ApiKey authentication type
    "CloudId": "The cloud id", // Required when using CloudConnectionPool connection type
    "Username": "admin", // Required  using Basic authentication types
    "Password": "admin", // Required  using Basic authentication types
    "KeyId": "The key id", // Required  using KeyIdAndKey authentication types
    "Key": "The key", // Required  using KeyIdAndKey authentication types
    "CertificateFingerprint": "75:21:E7:92:8F:D5:7A:27:06:38:8E:A4:35:FE:F5:17:D7:37:F4:DF:F0:9A:D2:C0:C4:B6:FF:EE:D1:EA:2B:A7",
    "EnableDebugMode": false,
    "EnableHttpCompression": true,
    "IndexPrefix": "",
    "Analyzers": {
      "standard": {
        "type": "standard"
      }
    }
  }
}

Note

When CloudConnectionPool connection type is used, CertificateFingerprint is not needed.

The connection types documentation and examples can be found in the official Elasticsearch documentation.

Indexing settings¶

When editing an Elasticsearch-based index from the Index Management page (Search > Indexing), you can configure both general indexing settings and Elasticsearch-specific settings.

Under the Elasticsearch-specific settings, the following configuration options are available:

Query analyzer: The analyzer to use when executing queries on this index. By default, it uses the standard analyzer; other analyzers are available if registered.
Search type: Determines how the index will be searched.
Multi-Match Query: The default search type that uses the multi_match query to search across multiple fields, as configured.
Query String Query: Uses the query_string query, which allows for more complex queries using a query syntax.
Custom Query: Allows you to define a custom Elasticsearch query for each search request. Liquid is supported, so use the {{ term }} template in place of the user-provided search term. An example query that utilizes search highlights is shown below:

{
  "query": {
    "multi_match": {
      "fields": [
        "Content.ContentItem.FullText"
      ],
      "query": "{{ term }}",
      "fuzziness": "AUTO"
    }
  },
  "highlight": {
    "pre_tags": [
      "<span style='background-color: #FFF3CD;'>"
    ],
    "post_tags": [
      "</span>"
    ],
    "fields": {
      "Content.ContentItem.FullText": {
        "fragment_size": 150,
        "number_of_fragments": 3
      }
    }
  }
}

With this feature, Elasticsearch will return highlighted fragments wrapped in <span class="search-highlight"> HTML tags, which can then be displayed in the Search module or other components. This enables the presentation of more relevant content that directly matches the search term.

Note

Highlight requests only work when the content item is stored in the Elasticsearch service, i.e., the "Store Source Data" checkbox under the index settings is checked.

Elasticsearch Analyzers¶

As of version 1.6, built-in and custom analyzers are supported. By default, only the standard analyzer is available. You may update the Elasticsearch configurations to enable any built-in or custom analyzers. For example, to enable the built-in stop and standard analyzers, you may add the following to the appsettings.json file:

{
  "OrchardCore_Elasticsearch": {
    "Analyzers": {
      "standard": {
        "type": "standard"
      },
      "stop": {
        "type": "stop"
      }
    }
  }
}

At the same time, you may define custom analyzers using the appsettings.json file as well. In the following example, we are enabling the standard analyzer, customizing the stop analyzer and creating a custom analyzer named english_analyzer.

{
  "OrchardCore_Elasticsearch": {
    "Analyzers": {
      "standard": {
        "type": "standard"
      },
      "stop": {
        "type": "stop",
        "stopwords": [
          "a",
          "the",
          "and",
          "or"
        ]
      },
      "english_analyzer": {
        "type": "custom",
        "tokenizer": "standard",
        "filter": [
          "lowercase",
          "stop"
        ],
        "char_filter": [
          "html_strip"
        ]
      }
    }
  }
}

Elasticsearch Token-Filters¶

As of version 2.1, you can define custom token filters in your Elasticsearch configuration. To add new custom token filters, update your Elasticsearch settings accordingly.

For instance, to create a token filter named english_stop, you can include the following configuration in your appsettings.json file:

{
  "OrchardCore_Elasticsearch": {
    "TokenFilters": {
      "english_stop": {
        "type": "stop",
        "stopwords": "_english_"
      }
    },
    "Analyzers": {
      "my_new_analyzer": {
        "type": "custom",
        "tokenizer": "standard",
        "filter": [
          "english_stop"
        ]
      }
    }
  }
}

In this example, the english_stop token filter removes English stop words, and the my_new_analyzer uses the standard tokenizer along with the english_stop filter to process text.

Elasticsearch vs Lucene¶

Both modules are complementary and can be enabled at the same time. While the Lucene module uses Lucene.NET it is not as feature complete as the Elasticsearch module.

There will be discrepancies between both modules' implementation because of the fact that Lucene.NET implements an older version of Lucene. Though the most basic types of Queries will work with both.

The Lucene module though will always only return stored fields from Lucene Queries while the Elasticsearch module can be set to return specific Fields or return the entire source data.

Here is one example of a Query that will return only specific fields from Elasticsearch.

{
  "query": {
    "match_all": { }
  },
  "fields": [
    "ContentItemId.keyword", "ContentItemVersionId.keyword"
  ],
  "_source": false
}

The Elasticsearch index settings allows to store the "source" data or not. It is set to store the source data by default.

Elasticsearch will do an automatic mapping based on CLR Types. Every data field that is passed to Elasticsearch that is mapped as a "string" will become text and keyword. For example, the Content.ContentItem.DisplayText will result as a text field and Content.ContentItem.DisplayText.keyword will become a keyword field so that it can be used as a technical value.

There may be differences between Lucene and Elasticsearch indexed fields. Lucene allows to store and set a field as a keyword explicitly. Elasticsearch, for now, is not affected by the stored or keyword options on a ContentField index settings. We may allow it eventually by executing manual mapping on the indices. So, right now, this can result in having fields that are text in Lucene and keyword in Elasticsearch when using the same Field name in a Query. You then need to adapt your Queries to use the proper type of Queries.

Indexed vs Stored¶

When we say that a field is indexed it means that it is parsed by the configured Analyzer that is set on the index (Elasticsearch also allows to pass custom Analyzers on Queries too).

Though, when a field is stored it can have different contexts.

As an example, Elasticsearch stores the original value passed in the "_source" fields of its index. All the automatically mapped fields are never stored in the index. They are indexed.

Lucene though will currently be able to store the original value passed when the Store source data option is set on a specific index setting. Lucene also has stored fields by design like the ContentItemId of a content item.

The equivalent of a StringField that will behave the same way as a keyword in Elasticsearch has been added to all ContentFields that are passing "string" values by using the .keyword suffix on the field name.

Here is a small table to compare Lucene and Elasticsearch (string) types:

Lucene	Elasticsearch	Description	When Stored	Search Query type
StringField	Keyword	A field that is indexed but not tokenized: the entire value is indexed as a single token	original value AND indexed	stored fields because indexed as a single token.
TextField	Text	A field that is indexed and tokenized, without term vectors	original value AND indexed	analyzed fields. Also known as full-text search
StoredField	stored in _source by mapping configuration	A field containing original value (not analyzed)	original value	stored fields

Elasticsearch (OrchardCore.Elasticsearch)¶

How to use¶

Install Elasticsearch with Docker compose¶

Set up Elasticsearch in Orchard Core¶

Recipe step¶

Create Index Step¶

Reset Elasticsearch Index Step¶

Rebuild Elasticsearch Index Step¶

Queries recipe step¶

Indexing custom data¶

Web APIs¶

api/elasticsearch/content¶

api/elasticsearch/documents¶

Elasticsearch Queries¶

Elasticsearch configuration¶

Connection configuration¶

Indexing settings¶

Elasticsearch Analyzers¶

Elasticsearch Token-Filters¶

Elasticsearch vs Lucene¶

Indexed vs Stored¶

Video¶

Elasticsearch (`OrchardCore.Elasticsearch`)¶

`api/elasticsearch/content`¶

`api/elasticsearch/documents`¶