Elasticsearch (`OrchardCore.Search.Elasticsearch`)¶

The Elasticsearch module allows you to manage Elasticsearch indices.

How to use¶

You can use an Elasticsearch cloud service like offered on https://www.elastic.co or install it on-premises. For development and testing purposes, it is also available to be deployed with Docker.

Install Elasticsearch 7.x with Docker compose¶

Elasticsearch uses a mmapfs directory by default to store its indices. The default operating system limits on mmap counts is likely to be too low, which may result in out of memory exceptions.

https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html

For Docker with WSL2, you will need to persist this setting by using a .wslconfig file.

In your Windows %userprofile% directory (typically C:\Users\<username>) create or edit the file .wslconfig with the following:

[wsl2]
kernelCommandLine = "sysctl.vm.max_map_count=262144"

Then exit any WSL instance, wsl --shutdown, and restart.

> sysctl vm.max_map_count
vm.max_map_count = 262144

Elasticsearch v7.17.5 Docker Compose file : docker-compose.yml

Copy this file in a folder named Elasticsearch somewhere safe.
Open up a Terminal or Command Shell in this folder.
Execute docker-compose up to deploy Elasticsearch containers.

Advice: don't remove this file from its folder if you want to remove all their containers at once later on in Docker desktop.

You should get this result in Docker Desktop app:

Elasticsearch docker containers

Set up Elasticsearch in Orchard Core¶

Add Elastic Connection in the shell configuration (OrchardCore.Cms.Web appsettings.json file). See Elasticsearch Configurations.
Start an Orchard Core instance with VS Code debugger
Go to Orchard Core features, Enable Elasticsearch.

Recipe step¶

Elasticsearch indices can be created during recipe execution using the ElasticIndexSettings step.
Here is a sample step:

{
  "steps":[
    {
      "name":"ElasticIndexSettings",
      "Indices":[
        {
          "Search":{
            "AnalyzerName":"standardanalyzer",
            "IndexLatest":false,
            "IndexedContentTypes":[
              "Article",
              "BlogPost"
            ]
          }
        }
      ]
    }
  ]
}

Elasticsearch settings recipe step¶

Here is an example for setting default search settings:

{
  "steps":[
    {
      // Create the search settings.
      "name":"Settings",
      "ElasticSettings":{
        "SearchIndex":"search",
        "DefaultSearchFields":[
          "Content.ContentItem.FullText"
        ],
        "SearchType": "", // Use 'custom' for a custom query in DefaultQuery and 'query_string' for a Query String Query search. Leave it blank for the default, which is a Multi-Match Query search.
        "DefaultQuery": null,
        "SyncWithLucene":true // Allows to sync content index settings.
      }
    }
  ]
}

Reset Elasticsearch Index Step¶

This Reset Index Step resets an Elasticsearch index. Restarts the indexing process from the beginning in order to update current content items. It doesn't delete existing entries from the index.

{
  "steps":[
    {
      "name":"elastic-index-reset",
      "Indices":[
        "IndexName1",
        "IndexName2"
      ]
    }
  ]
}

To reset all indices:

{
  "steps":[
    {
      "name":"elastic-index-reset",
      "IncludeAll":true
    }
  ]
}

Rebuild Elasticsearch Index Step¶

This Rebuild Index Step rebuilds an Elasticsearch index. Deletes and recreates the full index content.

{
  "steps":[
    {
      "name":"elastic-index-rebuild",
      "Indices":[
        "IndexName1",
        "IndexName2"
      ]
    }
  ]
}

To rebuild all indices:

{
  "steps":[
    {
      "name":"elastic-index-rebuild",
      "IncludeAll":true
    }
  ]
}

Queries recipe step¶

Here is an example for creating a Elasticsearch query from a Queries recipe step:

{
  "steps":[
    {
        "Source": "Elasticsearch",
        "Name": "RecentBlogPosts",
        "Index": "Search",
        "Template": "...", // json encoded query template
        "ReturnContentItems": true
    }
}

Web APIs¶

`api/elasticsearch/content`¶

Executes a query with the specified name and returns the corresponding content items.

Verbs: POST and GET

Parameter	Example	Description
`indexName`	`search`	The name of the index to query.
`query`	`{ "query": { "match_all": {} } }`	A JSON object representing the query.
`parameters`	`{ size: 3}`	A JSON object representing the parameters of the query.

`api/elasticsearch/documents`¶

Executes a query with the specified name and returns the corresponding Elasticsearch documents. Only the stored fields are returned.

Verbs: POST and GET

Parameter	Example	Description
`indexName`	`search`	The name of the index to query.
`query`	`{ "query": { "match_all": {} } }`	A JSON object representing the query.
`parameters`	`{ size: 3}`	A JSON object representing the parameters of the query.

Elasticsearch Queries¶

The Elasticsearch module provides a management UI and APIs for querying Elasticsearch data using ElasticSearch Queries. See: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html

Elasticsearch configuration¶

The Elasticsearch module connection configuration can be set globally in the appsettings.json file or per tenant.

"OrchardCore_Elasticsearch": {
  "ConnectionType": "SingleNodeConnectionPool",
  "Url": "http://localhost",
  "Ports": [ 9200 ],
  "CloudId": "Orchard_Core_deployment:ZWFzdHVzMi5henVyZS5lbGFzdGljLWNsb3VkLmNvbTo0NDMkNmMxZGQ4YzBrQ2Y2NDI5ZDkyNzc1MTUxN2IyYjZkYTgkMTJmMjA1MzBlOTU0NDgyNDlkZWVmZWYzNmZlY2Q5Yjc=",
  "Username": "admin",
  "Password": "admin",
  "CertificateFingerprint": "75:21:E7:92:8F:D5:7A:27:06:38:8E:A4:35:FE:F5:17:D7:37:F4:DF:F0:9A:D2:C0:C4:B6:FF:EE:D1:EA:2B:A7",
  "EnableApiVersioningHeader": false,
  "IndexPrefix": "",
  "Analyzers": {
    "standard": {
      "type": "standard"
    }
  }
}

Note

When CloudConnectionPool connection type is used, CertificateFingerprint is not needed.

The connection types documentation and examples can be found at this url:

https://www.elastic.co/guide/en/elasticsearch/client/net-api/7.17/connection-pooling.html

Elasticsearch Analyzers¶

As of version 1.6, built-in and custom analyzers are supported. By default, only standard analyzer is available. You may update the Elasticsearch configurations to enable any of the built-in and any custom analyzers. For example, to enable the built in stop and standard analyzers, you may add the following to the appsettings.json file

"OrchardCore_Elasticsearch": {
  "Analyzers": {
    "standard": {
      "type": "standard"
    },
    "stop": {
      "type": "stop"
    }
  }
}

At the same time, you may define custom analyzers using the appsettings.json file as well. In the following example, we are enabling the standard analyzer, customizing the stop analyzer and creating a custom analyzer named english_analyzer.

"OrchardCore_Elasticsearch": {
  "Analyzers": {
    "standard": {
      "type": "standard"
    },
    "stop": {
      "type": "stop",
      "stopwords": [
         "a", 
         "the", 
         "and",
         "or" 
       ]
    },
    "english_analyzer": {
      "type": "custom",
      "tokenizer": "standard",
      "filter": [
        "lowercase",
        "stop"
      ],
      "char_filter": [
        "html_strip"
      ]
    }
  }
}

Elasticsearch vs Lucene¶

Both modules are complementary and can be enabled at the same time. While the Lucene module uses Lucene.NET it is not as feature complete as the Elasticsearch module.

There will be discrepancies between both modules' implementation because of the fact that Lucene.NET implements an older version of Lucene. Though the most basic types of Queries will work with both.

The Lucene module though will always only return stored fields from Lucene Queries while the Elasticsearch module can be set to return specific Fields or return the entire source data.

Here is one example of a Query that will return only specific fields from Elasticsearch.

{
  "query": {
    "match_all": { }
  },
  "fields": [
    "ContentItemId.keyword", "ContentItemVersionId.keyword"
  ],
  "_source": false
}

The Elasticsearch index settings allows to store the "source" data or not. It is set to store the source data by default.

Elasticsearch will do an automatic mapping based on CLR Types. Every data field that is passed to Elasticsearch that is mapped as a "string" will become text and keyword. For example, the Content.ContentItem.DisplayText will result as a text field and Content.ContentItem.DisplayText.keyword will become a keyword field so that it can be used as a technical value.

There may be differences between Lucene and Elasticsearch indexed fields. Lucene allows to store and set a field as a keyword explicitly. Elasticsearch, for now, is not affected by the stored or keyword options on a ContentField index settings. We may allow it eventually by executing manual mapping on the indices. So, right now, this can result in having fields that are text in Lucene and keyword in Elasticsearch when using the same Field name in a Query. You then need to adapt your Queries to use the proper type of Queries.

Indexed vs Stored¶

When we say that a field is indexed it means that it is parsed by the configured Analyzer that is set on the index (Elasticsearch also allows to pass custom Analyzers on Queries too).

Though, when a field is stored it can have different contexts.

As an example, Elasticsearch stores the original value passed in the "_source" fields of its index. All the automatically mapped fields are never stored in the index. They are indexed.

Lucene though will currently be able to store the original value passed when the Store source data option is set on a specific index setting. Lucene also has stored fields by design like the ContentItemId of a content item.

The equivalent of a StringField that will behave the same way as a keyword in Elasticsearch has been added to all ContentFields that are passing "string" values by using the .keyword suffix on the field name.

Here is a small table to compare Lucene and Elasticsearch (string) types:

Lucene	Elasticsearch	Description	When Stored	Search Query type
StringField	Keyword	A field that is indexed but not tokenized: the entire value is indexed as a single token	original value AND indexed	stored fields because indexed as a single token.
TextField	Text	A field that is indexed and tokenized, without term vectors	original value AND indexed	analyzed fields. Also known as full-text search
StoredField	stored in _source by mapping configuration	A field containing original value (not analyzed)	original value	stored fields

Elasticsearch (OrchardCore.Search.Elasticsearch)¶