When we look at many popular commercial websites such as Amazon, Netflix and eBay, we can clearly see that autocomplete(search suggestion) boxes are important for the companies.
We know that a good search result is also important for end users. In terms of commercial websites, when we are able to direct the end user to the right product or category quickly, this situation will affect sales rates positively.
To be able to see related suggestions while we are writing, isn’t it great?
And I decided to write an autosuggest sample using Elasticsearch – Completion Suggester and .NET Core according to e-mails which followers requested.
There are several different ways to implement autocomplete/suggest feature in elasticsearch such as “ngrams“, “prefix queries” and “completion suggester“. Also, different implementations have some tradeoffs such as resulting and indexing speeds, document sizes etc… For “autocomplete/search-as-you-type” functionality example in this article, I will try to show how we can implement autocomplete feature in a most performant way(I think) using “completion suggester” feature.
Completion Suggester
I think resulting speed is important when it’s about to give an instant feedback to end users with a good suggestion result. At this point, completion suggester works differently from other implementation ways. All combinations to be suggested, need to be indexed on elasticsearch with a “completion” type mapping. Completion suggester uses an in-memory data structure called FST(Finite-state transducer) to provide fast lookup operation.
Thus, it can perform prefix lookup operation faster than other term-based queries.
Let’s take a look at the example on the elastic engineering blog to understand it working logic better. Suppose that “hotel“, “marriot“, “mercure“, “munchen” and “munich” words are on a FST.
On the in-memory graph above, suggester performs a matching process from left to right according to the text input which entered by a user. For example, when the user enters “h” as an input, this word will be completed immediately because the only possible matching option is “hotel“. If the user enters “m“, this time the suggester will list all words which start with “m“.
The disadvantage of completion suggester that matching process always starts from left to right as like above. For example “Sam” text will be matched with “Samsung Note 8” not with “Note 8 Samsung“. In such cases, term-based queries work more efficiently. However, as I mentioned above, we can also perform the match operation of “Note 8 Samsung” word by indexing all related combinations to be suggested on a single suggest output with completion suggester. I will mention this in an example later.
It is also possible to specify “Fuzzy Matching” with completion suggester and “Weights” for scoring operations.
Scenario
Let’s assume we are working at an e-commerce company. Product owner, responsible for search domain, asked us to create an autocomplete feature that gives a result close to real-time based on the “brand” and “product name” fields.
First we need to feed items, that we want to suggest on elasticsearch.
1) Creation of Mapping and Index
We have to create a mapping, which has completion type, to use completion suggester. For that, first create a .NET Core class library called “Autocomplete.Business.Objects” and include “NEST” library using NuGet package manager.
We created this library as separately, because we will use the models which we define here, in both feeder application and autocomplete API.
First, define “Product” and “ProductSuggestResponse” models as follows.
using Nest; namespace Autocomplete.Business.Objects { public class Product { public int Id { get; set; } public string Name { get; set; } public CompletionField Suggest {get;set;} } }
using System.Collections.Generic; namespace Autocomplete.Business.Objects { public class ProductSuggestResponse { public IEnumerable<ProductSuggest> Suggests { get; set; } } public class ProductSuggest { public int Id { get; set; } public string Name { get; set; } public double Score { get; set; } } }
We will use the “Suggest” property in the “Product” model for texts that we want to suggest during autocomplete.
Create a new .NET Core class library called “Autocomplete.Business“, then include the “Autocomplete.Business.Objects” and “NEST” libraries. After that let’s define an interface called “IAutocompleteService“.
using System.Collections.Generic; using System.Threading.Tasks; using Autocomplete.Business.Objects; namespace Autocomplete.Business { public interface IAutocompleteService { Task<bool> CreateIndexAsync(string indexName); Task IndexAsync(string indexName, List<Product> products); Task<ProductSuggestResponse> SuggestAsync(string indexName, string keyword); } }
and implement it as follows.
using System.Collections.Generic; using System.Linq; using System.Threading.Tasks; using Autocomplete.Business.Objects; using Nest; namespace Autocomplete.Business { public class AutocompleteService : IAutocompleteService { readonly ElasticClient _elasticClient; public AutocompleteService(ConnectionSettings connectionSettings) { _elasticClient = new ElasticClient(connectionSettings); } public async Task<bool> CreateIndexAsync(string indexName) { var createIndexDescriptor = new CreateIndexDescriptor(indexName) .Mappings(ms => ms .Map<Product>(m => m .AutoMap() .Properties(ps => ps .Completion(c => c .Name(p => p.Suggest)))) ); if (_elasticClient.IndexExists(indexName.ToLowerInvariant()).Exists) { _elasticClient.DeleteIndex(indexName.ToLowerInvariant()); } ICreateIndexResponse createIndexResponse = await _elasticClient.CreateIndexAsync(createIndexDescriptor); return createIndexResponse.IsValid; } public async Task IndexAsync(string indexName, List<Product> products) { await _elasticClient.IndexManyAsync(products, indexName); } public async Task<ProductSuggestResponse> SuggestAsync(string indexName, string keyword) { ISearchResponse<Product> searchResponse = await _elasticClient.SearchAsync<Product>(s => s .Index(indexName) .Suggest(su => su .Completion("suggestions", c => c .Field(f => f.Suggest) .Prefix(keyword) .Fuzzy(f => f .Fuzziness(Fuzziness.Auto) ) .Size(5)) )); var suggests = from suggest in searchResponse.Suggest["suggestions"] from option in suggest.Options select new ProductSuggest { Id = option.Source.Id, Name = option.Source.Name, SuggestedName = option.Text, Score = option.Score }; return new ProductSuggestResponse { Suggests = suggests }; } } }
If we look at the “CreateIndexAsync” method above, we did mapping process of “Product” model. Also, we specified “Suggest” property in the “Product” model as completion field. At this point, default “simple” analyzer is used as the analyzer. Simple analyzer divides all text into terms as lower case. If you want, it is possible to replace the analyzer with the “Analyzer” method on the completion.
The “SuggestAsync” method will be used during autocomplete. Actually, we specified that we will perform completion operation with the “Suggest” field. Also, we set text input that user will enter, to the completion using the “Prefix” method. I guess, “Fuzzy” is great feature while autocomplete process. Sometimes we can make simple typo mistakes when we write, right? 🙂
And the last one is, we mapped “Text” and “Score” properties of suggestion options which are come from “searchResponse“, to “ProductSuggest” model.
NOTE: It is also possible to access original document of suggested text from “Source” property.
Finally, we can start to develop console application which will feed suggestion documents to elasticsearch. So, let’s create a .NET Core console application project called “Autocomplete.Feed“, then include the “Autocomplete.Business.Objects” and “Autocomplete.Business” libraries.
First of all, we need an elasticsearch instance to test. So let’s run the following command on Docker.
docker run -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:6.1.2
and change the “Program.cs” class with the following code blog.
using System; using System.Collections.Generic; using Autocomplete.Business; using Autocomplete.Business.Objects; using Nest; namespace Autocomplete.Feed { class Program { static void Main(string[] args) { List<Product> products = new List<Product>(); products.Add(new Product() { Id = 1, Name = "Samsung Galaxy Note 8", Suggest = new CompletionField() { Input = new [] { "Samsung Galaxy Note 8", "Galaxy Note 8", "Note 8" } } }); products.Add(new Product() { Id = 2, Name = "Samsung Galaxy S8", Suggest = new CompletionField() { Input = new[] { "Samsung Galaxy S8", "Galaxy S8", "S8" } } }); products.Add(new Product() { Id = 3, Name = "Apple Iphone 8", Suggest = new CompletionField() { Input = new[] { "Apple Iphone 8", "Iphone 8" } } }); products.Add(new Product() { Id = 4, Name = "Apple Iphone X", Suggest = new CompletionField() { Input = new[] { "Apple Iphone X", "Iphone X" } } }); products.Add(new Product() { Id = 5, Name = "Apple iPad Pro", Suggest = new CompletionField() { Input = new[] { "Apple iPad Pro", "iPad Pro" } } }); var connectionSettings = new ConnectionSettings(new Uri("http://localhost:9200")); IAutocompleteService autocompleteService = new AutocompleteService(connectionSettings); string productSuggestIndex = "product_suggest"; bool isCreated = autocompleteService.CreateIndexAsync(productSuggestIndex).Result; if(isCreated) { autocompleteService.IndexAsync(productSuggestIndex, products).Wait(); } } } }
If we look at code blog above, firstly we created the products that will be used in autocomplete. Also while creating the products, we set the inputs, these are wanted to match with the related product, to “CompletionField” property of each product. Thus, if a user writes an input such as “Galaxy Note 8” or “Note 8“, we can provide matching these inputs with “Samsung Galaxy Note 8“.
After creation of the products, we have provided feed operation to elasticsearch using “AutocompleteService” class, which created before.
So we have a feeder project now, and let’s run it. If it runs successfully, we can see the “product_suggest” index on the elasticsearch with the following mapping.
“GET product_suggest/_mapping”
{ "product_suggest": { "mappings": { "product": { "properties": { "id": { "type": "integer" }, "name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "suggest": { "type": "completion", "analyzer": "simple", "preserve_separators": true, "preserve_position_increments": true, "max_input_length": 50 } } } } } }
2) Designing of Autocomplete API
Now, all we need is to do design an API, to expose autocomplete feature. Let’s create a .NET Core Web API project called “Autocomplete.API“, then include the “Autocomplete.Business.Objects“, “Autocomplete.Business” and “NEST” libraries.
After that, create a controller called “ProductSuggests“.
using System.Threading.Tasks; using Autocomplete.Business; using Autocomplete.Business.Objects; using Microsoft.AspNetCore.Mvc; namespace Autocomplete.API.Controllers { [Route("api/product-suggests")] public class ProductSuggestsController : Controller { readonly IAutocompleteService _autocompleteService; const string PRODUCT_SUGGEST_INDEX = "product_suggest"; public ProductSuggestsController(IAutocompleteService autocompleteService) { _autocompleteService = autocompleteService; } [HttpGet] public async Task<ProductSuggestResponse> Get(string keyword) { return await _autocompleteService.SuggestAsync(PRODUCT_SUGGEST_INDEX, keyword); } } }
In the “Get” method, we just return the “ProductSuggestResponse” using the “IAutocompleteService“.
Also in the “Startup” class, we need to inject services into service collection.
public void ConfigureServices(IServiceCollection services) { services.AddMvc(); services.AddSingleton(x => new ConnectionSettings(new Uri("http://localhost:9200"))); services.AddTransient<IAutocompleteService, AutocompleteService>(); }
That’s all.
Let’s run the API project for testing autocomplete feature, and then assume a user entered “iph“.
GET “http://localhost:5000/api/product-suggests?keyword=iph”
If we look at the response above, we can see that the related results for the “iph” such as “Iphone 8“, “Iphone X” and “iPad Pro“.
Now if we assume the user entered “app” instead of the “iph“:
and this time “Apple Iphone 8“, “Apple Iphone X” and “Apple iPad Pro” results will suggested to the user.
Conclusion
As I mentioned the beginning of this article, there are many ways implement to autocomplete feature in the elasticsearch with some tradeoffs. Completion suggester works faster than term-based queries. Also, it is a disadvantage of completion suggester to start matching operation at the beginning of the text. In addition, sort order options are limited.
https://github.com/GokGokalp/Elasticsearch-Autocomplete-API-Sample
References
https://www.elastic.co/blog/you-complete-me
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
Thanks for the great article! You are describing a lot. I only wanted to ask what if we are using some sort of db with a big amount of info in it. How would it be right to create suggesters – I see you inputing them by yourself
Hi Kirill, thanks for your interest. Could you please give me a little more information about your question? What you mean with “I see you inputting them by yourself”? Is it about indexing data to the elassticssearch from the db or something else?
Well, you have your own examples in your code. I’ve changed your solution a bit and now I have an opportunity to post some kind of news with fields like Name, Tags, Short Description. I’ve wanted to ask if you know how to connect ES with MS SQL for example(I’ve read smth about logstash). And is it possible to suggest not only Name but Description too? Thanks for your time
Hi Kirill, I’m sorry for late reply, these days I a little bit busy with marriage preparing. 🙂 So, yes you can set multiple inputs to one suggestion item. E.g : Input = new[] { “Name”, “Tags”, “Description” } If these fields match any search term, you can change will be displayed text what you want. Is the same thing with my example. Thanks.
Hi.
Thanks for your nice article.
I have a one problem for implementing autocomplete suggester in asp.net core and sql server.
How to create index of multiple fields of some tables?
For example, I search by Product name or category name or product attributes and etc.
Thanks.
Hi Hamid, thanks.
To do that, there are a few ways you can choose.
For example, while you prepare index data, you can add more input text for search by product name, category name or attributes etc…
products.Add(new Product()
{
Id = 3,
Name = "Apple Iphone 8",
Suggest = new CompletionField()
{
Input = new[] { "Apple Iphone 8", "Iphone 8", "Cell Phones", "128GB Iphone 8", "Silver 128GB Iphone 8", "ADD WHAT YOU WANT" }
}
});
or you can create different index and use term analyser instead of completion suggester. Thus you can search by product name, category name or product attributes as parallel, then you can aggregate the result simultaneously.
Regards
Merhaba,
ES direk bizim var olan bir dbdeki tabloyu içine alıp her insert update delete yaptığımızda da ES’yi de mi güncellememiz gerekiyor kurulu bir sistemde ürünler tablosunda ki milyonlarca satır arasından arama yapması için ES nasıl kullanilabilir.
Merhaba evet, nerede arama yapmasını istiyorsanız onunla ilgili index’lerinizi oluşturmanız ve her değişimde o index’leri up-to-date tutmanız gerekmektedir.
Hocam merhaba,
Suggest ile aramada sorun yok ama ben buna ek bir field daha ilave etmek istediğimde hata alıyorum sürekli.
Yani suggest ile arasın ama sadece userid =5 olanları getirsin şeklinde bir sorgu yazamadım bir türlü.
Yardımcı olabilir misin?
Merhaba, completion suggestor FST yapısını kullandığı için bu iş için uygun mudur bilemedim. Ek filtreler takmak istiyorsanız eğer, completion suggestor yerine term-based bir yapı kullanmanızı önerebilirim.
Merhabalar, yapmış olduğunuz örnekte tüm işlemleri ElasticSearch üzerinde gerçekleştiriyorsunuz. Verilerimiz Ms Sql üzerindeyse bu durum da ne olur?
Tam olarak açabilir misiniz? Yapmış ve kullanmış olduğumuz özellik burada elasticsearch’e ait bir özellik. Eğer aynı işlemi elasticsearch yerine Ms Sql de yapmaktan bahsediyorsanız, farklı bir yöntem izlemelisiniz.
Merhaba, emekleriniz için teşekkür ederim.
Ancak 2021-Eylül ayında uygulamayı .net core 2.0 ile create edip
NEST” Version=”5.6.0″ ile index’i oluşturamıyor.
Bunun için driver paketini yükseltmek gerekiyor. Denemek isteyen arkadaşların driver’i son versiona yükseltip code’da index ile ilgili olan kısımları _elasticClient.Indices.Exist() yada _elasticClient.Indices.CreateAsync() gibi revize etmeleri gerekiyor 🙂
Teşekkür ederim güncelleme için.
Merhaba Gökhan Bey,
Açıkçası diğer yorum yapan arkadaşlar da sormaya çalışmışlar ancak tam anlaşılamamış sanırım.
Mevcut MSSQL veritabanımızdaki kayıtların, ES’e aktarılması, Up to date tutulması, ve search işlemlerinin ES üzerinden gerçekleştirilmesine ilişkin, nasıl çalışmalar yapmamız gerekli?
Yani;
1- Mevcut MSSQL kayıtlarının ES’e aktarılması,
2-ES’in MSSQL ile senkronize tutulması için, Projemizdeki her MSQQL’e giden Insert/Update/Delete metodlarının altından ES ile ilgili metodları da mı çağırmamız gerekli?
Sanırım mevcudu aktarmak için tek seferlik çalışacak bir döngü oluşturarak ilgili tüm tabloları insert etmemiz gerekecek. Daha sonraki işlemlerde ise MSSQL ile birlikte ES’i de güncelleye kodlar yazmamız gerekecek gibi..
Bunlara kısa kod örnekleri verebilirseniz sevinirim.
Merhaba teşekkürler yorumunuz için.
Evet tek seferlik bir migration için hazırlayacak olduğunuz script/uygulama ile kayıtları istediğiniz doğrultuda ES ye aktarabilirsiniz. Söz konusu kayıtların up-to-date tutulması olduğunda ise, en güzel çözüm olarak event-based sistemlerden yararlanabilirsiniz. İlgili kayıtlarınızda ilgili domain’ler içerisinde herhangi bir değişim olduğunda bir event publish edebilir ve ilgili kayıt’ı ES tarafında async olarak update edebilirsiniz. Bu tarz işlemleri invalidation işlemleri olarak da aratabilirsiniz.
Merhaba, suggester ile birlikte “Fuzzy Matching” örneğini uyguladım fakat scoring işlemleri için “Weights” belirtmeyi nasıl yapacağımı bulamadım. Bir örnek yapabilir misiniz?
Teşekkürler.
Merhaba, kusura bakmayın geç cevap için. Açıkcası bu makaleyi yazalı epey süre geçmiş. 🙂 Söz vermemekle birlikte boş bir vakit bulabilirsem bakmaya çalışacağım. Teşekkür ederim.