ElasticSearch implementation

We have helped architect and design moving the bol.com main application to use ElasticSearch instead of Endeca.

Bol.com has used Endeca as their main search engine to power both search results, navigation and product faceting for quite some time. The scalability, flexibility and price structure of this tool was not future-proof for bol.com, so they initiated a project to start using ElasticSearch.

In a big project like that, some decisions need to be made very early on, as some early mistakes are very costly to fix later on. Setting up the right data model for Elasticsearch is one such design decision that is important to get right the first time.

For bol.com, two main entities Products and Offers needed to be represented. A hierarchical relationship like that can be modelled in many different ways, the most prominent ones being using the Document model, Nested documents, and Parent/Child relationships.

We set up a test environment and loaded it with randomized sets of data in each of the models, and fired a series of user query scenarios to it. We did this with different total sizes, so that we could assess the performance, query flexibility and scalability of them all.

We used the outcome of this experiment to model the objects in the system, and then started to implement the loading of the data from the source systems.

Scalable Data models with Elasticsearch

Slides from my presentation about this project at the Elasticsearch Meetup Amsterdam in 2016 can be found on Slideshare

Related projects