A recommended resilient and scalable Elasticsearch data indexing architecture — Part 2

Outline

Part 1

  1. What is Data Indexing? And why do I need to index?
  2. Open-source Search Engines — Elasticsearch
  3. Message queue/brokers — RabbitMQ
  4. Conventional data indexing flow

Part 2

  1. Problems with conventional data indexing flow
  2. Introduction to Extract-Transform-Load (ETL) indexing architecture
  3. Benefits of proposed architecture
  4. Complication of horizontal scaling

Problems with conventional data indexing flow

Diagram of a conventional data indexing flow

Transaction Boundary

Impact of a larger transaction boundary

Proposed Architecture

Diagram of Proposed Architecture

Benefits of Proposed Architecture

Scalable

Resilient

Modular — Separation of Concerns (SoC)

Complication of Horizontal Scaling

Diagram of horizontal scaling complication

Mitigation

  1. If the version in ES is before the current doc. version, we can proceed to index it.
  2. If the version in ES is after the current doc. version, we can safely discard it.

Conclusion

--

--

--

Full stack software engineer; the only way to not be replaced by robots, is to be the architect of said robots.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Web Scraping with PHP

Formalising the FIX Protocol in Imandra

Top 30+ Most Popular Database Management Software: Complete List

Learning in an Agile Way…

What is InsureDAO

Fetching Nested MongoDB Subdocuments in Golang using the mgo Driver

3 Red Flags to Watch Out for as a Junior Developer

Enrichment and batch processing in Snowplow

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Dan Ong Chew Seng

Dan Ong Chew Seng

Full stack software engineer; the only way to not be replaced by robots, is to be the architect of said robots.

More from Medium

Elasticsearch for Multi-Tenant Architecture

Kafka Stream Business Logic Error Management

Introducing Hot and Cold Retries on Apache Kafka®

The Finalizer Pattern