Message Brokers

Estimated reading time: 3 min

Introduction

Message Brokers are about processing messages or streams of data from sender to receiver. They play a key role in micro-services architecture or for example when you want to control data feeds. By using them you get centralized processor/storage of these messages as a single source of truth.

There are 2 main patterns of messaging:

  1. queueing
  2. publish-subscribe (pub-sub)

Apache Kafka

https://kafka.apache.org/

Powerful event streaming platform developed by LinkedIn. It is fault-tolerance and reliable with good horizontal scalability, and since it is designed as stream processor it is able to handle big data projects and real-time processing. But it is little bit harder to install and maintain because it has dependency to Apache Zookeper. Apache Zookeper is used for example to track status of cluster nodes, leader detection and configuration management.

Kafka can be easily connected to multiple consumers - such as ELK.

Basically it is able to publish and subscribe to streams of records (topics), topics can be partitioned across multiple nodes for a highly available deployment. A log is a time-ordered, append-only sequence of data inserts, every insert consists of key, a value and a timestamp.

Kafka messages do not have any separate IDs, they are addressed by their offset in the log. Because Kafka does not allow random access (always delivers messages in order, starting at the offset) and kafka producers do not wait for ack from the broker - it is higly performant.

RabbitMQ

https://www.rabbitmq.com/

It is easy to install and deploy (Puppet, Chef, Docker and others), has support for all mainstream programming languages, and it also has nice UI built-in. It is based on pub-sub communication pattern - is suitable for wide range of projects similarly to Apache ActiveMQ. Has routing and clustering available across various zones and regions.

Simmilarly to ActiveMQ it checks the state of a message and verifies if it was delivered sucesfully.

It could have issues with processing big amounts of data - where Apache Kafka might be more useful. It store message queue in memory so if consumer is not connected to the queue it will not get it (in contrast to Kafka, where the consumer can get it even if it connects later).

Apache ActiveMQ

https://activemq.apache.org/

It is a push based messaging system - publisher will send message to all conosumers in ActiveMQ, in contrast in Apache Kafka consumer is pulling the messages at its own time.

Other contrast to Kafka is that it maintains the state of every message (there is an ack from the consumer). It’s message storage has about 70% more overhead in comparison to Apache Kafka - therefore Kafka is much better for smaller messages.

Therefore Kafka has about 2-4x bigger throughput. But at the end, simplicity of the usage and sufficiency for most retail-based scenarios makes it a good fit. The bit of the downside is that the client libraries are not maintained well sometimes and could contain bugs (in contrast to RabbitMQ which provides the libraries).

Redis

https://redis.io/

Discussed already here Redis - Redis is also has pub-sub features. Similarly to RabbitMQ the messages are sent a forgotten if there is no subscriber in the channel. Because Redis is really lightweight and also it’s simplicity and perfomance makes it popular for rande of use cases.