Build Data Insights and Business Metrics with ELK Stack

With the world around us getting more and more connected, there is advent of different types of computing devices. It could be a heavy-duty server, laptop, desktop, mobile phone or even your refrigerator. One unique thread that connects all these devices is their logging of system information. These logs are nothing but a stream of messages in time-sequence. Systems can now log any piece of structured or unstructured data, application logs, transactions, audit logs, alarms, statistics or even tweets. Add to this the scale of logs. The earlier methodology of human analysis would not work in this kind of scenario. There has to be some automated mechanism for log analysis and deciphering useful information from them.

The trio of Logstash, Kibana and Elasticsearch is one of the most popular open source solutions for logs management. The three products together are known as the ELK stack and provide an elegant solution for log management. At the heart of ELK stack is Elasticsearch which is a distributed, open source search and analytic engine. It is based on Apache Lucene and is designed for horizontal scalability, reliability, and easy management. Logstash is a data collection, enrichment, and transportation pipeline. The ELK stack is completed by Kibana, which is a data visualization platform enabling interaction with data through stunning, powerful graphics.

In order to start your discovery of ELK stack, check out my book titled – Applied ELK Stack: Data Insights and Business Metrics with Collective Capability of ElasticSearch, Logstash and Kibana. With this book you will discover:

  • The need for log analytics, and current challenges
  • How to perform real-time data analytics on streaming data, and turn them into actionable insights
  • How to create indexing and delete data
  • The different components of ELK (Elasticsearch, Logstash, and Kibana) stack
  • Shipping, Filtering, and Parsing Events with Logstash
  • How to build amazing visualizations and dashboards using Data Discovery, Visualization, and Dashboard with Kibana

I hope this book is able to help you with log management along with providing business insights. Do let me know your valuable feedback on the book.

Microservices Asynchronous Communication

Communication among microservices can be either synchronous or asynchronous. Each mode has its own pros and cons. In this post, I would like to share the mechanism of asynchronous communication. As an example, lets consider MyKart, which is an online shopping portal. Just like a typical E-Commerce site, customers (users) can login, browse through product categories, select products and order after paying online. The service architecture is based on micro-service architecture and the application is being built on top of Java stack with the node.js/java script based UI.

Lets see how RabbitMQ can be used for asynchronous communication amongst the microservices.

Service Interaction

The interaction amongst the different services is depicted below:


A typical use case for ordering products is:

  1. Customer (user) logs into the MyKart portal. This is facilitated by the Order service. For existing customers, the details are fetched from Customer service. For new users, a new customer account is created.
  2. Customer can scan through all the different categories of products. The Order service queries the Catalog service to get the details of all the products.
  3. Customer selects the products which he/she wants to purchase. The Billing service is used for online transaction. A payment gateway is used to pay for the purchased products.
  4. Order service maintains records of purchased products and notifies the Shipment service of the transaction. The shipment service will further inform logistics team to prepare for shipping of the purchased products.

Service Interaction and Communication

In order to help with scalability, bulk of the interaction amongst the services would be through asynchronous communication. This would be facilitated by RabbitMQ messaging system. It would help to mention here key concepts of RabbitMQ.

RabbitMQ Fundamentals

The three parts that route messages in RabbitMQ’s are exchanges, queues, and bindings. Applications direct messages to exchanges. Exchanges route those messages to queues based on bindings. Bindings tie queues to exchanges and define a key (or topic) which it binds with.


Besides the exchange type, exchanges are declared with a number of attributes, the most important of which are:

  • Name
  • Durability (exchanges survive broker restart)
  • Auto-delete (exchange is deleted when all queues have finished using it)
  • Arguments (these are broker-dependent)

Exchanges can be durable or transient. Durable exchanges survive broker restart whereas transient exchanges do not. Different types of exchanges are as following:

  • Default Exchange – The default exchange is a direct exchange with no name (empty string) pre-declared by the broker. It has one special property that makes it very useful for simple applications: every queue that is created is automatically bound to it with a routing key which is the same as the queue name.
  • Direct Exchange – A direct exchange delivers messages to queues based on the message routing key. A direct exchange is ideal for the unicast routing of messages.
  • Fanout Exchange – A fanout exchange routes messages to all of the queues that are bound to it and the routing key is ignored. If N queues are bound to a fanout exchange, when a new message is published to that exchange a copy of the message is delivered to all N queues. Fanout exchanges are ideal for the broadcast routing of messages.
  • Topic Exchange – Topic exchanges route messages to one or many queues based on matching between a message routing key and the pattern that was used to bind a queue to an exchange. The topic exchange type is often used to implement various publish/subscribe pattern variations. Topic exchanges are commonly used for the multicast routing of messages.
  • Headers Exchange – A headers exchange is designed for routing on multiple attributes that are more easily expressed as message headers than a routing key. Headers exchanges ignore the routing key attribute. Instead, the attributes used for routing are taken from the headers attribute. A message is considered matching if the value of the header equals the value specified upon binding.


Queues store messages that are consumed by applications. The key properties of queues are as following:

  • Queue Name – Applications may pick queue names or ask the broker to generate a name for them.
  • Queue Durability – Durable queues are persisted to disk and thus survive broker restarts. Queues that are not durable are called transient. Not all scenarios and use cases mandate queues to be durable.
  • Bindings – These are rules that exchanges use (among other things) to route messages to queues. To instruct an exchange E to route messages to a queue Q, Q has to be bound to E. Bindings may have an optional routing key attribute used by some exchange types.

Message Attributes

Messages have a payload and attributes. Some of the key attributes are listed below:

  • Content type
  • Content encoding
  • Routing key
  • Delivery mode (persistent or not)
  • Message priority
  • Message publishing timestamp
  • Expiration period
  • Publisher application id

Message Acknowledgements – We know that networks can be unreliable and this might lead to failure of applications. So it is often necessary to have some kind of processing acknowledgement. In some cases it is only necessary to acknowledge the fact that a message has been received. In other cases acknowledgements mean that a message was validated and processed by a consumer.

Service Communication

All the services of MyKart would utilize RabbitMQ infrastructure for communicating with each other. This is depicted in the following diagram:


When different services want to implement request-response communication using RabbitMQ, there are two key patterns:

  • Exclusive reply queue per request – In this case each request creates a reply queue. The benefits are that it is simple to implement. There is no problem with correlating the response with the request, since each request has its own response consumer. If the connection between the client and the broker fails before a response is received, the broker will dispose of any remaining reply queues and the response message will be lost. Key consideration to take care is that we need to clean up any replies queues in the event that a problem with the server means that it never publishes the response. This pattern has a performance cost because a new queue and consumer has to be created for each request.
  • Exclusive reply queue per client – In this case each client connection maintains a reply queue which many requests can share. This avoids the performance cost of creating a queue and consumer per request, but adds the overhead that the client needs to keep track of the reply queue and match up responses with their respective requests. The standard way of doing this is with a correlation id that is copied by the server from the request to the response.

Durable reply queue – In both the above patterns, the response message can be lost if the connection between the client and broker goes down while the response is in flight. This is because they use exclusive queues that are deleted by the broker when the connection that owns them is closed. This can be avoided by using a non-exclusive reply queue. However this creates some management overhead. One needs some way to name the reply queue and associate it with a particular client. The problem is that it’s difficult for the client to know if any one reply queue belongs to itself, or to another instance. It’s easy to naively create a situation where responses are being delivered to the wrong instance of the client. In the worst case, we would have to manually create and name response queues, which remove one of the main benefits of choosing broker based messaging in the first place.

Rather than using the exclusive reply queue per request, I would recommend to use exclusive reply queue per client. Each service would have its own exchange of type direct where other services would send request events. For response events, the default exchange would be leveraged. All services would have a response queue bound to default exchange corresponding to each service from which response is expected.

Request – Response Event
The request and response messages would be send as an event structure. They would contain the following information:

  1. Correlation Id
    In order to co-relate response event with request event, unique request correlation id would be used. This would help in uniquely identifying a request and tying it up with response. One way of generating unique id is to leverage UUID class of Java. See the code snippet below:
    There can be other mechanisms also for unique id generation, but in my view this is most suitable and user-friendly.
  2. Payload
    The request information and response details would be serialized as a JSON string. While sending a request, the request object details would be converted into JSON format. Similarly, while receiving response, the JSON information would be used to populate the actual response object. The third party Jackson utility would be leveraged for object to JSON conversion and vice-versa.

Service Interaction
Let us take an example to see how the different services would leverage the messaging infrastructure for interaction. Let us take the use case when a customer logs in to the Order service and it then fetches customer information.

  1. Order service needs to get customer information, like name, address, etc. and for this it has to query the Customer service.
  2. The request object contains the customer id. It is converted into JSON format using Jackson.
  3. Java UUID class is used to generate the correlation id.
  4. The message is send to Customer exchange with as the routing key. The CustomerReadQueue is bound to Customer exchange with this key and so the message gets delivered to CustomerReadQueue.
  5. The reply_to field is used to indicate on which particular queue the response event should be send. In this case, the reply_to field is filled with the value CustomerResponseQueue.
  6. Once the Customer service is able to fetch the desired information, it converts it into JSON format. It then populates the Correlation Id received earlier. The response event is then send to CustomerResponseQueue, which is bound to the default exchange.
  7. The Order service is waiting for response messages on CustomerResponseQueue. Once it receives an event it uses the JSON payload to populate the Response object. The Correlation Id is used to map it to the request object send earlier.

Service Notification
Besides, the regular request-response interaction, the messaging infrastructure can also be used for one way notifications. Any good application should always log important information and also raise metrics/alarms for the same. All the services of MyKart would send important notifications to Logging Service and Metrics service. The Logging service would use these events to log information in logging infrastructure. The Metrics service would use this information to raise metrics and alarms. These metrics are helpful in serviceability of microservices.

For MyKart, there is Fanout exchange by the name of Notification exchange. It would send the received information to all the registered queues. Both the Logging service and Metrics service have their queues bound to this exchange. Events for Logging service are received on LoggingQueue and the events for Metrics service are received on MetricsQueue.


As has been demonstrated in the shopping cart example, RabbitMQ can be used effectively for asynchronous communication among different services.

Mining Mailboxes with Elasticsearch and Kibana

In a previous post I had mentioned that the trio of Logstash, Kibana and Elasticsearch (ELK stack) is one of the most popular open source solutions for not only logs management but also data analysis. In this post I will demonstrate how ELK can be used to effectively and efficiently perform big data analysis. As a reference let’s take some huge mailbox data. Mail archives are arguably one of the most interesting kind of social web data. It is omnipresent and each message throws light on the communication which people are having. As a CXO of an organization you may want to analyze corporate mails for trends and patterns.

As a reference, I will take the well-known Enron corpus as it has a huge collection of mails and there is no risk of any legal or privacy concerns. This data will be standardized into Unix mailbox (mbox) format. From the mbox format it will be again transformed into a single json file.

Getting the Enron corpus data

The full Enron dataset in a raw form is available for download in various formats. I will start with the original raw form of the data set that is essentially a set of folders that organizes a collection of mailboxes by person and folder. The following snippet would illustrate the basic structure of the corpus after you have downloaded and unarchived it. Go ahead and play with it a little bit so that you become familiar with it.

C:> cd enron_mail_20110402maildir # Go into the mail directory

C:\enron_mail_20110402\maildir>dir # Show folders/files in the current directory
allen-p         crandell-s     gay-r           horton-s

lokey-t         nemec-g         rogers-b       slinger-r

tycholiz-b     arnold-j       cuilla-m       geaccone-t
<pre>               …directory listing truncated…</pre>
neal-s         rodrique-r     skilling-j     townsend-j
<pre>C:enron_mail_20110402maildir> cd allen-p/ # Go into the allen-p folder

C:enron_mail_20110402maildirallen-p> dir # Show files in the current directory</pre>
_sent_mail         contacts         discussion_threads notes_inbox

sent_items         all_documents     deleted_items     inbox
sent               straw
C:\enron_mail_20110402\maildir\allen-p> cd inbox/ # Go into the inbox for allen-p
C:\enron_mail_20110402\maildirallen-p\inbox> dir # Show the files in the inbox for allen-p

  1. 11. 13. 15. 17. 19. 20. 22. 24. 26. 28. 3. 31. 33. 35. 37. 39. 40.
  2. 44. 5. 62. 64. 66. 68. 7. 71. 73. 75. 79. 83. 85. 87. 10. 12. 14.
  3. 18. 2. 21. 23. 25. 27. 29. 30. 32. 34. 36. 38. 4. 41. 43. 45. 6.

63. 65. 67. 69. 70. 72. 74. 78. 8. 84. 86. 9.
C:\enron_mail_20110402\maildir\allen-p\inbox> cat 1. # Show contents of the file named “1.”

Message-ID: &amp;lt;16159836.1075855377439.JavaMail.evans@thyme&amp;gt;
Date: Fri, 7 Dec 2001 10:06:42 -0800 (PST)
Subject: RE: West Position
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-From: Dunton, Heather &amp;lt;/O=ENRON/OU=NA/CN=RECIPIENTS/CN=HDUNTON&amp;gt;
X-To: Allen, Phillip K. &amp;lt;/O=ENRON/OU=NA/CN=RECIPIENTS/CN=Pallen&amp;gt;
X-Folder: Phillip_Allen_Jan2002_1Allen, Phillip K.Inbox
X-Origin: Allen-P
X-FileName: pallen (Non-Privileged).pst

Please let me know if you still need Curve Shift.



Now the next step is to convert the mail data into Unix mbox format. An mbox is in fact just a large text file of concatenated mail messages that are easily accessible by text-based tools. I have used python script to convert it into mbox format. Thereafter, this mbox file would be converted into ELK compatible JSON format. The json file can be found here. A snippet of json file can be found below:


[{"X-cc": "", "From": "", "X-Folder": "\jskillin\Inbox", "Content-Transfer-Encoding": "7bit", "X-bcc": "", "X-Origin": "SKILLING-J", "To": [""], "parts": [{"content": "n[IMAGE]n[IMAGE]nJoin us June 26th for an on-line seminar featuring Steven J. Kafka, Senior Analyst at Forrester Research, as he discusses how technology can create more effective collaboration in today's virtualized enterprise. Also featuring Mike Hager, VP, OppenheimerFunds, offering insights into implementing these technologies through real-world experiences. Brian Anderson, CMO, Access360 will share techniques and provide tips on how to successfully deploy resources across the virtualized enterprise. nDon't miss this important event. Register now at . For a sneak preview, check out our one-minute animation that illustrates the challenges of provisioning access rights across the "virtualized" enterprise.nAbout Access360nAccess360 provides the software and services needed for deploying policy-based provisioning solutions. Our solutions help companies automate the process of provisioning employees, contractors and business partners with access rights to the applications they need. With Access360, companies can react instantaneously to changing business environments and relationships and operate with confidence, whether in a closed enterprise environment or across a virtual or extended enterprise.n nAccess360 nnIf you would prefer not to receive further messages from this sender:n1. Click on the Reply button.n2. Replace the Subject field with the word REMOVE.n3. Click the Send button.nYou will receive one additional e-mail message confirming your removal.nn", "contentType": "text/plain"}], "X-FileName": "jskillin.pst", "Mime-Version": "1.0", "X-From": "Access360 <>@ENRON", "Date": {"$date": 991326029000}, "X-To": "Skilling, Jeff </o=ENRON/ou=NA/cn=Recipients/cn=JSKILLIN>", "Message-ID": "<14649554.1075840159275.JavaMail.evans@thyme>", "Content-Type": "text/plain; charset=us-ascii", "Subject": "Forrester Research on Best Practices for the "Virtualized" Enterprise"}


When you have huge amount of data to be pushed into Elasticsearch then it is better to do bulk import by specifying the data file. Each mail message is in a line of its own associated with an entry specifying the index (enron) and document (inbox). There is no need to specify the id as Elasticsearch would automatically specify the id.

Data in Elasticsearch can be broadly divided into two types – exact values and full text. Exact values are exactly what they sound like. Examples are a date or a user ID, but can also include exact strings such as a username or an email address. For e.g., the exact value Foo is not the same as the exact value foo. The exact value 2014 is not the same as the exact value 2014-09-15. On the other hand Full text refers to textual data – usually written in some human language – like the text of a tweet or the body of an email. For the purpose of this exercise, it is better to treat Email addresses (To, CC, BCC) as exact values. Hence, we first need to specify the mapping, which can be done in the following manner.

curl -XPUT “localhost:9200/enron” -d "{
    "number_of_shards": 5,
    "number_of_replicas": 1
            "enabled": false
                "type": "string",
                "index": "not_analyzed"
                "type": "string",
                "index": "not_analyzed"
                "type": "string",
                "index": "not_analyzed"
                "type": "string",
                "index": "not_analyzed"


You can verify that the mapping has indeed been set.

curl -XGET "http://localhost:9200/_mapping?pretty"
    "enron" :
        "mappings" :
            "inbox" :
                "_all" :
                    "enabled" : false
                "properties" :
                    "BCC" :
                        "type" : "string",
                        "index" : "not_analyzed"
                    "CC" :
                        "type" : "string",
                        "index" : "not_analyzed"
                    "From" :
                        "type" : "string",
                        "index" : "not_analyzed"
                    "To" :
                        "type" : "string",
                        "index" : "not_analyzed"


Now let’s load all the mailbox data by using the json file, in the following manner:

curl -XPOST "http://localhost:9200/_bulk" --data-binary @enron.json


We can check if all the data has been uploaded successfully.

curl "localhost:9200/enron/inbox/_count?pretty"
    "count" : 41299,
    "_shards" :
        "total" : 5,
        "successful" : 5,
        "failed" : 0


You can see that 41299 records each corresponding to a different message, have been uploaded. Now lets start the fun part by doing some analysis on this data. Kibana provides awesome analytic capability and associated charts. Lets try to see how many messages are circulated on a weekly basis.


The above histogram shows the message spread on a weekly basis. The date value is in terms of milliseconds past the epoch. You can see that one particular week has a peak of 3546 messages. Something interesting must be happening that week. Now lets see who are the top recipients of messages


You can see that Gerald, Sara, Kenneth are some of the top recipients of messages. How about checking out the top senders?


You can see that Pete, Jae and Ken are the top senders of messages. In case you are wondering what exactly Enron employees used to discuss, let’s check out top keywords from message subjects.


It seems most interesting discussions centered on enron, gas, energy, power. There can be a lot more interesting analysis done with the Enron mail data. I would recommend you try the following:

  • Counting sent/received messages for particular email addresses
  • What was the maximum number of recipients on a message?
  • Which two people exchanged the most messages amongst one another?
  • How many messages were person-to-person messages?



Microservices Orchestration

In my earlier post, I had introduced the concept of microservices. As a quick recap, microservices model promises ease of development and maintenance, flexibility for developers/teams to work on different things, building block for a scalable system and a true polygot model of development. However, this model is not without challenges and the biggest being addressing the complexity of a distributed system. Since now we have to deal with multiple services spread across multiple hosts, it becomes difficult to keep track of different hosts and services. In order to scale further, the number of instances of services would further increase and this would in turn lead to increase in the number of hosts.

Container technology can help in reducing the complexity of microservice orchestration. The idea of operating system container is not new and has been present in one form or another for decades. Operating systems in general provide access to various resources in some standard way and share it amongst the running processes. Container technology takes this idea further by not only sharing but also segregating access to networks, filesystems, cpu and memory. Each container can have its own restricted amount of CPU, memory and network independent of the other containers.

Lets see how container technology helps with microservice deployment.

  • This isolation between containers running on the same host makes deploying microservices developed using different languages and tools very easy.
  • Portability of containers makes deployment of microservices seamless. Whenever a new version of a service is to be rolled, the running container can be stopped and a new container can be started using the latest version of service code. All the other containers will keep on running unaffected.
  • Efficient resource utilization as containers comprise of just the application and its dependencies.
  • Strictly controlled resource requirements without using virtualization.

There are two popular container technologies.

Docker – Google was amongst the biggest contributor for underlying container technology in Linux and it evolved into LXC. Docker was initially developed as a LXC client and allows the easy creation of containers. The container images are essentially filesystems with a configuration file that gives a startup profile for the process that runs inside the container. The biggest USP of Docker is its ease of use wherein a single command can download an existing image and boot it into a running container. This has helped in its popularity and widespread use. Latest versions of Docker have no dependency on LXC and through their libcontainer project have a full stack of container technology all the way to the kernel. Docker is all set to shake up the virtualization market by consuming it from the bottom.

There are some opensource additions on Docker – Kubernetes can deploy and manage multiple Docker containers of the same type and TOSCA can help with more complex orchestration involving multiple tiers.

Amazon ECS – The Amazon EC2 Container Service (Amazon ECS) defines a pool of compute resources called a “cluster.” Each cluster consists of one or more Amazon EC2 instances. Amazon ECS manages the state of all container-based applications running in cluster, provides telemetry and logging, and manages capacity utilization of the cluster, allowing for efficient scheduling of work. A group of containers can be defined using a construct called “task definition”. Each container in the task definition specifies the resources required by that container, and Amazon ECS will schedule that task for execution based on the available resources in the cluster. Amazon ECS simplifies “service discovery” by integrating with 3rd party tools like Zookeeper.

I would say use of containers to deploy microservices is an evolutionary step driven mostly by the need to make better use of compute resources and to maintain increasingly complex applications. The use of microservices architectural philosophy with containers addresses these needs. Containers offer the right abstraction to encapsulate Microservices.

Microservices in a Nutshell

At the onset, “Microservices” sounds like yet another term on the long list of software architecture styles. Many experts consider it as lightweight or fine-grained SOA. Though it is not entirely a new idea, Microservices seem to have peaked in popularity in recent years, what with numerous blogs, articles and conferences evangelizing the benefits of building software systems in this style. The coming together of trends like Cloud, DevOps and Continuous Delivery has contributed to this popularity. Netflix, in particular, has been the poster boy of Microservices architectural model.

Nevertheless, Microservices deserves its place as a separate architectural style which while leveraging from earlier styles has a certain novelty. The fundamental idea behind Microservices is to develop systems as set of fine grained, independent and collaborating services which run in their own process space and communicate with lightweight mechanisms like HTTP. The stress is on services to be small and they can evolve over time.

In order to better understand the concept of microservices, let’s start by comparing them with traditional monolithic systems. Let’s consider an E-commerce store which sells products online. Customers can browse through the catalog for the product of their choice and give orders. The store verifies the inventory, takes the payment and then ships the order. The figure below demonstrates a sample E-commerce store.


The online store consists of many components like the UI which helps customers with the interaction. Besides, the UI there are other components for organizing the product catalog, order processing, taking care of any deals or discounts and managing the customer’s account. The business domain shares resources like Product, Order and Deals which are persisted in a Relational Database. The customers may access the store using a desktop browser or a mobile device.

Despite being a modularly designed system, the application is still monolithic. In the Java world the application would be deployed as a WAR file on a Web Server like Tomcat. The monolithic architecture has its own benefits:

  • Simple to develop as most of the IDEs and development tools are tailored for developing monolithic applications.
  • Ease of testing as only one application has to be launched.
  • Deployment friendly as in most cases a single file or directory has to be deployed on a server.

Till the monolithic application is simple everything seems normal from the outside. However, when the application starts getting complex in terms of scale or functionality, the handicaps become apparent. Some of the undesired shortcomings of a monolithic application are as following:

  • A large monolithic system becomes unmanageable when it comes to bug fixes or new features.
  • Even a single line of code change leads to a significant cycle time for deploying on production environment. The entire application has to be rebuilt and redeployed even if one module gets changed.
  • The entry barrier is high for trial and adoption of new technologies. It is difficult for different modules to use different technologies.
  • It is not possible to scale different modules differently. This can be a huge headache as monolithic architectures don’t scale to support large, long lived applications.

This is where the book, “The Art of Scalability” comes as a big help. It describes other architectural styles that do scale. It describes scalability model on the basis of a three dimensional scale cube.

scalability-axisThe X-axis scaling is the easiest way of scaling an application. It requires running multiple identical copies of the application behind a load balancer. In many cases it can provide good improvement in capacity and availability of an application without any refactoring.

In case of Z-axis scaling each server runs an identical copy of application. It sounds similar to X-axis scaling but the crucial difference is that each server is responsible for only a subset of the data. Requests are routed to appropriate server based on some intelligent mechanism. A common way is to route on the basis of the primary key of the entity being accessed, i.e. sharding. This technique for scaling is similar to X-axis scaling as it improves the application’s capacity and availability. However, none of these approaches does a good job of easing the development and application complexity. This is where the Y-axis scaling helps.

Y-axis scaling or functional decomposition is the 3rd dimension to scaling. We have seen that Z-axis splits things that are similar. On the other hand, Y-axis scaling splits things that are different. It divides a monolithic application into a set of services. Each service corresponds to a related functionality like catalog, order management, etc. How to split a system into services? It is not as easy as it appears to be. One way could be to split by verb or use case. For e.g., UI for compare products use case could be a separate service.

Another approach could be to split on the basis of nouns or system resources. This service takes ownership of all operations of an entity/resource. For e.g., there could be a separate catalog service which manages the catalog of products.

This leads to another interesting question. How big should a service really be? One school of thought recommends each service should have only a small set of responsibilities. As per Uncle Martin, services should be designed using the Single Responsibility Principle (SRP). The SRP suggests that a class should only have one reason to change. It makes great sense to apply the SRP to service design as well.

Another way of addressing this is to design services just like Unix utilities. Unix provides a large number of utilities such as grep, cat, ls and find. Each utility does exactly one thing, often exceptionally well, and can be combined with other utilities using a shell script to perform complex tasks. While designing services, try to model them on Unix utilities so that they perform a single function.

Some people argue that the lines of code (10 – 100 LOC) could be a benchmark for service size. I am not sure if it is a good talisman, what with many languages being more verbose than others. For e.g., Java is quite verbose as compared to Scala. An interesting take is that a microservice size should be such that an entire microservice can be developed from scratch by a scrum team in one sprint.

In all these interesting and diverse viewpoints, it should be remembered that the goal is to create a scalable system without any problems associated with monolithic architecture. It is perfectly fine to have some services which are very tiny while others are substantially larger.

On applying the Y-axis decomposition to the example monolithic application, we would get the following microservices based system:

servicesAs you can see after splitting the monolithic architecture, there are different frontend services addressing a specific UI need and interacting with multiple backend services. One such example is of the Catalog UI service which communicates with both the Catalog service and Product service. The backend services encompass the same logical modules which we had seen earlier in the monolithic system.

Benefits of Microservices

The microservices based architecture has numerous benefits and some of them can be found below:

  • The small size of microservice leads to ease of development and maintenance.
  • Multiple developers and teams can deliver relatively independently of each other.
  • Each microservice can be deployed independently of other services. If there is a change in a particular service then only that service need be deployed. Other services are unaffected. This in turn leads to easy transition to continuous deployment.
  • This architectural pattern leads to truly scalable systems. Each service can be scaled independently of other services.
  • The microservice architecture helps in troubleshooting. For example, a memory leak in one service only affects that service. Other services will continue to run normally.
  • Microservice architecture leads to a true polyglot model of development. Different services can deploy different technology stacks as per their needs. This gives a lot of flexibility to experiment with various tools and technologies. The small nature of these services makes the task of rewriting the service possible without much effort.

Drawbacks of Microservices

No technology is a silver bullet and microservice architecture is no exception. Let’s look at some challenges with this model.

  • There is added complexity of dealing with a distributed system. Each service would be a separate process and inter-process communication mechanism is required for message exchange amongst services. There are lots of remote procedure calls, REST APIs or messaging to glue components together across different processes and servers. We have to consider various challenges like network latency, fault tolerance, message serialization, unreliable networks, asynchronicity, versioning, varying loads within our application tiers etc.
  • Microservices are more likely to communicate with each other in an asynchronous manner. This helps in decompose work into genuinely separate independent tasks which can happen out of order at different times. Things start getting complex with the need for managing correlation IDs and distributed transactions to tie various actions together.
  • Different test strategies are required as asynchronous communication and dynamic message loads make it much harder to test systems. Services have to be tested both in isolation and their interactions also have to be tested.
  • Since there are many moving parts, the complexity is shifted towards orchestration. To manage this a PaaS-like technology such as Pivotal Cloud Foundry is required.
  • Careful planning required for features spanning multiple services.
  • Substantial DevOps skills are required as different services may use different tools. The developers need to be involved in operations also.


I am sure with all the aspects mentioned above you must be wondering whether you should start splitting your legacy monolithic architectures into microservices or in fact start any fresh system with this architectural style. The answer as in most cases is – “It Depends!” There are some organizations like Netflix, Amazon, The Guardian, etc. who have pioneered this architectural style and demonstrated that it can give tremendous benefits.

Success with microservices depends a lot on how well you componentize your system. If the components do not compose cleanly, then all you are doing is shifting complexity from inside a component to the connections between components. Not only that, it also moves complexity to a place that’s less explicit and harder to control.

The success of using microservices depends on the complexity of the system you’re contemplating. This approach introduces its own set of complexities like continuous deployment, monitoring, dealing with failure, eventual consistency, and other factors that a distributed system introduces. These complexities can be handled but its extra effort. So, I would say if you can manage to keep your system simple, then you may not need microservices. If your system runs the risk of becoming a complex one then you can certainly think about microservices.

Log Management in the Cloud Age

In traditional systems, logs are lines of text intended for offline human consumption. With the advent of Cloud and Big Data, there is a paradigm shift in what can be logged. Systems can now log any piece of structured or unstructured data, application logs, transactions, audit logs, alarms, statistics or even tweets. Add to this the scale of logs. The earlier methodology of human analysis would not work in this kind of scenario. There has to be some automated mechanism for log analysis and deciphering useful information from them.

The trio of Logstash, Kibana and Elasticsearch is one of the most popular open source solutions for logs management. The three products together are known as the ELK stack and provide an elegant solution for log management.

Elasticsearch is a distributed, flexible and powerful, RESTful, search and analytics engine based on Apache Lucene Index. It gives the ability to move beyond simple-full text search. It categorizes data using indices which can be easily divided into shards (equivalent to partitions in RDBMS) and each shard can have zero or more replicas. This helps in providing near real-time search. Elasticsearch provides robust set of APIs and query DSL in addition to clients for most of the popular programming languages.

Elasticsearch was built from the ground up to handle any kind of data and. It can slice and aggregate data on the fly, based on any field in the logs. This creates valuable insights from raw logs.

Kibana is a data visualization engine used along with Elasticsearch. It helps in natively interacting with all data in Elasticsearch via custom dashboards. You can make dynamic, shareable and exportable dashboards. Data analyses becomes a breeze with Kibana’s elegant user interface using pre – designed or custom dashboards in real-time for on-the-fly data analysis. Kibana is easy to setup and can integrate seamlessly with different log aggregators like Logstash, Apache Flume, etc. See below for a sample Kibana dashboard:


Logstash is one of the most popular open source logs and events shipper/processor. It takes as input logs, processes and other time based events from any stem and stores data in a single place for additional processing. It scrubs logs and parses all data sources into an easy to read JSON format. This means that your logging data can now be analyzed in real time. You can then use Kibana to explore and monitor the analytics. The logstash – elasticsearch – kibana is illustrated below:


The ELK stack is very powerful tool for monitoring and analytics of cloud scale logs.