Modularity in Java 9

One of the distinctive features of any good software engineering practice is modularity. Before Java 9, the developers had to themselves take care of modularity with the limited tool set. One of the key features of Java 9 is modularity and it is going to be a game changer in how we develop software. Modular programming is essentially a way of implementing a program as number of individual components with their distinct functionality. These individual components are known as modules. The central idea is to avoid monolithic design and break a complex system into manageable parts.

Challenges till Java 8

A thought often comes that may be idea of modularity is over-hyped. Do we really need modularity in Java? Before going into the details of various kinds of issues faced due to lack of modularity, does “JAR Hell” ring a bell. The arbitrary complex JAR loading mechanisms lead to a situation which is described as classpath hell or JAR hell. There are a lot of configurations which can lead to this situation. The challenges with Java 8 or earlier systems are as following:

  • Many a times classes and interfaces contained in a JAR file overlap with other classes in some other package to function appropriately. This creates dependency among classes and interfaces. As a result, one JAR file may be dependent on another to execute. Java runtime simply loads the JARs in the order in which they appear in the classpath irrespective of the fact that a copy of the same class may exist in multiple JARs.
  • There are instances where some classes or interfaces remain missing. This is only found during execution. Most of the times, this causes the application to crash giving a runtime error message.
  • A very common and frustrating problem is version mismatch. A JAR file dependent on another JAR file may not work because one or more of its dependent modules may have been upgraded or downgraded to make it compatible.
  • The large size of JDK makes it tough to scale down to small devices. Although Java 8 introduced three different types of profiles – compact1, compact2 and compact3, still the problem could not be resolved. Moreover, the large size of JRE makes it cumbersome to test and maintain applications.
  • Lack of strong encapsulation Java ecosystem as “public” access modifier is too open. Anyone can access it. Even internal APIs can be accessed

The characteristics of a modular system are as following:

  • Module Id and Version – Consistent and unique identity
  • Loose Coupling – Autonomous unit of deployment
  • Communication Contract – Open and Understandable interface
  • Encapsulation – Hidden implementation details

Java 9 Modular System

There is an argument, with some merit, that Java 8 and earlier releases were also modular in some sense. The object-oriented nature of Java ensures basic modularity but has limitations like no versioning of interfaces, non-uniqueness of class at deployment level and no strict compliance of loose coupling. A level of abstraction is added to Java programming environment by use of packages. The key benefit of packages is unique coding namespaces and configuration contexts. But these package conventions are conveniently bypassed, very often leading to an environment of dangerous compile-time couplings. JARs are a good attempt at modularization but they don’t fulfill all the requirements for a truly modular environment.

Java 9 has modular components and segments throughout the entire JDK. The key features supporting modularization are:

  • Modular Source Code – The JRE and JDK are organized into interoperable modules which enables the creation of scalable runtimes that can be executed on small devices.
  • Segmented Code Cache – The new code cache makes intelligent decisions to compile frequently accessed code segments to native code and store them for optimized lookup and future execution. The heap is segmented into 3 distinct units:
    • Non-method code that will be stored permanently in the cache
    • Code that has a potentially long lifecycle (non-profiled code)
    • Transient code (profiled code)
  • Deployment facilities – Tools are provided to support module boundaries, constraints, and dependencies at deployment time

Module Types

The different types of modules are listed below with their descriptions:

  • Application Modules – Modules that are created to achieve functionality. All third-party dependencies lie in this category.
  • Automated Modules – Those JARs which are placed in the module path without module descriptor are known as automated modules. They do an implicit export of all packages and read all other modules. The main benefit of these modules is to use pre-Java 9 build JARs.
  • Unnamed Modules – Any JAR or class on the class path will be in the unnamed module. Since, it does not have any name it can read and export all the modules.
  • Platform Modules – The JDK has also been transformed into a modular structure. These modules are known as platform modules. For e.g., java.se, java.xml.ws

Declaring a Jar file as a module

In order to declare a jar file as a named module, one needs to provide a module-info.class file which is obtained after compiling module-info.java file. The role of this file is to declare dependencies within the module system and allows the compiler and the runtime to govern the boundaries/access violations between then modules. Some of the module descriptors are described below:

  • module module.name – declares a module called module.name.
  • requires module.name – specifies that our module depends on the module module.name, allows this module to access public types exported in the target module.
  • requires transitive module.name – any modules that depend on this module automatically depend on module.name.
  • exports pkg.name says that our module exports public members in package pkg.name for every module requiring this one.
  • exports pkg.name to module.name the same as above, but limits which modules can use the public members from the package pkg.name.
  • uses class.name makes the current module a consumer for service class.name.
  • provides class.name with class.name.impl registers class.name.impl class a service that provides an implementation of the class.name service.
  • opens pkg.name allows other modules to use reflection to access the private members of package pkg.name.
  • opens pkg.name to module.name does the same, but limits which modules can have reflection access to the private members in the pkg.name.

All the popular IDEs support the module-info.java syntax.

Summary

Prior to Java 9, JAR files were the closest one could get to modules. There were pain points like “JAR hell” which made the development process quite frustrating. Java 9 has tried to address it primarily through project Jigsaw. Both JRE and JDK have been made modular without breaking existing system. The modular nature of Java 9 would give the necessary boost for creating interesting systems.

Please follow and like us:
RSS20
Follow by Email
Facebook0
Google+20
http://www.thistechnologylife.com/?p=464
Twitter20
LinkedIn0

Java 8 Lambda Expression for Design Patterns – Iterator Design Pattern

The iterator pattern allows for access and traversal of an aggregate object (list, map, etc.) without putting that overhead on the aggregate object. Iterator is responsible for keeping track of the current element, i.e., it knows which elements have been traversed already. The Iterator provides mechanism for accessing the list’s elements. An iterator object is responsible for keeping track of the current element, i.e., that is, it knows which elements have been traversed already.

In Java, the relationship between List and ListIterator interfaces can be depicted as follows:

Iterator Design Pattern
Iterator Design Pattern
  • List – An ordered collection which gives precise control over where in the list each element is inserted. Elements can be accessed by their integer index (position in the list), and searched in the list
  • ListIterator – Facilitates traversal of list in either direction, modification of list during traversal and obtain the current position in the list. A ListIterator does not have any current element. Its cursor position always lies between the element that would be returned by a call to previous() and the element that would be returned by a call to next().

Now let’s look at a concrete example of a list of numbers. The most common way to print numbers in a list is to use the good old “For Loop”.

for (int index=0; index < myNumbers.size(); index++)

    System.out.println(myNumbers.get(index));

or

for (int element:myNumbers)

    System.out.println(element);

Both the above-mentioned mechanisms work perfectly. However, with the introduction of lambdas, you can minimize the code size further. You can use internal iterators in the following manner:

myNumbers.forEach(element->System.out.println(element));

You see how the code got reduced. It can be reduced even further:

myNumbers.forEach(System.out::println);

The use of lambda functions while iterating list simplifies the code as the focus is on what to do to the element rather than indulging in the mechanisms of iterating the list.

Please follow and like us:
RSS20
Follow by Email
Facebook0
Google+20
http://www.thistechnologylife.com/?p=455
Twitter20
LinkedIn0

Java 9 Features

Just like Java 8 is known as the major release of lambdas, streams and API changes, similarly Java 9 is all about project Jigsaw, utilities and changes under the hood. In this post, I would like to talk about some of the most exciting features being targeted for Java 9. The full list of new features is available here.

Modular System – Jigsaw Project

This is the biggest addition to the JDK and would bring modularity to the Java platform. An increase in codebase, quite often leads to complicated, tangled “spaghetti code”. It makes it hard to encapsulate code and there is no clear dependencies between different parts (JAR files) of a system. Any public class can access any other public class based on the classpath. This can lead to inadvertent usage of classes that were not supposed to be public API. In fact, the classpath itself is not an elegant way of verifying if the required JARs are present or it there are any duplicate entries. These challenges are handled very well by the module system.

The capabilities of a modular system are quite similar to those of OSGi framework. Modules have an inherent concept of dependencies, can expose a public API and keep implementation details hidden/private. The biggest motivation is to provide modular JVM, which requires lesser memory footprint in order to run on devices. The JVM can then run with only those modules and APIs which are essential.

There is an additional module descriptor in modular JAR files. This module descriptor, has dependencies expressed through “requires” statements. In addition to this, “exports” statements control which packages are accessible to other modules.

Modular JAR files contain an additional module descriptor. In this module descriptor, dependencies on other modules are expressed through “requires” statements. Additionally, “exports” statements control which packages are accessible to other modules. All packages which are not exported are encapsulated in the module by default. Let’s see an example of a module descriptor, which lives in `module-info.java`:

module com.thistechnologylife.java9.modules.house {

requires com.thistechnologylife.java9.modules.furniture;

exports com.thistechnologylife.java9.modules.lawn;

}

The module house requires module furniture and exports a package for lawn.

JShell – The Interactive Java REPL

Jaba 9 comes with a new tool called “jshell”, which stands for Java Shell and also known as REPL (Read Evaluate Print Loop). It can be used to execute and test any Java Constructs like class, interface, enum, object, statements etc. very easily without wrapping them in a separate method or project.

JShell can be launched directly from the console and you can start typing and executing Java code. One great example of jshell is to test regular expressions.

The jshell executable itself can be found in <JAVA_HOME>/bin folder:

jdk-9\bin>jshell.exe

| Welcome to JShell -- Version 9

| For an introduction type: /help intro

jshell> "Say Hello To Java 9 Features".substring(13,19);

$5 ==> "Java 9"

Improved Network Communication with HTTP/2.0 Support

Java 9 comes with a new way of performing HTTP based communication. It provides a long-awaited replacement of the old HttpURLConnection and supports both WebSockets and HTTP/2. The new API is located under the java.net.http package.

HttpClient client = HttpClient.newHttpClient();

HttpRequest request =

HttpRequest.newBuilder(URI.create("http://www.amazon.com"))

.header("User-Agent","Java")

.GET()

.build();

HttpResponse<String> response = client.send(request, HttpResponse.BodyHandler.asString());

HttpClient also provides new APIs to deal with HTTP/2 features such as streams and server push.

Enhanced Process API

Process API had a limited capability to control and manage operating system processes. Even for getting process PID, you would need to either use native code or a tricky workaround. Not to forget, you would need different implementation for each platform.

In Java 9, the process API has been enhanced for controlling and managing operating-system processes. Most of the new functionality is present in the class java.lang.ProcessHandle. Process specific information can be obtained in the following manner:

ProcessHandle procHandle = ProcessHandle.current();

long PID = procHandle.getPid();

ProcessHandle.Info processInfo = procHandle.info();

Optional<String[]> args = processInfo.arguments();

Optional<String> cmd = processInfo.commandLine();

Optional<Instant> startTime = processInfo.startInstant();

Optional<Duration> cpuUsage = processInfo.totalCpuDuration();

As the example illustrates, the current method returns an object representing a process of currently running JVM. The Info subclass provides details about the process.

Now let’s see how we can stop all running processes using destroy method:

childProcess = ProcHandle.current().children();

childProcess.forEach(procHandle -> {

assertTrue("Could not kill process " + procHandle.getPid(), procHandle.destroy());

});

Stream API improvements

Java 9 brings significant improvements to the Stream API which packs a punch in creation of declarative pipelines of transformations on collections. Four new methods have been added to the java.util.Stream interface – dropWhile, takeWhile, ofNullable. The iterate method gets a new overload, which helps in providing a Predicate on when to stop iterating.

The takeWhile method takes a predicate as an argument and returns a Stream of subset of the given Stream values until that Predicate returns false for first time. If first value does NOT satisfy that Predicate, it just returns an empty Stream.

In this post I have tried to give a glimpse of some of the newly introduced features in Java 9. Trust me, we have just scratched the surface and there are many more useful and diverse features which would be available with Java 9. I am quite excited about this. What about you?

Please follow and like us:
RSS20
Follow by Email
Facebook0
Google+20
http://www.thistechnologylife.com/?p=422
Twitter20
LinkedIn0

Evolution of Java Security Architecture

Java platform had a strong emphasis on security from the very beginning.  The multi-platform support has made people sceptical about the way Java handles security aspects and if there are some hidden risks. Java is a type-safe language and provides automatic garbage collection. The use of secure class loading and verification mechanism allows only legitimate Java code to be executed. The initial version of Java platform was focused specifically on providing a safe environment for running potentially untrusted code, such as Java applets. With the growth in the platform and wide spread deployment, the Java security architecture has evolved to support the expanding set of services. Over the years, the architecture has grown to include a large set of APIs, tools and implementations of security algorithms, mechanisms and protocols. Java security revolves around two key aspects:

  • Secure platform to run applications on.
  • Tools and services available in the Java programming language that facilitate a range of security sensitive applications

Language Level Security

There are multiple mechanisms inbuilt within the Java language to ensure secure programming:

  • Strictly Object-Oriented – All security related advantages of object oriented paradigm can be availed of as there can be no structures outside the boundary of a class.
  • Final Classes and Methods – Undesired modification of functionality can be prevented by making classes and methods as final.
  • Automated Memory Management – There are fewer chances of errors related to memory access as Java language does all the memory management automatically.
  • Automatic Initialization – All memory constructs on heap are automatically initialized. However, all stack based memory is not pre-initialized. This makes sure that all classes and instances are never set undefined values.

Basic Security Architecture

There are a set of APIs spanning all the major security areas like cryptography, public key infrastructure, authentication, secure communication and access control. These APIs help in easy integration of security in application code. They are designed around the following principles:

  • Implementation Independence – Application need not worry about the finer details of security and can request security from the Java platform. These security services are implemented in providers which get plugged into the Java platform via a standard interface. An application may leverage multiple independent providers for security.
  • Implementation Interoperability – By their very nature, providers are interoperable across applications. Just like an application is not bound to a specific provider, similarly a provider is not bound to any specific application.
  • Algorithm Extensibility – There are a number of built-in providers that implement a basic set of security services. There is a provision for installing custom providers which address specific needs of applications.

Security Providers

The concept of security provider is encapsulated by java.security.Provider class. It specifies the provider’s name and lists the security services it supports.  It is possible to configure multiple providers at the same time and can be listed in order of preference. When a security service is requested, the provider with the highest priority is selected.  Message digest creation is one type of service available from providers. An application can invoke getInstance method in the java.securityy.MessageDigest class to obtain an implementation of a specific message digest algorithm like MD5.

MessageDisgest md = MessageDigest.getInstance(“MD”);

Optionally, the provider name can be specified while requesting an implementation.

MessageDigest md = MessageDigest.getInstance(“MD5”, “ProviderA”);

Following figures illustrate these options for requesting an MD5 message digest implementation. Both the instances depict three providers that implement message digest algorithm. The providers are ordered by preference from left to right. In the first instance, an application can be seen requesting MD5 algorithm without specifying a provider name. The providers are looked for in preference order and the implementation from the first provider supplying that particular algorithm, ProviderB is returned. In second instance, the application requests the MD5 algorithm implementation from a specific provider, ProviderC. In this case, the implementation from that provider is returned, even though a provider with a higher preference order, ProviderB also supplies an MD5 implementation.

Figure – 1 – Provider Selection

Cryptography

Java provides a cryptography framework for accessing and developing cryptographic functionality for the Java platform. It includes APIs for a large variety of cryptographic services like Message digest algorithms, Digital signature algorithms, Symmetric stream encryption, Asymmetric encryption, etc. The cryptographic interfaces are provider-based which allows for multiple and interoperable cryptography implementations.

Public Key Infrastructure

The Public Key Infrastructure (PKI) framework enables secure exchange of information based on public key cryptography. It facilitates identities (of people, organizations, etc.) to be bound to digital certificates and provides a means of verifying the authenticity of certificates. PKI comprises of keys, certificates, public key encryption, and trusted Certification Authorities (CAs) who generate and digitally sign certificates.

Authentication

Authentication helps in identifying the user of an executing Java program. The Java platform provides APIs to perform user authentication via pluggable login modules.

Secure Communication

It is quite important to ensure that the data that travels across a network is not accessed by an unintended recipient. In case the data includes private information, like passwords and credit card numbers, steps must be taken to make the data unintelligible to unauthorized parties. The Java platform provides API support and provider implementations for a number of standard secure communication protocols like SSL/TLS, SASL, GSS-API and Kerberos.

Access Control

The access control architecture protects access to sensitive resources (for example, local files) or sensitive application code (for example, methods in a class). All access control decisions are mediated by a security manager, represented by the java.lang.SecurityManager class.  A  SecurityManager must be installed into the Java runtime in order to activate the access control checks.

Enhancements in JDK 8

JDK 8 has wide range of security related enhancements and features. Some of those are listed below:

TLS 1.1 and TLS 1.2 enabled by default

TLS 1.1 and TLS 1.2 are enabled by default on the client side by the SUNJSSE provider. You can enable SunJSSE protocols by using the new system property jdk.tls.client.protocols.

Limited doPrivileged

A new version of the method AccessController.doPrivileged can be used to assert a subset of its privileges, without preventing the full traversal of the stack to check for other permissions.

Stronger Algorithms for Password-Based Encryption

Many AES-based Password-Based Encryption (PBE) algorithms, such as PBEWithSHA256AndAES_128 and PBEWithSHA512AndAES_256, have been added to the SunJCE provider.

SSL/TLS Server Name Indication (SNI) Extension Support in JSSE Server

The SNI extension extends the SSL/TLS protocols to indicate what server name the client is attempting to connect to during handshaking. Servers can use server name indication information to decide if specific SSLSocket or SSLEngine instances should accept a connection. JDK 7 already has SNI extension enabled by default. JDK 8 supports the SNI extension for server applications.

KeyStore Enhancements

A new command option – importpassword facilitates accepting a password and store it securely as a secret key. A new class, java.security.DomainLoadStoreParameter is added to support DKS keystore type.

SHA-224 Message Digests

JDK 8 has enhanced cryptographic algorithms with the SHA-224 variant of the SHA-2 family of message-digest implementations.

Improved Support for High Entropy Random Number Generation

The SecureRandom class helps in generating cryptographically strong random numbers used for private or public keys, ciphers, signed messages, and so on.

64-bit PKCS11 for Windows

The PKCS 11 provider support for Windows has been expanded to include 64-bit.

Weak Encryption Disabled by Default

The DES-related Kerberos 5 encryption types are not supported by default. These encryption types can be enabled by adding allow_wbeak_crypto=true in the krb5.conf file, but DES-related encryption types are considered highly insecure and should be avoided.

Unbound SASL for the GSS-API/Kerberos 5 mechanism

The  Krb5LoginModule principal value in a JAAS configuration file can be set to asterisk (*) on the acceptor side to denote an unbound acceptor. This means that the initiator can access the server using any service principal name if the acceptor has the long term secret keys to that service. The name can be retrieved by the acceptor using the GSSContext.getTargName() method after the context is established.

SASL service for multiple host names

While creating a SASL server, the server name can be set to null to denote an unbound server, which means a client can request for the service using any server name.

Java 8 has taken a definitive step in making Java more secure by introducing these new security enhancements.

Please follow and like us:
RSS20
Follow by Email
Facebook0
Google+20
http://www.thistechnologylife.com/?p=415
Twitter20
LinkedIn0

Build Data Insights and Business Metrics with ELK Stack

With the world around us getting more and more connected, there is advent of different types of computing devices. It could be a heavy-duty server, laptop, desktop, mobile phone or even your refrigerator. One unique thread that connects all these devices is their logging of system information. These logs are nothing but a stream of messages in time-sequence. Systems can now log any piece of structured or unstructured data, application logs, transactions, audit logs, alarms, statistics or even tweets. Add to this the scale of logs. The earlier methodology of human analysis would not work in this kind of scenario. There has to be some automated mechanism for log analysis and deciphering useful information from them.

The trio of Logstash, Kibana and Elasticsearch is one of the most popular open source solutions for logs management. The three products together are known as the ELK stack and provide an elegant solution for log management. At the heart of ELK stack is Elasticsearch which is a distributed, open source search and analytic engine. It is based on Apache Lucene and is designed for horizontal scalability, reliability, and easy management. Logstash is a data collection, enrichment, and transportation pipeline. The ELK stack is completed by Kibana, which is a data visualization platform enabling interaction with data through stunning, powerful graphics.

In order to start your discovery of ELK stack, check out my book titled – Applied ELK Stack: Data Insights and Business Metrics with Collective Capability of ElasticSearch, Logstash and Kibana. With this book you will discover:

  • The need for log analytics, and current challenges
  • How to perform real-time data analytics on streaming data, and turn them into actionable insights
  • How to create indexing and delete data
  • The different components of ELK (Elasticsearch, Logstash, and Kibana) stack
  • Shipping, Filtering, and Parsing Events with Logstash
  • How to build amazing visualizations and dashboards using Data Discovery, Visualization, and Dashboard with Kibana

I hope this book is able to help you with log management along with providing business insights. Do let me know your valuable feedback on the book.

Please follow and like us:
RSS20
Follow by Email
Facebook0
Google+20
http://www.thistechnologylife.com/?p=409
Twitter20
LinkedIn0

Microservices Asynchronous Communication

Communication among microservices can be either synchronous or asynchronous. Each mode has its own pros and cons. In this post, I would like to share the mechanism of asynchronous communication. As an example, lets consider MyKart, which is an online shopping portal. Just like a typical E-Commerce site, customers (users) can login, browse through product categories, select products and order after paying online. The service architecture is based on micro-service architecture and the application is being built on top of Java stack with the node.js/java script based UI.

Lets see how RabbitMQ can be used for asynchronous communication amongst the microservices.

Service Interaction

The interaction amongst the different services is depicted below:

SequenceDiagram

A typical use case for ordering products is:

  1. Customer (user) logs into the MyKart portal. This is facilitated by the Order service. For existing customers, the details are fetched from Customer service. For new users, a new customer account is created.
  2. Customer can scan through all the different categories of products. The Order service queries the Catalog service to get the details of all the products.
  3. Customer selects the products which he/she wants to purchase. The Billing service is used for online transaction. A payment gateway is used to pay for the purchased products.
  4. Order service maintains records of purchased products and notifies the Shipment service of the transaction. The shipment service will further inform logistics team to prepare for shipping of the purchased products.

Service Interaction and Communication

In order to help with scalability, bulk of the interaction amongst the services would be through asynchronous communication. This would be facilitated by RabbitMQ messaging system. It would help to mention here key concepts of RabbitMQ.

RabbitMQ Fundamentals

The three parts that route messages in RabbitMQ’s are exchanges, queues, and bindings. Applications direct messages to exchanges. Exchanges route those messages to queues based on bindings. Bindings tie queues to exchanges and define a key (or topic) which it binds with.

Exchange

Besides the exchange type, exchanges are declared with a number of attributes, the most important of which are:

  • Name
  • Durability (exchanges survive broker restart)
  • Auto-delete (exchange is deleted when all queues have finished using it)
  • Arguments (these are broker-dependent)

Exchanges can be durable or transient. Durable exchanges survive broker restart whereas transient exchanges do not. Different types of exchanges are as following:

  • Default Exchange – The default exchange is a direct exchange with no name (empty string) pre-declared by the broker. It has one special property that makes it very useful for simple applications: every queue that is created is automatically bound to it with a routing key which is the same as the queue name.
  • Direct Exchange – A direct exchange delivers messages to queues based on the message routing key. A direct exchange is ideal for the unicast routing of messages.
  • Fanout Exchange – A fanout exchange routes messages to all of the queues that are bound to it and the routing key is ignored. If N queues are bound to a fanout exchange, when a new message is published to that exchange a copy of the message is delivered to all N queues. Fanout exchanges are ideal for the broadcast routing of messages.
  • Topic Exchange – Topic exchanges route messages to one or many queues based on matching between a message routing key and the pattern that was used to bind a queue to an exchange. The topic exchange type is often used to implement various publish/subscribe pattern variations. Topic exchanges are commonly used for the multicast routing of messages.
  • Headers Exchange – A headers exchange is designed for routing on multiple attributes that are more easily expressed as message headers than a routing key. Headers exchanges ignore the routing key attribute. Instead, the attributes used for routing are taken from the headers attribute. A message is considered matching if the value of the header equals the value specified upon binding.

Queues

Queues store messages that are consumed by applications. The key properties of queues are as following:

  • Queue Name – Applications may pick queue names or ask the broker to generate a name for them.
  • Queue Durability – Durable queues are persisted to disk and thus survive broker restarts. Queues that are not durable are called transient. Not all scenarios and use cases mandate queues to be durable.
  • Bindings – These are rules that exchanges use (among other things) to route messages to queues. To instruct an exchange E to route messages to a queue Q, Q has to be bound to E. Bindings may have an optional routing key attribute used by some exchange types.

Message Attributes

Messages have a payload and attributes. Some of the key attributes are listed below:

  • Content type
  • Content encoding
  • Routing key
  • Delivery mode (persistent or not)
  • Message priority
  • Message publishing timestamp
  • Expiration period
  • Publisher application id

Message Acknowledgements – We know that networks can be unreliable and this might lead to failure of applications. So it is often necessary to have some kind of processing acknowledgement. In some cases it is only necessary to acknowledge the fact that a message has been received. In other cases acknowledgements mean that a message was validated and processed by a consumer.

Service Communication

All the services of MyKart would utilize RabbitMQ infrastructure for communicating with each other. This is depicted in the following diagram:

RabbitMQ-Comm

When different services want to implement request-response communication using RabbitMQ, there are two key patterns:

  • Exclusive reply queue per request – In this case each request creates a reply queue. The benefits are that it is simple to implement. There is no problem with correlating the response with the request, since each request has its own response consumer. If the connection between the client and the broker fails before a response is received, the broker will dispose of any remaining reply queues and the response message will be lost. Key consideration to take care is that we need to clean up any replies queues in the event that a problem with the server means that it never publishes the response. This pattern has a performance cost because a new queue and consumer has to be created for each request.
  • Exclusive reply queue per client – In this case each client connection maintains a reply queue which many requests can share. This avoids the performance cost of creating a queue and consumer per request, but adds the overhead that the client needs to keep track of the reply queue and match up responses with their respective requests. The standard way of doing this is with a correlation id that is copied by the server from the request to the response.

Durable reply queue – In both the above patterns, the response message can be lost if the connection between the client and broker goes down while the response is in flight. This is because they use exclusive queues that are deleted by the broker when the connection that owns them is closed. This can be avoided by using a non-exclusive reply queue. However this creates some management overhead. One needs some way to name the reply queue and associate it with a particular client. The problem is that it’s difficult for the client to know if any one reply queue belongs to itself, or to another instance. It’s easy to naively create a situation where responses are being delivered to the wrong instance of the client. In the worst case, we would have to manually create and name response queues, which remove one of the main benefits of choosing broker based messaging in the first place.

Rather than using the exclusive reply queue per request, I would recommend to use exclusive reply queue per client. Each service would have its own exchange of type direct where other services would send request events. For response events, the default exchange would be leveraged. All services would have a response queue bound to default exchange corresponding to each service from which response is expected.

Request – Response Event
The request and response messages would be send as an event structure. They would contain the following information:

  1. Correlation Id
    In order to co-relate response event with request event, unique request correlation id would be used. This would help in uniquely identifying a request and tying it up with response. One way of generating unique id is to leverage UUID class of Java. See the code snippet below:
    UUID.randomUUID().toString();
    There can be other mechanisms also for unique id generation, but in my view this is most suitable and user-friendly.
  2. Payload
    The request information and response details would be serialized as a JSON string. While sending a request, the request object details would be converted into JSON format. Similarly, while receiving response, the JSON information would be used to populate the actual response object. The third party Jackson utility would be leveraged for object to JSON conversion and vice-versa.

Service Interaction
Let us take an example to see how the different services would leverage the messaging infrastructure for interaction. Let us take the use case when a customer logs in to the Order service and it then fetches customer information.

  1. Order service needs to get customer information, like name, address, etc. and for this it has to query the Customer service.
  2. The request object contains the customer id. It is converted into JSON format using Jackson.
  3. Java UUID class is used to generate the correlation id.
  4. The message is send to Customer exchange with customer.read as the routing key. The CustomerReadQueue is bound to Customer exchange with this key and so the message gets delivered to CustomerReadQueue.
  5. The reply_to field is used to indicate on which particular queue the response event should be send. In this case, the reply_to field is filled with the value CustomerResponseQueue.
  6. Once the Customer service is able to fetch the desired information, it converts it into JSON format. It then populates the Correlation Id received earlier. The response event is then send to CustomerResponseQueue, which is bound to the default exchange.
  7. The Order service is waiting for response messages on CustomerResponseQueue. Once it receives an event it uses the JSON payload to populate the Response object. The Correlation Id is used to map it to the request object send earlier.

Service Notification
Besides, the regular request-response interaction, the messaging infrastructure can also be used for one way notifications. Any good application should always log important information and also raise metrics/alarms for the same. All the services of MyKart would send important notifications to Logging Service and Metrics service. The Logging service would use these events to log information in logging infrastructure. The Metrics service would use this information to raise metrics and alarms. These metrics are helpful in serviceability of microservices.

For MyKart, there is Fanout exchange by the name of Notification exchange. It would send the received information to all the registered queues. Both the Logging service and Metrics service have their queues bound to this exchange. Events for Logging service are received on LoggingQueue and the events for Metrics service are received on MetricsQueue.

Conclusion

As has been demonstrated in the shopping cart example, RabbitMQ can be used effectively for asynchronous communication among different services.

Please follow and like us:
RSS20
Follow by Email
Facebook0
Google+20
http://www.thistechnologylife.com/?p=400
Twitter20
LinkedIn0

How to Create a Blog

Today I would like to share my discovery of an interesting blog. It goes by the name of Simple Programmer and its founder is John Sonmez. This blog is little different from you regular tech blog and that adds to the ingenuity. It offers fresh perspective on how programmers should live fuller lives and improve their careers. The content at Simple Programmer is holistic in nature. It offers diverse range of topics like developing your people skills, getting in shape and tackling the mental aspects of being a software developer.

John also offers a free seven part course titled “How to Create a Blog“. I subscribed to this course and received valuable insights on how to start a career boosting blog. The key points of this course are:

  • How to start blog with a good domain name and hosting server.
  • How to select your niche
  • Keep a backlog of topics
  • Be consistent with your blogging activities.

Before I went through this course, I had a blog on WordPress which is essentially a free service. I realized that it is important to have my own domain name. I used Bluehost to register my domain name and also host my blog. Following John’s advice I now have a backlog of ideas and I have promised myself to be consistent.

Go check out this awesome course.

Please follow and like us:
RSS20
Follow by Email
Facebook0
Google+20
http://www.thistechnologylife.com/?p=398
Twitter20
LinkedIn0

Mining Mailboxes with Elasticsearch and Kibana

In a previous post I had mentioned that the trio of Logstash, Kibana and Elasticsearch (ELK stack) is one of the most popular open source solutions for not only logs management but also data analysis. In this post I will demonstrate how ELK can be used to effectively and efficiently perform big data analysis. As a reference let’s take some huge mailbox data. Mail archives are arguably one of the most interesting kind of social web data. It is omnipresent and each message throws light on the communication which people are having. As a CXO of an organization you may want to analyze corporate mails for trends and patterns.

As a reference, I will take the well-known Enron corpus as it has a huge collection of mails and there is no risk of any legal or privacy concerns. This data will be standardized into Unix mailbox (mbox) format. From the mbox format it will be again transformed into a single json file.

Getting the Enron corpus data

The full Enron dataset in a raw form is available for download in various formats. I will start with the original raw form of the data set that is essentially a set of folders that organizes a collection of mailboxes by person and folder. The following snippet would illustrate the basic structure of the corpus after you have downloaded and unarchived it. Go ahead and play with it a little bit so that you become familiar with it.


C:> cd enron_mail_20110402maildir # Go into the mail directory

C:\enron_mail_20110402\maildir>dir # Show folders/files in the current directory
allen-p         crandell-s     gay-r           horton-s

lokey-t         nemec-g         rogers-b       slinger-r

tycholiz-b     arnold-j       cuilla-m       geaccone-t
<pre>               …directory listing truncated…</pre>
neal-s         rodrique-r     skilling-j     townsend-j
<pre>C:enron_mail_20110402maildir> cd allen-p/ # Go into the allen-p folder

C:enron_mail_20110402maildirallen-p> dir # Show files in the current directory</pre>
_sent_mail         contacts         discussion_threads notes_inbox

sent_items         all_documents     deleted_items     inbox
sent               straw
C:\enron_mail_20110402\maildir\allen-p> cd inbox/ # Go into the inbox for allen-p
C:\enron_mail_20110402\maildirallen-p\inbox> dir # Show the files in the inbox for allen-p

  1. 11. 13. 15. 17. 19. 20. 22. 24. 26. 28. 3. 31. 33. 35. 37. 39. 40.
  2. 44. 5. 62. 64. 66. 68. 7. 71. 73. 75. 79. 83. 85. 87. 10. 12. 14.
  3. 18. 2. 21. 23. 25. 27. 29. 30. 32. 34. 36. 38. 4. 41. 43. 45. 6.

63. 65. 67. 69. 70. 72. 74. 78. 8. 84. 86. 9.
C:\enron_mail_20110402\maildir\allen-p\inbox> cat 1. # Show contents of the file named “1.”

Message-ID: &amp;lt;16159836.1075855377439.JavaMail.evans@thyme&amp;gt;
Date: Fri, 7 Dec 2001 10:06:42 -0800 (PST)
From: heather.dunton@enron.com
To: k..allen@enron.com
Subject: RE: West Position
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-From: Dunton, Heather &amp;lt;/O=ENRON/OU=NA/CN=RECIPIENTS/CN=HDUNTON&amp;gt;
X-To: Allen, Phillip K. &amp;lt;/O=ENRON/OU=NA/CN=RECIPIENTS/CN=Pallen&amp;gt;
X-cc:
X-bcc:
X-Folder: Phillip_Allen_Jan2002_1Allen, Phillip K.Inbox
X-Origin: Allen-P
X-FileName: pallen (Non-Privileged).pst

Please let me know if you still need Curve Shift.

Thanks,

 

Now the next step is to convert the mail data into Unix mbox format. An mbox is in fact just a large text file of concatenated mail messages that are easily accessible by text-based tools. I have used python script to convert it into mbox format. Thereafter, this mbox file would be converted into ELK compatible JSON format. The json file can be found here. A snippet of json file can be found below:


{"index":{"_index":"enron","_type":"inbox"}}


[{"X-cc": "", "From": "r-3-728402-1640008-2-359-us2-982d4478@xmr3.com", "X-Folder": "\jskillin\Inbox", "Content-Transfer-Encoding": "7bit", "X-bcc": "", "X-Origin": "SKILLING-J", "To": ["jeff.skilling@enron.com"], "parts": [{"content": "n[IMAGE]n[IMAGE]nJoin us June 26th for an on-line seminar featuring Steven J. Kafka, Senior Analyst at Forrester Research, as he discusses how technology can create more effective collaboration in today's virtualized enterprise. Also featuring Mike Hager, VP, OppenheimerFunds, offering insights into implementing these technologies through real-world experiences. Brian Anderson, CMO, Access360 will share techniques and provide tips on how to successfully deploy resources across the virtualized enterprise. nDon't miss this important event. Register now at http://www.access360.com/webinar/ . For a sneak preview, check out our one-minute animation that illustrates the challenges of provisioning access rights across the "virtualized" enterprise.nAbout Access360nAccess360 provides the software and services needed for deploying policy-based provisioning solutions. Our solutions help companies automate the process of provisioning employees, contractors and business partners with access rights to the applications they need. With Access360, companies can react instantaneously to changing business environments and relationships and operate with confidence, whether in a closed enterprise environment or across a virtual or extended enterprise.n nAccess360 nnIf you would prefer not to receive further messages from this sender:n1. Click on the Reply button.n2. Replace the Subject field with the word REMOVE.n3. Click the Send button.nYou will receive one additional e-mail message confirming your removal.nn", "contentType": "text/plain"}], "X-FileName": "jskillin.pst", "Mime-Version": "1.0", "X-From": "Access360 <R-3-728402-1640008-2-359-US2-982D4478@xmr3.com>@ENRON", "Date": {"$date": 991326029000}, "X-To": "Skilling, Jeff </o=ENRON/ou=NA/cn=Recipients/cn=JSKILLIN>", "Message-ID": "<14649554.1075840159275.JavaMail.evans@thyme>", "Content-Type": "text/plain; charset=us-ascii", "Subject": "Forrester Research on Best Practices for the "Virtualized" Enterprise"}

 

When you have huge amount of data to be pushed into Elasticsearch then it is better to do bulk import by specifying the data file. Each mail message is in a line of its own associated with an entry specifying the index (enron) and document (inbox). There is no need to specify the id as Elasticsearch would automatically specify the id.

Data in Elasticsearch can be broadly divided into two types – exact values and full text. Exact values are exactly what they sound like. Examples are a date or a user ID, but can also include exact strings such as a username or an email address. For e.g., the exact value Foo is not the same as the exact value foo. The exact value 2014 is not the same as the exact value 2014-09-15. On the other hand Full text refers to textual data – usually written in some human language – like the text of a tweet or the body of an email. For the purpose of this exercise, it is better to treat Email addresses (To, CC, BCC) as exact values. Hence, we first need to specify the mapping, which can be done in the following manner.

curl -XPUT “localhost:9200/enron” -d "{
"settings":
{
    "number_of_shards": 5,
    "number_of_replicas": 1
},
"mappings":
{
    "inbox":
    {
        "_all":
        {
            "enabled": false
        },
        "properties":
        {
            "To":
            {
                "type": "string",
                "index": "not_analyzed"
            },
            "From":
            {
                "type": "string",
                "index": "not_analyzed"
            },
            "CC":
            {
                "type": "string",
                "index": "not_analyzed"
            },
            "BCC":
            {
                "type": "string",
                "index": "not_analyzed"
            }
        }
    }
}
}"

 

You can verify that the mapping has indeed been set.

curl -XGET "http://localhost:9200/_mapping?pretty"
{
    "enron" :
    {
        "mappings" :
        {
            "inbox" :
            {
                "_all" :
                {
                    "enabled" : false
                },
                "properties" :
                {
                    "BCC" :
                    {
                        "type" : "string",
                        "index" : "not_analyzed"
                    },
                    "CC" :
                    {
                        "type" : "string",
                        "index" : "not_analyzed"
                    },
                    "From" :
                    {
                        "type" : "string",
                        "index" : "not_analyzed"
                    },
                    "To" :
                    {
                        "type" : "string",
                        "index" : "not_analyzed"
                    }
                }
            }
        }
    }
}

 

Now let’s load all the mailbox data by using the json file, in the following manner:


curl -XPOST "http://localhost:9200/_bulk" --data-binary @enron.json

 

We can check if all the data has been uploaded successfully.

curl "localhost:9200/enron/inbox/_count?pretty"
{
    "count" : 41299,
    "_shards" :
    {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
    }
}

 

You can see that 41299 records each corresponding to a different message, have been uploaded. Now lets start the fun part by doing some analysis on this data. Kibana provides awesome analytic capability and associated charts. Lets try to see how many messages are circulated on a weekly basis.

enron-date

The above histogram shows the message spread on a weekly basis. The date value is in terms of milliseconds past the epoch. You can see that one particular week has a peak of 3546 messages. Something interesting must be happening that week. Now lets see who are the top recipients of messages

enron-to

You can see that Gerald, Sara, Kenneth are some of the top recipients of messages. How about checking out the top senders?

enron-from

You can see that Pete, Jae and Ken are the top senders of messages. In case you are wondering what exactly Enron employees used to discuss, let’s check out top keywords from message subjects.

enorn-subject

It seems most interesting discussions centered on enron, gas, energy, power. There can be a lot more interesting analysis done with the Enron mail data. I would recommend you try the following:

  • Counting sent/received messages for particular email addresses
  • What was the maximum number of recipients on a message?
  • Which two people exchanged the most messages amongst one another?
  • How many messages were person-to-person messages?

 

 

Please follow and like us:
RSS20
Follow by Email
Facebook0
Google+20
http://www.thistechnologylife.com/?p=274
Twitter20
LinkedIn0

Java 8 Lambda Expression for Design Patterns – Strategy Design Pattern

The strategy pattern defines a family of algorithms encapsulated in a driver class usually known as Context and enables the algorithms to be interchangeable. It makes the algorithms easily interchangeable, and provides mechanism to choose the appropriate algorithm at a particular time.

The algorithms (strategies) are chosen at runtime either by a Client or by the Context. The Context class handles all the data during the interaction with the client.

The key participants of the Strategy pattern are represented below:

strategy.png
  • Strategy – Specifies the interface for all algorithms. This interface is used to invoke the algorithms defined by a ConcreteStrategy.
  • Context – Maintains a reference to a Strategy object.
  • ConcreteStrategy – Actual implementation of the algorithm as per Strategy interface

Now let’s look at a concrete example of the strategy pattern and see how it gets transformed with lambda expressions. Suppose we have different type of rates for calculating income tax. Based on whether tax is paid in advance or late, there is a rebate or penalty, respectively. We can encapsulate this functionality in the same class as different methods but it would need modification to the class if some other tax calculation is required in future. This is not an efficient approach. Changes in implementation of a class should be the last resort.

Let’s take an optimal approach by using Strategy pattern. We will make an interface for Tax Strategy with a basic method:

public interface TaxStrategy {

	public double calculateTax(double income);
}

Now let’s define the concrete strategy for normal income tax.

public class PersonalTaxStrategy implements TaxStrategy {

	public PersonalTaxStrategy() { }

	@Override
	public double calculateTax(double income) {

		System.out.println("PersonalTax");

		double tax = income * 0.3;
		return tax;
	}
}

The PersonalTaxStrategy class conforms to the TaxStrategy interface. Similarly, let’s define a concrete strategy for late tax payment which incurs a penalty.

public class PersonalTaxPenaltyStrategy implements TaxStrategy {

	public PersonalTaxPenaltyStrategy() { }

	@Override
	public double calculateTax(double income) {

		System.out.println("PersonalTaxWithPenalty");

		double tax = income * 0.4;
		return tax;
	}
}

Next lets define a concrete strategy for advance tax payment which results in tax rebate.

public class PersonalTaxRebateStrategy implements TaxStrategy {

	public PersonalTaxRebateStrategy() { }

	@Override
	public double calculateTax(double income) {

		System.out.println("PersonalTaxWithRebate");

		double tax = income * 0.2;
		return tax;
	}
}

Now let’s combine all classes and interfaces defined to leverage the power of Strategy pattern. Let the main method act as Context for the different strategies. See just one sample interplay of all these classes:

import java.util.Arrays;
import java.util.List;

public class TaxStrategyMain {

	public static void main(String [] args) {

		//Create a List of Tax strategies for different scenarios
		List<TaxStrategy> taxStrategyList =
				Arrays.asList(
						new PersonalTaxStrategy(),
						new PersonalTaxPenaltyStrategy(),
						new PersonalTaxRebateStrategy());

		//Calculate Tax for different scenarios with corresponding strategies
		for (TaxStrategy taxStrategy : taxStrategyList) {
			System.out.println(taxStrategy.calculateTax(30000.0));
		}
	}
}

Running this gives the following output:

PersonalTax
9000.0
PersonalTaxWithPenalty
12000.0
PersonalTaxWithRebate
6000.0

 

It clearly demonstrates how different tax rates can be calculated by using appropriate concrete strategy class. I have tried to combine all the concrete strategy (algorithms) in a list and then access them by iterating over the list.

What we have seen till now is just the standard strategy pattern and it’s been around for a long time. In these times when functional programming is the new buzzword one may ponder with the support of lambda expressions in Java, can things be done differently? Indeed, since the strategy interface is like a functional interface, we can rehash using lambda expressions in Java. Let’s see how the code looks like:

import java.util.Arrays;
import java.util.List;

public class TaxStrategyMainWithLambda {

	public static void main(String [] args) {

		//Create a List of Tax strategies for different scenarios with inline logic using Lambda
		List<TaxStrategy> taxStrategyList =
				Arrays.asList(
						(income) -> { System.out.println("PersonalTax"); return 0.30 * income; },
						(income) -> { System.out.println("PersonalTaxWithPenalty"); return 0.40 * income; },
						(income) -> { System.out.println("PersonalTaxWithRebate"); return 0.20 * income; }
			);

		//Calculate Tax for different scenarios with corresponding strategies
		taxStrategyList.forEach((strategy) -> System.out.println(strategy.calculateTax(30000.0)));
	}
}

Running this gives the similar output:

PersonalTax
9000.0
PersonalTaxWithPenalty
12000.0
PersonalTaxWithRebate
6000.0

 

We can see that use of lambda expressions, makes the additional classes for concrete strategies redundant. You don’t need additional classes; simply specify additional behavior using lambda expression.

All the code snippets can be accessed from my github repo

Please follow and like us:
RSS20
Follow by Email
Facebook0
Google+20
http://www.thistechnologylife.com/?p=241
Twitter20
LinkedIn0

2015 in review

The WordPress.com stats helper monkeys prepared a 2015 annual report for this blog.

Here's an excerpt:

A San Francisco cable car holds 60 people. This blog was viewed about 1,700 times in 2015. If it were a cable car, it would take about 28 trips to carry that many people.

Click here to see the complete report.

Please follow and like us:
RSS20
Follow by Email
Facebook0
Google+20
http://www.thistechnologylife.com/?p=239
Twitter20
LinkedIn0