Kubernetes Architecture: An Overview

Software Containers have transformed the Cloud application development scene. Docker is the most widely used container technology and it uses Linux kernel facilities like cgroups, namespaces and SELinux to create isolation. Popular Unix based operating systems, such as Linux, Solaris, and FreeBSD always had built in support for containers, but it is Docker that has led to widespread use of containers. More details on Docker and Container ecosystem can be found here.

The core implementations of containers are more focused on life cycle of individual containers, but in production environment, applications workloads require several containers spread across multiple hosts. Infrastructure tasks like container deployment, scaling and load-balancing require an orchestration framework. Kubernetes is an open-source container management platform, which provides highly resilient infrastructure with always available deployment capabilities, scaling and self-healing of containers. It was initially developed by Google in Golang and later donated to Cloud Native Computing Foundation (CNCF). Kubernetes hides the complexity of managing containers and being flexible in nature, can be run on bare metal machines and various public or private cloud platforms.

Kubernetes Architecture and Components

Kubernetes has a flexible architecture which provides for a loosely-coupled mechanism for service discovery. Like any distributed computing platform, Kubernetes cluster consists of one or more master node(s) and multiple worker nodes. It is the responsibility of master node to manage the Application Programming Interface (API), scheduling the deployment and overall cluster management. Each node is running a container runtime, like Docker or rkt, and has an agent to communicate with the master. There are additional components running for monitoring, logging, and service discovery. Nodes can be bare metal servers or virtual machines running in a cloud. Applications run on nodes leveraging compute, network and storage resources.

Kubernetes takes care of the management of scalable applications that typically consist of multiple microservices interacting with each other. Quite often, these microservices are so tightly coupled that they form a group of containers that would typically, in a non-containerized setup run together on one server. This group, which is the smallest unit that can be scheduled to be deployed through Kubernetes is called a pod.

Think of pods as logical boundary for containers sharing the same context and resources like storage, Linux namespaces, cgroups, IP addresses. You can scale pods by creating Replica Sets to ensure that the deployment always runs the required number of pods. Individual pods are not long-living, so it becomes difficult to always identify them with same IP address. To address this concern, Kubernetes uses the concept of Service, which acts like an abstraction on top of multiple pods. This enables discovery of pods by associating a set of pods to a specific criterion using key-value pairs. The figure below, illustrates different components of a Kubernetes cluster.

Kubernetes Architecture
Kubernetes Architecture

Master Node Components

The master node(s) is the entry point of all administrative tasks and takes care of orchestrating the worker nodes. Let’s look at different components of master node in more details.

  • API Server – The API Server is the entry point for performing operations on cluster using the API. It provides an elegant interface so that different tools and libraries can easily communicate with it. After processing the REST requests, it validates them and executes the business logic.
  • Controller Manager – Different kinds of controllers can be run inside the master node to regulate the state of cluster and perform different tasks. Controller manager is a daemon which encloses all the controllers and uses API server to monitor the shared state of the cluster. Its primary objective is to move the status of the cluster to the desired state. The main controllers are endpoint controller, namespace controller, replication controller, and service account controller.
  • etcd key-value store – It is a distributed, key-value store which stores configuration information that is used by all the nodes in the cluster. It is highly-available and provides REST API for performing CRUD operations. Some of examples of data stored in etcd are service details and state, namespaces and replication information, number of jobs scheduled and number of active jobs.
  • Scheduler – Scheduler is responsible for deployment of configured pods and services onto the nodes. It allocates pods to available nodes by tracking utilization of working load on cluster. Scheduler contains information related to resources available to member-nodes in the cluster and takes decision where to deploy a specific service.

Worker Node Components

Worker nodes are VMs or bare-metal servers which run applications using Pods. They provide all the essential services to manage the networking between the containers, communicate with the master node and assigning of resources to the containers. Let’s look at various components of worker node in more details.

  • Container Runtime – The primary role of container runtime is to run and manage a container’s lifecycle. It runs on each of the worker nodes and runs the configured pods. Docker is sometimes also referred to as a container runtime, but it is a platform which uses containers as a container runtime.
  • Kubelet – Kubelet is an agent that gets the configuration of a pod from the api server and makes sure that the required containers are up and running. It is a service running on worker nodes and responsible for communication with the master node. It fetches information about services from etcd and pushes information about the newly created services.
  • Kube-proxy – Kube-proxy plays the role of network proxy and load balancer for a service running on a single worker node. It is responsible for network routing for TCP / UDP packets.
  • Kubectl – Kubectl is a command line tool that communicates with api server and sends commands to the master node.

Kubernetes Design Principles

  • High Availability – Cloud architecture patterns like Microservices are intended to support high availability. Kubernetes is designed to provide support for high-availability at both application and infrastructure level using Replica sets, replication controllers and pet sets. Replica sets have the responsibility of keeping desired number of replicas of a stateless pod for a given application in running state. Kubernetes is compatible with multiple storage backends like NFS, AWS EBS, Azure Disk, and Google Persistent Disk.
  • Scalability – Application based on Microservices architecture run on Kubernetes over pods. Kubernetes enables horizontal scaling of pods based on CPU utilization based on configurable CPU threshold. It also acts as a load balancer in case of multiple pods for a application. The horizontal scaling of stateful pods is supported through Stateful sets. It is like deployment but persists the storage even when a pod is removed.
  • Security – Kubernetes manages security at different levels like cluster, application and network. The APIs endpoints are made secure through transport layer security (TLS) and this ensures that users are authenticated using a secure mechanism. For the benefits of applications, Kubernetes secrets can store sensitive information like passwords, tokens per cluster. These secrets can be accessed from any pod in the same cluster. The communication mechanism between pods and other network endpoints can be specified using network policies.
  • Portability – Kubernetes clusters can be deployed on top of popular Linux distributions, including Red Hat Linux, Ubuntu, CentOS, Debian, etc. It can run over varied environments like cloud platforms such as Azure, AWS, and Google Cloud, virtualized environments based on vSphere, KVM; and bare metal servers. There is flexibility on the container runtime to be used. Users can choose to use Docker, rkt runtime or any new container runtime in the future.

Summary

Kubernetes is an open-source orchestration tool for managing distributed services or containerized applications running over a distributed cluster of nodes. Since it is a distributed computing platform, Kubernetes consists of one or more master node(s) and multiple worker nodes. The master node has components like API server, scheduler, Controller Manager, etcd key-value store. Worker nodes primarily consist of container runtime, kubelet, kube-proxy and kubectl. The core design principles like High Availability, Scalability, Portability and Security provide for true distributed computing environment.

References

Understanding Kubernetes Architecture

Kubernetes Architecture

Kubernetes Cluster Architecture

Kubernetes Concepts

Kubernetes – Cluster Architecture

Share

Leave a Reply

Your email address will not be published.

Page generated in 1.363 seconds. Stats plugin by www.blog.ca