Intro to Kubernetes and AWS EKS

Wednesday, October 9, 2024

Having worked with Kubernetes (colloquilly known as "K8s") both on-prem and in AWS EKS (Elastic Kubernetes Service), and passed the CKA (Certified Kubernetes Administrator) and CKAD (Certified Kubernetes Application Developer) exams, it seems only right to do a refresher and explain the main concepts around the technology. We'll begin by trying to understand why we use K8s in the first place.

We use K8s to essentially be able to run containers, and containers are imperative when running enterprise-grade, large-scale workloads because they allow you to share the underlying infrastructure like a single physical server, virtual machine (VM), or cloud instance, or a set of these. Instead of confining a single app or service to a single machine (i.e., physical server, virtual machine (VM), or cloud instance), you can run multiple apps and services on a single machine with containers, in effect, ensuring that you are maximizing the compute and memory capacity of that single machine. If you were running a single app or service on a single machine, there could be underutilized capacity, which would not be cost-effective.

Let's assume that the set of machines your containers are running on consists of three, meaning you have three machines/servers/VMs allocated to run your set of apps or services as containers. With K8s, you can rely on its scheduler to allocate the predetermined number of containers to whichever machine in your set has available capacity. It's intelligent enough to determine where to run the containers and can move them across the allocated machines based on factors such as compute and memory utilization, as well as the health of the machine itself. For example, let's say we have machines A, B, and C, with a total of six containers running. Three of them are front-end containers that handle user traffic, and the other three serve as the back-end API layer for the front-end. K8s will continuously shift these six containers across the three machines to ensure optimal performance and cost-efficiency.

In my opinion, K8s is analogous to the game of Tetris. In Tetris, you have arrangements of blocks that you must try to fit into whatever space is available, and that's exactly what K8s does with containers. Containers have varying memory and compute requirements, and they need to fit into machine slots that can accommodate them. While “Pods” will be discussed in a following section, Pods are the smallest unit of measurement in K8s. Typically, each Pod runs a single container, but it can run multiple multiple containers. If you're interested in use cases for a multi-container Pod, look up “sidecar.”

Further building upon this analogy, K8s also has the ability to “self-heal.” This means that if, for any reason, a container dies due to an application-level error or the machine it runs on encounters issues, K8s will automatically spin up a new container on another machine, based on the desired count and available compute/memory capacity.

tetris

Image credit: https://containers.goffinet.org/k8s/setup-k8s

K8s can also “auto-scale,” but what does this mean? K8s can scale the number of containers up or down based on certain metrics and parameters you set. For example, if I have a configuration that states CPU (compute) utilization must not exceed 50%, and that the number of containers can scale up to 8 (with a minimum of 3), K8s will handle the scaling for you. If front-end traffic is high during peak operating hours and containers are hitting the 50% CPU threshold, K8s will increase the number of running containers to 8 to accommodate the increased traffic. This alleviates the compute pressure on the existing containers by distributing the traffic across 8 containers. This mechanism is called “Horizontal Pod Auto-Scaling” and is available out of the box. Additionally, there is “Vertical Pod Auto-Scaling,” which can increase the compute and memory allocated to the Pods themselves, and “Node Autoscaling,” which adds more machines to your set of nodes as needed.

Now that we understand the basics of K8s, let's shift our focus to Docker and running containers. Without a foundational understanding of these two concepts, it would be remiss to dive into the intricacies of K8s and its various components.

Docker is a platform that allows developers to package applications and their dependencies into lightweight, portable containers that run consistently across different environments. A container encapsulates everything needed to execute the application, including the code, runtime, libraries, and system tools, ensuring it works the same in development, testing, and production. Containers are efficient and scalable, making them ideal for microservices architectures (an architectural style where applications are broken down into small, loosely coupled services that can be developed, deployed, and scaled independently). K8s extends this by providing a system to automatically manage, scale, and orchestrate containers across clusters of machines, ensuring high availability and efficient resource utilization.

If you've read my prior blogs (like this one), you'll see that we've worked with a Dockerfile (a script that outlines the steps to build a Docker image, specifying the base image, application code, dependencies, and configurations for running the container) to prepare our Docker images which will run as containers once deployed to AWS ECS (Elastic Container Service). Now, EKS is just an alternative to ECS; they both do "container-orchestration", but ECS is native only to AWS, while K8s is open-source and can be implemented on-prem or with other cloud providers like Microsoft Azure and Google Cloud. AWS' K8s implementation is known as EKS, and since I work solely with AWS, we will be looking into EKS.

Ok, so we discussed why K8s is so great at the start, let's again remind ourselves of what it is and what it's used for.

K8s is an open-source platform designed for automating the deployment, scaling, and management of containerized applications across clusters of machines. It provides the tools to ensure that containers areefficiently distributed, load-balanced, and resilient to failure, handling tasks like scaling based on demand, rolling updates, and self-healing when containers crash or become unresponsive. K8s is widely used for managing complex, large-scale containerized environments, making it easier to deploy microservices architectures while maintaining high availability and performance. It helps solve challenges such as handling distributed systems, managing resource allocation, and automating the orchestration of containers across multiple hosts, reducing the manual effort involved in scaling, updating, and maintaining applications in production environments.

A K8s cluster is a set of machines, called nodes, that work together to run and manage containerized applications. These machines can be physical servers, virtual machines (VMs), or cloud instances, and they are categorized into two types: control plane (master node) and worker nodes. The control plane is responsible for managing the cluster's overall state, making decisions about scheduling, scaling, and orchestrating resources, while the worker nodes handle the actual running of application workloads. Each worker node runs components like the kubelet to manage containers and a container runtime to execute them. K8s ensures scalability, reliability, and high availability within the cluster by distributing tasks, managing resources, and maintaining the desired state of applications automatically across the nodes.

The control plane, worker nodes, and K8s objects together form the core of K8s and its ability to orchestrate containerized applications. The control plane acts as the decision-maker, managing the overall state of the cluster by continuously evaluating and enforcing the desired state, which is defined through K8s objects like Pods, Services, and Deployments. These objects describe what applications and resources should be running. The control plane assigns workloads (Pods) to the worker nodes, where the actual application containers are executed. The worker nodes, managed by the kubelet and powered by a container runtime like Docker, ensure that the workloads are running as expected. The kube-proxy manages communication between these Pods and across nodes, while the control plane constantly monitors and adjusts workloads to ensure the system remains in the desired state. Together, these three components provide the scalability, reliability, and automation that K8s is known for.

Control Plane Components

The control plane in K8s is responsible for managing the overall state and operations of the cluster. It ensures that the desired state of the cluster, as defined by the user through objects (we'll explore objects in the following section) like Pods, Services, and Deployments, is maintained. The control plane orchestrates tasks like scheduling, scaling, and networking across all the nodes in the cluster, effectively coordinating the lifecycle of containers and workloads. It is composed of several critical components that communicate to manage and control the K8s environment.

The kube-apiserver is the core component of the control plane, exposing the K8s API and serving as the entry point for all administrative tasks. etcd is a consistent and highly available key-value store that holds all cluster data and configuration states, ensuring reliability and persistence. The kube-scheduler assigns Pods to nodes based on resource availability and other constraints, while the kube-controller-manager runs various controllers that enforce the desired state of the cluster, such as ensuring Pods stay running or handling scaling events. The cloud-controller-manager is an optional component that integrates K8s with underlying cloud providers, handling tasks like managing load balancers and storage resources tied to the cloud infrastructure. Together, these components keep the K8s cluster running smoothly and in accordance with the desired configurations.

K8s Control Plane Components

The control plane is responsible for managing the overall state of the K8s cluster. It ensures that the cluster's desired state is maintained, coordinating everything from scheduling to resource management. Below is a breakdown of the key components in the control plane and how they interact to ensure the smooth operation of the cluster.

kube-apiserver

The kube-apiserver is the core component of the K8s control plane. It exposes the K8s API, which serves as the primary interface for interacting with the cluster. Whether you're using kubectl, making requests via the K8s dashboard, or utilizing the API directly, the kube-apiserver handles all these requests. It validates incoming requests and processes changes to the cluster, allowing users and system components to communicate with K8s efficiently.

The K8s API can be accessed directly or through tools like kubectl, which simplifies cluster management. With kubectl, you can perform actions such as deploying applications, scaling services, and managing resources like Pods, Services, and Deployments. For example, to create a new deployment, you would use kubectl apply -f {deployment.yaml}, where the YAML file defines the configuration of your application. You can also interact with live objects, such as checking the status of Pods using kubectl get pods or scaling a deployment with kubectl scale deployment {deployment-name} --replicas={number}. Through these commands, kubectl makes API calls to the kube-apiserver, which then processes the requests and ensures the cluster's desired state is maintained.

etcd

etcd is a highly available key-value store that holds all the configuration data for the cluster, including the current and desired states of K8s resources. It provides reliable, consistent storage for all the data managed by the API server. In a distributed K8s environment, etcd ensures data consistency and availability, even during network partitions or node failures, making it the backbone of K8s persistence.

kube-scheduler

The kube-scheduler is responsible for assigning Pods to nodes within the cluster. It continuously monitors for Pods that have not yet been assigned to a node and matches them to nodes based on resource requirements, node capacity, and other constraints like affinity, taints, or tolerations. The scheduler ensures efficient resource utilization by balancing workloads across the cluster's nodes.

The kube-scheduler uses several factors to determine the best node for a Pod, ensuring it runs optimally within the cluster. Affinity rules allow you to specify preferences or requirements for where Pods should be scheduled. For instance, you can configure node affinity so that a Pod is only scheduled on nodes with specific labels, like ensuring that a Pod runs on nodes in a particular region or with certain hardware. Taints and tolerations work together to repel Pods from specific nodes. A node may be tainted to prevent most Pods from being scheduled on it (e.g., if it's reserved for critical workloads), and only Pods with a matching toleration will be scheduled there. This mechanism is useful for ensuring that high-priority or specialized workloads don't compete for resources on nodes intended for general use. The scheduler balances these rules with resource requirements (like CPU and memory) to select the most appropriate node for each Pod, maximizing efficiency and resource utilization.

kube-controller-manager

The kube-controller-manager runs various controllers that continuously work to maintain the desired state of the cluster. These controllers manage important functions like ensuring that the correct number of Pods are running, managing node availability, handling scaling, and managing networking and storage. Each controller monitors the state of the cluster and makes changes as needed to align the actual state with the desired state.

cloud-controller-manager (optional)

The cloud-controller-manager integrates K8s with the underlying cloud provider's infrastructure. In the case of AWS EKS, this component is responsible for managing cloud resources such as Elastic Load Balancers (ELBs), Elastic Block Storage (EBS), and VPC routing. AWS EKS relies on this integration to handle cloud-specific resources like automatic provisioning of load balancers for Services of type LoadBalancer, attaching persistent storage volumes to nodes, and managing node health. The AWS cloud-controller-manager ensures seamless interaction between K8s and AWS infrastructure, providing a unified management experience for workloads running in the cloud.

These control plane components work together to manage the state and operation of the K8s cluster. The kube-apiserver acts as the front door to the cluster, etcd stores the desired state, the kube-scheduler assigns Pods to nodes, and the kube-controller-manager ensures that everything in the cluster remains in the desired state. In AWS EKS, the cloud-controller-manager handles cloud-specific resources like load balancers, storage, and networking, making the cluster integration with AWS infrastructure seamless. Together, they enable K8s to be a powerful and efficient orchestration tool for containerized applications.

Node Components

The node components in K8s are responsible for maintaining the runtime environment on each individual node, ensuring that the Pods and containers are running as expected. These components interact with the control plane to execute the desired state of the cluster on a per-node basis. Each node in the K8s cluster includes these components to manage networking, container execution, and monitoring of the running workloads.

The kubelet is the primary agent that runs on each node, ensuring that all containers within Pods are running as instructed by the control plane. It communicates with the control plane to receive updates about the desired state of the node and acts to maintain it. The kube-proxy (optional) manages network rules to enable communication between Pods and across the network, ensuring Services can route traffic to the correct Pods. Finally, the container runtime is the software that is responsible for actually running the containers within the Pods, with Docker, containerd, and CRI-O being common examples. These node components collectively enable K8s to manage workloads effectively across the entire cluster.

K8s Node Components

Node components are responsible for running workloads and maintaining the runtime environment on each individual node within the K8s cluster. These components work together to ensure that Pods are running as expected and that the necessary network rules and resources are in place. Below is an explanation of the key node components and their roles in K8s.

kubelet

The kubelet is the primary agent running on every node in the K8s cluster. It ensures that all containers in a Pod are running as intended. The kubelet interacts with the control plane to receive instructions and monitors the state of Pods on the node. If a container fails or needs to be restarted, the kubelet ensures that it aligns with the desired state as specified by the control plane, taking corrective action when necessary.

kube-proxy (optional)

kube-proxy is responsible for maintaining network rules on each node. It facilitates networking in K8s by ensuring that Pods can communicate with each other, both within the node and across different nodes in the cluster. kube-proxy enables the implementation of K8s Services by managing IP forwarding and load balancing rules. While kube-proxy is optional, it plays an important role in ensuring smooth service discovery and communication within the cluster.

Container runtime

The container runtime is the software responsible for running the containers that make up a Pod. While Docker is the most ubiquitous container runtime, widely known and used due to its simplicity and vast ecosystem, K8s supports other container runtimes as well. Alternatives include containerd, which is a lightweight runtime designed for K8s, and CRI-O, a K8s-native container runtime designed to be efficient and optimized for K8s workloads. The container runtime abstracts the low-level container execution, allowing K8s to manage and orchestrate containers across nodes, regardless of the runtime being used.

These node components work together to provide the runtime environment for K8s Pods. The kubelet ensures that containers are running as specified, kube-proxy manages network rules to allow communication across Pods and services, and the container runtime, such as Docker, containerd, or CRI-O, runs the containers themselves. Together, these components ensure that each node can fulfill its role in running and managing the workloads in a K8s cluster.

k8s control plane and node components

Image credit: https://kubernetes.io/docs/concepts/overview/components/

Objects

In K8s, resources are represented as objects, which are persistent entities that define the desired state of the cluster. As mentioned previously, a cluster is a set of machines that work together to run and manage containerized applications, orchestrated by K8s to ensure scalability, reliability, and high availability. These objects describe the resources needed and specify how they should behave, including what applications are running, their resource allocation, and policies such as restart behavior, upgrades, and fault tolerance. Once an object is created, K8s continually works to maintain the desired state you declare, following a declarative model where you specify the intended outcome, and the system ensures it is achieved.

For example, a Pod is the smallest deployable unit and encapsulates one or more containers that share storage, networking, and runtime configurations. A Service is another object that exposes a set of Pods to external traffic, facilitating communication between services or users. Deployments manage the desired state of Pods by ensuring the correct number of replicas are running and handling updates or rollbacks. Other objects like ConfigMaps and Secrets store configuration data and sensitive information, such as API keys or environment variables, that can be used by containers. By abstracting the complexity of managing containerized applications, K8s simplifies deployment, scaling, and maintenance, with tools like kubectl or direct API integration to manage these objects.

Now let's go into some details around the main objects in K8s and learn what does what.

K8s Objects Overview

In K8s, everything is managed through objects, which represent the desired state of your cluster. These objects define the configuration, scaling, and behavior of applications and resources running within the cluster. Below is an explanation of the key K8s objects and how they interact to form a dynamic, scalable, and self-healing infrastructure.

Pod

A Pod is the smallest and simplest K8s object. It encapsulates one or more tightly coupled containers, along with their storage and networking. Pods are the basic building blocks for running applications in K8s, and they ensure that containers within them share resources and communicate with each other. However, Pods are ephemeral, which means they can be destroyed and recreated, so other objects like Deployments are often used to manage Pods more efficiently.

Service

A Service is an abstraction that defines a logical set of Pods and the policies by which they are accessed. K8s Services provide load balancing and ensure that even as Pods are dynamically created or destroyed, the application remains reachable via a consistent endpoint. Services can expose Pods both internally within the cluster or externally to the internet, depending on their configuration (ClusterIP, NodePort, LoadBalancer).

A K8s Service adds a layer of abstraction by providing a unified and stable networking interface for Pods, even as they are created or destroyed across different nodes with varying IP addresses. Without Services, managing network communication between constantly changing Pods would be complex, as each Pod has its own unique IP address, which can change when the Pod is rescheduled or recreated. The Service solves this problem by acting as a consistent entry point for accessing Pods. It keeps track of the Pods that belong to the logical group it manages and automatically routes traffic to healthy Pods, regardless of where they are running in the cluster. This ensures that applications remain accessible through a single, stable endpoint, regardless of the underlying changes in Pod IPs or node locations. By decoupling the network layer from the actual Pod infrastructure, Services make communication between distributed components in K8s much more reliable and scalable. This concept is often referred to as K8s "Service Discovery".

ClusterIP is the default service type and exposes the service on an internal IP within the cluster, making it accessible only to other services and Pods inside the cluster. This is ideal for internal communication between microservices or components that don't need to be exposed externally. NodePort exposes the service on a static port on each node's IP, allowing external traffic to access the service through <NodeIP>:<NodePort>. It's typically used for development or testing, as it offers basic external access but can be less secure and scalable. LoadBalancer creates an external load balancer (e.g., AWS ELB or GCP Load Balancer) that automatically routes traffic to the underlying Pods, offering a fully managed and scalable solution for exposing services to the internet. LoadBalancer is best used when you need to expose a service externally with automatic traffic distribution, especially in production environments.

Deployment

A Deployment ensures the desired state of your application by managing Pods. It defines how many replicas (copies) of a Pod should be running at any given time and takes care of rolling updates, rollbacks, and scaling. Deployments provide high availability by automatically distributing Pods across nodes, ensuring that your application remains up and running even during updates or failures.

Rolling updates, rollbacks, and scaling are key features of K8s Deployments that help manage application lifecycle and stability. Rolling updates allow for seamless updates to your application by gradually replacing old Pods with new ones, ensuring zero downtime. This process ensures that a portion of the application remains available at all times while new Pods are deployed incrementally. Rollbacks come into play when an update causes issues, allowing you to revert to a previous stable version of the application quickly, ensuring minimal disruption. Scaling refers to adjusting the number of Pod replicas running your application to meet demand. You can scale up (add more Pods) during high traffic or scale down (reduce Pods) during low demand, ensuring efficient resource usage while maintaining high availability.

ReplicaSet

A ReplicaSet's purpose is to maintain a stable set of replicas (copies) of a Pod running at all times. While ReplicaSets ensure the correct number of Pods are running, they are usually managed by Deployments. ReplicaSets are important for maintaining the desired state of applications when Pods fail or are removed, automatically replacing them as needed to keep the system running smoothly.

StatefulSet

A StatefulSet is a K8s object designed for managing stateful applications. Unlike Deployments, StatefulSets provide guarantees about the ordering and uniqueness of Pods, making them ideal for use with databases or applications that require stable network identifiers and persistent storage.

StatefulSets are particularly useful in scenarios where applications require consistent identities, stable storage, or ordered deployment and scaling. For example, databases like MySQL, Cassandra, or MongoDB (both Cassandra and MongoDB are "NoSQL" databases, similar to DynamoDB I've worked with in other blog posts), which require persistent data across restarts, benefit from StatefulSets because they ensure each Pod retains its unique identifier and associated storage. Another common use case is for distributed systems, such as Apache Zookeeper or Kafka, which rely on stable network identities and ordered pod creation and termination to maintain cluster coordination. In these situations, StatefulSets ensure that even as Pods are scaled or restarted, they maintain their connections, data, and predictable behavior across the cluster.

Job & CronJob

Jobs and CronJobs are K8s objects for running tasks. A Job runs a task to completion, ensuring that the task runs successfully even if Pods fail and need to be restarted. A CronJob schedules recurring tasks, similar to cron jobs on a Linux system, making them useful for periodic tasks like backups or scheduled maintenance.

PersistentVolume & PersistentVolumeClaim

PersistentVolumes (PV) and PersistentVolumeClaims (PVC) are used to manage storage in K8s. A PersistentVolume represents a piece of storage provisioned in the cluster, while a PersistentVolumeClaim allows a Pod to request storage. This separation between PV and PVC allows for dynamic storage management and ensures that data can persist across Pod restarts or rescheduling.

ConfigMap & Secret

ConfigMaps and Secrets are K8s objects used to store configuration data for your applications. ConfigMaps hold non-sensitive data, such as environment variables or configuration files, while Secrets store sensitive information like passwords, API keys, or tokens. Both objects allow you to decouple configuration from your containerized application, making it easier to update or manage settings without modifying the actual application image.

Ingress

Ingress is an API object that manages external access to services within the cluster, typically HTTP or HTTPS routes. Unlike Services that expose Pods directly, Ingress allows you to define more complex routing rules and can offer SSL termination, load balancing, and name-based virtual hosting, providing a flexible and secure way to manage traffic coming into the cluster.

In AWS EKS, Ingress can be seamlessly integrated with AWS's native load balancers (a service that distributes incoming network traffic across multiple backend services or Pods to ensure reliability, high availability, and optimal resource utilization in the cluster) to manage external traffic. When an Ingress resource is deployed in EKS, AWS often provisions an Elastic Load Balancer (ELB)—either an Application Load Balancer (ALB) or Network Load Balancer (NLB)—to handle incoming requests based on the routing rules defined in the Ingress object. By using the AWS ALB Ingress Controller, for example, K8s can automatically configure an ALB to manage HTTP/HTTPS traffic, offer SSL termination, and distribute traffic across services, all while leveraging AWS's scalable and high-availability infrastructure. This integration combines K8s' flexibility in routing with AWS's powerful load balancing capabilities, providing a streamlined solution for handling complex traffic patterns securely and efficiently.

Namespace

A Namespace is used to organize and isolate resources within a K8s cluster. Namespaces allow you to group related resources together, making it easier to manage large clusters and divide resources across different teams or environments (such as development and production).

Namespaces are especially useful in scenarios where multiple teams or environments share the same K8s cluster. For example, if you have separate development, staging, and production environments, you can create a namespace for each to ensure that resources like Pods, Services, and ConfigMaps are isolated from one another, reducing the risk of accidental conflicts or misconfigurations. Additionally, when managing multiple applications, each application can be placed in its own namespace, as I've experienced personally, to simplify resource management, monitoring, and access control. This approach also allows administrators to apply specific quotas, policies, or security settings at the namespace level, ensuring each application has the right amount of resources and permissions without interfering with others in the same cluster.

These K8s objects work together to build a robust, self-healing, and scalable infrastructure. Pods run your containerized workloads, Services ensure consistent access to those workloads, and Deployments and StatefulSets manage the lifecycle and scaling. ConfigMaps, Secrets, PersistentVolumes, and Ingress handle configuration, storage, and traffic management, while Namespaces help organize and isolate resources in large-scale environments. Together, they form the backbone of modern cloud-native applications in K8s.

Sure, this is all great—a bunch of concepts pertaining to K8s—but it still remains very esoteric. So let's look at an architecture diagram to understand the interplay and synergies between the control plane, node components, and objects, and how they all work together to run a simple app in K8s. The architecture diagram will focus on running K8s in EKS, and any AWS-specific concepts and tooling will be elaborated on. However, before we dive into EKS, why use a cloud-managed service like EKS instead of running "vanilla" K8s on your own, either locally or in a cloud VM like EC2?

K8s is a complex technology that isn't easy to set up and maintain. As discussed above, there are many components to manage (e.g., the control plane, worker nodes, objects, etc.), along with numerous add-ons and plugins (e.g., cilium, coreDNS, etc.). For many developers, it's just easier to rely on a cloud provider's managed service to handle the "heavy lifting." By "heavy lifting," I mean the work required to maintain a highly available, secure, and scalable environment for running K8s—this is exactly what AWS EKS provides. For example, there's an open-source (recently taken over by AWS) library called eksctl, which makes life much easier by allowing you to create a cluster from the command line. Check out the documentation here.

using eksctl to build an EKS cluster

Image credit: Cumulus Cycles on YouTube - https://www.youtube.com/watch?v=QiE6YpA5jk4

kubectl and eksctl are not serving the same function when working with K8s and AWS EKS. kubectl is the command-line tool used to interact with any K8s cluster, allowing you to manage resources like Pods, Deployments, and Services, perform CRUD operations, view logs, and execute commands within containers. It works across all K8s environments, regardless of where the cluster is running. On the other hand, eksctl is a specialized tool for creating and managing AWS EKS clusters. It simplifies tasks such as provisioning EKS control planes, node groups, and networking components with just a few commands. While kubectl is used to manage applications and resources inside an existing K8s cluster, eksctl is focused on the infrastructure, automating the setup and maintenance of EKS clusters. Both tools are often used together when working with AWS EKS—eksctl for cluster creation and kubectl for day-to-day management of resources within the cluster.

Returning to EKS, it can also architect and maintain the control plane for you, in addition to offering something called "Managed Node Groups," which makes it easier to set up groups of EC2 instances that serve as your worker nodes. In the diagram below, I'll discuss some of these features.

diagram showing how EKS works

What you see is an incredibly rudimentary app running in K8s. You can have apps that rely on storage, for which you can leverage PersistentVolume and PersistentVolumeClaims, and you can expose your app to the public internet using Ingress. The potential for K8s truly knows no bounds, and the ecosystem continues to grow with third-party offerings that can make your clusters even more efficient. For example, a few months ago, I was introduced to something called "cast.ai," which uses AI to significantly improve cost efficiency for EKS clusters. Based on your cluster setup (I saw a combination of EC2 Spot and Reserved Instances), it automatically selects the most cost-efficient EC2 pricing type and even instance type (e.g., t2.micro, t2.large, t2.nano, etc.) available in your allocation to provision new nodes to your cluster based on the memory and compute requirements of your Pods and Deployments. This all happens on the fly, and it was magical to witness, as it even scales down based on the cluster metrics it consistently observes.

While for TLDRLW, I am not currently using K8s, it's something I might explore adopting once we need to scale beyond the capabilities of AWS ECS (Elastic Container Service). For now, ECS is sufficiently meeting my workload requirements. ECS is easier to provision and maintain, but at some point, the extra effort involved in spinning up an EKS cluster could pay dividends.