Service Mesh - A new era for Microservices

This article explores the history of service mesh, its architecture and briefly examines its challenges.

Before we start examining what Service-Meshes means and how they are operating, I would like first to take a step back and provide some context.

Early Ages!

Traditional architectures such as Monolithics used to be widely consumed, requiring lower efforts to build and deploy, yet so difficult to respond quickly to business needs’ in term of elasticity and maintainability.

Monoliths, are typically a single package with a range of different features and components, all tightly coupled in an individual solution and then tightly dependent on one of another. Any changes are more likely going to impact any other component.

Era of Microservices

Recently monoliths designs were effectively abandoned by architects in favour of native microservices, bringing agility to the table by speeding up changes and lowering the risk of these changes. Microservices are nowadays the evolution of software development.

Under the Microservices approach, the applications are decomposed within a separated, distributed and independent set of modules offering different pieces of functionalities. Their main benefits are faster delivery, isolation, flexibility and culture.

Microservices — Challenges

As a starting point, Microservices resembles to be the solution, but, Microservices demands extraordinary efforts for management, maintenance and observability, as there is an increasing number of sophisticated units/services running side by side. In addition to this, there is a continuous requirement to ensure secure and reliable service-to-service communication.

Developers requires a suitable solution to observe, orchestrate and secure these distributed services efficiently.

Microservices — Approaching challenges

Different workaround solutions come to mind to address the briefly discussed challenges, such as:

  • Building reliability and security into the Microservices,
  • Rely on orchestration tools, such as Kubernetes, to improve reliability and observability.

These two approaches either are non-repeatable across services or are labour-intensive, making them weak and non-sustainable (Ex: Manually packing encryption/access control into source code of each one of the services).

Is there a way to push these complexities outside of the microservices themselves, keeping them clean with a primary focus on business functions?

Service Meshes were disclosed and are on show, on the promise that they make day-2 operations, and security of microservices more obvious.

Service Mesh — In a nutshell

First, Service Mesh is a concept that is applied to Microservices. From a theoretical perspective, the term itself is used to describe the network of microservices and the interactions between them. So, It’s not a “mesh of services”, but rather a mesh of sidecars/proxies plugged to services.

A broader definition exists today:

Service Mesh is complementing Kubernetes, handling things that are outside of Kubernetes’ scope, and solving challenges around security, observability and networking. With a service mesh, it’s possible to ensure that encryption and granular access control rules are put into place, in a way that can be centrally controlled and monitored and with minimal impact on the applications themselves.

Service Mesh — Architecture overview

Service Meshes breaks into a logical split between two planes; The Control Plane and the Data Plane, detailed below:

Service Mesh — Logical planes

Data Plane:

The data plane is represented by an interconnected set of intelligent proxies deployed as sidecars, within the same Kubernetes pods.

These sidecar-proxies enforce all the traffic Inbound and Outbound throughout the mesh, allowing the service mesh to control traffic without the awareness of the microservice’s application logic.

The data plane takes care of functionalities like Service Discovery, Load Balancing, Traffic Management (Shaping and Routing), Health Checks, Metrics and Telemetry.

Control Plane:

At a glance, the Control Plane provides a set of tools for centrally controlling the behaviour of the data plane (proxies) and collect metrics. The control plane manages and configures the Sidecar Proxies to route traffic, enforce policies and collect telemetry, handles the configuration and policies that make the Service Mesh running as expected.

Service Mesh —Adoption

Security and Observability are probably the most common reasons leading organizations to implement Service Mesh:

  • Observability: Service Mesh provides Layer 7 visibility, tracing requests targeting applications and making troubleshooting of Microservices consumable.
  • Security: Service mesh make possible encryption on east-west traffic, or traffic between services inside a cluster, through the usage of transparent mTLS.
  • Reliability: Helps ensuring the applications are continuing to perform well, with the service mesh’s built-in features, such as load balancing, retries and timeouts.
  • Management: Is one of the critical features required by customers

But, Service Mesh helps also governing Load Balancing and Routing policies. One central team can push changes to a fleet of running microservices through the sidecar proxies without losing visibility or dealing with code.

Service Mesh —Variants

Many open-source initiatives initiated service mesh technologies. The oldest service meshes that exists today is founded by former Twitter engineers and are Linkerd and Conduit. Istio, the most popular, and Red Hat Service Mesh shares a lot of core features in common and offering a multi-cluster configuration.

The landscape includes, and not limited to the list below:

  • Kuma [by Kong]: An OpenSource Service Mesh designed to increase the adoption of microservices.
  • Linkerd [by Buoyant]: Is maintained by CNCF and licenced as Apache v2. Is offering both of a Control and Data planes.
  • Envoy [by Lyft]: is a high-performance sidecar proxy which composes the Data Plane.
  • Istio, backed by technology powerhouses such as Google, Lyft and IBM, is offering only the Control plane but relies on Envoy high-performance sidecars.
  • Consul [by Hashicorp]: Offering both of Control and Data planes.

Service Mesh Interface

In May 2019, an initiative by Microsoft, Red Hat and others took place to make a new standard, called Service Mesh Interface or SMI for short. SMI provides with standardized API, freeing developers to use service mesh capabilities without being tied to a particular implementation.

Its initial specifications are based on 3 main pillars:

  • Traffic Policies: Examples are identity and transport encryption across services,
  • Traffic Telemetry: Capturing metrics like error rates and latencies,
  • Traffic Management: Shift and weight inter-services traffic.
Service Mesh Initiative — SMI

&! to finish

Service Mesh is already live in production environments but not as widely used as one would expect. This is undoubtedly due in part to the additional layers of complexity it introduces and performance impact.

With the Service Mesh Initiative in mind, I trust that all of the Service Mesh solutions unite to a common industry standard analogous to what happened to containers format/runtime with OCI — Open Container Initiative.

Thank You for Reading Me!

Is a DevOps & Cloud enthusiast with 10 plus years of experience. He’s continuously immersing himself in the latest technologies trends & projects.