Deep Dive into the AWS Nitro System

January 7, 2020

Tweet This:
Share on LinkedIn:

By John Valentine, Kovarus Cloud Practice Manager

For me, one of the more exciting things to learn about at re:Invent this year was the Nitro system. It’s been out for a little while, but it’s what’s really allowing AWS to drive a lot of the innovation we are seeing, as it takes a lot of the computing burden off of the servers and onto specialized hardware modules. These modules combined with a new Nitro hypervisor, AWS created, make up the full Nitro system. Let’s take a deeper look at what Nitro is, how it works and the benefits we see from it.

Nitro was first launched in 2017 and was featured only on the C5 instance type. Now, in December of 2019, all of the instance types run Nitro. Nitro is a purpose-built platform for AWS and is made up of a specialized Nitro hypervisor and several Nitro cards such as a Nitro card for VPC, EBS, instance store, controller, and security chip. Each card is physically different from the others, meaning the VPC card looks like a Network Interface Card (NIC) for example, and the EBS card looks like a storage IO card. Let’s take a deeper look at how the cards work and what they do.

Nitro cards for VPC — This looks like a NIC, both physically and to the OS, which sees a network adapter and has a driver that provides network connectivity and is also called Elastic Network Adapter (ENA). This works across multiple generations of network cards and the drivers don’t change with different instance types, making it much more flexible. This also provides improvements at the VPC data plane level, allowing things like Security Group actions to be placed as close as possible to an instance, offloading a lot of that work to the Nitro card for VPC. Lastly, the use of limiters helps prevent issues like network fragmentation and enforces standard performance. Ultimately, the goal is to get the same predictable performance across regions.

Nitro cards for EBS — Again, this looks physically different from other cards (it looks like a storage card) and if you put this card in a server, it would show up as a PCIe device with an NVMe storage class. Like the VPC card, the Nitro card for EBS improves the EBS data plane so that all packet processing is offloaded to the card and the software on the card takes packets and ships across network (NAS). Lastly, encryption is offloaded to the card so there is no performance tradeoff for enabling encryption.

Nitro for Instance Storage — This card also shows as an NVMe storage device when placed into a server. Instance storage is commonly used for transient data or applications that have a durability solution, or for database replication. Wear leveling is built into the card by default and AWS monitors wear levels and informs customers about the issue proactively. This helps prevent data loss in the event of a card failure.

Nitro card controller — When launching an instance, the EC2 control plane talks to the Nitro card controller, which also coordinates with all other nitro cards, the hypervisor and security chip for fast and seamless instance provisioning.

Nitro Security Chip — The Security Chip’s purpose in life is to prevent an EC2 or bare metal instance from making any changes to its on-board firmware devices or the BIOS. Rather than allowing access and ensuring there’s an approval process, Nitro blocks all access. Traditionally, something like Secure Boot achieved this, which ensured every step of the boot process is done correctly, unfortunately this approach can be complicated and hard to update. This is especially important with bare metal instances where you can run tools to update flash on the physical motherboard. This can cause contamination within the BIOS, so physical hardware must be sanitized before the next customer can use that server. Additionally, the Nitro Security Chip also implements something called the Nitro Root of Trust, which sits in front of the physical bus and blocks access to physical stores. We no longer write to non-volatile stores and once the customer is done with the server, Nitro ensures every image is as it should be.

Nitro Hypervisor — This is a stripped-down version of KVM and only runs 5–10 user space processes that do the minimum amount of work needed. After that it doesn’t really do anything else. This type of hypervisor is called a quiescent hypervisor and overcomes the issue of traditional hypervisors that reserve a small subset of memory and cores for back-end processes. With the Nitro Hypervisor, when launching an instance, you get full machine resources.

That’s an overview of the Nitro system. Let’s quickly look at some of the benefits AWS is seeing now that they’ve built this out.

  • Performance — We are now seeing various virtualized instance types closely mirroring the latency of bare metal instances. In a test run by AWS to test hypervisor jitter (wakeup delay), they compared the latency between a C5, I3.metal and C4 instance. In this test, the C4 instance (running the older Xen hypervisor) shows a latency of about 100 microseconds. This is because the older hypervisor BIOS has a system management mode which steals CPU cycles from the OS to do BIOS-level processes, which is now optimized. The C5 and i3.metal instances both showed latencies of ~15 microseconds, meaning that both virtual and bare metal instances are within a few microseconds of each other.
  • Innovation — AWS has now been able to focus on new instance types that are AMD-based, as well as creating the Gravitron processor. Additionally, AWS released the Elastic Fabric Adaptor (EFA) which allows for linear scaling of machine learning and HPC workloads. A few other areas Nitro allows AWS to focus on is offering more bare metal instance types, improving network performance on certain instances to 100Gbps, creating a custom HPC software stack, releasing accelerator instances for caching workloads and lastly, creating incredibly dense storage instances (up to 60 TB).

Nitro is one of the many game-changing solutions AWS created that is truly accelerating the services they release. It’s a simple, yet effective solution and shows that sometimes, keeping software and hardware simple is the best way to improve usability, performance and most importantly, customer experience.

Looking to learn more about modernizing and automating IT? We created the Kovarus Proven Solutions Center (KPSC) to let you see what’s possible and learn how we can help you succeed. To learn more about the KPSC go to the KPSC page.

Also, follow Kovarus on LinkedIn for technology updates from our experts along with updates on Kovarus news and events.