introduction
Since 2014, Amazon Cloud Technologies has been providing users with Serverless capabilities through Lambda services. With Lambda, users don’t have to worry about managing servers or adjusting capacity to meet changing market demands, and these needs are automatically configured by Amazon Cloud Technology, and users only pay for the resources they use.
Datadog’s 2021 Serverless Status Report shows that developers are accelerating the adoption of Serverless architectures to address new, more advanced business challenges. We are pleased to see that more and more organizations are benefiting from the agility, resiliency, and cost efficiency of adopting Serverless technologies such as Lambda, supporting this growing and diverse community of developers. Ajay Nair, General Manager, Amazon Cloud Technologies Lambda Service Experience
Behind this powerful Serverless service lies the backdrop of many of the cradle’s ongoing innovations. Today we will talk about the continuous innovation of Amazon Cloud Technology’s Serverless base.
base innovation
Serverless is an architecture in the field of cloud computing, users do not need to purchase or manage a fixed number of servers, but by the cloud according to the user’s business characteristics, dynamic management of server resources, and provide elastic scheduling, automatic scaling capacity, fault self-healing and other capabilities, users do not need to charge for the server, but for the real consumption of resources to pay.
Serverless in a broad sense, usually consisting of FaaS (Function as a Service) and BaaS (Backend as a Service), users do not need to care about infrastructure management and maintenance, only need to care about business code, can run their own server-side business logic in the FaaS environment, and can use a variety of BaaS services provided by the cloud, and do not have to care about operation and maintenance and server scaling.

Therefore, in addition to the diversity of BaaS, the entire serverless base essentially depends on the performance and security of Serverless Computing Services (FaaS). That said, the Serverless dock innovation we’re exploring here is for Lambda services. At the same time, we can also think about the performance and security of FaaS, depending on what aspects, in simple terms, the performance and security of FaaS is heavily dependent on the ability of chassis chips and virtualization. So, the Serverless base we’re going to explore today continues to innovate, focusing on FaaS chips and virtualization.
Chip innovations from FaaS
Graviton is an ARM-based server processor released by Amazon Cloud Technology in 2018 for users of EC2 virtual machine instances. The first generation of Graviton processors features custom chips and 64-bit Neoverse cores, and EC2 A1 instances support ARM-based applications such as web servers, cache queues, distributed big data processing, and containerized microservices. Using an open ARM architecture means saving costs by not creating new chips from scratch, instead, using the existing ARM architecture, customizing it for how EC2 instances work, hoping to give users more choice in EC2 instance selection, provide high availability and security for ARM-based applications, while reducing virtualization costs, providing users with good server performance and lower prices.
Compared to the first-generation Graviton processors, the Graviton2 processor delivers a major leap in performance and functionality. Graviton2-based instances provide the best price/performance ratio for workloads in EC2 instances. Graviton2-based instances support a wide range of general purpose, burst, compute optimized, memory optimized, storage optimized, and accelerated compute workloads, including application servers, microservices, high-performance computing, machine learning inference, video encoding, EDA, gaming, databases, and in-memory caching. Many Serverless services, including Lambda and Fargate, also support Graviton2-based instances to provide a cost-effective, fully managed experience that improves performance and saves costs.
The Graviton3 processor is the latest in the Graviton processor family. 25% better compute performance, 2x faster floating-point performance, and 2x faster crypto workload performance compared to Graviton2 processors. The Graviton3 processor delivers 3x better performance than the Graviton2 processor for ML workloads, including support for bfloat16, support for DDR5 memory, a 50% increase in memory bandwidth compared to DDR4, and future support for Serverless services such as Lambda.
Lambda services currently support running on ARM-based Graviton2 processors. Use this processor architecture option for up to 34% better price/performance. Julian Wood’s article, Migrating AWS Lambda Functions to Arm-based AWS Graviton2 processors, highlights how to migrate from x86 to ARM64 and the considerations required during the migration process. The performance and security built into the Graviton2 processor is designed to deliver up to 19% performance gains for compute-intensive workloads, and workloads that use multithreading and multiprocessing or perform I/O operations can experience shorter call times to reduce costs, and a 20% reduction in duration charges billed in milliseconds compared to x86 pricing. Changes from the x86 architecture to the ARM architecture will not affect how Lambda functions are invoked, and integration with APIs, services, applications, or tools will not be affected by the new architecture and will continue to work as before.
Many features can be migrated seamlessly with configuration changes, others need to be rebuilt to use arm64 packages, or we can use the Lambda PowerTuning tool to test the performance of the migrated Lambda functions, which can be found here:
Virtualization innovations for FaaS
When we first built the Lambda service, we had two paths:
- containerized, fast and resource-efficient, but does not provide strong isolation between users
- run code in a virtual machine to provide greater security at the expense of compute overhead.
Security has always been a top priority for AWS, so we were the first to build Lambda using traditional VMs.
Users are asking us to scale faster, with lower latency, and advanced features like provisioned concurrency. We know that this functional requirement cannot be built on a traditional VM.
As serverless technology becomes more widely adopted by users, we recognize that current virtualization technologies cannot be optimized for ephemeral workloads. We believe we need to build virtualization technologies specifically for Serverless computing that provide the security boundaries of hardware-based virtualization while maintaining the lightweight and agility of containers and functions.
That’s why we built Firecracker and released the virtualization platform open source in November 2018. Firecracker is an open source hypervisor (VMM) using Virtual Machine (KVM) technology based on the Linux kernel. Firecracker allows the creation of micro virtual machines, known as microVM. Firecracker adheres to the design principles of leanism and contains only the components needed to run a secure, lightweight virtual machine. Firecracker is licensed under Apache 2.0. Visit the Firecracker GitHub repository to learn more and contribute to Firecracker.
Firecracker was satisfied at the same time
- security for virtual machines based on hardware virtualization
- container resource efficiency and fast startup time
At the 2020 USENIX Web System Design and Implementation Workshop, we published a paper that explained in detail how Firecracker works.

where containers (left) give code direct access to some of the operating system’s core functions (the “kernel”), enhancing security by denying access to other functions (x in the “sandbox” layer), virtual machines (right) provide workloads with their own user kernels and are isolated using hardware virtualization features.
The core part of virtualization is the virtual machine monitor (or VMM) that sets up virtualization, manages memory, and handles I/O such as networking and disk storage. Traditional VMM is almost as complex as a complete operating system, for example: QEMU is a virtual machine monitor, often used in conjunction with the Linux kernel virtual machine (KVM), with more than 1.4 million lines of code (and correspondingly extensive and powerful features).
The reason Firecracker is much more efficient than QEMU is its streamlined VMM, with only 50,000 lines of code — 96 percent less than QEMU. This allows the creation of a single microVM for each user program, which is a simple but powerful security model. In addition, the 50,000 lines of code were written in the Rust language, and in the fall of 2017, we decided to write Firecracker in Rust, a very advanced programming language that guarantees thread and memory safety, prevents cache overflows and many other types of memory security issues that can lead to security vulnerabilities, and has a built-in REST control API to launch instances, get or set up VM configurations, manage snapshots, and more. A single server can create up to 150 Firecracker microVMs per second and run thousands of microVMs simultaneously.
Of course, drastically reducing the size of VMM also drastically reduces its functionality. Instead of implementing legacy devices like bioS or PCI buses, Firecracker communicates with the user kernel through an optimized virtio interface. Importantly, Serverless workloads do not require hardware features such as USB, monitors, speakers, and microphones, so there is no need to implement support for these features at all.
Firecracker powers Lambda services, and one statistic for 2020 is that Firecracker processes trillions of requests per month for hundreds of thousands of users. Currently, a stable 1.0 version has been released. Currently, Firecracker VMs can fully start in less than 125 milliseconds, compared to less than 7 milliseconds to fully create a microVM. The memory footprint per microVM is less than 5MB.
Firecracker is used on multiple container hosting platforms such as Appleet, Fly.io, and Koyeb etc., can be managed using the container runtime, including containerization via firecracker-containerd, Weave Kubes, and Kata Containers, enabling it to integrate with Kubernetes.
summary
This article shares Amazon Cloud Technologies’ ongoing innovation at the Serverless dock, and we will continue to invest heavily in the three layers of Serverless computing: the application layer, the virtualization layer, and the hardware layer, providing users with superior computing power while ensuring security, scalability, and high performance. Investing heavily in the research and development of basic technologies is one of the key points of continuous innovation, not only for tomorrow, but also for the next decade and beyond. We also work closely with and share this innovation with the community. By making Firecracker open source, we not only welcome you to delve into the future foundational technologies for building Serverless computing, but we also hope you can work with us to enhance and refine Firecracker. For more information, see firecracker problem list and firecracker roadmap.