Highlights From AWS re:Invent 2021 Infrastructure Keynote With Peter DeSantis
by Jason Pavao, Senior Solution Architect, Rackspace Technology
On Wednesday, December 1st Peter DeSantis, Senior Vice President, Utility Computing and Apps at Amazon Web Services (AWS), took the stage for his 10th re:invent keynote presentation, with a trip down memory lane. Fifteen years ago, AWS was initially born as S3, followed by SQS and EC2 a few months later. Cloud computing at this point wasn’t in our general vocabulary, but as we all know, things would soon change.
“It’s amazing to see how far we’ve come from those humble beginnings,” said DeSantis as he recounted when AWS was initially a single region and availability zones were not even a concept yet. The only underlying storage medium for EC2 was ephemeral, and there was only one instance type available.
Peter reminds us that the key table stakes for all AWS services have always been: Security, Availability, Elasticity, Performance, Cost, and Sustainability. It’s with those key points that Peter set the stage for his keynote foretelling, AWS will always be a leader in innovation.
AWS Nitro SSD
While Nitro is not a new product announcement, it was a large portion of Peter’s keynote, as he mentions, “Nitro is the reason why AWS began developing in-house silicon.” The Nitro controller allows consistency across multiple storages, processors, and networking vendors, providing a seamless experience for AWS customers. With the addition of the Nitro SSD, AWS offers the EBS io2 Block Express, which provides 260,000 IOPS with consistent sub-millisecond latency.
AWS Graviton 3
AWS is laser-focused on providing performance improvements on real-world customer workloads. While most chip makers seek to amaze with sticker stats like processor frequency and core count, they are not the end goal and consume much more power. More power consumption produces more heat which means lower efficiency.
So how did AWS efficiently increase the performance of a Graviton core? The answer- make the core wider! A wider core can do more work per processing cycle. So instead of increasing the number of cycles per second, they have increased the amount of work you can do each cycle from five to eight instructions per cycle, called instruction execution parallelism. Another way that AWS improved the performance of the Graviton 3 was by adding 50% more memory bandwidth than the previous Graviton 2 processor.
Though in preview and not generally available, the Graviton 3 processor provides 25% overall improved performance for most workloads.
Machine Learning improvements with the AWS Trainium and Inferentia Processors
The first thing to note is that you’ll need a very different infrastructure for training and inference. The second is, machine-learning has two distinct components: training and inference.
Training is where you build your model by iterating through your training data. Think of a model as a math formula with lots of variables. All this math is computed on huge matrices with floating-point numbers. Training uses statistics to find optimal coefficients for all those variables, and those coefficients are called parameters.
Inference is where you take the model you train to make predictions. Inference is the vast majority of cost because you’re always performing inference against your machine learning models.
The AWS Trainium and Inferentia are purpose-built processors that deliver the best machine learning training and ongoing inference performance.
Distributed training techniques
The simple way to perform distributed training is called data parallelism using multiple training processors. Each processor has a complete copy of the model in memory. The training data is partitioned, and each processor processes a subset of the training data. Occasionally the processors must exchange some information as they converge towards a common solution preventing networking bottlenecks from occurring.
AWS Climate Pledge
From data center design to modeling and tracking performance of AWS operations, the most significant gains in efficiency have been in the design of AWS silicon. For example, Graviton is AWS’s most efficient general-purpose processor providing 60% more efficiency for most workloads. In addition, Inferentia is AWS’s most efficient inference processor available on the market today. Unfortunately, AWS did not have actual work efficiency benchmarks available for the Trainium processor during the keynote.
Amazon is committed to becoming net-zero carbon by 2040 with substantial investments in green technologies, which puts Amazon 10 years ahead of the Paris Agreement. In addition, Amazon has always worked toward improving efficiency and reducing the energy needed to deliver services to customers by focusing on all aspects of their infrastructure.
Just one more way Amazon Web Services (AWS) is leading the way in innovation. Rackspace Technology is proud to be an “all-in” AWS Partner Network (APN) Premier Consulting Partner that has deep AWS expertise and scalability to take on the most complex AWS projects.
Are You Realizing the Cloud Optimization Benefits of Kubernetes and Containers?
September 22nd, 2023
Google Cloud Next ’23 Highlights— AI and Beyond
September 14th, 2023
Why You Need an MLOps Framework for Standardizing AI and Machine Learning Operations
September 12th, 2023