Skip to main content

Software Development Engineer, GenAI/ML GPU Orchestration Service, Alloy Greenland

Job ID: 2797882 | Amazon.com Services LLC

DESCRIPTION

GenAI is revolutionizing every industry in the world, yet we are still at the very beginning. As the appetite for GenAI continue to grow exponentially, the demand for GPU instances grows exponentially with that resulting in a biggest problem on our generation - how do we get more GPU capacity? In order to solve it as an industry we need to (1) tackle both the production scale of GPUs, and (2) optimize usage of existing scarce GPU resources to do more with less.

The second bucket is where our team comes in the picture. Large cloud providers are spending billions already on GPU resources and resources are distributed in a Silo fashion to teams where they are unable to utilize the resources to its fullest extent, whether due to peak/off-peak seasonality or workloads completely sooner than expected during vacation. As we looked at the data and saw 15-30% idle capacity across GPU allocation, this then presented a huge opportunity for us to tackle.

As part the Alloy Greenland team, we are a new team started beginning of the year operating startup style with true Day1 spirit on a mission to accelerate AI/ML innovations of all teams across Amazon and as an extension with our partnership with AWS SageMaker to the rest of the world.

If you love working backwards from customers, building 0-1, having exposure to senior leadership visibility, and ultimately making a dent in the world excites you, this is the right place for you!

Alloy Greenland team is part of the Alloy organization which is the central efficiency org which drives cost savings for all service teams within Amazon via efficient use of AWS resources as they build and operate their services. This team is special in 3 ways (1) business impact - we have proven records to save cost by hundred million dollars annually. We have earned trust and reputation from service teams, partner teams (business and technical), and senior leadership (2) technical complexity - our system is not a single product but the whole Amazon. We create central efficiency solutions which save costs for thousands of internal services with minimal or zero efforts from their engineers; (3) professional network - we work with a group of Principal Engineers and Distinguished Engineers closely. Working with brilliant people helps you grow your career.


In this role, you will:

* Build 0-1 products and service that delight our customers
* Work closely with AL/ML customers across Amazon and innovate on their behalf
* Solve performance and efficiency problems that manifest at scale.
* Design metrics and measure performance and cost efficiency of services in Amazon's ecosystem.
* Collaborate with service teams to identify inefficiencies, and design and implement solutions.
* Design and develop highly available components and profiling tools.
* Lead and mentor a team of engineers.

BASIC QUALIFICATIONS

- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language

PREFERRED QUALIFICATIONS

- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $129,300/year in our lowest geographic market up to $223,600/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.