Skip to main content

Senior Software Development Engineer, Stores Incident Monitoring

Job ID: 2653380 | Amazon.com Services LLC

DESCRIPTION

We’re hiring a Senior Software Development Engineer to help shape and drive Incident Monitoring tooling and engineering efforts as part of the incident response program for the worldwide Amazon retail websites.

We are re-imagining incident management & response for Amazon’s retail operations. Amazon is evolving faster than our incident management/response programs can keep up. You'll play a critical role in the design and implementation of a strategic platform for the central incident response team. You will help us build next generation indicators to monitor the retail website using the customer's lens. Your efforts will have an impact and influence on Amazon leadership decisions and every single team at Amazon that interacts with our centralized control centers or outage calls. You’ll be required to dive into the depths of post incident analysis, work with multiple Amazonians to understand how we can improve visibility into their Service Level Objectives (SLO) in the future.

Key job responsibilities
You are enthusiastic about helping in defining, building and integrating key performance indicators of different website experiences into our product by navigating through complex Amazon's architectures. You are comfortable taking initiative and working across feature owners in a relatively unstructured environment. You have well-honed, insightful architectural design instincts, and enjoy building simple and elegant solutions that will scale to support thousands of unique retail website experiences. Passion for understanding retail business and providing technical solutions to gain real-time visibility into Amazon's health, is a key requirement for you. As an engineer, you enjoy working with the Amazon ecosystem, innovating on behalf of the feature owners, and building solutions that will form the foundation for the central command center.
It won’t be easy. The challenges that come with scale and the semi-connected nature of Amazon will pose interesting, unique technical problems. These are big challenges that will move the needle for Central reliability and response organization.

About the team
The Incident Command Systems team at Amazon is responsible for envisioning and building programs, which consistently improve remediation times for outages. This group consists of multiple 2-pizza teams (teams of 6-10 engineers) that each own software components for monitoring, anomaly detection of website degrading issues as well as incident management software used during these outages.

BASIC QUALIFICATIONS

- 5+ years of non-internship professional software development experience
- 5+ years of programming with at least one software programming language experience
- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience as a mentor, tech lead or leading an engineering team

PREFERRED QUALIFICATIONS

- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
- Experience contributing to the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $151,300/year in our lowest geographic market up to $261,500/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.