What you'll do
- Lead design and development of next-generation AWS platforms for AI/ML and HPC workloads.
- Own and proactively improve server system reliability, testability, and diagnosis using hardware and software expertise.
- Collaborate with cross-functional teams including SDEs, hardware engineers, TPMs, and managers across multiple locations.
- Drive complex architectural problem-solving and deliver scalable, performant software solutions in production.
- Operate within a fast-paced, growing team focused on continuous innovation and direct impact on AWS cloud offerings.
What you should know
- Ideal candidates are innovative self-starters with deep knowledge across hardware and software stacks.
- The role requires strong leadership and communication skills to manage complex projects and teams.
- Applicants should be comfortable working in a highly collaborative, multi-location environment.
- Candidates will have the opportunity to work on cutting-edge AI/ML infrastructure impacting billions of users.
- The position demands ownership and accountability for delivering high-quality, scalable solutions.
About the company
- Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform.
- AWS fosters an inclusive culture with employee-led affinity groups and ongoing diversity and learning initiatives.
- The company values work-life harmony and offers flexibility to support success at work and home.
- AWS is a global leader in cloud computing with continuous innovation in services and infrastructure.
- Amazon emphasizes mentorship and career growth, providing resources to develop well-rounded professionals.
Key required skills
C++JavaPythonLinuxSystems designHardware knowledgeCloud computing