What you'll do
- Lead design and development of next-generation AWS platforms for AI/ML and HPC workloads.
- Own and proactively improve server systems reliability, testability, and diagnosis using hardware and software expertise.
- Collaborate with cross-functional teams including SDEs, hardware engineers, TPMs, and managers across multiple locations.
- Drive complex architectural problem solving and deliver scalable, reliable software solutions in production.
- Contribute to continuous price-performance improvements for AI model training on AWS cloud infrastructure.
What you should know
- Role requires a self-starter with broad technical knowledge from baremetal hardware to userland software.
- Candidates should be comfortable working in a fast-paced, collaborative, and global team environment.
- Opportunity to have direct impact on AWS product improvements and bottom-line results.
- Expect to tackle complex, undefined problems requiring strong debugging and system design skills.
- Applicants with diverse or non-traditional backgrounds are encouraged to apply, reflecting Amazon’s inclusive hiring.
About the company
- AWS is a global leader in cloud computing, serving startups to Global 500 companies.
- The company culture emphasizes inclusion, curiosity, and continuous learning through employee-led affinity groups and events.
- AWS values work-life harmony and offers flexible working arrangements to support employee well-being.
- Amazon is committed to being Earth’s Best Employer with strong mentorship and career growth resources.
- AWS Hardware Engineering is known for innovative, frugal, and operationally excellent server designs critical to AWS success.
Key required skills
C++JavaPythonLinuxSystems designDebuggingCloud computing