What you'll do
- Lead design and development of next-generation AWS platforms for AI/ML and HPC workloads.
- Own and proactively improve server systems reliability, testability, and diagnosis at cloud scale.
- Collaborate cross-functionally with engineers, TPMs, and managers across multiple AWS teams globally.
- Develop tactical software and hardware solutions using deep knowledge of x86 architecture and systems engineering.
- Drive continuous price-performance improvements for AI model training infrastructure supporting large language models.
What you should know
- Ideal candidates are innovative self-starters with broad technical knowledge from hardware to software.
- The role requires strong systems debugging skills and the ability to solve undefined, complex architectural problems.
- Applicants should be comfortable working in a fast-paced, high-impact environment with ownership of deliverables.
- Experience with Agile methodologies and cross-team collaboration is highly valued.
- Candidates with non-traditional backgrounds or alternative experiences are encouraged to apply.
About the company
- AWS is a global leader in cloud computing, pioneering innovation across infrastructure and services.
- The company values an inclusive culture with employee-led affinity groups and diversity initiatives.
- AWS emphasizes work-life harmony and offers flexibility to support employees’ personal and professional lives.
- Amazon is a massive, fast-paced organization with a strong focus on operational excellence and customer impact.
- The Hardware Engineering team is highly collaborative, working across locations like Seattle, Cupertino, and Austin.
Key required skills
C++PythonPowerShellSystems engineeringNetworkingStorage systemsOperating systemsAgile