What you'll do
- Lead design and development of next-generation AWS platforms for AI/ML and HPC workloads with a focus on server hardware and software integration.
- Collaborate cross-functionally with engineers, TPMs, and managers across multiple AWS teams and locations to deliver scalable, reliable cloud infrastructure.
- Own system-level debugging and proactive issue resolution using a deep understanding of hardware, software, and x86 architecture.
- Drive improvements in server testability, reliability, and performance impacting AWS’s bottom line and customer experience.
- Operate within a fast-paced, growing team environment with ownership of implementation and direct visibility of product impact.
What you should know
- Candidates should be prepared to work onsite in Cupertino within a collaborative and fast-paced environment.
- The role requires a broad technical skill set spanning hardware, operating systems, networking, and software development.
- Applicants will have opportunities for mentorship, career growth, and working on cutting-edge cloud infrastructure.
- The position demands strong organizational and communication skills to lead complex architectural problem-solving.
- AWS encourages applicants from diverse and non-traditional backgrounds to apply, valuing varied experiences.
About the company
- Amazon Web Services (AWS) is the world’s largest and most broadly adopted cloud platform, pioneering cloud computing innovation.
- The company values inclusion and diversity, fostering employee-led affinity groups and ongoing cultural learning experiences.
- AWS emphasizes work-life harmony and flexibility to support employee success both professionally and personally.
- AWS Hardware Engineering focuses on industry-leading, frugal, and operationally excellent server designs critical to AWS’s success.
- The team operates globally with a large-scale, distributed workforce across Seattle, Cupertino, and Austin.
Key required skills
Systems engineering fundamentals including networking, storage, and operating systemsProficiency in at least one modern programming language such as C++, Python, Java, Golang, or PowerShellExperience in designing or architecting scalable and reliable systemsHands-on experience with server hardware and software integrationStrong debugging and problem-solving skills across the full technical stack