What you'll do
- Lead design and development of next-generation AWS AI/ML cloud infrastructure with a focus on server systems and hardware-software integration.
- Collaborate cross-functionally with engineers, TPMs, and managers across AWS Hardware Engineering and EC2 teams to deliver scalable, reliable server solutions.
- Develop and implement AI-driven automation tools and workflows to enhance productivity and system diagnostics.
- Own system reliability, testability, and debugging of complex server platforms involving x86, ARM, GPU/FPGA, and various hardware interfaces.
- Drive architectural problem solving and build tactical solutions for high-performance cloud AI training and inference workloads.
What you should know
- Ideal candidates are innovative self-starters with deep knowledge across hardware and software stacks and strong system debugging skills.
- The role offers a unique opportunity to work at the intersection of AI automation and cloud platform development impacting billions of users.
- Applicants should be comfortable working onsite in Austin, TX, within a fast-paced, collaborative team environment.
- Candidates will face challenges involving complex server system interactions, requiring strong system thinking and reliability focus.
- AWS encourages applicants with diverse and non-traditional backgrounds to apply, highlighting a supportive and flexible work culture.
About the company
- AWS is a global leader in cloud computing, pioneering innovation with a broad suite of services trusted by startups and Global 500 companies.
- The company fosters an inclusive culture that values diversity, curiosity, and employee-led affinity groups supporting equity and belonging.
- AWS Hardware Engineering focuses on frugal, operationally excellent server designs critical to AWS’s business and customer success.
- AWS Infrastructure Services manages the global cloud infrastructure, ensuring high availability, security, and cost efficiency at massive scale.
- Amazon emphasizes continuous learning, mentorship, and career growth to develop well-rounded professionals in a fast-paced environment.
Key required skills
C++PythonGox86 architectureLinuxPCIe - protocolSystem debuggingAutomationAI-driven tools