Skip to main content Skip to footer

Infrastructure Architect

Gurugram Job No. atci-5556290-s2021422 Full-time

工作描述

Project Role : Infrastructure Architect
Project Role Description : Lead the definition, design and documentation of technical environments. Deploy solution architectures, conduct analysis of alternative architectures, create architectural standards, define processes to ensure conformance with standards, institute solution-testing criteria, define a solutions cost of ownership, and promote a clear and consistent business vision through technical architectures.
Must have skills : Infrastructure Automation
Good to have skills : NA
Minimum 12 year(s) of experience is required
Educational Qualification : 15 years full time education

Summary:
As an Infrastructure Architect, a typical day involves leading the design and documentation of complex technical environments that support organizational goals. This role requires deploying solution architectures and evaluating various architectural alternatives to determine the most effective approach. The professional ensures that architectural standards are clearly defined and adhered to, while establishing processes that maintain consistency and quality across projects. Additionally, the role includes setting criteria for solution testing and assessing the total cost of ownership for proposed solutions. Throughout the day, the Infrastructure Architect works to align technical strategies with the broader business vision, fostering clarity and coherence in architectural decisions.

Key Responsibilities:
Design and implement HPC and AI infrastructure solutions, aligning system architecture and deployment roadmaps to industry-specific performance and scalability needs
Deploy, configure, and manage XPU-based clusters (CPU/GPU/accelerators) using schedulers, VM/K8s orchestration platforms, Slurm, and containerized platforms in scalable designs to provide Metal as a Service (MaaS), GPUaaS, AIaaS, and other offerings
Optimize cluster performance, scalability, energy, and cost efficiency across on-premises, cloud, and hybrid environments
Integrate AI and HPC platforms with existing IT systems, data pipelines, and security frameworks
Monitor, troubleshoot, and tune infrastructure to ensure high availability, low-latency networking, and workload resiliency
Develop and maintain documentation including architecture diagrams, configuration baselines, and operational runbooks
Provide Provide technical guidance and support to users, enabling efficient execution of HPC/AI workloads, large-scale models, and simulations.


Required Skills and Qualifications:

Proven ability to advise and engage with C-Suite executives and senior leadership, translating complex AI and HPC technologies into business and strategic value
Deep knowledge of infrastructure components including XPUs, high-performance fabrics (InfiniBand, Ethernet), and modern storage/data platforms (e.g. NVMe-oF, Lustre, BeeGFS, VAST, DDN, Weka)
Familiarity with orchestration and management frameworks (Slurm, Kubernetes, Docker) and performance/monitoring tools for AI/HPC environments
Strong grasp of MLOps, DevSecOps, and automation principles (Terraform, Ansible) as they apply to large-scale, secure, and reproducible workflows Experience in AgenticAI based automation developing and integrating agents for automation and observability
Excellent communication and client-facing skills, with the ability to present complex architectures to both executives and technical teams.

Preferred Skills and Qualifications:

Understanding of cloud and virtualization platforms (AWS, Azure, GCP, VMware, Nutanix) and how to align them with AI/HPC workload requirements
Experience advising or overseeing large-scale AI/HPC deployments (1,000+ GPUs or clusters of 100+ servers), providing architecture and strategic guidance
Familiarity with GPU computing and accelerator ecosystems (NVIDIA CUDA, AMD ROCm) and integration considerations for HPC/AI workloads
Knowledge of AI/ML frameworks (TensorFlow, PyTorch) and their operational and performance implications in HPC/AI environments
Industry experience in Life Sciences, Resources, Automotive, Financial Services, Telecommunications, or other HPC/AI-intensive sectors
Relevant cloud or infrastructure certifications (e.g., AWS Solutions Architect, GCP Professional Data Engineer) or equivalent technical credentials
Experience in workload planning, optimization, and orchestration guidance to align infrastructure with business and research objectives
Demonstrated ability to develop roadmaps, ROI analysis, and architecture recommendations that balance performance, scalability, and cost efficiency

Additional Information:
- The candidate should have minimum 12 years of experience in Infrastructure Automation.
- This position is based at our Gurugram office.
- A 15 years full time education is required.

职位要求

15 years full time education

更多了解埃森哲

我们的专长

我们秉承“科技融灵智,匠心承未来”的企业使命,致力于通过引领变革创造价值,为我们的客户、员工、股东、合作伙伴与整个社会创造美好未来。

认识我们的团队

从业务服务部门到各个行业领域, 从职场新人到卓越领袖,我们一直在运用科技创造非凡!

联系我们

加入我们的团队

搜索与你的技能和兴趣匹配的空缺职位。我们希望招聘充满激情、求知若渴、富有创意、专注于解决方案且喜欢团队合作的员工。

埃森哲职位博客

关注埃森哲职业博客,在职场中先人一步,从真正的业内人士处,获取职业建议、内部观点以及可以即学即用的行业真知。