Skip to main content Skip to footer

Infra Tech Support Practitioner

Bengaluru Job No. atci-5353776-s1970548 Full-time

工作描述

Project Role : Infra Tech Support Practitioner
Project Role Description : Provide ongoing technical support and maintenance of production and development systems and software products (both remote and onsite) and for configured services running on various platforms (operating within a defined operating model and processes). Provide hardware/software support and implement technology at the operating system-level across all server and network areas, and for particular software solutions/vendors/brands. Work includes L1 and L2/ basic and intermediate level troubleshooting.
Must have skills : Site Reliability Engineering
Good to have skills : NA
Minimum 5 year(s) of experience is required
Educational Qualification : 15 years full time education

Summary
A Site Reliability Engineer (SRE) ensures systems are stable, scalable, and highly available, bridging the gap between Business Application development and IT operations. This role combines automation, observability, incident response, and performance engineering to maintain continuous service reliability while accelerating delivery velocity. The Site Reliability Engineer designs and maintains production systems that meet defined Service Level Objectives (SLOs) and error budgets. Using software engineering principles, an SRE prevents downtime, automates operations, and improves platform performance through observability, fault tolerance, and system resilience.

Key Responsibilities:
-Reliability and Performance: Monitor and optimize system uptime, latency, and throughput to meet SLOs and SLIs.
-Incident Management: Lead incident response, manage escalations, perform root cause analysis (RCA), and drive postmortem reviews.
-Automation and Tooling: Develop CI/CD pipelines, automate infrastructure management, and eliminate manual toil through scripting and orchestration.
-Monitoring and Observability: Implement metrics, logging, and tracing frameworks (Prometheus, Grafana, ELK, Datadog) to gain real-time visibility into distributed systems.
-Capacity Planning: Conduct resource forecasting, design scalable infrastructure, and handle performance under surge conditions.
-Change & Release Management: Partner with developers to ensure safe, reliable rollout of new features with automated testing and rollback mechanisms.
-Disaster Recovery & Resilience Engineering: Implement multi-region resilience strategies, chaos tests, and failover automation for business continuity.
-Process Improvement: Use post-incident analytics to refine operational practices and improve reliability with data-driven improvements.
-Collaborate with product, design, ML, and DevOps teams to build intelligent workflows and user experiences
-Implement Infrastructure as Code (IaC) using tools like Terraform, CloudFormation, AZURE DEV OPS or Pulumi.
-Expert in Cloud IaaS and PaaS services.

Required Skills:
-Expertise Python, Go, Bash, or JavaScript for automation and tooling.
-Hands-on with cloud environments AWS, Azure, GCP and orchestration tools like Kubernetes and Terraform.
-Deep understanding of Linux systems, networking, and distributed architectures.
-Experience with observability solutions Prometheus, Grafana, Datadog, CloudWatch, or New Relic.
-Familiarity with incident management and alerting platforms (PagerDuty, xmatters)
-Proficiency in CI/CD frameworks such as Jenkins, GitHub Actions, or GitLab CI.
-Working knowledge of security, compliance, and performance optimization for highly available systems.

Certifications (Required / Preferred):
-AWS Certified Solutions Architect Professional
-Microsoft Certified: Azure Solutions Architect Expert
-Google Professional Cloud Architect
-Certified Kubernetes Administrator (CKA)
-Hashi Corp Certified: Terraform Associate
-Certified DevOps Engineer certifications (AWS, Azure, or Google)

Additional Information:
- The candidate should have minimum 5 years of experience in Site Reliability Engineering.
- This position is based at our Bengaluru office.
- A 15 year full time education is required.
- Resource needs to be AI Ready.

职位要求

15 years full time education

更多了解埃森哲

我们的专长

我们秉承“科技融灵智,匠心承未来”的企业使命,致力于通过引领变革创造价值,为我们的客户、员工、股东、合作伙伴与整个社会创造美好未来。

认识我们的团队

从业务服务部门到各个行业领域, 从职场新人到卓越领袖,我们一直在运用科技创造非凡!

联系我们

加入我们的团队

搜索与你的技能和兴趣匹配的空缺职位。我们希望招聘充满激情、求知若渴、富有创意、专注于解决方案且喜欢团队合作的员工。

埃森哲职位博客

关注埃森哲职业博客,在职场中先人一步,从真正的业内人士处,获取职业建议、内部观点以及可以即学即用的行业真知。