Infra Tech Support Practitioner

Software/Application/Cloud Tech Support Team Lead/Consultant | Full time | Experience: 5-10 years

Job No. ATCI-5379915-S1970694 | Bengaluru | Required Skill: Site Reliability Engineering

工作描述

Project Role : Infra Tech Support Practitioner
Project Role Description : Provide ongoing technical support and maintenance of production and development systems and software products (both remote and onsite) and for configured services running on various platforms (operating within a defined operating model and processes). Provide hardware/software support and implement technology at the operating system-level across all server and network areas, and for particular software solutions/vendors/brands. Work includes L1 and L2/ basic and intermediate level troubleshooting.
Must have skills : Site Reliability Engineering
Good to have skills : Python (Programming Language), Kubernetes, Prometheus Event Monitoring System
Minimum 5 year(s) of experience is required
Educational Qualification : 15 years full time education

Summary
A Site Reliability Engineer (SRE) ensures systems are stable, scalable, and highly available, bridging the gap between Business Application development and IT operations. This role combines automation, observability, incident response, and performance engineering to maintain continuous service reliability while accelerating delivery velocity. The Site Reliability Engineer designs and maintains production systems that meet defined Service Level Objectives (SLOs) and error budgets. Using software engineering principles, an SRE prevents downtime, automates operations, and improves platform performance through observability, fault tolerance, and system resilience.

Key Responsibilities:
- Reliability and Performance: Monitor and optimize system uptime, latency, and throughput to meet SLOs and SLIs.
- Incident Management: Lead incident response, manage escalations, perform root cause analysis (RCA), and drive postmortem reviews.
- Automation and Tooling: Develop CI/CD pipelines, automate infrastructure management, and eliminate manual toil through scripting and orchestration.
- Monitoring and Observability: Implement metrics, logging, and tracing frameworks (Prometheus, Grafana, ELK, Datadog) to gain real-time visibility into distributed systems.
- Capacity Planning: Conduct resource forecasting, design scalable infrastructure, and handle performance under surge conditions.
- Change & Release Management: Partner with developers to ensure safe, reliable rollout of new features with automated testing and rollback mechanisms.
- Disaster Recovery & Resilience Engineering: Implement multi-region resilience strategies, chaos tests, and failover automation for business continuity.
- Process Improvement: Use post-incident analytics to refine operational practices and improve reliability with data-driven improvements.
- Collaborate with product, design, ML, and DevOps teams to build intelligent workflows and user experiences
- Implement Infrastructure as Code (IaC) using tools like Terraform, CloudFormation, AZURE DEV OPS or Pulumi.
- Expert in Cloud IaaS and PaaS services.
Required Skills:
- Expertise in Python, Go, Bash, or JavaScript for automation and tooling.
- Hands-on with cloud environments AWS, Azure, GCP and orchestration tools like Kubernetes and Terraform.
- Deep understanding of Linux systems, networking, and distributed architectures.
- Experience with observability solutions Prometheus, Grafana, Datadog, CloudWatch, or New Relic.
- Familiarity with incident management and alerting platforms (PagerDuty, xmatters)
- Proficiency in CI/CD frameworks such as Jenkins, GitHub Actions, or GitLab CI.
- Working knowledge of security, compliance, and performance optimization for highly available systems.

Certifications (Required / Preferred):
- AWS Certified Solutions Architect Professional
- Microsoft Certified: Azure Solutions Architect Expert
- Google Professional Cloud Architect
- Certified Kubernetes Administrator (CKA)
- HashiCorp Certified: Terraform Associate
- Certified DevOps Engineer certifications (AWS, Azure, or Google)

Additional Information:
- The candidate should have minimum 5 years of experience in Site Reliability Engineering.
- This position is based at our Bengaluru office.
- A 15 years full time education is required.
- Resource needs to be AI Ready.
- AI Powered Tech Talent

职位要求

15 years full time education

地点

Bengaluru

附加信息

平等就业机会声明

所有聘用决定均不考虑年龄、种族、信仰、肤色、宗教、性别、国籍、血统、残疾状况、退伍军人身份、性取向、性别认同或表达、基因信息、婚姻状况、公民身份或任何其他受联邦、州或地方法律保护的因素。

求职者在招聘过程中没有义务披露已封存或已删除的定罪或逮捕记录。

埃森哲致力于为我们的男女军人提供退伍军人就业机会。

请阅读埃森哲的招聘和聘用声明，了解更多关于我们在招聘和聘用过程中如何处理您的数据的信息。

关于埃森哲

We work with one shared purpose: to deliver on the promise of technology and human ingenuity. Every day, more than 775,000 of us help our stakeholders continuously reinvent. Together, we drive positive change and deliver value to our clients, partners, shareholders, communities, and each other.

We believe that delivering value requires innovation, and innovation thrives in an inclusive and diverse environment. We actively foster a workplace free from bias, where everyone feels a sense of belonging and is respected and empowered to do their best work.

At Accenture, we see well-being holistically, supporting our people’s physical, mental, and financial health. We also provide opportunities to keep skills relevant through certifications, learning, and diverse work experiences. We’re proud to be consistently recognized as one of the World’s Best Workplaces™.

Join Accenture to work at the heart of change. Visit us at www.accenture.com.

埃森哲专业领域

软件开发工程师职位：重塑格局

从设计游戏到打造革新性的体验和产品，从事编程语言相关工作可以让您实现无尽可能。

了解更多