Team Lead - Site Reliability Engineer
-
Location
Chandigarh
-
Sector:
-
Job type:
-
Salary:
Negotiable
-
Contact:
Tushar Kumar
-
Contact email:
t.kumar@ioassociates.co.uk
-
Job ref:
BBBH162971_1750911544
-
Consultant:
Tushar Kumar
Site Reliability Engineer - Team Lead | Chandigarh (Onsite) | Permanent
POSITION:
We are looking for an experienced "Site Reliability Engineer - Team Lead" to lead an SRE team. The ideal candidate will have a strong background in enhancing the reliability and scalability of services, leading technical teams, and driving strategic initiatives to improve a Lodging-as-a-Service platform.
RESPONSIBILITIES:
- Leadership & Mentorship: Lead, mentor, and develop a team of SREs, fostering a culture of reliability, collaboration, and continuous improvement.
- Strategic Planning: Drive the design and implementation of scalable, sustainable solutions, and lead the transition towards a cloud-native, serverless, and NoOps environment.
- Service Excellence: Oversee service availability, system performance, and capacity planning for critical
- Cross-Functional Collaboration: Work closely with stakeholders across the organization to solve complex
- technical challenges and enhance user experiences.
- Incident Management: Lead incident response efforts, perform root cause analysis, and implement preventative measures.
- Process Optimization: Champion the adoption of best practices in monitoring, automation, and observability.
- SLO Management: Define and manage Service Level Objectives (SLOs) to guide prioritization and ensure reliability.
REQUIRED EXPERIENCE:
- Experience: 7+ years in site reliability engineering or related fields, with at least 2 years in a leadership role.
- Education: Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Technical Expertise:
- Extensive experience with AWS cloud services and cloud engineering best practices.
- Proficiency in programming languages such as Java, Python, and familiarity with React.
- Deep understanding of software engineering methodologies and development cycles.
- Expertise in monitoring and observability tools (New Relic, Kibana, Prometheus, Grafana, ElasticSearch).
- Leadership Skills: Proven ability to lead technical teams, manage projects, and communicate effectively with stakeholders.
- Problem-Solving skills: Exceptional analytical abilities to perform root cause analysis and develop effective solutions.
- Automation & Efficiency: Strong background in automating processes and driving operational efficiency.
