Tech & Digital

Team Lead - Site Reliability Engineer

  • Location

    Chandigarh

  • Sector:

    Tech & Digital

  • Job type:

    Permanent

  • Salary:

    Negotiable

  • Contact:

    Tushar Kumar

  • Contact email:

    t.kumar@ioassociates.co.uk

  • Job ref:

    BBBH162971_1750911544

  • Consultant:

    Tushar Kumar

Site Reliability Engineer - Team Lead | Chandigarh (Onsite) | Permanent

POSITION:

We are looking for an experienced "Site Reliability Engineer - Team Lead" to lead an SRE team. The ideal candidate will have a strong background in enhancing the reliability and scalability of services, leading technical teams, and driving strategic initiatives to improve a Lodging-as-a-Service platform.


RESPONSIBILITIES:

  • Leadership & Mentorship: Lead, mentor, and develop a team of SREs, fostering a culture of reliability, collaboration, and continuous improvement.
  • Strategic Planning: Drive the design and implementation of scalable, sustainable solutions, and lead the transition towards a cloud-native, serverless, and NoOps environment.
  • Service Excellence: Oversee service availability, system performance, and capacity planning for critical
  • Cross-Functional Collaboration: Work closely with stakeholders across the organization to solve complex
  • technical challenges and enhance user experiences.
  • Incident Management: Lead incident response efforts, perform root cause analysis, and implement preventative measures.
  • Process Optimization: Champion the adoption of best practices in monitoring, automation, and observability.
  • SLO Management: Define and manage Service Level Objectives (SLOs) to guide prioritization and ensure reliability.

REQUIRED EXPERIENCE:

  • Experience: 7+ years in site reliability engineering or related fields, with at least 2 years in a leadership role.
  • Education: Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
  • Technical Expertise:
  • Extensive experience with AWS cloud services and cloud engineering best practices.
  • Proficiency in programming languages such as Java, Python, and familiarity with React.
  • Deep understanding of software engineering methodologies and development cycles.
  • Expertise in monitoring and observability tools (New Relic, Kibana, Prometheus, Grafana, ElasticSearch).
  • Leadership Skills: Proven ability to lead technical teams, manage projects, and communicate effectively with stakeholders.
  • Problem-Solving skills: Exceptional analytical abilities to perform root cause analysis and develop effective solutions.
  • Automation & Efficiency: Strong background in automating processes and driving operational efficiency.