Sr. Site Reliability Engineer - Incident Response
India - Bengaluru
HashiCorp
HashiCorp delivers consistent workflows to provision, secure, connect, and run any infrastructure for any application.HashiCorp solves development, operations, and security challenges in infrastructure so organizations can focus on business-critical tasks. We build products to give organizations a consistent way to manage their move to cloud-based IT infrastructures for running their applications. Our products enable companies large and small to mix and match AWS, Microsoft Azure, Google Cloud, and other clouds as well as on-premises environments, easing their ability to deliver new applications.
At HashiCorp, we have used the Tao of HashiCorp as our guiding principles for product development and operate according to a strong set of company principles for how we interact with each other. We value top-notch collaboration and communication skills, both among internal teams and in how we interact with our users.
About this Role
As a Senior Site Reliability Engineer specializing in Incident Response, you will play a pivotal role in enhancing our operational resilience and maintaining the reliability of our cloud-based products. With a focus on rapid identification, response, and resolution of incidents, you will be at the forefront of ensuring high availability and performance across HashiCorp’s offerings.
In this role, you can expect to:
- Lead and refine our incident response strategy, ensuring rapid and effective response to operational disruptions.
- Implement best practices for system reliability, including proactive identification of potential failure points and the development of automated mitigations.
- Work closely with development, operations, and security teams to coordinate incident response efforts and post-incident analyses.
- Analyze incident trends and root causes to drive continuous improvements in system reliability and response processes.
- Develop and maintain tools for incident detection, analysis, and resolution, automating responses where possible to minimize human intervention.
- Create comprehensive incident response documentation and conduct training sessions to prepare all relevant teams for effective incident handling.
- Participate in and occasionally lead the on-call rotation, serving as a key decision-maker in the management of live incidents.
You may be a good fit for our team if:
- 6+ years of experience in site reliability engineering, systems administration, or software engineering, with a significant focus on incident response and operational reliability.
- Proven track record of managing and resolving incidents in cloud-based environments, with expertise in major public cloud platforms (AWS, GCP, Azure).
- Strong understanding of monitoring and alerting systems, with the ability to develop metrics and alarms that accurately reflect system health and operational risks.
- Experience with incident management tools and practices, including post-mortem analysis and root cause investigation.
- Excellent communication skills, capable of working effectively across multiple teams and with stakeholders at all levels.
- Familiarity with HashiCorp’s product suite and infrastructure automation tools is a plus. #LI-Hybrid
* Salary range is an estimate based on our InfoSec / Cybersecurity Salary Index 💰
Tags: Automation AWS Azure Cloud GCP Incident response Monitoring Strategy
More jobs like this
Explore more InfoSec / Cybersecurity career opportunities
Find even more open roles in Ethical Hacking, Pen Testing, Security Engineering, Threat Research, Vulnerability Management, Cryptography, Digital Forensics and Cyber Security in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Senior Product Security Engineer jobs
- Open Information Systems Security Officer (ISSO) jobs
- Open Information Security Specialist jobs
- Open Senior Cyber Security Engineer jobs
- Open Cyber Security Architect jobs
- Open Ethical hacker / Pentester H/F jobs
- Open Cyber Security Specialist jobs
- Open Product Security Engineer jobs
- Open Cybersecurity Analyst jobs
- Open Security Specialist jobs
- Open Chief Information Security Officer jobs
- Open Staff Security Engineer jobs
- Open Manager Pentest H/F jobs
- Open Senior Information Security Analyst jobs
- Open Consultant infrastructure sécurité H/F jobs
- Open Consultant SOC / CERT H/F jobs
- Open IT Security Analyst jobs
- Open Senior Information Security Engineer jobs
- Open Cybersecurity Consultant jobs
- Open IT Security Engineer jobs
- Open Senior Penetration Tester jobs
- Open Cybersecurity Specialist jobs
- Open Security Operations Analyst jobs
- Open Sr. Security Engineer jobs
- Open Security Consultant jobs
- Open CISM-related jobs
- Open Windows-related jobs
- Open Network security-related jobs
- Open Pentesting-related jobs
- Open Agile-related jobs
- Open Application security-related jobs
- Open GCP-related jobs
- Open Vulnerability management-related jobs
- Open ISO 27001-related jobs
- Open Threat intelligence-related jobs
- Open CISA-related jobs
- Open Analytics-related jobs
- Open IAM-related jobs
- Open Security assessment-related jobs
- Open Malware-related jobs
- Open Java-related jobs
- Open APIs-related jobs
- Open Security Clearance-related jobs
- Open Forensics-related jobs
- Open SaaS-related jobs
- Open CEH-related jobs
- Open EDR-related jobs
- Open IDS-related jobs
- Open DevOps-related jobs
- Open DoD-related jobs