Reliability Engineer

233 jobs found

ai analyst backend bitcoin blockchain community manager crypto cryptography cto customer support dao data science defi design developer relations devops discord economy designer entry level erc erc 20 evm front end full stack gaming ganache golang hardhat intern java javascript layer 2 marketing mobile moderator nft node non tech open source openzeppelin pay in crypto product manager project manager react refi research ruby rust sales smart contract solana solidity truffle web3 py web3js zero knowledge

Job Position and Company	Posted	Location	Salary
Site Reliability Engineer Zamp 📍 Bangalore, India	1y	$77k - $85k	engineer reliability aws +2
Director of Site Reliability Engineering Stellar 📍 Remote	1y	$210k - $310k	executive reliability blockchain +2
Site Reliability Engineer Fully Remote Parity 📍 Remote	1y	$72k - $72k	remote engineer reliability +7
Site Reliability Engineer SRE Cloud Efficiency Engineer FinOps Chainlink Labs 📍 United States	1y	$98k - $112k	cloud engineer reliability +6
Get hired in web3 - JOB GUARANTEED Learn Job-ready Solidity & Rust skills, in your schedule with 1-on-1 mentor support, or get your money back. ISO 9001 Certified \| 400+ students	Learn more	by Metana
Senior Software Engineer Infrastructure Platm Core Reliability Coinbase 📍 Remote	1y	$180k - $218k	infrastructure engineer reliability +9
Sr. Staff Site Reliability Engineer Zscaler 📍 Remote	1y	$140k - $200k	engineer reliability senior +2
Staff Site Reliability Engineer Platm Coinbase 📍 Remote	1y	$211k - $249k	engineer reliability remote +7
Senior Site Reliability Engineer Platm Coinbase 📍 Remote	1y	$122k - $140k	engineer reliability senior +8
Senior Site Reliability Engineer Zscaler 📍 Remote	1y	$103k - $117k	engineer reliability senior +2
Senior Principal Site Reliability Engineer NYC MIA Crossmint 📍 New York, NY, United States	1y	$170k - $210k	engineer executive reliability +7
Senior Site Reliability Engineer Core AI Infrastructure Coinbase 📍 Remote	1y	$186k - $218k	infrastructure ai engineer +7
Site Reliability Engineer SRE Weekend Coverage Elwoodtechnologies 📍 Remote	1y		engineer reliability senior +4
Site Reliability Engineer Indonesia Argus Labs 📍 Jakarta, Indonesia	1y	$90k - $145k	engineer reliability blockchain +4
Site Reliability Engineer APAC Argus Labs 📍 San Francisco, CA, United States	1y	$90k - $145k	engineer reliability blockchain +4
Site Reliability Engineer South East Asia Argus Labs 📍 San Francisco, CA, United States	1y	$90k - $145k	ai engineer reliability +5

Site Reliability Engineer

Zamp

$77k - $85k estimated

Bangalore

Apply

Site Reliability Engineer

Bangalore

Engineering – DevOps /

Full time /

On-site

Apply for this job

About Zamp:

At Zamp, we’re building AI agents that empower people to move at the speed of thought. Our vision is a world where AI handles the routine, so humans can focus on strategy and innovation. We are building a platform where all operational work runs autonomously. We partner with Fortune 500s, leading global banks and companies to streamline complex Finance and Operations processes.

Founded in 2022 by Amit Jain—an IIT Delhi and Stanford graduate with over 20 years of industry leadership, including roles as Managing Director at Sequoia Capital and Head of Asia Pacific at Uber—Zamp is backed by a stellar $22M seed round. Our investors include Sequoia Capital, Dara Khosrowshahi (CEO, Uber), Tony Xu (CEO, DoorDash), and other global visionaries.

About the team:

At Zamp, our engineering team is the force behind our technological innovations. Transforming the most ambitious ideas into reality, we breathe life into dreams through code and hardware. With the right mix of expertise and creativity, this team is the unseen magicians who add a dash of tech wizardry to make Zamp’s products shine.

From coding late into the night to brainstorming over a cup of coffee, we are always on a mission to make our technology stand out. We are not just the engineering team but the tech superheroes that keep Zamp at the forefront of innovation.

You are likely to succeed in this role if you bring experiences in :

Self-Hosted Infrastructure Ownership: Design, deploy, and maintain self-hosted systems and services, ensuring reliability, scalability, and security.
Kubernetes & Container Orchestration: Architect, manage, and scale Kubernetes clusters in production and self-hosted environments.
Infrastructure as Code (IaC): Write and manage Terraform modules to provision and manage infrastructure across AWS/GCP and on-prem setups.
CI/CD Automation: Build and maintain reliable CI/CD pipelines using tools like Jenkins, GitLab CI, or ArgoCD to ensure fast and safe deployments.
Monitoring & Observability: Set up and fine-tune observability tools like Grafana, Prometheus, and Graylog to monitor infrastructure, detect anomalies, and ensure uptime SLAs.
Scripting & Engineering: Write clean, modular automation scripts in Python, Bash, or Go to support operational needs and improve team productivity.
System Reliability & Incident Response: Own on-call responsibilities, drive root cause analysis, and continuously improve incident handling and system resilience.
Security & Compliance: Implement security and access controls within infrastructure, focusing on hardened self-hosted environments.

What we are actively looking for :

4-6 years of experience
Proven experience managing self-hosted systems and internal tooling at scale
Deep hands-on knowledge of Kubernetes, including Helm, Ingress, scaling, and custom operators
Solid experience with Terraform for IaC; optionally Ansible for configuration
Expertise in monitoring/logging stacks: Grafana, Prometheus, Graylog, ELK
Hands-on with AWS, GCP, or Azure; strong understanding of cloud-native + on-prem hybrid setups
Strong scripting experience in Python, Bash, or Go
Proficiency in version control systems like Git for managing code repositories and facilitating collaboration among development teams

Our Culture and Benefits:

At Zamp, we promote a culture of open communication, collaboration, and empowerment. We value transparency, meritocracy, and a strong work ethic. Join our early team and help us build something exceptional.

Perks:

- Competitive salaries and stock options with substantial potential upside.

- Collaborate with top talent.

- Diverse and inclusive workspace.

- Comprehensive medical insurance for employees, spouses, and children.

- A culture celebrating every victory.

- Continuous learning and skill development opportunities.

- Enjoy good food, games, and a comfortable office environment.

Apply for this job

⬇

Apply Now

Join talent pool

What does Reliability Engineer do?

▼

A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization

They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance

Here are some of the typical tasks and responsibilities of a Reliability Engineer:

Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.

Stop applying — get discovered by hiring agents.

Site Reliability Engineer

Zamp

Director of Site Reliability Engineering

Stellar

Site Reliability Engineer Fully Remote

Parity

Site Reliability Engineer SRE Cloud Efficiency Engineer FinOps

Chainlink Labs

Senior Software Engineer Infrastructure Platm Core Reliability

Coinbase

Sr. Staff Site Reliability Engineer

Zscaler

Staff Site Reliability Engineer Platm

Coinbase

Senior Site Reliability Engineer Platm

Coinbase

Senior Site Reliability Engineer

Zscaler

Senior Principal Site Reliability Engineer NYC MIA

Crossmint

Senior Site Reliability Engineer Core AI Infrastructure

Coinbase

Site Reliability Engineer SRE Weekend Coverage

Elwoodtechnologies

Site Reliability Engineer Indonesia

Argus Labs

Site Reliability Engineer APAC

Argus Labs

Site Reliability Engineer South East Asia

Argus Labs

Site Reliability Engineer

You are likely to succeed in this role if you bring experiences in :

What we are actively looking for :

What does Reliability Engineer do?