Reliability Engineer

233 jobs found

ai analyst backend bitcoin blockchain community manager crypto cryptography cto customer support dao data science defi design developer relations devops discord economy designer entry level erc erc 20 evm front end full stack gaming ganache golang hardhat intern java javascript layer 2 marketing mobile moderator nft node non tech open source openzeppelin pay in crypto product manager project manager react refi research ruby rust sales smart contract solana solidity truffle web3 py web3js zero knowledge

Job Position and Company	Posted	Location	Salary
DevOps Site Reliability Engineer Okx 📍 Remote	2mo	$140k - $144k	devops engineer reliability +7
Site Reliability Engineer AI Agents Kraken 📍 United States	2mo	$96k - $192k	ai engineer reliability +3
Site Reliability Engineer Alpaca 📍 Remote	2mo	$119k - $135k	engineer reliability crypto +2
Sr. Staff Site Reliability EngineerFederal Security Clearance Zscaler 📍 Remote	3mo	$140k - $200k	engineer reliability security +4
Get hired in web3 - JOB GUARANTEED Learn Job-ready Solidity & Rust skills, in your schedule with 1-on-1 mentor support, or get your money back. ISO 9001 Certified \| 400+ students	Learn more	by Metana
Operations Reliability Engineer Automations Alpaca 📍 Remote	3mo	$90k - $165k	engineer operations reliability +7
Staff Site Reliability EngineerFederal Security Clearance Zscaler 📍 Remote	3mo	$119k - $170k	engineer reliability security +3
Senior Site Reliability Engineer Payward Services Kraken 📍 London, United Kingdom	3mo	$119k - $131k	engineer reliability senior +4
Engineering Manager Site Reliability Engineering Kraken 📍 London, United Kingdom	3mo	$105k - $120k	engineering manager engineer reliability +3
Senior DevOps Site Reliability Engineer Limit Break 📍 Tokyo, Japan	3mo	$112k - $130k	devops engineer reliability +6
Senior Site Reliability Engineer Hyperbolic Labs 📍 San Francisco, CA, United States	4mo	$103k - $120k	engineer reliability senior +1
Senior Site Reliability Engineer Node Platm Chainlink Labs 📍 United States	4mo	$115k - $117k	engineer javascript node +5
SRE Site Reliability Engineer Keyrock 📍 Brussels, Belgium	5mo	$133k - $135k	engineer reliability senior +6
Sr. Site Reliability Engineer SRE Zora 📍 Remote	5mo	$170k - $225k	engineer reliability senior +6
Site Reliability Engineer asymmetric.re 📍 Remote	5mo	$124k - $150k	engineer reliability bitcoin +6
Site Reliability Engineer II Chainlink Labs 📍 Argentina	5mo	$112k - $156k	engineer reliability blockchain +3

DevOps Site Reliability Engineer

Okx

$140k - $144k estimated

Remote

Apply

OKX will be prioritising applicants who have a current right to work in Singapore, and do not require OKX's sponsorship of a visa.

Who We Are

At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom. OKX is a leading crypto exchange, and the developer of OKX Wallet, giving millions access to crypto trading and decentralized crypto applications (dApps). OKX is also a trusted brand by hundreds of large institutions seeking access to crypto markets. We are safe and reliable, backed by our Proof of Reserves. Across our multiple offices globally, we are united by our core principles: We Before Me, Do the Right Thing, and Get Things Done. These shared values drive our culture, shape our processes, and foster a friendly, rewarding, and diverse environment for every OK-er.

OKX is part of OKG, a group that brings the value of Blockchain to users around the world, through our leading products OKX, OKX Wallet, OKLink and more.

What You’ll Be Doing

Build and maintain the core infrastructure of the AIOps platform, including the unified monitoring & alerting system and the FinOps cost observability platform. Maintain and continuously optimize internal R&D infrastructure (GitLab, Nexus, Sonar, etc.). Manage monitoring data collection, alert governance, and cost data visualization across multi-cloud environments (Alibaba Cloud / AWS). Support cloud security operations, including cloud security alert management and compliance auditing.

What We Look For In You

3+ years of DevOps or SRE experience; experience with AIOps or observability platform development is a plus. Proficient in Python; familiar with at least one of Go or Java. Full-stack capability (React/Vue frontend + backend API) is a plus. Hands-on experience with at least one major cloud platform (Alibaba Cloud or AWS); familiar with cloud monitoring products (CloudWatch / Alibaba Cloud CloudMonitor) and cost management tools. Familiar with monitoring and logging stacks such as Prometheus, Grafana, and ELK. Experience maintaining and optimizing CI/CD toolchains (GitLab CI, Nexus, container registries). Experience with AI/LLM application development (e.g., LLM API integration, RAG, Agent frameworks) is a plus. Good written and verbal English communication skills.

Perks & Benefits

Competitive total compensation package

L&D programs and education subsidy for employees' growth and development

Various team building programs and company events

Wellness and meal allowances Comprehensive healthcare schemes for employees and dependants More that we love to tell you along the process!

Notice: All official OKX vacancies are published on this website. While roles may appear on selected third-party platforms from time to time, information on other sites may be inaccurate or outdated. If in doubt, please apply directly through our official careers website.

Information collected and processed as part of the recruitment process of any job application you choose to submit is subject to OKX's Candidate Privacy Notice.

⬇

Apply Now

Join talent pool

What does Reliability Engineer do?

▼

A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization

They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance

Here are some of the typical tasks and responsibilities of a Reliability Engineer:

Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.

Stop applying — get discovered by hiring agents.

DevOps Site Reliability Engineer

Okx

Site Reliability Engineer AI Agents

Kraken

Site Reliability Engineer

Alpaca

Sr. Staff Site Reliability EngineerFederal Security Clearance

Zscaler

Operations Reliability Engineer Automations

Alpaca

Staff Site Reliability EngineerFederal Security Clearance

Zscaler

Senior Site Reliability Engineer Payward Services

Kraken

Engineering Manager Site Reliability Engineering

Kraken

Senior DevOps Site Reliability Engineer

Limit Break

Senior Site Reliability Engineer

Hyperbolic Labs

Senior Site Reliability Engineer Node Platm

Chainlink Labs

SRE Site Reliability Engineer

Keyrock

Sr. Site Reliability Engineer SRE

Zora

Site Reliability Engineer

asymmetric.re

Site Reliability Engineer II

Chainlink Labs

What does Reliability Engineer do?