Reliability Engineer

420 jobs found

Receive emails of Reliability Engineer
Job Position Company Posted Location Salary Tags

OKX

Hong Kong, Hong Kong

$26k - $61k

Terraform Labs

Remote

$22k - $31k

Nethermind

London, United Kingdom

$53k - $91k

Pintu

Setiabudi, Indonesia

$63k - $100k

Consensys

Remote

$90k - $100k

Pyth Network

remote

$63k - $150k

Worldcoin

remote

$72k - $91k

Stellar Development Foundation

New York, NY, United States

$165k - $205k

CoinDesk

New York, NY, United States

$135k - $195k

CoinDesk

Sao Paulo, Brazil

$90k - $100k

CoinDesk

London, United Kingdom

$90k - $100k

CoinDesk

Bangalore, India

$90k - $100k

Gemini

Gurgaon, India

$63k - $70k

NFT Now

Remote

$54k - $100k

Consensys

Remote

$72k - $100k

Site Reliability Engineering Lead Stability Architecture

OKX
$26k - $61k estimated

This job is closed

Who We Are

At OKX, we believe our future is reshaped with technology. Founded in 2017, OKX is one of the world’s leading cryptocurrency spot and derivatives exchanges. OKX innovatively adopted blockchain technology to reshape the financial ecosystem by offering some of the most diverse and sophisticated products, solutions, and trading tools on the market. Trusted by more than 20 million users in over 180 regions globally, OKX strives to provide an engaging platform that empowers every individual to explore the world of crypto. In addition to its world-class DeFi exchange, OKX serves its users with OKX Insights, a research arm that is at the cutting edge of the latest trends in the cryptocurrency industry. With its extensive range of crypto products and services, and unwavering commitment to innovation, OKX vision is a world of financial access backed by blockchain and the power of decentralized finance.
We invest in our people as much as we invest in technology. We are united by our engaging culture, here we win as a team, embrace changes, and do the right thing. We are committed to creating a friendly, rewarding and diverse environment for OKers. It doesn’t matter where you come from, here everyone feels valued, respected and has the same opportunities to develop and thrive — we want to bring out the best in you.

About the Opportunity

What You’ll Be Doing:
  • Identify hidden system issues, establish technical support systems related to stability and performance;
  • Lead cross-team collaboration to promote stability improvement and system development.
What We Look For In You:
  • A bachelor's degree or above in computer or related major, with about 8 years of development and architecture experience;
  • Proficient in Java, proficient in Spring Cloud technology stack;
  • Experience in high-concurrency distributed system architecture;
  • Proficient in the core principles of commonly used middleware (nginx, naocs, apollo, kafka, redis, Elasticsearch, etc.);
  • Preferably experienced in service governance system construction, capacity scheduling management, stability guarantee construction, architecture optimization, event guarantee, and chaos engineering practices.
Perks & Benefits:
  • Competitive total compensation
  • Comprehensive insurance coverage for employees and their dependants
  • More that we love to tell you along the process!

What does Reliability Engineer do?

A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization

They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance

Here are some of the typical tasks and responsibilities of a Reliability Engineer:

  1. Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
  2. Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
  3. Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
  4. Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
  5. Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
  6. Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.