Reliability Engineer

491 jobs found

web3.career is now part of the Bondex Logo Bondex Ecosystem

Receive emails of Reliability Engineer
Job Position Company Posted Location Salary Tags

Blockchain.com

London, United Kingdom

$105k - $180k

Flock Safety

Atlanta, GA, United States

$150k - $185k

Chainlink Labs

Remote

Circle

Seattle, WA, United States

$147k - $195k

Chainlink Labs

Remote

Chainlink Labs

Remote

Circle

Seattle, WA, United States

$100k - $140k

CoinGecko

Malaysia

$97k - $120k

Bitpanda

Bucharest, Romania

$30k - $90k

Gemini

Remote

$120k - $168k

Gemini

Gurgaon, India

$36k - $54k

Gemini

New York, NY, United States

$172k - $241k

Edge & Node

Remote

$112k - $156k

Gemini

Singapore, Singapore

$87k - $102k

Elwood Technologies

Remote

Blockchain.com
$105k - $180k estimated
ENG London, England, United Kingdom

Blockchain is the world's leading software platform for digital assets. Offering the largest production blockchain platform in the world, we share the passion to code, create, and ultimately build an open, accessible and fair financial future, one piece of software at a time.

We are looking for a Site Reliability Engineer to join our engineering team as we tackle some of the most interesting problems in the crypto space, like how do we securely scale a distributed financial platform that touches millions of people a day.

Site Reliability Engineering (SRE) and DevOps is an engineering discipline that combines software and systems engineering to build and run large-scale, distributed and fault-tolerant systems. SRE ensures that Blockchain’s services are reliable and available to meet our users and the business needs, and delivers a fast rate of improvement while keeping an ever-watchful eye on capacity and performance.

At Blockchain, SRE is also a mindset and a set of engineering approaches to running better production systems—we build our own creative engineering solutions to operations problems. SREs are responsible for the big picture of how our systems are designed for operability, how they relate to each other, and we use a breadth of tools and approaches to solve a broad spectrum of problems. Practices such as limiting time spent on operational work, blameless postmortems and proactive identification of potential outages factor into iterative improvement are key to both product quality and interesting and dynamic day-to-day work.

The SRE/DevOps environment at Blockchain is a work in progress - we are looking for an experienced, senior SRE to provide engineering leadership across the SRE team and the broader engineering team. Are you ready for a challenge?

WHAT YOU WILL DO

  • You will be able to play a critical role in evolving our infrastructure as we develop solutions to complex technical problems involving reliability, latency, bandwidth and most importantly security.
  • You will focus heavily on writing tooling to replace manual, repetitive work in a scalable way.
  • You will work in a fast paced, and dynamic environment complementing our existing high calibre team.

WHAT YOU WILL NEED

  • Deep understanding and demonstrable experience running Bitcoin Core/Cash and Ethereum (geth, parity, openethereum) nodes. Experience with other networks is a plus (Cosmos, Polkadot, Solana).
  • Understanding of blockchain consensus mechanics (PoW/PoS, slashing, jailing)
  • Ability to adopt new technologies
  • Experience with containerization and service orchestration, including best practices and security. Experience with Hashicorp Nomad, Consul and Vault is a plus.
  • Strong at automation in at least one programming language, preferably Python/Golang.
  • Linux, including an understanding of resource allocation, network and/or internals.
  • Solid background with configuration management tools.
  • Experience with using GitOps and CI to make changes.
  • Experience with infrastructure as code tools. Experience with complex terraform deployments is a plus.
  • Experience with messaging systems such as Kafka.
  • Experience with database management.
  • Experience working in Data Centers is a plus.
  • Knowledge of routing and switching protocols is a plus.

COMPENSATION & PERKS

  • Full-time salary based on experience and meaningful equity in an industry-leading company
  • Hybrid model working from home & awesome office location in the heart of London
  • Unlimited vacation policy; work hard and take time when you need it
  • Apple equipment
  • The opportunity to be a key player and build your career at a rapidly expanding, global technology company in an emerging field
  • Flexible work culture

Blockchain is committed to diversity and inclusion in the workplace and is proud to be an equal opportunity employer. We prohibit discrimination and harassment of any kind based on race, religion, color, national origin, gender, gender expression, sex, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. This policy applies to all employment practices within our organization, including hiring, recruiting, promotion, termination, layoff, recall, leave of absence, and apprenticeship. Blockchain makes hiring decisions based solely on qualifications, merit, and business need at the time.

What does Reliability Engineer do?

A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization

They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance

Here are some of the typical tasks and responsibilities of a Reliability Engineer:

  1. Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
  2. Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
  3. Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
  4. Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
  5. Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
  6. Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.