Reliability Engineer

485 jobs found

web3.career is now part of the Bondex Logo Bondex Ecosystem

Receive emails of Reliability Engineer
Job Position Company Posted Location Salary Tags

CoinDesk

London, United Kingdom

$90k - $100k

CoinDesk

Bangalore, India

$90k - $100k

Gemini

Gurgaon, India

$63k - $70k

NFT Now

Remote

$54k - $100k

Consensys

Remote

$72k - $100k

Consensys

Remote

$72k - $100k

Pinax

Canada

$54k - $100k

Binance

Taipei, Taiwan

Nethermind

London, United Kingdom

$90k - $100k

OKX

Singapore, Singapore

$105k - $120k

Coinmarketcap

Remote

$129k - $149k

OKX

Singapore, Singapore

$105k - $120k

bloXroute Labs

Tel Aviv, Israel

$77k - $87k

Solana US

United States

$120k

TRM Labs

Remote

$54k - $80k

Senior Site Reliability Engineer CoinDesk Indices Europe

CoinDesk
$90k - $100k estimated
ENG London, England, United Kingdom
Join Talent Pool

This job is closed

About CoinDesk Indices

CoinDesk Indices (CDI), a subsidiary of CoinDesk, has been the leading provider of digital asset indices by AUM since 2014. We are driven by precision, rigor, research and a desire to educate the marketplace and empower investors. CoinDesk, a media, events, data, and indices company, is the most influential and trusted platform for the global crypto ecosystem.

CDI has three distinct product lines: single-asset reference indices, broad market and sector indices, and systematic strategy indices. The CoinDesk Bitcoin Price Index (XBX) has the longest index track record and underlies the world’s largest digital asset products. Our broad market and sector indices offer the most comprehensive broad market benchmarks, and our investible sectors are constructed using CDI’s industry-adopted taxonomy. Our systematic strategy indices help investors target specific outcomes.

About the Role

As a Senior Site Reliability Engineer (SRE), you will design and build operational systems and processes for critical services across CoinDesk Indices. In this role, you will have direct input on how we scale, secure, and monitor our products and platforms throughout the entire organization.

We have a growing infrastructure footprint which means we need to hire an experienced lead to run point on our DevOps; someone with the wisdom and experience to build mature operational processes for the future. If this sounds like you, then we’d love to hear from you!

This is a fully remote position based in Europe.

Responsibilities:

  • Design and build operational systems and process for critical services across CoinDesk Indices
  • Lead / drive / participate in the design, automation, administration, and security aspects of our AWS Cloud Infrastructure
  • Develop Infrastructure as Code (IaC) to automate infrastructure provisioning, deployment, and administration using Terraform
  • Improve and manage CI/CD Pipeline
  • Design and implement Disaster Recovery & backup solutions
  • Communicate, escalate, and follow up as appropriate to ensure that problems are solved, participate in an on-call rotation, and incident response.
  • Partner with engineering to identify & implement improvements to the development, deployment & monitoring workflows.
  • Own & maintain platforms services measuring and monitoring availability, latency, and overall system health
  • Help develop security standards and participate in security and compliance review process
  • Manage vulnerability and enable security integrations into the SDLC process and CI/CD pipeline along with investigating and triaging production security alerts

Requirements:

  • 7+ years of software and/or DevOps engineering experience, highly proficient in managing AWS Cloud Infrastructure in a distributed systems environment
  • Experience working in a fast paced startup environment.
  • Mastery of Linux command line interface is a must
  • Deep knowledge and hands on experience automating cloud native technologies, deploying applications, and provisioning infrastructure on AWS services
  • Solid experience in Observability tooling, AWS Cloud Infrastructure/Services and Terraform is a must.
  • Experience in building CICD pipelines of a large-scale enterprise application and monitoring & maintaining mission critical systems.
  • Experience with Kubernetes or similar is preferred
  • Experience in with Network Security in a Cloud Computing context
  • Excellent communication and interpersonal skills to engage both technical and non-technical stakeholders. Being proactive, responsible, accuracy high, ownership of the resources is a must.

What does Reliability Engineer do?

A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization

They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance

Here are some of the typical tasks and responsibilities of a Reliability Engineer:

  1. Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
  2. Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
  3. Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
  4. Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
  5. Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
  6. Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.