Reliability Engineer

420 jobs found

Receive emails of Reliability Engineer
Job Position Company Posted Location Salary Tags

Coinme

Seattle, WA, United States

$105k - $120k

Chainlink Labs

Remote

Socean Finance

Remote

$40k - $70k

MetaMask

San Francisco, CA, United States

$72k - $75k

MetaMask

San Francisco, CA, United States

$72k - $75k

Snickerdoodle Labs

San Jose, CA, United States

$63k - $90k

Nuri

Berlin, Germany

$105k - $120k

BlockFi

New York, NY, United States

$84k - $180k

BlockFi

London, United Kingdom

$84k - $180k

Triton One

Remote

$54k - $82k

Triton One

Remote

$54k - $82k

Solana Labs

Remote

$40k - $86k

WOO Network

Remote

$45k - $75k

IOTA Foundation

Remote

$105k - $120k

Senior Manager Quality and Reliability Engineering Remote Optional

Coinme
$105k - $120k estimated

This job is closed

Senior Manager, Quality and Reliability Engineering

Coinme is hiring an engineering leader to contribute to developing new technology products. Your duties will include overseeing the various stages of development activities across functional squads while collaborating with the software development team utilizing agile processes and scrum rituals. To ensure success, you should exhibit strong leadership skills and extensive experience in quality and reliability as well as engineering management in a related industry. Top-notch engineering leaders combine their technical knowledge with people skills to build strong and performing teams and promote product innovation and ensure high-quality delivery within the different constraints.

What You'll Be Working On

  • Partner with technical leadership to define and execute quality and reliability strategy
  • Implement comprehensive quality and reliability gates to ensure the health of production and pre-production environments
  • Lead engineers to develop frameworks and tools and integrate them into our pipelines and quality gates
  • Manage and coordinate testing and validation of functional and non-functional aspects for key services and end-to-end workflows
  • Contribute to architecture discussions and planning with development teams
  • Define, measure, and implement standardized key performance metrics
  • Drive scalability for business-critical services ensuring performance, capacity planning, and resiliency
  • Define and manage internal and external stakeholder expectations and collaborating with peers and stakeholders to achieve common goals
  • Provide technical leadership, career development, and mentoring to engineers
  • Coordinate reproduction of critical customer situations requiring special performance tests or simulations

What We're Looking For:

  • A Bachelor’s degree in Computer Science or equivalent combination of technical education and work experience
  • 4+ years of people management experience leading engineering teams with a deep knowledge of a large scale, highly available, and distributed system as well as engaging with the executive suite
  • 5+ years of hands-on experience in software quality assurance in an agile environment
  • Knowledge of event-driven and cloud-based architecture and systems
  • Experience defining and measuring reliability metrics
  • Experience with APIs, web and mobile validation, and testing strategies
  • Familiar with web automation frameworks like Cypress, Selenium, Playwright
  • Demonstrated business and technical judgment, especially relevant in balancing long-term strategic investments with near-term business goals
  • Ability to communicate clearly and effectively with developers, product managers, business owners, and senior business leaders
  • Ability to develop and evaluate performance metrics, as well as diagnose and resolve issues

Not Required, But Nice to Have:

  • Experience in a startup environment
  • Experience in crypto, fintech, including financial and compliance-related reporting

What does Reliability Engineer do?

A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization

They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance

Here are some of the typical tasks and responsibilities of a Reliability Engineer:

  1. Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
  2. Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
  3. Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
  4. Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
  5. Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
  6. Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.