Reliability Engineer

458 jobs found

web3.career is now part of the Bondex Logo Bondex Ecosystem

Receive emails of Reliability Engineer
Job Position Company Posted Location Salary Tags

Crypto.com

Taipei, Taiwan

$185k

Coinbase

Remote

$185k

Fuel Labs

Web3

$103k - $156k

Chainlink Labs

Remote

Ramp

Poland

$90k - $100k

Chainlink Labs

Remote

Chainlink Labs

Remote

Wallet

Remote

$90k - $100k

Ripple

Lausanne, Switzerland

$95k - $144k

Ripple

Singapore, Singapore

$90k - $115k

Heretic

San Francisco, CA, United States

$103k - $117k

LayerZero Labs

Remote

$112k - $156k

Blockchain.com

London, United Kingdom

$105k - $180k

Flock Safety

Atlanta, GA, United States

$150k - $185k

Chainlink Labs

Remote

Crypto.com
$185k estimated
Taipei, Taiwan
Apply

Software Engineer/Site Reliability Engineer

Taipei, Taiwan
Engineering – Engineering /
Full-time /
Hybrid

Apply for this job
We are a team to design, develop, maintain, and improve software for various ventures projects, i.e., projects that are adjacent to our core businesses and are bootstrapped fast with a lean team. You will be actively involved in the design of various components behind scalable applications, from frontend UI to backend infrastructure.


What you’ll be doing

    • Ensure entire stack is healthy: hardware, software, application and network are operating at optimal performance
    • Perform deep dives into both systemic and latent reliability issues; partnering with other software and DevOps engineers across the organization to design, implement and roll out fixes
    • Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent the incident from ever happening again
    • Continuously improve availability, reliability, and observability and reduce the burden of human toil with tooling and automation
    • Define SLA/SLOs for different services partnering with product engineers
    • Represent the SRE team in system design reviews and operational readiness exercises for new and existing services

What you need

    • Experience coding in Ruby and/or Go
    • Familiar with GitOps principles and tools (Github Actions, Docker, Kubernetes)
    • Experience in designing, analyzing, and troubleshooting large-scale distributed systems
    • Curiosity about finding root causes in incidents and outages
    • Ability to develop alignment to cultivate relationships and driving impact
    • Mindset in designing fault tolerance system architecture
    • Comfort with being uncomfortable in ambiguous situations
    • Involvement with incident management and response
    • Desire to grow expertise, inform, and educate others
    • Capable to pick up various technologies, a fast learner and have a “get things done” mentality
    • Humble to embrace better ideas from others, eager to make things better, open to challenges and possibilities

Desirable

    • Familiar with cloud platforms and micro-service based architecture (AWS is big plus)
    • Familiar with monitoring tools (e.g. NewRelic, Datadog, and/or OpenTelemetry)
    • Familiar with IaC tools (e.g. Terraform, Spacelift)
    • Experience in designing resilient system architecture
    • Experience in optimizing performance of large-scale production system
    • Experience in promoting site reliability engineering practices
#LI-SF1
#LI-MidSenior
#LI-Hybrid

Life @ Crypto.com

Empowered to think big. Try new opportunities while working with a talented, ambitious and supportive team.
Transformational and proactive working environment. Empower employees to find thoughtful and innovative solutions.
Growth from within. We help to develop new skill-sets that would impact the shaping of your personal and professional growth.
Work Culture. Our colleagues are some of the best in the industry; we are all here to help and support one another.
One cohesive team. Engage stakeholders to achieve our ultimate goal - Cryptocurrency in every wallet.
Work Flexibility Adoption. Flexi-work hour and hybrid or remote set-up
Aspire career alternatives through us - our internal mobility program offers employees a new scope.
Work Perks: crypto.com visa card provided upon joining

Are you ready to kickstart your future with us?

Benefits

Competitive salary
Attractive annual leave entitlement including: birthday, work anniversary
Work Flexibility Adoption. Flexi-work hour and hybrid or remote set-up
Aspire career alternatives through us. Our internal mobility program can offer employees a diverse scope.
Work Perks: crypto.com visa card provided upon joining

Our Crypto.com benefits packages vary depending on region requirements, you can learn more from our talent acquisition team.


About Crypto.com:

Founded in 2016, Crypto.com serves more than 80 million customers and is the world's fastest growing global cryptocurrency platform. Our vision is simple: Cryptocurrency in Every Wallet™. Built on a foundation of security, privacy, and compliance, Crypto.com is committed to accelerating the adoption of cryptocurrency through innovation and empowering the next generation of builders, creators, and entrepreneurs to develop a fairer and more equitable digital ecosystem.

Learn more at https://crypto.com.

Crypto.com is an equal opportunities employer and we are committed to creating an environment where opportunities are presented to everyone in a fair and transparent way. Crypto.com values diversity and inclusion, seeking candidates with a variety of backgrounds, perspectives, and skills that complement and strengthen our team.

Personal data provided by applicants will be used for recruitment purposes only.

Please note that only shortlisted candidates will be contacted.
Apply for this job

What does Reliability Engineer do?

A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization

They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance

Here are some of the typical tasks and responsibilities of a Reliability Engineer:

  1. Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
  2. Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
  3. Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
  4. Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
  5. Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
  6. Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.