Reliability Engineer
412 jobs found
ai analyst backend bitcoin blockchain community manager crypto cryptography cto customer support dao data science defi design developer relations devops discord economy designer entry level erc erc 20 evm front end full stack gaming ganache golang hardhat intern java javascript layer 2 marketing mobile moderator nft node non tech open source openzeppelin pay in crypto product manager project manager react refi research ruby rust sales smart contract solana solidity truffle web3 py web3js zero knowledge
Job Position | Company | Posted | Location | Salary | Tags |
---|---|---|---|---|---|
Crypto.com | Hong Kong, Hong Kong | $185k | |||
Zscaler | Remote | $161k - $230k | |||
Wehrtyou | Remote | $200k - $275k | |||
Wehrtyou | Remote | $150k - $250k | |||
Learn job-ready web3 skills on your schedule with 1-on-1 support & get a job, or your money back. | | by Metana Bootcamp Info | |||
Scrollio | Remote | $133k - $135k | |||
Blockchain | Remote | $120k - $144k | |||
Aurosglobal | Remote | $87k - $87k | |||
Coinbase | Remote | $186k - $218k | |||
Auros | Remote | $87k - $87k | |||
Chainlink Labs | United States | $126k - $135k | |||
Argus Labs | Toronto, Canada | $90k - $145k | |||
Avalabs | Remote | $85k - $107k | |||
Kraken | European Union | $112k - $156k | |||
Zinnia | Remote | $126k - $127k | |||
Anagram | Remote | $112k - $156k |
Crypto.com
$185k estimated
Senior Software Engineer, Site Reliability Engineering
Hong Kong, Hong Kong SAR
Engineering – Engineering /
Hybrid
Apply for this job
We are a team to design, develop, maintain, and improve software for various ventures projects, i.e., projects that are adjacent to our core businesses and are bootstrapped fast with a lean team. You will be actively involved in the design of various components behind scalable applications, from frontend UI to backend infrastructure.
What you’ll be doing
- Ensure entire stack is healthy: hardware, software, application and network are operating at optimal performance
- Perform deep dives into both systemic and latent reliability issues; partnering with other software and DevOps engineers across the organization to design, implement and roll out fixes
- Continuously improve availability, reliability, and observability and reduce the burden of human toil with tooling and automation
- Lead and drive SRE initiatives to improve operation efficiencies
- Represent the SRE team in system design reviews and operational readiness exercises for new and existing services
What you need
- Experience coding in Ruby and/or Go
- Familiar with GitOps principles and tools (Github Actions, Docker, Kubernetes)
- Experience in designing, analyzing, and troubleshooting large-scale distributed systems
- Curiosity about finding root causes in incidents and outages
- Ability to develop alignment to cultivate relationships and driving impact
- Mindset in designing fault tolerance system architecture
- Comfort with being uncomfortable in ambiguous situations
- Involvement with incident management and response
- Desire to grow expertise, inform, and educate others
- Capable to pick up various technologies, a fast learner and have a “get things done” mentality
- Humble to embrace better ideas from others, eager to make things better, open to challenges and possibilities
Desirable
- Familiar with cloud platforms and micro-service based architecture (AWS is big plus)
- Familiar with monitoring tools (e.g. Datadog, OpenTelemetry)
- Familiar with CICD tools (e.g. Github Actions)
- Familiar with IaC tools (e.g. Terraform, Spacelift)
- Experience in designing resilient system architecture
- Experience in optimizing performance of large-scale production system
Life @ Crypto.com
Empowered to think big. Try new opportunities while working with a talented, ambitious and supportive team.
Transformational and proactive working environment. Empower employees to find thoughtful and innovative solutions.
Growth from within. We help to develop new skill-sets that would impact the shaping of your personal and professional growth.
Work Culture. Our colleagues are some of the best in the industry; we are all here to help and support one another.
One cohesive team. Engage stakeholders to achieve our ultimate goal - Cryptocurrency in every wallet.
Work Flexibility Adoption. Flexi-work hour and hybrid or remote set-up
Aspire career alternatives through us - our internal mobility program offers employees a new scope.
Work Perks: crypto.com visa card provided upon joining
Are you ready to kickstart your future with us?
Benefits
Competitive salary
Attractive annual leave entitlement including: birthday, work anniversary
Work Flexibility Adoption. Flexi-work hour and hybrid or remote set-up
Aspire career alternatives through us. Our internal mobility program can offer employees a diverse scope.
Work Perks: crypto.com visa card provided upon joining
Our Crypto.com benefits packages vary depending on region requirements, you can learn more from our talent acquisition team.
About Crypto.com:
Founded in 2016, Crypto.com serves more than 80 million customers and is the world's fastest growing global cryptocurrency platform. Our vision is simple: Cryptocurrency in Every Wallet™. Built on a foundation of security, privacy, and compliance, Crypto.com is committed to accelerating the adoption of cryptocurrency through innovation and empowering the next generation of builders, creators, and entrepreneurs to develop a fairer and more equitable digital ecosystem.
Learn more at https://crypto.com.
Crypto.com is an equal opportunities employer and we are committed to creating an environment where opportunities are presented to everyone in a fair and transparent way. Crypto.com values diversity and inclusion, seeking candidates with a variety of backgrounds, perspectives, and skills that complement and strengthen our team.
Personal data provided by applicants will be used for recruitment purposes only.
Please note that only shortlisted candidates will be contacted.
Apply for this job
What does Reliability Engineer do?
A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization
They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance
Here are some of the typical tasks and responsibilities of a Reliability Engineer:
- Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
- Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
- Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
- Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
- Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
- Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.