Reliability Engineer

468 jobs found

web3.career is now part of the Bondex Logo Bondex Ecosystem

Receive emails of Reliability Engineer
RealtyBits
$54k - $90k estimated
remote
Apply

This is a key role at the company as we are getting ready to open up our platform for a wider audience and need a reliable cloud infrastructure to handle the upcoming higher loads.

As an infrastructure engineer you will be responsible for the reliability, availability and system security of the platform. You need to be well versed in building and maintaining cloud infrastructure on AWS using services such as Lambda, ECS (Fargate), Aurora, etc. Using automation, collecting performance metrics, setting up logging, tracing and notifications to avoid issues before they happen should be standard practices. You need to be able to manage bare metal linux servers, server-less architecture using Lambda functions, and PostgreSQL database on Aurora et al. We use tools such as AWS CloudFormation managed using AWS CDK, serverless framework tools and plain shell scripts to manage the architecture.

You will play a critical role for the company setting up the services to guarantee uptime, performance and security in the beginning, as the company grows you would lead the infra work as the team grows, so having experience from leading a team and manage people is important for this role.

If you have experience with fintech and/or blockchain technologies it is a plus but not a requirement.

What does Reliability Engineer do?

A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization

They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance

Here are some of the typical tasks and responsibilities of a Reliability Engineer:

  1. Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
  2. Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
  3. Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
  4. Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
  5. Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
  6. Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.