Reliability Engineer

478 jobs found

web3.career is now part of the Bondex Logo Bondex Ecosystem

Receive emails of Reliability Engineer
Job Position Company Posted Location Salary Tags

Polygon Labs

LATAM

$84k - $100k

Zscaler

Remote

$115k - $165k

Layerzerolabs

Remote

$86k - $110k

Gsrmarkets

Remote

$80k - $100k

Zetachain

Remote

$157k - $171k

Parity

Remote

$80k - $120k

D3

Remote

$112k - $156k

Binance

Dublin, Ireland

Autumn Compass

Sydney, Australia

$120k - $150k

Chainlink Labs

United States

$98k - $112k

Wormholefoundation

Remote

$112k - $156k

Kiln

Paris, France

$112k - $156k

Kiln

Paris, France

$84k - $112k

Zscaler

Remote

$140k - $200k

Zscaler

Remote

$130k - $131k

Polygon Labs
$84k - $100k estimated

About Polygon Labs

Polygon Labs is a global blockchain payments company building and operating infrastructure to move money instantly, reliably, and at internet scale, with the mission to move all money onchain. It is building the Polygon Open Money Stack, an open and integrated stack of services and technologies to instantly and reliably move money anywhere, and put it to work. Its infrastructure has facilitated trillions of dollars in onchain value transfer and supported millions of transactions daily for some of the globe's largest banks, fintechs, enterprises, and consumer applications.

Your Role

As a Site Reliability Engineer (SRE) at Polygon Labs, you will play a key role in helping operate and support the production infrastructure that powers the Polygon network. Working alongside experienced SREs and protocol engineers, you will gain hands-on exposure to running large-scale, distributed blockchain systems while learning best practices for reliability, observability, and incident response.

This is an ideal role for someone early in their SRE or infrastructure career who is curious about how production systems work, motivated to learn through real-world operational challenges, and excited to grow within a collaborative and mentorship-driven environment. Your work will directly contribute to the reliability and performance of critical public infrastructure used by developers and users globally.

Your Responsibilities

You will support the day-to-day reliability and operations of Polygon Labs’ production systems, with responsibilities that include:

  • Monitoring production systems, alerts, dashboards, and logs across Polygon networks, including Polygon PoS and the Agglayer.

  • Assisting with incident detection, triage, escalation, and resolution under the guidance of senior engineers.

  • Supporting on-call and operational coverage through structured rotations, with training and mentorship.

  • Following, maintaining, and improving runbooks and standard operating procedures.

  • Assisting with routine operational tasks such as service restarts, upgrades, and configuration changes.

  • Helping maintain and improve monitoring, logging, and alerting systems, including dashboards for network health, RPC performance, and node metrics.

  • Learning to improve alert signal quality and reduce operational noise.

  • Supporting cloud-based and containerized infrastructure, including nodes, RPC endpoints, and supporting services.

  • Collaborating with protocol, product, and cross-functional teams to understand production issues and user impact.

  • Participating in post-incident reviews and contributing to root-cause analysis documentation.

  • Continuously building knowledge of blockchain fundamentals, distributed systems, and networking.


What You'll Need

  • A foundational understanding of Linux systems, processes, and basic networking concepts.

  • Familiarity with at least one scripting or programming language, such as Python, Bash, or Go.

  • An interest in site reliability, monitoring, and operating production infrastructure.

  • Clear written and verbal communication skills, with a willingness to ask questions and learn.

  • The ability to remain calm, methodical, and responsive during incidents or operational events.

Preferred Qualifications

  • Exposure to cloud platforms such as AWS or GCP.

  • Familiarity with containerization or orchestration technologies, including Docker or Kubernetes.

  • Basic understanding of blockchain or Web3 concepts, such as nodes, RPC services, or validators.

  • Experience with monitoring and observability tools such as Grafana, Prometheus, Datadog, or ELK-based stacks.


Polygon Labs Perks

The goal of the Polygon Labs total rewards program is to support the health and well-being of you and your family. Our comprehensive compensation plan includes the following benefits for our full time employees:

  • Remote first global workforce

  • Industry leading Medical, Dental and Vision health insurance

  • Company matching 401k with 3% match

  • $1,500 Home Office Set Up Allowance (life-time max)

  • $200 Annual Book Allowance Program

  • $75 Monthly internet or phone reimbursement

  • Flexible Time Off

  • Company issued laptop

  • Egg freezing, mental health, and employee wellness benefits

In certain countries medical, dental and vision is fully covered for employees & their dependents. This is country and plan specific.

401k is for United States employees only

Polygon Labs is committed to a diverse and inclusive workplace and is an equal opportunity employer. We do not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. Polygon Labs is committed to treating all people in a way that allows them to maintain their dignity and independence. We believe in integration and equal opportunity. Accommodations are available throughout the recruitment process and applicants with a disability may request to be accommodated throughout the recruitment process. We will work with all applicants to accommodate their individual accessibility needs.

If you think you have what it takes, but don't necessarily meet every single point on the job description, please still get in touch. We'd love to have a chat and see if you could be a great fit.

Learn More about Polygon Labs

Website |Twitter|Telegram |Reddit |Discord |Instagram |Facebook |LinkedIn

What does Reliability Engineer do?

A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization

They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance

Here are some of the typical tasks and responsibilities of a Reliability Engineer:

  1. Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
  2. Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
  3. Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
  4. Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
  5. Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
  6. Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.