Reliability Engineer

502 jobs found

web3.career is now part of the Bondex Logo Bondex Ecosystem

Receive emails of Reliability Engineer
Job Position Company Posted Location Salary Tags

Alpaca

Remote

$90k - $165k

Zscaler

Remote

$119k - $170k

Kraken

London, United Kingdom

$119k - $131k

Kraken

London, United Kingdom

$105k - $120k

Limit Break

Tokyo, Japan

$112k - $130k

Hyperbolic Labs

San Francisco, CA, United States

$103k - $120k

Chainlink Labs

United States

$115k - $117k

Keyrock

Brussels, Belgium

$133k - $135k

Zora

Remote

$170k - $225k

asymmetric.re

Remote

$124k - $150k

Chainlink Labs

Argentina

$112k - $156k

Kraken

Remote

$88k - $101k

Zinnia

Remote

$126k - $127k

Gsrmarkets

Remote

$80k - $100k

Douro Labs

North America

$112k - $156k

Alpaca
$90k - $165k estimated
Remote

Who We Are: Alpaca is a US-headquartered self-clearing broker-dealer and brokerage infrastructure for stocks, ETFs, options, crypto, fixed income, 24/5 trading, and more. Our recent Series D funding round brought our total investment to over $320 million, fueling our ambitious vision. Amongst our subsidiaries, Alpaca is a licensed financial services company, serving hundreds of financial institutions across 40 countries with our institutional-grade APIs. This includes broker-dealers, investment advisors, wealth managers, hedge funds, and crypto exchanges, totalling over 9 million brokerage accounts. Our global team is a diverse group of experienced engineers, traders, and brokerage professionals who are working to achieve our mission of opening financial services to everyone on the planet. We're deeply committed to open-source contributions and fostering a vibrant community, continuously enhancing our award-winning, developer-friendly API and the robust infrastructure behind it. Alpaca is proudly backed by top-tier global investors, including Portage Ventures, Spark Capital, Tribe Capital, Social Leverage, Horizons Ventures, Unbound, SBI Group, Derayah Financial, Elefund, and Y Combinator.   Our Team Members: We're a dynamic team of 230+ globally distributed members who thrive working from our favorite places around the world, with teammates spanning the USA, Canada, Japan, Hungary, Nigeria, Brazil, the UK, and beyond!We're searching for passionate individuals eager to contribute to Alpaca's rapid growth. If you align with our core values—Stay Curious, Have Empathy, and Be Accountable—and are ready to make a significant impact, we encourage you to apply.Your RoleAs an Operations Reliability Engineer, you will embed directly within brokerage operations functions to systematically eliminate manual work and replace it with durable, auditable software systems. You start by immersing yourself in operational workflows: observing, documenting, and deeply understanding processes end-to-end before designing solutions. Every recurring manual process is treated as a system defect, and every fix you ship is measured by its real-world impact on efficiency and reliability. You will work closely with licensed brokerage staff, domain experts, and platform engineers to build automations and tooling that allow Alpaca's operations to scale globally without scaling headcount linearly. The ideal candidate is equally comfortable shadowing an operational process and architecting the backend service that replaces it. Things You Get To Do

Design, build, test, deploy, and monitor production automations and UIs that remove manual steps and reduce operation time. Partner with frontend engineers to productize ops tooling so global teams can run functions with predictable staffing. Execute operational procedures to surface painful manual processes prior to automation. Instrument and report baseline and outcome metrics (MTTC, manual-steps removed, queue sizes, ops satisfaction) and iterate based on measured impact. Produce Platform Opportunity Briefs / RFCs for higher-level platform tooling and automations Collaborate with licensed BD leadership, Compliance, and Security to build auditable, safe automations with role-based access and clear runbooks. Own the full lifecycle of the systems you build, including automated deployment (CI/CD with tools like ArgoCD and Terraform), proactive monitoring, On-call support rotations and incident response, following a "you build it, you run it" philosophy. Build systems with auditability, traceability, and data lineage as a first-class concern to ensure transparency for our auditors and regulators.  

Who You Are (must-haves)

5+ years of professional software engineering experience, with a proven track record of shipping and operating complex, large-scale systems in production. Strong business sense and understanding of operations Deep, hands-on expertise in Golang, including a strong command of its concurrency models (goroutines, channels), memory management, and standard library.   Proven track record of building user-facing features end-to-end with Typescript/React Proficient with SQL and relational databases, preferably PostgreSQL. Demonstrated ability to reason about human workflows as systems, not just software services. Experience with observability, tracing, continuous profiling Exceptional analytical and problem-solving skills, with the ability to deconstruct complex requirements into clear technical components and excellent communication skills for working in a cross-functional environment.  High ownership mindset with bias toward durable, structural fixes over tactical patches.

Who You Might Be (nice-to-haves)

Knowledge of service oriented architectures Experience with major cloud platforms (we primarily use GCP). Financial market (exchange, broker-dealers, clearing, etc.) knowledge Experience with Docker and Kubernetes. A passion for financial markets or the desire to learn Knowledge of Agile/Scrum methodologies Demonstrable experience in designing, building, and reasoning about distributed systems, including a strong understanding of microservices architecture and API design patterns (e.g., REST, gRPC).   Experience with capacity planning and benchmarking How We Take Care of You:

Competitive Salary & Stock Options Health Benefits New Hire Home-Office Setup: One-time USD $500 Monthly Stipend: USD $150 per month via a Brex Card

Alpaca is proud to be an equal opportunity workplace dedicated to pursuing and hiring a diverse workforce. Recruitment Privacy Policy

What does Reliability Engineer do?

A Reliability Engineer is a professional who is responsible for ensuring the reliability and availability of systems and equipment in an organization

They use their knowledge of engineering principles, statistical analysis, and data science to identify and mitigate risks, prevent failures, and optimize system performance

Here are some of the typical tasks and responsibilities of a Reliability Engineer:

  1. Analyze data and perform statistical modeling: Reliability Engineers analyze data related to equipment performance, failure rates, and maintenance history to identify trends and patterns. They use statistical modeling to predict future failures and plan maintenance activities accordingly.
  2. Develop and implement reliability strategies: Reliability Engineers develop and implement strategies to improve the reliability and availability of equipment and systems. This may include performing root cause analysis, implementing preventive maintenance programs, and conducting failure mode and effects analysis (FMEA).
  3. Collaborate with other teams: Reliability Engineers collaborate with other teams such as operations, maintenance, and engineering to identify and address reliability issues. They may also work with suppliers to ensure the reliability of equipment and materials.
  4. Monitor and evaluate performance: Reliability Engineers monitor the performance of systems and equipment to identify areas for improvement. They use data to evaluate the effectiveness of reliability strategies and make adjustments as necessary.
  5. Provide technical support: Reliability Engineers provide technical support to other teams and stakeholders, answering questions and providing guidance on reliability-related issues.
  6. Continuously improve processes: Reliability Engineers are responsible for continuously improving reliability processes and methodologies. They stay up-to-date with the latest technologies and best practices in the field and identify opportunities for improvement.