| Job Position | Company | Posted | Location | Salary | Tags |
|---|---|---|---|---|---|
Impossible Cloud | Hamburg, Germany | $103k - $117k | |||
Impossible Cloud | Hamburg, Germany | $103k - $171k | |||
BitGo | Toronto, Canada | $180k - $240k | |||
BitGo | Palo Alto, CA, United States | $165k - $210k | |||
| Learn job-ready web3 skills on your schedule with 1-on-1 support & get a job, or your money back. | | by Metana Bootcamp Info | |||
Impossible Cloud | Hamburg, Germany | $80k - $106k | |||
DFINITY | San Francisco, CA, United States | $175k - $235k | |||
NUMBER GROUP | Remote |
| |||
Ledger | Paris, France | $90k - $144k | |||
capital.com | Lima, Peru | $84k - $150k | |||
Coinbase | Remote | $87k - $87k | |||
Coinbase | Remote | $57k - $60k | |||
Bitso | Latin America | $126k - $127k | |||
ZetaChain | Remote |
| |||
Fireblocks | Get a Fireblocks Platform Demo | $98k - $150k | |||
ZetaChain | Remote |
|
Site Reliability Engineer (SRE)
IN THIS ROLE YOU WILL
- Define and track Service Level Indicators (SLIs) to measure against Service Level Objectives (SLOs). Provide weekly reports on error rates, availability, and SLO compliance.
- Develop and maintain self-hosted observability to collect logs, metrics, and traces, ensuring end-to-end visibility of system performance and availability.
- Build and maintain dashboards and alerts to monitor system functionality, health and performance.
- Participate in on-call rotation, handle incidents and lead post-mortems.
- Assist teams with incident investigation and root cause analysis. Provide mentorship and consultation to other teams to prevent future incidents.
- Analyze and provide input for the scalability of services and core components. Ensure that systems can scale efficiently and handle increasing load.
- Build automation to improve site reliability and performance
YOU COULD BE A GREAT FIT FOR THE ROLE IF YOU HAVE
- Experience with Loki, Grafana, Tempo and Mimir stack for log aggregation, metrics collection, and visualization.
- A proven track record in configuring alerting systems and on-call work.
- Strong foundations in working with Kubernetes and understanding how to operate services in Kubernetes environments.
- Ability to follow and create runbooks for incident management and ensure rapid response to critical issues.
- Generate reports on error rates, SLO compliance, and other key metrics to provide visibility into system health.
- Proactive and analytical approach to problem-solving with a focus on long-term solutions.
- Excellent collaboration skills to work across teams and assist other engineers with root cause analysis and incident resolution.
- Ability to mentor other team members and consult with teams on best practices for reliability and scalability.
Is Kubernetes high demand?
Yes, Kubernetes is currently in high demand in the technology industry
Kubernetes is an open-source container orchestration platform that is widely used for deploying, scaling, and managing containerized applications
It provides a standardized way to manage and automate the deployment of containerized applications across multiple hosts and provides benefits such as reliability, scalability, and flexibility
As more and more organizations move towards containerized architectures, Kubernetes has become a critical component of their infrastructure
Kubernetes is used by companies of all sizes, from startups to large enterprises, and across various industries, including finance, healthcare, and e-commerce
According to various job market and salary surveys, Kubernetes-related skills are in high demand, and job positions related to Kubernetes are growing at a rapid pace
In fact, Kubernetes is often listed as one of the top skills that are in high demand by technology companies
Overall, Kubernetes is a highly sought-after skill in the technology industry, and it's likely to remain in high demand in the foreseeable future as more and more organizations adopt containerization and cloud-native architectures.