Heretic is hiring a
Web3 Senior Site Reliability Engineer (Heretic Stealth PortCo)

Compensation: $103k - $117k estimated

Location: San Francisco

2y ago

Senior Site Reliability Engineer (Heretic Stealth PortCo)

San Francisco

Stealth Mode Portfolio Company /

Full-Time

/ Hybrid

Apply for this job

Overview of Role

Heretic Ventures is seeking an experienced Site Reliability Engineer to join an early stage generative AI business that Heretic Ventures is launching.

The ideal candidate has built and operationalized cloud infrastructure from the ground up, with monitoring, alerting, and deployment orchestration. You are entrepreneurial and adapts to a fast-changing environment with limited time and resources. You get excited about designing and taking full ownership of our SaaS architecture. In addition, you have worked with complex systems at scale. You understand how to plan for growth and traffic patterns, and you know to implement the right safety checks to mitigate the unexpected. Working with both web app and AI engineering teams, this role will help define and implement the engineering tooling and processes needed to ensure our platform is performant, stable and scalable.

This is a unique opportunity to help build a billion-dollar company from the ground up while learning from successful repeat entrepreneurs and a team of powerful and experienced mentors and advisors.

This is a hybrid role with the expectation of partial in-person work in our sunny Presidio, SF office. The position is compensated with salary, benefits, and equity.

About Heretic

Heretic Ventures is a San Francisco-based venture studio ideating and launching new businesses in the creator economy, including those that capitalize on AI/ML technology. Heretic is run by Managing Partner Mariam Naficy, who founded and built the pioneering internet companies Minted and Eve.com. Heretic is backed by household names in Silicon Valley (investors and entrepreneurs), who act as the studio’s advisors both in selecting and in advising companies.

Responsibilities

Build and extend tooling for end-to-end ML model deployment and lifecycle management
Setup, configure and connect cloud infrastructure services together to serve as the foundation of our platform
Automate deployment orchestration, building a fast and maintainable CI/CD pipeline for our web applications
Hook up real time monitoring and alerting for all parts of the web platform, enabling engineering teams to quickly respond to incidents.
Build and maintain analytics pipeline, connecting data sources to data warehouse, then from data warehouse to reporting platform and back to model training.
Collaborate with cross-functional teams to deploy and maintain AI models in production environments, ensuring scalability, reliability, efficiency and robustness
Orchestrate model serving to accommodate our unique infrastructure in a scalable manner
Configure and maintain Kubernetes clusters on Ubuntu.
Maintain backend planning and optimize GPU capacity continuously.

Qualifications

Bachelor's or Master's degree in Computer Science, a related field, or equivalent work experience
5+ years of professional experience as DevOps, TechOps, or SRE engineer
Extensive experience with setting up IaaS cloud platforms (GCP preferred)
Experience scaling infrastructure for consumer facing web applications
Proven experience in working with and scaling GPUs
Proficiency in containerization technologies, especially Docker and Kubernetes
Proficient in Python and creating scripts to automate pipelines and processes
Extensive Linux troubleshooting experience
Excellent problem-solving and analytical thinking skills, with a strong attention to detail
Effective verbal and written communication a must.
Comfortable working in a dynamic, fast-paced, and collaborative environment

Nice to Haves

Marketplace and/or E-commerce experience a plus
Experience with deploying AI models in cloud-based environments (Diffusion models preferred)
Experience managing Triton inference servers
Experience in popular machine learning libraries (e.g., TensorFlow, PyTorch, Spark)

Apply for this job

Apply Now:

This job is closed

Heretic

Compensation: $103k - $117k estimated

Location: San Francisco

This job is closed

Receive similar jobs:

dev reliability senior docker kubernetes engineer gcp gpu tensorflow pytorch

San Francisco, California, United States

Cover Letter / AI Interview
⬇