OpenAI is hiring a
Web3 Research Scientist, Post-Training Core Algorithms

Compensation: $90k - $90k estimated

Location: CA San Francisco, California, United States

2y ago

About the Team

The Post-Training - Core Algorithms team is responsible for researching and developing the next generation of algorithms to power our RLHF stack (reinforcement learning from human feedback). The algorithms we develop are used in ChatGPT consumer product and the OpenAI API.

About the Role

As a Member of Technical Staff on our team, you will research and develop improvements to all components of our RLHF stack, including data collection, supervised finetuning, reward modeling, off- and on-policy learning, active learning, and evaluations. The ultimate test for our algorithms is how useful they are to our users, and we often deploy our algorithms into new ChatGPT models.

We’re looking for people who have extensive background in reinforcement learning research, are able to iterate quickly, and are proficient at coding.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:

Come up with improvements to RLHF
Prototype and evaluate these ideas
Scale up your innovations to ChatGPT scale

You might thrive in this role if you:

Love being on the cutting edge of RL and language model research
Can iterate fast on lots of ideas
Like doing research that has real-world impact

Apply Now:

This job is closed

OpenAI

Compensation: $90k - $90k estimated

Location: CA San Francisco, California, United States

This job is closed

Join talent pool

Receive similar jobs:

scientist research education a16z

San Francisco, California, United States

Cover Letter / AI Interview
⬇