BACK_TO_REQUISITIONS
REQ_ID // RES-02Y

RL Research Scientist

NODE_LOCSF / REMOTE_NODE
EMP_TYPEFull-Time
DOMAINHyper-Parallel Policy Search
CORE_STACKPyTorch, JAX, Distributed RL

The Mission

General Purpose Labor cannot be programmed—it must be learned. As a Research Scientist at Iacon, you will scale our reinforcement learning pipelines to unprecedented throughput. We are moving beyond manual reward shaping; your mission is to perfect the Optimus Agent, a system capable of autonomous reward synthesis and policy discovery.

You will have access to massive compute grids. We do not constrain our researchers.

Core Responsibilities

  • Algorithm Scaling: Distribute PPO and SAC algorithms across 4096+ parallel GPU nodes, ensuring minimal communication overhead and perfect gradient synchronization.
  • Autonomous Reward Tuning: Design the heuristics that allow our system to learn unconstrained reward functions via Large Behavior Models (LBMs) given only a sparse end-goal.
  • Manifold Optimization: Engineer mathematical bounds to keep the RL agent's exploration strictly within stable physical manifolds, preventing explosive failure modes in Sim-to-Real.
  • State Space Generalization: Formulate representations that generalize seamlessly across completely different robotic morphologies (bipeds, quadrupeds, swarms).

Qualifications

  • Unparalleled expertise in Deep Reinforcement Learning (PPO, SAC, DDPG).
  • Experience contributing to, or authoring, foundational papers on distributed ML.
  • Extreme proficiency in PyTorch, JAX, and CUDA kernel development.
  • Deep understanding of numerical optimization and heuristic search spaces.

The Iacon Standard

We expect researchers to deploy code. Theory is useless without physical execution. Join the grid.

SYSTEM_UPLOAD // APPLICATION_PAYLOADAWAITING_INPUT
// HUD_BRACKET_START