REQ_ID // RES-02Y

RL Research Scientist

NODE_LOCSF / REMOTE_NODE

EMP_TYPEFull-Time

DOMAINHyper-Parallel Policy Search

CORE_STACKPyTorch, JAX, Distributed RL

The Mission

General Purpose Labor cannot be programmed—it must be learned. As a Research Scientist at Iacon, you will scale our reinforcement learning pipelines to unprecedented throughput. We are moving beyond manual reward shaping; your mission is to perfect the Optimus Agent, a system capable of autonomous reward synthesis and policy discovery.

You will have access to massive compute grids. We do not constrain our researchers.

Core Responsibilities

Algorithm Scaling: Distribute PPO and SAC algorithms across 4096+ parallel GPU nodes, ensuring minimal communication overhead and perfect gradient synchronization.
Autonomous Reward Tuning: Design the heuristics that allow our system to learn unconstrained reward functions via Large Behavior Models (LBMs) given only a sparse end-goal.
Manifold Optimization: Engineer mathematical bounds to keep the RL agent's exploration strictly within stable physical manifolds, preventing explosive failure modes in Sim-to-Real.
State Space Generalization: Formulate representations that generalize seamlessly across completely different robotic morphologies (bipeds, quadrupeds, swarms).

Qualifications

Unparalleled expertise in Deep Reinforcement Learning (PPO, SAC, DDPG).
Experience contributing to, or authoring, foundational papers on distributed ML.
Extreme proficiency in PyTorch, JAX, and CUDA kernel development.
Deep understanding of numerical optimization and heuristic search spaces.

The Iacon Standard

We expect researchers to deploy code. Theory is useless without physical execution. Join the grid.

SYSTEM_UPLOAD // APPLICATION_PAYLOADAWAITING_INPUT