Code
Hub
Workspaces
Connect
Indexed graphs
Engine
MCP
copy
hub
/
github.com/FareedKhan-dev/train-llm-from-scratch
/ train_ppo.py
File
train_ppo.py
scripts/train_ppo.py:None–None ·
view source on GitHub ↗
Source
from the content-addressed store, hash-verified
1
""
"
2
PPO RLHF on GSM8K (the classic InstructGPT recipe),
from
scratch.
3
4
Per iteration: roll out completions
with
the current policy, score them (verifiable GSM8K
Callers
nothing calls this directly
Calls
1
main
Function · 0.70
Tested by
no test coverage detected