| 1 | """ |
| 2 | Vanilla Policy Gradient(VPG or REINFORCE) |
| 3 | ----------------------------------------- |
| 4 | The policy gradient algorithm works by updating policy parameters via stochastic gradient ascent on policy performance. |
nothing calls this directly
no test coverage detected