MCPcopy Index your code
hub / github.com/tensorlayer/TensorLayer / get_action

Method get_action

examples/reinforcement_learning/tutorial_TRPO.py:213–235  ·  view source on GitHub ↗

get action :param state: state input :param greedy: get action greedy or not :return: pi, v, logp_pi, mean, log_std

(self, state, greedy=False)

Source from the content-addressed store, hash-verified

211 self.action_bound = action_bound
212
213 def get_action(self, state, greedy=False):
214 """
215 get action
216 :param state: state input
217 :param greedy: get action greedy or not
218 :return: pi, v, logp_pi, mean, log_std
219 """
220 state = np.array([state], np.float32)
221 mean = self.actor(state)
222 log_std = tf.convert_to_tensor(self.actor.log_std)
223 std = tf.exp(log_std)
224 std = tf.ones_like(mean) * std
225 pi = tfp.distributions.Normal(mean, std)
226
227 if greedy:
228 action = mean
229 else:
230 action = pi.sample()
231 action = np.clip(action, -self.action_bound, self.action_bound)
232 logp_pi = pi.log_prob(action)
233
234 value = self.critic(state)
235 return action[0], value, logp_pi, mean, log_std
236
237 def pi_loss(self, states, actions, adv, old_log_prob):
238 """

Callers 1

tutorial_TRPO.pyFile · 0.45

Calls 1

sampleMethod · 0.45

Tested by

no test coverage detected