hub / github.com/tensorlayer/TensorLayer / gradient

Method gradient

examples/reinforcement_learning/tutorial_TRPO.py:253–267 · view source on GitHub ↗

pi gradients :param states: state batch :param actions: actions batch :param adv: advantage batch :param old_log_prob: old log probability batch :return: gradient

(self, states, actions, adv, old_log_prob)

Source from the content-addressed store, hash-verified

251	return -surr
252
253	def gradient(self, states, actions, adv, old_log_prob):
254	"""
255	pi gradients
256	:param states: state batch
257	:param actions: actions batch
258	:param adv: advantage batch
259	:param old_log_prob: old log probability batch
260	:return: gradient
261	"""
262	pi_params = self.actor.trainable_weights
263	with tf.GradientTape() as tape:
264	loss = self.pi_loss(states, actions, adv, old_log_prob)
265	grad = tape.gradient(loss, pi_params)
266	gradient = self._flat_concat(grad)
267	return gradient, loss
268
269	def train_vf(self, states, rewards_to_go):
270	"""

Callers 15

updateMethod · 0.95

_train_stepFunction · 0.80

train_stepFunction · 0.80

test_basic_simplernnMethod · 0.80

test_basic_simplernn_classMethod · 0.80

test_basic_simplernn_dynamicMethod · 0.80

test_basic_simplernn_dynamic_classMethod · 0.80

Calls 2

pi_lossMethod · 0.95

_flat_concatMethod · 0.95

Tested by 15

test_basic_simplernnMethod · 0.64

test_basic_simplernn_classMethod · 0.64

test_basic_simplernn_dynamicMethod · 0.64

test_basic_simplernn_dynamic_classMethod · 0.64

test_basic_simplernn_dynamic_2Method · 0.64

test_basic_simplernn_dynamic_3Method · 0.64

test_basic_lstmrnnMethod · 0.64

test_basic_lstmrnn_classMethod · 0.64

test_basic_grurnnMethod · 0.64

test_basic_grurnn_classMethod · 0.64

test_basic_birnn_simplernncellMethod · 0.64

test_basic_birnn_lstmcellMethod · 0.64