hub / github.com/lisa-lab/DeepLearningTutorials / rmsprop

Function rmsprop

code/lstm.py:302–364 · view source on GitHub ↗

A variant of SGD that scales the step size by running average of the recent step norms. Parameters ---------- lr : Theano SharedVariable Initial learning rate tpramas: Theano SharedVariable Model parameters grads: Theano variable Gradients of co

(lr, tparams, grads, x, mask, y, cost)

Source from the content-addressed store, hash-verified

300
301
302	def rmsprop(lr, tparams, grads, x, mask, y, cost):
303	"""
304	A variant of SGD that scales the step size by running average of the
305	recent step norms.
306
307	Parameters
308	----------
309	lr : Theano SharedVariable
310	Initial learning rate
311	tpramas: Theano SharedVariable
312	Model parameters
313	grads: Theano variable
314	Gradients of cost w.r.t to parameres
315	x: Theano variable
316	Model inputs
317	mask: Theano variable
318	Sequence mask
319	y: Theano variable
320	Targets
321	cost: Theano variable
322	Objective fucntion to minimize
323
324	Notes
325	-----
326	For more information, see [Hint2014]_.
327
328	.. [Hint2014] Geoff Hinton, Neural Networks for Machine Learning,
329	lecture 6a,
330	http://cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf
331	"""
332
333	zipped_grads = [theano.shared(p.get_value() * numpy_floatX(0.),
334	name='%s_grad' % k)
335	for k, p in tparams.items()]
336	running_grads = [theano.shared(p.get_value() * numpy_floatX(0.),
337	name='%s_rgrad' % k)
338	for k, p in tparams.items()]
339	running_grads2 = [theano.shared(p.get_value() * numpy_floatX(0.),
340	name='%s_rgrad2' % k)
341	for k, p in tparams.items()]
342
343	zgup = [(zg, g) for zg, g in zip(zipped_grads, grads)]
344	rgup = [(rg, 0.95 * rg + 0.05 * g) for rg, g in zip(running_grads, grads)]
345	rg2up = [(rg2, 0.95 * rg2 + 0.05 * (g ** 2))
346	for rg2, g in zip(running_grads2, grads)]
347
348	f_grad_shared = theano.function([x, mask, y], cost,
349	updates=zgup + rgup + rg2up,
350	name='rmsprop_f_grad_shared')
351
352	updir = [theano.shared(p.get_value() * numpy_floatX(0.),
353	name='%s_updir' % k)
354	for k, p in tparams.items()]
355	updir_new = [(ud, 0.9 * ud - 1e-4 * zg / tensor.sqrt(rg2 - rg ** 2 + 1e-4))
356	for ud, zg, rg, rg2 in zip(updir, zipped_grads, running_grads,
357	running_grads2)]
358	param_up = [(p, p + udn[1])
359	for p, udn in zip(tparams.values(), updir_new)]

Callers

nothing calls this directly

Calls 1

numpy_floatXFunction · 0.85

Tested by

no test coverage detected