tensorflow中的学习率
学习率 learning_rate:表示了每次参数更新的幅度大小。学习率过大,会导致待优化的参数在最 小值附近波动,不收敛;学习率过小,会导致待优化的参数收敛缓慢。 在训练过程中,参数的更新向着损失函数梯度下降的方向。
假设损失函数为 loss = (w + 1)^2。梯度是损失函数 loss 的导数为 ∇=2w+2。如参数初值为 5,学习 率为 0.2
损失函数 loss = (w + 1)^2 的图像为:
由图可知,损失函数loss的最小值会在(-1,0)处得到,此时损失函数的导数为0,得到最终参数w = -1。
代码如下:
#coding:utf-8
#设损失函数loss = (w+1)^2, 设置w初始值为常数5.反向传播就是求最优的w,即求最小的loss对应的w值
import tensorflow as tf
#定义待优化的参数初始值为5
w = tf.Variable(tf.constant(5,dtype=tf.float32))
#定义损失函数loss
loss = tf.square(w + 1)
#定义反向传播方法
train_step = tf.train.GradientDescentOptimizer(0.3).minimize(loss)
#生成会话并训练
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
for i in range(40):
sess.run(train_step)
w_val = sess.run(w)
loss_val = sess.run(loss)
print "After %s steps: w is %f , loss is %f" % (i,w_val,loss_val)
After 0 steps: w is 1.400000 , loss is 5.759999
After 1 steps: w is -0.040000 , loss is 0.921600
After 2 steps: w is -0.616000 , loss is 0.147456
After 3 steps: w is -0.846400 , loss is 0.023593
After 4 steps: w is -0.938560 , loss is 0.003775
After 5 steps: w is -0.975424 , loss is 0.000604
After 6 steps: w is -0.990170 , loss is 0.000097
After 7 steps: w is -0.996068 , loss is 0.000015
After 8 steps: w is -0.998427 , loss is 0.000002
After 9 steps: w is -0.999371 , loss is 0.000000
After 10 steps: w is -0.999748 , loss is 0.000000
After 11 steps: w is -0.999899 , loss is 0.000000
After 12 steps: w is -0.999960 , loss is 0.000000
After 13 steps: w is -0.999984 , loss is 0.000000
After 14 steps: w is -0.999994 , loss is 0.000000
After 15 steps: w is -0.999997 , loss is 0.000000
After 16 steps: w is -0.999999 , loss is 0.000000
After 17 steps: w is -1.000000 , loss is 0.000000
After 18 steps: w is -1.000000 , loss is 0.000000
After 19 steps: w is -1.000000 , loss is 0.000000
After 20 steps: w is -1.000000 , loss is 0.000000
After 21 steps: w is -1.000000 , loss is 0.000000
After 22 steps: w is -1.000000 , loss is 0.000000
After 23 steps: w is -1.000000 , loss is 0.000000
After 24 steps: w is -1.000000 , loss is 0.000000
After 25 steps: w is -1.000000 , loss is 0.000000
After 26 steps: w is -1.000000 , loss is 0.000000
After 27 steps: w is -1.000000 , loss is 0.000000
After 28 steps: w is -1.000000 , loss is 0.000000
After 29 steps: w is -1.000000 , loss is 0.000000
After 30 steps: w is -1.000000 , loss is 0.000000
After 31 steps: w is -1.000000 , loss is 0.000000
After 32 steps: w is -1.000000 , loss is 0.000000
After 33 steps: w is -1.000000 , loss is 0.000000
After 34 steps: w is -1.000000 , loss is 0.000000
After 35 steps: w is -1.000000 , loss is 0.000000
After 36 steps: w is -1.000000 , loss is 0.000000
After 37 steps: w is -1.000000 , loss is 0.000000
After 38 steps: w is -1.000000 , loss is 0.000000
After 39 steps: w is -1.000000 , loss is 0.000000
由结果可知,随着损失函数值的减小,w 无限趋近于-1,模型计算推测出最优参数 w = -1。