# 0. 小结

1. 什么是ReLU函数，以及如何用TensorFlow实现
2. 反向传播
3. TensorFlow中的深度神经网络，包括如何初始化，定义权重，以及各种超参数等
5. 如何防止过拟合，权值衰减在TensorFlow中如何实施
6. 同样为了防止过拟合，使用Dropout方法，了解Dropout在TensorFlow中如何实施

# 3. Number of Parameters

``````= size of W + size of b
= 28x28x10 + 10
= 7850
``````

``````n_features = 3
n_labels = 5
weights = tf.Variable(tf.truncated_normal((n_features, n_labels)))
bias = tf.Variable(tf.zeros(n_labels))
``````

# 7. 2-Layer Neural Network

ReLU 是个非线性函数，当x大于0时，y等于x；否则y为0，该函数的导数如下图：

1. 第一层由输入x和其对应的权重w及偏置bias构成，结果经由ReLU函数，传递给下一层神经网络。
2. 第二层由上一层的中间结果，以及该层的权重w和偏置bias构成，计算出来的结果，最终传递给激活函数如softmax函数，计算出概率。

# 8. Quiz: TensorFlow ReLu

ReLU函数(f(x) = max(0, x))，也是一种激活函数，它在TensorFlow中用` tf.nn.relu()`来定义，示例代码如下：

``````# Hidden Layer with ReLU activation function
hidden_layer = tf.nn.relu(hidden_layer)

``````

1. 将 tf.nn.relu() 应用到了 hidden_layer 隐藏层。
2. 添加了一个新的层output layer，output layer的输入数据是前一层hidden_layer的输出(非线性Relu函数处理后的)

In this quiz, you’ll use TensorFlow’s ReLU function to turn the linear model below into a nonlinear model.

``````# Solution is available in the other "solution.py" tab
import tensorflow as tf

output = None
hidden_layer_weights = [
[0.1, 0.2, 0.4],
[0.4, 0.6, 0.6],
[0.5, 0.9, 0.1],
[0.8, 0.2, 0.8]]
out_weights = [
[0.1, 0.6],
[0.2, 0.1],
[0.7, 0.9]]

# Weights and biases
weights = [
tf.Variable(hidden_layer_weights),
tf.Variable(out_weights)]
biases = [
tf.Variable(tf.zeros(3)),
tf.Variable(tf.zeros(2))]

# Input
features = tf.Variable([[1.0, 2.0, 3.0, 4.0], [-1.0, -2.0, -3.0, -4.0], [11.0, 12.0, 13.0, 14.0]])
``````
``````# TODO: Create Model

hidden_layer = tf.nn.relu(hidden_layer)

``````
``````# TODO: save and print session results on a variable named "output"

init = tf.global_variables_initializer()

with tf.Session() as sess:
# Run the tf.constant operation in the session
sess.run(init)
result = sess.run(output)
print(result)
``````

``````[[  5.11000013   8.44000053]
[  0.           0.        ]
[ 24.01000214  38.23999786]]
``````

# 12. Deep Neural Network in TensorFlow

## 12.1 示例代码：

``````from tensorflow.examples.tutorials.mnist import input_data

import tensorflow as tf

# Parameters
learning_rate = 0.001
training_epochs = 20
batch_size = 128  # Decrease batch size if you don't have enough memory
display_step = 1

n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)

n_hidden_layer = 256 # layer number of features

# Store layers weight & bias
weights = {
'hidden_layer': tf.Variable(tf.random_normal([n_input, n_hidden_layer])),
'out': tf.Variable(tf.random_normal([n_hidden_layer, n_classes]))
}
biases = {
'hidden_layer': tf.Variable(tf.random_normal([n_hidden_layer])),
'out': tf.Variable(tf.random_normal([n_classes]))
}

# tf Graph input
x = tf.placeholder("float", [None, 28, 28, 1])
y = tf.placeholder("float", [None, n_classes])

x_flat = tf.reshape(x, [-1, n_input])

# Hidden layer with RELU activation
layer_1 = tf.nn.relu(layer_1)
# Output layer with linear activation
logits = tf.matmul(layer_1, weights['out']) + biases['out']

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
sess.run(init)
# Training cycle
for epoch in range(training_epochs):
total_batch = int(mnist.train.num_examples/batch_size)
# Loop over all batches
for i in range(total_batch):
batch_x, batch_y = mnist.train.next_batch(batch_size)
# Run optimization op (backprop) and cost op (to get loss value)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
# Display logs per epoch step
if epoch % display_step == 0:
c = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
print("Epoch:", '%04d' % (epoch+1), "cost=", \
"{:.9f}".format(c))
print("Optimization Finished!")

# Test model
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
# Decrease test_size if you don't have enough memory
test_size = 256
print("Accuracy:", accuracy.eval({x: mnist.test.images[:test_size], y: mnist.test.labels[:test_size]}))

``````

## 12.2 代码解析

• 使用TensorFlow提供的MNIST数据，已经做好了batch和one-hot编码处理：
``````from tensorflow.examples.tutorials.mnist import input_data
``````
• Learning Parameters

``````import tensorflow as tf

# Parameters
learning_rate = 0.001
training_epochs = 20
batch_size = 128  # Decrease batch size if you don't have enough memory
display_step = 1

n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)
``````
• Hidden Layer Parameters

``````n_hidden_layer = 256 # layer number of features
``````
• Weights and Biases 权重和偏置

``````# Store layers weight & bias
weights = {
'hidden_layer': tf.Variable(tf.random_normal([n_input, n_hidden_layer])),
'out': tf.Variable(tf.random_normal([n_hidden_layer, n_classes]))
}
biases = {
'hidden_layer': tf.Variable(tf.random_normal([n_hidden_layer])),
'out': tf.Variable(tf.random_normal([n_classes]))
}
``````
• input 输入数据
``````# tf Graph input
x = tf.placeholder("float", [None, 28, 28, 1])
y = tf.placeholder("float", [None, n_classes])

x_flat = tf.reshape(x, [-1, n_input])
``````

• Multilayer Perceptron 多层感知

``````# Hidden layer with RELU activation
biases['hidden_layer'])
layer_1 = tf.nn.relu(layer_1)
# Output layer with linear activation
``````
• Optimizer 优化器
``````# Define loss and optimizer
cost = tf.reduce_mean(\
tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
.minimize(cost)
``````
• session

TensorFlow中提供的MNIST library库，能够批量接收数据集，使用`mnist.train.next_batch()`函数返回训练数据的一个子集subset。

``````# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
sess.run(init)
# Training cycle
for epoch in range(training_epochs):
total_batch = int(mnist.train.num_examples/batch_size)
# Loop over all batches
for i in range(total_batch):
batch_x, batch_y = mnist.train.next_batch(batch_size)
# Run optimization op (backprop) and cost op (to get loss value)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
``````

# 14. Save and Restore TensorFlow Models

## 14.1 Saving Variables

``````import tensorflow as tf

# The file path to save the data
save_file = './model.ckpt'

# Two Tensor Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))

# Class used to save and/or restore Tensor Variables
saver = tf.train.Saver()

with tf.Session() as sess:
# Initialize all the Variables
sess.run(tf.global_variables_initializer())

# Show the values of weights and bias
print('Weights:')
print(sess.run(weights))
print('Bias:')
print(sess.run(bias))

# Save the model
saver.save(sess, save_file)
``````
``````Weights:
[[ 0.74129212  1.16585362  0.18823986]
[ 0.84469903 -0.30504367 -0.9390443 ]]
Bias:
[-0.0300845   0.12080105  0.38587224]
``````

``````# Remove the previous weights and bias
tf.reset_default_graph()

# Two Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))

# Class used to save and/or restore Tensor Variables
saver = tf.train.Saver()

with tf.Session() as sess:
# Load the weights and bias
saver.restore(sess, save_file)

# Show the values of weights and bias
print('Weight:')
print(sess.run(weights))
print('Bias:')
print(sess.run(bias))
``````
``````INFO:tensorflow:Restoring parameters from ./model.ckpt
Weight:
[[ 0.74129212  1.16585362  0.18823986]
[ 0.84469903 -0.30504367 -0.9390443 ]]
Bias:
[-0.0300845   0.12080105  0.38587224]
``````

``````with tf.Session() as sess:
# Load the weights and bias
#saver.restore(sess, save_file)
sess.run(tf.global_variables_initializer())
# Show the values of weights and bias
print('Weight:')
print(sess.run(weights))
print('Bias:')
print(sess.run(bias))
``````

``````Weight:
[[ 0.51144725  0.18832855  0.00272263]
[ 0.05852098 -0.44724768 -0.96787697]]
Bias:
[-0.05925143 -1.33713555  0.32981932]
``````

## 14.3 Save a Trained Model

• 先新建一个模型
``````# Remove previous Tensors and Operations
tf.reset_default_graph()

from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

learning_rate = 0.001
n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)

# Import MNIST data

# Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

# Weights & bias
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

# Logits - xW + b

# Define loss and optimizer
cost = tf.reduce_mean(\
tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
.minimize(cost)

# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
``````
• 再训练一个模型，并保存其权重：
``````import math

save_file = './train_model.ckpt'
batch_size = 128
n_epochs = 100

saver = tf.train.Saver()

# Launch the graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())

# Training cycle
for epoch in range(n_epochs):
total_batch = math.ceil(mnist.train.num_examples / batch_size)

# Loop over all batches
for i in range(total_batch):
batch_features, batch_labels = mnist.train.next_batch(batch_size)
sess.run(
optimizer,
feed_dict={features: batch_features, labels: batch_labels})

# Print status for every 10 epochs
if epoch % 10 == 0:
valid_accuracy = sess.run(
accuracy,
feed_dict={
features: mnist.validation.images,
labels: mnist.validation.labels})
print('Epoch {:<3} - Validation Accuracy: {}'.format(
epoch,
valid_accuracy))

# Save the model
saver.save(sess, save_file)
print('Trained Model Saved.')
``````

## 14.4 Load a Trained Model

``````saver = tf.train.Saver()

# Launch the graph
with tf.Session() as sess:
saver.restore(sess, save_file)

test_accuracy = sess.run(
accuracy,
feed_dict={features: mnist.test.images, labels: mnist.test.labels})

print('Test Accuracy: {}'.format(test_accuracy))
``````

# 15. Fine tuning

## 15.1 Naming Error

``````import tensorflow as tf

# Remove the previous weights and bias
tf.reset_default_graph()

save_file = 'model.ckpt'

# Two Tensor Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))

saver = tf.train.Saver()

# Print the name of Weights and Bias
print('Save Weights: {}'.format(weights.name))
print('Save Bias: {}'.format(bias.name))

with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.save(sess, save_file)

# Remove the previous weights and bias
tf.reset_default_graph()

# Two Variables: weights and bias
bias = tf.Variable(tf.truncated_normal([3]))
weights = tf.Variable(tf.truncated_normal([2, 3]))

saver = tf.train.Saver()

# Print the name of Weights and Bias

with tf.Session() as sess:
# Load the weights and bias - ERROR
saver.restore(sess, save_file)
``````

``````Assign requires shapes of both tensors to match
``````

``````import tensorflow as tf

tf.reset_default_graph()

save_file = 'model.ckpt'

# Two Tensor Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2, 3]), name='weights_0')
bias = tf.Variable(tf.truncated_normal([3]), name='bias_0')

saver = tf.train.Saver()

# Print the name of Weights and Bias
print('Save Weights: {}'.format(weights.name))
print('Save Bias: {}'.format(bias.name))

with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.save(sess, save_file)

# Remove the previous weights and bias
tf.reset_default_graph()

# Two Variables: weights and bias
bias = tf.Variable(tf.truncated_normal([3]), name='bias_0')
weights = tf.Variable(tf.truncated_normal([2, 3]) ,name='weights_0')

saver = tf.train.Saver()

# Print the name of Weights and Bias

with tf.Session() as sess:
# Load the weights and bias - No Error
saver.restore(sess, save_file)

``````

# 18. Regularization

• Early Termination: 当发现性能不再上升时，就停止训练
• L2 Regularization: L2正则化，惩罚那些权重高的，即这里的权值衰减

# 21. Quiz: TensorFlow Dropout

Dropout是一种防止过拟合的方式，它随机删除一些神经元，如下图：

``````keep_prob = tf.placeholder(tf.float32) # probability to keep units

hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)

``````

The tf.nn.dropout() function takes in two parameters:

• hidden_layer: the tensor to which you would like to apply dropout
• keep_prob: the probability of keeping (i.e. not dropping) any given unit

``````...

keep_prob = tf.placeholder(tf.float32) # probability to keep units

hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)

...

with tf.Session() as sess:
sess.run(tf.global_variables_initializer())

for epoch_i in range(epochs):
for batch_i in range(batches):
....

sess.run(optimizer, feed_dict={
features: batch_features,
labels: batch_labels,
keep_prob: 0.5})

validation_accuracy = sess.run(accuracy, feed_dict={
features: test_features,
labels: test_labels,
keep_prob: 1.0})
``````

# 22. Quiz 2: TensorFlow Dropout

``````# Quiz Solution
# Note: You can't run code in this tab
import tensorflow as tf

hidden_layer_weights = [
[0.1, 0.2, 0.4],
[0.4, 0.6, 0.6],
[0.5, 0.9, 0.1],
[0.8, 0.2, 0.8]]
out_weights = [
[0.1, 0.6],
[0.2, 0.1],
[0.7, 0.9]]

# set random seed
tf.set_random_seed(123456)

# Weights and biases
weights = [
tf.Variable(hidden_layer_weights),
tf.Variable(out_weights)]
biases = [
tf.Variable(tf.zeros(3)),
tf.Variable(tf.zeros(2))]

# Input
features = tf.Variable([[0.0, 2.0, 3.0, 4.0], [0.1, 0.2, 0.3, 0.4], [11.0, 12.0, 13.0, 14.0]])

# TODO: Create Model with Dropout
keep_prob = tf.placeholder(tf.float32)
hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)

# TODO: save and print session results as variable named "output"
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output = sess.run(logits, feed_dict={keep_prob: 0.5})
print(output)
``````

``````[[  9.55999947  16.        ]
[  0.11200001   0.67200011]
[ 43.30000305  48.15999985]]
``````