# 0. 总结：

1. 滤波器，本质上就是权重，通过权重共享，减少参数，缩小output的size。
2. 如何根据input/滤波器(步长，size)，偏置等计算output的size和depth。
4. 经过卷积层处理后，最终都要经过全连接层(Full connected layer)，最后提供给分类器。
5. CNN各层的可视化，可以看到每一层都能抽象出一些特征，越往后其特征就越明显。
6. 池化处理是CNN中减少input数量的方式，有Max池化以及平均值池化等方式。
7. 最终通过实例代码，解释了上述的概念在TensorFlow中如何实现。

# 7. Filters 滤波器

CNN神经网络的第一步，就是将图片分割成更小的块，这个分割的过程通过滤波器完成。

Filter Depth(滤波器深度)：

# 8. Feature Map Sizes

• input depth 表示输入数据有3个通道
• output depth 表示输出数据有8个通道，说明滤波器是个四维的数据，size为(8,3,3,3)，表示意义为(FN滤波器个数，C通道，FH高，FW宽)

# 9. Convolutions continued

A “Fully Connected” layer is a standard, non convolutional layer, where all inputs are connected to all output neurons. This is also referred to as a “dense” layer, and is what we used in the previous two lessons.

# 10. Parameters

Dimensionality 维度：

``````1. input layer has a width of W and a height of H
2. our convolutional layer has a filter size F
3. the number of filters : K
4. stride ： S
``````
1. the following formula gives us the width of the next layer: W_out =[ (W−F+2P)/S] + 1
2. The output height would be H_out = [(H-F+2P)/S] + 1
3. the output depth would be equal to the number of filters D_out = K
4. The output volume would be `W_out * H_out * D_out`

# 11. Quiz: Convolution Output Shape

``````- We have an input of shape 32x32x3 (HxWxD)
- 20 filters of shape 8x8x3 (HxWxD)
- A stride of 2 for both the height and width (S)
- With padding of size 1 (P)
``````

``````new_height = (input_height - filter_height + 2 * P)/S + 1
new_width = (input_width - filter_width + 2 * P)/S + 1
``````

``````input = tf.placeholder(tf.float32, (None, 32, 32, 3))
filter_weights = tf.Variable(tf.truncated_normal((8, 8, 3, 20))) # (height, width, input_depth, output_depth)
filter_bias = tf.Variable(tf.zeros(20))
strides = [1, 2, 2, 1] # (batch, height, width, depth)
conv = tf.nn.conv2d(input, filter_weights, strides, padding) + filter_bias
``````

TensorFlow中关于SAME和VALID下output的shape计算方式如下：

``````out_height = ceil(float(in_height) / float(strides[1]))
out_width = ceil(float(in_width) / float(strides[2]))
``````

``````out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))
``````

# 13. Quiz: Number of Parameters

``````- We have an input of shape 32x32x3 (HxWxD)
- 20 filters of shape 8x8x3 (HxWxD)
- A stride of 2 for both the height and width (S)
- With padding of size 1 (P)
``````

output layer: `14x14x20 (HxWxD)`

Hint：

Without parameter sharing, each neuron in the output layer must connect to each neuron in the filter. In addition, each neuron in the output layer must also connect to a single bias neuron.

`8 * 8 * 3`是权重的个数，1是表示bias个数，右边的`14 * 14 * 20`是连接着的输出层的神经元数量。

# 15. Quiz: Parameter Sharing

``````- We have an input of shape 32x32x3 (HxWxD)
- 20 filters of shape 8x8x3 (HxWxD)
- A stride of 2 for both the height and width (S)
- With padding of size 1 (P)
``````

output layer: `14x14x20 (HxWxD)`

Hint：

With parameter sharing, each neuron in an output channel shares its weights with every other neuron in that channel. So the number of parameters is equal to the number of neurons in the filter, plus a bias neuron, all multiplied by the number of channels in the output layer.

``````(8 * 8 * 3 + 1) * 20 = 3840 + 20 = 3860
``````

3840是权重，20是偏置，在权重共享中，我们在一个通道上使用同一个filter即权重，可以参考下图。

# 17. Visualizing CNNs

CNN的可视化，对应CNN的可视化

Layer1：

Layer2：

Layer3：

Layer5：

# 18. TensorFlow Convolution Layer

``````# Output depth
k_output = 64

# Image Properties
image_width = 10
image_height = 10
color_channels = 3

# Convolution filter
filter_size_width = 5
filter_size_height = 5

# Input/Image
input = tf.placeholder(
tf.float32,
shape=[None, image_height, image_width, color_channels])

# Weight and bias
weight = tf.Variable(tf.truncated_normal(
[filter_size_height, filter_size_width, color_channels, k_output]))
bias = tf.Variable(tf.zeros(k_output))

# Apply Convolution
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
# Apply activation function
conv_layer = tf.nn.relu(conv_layer)
``````
1. `tf.nn.conv2d()`: 利用滤波器(权重)和步长，以及填充方式作为input，生成卷积层
2. `strides=[1, 2, 2, 1]`: strides: A list of ints. 1-D tensor of length 4. The stride of the sliding window for each dimension of input. For the most common case of the same horizontal and vertices strides, strides = [1, stride, stride, 1].
3. `input` : Given an input tensor of shape [batch, in_height, in_width, in_channels]
4. `tf.nn.bias_add()`：adds a 1-d bias to the last dimension in a matrix.

# 20. TensorFlow Max Pooling

TensorFlow中使用`tf.nn.max_pool()`函数实现卷积层的池化：

``````...
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
conv_layer = tf.nn.relu(conv_layer)
# Apply Max Pooling
conv_layer = tf.nn.max_pool(
conv_layer,
ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1],
``````
1. ksize：parameter as the size of the filter
2. strides：parameter as the length of the stride

The ksize and strides parameters are structured as 4-element lists, with each element corresponding to a dimension of the input tensor ([batch, height, width, channels]). For both ksize and strides, the batch and channel dimensions are typically set to 1.

A pooling layer is generally used to：

1. Decrease the size of the output
2. Prevent overfitting

Reducing overfitting is a consequence of the reducing the output size, which in turn, reduces the number of parameters in future layers.

1. Dropout能提供更好的正则化。（Dropout是一种在学习的过程中随机删除神经元的方法，训练时，随机选出隐藏层的神经元，然后将其删除，被删除的神经元不再进行信号的传递。）
2. 池化导致了信息丢失。

# 23. Quiz: Pooling Mechanics

``````- We have an input of shape 4x4x5 (HxWxD)
- Filter of shape 2x2 (HxW)
- A stride of 2 for both the height and width (S)
``````

``````new_height = (input_height - filter_height)/S + 1
new_width = (input_width - filter_width)/S + 1
``````

output is `2x2x5`，相应的代码如下：

``````input = tf.placeholder(tf.float32, (None, 4, 4, 5))
filter_shape = [1, 2, 2, 1]
strides = [1, 2, 2, 1]
pool = tf.nn.max_pool(input, filter_shape, strides, padding)
``````

# 29. 1x1 Convolutions

udacity这个视频讲的啥啊，…无语了，完全没懂，看看吴恩达讲的视频Neural Networks - Networks in Networks and 1x1 Convolutions

# 31. Convolutional Network in TensorFlow

## 31.1 示例代码

``````from tensorflow.examples.tutorials.mnist import input_data

import tensorflow as tf

# Parameters
learning_rate = 0.00001
epochs = 10
batch_size = 128

# Number of samples to calculate validation and accuracy
# Decrease this if you're running out of memory to calculate accuracy
test_valid_size = 256

# Network Parameters
n_classes = 10  # MNIST total classes (0-9 digits)
dropout = 0.75  # Dropout, probability to keep units

# Store layers weight & bias
weights = {
'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])),
'out': tf.Variable(tf.random_normal([1024, n_classes]))}

biases = {
'bc1': tf.Variable(tf.random_normal([32])),
'bc2': tf.Variable(tf.random_normal([64])),
'bd1': tf.Variable(tf.random_normal([1024])),
'out': tf.Variable(tf.random_normal([n_classes]))}

def conv2d(x, W, b, strides=1):
x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
return tf.nn.relu(x)

def maxpool2d(x, k=2):
return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding='SAME')

def conv_net(x, weights, biases, dropout):
# Layer 1 - 28*28*1 to 14*14*32
conv1 = conv2d(x, weights['wc1'], biases['bc1'])
conv1 = maxpool2d(conv1, k=2)

# Layer 2 - 14*14*32 to 7*7*64
conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
conv2 = maxpool2d(conv2, k=2)

# Fully connected layer - 7*7*64 to 1024
fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]])
fc1 = tf.nn.relu(fc1)
fc1 = tf.nn.dropout(fc1, dropout)

# Output Layer - class prediction - 1024 to 10
return out

# tf Graph input
x = tf.placeholder(tf.float32, [None, 28, 28, 1])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32)

# Model
logits = conv_net(x, weights, biases, keep_prob)

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))

# Accuracy
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
sess.run(init)

for epoch in range(epochs):
for batch in range(mnist.train.num_examples//batch_size):
batch_x, batch_y = mnist.train.next_batch(batch_size)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: dropout})

# Calculate batch loss and accuracy
loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y, keep_prob: 1.})
valid_acc = sess.run(accuracy, feed_dict={
x: mnist.validation.images[:test_valid_size],
y: mnist.validation.labels[:test_valid_size],
keep_prob: 1.})

print('Epoch {:>2}, Batch {:>3} - Loss: {:>10.4f} Validation Accuracy: {:.6f}'.format(
epoch + 1,
batch + 1,
loss,
valid_acc))

# Calculate Test Accuracy
test_acc = sess.run(accuracy, feed_dict={
x: mnist.test.images[:test_valid_size],
y: mnist.test.labels[:test_valid_size],
keep_prob: 1.})
print('Testing Accuracy: {}'.format(test_acc))
``````

## 31.2 代码分析：

``````from tensorflow.examples.tutorials.mnist import input_data

import tensorflow as tf

# Parameters
learning_rate = 0.00001
epochs = 10
batch_size = 128

# Number of samples to calculate validation and accuracy
# Decrease this if you're running out of memory to calculate accuracy
test_valid_size = 256

# Network Parameters
n_classes = 10  # MNIST total classes (0-9 digits)
dropout = 0.75  # Dropout, probability to keep units
``````

Weights and Biases:

``````# Store layers weight & bias
weights = {
'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])),
'out': tf.Variable(tf.random_normal([1024, n_classes]))}

biases = {
'bc1': tf.Variable(tf.random_normal([32])),
'bc2': tf.Variable(tf.random_normal([64])),
'bd1': tf.Variable(tf.random_normal([1024])),
'out': tf.Variable(tf.random_normal([n_classes]))}
``````

Convolutions：

In TensorFlow, this is all done using tf.nn.conv2d() and tf.nn.bias_add().

``````def conv2d(x, W, b, strides=1):
x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
return tf.nn.relu(x)
``````

Max Pooling：

``````def maxpool2d(x, k=2):
return tf.nn.max_pool(
x,
ksize=[1, k, k, 1],
strides=[1, k, k, 1],
``````

Model：

``````def conv_net(x, weights, biases, dropout):
# Layer 1 - 28*28*1 to 14*14*32
conv1 = conv2d(x, weights['wc1'], biases['bc1'])
conv1 = maxpool2d(conv1, k=2)

# Layer 2 - 14*14*32 to 7*7*64
conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
conv2 = maxpool2d(conv2, k=2)

# Fully connected layer - 7*7*64 to 1024
fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]])
fc1 = tf.nn.relu(fc1)
fc1 = tf.nn.dropout(fc1, dropout)

# Output Layer - class prediction - 1024 to 10
return out
``````

Session：

``````# tf Graph input
x = tf.placeholder(tf.float32, [None, 28, 28, 1])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32)

# Model
logits = conv_net(x, weights, biases, keep_prob)

# Define loss and optimizer
cost = tf.reduce_mean(\
tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
.minimize(cost)

# Accuracy
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf. global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
sess.run(init)

for epoch in range(epochs):
for batch in range(mnist.train.num_examples//batch_size):
batch_x, batch_y = mnist.train.next_batch(batch_size)
sess.run(optimizer, feed_dict={
x: batch_x,
y: batch_y,
keep_prob: dropout})

# Calculate batch loss and accuracy
loss = sess.run(cost, feed_dict={
x: batch_x,
y: batch_y,
keep_prob: 1.})
valid_acc = sess.run(accuracy, feed_dict={
x: mnist.validation.images[:test_valid_size],
y: mnist.validation.labels[:test_valid_size],
keep_prob: 1.})

print('Epoch {:>2}, Batch {:>3} -'
'Loss: {:>10.4f} Validation Accuracy: {:.6f}'.format(
epoch + 1,
batch + 1,
loss,
valid_acc))

# Calculate Test Accuracy
test_acc = sess.run(accuracy, feed_dict={
x: mnist.test.images[:test_valid_size],
y: mnist.test.labels[:test_valid_size],
keep_prob: 1.})
print('Testing Accuracy: {}'.format(test_acc))
``````

# 32. TensorFlow Convolutional Layer Workspaces

``````import tensorflow as tf
import numpy as np
``````
``````"""
Setup the strides, padding and filter weight/bias such that
the output shape is (1, 2, 2, 3).
"""
# `tf.nn.conv2d` requires the input be 4D (batch_size, height, width, depth)
# (1, 4, 4, 1)
x = np.array([
[0, 1, 0.5, 10],
[2, 2.5, 1, -8],
[4, 0, 5, 6],
[15, 1, 2, 3]], dtype=np.float32).reshape((1, 4, 4, 1))
X = tf.constant(x)

def conv2d(input_array):
# Filter (weights and bias)
# The shape of the filter weight is (height, width, input_depth, output_depth)
# The shape of the filter bias is (output_depth,)
# TODO: Define the filter weights `F_W` and filter bias `F_b`.
# NOTE: Remember to wrap them in `tf.Variable`, they are trainable parameters after all.
F_W = tf.Variable(tf.random_normal([2, 2, 1, 3]))
F_b = tf.Variable(tf.random_normal([3]))
# TODO: Set the stride for each dimension (batch_size, height, width, depth)
strides = [1, 2, 2, 1]
# TODO: set the padding, either 'VALID' or 'SAME'.
# https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#conv2d
# `tf.nn.conv2d` does not include the bias computation so we have to add it ourselves after.
return tf.nn.conv2d(input_array, F_W, strides, padding) + F_b

output = conv2d(X)
output
``````

``````##### Do Not Modify ######

test_X = tf.constant(np.random.randn(1, 4, 4, 1), dtype=tf.float32)

try:
print(response)

except Exception as err:
print(str(err))

``````

``````Great job! Your Convolution layer looks good :)
``````

``````def conv2d(input):
# Filter (weights and bias)
F_W = tf.Variable(tf.truncated_normal((2, 2, 1, 3)))
F_b = tf.Variable(tf.zeros(3))
strides = [1, 2, 2, 1]
return tf.nn.conv2d(input, F_W, strides, padding) + F_b
``````

1. 因为output的depth是3，所以这里滤波器(权重)应该有3个，同理偏置也是3个
2. 因为output的形状是(2,2)，所以根据公式计算，滤波器的形状也是(2,2)

VALID时，计算公式如下：

``````out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))
``````
``````out_height = ceil(float(4 - 2 + 1) / float(2)) = ceil(1.5) = 2
out_width  = ceil(float(4 - 2 + 1) / float(2)) = ceil(1.5) = 2
``````

# 34. TensorFlow Pooling Layer Workspaces

Using Pooling Layers in TensorFlow:

In the below exercise, you’ll be asked to set up the dimensions of the pooling filters, strides, as well as the appropriate padding. You should go over the TensorFlow documentation for tf.nn.max_pool(). Padding works the same as it does for a convolution.

Instructions:

1. Finish off each TODO in the maxpool function.
2. etup the strides, padding and ksize such that the output shape after pooling is (1, 2, 2, 1).
``````"""
Set the values to `strides` and `ksize` such that
the output shape after pooling is (1, 2, 2, 1).
"""
import tensorflow as tf
import numpy as np

# `tf.nn.max_pool` requires the input be 4D (batch_size, height, width, depth)
# (1, 4, 4, 1)
x = np.array([
[0, 1, 0.5, 10],
[2, 2.5, 1, -8],
[4, 0, 5, 6],
[15, 1, 2, 3]], dtype=np.float32).reshape((1, 4, 4, 1))
X = tf.constant(x)

def maxpool(input):
# TODO: Set the ksize (filter size) for each dimension (batch_size, height, width, depth)
ksize = [1, 2, 2, 1]
# TODO: Set the stride for each dimension (batch_size, height, width, depth)
strides = [1, 2, 2, 1]
# TODO: set the padding, either 'VALID' or 'SAME'.
# https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#max_pool

out = maxpool(X)
``````

filter_height / filter_width 是池化层的块大小，S是步长

``````new_height = (input_height - filter_height)/S + 1
new_width = (input_width - filter_width)/S + 1
``````

# 36. Lab: LeNet in TensorFlow

Preprocessing:

1. An MNIST image is initially 784 features(1D)
2. If the data is not normalized from [0,255] to [0,1],normalize it.
3. We reshape this to (28,28,1)(3D), and pad the image with 0s such that the height and width are 32.
4. The input shape going into the first convolutional layer is (32，32,1)

Spec：

1. Convolution layer 1. The output shape should be 28x28x6.
2. Activation 1. Your choice of activation function.
3. Pooling layer 1. The output shape should be 14x14x6.
4. Convolution layer 2. The output shape should be 10x10x16.
5. Activation 2. Your choice of activation function.
6. Pooling layer 2. The output shape should be 5x5x16.
7. Flatten layer. Flatten the output shape of the final pooling layer such that it’s 1D instead of 3D. The easiest way to do is by using tf.contrib.layers.flatten, which is already imported for you.
8. Fully connected layer 1. This should have 120 outputs.
9. Activation 3. Your choice of activation function.
10. Fully connected layer 2. This should have 84 outputs.
11. Activation 4. Your choice of activation function.
12. Fully connected layer 3. This should have 10 outputs.

You’ll return the result of the final fully connected layer from the LeNet function.

If implemented correctly you should see output similar to the following:

``````EPOCH 1 ...
Validation loss = 52.809
Validation accuracy = 0.864

EPOCH 2 ...
Validation loss = 24.749
Validation accuracy = 0.915

EPOCH 3 ...
Validation loss = 17.719
Validation accuracy = 0.930

EPOCH 4 ...
Validation loss = 12.188
Validation accuracy = 0.943

EPOCH 5 ...
Validation loss = 8.935
Validation accuracy = 0.954

EPOCH 6 ...
Validation loss = 7.674
Validation accuracy = 0.956

EPOCH 7 ...
Validation loss = 6.822
Validation accuracy = 0.956

EPOCH 8 ...
Validation loss = 5.451
Validation accuracy = 0.961

EPOCH 9 ...
Validation loss = 4.881
Validation accuracy = 0.964

EPOCH 10 ...
Validation loss = 4.623
Validation accuracy = 0.964

Test loss = 4.726
Test accuracy = 0.962

``````

# 37. LeNet Lab Workspace

``````from tensorflow.examples.tutorials.mnist import input_data

X_train, y_train           = mnist.train.images, mnist.train.labels
X_validation, y_validation = mnist.validation.images, mnist.validation.labels
X_test, y_test             = mnist.test.images, mnist.test.labels

assert(len(X_train) == len(y_train))
assert(len(X_validation) == len(y_validation))
assert(len(X_test) == len(y_test))

print()
print("Image Shape: {}".format(X_train[0].shape))
print()
print("Training Set:   {} samples".format(len(X_train)))
print("Validation Set: {} samples".format(len(X_validation)))
print("Test Set:       {} samples".format(len(X_test)))
``````
``````Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

Image Shape: (28, 28, 1)

Training Set:   55000 samples
Validation Set: 5000 samples
Test Set:       10000 samples
``````

The MNIST data that TensorFlow pre-loads comes as 28x28x1 images.

However, the LeNet architecture only accepts 32x32xC images, where C is the number of color channels.

In order to reformat the MNIST data into a shape that LeNet will accept, we pad the data with two rows of zeros on the top and bottom, and two columns of zeros on the left and right (28+2+2 = 32).

``````import numpy as np

print("Updated Image Shape: {}".format(X_train[0].shape))
``````
``````Updated Image Shape: (32, 32, 1)
``````

## 37.2 Visualize Data

``````import random
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

index = random.randint(0, len(X_train))
image = X_train[index].squeeze()

plt.figure(figsize=(1,1))
plt.imshow(image, cmap="gray")
print(y_train[index])
``````

## 37.3 Preprocess Data

``````from sklearn.utils import shuffle

X_train, y_train = shuffle(X_train, y_train)
``````

## 37.4 Setup TensorFlow

``````The `EPOCH` and `BATCH_SIZE` values affect the training speed and model accuracy.
``````
``````import tensorflow as tf

EPOCHS = 10
BATCH_SIZE = 128
``````

## 37.4 TODO: Implement LeNet-5

Implement the LeNet-5 neural network architecture.

This is the only cell you need to edit.

### Input

The LeNet architecture accepts a 32x32xC image as input, where C is the number of color channels. Since MNIST images are grayscale, C is 1 in this case.

### Architecture

Layer 1: Convolutional. The output shape should be 28x28x6.

Activation. Your choice of activation function.

Pooling. The output shape should be 14x14x6.

Layer 2: Convolutional. The output shape should be 10x10x16.

Activation. Your choice of activation function.

Pooling. The output shape should be 5x5x16.

Flatten. Flatten the output shape of the final pooling layer such that it’s 1D instead of 3D. The easiest way to do is by using `tf.contrib.layers.flatten`, which is already imported for you.

Layer 3: Fully Connected. This should have 120 outputs.

Activation. Your choice of activation function.

Layer 4: Fully Connected. This should have 84 outputs.

Activation. Your choice of activation function.

Layer 5: Fully Connected (Logits). This should have 10 outputs.

### Output

Return the result of the 2nd fully connected layer.

### 代码

``````from tensorflow.contrib.layers import flatten

def LeNet(x):
# Arguments used for tf.truncated_normal, randomly defines variables for the weights and biases for each layer
mu = 0
sigma = 0.1

# TODO: Layer 1: Convolutional. Input = 32x32x1. Output = 28x28x6.
filter_weights = tf.Variable(tf.truncated_normal((5, 5, 1, 6), mean = mu, stddev = sigma)) # (height, width, input_depth, output_depth)
filter_bias = tf.Variable(tf.zeros(6))
strides = [1, 1, 1, 1] # (batch, height, width, depth)
conv_1 = tf.nn.conv2d(x, filter_weights, strides, padding) + filter_bias

# TODO: Activation.
conv_1 = tf.nn.relu(conv_1)

# TODO: Pooling. Input = 28x28x6. Output = 14x14x6.
conv_1 = tf.nn.max_pool(conv_1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')

# TODO: Layer 2: Convolutional. Output = 10x10x16.
filter_weights = tf.Variable(tf.truncated_normal((5, 5, 6, 16), mean = mu, stddev = sigma)) # (height, width, input_depth, output_depth)
filter_bias = tf.Variable(tf.zeros(16))
strides = [1, 1, 1, 1] # (batch, height, width, depth)
conv_2 = tf.nn.conv2d(conv_1, filter_weights, strides, padding) + filter_bias

# TODO: Activation.
conv_2 = tf.nn.relu(conv_2)

# TODO: Pooling. Input = 10x10x16. Output = 5x5x16.
conv_2 = tf.nn.max_pool(conv_2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')

# TODO: Flatten. Input = 5x5x16. Output = 400.
fc0 = flatten(conv_2)

# TODO: Layer 3: Fully Connected. Input = 400. Output = 120.
wd1 = tf.Variable(tf.truncated_normal(shape=(400, 120), mean = mu, stddev = sigma))
bd1 = tf.Variable(tf.zeros(120))

# TODO: Activation.
fc1 = tf.nn.relu(fc1)

# TODO: Layer 4: Fully Connected. Input = 120. Output = 84.
wd2 = tf.Variable(tf.truncated_normal(shape=(120, 84), mean = mu, stddev = sigma))
bd2 = tf.Variable(tf.zeros(84))

# TODO: Activation.
fc2 = tf.nn.relu(fc2)

# TODO: Layer 5: Fully Connected. Input = 84. Output = 10.

wd3 = tf.Variable(tf.truncated_normal(shape=(84, 10), mean = mu, stddev = sigma))
bd3 = tf.Variable(tf.zeros(10))

return logits
``````

## 37.5 Features and Labels

Train LeNet to classify MNIST data.

• `x` is a placeholder for a batch of input images.
• `y` is a placeholder for a batch of output labels.
``````x = tf.placeholder(tf.float32, (None, 32, 32, 1))
y = tf.placeholder(tf.int32, (None))
one_hot_y = tf.one_hot(y, 10)
``````

## 37.6 Training Pipeline

Create a training pipeline that uses the model to classify MNIST data.

``````rate = 0.001

logits = LeNet(x)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=one_hot_y, logits=logits)
loss_operation = tf.reduce_mean(cross_entropy)
training_operation = optimizer.minimize(loss_operation)
``````

## 37.7 Model Evaluation

Evaluate how well the loss and accuracy of the model for a given dataset.

``````correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(one_hot_y, 1))
accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
saver = tf.train.Saver()

def evaluate(X_data, y_data):
num_examples = len(X_data)
total_accuracy = 0
sess = tf.get_default_session()
for offset in range(0, num_examples, BATCH_SIZE):
batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
accuracy = sess.run(accuracy_operation, feed_dict={x: batch_x, y: batch_y})
total_accuracy += (accuracy * len(batch_x))
``````

## 37.8 Train the Model

Run the training data through the training pipeline to train the model.

Before each epoch, shuffle the training set.

After each epoch, measure the loss and accuracy of the validation set.

Save the model after training.

``````with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
num_examples = len(X_train)

print("Training...")
print()
for i in range(EPOCHS):
X_train, y_train = shuffle(X_train, y_train)
for offset in range(0, num_examples, BATCH_SIZE):
end = offset + BATCH_SIZE
batch_x, batch_y = X_train[offset:end], y_train[offset:end]
sess.run(training_operation, feed_dict={x: batch_x, y: batch_y})

validation_accuracy = evaluate(X_validation, y_validation)
print("EPOCH {} ...".format(i+1))
print("Validation Accuracy = {:.3f}".format(validation_accuracy))
print()

saver.save(sess, './lenet')
print("Model saved")
``````

``````Training...

EPOCH 1 ...
Validation Accuracy = 0.969

...

EPOCH 10 ...
Validation Accuracy = 0.988

Model saved
``````

## 37.9 Evaluate the Model

Once you are completely satisfied with your model, evaluate the performance of the model on the test set.

Be sure to only do this once!

If you were to measure the performance of your trained model on the test set, then improve your model, and then measure the performance of your model on the test set again, that would invalidate your test results. You wouldn’t get a true measure of how well your model would perform against real data.

``````with tf.Session() as sess:
saver.restore(sess, tf.train.latest_checkpoint('.'))

test_accuracy = evaluate(X_test, y_test)
print("Test Accuracy = {:.3f}".format(test_accuracy))
``````

``````INFO:tensorflow:Restoring parameters from ./lenet
Test Accuracy = 0.989
``````

# 37. Lab代码解析

## 37.1 LeNet的udacity代码：

``````from tensorflow.contrib.layers import flatten

def LeNet(x):
# Arguments used for tf.truncated_normal, randomly defines variables for the weights and biases for each layer
mu = 0
sigma = 0.1

# SOLUTION: Layer 1: Convolutional. Input = 32x32x1. Output = 28x28x6.
conv1_W = tf.Variable(tf.truncated_normal(shape=(5, 5, 1, 6), mean = mu, stddev = sigma))
conv1_b = tf.Variable(tf.zeros(6))
conv1   = tf.nn.conv2d(x, conv1_W, strides=[1, 1, 1, 1], padding='VALID') + conv1_b

# SOLUTION: Activation.
conv1 = tf.nn.relu(conv1)

# SOLUTION: Pooling. Input = 28x28x6. Output = 14x14x6.
conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')

# SOLUTION: Layer 2: Convolutional. Output = 10x10x16.
conv2_W = tf.Variable(tf.truncated_normal(shape=(5, 5, 6, 16), mean = mu, stddev = sigma))
conv2_b = tf.Variable(tf.zeros(16))
conv2   = tf.nn.conv2d(conv1, conv2_W, strides=[1, 1, 1, 1], padding='VALID') + conv2_b

# SOLUTION: Activation.
conv2 = tf.nn.relu(conv2)

# SOLUTION: Pooling. Input = 10x10x16. Output = 5x5x16.
conv2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')

# SOLUTION: Flatten. Input = 5x5x16. Output = 400.
fc0   = flatten(conv2)

# SOLUTION: Layer 3: Fully Connected. Input = 400. Output = 120.
fc1_W = tf.Variable(tf.truncated_normal(shape=(400, 120), mean = mu, stddev = sigma))
fc1_b = tf.Variable(tf.zeros(120))
fc1   = tf.matmul(fc0, fc1_W) + fc1_b

# SOLUTION: Activation.
fc1    = tf.nn.relu(fc1)

# SOLUTION: Layer 4: Fully Connected. Input = 120. Output = 84.
fc2_W  = tf.Variable(tf.truncated_normal(shape=(120, 84), mean = mu, stddev = sigma))
fc2_b  = tf.Variable(tf.zeros(84))
fc2    = tf.matmul(fc1, fc2_W) + fc2_b

# SOLUTION: Activation.
fc2    = tf.nn.relu(fc2)

# SOLUTION: Layer 5: Fully Connected. Input = 84. Output = 10.
fc3_W  = tf.Variable(tf.truncated_normal(shape=(84, 10), mean = mu, stddev = sigma))
fc3_b  = tf.Variable(tf.zeros(10))
logits = tf.matmul(fc2, fc3_W) + fc3_b

return logits
``````
• Layer1中，ouput的depth为6，说明滤波器的depth为6,偏置为6，另外，input为32×32，output为28×28,说明stride只能为1，计算可知道滤波器的长宽为(32-28+1)/1 = 5。
• Layer1中，池化前后分别为28×28和14×14，说明池化是以2×2为单位，且步长为2，示例图如下:
• Layer2也是一样方法
• Layer3，通过全连接层的计算法，即直接用权重乘以input，加上偏置，得到Layer3的结果，再通过激活层进行处理
• Layer4和Layer5与Layer3相同的方法

# 38. CNNs - Additional Resources

Additional Resources： There are many wonderful free resources that allow you to go into more depth around Convolutional Neural Networks. In this course, our goal is to give you just enough intuition to start applying this concept on real world problems so you have enough of an exposure to explore more on your own. We strongly encourage you to explore some of these resources more to reinforce your intuition and explore different ideas.

These are the resources we recommend in particular:

• Andrej Karpathy’s CS231n Stanford course on Convolutional Neural Networks.
• Michael Nielsen’s free book on Deep Learning.
• Goodfellow, Bengio, and Courville’s more advanced free book on Deep Learning.