20. 交叉熵(cross entropy)

• 参考代码如下:
``````import numpy as np

def cross_entropy(Y, P):
Y = np.float_(Y)
P = np.float_(P)
return -np.sum(Y * np.log(P) + (1 - Y) * np.log(1 - P))

``````

``````Y = np.array([1,1,0])
P = np.array([0.8,0.7,0.1])
print(cross_entropy(Y,P))
``````
``````Y = np.array([0,0,1])
P = np.array([0.8,0.7,0.1])
print(cross_entropy(Y,P))
``````

22. Logistic 回归

• 获得数据
• 选择一个随机模型
• 计算误差
• 最小化误差，获得更好的模型

26. [Lab]梯度下降

• sigmoid: sigmoid激活函数。
• output_formula: 输出（预测）公式
• error_formula: 误差函数。
• update_weights: 更新权重的函数。

26.1 读取与绘制数据

``````import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

#Some helper functions for plotting and drawing lines

def plot_points(X, y):
rejected = X[np.argwhere(y==0)]
plt.scatter([s[0][0] for s in rejected], [s[0][1] for s in rejected], s = 25, color = 'blue', edgecolor = 'k')
plt.scatter([s[0][0] for s in admitted], [s[0][1] for s in admitted], s = 25, color = 'red', edgecolor = 'k')

def display(m, b, color='g--'):
plt.xlim(-0.05,1.05)
plt.ylim(-0.05,1.05)
x = np.arange(-10, 10, 0.1)
plt.plot(x, m*x+b, color)
``````
``````data = pd.read_csv('data.csv', header=None)
X = np.array(data[[0,1]])
y = np.array(data[2])
plot_points(X,y)
plt.show()
``````

26.2 实现基本函数

``````# Activation (sigmoid) function
def sigmoid(x):
return 1 / (1 + np.exp(-x))

def output_formula(features, weights, bias):
return sigmoid(np.dot(features, weights) + bias)

def error_formula(y, output):
return - y*np.log(output) - (1 - y) * np.log(1-output)

def update_weights(x, y, weights, bias, learnrate):
output = output_formula(x, weights, bias)
d_error = -(y - output)
weights -= learnrate * d_error * x
bias -= learnrate * d_error
return weights, bias
``````

26.3 训练函数

``````np.random.seed(44)

epochs = 100
learnrate = 0.01

def train(features, targets, epochs, learnrate, graph_lines=False):

errors = []
n_records, n_features = features.shape
last_loss = None
weights = np.random.normal(scale=1 / n_features**.5, size=n_features)
#print(weights)
bias = 0
for e in range(epochs):
del_w = np.zeros(weights.shape)
for x, y in zip(features, targets):
output = output_formula(x, weights, bias)
error = error_formula(y, output)
weights, bias = update_weights(x, y, weights, bias, learnrate)

# Printing out the log-loss error on the training set
out = output_formula(features, weights, bias)
loss = np.mean(error_formula(targets, out))
errors.append(loss)
if e % (epochs / 10) == 0:
print("\n========== Epoch", e,"==========")
if last_loss and last_loss < loss:
print("Train loss: ", loss, "  WARNING - Loss Increasing")
else:
print("Train loss: ", loss)
last_loss = loss
predictions = out > 0.5
accuracy = np.mean(predictions == targets)
print("Accuracy: ", accuracy)
if graph_lines and e % (epochs / 100) == 0:
display(-weights[0]/weights[1], -bias/weights[1])

# Plotting the solution boundary
plt.title("Solution boundary")
display(-weights[0]/weights[1], -bias/weights[1], 'black')

# Plotting the data
plot_points(features, targets)
plt.show()

# Plotting the error
plt.title("Error Plot")
plt.xlabel('Number of epochs')
plt.ylabel('Error')
plt.plot(errors)
plt.show()
``````

26.4 训练结果

• 目前的训练损失与准确性的 10 次更新
• 获取的数据图和一些边界线的图。 最后一个是黑色的。请注意，随着我们遍历更多的 epoch ，线会越来越接近最佳状态。
• 误差函数的图。 请留意，随着我们遍历更多的 epoch，它会如何降低。
``````train(X, y, epochs, learnrate, True)
``````

``````========== Epoch 0 ==========
Train loss:  0.713584519538
Accuracy:  0.4

========== Epoch 10 ==========
Train loss:  0.622583521045
Accuracy:  0.59

========== Epoch 20 ==========
Train loss:  0.554874408367
Accuracy:  0.74

========== Epoch 30 ==========
Train loss:  0.501606141872
Accuracy:  0.84

========== Epoch 40 ==========
Train loss:  0.459333464186
Accuracy:  0.86

========== Epoch 50 ==========
Train loss:  0.425255434335
Accuracy:  0.93

========== Epoch 60 ==========
Train loss:  0.397346157167
Accuracy:  0.93

========== Epoch 70 ==========
Train loss:  0.374146976524
Accuracy:  0.93

========== Epoch 80 ==========
Train loss:  0.354599733682
Accuracy:  0.94

========== Epoch 90 ==========
Train loss:  0.337927365888
Accuracy:  0.94
``````

31. 神经网络结构

31.1 神经网络架构

``````import numpy as np

def sigmoid(x):
return 1 / (1 + np.exp(-x))

print(sigmoid(1.5))
print(sigmoid(2.9))
``````

31.2 多层级

• 向输入、隐藏和输出层添加更多节点。
• 添加更多层级。

• 1.向输入、隐藏和输出层添加更多节点

• 2.添加更多层级

32. 前向反馈

32.1 前向反馈-feedforward

• 最简单的神经网络，只有一个隐藏层

• 追加一个隐藏层节点

• 一个隐藏层的神经网络预测函数

• 两个隐藏层的神经网络预测函数

32.2 误差函数

• 一层隐藏层下的预测函数和误差函数

• 二层隐藏层下的预测函数和误差函数

33. 反向传播 - Backpropagation

1. 进行前向反馈运算。
2. 将模型的输出与期望的输出进行比较。
3. 计算误差。
4. 向后运行前向反馈运算（反向传播），将误差分散到每个权重上。
5. 更新权重，并获得更好的模型。
6. 继续此流程，直到获得很好的模型。
• 反向传播，更新权重获得更好的模型

• 调整各个层上的权重，获得更好的模型

35. [Lab]分析学生数据

• One-hot 编码数据
• 缩放数据
• 编写反向传播步骤

• GRE 分数（测试）即 GRE Scores (Test)
• GPA 分数（成绩）即 GPA Scores (Grades)
• 评级（1-4）即 Class rank (1-4)

35.1 加载数据

• https://pandas.pydata.org/pandas-docs/stable/
• https://docs.scipy.org/
``````# Importing pandas and numpy
import pandas as pd
import numpy as np

# Reading the csv file into a pandas DataFrame

# Printing out the first 10 rows of our data
data[:10]
``````

35.2 绘制数据

``````# Importing matplotlib
import matplotlib.pyplot as plt

# Function to help us plot
def plot_points(data):
X = np.array(data[["gre","gpa"]])
rejected = X[np.argwhere(y==0)]
plt.scatter([s[0][0] for s in rejected], [s[0][1] for s in rejected], s = 25, color = 'red', edgecolor = 'k')
plt.scatter([s[0][0] for s in admitted], [s[0][1] for s in admitted], s = 25, color = 'cyan', edgecolor = 'k')
plt.xlabel('Test (GRE)')

# Plotting the points
plot_points(data)
plt.show()
``````

``````# Separating the ranks
data_rank1 = data[data["rank"]==1]
data_rank2 = data[data["rank"]==2]
data_rank3 = data[data["rank"]==3]
data_rank4 = data[data["rank"]==4]

# Plotting the graphs
plot_points(data_rank1)
plt.title("Rank 1")
plt.show()
plot_points(data_rank2)
plt.title("Rank 2")
plt.show()
plot_points(data_rank3)
plt.title("Rank 3")
plt.show()
plot_points(data_rank4)
plt.title("Rank 4")
plt.show()
``````

35.3 将评级进行 One-hot 编码

``````# Make dummy variables for rank
one_hot_data = pd.concat([data, pd.get_dummies(data['rank'], prefix='rank')], axis=1)

# Drop the previous rank column
one_hot_data = one_hot_data.drop('rank', axis=1)

# Print the first 10 rows of our data
one_hot_data[:10]
``````

35.4 缩放数据

``````# Copying our data
processed_data = one_hot_data[:]

# Scaling the columns
processed_data['gre'] = processed_data['gre']/800
processed_data['gpa'] = processed_data['gpa']/4.0
processed_data[:10]
``````

35.5 将数据分成训练集和测试集

``````sample = np.random.choice(processed_data.index, size=int(len(processed_data)*0.9), replace=False)
train_data, test_data = processed_data.iloc[sample], processed_data.drop(sample)

print("Number of training samples is", len(train_data))
print("Number of testing samples is", len(test_data))
print(train_data[:10])
print(test_data[:10])

``````

35.6 将数据分成特征和目标（标签）

``````features = train_data.drop('admit', axis=1)

print(features[:10])
print(targets[:10])
``````

35.7 训练二层神经网络

``````# Activation (sigmoid) function
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def sigmoid_prime(x):
return sigmoid(x) * (1-sigmoid(x))
def error_formula(y, output):
return - y*np.log(output) - (1 - y) * np.log(1-output)
``````

35.8 误差反向传播

``````def error_term_formula(y, output):
return (y-output) * output * (1 - output)
``````
``````# Neural Network hyperparameters
epochs = 1000
learnrate = 0.5

# Training function
def train_nn(features, targets, epochs, learnrate):

# Use to same seed to make debugging easier
np.random.seed(42)

n_records, n_features = features.shape
last_loss = None

# Initialize weights
weights = np.random.normal(scale=1 / n_features**.5, size=n_features)

for e in range(epochs):
del_w = np.zeros(weights.shape)
for x, y in zip(features.values, targets):
# Loop through all records, x is the input, y is the target

# Activation of the output unit
#   Notice we multiply the inputs and the weights here
#   rather than storing h as a separate variable
output = sigmoid(np.dot(x, weights))

# The error, the target minus the network output
error = error_formula(y, output)

# The error term
#   Notice we calulate f'(h) here instead of defining a separate
#   sigmoid_prime function. This just makes it faster because we
#   can re-use the result of the sigmoid function stored in
#   the output variable
error_term = error_term_formula(y, output)

# The gradient descent step, the error times the gradient times the inputs
del_w += error_term * x

# Update the weights here. The learning rate times the
# change in weights, divided by the number of records to average
weights += learnrate * del_w / n_records

# Printing out the mean square error on the training set
if e % (epochs / 10) == 0:
out = sigmoid(np.dot(features, weights))
loss = np.mean((out - targets) ** 2)
print("Epoch:", e)
if last_loss and last_loss < loss:
print("Train loss: ", loss, "  WARNING - Loss Increasing")
else:
print("Train loss: ", loss)
last_loss = loss
print("=========")
print("Finished training!")
return weights

weights = train_nn(features, targets, epochs, learnrate)
``````

35.9 计算测试 (Test) 数据的精确度

``````# Calculate accuracy on test data
tes_out = sigmoid(np.dot(features_test, weights))
predictions = tes_out > 0.5
accuracy = np.mean(predictions == targets_test)
print("Prediction accuracy: {:.3f}".format(accuracy))
``````