1. Introduction

1.1 Preface

本系列博文是和鲸社区的活动《20天吃掉那只PyTorch》学习的笔记，本篇为系列笔记的第三篇—— Pytorch 的层次结构。该专栏是 Github 上 2.8K 星的项目，在学习该书的过程中可以参考阅读《Python深度学习》一书的第一部分"深度学习基础"内容。

《Python深度学习》这本书是 Keras 之父 Francois Chollet 所著，该书假定读者无任何机器学习知识，以Keras 为工具，使用丰富的范例示范深度学习的最佳实践，该书通俗易懂，全书没有一个数学公式，注重培养读者的深度学习直觉。

《Python深度学习》一书的第一部分的 4 个章节内容如下，预计读者可以在 20 小时之内学完。

什么是深度学习
神经网络的数学基础
神经网络入门
机器学习基础

本系列博文的大纲如下：

一、PyTorch的建模流程
二、PyTorch的核心概念
三、PyTorch的层次结构
四、PyTorch的低阶API
五、PyTorch的中阶API
六、PyTorch的高阶API

最后，本博文提供所使用的全部数据，读者可以从下述连接中下载数据：

Download Now

1.2 Pytorch 的层次结构

本章我们介绍 Pytorch 中 5 个不同的层次结构：

硬件层
内核层
低阶 API
中阶 `API ``
高阶 API【torchkeras】

并以线性回归和 DNN 二分类模型为例，直观对比展示在不同层级实现模型的特点。

Pytorch 的层次结构从低到高可以分成如下五层：

最底层为硬件层，Pytorch 支持 CPU、GPU 加入计算资源池；
第二层为 C++ 实现的内核；
第三层为 Python 实现的操作符，提供了封装 C++ 内核的低级API指令，主要包括各种张量操作算子、自动微分、变量管理；

如 torch.tensor , torch.cat, torch.autograd.grad, nn.Module。如果把模型比作一个房子，那么第三层 API 就是【模型之砖】。
第四层为 Python 实现的模型组件，对低级 API 进行了函数封装，主要包括各种模型层，损失函数，优化器，数据管道等等。

如 torch.nn.Linear, torch.nn.BCE, torch.optim.Adam, torch.utils.data.DataLoader。如果把模型比作一个房子，那么第四层API就是【模型之墙】。
第五层为 Python 实现的模型接口。Pytorch 没有官方的高阶API。为了便于训练模型，作者仿照 keras 中的模型接口，使用了不到 300 行代码，封装了Pytorch的高阶模型接口 torchkeras.Model 。如果把模型比作一个房子，那么第五层API就是模型本身，即【模型之屋】。

2. 低阶 `API` 示范

下面的范例使用 Pytorch 的低阶 API 实现线性回归模型和 DNN 二分类模型。低阶 API 主要包括张量操作，计算图和自动微分。

2.1 Linear regression

2.1.1 Prepare data

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn

#样本数量
n = 400

# 生成测试用数据集
X = 10*torch.rand([n,2])-5.0  #torch.rand是均匀分布
w0 = torch.tensor([[2.0],[-3.0]])
b0 = torch.tensor([[10.0]])
Y = X@w0 + b0 + torch.normal( 0.0,2.0,size = [n,1])  # @表示矩阵乘法,增加正态扰动

数据可视化

# 数据可视化

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

plt.figure(figsize = (12,5))
ax1 = plt.subplot(121)
ax1.scatter(X[:,0].numpy(),Y[:,0].numpy(), c = "b",label = "samples")
ax1.legend()
plt.xlabel("x1")
plt.ylabel("y",rotation = 0)

ax2 = plt.subplot(122)
ax2.scatter(X[:,1].numpy(),Y[:,0].numpy(), c = "g",label = "samples")
ax2.legend()
plt.xlabel("x2")
plt.ylabel("y",rotation = 0)
plt.show()

Results:

构建数据管道迭代器

# 构建数据管道迭代器
def data_iter(features, labels, batch_size=8):
    num_examples = len(features)
    indices = list(range(num_examples))
    np.random.shuffle(indices)  #样本的读取顺序是随机的
    for i in range(0, num_examples, batch_size):
        indexs = torch.LongTensor(indices[i: min(i + batch_size, num_examples)])
        yield  features.index_select(0, indexs), labels.index_select(0, indexs)

# 测试数据管道效果   
batch_size = 8
(features,labels) = next(data_iter(X,Y,batch_size))
print(features)
print(labels)

Result:

tensor([[ 1.9428, -0.7624],
        [-2.5625,  0.4411],
        [-0.7651, -3.8922],
        [ 3.1022, -2.6201],
        [-1.0578, -2.6963],
        [-1.9720,  3.8035],
        [-3.4711, -2.4106],
        [-0.6102,  2.6127]])
tensor([[11.7814],
        [ 6.0209],
        [23.4428],
        [22.5369],
        [17.8275],
        [-8.7643],
        [ 7.5050],
        [ 0.5841]])

2.2 Model

2.2.1 Define model

# 定义模型
class LinearRegression:

    def __init__(self):
        self.w = torch.randn_like(w0,requires_grad=True)
        self.b = torch.zeros_like(b0,requires_grad=True)

    #正向传播
    def forward(self,x):
        return x@self.w + self.b

    # 损失函数
    def loss_func(self,y_pred,y_true):  
        return torch.mean((y_pred - y_true)**2/2)

model = LinearRegression()

2.2.2 Training model

def train_step(model, features, labels):

    predictions = model.forward(features)
    loss = model.loss_func(predictions,labels)

    # 反向传播求梯度
    loss.backward()

    # 使用torch.no_grad()避免梯度记录，也可以通过操作 model.w.data 实现避免梯度记录
    with torch.no_grad():
        # 梯度下降法更新参数
        model.w -= 0.001*model.w.grad
        model.b -= 0.001*model.b.grad

        # 梯度清零
        model.w.grad.zero_()
        model.b.grad.zero_()
    return loss

测试 train_step 效果

1
2
3

batch_size = 10
(features,labels) = next(data_iter(X,Y,batch_size))
train_step(model,features,labels)

Results:

1	tensor(68.6391, grad_fn=<MeanBackward0>)

Train model

def train_model(model,epochs):
    for epoch in range(1,epochs+1):
        for features, labels in data_iter(X,Y,10):
            loss = train_step(model,features,labels)

        if epoch%200==0:
            print("epoch =",epoch,"loss = ",loss.item())
            print("model.w =",model.w.data)
            print("model.b =",model.b.data)

train_model(model,epochs = 1000)

Results:

epoch = 200 loss =  3.2508397102355957
model.w = tensor([[ 2.0401],
        [-2.9877]])
model.b = tensor([[9.9169]])
epoch = 400 loss =  3.0016872882843018
model.w = tensor([[ 2.0435],
        [-2.9855]])
model.b = tensor([[9.9173]])
epoch = 600 loss =  2.7006335258483887
model.w = tensor([[ 2.0418],
        [-2.9843]])
model.b = tensor([[9.9174]])
epoch = 800 loss =  1.280609369277954
model.w = tensor([[ 2.0416],
        [-2.9869]])
model.b = tensor([[9.9169]])
epoch = 1000 loss =  2.169107675552368
model.w = tensor([[ 2.0420],
        [-2.9852]])
model.b = tensor([[9.9170]])

2.2.3 Visualization

# 结果可视化

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

plt.figure(figsize = (12,5))
ax1 = plt.subplot(121)
ax1.scatter(X[:,0].numpy(),Y[:,0].numpy(), c = "b",label = "samples")
ax1.plot(X[:,0].numpy(),(model.w[0].data*X[:,0]+model.b[0].data).numpy(),"-r",linewidth = 5.0,label = "model")
ax1.legend()
plt.xlabel("x1")
plt.ylabel("y",rotation = 0)

ax2 = plt.subplot(122)
ax2.scatter(X[:,1].numpy(),Y[:,0].numpy(), c = "g",label = "samples")
ax2.plot(X[:,1].numpy(),(model.w[1].data*X[:,1]+model.b[0].data).numpy(),"-r",linewidth = 5.0,label = "model")
ax2.legend()
plt.xlabel("x2")
plt.ylabel("y",rotation = 0)

plt.show()

Results:

2.3 DNN二分类模型

2.3.1 Prepare data

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn
%matplotlib inline
%config InlineBackend.figure_format = 'svg'

#正负样本数量
n_positive,n_negative = 2000,2000

#生成正样本, 小圆环分布
r_p = 5.0 + torch.normal(0.0,1.0,size = [n_positive,1])
theta_p = 2*np.pi*torch.rand([n_positive,1])
Xp = torch.cat([r_p*torch.cos(theta_p),r_p*torch.sin(theta_p)],axis = 1)
Yp = torch.ones_like(r_p)

#生成负样本, 大圆环分布
r_n = 8.0 + torch.normal(0.0,1.0,size = [n_negative,1])
theta_n = 2*np.pi*torch.rand([n_negative,1])
Xn = torch.cat([r_n*torch.cos(theta_n),r_n*torch.sin(theta_n)],axis = 1)
Yn = torch.zeros_like(r_n)

#汇总样本
X = torch.cat([Xp,Xn],axis = 0)
Y = torch.cat([Yp,Yn],axis = 0)

#可视化
plt.figure(figsize = (6,6))
plt.scatter(Xp[:,0].numpy(),Xp[:,1].numpy(),c = "r")
plt.scatter(Xn[:,0].numpy(),Xn[:,1].numpy(),c = "g")
plt.legend(["positive","negative"]);

Results:

构建数据管道迭代器

# 构建数据管道迭代器
def data_iter(features, labels, batch_size=8):
    num_examples = len(features)
    indices = list(range(num_examples))
    np.random.shuffle(indices)  #样本的读取顺序是随机的
    for i in range(0, num_examples, batch_size):
        indexs = torch.LongTensor(indices[i: min(i + batch_size, num_examples)])
        yield  features.index_select(0, indexs), labels.index_select(0, indexs)

# 测试数据管道效果   
batch_size = 8
(features,labels) = next(data_iter(X,Y,batch_size))
print(features)
print(labels)

Results:

tensor([[ 6.3216, -2.6834],
        [ 2.4433,  4.4928],
        [ 8.5585,  3.0958],
        [-1.0328,  3.3381],
        [-4.6885, -0.1144],
        [ 8.7589, -3.4486],
        [ 0.4830,  3.6482],
        [ 4.9465,  0.3443]])
tensor([[0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.]])

2.3.2 Define model

此处范例我们利用 nn.Module 来组织模型变量。

class DNNModel(nn.Module):
    def __init__(self):
        super(DNNModel, self).__init__()
        self.w1 = nn.Parameter(torch.randn(2,4))
        self.b1 = nn.Parameter(torch.zeros(1,4))
        self.w2 = nn.Parameter(torch.randn(4,8))
        self.b2 = nn.Parameter(torch.zeros(1,8))
        self.w3 = nn.Parameter(torch.randn(8,1))
        self.b3 = nn.Parameter(torch.zeros(1,1))

    # 正向传播
    def forward(self,x):
        x = torch.relu(x@self.w1 + self.b1)
        x = torch.relu(x@self.w2 + self.b2)
        y = torch.sigmoid(x@self.w3 + self.b3)
        return y

    # 损失函数(二元交叉熵)
    def loss_func(self,y_pred,y_true):  
        #将预测值限制在1e-7以上, 1- (1e-7)以下，避免log(0)错误
        eps = 1e-7
        y_pred = torch.clamp(y_pred,eps,1.0-eps)
        bce = - y_true*torch.log(y_pred) - (1-y_true)*torch.log(1-y_pred)
        return torch.mean(bce)

    # 评估指标(准确率)
    def metric_func(self,y_pred,y_true):
        y_pred = torch.where(y_pred>0.5,torch.ones_like(y_pred,dtype = torch.float32),
                          torch.zeros_like(y_pred,dtype = torch.float32))
        acc = torch.mean(1-torch.abs(y_true-y_pred))
        return acc

model = DNNModel()

测试模型结构

# 测试模型结构
batch_size = 10
(features,labels) = next(data_iter(X,Y,batch_size))

predictions = model(features)

loss = model.loss_func(labels,predictions)
metric = model.metric_func(labels,predictions)

print("init loss:", loss.item())
print("init metric:", metric.item())

Results:

1 2	init loss: 7.446216583251953 init metric: 0.5362008810043335

1	len(list(model.parameters()))

Results:

2.3.3 Trianing model

def train_step(model, features, labels):   

    # 正向传播求损失
    predictions = model.forward(features)
    loss = model.loss_func(predictions,labels)
    metric = model.metric_func(predictions,labels)

    # 反向传播求梯度
    loss.backward()

    # 梯度下降法更新参数
    for param in model.parameters():
        #注意是对param.data进行重新赋值,避免此处操作引起梯度记录
        param.data = (param.data - 0.01*param.grad.data)

    # 梯度清零
    model.zero_grad()

    return loss.item(),metric.item()

def train_model(model,epochs):
    for epoch in range(1,epochs+1):
        loss_list,metric_list = [],[]
        for features, labels in data_iter(X,Y,20):
            lossi,metrici = train_step(model,features,labels)
            loss_list.append(lossi)
            metric_list.append(metrici)
        loss = np.mean(loss_list)
        metric = np.mean(metric_list)

        if epoch%100==0:
            print("epoch =",epoch,"loss = ",loss,"metric = ",metric)

train_model(model,epochs = 1000)

Results:

epoch = 100 loss =  0.1934373697731644 metric =  0.9207499933242798
epoch = 200 loss =  0.18901969484053552 metric =  0.918999993801117
epoch = 300 loss =  0.18451461097225547 metric =  0.9247499924898147
epoch = 400 loss =  0.18301934767514466 metric =  0.9247499933838844
epoch = 500 loss =  0.18300161071121693 metric =  0.9274999922513962
epoch = 600 loss =  0.18265636594966053 metric =  0.9219999933242797
epoch = 700 loss =  0.18221229410730302 metric =  0.9239999923110008
epoch = 800 loss =  0.1817048901133239 metric =  0.922749992609024
epoch = 900 loss =  0.18160937033127994 metric =  0.9259999924898148
epoch = 1000 loss =  0.1799963693227619 metric =  0.9282499927282334

2.3.4 Visualization

# 结果可视化
fig, (ax1,ax2) = plt.subplots(nrows=1,ncols=2,figsize = (12,5))
ax1.scatter(Xp[:,0],Xp[:,1], c="r")
ax1.scatter(Xn[:,0],Xn[:,1],c = "g")
ax1.legend(["positive","negative"]);
ax1.set_title("y_true");

Xp_pred = X[torch.squeeze(model.forward(X)>=0.5)]
Xn_pred = X[torch.squeeze(model.forward(X)<0.5)]

ax2.scatter(Xp_pred[:,0],Xp_pred[:,1],c = "r")
ax2.scatter(Xn_pred[:,0],Xn_pred[:,1],c = "g")
ax2.legend(["positive","negative"]);
ax2.set_title("y_pred");

Results:

3. 中阶API示范

下面的范例使用 Pytorch 的中阶 API 实现线性回归模型和和 DNN 二分类模型。

Pytorch 的中阶 API 主要包括：

各种模型层；
损失函数；
优化器；
数据管道等。

3.1 Linear regression

3.1.1 Prepare data

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import Dataset,DataLoader,TensorDataset

#样本数量
n = 400

# 生成测试用数据集
X = 10*torch.rand([n,2])-5.0  #torch.rand是均匀分布
w0 = torch.tensor([[2.0],[-3.0]])
b0 = torch.tensor([[10.0]])
Y = X@w0 + b0 + torch.normal( 0.0,2.0,size = [n,1])  # @表示矩阵乘法,增加正态扰动

Visualization

# 数据可视化

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

plt.figure(figsize = (12,5))
ax1 = plt.subplot(121)
ax1.scatter(X[:,0],Y[:,0], c = "b",label = "samples")
ax1.legend()
plt.xlabel("x1")
plt.ylabel("y",rotation = 0)

ax2 = plt.subplot(122)
ax2.scatter(X[:,1],Y[:,0], c = "g",label = "samples")
ax2.legend()
plt.xlabel("x2")
plt.ylabel("y",rotation = 0)
plt.show()

Results:

Data pipeline

1
2
3

#构建输入数据管道
ds = TensorDataset(X,Y)
dl = DataLoader(ds,batch_size = 10,shuffle=True,num_workers=2)

3.2 Model

3.2.1 Define model

model = nn.Linear(2,1) #线性层

model.loss_func = nn.MSELoss()
model.optimizer = torch.optim.SGD(model.parameters(),lr = 0.01)

3.2.2 Training model

Train step

def train_step(model, features, labels):

    predictions = model(features)
    loss = model.loss_func(predictions,labels)
    loss.backward()
    model.optimizer.step()
    model.optimizer.zero_grad()
    return loss.item()

# 测试train_step效果
features,labels = next(iter(dl))
train_step(model,features,labels)

Results:

1	415.08831787109375

Train model

def train_model(model,epochs):
    for epoch in range(1,epochs+1):
        for features, labels in dl:
            loss = train_step(model,features,labels)
        if epoch%50==0:
            w = model.state_dict()["weight"]
            b = model.state_dict()["bias"]
            print("epoch =",epoch,"loss = ",loss)
            print("w =",w)
            print("b =",b)

train_model(model,epochs = 200)

Results:

epoch = 50 loss =  4.598311901092529
w = tensor([[ 1.9602, -2.9793]])
b = tensor([10.1778])
epoch = 100 loss =  3.397813320159912
w = tensor([[ 2.0284, -2.9681]])
b = tensor([10.2230])
epoch = 150 loss =  1.588686227798462
w = tensor([[ 1.9387, -2.9690]])
b = tensor([10.1770])
epoch = 200 loss =  4.254576206207275
w = tensor([[ 1.8670, -3.1228]])
b = tensor([10.2100])

3.2.3 Visualization

w,b = model.state_dict()["weight"],model.state_dict()["bias"]

plt.figure(figsize = (12,5))
ax1 = plt.subplot(121)
ax1.scatter(X[:,0],Y[:,0], c = "b",label = "samples")
ax1.plot(X[:,0],w[0,0]*X[:,0]+b[0],"-r",linewidth = 5.0,label = "model")
ax1.legend()
plt.xlabel("x1")
plt.ylabel("y",rotation = 0)

ax2 = plt.subplot(122)
ax2.scatter(X[:,1],Y[:,0], c = "g",label = "samples")
ax2.plot(X[:,1],w[0,1]*X[:,1]+b[0],"-r",linewidth = 5.0,label = "model")
ax2.legend()
plt.xlabel("x2")
plt.ylabel("y",rotation = 0)

plt.show()

Results:

3.3 DNN二分类模型

3.3.1 Prepare data

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import Dataset,DataLoader,TensorDataset
%matplotlib inline
%config InlineBackend.figure_format = 'svg'

#正负样本数量
n_positive,n_negative = 2000,2000

#生成正样本, 小圆环分布
r_p = 5.0 + torch.normal(0.0,1.0,size = [n_positive,1])
theta_p = 2*np.pi*torch.rand([n_positive,1])
Xp = torch.cat([r_p*torch.cos(theta_p),r_p*torch.sin(theta_p)],axis = 1)
Yp = torch.ones_like(r_p)

#生成负样本, 大圆环分布
r_n = 8.0 + torch.normal(0.0,1.0,size = [n_negative,1])
theta_n = 2*np.pi*torch.rand([n_negative,1])
Xn = torch.cat([r_n*torch.cos(theta_n),r_n*torch.sin(theta_n)],axis = 1)
Yn = torch.zeros_like(r_n)

#汇总样本
X = torch.cat([Xp,Xn],axis = 0)
Y = torch.cat([Yp,Yn],axis = 0)

#可视化
plt.figure(figsize = (6,6))
plt.scatter(Xp[:,0],Xp[:,1],c = "r")
plt.scatter(Xn[:,0],Xn[:,1],c = "g")
plt.legend(["positive","negative"]);

Results:

Pipeline

1
2
3

#构建输入数据管道
ds = TensorDataset(X,Y)
dl = DataLoader(ds,batch_size = 10,shuffle=True,num_workers=2)

3.3.2 Define model

class DNNModel(nn.Module):
    def __init__(self):
        super(DNNModel, self).__init__()
        self.fc1 = nn.Linear(2,4)
        self.fc2 = nn.Linear(4,8)
        self.fc3 = nn.Linear(8,1)

    # 正向传播
    def forward(self,x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        y = nn.Sigmoid()(self.fc3(x))
        return y

    # 损失函数
    def loss_func(self,y_pred,y_true):
        return nn.BCELoss()(y_pred,y_true)

    # 评估函数(准确率)
    def metric_func(self,y_pred,y_true):
        y_pred = torch.where(y_pred>0.5,torch.ones_like(y_pred,dtype = torch.float32),
                          torch.zeros_like(y_pred,dtype = torch.float32))
        acc = torch.mean(1-torch.abs(y_true-y_pred))
        return acc

    # 优化器
    @property
    def optimizer(self):
        return torch.optim.Adam(self.parameters(),lr = 0.001)

model = DNNModel()

Test pipeline

# 测试模型结构
(features,labels) = next(iter(dl))
predictions = model(features)

loss = model.loss_func(predictions,labels)
metric = model.metric_func(predictions,labels)

print("init loss:",loss.item())
print("init metric:",metric.item())

Results:

1 2	init loss: 0.8217536807060242 init metric: 0.6000000238418579

3.3.3 Training model

Train step

def train_step(model, features, labels):

    # 正向传播求损失
    predictions = model(features)
    loss = model.loss_func(predictions,labels)
    metric = model.metric_func(predictions,labels)

    # 反向传播求梯度
    loss.backward()

    # 更新模型参数
    model.optimizer.step()
    model.optimizer.zero_grad()

    return loss.item(),metric.item()

# 测试train_step效果
features,labels = next(iter(dl))
train_step(model,features,labels)

Results:

1	(1.027471899986267, 0.4000000059604645)

Train model

def train_model(model,epochs):
    for epoch in range(1,epochs+1):
        loss_list,metric_list = [],[]
        for features, labels in dl:
            lossi,metrici = train_step(model,features,labels)
            loss_list.append(lossi)
            metric_list.append(metrici)
        loss = np.mean(loss_list)
        metric = np.mean(metric_list)

        if epoch%100==0:
            print("epoch =",epoch,"loss = ",loss,"metric = ",metric)

train_model(model,epochs = 300)

Results:

1
2
3

epoch = 100 loss =  0.2738241909684248 metric =  0.9302499929070472
epoch = 200 loss =  0.27702247152624065 metric =  0.9312499925494194
epoch = 300 loss =  0.27914922587944946 metric =  0.9309999929368495

3.3.4 Visualization

# 结果可视化
fig, (ax1,ax2) = plt.subplots(nrows=1,ncols=2,figsize = (12,5))
ax1.scatter(Xp[:,0],Xp[:,1], c="r")
ax1.scatter(Xn[:,0],Xn[:,1],c = "g")
ax1.legend(["positive","negative"]);
ax1.set_title("y_true");

Xp_pred = X[torch.squeeze(model.forward(X)>=0.5)]
Xn_pred = X[torch.squeeze(model.forward(X)<0.5)]

ax2.scatter(Xp_pred[:,0],Xp_pred[:,1],c = "r")
ax2.scatter(Xn_pred[:,0],Xn_pred[:,1],c = "g")
ax2.legend(["positive","negative"]);
ax2.set_title("y_pred");

Results:

4. 高阶API示范

Pytorch 没有官方的高阶 API，一般需要用户自己实现训练循环、验证循环、和预测循环。

torchkeras.Model 类是仿照 tf.keras.Model 的功能对 Pytorch 的 nn.Module 进行了封装设计而成的，它实现了 fit, validate，predict, summary 方法，相当于用户自定义高阶 API。本章后面的内容借助它来实现线性回归模型。

此外，torchkeras.LightModel 类是借用 pytorch_lightning 的功能，封装了类Keras 接口的另外一种实现。本章后面的内容用它实现DNN二分类模型。

4.1 Linear regression

4.1.1 Prepare data

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import Dataset,DataLoader,TensorDataset

#样本数量
n = 400

# 生成测试用数据集
X = 10*torch.rand([n,2])-5.0  #torch.rand是均匀分布
w0 = torch.tensor([[2.0],[-3.0]])
b0 = torch.tensor([[10.0]])
Y = X@w0 + b0 + torch.normal( 0.0,2.0,size = [n,1])  # @表示矩阵乘法,增加正态扰动

Visualization

# 数据可视化

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

plt.figure(figsize = (12,5))
ax1 = plt.subplot(121)
ax1.scatter(X[:,0],Y[:,0], c = "b",label = "samples")
ax1.legend()
plt.xlabel("x1")
plt.ylabel("y",rotation = 0)

ax2 = plt.subplot(122)
ax2.scatter(X[:,1],Y[:,0], c = "g",label = "samples")
ax2.legend()
plt.xlabel("x2")
plt.ylabel("y",rotation = 0)
plt.show()

Results:

Data pipeline

#构建输入数据管道
ds = TensorDataset(X,Y)
ds_train,ds_valid = torch.utils.data.random_split(ds,[int(400*0.7),400-int(400*0.7)])
dl_train = DataLoader(ds_train,batch_size = 10,shuffle=True,num_workers=2)
dl_valid = DataLoader(ds_valid,batch_size = 10,num_workers=2)

4.2 Model

4.2.1 Define model

# 继承用户自定义模型
from torchkeras import Model
class LinearRegression(Model):
    def __init__(self):
        super(LinearRegression, self).__init__()
        self.fc = nn.Linear(2,1)

    def forward(self,x):
        return self.fc(x)

model = LinearRegression()

4.2.2 Training model

# 使用fit方法进行训练

def mean_absolute_error(y_pred,y_true):
    return torch.mean(torch.abs(y_pred-y_true))

def mean_absolute_percent_error(y_pred,y_true):
    absolute_percent_error = (torch.abs(y_pred-y_true)+1e-7)/(torch.abs(y_true)+1e-7)
    return torch.mean(absolute_percent_error)

model.compile(loss_func = nn.MSELoss(),
              optimizer= torch.optim.Adam(model.parameters(),lr = 0.01),
              metrics_dict={"mae":mean_absolute_error,"mape":mean_absolute_percent_error})

dfhistory = model.fit(200, dl_train = dl_train, dl_val = dl_valid,log_step_freq = 20)

Results:

Start Training ...

================================================================================2022-02-06 22:48:10
{'step': 20, 'loss': 208.126, 'mae': 11.994, 'mape': 1.195}

 +-------+---------+--------+-------+----------+---------+----------+
| epoch |   loss  |  mae   |  mape | val_loss | val_mae | val_mape |
+-------+---------+--------+-------+----------+---------+----------+
|   1   | 201.175 | 11.695 | 1.269 | 195.057  |  11.834 |  1.065   |
+-------+---------+--------+-------+----------+---------+----------+

...

 +-------+-------+-------+-------+----------+---------+----------+
| epoch |  loss |  mae  |  mape | val_loss | val_mae | val_mape |
+-------+-------+-------+-------+----------+---------+----------+
|   20  | 39.91 | 5.993 | 1.649 |  42.392  |  6.193  |  1.032   |
+-------+-------+-------+-------+----------+---------+----------+

================================================================================2022-02-06 22:49:56
Finished Training...

4.2.3 Visualization

w,b = model.state_dict()["fc.weight"],model.state_dict()["fc.bias"]

plt.figure(figsize = (12,5))
ax1 = plt.subplot(121)
ax1.scatter(X[:,0],Y[:,0], c = "b",label = "samples")
ax1.plot(X[:,0],w[0,0]*X[:,0]+b[0],"-r",linewidth = 5.0,label = "model")
ax1.legend()
plt.xlabel("x1")
plt.ylabel("y",rotation = 0)

ax2 = plt.subplot(122)
ax2.scatter(X[:,1],Y[:,0], c = "g",label = "samples")
ax2.plot(X[:,1],w[0,1]*X[:,1]+b[0],"-r",linewidth = 5.0,label = "model")
ax2.legend()
plt.xlabel("x2")
plt.ylabel("y",rotation = 0)

plt.show()

Results:

4.2.4 Evaluation

1	dfhistory.tail()

Results:

	loss	mae	mape	val_loss	val_mae	val_mape
15	51.618867	6.840317	1.773152	54.423827	7.038455	1.124349
16	48.355738	6.618555	1.744567	51.134396	6.821975	1.102371
17	45.444238	6.420669	1.726280	47.896852	6.605719	1.086570
18	42.519069	6.199411	1.682794	45.115399	6.398358	1.055073
19	39.909953	5.992503	1.649152	42.391730	6.192853	1.031992

import matplotlib.pyplot as plt

def plot_metric(dfhistory, metric):
    train_metrics = dfhistory[metric]
    val_metrics = dfhistory['val_'+metric]
    epochs = range(1, len(train_metrics) + 1)
    plt.plot(epochs, train_metrics, 'bo--')
    plt.plot(epochs, val_metrics, 'ro-')
    plt.title('Training and validation '+ metric)
    plt.xlabel("Epochs")
    plt.ylabel(metric)
    plt.legend(["train_"+metric, 'val_'+metric])
    plt.show()

plot_metric(dfhistory,"loss")

Results:

1	plot_metric(dfhistory,"mape")

Results:

1 2	# 评估 model.evaluate(dl_valid)

Results:

{'val_loss': 42.391730308532715,
 'val_mae': 6.19285261631012,
 'val_mape': 1.0319924702246983}

4.2.5 Predict

1
2
3

# 预测
dl = DataLoader(TensorDataset(X))
model.predict(dl)[0:10]

Results:

tensor([[  8.9128],
        [  9.5116],
        [ 12.2481],
        [  0.1308],
        [ 16.1116],
        [-17.9351],
        [-14.6407],
        [  2.9675],
        [ 10.9686],
        [ 14.8227]])

Predict validate data

1 2	# 预测 model.predict(dl_valid)[0:10]

Results:

tensor([[ -4.9393],
        [-12.2253],
        [  3.5050],
        [  6.6128],
        [  2.7707],
        [  0.7076],
        [ -6.2700],
        [ -8.4491],
        [ -7.4038],
        [ 10.0306]])

4.3 DNN二分类模型

4.3.1 Prepare data

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import Dataset,DataLoader,TensorDataset
import torchkeras
import pytorch_lightning as pl
%matplotlib inline
%config InlineBackend.figure_format = 'svg'

#正负样本数量
n_positive,n_negative = 2000,2000

#生成正样本, 小圆环分布
r_p = 5.0 + torch.normal(0.0,1.0,size = [n_positive,1])
theta_p = 2*np.pi*torch.rand([n_positive,1])
Xp = torch.cat([r_p*torch.cos(theta_p),r_p*torch.sin(theta_p)],axis = 1)
Yp = torch.ones_like(r_p)

#生成负样本, 大圆环分布
r_n = 8.0 + torch.normal(0.0,1.0,size = [n_negative,1])
theta_n = 2*np.pi*torch.rand([n_negative,1])
Xn = torch.cat([r_n*torch.cos(theta_n),r_n*torch.sin(theta_n)],axis = 1)
Yn = torch.zeros_like(r_n)

#汇总样本
X = torch.cat([Xp,Xn],axis = 0)
Y = torch.cat([Yp,Yn],axis = 0)

#可视化
plt.figure(figsize = (6,6))
plt.scatter(Xp[:,0],Xp[:,1],c = "r")
plt.scatter(Xn[:,0],Xn[:,1],c = "g")
plt.legend(["positive","negative"]);

Results:

Dataloader

ds = TensorDataset(X,Y)

ds_train,ds_valid = torch.utils.data.random_split(ds,[int(len(ds)*0.7),len(ds)-int(len(ds)*0.7)])
dl_train = DataLoader(ds_train,batch_size = 100,shuffle=True,num_workers=2)
dl_valid = DataLoader(ds_valid,batch_size = 100,num_workers=2)

4.3.2 Define model

import torchmetrics as metrics

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(2,4)
        self.fc2 = nn.Linear(4,8)
        self.fc3 = nn.Linear(8,1)

    def forward(self,x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        y = nn.Sigmoid()(self.fc3(x))
        return y

class Model(torchkeras.LightModel):

    #loss,and optional metrics
    def shared_step(self,batch)->dict:
        x, y = batch
        prediction = self(x)
        loss = nn.BCELoss()(prediction,y)
        preds = torch.where(prediction>0.5,torch.ones_like(prediction),torch.zeros_like(prediction))
        acc = metrics.functional.accuracy(preds.int(), y.int())
        # attention: there must be a key of "loss" in the returned dict
        dic = {"loss":loss,"acc":acc}
        return dic

    #optimizer,and optional lr_scheduler
    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=1e-2)
        lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.0001)
        return {"optimizer":optimizer,"lr_scheduler":lr_scheduler}

pl.seed_everything(1234)
net = Net()
model = Model(net)

torchkeras.summary(model,input_shape =(2,))

Results:

Global seed set to 1234

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Linear-1                    [-1, 4]              12
            Linear-2                    [-1, 8]              40
            Linear-3                    [-1, 1]               9
================================================================
Total params: 61
Trainable params: 61
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.000008
Forward/backward pass size (MB): 0.000099
Params size (MB): 0.000233
Estimated Total Size (MB): 0.000340
----------------------------------------------------------------

4.3.3 Training model

Note：下述代码，如果本机没有 gpu 会报 Runerror 错误：

1	RuntimeError: DataLoader worker (pid(s) 6088, 19424) exited unexpectedl

将 gpu=0 删掉能避免此错误。

ckpt_cb = pl.callbacks.ModelCheckpoint(monitor='val_loss')

# set gpus=0 will use cpu，
# set gpus=1 will use 1 gpu
# set gpus=2 will use 2gpus
# set gpus = -1 will use all gpus
# you can also set gpus = [0,1] to use the  given gpus
# you can even set tpu_cores=2 to use two tpus

trainer = pl.Trainer(max_epochs=100, callbacks=[ckpt_cb])

trainer.fit(model,dl_train,dl_valid)

Results:

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs

  | Name | Type | Params
------------------------------
0 | net  | Net  | 61    
------------------------------
61        Trainable params
0         Non-trainable params
61        Total params
0.000     Total estimated model params size (MB)

Validation sanity check: 0it [00:00, ?it/s]

Global seed set to 1234    

================================================================================2022-02-07 09:45:32
epoch =  0
{'val_loss': 0.6725655794143677, 'val_acc': 0.5399999618530273}

Training: 0it [00:00, ?it/s]

Validating: 0it [00:00, ?it/s]

================================================================================2022-02-07 09:48:22
epoch =  0
{'val_loss': 0.6592584252357483, 'val_acc': 0.5483332872390747}
{'loss': 0.679371178150177, 'acc': 0.5324999690055847}

...

Validating: 0it [00:00, ?it/s]

================================================================================2022-02-07 10:16:49
epoch =  99
{'val_loss': 0.20280574262142181, 'val_acc': 0.9183333516120911}
{'loss': 0.20242063701152802, 'acc': 0.9210714101791382}

4.3.4 Visualization

# 结果可视化
fig, (ax1,ax2) = plt.subplots(nrows=1,ncols=2,figsize = (12,5))
ax1.scatter(Xp[:,0],Xp[:,1], c="r")
ax1.scatter(Xn[:,0],Xn[:,1],c = "g")
ax1.legend(["positive","negative"]);
ax1.set_title("y_true");

Xp_pred = X[torch.squeeze(model.forward(X)>=0.5)]
Xn_pred = X[torch.squeeze(model.forward(X)<0.5)]

ax2.scatter(Xp_pred[:,0],Xp_pred[:,1],c = "r")
ax2.scatter(Xn_pred[:,0],Xn_pred[:,1],c = "g")
ax2.legend(["positive","negative"]);
ax2.set_title("y_pred");

Results:

4.3.5 Evaluation

import pandas as pd

history = model.history
dfhistory = pd.DataFrame(history)
dfhistory

Results:

	val_loss	val_acc	loss	acc	epoch
0	0.659258	0.548333	0.679371	0.532500	0
1	0.633105	0.712500	0.653128	0.617500	1
2	0.560715	0.705833	0.603827	0.702857	2
3	0.468437	0.794167	0.533967	0.737143	3
4	0.345662	0.820000	0.427476	0.795357	4
...	...	...	...	...	...
95	0.202806	0.918333	0.202421	0.921071	95
96	0.202806	0.918333	0.202421	0.921071	96
97	0.202806	0.918333	0.202421	0.921071	97
98	0.202806	0.918333	0.202421	0.921071	98
99	0.202806	0.918333	0.202421	0.921071	99

100 rows × 5 columns

import matplotlib.pyplot as plt

def plot_metric(dfhistory, metric):
    train_metrics = dfhistory[metric]
    val_metrics = dfhistory['val_'+metric]
    epochs = range(1, len(train_metrics) + 1)
    plt.plot(epochs, train_metrics, 'bo--')
    plt.plot(epochs, val_metrics, 'ro-')
    plt.title('Training and validation '+ metric)
    plt.xlabel("Epochs")
    plt.ylabel(metric)
    plt.legend(["train_"+metric, 'val_'+metric])
    plt.show()
plot_metric(dfhistory,"loss")

Results:

1	plot_metric(dfhistory,"acc")

Results:

1 2	results = trainer.test(model, test_dataloaders=dl_valid, verbose = False) print(results[0])

Results:

1
2
3

Testing: 0it [00:00, ?it/s]

{'test_loss': 0.20280574262142181, 'test_acc': 0.9183333516120911}

4.3.6 Predict

def predict(model,dl):
    model.eval()
    prediction = torch.cat([model.forward(t[0].to(model.device)) for t in dl])
    result = torch.where(prediction>0.5,torch.ones_like(prediction),torch.zeros_like(prediction))
    return(result.data)

result = predict(model,dl_valid)

result

Results:

tensor([[1.],
        [1.],
        [0.],
        ...,
        [0.],
        [0.],
        [1.]])

1. Introduction

1.1 Preface

1.2 Pytorch 的层次结构

2. 低阶 API 示范

2.1 Linear regression

2.1.1 Prepare data

2.2 Model

2.2.1 Define model

2.2.2 Training model

2.2.3 Visualization

2.3 DNN二分类模型

2.3.1 Prepare data

2.3.2 Define model

2.3.3 Trianing model

2.3.4 Visualization

3. 中阶API示范

3.1 Linear regression

3.1.1 Prepare data

3.2 Model

3.2.1 Define model

3.2.2 Training model

3.2.3 Visualization

3.3 DNN二分类模型

3.3.1 Prepare data

3.3.2 Define model

3.3.3 Training model

3.3.4 Visualization

4. 高阶API示范

4.1 Linear regression

4.1.1 Prepare data

4.2 Model

4.2.1 Define model

4.2.2 Training model

4.2.3 Visualization

4.2.4 Evaluation

4.2.5 Predict

4.3 DNN二分类模型

4.3.1 Prepare data

4.3.2 Define model

4.3.3 Training model

4.3.4 Visualization

4.3.5 Evaluation

4.3.6 Predict

2. 低阶 `API` 示范