├── .gitignore ├── LICENSE ├── README.md ├── chapter2_PyTorch安装和快速上手 ├── __init__.py └── simple_regression.py ├── chapter3_神经网络 ├── __init__.py ├── iris_multi_classification.py └── loss.png ├── chapter4_神经网络的训练 ├── __init__.py ├── mnist_dnn │ ├── __init__.py │ ├── dnn_mnist.py │ └── model │ │ ├── mnist_dnn_model.pkl │ │ └── mnist_model.pkl └── optim │ ├── __init__.py │ ├── dataset.png │ ├── losses.png │ └── optim.py ├── chapter5_卷积神经网络 ├── __init__.py ├── cnn.pth ├── mycnn.py └── predict_cnn.py ├── chapter6_嵌入与表示学习 ├── __init__.py ├── autoencoder.py ├── denoise_autoencoder.py └── word_embeddings.py ├── chapter7_序列预测模型 ├── __init__.py ├── data │ ├── eng-cmn.txt │ ├── eng_cmn_attn_decoder1.model │ ├── eng_cmn_attn_decoder1.stat │ ├── eng_cmn_encoder1.model │ ├── eng_cmn_encoder1.stat │ ├── eng_cmn_input_lang.pkl │ ├── eng_cmn_output_lang.pkl │ └── eng_cmn_pairs.pkl ├── evaluate_cmn_eng.py ├── logger.py ├── model.py ├── process.py ├── seq2seq.py └── train.py ├── chapter8_PyTorch项目实战 ├── __init__.py ├── cat_vs_dog │ ├── convert.py │ ├── data │ ├── preprare_data.py │ └── result.txt ├── speech_command │ ├── README.md │ ├── __init__.py │ ├── make_dataset.py │ ├── model.py │ ├── run.py │ ├── speech_loader.py │ └── train.py └── text_classification │ ├── __init__.py │ ├── model.py │ ├── mydatasets.py │ ├── readme.txt │ └── text_classification.py ├── corrigendum.md ├── environment.yaml ├── images ├── PyTorch-in-action.png └── 图_4-2.png └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | chapter8_PyTorch项目实战/cat_vs_dog/ 6 | chapter8_PyTorch项目实战/text_classification/data 7 | chapter8_PyTorch项目实战/text_classification/.vector_cache/ 8 | chapter8_PyTorch项目实战/speech_command/org_data 9 | chapter8_PyTorch项目实战/speech_command/data 10 | # Pycharm 11 | .idea 12 | 13 | # C extensions 14 | *.so 15 | 16 | # Distribution / packaging 17 | .Python 18 | build/ 19 | develop-eggs/ 20 | dist/ 21 | downloads/ 22 | eggs/ 23 | .eggs/ 24 | lib/ 25 | lib64/ 26 | parts/ 27 | sdist/ 28 | var/ 29 | wheels/ 30 | *.egg-info/ 31 | .installed.cfg 32 | *.egg 33 | MANIFEST 34 | 35 | # PyInstaller 36 | # Usually these files are written by a python script from a template 37 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 38 | *.manifest 39 | *.spec 40 | 41 | # Installer logs 42 | pip-log.txt 43 | pip-delete-this-directory.txt 44 | 45 | # Unit test / coverage reports 46 | htmlcov/ 47 | .tox/ 48 | .coverage 49 | .coverage.* 50 | .cache 51 | nosetests.xml 52 | coverage.xml 53 | *.cover 54 | .hypothesis/ 55 | .pytest_cache/ 56 | 57 | # Translations 58 | *.mo 59 | *.pot 60 | 61 | # Django stuff: 62 | *.log 63 | local_settings.py 64 | db.sqlite3 65 | 66 | # Flask stuff: 67 | instance/ 68 | .webassets-cache 69 | 70 | # Scrapy stuff: 71 | .scrapy 72 | 73 | # Sphinx documentation 74 | docs/_build/ 75 | 76 | # PyBuilder 77 | target/ 78 | 79 | # Jupyter Notebook 80 | .ipynb_checkpoints 81 | 82 | # pyenv 83 | .python-version 84 | 85 | # celery beat schedule file 86 | celerybeat-schedule 87 | 88 | # SageMath parsed files 89 | *.sage.py 90 | 91 | # Environments 92 | .env 93 | .venv 94 | env/ 95 | venv/ 96 | ENV/ 97 | env.bak/ 98 | venv.bak/ 99 | 100 | # Spyder project settings 101 | .spyderproject 102 | .spyproject 103 | 104 | # Rope project settings 105 | .ropeproject 106 | 107 | # mkdocs documentation 108 | /site 109 | 110 | # mypy 111 | .mypy_cache/ 112 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 校宝在线 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # 《PyTorch机器学习从入门到实战》 2 | 3 | 4 | ![《PyTorch机器学习从入门到实战》-立体封](./images/PyTorch-in-action.png) 5 | 6 | 7 | 8 | ## 内容简介 9 | 10 | 该repository是《PyTorch机器学习从入门到实战》书里对应的示例代码。 11 | 12 | 购书链接: 13 | [机械工业出版社](http://www.cmpbook.com/stackroom.php?id=44729) | 14 | [亚马逊](https://www.amazon.cn/dp/B07JRBZJ9M/ref=sr_1_7?ie=UTF8&qid=1540979335&sr=8-7&keywords=PyTorch%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0) | 15 | [china-pub](http://product.china-pub.com/8053408) | 16 | [当当](http://search.dangdang.com/?key=PyTorch%BB%FA%C6%F7%D1%A7%CF%B0%B4%D3%C8%EB%C3%C5%B5%BD%CA%B5%D5%BD&act=input) | 17 | [京东](https://search.jd.com/Search?keyword=PyTorch%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E4%BB%8E%E5%85%A5%E9%97%A8%E5%88%B0%E5%AE%9E%E6%88%98&enc=utf-8&wq=PyTorch%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E4%BB%8E%E5%85%A5%E9%97%A8%E5%88%B0%E5%AE%9E%E6%88%98&pvid=0265a6a4990c4bc58fd670c8d5f63aac) 18 | 19 | 20 | 近年来,基于深度学习的人工智能掀起了一股学习的热潮。本书是使用PyTorch深度学习框架的入门书籍。本书从深度学习原理入手,由浅入深,阐述深度学习中神经网络、深度神经网络、卷积神经网络、自编码器、循环神经网络等,同时穿插学习PyTorch框架的各个知识点和基于知识点的实例。最后,综合运用PyTorch和深度学习知识,来解决实践中的具体问题,比如图像识别、文本分类和命令词识别等。可以说,本书是深度学习和PyTorch的入门教程,同时也引领读者登堂入室,进入机会和挑战的人工智能领域。 21 | 22 | 本书针对的对象是机器学习和人工智能的爱好者和研究者,希望其能够有一定的机器学习和深度学习知识,有一定的Python编程基础。 23 | 24 | ## 勘误 25 | 编者才疏学浅, 更兼时间和精力所限, 书中错谬之处甚多,已发现的错误在[勘误](./corrigendum.md)中列出,对新的错误欢迎提[issues](https://github.com/xiaobaoonline/pytorch-in-action/issues)不吝告知, 将不胜感激。 26 | 27 | 28 | ## 目录 29 | 30 | ``` 31 | 第 1 章 深度学习介绍 32 | 1.1 人工智能、机器学习与深度学习 33 | 1.2 深度学习工具介绍 34 | 1.3 PyTorch 介绍 35 | 1.4 你能从本书中学到什么 36 | 第 2 章 PyTorch 安装和快速上手 37 | 2.1 PyTorch 安装 38 | 2.2 Jupyter Notebook 使用 39 | 2.3 NumPy 基础知识 40 | 2.4 PyTorch 基础知识 41 | 第 3 章 神经网络 42 | 3.1 神经元与神经网络 43 | 3.2 激活函数 44 | 3.3 前向算法 45 | 3.4 损失函数 46 | 3.5 反向传播算法 47 | 3.6 数据的准备 48 | 3.7 PyTorch 实例:单层神经网络实现 49 | 第 4 章 深度神经网络及训练 50 | 4.1 深度神经网络 51 | 4.2 梯度下降 52 | 4.3 优化器 53 | 4.4 正则化 54 | 4.5 PyTorch 实例:深度神经网络实现 55 | 第 5 章 卷积神经网络 56 | 5.1 计算机视觉 57 | 5.2 卷积神经网络 58 | 5.3 MNIST 数据集上卷积神经网络的实现 59 | 第 6 章 嵌入与表征学习 60 | 6.1 PCA 61 | 6.2 自编码器 62 | 6.3 词嵌入 63 | 第 7 章 序列预测模型 64 | 7.1 序列数据处理 65 | 7.2 循环神经网络 66 | 7.3 LSTM 和 GRU 67 | 7.4 LSTM 在自然语言处理中的应用 68 | 7.5 序列到序列网络 69 | 7.6 PyTorch 实例:基于 GRU 和 Attention 的机器翻译 70 | 第 8 章 PyTorch 项目实战 71 | 8.1 图像识别和迁移学习——猫狗大战 72 | 8.2 文本分类 73 | 8.3 语音识别系统介绍 74 | ``` 75 | 76 | ## 运行环境 77 | - python==3.6.5 78 | - pytorch==0.3.0 79 | - torch==0.3.0 80 | - torchtext==0.2.1 81 | - torchvision==0.2.0 82 | - librosa==0.5.1 83 | -------------------------------------------------------------------------------- /chapter2_PyTorch安装和快速上手/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-5-16 下午2:21 4 | # @File : __init__.py.py -------------------------------------------------------------------------------- /chapter2_PyTorch安装和快速上手/simple_regression.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-5-16 下午2:21 4 | # @File : simple_regression.py 5 | 6 | from __future__ import print_function 7 | 8 | from itertools import count 9 | 10 | import matplotlib.pyplot as plt 11 | import numpy as np 12 | import torch 13 | import torch.autograd 14 | import torch.nn.functional as F 15 | from torch.autograd import Variable 16 | 17 | random_state = 5000 18 | torch.manual_seed(random_state) 19 | POLY_DEGREE = 4 20 | W_target = torch.randn(POLY_DEGREE, 1) * 5 21 | b_target = torch.randn(1) * 5 22 | 23 | 24 | def make_features(x): 25 | """创建一个特征矩阵结构为[x, x^2, x^3, x^4].""" 26 | x = x.unsqueeze(1) 27 | return torch.cat([x ** i for i in range(1, POLY_DEGREE + 1)], 1) 28 | 29 | 30 | def f(x): 31 | """近似函数.""" 32 | return x.mm(W_target) + b_target[0] 33 | 34 | 35 | def poly_desc(W, b): 36 | """生成多向式描述内容.""" 37 | result = 'y = ' 38 | for i, w in enumerate(W): 39 | result += '{:+.2f} x^{} '.format(w, len(W) - i) 40 | result += '{:+.2f}'.format(b[0]) 41 | return result 42 | 43 | 44 | def get_batch(batch_size=32): 45 | """创建类似 (x, f(x))的批数据.""" 46 | random = torch.from_numpy(np.sort(torch.randn(batch_size))) 47 | x = make_features(random) 48 | y = f(x) 49 | return Variable(x), Variable(y) 50 | 51 | 52 | # 声明模型 53 | fc = torch.nn.Linear(W_target.size(0), 1) 54 | 55 | for batch_idx in count(1): 56 | # 获取数据 57 | batch_x, batch_y = get_batch() 58 | 59 | # 重置求导 60 | fc.zero_grad() 61 | 62 | # 前向传播 63 | output = F.smooth_l1_loss(fc(batch_x), batch_y) 64 | loss = output.data[0] 65 | 66 | # 后向传播 67 | output.backward() 68 | 69 | # 应用导数 70 | for param in fc.parameters(): 71 | param.data.add_(-0.1 * param.grad.data) 72 | 73 | # 停止条件 74 | if loss < 1e-3: 75 | plt.cla() 76 | plt.scatter(batch_x.data.numpy()[:, 0], batch_y.data.numpy()[:, 0], label='real curve', color='b') 77 | plt.plot(batch_x.data.numpy()[:, 0], fc(batch_x).data.numpy()[:, 0], label='fitting curve', color='r') 78 | plt.legend() 79 | plt.show() 80 | break 81 | 82 | print('Loss: {:.6f} after {} batches'.format(loss, batch_idx)) 83 | print('==> Learned function:\t' + poly_desc(fc.weight.data.view(-1), fc.bias.data)) 84 | print('==> Actual function:\t' + poly_desc(W_target.view(-1), b_target)) 85 | -------------------------------------------------------------------------------- /chapter3_神经网络/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-5-16 下午4:52 4 | # @Author : J.W. 5 | # @File : __init__.py.py -------------------------------------------------------------------------------- /chapter3_神经网络/iris_multi_classification.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-5-16 下午4:54 4 | # @Author : J.W. 5 | # @File : iris_multi_classification.py 6 | 7 | ''' 8 | PyTorch in action. 9 | PyTorch 实现iris 数据集的分类。 10 | ''' 11 | 12 | from __future__ import print_function 13 | 14 | import matplotlib.pyplot as plt 15 | import torch 16 | import torch.nn.functional as F 17 | from sklearn.datasets import load_iris 18 | from torch.autograd import Variable 19 | from torch.optim import SGD 20 | 21 | # GPU 是否可用 22 | use_cuda = torch.cuda.is_available() 23 | print("use_cuda: ", use_cuda) 24 | 25 | # 加载数据集 26 | iris = load_iris() 27 | print(iris.keys()) # dict_keys(['target_names', 'data', 'feature_names', 'DESCR', 'target']) 28 | 29 | x = iris['data'] # 特征信息 30 | y = iris['target'] # 目标分类 31 | print(x.shape) # (150, 4) 32 | print(x.shape) # (150,) 33 | 34 | print(y) 35 | 36 | x = torch.FloatTensor(x) 37 | y = torch.LongTensor(y) 38 | x, y = Variable(x), Variable(y) 39 | 40 | 41 | class Net(torch.nn.Module): 42 | """ 43 | 定义网络 44 | """ 45 | 46 | def __init__(self, n_feature, n_hidden, n_output): 47 | """ 48 | 初始化函数,接受自定义输入特征维数,隐藏层特征维数,输出层特征维数 49 | """ 50 | super(Net, self).__init__() 51 | self.hidden = torch.nn.Linear(n_feature, n_hidden) # 一个线性隐藏层 52 | self.predict = torch.nn.Linear(n_hidden, n_output) # 线性输出层 53 | 54 | def forward(self, x): 55 | """ 56 | 前向传播过程 57 | """ 58 | x = F.sigmoid(self.hidden(x)) 59 | x = self.predict(x) 60 | out = F.log_softmax(x, dim=1) 61 | return out 62 | 63 | 64 | # iris 中输入特征 4 维,隐藏层和输出层可以自己选择 65 | net = Net(n_feature=4, n_hidden=5, n_output=3) 66 | 67 | # 如果GPU可用 训练数据和模型都放到GPU上,注意:数据和网络是否在GPU上要同步 68 | if use_cuda: 69 | x = x.cuda() 70 | y = y.cuda() 71 | net = net.cuda() 72 | 73 | # 查看网络结构 74 | print(net) 75 | 76 | optimizer = SGD(net.parameters(), lr=0.5) 77 | 78 | iter_num = 1000 79 | px, py = [], [] 80 | 81 | plt.rcParams['font.sans-serif'] = ['STSong'] # 用来正常显示中文标签 82 | plt.rcParams['axes.unicode_minus'] = False # 用来正常显示负号 83 | 84 | for i in range(iter_num): 85 | # 数据集传入网络前向计算 86 | prediction = net(x) 87 | 88 | # 计算loss 89 | loss = F.nll_loss(prediction, y) 90 | # 这里也可用CrossEntropyLoss 91 | # loss = loss_func(prediction, y) 92 | 93 | # 清除网络状态 94 | optimizer.zero_grad() 95 | 96 | # loss 反向传播 97 | loss.backward() 98 | 99 | # 更新参数 100 | optimizer.step() 101 | 102 | # 打印并记录当前的index 和 loss 103 | print(i, " loss: ", loss.data[0]) 104 | px.append(i) 105 | py.append(loss.data[0]) 106 | 107 | if i % 10 == 0: 108 | # 动态画出loss走向 结果:loss.png 109 | plt.cla() 110 | plt.title(u'训练过程的loss曲线') 111 | plt.xlabel(u'迭代次数') 112 | plt.ylabel('损失') 113 | plt.plot(px, py, 'r-', lw=1) 114 | plt.text(0, 0, 'Loss=%.4f' % loss.data[0], fontdict={'size': 20, 'color': 'red'}) 115 | plt.pause(0.1) 116 | if i == iter_num - 1: 117 | # 最后一个图像定格 118 | plt.show() 119 | -------------------------------------------------------------------------------- /chapter3_神经网络/loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter3_神经网络/loss.png -------------------------------------------------------------------------------- /chapter4_神经网络的训练/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-5-21 下午5:48 4 | # @Author : J.W. 5 | # @File : __init__.py.py -------------------------------------------------------------------------------- /chapter4_神经网络的训练/mnist_dnn/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-5-29 上午10:21 4 | # @Author : J.W. 5 | # @File : __init__.py.py -------------------------------------------------------------------------------- /chapter4_神经网络的训练/mnist_dnn/dnn_mnist.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | import torch 3 | import torch.nn as nn 4 | import torchvision.datasets as dsets 5 | import torchvision.transforms as transforms 6 | from torch.autograd import Variable 7 | 8 | # Hyper Parameters 配置参数 9 | torch.manual_seed(1) # 设置随机数种子,确保结果可重复 10 | input_size = 784 # 11 | hidden_size = 500 12 | num_classes = 10 13 | num_epochs = 5 # 训练次数 14 | batch_size = 100 # 批处理大小 15 | learning_rate = 0.001 # 学习率 16 | 17 | # MNIST Dataset 下载训练集 MNIST 手写数字训练集 18 | train_dataset = dsets.MNIST(root='./data', # 数据保持的位置 19 | train=True, # 训练集 20 | transform=transforms.ToTensor(), # 一个取值范围是[0,255]的PIL.Image 21 | # 转化为取值范围是[0,1.0]的torch.FloadTensor 22 | download=True) # 下载数据 23 | 24 | test_dataset = dsets.MNIST(root='./data', 25 | train=False, # 测试集 26 | transform=transforms.ToTensor()) 27 | 28 | # Data Loader (Input Pipeline) 29 | # 数据的批处理,尺寸大小为batch_size, 30 | # 在训练集中,shuffle 必须设置为True, 表示次序是随机的 31 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 32 | batch_size=batch_size, 33 | shuffle=True) 34 | 35 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 36 | batch_size=batch_size, 37 | shuffle=False) 38 | 39 | 40 | # Neural Network Model (1 hidden layer) 定义神经网络模型 41 | class Net(nn.Module): 42 | def __init__(self, input_size, hidden_size, num_classes): 43 | super(Net, self).__init__() 44 | self.fc1 = nn.Linear(input_size, hidden_size) 45 | self.relu = nn.ReLU() 46 | self.fc2 = nn.Linear(hidden_size, num_classes) 47 | 48 | def forward(self, x): 49 | out = self.fc1(x) 50 | out = self.relu(out) 51 | out = self.fc2(out) 52 | return out 53 | 54 | 55 | net = Net(input_size, hidden_size, num_classes) 56 | # 打印模型 57 | print(net) 58 | 59 | # Loss and Optimizer 定义loss和optimizer 60 | criterion = nn.CrossEntropyLoss() 61 | optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate) 62 | 63 | # Train the Model 开始训练 64 | for epoch in range(num_epochs): 65 | for i, (images, labels) in enumerate(train_loader): # 批处理 66 | # Convert torch tensor to Variable 67 | images = Variable(images.view(-1, 28 * 28)) 68 | labels = Variable(labels) 69 | 70 | # Forward + Backward + Optimize 71 | optimizer.zero_grad() # zero the gradient buffer #梯度清零,以免影响其他batch 72 | outputs = net(images) # 前向传播 73 | 74 | # import pdb 75 | # pdb.set_trace() 76 | loss = criterion(outputs, labels) # loss 77 | loss.backward() # 后向传播,计算梯度 78 | optimizer.step() # 梯度更新 79 | 80 | if (i + 1) % 100 == 0: 81 | print('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' 82 | % (epoch + 1, num_epochs, i + 1, len(train_dataset) // batch_size, loss.data[0])) 83 | 84 | # Test the Model 85 | correct = 0 86 | total = 0 87 | for images, labels in test_loader: # test set 批处理 88 | images = Variable(images.view(-1, 28 * 28)) 89 | outputs = net(images) 90 | _, predicted = torch.max(outputs.data, 1) # 预测结果 91 | total += labels.size(0) # 正确结果 92 | correct += (predicted == labels).sum() # 正确结果总数 93 | 94 | print('Accuracy of the network on the 10000 test images: %d %%' % (100 * correct / total)) 95 | 96 | # Save the Model 97 | torch.save(net.state_dict(), 'mnist_dnn_model.pkl') 98 | -------------------------------------------------------------------------------- /chapter4_神经网络的训练/mnist_dnn/model/mnist_dnn_model.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter4_神经网络的训练/mnist_dnn/model/mnist_dnn_model.pkl -------------------------------------------------------------------------------- /chapter4_神经网络的训练/mnist_dnn/model/mnist_model.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter4_神经网络的训练/mnist_dnn/model/mnist_model.pkl -------------------------------------------------------------------------------- /chapter4_神经网络的训练/optim/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-5-29 上午10:31 4 | # @Author : J.W. 5 | # @File : __init__.py.py -------------------------------------------------------------------------------- /chapter4_神经网络的训练/optim/dataset.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter4_神经网络的训练/optim/dataset.png -------------------------------------------------------------------------------- /chapter4_神经网络的训练/optim/losses.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter4_神经网络的训练/optim/losses.png -------------------------------------------------------------------------------- /chapter4_神经网络的训练/optim/optim.py: -------------------------------------------------------------------------------- 1 | import matplotlib.pyplot as plt 2 | import torch 3 | import torch.nn.functional as F 4 | import torch.utils.data as Data 5 | from torch.autograd import Variable 6 | 7 | torch.manual_seed(1) # 确定随机种子,保证结果可重复 8 | 9 | LR = 0.01 10 | BATCH_SIZE = 20 11 | EPOCH = 10 12 | 13 | # 生成数据 14 | x = torch.unsqueeze(torch.linspace(-1, 1, 1500), dim=1) 15 | y = x.pow(3) + 0.1 * torch.normal(torch.zeros(*x.size())) 16 | 17 | # 绘制数据分布 18 | plt.scatter(x.numpy(), y.numpy()) 19 | plt.show() 20 | 21 | # 把数据转换为torch类型 22 | torch_dataset = Data.TensorDataset(x, y) 23 | loader = Data.DataLoader(dataset=torch_dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=2, ) 24 | 25 | 26 | # 定义模型 27 | class Net(torch.nn.Module): 28 | def __init__(self): 29 | super(Net, self).__init__() 30 | self.hidden = torch.nn.Linear(1, 20) # 隐藏层 31 | self.predict = torch.nn.Linear(20, 1) # 输出层 32 | 33 | def forward(self, x): 34 | # pdb.set_trace() 35 | x = F.relu(self.hidden(x)) # 隐藏层的激活函数 36 | x = self.predict(x) # 线性输出 37 | return x 38 | 39 | 40 | # 不同的网络模型 41 | net_SGD = Net() 42 | net_Momentum = Net() 43 | net_RMSprop = Net() 44 | net_AdaGrad = Net() 45 | net_Adam = Net() 46 | 47 | nets = [net_SGD, net_Momentum, net_AdaGrad, net_RMSprop, net_Adam] 48 | # 不同的优化器 49 | opt_SGD = torch.optim.SGD(net_SGD.parameters(), lr=LR) 50 | opt_Momentum = torch.optim.SGD(net_Momentum.parameters(), lr=LR, momentum=0.8) 51 | 52 | opt_AdaGrad = torch.optim.Adagrad(net_AdaGrad.parameters(), lr=LR) 53 | opt_RMSprop = torch.optim.RMSprop(net_RMSprop.parameters(), lr=LR, alpha=0.9) 54 | opt_Adam = torch.optim.Adam(net_Adam.parameters(), lr=LR, betas=(0.9, 0.99)) 55 | optimizers = [opt_SGD, opt_Momentum, opt_AdaGrad, opt_RMSprop, opt_Adam] 56 | 57 | loss_func = torch.nn.MSELoss() 58 | losses_his = [[], [], [], [], []] # 记录loss用 59 | 60 | # 模型训练 61 | for epoch in range(EPOCH): 62 | print('Epoch: ', epoch) 63 | for step, (batch_x, batch_y) in enumerate(loader): 64 | b_x = Variable(batch_x) 65 | b_y = Variable(batch_y) 66 | 67 | for net, opt, l_his in zip(nets, optimizers, losses_his): 68 | output = net(b_x) # 前向算法的结果 69 | loss = loss_func(output, b_y) # 计算loss 70 | opt.zero_grad() # 梯度清零 71 | loss.backward() # 后向算法,计算梯度 72 | opt.step() # 应用梯度 73 | l_his.append(loss.data[0]) # 记录loss 74 | 75 | labels = ['SGD', 'Momentum', 'AdaGrad', 'RMSprop', 'Adam'] 76 | for i, l_his in enumerate(losses_his): 77 | plt.plot(l_his, label=labels[i]) 78 | plt.legend(loc='best') 79 | plt.xlabel('Steps') 80 | plt.ylabel('Loss') 81 | plt.ylim((0, 0.2)) 82 | plt.show() 83 | -------------------------------------------------------------------------------- /chapter5_卷积神经网络/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-5-21 下午5:48 4 | # @Author : J.W. 5 | # @File : __init__.py.py -------------------------------------------------------------------------------- /chapter5_卷积神经网络/cnn.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter5_卷积神经网络/cnn.pth -------------------------------------------------------------------------------- /chapter5_卷积神经网络/mycnn.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | # __author__ = 'jiangyangbo' 3 | 4 | # 配置库 5 | import torch 6 | from torch import nn, optim 7 | from torch.autograd import Variable 8 | from torch.utils.data import DataLoader 9 | from torchvision import datasets 10 | from torchvision import transforms 11 | 12 | # 配置参数 13 | torch.manual_seed( 14 | 1) # 设置随机数种子,确保结果可重复 15 | batch_size = 128 # 批处理大小 16 | learning_rate = 1e-2 # 学习率 17 | num_epoches = 10 # 训练次数 18 | 19 | # 下载训练集 MNIST 手写数字训练集 20 | train_dataset = datasets.MNIST( 21 | root='./data', # 数据保持的位置 22 | train=True, # 训练集 23 | transform=transforms.ToTensor(), # 一个取值范围是[0,255]的PIL.Image 24 | # 转化为取值范围是[0,1.0]的torch.FloadTensor 25 | download=True) # 下载数据 26 | 27 | test_dataset = datasets.MNIST( 28 | root='./data', 29 | train=False, # 测试集 30 | transform=transforms.ToTensor()) 31 | 32 | # 数据的批处理,尺寸大小为batch_size, 33 | # 在训练集中,shuffle 必须设置为True, 表示次序是随机的 34 | train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) 35 | test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False) 36 | 37 | 38 | # 定义卷积神经网络模型 39 | class Cnn(nn.Module): 40 | def __init__(self, in_dim, n_class): # 28x28x1 41 | super(Cnn, self).__init__() 42 | self.conv = nn.Sequential( 43 | nn.Conv2d(in_dim, 6, 3, stride=1, padding=1), # 28 x28 44 | nn.ReLU(True), 45 | nn.MaxPool2d(2, 2), # 14 x 14 46 | nn.Conv2d(6, 16, 5, stride=1, padding=0), # 10 * 10*16 47 | nn.ReLU(True), nn.MaxPool2d(2, 2)) # 5x5x16 48 | 49 | self.fc = nn.Sequential( 50 | nn.Linear(400, 120), # 400 = 5 * 5 * 16 51 | nn.Linear(120, 84), 52 | nn.Linear(84, n_class)) 53 | 54 | def forward(self, x): 55 | out = self.conv(x) 56 | out = out.view(out.size(0), 400) # 400 = 5 * 5 * 16, 57 | out = self.fc(out) 58 | return out 59 | 60 | 61 | # 打印模型 62 | model = Cnn(1, 10) # 图片大小是28x28, 10 63 | 64 | # 打印模型 65 | print(model) 66 | 67 | # 定义loss和optimizer 68 | criterion = nn.CrossEntropyLoss() 69 | optimizer = optim.SGD(model.parameters(), lr=learning_rate) 70 | 71 | # 开始训练 72 | for epoch in range(num_epoches): 73 | print('epoch {}'.format(epoch + 1)) 74 | print('*' * 10) 75 | running_loss = 0.0 76 | running_acc = 0.0 77 | for i, data in enumerate(train_loader, 1): # 批处理 78 | img, label = data 79 | img = Variable(img) 80 | label = Variable(label) 81 | # 前向传播 82 | out = model(img) 83 | loss = criterion(out, label) # loss 84 | running_loss += loss.data[0] * label.size(0) # total loss , 由于loss 是batch 取均值的,需要把batch size 乘回去 85 | _, pred = torch.max(out, 1) # 预测结果 86 | num_correct = (pred == label).sum() # 正确结果的num 87 | # accuracy = (pred == label).float().mean() #正确率 88 | running_acc += num_correct.data[0] # 正确结果的总数 89 | # 后向传播 90 | optimizer.zero_grad() # 梯度清零,以免影响其他batch 91 | loss.backward() # 后向传播,计算梯度 92 | optimizer.step() # 梯度更新 93 | 94 | # if i % 300 == 0: 95 | # print('[{}/{}] Loss: {:.6f}, Acc: {:.6f}'.format( 96 | # epoch + 1, num_epoches, running_loss / (batch_size * i), 97 | # running_acc / (batch_size * i))) 98 | # 打印一个循环后,训练集合上的loss 和 正确率 99 | print('Train Finish {} epoch, Loss: {:.6f}, Acc: {:.6f}'.format( 100 | epoch + 1, running_loss / (len(train_dataset)), running_acc / (len( 101 | train_dataset)))) 102 | 103 | # 模型测试, 由于训练和测试 BatchNorm, Dropout配置不同,需要说明是否模型测试 104 | model.eval() 105 | eval_loss = 0 106 | eval_acc = 0 107 | for data in test_loader: # test set 批处理 108 | img, label = data 109 | 110 | img = Variable(img, volatile=True) # volatile 确定你是否不调用.backward(), 测试中不需要 111 | label = Variable(label, volatile=True) 112 | out = model(img) # 前向算法 113 | loss = criterion(out, label) # 计算 loss 114 | eval_loss += loss.data[0] * label.size(0) # total loss 115 | _, pred = torch.max(out, 1) # 预测结果 116 | num_correct = (pred == label).sum() # 正确结果 117 | eval_acc += num_correct.data[0] # 正确结果总数 118 | 119 | print('Test Loss: {:.6f}, Acc: {:.6f}'.format(eval_loss / (len( 120 | test_dataset)), eval_acc * 1.0 / (len(test_dataset)))) 121 | 122 | # 保存模型 123 | torch.save(model.state_dict(), './cnn.pth') 124 | -------------------------------------------------------------------------------- /chapter5_卷积神经网络/predict_cnn.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | # 配置库 3 | import torch 4 | from torch import nn 5 | from torch.autograd import Variable 6 | from torchvision import datasets 7 | 8 | 9 | # load model 10 | # 保存模型 11 | # torch.save(model.state_dict(), './cnn.pth') 12 | 13 | # 定义卷积神经网络模型 14 | class Cnn(nn.Module): 15 | def __init__(self, in_dim, n_class): # 28x28x1 16 | super(Cnn, self).__init__() 17 | self.conv = nn.Sequential( 18 | nn.Conv2d(in_dim, 6, 3, stride=1, padding=1), # 28 x28 19 | nn.ReLU(True), 20 | nn.MaxPool2d(2, 2), # 14 x 14 21 | nn.Conv2d(6, 16, 5, stride=1, padding=0), # 10 * 10*16 22 | nn.ReLU(True), nn.MaxPool2d(2, 2)) # 5x5x16 23 | 24 | self.fc = nn.Sequential( 25 | nn.Linear(400, 120), # 400 = 5 * 5 * 16 26 | nn.Linear(120, 84), 27 | nn.Linear(84, n_class)) 28 | 29 | def forward(self, x): 30 | out = self.conv(x) 31 | out = out.view(out.size(0), 400) # 400 = 5 * 5 * 16, 32 | out = self.fc(out) 33 | return out 34 | 35 | 36 | # 打印模型 37 | print(Cnn) 38 | 39 | model = Cnn(1, 10) # 图片大小是28x28, 10 40 | # cnn = torch.load('./cnn.pth')['state_dict'] 41 | model.load_state_dict(torch.load('./cnn.pth')) 42 | 43 | # 识别 44 | print(model) 45 | test_data = datasets.MNIST(root='./data', train=False, download=True) 46 | test_x = Variable(torch.unsqueeze(test_data.test_data, dim=1), volatile=True).type(torch.FloatTensor)[ 47 | :20] / 255.0 # shape from (2000, 28, 28) to (2000, 1, 28, 28), value in range(0,1) 48 | test_y = test_data.test_labels[:20] 49 | print(test_x.size()) 50 | test_output = model(test_x[:10]) 51 | pred_y = torch.max(test_output, 1)[1].data.numpy().squeeze() 52 | print(pred_y, 'predict result') 53 | print(test_y[:10].numpy(), 'real result') 54 | -------------------------------------------------------------------------------- /chapter6_嵌入与表示学习/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-5-30 上午10:30 4 | # @File : __init__.py.py -------------------------------------------------------------------------------- /chapter6_嵌入与表示学习/autoencoder.py: -------------------------------------------------------------------------------- 1 | import os 2 | import pdb 3 | import torch 4 | import torchvision 5 | from torch import nn 6 | from torch.autograd import Variable 7 | from torch.utils.data import DataLoader 8 | from torchvision import transforms 9 | from torchvision.datasets import MNIST 10 | from torchvision.utils import save_image 11 | from torchvision import datasets 12 | import matplotlib.pyplot as plt 13 | 14 | # 配置参数 15 | torch.manual_seed(1) #设置随机数种子,确保结果可重复 16 | batch_size = 128 #批处理大小 17 | learning_rate = 1e-2 #学习率 18 | num_epochs = 10 #训练次数 19 | 20 | #下载训练集 MNIST 手写数字训练集 21 | train_dataset = datasets.MNIST( 22 | root='./data', #数据保持的位置 23 | train=True, # 训练集 24 | transform=transforms.ToTensor(),# 一个取值范围是[0,255]的PIL.Image 25 | # 转化为取值范围是[0,1.0]的torch.FloadTensor 26 | download=True) #下载数据 27 | 28 | test_dataset = datasets.MNIST( 29 | root='./data', 30 | train=False, # 测试集 31 | transform=transforms.ToTensor()) 32 | 33 | #pdb.set_trace() 34 | #数据的批处理,尺寸大小为batch_size, 35 | #在训练集中,shuffle 必须设置为True, 表示次序是随机的 36 | train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) 37 | test_loader = DataLoader(test_dataset, batch_size=10000, shuffle=False) 38 | 39 | class autoencoder(nn.Module): 40 | def __init__(self): 41 | super(autoencoder, self).__init__() 42 | self.encoder = nn.Sequential( 43 | nn.Linear(28 * 28, 1000), 44 | nn.ReLU(True), 45 | nn.Linear(1000, 500), 46 | nn.ReLU(True), 47 | nn.Linear(500, 250), 48 | nn.ReLU(True), 49 | nn.Linear(250, 2) 50 | ) 51 | self.decoder = nn.Sequential( 52 | nn.Linear(2, 250), 53 | nn.ReLU(True), 54 | nn.Linear(250, 500), 55 | nn.ReLU(True), 56 | nn.Linear(500, 1000), 57 | nn.ReLU(True), 58 | nn.Linear(1000, 28 * 28), 59 | nn.Tanh()) 60 | 61 | def forward(self, x): 62 | x = self.encoder(x) 63 | x = self.decoder(x) 64 | return x 65 | 66 | #model = autoencoder().cuda() 67 | model = autoencoder() 68 | criterion = nn.MSELoss() 69 | optimizer = torch.optim.Adam( 70 | model.parameters(), lr=learning_rate, weight_decay=1e-5) 71 | 72 | for epoch in range(num_epochs): 73 | for data in train_loader: 74 | img, _ = data 75 | img = img.view(img.size(0), -1) 76 | #img = Variable(img).cuda() 77 | img = Variable(img) 78 | # ===================forward===================== 79 | output = model(img) 80 | loss = criterion(output, img) 81 | # ===================backward==================== 82 | optimizer.zero_grad() 83 | loss.backward() 84 | optimizer.step() 85 | # ===================log======================== 86 | print('epoch [{}/{}], loss:{:.4f}' 87 | .format(epoch + 1, num_epochs, loss.data.item())) 88 | 89 | 90 | #模型测试, 由于训练和测试 BatchNorm, Dropout配置不同,需要说明是否模型测试 91 | model.eval() 92 | eval_loss = 0 93 | import pdb 94 | #pdb.set_trace() 95 | for data in test_loader: #test set 批处理 96 | img, label = data 97 | 98 | img = img.view(img.size(0), -1) 99 | #img = Variable(img, volatile=True).cuda() # volatile 确定你是否不调用.backward(), 测试中不需要 100 | img = Variable(img, volatile=True) 101 | label = Variable(label, volatile=True) 102 | out = model(img) # 前向算法 103 | out = out.detach().numpy() 104 | y = (label.data).numpy() 105 | plt.scatter(out[:, 0], out[:, 1], c = y) 106 | plt.colorbar() 107 | plt.title('audocoder of MNIST test dataset') 108 | plt.show() 109 | 110 | 111 | -------------------------------------------------------------------------------- /chapter6_嵌入与表示学习/denoise_autoencoder.py: -------------------------------------------------------------------------------- 1 | # Simple Convolutional Autoencoder 2 | import torch 3 | import torch.nn as nn 4 | import torch.utils as utils 5 | from torch.autograd import Variable 6 | import torchvision.datasets as dset 7 | import torchvision.transforms as transforms 8 | import numpy as np 9 | import matplotlib.pyplot as plt 10 | # 配置参数 11 | torch.manual_seed(1) #设置随机数种子,确保结果可重复 12 | n_epoch = 200 #训练次数 13 | batch_size = 100 #批处理大小 14 | learning_rate = 0.0002 #学习率 15 | 16 | #下载训练集 MNIST 手写数字训练集 17 | mnist_train = dset.MNIST("./", train=True, transform=transforms.ToTensor(), target_transform=None, download=True) 18 | train_loader = torch.utils.data.DataLoader(dataset=mnist_train,batch_size=batch_size,shuffle=True) 19 | 20 | # Encoder 模型设置 21 | class Encoder(nn.Module): 22 | def __init__(self): 23 | super(Encoder,self).__init__() 24 | self.layer1 = nn.Sequential( 25 | nn.Conv2d(1,32,3,padding=1), # batch x 32 x 28 x 28 26 | nn.ReLU(), 27 | nn.BatchNorm2d(32), 28 | nn.Conv2d(32,32,3,padding=1), # batch x 32 x 28 x 28 29 | nn.ReLU(), 30 | nn.BatchNorm2d(32), 31 | nn.Conv2d(32,64,3,padding=1), # batch x 64 x 28 x 28 32 | nn.ReLU(), 33 | nn.BatchNorm2d(64), 34 | nn.Conv2d(64,64,3,padding=1), # batch x 64 x 28 x 28 35 | nn.ReLU(), 36 | nn.BatchNorm2d(64), 37 | nn.MaxPool2d(2,2) # batch x 64 x 14 x 14 38 | ) 39 | self.layer2 = nn.Sequential( 40 | nn.Conv2d(64,128,3,padding=1), # batch x 128 x 14 x 14 41 | nn.ReLU(), 42 | nn.BatchNorm2d(128), 43 | nn.Conv2d(128,128,3,padding=1), # batch x 128 x 14 x 14 44 | nn.ReLU(), 45 | nn.BatchNorm2d(128), 46 | nn.MaxPool2d(2,2), 47 | nn.Conv2d(128,256,3,padding=1), # batch x 256 x 7 x 7 48 | nn.ReLU() 49 | ) 50 | 51 | 52 | 53 | # Encoder 模型设置 54 | class Encoder(nn.Module): 55 | def __init__(self): 56 | super(Encoder,self).__init__() 57 | self.layer1 = nn.Sequential( 58 | nn.Conv2d(1,32,3,padding=1), # batch x 32 x 28 x 28 59 | nn.ReLU(), 60 | nn.BatchNorm2d(32), 61 | nn.Conv2d(32,32,3,padding=1), # batch x 32 x 28 x 28 62 | nn.ReLU(), 63 | nn.BatchNorm2d(32), 64 | nn.Conv2d(32,64,3,padding=1), # batch x 64 x 28 x 28 65 | nn.ReLU(), 66 | nn.BatchNorm2d(64), 67 | nn.Conv2d(64,64,3,padding=1), # batch x 64 x 28 x 28 68 | nn.ReLU(), 69 | nn.BatchNorm2d(64), 70 | nn.MaxPool2d(2,2) # batch x 64 x 14 x 14 71 | ) 72 | self.layer2 = nn.Sequential( 73 | nn.Conv2d(64,128,3,padding=1), # batch x 128 x 14 x 14 74 | nn.ReLU(), 75 | nn.BatchNorm2d(128), 76 | nn.Conv2d(128,128,3,padding=1), # batch x 128 x 14 x 14 77 | nn.ReLU(), 78 | nn.BatchNorm2d(128), 79 | nn.MaxPool2d(2,2), 80 | nn.Conv2d(128,256,3,padding=1), # batch x 256 x 7 x 7 81 | nn.ReLU() 82 | ) 83 | 84 | def forward(self,x): 85 | out = self.layer1(x) 86 | out = self.layer2(out) 87 | out = out.view(batch_size, -1) 88 | return out 89 | 90 | #encoder = Encoder().cuda() 91 | encoder = Encoder() 92 | # decoder模型设置 93 | 94 | class Decoder(nn.Module): 95 | def __init__(self): 96 | super(Decoder,self).__init__() 97 | self.layer1 = nn.Sequential( 98 | nn.ConvTranspose2d(256,128,3,2,1,1), # batch x 128 x 14 x 14 99 | nn.ReLU(), 100 | nn.BatchNorm2d(128), 101 | nn.ConvTranspose2d(128,128,3,1,1), # batch x 128 x 14 x 14 102 | nn.ReLU(), 103 | nn.BatchNorm2d(128), 104 | nn.ConvTranspose2d(128,64,3,1,1), # batch x 64 x 14 x 14 105 | nn.ReLU(), 106 | nn.BatchNorm2d(64), 107 | nn.ConvTranspose2d(64,64,3,1,1), # batch x 64 x 14 x 14 108 | nn.ReLU(), 109 | nn.BatchNorm2d(64) 110 | ) 111 | self.layer2 = nn.Sequential( 112 | nn.ConvTranspose2d(64,32,3,1,1), # batch x 32 x 14 x 14 113 | nn.ReLU(), 114 | nn.BatchNorm2d(32), 115 | nn.ConvTranspose2d(32,32,3,1,1), # batch x 32 x 14 x 14 116 | nn.ReLU(), 117 | nn.BatchNorm2d(32), 118 | nn.ConvTranspose2d(32,1,3,2,1,1), # batch x 1 x 28 x 28 119 | nn.ReLU() 120 | ) 121 | 122 | def forward(self,x): 123 | out = x.view(batch_size,256,7,7) 124 | out = self.layer1(out) 125 | out = self.layer2(out) 126 | return out 127 | 128 | 129 | #decoder = Decoder().cuda() 130 | decoder = Decoder() 131 | 132 | parameters = list(encoder.parameters())+ list(decoder.parameters()) 133 | loss_func = nn.MSELoss() 134 | optimizer = torch.optim.Adam(parameters, lr=learning_rate) 135 | 136 | # 噪声 137 | noise = torch.rand(batch_size,1,28,28) 138 | for i in range(n_epoch): 139 | for image,label in train_loader: 140 | image_n = torch.mul(image+0.25, 0.1 * noise) 141 | #image = Variable(image).cuda() 142 | image = Variable(image) 143 | #image_n = Variable(image_n).cuda() 144 | image_n = Variable(image_n) 145 | optimizer.zero_grad() 146 | output = encoder(image_n) 147 | output = decoder(output) 148 | loss = loss_func(output,image) 149 | loss.backward() 150 | optimizer.step() 151 | break 152 | print('epoch [{}/{}], loss:{:.4f}' 153 | .format(i + 1, n_epoch, loss.data.item())) 154 | 155 | 156 | 157 | img = image[0].cpu() 158 | input_img = image_n[0].cpu() 159 | output_img = output[0].cpu() 160 | origin = img.data.numpy() 161 | inp = input_img.data.numpy() 162 | out = output_img.data.numpy() 163 | plt.figure('denoising autodecoder') 164 | plt.subplot(131) 165 | plt.imshow(origin[0],cmap='gray') 166 | plt.subplot(132) 167 | plt.imshow(inp[0],cmap='gray') 168 | plt.subplot(133) 169 | plt.imshow(out[0],cmap="gray") 170 | plt.show() 171 | print(label[0]) 172 | -------------------------------------------------------------------------------- /chapter6_嵌入与表示学习/word_embeddings.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | 3 | import torch 4 | import torch.autograd as autograd 5 | import torch.nn as nn 6 | import torch.nn.functional as F 7 | import torch.optim as optim 8 | 9 | # 参数设置 10 | torch.manual_seed(1) 11 | CONTEXT_SIZE = 2 12 | EMBEDDING_DIM = 10 13 | N_EPHCNS = 100 14 | 15 | # 语料 16 | test_sentence = """Word embeddings are dense vectors of real numbers, 17 | one per word in your vocabulary. In NLP, it is almost always the case 18 | that your features are words! But how should you represent a word in a 19 | computer? You could store its ascii character representation, but that 20 | only tells you what the word is, it doesn’t say much about what it means 21 | (you might be able to derive its part of speech from its affixes, or properties 22 | from its capitalization, but not much). Even more, in what sense could you combine 23 | these representations?""".split() 24 | 25 | # 三元模型语料准备 26 | trigrams = [([test_sentence[i], test_sentence[i + 1]], test_sentence[i + 2]) 27 | for i in range(len(test_sentence) - 2)] 28 | 29 | vocab = set(test_sentence) 30 | word_to_ix = {word: i for i, word in enumerate(vocab)} 31 | idx_to_word = {word_to_ix[word]: word for word in word_to_ix} 32 | 33 | 34 | # 语言模型 35 | class NGramLanguageModeler(nn.Module): 36 | def __init__(self, vocab_size, embedding_dim, context_size): 37 | super(NGramLanguageModeler, self).__init__() 38 | self.embeddings = nn.Embedding(vocab_size, embedding_dim) 39 | self.linear1 = nn.Linear(context_size * embedding_dim, 128) 40 | self.linear2 = nn.Linear(128, vocab_size) 41 | 42 | def forward(self, inputs): 43 | embeds = self.embeddings(inputs).view((1, -1)) 44 | out = F.relu(self.linear1(embeds)) 45 | out = self.linear2(out) 46 | log_probs = F.log_softmax(out) 47 | return log_probs 48 | 49 | 50 | # loss 函数和优化器 51 | loss_function = nn.NLLLoss() 52 | model = NGramLanguageModeler(len(vocab), EMBEDDING_DIM, CONTEXT_SIZE) 53 | optimizer = optim.SGD(model.parameters(), lr=0.001) 54 | 55 | # 训练语言模型 56 | for epoch in range(N_EPHCNS): 57 | total_loss = torch.Tensor([0]) 58 | for context, target in trigrams: 59 | # Step 1. 准备数据 60 | context_idxs = [word_to_ix[w] for w in context] 61 | context_var = autograd.Variable(torch.LongTensor(context_idxs)) 62 | 63 | # Step 2 梯度初始化 64 | model.zero_grad() 65 | 66 | # Step 3. 前向算法 67 | log_probs = model(context_var) 68 | 69 | # Step 4. 计算loss 70 | loss = loss_function(log_probs, autograd.Variable( 71 | torch.LongTensor([word_to_ix[target]]))) 72 | 73 | # Step 5. 后向算法和更新梯度 74 | loss.backward() 75 | optimizer.step() 76 | 77 | # step 6. loss 78 | total_loss += loss.data 79 | # 打印 loss 80 | print('\r epoch[{}] - loss: {:.6f}'.format(epoch, total_loss[0])) 81 | 82 | word, label = trigrams[3] 83 | word = autograd.Variable(torch.LongTensor([word_to_ix[i] for i in word])) 84 | out = model(word) 85 | _, predict_label = torch.max(out, 1) 86 | predict_word = idx_to_word[predict_label.data[0]] 87 | print('real word is {}, predict word is {}'.format(label, predict_word)) 88 | 89 | # 90 | # epoch[91] - loss: 243.199814 91 | # epoch[92] - loss: 241.579529 92 | # epoch[93] - loss: 239.956345 93 | # epoch[94] - loss: 238.329926 94 | # epoch[95] - loss: 236.701630 95 | # epoch[96] - loss: 235.069275 96 | # epoch[97] - loss: 233.434341 97 | # epoch[98] - loss: 231.797974 98 | # epoch[99] - loss: 230.158493 99 | # 100 | -------------------------------------------------------------------------------- /chapter7_序列预测模型/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-6-6 上午11:18 4 | # @Author : J.W. 5 | # @File : __init__.py.py -------------------------------------------------------------------------------- /chapter7_序列预测模型/data/eng_cmn_attn_decoder1.model: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter7_序列预测模型/data/eng_cmn_attn_decoder1.model -------------------------------------------------------------------------------- /chapter7_序列预测模型/data/eng_cmn_attn_decoder1.stat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter7_序列预测模型/data/eng_cmn_attn_decoder1.stat -------------------------------------------------------------------------------- /chapter7_序列预测模型/data/eng_cmn_encoder1.model: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter7_序列预测模型/data/eng_cmn_encoder1.model -------------------------------------------------------------------------------- /chapter7_序列预测模型/data/eng_cmn_encoder1.stat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter7_序列预测模型/data/eng_cmn_encoder1.stat -------------------------------------------------------------------------------- /chapter7_序列预测模型/data/eng_cmn_input_lang.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter7_序列预测模型/data/eng_cmn_input_lang.pkl -------------------------------------------------------------------------------- /chapter7_序列预测模型/data/eng_cmn_output_lang.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter7_序列预测模型/data/eng_cmn_output_lang.pkl -------------------------------------------------------------------------------- /chapter7_序列预测模型/data/eng_cmn_pairs.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/chapter7_序列预测模型/data/eng_cmn_pairs.pkl -------------------------------------------------------------------------------- /chapter7_序列预测模型/evaluate_cmn_eng.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : evaluateSentence.py 4 | # @Time : 18-3-15 5 | # @Author : J.W. 6 | 7 | import pickle 8 | 9 | import matplotlib.pyplot as plt 10 | import torch 11 | from logger import logger 12 | from train import evaluate 13 | from train import evaluateAndShowAtten 14 | from train import evaluateRandomly 15 | 16 | input = 'eng' 17 | output = 'cmn' 18 | logger.info('%s -> %s' % (input, output)) 19 | # 加载处理好的语言信息 20 | input_lang = pickle.load(open('./data/%s_%s_input_lang.pkl' % (input, output), "rb")) 21 | output_lang = pickle.load(open('./data/%s_%s_output_lang.pkl' % (input, output), "rb")) 22 | pairs = pickle.load(open('./data/%s_%s_pairs.pkl' % (input, output), 'rb')) 23 | logger.info('lang loaded.') 24 | 25 | # 加载训练好的编码器和解码器 26 | encoder1 = torch.load(open('./data/%s_%s_encoder1.model' % (input, output), 'rb')) 27 | attn_decoder1 = torch.load(open('./data/%s_%s_attn_decoder1.model' % (input, output), 'rb')) 28 | logger.info('model loaded.') 29 | 30 | # 对单句进行评估并绘制注意力图像 31 | def evaluateAndShowAttention(sentence): 32 | evaluateAndShowAtten(input_lang, output_lang, sentence, encoder1, attn_decoder1) 33 | 34 | evaluateAndShowAttention("他们肯定会相恋的。") 35 | evaluateAndShowAttention("我现在正在学习。") 36 | 37 | # 语料中的数据随机选择评估 38 | evaluateRandomly(input_lang, output_lang, pairs, encoder1, attn_decoder1) 39 | 40 | output_words, attentions = evaluate(input_lang, output_lang, 41 | encoder1, attn_decoder1, "我是中国人。") 42 | plt.matshow(attentions.numpy()) 43 | 44 | 45 | 46 | -------------------------------------------------------------------------------- /chapter7_序列预测模型/logger.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : logger.py 4 | # @Time : 18-3-14 5 | # @Author : J.W. 6 | 7 | import logging as logger 8 | 9 | logger.basicConfig(level=logger.DEBUG, 10 | format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s', 11 | datefmt='%Y-%m-%d %H:%M:%S -', 12 | filename='log.log', 13 | filemode='a') # or 'w', default 'a' 14 | 15 | console = logger.StreamHandler() 16 | console.setLevel(logger.INFO) 17 | formatter = logger.Formatter('%(asctime)s %(name)-6s: %(levelname)-6s %(message)s') 18 | console.setFormatter(formatter) 19 | logger.getLogger('').addHandler(console) 20 | 21 | # 22 | # logger.info("info test.") 23 | # logger.debug("debug test.") 24 | # logger.warning('waring test.') 25 | # 26 | # # 指定logger名称 27 | # logger1 = logger.getLogger("logger1") 28 | # logger1.info('logger1 info test.') 29 | -------------------------------------------------------------------------------- /chapter7_序列预测模型/model.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : model.py 4 | # @Time : 18-3-14 5 | # @Author : J.W. 6 | 7 | import torch 8 | from logger import logger 9 | from torch import nn 10 | from torch.autograd import Variable 11 | from torch.nn import functional as F 12 | 13 | # from process import cut 14 | from process import MAX_LENGTH 15 | 16 | use_cuda = torch.cuda.is_available() 17 | 18 | 19 | class EncoderRNN(nn.Module): 20 | ''' 21 | 编码器的定义 22 | ''' 23 | 24 | def __init__(self, input_size, hidden_size, n_layers=1): 25 | ''' 26 | 初始化过程 27 | :param input_size: 输入向量长度,这里是词汇表大小 28 | :param hidden_size: 隐藏层大小 29 | :param n_layers: 叠加层数 30 | ''' 31 | super(EncoderRNN, self).__init__() 32 | self.n_layers = n_layers 33 | self.hidden_size = hidden_size 34 | 35 | self.embedding = nn.Embedding(input_size, hidden_size) 36 | self.gru = nn.GRU(hidden_size, hidden_size) 37 | 38 | def forward(self, input, hidden): 39 | ''' 40 | 前向计算过程 41 | :param input: 输入 42 | :param hidden: 隐藏层状态 43 | :return: 编码器输出,隐藏层状态 44 | ''' 45 | try: 46 | embedded = self.embedding(input).view(1, 1, -1) 47 | output = embedded 48 | for i in range(self.n_layers): 49 | output, hidden = self.gru(output, hidden) 50 | return output, hidden 51 | except Exception as err: 52 | logger.error(err) 53 | 54 | def initHidden(self): 55 | ''' 56 | 隐藏层状态初始化 57 | :return: 初始化过的隐藏层状态 58 | ''' 59 | result = Variable(torch.zeros(1, 1, self.hidden_size)) 60 | if use_cuda: 61 | return result.cuda() 62 | else: 63 | return result 64 | 65 | 66 | class DecoderRNN(nn.Module): 67 | ''' 68 | 解码器定义 69 | ''' 70 | 71 | def __init__(self, hidden_size, output_size, n_layers=1): 72 | ''' 73 | 初始化过程 74 | :param hidden_size: 隐藏层大小 75 | :param output_size: 输出大小 76 | :param n_layers: 叠加层数 77 | ''' 78 | super(DecoderRNN, self).__init__() 79 | self.n_layers = n_layers 80 | self.hidden_size = hidden_size 81 | 82 | self.embedding = nn.Embedding(output_size, hidden_size) 83 | self.gru = nn.GRU(hidden_size, hidden_size) 84 | self.out = nn.Linear(hidden_size, output_size) 85 | self.softmax = nn.LogSoftmax() 86 | 87 | def forward(self, input, hidden): 88 | ''' 89 | 前向计算过程 90 | :param input: 输入信息 91 | :param hidden: 隐藏层状态 92 | :return: 解码器输出,隐藏层状态 93 | ''' 94 | try: 95 | output = self.embedding(input).view(1, 1, -1) 96 | for i in range(self.n_layers): 97 | output = F.relu(output) 98 | output, hidden = self.gru(output, hidden) 99 | output = self.softmax(self.out(output[0])) 100 | return output, hidden 101 | except Exception as err: 102 | logger.error(err) 103 | 104 | def initHidden(self): 105 | ''' 106 | 隐藏层状态初始化 107 | :return: 初始化过的隐藏层状态 108 | ''' 109 | result = Variable(torch.zeros(1, 1, self.hidden_size)) 110 | if use_cuda: 111 | return result.cuda() 112 | else: 113 | return result 114 | 115 | 116 | class AttnDecoderRNN(nn.Module): 117 | ''' 118 | 带注意力的解码器的定义 119 | ''' 120 | 121 | def __init__(self, hidden_size, output_size, n_layers=1, dropout_p=0.1, max_length=MAX_LENGTH): 122 | ''' 123 | 带注意力的解码器初始化过程 124 | :param hidden_size: 隐藏层大小 125 | :param output_size: 输出大小 126 | :param n_layers: 叠加层数 127 | :param dropout_p: dropout率定义 128 | :param max_length: 接受的最大句子长度 129 | ''' 130 | super(AttnDecoderRNN, self).__init__() 131 | self.hidden_size = hidden_size 132 | self.output_size = output_size 133 | self.n_layers = n_layers 134 | self.dropout_p = dropout_p 135 | self.max_length = max_length 136 | 137 | self.embedding = nn.Embedding(self.output_size, self.hidden_size) 138 | self.attn = nn.Linear(self.hidden_size * 2, self.max_length) 139 | self.attn_combine = nn.Linear(self.hidden_size * 2, self.hidden_size) 140 | self.dropout = nn.Dropout(self.dropout_p) 141 | self.gru = nn.GRU(self.hidden_size, self.hidden_size) 142 | self.out = nn.Linear(self.hidden_size, self.output_size) 143 | 144 | def forward(self, input, hidden, encoder_output, encoder_outputs): 145 | ''' 146 | 前向计算过程 147 | :param input: 输入信息 148 | :param hidden: 隐藏层状态 149 | :param encoder_output: 编码器分时刻的输出 150 | :param encoder_outputs: 编码器全部输出 151 | :return: 解码器输出,隐藏层状态,注意力权重 152 | ''' 153 | try: 154 | embedded = self.embedding(input).view(1, 1, -1) 155 | embedded = self.dropout(embedded) 156 | 157 | attn_weights = F.softmax( 158 | self.attn(torch.cat((embedded[0], hidden[0]), 1))) 159 | attn_applied = torch.bmm(attn_weights.unsqueeze(0), 160 | encoder_outputs.unsqueeze(0)) 161 | 162 | output = torch.cat((embedded[0], attn_applied[0]), 1) 163 | output = self.attn_combine(output).unsqueeze(0) 164 | 165 | for i in range(self.n_layers): 166 | output = F.relu(output) 167 | output, hidden = self.gru(output, hidden) 168 | 169 | output = F.log_softmax(self.out(output[0])) 170 | return output, hidden, attn_weights 171 | except Exception as err: 172 | logger.error(err) 173 | 174 | def initHidden(self): 175 | ''' 176 | 隐藏层状态初始化 177 | :return: 初始化过的隐藏层状态 178 | ''' 179 | result = Variable(torch.zeros(1, 1, self.hidden_size)) 180 | if use_cuda: 181 | return result.cuda() 182 | else: 183 | return result 184 | -------------------------------------------------------------------------------- /chapter7_序列预测模型/process.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : process.py 4 | # @Time : 18-3-14 5 | # @Author : J.W. 6 | 7 | from __future__ import unicode_literals, print_function, division 8 | 9 | import math 10 | import re 11 | import time 12 | import unicodedata 13 | 14 | import jieba 15 | import torch 16 | from logger import logger 17 | from torch.autograd import Variable 18 | 19 | use_cuda = torch.cuda.is_available() 20 | 21 | SOS_token = 0 22 | EOS_token = 1 23 | # 中文的时候要设置大一些 24 | MAX_LENGTH = 25 25 | 26 | 27 | def unicodeToAscii(s): 28 | ''' 29 | Unicode转换成ASCII,http://stackoverflow.com/a/518232/2809427 30 | :param s: 31 | :return: 32 | ''' 33 | return ''.join( 34 | c for c in unicodedata.normalize('NFD', s) 35 | if unicodedata.category(c) != 'Mn' 36 | ) 37 | 38 | 39 | def normalizeString(s): 40 | ''' 41 | 转小写,去除非法字符 42 | :param s: 43 | :return: 44 | ''' 45 | s = unicodeToAscii(s.lower().strip()) 46 | s = re.sub(r"([.!?])", r" \1", s) 47 | # 中文不能进行下面的处理 48 | # s = re.sub(r"[^a-zA-Z.!?]+", r" ", s) 49 | return s 50 | 51 | 52 | class Lang: 53 | def __init__(self, name): 54 | ''' 55 | 添加 need_cut 可根据语种进行不同的分词逻辑处理 56 | :param name: 语种名称 57 | ''' 58 | self.name = name 59 | self.need_cut = self.name == 'cmn' 60 | self.word2index = {} 61 | self.word2count = {} 62 | self.index2word = {0: "SOS", 1: "EOS"} 63 | self.n_words = 2 # 初始化词数为2:SOS & EOS 64 | 65 | def addSentence(self, sentence): 66 | ''' 67 | 从语料中添加句子到 Lang 68 | :param sentence: 语料中的每个句子 69 | ''' 70 | if self.need_cut: 71 | sentence = cut(sentence) 72 | for word in sentence.split(' '): 73 | if len(word) > 0: 74 | self.addWord(word) 75 | 76 | def addWord(self, word): 77 | ''' 78 | 向 Lang 中添加每个词,并统计词频,如果是新词修改词表大小 79 | :param word: 80 | ''' 81 | if word not in self.word2index: 82 | self.word2index[word] = self.n_words 83 | self.word2count[word] = 1 84 | self.index2word[self.n_words] = word 85 | self.n_words += 1 86 | else: 87 | self.word2count[word] += 1 88 | 89 | 90 | def cut(sentence, use_jieba=False): 91 | ''' 92 | 对句子分词。 93 | :param sentence: 要分词的句子 94 | :param use_jieba: 是否使用 jieba 进行智能分词,默认按单字切分 95 | :return: 分词结果,空格区分 96 | ''' 97 | if use_jieba: 98 | return ' '.join(jieba.cut(sentence)) 99 | else: 100 | words = [word for word in sentence] 101 | return ' '.join(words) 102 | 103 | 104 | import jieba.posseg as pseg 105 | 106 | 107 | def tag(sentence): 108 | words = pseg.cut(sentence) 109 | result = '' 110 | for w in words: 111 | result = result + w.word + "/" + w.flag +" " 112 | return result 113 | 114 | 115 | def readLangs(lang1, lang2, reverse=False): 116 | ''' 117 | 118 | :param lang1: 源语言 119 | :param lang2: 目标语言 120 | :param reverse: 是否逆向翻译 121 | :return: 源语言实例,目标语言实例,词语对 122 | ''' 123 | logger.info("Reading lines...") 124 | 125 | # 读取txt文件并分割成行 126 | lines = open('data/%s-%s.txt' % (lang1, lang2), encoding='utf-8'). \ 127 | read().strip().split('\n') 128 | 129 | # 按行处理成 源语言-目标语言对,并做预处理 130 | pairs = [[normalizeString(s) for s in l.split('\t')] for l in lines] 131 | 132 | # Reverse pairs, make Lang instances 133 | if reverse: 134 | pairs = [list(reversed(p)) for p in pairs] 135 | input_lang = Lang(lang2) 136 | output_lang = Lang(lang1) 137 | else: 138 | input_lang = Lang(lang1) 139 | output_lang = Lang(lang2) 140 | 141 | return input_lang, output_lang, pairs 142 | 143 | 144 | eng_prefixes = ( 145 | "i am ", "i m ", 146 | "he is", "he s ", 147 | "she is", "she s", 148 | "you are", "you re ", 149 | "we are", "we re ", 150 | "they are", "they re " 151 | ) 152 | 153 | 154 | def filterPair(p): 155 | ''' 156 | 按自定义最大长度过滤 157 | ''' 158 | return len(p[0].split(' ')) < MAX_LENGTH and \ 159 | len(p[1].split(' ')) < MAX_LENGTH and \ 160 | p[1].startswith(eng_prefixes) 161 | 162 | 163 | def filterPairs(pairs): 164 | return [pair for pair in pairs if filterPair(pair)] 165 | 166 | 167 | def prepareData(lang1, lang2, reverse=False): 168 | input_lang, output_lang, pairs = readLangs(lang1, lang2, reverse) 169 | logger.info("Read %s sentence pairs" % len(pairs)) 170 | pairs = filterPairs(pairs) 171 | logger.info("Trimmed to %s sentence pairs" % len(pairs)) 172 | logger.info("Counting words...") 173 | for pair in pairs: 174 | input_lang.addSentence(pair[0]) 175 | output_lang.addSentence(pair[1]) 176 | logger.info("Counted words:") 177 | logger.info('%s, %d' % (input_lang.name, input_lang.n_words)) 178 | logger.info('%s, %d' % (output_lang.name, output_lang.n_words)) 179 | return input_lang, output_lang, pairs 180 | 181 | 182 | def indexesFromSentence(lang, sentence): 183 | ''' 184 | :param lang: 185 | :param sentence: 186 | :return: 187 | ''' 188 | return [lang.word2index[word] for word in sentence.split(' ') if len(word) > 0] 189 | 190 | 191 | def variableFromSentence(lang, sentence): 192 | if lang.need_cut: 193 | sentence = cut(sentence) 194 | # logger.info("cuted sentence: %s" % sentence) 195 | indexes = indexesFromSentence(lang, sentence) 196 | indexes.append(EOS_token) 197 | result = Variable(torch.LongTensor(indexes).view(-1, 1)) 198 | if use_cuda: 199 | return result.cuda() 200 | else: 201 | return result 202 | 203 | 204 | def variablesFromPair(input_lang, output_lang, pair): 205 | input_variable = variableFromSentence(input_lang, pair[0]) 206 | target_variable = variableFromSentence(output_lang, pair[1]) 207 | return (input_variable, target_variable) 208 | 209 | 210 | def asMinutes(s): 211 | m = math.floor(s / 60) 212 | s -= m * 60 213 | return '%dm %ds' % (m, s) 214 | 215 | 216 | def timeSince(since, percent): 217 | now = time.time() 218 | s = now - since 219 | es = s / (percent) 220 | rs = es - s 221 | return '%s (- %s)' % (asMinutes(s), asMinutes(rs)) 222 | 223 | 224 | if __name__ == "__main__": 225 | s = 'Fans of Belgium cheer prior to the 2018 FIFA World Cup Group G match between Belgium and Tunisia in Moscow, Russia, June 23, 2018.' 226 | s = '结婚的和尚未结婚的和尚' 227 | s = "买张下周三去南海的飞机票,海航的" 228 | s = "过几天天天天气不好。" 229 | 230 | a = cut(s, use_jieba=True) 231 | print(a) 232 | print(tag(s)) 233 | -------------------------------------------------------------------------------- /chapter7_序列预测模型/seq2seq.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | import pickle 3 | import sys 4 | from io import open 5 | 6 | import torch 7 | from logger import logger 8 | from model import AttnDecoderRNN 9 | from model import EncoderRNN 10 | from process import prepareData 11 | from train import * 12 | 13 | use_cuda = torch.cuda.is_available() 14 | logger.info("Use cuda:{}".format(use_cuda)) 15 | input = 'eng' 16 | output = 'cmn' 17 | # 从参数接收要翻译的语种名词 18 | if len(sys.argv) > 1: 19 | output = sys.argv[1] 20 | logger.info('%s -> %s' % (input, output)) 21 | 22 | # 处理语料库 23 | input_lang, output_lang, pairs = prepareData(input, output, True) 24 | logger.info(random.choice(pairs)) 25 | 26 | # 查看两种语言的词汇大小情况 27 | logger.info('input_lang.n_words: %d' % input_lang.n_words) 28 | logger.info('output_lang.n_words: %d' % output_lang.n_words) 29 | 30 | # 保存处理过的语言信息,评估时加载使用 31 | pickle.dump(input_lang, open('./data/%s_%s_input_lang.pkl' % (input, output), "wb")) 32 | pickle.dump(output_lang, open('./data/%s_%s_output_lang.pkl' % (input, output), "wb")) 33 | pickle.dump(pairs, open('./data/%s_%s_pairs.pkl' % (input, output), "wb")) 34 | logger.info('lang saved.') 35 | 36 | # 编码器和解码器的实例化 37 | hidden_size = 256 38 | encoder1 = EncoderRNN(input_lang.n_words, hidden_size) 39 | attn_decoder1 = AttnDecoderRNN(hidden_size, output_lang.n_words, 40 | 1, dropout_p=0.1) 41 | if use_cuda: 42 | encoder1 = encoder1.cuda() 43 | attn_decoder1 = attn_decoder1.cuda() 44 | 45 | logger.info('train start. ') 46 | # 训练过程,指定迭代次数,此处为迭代75000次,每5000次打印中间信息 47 | trainIters(input_lang, output_lang, pairs, encoder1, attn_decoder1, 75000, print_every=5000) 48 | logger.info('train end. ') 49 | 50 | # 保存编码器和解码器网络状态 51 | torch.save(encoder1.state_dict(), open('./data/%s_%s_encoder1.stat' % (input, output), 'wb')) 52 | torch.save(attn_decoder1.state_dict(), open('./data/%s_%s_attn_decoder1.stat' % (input, output), 'wb')) 53 | logger.info('stat saved.') 54 | 55 | # 保存整个网络 56 | torch.save(encoder1, open('./data/%s_%s_encoder1.model' % (input, output), 'wb')) 57 | torch.save(attn_decoder1, open('./data/%s_%s_attn_decoder1.model' % (input, output), 'wb')) 58 | logger.info('model saved.') 59 | -------------------------------------------------------------------------------- /chapter7_序列预测模型/train.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : train.py 4 | # @Time : 18-3-14 5 | # @Author : J.W. 6 | 7 | import random 8 | import time 9 | 10 | import matplotlib.pyplot as plt 11 | import matplotlib.ticker as ticker 12 | import torch 13 | from logger import logger 14 | from process import * 15 | from torch import nn 16 | from torch import optim 17 | from torch.autograd import Variable 18 | 19 | use_cuda = torch.cuda.is_available() 20 | 21 | 22 | def evaluate(input_lang, output_lang, encoder, decoder, sentence, max_length=MAX_LENGTH): 23 | ''' 24 | 单句评估 25 | :param input_lang: 源语言信息 26 | :param output_lang: 目标语言信息 27 | :param encoder: 编码器 28 | :param decoder: 解码器 29 | :param sentence: 要评估的句子 30 | :param max_length: 可接受最大长度 31 | :return: 翻译过的句子和注意力信息 32 | ''' 33 | # 输入句子预处理 34 | input_variable = variableFromSentence(input_lang, sentence) 35 | input_length = input_variable.size()[0] 36 | encoder_hidden = encoder.initHidden() 37 | 38 | encoder_outputs = Variable(torch.zeros(max_length, encoder.hidden_size)) 39 | encoder_outputs = encoder_outputs.cuda() if use_cuda else encoder_outputs 40 | 41 | for ei in range(input_length): 42 | encoder_output, encoder_hidden = encoder(input_variable[ei], 43 | encoder_hidden) 44 | encoder_outputs[ei] = encoder_outputs[ei] + encoder_output[0][0] 45 | 46 | decoder_input = Variable(torch.LongTensor([[SOS_token]])) # 起始标志 SOS 47 | decoder_input = decoder_input.cuda() if use_cuda else decoder_input 48 | 49 | decoder_hidden = encoder_hidden 50 | 51 | decoded_words = [] 52 | decoder_attentions = torch.zeros(max_length, max_length) 53 | # 翻译过程 54 | for di in range(max_length): 55 | decoder_output, decoder_hidden, decoder_attention = decoder( 56 | decoder_input, decoder_hidden, encoder_output, encoder_outputs) 57 | decoder_attentions[di] = decoder_attention.data 58 | topv, topi = decoder_output.data.topk(1) 59 | ni = topi[0][0] 60 | # 当前时刻输出为句子结束标志,则结束 61 | if ni == EOS_token: 62 | decoded_words.append('') 63 | break 64 | else: 65 | decoded_words.append(output_lang.index2word[ni]) 66 | 67 | decoder_input = Variable(torch.LongTensor([[ni]])) 68 | decoder_input = decoder_input.cuda() if use_cuda else decoder_input 69 | 70 | return decoded_words, decoder_attentions[:di + 1] 71 | 72 | 73 | teacher_forcing_ratio = 0.5 74 | 75 | 76 | def train(input_variable, target_variable, encoder, decoder, encoder_optimizer, decoder_optimizer, criterion, 77 | max_length=MAX_LENGTH): 78 | ''' 79 | 单次训练过程, 80 | :param input_variable: 源语言信息 81 | :param target_variable: 目标语言信息 82 | :param encoder: 编码器 83 | :param decoder: 解码器 84 | :param encoder_optimizer: 编码器的优化器 85 | :param decoder_optimizer: 解码器的优化器 86 | :param criterion: 评价准则,即损失函数的定义 87 | :param max_length: 接受的单句最大长度 88 | :return: 本次训练的平均损失 89 | ''' 90 | encoder_hidden = encoder.initHidden() 91 | 92 | # 清楚优化器状态 93 | encoder_optimizer.zero_grad() 94 | decoder_optimizer.zero_grad() 95 | 96 | input_length = input_variable.size()[0] 97 | target_length = target_variable.size()[0] 98 | # print(input_length, " -> ", target_length) 99 | 100 | encoder_outputs = Variable(torch.zeros(max_length, encoder.hidden_size)) 101 | encoder_outputs = encoder_outputs.cuda() if use_cuda else encoder_outputs 102 | # print("encoder_outputs shape ", encoder_outputs.shape) 103 | loss = 0 104 | 105 | # 编码过程 106 | for ei in range(input_length): 107 | encoder_output, encoder_hidden = encoder( 108 | input_variable[ei], encoder_hidden) 109 | encoder_outputs[ei] = encoder_output[0][0] 110 | 111 | decoder_input = Variable(torch.LongTensor([[SOS_token]])) 112 | decoder_input = decoder_input.cuda() if use_cuda else decoder_input 113 | 114 | decoder_hidden = encoder_hidden 115 | 116 | use_teacher_forcing = True if random.random() < teacher_forcing_ratio else False 117 | 118 | if use_teacher_forcing: 119 | # Teacher forcing: 以目标作为下一个输入 120 | for di in range(target_length): 121 | decoder_output, decoder_hidden, decoder_attention = decoder( 122 | decoder_input, decoder_hidden, encoder_output, encoder_outputs) 123 | loss += criterion(decoder_output, target_variable[di]) 124 | decoder_input = target_variable[di] # Teacher forcing 125 | 126 | else: 127 | # Without teacher forcing: 网络自己预测的输出为下一个输入 128 | for di in range(target_length): 129 | decoder_output, decoder_hidden, decoder_attention = decoder( 130 | decoder_input, decoder_hidden, encoder_output, encoder_outputs) 131 | topv, topi = decoder_output.data.topk(1) 132 | ni = topi[0][0] 133 | 134 | decoder_input = Variable(torch.LongTensor([[ni]])) 135 | decoder_input = decoder_input.cuda() if use_cuda else decoder_input 136 | 137 | loss += criterion(decoder_output, target_variable[di]) 138 | if ni == EOS_token: 139 | break 140 | 141 | # 反向传播 142 | loss.backward() 143 | 144 | # 网络状态更新 145 | encoder_optimizer.step() 146 | decoder_optimizer.step() 147 | 148 | return loss.data[0] / target_length 149 | 150 | 151 | def showPlot(points): 152 | ''' 153 | 绘制图像 154 | :param points: 155 | :return: 156 | ''' 157 | plt.figure() 158 | fig, ax = plt.subplots() 159 | # this locator puts ticks at regular intervals 160 | loc = ticker.MultipleLocator(base=0.2) 161 | ax.yaxis.set_major_locator(loc) 162 | plt.plot(points) 163 | 164 | 165 | def trainIters(input_lang, output_lang, pairs, encoder, decoder, n_iters, print_every=1000, plot_every=100, 166 | learning_rate=0.01): 167 | ''' 168 | 训练过程,可以指定迭代次数,每次迭代调用 前面定义的train函数,并在迭代结束调用绘制图像的函数 169 | :param input_lang: 输入语言实例 170 | :param output_lang: 输出语言实例 171 | :param pairs: 语料中的源语言-目标语言对 172 | :param encoder: 编码器 173 | :param decoder: 解码器 174 | :param n_iters: 迭代次数 175 | :param print_every: 打印loss间隔 176 | :param plot_every: 绘制图像间隔 177 | :param learning_rate: 学习率 178 | :return: 179 | ''' 180 | start = time.time() 181 | plot_losses = [] 182 | print_loss_total = 0 # Reset every print_every 183 | plot_loss_total = 0 # Reset every plot_every 184 | 185 | encoder_optimizer = optim.SGD(encoder.parameters(), lr=learning_rate) 186 | decoder_optimizer = optim.SGD(decoder.parameters(), lr=learning_rate) 187 | training_pairs = [variablesFromPair(input_lang, output_lang, random.choice(pairs)) 188 | for i in range(n_iters)] 189 | # 损失函数定义 190 | criterion = nn.NLLLoss() 191 | 192 | for iter in range(1, n_iters + 1): 193 | training_pair = training_pairs[iter - 1] 194 | input_variable = training_pair[0] 195 | target_variable = training_pair[1] 196 | 197 | loss = train(input_variable, target_variable, encoder, 198 | decoder, encoder_optimizer, decoder_optimizer, criterion) 199 | print_loss_total += loss 200 | plot_loss_total += loss 201 | 202 | if iter % print_every == 0: 203 | print_loss_avg = print_loss_total / print_every 204 | print_loss_total = 0 205 | logger.info('%s (%d %d%%) %.4f' % (timeSince(start, iter / n_iters), 206 | iter, iter / n_iters * 100, print_loss_avg)) 207 | 208 | if iter % plot_every == 0: 209 | plot_loss_avg = plot_loss_total / plot_every 210 | plot_losses.append(plot_loss_avg) 211 | plot_loss_total = 0 212 | 213 | showPlot(plot_losses) 214 | 215 | 216 | def evaluateRandomly(input_lang, output_lang, pairs, encoder, decoder, n=10): 217 | ''' 218 | 从语料中随机选取句子进行评估 219 | ''' 220 | for i in range(n): 221 | pair = random.choice(pairs) 222 | logger.info('> %s' % pair[0]) 223 | logger.info('= %s' % pair[1]) 224 | output_words, attentions = evaluate(input_lang, output_lang, encoder, decoder, pair[0]) 225 | output_sentence = ' '.join(output_words) 226 | logger.info('< %s' % output_sentence) 227 | logger.info('') 228 | 229 | 230 | 231 | def showAttention(input_sentence, output_words, attentions): 232 | try: 233 | # 添加绘图中的中文显示 234 | plt.rcParams['font.sans-serif'] = ['STSong'] # 宋体 235 | plt.rcParams['axes.unicode_minus'] = False # 用来正常显示负号 236 | # 使用 colorbar 初始化绘图 237 | fig = plt.figure() 238 | ax = fig.add_subplot(111) 239 | cax = ax.matshow(attentions.numpy(), cmap='bone') 240 | fig.colorbar(cax) 241 | 242 | # 设置x,y轴信息 243 | ax.set_xticklabels([''] + input_sentence.split(' ') + 244 | [''], rotation=90) 245 | ax.set_yticklabels([''] + output_words) 246 | 247 | # 显示标签 248 | ax.xaxis.set_major_locator(ticker.MultipleLocator(1)) 249 | ax.yaxis.set_major_locator(ticker.MultipleLocator(1)) 250 | 251 | plt.show() 252 | except Exception as err: 253 | logger.error(err) 254 | 255 | 256 | def evaluateAndShowAtten(input_lang, ouput_lang, input_sentence, encoder1, attn_decoder1): 257 | output_words, attentions = evaluate(input_lang, ouput_lang, 258 | encoder1, attn_decoder1, input_sentence) 259 | logger.info('input = %s' % input_sentence) 260 | logger.info('output = %s' % ' '.join(output_words)) 261 | # 如果是中文需要分词 262 | if input_lang.name == 'cmn': 263 | print(input_lang.name) 264 | input_sentence = cut(input_sentence) 265 | showAttention(input_sentence, output_words, attentions) 266 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @Time : 18-6-8 下午4:46 4 | # @Author : J.W. 5 | # @File : __init__.py.py -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/cat_vs_dog/convert.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | from __future__ import print_function, division 3 | 4 | import numpy as np 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | import torch.optim as optim 9 | from torch.autograd import Variable 10 | from torch.utils.data import Dataset 11 | from torchvision import transforms, datasets, models 12 | 13 | # 配置参数 14 | random_state = 1 15 | torch.manual_seed(random_state) # 设置随机数种子,确保结果可重复 16 | torch.cuda.manual_seed(random_state) 17 | torch.cuda.manual_seed_all(random_state) 18 | np.random.seed(random_state) 19 | # random.seed(random_state) 20 | 21 | epochs = 10 # 训练次数 22 | batch_size = 4 # 批处理大小 23 | num_workers = 4 # 多线程的数目 24 | use_gpu = torch.cuda.is_available() 25 | 26 | # 对加载的图像作归一化处理, 并裁剪为[224x224x3]大小的图像 27 | data_transform = transforms.Compose([ 28 | transforms.Scale(256), 29 | transforms.CenterCrop(224), 30 | transforms.ToTensor(), 31 | transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) 32 | ]) 33 | 34 | # 数据的批处理,尺寸大小为batch_size, 35 | # 在训练集中,shuffle 必须设置为True, 表示次序是随机的 36 | train_dataset = datasets.ImageFolder(root='cats_and_dogs_small/train/', transform=data_transform) 37 | train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=num_workers) 38 | 39 | test_dataset = datasets.ImageFolder(root='cats_and_dogs_small/test/', transform=data_transform) 40 | test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=True, num_workers=num_workers) 41 | 42 | 43 | # 创建模型 44 | class Net(nn.Module): 45 | def __init__(self): 46 | super(Net, self).__init__() 47 | self.conv1 = nn.Conv2d(3, 6, 5) 48 | self.maxpool = nn.MaxPool2d(2, 2) 49 | self.conv2 = nn.Conv2d(6, 16, 5) 50 | self.fc1 = nn.Linear(16 * 53 * 53, 1024) 51 | self.fc2 = nn.Linear(1024, 512) 52 | self.fc3 = nn.Linear(512, 2) 53 | 54 | def forward(self, x): 55 | x = self.maxpool(F.relu(self.conv1(x))) 56 | x = self.maxpool(F.relu(self.conv2(x))) 57 | x = x.view(-1, 16 * 53 * 53) 58 | x = F.relu(self.fc1(x)) 59 | x = F.relu(self.fc2(x)) 60 | x = self.fc3(x) 61 | 62 | return x 63 | 64 | 65 | # net = Net() 66 | 67 | # 加载resnet18 模型, 68 | net = models.resnet18(pretrained=False) 69 | num_ftrs = net.fc.in_features 70 | net.fc = nn.Linear(num_ftrs, 2) # 更新resnet18模型的fc模型, 71 | 72 | if use_gpu: 73 | net = net.cuda() 74 | print(net) 75 | 76 | ''' 77 | Net ( 78 | (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1)) 79 | (maxpool): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1)) 80 | (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1)) 81 | (fc1): Linear (44944 -> 2048) 82 | (fc2): Linear (2048 -> 512) 83 | (fc3): Linear (512 -> 2) 84 | ) 85 | ''' 86 | 87 | # 定义loss和optimizer 88 | cirterion = nn.CrossEntropyLoss() 89 | optimizer = optim.SGD(net.parameters(), lr=0.0001, momentum=0.9) 90 | 91 | # 开始训练 92 | net.train() 93 | for epoch in range(epochs): 94 | running_loss = 0.0 95 | train_correct = 0 96 | train_total = 0 97 | for i, data in enumerate(train_loader, 0): 98 | inputs, train_labels = data 99 | if use_gpu: 100 | inputs, labels = Variable(inputs.cuda()), Variable(train_labels.cuda()) 101 | else: 102 | inputs, labels = Variable(inputs), Variable(train_labels) 103 | # inputs, labels = Variable(inputs), Variable(train_labels) 104 | optimizer.zero_grad() 105 | outputs = net(inputs) 106 | _, train_predicted = torch.max(outputs.data, 1) 107 | # import pdb 108 | # pdb.set_trace() 109 | train_correct += (train_predicted == labels.data).sum() 110 | loss = cirterion(outputs, labels) 111 | loss.backward() 112 | optimizer.step() 113 | 114 | running_loss += loss.data[0] 115 | train_total += train_labels.size(0) 116 | 117 | print('train %d epoch loss: %.3f acc: %.3f ' % ( 118 | epoch + 1, running_loss / train_total, 100 * train_correct / train_total)) 119 | 120 | # 模型测试 121 | correct = 0 122 | test_loss = 0.0 123 | test_total = 0 124 | test_total = 0 125 | net.eval() 126 | for data in test_loader: 127 | images, labels = data 128 | if use_gpu: 129 | images, labels = Variable(images.cuda()), Variable(labels.cuda()) 130 | else: 131 | images, labels = Variable(images), Variable(labels) 132 | outputs = net(images) 133 | _, predicted = torch.max(outputs.data, 1) 134 | loss = cirterion(outputs, labels) 135 | test_loss += loss.data[0] 136 | test_total += labels.size(0) 137 | correct += (predicted == labels.data).sum() 138 | 139 | print('test %d epoch loss: %.3f acc: %.3f ' % (epoch + 1, test_loss / test_total, 100 * correct / test_total)) 140 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/cat_vs_dog/data: -------------------------------------------------------------------------------- 1 | Kaggle 竞赛的猫狗数据集在 https://www.kaggle.com/c/dogs-vs-cats/data 网址下载。 2 | 在本实验中,先解压 train.zip 压缩包,得到一个 train 文件夹。 3 | 数据解压后利用 preprare_data.py 进行数据预处理。 4 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/cat_vs_dog/preprare_data.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | import os 3 | import shutil 4 | 5 | import numpy as np 6 | 7 | # 随机种子设置 8 | random_state = 42 9 | np.random.seed(random_state) 10 | 11 | # kaggle原始数据集地址 12 | original_dataset_dir = 'train' 13 | total_num = int(len(os.listdir(original_dataset_dir)) / 2) 14 | random_idx = np.array(range(total_num)) 15 | np.random.shuffle(random_idx) 16 | 17 | # 待处理的数据集地址 18 | base_dir = 'cats_and_dogs_small' 19 | if not os.path.exists(base_dir): 20 | os.mkdir(base_dir) 21 | 22 | # 训练集、测试集的划分 23 | sub_dirs = ['train', 'test'] 24 | animals = ['cats', 'dogs'] 25 | train_idx = random_idx[:int(total_num * 0.9)] 26 | test_idx = random_idx[int(total_num * 0.9):] 27 | numbers = [train_idx, test_idx] 28 | for idx, sub_dir in enumerate(sub_dirs): 29 | dir = os.path.join(base_dir, sub_dir) 30 | if not os.path.exists(dir): 31 | os.mkdir(dir) 32 | for animal in animals: 33 | animal_dir = os.path.join(dir, animal) # 34 | if not os.path.exists(animal_dir): 35 | os.mkdir(animal_dir) 36 | fnames = [animal[:-1] + '.{}.jpg'.format(i) for i in numbers[idx]] 37 | for fname in fnames: 38 | src = os.path.join(original_dataset_dir, fname) 39 | dst = os.path.join(animal_dir, fname) 40 | shutil.copyfile(src, dst) 41 | 42 | # 验证训练集、验证集、测试集的划分的照片数目 43 | print(animal_dir + ' total images : %d' % (len(os.listdir(animal_dir)))) 44 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/cat_vs_dog/result.txt: -------------------------------------------------------------------------------- 1 | /home/jason/anaconda3/envs/pytorch-in-action/bin/python /home/jason/jason/ebook/cs/ml/book_by_xiaobao/pytorch-in-action/chapter8_PyTorch项目实战/cat_vs_dog/convert.py 2 | /home/jason/anaconda3/envs/pytorch-in-action/lib/python3.6/site-packages/torchvision/transforms/transforms.py:156: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead. 3 | "please use transforms.Resize instead.") 4 | ResNet( 5 | (conv1): Conv2d (3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) 6 | (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True) 7 | (relu): ReLU(inplace) 8 | (maxpool): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), dilation=(1, 1)) 9 | (layer1): Sequential( 10 | (0): BasicBlock( 11 | (conv1): Conv2d (64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 12 | (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True) 13 | (relu): ReLU(inplace) 14 | (conv2): Conv2d (64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 15 | (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True) 16 | ) 17 | (1): BasicBlock( 18 | (conv1): Conv2d (64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 19 | (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True) 20 | (relu): ReLU(inplace) 21 | (conv2): Conv2d (64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 22 | (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True) 23 | ) 24 | ) 25 | (layer2): Sequential( 26 | (0): BasicBlock( 27 | (conv1): Conv2d (64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) 28 | (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True) 29 | (relu): ReLU(inplace) 30 | (conv2): Conv2d (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 31 | (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True) 32 | (downsample): Sequential( 33 | (0): Conv2d (64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False) 34 | (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True) 35 | ) 36 | ) 37 | (1): BasicBlock( 38 | (conv1): Conv2d (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 39 | (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True) 40 | (relu): ReLU(inplace) 41 | (conv2): Conv2d (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 42 | (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True) 43 | ) 44 | ) 45 | (layer3): Sequential( 46 | (0): BasicBlock( 47 | (conv1): Conv2d (128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) 48 | (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True) 49 | (relu): ReLU(inplace) 50 | (conv2): Conv2d (256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 51 | (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True) 52 | (downsample): Sequential( 53 | (0): Conv2d (128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False) 54 | (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True) 55 | ) 56 | ) 57 | (1): BasicBlock( 58 | (conv1): Conv2d (256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 59 | (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True) 60 | (relu): ReLU(inplace) 61 | (conv2): Conv2d (256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 62 | (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True) 63 | ) 64 | ) 65 | (layer4): Sequential( 66 | (0): BasicBlock( 67 | (conv1): Conv2d (256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) 68 | (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True) 69 | (relu): ReLU(inplace) 70 | (conv2): Conv2d (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 71 | (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True) 72 | (downsample): Sequential( 73 | (0): Conv2d (256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False) 74 | (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True) 75 | ) 76 | ) 77 | (1): BasicBlock( 78 | (conv1): Conv2d (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 79 | (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True) 80 | (relu): ReLU(inplace) 81 | (conv2): Conv2d (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 82 | (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True) 83 | ) 84 | ) 85 | (avgpool): AvgPool2d(kernel_size=7, stride=1, padding=0, ceil_mode=False, count_include_pad=True) 86 | (fc): Linear(in_features=512, out_features=2) 87 | ) 88 | train 1 epoch loss: 0.162 acc: 62.200 89 | test 1 epoch loss: 0.159 acc: 63.800 90 | train 2 epoch loss: 0.149 acc: 68.182 91 | test 2 epoch loss: 0.134 acc: 74.160 92 | train 3 epoch loss: 0.131 acc: 74.378 93 | test 3 epoch loss: 0.118 acc: 77.680 94 | train 4 epoch loss: 0.120 acc: 77.360 95 | test 4 epoch loss: 0.122 acc: 76.040 96 | train 5 epoch loss: 0.112 acc: 79.311 97 | test 5 epoch loss: 0.107 acc: 81.040 98 | train 6 epoch loss: 0.104 acc: 80.969 99 | test 6 epoch loss: 0.100 acc: 81.560 100 | train 7 epoch loss: 0.097 acc: 82.333 101 | test 7 epoch loss: 0.115 acc: 79.000 102 | train 8 epoch loss: 0.091 acc: 83.787 103 | test 8 epoch loss: 0.092 acc: 83.120 104 | train 9 epoch loss: 0.085 acc: 85.102 105 | test 9 epoch loss: 0.088 acc: 84.360 106 | train 10 epoch loss: 0.079 acc: 86.427 107 | test 10 epoch loss: 0.088 acc: 84.640 108 | 109 | Process finished with exit code 0 -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/speech_command/README.md: -------------------------------------------------------------------------------- 1 | # 命令词识别的 PyTorch 实现 2 | - 在 `speech_command` 中创建并进入 `org_data` 文件夹,[点击下载](http://download.tensorflow.org/data/speech_commands_v0.01.tar.gz)数据,下载完成后解压. 3 | - 运行 `make_dataset.py` 进行数据预处理. 4 | - 运行 `run.py` -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/speech_command/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : __init__.py.py 4 | 5 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/speech_command/make_dataset.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import shutil 4 | 5 | 6 | # 把文件从源文件夹移动到目标文件夹 7 | def move_files(original_fold, data_fold, data_filename): 8 | with open(data_filename) as f: 9 | for line in f.readlines(): 10 | vals = line.split('/') 11 | dest_fold = os.path.join(data_fold, vals[0]) 12 | if not os.path.exists(dest_fold): 13 | os.mkdir(dest_fold) 14 | shutil.move(os.path.join(original_fold, line[:-1]), os.path.join(data_fold, line[:-1])) 15 | 16 | 17 | # 建立train文件夹 18 | def create_train_fold(original_fold, train_fold, test_fold): 19 | # 文件夹名列表 20 | dir_names = list() 21 | for file in os.listdir(test_fold): 22 | if os.path.isdir(os.path.join(test_fold, file)): 23 | dir_names.append(file) 24 | 25 | # 建立训练文件夹train 26 | for file in os.listdir(original_fold): 27 | if os.path.isdir(os.path.join(test_fold, file)) and file in dir_names: 28 | shutil.move(os.path.join(original_fold, file), os.path.join(train_fold, file)) 29 | 30 | 31 | # 建立数据集,train,valid, 和 test 32 | def make_dataset(in_path, out_path): 33 | validation_path = os.path.join(in_path, 'validation_list.txt') 34 | test_path = os.path.join(in_path, 'testing_list.txt') 35 | 36 | # train, valid, test三个数据集文件夹的建立 37 | train_fold = os.path.join(out_path, 'train') 38 | valid_fold = os.path.join(out_path, 'valid') 39 | test_fold = os.path.join(out_path, 'test') 40 | 41 | for fold in [valid_fold, test_fold, train_fold]: 42 | if not os.path.exists(fold): 43 | os.mkdir(fold) 44 | # 移动train, valid, test三个数据集所需要的文件 45 | move_files(in_path, test_fold, test_path) 46 | move_files(in_path, valid_fold, validation_path) 47 | create_train_fold(in_path, train_fold, test_fold) 48 | 49 | 50 | if __name__ == '__main__': 51 | parser = argparse.ArgumentParser(description='Make speech commands dataset.') 52 | parser.add_argument('--in_path', default='org_data', 53 | help='the path to the root folder of te speech commands dataset.') 54 | parser.add_argument('--out_path', default='data', help='the path where to save the files splitted to folders.') 55 | args = parser.parse_args() 56 | make_dataset(args.in_path, args.out_path) 57 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/speech_command/model.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch.nn.functional as F 3 | 4 | # 建立VGG卷积神经的模型层 5 | def _make_layers(cfg): 6 | layers = [] 7 | in_channels = 1 8 | for x in cfg: 9 | if x == 'M': # maxpool 池化层 10 | layers += [nn.MaxPool2d(kernel_size=2, stride=2)] 11 | else: # 卷积层 12 | layers += [nn.Conv2d(in_channels, x, kernel_size=3, padding=1), 13 | nn.BatchNorm2d(x), 14 | nn.ReLU(inplace=True)] 15 | in_channels = x 16 | layers += [nn.AvgPool2d(kernel_size=1, stride=1)] #avgPool 池化层 17 | return nn.Sequential(*layers) 18 | 19 | # 各个VGG模型的参数 20 | cfg = { 21 | 'VGG11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 22 | 'VGG13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 23 | 'VGG16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'], 24 | 'VGG19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'], 25 | } 26 | 27 | # VGG卷积神经网络 28 | class VGG(nn.Module): 29 | def __init__(self, vgg_name): 30 | super(VGG, self).__init__() 31 | self.features = _make_layers(cfg[vgg_name]) # VGG的模型层 32 | self.fc1 = nn.Linear(7680, 512) 33 | self.fc2 = nn.Linear(512, 30) 34 | 35 | def forward(self, x): 36 | out = self.features(x) 37 | out = out.view(out.size(0), -1) # flatting 38 | out = self.fc1(out) # 线性层 39 | out = self.fc2(out) # 线性层 40 | #import pdb 41 | #pdb.set_trace() 42 | return F.log_softmax(out, dim=1) # log_softmax 激活函数 43 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/speech_command/run.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import argparse 3 | import torch 4 | import torch.optim as optim 5 | from speech_loader import SpeechLoader 6 | import numpy as np 7 | from model import VGG 8 | from train import train, test 9 | import torch.nn.functional as F 10 | from torch.autograd import Variable 11 | 12 | # 参数设置 13 | parser = argparse.ArgumentParser(description='Google Speech Commands Recognition') 14 | parser.add_argument('--train_path', default='data/train', help='path to the train data folder') 15 | parser.add_argument('--test_path', default='data/test', help='path to the test data folder') 16 | parser.add_argument('--valid_path', default='data/valid', help='path to the valid data folder') 17 | parser.add_argument('--batch_size', type=int, default=100, metavar='N', help='training and valid batch size') 18 | parser.add_argument('--test_batch_size', type=int, default=100, metavar='N', help='batch size for testing') 19 | parser.add_argument('--arc', default='VGG11', help='network architecture: VGG11, VGG13, VGG16, VGG19') 20 | parser.add_argument('--epochs', type=int, default=10, metavar='N', help='number of epochs to train') 21 | parser.add_argument('--lr', type=float, default=0.001, metavar='LR', help='learning rate') 22 | parser.add_argument('--momentum', type=float, default=0.9, metavar='M', help='SGD momentum, for SGD only') 23 | parser.add_argument('--optimizer', default='adam', help='optimization method: sgd | adam') 24 | parser.add_argument('--cuda', default=True, help='enable CUDA') 25 | parser.add_argument('--seed', type=int, default=1234, metavar='S', help='random seed') 26 | 27 | # 特征提取参数设置 28 | parser.add_argument('--window_size', default=.02, help='window size for the stft') 29 | parser.add_argument('--window_stride', default=.01, help='window stride for the stft') 30 | parser.add_argument('--window_type', default='hamming', help='window type for the stft') 31 | parser.add_argument('--normalize', default=True, help='boolean, wheather or not to normalize the spect') 32 | 33 | args = parser.parse_args() 34 | 35 | # 确定是否使用CUDA 36 | args.cuda = args.cuda and torch.cuda.is_available() 37 | torch.manual_seed(args.seed) # PyTorch随机种子设置 38 | if args.cuda: 39 | torch.cuda.manual_seed(args.seed) # CUDA随机种子设置 40 | 41 | # 加载数据, 训练集,验证集和测试集 42 | train_dataset = SpeechLoader(args.train_path, window_size=args.window_size, window_stride=args.window_stride, 43 | window_type=args.window_type, normalize=args.normalize) 44 | train_loader = torch.utils.data.DataLoader( 45 | train_dataset, batch_size=args.batch_size, shuffle=True, 46 | num_workers=20, pin_memory=args.cuda, sampler=None) 47 | 48 | valid_dataset = SpeechLoader(args.valid_path, window_size=args.window_size, window_stride=args.window_stride, 49 | window_type=args.window_type, normalize=args.normalize) 50 | valid_loader = torch.utils.data.DataLoader( 51 | valid_dataset, batch_size=args.batch_size, shuffle=None, 52 | num_workers=20, pin_memory=args.cuda, sampler=None) 53 | 54 | test_dataset = SpeechLoader(args.test_path, window_size=args.window_size, window_stride=args.window_stride, 55 | window_type=args.window_type, normalize=args.normalize) 56 | test_loader = torch.utils.data.DataLoader( 57 | test_dataset, batch_size=args.test_batch_size, shuffle=None, 58 | num_workers=20, pin_memory=args.cuda, sampler=None) 59 | 60 | # 建立网络模型 61 | model = VGG(args.arc) 62 | 63 | if args.cuda: 64 | print('Using CUDA with {0} GPUs'.format(torch.cuda.device_count())) 65 | model = torch.nn.DataParallel(model).cuda() 66 | 67 | # 定义优化器 68 | if args.optimizer.lower() == 'adam': 69 | optimizer = optim.Adam(model.parameters(), lr=args.lr) 70 | elif args.optimizer.lower() == 'sgd': 71 | optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum) 72 | else: 73 | optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum) 74 | 75 | #import pdb 76 | #pdb.set_trace() 77 | # train 和 valid过程 78 | for epoch in range(1, args.epochs + 1): 79 | # 模型在train集上训练 80 | train(train_loader, model, optimizer, epoch, args.cuda) 81 | 82 | # 验证集测试 83 | test(valid_loader, model, args.cuda, 'valid') 84 | 85 | # 测试集验证 86 | test(test_loader, model, args.cuda, 'test') 87 | 88 | 89 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/speech_command/speech_loader.py: -------------------------------------------------------------------------------- 1 | import torch.utils.data as data 2 | 3 | import os 4 | import os.path 5 | import torch 6 | 7 | import librosa 8 | import numpy as np 9 | 10 | # 音频数据格式,只允许wav和WAV 11 | AUDIO_EXTENSIONS = [ 12 | '.wav', '.WAV', 13 | ] 14 | 15 | # 判断是否是音频文件 16 | def is_audio_file(filename): 17 | return any(filename.endswith(extension) for extension in AUDIO_EXTENSIONS) 18 | 19 | # 找到类名并索引 20 | def find_classes(dir): 21 | classes = [d for d in os.listdir(dir) if os.path.isdir(os.path.join(dir, d))] 22 | classes.sort() 23 | class_to_idx = {classes[i]: i for i in range(len(classes))} 24 | return classes, class_to_idx 25 | 26 | # 构造数据集 27 | def make_dataset(dir, class_to_idx): 28 | spects = [] 29 | dir = os.path.expanduser(dir) 30 | for target in sorted(os.listdir(dir)): 31 | d = os.path.join(dir, target) 32 | if not os.path.isdir(d): 33 | continue 34 | 35 | for root, _, fnames in sorted(os.walk(d)): 36 | for fname in sorted(fnames): 37 | if is_audio_file(fname): 38 | path = os.path.join(root, fname) 39 | item = (path, class_to_idx[target]) 40 | spects.append(item) 41 | return spects 42 | 43 | # 频谱加载器, 处理音频,生成频谱 44 | def spect_loader(path, window_size, window_stride, window, normalize, max_len=101): 45 | y, sr = librosa.load(path, sr=None) 46 | # n_fft = 4096 47 | n_fft = int(sr * window_size) 48 | win_length = n_fft 49 | hop_length = int(sr * window_stride) 50 | 51 | # 短时傅立叶变换 52 | D = librosa.stft(y, n_fft=n_fft, hop_length=hop_length, 53 | win_length=win_length, window=window) 54 | spect, phase = librosa.magphase(D) # 计算幅度谱和相位 55 | 56 | # S = log(S+1) 57 | spect = np.log1p(spect) # 计算log域幅度谱 58 | 59 | # 处理所有的频谱,使得长度一致,少于规定长度,补0到规定长度; 多于规定长度的, 截短到规定长度; 60 | if spect.shape[1] < max_len: 61 | pad = np.zeros((spect.shape[0], max_len - spect.shape[1])) 62 | spect = np.hstack((spect, pad)) 63 | elif spect.shape[1] > max_len: 64 | spect = spect[:max_len, ] 65 | spect = np.resize(spect, (1, spect.shape[0], spect.shape[1])) 66 | spect = torch.FloatTensor(spect) 67 | 68 | # z-score 归一化 69 | if normalize: 70 | mean = spect.mean() 71 | std = spect.std() 72 | if std != 0: 73 | spect.add_(-mean) 74 | spect.div_(std) 75 | 76 | return spect 77 | 78 | # 音频加载器, 类似PyTorch的加载器,实现对数据的加载 79 | class SpeechLoader(data.Dataset): 80 | """ Google 音频命令数据集的数据形式如下: 81 | root/one/xxx.wav 82 | root/head/123.wav 83 | 参数: 84 | root (string): 原始数据集路径 85 | window_size: STFT的窗长大小,默认参数是 .02 86 | window_stride: 用于STFT窗的帧移是 .01 87 | window_type: , 窗的类型, 默认是 hamming窗 88 | normalize: boolean, whether or not to normalize the spect to have zero mean and one std 89 | max_len: 帧的最大长度 90 | 属性: 91 | classes (list): 类别名的列表 92 | class_to_idx (dict):  目标参数(class_name, class_index)(字典类型). 93 | spects (list): 频谱参数(spects path, class_index) 的列表 94 | STFT parameter: 窗长, 帧移, 窗的类型, 归一化 95 | """ 96 | 97 | def __init__(self, root, window_size=.02, window_stride=.01, window_type='hamming', 98 | normalize=True, max_len=101): 99 | classes, class_to_idx = find_classes(root) 100 | spects = make_dataset(root, class_to_idx) 101 | if len(spects) == 0: # 错误处理 102 | raise (RuntimeError("Found 0 sound files in subfolders of: " + root + "Supported audio file extensions are: " + ",".join(AUDIO_EXTENSIONS))) 103 | 104 | self.root = root 105 | self.spects = spects 106 | self.classes = classes 107 | self.class_to_idx = class_to_idx 108 | self.loader = spect_loader 109 | self.window_size = window_size 110 | self.window_stride = window_stride 111 | self.window_type = window_type 112 | self.normalize = normalize 113 | self.max_len = max_len 114 | 115 | def __getitem__(self, index): 116 | """ 117 | Args: 118 | index (int): 序列 119 | Returns: 120 | tuple (spect, target):返回(spec,target), 其中target 是类别的索引.. 121 | """ 122 | path, target = self.spects[index] 123 | spect = self.loader(path, self.window_size, self.window_stride, self.window_type, self.normalize, self.max_len) 124 | 125 | return spect, target 126 | 127 | def __len__(self): 128 | return len(self.spects) 129 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/speech_command/train.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import torch.nn.functional as F 3 | from torch.autograd import Variable 4 | 5 | # 训练函数, 模型在train集上训练 6 | def train(loader, model, optimizer, epoch, cuda): 7 | model.train() 8 | train_loss = 0 9 | train_correct = 0 10 | 11 | for batch_idx, (data, target) in enumerate(loader): 12 | if cuda: 13 | data, target = data.cuda(), target.cuda() 14 | data, target = Variable(data), Variable(target) 15 | optimizer.zero_grad() 16 | output = model(data) 17 | loss = F.nll_loss(output, target) 18 | loss.backward() 19 | optimizer.step() 20 | train_loss += loss.data[0] 21 | pred = output.data.max(1, keepdim=True)[1] 22 | train_correct += pred.eq(target.data.view_as(pred)).cpu().sum() 23 | 24 | train_loss = train_loss / len(loader.dataset) 25 | print('train set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)'.format( 26 | train_loss, train_correct, len(loader.dataset), 100. * train_correct / len(loader.dataset))) 27 | 28 | # 测试函数,用来测试valid集和test集 29 | def test(loader, model, cuda, dataset): 30 | model.eval() 31 | test_loss = 0 32 | correct = 0 33 | for data, target in loader: 34 | if cuda: 35 | data, target = data.cuda(), target.cuda() 36 | data, target = Variable(data, volatile=True), Variable(target) 37 | output = model(data) 38 | test_loss += F.nll_loss(output, target, size_average=False).data[0] # sum up batch loss 39 | pred = output.data.max(1, keepdim=True)[1] # get the index of the max log-probability 40 | correct += pred.eq(target.data.view_as(pred)).cpu().sum() 41 | 42 | test_loss /= len(loader.dataset) 43 | 44 | print(dataset + ' set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)'.format( 45 | test_loss, correct, len(loader.dataset), 100. * correct / len(loader.dataset))) 46 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/text_classification/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # @File : __init__.py.py 4 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/text_classification/model.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | 6 | class CNN_Text(nn.Module): 7 | def __init__(self, args): 8 | super(CNN_Text, self).__init__() 9 | self.args = args 10 | 11 | embed_num = args.embed_num 12 | embed_dim = args.embed_dim 13 | class_num = args.class_num 14 | Ci = 1 15 | kernel_num = args.kernel_num 16 | kernel_sizes = args.kernel_sizes 17 | 18 | self.embed = nn.Embedding(embed_num, embed_dim) 19 | 20 | self.convs_list = nn.ModuleList( 21 | [nn.Conv2d(Ci, kernel_num, (kernel_size, embed_dim)) for kernel_size in kernel_sizes]) 22 | 23 | self.dropout = nn.Dropout(args.dropout) 24 | self.fc = nn.Linear(len(kernel_sizes) * kernel_num, class_num) 25 | 26 | def forward(self, x): 27 | x = self.embed(x) 28 | x = x.unsqueeze(1) 29 | x = [F.relu(conv(x)).squeeze(3) for conv in self.convs_list] 30 | x = [F.max_pool1d(i, i.size(2)).squeeze(2) for i in x] 31 | x = torch.cat(x, 1) 32 | x = self.dropout(x) 33 | x = x.view(x.size(0), -1) 34 | logit = self.fc(x) 35 | return logit 36 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/text_classification/mydatasets.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | import tarfile 4 | import urllib 5 | 6 | from torchtext import data 7 | 8 | 9 | class TarDataset(data.Dataset): 10 | """Defines a Dataset loaded from a downloadable tar archive. 11 | 12 | Attributes: 13 | url: URL where the tar archive can be downloaded. 14 | filename: Filename of the downloaded tar archive. 15 | dirname: Name of the top-level directory within the zip archive that 16 | contains the data files. 17 | """ 18 | 19 | @classmethod 20 | def download_or_unzip(cls, root): 21 | path = os.path.join(root, cls.dirname) 22 | if not os.path.isdir(path): 23 | tpath = os.path.join(root, cls.filename) 24 | if not os.path.isfile(tpath): 25 | print('downloading') 26 | urllib.request.urlretrieve(cls.url, tpath) 27 | with tarfile.open(tpath, 'r') as tfile: 28 | print('extracting') 29 | tfile.extractall(root) 30 | return os.path.join(path, '') 31 | 32 | 33 | class NEWS_20(TarDataset): 34 | url = 'http://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz' 35 | filename = 'data/20news-bydate-train' 36 | dirname = '' 37 | 38 | @staticmethod 39 | def sort_key(ex): 40 | return len(ex.text) 41 | 42 | def __init__(self, text_field, label_field, path=None, text_cnt=1000, examples=None, **kwargs): 43 | """Create an MR dataset instance given a path and fields. 44 | 45 | Arguments: 46 | text_field: The field that will be used for text data. 47 | 48 | label_field: The field that will be used for label data. 49 | path: Path to the data file. 50 | examples: The examples contain all the data. 51 | Remaining keyword arguments: Passed to the constructor of 52 | data.Dataset. 53 | """ 54 | 55 | def clean_str(string): 56 | """ 57 | Tokenization/string cleaning for all datasets except for SST. 58 | Original taken from https://github.com/yoonkim/CNN_sentence/blob/master/process_data.py 59 | """ 60 | string = re.sub(r"[^A-Za-z0-9(),!?\'\`]", " ", string) 61 | string = re.sub(r"\'s", " \'s", string) 62 | string = re.sub(r"\'ve", " \'ve", string) 63 | string = re.sub(r"n\'t", " n\'t", string) 64 | string = re.sub(r"\'re", " \'re", string) 65 | string = re.sub(r"\'d", " \'d", string) 66 | string = re.sub(r"\'ll", " \'ll", string) 67 | string = re.sub(r",", " , ", string) 68 | string = re.sub(r"!", " ! ", string) 69 | string = re.sub(r"\(", " \( ", string) 70 | string = re.sub(r"\)", " \) ", string) 71 | string = re.sub(r"\?", " \? ", string) 72 | string = re.sub(r"\s{2,}", " ", string) 73 | return string.strip().lower() 74 | 75 | text_field.preprocessing = data.Pipeline(clean_str) 76 | fields = [('text', text_field), ('label', label_field)] 77 | 78 | categories = ['alt.atheism', 'comp.graphics', 'sci.med', 'soc.religion.christian'] 79 | if examples is None: 80 | path = self.dirname if path is None else path 81 | examples = [] 82 | for sub_path in categories: 83 | 84 | sub_path_one = os.path.join(path, sub_path) 85 | sub_paths_two = os.listdir(sub_path_one) 86 | cnt = 0 87 | for sub_path_two in sub_paths_two: 88 | lines = "" 89 | with open(os.path.join(sub_path_one, sub_path_two), encoding="utf8", errors='ignore') as f: 90 | lines = f.read() 91 | examples += [data.Example.fromlist([lines, sub_path], fields)] 92 | cnt += 1 93 | 94 | super(NEWS_20, self).__init__(examples, fields, **kwargs) 95 | 96 | @classmethod 97 | def splits(cls, text_field, label_field, root='./data', 98 | train='20news-bydate-train', test='20news-bydate-test', 99 | **kwargs): 100 | """Create dataset objects for splits of the 20news dataset. 101 | 102 | Arguments: 103 | text_field: The field that will be used for the sentence. 104 | label_field: The field that will be used for label data. 105 | 106 | train: The filename of the train data. Default: 'train.txt'. 107 | Remaining keyword arguments: Passed to the splits method of 108 | Dataset. 109 | """ 110 | 111 | path = cls.download_or_unzip(root) 112 | 113 | train_data = None if train is None else cls( 114 | text_field, label_field, os.path.join(path, train), 2000, **kwargs) 115 | 116 | dev_ratio = 0.1 117 | dev_index = -1 * int(dev_ratio * len(train_data)) 118 | 119 | return (cls(text_field, label_field, examples=train_data[:dev_index]), 120 | cls(text_field, label_field, examples=train_data[dev_index:])) 121 | -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/text_classification/readme.txt: -------------------------------------------------------------------------------- 1 | text_classification 文件夹下数据: 2 | .vector_cache/glove.6B.zip 3 | 运行 text_classification.py会自动创建.vector_cache并下载数据,(下载地址:http://nlp.stanford.edu/data/glove.6B.zip) 4 | 5 | 20Newsgroups数据集: 6 | data/20news-bydate-test 7 | data/20news-bydate-train -------------------------------------------------------------------------------- /chapter8_PyTorch项目实战/text_classification/text_classification.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import random 3 | 4 | import model 5 | import mydatasets 6 | import numpy as np 7 | import torch 8 | import torch.nn as nn 9 | import torch.nn.functional as F 10 | import torchtext.data as data 11 | 12 | # import torch.nn.init.xavier_uniform_ as xavier 13 | # random_state = 11892 #92% 14 | random_state = 11117 # 20 94.22% 15 | torch.manual_seed(random_state) 16 | torch.cuda.manual_seed(random_state) 17 | torch.cuda.manual_seed_all(random_state) 18 | np.random.seed(random_state) 19 | random.seed(random_state) 20 | 21 | # lr = 0.001 , 17 ,95.11% 22 | # 13 , 96.88% 23 | 24 | parser = argparse.ArgumentParser(description='CNN text classificer') 25 | # learning 26 | parser.add_argument('-lr', type=float, default=0.001, help='initial learning rate [default: 0.001]') 27 | parser.add_argument('-epochs', type=int, default=20, help='number of epochs for train [default: 20]') 28 | parser.add_argument('-batch-size', type=int, default=64, help='batch size for training [default: 64]') 29 | # data 30 | parser.add_argument('-shuffle', action='store_true', default=False, help='shuffle the data every epoch') 31 | # model 32 | parser.add_argument('-dropout', type=float, default=0.2, help='the probability for dropout [default: 0.5]') 33 | parser.add_argument('-embed-dim', type=int, default=100, help='number of embedding dimension [default: 128]') 34 | parser.add_argument('-kernel-num', type=int, default=128, help='number of each kind of kernel, 100') 35 | parser.add_argument('-kernel-sizes', type=str, default='3,5,7', 36 | help='comma-separated kernel size to use for convolution') 37 | parser.add_argument('-static', action='store_true', default=False, help='fix the embedding') 38 | # device 39 | parser.add_argument('-device', type=int, default=-1, help='device to use for iterate data, -1 mean cpu [default: -1]') 40 | parser.add_argument('-no-cuda', action='store_true', default=False, help='disable the gpu') 41 | 42 | args = parser.parse_args() 43 | 44 | 45 | # load 20new dataset 46 | def new_20(text_field, label_field, **kargs): 47 | train_data, dev_data = mydatasets.NEWS_20.splits(text_field, label_field) 48 | 49 | max_document_length = max([len(x.text) for x in train_data.examples]) 50 | print('train max_document_length', max_document_length) 51 | 52 | max_document_length = max([len(x.text) for x in dev_data]) 53 | print('dev max_document_length', max_document_length) 54 | 55 | text_field.build_vocab(train_data, dev_data) 56 | text_field.vocab.load_vectors('glove.6B.100d') 57 | 58 | label_field.build_vocab(train_data, dev_data) 59 | train_iter, dev_iter = data.Iterator.splits( 60 | (train_data, dev_data), 61 | batch_sizes=(args.batch_size, len(dev_data)), 62 | **kargs) 63 | return train_iter, dev_iter, text_field 64 | 65 | 66 | def weights_init(m): 67 | classname = m.__class__.__name__ 68 | if classname.find('Conv2d') != -1: 69 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 70 | nn.init.xavier_normal(m.weight.data) 71 | m.bias.data.fill_(0) 72 | elif classname.find('Linear') != -1: 73 | m.weight.data.normal_(0.0, 0.02) 74 | m.bias.data.fill_(0) 75 | 76 | 77 | # load data 78 | print("\nLoading data...") 79 | text_field = data.Field(lower=True) 80 | label_field = data.Field(sequential=False) 81 | train_iter, dev_iter, text_field = new_20(text_field, label_field, device=-1, repeat=False) 82 | 83 | # update args and print 84 | args.embed_num = len(text_field.vocab) 85 | args.class_num = len(label_field.vocab) - 1 86 | 87 | args.cuda = (not args.no_cuda) and torch.cuda.is_available(); 88 | del args.no_cuda 89 | args.kernel_sizes = [int(k) for k in args.kernel_sizes.split(',')] 90 | 91 | print("\nParameters:") 92 | for attr, value in sorted(args.__dict__.items()): 93 | print("\t{}={}".format(attr.upper(), value)) 94 | 95 | # model 96 | cnn = model.CNN_Text(args) 97 | 98 | # load pre-training glove model 99 | cnn.embed.weight.data = text_field.vocab.vectors 100 | # weight init 101 | cnn.apply(weights_init) # 102 | 103 | # print net 104 | print(cnn) 105 | ''' 106 | 107 | CNN_Text( 108 | (embed): Embedding(53605, 100) 109 | (convs_list): ModuleList( 110 | (0): Conv2d (1, 128, kernel_size=(3, 100), stride=(1, 1)) 111 | (1): Conv2d (1, 128, kernel_size=(5, 100), stride=(1, 1)) 112 | (2): Conv2d (1, 128, kernel_size=(7, 100), stride=(1, 1)) 113 | ) 114 | (dropout): Dropout(p=0.2) 115 | (fc): Linear(in_features=384, out_features=4) 116 | ) 117 | ''' 118 | if args.cuda: 119 | torch.cuda.set_device(args.device) 120 | cnn = cnn.cuda() 121 | 122 | optimizer = torch.optim.Adam(cnn.parameters(), lr=args.lr, weight_decay=0.01) 123 | # train 124 | cnn.train() 125 | for epoch in range(1, args.epochs + 1): 126 | corrects, avg_loss = 0, 0 127 | for batch in train_iter: 128 | feature, target = batch.text, batch.label 129 | feature.data.t_(), target.data.sub_(1) # batch first, index align 130 | if args.cuda: 131 | feature, target = feature.cuda(), target.cuda() 132 | 133 | optimizer.zero_grad() 134 | logit = cnn(feature) 135 | 136 | loss = F.cross_entropy(logit, target) 137 | loss.backward() 138 | optimizer.step() 139 | 140 | avg_loss += loss.data[0] 141 | corrects += (torch.max(logit, 1)[1].view(target.size()).data == target.data).sum() 142 | 143 | size = len(train_iter.dataset) 144 | avg_loss /= size 145 | accuracy = 100.0 * corrects / size 146 | print('epoch[{}] Traning - loss: {:.6f} acc: {:.4f}%({}/{})'.format(epoch, 147 | avg_loss, 148 | accuracy, 149 | corrects, 150 | size)) 151 | # test 152 | cnn.eval() 153 | corrects, avg_loss = 0, 0 154 | for batch in dev_iter: 155 | feature, target = batch.text, batch.label 156 | feature.data.t_(), target.data.sub_(1) # batch first, index align 157 | if args.cuda: 158 | feature, target = feature.cuda(), target.cuda() 159 | 160 | logit = cnn(feature) 161 | loss = F.cross_entropy(logit, target, size_average=False) 162 | 163 | avg_loss += loss.data[0] 164 | corrects += (torch.max(logit, 1) 165 | [1].view(target.size()).data == target.data).sum() 166 | 167 | size = len(dev_iter.dataset) 168 | avg_loss /= size 169 | accuracy = 100.0 * corrects / size 170 | print('Evaluation - loss: {:.6f} acc: {:.4f}%({}/{}) '.format(avg_loss, 171 | accuracy, 172 | corrects, 173 | size)) 174 | -------------------------------------------------------------------------------- /corrigendum.md: -------------------------------------------------------------------------------- 1 | # 《PyTorch机器学习从入门到实战》- 勘误 2 | 3 | 工作之余编著本书,但自认才疏学浅, 对深度学习仅略知皮毛, 更兼时间和精力所限, 书中错谬之处依然甚多, 虽多次审稿修正,但错误仍在所难免, 若蒙读者诸君不吝告知, 将不胜感激。—— 修改自[西瓜书勘误修订](http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/MLbook2016.htm#%E5%8B%98%E8%AF%AF%E4%BF%AE%E8%AE%A2) 4 | 5 | 欢迎提[issue](https://github.com/xiaobaoonline/pytorch-in-action/issues)指出书中错误或不当之处。 6 | 7 | ## 2018年11月12日 [@jianchengss](https://github.com/jianchengss) 8 | - p73,图4-2用错,应修正为 9 | 10 | ![图_4-2](./images/图_4-2.png) 11 | 12 | ## 2019年4月29日 [Issues#2](https://github.com/xiaobaoonline/pytorch-in-action/issues/2) [@DemonsRhythm](https://github.com/DemonsRhythm) 13 | 14 | - p53,"start-of-art",应修正为"state-of-the-art" 15 | - p79,p87, p93, "Hilton",应修正为"Hinton" 16 | - p85,"L1正则化被定义为",应修正为"L2正则化被定义为" 17 | - p90,计算"less",应修正为计算"loss" -------------------------------------------------------------------------------- /environment.yaml: -------------------------------------------------------------------------------- 1 | name: pytorch-in-action 2 | channels: 3 | - soumith 4 | - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/ 5 | - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/ 6 | - defaults 7 | dependencies: 8 | - blas=1.0=mkl 9 | - ca-certificates=2018.03.07=0 10 | - certifi=2018.4.16=py36_0 11 | - cffi=1.11.5=py36h9745a5d_0 12 | - cudatoolkit=8.0=3 13 | - cudnn=7.0.5=cuda8.0_0 14 | - cycler=0.10.0=py36h93f1223_0 15 | - dbus=1.13.2=h714fa37_1 16 | - expat=2.2.5=he0dffb1_0 17 | - fontconfig=2.12.6=h49f89f6_0 18 | - freetype=2.8=hab7d2ae_1 19 | - glib=2.56.1=h000015b_0 20 | - gst-plugins-base=1.14.0=hbbd80ab_1 21 | - gstreamer=1.14.0=hb453b48_1 22 | - icu=58.2=h9c2bf20_1 23 | - intel-openmp=2018.0.0=8 24 | - jpeg=9b=h024ee3a_2 25 | - kiwisolver=1.0.1=py36h764f252_0 26 | - libedit=3.1.20170329=h6b74fdf_2 27 | - libffi=3.2.1=hd88cf55_4 28 | - libgcc-ng=7.2.0=hdf63c60_3 29 | - libgfortran-ng=7.2.0=hdf63c60_3 30 | - libpng=1.6.34=hb9fc6fc_0 31 | - libstdcxx-ng=7.2.0=hdf63c60_3 32 | - libtiff=4.0.9=he85c1e1_1 33 | - libxcb=1.13=h1bed415_1 34 | - libxml2=2.9.8=hf84eae3_0 35 | - matplotlib=2.2.2=py36h0e671d2_1 36 | - mkl=2019.0=118 37 | - mkl_fft=1.0.1=py36h3010b51_0 38 | - mkl_random=1.0.1=py36h629b387_0 39 | - nccl=1.3.4=cuda8.0_1 40 | - ncurses=6.1=hf484d3e_0 41 | - numpy=1.14.2=py36hdbf6ddf_1 42 | - olefile=0.46=py36_0 43 | - openssl=1.0.2o=h20670df_0 44 | - pcre=8.42=h439df22_0 45 | - pillow=5.1.0=py36h3deb7b8_0 46 | - pip=10.0.1=py36_0 47 | - pycparser=2.18=py36hf9f622e_1 48 | - pyparsing=2.2.0=py36hee85983_1 49 | - pyqt=5.9.2=py36h751905a_0 50 | - python=3.6.5=hc3d631a_2 51 | - python-dateutil=2.7.3=py36_0 52 | - pytorch=0.3.0=py36cuda8.0cudnn7.0_0 53 | - pytz=2018.4=py36_0 54 | - qt=5.9.5=h7e424d6_0 55 | - readline=7.0=ha6073c6_4 56 | - scikit-learn=0.19.1=py36h7aa7ec6_0 57 | - scipy=1.1.0=py36hfc37229_0 58 | - setuptools=39.1.0=py36_0 59 | - sip=4.19.8=py36hf484d3e_0 60 | - six=1.11.0=py36h372c433_1 61 | - sqlite=3.23.1=he433501_0 62 | - tk=8.6.7=hc745277_3 63 | - torchvision=0.2.0=py36_0 64 | - tornado=5.0.2=py36_0 65 | - wheel=0.31.1=py36_0 66 | - xz=5.2.3=h5e939de_4 67 | - zlib=1.2.11=ha838bed_2 68 | - cuda80=1.0=0 69 | - pip: 70 | - audioread==2.1.6 71 | - chardet==3.0.4 72 | - decorator==4.3.0 73 | - idna==2.7 74 | - joblib==0.12.5 75 | - librosa==0.5.1 76 | - llvmlite==0.25.0 77 | - mkl-fft==1.0.0 78 | - mkl-random==1.0.1 79 | - numba==0.40.1 80 | - requests==2.20.0 81 | - resampy==0.2.1 82 | - torch==0.3.0 83 | - torchtext==0.2.1 84 | - tqdm==4.28.1 85 | - urllib3==1.24 86 | prefix: /home/wangjiancheng/.conda/envs/pytorch-in-action 87 | 88 | -------------------------------------------------------------------------------- /images/PyTorch-in-action.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/images/PyTorch-in-action.png -------------------------------------------------------------------------------- /images/图_4-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/xiaobaoonline/pytorch-in-action/193745dc0b45b4c292ad9276eac0023c4ac85ae8/images/图_4-2.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | audioread==2.1.6 2 | certifi==2018.4.16 3 | cffi==1.11.5 4 | chardet==3.0.4 5 | cycler==0.10.0 6 | decorator==4.3.0 7 | idna==2.7 8 | joblib==0.12.5 9 | kiwisolver==1.0.1 10 | librosa==0.5.1 11 | llvmlite==0.25.0 12 | matplotlib==2.2.2 13 | mkl-fft==1.0.0 14 | mkl-random==1.0.1 15 | numba==0.40.1 16 | numpy==1.14.2 17 | olefile==0.46 18 | Pillow==6.2.0 19 | pycparser==2.18 20 | pyparsing==2.2.0 21 | python-dateutil==2.7.3 22 | pytz==2018.4 23 | requests==2.20.0 24 | resampy==0.2.1 25 | scikit-learn==0.19.1 26 | scipy==1.1.0 27 | six==1.11.0 28 | torch==0.3.0 29 | torchtext==0.2.1 30 | torchvision==0.2.0 31 | tornado==5.0.2 32 | tqdm==4.28.1 33 | urllib3==1.24.2 34 | --------------------------------------------------------------------------------