├── .github
└── workflows
│ └── pythonapp.yml
├── .gitignore
├── README.md
├── TensorFlow深度学习(带目录).pdf
├── assets
├── 0.4.目录-双排-1.jpg
├── 0.4.目录-双排-2.jpg
├── 0.4.目录-双排-3.jpg
├── 1.jpg
├── 2.png
├── book-cover.png
├── dglg.jpg
├── dzkjdx.jpg
├── hnxxxy.jpg
└── xbgydx.jpg
├── ch01-人工智能绪论
├── autograd.py
├── gpu_accelerate.py
├── tf1.py
└── tf2.py
├── ch02-回归问题
├── data.csv
├── linear_regression.py
├── 回归实战.pdf
└── 回归问题.pdf
├── ch03-分类问题
├── forward_layer.py
├── forward_tensor.py
├── main.py
├── 手写数字问题.pdf
└── 手写数字问题体验.pdf
├── ch04-TensorFlow基础
├── 4.10-forward-prop.py
├── Broadcasting.pdf
├── MNIST数据集的前向传播训练误差曲线.png
├── ch04-TensorFlow基础.ipynb
├── 创建Tensor.pdf
├── 前向传播.pdf
├── 数学运算.pdf
├── 数据类型.pdf
├── 索引与切片-1.pdf
├── 索引与切片-2.pdf
└── 维度变换.pdf
├── ch05-TensorFlow进阶
├── acc_topk.py
├── gradient_clip.py
├── mnist_tensor.py
├── 合并与分割.pdf
├── 填充与复制.pdf
├── 张量排序.pdf
├── 张量限幅.pdf
├── 数据统计.pdf
└── 高阶特性.pdf
├── ch06-神经网络
├── auto_efficency_regression.py
├── ch06-神经网络.ipynb
├── forward.py
├── nb.py
├── 全接连层.pdf
├── 误差计算.pdf
└── 输出方式.pdf
├── ch07-反向传播算法
├── 0.梯度下降-简介.pdf
├── 2.常见函数的梯度.pdf
├── 2nd_derivative.py
├── 3.激活函数及其梯度.pdf
├── 4.损失函数及其梯度.pdf
├── 5.单输出感知机梯度.pdf
├── 6.多输出感知机梯度.pdf
├── 7.链式法则.pdf
├── 8.多层感知机梯度.pdf
├── ch07-反向传播算法.ipynb
├── chain_rule.py
├── crossentropy_loss.py
├── himmelblau.py
├── mse_grad.py
├── multi_output_perceptron.py
├── numpy-backward-prop.py
├── sigmoid_grad.py
└── single_output_perceptron.py
├── ch08-Keras高层接口
├── 1.Metrics.pdf
├── 2.Compile&Fit.pdf
├── 3.自定义层.pdf
├── Keras实战CIFAR10.pdf
├── compile_fit.py
├── keras_train.py
├── layer_model.py
├── metrics.py
├── nb.py
├── pretained.py
├── save_load_model.py
├── save_load_weight.py
└── 模型加载与保存.pdf
├── ch09-过拟合
├── 9.8-over-fitting-and-under-fitting.py
├── Regularization.pdf
├── compile_fit.py
├── dropout.py
├── lenna.png
├── lenna_crop.png
├── lenna_crop2.png
├── lenna_eras.png
├── lenna_eras2.png
├── lenna_flip.png
├── lenna_flip2.png
├── lenna_guassian.png
├── lenna_perspective.png
├── lenna_resize.png
├── lenna_rotate.png
├── lenna_rotate2.png
├── misc.pdf
├── regularization.py
├── train_evalute_test.py
├── 交叉验证.pdf
├── 学习率与动量.pdf
└── 过拟合与欠拟合.pdf
├── ch10-卷积神经网络
├── BatchNorm.pdf
├── CIFAR与VGG实战.pdf
├── ResNet与DenseNet.pdf
├── ResNet实战.pdf
├── bn_main.py
├── cifar10_train.py
├── nb.py
├── resnet.py
├── resnet18_train.py
├── 什么是卷积.pdf
├── 卷积神经网络.pdf
├── 池化与采样.pdf
└── 经典卷积网络.pdf
├── ch11-循环神经网络
├── LSTM.pdf
├── LSTM实战.pdf
├── RNN Layer使用.pdf
├── nb.py
├── pretrained.py
├── sentiment_analysis_cell - GRU.py
├── sentiment_analysis_cell - LSTM.py
├── sentiment_analysis_cell.py
├── sentiment_analysis_layer - GRU.py
├── sentiment_analysis_layer - LSTM - pretrained.py
├── sentiment_analysis_layer - LSTM.py
├── sentiment_analysis_layer.py
├── 循环神经网络.pdf
├── 情感分类实战.pdf
├── 时间序列表示.pdf
└── 梯度弥散与梯度爆炸.pdf
├── ch12-自编码器
├── AE实战.pdf
├── AutoEncoders.pdf
├── autoencoder.py
└── vae.py
├── ch13-生成对抗网络
├── GAN.pdf
├── GAN实战.pdf
├── dataset.py
├── gan.py
├── gan_train.py
├── wgan.py
└── wgan_train.py
├── ch14-强化学习
├── REINFORCE_tf.py
├── a3c_tf_cartpole.py
├── dqn_tf.py
└── ppo_tf_cartpole.py
├── ch15-自定义数据集
├── pokemon.py
├── resnet.py
├── train_scratch.py
├── train_transfer.py
└── 宝可梦数据集.pdf
└── 【《TensorFlow深度学习》】.pdf
/.github/workflows/pythonapp.yml:
--------------------------------------------------------------------------------
1 | name: Python application
2 | on: [push, pull_request]
3 | jobs:
4 | build:
5 | runs-on: ubuntu-latest
6 | steps:
7 | - uses: actions/checkout@v1
8 | - name: Set up Python 3
9 | uses: actions/setup-python@v1
10 | with:
11 | python-version: 3.x
12 | - name: Install dependencies
13 | run: |
14 | python -m pip install --upgrade pip
15 | pip install flake8 pytest
16 | pip install -r requirements.txt || true
17 | - name: Lint with flake8
18 | run: |
19 | # stop the build if there are Python syntax errors or undefined names
20 | flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
21 | # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
22 | flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
23 | - name: Test with pytest
24 | run: |
25 | pytest || true
26 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | *.DS_Store
2 | *.bak
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # TensorFlow 2深度学习开源书(龙书)
2 |
3 | 基于TensorFlow 2正式版!!!
4 | 理论与实战结合,非常适合入门学习!!!
5 |
6 | - **[纸质书购买链接:京东](https://item.jd.com/12954866.html)**
7 | - **[纸质书购买链接:淘宝](https://detail.tmall.com/item.htm?spm=a230r.1.14.16.18b460abi8w8jJ&id=625801924474&ns=1&abbucket=9)**
8 |
9 | 本仓库包含pdf电子书、配套源代码、配套课件等。部分代码已替换为Ipython Notebook形式,感谢这位[童鞋](https://github.com/Relph1119/deeplearning-with-tensorflow-notes)的整理。
10 |
11 | 开源电子版pdf还可以从[百度网盘下载](https://pan.baidu.com/s/1GgQjhDqSgSfjxqBMsE3RDQ) 提取码:juqs
12 | 感谢云城不及粒火童鞋提供的书签版pdf。
13 |
14 | - **本书的繁体版已经出版,已授权在中国台湾地区上市发行**
15 |
16 | - **本书被“机器之心”,“量子位”等权威媒体报道!**
17 |
18 | - **本库在Github趋势日榜单连续多天全球排名第一!**
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 | - 提交错误或者修改等反馈意见,请在Github [Issues](https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book/issues)页面提交
28 |
29 | - 联系邮箱(一般问题建议Github issues交流):liangqu.long AT gmail.com
30 |
31 | - **高校老师索取PPT原素材**等教案,请邮箱联系,并详注院校课程等信息,一般3天内发送邮件回复
32 |
33 | - 使用本书本的任何内容时(**仅限非商业用途**),请注明作者和Github链接
34 |
35 |
36 | # 合作院校
37 |
38 | 以下高校已采用本书作为专业教材或参考资料(排名不分先后),欢迎更多高校加入!发送邮件即可索取PPT原始教案。
39 |
40 | | 电子科技大学 | 西北工业大学 | 北京交通大学 | 厦门大学 | 重庆邮电大学 |
41 | |---|---|---|---|---|
42 | | **东南大学** | ** ** | ** ** | ** ** | |
43 | | **湖南信息学院** | **中山大学新华学院** | **东莞理工大学** | **北京科技职业学院** | |
44 | | **郑州轻工业大学** | **金华职业技术学院** | **高雄市立新莊高級中學** | **安徽财经大学** | |
45 | | **长沙民政职业技术学院** | **兰州交通大学** | ** ** | ** ** | |
46 |
47 |
48 |
49 | # “龙书”生态系统
50 |
51 | - [纸质书/实体书](https://item.jd.com/12954866.html)
52 |
53 | - [介绍短片](https://www.bilibili.com/video/av75331861)
54 |
55 | - [English Version](https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book-EN)
56 |
57 | - [TensorFlow视频课程](https://study.163.com/course/courseMain.htm?share=2&shareId=480000001847407&courseId=1209092816&_trace_c_p_k2_=9e74eb6f891d47cfaa6f00b5cb5f617c)
58 |
59 | - [PyTorch深度学习开源书](https://github.com/dragen1860/Deep-Learning-with-PyTorch-book)
60 |
61 | - 更多TensorFlow 2实战案例在[这里](https://github.com/dragen1860/TensorFlow-2.x-Tutorials)
62 |
63 |
64 | # 简要目录
65 |
66 |
67 |
68 |
69 |
70 |
71 |
72 |
73 |
74 | # 配套视频课程
75 |
76 | 适合零基础、希望快速入门AI的朋友,提供答疑、指导等全方位服务。
77 |
78 | - 深度学习与TensorFlow入门实战
79 | https://study.163.com/course/courseMain.htm?share=2&shareId=480000001847407&courseId=1209092816&_trace_c_p_k2_=9e74eb6f891d47cfaa6f00b5cb5f617c
80 | - 深度学习与PyTorch入门实战
81 | https://study.163.com/course/courseMain.htm?share=2&shareId=480000001847407&courseId=1208894818&_trace_c_p_k2_=8d1b10e04bd34d69855bb71da65b0549
82 |
83 |
--------------------------------------------------------------------------------
/TensorFlow深度学习(带目录).pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/TensorFlow深度学习(带目录).pdf
--------------------------------------------------------------------------------
/assets/0.4.目录-双排-1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/0.4.目录-双排-1.jpg
--------------------------------------------------------------------------------
/assets/0.4.目录-双排-2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/0.4.目录-双排-2.jpg
--------------------------------------------------------------------------------
/assets/0.4.目录-双排-3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/0.4.目录-双排-3.jpg
--------------------------------------------------------------------------------
/assets/1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/1.jpg
--------------------------------------------------------------------------------
/assets/2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/2.png
--------------------------------------------------------------------------------
/assets/book-cover.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/book-cover.png
--------------------------------------------------------------------------------
/assets/dglg.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/dglg.jpg
--------------------------------------------------------------------------------
/assets/dzkjdx.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/dzkjdx.jpg
--------------------------------------------------------------------------------
/assets/hnxxxy.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/hnxxxy.jpg
--------------------------------------------------------------------------------
/assets/xbgydx.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/xbgydx.jpg
--------------------------------------------------------------------------------
/ch01-人工智能绪论/autograd.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | # 创建4个张量
4 | a = tf.constant(1.)
5 | b = tf.constant(2.)
6 | c = tf.constant(3.)
7 | w = tf.constant(4.)
8 |
9 |
10 | with tf.GradientTape() as tape:# 构建梯度环境
11 | tape.watch([w]) # 将w加入梯度跟踪列表
12 | # 构建计算过程
13 | y = a * w**2 + b * w + c
14 | # 求导
15 | [dy_dw] = tape.gradient(y, [w])
16 | print(dy_dw)
17 |
18 |
--------------------------------------------------------------------------------
/ch01-人工智能绪论/gpu_accelerate.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplotlib
3 | from matplotlib import pyplot as plt
4 | # Default parameters for plots
5 | matplotlib.rcParams['font.size'] = 20
6 | matplotlib.rcParams['figure.titlesize'] = 20
7 | matplotlib.rcParams['figure.figsize'] = [9, 7]
8 | matplotlib.rcParams['font.family'] = ['STKaiti']
9 | matplotlib.rcParams['axes.unicode_minus']=False
10 |
11 |
12 |
13 | import tensorflow as tf
14 | import timeit
15 |
16 |
17 |
18 |
19 | cpu_data = []
20 | gpu_data = []
21 | for n in range(9):
22 | n = 10**n
23 | # 创建在CPU上运算的2个矩阵
24 | with tf.device('/cpu:0'):
25 | cpu_a = tf.random.normal([1, n])
26 | cpu_b = tf.random.normal([n, 1])
27 | print(cpu_a.device, cpu_b.device)
28 | # 创建使用GPU运算的2个矩阵
29 | with tf.device('/gpu:0'):
30 | gpu_a = tf.random.normal([1, n])
31 | gpu_b = tf.random.normal([n, 1])
32 | print(gpu_a.device, gpu_b.device)
33 |
34 | def cpu_run():
35 | with tf.device('/cpu:0'):
36 | c = tf.matmul(cpu_a, cpu_b)
37 | return c
38 |
39 | def gpu_run():
40 | with tf.device('/gpu:0'):
41 | c = tf.matmul(gpu_a, gpu_b)
42 | return c
43 |
44 | # 第一次计算需要热身,避免将初始化阶段时间结算在内
45 | cpu_time = timeit.timeit(cpu_run, number=10)
46 | gpu_time = timeit.timeit(gpu_run, number=10)
47 | print('warmup:', cpu_time, gpu_time)
48 | # 正式计算10次,取平均时间
49 | cpu_time = timeit.timeit(cpu_run, number=10)
50 | gpu_time = timeit.timeit(gpu_run, number=10)
51 | print('run time:', cpu_time, gpu_time)
52 | cpu_data.append(cpu_time/10)
53 | gpu_data.append(gpu_time/10)
54 |
55 | del cpu_a,cpu_b,gpu_a,gpu_b
56 |
57 | x = [10**i for i in range(9)]
58 | cpu_data = [1000*i for i in cpu_data]
59 | gpu_data = [1000*i for i in gpu_data]
60 | plt.plot(x, cpu_data, 'C1')
61 | plt.plot(x, cpu_data, color='C1', marker='s', label='CPU')
62 | plt.plot(x, gpu_data,'C0')
63 | plt.plot(x, gpu_data, color='C0', marker='^', label='GPU')
64 |
65 |
66 | plt.gca().set_xscale('log')
67 | plt.gca().set_yscale('log')
68 | plt.ylim([0,100])
69 | plt.xlabel('矩阵大小n:(1xn)@(nx1)')
70 | plt.ylabel('运算时间(ms)')
71 | plt.legend()
72 | plt.savefig('gpu-time.svg')
--------------------------------------------------------------------------------
/ch01-人工智能绪论/tf1.py:
--------------------------------------------------------------------------------
1 | import tensorflow.compat.v1 as tf
2 | tf.disable_v2_behavior() # 使用静态图模式运行以下代码
3 | assert tf.__version__.startswith('2.')
4 |
5 | # 1.创建计算图阶段
6 | # 创建2个输入端子,指定类型和名字
7 | a_ph = tf.placeholder(tf.float32, name='variable_a')
8 | b_ph = tf.placeholder(tf.float32, name='variable_b')
9 | # 创建输出端子的运算操作,并命名
10 | c_op = tf.add(a_ph, b_ph, name='variable_c')
11 |
12 | # 2.运行计算图阶段
13 | # 创建运行环境
14 | sess = tf.InteractiveSession()
15 | # 初始化操作也需要作为操作运行
16 | init = tf.global_variables_initializer()
17 | sess.run(init) # 运行初始化操作,完成初始化
18 | # 运行输出端子,需要给输入端子赋值
19 | c_numpy = sess.run(c_op, feed_dict={a_ph: 2., b_ph: 4.})
20 | # 运算完输出端子才能得到数值类型的c_numpy
21 | print('a+b=',c_numpy)
--------------------------------------------------------------------------------
/ch01-人工智能绪论/tf2.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import tensorflow as tf
3 | assert tf.__version__.startswith('2.')
4 |
5 | # 1.创建输入张量
6 | a = tf.constant(2.)
7 | b = tf.constant(4.)
8 | # 2.直接计算并打印
9 | print('a+b=',a+b)
10 |
11 |
12 |
--------------------------------------------------------------------------------
/ch02-回归问题/data.csv:
--------------------------------------------------------------------------------
1 | 32.502345269453031,31.70700584656992
2 | 53.426804033275019,68.77759598163891
3 | 61.530358025636438,62.562382297945803
4 | 47.475639634786098,71.546632233567777
5 | 59.813207869512318,87.230925133687393
6 | 55.142188413943821,78.211518270799232
7 | 52.211796692214001,79.64197304980874
8 | 39.299566694317065,59.171489321869508
9 | 48.10504169176825,75.331242297063056
10 | 52.550014442733818,71.300879886850353
11 | 45.419730144973755,55.165677145959123
12 | 54.351634881228918,82.478846757497919
13 | 44.164049496773352,62.008923245725825
14 | 58.16847071685779,75.392870425994957
15 | 56.727208057096611,81.43619215887864
16 | 48.955888566093719,60.723602440673965
17 | 44.687196231480904,82.892503731453715
18 | 60.297326851333466,97.379896862166078
19 | 45.618643772955828,48.847153317355072
20 | 38.816817537445637,56.877213186268506
21 | 66.189816606752601,83.878564664602763
22 | 65.41605174513407,118.59121730252249
23 | 47.48120860786787,57.251819462268969
24 | 41.57564261748702,51.391744079832307
25 | 51.84518690563943,75.380651665312357
26 | 59.370822011089523,74.765564032151374
27 | 57.31000343834809,95.455052922574737
28 | 63.615561251453308,95.229366017555307
29 | 46.737619407976972,79.052406169565586
30 | 50.556760148547767,83.432071421323712
31 | 52.223996085553047,63.358790317497878
32 | 35.567830047746632,41.412885303700563
33 | 42.436476944055642,76.617341280074044
34 | 58.16454011019286,96.769566426108199
35 | 57.504447615341789,74.084130116602523
36 | 45.440530725319981,66.588144414228594
37 | 61.89622268029126,77.768482417793024
38 | 33.093831736163963,50.719588912312084
39 | 36.436009511386871,62.124570818071781
40 | 37.675654860850742,60.810246649902211
41 | 44.555608383275356,52.682983366387781
42 | 43.318282631865721,58.569824717692867
43 | 50.073145632289034,82.905981485070512
44 | 43.870612645218372,61.424709804339123
45 | 62.997480747553091,115.24415280079529
46 | 32.669043763467187,45.570588823376085
47 | 40.166899008703702,54.084054796223612
48 | 53.575077531673656,87.994452758110413
49 | 33.864214971778239,52.725494375900425
50 | 64.707138666121296,93.576118692658241
51 | 38.119824026822805,80.166275447370964
52 | 44.502538064645101,65.101711570560326
53 | 40.599538384552318,65.562301260400375
54 | 41.720676356341293,65.280886920822823
55 | 51.088634678336796,73.434641546324301
56 | 55.078095904923202,71.13972785861894
57 | 41.377726534895203,79.102829683549857
58 | 62.494697427269791,86.520538440347153
59 | 49.203887540826003,84.742697807826218
60 | 41.102685187349664,59.358850248624933
61 | 41.182016105169822,61.684037524833627
62 | 50.186389494880601,69.847604158249183
63 | 52.378446219236217,86.098291205774103
64 | 50.135485486286122,59.108839267699643
65 | 33.644706006191782,69.89968164362763
66 | 39.557901222906828,44.862490711164398
67 | 56.130388816875467,85.498067778840223
68 | 57.362052133238237,95.536686846467219
69 | 60.269214393997906,70.251934419771587
70 | 35.678093889410732,52.721734964774988
71 | 31.588116998132829,50.392670135079896
72 | 53.66093226167304,63.642398775657753
73 | 46.682228649471917,72.247251068662365
74 | 43.107820219102464,57.812512976181402
75 | 70.34607561504933,104.25710158543822
76 | 44.492855880854073,86.642020318822006
77 | 57.50453330326841,91.486778000110135
78 | 36.930076609191808,55.231660886212836
79 | 55.805733357942742,79.550436678507609
80 | 38.954769073377065,44.847124242467601
81 | 56.901214702247074,80.207523139682763
82 | 56.868900661384046,83.14274979204346
83 | 34.33312470421609,55.723489260543914
84 | 59.04974121466681,77.634182511677864
85 | 57.788223993230673,99.051414841748269
86 | 54.282328705967409,79.120646274680027
87 | 51.088719898979143,69.588897851118475
88 | 50.282836348230731,69.510503311494389
89 | 44.211741752090113,73.687564318317285
90 | 38.005488008060688,61.366904537240131
91 | 32.940479942618296,67.170655768995118
92 | 53.691639571070056,85.668203145001542
93 | 68.76573426962166,114.85387123391394
94 | 46.230966498310252,90.123572069967423
95 | 68.319360818255362,97.919821035242848
96 | 50.030174340312143,81.536990783015028
97 | 49.239765342753763,72.111832469615663
98 | 50.039575939875988,85.232007342325673
99 | 48.149858891028863,66.224957888054632
100 | 25.128484647772304,53.454394214850524
101 |
--------------------------------------------------------------------------------
/ch02-回归问题/linear_regression.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | # data = []
4 | # for i in range(100):
5 | # x = np.random.uniform(3., 12.)
6 | # # mean=0, std=0.1
7 | # eps = np.random.normal(0., 0.1)
8 | # y = 1.477 * x + 0.089 + eps
9 | # data.append([x, y])
10 | # data = np.array(data)
11 | # print(data.shape, data)
12 |
13 | # y = wx + b
14 | def compute_error_for_line_given_points(b, w, points):
15 | totalError = 0
16 | for i in range(0, len(points)):
17 | x = points[i, 0]
18 | y = points[i, 1]
19 | # computer mean-squared-error
20 | totalError += (y - (w * x + b)) ** 2
21 | # average loss for each point
22 | return totalError / float(len(points))
23 |
24 |
25 |
26 | def step_gradient(b_current, w_current, points, learningRate):
27 | b_gradient = 0
28 | w_gradient = 0
29 | N = float(len(points))
30 | for i in range(0, len(points)):
31 | x = points[i, 0]
32 | y = points[i, 1]
33 | # grad_b = 2(wx+b-y)
34 | b_gradient += (2/N) * ((w_current * x + b_current) - y)
35 | # grad_w = 2(wx+b-y)*x
36 | w_gradient += (2/N) * x * ((w_current * x + b_current) - y)
37 | # update w'
38 | new_b = b_current - (learningRate * b_gradient)
39 | new_w = w_current - (learningRate * w_gradient)
40 | return [new_b, new_w]
41 |
42 | def gradient_descent_runner(points, starting_b, starting_w, learning_rate, num_iterations):
43 | b = starting_b
44 | w = starting_w
45 | # update for several times
46 | for i in range(num_iterations):
47 | b, w = step_gradient(b, w, np.array(points), learning_rate)
48 | return [b, w]
49 |
50 |
51 | def run():
52 |
53 | points = np.genfromtxt("data.csv", delimiter=",")
54 | learning_rate = 0.0001
55 | initial_b = 0 # initial y-intercept guess
56 | initial_w = 0 # initial slope guess
57 | num_iterations = 1000
58 | print("Starting gradient descent at b = {0}, w = {1}, error = {2}"
59 | .format(initial_b, initial_w,
60 | compute_error_for_line_given_points(initial_b, initial_w, points))
61 | )
62 | print("Running...")
63 | [b, w] = gradient_descent_runner(points, initial_b, initial_w, learning_rate, num_iterations)
64 | print("After {0} iterations b = {1}, w = {2}, error = {3}".
65 | format(num_iterations, b, w,
66 | compute_error_for_line_given_points(b, w, points))
67 | )
68 |
69 | if __name__ == '__main__':
70 | run()
--------------------------------------------------------------------------------
/ch02-回归问题/回归实战.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch02-回归问题/回归实战.pdf
--------------------------------------------------------------------------------
/ch02-回归问题/回归问题.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch02-回归问题/回归问题.pdf
--------------------------------------------------------------------------------
/ch03-分类问题/forward_layer.py:
--------------------------------------------------------------------------------
1 | import os
2 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
3 |
4 |
5 | import tensorflow as tf
6 | from tensorflow import keras
7 | from tensorflow.keras import layers, optimizers, datasets
8 |
9 |
10 |
11 |
12 | (x, y), (x_val, y_val) = datasets.mnist.load_data()
13 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 255.
14 | y = tf.convert_to_tensor(y, dtype=tf.int32)
15 | y = tf.one_hot(y, depth=10)
16 | print(x.shape, y.shape)
17 | train_dataset = tf.data.Dataset.from_tensor_slices((x, y))
18 | train_dataset = train_dataset.batch(200)
19 |
20 |
21 |
22 |
23 | model = keras.Sequential([
24 | layers.Dense(512, activation='relu'),
25 | layers.Dense(256, activation='relu'),
26 | layers.Dense(10)])
27 |
28 | optimizer = optimizers.SGD(learning_rate=0.001)
29 |
30 |
31 | def train_epoch(epoch):
32 |
33 | # Step4.loop
34 | for step, (x, y) in enumerate(train_dataset):
35 |
36 |
37 | with tf.GradientTape() as tape:
38 | # [b, 28, 28] => [b, 784]
39 | x = tf.reshape(x, (-1, 28*28))
40 | # Step1. compute output
41 | # [b, 784] => [b, 10]
42 | out = model(x)
43 | # Step2. compute loss
44 | loss = tf.reduce_sum(tf.square(out - y)) / x.shape[0]
45 |
46 | # Step3. optimize and update w1, w2, w3, b1, b2, b3
47 | grads = tape.gradient(loss, model.trainable_variables)
48 | # w' = w - lr * grad
49 | optimizer.apply_gradients(zip(grads, model.trainable_variables))
50 |
51 | if step % 100 == 0:
52 | print(epoch, step, 'loss:', loss.numpy())
53 |
54 |
55 |
56 | def train():
57 |
58 | for epoch in range(30):
59 |
60 | train_epoch(epoch)
61 |
62 |
63 |
64 |
65 |
66 |
67 | if __name__ == '__main__':
68 | train()
--------------------------------------------------------------------------------
/ch03-分类问题/forward_tensor.py:
--------------------------------------------------------------------------------
1 | import os
2 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
3 | import matplotlib
4 | from matplotlib import pyplot as plt
5 | # Default parameters for plots
6 | matplotlib.rcParams['font.size'] = 20
7 | matplotlib.rcParams['figure.titlesize'] = 20
8 | matplotlib.rcParams['figure.figsize'] = [9, 7]
9 | matplotlib.rcParams['font.family'] = ['STKaiTi']
10 | matplotlib.rcParams['axes.unicode_minus']=False
11 |
12 | import tensorflow as tf
13 | from tensorflow import keras
14 | from tensorflow.keras import datasets
15 |
16 |
17 | # x: [60k, 28, 28],
18 | # y: [60k]
19 | (x, y), _ = datasets.mnist.load_data()
20 | # x: [0~255] => [0~1.]
21 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 255.
22 | y = tf.convert_to_tensor(y, dtype=tf.int32)
23 |
24 | print(x.shape, y.shape, x.dtype, y.dtype)
25 | print(tf.reduce_min(x), tf.reduce_max(x))
26 | print(tf.reduce_min(y), tf.reduce_max(y))
27 |
28 |
29 | train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128)
30 | train_iter = iter(train_db)
31 | sample = next(train_iter)
32 | print('batch:', sample[0].shape, sample[1].shape)
33 |
34 |
35 | # [b, 784] => [b, 256] => [b, 128] => [b, 10]
36 | # [dim_in, dim_out], [dim_out]
37 | w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1))
38 | b1 = tf.Variable(tf.zeros([256]))
39 | w2 = tf.Variable(tf.random.truncated_normal([256, 128], stddev=0.1))
40 | b2 = tf.Variable(tf.zeros([128]))
41 | w3 = tf.Variable(tf.random.truncated_normal([128, 10], stddev=0.1))
42 | b3 = tf.Variable(tf.zeros([10]))
43 |
44 | lr = 1e-3
45 |
46 | losses = []
47 |
48 | for epoch in range(20): # iterate db for 10
49 | for step, (x, y) in enumerate(train_db): # for every batch
50 | # x:[128, 28, 28]
51 | # y: [128]
52 |
53 | # [b, 28, 28] => [b, 28*28]
54 | x = tf.reshape(x, [-1, 28*28])
55 |
56 | with tf.GradientTape() as tape: # tf.Variable
57 | # x: [b, 28*28]
58 | # h1 = x@w1 + b1
59 | # [b, 784]@[784, 256] + [256] => [b, 256] + [256] => [b, 256] + [b, 256]
60 | h1 = x@w1 + tf.broadcast_to(b1, [x.shape[0], 256])
61 | h1 = tf.nn.relu(h1)
62 | # [b, 256] => [b, 128]
63 | h2 = h1@w2 + b2
64 | h2 = tf.nn.relu(h2)
65 | # [b, 128] => [b, 10]
66 | out = h2@w3 + b3
67 |
68 | # compute loss
69 | # out: [b, 10]
70 | # y: [b] => [b, 10]
71 | y_onehot = tf.one_hot(y, depth=10)
72 |
73 | # mse = mean(sum(y-out)^2)
74 | # [b, 10]
75 | loss = tf.square(y_onehot - out)
76 | # mean: scalar
77 | loss = tf.reduce_mean(loss)
78 |
79 | # compute gradients
80 | grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
81 | # print(grads)
82 | # w1 = w1 - lr * w1_grad
83 | w1.assign_sub(lr * grads[0])
84 | b1.assign_sub(lr * grads[1])
85 | w2.assign_sub(lr * grads[2])
86 | b2.assign_sub(lr * grads[3])
87 | w3.assign_sub(lr * grads[4])
88 | b3.assign_sub(lr * grads[5])
89 |
90 |
91 | if step % 100 == 0:
92 | print(epoch, step, 'loss:', float(loss))
93 |
94 | losses.append(float(loss))
95 |
96 | plt.figure()
97 | plt.plot(losses, color='C0', marker='s', label='训练')
98 | plt.xlabel('Epoch')
99 | plt.legend()
100 | plt.ylabel('MSE')
101 | plt.savefig('forward.svg')
102 | # plt.show()
103 |
--------------------------------------------------------------------------------
/ch03-分类问题/main.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
3 |
4 |
5 | # 设置GPU使用方式
6 | # 获取GPU列表
7 | gpus = tf.config.experimental.list_physical_devices('GPU')
8 | if gpus:
9 | try:
10 | # 设置GPU为增长式占用
11 | for gpu in gpus:
12 | tf.config.experimental.set_memory_growth(gpu, True)
13 | except RuntimeError as e:
14 | # 打印异常
15 | print(e)
16 |
17 | (xs, ys),_ = datasets.mnist.load_data()
18 | print('datasets:', xs.shape, ys.shape, xs.min(), xs.max())
19 |
20 | batch_size = 32
21 |
22 | xs = tf.convert_to_tensor(xs, dtype=tf.float32) / 255.
23 | db = tf.data.Dataset.from_tensor_slices((xs,ys))
24 | db = db.batch(batch_size).repeat(30)
25 |
26 |
27 | model = Sequential([layers.Dense(256, activation='relu'),
28 | layers.Dense(128, activation='relu'),
29 | layers.Dense(10)])
30 | model.build(input_shape=(4, 28*28))
31 | model.summary()
32 |
33 | optimizer = optimizers.SGD(lr=0.01)
34 | acc_meter = metrics.Accuracy()
35 |
36 | for step, (x,y) in enumerate(db):
37 |
38 | with tf.GradientTape() as tape:
39 | # 打平操作,[b, 28, 28] => [b, 784]
40 | x = tf.reshape(x, (-1, 28*28))
41 | # Step1. 得到模型输出output [b, 784] => [b, 10]
42 | out = model(x)
43 | # [b] => [b, 10]
44 | y_onehot = tf.one_hot(y, depth=10)
45 | # 计算差的平方和,[b, 10]
46 | loss = tf.square(out-y_onehot)
47 | # 计算每个样本的平均误差,[b]
48 | loss = tf.reduce_sum(loss) / x.shape[0]
49 |
50 |
51 | acc_meter.update_state(tf.argmax(out, axis=1), y)
52 |
53 | grads = tape.gradient(loss, model.trainable_variables)
54 | optimizer.apply_gradients(zip(grads, model.trainable_variables))
55 |
56 |
57 | if step % 200==0:
58 |
59 | print(step, 'loss:', float(loss), 'acc:', acc_meter.result().numpy())
60 | acc_meter.reset_states()
61 |
--------------------------------------------------------------------------------
/ch03-分类问题/手写数字问题.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch03-分类问题/手写数字问题.pdf
--------------------------------------------------------------------------------
/ch03-分类问题/手写数字问题体验.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch03-分类问题/手写数字问题体验.pdf
--------------------------------------------------------------------------------
/ch04-TensorFlow基础/4.10-forward-prop.py:
--------------------------------------------------------------------------------
1 |
2 |
3 | import matplotlib.pyplot as plt
4 | import tensorflow as tf
5 | import tensorflow.keras.datasets as datasets
6 |
7 | plt.rcParams['font.size'] = 16
8 | plt.rcParams['font.family'] = ['STKaiti']
9 | plt.rcParams['axes.unicode_minus'] = False
10 |
11 |
12 | def load_data():
13 | # 加载 MNIST 数据集
14 | (x, y), (x_val, y_val) = datasets.mnist.load_data()
15 | # 转换为浮点张量, 并缩放到-1~1
16 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 255.
17 | # 转换为整形张量
18 | y = tf.convert_to_tensor(y, dtype=tf.int32)
19 | # one-hot 编码
20 | y = tf.one_hot(y, depth=10)
21 |
22 | # 改变视图, [b, 28, 28] => [b, 28*28]
23 | x = tf.reshape(x, (-1, 28 * 28))
24 |
25 | # 构建数据集对象
26 | train_dataset = tf.data.Dataset.from_tensor_slices((x, y))
27 | # 批量训练
28 | train_dataset = train_dataset.batch(200)
29 | return train_dataset
30 |
31 |
32 | def init_paramaters():
33 | # 每层的张量都需要被优化,故使用 Variable 类型,并使用截断的正太分布初始化权值张量
34 | # 偏置向量初始化为 0 即可
35 | # 第一层的参数
36 | w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1))
37 | b1 = tf.Variable(tf.zeros([256]))
38 | # 第二层的参数
39 | w2 = tf.Variable(tf.random.truncated_normal([256, 128], stddev=0.1))
40 | b2 = tf.Variable(tf.zeros([128]))
41 | # 第三层的参数
42 | w3 = tf.Variable(tf.random.truncated_normal([128, 10], stddev=0.1))
43 | b3 = tf.Variable(tf.zeros([10]))
44 | return w1, b1, w2, b2, w3, b3
45 |
46 |
47 | def train_epoch(epoch, train_dataset, w1, b1, w2, b2, w3, b3, lr=0.001):
48 | for step, (x, y) in enumerate(train_dataset):
49 | with tf.GradientTape() as tape:
50 | # 第一层计算, [b, 784]@[784, 256] + [256] => [b, 256] + [256] => [b,256] + [b, 256]
51 | h1 = x @ w1 + tf.broadcast_to(b1, (x.shape[0], 256))
52 | h1 = tf.nn.relu(h1) # 通过激活函数
53 |
54 | # 第二层计算, [b, 256] => [b, 128]
55 | h2 = h1 @ w2 + b2
56 | h2 = tf.nn.relu(h2)
57 | # 输出层计算, [b, 128] => [b, 10]
58 | out = h2 @ w3 + b3
59 |
60 | # 计算网络输出与标签之间的均方差, mse = mean(sum(y-out)^2)
61 | # [b, 10]
62 | loss = tf.square(y - out)
63 | # 误差标量, mean: scalar
64 | loss = tf.reduce_mean(loss)
65 |
66 | # 自动梯度,需要求梯度的张量有[w1, b1, w2, b2, w3, b3]
67 | grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
68 |
69 | # 梯度更新, assign_sub 将当前值减去参数值,原地更新
70 | w1.assign_sub(lr * grads[0])
71 | b1.assign_sub(lr * grads[1])
72 | w2.assign_sub(lr * grads[2])
73 | b2.assign_sub(lr * grads[3])
74 | w3.assign_sub(lr * grads[4])
75 | b3.assign_sub(lr * grads[5])
76 |
77 | if step % 100 == 0:
78 | print(epoch, step, 'loss:', loss.numpy())
79 |
80 | return loss.numpy()
81 |
82 |
83 | def train(epochs):
84 | losses = []
85 | train_dataset = load_data()
86 | w1, b1, w2, b2, w3, b3 = init_paramaters()
87 | for epoch in range(epochs):
88 | loss = train_epoch(epoch, train_dataset, w1, b1, w2, b2, w3, b3, lr=0.001)
89 | losses.append(loss)
90 |
91 | x = [i for i in range(0, epochs)]
92 | # 绘制曲线
93 | plt.plot(x, losses, color='blue', marker='s', label='训练')
94 | plt.xlabel('Epoch')
95 | plt.ylabel('MSE')
96 | plt.legend()
97 | plt.savefig('MNIST数据集的前向传播训练误差曲线.png')
98 | plt.close()
99 |
100 |
101 | if __name__ == '__main__':
102 | train(epochs=20)
103 |
--------------------------------------------------------------------------------
/ch04-TensorFlow基础/Broadcasting.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/Broadcasting.pdf
--------------------------------------------------------------------------------
/ch04-TensorFlow基础/MNIST数据集的前向传播训练误差曲线.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/MNIST数据集的前向传播训练误差曲线.png
--------------------------------------------------------------------------------
/ch04-TensorFlow基础/创建Tensor.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/创建Tensor.pdf
--------------------------------------------------------------------------------
/ch04-TensorFlow基础/前向传播.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/前向传播.pdf
--------------------------------------------------------------------------------
/ch04-TensorFlow基础/数学运算.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/数学运算.pdf
--------------------------------------------------------------------------------
/ch04-TensorFlow基础/数据类型.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/数据类型.pdf
--------------------------------------------------------------------------------
/ch04-TensorFlow基础/索引与切片-1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/索引与切片-1.pdf
--------------------------------------------------------------------------------
/ch04-TensorFlow基础/索引与切片-2.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/索引与切片-2.pdf
--------------------------------------------------------------------------------
/ch04-TensorFlow基础/维度变换.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/维度变换.pdf
--------------------------------------------------------------------------------
/ch05-TensorFlow进阶/acc_topk.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import os
3 |
4 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
5 | tf.random.set_seed(2467)
6 |
7 | def accuracy(output, target, topk=(1,)):
8 | maxk = max(topk)
9 | batch_size = target.shape[0]
10 |
11 | pred = tf.math.top_k(output, maxk).indices
12 | pred = tf.transpose(pred, perm=[1, 0])
13 | target_ = tf.broadcast_to(target, pred.shape)
14 | # [10, b]
15 | correct = tf.equal(pred, target_)
16 |
17 | res = []
18 | for k in topk:
19 | correct_k = tf.cast(tf.reshape(correct[:k], [-1]), dtype=tf.float32)
20 | correct_k = tf.reduce_sum(correct_k)
21 | acc = float(correct_k* (100.0 / batch_size) )
22 | res.append(acc)
23 |
24 | return res
25 |
26 |
27 |
28 | output = tf.random.normal([10, 6])
29 | output = tf.math.softmax(output, axis=1)
30 | target = tf.random.uniform([10], maxval=6, dtype=tf.int32)
31 | print('prob:', output.numpy())
32 | pred = tf.argmax(output, axis=1)
33 | print('pred:', pred.numpy())
34 | print('label:', target.numpy())
35 |
36 | acc = accuracy(output, target, topk=(1,2,3,4,5,6))
37 | print('top-1-6 acc:', acc)
--------------------------------------------------------------------------------
/ch05-TensorFlow进阶/gradient_clip.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow import keras
3 | from tensorflow.keras import datasets, layers, optimizers
4 | import os
5 |
6 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
7 | print(tf.__version__)
8 |
9 | (x, y), _ = datasets.mnist.load_data()
10 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 50.
11 | y = tf.convert_to_tensor(y)
12 | y = tf.one_hot(y, depth=10)
13 | print('x:', x.shape, 'y:', y.shape)
14 | train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128).repeat(30)
15 | x,y = next(iter(train_db))
16 | print('sample:', x.shape, y.shape)
17 | # print(x[0], y[0])
18 |
19 |
20 |
21 | def main():
22 |
23 | # 784 => 512
24 | w1, b1 = tf.Variable(tf.random.truncated_normal([784, 512], stddev=0.1)), tf.Variable(tf.zeros([512]))
25 | # 512 => 256
26 | w2, b2 = tf.Variable(tf.random.truncated_normal([512, 256], stddev=0.1)), tf.Variable(tf.zeros([256]))
27 | # 256 => 10
28 | w3, b3 = tf.Variable(tf.random.truncated_normal([256, 10], stddev=0.1)), tf.Variable(tf.zeros([10]))
29 |
30 |
31 |
32 | optimizer = optimizers.SGD(lr=0.01)
33 |
34 |
35 | for step, (x,y) in enumerate(train_db):
36 |
37 | # [b, 28, 28] => [b, 784]
38 | x = tf.reshape(x, (-1, 784))
39 |
40 | with tf.GradientTape() as tape:
41 |
42 | # layer1.
43 | h1 = x @ w1 + b1
44 | h1 = tf.nn.relu(h1)
45 | # layer2
46 | h2 = h1 @ w2 + b2
47 | h2 = tf.nn.relu(h2)
48 | # output
49 | out = h2 @ w3 + b3
50 | # out = tf.nn.relu(out)
51 |
52 | # compute loss
53 | # [b, 10] - [b, 10]
54 | loss = tf.square(y-out)
55 | # [b, 10] => [b]
56 | loss = tf.reduce_mean(loss, axis=1)
57 | # [b] => scalar
58 | loss = tf.reduce_mean(loss)
59 |
60 |
61 |
62 | # compute gradient
63 | grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
64 | # print('==before==')
65 | # for g in grads:
66 | # print(tf.norm(g))
67 |
68 | grads, _ = tf.clip_by_global_norm(grads, 15)
69 |
70 | # print('==after==')
71 | # for g in grads:
72 | # print(tf.norm(g))
73 | # update w' = w - lr*grad
74 | optimizer.apply_gradients(zip(grads, [w1, b1, w2, b2, w3, b3]))
75 |
76 |
77 |
78 | if step % 100 == 0:
79 | print(step, 'loss:', float(loss))
80 |
81 |
82 |
83 |
84 | if __name__ == '__main__':
85 | main()
--------------------------------------------------------------------------------
/ch05-TensorFlow进阶/mnist_tensor.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import matplotlib
3 | from matplotlib import pyplot as plt
4 | # Default parameters for plots
5 | matplotlib.rcParams['font.size'] = 20
6 | matplotlib.rcParams['figure.titlesize'] = 20
7 | matplotlib.rcParams['figure.figsize'] = [9, 7]
8 | matplotlib.rcParams['font.family'] = ['STKaiTi']
9 | matplotlib.rcParams['axes.unicode_minus']=False
10 | import tensorflow as tf
11 | from tensorflow import keras
12 | from tensorflow.keras import datasets, layers, optimizers
13 | import os
14 |
15 |
16 |
17 |
18 |
19 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
20 | print(tf.__version__)
21 |
22 |
23 | def preprocess(x, y):
24 | # [b, 28, 28], [b]
25 | print(x.shape,y.shape)
26 | x = tf.cast(x, dtype=tf.float32) / 255.
27 | x = tf.reshape(x, [-1, 28*28])
28 | y = tf.cast(y, dtype=tf.int32)
29 | y = tf.one_hot(y, depth=10)
30 |
31 | return x,y
32 |
33 | #%%
34 | (x, y), (x_test, y_test) = datasets.mnist.load_data()
35 | print('x:', x.shape, 'y:', y.shape, 'x test:', x_test.shape, 'y test:', y_test)
36 | #%%
37 | batchsz = 512
38 | train_db = tf.data.Dataset.from_tensor_slices((x, y))
39 | train_db = train_db.shuffle(1000)
40 | train_db = train_db.batch(batchsz)
41 | train_db = train_db.map(preprocess)
42 | train_db = train_db.repeat(20)
43 |
44 | #%%
45 |
46 | test_db = tf.data.Dataset.from_tensor_slices((x_test, y_test))
47 | test_db = test_db.shuffle(1000).batch(batchsz).map(preprocess)
48 | x,y = next(iter(train_db))
49 | print('train sample:', x.shape, y.shape)
50 | # print(x[0], y[0])
51 |
52 |
53 |
54 |
55 | #%%
56 | def main():
57 |
58 | # learning rate
59 | lr = 1e-2
60 | accs,losses = [], []
61 |
62 |
63 | # 784 => 512
64 | w1, b1 = tf.Variable(tf.random.normal([784, 256], stddev=0.1)), tf.Variable(tf.zeros([256]))
65 | # 512 => 256
66 | w2, b2 = tf.Variable(tf.random.normal([256, 128], stddev=0.1)), tf.Variable(tf.zeros([128]))
67 | # 256 => 10
68 | w3, b3 = tf.Variable(tf.random.normal([128, 10], stddev=0.1)), tf.Variable(tf.zeros([10]))
69 |
70 |
71 |
72 |
73 |
74 | for step, (x,y) in enumerate(train_db):
75 |
76 | # [b, 28, 28] => [b, 784]
77 | x = tf.reshape(x, (-1, 784))
78 |
79 | with tf.GradientTape() as tape:
80 |
81 | # layer1.
82 | h1 = x @ w1 + b1
83 | h1 = tf.nn.relu(h1)
84 | # layer2
85 | h2 = h1 @ w2 + b2
86 | h2 = tf.nn.relu(h2)
87 | # output
88 | out = h2 @ w3 + b3
89 | # out = tf.nn.relu(out)
90 |
91 | # compute loss
92 | # [b, 10] - [b, 10]
93 | loss = tf.square(y-out)
94 | # [b, 10] => scalar
95 | loss = tf.reduce_mean(loss)
96 |
97 |
98 | grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
99 | for p, g in zip([w1, b1, w2, b2, w3, b3], grads):
100 | p.assign_sub(lr * g)
101 |
102 |
103 | # print
104 | if step % 80 == 0:
105 | print(step, 'loss:', float(loss))
106 | losses.append(float(loss))
107 |
108 | if step %80 == 0:
109 | # evaluate/test
110 | total, total_correct = 0., 0
111 |
112 | for x, y in test_db:
113 | # layer1.
114 | h1 = x @ w1 + b1
115 | h1 = tf.nn.relu(h1)
116 | # layer2
117 | h2 = h1 @ w2 + b2
118 | h2 = tf.nn.relu(h2)
119 | # output
120 | out = h2 @ w3 + b3
121 | # [b, 10] => [b]
122 | pred = tf.argmax(out, axis=1)
123 | # convert one_hot y to number y
124 | y = tf.argmax(y, axis=1)
125 | # bool type
126 | correct = tf.equal(pred, y)
127 | # bool tensor => int tensor => numpy
128 | total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy()
129 | total += x.shape[0]
130 |
131 | print(step, 'Evaluate Acc:', total_correct/total)
132 |
133 | accs.append(total_correct/total)
134 |
135 |
136 | plt.figure()
137 | x = [i*80 for i in range(len(losses))]
138 | plt.plot(x, losses, color='C0', marker='s', label='训练')
139 | plt.ylabel('MSE')
140 | plt.xlabel('Step')
141 | plt.legend()
142 | plt.savefig('train.svg')
143 |
144 | plt.figure()
145 | plt.plot(x, accs, color='C1', marker='s', label='测试')
146 | plt.ylabel('准确率')
147 | plt.xlabel('Step')
148 | plt.legend()
149 | plt.savefig('test.svg')
150 |
151 | if __name__ == '__main__':
152 | main()
--------------------------------------------------------------------------------
/ch05-TensorFlow进阶/合并与分割.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/合并与分割.pdf
--------------------------------------------------------------------------------
/ch05-TensorFlow进阶/填充与复制.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/填充与复制.pdf
--------------------------------------------------------------------------------
/ch05-TensorFlow进阶/张量排序.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/张量排序.pdf
--------------------------------------------------------------------------------
/ch05-TensorFlow进阶/张量限幅.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/张量限幅.pdf
--------------------------------------------------------------------------------
/ch05-TensorFlow进阶/数据统计.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/数据统计.pdf
--------------------------------------------------------------------------------
/ch05-TensorFlow进阶/高阶特性.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/高阶特性.pdf
--------------------------------------------------------------------------------
/ch06-神经网络/auto_efficency_regression.py:
--------------------------------------------------------------------------------
1 | #%%
2 | from __future__ import absolute_import, division, print_function, unicode_literals
3 |
4 | import pathlib
5 | import os
6 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
7 |
8 |
9 | import matplotlib.pyplot as plt
10 | import pandas as pd
11 | import seaborn as sns
12 |
13 | import tensorflow as tf
14 |
15 | from tensorflow import keras
16 | from tensorflow.keras import layers, losses
17 |
18 | print(tf.__version__)
19 |
20 |
21 | # 在线下载汽车效能数据集
22 | dataset_path = keras.utils.get_file("auto-mpg.data", "http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data")
23 |
24 | # 效能(公里数每加仑),气缸数,排量,马力,重量
25 | # 加速度,型号年份,产地
26 | column_names = ['MPG','Cylinders','Displacement','Horsepower','Weight',
27 | 'Acceleration', 'Model Year', 'Origin']
28 | raw_dataset = pd.read_csv(dataset_path, names=column_names,
29 | na_values = "?", comment='\t',
30 | sep=" ", skipinitialspace=True)
31 |
32 | dataset = raw_dataset.copy()
33 | # 查看部分数据
34 | dataset.tail()
35 | dataset.head()
36 | dataset
37 | #%%
38 |
39 |
40 | #%%
41 |
42 | # 统计空白数据,并清除
43 | dataset.isna().sum()
44 | dataset = dataset.dropna()
45 | dataset.isna().sum()
46 | dataset
47 | #%%
48 |
49 | # 处理类别型数据,其中origin列代表了类别1,2,3,分布代表产地:美国、欧洲、日本
50 | # 其弹出这一列
51 | origin = dataset.pop('Origin')
52 | # 根据origin列来写入新列
53 | dataset['USA'] = (origin == 1)*1.0
54 | dataset['Europe'] = (origin == 2)*1.0
55 | dataset['Japan'] = (origin == 3)*1.0
56 | dataset.tail()
57 |
58 |
59 | # 切分为训练集和测试集
60 | train_dataset = dataset.sample(frac=0.8,random_state=0)
61 | test_dataset = dataset.drop(train_dataset.index)
62 |
63 |
64 | #%% 统计数据
65 | sns.pairplot(train_dataset[["Cylinders", "Displacement", "Weight", "MPG"]],
66 | diag_kind="kde")
67 | #%%
68 | # 查看训练集的输入X的统计数据
69 | train_stats = train_dataset.describe()
70 | train_stats.pop("MPG")
71 | train_stats = train_stats.transpose()
72 | train_stats
73 |
74 |
75 | # 移动MPG油耗效能这一列为真实标签Y
76 | train_labels = train_dataset.pop('MPG')
77 | test_labels = test_dataset.pop('MPG')
78 |
79 |
80 | # 标准化数据
81 | def norm(x):
82 | return (x - train_stats['mean']) / train_stats['std']
83 | normed_train_data = norm(train_dataset)
84 | normed_test_data = norm(test_dataset)
85 | #%%
86 |
87 | print(normed_train_data.shape,train_labels.shape)
88 | print(normed_test_data.shape, test_labels.shape)
89 | #%%
90 |
91 | class Network(keras.Model):
92 | # 回归网络
93 | def __init__(self):
94 | super(Network, self).__init__()
95 | # 创建3个全连接层
96 | self.fc1 = layers.Dense(64, activation='relu')
97 | self.fc2 = layers.Dense(64, activation='relu')
98 | self.fc3 = layers.Dense(1)
99 |
100 | def call(self, inputs, training=None, mask=None):
101 | # 依次通过3个全连接层
102 | x = self.fc1(inputs)
103 | x = self.fc2(x)
104 | x = self.fc3(x)
105 |
106 | return x
107 |
108 | model = Network()
109 | model.build(input_shape=(None, 9))
110 | model.summary()
111 | optimizer = tf.keras.optimizers.RMSprop(0.001)
112 | train_db = tf.data.Dataset.from_tensor_slices((normed_train_data.values, train_labels.values))
113 | train_db = train_db.shuffle(100).batch(32)
114 |
115 | # # 未训练时测试
116 | # example_batch = normed_train_data[:10]
117 | # example_result = model.predict(example_batch)
118 | # example_result
119 |
120 |
121 | train_mae_losses = []
122 | test_mae_losses = []
123 | for epoch in range(200):
124 | for step, (x,y) in enumerate(train_db):
125 |
126 | with tf.GradientTape() as tape:
127 | out = model(x)
128 | loss = tf.reduce_mean(losses.MSE(y, out))
129 | mae_loss = tf.reduce_mean(losses.MAE(y, out))
130 |
131 | if step % 10 == 0:
132 | print(epoch, step, float(loss))
133 |
134 | grads = tape.gradient(loss, model.trainable_variables)
135 | optimizer.apply_gradients(zip(grads, model.trainable_variables))
136 |
137 | train_mae_losses.append(float(mae_loss))
138 | out = model(tf.constant(normed_test_data.values))
139 | test_mae_losses.append(tf.reduce_mean(losses.MAE(test_labels, out)))
140 |
141 |
142 | plt.figure()
143 | plt.xlabel('Epoch')
144 | plt.ylabel('MAE')
145 | plt.plot(train_mae_losses, label='Train')
146 |
147 | plt.plot(test_mae_losses, label='Test')
148 | plt.legend()
149 |
150 | # plt.ylim([0,10])
151 | plt.legend()
152 | plt.savefig('auto.svg')
153 | plt.show()
154 |
155 |
156 |
157 |
158 | #%%
159 |
--------------------------------------------------------------------------------
/ch06-神经网络/forward.py:
--------------------------------------------------------------------------------
1 | #%%
2 |
3 | import tensorflow as tf
4 | from tensorflow import keras
5 | from tensorflow.keras import layers
6 | from tensorflow.keras import datasets
7 | import os
8 |
9 |
10 | #%%
11 | x = tf.random.normal([2,28*28])
12 | w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1))
13 | b1 = tf.Variable(tf.zeros([256]))
14 | o1 = tf.matmul(x,w1) + b1
15 | o1
16 | #%%
17 | x = tf.random.normal([4,28*28])
18 | fc1 = layers.Dense(256, activation=tf.nn.relu)
19 | fc2 = layers.Dense(128, activation=tf.nn.relu)
20 | fc3 = layers.Dense(64, activation=tf.nn.relu)
21 | fc4 = layers.Dense(10, activation=None)
22 | h1 = fc1(x)
23 | h2 = fc2(h1)
24 | h3 = fc3(h2)
25 | h4 = fc4(h3)
26 |
27 | model = layers.Sequential([
28 | layers.Dense(256, activation=tf.nn.relu) ,
29 | layers.Dense(128, activation=tf.nn.relu) ,
30 | layers.Dense(64, activation=tf.nn.relu) ,
31 | layers.Dense(10, activation=None) ,
32 | ])
33 | out = model(x)
34 |
35 | #%%
36 | 256*784+256+128*256+128+64*128+64+10*64+10
37 | #%%
38 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
39 |
40 | # x: [60k, 28, 28],
41 | # y: [60k]
42 | (x, y), _ = datasets.mnist.load_data()
43 | # x: [0~255] => [0~1.]
44 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 255.
45 | y = tf.convert_to_tensor(y, dtype=tf.int32)
46 |
47 | print(x.shape, y.shape, x.dtype, y.dtype)
48 | print(tf.reduce_min(x), tf.reduce_max(x))
49 | print(tf.reduce_min(y), tf.reduce_max(y))
50 |
51 |
52 | train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128)
53 | train_iter = iter(train_db)
54 | sample = next(train_iter)
55 | print('batch:', sample[0].shape, sample[1].shape)
56 |
57 |
58 | # [b, 784] => [b, 256] => [b, 128] => [b, 10]
59 | # [dim_in, dim_out], [dim_out]
60 | # 隐藏层1张量
61 | w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1))
62 | b1 = tf.Variable(tf.zeros([256]))
63 | # 隐藏层2张量
64 | w2 = tf.Variable(tf.random.truncated_normal([256, 128], stddev=0.1))
65 | b2 = tf.Variable(tf.zeros([128]))
66 | # 隐藏层3张量
67 | w3 = tf.Variable(tf.random.truncated_normal([128, 64], stddev=0.1))
68 | b3 = tf.Variable(tf.zeros([64]))
69 | # 输出层张量
70 | w4 = tf.Variable(tf.random.truncated_normal([64, 10], stddev=0.1))
71 | b4 = tf.Variable(tf.zeros([10]))
72 |
73 | lr = 1e-3
74 |
75 | for epoch in range(10): # iterate db for 10
76 | for step, (x, y) in enumerate(train_db): # for every batch
77 | # x:[128, 28, 28]
78 | # y: [128]
79 |
80 | # [b, 28, 28] => [b, 28*28]
81 | x = tf.reshape(x, [-1, 28*28])
82 |
83 | with tf.GradientTape() as tape: # tf.Variable
84 | # x: [b, 28*28]
85 | # 隐藏层1前向计算,[b, 28*28] => [b, 256]
86 | h1 = x@w1 + tf.broadcast_to(b1, [x.shape[0], 256])
87 | h1 = tf.nn.relu(h1)
88 | # 隐藏层2前向计算,[b, 256] => [b, 128]
89 | h2 = h1@w2 + b2
90 | h2 = tf.nn.relu(h2)
91 | # 隐藏层3前向计算,[b, 128] => [b, 64]
92 | h3 = h2@w3 + b3
93 | h3 = tf.nn.relu(h3)
94 | # 输出层前向计算,[b, 64] => [b, 10]
95 | h4 = h3@w4 + b4
96 | out = h4
97 |
98 | # compute loss
99 | # out: [b, 10]
100 | # y: [b] => [b, 10]
101 | y_onehot = tf.one_hot(y, depth=10)
102 |
103 | # mse = mean(sum(y-out)^2)
104 | # [b, 10]
105 | loss = tf.square(y_onehot - out)
106 | # mean: scalar
107 | loss = tf.reduce_mean(loss)
108 |
109 | # compute gradients
110 | grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3, w4, b4])
111 | # print(grads)
112 | # w1 = w1 - lr * w1_grad
113 | w1.assign_sub(lr * grads[0])
114 | b1.assign_sub(lr * grads[1])
115 | w2.assign_sub(lr * grads[2])
116 | b2.assign_sub(lr * grads[3])
117 | w3.assign_sub(lr * grads[4])
118 | b3.assign_sub(lr * grads[5])
119 | w4.assign_sub(lr * grads[6])
120 | b4.assign_sub(lr * grads[7])
121 |
122 |
123 | if step % 100 == 0:
124 | print(epoch, step, 'loss:', float(loss))
125 |
126 |
127 |
128 |
129 | #%%
130 |
--------------------------------------------------------------------------------
/ch06-神经网络/nb.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import tensorflow as tf
3 | from tensorflow import keras
4 | from tensorflow.keras import datasets, layers
5 | import os
6 |
7 |
8 | #%%
9 | a = tf.random.normal([4,35,8]) # 模拟成绩册A
10 | b = tf.random.normal([6,35,8]) # 模拟成绩册B
11 | tf.concat([a,b],axis=0) # 合并成绩册
12 |
13 |
14 | #%%
15 | x = tf.random.normal([2,784])
16 | w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1))
17 | b1 = tf.Variable(tf.zeros([256]))
18 | o1 = tf.matmul(x,w1) + b1 #
19 | o1 = tf.nn.relu(o1)
20 | o1
21 | #%%
22 | x = tf.random.normal([4,28*28])
23 | # 创建全连接层,指定输出节点数和激活函数
24 | fc = layers.Dense(512, activation=tf.nn.relu)
25 | h1 = fc(x) # 通过fc类完成一次全连接层的计算
26 |
27 |
28 | #%%
29 | vars(fc)
30 |
31 | #%%
32 | x = tf.random.normal([4,4])
33 | # 创建全连接层,指定输出节点数和激活函数
34 | fc = layers.Dense(3, activation=tf.nn.relu)
35 | h1 = fc(x) # 通过fc类完成一次全连接层的计算
36 |
37 |
38 | #%%
39 | fc.non_trainable_variables
40 |
41 | #%%
42 | embedding = layers.Embedding(10000, 100)
43 |
44 | #%%
45 | x = tf.ones([25000,80])
46 |
47 | #%%
48 |
49 | embedding(x)
50 |
51 | #%%
52 | z = tf.random.normal([2,10]) # 构造输出层的输出
53 | y_onehot = tf.constant([1,3]) # 构造真实值
54 | y_onehot = tf.one_hot(y_onehot, depth=10) # one-hot编码
55 | # 输出层未使用Softmax函数,故from_logits设置为True
56 | loss = keras.losses.categorical_crossentropy(y_onehot,z,from_logits=True)
57 | loss = tf.reduce_mean(loss) # 计算平均交叉熵损失
58 | loss
59 |
60 |
61 | #%%
62 | criteon = keras.losses.CategoricalCrossentropy(from_logits=True)
63 | loss = criteon(y_onehot,z) # 计算损失
64 | loss
65 |
66 |
67 | #%%
68 |
--------------------------------------------------------------------------------
/ch06-神经网络/全接连层.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch06-神经网络/全接连层.pdf
--------------------------------------------------------------------------------
/ch06-神经网络/误差计算.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch06-神经网络/误差计算.pdf
--------------------------------------------------------------------------------
/ch06-神经网络/输出方式.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch06-神经网络/输出方式.pdf
--------------------------------------------------------------------------------
/ch07-反向传播算法/0.梯度下降-简介.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/0.梯度下降-简介.pdf
--------------------------------------------------------------------------------
/ch07-反向传播算法/2.常见函数的梯度.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/2.常见函数的梯度.pdf
--------------------------------------------------------------------------------
/ch07-反向传播算法/2nd_derivative.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | w = tf.Variable(1.0)
4 | b = tf.Variable(2.0)
5 | x = tf.Variable(3.0)
6 |
7 | with tf.GradientTape() as t1:
8 | with tf.GradientTape() as t2:
9 | y = x * w + b
10 | dy_dw, dy_db = t2.gradient(y, [w, b])
11 | d2y_dw2 = t1.gradient(dy_dw, w)
12 |
13 | print(dy_dw)
14 | print(dy_db)
15 | print(d2y_dw2)
16 |
17 | assert dy_dw.numpy() == 3.0
18 | assert d2y_dw2 is None
--------------------------------------------------------------------------------
/ch07-反向传播算法/3.激活函数及其梯度.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/3.激活函数及其梯度.pdf
--------------------------------------------------------------------------------
/ch07-反向传播算法/4.损失函数及其梯度.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/4.损失函数及其梯度.pdf
--------------------------------------------------------------------------------
/ch07-反向传播算法/5.单输出感知机梯度.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/5.单输出感知机梯度.pdf
--------------------------------------------------------------------------------
/ch07-反向传播算法/6.多输出感知机梯度.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/6.多输出感知机梯度.pdf
--------------------------------------------------------------------------------
/ch07-反向传播算法/7.链式法则.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/7.链式法则.pdf
--------------------------------------------------------------------------------
/ch07-反向传播算法/8.多层感知机梯度.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/8.多层感知机梯度.pdf
--------------------------------------------------------------------------------
/ch07-反向传播算法/chain_rule.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | # 构建待优化变量
4 | x = tf.constant(1.)
5 | w1 = tf.constant(2.)
6 | b1 = tf.constant(1.)
7 | w2 = tf.constant(2.)
8 | b2 = tf.constant(1.)
9 |
10 |
11 | with tf.GradientTape(persistent=True) as tape:
12 | # 非tf.Variable类型的张量需要人为设置记录梯度信息
13 | tape.watch([w1, b1, w2, b2])
14 | # 构建2层网络
15 | y1 = x * w1 + b1
16 | y2 = y1 * w2 + b2
17 |
18 | # 独立求解出各个导数
19 | dy2_dy1 = tape.gradient(y2, [y1])[0]
20 | dy1_dw1 = tape.gradient(y1, [w1])[0]
21 | dy2_dw1 = tape.gradient(y2, [w1])[0]
22 |
23 | # 验证链式法则
24 | print(dy2_dy1 * dy1_dw1)
25 | print(dy2_dw1)
--------------------------------------------------------------------------------
/ch07-反向传播算法/crossentropy_loss.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 |
4 | tf.random.set_seed(4323)
5 |
6 | x=tf.random.normal([1,3])
7 |
8 | w=tf.random.normal([3,2])
9 |
10 | b=tf.random.normal([2])
11 |
12 | y = tf.constant([0, 1])
13 |
14 |
15 | with tf.GradientTape() as tape:
16 |
17 | tape.watch([w, b])
18 | logits = (x@w+b)
19 | loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y, logits, from_logits=True))
20 |
21 | grads = tape.gradient(loss, [w, b])
22 | print('w grad:', grads[0])
23 |
24 | print('b grad:', grads[1])
--------------------------------------------------------------------------------
/ch07-反向传播算法/himmelblau.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | from mpl_toolkits.mplot3d import Axes3D
3 | from matplotlib import pyplot as plt
4 | import tensorflow as tf
5 |
6 |
7 |
8 | def himmelblau(x):
9 | # himmelblau函数实现
10 | return (x[0] ** 2 + x[1] - 11) ** 2 + (x[0] + x[1] ** 2 - 7) ** 2
11 |
12 |
13 | x = np.arange(-6, 6, 0.1)
14 | y = np.arange(-6, 6, 0.1)
15 | print('x,y range:', x.shape, y.shape)
16 | # 生成x-y平面采样网格点,方便可视化
17 | X, Y = np.meshgrid(x, y)
18 | print('X,Y maps:', X.shape, Y.shape)
19 | Z = himmelblau([X, Y]) # 计算网格点上的函数值
20 |
21 | # 绘制himmelblau函数曲面
22 | fig = plt.figure('himmelblau')
23 | ax = fig.gca(projection='3d')
24 | ax.plot_surface(X, Y, Z)
25 | ax.view_init(60, -30)
26 | ax.set_xlabel('x')
27 | ax.set_ylabel('y')
28 | plt.show()
29 |
30 | # 参数的初始化值对优化的影响不容忽视,可以通过尝试不同的初始化值,
31 | # 检验函数优化的极小值情况
32 | # [1., 0.], [-4, 0.], [4, 0.]
33 | # x = tf.constant([4., 0.])
34 | # x = tf.constant([1., 0.])
35 | # x = tf.constant([-4., 0.])
36 | x = tf.constant([-2., 2.])
37 |
38 | for step in range(200):# 循环优化
39 | with tf.GradientTape() as tape: #梯度跟踪
40 | tape.watch([x]) # 记录梯度
41 | y = himmelblau(x) # 前向传播
42 | # 反向传播
43 | grads = tape.gradient(y, [x])[0]
44 | # 更新参数,0.01为学习率
45 | x -= 0.01*grads
46 | # 打印优化的极小值
47 | if step % 20 == 19:
48 | print ('step {}: x = {}, f(x) = {}'
49 | .format(step, x.numpy(), y.numpy()))
--------------------------------------------------------------------------------
/ch07-反向传播算法/mse_grad.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 |
4 |
5 |
6 | x=tf.random.normal([1,3])
7 |
8 | w=tf.ones([3,2])
9 |
10 | b=tf.ones([2])
11 |
12 | y = tf.constant([0, 1])
13 |
14 |
15 | with tf.GradientTape() as tape:
16 |
17 | tape.watch([w, b])
18 | logits = tf.sigmoid(x@w+b)
19 | loss = tf.reduce_mean(tf.losses.MSE(y, logits))
20 |
21 | grads = tape.gradient(loss, [w, b])
22 | print('w grad:', grads[0])
23 |
24 | print('b grad:', grads[1])
25 |
26 |
27 |
--------------------------------------------------------------------------------
/ch07-反向传播算法/multi_output_perceptron.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 |
4 |
5 |
6 | x=tf.random.normal([1,3])
7 |
8 | w=tf.ones([3,2])
9 |
10 | b=tf.ones([2])
11 |
12 | y = tf.constant([0, 1])
13 |
14 |
15 | with tf.GradientTape() as tape:
16 |
17 | tape.watch([w, b])
18 | logits = tf.sigmoid(x@w+b)
19 | loss = tf.reduce_mean(tf.losses.MSE(y, logits))
20 |
21 | grads = tape.gradient(loss, [w, b])
22 | print('w grad:', grads[0])
23 |
24 | print('b grad:', grads[1])
25 |
26 |
27 |
--------------------------------------------------------------------------------
/ch07-反向传播算法/sigmoid_grad.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 |
4 | a = tf.linspace(-10., 10., 10)
5 |
6 | with tf.GradientTape() as tape:
7 | tape.watch(a)
8 | y = tf.sigmoid(a)
9 |
10 |
11 | grads = tape.gradient(y, [a])
12 | print('x:', a.numpy())
13 | print('y:', y.numpy())
14 | print('grad:', grads[0].numpy())
15 |
--------------------------------------------------------------------------------
/ch07-反向传播算法/single_output_perceptron.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 |
4 |
5 |
6 | x=tf.random.normal([1,3])
7 |
8 | w=tf.ones([3,1])
9 |
10 | b=tf.ones([1])
11 |
12 | y = tf.constant([1])
13 |
14 |
15 | with tf.GradientTape() as tape:
16 |
17 | tape.watch([w, b])
18 | logits = tf.sigmoid(x@w+b)
19 | loss = tf.reduce_mean(tf.losses.MSE(y, logits))
20 |
21 | grads = tape.gradient(loss, [w, b])
22 | print('w grad:', grads[0])
23 |
24 | print('b grad:', grads[1])
25 |
26 |
27 |
--------------------------------------------------------------------------------
/ch08-Keras高层接口/1.Metrics.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch08-Keras高层接口/1.Metrics.pdf
--------------------------------------------------------------------------------
/ch08-Keras高层接口/2.Compile&Fit.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch08-Keras高层接口/2.Compile&Fit.pdf
--------------------------------------------------------------------------------
/ch08-Keras高层接口/3.自定义层.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch08-Keras高层接口/3.自定义层.pdf
--------------------------------------------------------------------------------
/ch08-Keras高层接口/Keras实战CIFAR10.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch08-Keras高层接口/Keras实战CIFAR10.pdf
--------------------------------------------------------------------------------
/ch08-Keras高层接口/compile_fit.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
3 |
4 |
5 | def preprocess(x, y):
6 | """
7 | x is a simple image, not a batch
8 | """
9 | x = tf.cast(x, dtype=tf.float32) / 255.
10 | x = tf.reshape(x, [28*28])
11 | y = tf.cast(y, dtype=tf.int32)
12 | y = tf.one_hot(y, depth=10)
13 | return x,y
14 |
15 |
16 | batchsz = 128
17 | (x, y), (x_val, y_val) = datasets.mnist.load_data()
18 | print('datasets:', x.shape, y.shape, x.min(), x.max())
19 |
20 |
21 |
22 | db = tf.data.Dataset.from_tensor_slices((x,y))
23 | db = db.map(preprocess).shuffle(60000).batch(batchsz)
24 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
25 | ds_val = ds_val.map(preprocess).batch(batchsz)
26 |
27 | sample = next(iter(db))
28 | print(sample[0].shape, sample[1].shape)
29 |
30 |
31 | network = Sequential([layers.Dense(256, activation='relu'),
32 | layers.Dense(128, activation='relu'),
33 | layers.Dense(64, activation='relu'),
34 | layers.Dense(32, activation='relu'),
35 | layers.Dense(10)])
36 | network.build(input_shape=(None, 28*28))
37 | network.summary()
38 |
39 |
40 |
41 |
42 | network.compile(optimizer=optimizers.Adam(lr=0.01),
43 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
44 | metrics=['accuracy']
45 | )
46 |
47 | network.fit(db, epochs=5, validation_data=ds_val, validation_freq=2)
48 |
49 | network.evaluate(ds_val)
50 |
51 | sample = next(iter(ds_val))
52 | x = sample[0]
53 | y = sample[1] # one-hot
54 | pred = network.predict(x) # [b, 10]
55 | # convert back to number
56 | y = tf.argmax(y, axis=1)
57 | pred = tf.argmax(pred, axis=1)
58 |
59 | print(pred)
60 | print(y)
61 |
--------------------------------------------------------------------------------
/ch08-Keras高层接口/keras_train.py:
--------------------------------------------------------------------------------
1 | import os
2 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
3 |
4 | import tensorflow as tf
5 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
6 | from tensorflow import keras
7 |
8 |
9 |
10 | def preprocess(x, y):
11 | # [0~255] => [-1~1]
12 | x = 2 * tf.cast(x, dtype=tf.float32) / 255. - 1.
13 | y = tf.cast(y, dtype=tf.int32)
14 | return x,y
15 |
16 |
17 | batchsz = 128
18 | # [50k, 32, 32, 3], [10k, 1]
19 | (x, y), (x_val, y_val) = datasets.cifar10.load_data()
20 | y = tf.squeeze(y)
21 | y_val = tf.squeeze(y_val)
22 | y = tf.one_hot(y, depth=10) # [50k, 10]
23 | y_val = tf.one_hot(y_val, depth=10) # [10k, 10]
24 | print('datasets:', x.shape, y.shape, x_val.shape, y_val.shape, x.min(), x.max())
25 |
26 |
27 | train_db = tf.data.Dataset.from_tensor_slices((x,y))
28 | train_db = train_db.map(preprocess).shuffle(10000).batch(batchsz)
29 | test_db = tf.data.Dataset.from_tensor_slices((x_val, y_val))
30 | test_db = test_db.map(preprocess).batch(batchsz)
31 |
32 |
33 | sample = next(iter(train_db))
34 | print('batch:', sample[0].shape, sample[1].shape)
35 |
36 |
37 | class MyDense(layers.Layer):
38 | # to replace standard layers.Dense()
39 | def __init__(self, inp_dim, outp_dim):
40 | super(MyDense, self).__init__()
41 |
42 | self.kernel = self.add_variable('w', [inp_dim, outp_dim])
43 | # self.bias = self.add_variable('b', [outp_dim])
44 |
45 | def call(self, inputs, training=None):
46 |
47 | x = inputs @ self.kernel
48 | return x
49 |
50 | class MyNetwork(keras.Model):
51 |
52 | def __init__(self):
53 | super(MyNetwork, self).__init__()
54 |
55 | self.fc1 = MyDense(32*32*3, 256)
56 | self.fc2 = MyDense(256, 128)
57 | self.fc3 = MyDense(128, 64)
58 | self.fc4 = MyDense(64, 32)
59 | self.fc5 = MyDense(32, 10)
60 |
61 |
62 |
63 | def call(self, inputs, training=None):
64 | """
65 |
66 | :param inputs: [b, 32, 32, 3]
67 | :param training:
68 | :return:
69 | """
70 | x = tf.reshape(inputs, [-1, 32*32*3])
71 | # [b, 32*32*3] => [b, 256]
72 | x = self.fc1(x)
73 | x = tf.nn.relu(x)
74 | # [b, 256] => [b, 128]
75 | x = self.fc2(x)
76 | x = tf.nn.relu(x)
77 | # [b, 128] => [b, 64]
78 | x = self.fc3(x)
79 | x = tf.nn.relu(x)
80 | # [b, 64] => [b, 32]
81 | x = self.fc4(x)
82 | x = tf.nn.relu(x)
83 | # [b, 32] => [b, 10]
84 | x = self.fc5(x)
85 |
86 | return x
87 |
88 |
89 | network = MyNetwork()
90 | network.compile(optimizer=optimizers.Adam(lr=1e-3),
91 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
92 | metrics=['accuracy'])
93 | network.fit(train_db, epochs=15, validation_data=test_db, validation_freq=1)
94 |
95 | network.evaluate(test_db)
96 | network.save_weights('ckpt/weights.ckpt')
97 | del network
98 | print('saved to ckpt/weights.ckpt')
99 |
100 |
101 | network = MyNetwork()
102 | network.compile(optimizer=optimizers.Adam(lr=1e-3),
103 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
104 | metrics=['accuracy'])
105 | network.load_weights('ckpt/weights.ckpt')
106 | print('loaded weights from file.')
107 | network.evaluate(test_db)
--------------------------------------------------------------------------------
/ch08-Keras高层接口/layer_model.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
3 | from tensorflow import keras
4 |
5 | def preprocess(x, y):
6 | """
7 | x is a simple image, not a batch
8 | """
9 | x = tf.cast(x, dtype=tf.float32) / 255.
10 | x = tf.reshape(x, [28*28])
11 | y = tf.cast(y, dtype=tf.int32)
12 | y = tf.one_hot(y, depth=10)
13 | return x,y
14 |
15 |
16 | batchsz = 128
17 | (x, y), (x_val, y_val) = datasets.mnist.load_data()
18 | print('datasets:', x.shape, y.shape, x.min(), x.max())
19 |
20 |
21 |
22 | db = tf.data.Dataset.from_tensor_slices((x,y))
23 | db = db.map(preprocess).shuffle(60000).batch(batchsz)
24 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
25 | ds_val = ds_val.map(preprocess).batch(batchsz)
26 |
27 | sample = next(iter(db))
28 | print(sample[0].shape, sample[1].shape)
29 |
30 |
31 | network = Sequential([layers.Dense(256, activation='relu'),
32 | layers.Dense(128, activation='relu'),
33 | layers.Dense(64, activation='relu'),
34 | layers.Dense(32, activation='relu'),
35 | layers.Dense(10)])
36 | network.build(input_shape=(None, 28*28))
37 | network.summary()
38 |
39 |
40 | class MyDense(layers.Layer):
41 |
42 | def __init__(self, inp_dim, outp_dim):
43 | super(MyDense, self).__init__()
44 |
45 | self.kernel = self.add_weight('w', [inp_dim, outp_dim])
46 | self.bias = self.add_weight('b', [outp_dim])
47 |
48 | def call(self, inputs, training=None):
49 |
50 | out = inputs @ self.kernel + self.bias
51 |
52 | return out
53 |
54 | class MyModel(keras.Model):
55 |
56 | def __init__(self):
57 | super(MyModel, self).__init__()
58 |
59 | self.fc1 = MyDense(28*28, 256)
60 | self.fc2 = MyDense(256, 128)
61 | self.fc3 = MyDense(128, 64)
62 | self.fc4 = MyDense(64, 32)
63 | self.fc5 = MyDense(32, 10)
64 |
65 | def call(self, inputs, training=None):
66 |
67 | x = self.fc1(inputs)
68 | x = tf.nn.relu(x)
69 | x = self.fc2(x)
70 | x = tf.nn.relu(x)
71 | x = self.fc3(x)
72 | x = tf.nn.relu(x)
73 | x = self.fc4(x)
74 | x = tf.nn.relu(x)
75 | x = self.fc5(x)
76 |
77 | return x
78 |
79 |
80 | network = MyModel()
81 |
82 |
83 | network.compile(optimizer=optimizers.Adam(lr=0.01),
84 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
85 | metrics=['accuracy']
86 | )
87 |
88 | network.fit(db, epochs=5, validation_data=ds_val,
89 | validation_freq=2)
90 |
91 | network.evaluate(ds_val)
92 |
93 | sample = next(iter(ds_val))
94 | x = sample[0]
95 | y = sample[1] # one-hot
96 | pred = network.predict(x) # [b, 10]
97 | # convert back to number
98 | y = tf.argmax(y, axis=1)
99 | pred = tf.argmax(pred, axis=1)
100 |
101 | print(pred)
102 | print(y)
103 |
--------------------------------------------------------------------------------
/ch08-Keras高层接口/metrics.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
3 |
4 |
5 | def preprocess(x, y):
6 |
7 | x = tf.cast(x, dtype=tf.float32) / 255.
8 | y = tf.cast(y, dtype=tf.int32)
9 |
10 | return x,y
11 |
12 |
13 | batchsz = 128
14 | (x, y), (x_val, y_val) = datasets.mnist.load_data()
15 | print('datasets:', x.shape, y.shape, x.min(), x.max())
16 |
17 |
18 |
19 | db = tf.data.Dataset.from_tensor_slices((x,y))
20 | db = db.map(preprocess).shuffle(60000).batch(batchsz).repeat(10)
21 |
22 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
23 | ds_val = ds_val.map(preprocess).batch(batchsz)
24 |
25 |
26 |
27 |
28 | network = Sequential([layers.Dense(256, activation='relu'),
29 | layers.Dense(128, activation='relu'),
30 | layers.Dense(64, activation='relu'),
31 | layers.Dense(32, activation='relu'),
32 | layers.Dense(10)])
33 | network.build(input_shape=(None, 28*28))
34 | network.summary()
35 |
36 | optimizer = optimizers.Adam(lr=0.01)
37 |
38 | acc_meter = metrics.Accuracy()
39 | loss_meter = metrics.Mean()
40 |
41 |
42 | for step, (x,y) in enumerate(db):
43 |
44 | with tf.GradientTape() as tape:
45 | # [b, 28, 28] => [b, 784]
46 | x = tf.reshape(x, (-1, 28*28))
47 | # [b, 784] => [b, 10]
48 | out = network(x)
49 | # [b] => [b, 10]
50 | y_onehot = tf.one_hot(y, depth=10)
51 | # [b]
52 | loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onehot, out, from_logits=True))
53 |
54 | loss_meter.update_state(loss)
55 |
56 |
57 |
58 | grads = tape.gradient(loss, network.trainable_variables)
59 | optimizer.apply_gradients(zip(grads, network.trainable_variables))
60 |
61 |
62 | if step % 100 == 0:
63 |
64 | print(step, 'loss:', loss_meter.result().numpy())
65 | loss_meter.reset_states()
66 |
67 |
68 | # evaluate
69 | if step % 500 == 0:
70 | total, total_correct = 0., 0
71 | acc_meter.reset_states()
72 |
73 | for step, (x, y) in enumerate(ds_val):
74 | # [b, 28, 28] => [b, 784]
75 | x = tf.reshape(x, (-1, 28*28))
76 | # [b, 784] => [b, 10]
77 | out = network(x)
78 |
79 |
80 | # [b, 10] => [b]
81 | pred = tf.argmax(out, axis=1)
82 | pred = tf.cast(pred, dtype=tf.int32)
83 | # bool type
84 | correct = tf.equal(pred, y)
85 | # bool tensor => int tensor => numpy
86 | total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy()
87 | total += x.shape[0]
88 |
89 | acc_meter.update_state(y, pred)
90 |
91 |
92 | print(step, 'Evaluate Acc:', total_correct/total, acc_meter.result().numpy())
93 |
--------------------------------------------------------------------------------
/ch08-Keras高层接口/nb.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import tensorflow as tf
3 | from tensorflow import keras
4 | from tensorflow.keras import layers,Sequential,losses,optimizers,datasets
5 |
6 |
7 | #%%
8 | x = tf.constant([2.,1.,0.1])
9 | layer = layers.Softmax(axis=-1)
10 | layer(x)
11 | #%%
12 | def proprocess(x,y):
13 | x = tf.reshape(x, [-1])
14 | return x,y
15 |
16 | # x: [60k, 28, 28],
17 | # y: [60k]
18 | (x, y), (x_test,y_test) = datasets.mnist.load_data()
19 | # x: [0~255] => [0~1.]
20 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 255.
21 | y = tf.convert_to_tensor(y, dtype=tf.int32)
22 |
23 | # x: [0~255] => [0~1.]
24 | x_test = tf.convert_to_tensor(x_test, dtype=tf.float32) / 255.
25 | y_test = tf.convert_to_tensor(y_test, dtype=tf.int32)
26 |
27 | train_db = tf.data.Dataset.from_tensor_slices((x,y))
28 | train_db = train_db.shuffle(1000).map(proprocess).batch(128)
29 |
30 | val_db = tf.data.Dataset.from_tensor_slices((x_test,y_test))
31 | val_db = val_db.shuffle(1000).map(proprocess).batch(128)
32 |
33 | x,y = next(iter(train_db))
34 | print(x.shape, y.shape)
35 | #%%
36 |
37 | from tensorflow.keras import layers, Sequential
38 | network = Sequential([
39 | layers.Dense(3, activation=None),
40 | layers.ReLU(),
41 | layers.Dense(2, activation=None),
42 | layers.ReLU()
43 | ])
44 | x = tf.random.normal([4,3])
45 | network(x)
46 |
47 | #%%
48 | layers_num = 2
49 | network = Sequential([])
50 | for _ in range(layers_num):
51 | network.add(layers.Dense(3))
52 | network.add(layers.ReLU())
53 | network.build(input_shape=(None, 4))
54 | network.summary()
55 |
56 | #%%
57 | for p in network.trainable_variables:
58 | print(p.name, p.shape)
59 |
60 | #%%
61 | # 创建5层的全连接层网络
62 | network = Sequential([layers.Dense(256, activation='relu'),
63 | layers.Dense(128, activation='relu'),
64 | layers.Dense(64, activation='relu'),
65 | layers.Dense(32, activation='relu'),
66 | layers.Dense(10)])
67 | network.build(input_shape=(4, 28*28))
68 | network.summary()
69 |
70 |
71 | #%%
72 | # 导入优化器,损失函数模块
73 | from tensorflow.keras import optimizers,losses
74 | # 采用Adam优化器,学习率为0.01;采用交叉熵损失函数,包含Softmax
75 | network.compile(optimizer=optimizers.Adam(lr=0.01),
76 | loss=losses.CategoricalCrossentropy(from_logits=True),
77 | metrics=['accuracy'] # 设置测量指标为准确率
78 | )
79 |
80 |
81 | #%%
82 | # 指定训练集为db,验证集为val_db,训练5个epochs,每2个epoch验证一次
83 | history = network.fit(train_db, epochs=5, validation_data=val_db, validation_freq=2)
84 |
85 |
86 | #%%
87 | history.history # 打印训练记录
88 |
89 | #%%
90 | # 保存模型参数到文件上
91 | network.save_weights('weights.ckpt')
92 | print('saved weights.')
93 | del network # 删除网络对象
94 | # 重新创建相同的网络结构
95 | network = Sequential([layers.Dense(256, activation='relu'),
96 | layers.Dense(128, activation='relu'),
97 | layers.Dense(64, activation='relu'),
98 | layers.Dense(32, activation='relu'),
99 | layers.Dense(10)])
100 | network.compile(optimizer=optimizers.Adam(lr=0.01),
101 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
102 | metrics=['accuracy']
103 | )
104 | # 从参数文件中读取数据并写入当前网络
105 | network.load_weights('weights.ckpt')
106 | print('loaded weights!')
107 |
108 |
109 | #%%
110 | # 新建池化层
111 | global_average_layer = layers.GlobalAveragePooling2D()
112 | # 利用上一层的输出作为本层的输入,测试其输出
113 | x = tf.random.normal([4,7,7,2048])
114 | out = global_average_layer(x) # 池化层降维
115 | print(out.shape)
116 |
117 |
118 | #%%
119 | # 新建全连接层
120 | fc = layers.Dense(100)
121 | # 利用上一层的输出作为本层的输入,测试其输出
122 | x = tf.random.normal([4,2048])
123 | out = fc(x)
124 | print(out.shape)
125 |
126 |
127 | #%%
128 |
--------------------------------------------------------------------------------
/ch08-Keras高层接口/pretained.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import tensorflow as tf
3 | from tensorflow import keras
4 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
5 |
6 | #%%
7 | # 加载预训练网络模型,并去掉最后一层
8 | resnet = keras.applications.ResNet50(weights='imagenet',include_top=False)
9 | resnet.summary()
10 | # 测试网络的输出
11 | x = tf.random.normal([4,224,224,3])
12 | out = resnet(x)
13 | out.shape
14 | #%%
15 | # 新建池化层
16 | global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
17 | # 利用上一层的输出作为本层的输入,测试其输出
18 | x = tf.random.normal([4,7,7,2048])
19 | out = global_average_layer(x)
20 | print(out.shape)
21 | #%%
22 | # 新建全连接层
23 | fc = tf.keras.layers.Dense(100)
24 | # 利用上一层的输出作为本层的输入,测试其输出
25 | x = tf.random.normal([4,2048])
26 | out = fc(x)
27 | print(out.shape)
28 | #%%
29 | # 重新包裹成我们的网络模型
30 | mynet = Sequential([resnet, global_average_layer, fc])
31 | mynet.summary()
32 | #%%
33 | resnet.trainable = False
34 | mynet.summary()
35 |
36 | #%%
--------------------------------------------------------------------------------
/ch08-Keras高层接口/save_load_model.py:
--------------------------------------------------------------------------------
1 | import os
2 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
3 |
4 | import tensorflow as tf
5 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
6 |
7 |
8 | def preprocess(x, y):
9 | """
10 | x is a simple image, not a batch
11 | """
12 | x = tf.cast(x, dtype=tf.float32) / 255.
13 | x = tf.reshape(x, [28*28])
14 | y = tf.cast(y, dtype=tf.int32)
15 | y = tf.one_hot(y, depth=10)
16 | return x,y
17 |
18 |
19 | batchsz = 128
20 | (x, y), (x_val, y_val) = datasets.mnist.load_data()
21 | print('datasets:', x.shape, y.shape, x.min(), x.max())
22 |
23 |
24 |
25 | db = tf.data.Dataset.from_tensor_slices((x,y))
26 | db = db.map(preprocess).shuffle(60000).batch(batchsz)
27 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
28 | ds_val = ds_val.map(preprocess).batch(batchsz)
29 |
30 | sample = next(iter(db))
31 | print(sample[0].shape, sample[1].shape)
32 |
33 |
34 | network = Sequential([layers.Dense(256, activation='relu'),
35 | layers.Dense(128, activation='relu'),
36 | layers.Dense(64, activation='relu'),
37 | layers.Dense(32, activation='relu'),
38 | layers.Dense(10)])
39 | network.build(input_shape=(None, 28*28))
40 | network.summary()
41 |
42 |
43 |
44 |
45 | network.compile(optimizer=optimizers.Adam(lr=0.01),
46 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
47 | metrics=['accuracy']
48 | )
49 |
50 | network.fit(db, epochs=3, validation_data=ds_val, validation_freq=2)
51 |
52 | network.evaluate(ds_val)
53 |
54 | network.save('model.h5')
55 | print('saved total model.')
56 | del network
57 |
58 | print('loaded model from file.')
59 | network = tf.keras.models.load_model('model.h5', compile=False)
60 | network.compile(optimizer=optimizers.Adam(lr=0.01),
61 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
62 | metrics=['accuracy']
63 | )
64 | x_val = tf.cast(x_val, dtype=tf.float32) / 255.
65 | x_val = tf.reshape(x_val, [-1, 28*28])
66 | y_val = tf.cast(y_val, dtype=tf.int32)
67 | y_val = tf.one_hot(y_val, depth=10)
68 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)).batch(128)
69 | network.evaluate(ds_val)
70 |
--------------------------------------------------------------------------------
/ch08-Keras高层接口/save_load_weight.py:
--------------------------------------------------------------------------------
1 | import os
2 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
3 |
4 | import tensorflow as tf
5 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
6 |
7 |
8 | def preprocess(x, y):
9 | """
10 | x is a simple image, not a batch
11 | """
12 | x = tf.cast(x, dtype=tf.float32) / 255.
13 | x = tf.reshape(x, [28*28])
14 | y = tf.cast(y, dtype=tf.int32)
15 | y = tf.one_hot(y, depth=10)
16 | return x,y
17 |
18 |
19 | batchsz = 128
20 | (x, y), (x_val, y_val) = datasets.mnist.load_data()
21 | print('datasets:', x.shape, y.shape, x.min(), x.max())
22 |
23 |
24 |
25 | db = tf.data.Dataset.from_tensor_slices((x,y))
26 | db = db.map(preprocess).shuffle(60000).batch(batchsz)
27 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
28 | ds_val = ds_val.map(preprocess).batch(batchsz)
29 |
30 | sample = next(iter(db))
31 | print(sample[0].shape, sample[1].shape)
32 |
33 |
34 | network = Sequential([layers.Dense(256, activation='relu'),
35 | layers.Dense(128, activation='relu'),
36 | layers.Dense(64, activation='relu'),
37 | layers.Dense(32, activation='relu'),
38 | layers.Dense(10)])
39 | network.build(input_shape=(None, 28*28))
40 | network.summary()
41 |
42 |
43 |
44 |
45 | network.compile(optimizer=optimizers.Adam(lr=0.01),
46 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
47 | metrics=['accuracy']
48 | )
49 |
50 | network.fit(db, epochs=3, validation_data=ds_val, validation_freq=2)
51 |
52 | network.evaluate(ds_val)
53 |
54 | network.save_weights('weights.ckpt')
55 | print('saved weights.')
56 | del network
57 |
58 | network = Sequential([layers.Dense(256, activation='relu'),
59 | layers.Dense(128, activation='relu'),
60 | layers.Dense(64, activation='relu'),
61 | layers.Dense(32, activation='relu'),
62 | layers.Dense(10)])
63 | network.compile(optimizer=optimizers.Adam(lr=0.01),
64 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
65 | metrics=['accuracy']
66 | )
67 | network.load_weights('weights.ckpt')
68 | print('loaded weights!')
69 | network.evaluate(ds_val)
70 |
--------------------------------------------------------------------------------
/ch08-Keras高层接口/模型加载与保存.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch08-Keras高层接口/模型加载与保存.pdf
--------------------------------------------------------------------------------
/ch09-过拟合/Regularization.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/Regularization.pdf
--------------------------------------------------------------------------------
/ch09-过拟合/compile_fit.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
3 |
4 |
5 | def preprocess(x, y):
6 | """
7 | x is a simple image, not a batch
8 | """
9 | x = tf.cast(x, dtype=tf.float32) / 255.
10 | x = tf.reshape(x, [28*28])
11 | y = tf.cast(y, dtype=tf.int32)
12 | y = tf.one_hot(y, depth=10)
13 | return x,y
14 |
15 |
16 | batchsz = 128
17 | (x, y), (x_val, y_val) = datasets.mnist.load_data()
18 | print('datasets:', x.shape, y.shape, x.min(), x.max())
19 |
20 |
21 |
22 | db = tf.data.Dataset.from_tensor_slices((x,y))
23 | db = db.map(preprocess).shuffle(60000).batch(batchsz)
24 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
25 | ds_val = ds_val.map(preprocess).batch(batchsz)
26 |
27 | sample = next(iter(db))
28 | print(sample[0].shape, sample[1].shape)
29 |
30 |
31 | network = Sequential([layers.Dense(256, activation='relu'),
32 | layers.Dense(128, activation='relu'),
33 | layers.Dense(64, activation='relu'),
34 | layers.Dense(32, activation='relu'),
35 | layers.Dense(10)])
36 | network.build(input_shape=(None, 28*28))
37 | network.summary()
38 |
39 |
40 |
41 |
42 | network.compile(optimizer=optimizers.Adam(lr=0.01),
43 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
44 | metrics=['accuracy']
45 | )
46 |
47 | network.fit(db, epochs=5, validation_data=ds_val,
48 | validation_steps=2)
49 |
50 | network.evaluate(ds_val)
51 |
52 | sample = next(iter(ds_val))
53 | x = sample[0]
54 | y = sample[1] # one-hot
55 | pred = network.predict(x) # [b, 10]
56 | # convert back to number
57 | y = tf.argmax(y, axis=1)
58 | pred = tf.argmax(pred, axis=1)
59 |
60 | print(pred)
61 | print(y)
62 |
--------------------------------------------------------------------------------
/ch09-过拟合/dropout.py:
--------------------------------------------------------------------------------
1 | import os
2 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
3 |
4 | import tensorflow as tf
5 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
6 |
7 |
8 | def preprocess(x, y):
9 |
10 | x = tf.cast(x, dtype=tf.float32) / 255.
11 | y = tf.cast(y, dtype=tf.int32)
12 |
13 | return x,y
14 |
15 |
16 | batchsz = 128
17 | (x, y), (x_val, y_val) = datasets.mnist.load_data()
18 | print('datasets:', x.shape, y.shape, x.min(), x.max())
19 |
20 |
21 |
22 | db = tf.data.Dataset.from_tensor_slices((x,y))
23 | db = db.map(preprocess).shuffle(60000).batch(batchsz).repeat(10)
24 |
25 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
26 | ds_val = ds_val.map(preprocess).batch(batchsz)
27 |
28 |
29 |
30 |
31 | network = Sequential([layers.Dense(256, activation='relu'),
32 | layers.Dropout(0.5), # 0.5 rate to drop
33 | layers.Dense(128, activation='relu'),
34 | layers.Dropout(0.5), # 0.5 rate to drop
35 | layers.Dense(64, activation='relu'),
36 | layers.Dense(32, activation='relu'),
37 | layers.Dense(10)])
38 | network.build(input_shape=(None, 28*28))
39 | network.summary()
40 |
41 | optimizer = optimizers.Adam(lr=0.01)
42 |
43 |
44 |
45 | for step, (x,y) in enumerate(db):
46 |
47 | with tf.GradientTape() as tape:
48 | # [b, 28, 28] => [b, 784]
49 | x = tf.reshape(x, (-1, 28*28))
50 | # [b, 784] => [b, 10]
51 | out = network(x, training=True)
52 | # [b] => [b, 10]
53 | y_onehot = tf.one_hot(y, depth=10)
54 | # [b]
55 | loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onehot, out, from_logits=True))
56 |
57 |
58 | loss_regularization = []
59 | for p in network.trainable_variables:
60 | loss_regularization.append(tf.nn.l2_loss(p))
61 | loss_regularization = tf.reduce_sum(tf.stack(loss_regularization))
62 |
63 | loss = loss + 0.0001 * loss_regularization
64 |
65 |
66 | grads = tape.gradient(loss, network.trainable_variables)
67 | optimizer.apply_gradients(zip(grads, network.trainable_variables))
68 |
69 |
70 | if step % 100 == 0:
71 |
72 | print(step, 'loss:', float(loss), 'loss_regularization:', float(loss_regularization))
73 |
74 |
75 | # evaluate
76 | if step % 500 == 0:
77 | total, total_correct = 0., 0
78 |
79 | for step, (x, y) in enumerate(ds_val):
80 | # [b, 28, 28] => [b, 784]
81 | x = tf.reshape(x, (-1, 28*28))
82 | # [b, 784] => [b, 10]
83 | out = network(x, training=True)
84 | # [b, 10] => [b]
85 | pred = tf.argmax(out, axis=1)
86 | pred = tf.cast(pred, dtype=tf.int32)
87 | # bool type
88 | correct = tf.equal(pred, y)
89 | # bool tensor => int tensor => numpy
90 | total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy()
91 | total += x.shape[0]
92 |
93 | print(step, 'Evaluate Acc with drop:', total_correct/total)
94 |
95 | total, total_correct = 0., 0
96 |
97 | for step, (x, y) in enumerate(ds_val):
98 | # [b, 28, 28] => [b, 784]
99 | x = tf.reshape(x, (-1, 28*28))
100 | # [b, 784] => [b, 10]
101 | out = network(x, training=False)
102 | # [b, 10] => [b]
103 | pred = tf.argmax(out, axis=1)
104 | pred = tf.cast(pred, dtype=tf.int32)
105 | # bool type
106 | correct = tf.equal(pred, y)
107 | # bool tensor => int tensor => numpy
108 | total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy()
109 | total += x.shape[0]
110 |
111 | print(step, 'Evaluate Acc without drop:', total_correct/total)
--------------------------------------------------------------------------------
/ch09-过拟合/lenna.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_crop.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_crop.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_crop2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_crop2.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_eras.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_eras.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_eras2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_eras2.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_flip.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_flip.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_flip2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_flip2.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_guassian.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_guassian.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_perspective.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_perspective.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_resize.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_resize.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_rotate.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_rotate.png
--------------------------------------------------------------------------------
/ch09-过拟合/lenna_rotate2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_rotate2.png
--------------------------------------------------------------------------------
/ch09-过拟合/misc.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/misc.pdf
--------------------------------------------------------------------------------
/ch09-过拟合/regularization.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
3 |
4 |
5 | def preprocess(x, y):
6 |
7 | x = tf.cast(x, dtype=tf.float32) / 255.
8 | y = tf.cast(y, dtype=tf.int32)
9 |
10 | return x,y
11 |
12 |
13 | batchsz = 128
14 | (x, y), (x_val, y_val) = datasets.mnist.load_data()
15 | print('datasets:', x.shape, y.shape, x.min(), x.max())
16 |
17 |
18 |
19 | db = tf.data.Dataset.from_tensor_slices((x,y))
20 | db = db.map(preprocess).shuffle(60000).batch(batchsz).repeat(10)
21 |
22 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val))
23 | ds_val = ds_val.map(preprocess).batch(batchsz)
24 |
25 |
26 |
27 |
28 | network = Sequential([layers.Dense(256, activation='relu'),
29 | layers.Dense(128, activation='relu'),
30 | layers.Dense(64, activation='relu'),
31 | layers.Dense(32, activation='relu'),
32 | layers.Dense(10)])
33 | network.build(input_shape=(None, 28*28))
34 | network.summary()
35 |
36 | optimizer = optimizers.Adam(lr=0.01)
37 |
38 |
39 |
40 | for step, (x,y) in enumerate(db):
41 |
42 | with tf.GradientTape() as tape:
43 | # [b, 28, 28] => [b, 784]
44 | x = tf.reshape(x, (-1, 28*28))
45 | # [b, 784] => [b, 10]
46 | out = network(x)
47 | # [b] => [b, 10]
48 | y_onehot = tf.one_hot(y, depth=10)
49 | # [b]
50 | loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onehot, out, from_logits=True))
51 |
52 |
53 | loss_regularization = []
54 | for p in network.trainable_variables:
55 | loss_regularization.append(tf.nn.l2_loss(p))
56 | loss_regularization = tf.reduce_sum(tf.stack(loss_regularization))
57 |
58 | loss = loss + 0.0001 * loss_regularization
59 |
60 |
61 | grads = tape.gradient(loss, network.trainable_variables)
62 | optimizer.apply_gradients(zip(grads, network.trainable_variables))
63 |
64 |
65 | if step % 100 == 0:
66 |
67 | print(step, 'loss:', float(loss), 'loss_regularization:', float(loss_regularization))
68 |
69 |
70 | # evaluate
71 | if step % 500 == 0:
72 | total, total_correct = 0., 0
73 |
74 | for step, (x, y) in enumerate(ds_val):
75 | # [b, 28, 28] => [b, 784]
76 | x = tf.reshape(x, (-1, 28*28))
77 | # [b, 784] => [b, 10]
78 | out = network(x)
79 | # [b, 10] => [b]
80 | pred = tf.argmax(out, axis=1)
81 | pred = tf.cast(pred, dtype=tf.int32)
82 | # bool type
83 | correct = tf.equal(pred, y)
84 | # bool tensor => int tensor => numpy
85 | total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy()
86 | total += x.shape[0]
87 |
88 | print(step, 'Evaluate Acc:', total_correct/total)
--------------------------------------------------------------------------------
/ch09-过拟合/train_evalute_test.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
3 |
4 |
5 | def preprocess(x, y):
6 | """
7 | x is a simple image, not a batch
8 | """
9 | x = tf.cast(x, dtype=tf.float32) / 255.
10 | x = tf.reshape(x, [28*28])
11 | y = tf.cast(y, dtype=tf.int32)
12 | y = tf.one_hot(y, depth=10)
13 | return x,y
14 |
15 |
16 | batchsz = 128
17 | (x, y), (x_test, y_test) = datasets.mnist.load_data()
18 | print('datasets:', x.shape, y.shape, x.min(), x.max())
19 |
20 |
21 |
22 | idx = tf.range(60000)
23 | idx = tf.random.shuffle(idx)
24 | x_train, y_train = tf.gather(x, idx[:50000]), tf.gather(y, idx[:50000])
25 | x_val, y_val = tf.gather(x, idx[-10000:]) , tf.gather(y, idx[-10000:])
26 | print(x_train.shape, y_train.shape, x_val.shape, y_val.shape)
27 | db_train = tf.data.Dataset.from_tensor_slices((x_train,y_train))
28 | db_train = db_train.map(preprocess).shuffle(50000).batch(batchsz)
29 |
30 | db_val = tf.data.Dataset.from_tensor_slices((x_val,y_val))
31 | db_val = db_val.map(preprocess).shuffle(10000).batch(batchsz)
32 |
33 |
34 |
35 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
36 | db_test = db_test.map(preprocess).batch(batchsz)
37 |
38 | sample = next(iter(db_train))
39 | print(sample[0].shape, sample[1].shape)
40 |
41 |
42 | network = Sequential([layers.Dense(256, activation='relu'),
43 | layers.Dense(128, activation='relu'),
44 | layers.Dense(64, activation='relu'),
45 | layers.Dense(32, activation='relu'),
46 | layers.Dense(10)])
47 | network.build(input_shape=(None, 28*28))
48 | network.summary()
49 |
50 |
51 |
52 |
53 | network.compile(optimizer=optimizers.Adam(lr=0.01),
54 | loss=tf.losses.CategoricalCrossentropy(from_logits=True),
55 | metrics=['accuracy']
56 | )
57 |
58 | network.fit(db_train, epochs=6, validation_data=db_val, validation_freq=2)
59 |
60 | print('Test performance:')
61 | network.evaluate(db_test)
62 |
63 |
64 | sample = next(iter(db_test))
65 | x = sample[0]
66 | y = sample[1] # one-hot
67 | pred = network.predict(x) # [b, 10]
68 | # convert back to number
69 | y = tf.argmax(y, axis=1)
70 | pred = tf.argmax(pred, axis=1)
71 |
72 | print(pred)
73 | print(y)
74 |
--------------------------------------------------------------------------------
/ch09-过拟合/交叉验证.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/交叉验证.pdf
--------------------------------------------------------------------------------
/ch09-过拟合/学习率与动量.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/学习率与动量.pdf
--------------------------------------------------------------------------------
/ch09-过拟合/过拟合与欠拟合.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/过拟合与欠拟合.pdf
--------------------------------------------------------------------------------
/ch10-卷积神经网络/BatchNorm.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/BatchNorm.pdf
--------------------------------------------------------------------------------
/ch10-卷积神经网络/CIFAR与VGG实战.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/CIFAR与VGG实战.pdf
--------------------------------------------------------------------------------
/ch10-卷积神经网络/ResNet与DenseNet.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/ResNet与DenseNet.pdf
--------------------------------------------------------------------------------
/ch10-卷积神经网络/ResNet实战.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/ResNet实战.pdf
--------------------------------------------------------------------------------
/ch10-卷积神经网络/bn_main.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | from tensorflow import keras
4 | from tensorflow.keras import layers, optimizers
5 |
6 |
7 | # 2 images with 4x4 size, 3 channels
8 | # we explicitly enforce the mean and stddev to N(1, 0.5)
9 | x = tf.random.normal([2,4,4,3], mean=1.,stddev=0.5)
10 |
11 | net = layers.BatchNormalization(axis=-1, center=True, scale=True,
12 | trainable=True)
13 |
14 | out = net(x)
15 | print('forward in test mode:', net.variables)
16 |
17 |
18 | out = net(x, training=True)
19 | print('forward in train mode(1 step):', net.variables)
20 |
21 | for i in range(100):
22 | out = net(x, training=True)
23 | print('forward in train mode(100 steps):', net.variables)
24 |
25 |
26 | optimizer = optimizers.SGD(lr=1e-2)
27 | for i in range(10):
28 | with tf.GradientTape() as tape:
29 | out = net(x, training=True)
30 | loss = tf.reduce_mean(tf.pow(out,2)) - 1
31 |
32 | grads = tape.gradient(loss, net.trainable_variables)
33 | optimizer.apply_gradients(zip(grads, net.trainable_variables))
34 | print('backward(10 steps):', net.variables)
35 |
36 |
37 |
38 |
39 |
--------------------------------------------------------------------------------
/ch10-卷积神经网络/cifar10_train.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.keras import layers, optimizers, datasets, Sequential
3 | import os
4 |
5 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
6 | tf.random.set_seed(2345)
7 |
8 | conv_layers = [ # 5 units of conv + max pooling
9 | # unit 1
10 | layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
11 | layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
12 | layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
13 |
14 | # unit 2
15 | layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
16 | layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
17 | layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
18 |
19 | # unit 3
20 | layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
21 | layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
22 | layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
23 |
24 | # unit 4
25 | layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
26 | layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
27 | layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),
28 |
29 | # unit 5
30 | layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
31 | layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
32 | layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same')
33 |
34 | ]
35 |
36 |
37 |
38 | def preprocess(x, y):
39 | # [0~1]
40 | x = 2*tf.cast(x, dtype=tf.float32) / 255.-1
41 | y = tf.cast(y, dtype=tf.int32)
42 | return x,y
43 |
44 |
45 | (x,y), (x_test, y_test) = datasets.cifar10.load_data()
46 | y = tf.squeeze(y, axis=1)
47 | y_test = tf.squeeze(y_test, axis=1)
48 | print(x.shape, y.shape, x_test.shape, y_test.shape)
49 |
50 |
51 | train_db = tf.data.Dataset.from_tensor_slices((x,y))
52 | train_db = train_db.shuffle(1000).map(preprocess).batch(128)
53 |
54 | test_db = tf.data.Dataset.from_tensor_slices((x_test,y_test))
55 | test_db = test_db.map(preprocess).batch(64)
56 |
57 | sample = next(iter(train_db))
58 | print('sample:', sample[0].shape, sample[1].shape,
59 | tf.reduce_min(sample[0]), tf.reduce_max(sample[0]))
60 |
61 |
62 | def main():
63 |
64 | # [b, 32, 32, 3] => [b, 1, 1, 512]
65 | conv_net = Sequential(conv_layers)
66 |
67 | fc_net = Sequential([
68 | layers.Dense(256, activation=tf.nn.relu),
69 | layers.Dense(128, activation=tf.nn.relu),
70 | layers.Dense(10, activation=None),
71 | ])
72 |
73 | conv_net.build(input_shape=[None, 32, 32, 3])
74 | fc_net.build(input_shape=[None, 512])
75 | conv_net.summary()
76 | fc_net.summary()
77 | optimizer = optimizers.Adam(lr=1e-4)
78 |
79 | # [1, 2] + [3, 4] => [1, 2, 3, 4]
80 | variables = conv_net.trainable_variables + fc_net.trainable_variables
81 |
82 | for epoch in range(50):
83 |
84 | for step, (x,y) in enumerate(train_db):
85 |
86 | with tf.GradientTape() as tape:
87 | # [b, 32, 32, 3] => [b, 1, 1, 512]
88 | out = conv_net(x)
89 | # flatten, => [b, 512]
90 | out = tf.reshape(out, [-1, 512])
91 | # [b, 512] => [b, 10]
92 | logits = fc_net(out)
93 | # [b] => [b, 10]
94 | y_onehot = tf.one_hot(y, depth=10)
95 | # compute loss
96 | loss = tf.losses.categorical_crossentropy(y_onehot, logits, from_logits=True)
97 | loss = tf.reduce_mean(loss)
98 |
99 | grads = tape.gradient(loss, variables)
100 | optimizer.apply_gradients(zip(grads, variables))
101 |
102 | if step %100 == 0:
103 | print(epoch, step, 'loss:', float(loss))
104 |
105 |
106 |
107 | total_num = 0
108 | total_correct = 0
109 | for x,y in test_db:
110 |
111 | out = conv_net(x)
112 | out = tf.reshape(out, [-1, 512])
113 | logits = fc_net(out)
114 | prob = tf.nn.softmax(logits, axis=1)
115 | pred = tf.argmax(prob, axis=1)
116 | pred = tf.cast(pred, dtype=tf.int32)
117 |
118 | correct = tf.cast(tf.equal(pred, y), dtype=tf.int32)
119 | correct = tf.reduce_sum(correct)
120 |
121 | total_num += x.shape[0]
122 | total_correct += int(correct)
123 |
124 | acc = total_correct / total_num
125 | print(epoch, 'acc:', acc)
126 |
127 |
128 |
129 | if __name__ == '__main__':
130 | main()
131 |
--------------------------------------------------------------------------------
/ch10-卷积神经网络/resnet.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow import keras
3 | from tensorflow.keras import layers, Sequential
4 |
5 |
6 |
7 | class BasicBlock(layers.Layer):
8 | # 残差模块
9 | def __init__(self, filter_num, stride=1):
10 | super(BasicBlock, self).__init__()
11 | # 第一个卷积单元
12 | self.conv1 = layers.Conv2D(filter_num, (3, 3), strides=stride, padding='same')
13 | self.bn1 = layers.BatchNormalization()
14 | self.relu = layers.Activation('relu')
15 | # 第二个卷积单元
16 | self.conv2 = layers.Conv2D(filter_num, (3, 3), strides=1, padding='same')
17 | self.bn2 = layers.BatchNormalization()
18 |
19 | if stride != 1:# 通过1x1卷积完成shape匹配
20 | self.downsample = Sequential()
21 | self.downsample.add(layers.Conv2D(filter_num, (1, 1), strides=stride))
22 | else:# shape匹配,直接短接
23 | self.downsample = lambda x:x
24 |
25 | def call(self, inputs, training=None):
26 |
27 | # [b, h, w, c],通过第一个卷积单元
28 | out = self.conv1(inputs)
29 | out = self.bn1(out)
30 | out = self.relu(out)
31 | # 通过第二个卷积单元
32 | out = self.conv2(out)
33 | out = self.bn2(out)
34 | # 通过identity模块
35 | identity = self.downsample(inputs)
36 | # 2条路径输出直接相加
37 | output = layers.add([out, identity])
38 | output = tf.nn.relu(output) # 激活函数
39 |
40 | return output
41 |
42 |
43 | class ResNet(keras.Model):
44 | # 通用的ResNet实现类
45 | def __init__(self, layer_dims, num_classes=10): # [2, 2, 2, 2]
46 | super(ResNet, self).__init__()
47 | # 根网络,预处理
48 | self.stem = Sequential([layers.Conv2D(64, (3, 3), strides=(1, 1)),
49 | layers.BatchNormalization(),
50 | layers.Activation('relu'),
51 | layers.MaxPool2D(pool_size=(2, 2), strides=(1, 1), padding='same')
52 | ])
53 | # 堆叠4个Block,每个block包含了多个BasicBlock,设置步长不一样
54 | self.layer1 = self.build_resblock(64, layer_dims[0])
55 | self.layer2 = self.build_resblock(128, layer_dims[1], stride=2)
56 | self.layer3 = self.build_resblock(256, layer_dims[2], stride=2)
57 | self.layer4 = self.build_resblock(512, layer_dims[3], stride=2)
58 |
59 | # 通过Pooling层将高宽降低为1x1
60 | self.avgpool = layers.GlobalAveragePooling2D()
61 | # 最后连接一个全连接层分类
62 | self.fc = layers.Dense(num_classes)
63 |
64 | def call(self, inputs, training=None):
65 | # 通过根网络
66 | x = self.stem(inputs)
67 | # 一次通过4个模块
68 | x = self.layer1(x)
69 | x = self.layer2(x)
70 | x = self.layer3(x)
71 | x = self.layer4(x)
72 |
73 | # 通过池化层
74 | x = self.avgpool(x)
75 | # 通过全连接层
76 | x = self.fc(x)
77 |
78 | return x
79 |
80 |
81 |
82 | def build_resblock(self, filter_num, blocks, stride=1):
83 | # 辅助函数,堆叠filter_num个BasicBlock
84 | res_blocks = Sequential()
85 | # 只有第一个BasicBlock的步长可能不为1,实现下采样
86 | res_blocks.add(BasicBlock(filter_num, stride))
87 |
88 | for _ in range(1, blocks):#其他BasicBlock步长都为1
89 | res_blocks.add(BasicBlock(filter_num, stride=1))
90 |
91 | return res_blocks
92 |
93 |
94 | def resnet18():
95 | # 通过调整模块内部BasicBlock的数量和配置实现不同的ResNet
96 | return ResNet([2, 2, 2, 2])
97 |
98 |
99 | def resnet34():
100 | # 通过调整模块内部BasicBlock的数量和配置实现不同的ResNet
101 | return ResNet([3, 4, 6, 3])
--------------------------------------------------------------------------------
/ch10-卷积神经网络/resnet18_train.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow.keras import layers, optimizers, datasets, Sequential
3 | import os
4 | from resnet import resnet18
5 |
6 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
7 | tf.random.set_seed(2345)
8 |
9 |
10 |
11 |
12 |
13 | def preprocess(x, y):
14 | # 将数据映射到-1~1
15 | x = 2*tf.cast(x, dtype=tf.float32) / 255. - 1
16 | y = tf.cast(y, dtype=tf.int32) # 类型转换
17 | return x,y
18 |
19 |
20 | (x,y), (x_test, y_test) = datasets.cifar10.load_data() # 加载数据集
21 | y = tf.squeeze(y, axis=1) # 删除不必要的维度
22 | y_test = tf.squeeze(y_test, axis=1) # 删除不必要的维度
23 | print(x.shape, y.shape, x_test.shape, y_test.shape)
24 |
25 |
26 | train_db = tf.data.Dataset.from_tensor_slices((x,y)) # 构建训练集
27 | # 随机打散,预处理,批量化
28 | train_db = train_db.shuffle(1000).map(preprocess).batch(512)
29 |
30 | test_db = tf.data.Dataset.from_tensor_slices((x_test,y_test)) #构建测试集
31 | # 随机打散,预处理,批量化
32 | test_db = test_db.map(preprocess).batch(512)
33 | # 采样一个样本
34 | sample = next(iter(train_db))
35 | print('sample:', sample[0].shape, sample[1].shape,
36 | tf.reduce_min(sample[0]), tf.reduce_max(sample[0]))
37 |
38 |
39 | def main():
40 |
41 | # [b, 32, 32, 3] => [b, 1, 1, 512]
42 | model = resnet18() # ResNet18网络
43 | model.build(input_shape=(None, 32, 32, 3))
44 | model.summary() # 统计网络参数
45 | optimizer = optimizers.Adam(lr=1e-4) # 构建优化器
46 |
47 | for epoch in range(100): # 训练epoch
48 |
49 | for step, (x,y) in enumerate(train_db):
50 |
51 | with tf.GradientTape() as tape:
52 | # [b, 32, 32, 3] => [b, 10],前向传播
53 | logits = model(x)
54 | # [b] => [b, 10],one-hot编码
55 | y_onehot = tf.one_hot(y, depth=10)
56 | # 计算交叉熵
57 | loss = tf.losses.categorical_crossentropy(y_onehot, logits, from_logits=True)
58 | loss = tf.reduce_mean(loss)
59 | # 计算梯度信息
60 | grads = tape.gradient(loss, model.trainable_variables)
61 | # 更新网络参数
62 | optimizer.apply_gradients(zip(grads, model.trainable_variables))
63 |
64 | if step %50 == 0:
65 | print(epoch, step, 'loss:', float(loss))
66 |
67 |
68 |
69 | total_num = 0
70 | total_correct = 0
71 | for x,y in test_db:
72 |
73 | logits = model(x)
74 | prob = tf.nn.softmax(logits, axis=1)
75 | pred = tf.argmax(prob, axis=1)
76 | pred = tf.cast(pred, dtype=tf.int32)
77 |
78 | correct = tf.cast(tf.equal(pred, y), dtype=tf.int32)
79 | correct = tf.reduce_sum(correct)
80 |
81 | total_num += x.shape[0]
82 | total_correct += int(correct)
83 |
84 | acc = total_correct / total_num
85 | print(epoch, 'acc:', acc)
86 |
87 |
88 |
89 | if __name__ == '__main__':
90 | main()
91 |
--------------------------------------------------------------------------------
/ch10-卷积神经网络/什么是卷积.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/什么是卷积.pdf
--------------------------------------------------------------------------------
/ch10-卷积神经网络/卷积神经网络.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/卷积神经网络.pdf
--------------------------------------------------------------------------------
/ch10-卷积神经网络/池化与采样.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/池化与采样.pdf
--------------------------------------------------------------------------------
/ch10-卷积神经网络/经典卷积网络.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/经典卷积网络.pdf
--------------------------------------------------------------------------------
/ch11-循环神经网络/LSTM.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/LSTM.pdf
--------------------------------------------------------------------------------
/ch11-循环神经网络/LSTM实战.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/LSTM实战.pdf
--------------------------------------------------------------------------------
/ch11-循环神经网络/RNN Layer使用.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/RNN Layer使用.pdf
--------------------------------------------------------------------------------
/ch11-循环神经网络/nb.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import tensorflow as tf
3 | from tensorflow import keras
4 | from tensorflow.keras import layers
5 |
6 | import matplotlib.pyplot as plt
7 |
8 | #%%
9 | x = tf.range(10)
10 | x = tf.random.shuffle(x)
11 | # 创建共10个单词,每个单词用长度为4的向量表示的层
12 | net = layers.Embedding(10, 4)
13 | out = net(x)
14 |
15 | out
16 | #%%
17 | net.embeddings
18 | net.embeddings.trainable
19 | net.trainable = False
20 | #%%
21 | # 从预训练模型中加载词向量表
22 | embed_glove = load_embed('glove.6B.50d.txt')
23 | # 直接利用预训练的词向量表初始化Embedding层
24 | net.set_weights([embed_glove])
25 | #%%
26 | cell = layers.SimpleRNNCell(3)
27 | cell.build(input_shape=(None,4))
28 | cell.trainable_variables
29 |
30 |
31 | #%%
32 | # 初始化状态向量
33 | h0 = [tf.zeros([4, 64])]
34 | x = tf.random.normal([4, 80, 100])
35 | xt = x[:,0,:]
36 | # 构建输入特征f=100,序列长度s=80,状态长度=64的Cell
37 | cell = layers.SimpleRNNCell(64)
38 | out, h1 = cell(xt, h0) # 前向计算
39 | print(out.shape, h1[0].shape)
40 | print(id(out), id(h1[0]))
41 |
42 |
43 | #%%
44 | h = h0
45 | # 在序列长度的维度解开输入,得到xt:[b,f]
46 | for xt in tf.unstack(x, axis=1):
47 | out, h = cell(xt, h) # 前向计算
48 | # 最终输出可以聚合每个时间戳上的输出,也可以只取最后时间戳的输出
49 | out = out
50 |
51 | #%%
52 | x = tf.random.normal([4,80,100])
53 | xt = x[:,0,:] # 取第一个时间戳的输入x0
54 | # 构建2个Cell,先cell0,后cell1
55 | cell0 = layers.SimpleRNNCell(64)
56 | cell1 = layers.SimpleRNNCell(64)
57 | h0 = [tf.zeros([4,64])] # cell0的初始状态向量
58 | h1 = [tf.zeros([4,64])] # cell1的初始状态向量
59 |
60 | out0, h0 = cell0(xt, h0)
61 | out1, h1 = cell1(out0, h1)
62 |
63 |
64 | #%%
65 | for xt in tf.unstack(x, axis=1):
66 | # xtw作为输入,输出为out0
67 | out0, h0 = cell0(xt, h0)
68 | # 上一个cell的输出out0作为本cell的输入
69 | out1, h1 = cell1(out0, h1)
70 |
71 |
72 | #%%
73 | print(x.shape)
74 | # 保存上一层的所有时间戳上面的输出
75 | middle_sequences = []
76 | # 计算第一层的所有时间戳上的输出,并保存
77 | for xt in tf.unstack(x, axis=1):
78 | out0, h0 = cell0(xt, h0)
79 | middle_sequences.append(out0)
80 | # 计算第二层的所有时间戳上的输出
81 | # 如果不是末层,需要保存所有时间戳上面的输出
82 | for xt in middle_sequences:
83 | out1, h1 = cell1(xt, h1)
84 |
85 |
86 | #%%
87 | layer = layers.SimpleRNN(64)
88 | x = tf.random.normal([4, 80, 100])
89 | out = layer(x)
90 | out.shape
91 |
92 | #%%
93 | layer = layers.SimpleRNN(64,return_sequences=True)
94 | out = layer(x)
95 | out
96 |
97 | #%%
98 | net = keras.Sequential([ # 构建2层RNN网络
99 | # 除最末层外,都需要返回所有时间戳的输出
100 | layers.SimpleRNN(64, return_sequences=True),
101 | layers.SimpleRNN(64),
102 | ])
103 | out = net(x)
104 |
105 |
106 |
107 | #%%
108 | W = tf.ones([2,2]) # 任意创建某矩阵
109 | eigenvalues = tf.linalg.eigh(W)[0] # 计算特征值
110 | eigenvalues
111 | #%%
112 | val = [W]
113 | for i in range(10): # 矩阵相乘n次方
114 | val.append([val[-1]@W])
115 | # 计算L2范数
116 | norm = list(map(lambda x:tf.norm(x).numpy(),val))
117 | plt.plot(range(1,12),norm)
118 | plt.xlabel('n times')
119 | plt.ylabel('L2-norm')
120 | plt.savefig('w_n_times_1.svg')
121 | #%%
122 | W = tf.ones([2,2])*0.4 # 任意创建某矩阵
123 | eigenvalues = tf.linalg.eigh(W)[0] # 计算特征值
124 | print(eigenvalues)
125 | val = [W]
126 | for i in range(10):
127 | val.append([val[-1]@W])
128 | norm = list(map(lambda x:tf.norm(x).numpy(),val))
129 | plt.plot(range(1,12),norm)
130 | plt.xlabel('n times')
131 | plt.ylabel('L2-norm')
132 | plt.savefig('w_n_times_0.svg')
133 | #%%
134 | a=tf.random.uniform([2,2])
135 | tf.clip_by_value(a,0.4,0.6) # 梯度值裁剪
136 |
137 | #%%
138 |
139 |
140 |
141 |
142 | #%%
143 | a=tf.random.uniform([2,2]) * 5
144 | # 按范数方式裁剪
145 | b = tf.clip_by_norm(a, 5)
146 | tf.norm(a),tf.norm(b)
147 |
148 | #%%
149 | w1=tf.random.normal([3,3]) # 创建梯度张量1
150 | w2=tf.random.normal([3,3]) # 创建梯度张量2
151 | # 计算global norm
152 | global_norm=tf.math.sqrt(tf.norm(w1)**2+tf.norm(w2)**2)
153 | # 根据global norm和max norm=2裁剪
154 | (ww1,ww2),global_norm=tf.clip_by_global_norm([w1,w2],2)
155 | # 计算裁剪后的张量组的global norm
156 | global_norm2 = tf.math.sqrt(tf.norm(ww1)**2+tf.norm(ww2)**2)
157 | print(global_norm, global_norm2)
158 |
159 | #%%
160 | with tf.GradientTape() as tape:
161 | logits = model(x) # 前向传播
162 | loss = criteon(y, logits) # 误差计算
163 | # 计算梯度值
164 | grads = tape.gradient(loss, model.trainable_variables)
165 | grads, _ = tf.clip_by_global_norm(grads, 25) # 全局梯度裁剪
166 | # 利用裁剪后的梯度张量更新参数
167 | optimizer.apply_gradients(zip(grads, model.trainable_variables))
168 |
169 | #%%
170 | x = tf.random.normal([2,80,100])
171 | xt = x[:,0,:] # 得到一个时间戳的输入
172 | cell = layers.LSTMCell(64) # 创建Cell
173 | # 初始化状态和输出List,[h,c]
174 | state = [tf.zeros([2,64]),tf.zeros([2,64])]
175 | out, state = cell(xt, state) # 前向计算
176 | id(out),id(state[0]),id(state[1])
177 |
178 |
179 | #%%
180 | net = layers.LSTM(4)
181 | net.build(input_shape=(None,5,3))
182 | net.trainable_variables
183 | #%%
184 |
185 | net = layers.GRU(4)
186 | net.build(input_shape=(None,5,3))
187 | net.trainable_variables
188 |
189 | #%%
190 | # 初始化状态向量
191 | h = [tf.zeros([2,64])]
192 | cell = layers.GRUCell(64) # 新建GRU Cell
193 | for xt in tf.unstack(x, axis=1):
194 | out, h = cell(xt, h)
195 | out.shape
196 |
197 |
198 | #%%
199 |
--------------------------------------------------------------------------------
/ch11-循环神经网络/pretrained.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 | import numpy as np
4 | import tensorflow as tf
5 | from tensorflow import keras
6 | from tensorflow.keras import layers
7 | from tensorflow.keras.preprocessing.text import Tokenizer
8 | from tensorflow.keras.preprocessing.sequence import pad_sequences
9 |
10 |
11 |
12 | BASE_DIR = ''
13 | GLOVE_DIR = os.path.join(BASE_DIR, 'glove.6B')
14 | TEXT_DATA_DIR = os.path.join(BASE_DIR, '20_newsgroup')
15 | MAX_SEQUENCE_LENGTH = 1000
16 | MAX_NUM_WORDS = 20000
17 | EMBEDDING_DIM = 100
18 | VALIDATION_SPLIT = 0.2
19 |
20 | # first, build index mapping words in the embeddings set
21 | # to their embedding vector
22 |
23 | print('Indexing word vectors.')
24 |
25 | embeddings_index = {}
26 | with open(os.path.join(GLOVE_DIR, 'glove.6B.100d.txt')) as f:
27 | for line in f:
28 | values = line.split()
29 | word = values[0]
30 | coefs = np.asarray(values[1:], dtype='float32')
31 | embeddings_index[word] = coefs
32 |
33 | print('Found %s word vectors.' % len(embeddings_index))
34 |
35 | # second, prepare text samples and their labels
36 | print('Processing text dataset')
37 |
38 | texts = [] # list of text samples
39 | labels_index = {} # dictionary mapping label name to numeric id
40 | labels = [] # list of label ids
41 | for name in sorted(os.listdir(TEXT_DATA_DIR)):
42 | path = os.path.join(TEXT_DATA_DIR, name)
43 | if os.path.isdir(path):
44 | label_id = len(labels_index)
45 | labels_index[name] = label_id
46 | for fname in sorted(os.listdir(path)):
47 | if fname.isdigit():
48 | fpath = os.path.join(path, fname)
49 | args = {} if sys.version_info < (3,) else {'encoding': 'latin-1'}
50 | with open(fpath, **args) as f:
51 | t = f.read()
52 | i = t.find('\n\n') # skip header
53 | if 0 < i:
54 | t = t[i:]
55 | texts.append(t)
56 | labels.append(label_id)
57 |
58 | print('Found %s texts.' % len(texts))
59 |
60 | # finally, vectorize the text samples into a 2D integer tensor
61 | tokenizer = Tokenizer(num_words=MAX_NUM_WORDS)
62 | tokenizer.fit_on_texts(texts)
63 | sequences = tokenizer.texts_to_sequences(texts)
64 |
65 | word_index = tokenizer.word_index
66 | print('Found %s unique tokens.' % len(word_index))
67 |
68 | data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)
69 |
70 | labels = to_categorical(np.asarray(labels))
71 | print('Shape of data tensor:', data.shape)
72 | print('Shape of label tensor:', labels.shape)
73 |
74 | # split the data into a training set and a validation set
75 | indices = np.arange(data.shape[0])
76 | np.random.shuffle(indices)
77 | data = data[indices]
78 | labels = labels[indices]
79 | num_validation_samples = int(VALIDATION_SPLIT * data.shape[0])
80 |
81 | x_train = data[:-num_validation_samples]
82 | y_train = labels[:-num_validation_samples]
83 | x_val = data[-num_validation_samples:]
84 | y_val = labels[-num_validation_samples:]
85 |
86 | print('Preparing embedding matrix.')
87 |
88 | # prepare embedding matrix
89 | num_words = min(MAX_NUM_WORDS, len(word_index)) + 1
90 | embedding_matrix = np.zeros((num_words, EMBEDDING_DIM))
91 | for word, i in word_index.items():
92 | if i > MAX_NUM_WORDS:
93 | continue
94 | embedding_vector = embeddings_index.get(word)
95 | if embedding_vector is not None:
96 | # words not found in embedding index will be all-zeros.
97 | embedding_matrix[i] = embedding_vector
98 |
99 | # load pre-trained word embeddings into an Embedding layer
100 | # note that we set trainable = False so as to keep the embeddings fixed
101 | embedding_layer = Embedding(num_words,
102 | EMBEDDING_DIM,
103 | embeddings_initializer=Constant(embedding_matrix),
104 | input_length=MAX_SEQUENCE_LENGTH,
105 | trainable=False)
106 |
107 | print('Training model.')
108 |
109 | # train a 1D convnet with global maxpooling
110 | sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
111 | embedded_sequences = embedding_layer(sequence_input)
112 | x = Conv1D(128, 5, activation='relu')(embedded_sequences)
113 | x = MaxPooling1D(5)(x)
114 | x = Conv1D(128, 5, activation='relu')(x)
115 | x = MaxPooling1D(5)(x)
116 | x = Conv1D(128, 5, activation='relu')(x)
117 | x = GlobalMaxPooling1D()(x)
118 | x = Dense(128, activation='relu')(x)
119 | preds = Dense(len(labels_index), activation='softmax')(x)
120 |
121 | model = Model(sequence_input, preds)
122 | model.compile(loss='categorical_crossentropy',
123 | optimizer='rmsprop',
124 | metrics=['acc'])
125 |
126 | model.fit(x_train, y_train,
127 | batch_size=128,
128 | epochs=10,
129 | validation_data=(x_val, y_val))
--------------------------------------------------------------------------------
/ch11-循环神经网络/sentiment_analysis_cell - GRU.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import os
3 | import tensorflow as tf
4 | import numpy as np
5 | from tensorflow import keras
6 | from tensorflow.keras import layers, losses, optimizers, Sequential
7 |
8 |
9 | tf.random.set_seed(22)
10 | np.random.seed(22)
11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
12 | assert tf.__version__.startswith('2.')
13 |
14 | batchsz = 128 # 批量大小
15 | total_words = 10000 # 词汇表大小N_vocab
16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充
17 | embedding_len = 100 # 词向量特征长度f
18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词
19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
20 | print(x_train.shape, len(x_train[0]), y_train.shape)
21 | print(x_test.shape, len(x_test[0]), y_test.shape)
22 | #%%
23 | x_train[0]
24 | #%%
25 | # 数字编码表
26 | word_index = keras.datasets.imdb.get_word_index()
27 | # for k,v in word_index.items():
28 | # print(k,v)
29 | #%%
30 | word_index = {k:(v+3) for k,v in word_index.items()}
31 | word_index[""] = 0
32 | word_index[""] = 1
33 | word_index[""] = 2 # unknown
34 | word_index[""] = 3
35 | # 翻转编码表
36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
37 |
38 | def decode_review(text):
39 | return ' '.join([reverse_word_index.get(i, '?') for i in text])
40 |
41 | decode_review(x_train[8])
42 |
43 | #%%
44 |
45 | # x_train:[b, 80]
46 | # x_test: [b, 80]
47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充
48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch
51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True)
53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
54 | db_test = db_test.batch(batchsz, drop_remainder=True)
55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train))
56 | print('x_test shape:', x_test.shape)
57 |
58 | #%%
59 |
60 | class MyRNN(keras.Model):
61 | # Cell方式构建多层网络
62 | def __init__(self, units):
63 | super(MyRNN, self).__init__()
64 | # [b, 64],构建Cell初始化状态向量,重复使用
65 | self.state0 = [tf.zeros([batchsz, units])]
66 | self.state1 = [tf.zeros([batchsz, units])]
67 | # 词向量编码 [b, 80] => [b, 80, 100]
68 | self.embedding = layers.Embedding(total_words, embedding_len,
69 | input_length=max_review_len)
70 | # 构建2个Cell
71 | self.rnn_cell0 = layers.GRUCell(units, dropout=0.5)
72 | self.rnn_cell1 = layers.GRUCell(units, dropout=0.5)
73 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类
74 | # [b, 80, 100] => [b, 64] => [b, 1]
75 | self.outlayer = Sequential([
76 | layers.Dense(units),
77 | layers.Dropout(rate=0.5),
78 | layers.ReLU(),
79 | layers.Dense(1)])
80 |
81 | def call(self, inputs, training=None):
82 | x = inputs # [b, 80]
83 | # embedding: [b, 80] => [b, 80, 100]
84 | x = self.embedding(x)
85 | # rnn cell compute,[b, 80, 100] => [b, 64]
86 | state0 = self.state0
87 | state1 = self.state1
88 | for word in tf.unstack(x, axis=1): # word: [b, 100]
89 | out0, state0 = self.rnn_cell0(word, state0, training)
90 | out1, state1 = self.rnn_cell1(out0, state1, training)
91 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1]
92 | x = self.outlayer(out1, training)
93 | # p(y is pos|x)
94 | prob = tf.sigmoid(x)
95 |
96 | return prob
97 |
98 | def main():
99 | units = 64 # RNN状态向量长度f
100 | epochs = 50 # 训练epochs
101 |
102 | model = MyRNN(units)
103 | # 装配
104 | model.compile(optimizer = optimizers.RMSprop(0.001),
105 | loss = losses.BinaryCrossentropy(),
106 | metrics=['accuracy'])
107 | # 训练和验证
108 | model.fit(db_train, epochs=epochs, validation_data=db_test)
109 | # 测试
110 | model.evaluate(db_test)
111 |
112 |
113 | if __name__ == '__main__':
114 | main()
115 |
--------------------------------------------------------------------------------
/ch11-循环神经网络/sentiment_analysis_cell - LSTM.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import os
3 | import tensorflow as tf
4 | import numpy as np
5 | from tensorflow import keras
6 | from tensorflow.keras import layers, losses, optimizers, Sequential
7 |
8 |
9 | tf.random.set_seed(22)
10 | np.random.seed(22)
11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
12 | assert tf.__version__.startswith('2.')
13 |
14 | batchsz = 128 # 批量大小
15 | total_words = 10000 # 词汇表大小N_vocab
16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充
17 | embedding_len = 100 # 词向量特征长度f
18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词
19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
20 | print(x_train.shape, len(x_train[0]), y_train.shape)
21 | print(x_test.shape, len(x_test[0]), y_test.shape)
22 | #%%
23 | x_train[0]
24 | #%%
25 | # 数字编码表
26 | word_index = keras.datasets.imdb.get_word_index()
27 | # for k,v in word_index.items():
28 | # print(k,v)
29 | #%%
30 | word_index = {k:(v+3) for k,v in word_index.items()}
31 | word_index[""] = 0
32 | word_index[""] = 1
33 | word_index[""] = 2 # unknown
34 | word_index[""] = 3
35 | # 翻转编码表
36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
37 |
38 | def decode_review(text):
39 | return ' '.join([reverse_word_index.get(i, '?') for i in text])
40 |
41 | decode_review(x_train[8])
42 |
43 | #%%
44 |
45 | # x_train:[b, 80]
46 | # x_test: [b, 80]
47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充
48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch
51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True)
53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
54 | db_test = db_test.batch(batchsz, drop_remainder=True)
55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train))
56 | print('x_test shape:', x_test.shape)
57 |
58 | #%%
59 |
60 | class MyRNN(keras.Model):
61 | # Cell方式构建多层网络
62 | def __init__(self, units):
63 | super(MyRNN, self).__init__()
64 | # [b, 64],构建Cell初始化状态向量,重复使用
65 | self.state0 = [tf.zeros([batchsz, units]),tf.zeros([batchsz, units])]
66 | self.state1 = [tf.zeros([batchsz, units]),tf.zeros([batchsz, units])]
67 | # 词向量编码 [b, 80] => [b, 80, 100]
68 | self.embedding = layers.Embedding(total_words, embedding_len,
69 | input_length=max_review_len)
70 | # 构建2个Cell
71 | self.rnn_cell0 = layers.LSTMCell(units, dropout=0.5)
72 | self.rnn_cell1 = layers.LSTMCell(units, dropout=0.5)
73 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类
74 | # [b, 80, 100] => [b, 64] => [b, 1]
75 | self.outlayer = Sequential([
76 | layers.Dense(units),
77 | layers.Dropout(rate=0.5),
78 | layers.ReLU(),
79 | layers.Dense(1)])
80 |
81 | def call(self, inputs, training=None):
82 | x = inputs # [b, 80]
83 | # embedding: [b, 80] => [b, 80, 100]
84 | x = self.embedding(x)
85 | # rnn cell compute,[b, 80, 100] => [b, 64]
86 | state0 = self.state0
87 | state1 = self.state1
88 | for word in tf.unstack(x, axis=1): # word: [b, 100]
89 | out0, state0 = self.rnn_cell0(word, state0, training)
90 | out1, state1 = self.rnn_cell1(out0, state1, training)
91 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1]
92 | x = self.outlayer(out1,training)
93 | # p(y is pos|x)
94 | prob = tf.sigmoid(x)
95 |
96 | return prob
97 |
98 | def main():
99 | units = 64 # RNN状态向量长度f
100 | epochs = 50 # 训练epochs
101 |
102 | model = MyRNN(units)
103 | # 装配
104 | model.compile(optimizer = optimizers.RMSprop(0.001),
105 | loss = losses.BinaryCrossentropy(),
106 | metrics=['accuracy'])
107 | # 训练和验证
108 | model.fit(db_train, epochs=epochs, validation_data=db_test)
109 | # 测试
110 | model.evaluate(db_test)
111 |
112 |
113 | if __name__ == '__main__':
114 | main()
115 |
--------------------------------------------------------------------------------
/ch11-循环神经网络/sentiment_analysis_cell.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import os
3 | import tensorflow as tf
4 | import numpy as np
5 | from tensorflow import keras
6 | from tensorflow.keras import layers, losses, optimizers, Sequential
7 |
8 |
9 | tf.random.set_seed(22)
10 | np.random.seed(22)
11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
12 | assert tf.__version__.startswith('2.')
13 |
14 | batchsz = 128 # 批量大小
15 | total_words = 10000 # 词汇表大小N_vocab
16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充
17 | embedding_len = 100 # 词向量特征长度f
18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词
19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
20 | print(x_train.shape, len(x_train[0]), y_train.shape)
21 | print(x_test.shape, len(x_test[0]), y_test.shape)
22 | #%%
23 | x_train[0]
24 | #%%
25 | # 数字编码表
26 | word_index = keras.datasets.imdb.get_word_index()
27 | # for k,v in word_index.items():
28 | # print(k,v)
29 | #%%
30 | word_index = {k:(v+3) for k,v in word_index.items()}
31 | word_index[""] = 0
32 | word_index[""] = 1
33 | word_index[""] = 2 # unknown
34 | word_index[""] = 3
35 | # 翻转编码表
36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
37 |
38 | def decode_review(text):
39 | return ' '.join([reverse_word_index.get(i, '?') for i in text])
40 |
41 | decode_review(x_train[8])
42 |
43 | #%%
44 |
45 | # x_train:[b, 80]
46 | # x_test: [b, 80]
47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充
48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch
51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True)
53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
54 | db_test = db_test.batch(batchsz, drop_remainder=True)
55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train))
56 | print('x_test shape:', x_test.shape)
57 |
58 | #%%
59 |
60 | class MyRNN(keras.Model):
61 | # Cell方式构建多层网络
62 | def __init__(self, units):
63 | super(MyRNN, self).__init__()
64 | # [b, 64],构建Cell初始化状态向量,重复使用
65 | self.state0 = [tf.zeros([batchsz, units])]
66 | self.state1 = [tf.zeros([batchsz, units])]
67 | # 词向量编码 [b, 80] => [b, 80, 100]
68 | self.embedding = layers.Embedding(total_words, embedding_len,
69 | input_length=max_review_len)
70 | # 构建2个Cell
71 | self.rnn_cell0 = layers.SimpleRNNCell(units, dropout=0.5)
72 | self.rnn_cell1 = layers.SimpleRNNCell(units, dropout=0.5)
73 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类
74 | # [b, 80, 100] => [b, 64] => [b, 1]
75 | self.outlayer = Sequential([
76 | layers.Dense(units),
77 | layers.Dropout(rate=0.5),
78 | layers.ReLU(),
79 | layers.Dense(1)])
80 |
81 | def call(self, inputs, training=None):
82 | x = inputs # [b, 80]
83 | # embedding: [b, 80] => [b, 80, 100]
84 | x = self.embedding(x)
85 | # rnn cell compute,[b, 80, 100] => [b, 64]
86 | state0 = self.state0
87 | state1 = self.state1
88 | for word in tf.unstack(x, axis=1): # word: [b, 100]
89 | out0, state0 = self.rnn_cell0(word, state0, training)
90 | out1, state1 = self.rnn_cell1(out0, state1, training)
91 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1]
92 | x = self.outlayer(out1, training)
93 | # p(y is pos|x)
94 | prob = tf.sigmoid(x)
95 |
96 | return prob
97 |
98 | def main():
99 | units = 64 # RNN状态向量长度f
100 | epochs = 50 # 训练epochs
101 |
102 | model = MyRNN(units)
103 | # 装配
104 | model.compile(optimizer = optimizers.RMSprop(0.001),
105 | loss = losses.BinaryCrossentropy(),
106 | metrics=['accuracy'])
107 | # 训练和验证
108 | model.fit(db_train, epochs=epochs, validation_data=db_test)
109 | # 测试
110 | model.evaluate(db_test)
111 |
112 |
113 | if __name__ == '__main__':
114 | main()
115 |
--------------------------------------------------------------------------------
/ch11-循环神经网络/sentiment_analysis_layer - GRU.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import os
3 | import tensorflow as tf
4 | import numpy as np
5 | from tensorflow import keras
6 | from tensorflow.keras import layers, losses, optimizers, Sequential
7 |
8 |
9 | tf.random.set_seed(22)
10 | np.random.seed(22)
11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
12 | assert tf.__version__.startswith('2.')
13 |
14 | batchsz = 128 # 批量大小
15 | total_words = 10000 # 词汇表大小N_vocab
16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充
17 | embedding_len = 100 # 词向量特征长度f
18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词
19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
20 | print(x_train.shape, len(x_train[0]), y_train.shape)
21 | print(x_test.shape, len(x_test[0]), y_test.shape)
22 | #%%
23 | x_train[0]
24 | #%%
25 | # 数字编码表
26 | word_index = keras.datasets.imdb.get_word_index()
27 | # for k,v in word_index.items():
28 | # print(k,v)
29 | #%%
30 | word_index = {k:(v+3) for k,v in word_index.items()}
31 | word_index[""] = 0
32 | word_index[""] = 1
33 | word_index[""] = 2 # unknown
34 | word_index[""] = 3
35 | # 翻转编码表
36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
37 |
38 | def decode_review(text):
39 | return ' '.join([reverse_word_index.get(i, '?') for i in text])
40 |
41 | decode_review(x_train[8])
42 |
43 | #%%
44 |
45 | # x_train:[b, 80]
46 | # x_test: [b, 80]
47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充
48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch
51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True)
53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
54 | db_test = db_test.batch(batchsz, drop_remainder=True)
55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train))
56 | print('x_test shape:', x_test.shape)
57 |
58 | #%%
59 |
60 | class MyRNN(keras.Model):
61 | # Cell方式构建多层网络
62 | def __init__(self, units):
63 | super(MyRNN, self).__init__()
64 | # 词向量编码 [b, 80] => [b, 80, 100]
65 | self.embedding = layers.Embedding(total_words, embedding_len,
66 | input_length=max_review_len)
67 | # 构建RNN
68 | self.rnn = keras.Sequential([
69 | layers.GRU(units, dropout=0.5, return_sequences=True),
70 | layers.GRU(units, dropout=0.5)
71 | ])
72 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类
73 | # [b, 80, 100] => [b, 64] => [b, 1]
74 | self.outlayer = Sequential([
75 | layers.Dense(32),
76 | layers.Dropout(rate=0.5),
77 | layers.ReLU(),
78 | layers.Dense(1)])
79 |
80 | def call(self, inputs, training=None):
81 | x = inputs # [b, 80]
82 | # embedding: [b, 80] => [b, 80, 100]
83 | x = self.embedding(x)
84 | # rnn cell compute,[b, 80, 100] => [b, 64]
85 | x = self.rnn(x)
86 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1]
87 | x = self.outlayer(x,training)
88 | # p(y is pos|x)
89 | prob = tf.sigmoid(x)
90 |
91 | return prob
92 |
93 | def main():
94 | units = 32 # RNN状态向量长度f
95 | epochs = 50 # 训练epochs
96 |
97 | model = MyRNN(units)
98 | # 装配
99 | model.compile(optimizer = optimizers.Adam(0.001),
100 | loss = losses.BinaryCrossentropy(),
101 | metrics=['accuracy'])
102 | # 训练和验证
103 | model.fit(db_train, epochs=epochs, validation_data=db_test)
104 | # 测试
105 | model.evaluate(db_test)
106 |
107 |
108 | if __name__ == '__main__':
109 | main()
110 |
--------------------------------------------------------------------------------
/ch11-循环神经网络/sentiment_analysis_layer - LSTM - pretrained.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import os
3 | import tensorflow as tf
4 | import numpy as np
5 | from tensorflow import keras
6 | from tensorflow.keras import layers, losses, optimizers, Sequential
7 |
8 |
9 | tf.random.set_seed(22)
10 | np.random.seed(22)
11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
12 | assert tf.__version__.startswith('2.')
13 |
14 | batchsz = 128 # 批量大小
15 | total_words = 10000 # 词汇表大小N_vocab
16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充
17 | embedding_len = 100 # 词向量特征长度f
18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词
19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
20 | print(x_train.shape, len(x_train[0]), y_train.shape)
21 | print(x_test.shape, len(x_test[0]), y_test.shape)
22 | #%%
23 | x_train[0]
24 | #%%
25 | # 数字编码表
26 | word_index = keras.datasets.imdb.get_word_index()
27 | # for k,v in word_index.items():
28 | # print(k,v)
29 | #%%
30 | word_index = {k:(v+3) for k,v in word_index.items()}
31 | word_index[""] = 0
32 | word_index[""] = 1
33 | word_index[""] = 2 # unknown
34 | word_index[""] = 3
35 | # 翻转编码表
36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
37 |
38 | def decode_review(text):
39 | return ' '.join([reverse_word_index.get(i, '?') for i in text])
40 |
41 | decode_review(x_train[8])
42 |
43 | #%%
44 | print('Indexing word vectors.')
45 | embeddings_index = {}
46 | GLOVE_DIR = r'C:\Users\z390\Downloads\glove6b50dtxt'
47 | with open(os.path.join(GLOVE_DIR, 'glove.6B.100d.txt'),encoding='utf-8') as f:
48 | for line in f:
49 | values = line.split()
50 | word = values[0]
51 | coefs = np.asarray(values[1:], dtype='float32')
52 | embeddings_index[word] = coefs
53 |
54 | print('Found %s word vectors.' % len(embeddings_index))
55 | #%%
56 | len(embeddings_index.keys())
57 | len(word_index.keys())
58 | #%%
59 | MAX_NUM_WORDS = total_words
60 | # prepare embedding matrix
61 | num_words = min(MAX_NUM_WORDS, len(word_index))
62 | embedding_matrix = np.zeros((num_words, embedding_len))
63 | applied_vec_count = 0
64 | for word, i in word_index.items():
65 | if i >= MAX_NUM_WORDS:
66 | continue
67 | embedding_vector = embeddings_index.get(word)
68 | # print(word,embedding_vector)
69 | if embedding_vector is not None:
70 | # words not found in embedding index will be all-zeros.
71 | embedding_matrix[i] = embedding_vector
72 | applied_vec_count += 1
73 | print(applied_vec_count, embedding_matrix.shape)
74 |
75 | #%%
76 | # x_train:[b, 80]
77 | # x_test: [b, 80]
78 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充
79 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
80 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
81 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch
82 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
83 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True)
84 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
85 | db_test = db_test.batch(batchsz, drop_remainder=True)
86 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train))
87 | print('x_test shape:', x_test.shape)
88 |
89 | #%%
90 |
91 | class MyRNN(keras.Model):
92 | # Cell方式构建多层网络
93 | def __init__(self, units):
94 | super(MyRNN, self).__init__()
95 | # 词向量编码 [b, 80] => [b, 80, 100]
96 | self.embedding = layers.Embedding(total_words, embedding_len,
97 | input_length=max_review_len,
98 | trainable=False)
99 | self.embedding.build(input_shape=(None,max_review_len))
100 | # self.embedding.set_weights([embedding_matrix])
101 | # 构建RNN
102 | self.rnn = keras.Sequential([
103 | layers.LSTM(units, dropout=0.5, return_sequences=True),
104 | layers.LSTM(units, dropout=0.5)
105 | ])
106 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类
107 | # [b, 80, 100] => [b, 64] => [b, 1]
108 | self.outlayer = Sequential([
109 | layers.Dense(32),
110 | layers.Dropout(rate=0.5),
111 | layers.ReLU(),
112 | layers.Dense(1)])
113 |
114 | def call(self, inputs, training=None):
115 | x = inputs # [b, 80]
116 | # embedding: [b, 80] => [b, 80, 100]
117 | x = self.embedding(x)
118 | # rnn cell compute,[b, 80, 100] => [b, 64]
119 | x = self.rnn(x)
120 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1]
121 | x = self.outlayer(x,training)
122 | # p(y is pos|x)
123 | prob = tf.sigmoid(x)
124 |
125 | return prob
126 |
127 | def main():
128 | units = 512 # RNN状态向量长度f
129 | epochs = 50 # 训练epochs
130 |
131 | model = MyRNN(units)
132 | # 装配
133 | model.compile(optimizer = optimizers.Adam(0.001),
134 | loss = losses.BinaryCrossentropy(),
135 | metrics=['accuracy'])
136 | # 训练和验证
137 | model.fit(db_train, epochs=epochs, validation_data=db_test)
138 | # 测试
139 | model.evaluate(db_test)
140 |
141 |
142 | if __name__ == '__main__':
143 | main()
144 |
145 |
146 | #%%
147 |
--------------------------------------------------------------------------------
/ch11-循环神经网络/sentiment_analysis_layer - LSTM.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import os
3 | import tensorflow as tf
4 | import numpy as np
5 | from tensorflow import keras
6 | from tensorflow.keras import layers, losses, optimizers, Sequential
7 |
8 |
9 | tf.random.set_seed(22)
10 | np.random.seed(22)
11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
12 | assert tf.__version__.startswith('2.')
13 |
14 | batchsz = 128 # 批量大小
15 | total_words = 10000 # 词汇表大小N_vocab
16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充
17 | embedding_len = 100 # 词向量特征长度f
18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词
19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
20 | print(x_train.shape, len(x_train[0]), y_train.shape)
21 | print(x_test.shape, len(x_test[0]), y_test.shape)
22 | #%%
23 | x_train[0]
24 | #%%
25 | # 数字编码表
26 | word_index = keras.datasets.imdb.get_word_index()
27 | # for k,v in word_index.items():
28 | # print(k,v)
29 | #%%
30 | word_index = {k:(v+3) for k,v in word_index.items()}
31 | word_index[""] = 0
32 | word_index[""] = 1
33 | word_index[""] = 2 # unknown
34 | word_index[""] = 3
35 | # 翻转编码表
36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
37 |
38 | def decode_review(text):
39 | return ' '.join([reverse_word_index.get(i, '?') for i in text])
40 |
41 | decode_review(x_train[8])
42 |
43 | #%%
44 |
45 | # x_train:[b, 80]
46 | # x_test: [b, 80]
47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充
48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch
51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True)
53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
54 | db_test = db_test.batch(batchsz, drop_remainder=True)
55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train))
56 | print('x_test shape:', x_test.shape)
57 |
58 | #%%
59 |
60 | class MyRNN(keras.Model):
61 | # Cell方式构建多层网络
62 | def __init__(self, units):
63 | super(MyRNN, self).__init__()
64 | # 词向量编码 [b, 80] => [b, 80, 100]
65 | self.embedding = layers.Embedding(total_words, embedding_len,
66 | input_length=max_review_len)
67 | # 构建RNN
68 | self.rnn = keras.Sequential([
69 | layers.LSTM(units, dropout=0.5, return_sequences=True),
70 | layers.LSTM(units, dropout=0.5)
71 | ])
72 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类
73 | # [b, 80, 100] => [b, 64] => [b, 1]
74 | self.outlayer = Sequential([
75 | layers.Dense(32),
76 | layers.Dropout(rate=0.5),
77 | layers.ReLU(),
78 | layers.Dense(1)])
79 |
80 | def call(self, inputs, training=None):
81 | x = inputs # [b, 80]
82 | # embedding: [b, 80] => [b, 80, 100]
83 | x = self.embedding(x)
84 | # rnn cell compute,[b, 80, 100] => [b, 64]
85 | x = self.rnn(x)
86 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1]
87 | x = self.outlayer(x,training)
88 | # p(y is pos|x)
89 | prob = tf.sigmoid(x)
90 |
91 | return prob
92 |
93 | def main():
94 | units = 32 # RNN状态向量长度f
95 | epochs = 50 # 训练epochs
96 |
97 | model = MyRNN(units)
98 | # 装配
99 | model.compile(optimizer = optimizers.Adam(0.001),
100 | loss = losses.BinaryCrossentropy(),
101 | metrics=['accuracy'])
102 | # 训练和验证
103 | model.fit(db_train, epochs=epochs, validation_data=db_test)
104 | # 测试
105 | model.evaluate(db_test)
106 |
107 |
108 | if __name__ == '__main__':
109 | main()
110 |
--------------------------------------------------------------------------------
/ch11-循环神经网络/sentiment_analysis_layer.py:
--------------------------------------------------------------------------------
1 | #%%
2 | import os
3 | import tensorflow as tf
4 | import numpy as np
5 | from tensorflow import keras
6 | from tensorflow.keras import layers, losses, optimizers, Sequential
7 |
8 |
9 | tf.random.set_seed(22)
10 | np.random.seed(22)
11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
12 | assert tf.__version__.startswith('2.')
13 |
14 | batchsz = 512 # 批量大小
15 | total_words = 10000 # 词汇表大小N_vocab
16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充
17 | embedding_len = 100 # 词向量特征长度f
18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词
19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
20 | print(x_train.shape, len(x_train[0]), y_train.shape)
21 | print(x_test.shape, len(x_test[0]), y_test.shape)
22 | #%%
23 | x_train[0]
24 | #%%
25 | # 数字编码表
26 | word_index = keras.datasets.imdb.get_word_index()
27 | # for k,v in word_index.items():
28 | # print(k,v)
29 | #%%
30 | word_index = {k:(v+3) for k,v in word_index.items()}
31 | word_index[""] = 0
32 | word_index[""] = 1
33 | word_index[""] = 2 # unknown
34 | word_index[""] = 3
35 | # 翻转编码表
36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
37 |
38 | def decode_review(text):
39 | return ' '.join([reverse_word_index.get(i, '?') for i in text])
40 |
41 | decode_review(x_train[8])
42 |
43 | #%%
44 |
45 | # x_train:[b, 80]
46 | # x_test: [b, 80]
47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充
48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch
51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True)
53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
54 | db_test = db_test.batch(batchsz, drop_remainder=True)
55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train))
56 | print('x_test shape:', x_test.shape)
57 |
58 | #%%
59 |
60 | class MyRNN(keras.Model):
61 | # Cell方式构建多层网络
62 | def __init__(self, units):
63 | super(MyRNN, self).__init__()
64 | # 词向量编码 [b, 80] => [b, 80, 100]
65 | self.embedding = layers.Embedding(total_words, embedding_len,
66 | input_length=max_review_len)
67 | # 构建RNN
68 | self.rnn = keras.Sequential([
69 | layers.SimpleRNN(units, dropout=0.5, return_sequences=True),
70 | layers.SimpleRNN(units, dropout=0.5)
71 | ])
72 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类
73 | # [b, 80, 100] => [b, 64] => [b, 1]
74 | self.outlayer = Sequential([
75 | layers.Dense(32),
76 | layers.Dropout(rate=0.5),
77 | layers.ReLU(),
78 | layers.Dense(1)])
79 |
80 | def call(self, inputs, training=None):
81 | x = inputs # [b, 80]
82 | # embedding: [b, 80] => [b, 80, 100]
83 | x = self.embedding(x)
84 | # rnn cell compute,[b, 80, 100] => [b, 64]
85 | x = self.rnn(x)
86 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1]
87 | x = self.outlayer(x,training)
88 | # p(y is pos|x)
89 | prob = tf.sigmoid(x)
90 |
91 | return prob
92 |
93 | def main():
94 | units = 64 # RNN状态向量长度f
95 | epochs = 50 # 训练epochs
96 |
97 | model = MyRNN(units)
98 | # 装配
99 | model.compile(optimizer = optimizers.Adam(0.001),
100 | loss = losses.BinaryCrossentropy(),
101 | metrics=['accuracy'])
102 | # 训练和验证
103 | model.fit(db_train, epochs=epochs, validation_data=db_test)
104 | # 测试
105 | model.evaluate(db_test)
106 |
107 |
108 | if __name__ == '__main__':
109 | main()
110 |
--------------------------------------------------------------------------------
/ch11-循环神经网络/循环神经网络.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/循环神经网络.pdf
--------------------------------------------------------------------------------
/ch11-循环神经网络/情感分类实战.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/情感分类实战.pdf
--------------------------------------------------------------------------------
/ch11-循环神经网络/时间序列表示.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/时间序列表示.pdf
--------------------------------------------------------------------------------
/ch11-循环神经网络/梯度弥散与梯度爆炸.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/梯度弥散与梯度爆炸.pdf
--------------------------------------------------------------------------------
/ch12-自编码器/AE实战.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch12-自编码器/AE实战.pdf
--------------------------------------------------------------------------------
/ch12-自编码器/AutoEncoders.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch12-自编码器/AutoEncoders.pdf
--------------------------------------------------------------------------------
/ch12-自编码器/autoencoder.py:
--------------------------------------------------------------------------------
1 | import os
2 | import tensorflow as tf
3 | import numpy as np
4 | from tensorflow import keras
5 | from tensorflow.keras import Sequential, layers
6 | from PIL import Image
7 | from matplotlib import pyplot as plt
8 |
9 |
10 |
11 | tf.random.set_seed(22)
12 | np.random.seed(22)
13 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
14 | assert tf.__version__.startswith('2.')
15 |
16 |
17 | def save_images(imgs, name):
18 | new_im = Image.new('L', (280, 280))
19 |
20 | index = 0
21 | for i in range(0, 280, 28):
22 | for j in range(0, 280, 28):
23 | im = imgs[index]
24 | im = Image.fromarray(im, mode='L')
25 | new_im.paste(im, (i, j))
26 | index += 1
27 |
28 | new_im.save(name)
29 |
30 |
31 | h_dim = 20
32 | batchsz = 512
33 | lr = 1e-3
34 |
35 |
36 | (x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()
37 | x_train, x_test = x_train.astype(np.float32) / 255., x_test.astype(np.float32) / 255.
38 | # we do not need label
39 | train_db = tf.data.Dataset.from_tensor_slices(x_train)
40 | train_db = train_db.shuffle(batchsz * 5).batch(batchsz)
41 | test_db = tf.data.Dataset.from_tensor_slices(x_test)
42 | test_db = test_db.batch(batchsz)
43 |
44 | print(x_train.shape, y_train.shape)
45 | print(x_test.shape, y_test.shape)
46 |
47 |
48 |
49 | class AE(keras.Model):
50 |
51 | def __init__(self):
52 | super(AE, self).__init__()
53 |
54 | # Encoders
55 | self.encoder = Sequential([
56 | layers.Dense(256, activation=tf.nn.relu),
57 | layers.Dense(128, activation=tf.nn.relu),
58 | layers.Dense(h_dim)
59 | ])
60 |
61 | # Decoders
62 | self.decoder = Sequential([
63 | layers.Dense(128, activation=tf.nn.relu),
64 | layers.Dense(256, activation=tf.nn.relu),
65 | layers.Dense(784)
66 | ])
67 |
68 |
69 | def call(self, inputs, training=None):
70 | # [b, 784] => [b, 10]
71 | h = self.encoder(inputs)
72 | # [b, 10] => [b, 784]
73 | x_hat = self.decoder(h)
74 |
75 | return x_hat
76 |
77 |
78 |
79 | model = AE()
80 | model.build(input_shape=(None, 784))
81 | model.summary()
82 |
83 | optimizer = tf.optimizers.Adam(lr=lr)
84 |
85 | for epoch in range(100):
86 |
87 | for step, x in enumerate(train_db):
88 |
89 | #[b, 28, 28] => [b, 784]
90 | x = tf.reshape(x, [-1, 784])
91 |
92 | with tf.GradientTape() as tape:
93 | x_rec_logits = model(x)
94 |
95 | rec_loss = tf.losses.binary_crossentropy(x, x_rec_logits, from_logits=True)
96 | rec_loss = tf.reduce_mean(rec_loss)
97 |
98 | grads = tape.gradient(rec_loss, model.trainable_variables)
99 | optimizer.apply_gradients(zip(grads, model.trainable_variables))
100 |
101 |
102 | if step % 100 ==0:
103 | print(epoch, step, float(rec_loss))
104 |
105 |
106 | # evaluation
107 | x = next(iter(test_db))
108 | logits = model(tf.reshape(x, [-1, 784]))
109 | x_hat = tf.sigmoid(logits)
110 | # [b, 784] => [b, 28, 28]
111 | x_hat = tf.reshape(x_hat, [-1, 28, 28])
112 |
113 | # [b, 28, 28] => [2b, 28, 28]
114 | x_concat = tf.concat([x, x_hat], axis=0)
115 | x_concat = x_hat
116 | x_concat = x_concat.numpy() * 255.
117 | x_concat = x_concat.astype(np.uint8)
118 | save_images(x_concat, 'ae_images/rec_epoch_%d.png'%epoch)
119 |
--------------------------------------------------------------------------------
/ch12-自编码器/vae.py:
--------------------------------------------------------------------------------
1 | import os
2 | import tensorflow as tf
3 | import numpy as np
4 | from tensorflow import keras
5 | from tensorflow.keras import Sequential, layers
6 | from PIL import Image
7 | from matplotlib import pyplot as plt
8 |
9 |
10 |
11 | tf.random.set_seed(22)
12 | np.random.seed(22)
13 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
14 | assert tf.__version__.startswith('2.')
15 |
16 |
17 | def save_images(imgs, name):
18 | new_im = Image.new('L', (280, 280))
19 |
20 | index = 0
21 | for i in range(0, 280, 28):
22 | for j in range(0, 280, 28):
23 | im = imgs[index]
24 | im = Image.fromarray(im, mode='L')
25 | new_im.paste(im, (i, j))
26 | index += 1
27 |
28 | new_im.save(name)
29 |
30 |
31 | h_dim = 20
32 | batchsz = 512
33 | lr = 1e-3
34 |
35 |
36 | (x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()
37 | x_train, x_test = x_train.astype(np.float32) / 255., x_test.astype(np.float32) / 255.
38 | # we do not need label
39 | train_db = tf.data.Dataset.from_tensor_slices(x_train)
40 | train_db = train_db.shuffle(batchsz * 5).batch(batchsz)
41 | test_db = tf.data.Dataset.from_tensor_slices(x_test)
42 | test_db = test_db.batch(batchsz)
43 |
44 | print(x_train.shape, y_train.shape)
45 | print(x_test.shape, y_test.shape)
46 |
47 | z_dim = 10
48 |
49 | class VAE(keras.Model):
50 |
51 | def __init__(self):
52 | super(VAE, self).__init__()
53 |
54 | # Encoder
55 | self.fc1 = layers.Dense(128)
56 | self.fc2 = layers.Dense(z_dim) # get mean prediction
57 | self.fc3 = layers.Dense(z_dim)
58 |
59 | # Decoder
60 | self.fc4 = layers.Dense(128)
61 | self.fc5 = layers.Dense(784)
62 |
63 | def encoder(self, x):
64 |
65 | h = tf.nn.relu(self.fc1(x))
66 | # get mean
67 | mu = self.fc2(h)
68 | # get variance
69 | log_var = self.fc3(h)
70 |
71 | return mu, log_var
72 |
73 | def decoder(self, z):
74 |
75 | out = tf.nn.relu(self.fc4(z))
76 | out = self.fc5(out)
77 |
78 | return out
79 |
80 | def reparameterize(self, mu, log_var):
81 |
82 | eps = tf.random.normal(log_var.shape)
83 |
84 | std = tf.exp(log_var*0.5)
85 |
86 | z = mu + std * eps
87 | return z
88 |
89 | def call(self, inputs, training=None):
90 |
91 | # [b, 784] => [b, z_dim], [b, z_dim]
92 | mu, log_var = self.encoder(inputs)
93 | # reparameterization trick
94 | z = self.reparameterize(mu, log_var)
95 |
96 | x_hat = self.decoder(z)
97 |
98 | return x_hat, mu, log_var
99 |
100 |
101 | model = VAE()
102 | model.build(input_shape=(4, 784))
103 | optimizer = tf.optimizers.Adam(lr)
104 |
105 | for epoch in range(1000):
106 |
107 | for step, x in enumerate(train_db):
108 |
109 | x = tf.reshape(x, [-1, 784])
110 |
111 | with tf.GradientTape() as tape:
112 | x_rec_logits, mu, log_var = model(x)
113 |
114 | rec_loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=x, logits=x_rec_logits)
115 | rec_loss = tf.reduce_sum(rec_loss) / x.shape[0]
116 |
117 | # compute kl divergence (mu, var) ~ N (0, 1)
118 | # https://stats.stackexchange.com/questions/7440/kl-divergence-between-two-univariate-gaussians
119 | kl_div = -0.5 * (log_var + 1 - mu**2 - tf.exp(log_var))
120 | kl_div = tf.reduce_sum(kl_div) / x.shape[0]
121 |
122 | loss = rec_loss + 1. * kl_div
123 |
124 | grads = tape.gradient(loss, model.trainable_variables)
125 | optimizer.apply_gradients(zip(grads, model.trainable_variables))
126 |
127 |
128 | if step % 100 == 0:
129 | print(epoch, step, 'kl div:', float(kl_div), 'rec loss:', float(rec_loss))
130 |
131 |
132 | # evaluation
133 | z = tf.random.normal((batchsz, z_dim))
134 | logits = model.decoder(z)
135 | x_hat = tf.sigmoid(logits)
136 | x_hat = tf.reshape(x_hat, [-1, 28, 28]).numpy() *255.
137 | x_hat = x_hat.astype(np.uint8)
138 | save_images(x_hat, 'vae_images/sampled_epoch%d.png'%epoch)
139 |
140 | x = next(iter(test_db))
141 | x = tf.reshape(x, [-1, 784])
142 | x_hat_logits, _, _ = model(x)
143 | x_hat = tf.sigmoid(x_hat_logits)
144 | x_hat = tf.reshape(x_hat, [-1, 28, 28]).numpy() *255.
145 | x_hat = x_hat.astype(np.uint8)
146 | save_images(x_hat, 'vae_images/rec_epoch%d.png'%epoch)
147 |
148 |
--------------------------------------------------------------------------------
/ch13-生成对抗网络/GAN.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch13-生成对抗网络/GAN.pdf
--------------------------------------------------------------------------------
/ch13-生成对抗网络/GAN实战.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch13-生成对抗网络/GAN实战.pdf
--------------------------------------------------------------------------------
/ch13-生成对抗网络/dataset.py:
--------------------------------------------------------------------------------
1 | import multiprocessing
2 |
3 | import tensorflow as tf
4 |
5 |
6 | def make_anime_dataset(img_paths, batch_size, resize=64, drop_remainder=True, shuffle=True, repeat=1):
7 |
8 | # @tf.function
9 | def _map_fn(img):
10 | img = tf.image.resize(img, [resize, resize])
11 | # img = tf.image.random_crop(img,[resize, resize])
12 | # img = tf.image.random_flip_left_right(img)
13 | # img = tf.image.random_flip_up_down(img)
14 | img = tf.clip_by_value(img, 0, 255)
15 | img = img / 127.5 - 1 #-1~1
16 | return img
17 |
18 | dataset = disk_image_batch_dataset(img_paths,
19 | batch_size,
20 | drop_remainder=drop_remainder,
21 | map_fn=_map_fn,
22 | shuffle=shuffle,
23 | repeat=repeat)
24 | img_shape = (resize, resize, 3)
25 | len_dataset = len(img_paths) // batch_size
26 |
27 | return dataset, img_shape, len_dataset
28 |
29 |
30 | def batch_dataset(dataset,
31 | batch_size,
32 | drop_remainder=True,
33 | n_prefetch_batch=1,
34 | filter_fn=None,
35 | map_fn=None,
36 | n_map_threads=None,
37 | filter_after_map=False,
38 | shuffle=True,
39 | shuffle_buffer_size=None,
40 | repeat=None):
41 | # set defaults
42 | if n_map_threads is None:
43 | n_map_threads = multiprocessing.cpu_count()
44 | if shuffle and shuffle_buffer_size is None:
45 | shuffle_buffer_size = max(batch_size * 128, 2048) # set the minimum buffer size as 2048
46 |
47 | # [*] it is efficient to conduct `shuffle` before `map`/`filter` because `map`/`filter` is sometimes costly
48 | if shuffle:
49 | dataset = dataset.shuffle(shuffle_buffer_size)
50 |
51 | if not filter_after_map:
52 | if filter_fn:
53 | dataset = dataset.filter(filter_fn)
54 |
55 | if map_fn:
56 | dataset = dataset.map(map_fn, num_parallel_calls=n_map_threads)
57 |
58 | else: # [*] this is slower
59 | if map_fn:
60 | dataset = dataset.map(map_fn, num_parallel_calls=n_map_threads)
61 |
62 | if filter_fn:
63 | dataset = dataset.filter(filter_fn)
64 |
65 | dataset = dataset.batch(batch_size, drop_remainder=drop_remainder)
66 |
67 | dataset = dataset.repeat(repeat).prefetch(n_prefetch_batch)
68 |
69 | return dataset
70 |
71 |
72 | def memory_data_batch_dataset(memory_data,
73 | batch_size,
74 | drop_remainder=True,
75 | n_prefetch_batch=1,
76 | filter_fn=None,
77 | map_fn=None,
78 | n_map_threads=None,
79 | filter_after_map=False,
80 | shuffle=True,
81 | shuffle_buffer_size=None,
82 | repeat=None):
83 | """Batch dataset of memory data.
84 |
85 | Parameters
86 | ----------
87 | memory_data : nested structure of tensors/ndarrays/lists
88 |
89 | """
90 | dataset = tf.data.Dataset.from_tensor_slices(memory_data)
91 | dataset = batch_dataset(dataset,
92 | batch_size,
93 | drop_remainder=drop_remainder,
94 | n_prefetch_batch=n_prefetch_batch,
95 | filter_fn=filter_fn,
96 | map_fn=map_fn,
97 | n_map_threads=n_map_threads,
98 | filter_after_map=filter_after_map,
99 | shuffle=shuffle,
100 | shuffle_buffer_size=shuffle_buffer_size,
101 | repeat=repeat)
102 | return dataset
103 |
104 |
105 | def disk_image_batch_dataset(img_paths,
106 | batch_size,
107 | labels=None,
108 | drop_remainder=True,
109 | n_prefetch_batch=1,
110 | filter_fn=None,
111 | map_fn=None,
112 | n_map_threads=None,
113 | filter_after_map=False,
114 | shuffle=True,
115 | shuffle_buffer_size=None,
116 | repeat=None):
117 | """Batch dataset of disk image for PNG and JPEG.
118 |
119 | Parameters
120 | ----------
121 | img_paths : 1d-tensor/ndarray/list of str
122 | labels : nested structure of tensors/ndarrays/lists
123 |
124 | """
125 | if labels is None:
126 | memory_data = img_paths
127 | else:
128 | memory_data = (img_paths, labels)
129 |
130 | def parse_fn(path, *label):
131 | img = tf.io.read_file(path)
132 | img = tf.image.decode_jpeg(img, channels=3) # fix channels to 3
133 | return (img,) + label
134 |
135 | if map_fn: # fuse `map_fn` and `parse_fn`
136 | def map_fn_(*args):
137 | return map_fn(*parse_fn(*args))
138 | else:
139 | map_fn_ = parse_fn
140 |
141 | dataset = memory_data_batch_dataset(memory_data,
142 | batch_size,
143 | drop_remainder=drop_remainder,
144 | n_prefetch_batch=n_prefetch_batch,
145 | filter_fn=filter_fn,
146 | map_fn=map_fn_,
147 | n_map_threads=n_map_threads,
148 | filter_after_map=filter_after_map,
149 | shuffle=shuffle,
150 | shuffle_buffer_size=shuffle_buffer_size,
151 | repeat=repeat)
152 |
153 | return dataset
154 |
--------------------------------------------------------------------------------
/ch13-生成对抗网络/gan.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow import keras
3 | from tensorflow.keras import layers
4 |
5 |
6 | class Generator(keras.Model):
7 | # 生成器网络
8 | def __init__(self):
9 | super(Generator, self).__init__()
10 | filter = 64
11 | # 转置卷积层1,输出channel为filter*8,核大小4,步长1,不使用padding,不使用偏置
12 | self.conv1 = layers.Conv2DTranspose(filter*8, 4,1, 'valid', use_bias=False)
13 | self.bn1 = layers.BatchNormalization()
14 | # 转置卷积层2
15 | self.conv2 = layers.Conv2DTranspose(filter*4, 4,2, 'same', use_bias=False)
16 | self.bn2 = layers.BatchNormalization()
17 | # 转置卷积层3
18 | self.conv3 = layers.Conv2DTranspose(filter*2, 4,2, 'same', use_bias=False)
19 | self.bn3 = layers.BatchNormalization()
20 | # 转置卷积层4
21 | self.conv4 = layers.Conv2DTranspose(filter*1, 4,2, 'same', use_bias=False)
22 | self.bn4 = layers.BatchNormalization()
23 | # 转置卷积层5
24 | self.conv5 = layers.Conv2DTranspose(3, 4,2, 'same', use_bias=False)
25 |
26 | def call(self, inputs, training=None):
27 | x = inputs # [z, 100]
28 | # Reshape乘4D张量,方便后续转置卷积运算:(b, 1, 1, 100)
29 | x = tf.reshape(x, (x.shape[0], 1, 1, x.shape[1]))
30 | x = tf.nn.relu(x) # 激活函数
31 | # 转置卷积-BN-激活函数:(b, 4, 4, 512)
32 | x = tf.nn.relu(self.bn1(self.conv1(x), training=training))
33 | # 转置卷积-BN-激活函数:(b, 8, 8, 256)
34 | x = tf.nn.relu(self.bn2(self.conv2(x), training=training))
35 | # 转置卷积-BN-激活函数:(b, 16, 16, 128)
36 | x = tf.nn.relu(self.bn3(self.conv3(x), training=training))
37 | # 转置卷积-BN-激活函数:(b, 32, 32, 64)
38 | x = tf.nn.relu(self.bn4(self.conv4(x), training=training))
39 | # 转置卷积-激活函数:(b, 64, 64, 3)
40 | x = self.conv5(x)
41 | x = tf.tanh(x) # 输出x范围-1~1,与预处理一致
42 |
43 | return x
44 |
45 |
46 | class Discriminator(keras.Model):
47 | # 判别器
48 | def __init__(self):
49 | super(Discriminator, self).__init__()
50 | filter = 64
51 | # 卷积层
52 | self.conv1 = layers.Conv2D(filter, 4, 2, 'valid', use_bias=False)
53 | self.bn1 = layers.BatchNormalization()
54 | # 卷积层
55 | self.conv2 = layers.Conv2D(filter*2, 4, 2, 'valid', use_bias=False)
56 | self.bn2 = layers.BatchNormalization()
57 | # 卷积层
58 | self.conv3 = layers.Conv2D(filter*4, 4, 2, 'valid', use_bias=False)
59 | self.bn3 = layers.BatchNormalization()
60 | # 卷积层
61 | self.conv4 = layers.Conv2D(filter*8, 3, 1, 'valid', use_bias=False)
62 | self.bn4 = layers.BatchNormalization()
63 | # 卷积层
64 | self.conv5 = layers.Conv2D(filter*16, 3, 1, 'valid', use_bias=False)
65 | self.bn5 = layers.BatchNormalization()
66 | # 全局池化层
67 | self.pool = layers.GlobalAveragePooling2D()
68 | # 特征打平
69 | self.flatten = layers.Flatten()
70 | # 2分类全连接层
71 | self.fc = layers.Dense(1)
72 |
73 |
74 | def call(self, inputs, training=None):
75 | # 卷积-BN-激活函数:(4, 31, 31, 64)
76 | x = tf.nn.leaky_relu(self.bn1(self.conv1(inputs), training=training))
77 | # 卷积-BN-激活函数:(4, 14, 14, 128)
78 | x = tf.nn.leaky_relu(self.bn2(self.conv2(x), training=training))
79 | # 卷积-BN-激活函数:(4, 6, 6, 256)
80 | x = tf.nn.leaky_relu(self.bn3(self.conv3(x), training=training))
81 | # 卷积-BN-激活函数:(4, 4, 4, 512)
82 | x = tf.nn.leaky_relu(self.bn4(self.conv4(x), training=training))
83 | # 卷积-BN-激活函数:(4, 2, 2, 1024)
84 | x = tf.nn.leaky_relu(self.bn5(self.conv5(x), training=training))
85 | # 卷积-BN-激活函数:(4, 1024)
86 | x = self.pool(x)
87 | # 打平
88 | x = self.flatten(x)
89 | # 输出,[b, 1024] => [b, 1]
90 | logits = self.fc(x)
91 |
92 | return logits
93 |
94 | def main():
95 |
96 | d = Discriminator()
97 | g = Generator()
98 |
99 |
100 | x = tf.random.normal([2, 64, 64, 3])
101 | z = tf.random.normal([2, 100])
102 |
103 | prob = d(x)
104 | print(prob)
105 | x_hat = g(z)
106 | print(x_hat.shape)
107 |
108 |
109 |
110 |
111 | if __name__ == '__main__':
112 | main()
--------------------------------------------------------------------------------
/ch13-生成对抗网络/gan_train.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import tensorflow as tf
4 | from tensorflow import keras
5 | from scipy.misc import toimage
6 | import glob
7 | from gan import Generator, Discriminator
8 |
9 | from dataset import make_anime_dataset
10 |
11 |
12 | def save_result(val_out, val_block_size, image_path, color_mode):
13 | def preprocess(img):
14 | img = ((img + 1.0) * 127.5).astype(np.uint8)
15 | # img = img.astype(np.uint8)
16 | return img
17 |
18 | preprocesed = preprocess(val_out)
19 | final_image = np.array([])
20 | single_row = np.array([])
21 | for b in range(val_out.shape[0]):
22 | # concat image into a row
23 | if single_row.size == 0:
24 | single_row = preprocesed[b, :, :, :]
25 | else:
26 | single_row = np.concatenate((single_row, preprocesed[b, :, :, :]), axis=1)
27 |
28 | # concat image row to final_image
29 | if (b+1) % val_block_size == 0:
30 | if final_image.size == 0:
31 | final_image = single_row
32 | else:
33 | final_image = np.concatenate((final_image, single_row), axis=0)
34 |
35 | # reset single row
36 | single_row = np.array([])
37 |
38 | if final_image.shape[2] == 1:
39 | final_image = np.squeeze(final_image, axis=2)
40 | toimage(final_image).save(image_path)
41 |
42 |
43 | def celoss_ones(logits):
44 | # 计算属于与标签为1的交叉熵
45 | y = tf.ones_like(logits)
46 | loss = keras.losses.binary_crossentropy(y, logits, from_logits=True)
47 | return tf.reduce_mean(loss)
48 |
49 |
50 | def celoss_zeros(logits):
51 | # 计算属于与便签为0的交叉熵
52 | y = tf.zeros_like(logits)
53 | loss = keras.losses.binary_crossentropy(y, logits, from_logits=True)
54 | return tf.reduce_mean(loss)
55 |
56 | def d_loss_fn(generator, discriminator, batch_z, batch_x, is_training):
57 | # 计算判别器的误差函数
58 | # 采样生成图片
59 | fake_image = generator(batch_z, is_training)
60 | # 判定生成图片
61 | d_fake_logits = discriminator(fake_image, is_training)
62 | # 判定真实图片
63 | d_real_logits = discriminator(batch_x, is_training)
64 | # 真实图片与1之间的误差
65 | d_loss_real = celoss_ones(d_real_logits)
66 | # 生成图片与0之间的误差
67 | d_loss_fake = celoss_zeros(d_fake_logits)
68 | # 合并误差
69 | loss = d_loss_fake + d_loss_real
70 |
71 | return loss
72 |
73 |
74 | def g_loss_fn(generator, discriminator, batch_z, is_training):
75 | # 采样生成图片
76 | fake_image = generator(batch_z, is_training)
77 | # 在训练生成网络时,需要迫使生成图片判定为真
78 | d_fake_logits = discriminator(fake_image, is_training)
79 | # 计算生成图片与1之间的误差
80 | loss = celoss_ones(d_fake_logits)
81 |
82 | return loss
83 |
84 | def main():
85 |
86 | tf.random.set_seed(3333)
87 | np.random.seed(3333)
88 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
89 | assert tf.__version__.startswith('2.')
90 |
91 |
92 | z_dim = 100 # 隐藏向量z的长度
93 | epochs = 3000000 # 训练步数
94 | batch_size = 64 # batch size
95 | learning_rate = 0.0002
96 | is_training = True
97 |
98 | # 获取数据集路径
99 | # C:\Users\z390\Downloads\anime-faces
100 | # r'C:\Users\z390\Downloads\faces\*.jpg'
101 | img_path = glob.glob(r'C:\Users\z390\Downloads\anime-faces\*\*.jpg') + \
102 | glob.glob(r'C:\Users\z390\Downloads\anime-faces\*\*.png')
103 | # img_path = glob.glob(r'C:\Users\z390\Downloads\getchu_aligned_with_label\GetChu_aligned2\*.jpg')
104 | # img_path.extend(img_path2)
105 | print('images num:', len(img_path))
106 | # 构建数据集对象
107 | dataset, img_shape, _ = make_anime_dataset(img_path, batch_size, resize=64)
108 | print(dataset, img_shape)
109 | sample = next(iter(dataset)) # 采样
110 | print(sample.shape, tf.reduce_max(sample).numpy(),
111 | tf.reduce_min(sample).numpy())
112 | dataset = dataset.repeat(100) # 重复循环
113 | db_iter = iter(dataset)
114 |
115 |
116 | generator = Generator() # 创建生成器
117 | generator.build(input_shape = (4, z_dim))
118 | discriminator = Discriminator() # 创建判别器
119 | discriminator.build(input_shape=(4, 64, 64, 3))
120 | # 分别为生成器和判别器创建优化器
121 | g_optimizer = keras.optimizers.Adam(learning_rate=learning_rate, beta_1=0.5)
122 | d_optimizer = keras.optimizers.Adam(learning_rate=learning_rate, beta_1=0.5)
123 |
124 | generator.load_weights('generator.ckpt')
125 | discriminator.load_weights('discriminator.ckpt')
126 | print('Loaded chpt!!')
127 |
128 | d_losses, g_losses = [],[]
129 | for epoch in range(epochs): # 训练epochs次
130 | # 1. 训练判别器
131 | for _ in range(1):
132 | # 采样隐藏向量
133 | batch_z = tf.random.normal([batch_size, z_dim])
134 | batch_x = next(db_iter) # 采样真实图片
135 | # 判别器前向计算
136 | with tf.GradientTape() as tape:
137 | d_loss = d_loss_fn(generator, discriminator, batch_z, batch_x, is_training)
138 | grads = tape.gradient(d_loss, discriminator.trainable_variables)
139 | d_optimizer.apply_gradients(zip(grads, discriminator.trainable_variables))
140 | # 2. 训练生成器
141 | # 采样隐藏向量
142 | batch_z = tf.random.normal([batch_size, z_dim])
143 | batch_x = next(db_iter) # 采样真实图片
144 | # 生成器前向计算
145 | with tf.GradientTape() as tape:
146 | g_loss = g_loss_fn(generator, discriminator, batch_z, is_training)
147 | grads = tape.gradient(g_loss, generator.trainable_variables)
148 | g_optimizer.apply_gradients(zip(grads, generator.trainable_variables))
149 |
150 | if epoch % 100 == 0:
151 | print(epoch, 'd-loss:',float(d_loss), 'g-loss:', float(g_loss))
152 | # 可视化
153 | z = tf.random.normal([100, z_dim])
154 | fake_image = generator(z, training=False)
155 | img_path = os.path.join('gan_images', 'gan-%d.png'%epoch)
156 | save_result(fake_image.numpy(), 10, img_path, color_mode='P')
157 |
158 | d_losses.append(float(d_loss))
159 | g_losses.append(float(g_loss))
160 |
161 | if epoch % 10000 == 1:
162 | # print(d_losses)
163 | # print(g_losses)
164 | generator.save_weights('generator.ckpt')
165 | discriminator.save_weights('discriminator.ckpt')
166 |
167 |
168 |
169 |
170 |
171 | if __name__ == '__main__':
172 | main()
--------------------------------------------------------------------------------
/ch13-生成对抗网络/wgan.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from tensorflow import keras
3 | from tensorflow.keras import layers
4 |
5 |
6 |
7 |
8 |
9 |
10 | class Generator(keras.Model):
11 |
12 | def __init__(self):
13 | super(Generator, self).__init__()
14 |
15 | # z: [b, 100] => [b, 3*3*512] => [b, 3, 3, 512] => [b, 64, 64, 3]
16 | self.fc = layers.Dense(3*3*512)
17 |
18 | self.conv1 = layers.Conv2DTranspose(256, 3, 3, 'valid')
19 | self.bn1 = layers.BatchNormalization()
20 |
21 | self.conv2 = layers.Conv2DTranspose(128, 5, 2, 'valid')
22 | self.bn2 = layers.BatchNormalization()
23 |
24 | self.conv3 = layers.Conv2DTranspose(3, 4, 3, 'valid')
25 |
26 | def call(self, inputs, training=None):
27 | # [z, 100] => [z, 3*3*512]
28 | x = self.fc(inputs)
29 | x = tf.reshape(x, [-1, 3, 3, 512])
30 | x = tf.nn.leaky_relu(x)
31 |
32 | #
33 | x = tf.nn.leaky_relu(self.bn1(self.conv1(x), training=training))
34 | x = tf.nn.leaky_relu(self.bn2(self.conv2(x), training=training))
35 | x = self.conv3(x)
36 | x = tf.tanh(x)
37 |
38 | return x
39 |
40 |
41 | class Discriminator(keras.Model):
42 |
43 | def __init__(self):
44 | super(Discriminator, self).__init__()
45 |
46 | # [b, 64, 64, 3] => [b, 1]
47 | self.conv1 = layers.Conv2D(64, 5, 3, 'valid')
48 |
49 | self.conv2 = layers.Conv2D(128, 5, 3, 'valid')
50 | self.bn2 = layers.BatchNormalization()
51 |
52 | self.conv3 = layers.Conv2D(256, 5, 3, 'valid')
53 | self.bn3 = layers.BatchNormalization()
54 |
55 | # [b, h, w ,c] => [b, -1]
56 | self.flatten = layers.Flatten()
57 | self.fc = layers.Dense(1)
58 |
59 |
60 | def call(self, inputs, training=None):
61 |
62 | x = tf.nn.leaky_relu(self.conv1(inputs))
63 | x = tf.nn.leaky_relu(self.bn2(self.conv2(x), training=training))
64 | x = tf.nn.leaky_relu(self.bn3(self.conv3(x), training=training))
65 |
66 | # [b, h, w, c] => [b, -1]
67 | x = self.flatten(x)
68 | # [b, -1] => [b, 1]
69 | logits = self.fc(x)
70 |
71 | return logits
72 |
73 | def main():
74 |
75 | d = Discriminator()
76 | g = Generator()
77 |
78 |
79 | x = tf.random.normal([2, 64, 64, 3])
80 | z = tf.random.normal([2, 100])
81 |
82 | prob = d(x)
83 | print(prob)
84 | x_hat = g(z)
85 | print(x_hat.shape)
86 |
87 |
88 |
89 |
90 | if __name__ == '__main__':
91 | main()
--------------------------------------------------------------------------------
/ch13-生成对抗网络/wgan_train.py:
--------------------------------------------------------------------------------
1 | import os
2 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
3 | import numpy as np
4 | import tensorflow as tf
5 | from tensorflow import keras
6 |
7 | from PIL import Image
8 | import glob
9 | from gan import Generator, Discriminator
10 |
11 | from dataset import make_anime_dataset
12 |
13 |
14 | def save_result(val_out, val_block_size, image_path, color_mode):
15 | def preprocess(img):
16 | img = ((img + 1.0) * 127.5).astype(np.uint8)
17 | # img = img.astype(np.uint8)
18 | return img
19 |
20 | preprocesed = preprocess(val_out)
21 | final_image = np.array([])
22 | single_row = np.array([])
23 | for b in range(val_out.shape[0]):
24 | # concat image into a row
25 | if single_row.size == 0:
26 | single_row = preprocesed[b, :, :, :]
27 | else:
28 | single_row = np.concatenate((single_row, preprocesed[b, :, :, :]), axis=1)
29 |
30 | # concat image row to final_image
31 | if (b+1) % val_block_size == 0:
32 | if final_image.size == 0:
33 | final_image = single_row
34 | else:
35 | final_image = np.concatenate((final_image, single_row), axis=0)
36 |
37 | # reset single row
38 | single_row = np.array([])
39 |
40 | if final_image.shape[2] == 1:
41 | final_image = np.squeeze(final_image, axis=2)
42 | Image.fromarray(final_image).save(image_path)
43 |
44 |
45 | def celoss_ones(logits):
46 | # [b, 1]
47 | # [b] = [1, 1, 1, 1,]
48 | # loss = tf.keras.losses.categorical_crossentropy(y_pred=logits,
49 | # y_true=tf.ones_like(logits))
50 | return - tf.reduce_mean(logits)
51 |
52 |
53 | def celoss_zeros(logits):
54 | # [b, 1]
55 | # [b] = [1, 1, 1, 1,]
56 | # loss = tf.keras.losses.categorical_crossentropy(y_pred=logits,
57 | # y_true=tf.zeros_like(logits))
58 | return tf.reduce_mean(logits)
59 |
60 |
61 | def gradient_penalty(discriminator, batch_x, fake_image):
62 |
63 | batchsz = batch_x.shape[0]
64 |
65 | # [b, h, w, c]
66 | t = tf.random.uniform([batchsz, 1, 1, 1])
67 | # [b, 1, 1, 1] => [b, h, w, c]
68 | t = tf.broadcast_to(t, batch_x.shape)
69 |
70 | interplate = t * batch_x + (1 - t) * fake_image
71 |
72 | with tf.GradientTape() as tape:
73 | tape.watch([interplate])
74 | d_interplote_logits = discriminator(interplate, training=True)
75 | grads = tape.gradient(d_interplote_logits, interplate)
76 |
77 | # grads:[b, h, w, c] => [b, -1]
78 | grads = tf.reshape(grads, [grads.shape[0], -1])
79 | gp = tf.norm(grads, axis=1) #[b]
80 | gp = tf.reduce_mean( (gp-1)**2 )
81 |
82 | return gp
83 |
84 |
85 |
86 | def d_loss_fn(generator, discriminator, batch_z, batch_x, is_training):
87 | # 1. treat real image as real
88 | # 2. treat generated image as fake
89 | fake_image = generator(batch_z, is_training)
90 | d_fake_logits = discriminator(fake_image, is_training)
91 | d_real_logits = discriminator(batch_x, is_training)
92 |
93 | d_loss_real = celoss_ones(d_real_logits)
94 | d_loss_fake = celoss_zeros(d_fake_logits)
95 | gp = gradient_penalty(discriminator, batch_x, fake_image)
96 |
97 | loss = d_loss_real + d_loss_fake + 10. * gp
98 |
99 | return loss, gp
100 |
101 |
102 | def g_loss_fn(generator, discriminator, batch_z, is_training):
103 |
104 | fake_image = generator(batch_z, is_training)
105 | d_fake_logits = discriminator(fake_image, is_training)
106 | loss = celoss_ones(d_fake_logits)
107 |
108 | return loss
109 |
110 |
111 | def main():
112 |
113 | tf.random.set_seed(233)
114 | np.random.seed(233)
115 | assert tf.__version__.startswith('2.')
116 |
117 |
118 | # hyper parameters
119 | z_dim = 100
120 | epochs = 3000000
121 | batch_size = 512
122 | learning_rate = 0.0005
123 | is_training = True
124 |
125 |
126 | img_path = glob.glob(r'C:\Users\Jackie\Downloads\faces\*.jpg')
127 | assert len(img_path) > 0
128 |
129 |
130 | dataset, img_shape, _ = make_anime_dataset(img_path, batch_size)
131 | print(dataset, img_shape)
132 | sample = next(iter(dataset))
133 | print(sample.shape, tf.reduce_max(sample).numpy(),
134 | tf.reduce_min(sample).numpy())
135 | dataset = dataset.repeat()
136 | db_iter = iter(dataset)
137 |
138 |
139 | generator = Generator()
140 | generator.build(input_shape = (None, z_dim))
141 | discriminator = Discriminator()
142 | discriminator.build(input_shape=(None, 64, 64, 3))
143 | z_sample = tf.random.normal([100, z_dim])
144 |
145 |
146 | g_optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate, beta_1=0.5)
147 | d_optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate, beta_1=0.5)
148 |
149 |
150 | for epoch in range(epochs):
151 |
152 | for _ in range(5):
153 | batch_z = tf.random.normal([batch_size, z_dim])
154 | batch_x = next(db_iter)
155 |
156 | # train D
157 | with tf.GradientTape() as tape:
158 | d_loss, gp = d_loss_fn(generator, discriminator, batch_z, batch_x, is_training)
159 | grads = tape.gradient(d_loss, discriminator.trainable_variables)
160 | d_optimizer.apply_gradients(zip(grads, discriminator.trainable_variables))
161 |
162 | batch_z = tf.random.normal([batch_size, z_dim])
163 |
164 | with tf.GradientTape() as tape:
165 | g_loss = g_loss_fn(generator, discriminator, batch_z, is_training)
166 | grads = tape.gradient(g_loss, generator.trainable_variables)
167 | g_optimizer.apply_gradients(zip(grads, generator.trainable_variables))
168 |
169 | if epoch % 100 == 0:
170 | print(epoch, 'd-loss:',float(d_loss), 'g-loss:', float(g_loss),
171 | 'gp:', float(gp))
172 |
173 | z = tf.random.normal([100, z_dim])
174 | fake_image = generator(z, training=False)
175 | img_path = os.path.join('images', 'wgan-%d.png'%epoch)
176 | save_result(fake_image.numpy(), 10, img_path, color_mode='P')
177 |
178 |
179 |
180 | if __name__ == '__main__':
181 | main()
--------------------------------------------------------------------------------
/ch14-强化学习/REINFORCE_tf.py:
--------------------------------------------------------------------------------
1 | import gym,os
2 | import numpy as np
3 | import matplotlib
4 | from matplotlib import pyplot as plt
5 | # Default parameters for plots
6 | matplotlib.rcParams['font.size'] = 18
7 | matplotlib.rcParams['figure.titlesize'] = 18
8 | matplotlib.rcParams['figure.figsize'] = [9, 7]
9 | matplotlib.rcParams['font.family'] = ['KaiTi']
10 | matplotlib.rcParams['axes.unicode_minus']=False
11 |
12 | import tensorflow as tf
13 | from tensorflow import keras
14 | from tensorflow.keras import layers,optimizers,losses
15 | from PIL import Image
16 | env = gym.make('CartPole-v1') # 创建游戏环境
17 | env.seed(2333)
18 | tf.random.set_seed(2333)
19 | np.random.seed(2333)
20 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
21 | assert tf.__version__.startswith('2.')
22 |
23 | learning_rate = 0.0002
24 | gamma = 0.98
25 |
26 | class Policy(keras.Model):
27 | # 策略网络,生成动作的概率分布
28 | def __init__(self):
29 | super(Policy, self).__init__()
30 | self.data = [] # 存储轨迹
31 | # 输入为长度为4的向量,输出为左、右2个动作
32 | self.fc1 = layers.Dense(128, kernel_initializer='he_normal')
33 | self.fc2 = layers.Dense(2, kernel_initializer='he_normal')
34 | # 网络优化器
35 | self.optimizer = optimizers.Adam(lr=learning_rate)
36 |
37 | def call(self, inputs, training=None):
38 | # 状态输入s的shape为向量:[4]
39 | x = tf.nn.relu(self.fc1(inputs))
40 | x = tf.nn.softmax(self.fc2(x), axis=1)
41 | return x
42 |
43 | def put_data(self, item):
44 | # 记录r,log_P(a|s)
45 | self.data.append(item)
46 |
47 | def train_net(self, tape):
48 | # 计算梯度并更新策略网络参数。tape为梯度记录器
49 | R = 0 # 终结状态的初始回报为0
50 | for r, log_prob in self.data[::-1]:#逆序取
51 | R = r + gamma * R # 计算每个时间戳上的回报
52 | # 每个时间戳都计算一次梯度
53 | # grad_R=-log_P*R*grad_theta
54 | loss = -log_prob * R
55 | with tape.stop_recording():
56 | # 优化策略网络
57 | grads = tape.gradient(loss, self.trainable_variables)
58 | # print(grads)
59 | self.optimizer.apply_gradients(zip(grads, self.trainable_variables))
60 | self.data = [] # 清空轨迹
61 |
62 | def main():
63 | pi = Policy() # 创建策略网络
64 | pi(tf.random.normal((4,4)))
65 | pi.summary()
66 | score = 0.0 # 计分
67 | print_interval = 20 # 打印间隔
68 | returns = []
69 |
70 | for n_epi in range(400):
71 | s = env.reset() # 回到游戏初始状态,返回s0
72 | with tf.GradientTape(persistent=True) as tape:
73 | for t in range(501): # CartPole-v1 forced to terminates at 500 step.
74 | # 送入状态向量,获取策略
75 | s = tf.constant(s,dtype=tf.float32)
76 | # s: [4] => [1,4]
77 | s = tf.expand_dims(s, axis=0)
78 | prob = pi(s) # 动作分布:[1,2]
79 | # 从类别分布中采样1个动作, shape: [1]
80 | a = tf.random.categorical(tf.math.log(prob), 1)[0]
81 | a = int(a) # Tensor转数字
82 | s_prime, r, done, info = env.step(a)
83 | # 记录动作a和动作产生的奖励r
84 | # prob shape:[1,2]
85 | pi.put_data((r, tf.math.log(prob[0][a])))
86 | s = s_prime # 刷新状态
87 | score += r # 累积奖励
88 |
89 | if n_epi >1000:
90 | env.render()
91 | # im = Image.fromarray(s)
92 | # im.save("res/%d.jpg" % info['frames'][0])
93 |
94 | if done: # 当前episode终止
95 | break
96 | # episode终止后,训练一次网络
97 | pi.train_net(tape)
98 | del tape
99 |
100 | if n_epi%print_interval==0 and n_epi!=0:
101 | returns.append(score/print_interval)
102 | print(f"# of episode :{n_epi}, avg score : {score/print_interval}")
103 | score = 0.0
104 | env.close() # 关闭环境
105 |
106 | plt.plot(np.arange(len(returns))*print_interval, returns)
107 | plt.plot(np.arange(len(returns))*print_interval, returns, 's')
108 | plt.xlabel('回合数')
109 | plt.ylabel('总回报')
110 | plt.savefig('reinforce-tf-cartpole.svg')
111 |
112 | if __name__ == '__main__':
113 | main()
--------------------------------------------------------------------------------
/ch14-强化学习/dqn_tf.py:
--------------------------------------------------------------------------------
1 | import collections
2 | import random
3 | import gym,os
4 | import numpy as np
5 | import tensorflow as tf
6 | from tensorflow import keras
7 | from tensorflow.keras import layers,optimizers,losses
8 |
9 | env = gym.make('CartPole-v1') # 创建游戏环境
10 | env.seed(1234)
11 | tf.random.set_seed(1234)
12 | np.random.seed(1234)
13 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
14 | assert tf.__version__.startswith('2.')
15 |
16 | # Hyperparameters
17 | learning_rate = 0.0002
18 | gamma = 0.99
19 | buffer_limit = 50000
20 | batch_size = 32
21 |
22 |
23 | class ReplayBuffer():
24 | # 经验回放池
25 | def __init__(self):
26 | # 双向队列
27 | self.buffer = collections.deque(maxlen=buffer_limit)
28 |
29 | def put(self, transition):
30 | self.buffer.append(transition)
31 |
32 | def sample(self, n):
33 | # 从回放池采样n个5元组
34 | mini_batch = random.sample(self.buffer, n)
35 | s_lst, a_lst, r_lst, s_prime_lst, done_mask_lst = [], [], [], [], []
36 | # 按类别进行整理
37 | for transition in mini_batch:
38 | s, a, r, s_prime, done_mask = transition
39 | s_lst.append(s)
40 | a_lst.append([a])
41 | r_lst.append([r])
42 | s_prime_lst.append(s_prime)
43 | done_mask_lst.append([done_mask])
44 | # 转换成Tensor
45 | return tf.constant(s_lst, dtype=tf.float32),\
46 | tf.constant(a_lst, dtype=tf.int32), \
47 | tf.constant(r_lst, dtype=tf.float32), \
48 | tf.constant(s_prime_lst, dtype=tf.float32), \
49 | tf.constant(done_mask_lst, dtype=tf.float32)
50 |
51 |
52 | def size(self):
53 | return len(self.buffer)
54 |
55 |
56 | class Qnet(keras.Model):
57 | def __init__(self):
58 | # 创建Q网络,输入为状态向量,输出为动作的Q值
59 | super(Qnet, self).__init__()
60 | self.fc1 = layers.Dense(256, kernel_initializer='he_normal')
61 | self.fc2 = layers.Dense(256, kernel_initializer='he_normal')
62 | self.fc3 = layers.Dense(2, kernel_initializer='he_normal')
63 |
64 | def call(self, x, training=None):
65 | x = tf.nn.relu(self.fc1(x))
66 | x = tf.nn.relu(self.fc2(x))
67 | x = self.fc3(x)
68 | return x
69 |
70 | def sample_action(self, s, epsilon):
71 | # 送入状态向量,获取策略: [4]
72 | s = tf.constant(s, dtype=tf.float32)
73 | # s: [4] => [1,4]
74 | s = tf.expand_dims(s, axis=0)
75 | out = self(s)[0]
76 | coin = random.random()
77 | # 策略改进:e-贪心方式
78 | if coin < epsilon:
79 | # epsilon大的概率随机选取
80 | return random.randint(0, 1)
81 | else: # 选择Q值最大的动作
82 | return int(tf.argmax(out))
83 |
84 |
85 | def train(q, q_target, memory, optimizer):
86 | # 通过Q网络和影子网络来构造贝尔曼方程的误差,
87 | # 并只更新Q网络,影子网络的更新会滞后Q网络
88 | huber = losses.Huber()
89 | for i in range(10): # 训练10次
90 | # 从缓冲池采样
91 | s, a, r, s_prime, done_mask = memory.sample(batch_size)
92 | with tf.GradientTape() as tape:
93 | # s: [b, 4]
94 | q_out = q(s) # 得到Q(s,a)的分布
95 | # 由于TF的gather_nd与pytorch的gather功能不一样,需要构造
96 | # gather_nd需要的坐标参数,indices:[b, 2]
97 | # pi_a = pi.gather(1, a) # pytorch只需要一行即可实现
98 | indices = tf.expand_dims(tf.range(a.shape[0]), axis=1)
99 | indices = tf.concat([indices, a], axis=1)
100 | q_a = tf.gather_nd(q_out, indices) # 动作的概率值, [b]
101 | q_a = tf.expand_dims(q_a, axis=1) # [b]=> [b,1]
102 | # 得到Q(s',a)的最大值,它来自影子网络! [b,4]=>[b,2]=>[b,1]
103 | max_q_prime = tf.reduce_max(q_target(s_prime),axis=1,keepdims=True)
104 | # 构造Q(s,a_t)的目标值,来自贝尔曼方程
105 | target = r + gamma * max_q_prime * done_mask
106 | # 计算Q(s,a_t)与目标值的误差
107 | loss = huber(q_a, target)
108 | # 更新网络,使得Q(s,a_t)估计符合贝尔曼方程
109 | grads = tape.gradient(loss, q.trainable_variables)
110 | # for p in grads:
111 | # print(tf.norm(p))
112 | # print(grads)
113 | optimizer.apply_gradients(zip(grads, q.trainable_variables))
114 |
115 |
116 | def main():
117 | env = gym.make('CartPole-v1') # 创建环境
118 | q = Qnet() # 创建Q网络
119 | q_target = Qnet() # 创建影子网络
120 | q.build(input_shape=(2,4))
121 | q_target.build(input_shape=(2,4))
122 | for src, dest in zip(q.variables, q_target.variables):
123 | dest.assign(src) # 影子网络权值来自Q
124 | memory = ReplayBuffer() # 创建回放池
125 |
126 | print_interval = 20
127 | score = 0.0
128 | optimizer = optimizers.Adam(lr=learning_rate)
129 |
130 | for n_epi in range(10000): # 训练次数
131 | # epsilon概率也会8%到1%衰减,越到后面越使用Q值最大的动作
132 | epsilon = max(0.01, 0.08 - 0.01 * (n_epi / 200))
133 | s = env.reset() # 复位环境
134 | for t in range(600): # 一个回合最大时间戳
135 | # if n_epi>1000:
136 | # env.render()
137 | # 根据当前Q网络提取策略,并改进策略
138 | a = q.sample_action(s, epsilon)
139 | # 使用改进的策略与环境交互
140 | s_prime, r, done, info = env.step(a)
141 | done_mask = 0.0 if done else 1.0 # 结束标志掩码
142 | # 保存5元组
143 | memory.put((s, a, r / 100.0, s_prime, done_mask))
144 | s = s_prime # 刷新状态
145 | score += r # 记录总回报
146 | if done: # 回合结束
147 | break
148 |
149 | if memory.size() > 2000: # 缓冲池只有大于2000就可以训练
150 | train(q, q_target, memory, optimizer)
151 |
152 | if n_epi % print_interval == 0 and n_epi != 0:
153 | for src, dest in zip(q.variables, q_target.variables):
154 | dest.assign(src) # 影子网络权值来自Q
155 | print("# of episode :{}, avg score : {:.1f}, buffer size : {}, " \
156 | "epsilon : {:.1f}%" \
157 | .format(n_epi, score / print_interval, memory.size(), epsilon * 100))
158 | score = 0.0
159 | env.close()
160 |
161 |
162 | if __name__ == '__main__':
163 | main()
--------------------------------------------------------------------------------
/ch14-强化学习/ppo_tf_cartpole.py:
--------------------------------------------------------------------------------
1 | import matplotlib
2 | from matplotlib import pyplot as plt
3 | matplotlib.rcParams['font.size'] = 18
4 | matplotlib.rcParams['figure.titlesize'] = 18
5 | matplotlib.rcParams['figure.figsize'] = [9, 7]
6 | matplotlib.rcParams['font.family'] = ['KaiTi']
7 | matplotlib.rcParams['axes.unicode_minus']=False
8 |
9 | plt.figure()
10 |
11 | import gym,os
12 | import numpy as np
13 | import tensorflow as tf
14 | from tensorflow import keras
15 | from tensorflow.keras import layers,optimizers,losses
16 | from collections import namedtuple
17 | from torch.utils.data import SubsetRandomSampler,BatchSampler
18 |
19 | env = gym.make('CartPole-v1') # 创建游戏环境
20 | env.seed(2222)
21 | tf.random.set_seed(2222)
22 | np.random.seed(2222)
23 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
24 | assert tf.__version__.startswith('2.')
25 |
26 |
27 |
28 | gamma = 0.98 # 激励衰减因子
29 | epsilon = 0.2 # PPO误差超参数0.8~1.2
30 | batch_size = 32 # batch size
31 |
32 |
33 | # 创建游戏环境
34 | env = gym.make('CartPole-v0').unwrapped
35 | Transition = namedtuple('Transition', ['state', 'action', 'a_log_prob', 'reward', 'next_state'])
36 |
37 |
38 | class Actor(keras.Model):
39 | def __init__(self):
40 | super(Actor, self).__init__()
41 | # 策略网络,也叫Actor网络,输出为概率分布pi(a|s)
42 | self.fc1 = layers.Dense(100, kernel_initializer='he_normal')
43 | self.fc2 = layers.Dense(2, kernel_initializer='he_normal')
44 |
45 | def call(self, inputs):
46 | x = tf.nn.relu(self.fc1(inputs))
47 | x = self.fc2(x)
48 | x = tf.nn.softmax(x, axis=1) # 转换成概率
49 | return x
50 |
51 | class Critic(keras.Model):
52 | def __init__(self):
53 | super(Critic, self).__init__()
54 | # 偏置b的估值网络,也叫Critic网络,输出为v(s)
55 | self.fc1 = layers.Dense(100, kernel_initializer='he_normal')
56 | self.fc2 = layers.Dense(1, kernel_initializer='he_normal')
57 |
58 | def call(self, inputs):
59 | x = tf.nn.relu(self.fc1(inputs))
60 | x = self.fc2(x)
61 | return x
62 |
63 |
64 |
65 |
66 | class PPO():
67 | # PPO算法主体
68 | def __init__(self):
69 | super(PPO, self).__init__()
70 | self.actor = Actor() # 创建Actor网络
71 | self.critic = Critic() # 创建Critic网络
72 | self.buffer = [] # 数据缓冲池
73 | self.actor_optimizer = optimizers.Adam(1e-3) # Actor优化器
74 | self.critic_optimizer = optimizers.Adam(3e-3) # Critic优化器
75 |
76 | def select_action(self, s):
77 | # 送入状态向量,获取策略: [4]
78 | s = tf.constant(s, dtype=tf.float32)
79 | # s: [4] => [1,4]
80 | s = tf.expand_dims(s, axis=0)
81 | # 获取策略分布: [1, 2]
82 | prob = self.actor(s)
83 | # 从类别分布中采样1个动作, shape: [1]
84 | a = tf.random.categorical(tf.math.log(prob), 1)[0]
85 | a = int(a) # Tensor转数字
86 | return a, float(prob[0][a]) # 返回动作及其概率
87 |
88 | def get_value(self, s):
89 | # 送入状态向量,获取策略: [4]
90 | s = tf.constant(s, dtype=tf.float32)
91 | # s: [4] => [1,4]
92 | s = tf.expand_dims(s, axis=0)
93 | # 获取策略分布: [1, 2]
94 | v = self.critic(s)[0]
95 | return float(v) # 返回v(s)
96 |
97 | def store_transition(self, transition):
98 | # 存储采样数据
99 | self.buffer.append(transition)
100 |
101 | def optimize(self):
102 | # 优化网络主函数
103 | # 从缓存中取出样本数据,转换成Tensor
104 | state = tf.constant([t.state for t in self.buffer], dtype=tf.float32)
105 | action = tf.constant([t.action for t in self.buffer], dtype=tf.int32)
106 | action = tf.reshape(action,[-1,1])
107 | reward = [t.reward for t in self.buffer]
108 | old_action_log_prob = tf.constant([t.a_log_prob for t in self.buffer], dtype=tf.float32)
109 | old_action_log_prob = tf.reshape(old_action_log_prob, [-1,1])
110 | # 通过MC方法循环计算R(st)
111 | R = 0
112 | Rs = []
113 | for r in reward[::-1]:
114 | R = r + gamma * R
115 | Rs.insert(0, R)
116 | Rs = tf.constant(Rs, dtype=tf.float32)
117 | # 对缓冲池数据大致迭代10遍
118 | for _ in range(round(10*len(self.buffer)/batch_size)):
119 | # 随机从缓冲池采样batch size大小样本
120 | index = np.random.choice(np.arange(len(self.buffer)), batch_size, replace=False)
121 | # 构建梯度跟踪环境
122 | with tf.GradientTape() as tape1, tf.GradientTape() as tape2:
123 | # 取出R(st),[b,1]
124 | v_target = tf.expand_dims(tf.gather(Rs, index, axis=0), axis=1)
125 | # 计算v(s)预测值,也就是偏置b,我们后面会介绍为什么写成v
126 | v = self.critic(tf.gather(state, index, axis=0))
127 | delta = v_target - v # 计算优势值
128 | advantage = tf.stop_gradient(delta) # 断开梯度连接
129 | # 由于TF的gather_nd与pytorch的gather功能不一样,需要构造
130 | # gather_nd需要的坐标参数,indices:[b, 2]
131 | # pi_a = pi.gather(1, a) # pytorch只需要一行即可实现
132 | a = tf.gather(action, index, axis=0) # 取出batch的动作at
133 | # batch的动作分布pi(a|st)
134 | pi = self.actor(tf.gather(state, index, axis=0))
135 | indices = tf.expand_dims(tf.range(a.shape[0]), axis=1)
136 | indices = tf.concat([indices, a], axis=1)
137 | pi_a = tf.gather_nd(pi, indices) # 动作的概率值pi(at|st), [b]
138 | pi_a = tf.expand_dims(pi_a, axis=1) # [b]=> [b,1]
139 | # 重要性采样
140 | ratio = (pi_a / tf.gather(old_action_log_prob, index, axis=0))
141 | surr1 = ratio * advantage
142 | surr2 = tf.clip_by_value(ratio, 1 - epsilon, 1 + epsilon) * advantage
143 | # PPO误差函数
144 | policy_loss = -tf.reduce_mean(tf.minimum(surr1, surr2))
145 | # 对于偏置v来说,希望与MC估计的R(st)越接近越好
146 | value_loss = losses.MSE(v_target, v)
147 | # 优化策略网络
148 | grads = tape1.gradient(policy_loss, self.actor.trainable_variables)
149 | self.actor_optimizer.apply_gradients(zip(grads, self.actor.trainable_variables))
150 | # 优化偏置值网络
151 | grads = tape2.gradient(value_loss, self.critic.trainable_variables)
152 | self.critic_optimizer.apply_gradients(zip(grads, self.critic.trainable_variables))
153 |
154 | self.buffer = [] # 清空已训练数据
155 |
156 |
157 | def main():
158 | agent = PPO()
159 | returns = [] # 统计总回报
160 | total = 0 # 一段时间内平均回报
161 | for i_epoch in range(500): # 训练回合数
162 | state = env.reset() # 复位环境
163 | for t in range(500): # 最多考虑500步
164 | # 通过最新策略与环境交互
165 | action, action_prob = agent.select_action(state)
166 | next_state, reward, done, _ = env.step(action)
167 | # 构建样本并存储
168 | trans = Transition(state, action, action_prob, reward, next_state)
169 | agent.store_transition(trans)
170 | state = next_state # 刷新状态
171 | total += reward # 累积激励
172 | if done: # 合适的时间点训练网络
173 | if len(agent.buffer) >= batch_size:
174 | agent.optimize() # 训练网络
175 | break
176 |
177 | if i_epoch % 20 == 0: # 每20个回合统计一次平均回报
178 | returns.append(total/20)
179 | total = 0
180 | print(i_epoch, returns[-1])
181 |
182 | print(np.array(returns))
183 | plt.figure()
184 | plt.plot(np.arange(len(returns))*20, np.array(returns))
185 | plt.plot(np.arange(len(returns))*20, np.array(returns), 's')
186 | plt.xlabel('回合数')
187 | plt.ylabel('总回报')
188 | plt.savefig('ppo-tf-cartpole.svg')
189 |
190 |
191 | if __name__ == '__main__':
192 | main()
193 | print("end")
--------------------------------------------------------------------------------
/ch15-自定义数据集/pokemon.py:
--------------------------------------------------------------------------------
1 | import os, glob
2 | import random, csv
3 | import tensorflow as tf
4 |
5 |
6 |
7 | def load_csv(root, filename, name2label):
8 | # 从csv文件返回images,labels列表
9 | # root:数据集根目录,filename:csv文件名, name2label:类别名编码表
10 | if not os.path.exists(os.path.join(root, filename)):
11 | # 如果csv文件不存在,则创建
12 | images = []
13 | for name in name2label.keys(): # 遍历所有子目录,获得所有的图片
14 | # 只考虑后缀为png,jpg,jpeg的图片:'pokemon\\mewtwo\\00001.png
15 | images += glob.glob(os.path.join(root, name, '*.png'))
16 | images += glob.glob(os.path.join(root, name, '*.jpg'))
17 | images += glob.glob(os.path.join(root, name, '*.jpeg'))
18 | # 打印数据集信息:1167, 'pokemon\\bulbasaur\\00000000.png'
19 | print(len(images), images)
20 | random.shuffle(images) # 随机打散顺序
21 | # 创建csv文件,并存储图片路径及其label信息
22 | with open(os.path.join(root, filename), mode='w', newline='') as f:
23 | writer = csv.writer(f)
24 | for img in images: # 'pokemon\\bulbasaur\\00000000.png'
25 | name = img.split(os.sep)[-2]
26 | label = name2label[name]
27 | # 'pokemon\\bulbasaur\\00000000.png', 0
28 | writer.writerow([img, label])
29 | print('written into csv file:', filename)
30 |
31 | # 此时已经有csv文件,直接读取
32 | images, labels = [], []
33 | with open(os.path.join(root, filename)) as f:
34 | reader = csv.reader(f)
35 | for row in reader:
36 | # 'pokemon\\bulbasaur\\00000000.png', 0
37 | img, label = row
38 | label = int(label)
39 | images.append(img)
40 | labels.append(label)
41 | # 返回图片路径list和标签list
42 | return images, labels
43 |
44 |
45 | def load_pokemon(root, mode='train'):
46 | # 创建数字编码表
47 | name2label = {} # "sq...":0
48 | # 遍历根目录下的子文件夹,并排序,保证映射关系固定
49 | for name in sorted(os.listdir(os.path.join(root))):
50 | # 跳过非文件夹
51 | if not os.path.isdir(os.path.join(root, name)):
52 | continue
53 | # 给每个类别编码一个数字
54 | name2label[name] = len(name2label.keys())
55 |
56 | # 读取Label信息
57 | # [file1,file2,], [3,1]
58 | images, labels = load_csv(root, 'images.csv', name2label)
59 |
60 | if mode == 'train': # 60%
61 | images = images[:int(0.6 * len(images))]
62 | labels = labels[:int(0.6 * len(labels))]
63 | elif mode == 'val': # 20% = 60%->80%
64 | images = images[int(0.6 * len(images)):int(0.8 * len(images))]
65 | labels = labels[int(0.6 * len(labels)):int(0.8 * len(labels))]
66 | else: # 20% = 80%->100%
67 | images = images[int(0.8 * len(images)):]
68 | labels = labels[int(0.8 * len(labels)):]
69 |
70 | return images, labels, name2label
71 |
72 | # 这里的mean和std根据真实的数据计算获得,比如ImageNet
73 | img_mean = tf.constant([0.485, 0.456, 0.406])
74 | img_std = tf.constant([0.229, 0.224, 0.225])
75 | def normalize(x, mean=img_mean, std=img_std):
76 | # 标准化
77 | # x: [224, 224, 3]
78 | # mean: [224, 224, 3], std: [3]
79 | x = (x - mean)/std
80 | return x
81 |
82 | def denormalize(x, mean=img_mean, std=img_std):
83 | # 标准化的逆过程
84 | x = x * std + mean
85 | return x
86 |
87 | def preprocess(x,y):
88 | # x: 图片的路径List,y:图片的数字编码List
89 | x = tf.io.read_file(x) # 根据路径读取图片
90 | x = tf.image.decode_jpeg(x, channels=3) # 图片解码
91 | x = tf.image.resize(x, [244, 244]) # 图片缩放
92 |
93 | # 数据增强
94 | # x = tf.image.random_flip_up_down(x)
95 | x= tf.image.random_flip_left_right(x) # 左右镜像
96 | x = tf.image.random_crop(x, [224, 224, 3]) # 随机裁剪
97 | # 转换成张量
98 | # x: [0,255]=> 0~1
99 | x = tf.cast(x, dtype=tf.float32) / 255.
100 | # 0~1 => D(0,1)
101 | x = normalize(x) # 标准化
102 | y = tf.convert_to_tensor(y) # 转换成张量
103 |
104 | return x, y
105 |
106 |
107 | def main():
108 | import time
109 |
110 |
111 |
112 | # 加载pokemon数据集,指定加载训练集
113 | images, labels, table = load_pokemon('pokemon', 'train')
114 | print('images:', len(images), images)
115 | print('labels:', len(labels), labels)
116 | print('table:', table)
117 |
118 | # images: string path
119 | # labels: number
120 | db = tf.data.Dataset.from_tensor_slices((images, labels))
121 | db = db.shuffle(1000).map(preprocess).batch(32)
122 |
123 | # 创建TensorBoard对象
124 | writter = tf.summary.create_file_writer('logs')
125 | for step, (x,y) in enumerate(db):
126 | # x: [32, 224, 224, 3]
127 | # y: [32]
128 | with writter.as_default():
129 | x = denormalize(x) # 反向normalize,方便可视化
130 | # 写入图片数据
131 | tf.summary.image('img',x,step=step,max_outputs=9)
132 | time.sleep(5)
133 |
134 |
135 |
136 |
137 | if __name__ == '__main__':
138 | main()
--------------------------------------------------------------------------------
/ch15-自定义数据集/resnet.py:
--------------------------------------------------------------------------------
1 | import os
2 | import tensorflow as tf
3 | import numpy as np
4 | from tensorflow import keras
5 | from tensorflow.keras import layers
6 |
7 |
8 |
9 | tf.random.set_seed(22)
10 | np.random.seed(22)
11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
12 | assert tf.__version__.startswith('2.')
13 |
14 |
15 |
16 | class ResnetBlock(keras.Model):
17 |
18 | def __init__(self, channels, strides=1):
19 | super(ResnetBlock, self).__init__()
20 |
21 | self.channels = channels
22 | self.strides = strides
23 |
24 | self.conv1 = layers.Conv2D(channels, 3, strides=strides,
25 | padding=[[0,0],[1,1],[1,1],[0,0]])
26 | self.bn1 = keras.layers.BatchNormalization()
27 | self.conv2 = layers.Conv2D(channels, 3, strides=1,
28 | padding=[[0,0],[1,1],[1,1],[0,0]])
29 | self.bn2 = keras.layers.BatchNormalization()
30 |
31 | if strides!=1:
32 | self.down_conv = layers.Conv2D(channels, 1, strides=strides, padding='valid')
33 | self.down_bn = tf.keras.layers.BatchNormalization()
34 |
35 | def call(self, inputs, training=None):
36 | residual = inputs
37 |
38 | x = self.conv1(inputs)
39 | x = tf.nn.relu(x)
40 | x = self.bn1(x, training=training)
41 | x = self.conv2(x)
42 | x = tf.nn.relu(x)
43 | x = self.bn2(x, training=training)
44 |
45 | # 残差连接
46 | if self.strides!=1:
47 | residual = self.down_conv(inputs)
48 | residual = tf.nn.relu(residual)
49 | residual = self.down_bn(residual, training=training)
50 |
51 | x = x + residual
52 | x = tf.nn.relu(x)
53 | return x
54 |
55 |
56 | class ResNet(keras.Model):
57 |
58 | def __init__(self, num_classes, initial_filters=16, **kwargs):
59 | super(ResNet, self).__init__(**kwargs)
60 |
61 | self.stem = layers.Conv2D(initial_filters, 3, strides=3, padding='valid')
62 |
63 | self.blocks = keras.models.Sequential([
64 | ResnetBlock(initial_filters * 2, strides=3),
65 | ResnetBlock(initial_filters * 2, strides=1),
66 | # layers.Dropout(rate=0.5),
67 |
68 | ResnetBlock(initial_filters * 4, strides=3),
69 | ResnetBlock(initial_filters * 4, strides=1),
70 |
71 | ResnetBlock(initial_filters * 8, strides=2),
72 | ResnetBlock(initial_filters * 8, strides=1),
73 |
74 | ResnetBlock(initial_filters * 16, strides=2),
75 | ResnetBlock(initial_filters * 16, strides=1),
76 | ])
77 |
78 | self.final_bn = layers.BatchNormalization()
79 | self.avg_pool = layers.GlobalMaxPool2D()
80 | self.fc = layers.Dense(num_classes)
81 |
82 | def call(self, inputs, training=None):
83 | # print('x:',inputs.shape)
84 | out = self.stem(inputs)
85 | out = tf.nn.relu(out)
86 |
87 | # print('stem:',out.shape)
88 |
89 | out = self.blocks(out, training=training)
90 | # print('res:',out.shape)
91 |
92 | out = self.final_bn(out, training=training)
93 | # out = tf.nn.relu(out)
94 |
95 | out = self.avg_pool(out)
96 |
97 | # print('avg_pool:',out.shape)
98 | out = self.fc(out)
99 |
100 | # print('out:',out.shape)
101 |
102 | return out
103 |
104 |
105 |
106 | def main():
107 | num_classes = 5
108 |
109 | resnet18 = ResNet(5)
110 | resnet18.build(input_shape=(4,224,224,3))
111 | resnet18.summary()
112 |
113 |
114 |
115 |
116 |
117 |
118 | if __name__ == '__main__':
119 | main()
--------------------------------------------------------------------------------
/ch15-自定义数据集/train_scratch.py:
--------------------------------------------------------------------------------
1 | import matplotlib
2 | from matplotlib import pyplot as plt
3 | matplotlib.rcParams['font.size'] = 18
4 | matplotlib.rcParams['figure.titlesize'] = 18
5 | matplotlib.rcParams['figure.figsize'] = [9, 7]
6 | matplotlib.rcParams['font.family'] = ['KaiTi']
7 | matplotlib.rcParams['axes.unicode_minus']=False
8 |
9 | import os
10 | import tensorflow as tf
11 | import numpy as np
12 | from tensorflow import keras
13 | from tensorflow.keras import layers,optimizers,losses
14 | from tensorflow.keras.callbacks import EarlyStopping
15 |
16 | tf.random.set_seed(1234)
17 | np.random.seed(1234)
18 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
19 | assert tf.__version__.startswith('2.')
20 |
21 |
22 | from pokemon import load_pokemon,normalize
23 |
24 |
25 |
26 | def preprocess(x,y):
27 | # x: 图片的路径,y:图片的数字编码
28 | x = tf.io.read_file(x)
29 | x = tf.image.decode_jpeg(x, channels=3) # RGBA
30 | x = tf.image.resize(x, [244, 244])
31 |
32 | x = tf.image.random_flip_left_right(x)
33 | x = tf.image.random_flip_up_down(x)
34 | x = tf.image.random_crop(x, [224,224,3])
35 |
36 | # x: [0,255]=> -1~1
37 | x = tf.cast(x, dtype=tf.float32) / 255.
38 | x = normalize(x)
39 | y = tf.convert_to_tensor(y)
40 | y = tf.one_hot(y, depth=5)
41 |
42 | return x, y
43 |
44 |
45 | batchsz = 32
46 | # 创建训练集Datset对象
47 | images, labels, table = load_pokemon('pokemon',mode='train')
48 | db_train = tf.data.Dataset.from_tensor_slices((images, labels))
49 | db_train = db_train.shuffle(1000).map(preprocess).batch(batchsz)
50 | # 创建验证集Datset对象
51 | images2, labels2, table = load_pokemon('pokemon',mode='val')
52 | db_val = tf.data.Dataset.from_tensor_slices((images2, labels2))
53 | db_val = db_val.map(preprocess).batch(batchsz)
54 | # 创建测试集Datset对象
55 | images3, labels3, table = load_pokemon('pokemon',mode='test')
56 | db_test = tf.data.Dataset.from_tensor_slices((images3, labels3))
57 | db_test = db_test.map(preprocess).batch(batchsz)
58 |
59 | # 加载DenseNet网络模型,并去掉最后一层全连接层,最后一个池化层设置为max pooling
60 | net = keras.applications.DenseNet121(include_top=False, pooling='max')
61 | # 设计为不参与优化,即MobileNet这部分参数固定不动
62 | net.trainable = True
63 | newnet = keras.Sequential([
64 | net, # 去掉最后一层的DenseNet121
65 | layers.Dense(1024, activation='relu'), # 追加全连接层
66 | layers.BatchNormalization(), # 追加BN层
67 | layers.Dropout(rate=0.5), # 追加Dropout层,防止过拟合
68 | layers.Dense(5) # 根据宝可梦数据的任务,设置最后一层输出节点数为5
69 | ])
70 | newnet.build(input_shape=(4,224,224,3))
71 | newnet.summary()
72 |
73 | # 创建Early Stopping类,连续3次不下降则终止
74 | early_stopping = EarlyStopping(
75 | monitor='val_accuracy',
76 | min_delta=0.001,
77 | patience=3
78 | )
79 |
80 | newnet.compile(optimizer=optimizers.Adam(lr=1e-3),
81 | loss=losses.CategoricalCrossentropy(from_logits=True),
82 | metrics=['accuracy'])
83 | history = newnet.fit(db_train, validation_data=db_val, validation_freq=1, epochs=100,
84 | callbacks=[early_stopping])
85 | history = history.history
86 | print(history.keys())
87 | print(history['val_accuracy'])
88 | print(history['accuracy'])
89 | test_acc = newnet.evaluate(db_test)
90 |
91 | plt.figure()
92 | returns = history['val_accuracy']
93 | plt.plot(np.arange(len(returns)), returns, label='验证准确率')
94 | plt.plot(np.arange(len(returns)), returns, 's')
95 | returns = history['accuracy']
96 | plt.plot(np.arange(len(returns)), returns, label='训练准确率')
97 | plt.plot(np.arange(len(returns)), returns, 's')
98 |
99 | plt.plot([len(returns)-1],[test_acc[-1]], 'D', label='测试准确率')
100 | plt.legend()
101 | plt.xlabel('Epoch')
102 | plt.ylabel('准确率')
103 | plt.savefig('scratch.svg')
--------------------------------------------------------------------------------
/ch15-自定义数据集/train_transfer.py:
--------------------------------------------------------------------------------
1 | import matplotlib
2 | from matplotlib import pyplot as plt
3 | matplotlib.rcParams['font.size'] = 18
4 | matplotlib.rcParams['figure.titlesize'] = 18
5 | matplotlib.rcParams['figure.figsize'] = [9, 7]
6 | matplotlib.rcParams['font.family'] = ['KaiTi']
7 | matplotlib.rcParams['axes.unicode_minus']=False
8 |
9 | import os
10 | import tensorflow as tf
11 | import numpy as np
12 | from tensorflow import keras
13 | from tensorflow.keras import layers,optimizers,losses
14 | from tensorflow.keras.callbacks import EarlyStopping
15 |
16 | tf.random.set_seed(2222)
17 | np.random.seed(2222)
18 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
19 | assert tf.__version__.startswith('2.')
20 |
21 |
22 | from pokemon import load_pokemon,normalize
23 |
24 |
25 |
26 | def preprocess(x,y):
27 | # x: 图片的路径,y:图片的数字编码
28 | x = tf.io.read_file(x)
29 | x = tf.image.decode_jpeg(x, channels=3) # RGBA
30 | x = tf.image.resize(x, [244, 244])
31 |
32 | x = tf.image.random_flip_left_right(x)
33 | x = tf.image.random_flip_up_down(x)
34 | x = tf.image.random_crop(x, [224,224,3])
35 |
36 | # x: [0,255]=> -1~1
37 | x = tf.cast(x, dtype=tf.float32) / 255.
38 | x = normalize(x)
39 | y = tf.convert_to_tensor(y)
40 | y = tf.one_hot(y, depth=5)
41 |
42 | return x, y
43 |
44 |
45 | batchsz = 32
46 | # 创建训练集Datset对象
47 | images, labels, table = load_pokemon('pokemon',mode='train')
48 | db_train = tf.data.Dataset.from_tensor_slices((images, labels))
49 | db_train = db_train.shuffle(1000).map(preprocess).batch(batchsz)
50 | # 创建验证集Datset对象
51 | images2, labels2, table = load_pokemon('pokemon',mode='val')
52 | db_val = tf.data.Dataset.from_tensor_slices((images2, labels2))
53 | db_val = db_val.map(preprocess).batch(batchsz)
54 | # 创建测试集Datset对象
55 | images3, labels3, table = load_pokemon('pokemon',mode='test')
56 | db_test = tf.data.Dataset.from_tensor_slices((images3, labels3))
57 | db_test = db_test.map(preprocess).batch(batchsz)
58 |
59 | # 加载DenseNet网络模型,并去掉最后一层全连接层,最后一个池化层设置为max pooling
60 | net = keras.applications.DenseNet121(weights='imagenet', include_top=False, pooling='max')
61 | # 设计为不参与优化,即MobileNet这部分参数固定不动
62 | net.trainable = True
63 | newnet = keras.Sequential([
64 | net, # 去掉最后一层的DenseNet121
65 | layers.Dense(1024, activation='relu'), # 追加全连接层
66 | layers.BatchNormalization(), # 追加BN层
67 | layers.Dropout(rate=0.5), # 追加Dropout层,防止过拟合
68 | layers.Dense(5) # 根据宝可梦数据的任务,设置最后一层输出节点数为5
69 | ])
70 | newnet.build(input_shape=(4,224,224,3))
71 | newnet.summary()
72 |
73 | # 创建Early Stopping类,连续3次不下降则终止
74 | early_stopping = EarlyStopping(
75 | monitor='val_accuracy',
76 | min_delta=0.001,
77 | patience=3
78 | )
79 |
80 | newnet.compile(optimizer=optimizers.Adam(lr=1e-3),
81 | loss=losses.CategoricalCrossentropy(from_logits=True),
82 | metrics=['accuracy'])
83 | history = newnet.fit(db_train, validation_data=db_val, validation_freq=1, epochs=100,
84 | callbacks=[early_stopping])
85 | history = history.history
86 | print(history.keys())
87 | print(history['val_accuracy'])
88 | print(history['accuracy'])
89 | test_acc = newnet.evaluate(db_test)
90 |
91 | plt.figure()
92 | returns = history['val_accuracy']
93 | plt.plot(np.arange(len(returns)), returns, label='验证准确率')
94 | plt.plot(np.arange(len(returns)), returns, 's')
95 | returns = history['accuracy']
96 | plt.plot(np.arange(len(returns)), returns, label='训练准确率')
97 | plt.plot(np.arange(len(returns)), returns, 's')
98 |
99 | plt.plot([len(returns)-1],[test_acc[-1]], 'D', label='测试准确率')
100 | plt.legend()
101 | plt.xlabel('Epoch')
102 | plt.ylabel('准确率')
103 | plt.savefig('transfer.svg')
--------------------------------------------------------------------------------
/ch15-自定义数据集/宝可梦数据集.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch15-自定义数据集/宝可梦数据集.pdf
--------------------------------------------------------------------------------
/【《TensorFlow深度学习》】.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/【《TensorFlow深度学习》】.pdf
--------------------------------------------------------------------------------