├── .github └── workflows │ └── pythonapp.yml ├── .gitignore ├── README.md ├── TensorFlow深度学习(带目录).pdf ├── assets ├── 0.4.目录-双排-1.jpg ├── 0.4.目录-双排-2.jpg ├── 0.4.目录-双排-3.jpg ├── 1.jpg ├── 2.png ├── book-cover.png ├── dglg.jpg ├── dzkjdx.jpg ├── hnxxxy.jpg └── xbgydx.jpg ├── ch01-人工智能绪论 ├── autograd.py ├── gpu_accelerate.py ├── tf1.py └── tf2.py ├── ch02-回归问题 ├── data.csv ├── linear_regression.py ├── 回归实战.pdf └── 回归问题.pdf ├── ch03-分类问题 ├── forward_layer.py ├── forward_tensor.py ├── main.py ├── 手写数字问题.pdf └── 手写数字问题体验.pdf ├── ch04-TensorFlow基础 ├── 4.10-forward-prop.py ├── Broadcasting.pdf ├── MNIST数据集的前向传播训练误差曲线.png ├── ch04-TensorFlow基础.ipynb ├── 创建Tensor.pdf ├── 前向传播.pdf ├── 数学运算.pdf ├── 数据类型.pdf ├── 索引与切片-1.pdf ├── 索引与切片-2.pdf └── 维度变换.pdf ├── ch05-TensorFlow进阶 ├── acc_topk.py ├── gradient_clip.py ├── mnist_tensor.py ├── 合并与分割.pdf ├── 填充与复制.pdf ├── 张量排序.pdf ├── 张量限幅.pdf ├── 数据统计.pdf └── 高阶特性.pdf ├── ch06-神经网络 ├── auto_efficency_regression.py ├── ch06-神经网络.ipynb ├── forward.py ├── nb.py ├── 全接连层.pdf ├── 误差计算.pdf └── 输出方式.pdf ├── ch07-反向传播算法 ├── 0.梯度下降-简介.pdf ├── 2.常见函数的梯度.pdf ├── 2nd_derivative.py ├── 3.激活函数及其梯度.pdf ├── 4.损失函数及其梯度.pdf ├── 5.单输出感知机梯度.pdf ├── 6.多输出感知机梯度.pdf ├── 7.链式法则.pdf ├── 8.多层感知机梯度.pdf ├── ch07-反向传播算法.ipynb ├── chain_rule.py ├── crossentropy_loss.py ├── himmelblau.py ├── mse_grad.py ├── multi_output_perceptron.py ├── numpy-backward-prop.py ├── sigmoid_grad.py └── single_output_perceptron.py ├── ch08-Keras高层接口 ├── 1.Metrics.pdf ├── 2.Compile&Fit.pdf ├── 3.自定义层.pdf ├── Keras实战CIFAR10.pdf ├── compile_fit.py ├── keras_train.py ├── layer_model.py ├── metrics.py ├── nb.py ├── pretained.py ├── save_load_model.py ├── save_load_weight.py └── 模型加载与保存.pdf ├── ch09-过拟合 ├── 9.8-over-fitting-and-under-fitting.py ├── Regularization.pdf ├── compile_fit.py ├── dropout.py ├── lenna.png ├── lenna_crop.png ├── lenna_crop2.png ├── lenna_eras.png ├── lenna_eras2.png ├── lenna_flip.png ├── lenna_flip2.png ├── lenna_guassian.png ├── lenna_perspective.png ├── lenna_resize.png ├── lenna_rotate.png ├── lenna_rotate2.png ├── misc.pdf ├── regularization.py ├── train_evalute_test.py ├── 交叉验证.pdf ├── 学习率与动量.pdf └── 过拟合与欠拟合.pdf ├── ch10-卷积神经网络 ├── BatchNorm.pdf ├── CIFAR与VGG实战.pdf ├── ResNet与DenseNet.pdf ├── ResNet实战.pdf ├── bn_main.py ├── cifar10_train.py ├── nb.py ├── resnet.py ├── resnet18_train.py ├── 什么是卷积.pdf ├── 卷积神经网络.pdf ├── 池化与采样.pdf └── 经典卷积网络.pdf ├── ch11-循环神经网络 ├── LSTM.pdf ├── LSTM实战.pdf ├── RNN Layer使用.pdf ├── nb.py ├── pretrained.py ├── sentiment_analysis_cell - GRU.py ├── sentiment_analysis_cell - LSTM.py ├── sentiment_analysis_cell.py ├── sentiment_analysis_layer - GRU.py ├── sentiment_analysis_layer - LSTM - pretrained.py ├── sentiment_analysis_layer - LSTM.py ├── sentiment_analysis_layer.py ├── 循环神经网络.pdf ├── 情感分类实战.pdf ├── 时间序列表示.pdf └── 梯度弥散与梯度爆炸.pdf ├── ch12-自编码器 ├── AE实战.pdf ├── AutoEncoders.pdf ├── autoencoder.py └── vae.py ├── ch13-生成对抗网络 ├── GAN.pdf ├── GAN实战.pdf ├── dataset.py ├── gan.py ├── gan_train.py ├── wgan.py └── wgan_train.py ├── ch14-强化学习 ├── REINFORCE_tf.py ├── a3c_tf_cartpole.py ├── dqn_tf.py └── ppo_tf_cartpole.py ├── ch15-自定义数据集 ├── pokemon.py ├── resnet.py ├── train_scratch.py ├── train_transfer.py └── 宝可梦数据集.pdf └── 【《TensorFlow深度学习》】.pdf /.github/workflows/pythonapp.yml: -------------------------------------------------------------------------------- 1 | name: Python application 2 | on: [push, pull_request] 3 | jobs: 4 | build: 5 | runs-on: ubuntu-latest 6 | steps: 7 | - uses: actions/checkout@v1 8 | - name: Set up Python 3 9 | uses: actions/setup-python@v1 10 | with: 11 | python-version: 3.x 12 | - name: Install dependencies 13 | run: | 14 | python -m pip install --upgrade pip 15 | pip install flake8 pytest 16 | pip install -r requirements.txt || true 17 | - name: Lint with flake8 18 | run: | 19 | # stop the build if there are Python syntax errors or undefined names 20 | flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics 21 | # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide 22 | flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics 23 | - name: Test with pytest 24 | run: | 25 | pytest || true 26 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.DS_Store 2 | *.bak -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # TensorFlow 2深度学习开源书(龙书) 2 | 3 | 基于TensorFlow 2正式版!!! 4 | 理论与实战结合,非常适合入门学习!!! 5 | 6 | - **[纸质书购买链接:京东](https://item.jd.com/12954866.html)** 7 | - **[纸质书购买链接:淘宝](https://detail.tmall.com/item.htm?spm=a230r.1.14.16.18b460abi8w8jJ&id=625801924474&ns=1&abbucket=9)** 8 | 9 | 本仓库包含pdf电子书、配套源代码、配套课件等。部分代码已替换为Ipython Notebook形式,感谢这位[童鞋](https://github.com/Relph1119/deeplearning-with-tensorflow-notes)的整理。 10 | 11 | 开源电子版pdf还可以从[百度网盘下载](https://pan.baidu.com/s/1GgQjhDqSgSfjxqBMsE3RDQ) 提取码:juqs 12 | 感谢云城不及粒火童鞋提供的书签版pdf。 13 | 14 | - **本书的繁体版已经出版,已授权在中国台湾地区上市发行** 15 | 16 | - **本书被“机器之心”,“量子位”等权威媒体报道!** 17 | 18 | - **本库在Github趋势日榜单连续多天全球排名第一!** 19 | 20 | 21 | 22 |

23 | 24 | 25 |

26 | 27 | - 提交错误或者修改等反馈意见,请在Github [Issues](https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book/issues)页面提交 28 | 29 | - 联系邮箱(一般问题建议Github issues交流):liangqu.long AT gmail.com 30 | 31 | - **高校老师索取PPT原素材**等教案,请邮箱联系,并详注院校课程等信息,一般3天内发送邮件回复 32 | 33 | - 使用本书本的任何内容时(**仅限非商业用途**),请注明作者和Github链接 34 | 35 | 36 | # 合作院校 37 | 38 | 以下高校已采用本书作为专业教材或参考资料(排名不分先后),欢迎更多高校加入!发送邮件即可索取PPT原始教案。 39 | 40 | | 电子科技大学 | 西北工业大学 | 北京交通大学 | 厦门大学 | 重庆邮电大学 | 41 | |---|---|---|---|---| 42 | | **东南大学** | ** ** | ** ** | ** ** | | 43 | | **湖南信息学院** | **中山大学新华学院** | **东莞理工大学** | **北京科技职业学院** | | 44 | | **郑州轻工业大学** | **金华职业技术学院** | **高雄市立新莊高級中學** | **安徽财经大学** | | 45 | | **长沙民政职业技术学院** | **兰州交通大学** | ** ** | ** ** | | 46 | 47 | 48 | 49 | # “龙书”生态系统 50 | 51 | - [纸质书/实体书](https://item.jd.com/12954866.html) 52 | 53 | - [介绍短片](https://www.bilibili.com/video/av75331861) 54 | 55 | - [English Version](https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book-EN) 56 | 57 | - [TensorFlow视频课程](https://study.163.com/course/courseMain.htm?share=2&shareId=480000001847407&courseId=1209092816&_trace_c_p_k2_=9e74eb6f891d47cfaa6f00b5cb5f617c) 58 | 59 | - [PyTorch深度学习开源书](https://github.com/dragen1860/Deep-Learning-with-PyTorch-book) 60 | 61 | - 更多TensorFlow 2实战案例在[这里](https://github.com/dragen1860/TensorFlow-2.x-Tutorials) 62 | 63 | 64 | # 简要目录 65 | 66 |

67 | 68 | 69 | 70 |

71 | 72 | 73 | 74 | # 配套视频课程 75 | 76 | 适合零基础、希望快速入门AI的朋友,提供答疑、指导等全方位服务。 77 | 78 | - 深度学习与TensorFlow入门实战 79 | https://study.163.com/course/courseMain.htm?share=2&shareId=480000001847407&courseId=1209092816&_trace_c_p_k2_=9e74eb6f891d47cfaa6f00b5cb5f617c 80 | - 深度学习与PyTorch入门实战 81 | https://study.163.com/course/courseMain.htm?share=2&shareId=480000001847407&courseId=1208894818&_trace_c_p_k2_=8d1b10e04bd34d69855bb71da65b0549 82 | 83 | -------------------------------------------------------------------------------- /TensorFlow深度学习(带目录).pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/TensorFlow深度学习(带目录).pdf -------------------------------------------------------------------------------- /assets/0.4.目录-双排-1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/0.4.目录-双排-1.jpg -------------------------------------------------------------------------------- /assets/0.4.目录-双排-2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/0.4.目录-双排-2.jpg -------------------------------------------------------------------------------- /assets/0.4.目录-双排-3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/0.4.目录-双排-3.jpg -------------------------------------------------------------------------------- /assets/1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/1.jpg -------------------------------------------------------------------------------- /assets/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/2.png -------------------------------------------------------------------------------- /assets/book-cover.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/book-cover.png -------------------------------------------------------------------------------- /assets/dglg.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/dglg.jpg -------------------------------------------------------------------------------- /assets/dzkjdx.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/dzkjdx.jpg -------------------------------------------------------------------------------- /assets/hnxxxy.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/hnxxxy.jpg -------------------------------------------------------------------------------- /assets/xbgydx.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/assets/xbgydx.jpg -------------------------------------------------------------------------------- /ch01-人工智能绪论/autograd.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | # 创建4个张量 4 | a = tf.constant(1.) 5 | b = tf.constant(2.) 6 | c = tf.constant(3.) 7 | w = tf.constant(4.) 8 | 9 | 10 | with tf.GradientTape() as tape:# 构建梯度环境 11 | tape.watch([w]) # 将w加入梯度跟踪列表 12 | # 构建计算过程 13 | y = a * w**2 + b * w + c 14 | # 求导 15 | [dy_dw] = tape.gradient(y, [w]) 16 | print(dy_dw) 17 | 18 | -------------------------------------------------------------------------------- /ch01-人工智能绪论/gpu_accelerate.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib 3 | from matplotlib import pyplot as plt 4 | # Default parameters for plots 5 | matplotlib.rcParams['font.size'] = 20 6 | matplotlib.rcParams['figure.titlesize'] = 20 7 | matplotlib.rcParams['figure.figsize'] = [9, 7] 8 | matplotlib.rcParams['font.family'] = ['STKaiti'] 9 | matplotlib.rcParams['axes.unicode_minus']=False 10 | 11 | 12 | 13 | import tensorflow as tf 14 | import timeit 15 | 16 | 17 | 18 | 19 | cpu_data = [] 20 | gpu_data = [] 21 | for n in range(9): 22 | n = 10**n 23 | # 创建在CPU上运算的2个矩阵 24 | with tf.device('/cpu:0'): 25 | cpu_a = tf.random.normal([1, n]) 26 | cpu_b = tf.random.normal([n, 1]) 27 | print(cpu_a.device, cpu_b.device) 28 | # 创建使用GPU运算的2个矩阵 29 | with tf.device('/gpu:0'): 30 | gpu_a = tf.random.normal([1, n]) 31 | gpu_b = tf.random.normal([n, 1]) 32 | print(gpu_a.device, gpu_b.device) 33 | 34 | def cpu_run(): 35 | with tf.device('/cpu:0'): 36 | c = tf.matmul(cpu_a, cpu_b) 37 | return c 38 | 39 | def gpu_run(): 40 | with tf.device('/gpu:0'): 41 | c = tf.matmul(gpu_a, gpu_b) 42 | return c 43 | 44 | # 第一次计算需要热身,避免将初始化阶段时间结算在内 45 | cpu_time = timeit.timeit(cpu_run, number=10) 46 | gpu_time = timeit.timeit(gpu_run, number=10) 47 | print('warmup:', cpu_time, gpu_time) 48 | # 正式计算10次,取平均时间 49 | cpu_time = timeit.timeit(cpu_run, number=10) 50 | gpu_time = timeit.timeit(gpu_run, number=10) 51 | print('run time:', cpu_time, gpu_time) 52 | cpu_data.append(cpu_time/10) 53 | gpu_data.append(gpu_time/10) 54 | 55 | del cpu_a,cpu_b,gpu_a,gpu_b 56 | 57 | x = [10**i for i in range(9)] 58 | cpu_data = [1000*i for i in cpu_data] 59 | gpu_data = [1000*i for i in gpu_data] 60 | plt.plot(x, cpu_data, 'C1') 61 | plt.plot(x, cpu_data, color='C1', marker='s', label='CPU') 62 | plt.plot(x, gpu_data,'C0') 63 | plt.plot(x, gpu_data, color='C0', marker='^', label='GPU') 64 | 65 | 66 | plt.gca().set_xscale('log') 67 | plt.gca().set_yscale('log') 68 | plt.ylim([0,100]) 69 | plt.xlabel('矩阵大小n:(1xn)@(nx1)') 70 | plt.ylabel('运算时间(ms)') 71 | plt.legend() 72 | plt.savefig('gpu-time.svg') -------------------------------------------------------------------------------- /ch01-人工智能绪论/tf1.py: -------------------------------------------------------------------------------- 1 | import tensorflow.compat.v1 as tf 2 | tf.disable_v2_behavior() # 使用静态图模式运行以下代码 3 | assert tf.__version__.startswith('2.') 4 | 5 | # 1.创建计算图阶段 6 | # 创建2个输入端子,指定类型和名字 7 | a_ph = tf.placeholder(tf.float32, name='variable_a') 8 | b_ph = tf.placeholder(tf.float32, name='variable_b') 9 | # 创建输出端子的运算操作,并命名 10 | c_op = tf.add(a_ph, b_ph, name='variable_c') 11 | 12 | # 2.运行计算图阶段 13 | # 创建运行环境 14 | sess = tf.InteractiveSession() 15 | # 初始化操作也需要作为操作运行 16 | init = tf.global_variables_initializer() 17 | sess.run(init) # 运行初始化操作,完成初始化 18 | # 运行输出端子,需要给输入端子赋值 19 | c_numpy = sess.run(c_op, feed_dict={a_ph: 2., b_ph: 4.}) 20 | # 运算完输出端子才能得到数值类型的c_numpy 21 | print('a+b=',c_numpy) -------------------------------------------------------------------------------- /ch01-人工智能绪论/tf2.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import tensorflow as tf 3 | assert tf.__version__.startswith('2.') 4 | 5 | # 1.创建输入张量 6 | a = tf.constant(2.) 7 | b = tf.constant(4.) 8 | # 2.直接计算并打印 9 | print('a+b=',a+b) 10 | 11 | 12 | -------------------------------------------------------------------------------- /ch02-回归问题/data.csv: -------------------------------------------------------------------------------- 1 | 32.502345269453031,31.70700584656992 2 | 53.426804033275019,68.77759598163891 3 | 61.530358025636438,62.562382297945803 4 | 47.475639634786098,71.546632233567777 5 | 59.813207869512318,87.230925133687393 6 | 55.142188413943821,78.211518270799232 7 | 52.211796692214001,79.64197304980874 8 | 39.299566694317065,59.171489321869508 9 | 48.10504169176825,75.331242297063056 10 | 52.550014442733818,71.300879886850353 11 | 45.419730144973755,55.165677145959123 12 | 54.351634881228918,82.478846757497919 13 | 44.164049496773352,62.008923245725825 14 | 58.16847071685779,75.392870425994957 15 | 56.727208057096611,81.43619215887864 16 | 48.955888566093719,60.723602440673965 17 | 44.687196231480904,82.892503731453715 18 | 60.297326851333466,97.379896862166078 19 | 45.618643772955828,48.847153317355072 20 | 38.816817537445637,56.877213186268506 21 | 66.189816606752601,83.878564664602763 22 | 65.41605174513407,118.59121730252249 23 | 47.48120860786787,57.251819462268969 24 | 41.57564261748702,51.391744079832307 25 | 51.84518690563943,75.380651665312357 26 | 59.370822011089523,74.765564032151374 27 | 57.31000343834809,95.455052922574737 28 | 63.615561251453308,95.229366017555307 29 | 46.737619407976972,79.052406169565586 30 | 50.556760148547767,83.432071421323712 31 | 52.223996085553047,63.358790317497878 32 | 35.567830047746632,41.412885303700563 33 | 42.436476944055642,76.617341280074044 34 | 58.16454011019286,96.769566426108199 35 | 57.504447615341789,74.084130116602523 36 | 45.440530725319981,66.588144414228594 37 | 61.89622268029126,77.768482417793024 38 | 33.093831736163963,50.719588912312084 39 | 36.436009511386871,62.124570818071781 40 | 37.675654860850742,60.810246649902211 41 | 44.555608383275356,52.682983366387781 42 | 43.318282631865721,58.569824717692867 43 | 50.073145632289034,82.905981485070512 44 | 43.870612645218372,61.424709804339123 45 | 62.997480747553091,115.24415280079529 46 | 32.669043763467187,45.570588823376085 47 | 40.166899008703702,54.084054796223612 48 | 53.575077531673656,87.994452758110413 49 | 33.864214971778239,52.725494375900425 50 | 64.707138666121296,93.576118692658241 51 | 38.119824026822805,80.166275447370964 52 | 44.502538064645101,65.101711570560326 53 | 40.599538384552318,65.562301260400375 54 | 41.720676356341293,65.280886920822823 55 | 51.088634678336796,73.434641546324301 56 | 55.078095904923202,71.13972785861894 57 | 41.377726534895203,79.102829683549857 58 | 62.494697427269791,86.520538440347153 59 | 49.203887540826003,84.742697807826218 60 | 41.102685187349664,59.358850248624933 61 | 41.182016105169822,61.684037524833627 62 | 50.186389494880601,69.847604158249183 63 | 52.378446219236217,86.098291205774103 64 | 50.135485486286122,59.108839267699643 65 | 33.644706006191782,69.89968164362763 66 | 39.557901222906828,44.862490711164398 67 | 56.130388816875467,85.498067778840223 68 | 57.362052133238237,95.536686846467219 69 | 60.269214393997906,70.251934419771587 70 | 35.678093889410732,52.721734964774988 71 | 31.588116998132829,50.392670135079896 72 | 53.66093226167304,63.642398775657753 73 | 46.682228649471917,72.247251068662365 74 | 43.107820219102464,57.812512976181402 75 | 70.34607561504933,104.25710158543822 76 | 44.492855880854073,86.642020318822006 77 | 57.50453330326841,91.486778000110135 78 | 36.930076609191808,55.231660886212836 79 | 55.805733357942742,79.550436678507609 80 | 38.954769073377065,44.847124242467601 81 | 56.901214702247074,80.207523139682763 82 | 56.868900661384046,83.14274979204346 83 | 34.33312470421609,55.723489260543914 84 | 59.04974121466681,77.634182511677864 85 | 57.788223993230673,99.051414841748269 86 | 54.282328705967409,79.120646274680027 87 | 51.088719898979143,69.588897851118475 88 | 50.282836348230731,69.510503311494389 89 | 44.211741752090113,73.687564318317285 90 | 38.005488008060688,61.366904537240131 91 | 32.940479942618296,67.170655768995118 92 | 53.691639571070056,85.668203145001542 93 | 68.76573426962166,114.85387123391394 94 | 46.230966498310252,90.123572069967423 95 | 68.319360818255362,97.919821035242848 96 | 50.030174340312143,81.536990783015028 97 | 49.239765342753763,72.111832469615663 98 | 50.039575939875988,85.232007342325673 99 | 48.149858891028863,66.224957888054632 100 | 25.128484647772304,53.454394214850524 101 | -------------------------------------------------------------------------------- /ch02-回归问题/linear_regression.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | # data = [] 4 | # for i in range(100): 5 | # x = np.random.uniform(3., 12.) 6 | # # mean=0, std=0.1 7 | # eps = np.random.normal(0., 0.1) 8 | # y = 1.477 * x + 0.089 + eps 9 | # data.append([x, y]) 10 | # data = np.array(data) 11 | # print(data.shape, data) 12 | 13 | # y = wx + b 14 | def compute_error_for_line_given_points(b, w, points): 15 | totalError = 0 16 | for i in range(0, len(points)): 17 | x = points[i, 0] 18 | y = points[i, 1] 19 | # computer mean-squared-error 20 | totalError += (y - (w * x + b)) ** 2 21 | # average loss for each point 22 | return totalError / float(len(points)) 23 | 24 | 25 | 26 | def step_gradient(b_current, w_current, points, learningRate): 27 | b_gradient = 0 28 | w_gradient = 0 29 | N = float(len(points)) 30 | for i in range(0, len(points)): 31 | x = points[i, 0] 32 | y = points[i, 1] 33 | # grad_b = 2(wx+b-y) 34 | b_gradient += (2/N) * ((w_current * x + b_current) - y) 35 | # grad_w = 2(wx+b-y)*x 36 | w_gradient += (2/N) * x * ((w_current * x + b_current) - y) 37 | # update w' 38 | new_b = b_current - (learningRate * b_gradient) 39 | new_w = w_current - (learningRate * w_gradient) 40 | return [new_b, new_w] 41 | 42 | def gradient_descent_runner(points, starting_b, starting_w, learning_rate, num_iterations): 43 | b = starting_b 44 | w = starting_w 45 | # update for several times 46 | for i in range(num_iterations): 47 | b, w = step_gradient(b, w, np.array(points), learning_rate) 48 | return [b, w] 49 | 50 | 51 | def run(): 52 | 53 | points = np.genfromtxt("data.csv", delimiter=",") 54 | learning_rate = 0.0001 55 | initial_b = 0 # initial y-intercept guess 56 | initial_w = 0 # initial slope guess 57 | num_iterations = 1000 58 | print("Starting gradient descent at b = {0}, w = {1}, error = {2}" 59 | .format(initial_b, initial_w, 60 | compute_error_for_line_given_points(initial_b, initial_w, points)) 61 | ) 62 | print("Running...") 63 | [b, w] = gradient_descent_runner(points, initial_b, initial_w, learning_rate, num_iterations) 64 | print("After {0} iterations b = {1}, w = {2}, error = {3}". 65 | format(num_iterations, b, w, 66 | compute_error_for_line_given_points(b, w, points)) 67 | ) 68 | 69 | if __name__ == '__main__': 70 | run() -------------------------------------------------------------------------------- /ch02-回归问题/回归实战.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch02-回归问题/回归实战.pdf -------------------------------------------------------------------------------- /ch02-回归问题/回归问题.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch02-回归问题/回归问题.pdf -------------------------------------------------------------------------------- /ch03-分类问题/forward_layer.py: -------------------------------------------------------------------------------- 1 | import os 2 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2' 3 | 4 | 5 | import tensorflow as tf 6 | from tensorflow import keras 7 | from tensorflow.keras import layers, optimizers, datasets 8 | 9 | 10 | 11 | 12 | (x, y), (x_val, y_val) = datasets.mnist.load_data() 13 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 255. 14 | y = tf.convert_to_tensor(y, dtype=tf.int32) 15 | y = tf.one_hot(y, depth=10) 16 | print(x.shape, y.shape) 17 | train_dataset = tf.data.Dataset.from_tensor_slices((x, y)) 18 | train_dataset = train_dataset.batch(200) 19 | 20 | 21 | 22 | 23 | model = keras.Sequential([ 24 | layers.Dense(512, activation='relu'), 25 | layers.Dense(256, activation='relu'), 26 | layers.Dense(10)]) 27 | 28 | optimizer = optimizers.SGD(learning_rate=0.001) 29 | 30 | 31 | def train_epoch(epoch): 32 | 33 | # Step4.loop 34 | for step, (x, y) in enumerate(train_dataset): 35 | 36 | 37 | with tf.GradientTape() as tape: 38 | # [b, 28, 28] => [b, 784] 39 | x = tf.reshape(x, (-1, 28*28)) 40 | # Step1. compute output 41 | # [b, 784] => [b, 10] 42 | out = model(x) 43 | # Step2. compute loss 44 | loss = tf.reduce_sum(tf.square(out - y)) / x.shape[0] 45 | 46 | # Step3. optimize and update w1, w2, w3, b1, b2, b3 47 | grads = tape.gradient(loss, model.trainable_variables) 48 | # w' = w - lr * grad 49 | optimizer.apply_gradients(zip(grads, model.trainable_variables)) 50 | 51 | if step % 100 == 0: 52 | print(epoch, step, 'loss:', loss.numpy()) 53 | 54 | 55 | 56 | def train(): 57 | 58 | for epoch in range(30): 59 | 60 | train_epoch(epoch) 61 | 62 | 63 | 64 | 65 | 66 | 67 | if __name__ == '__main__': 68 | train() -------------------------------------------------------------------------------- /ch03-分类问题/forward_tensor.py: -------------------------------------------------------------------------------- 1 | import os 2 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 3 | import matplotlib 4 | from matplotlib import pyplot as plt 5 | # Default parameters for plots 6 | matplotlib.rcParams['font.size'] = 20 7 | matplotlib.rcParams['figure.titlesize'] = 20 8 | matplotlib.rcParams['figure.figsize'] = [9, 7] 9 | matplotlib.rcParams['font.family'] = ['STKaiTi'] 10 | matplotlib.rcParams['axes.unicode_minus']=False 11 | 12 | import tensorflow as tf 13 | from tensorflow import keras 14 | from tensorflow.keras import datasets 15 | 16 | 17 | # x: [60k, 28, 28], 18 | # y: [60k] 19 | (x, y), _ = datasets.mnist.load_data() 20 | # x: [0~255] => [0~1.] 21 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 255. 22 | y = tf.convert_to_tensor(y, dtype=tf.int32) 23 | 24 | print(x.shape, y.shape, x.dtype, y.dtype) 25 | print(tf.reduce_min(x), tf.reduce_max(x)) 26 | print(tf.reduce_min(y), tf.reduce_max(y)) 27 | 28 | 29 | train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128) 30 | train_iter = iter(train_db) 31 | sample = next(train_iter) 32 | print('batch:', sample[0].shape, sample[1].shape) 33 | 34 | 35 | # [b, 784] => [b, 256] => [b, 128] => [b, 10] 36 | # [dim_in, dim_out], [dim_out] 37 | w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1)) 38 | b1 = tf.Variable(tf.zeros([256])) 39 | w2 = tf.Variable(tf.random.truncated_normal([256, 128], stddev=0.1)) 40 | b2 = tf.Variable(tf.zeros([128])) 41 | w3 = tf.Variable(tf.random.truncated_normal([128, 10], stddev=0.1)) 42 | b3 = tf.Variable(tf.zeros([10])) 43 | 44 | lr = 1e-3 45 | 46 | losses = [] 47 | 48 | for epoch in range(20): # iterate db for 10 49 | for step, (x, y) in enumerate(train_db): # for every batch 50 | # x:[128, 28, 28] 51 | # y: [128] 52 | 53 | # [b, 28, 28] => [b, 28*28] 54 | x = tf.reshape(x, [-1, 28*28]) 55 | 56 | with tf.GradientTape() as tape: # tf.Variable 57 | # x: [b, 28*28] 58 | # h1 = x@w1 + b1 59 | # [b, 784]@[784, 256] + [256] => [b, 256] + [256] => [b, 256] + [b, 256] 60 | h1 = x@w1 + tf.broadcast_to(b1, [x.shape[0], 256]) 61 | h1 = tf.nn.relu(h1) 62 | # [b, 256] => [b, 128] 63 | h2 = h1@w2 + b2 64 | h2 = tf.nn.relu(h2) 65 | # [b, 128] => [b, 10] 66 | out = h2@w3 + b3 67 | 68 | # compute loss 69 | # out: [b, 10] 70 | # y: [b] => [b, 10] 71 | y_onehot = tf.one_hot(y, depth=10) 72 | 73 | # mse = mean(sum(y-out)^2) 74 | # [b, 10] 75 | loss = tf.square(y_onehot - out) 76 | # mean: scalar 77 | loss = tf.reduce_mean(loss) 78 | 79 | # compute gradients 80 | grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3]) 81 | # print(grads) 82 | # w1 = w1 - lr * w1_grad 83 | w1.assign_sub(lr * grads[0]) 84 | b1.assign_sub(lr * grads[1]) 85 | w2.assign_sub(lr * grads[2]) 86 | b2.assign_sub(lr * grads[3]) 87 | w3.assign_sub(lr * grads[4]) 88 | b3.assign_sub(lr * grads[5]) 89 | 90 | 91 | if step % 100 == 0: 92 | print(epoch, step, 'loss:', float(loss)) 93 | 94 | losses.append(float(loss)) 95 | 96 | plt.figure() 97 | plt.plot(losses, color='C0', marker='s', label='训练') 98 | plt.xlabel('Epoch') 99 | plt.legend() 100 | plt.ylabel('MSE') 101 | plt.savefig('forward.svg') 102 | # plt.show() 103 | -------------------------------------------------------------------------------- /ch03-分类问题/main.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 3 | 4 | 5 | # 设置GPU使用方式 6 | # 获取GPU列表 7 | gpus = tf.config.experimental.list_physical_devices('GPU') 8 | if gpus: 9 | try: 10 | # 设置GPU为增长式占用 11 | for gpu in gpus: 12 | tf.config.experimental.set_memory_growth(gpu, True) 13 | except RuntimeError as e: 14 | # 打印异常 15 | print(e) 16 | 17 | (xs, ys),_ = datasets.mnist.load_data() 18 | print('datasets:', xs.shape, ys.shape, xs.min(), xs.max()) 19 | 20 | batch_size = 32 21 | 22 | xs = tf.convert_to_tensor(xs, dtype=tf.float32) / 255. 23 | db = tf.data.Dataset.from_tensor_slices((xs,ys)) 24 | db = db.batch(batch_size).repeat(30) 25 | 26 | 27 | model = Sequential([layers.Dense(256, activation='relu'), 28 | layers.Dense(128, activation='relu'), 29 | layers.Dense(10)]) 30 | model.build(input_shape=(4, 28*28)) 31 | model.summary() 32 | 33 | optimizer = optimizers.SGD(lr=0.01) 34 | acc_meter = metrics.Accuracy() 35 | 36 | for step, (x,y) in enumerate(db): 37 | 38 | with tf.GradientTape() as tape: 39 | # 打平操作,[b, 28, 28] => [b, 784] 40 | x = tf.reshape(x, (-1, 28*28)) 41 | # Step1. 得到模型输出output [b, 784] => [b, 10] 42 | out = model(x) 43 | # [b] => [b, 10] 44 | y_onehot = tf.one_hot(y, depth=10) 45 | # 计算差的平方和,[b, 10] 46 | loss = tf.square(out-y_onehot) 47 | # 计算每个样本的平均误差,[b] 48 | loss = tf.reduce_sum(loss) / x.shape[0] 49 | 50 | 51 | acc_meter.update_state(tf.argmax(out, axis=1), y) 52 | 53 | grads = tape.gradient(loss, model.trainable_variables) 54 | optimizer.apply_gradients(zip(grads, model.trainable_variables)) 55 | 56 | 57 | if step % 200==0: 58 | 59 | print(step, 'loss:', float(loss), 'acc:', acc_meter.result().numpy()) 60 | acc_meter.reset_states() 61 | -------------------------------------------------------------------------------- /ch03-分类问题/手写数字问题.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch03-分类问题/手写数字问题.pdf -------------------------------------------------------------------------------- /ch03-分类问题/手写数字问题体验.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch03-分类问题/手写数字问题体验.pdf -------------------------------------------------------------------------------- /ch04-TensorFlow基础/4.10-forward-prop.py: -------------------------------------------------------------------------------- 1 | 2 | 3 | import matplotlib.pyplot as plt 4 | import tensorflow as tf 5 | import tensorflow.keras.datasets as datasets 6 | 7 | plt.rcParams['font.size'] = 16 8 | plt.rcParams['font.family'] = ['STKaiti'] 9 | plt.rcParams['axes.unicode_minus'] = False 10 | 11 | 12 | def load_data(): 13 | # 加载 MNIST 数据集 14 | (x, y), (x_val, y_val) = datasets.mnist.load_data() 15 | # 转换为浮点张量, 并缩放到-1~1 16 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 255. 17 | # 转换为整形张量 18 | y = tf.convert_to_tensor(y, dtype=tf.int32) 19 | # one-hot 编码 20 | y = tf.one_hot(y, depth=10) 21 | 22 | # 改变视图, [b, 28, 28] => [b, 28*28] 23 | x = tf.reshape(x, (-1, 28 * 28)) 24 | 25 | # 构建数据集对象 26 | train_dataset = tf.data.Dataset.from_tensor_slices((x, y)) 27 | # 批量训练 28 | train_dataset = train_dataset.batch(200) 29 | return train_dataset 30 | 31 | 32 | def init_paramaters(): 33 | # 每层的张量都需要被优化,故使用 Variable 类型,并使用截断的正太分布初始化权值张量 34 | # 偏置向量初始化为 0 即可 35 | # 第一层的参数 36 | w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1)) 37 | b1 = tf.Variable(tf.zeros([256])) 38 | # 第二层的参数 39 | w2 = tf.Variable(tf.random.truncated_normal([256, 128], stddev=0.1)) 40 | b2 = tf.Variable(tf.zeros([128])) 41 | # 第三层的参数 42 | w3 = tf.Variable(tf.random.truncated_normal([128, 10], stddev=0.1)) 43 | b3 = tf.Variable(tf.zeros([10])) 44 | return w1, b1, w2, b2, w3, b3 45 | 46 | 47 | def train_epoch(epoch, train_dataset, w1, b1, w2, b2, w3, b3, lr=0.001): 48 | for step, (x, y) in enumerate(train_dataset): 49 | with tf.GradientTape() as tape: 50 | # 第一层计算, [b, 784]@[784, 256] + [256] => [b, 256] + [256] => [b,256] + [b, 256] 51 | h1 = x @ w1 + tf.broadcast_to(b1, (x.shape[0], 256)) 52 | h1 = tf.nn.relu(h1) # 通过激活函数 53 | 54 | # 第二层计算, [b, 256] => [b, 128] 55 | h2 = h1 @ w2 + b2 56 | h2 = tf.nn.relu(h2) 57 | # 输出层计算, [b, 128] => [b, 10] 58 | out = h2 @ w3 + b3 59 | 60 | # 计算网络输出与标签之间的均方差, mse = mean(sum(y-out)^2) 61 | # [b, 10] 62 | loss = tf.square(y - out) 63 | # 误差标量, mean: scalar 64 | loss = tf.reduce_mean(loss) 65 | 66 | # 自动梯度,需要求梯度的张量有[w1, b1, w2, b2, w3, b3] 67 | grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3]) 68 | 69 | # 梯度更新, assign_sub 将当前值减去参数值,原地更新 70 | w1.assign_sub(lr * grads[0]) 71 | b1.assign_sub(lr * grads[1]) 72 | w2.assign_sub(lr * grads[2]) 73 | b2.assign_sub(lr * grads[3]) 74 | w3.assign_sub(lr * grads[4]) 75 | b3.assign_sub(lr * grads[5]) 76 | 77 | if step % 100 == 0: 78 | print(epoch, step, 'loss:', loss.numpy()) 79 | 80 | return loss.numpy() 81 | 82 | 83 | def train(epochs): 84 | losses = [] 85 | train_dataset = load_data() 86 | w1, b1, w2, b2, w3, b3 = init_paramaters() 87 | for epoch in range(epochs): 88 | loss = train_epoch(epoch, train_dataset, w1, b1, w2, b2, w3, b3, lr=0.001) 89 | losses.append(loss) 90 | 91 | x = [i for i in range(0, epochs)] 92 | # 绘制曲线 93 | plt.plot(x, losses, color='blue', marker='s', label='训练') 94 | plt.xlabel('Epoch') 95 | plt.ylabel('MSE') 96 | plt.legend() 97 | plt.savefig('MNIST数据集的前向传播训练误差曲线.png') 98 | plt.close() 99 | 100 | 101 | if __name__ == '__main__': 102 | train(epochs=20) 103 | -------------------------------------------------------------------------------- /ch04-TensorFlow基础/Broadcasting.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/Broadcasting.pdf -------------------------------------------------------------------------------- /ch04-TensorFlow基础/MNIST数据集的前向传播训练误差曲线.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/MNIST数据集的前向传播训练误差曲线.png -------------------------------------------------------------------------------- /ch04-TensorFlow基础/创建Tensor.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/创建Tensor.pdf -------------------------------------------------------------------------------- /ch04-TensorFlow基础/前向传播.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/前向传播.pdf -------------------------------------------------------------------------------- /ch04-TensorFlow基础/数学运算.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/数学运算.pdf -------------------------------------------------------------------------------- /ch04-TensorFlow基础/数据类型.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/数据类型.pdf -------------------------------------------------------------------------------- /ch04-TensorFlow基础/索引与切片-1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/索引与切片-1.pdf -------------------------------------------------------------------------------- /ch04-TensorFlow基础/索引与切片-2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/索引与切片-2.pdf -------------------------------------------------------------------------------- /ch04-TensorFlow基础/维度变换.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch04-TensorFlow基础/维度变换.pdf -------------------------------------------------------------------------------- /ch05-TensorFlow进阶/acc_topk.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import os 3 | 4 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 5 | tf.random.set_seed(2467) 6 | 7 | def accuracy(output, target, topk=(1,)): 8 | maxk = max(topk) 9 | batch_size = target.shape[0] 10 | 11 | pred = tf.math.top_k(output, maxk).indices 12 | pred = tf.transpose(pred, perm=[1, 0]) 13 | target_ = tf.broadcast_to(target, pred.shape) 14 | # [10, b] 15 | correct = tf.equal(pred, target_) 16 | 17 | res = [] 18 | for k in topk: 19 | correct_k = tf.cast(tf.reshape(correct[:k], [-1]), dtype=tf.float32) 20 | correct_k = tf.reduce_sum(correct_k) 21 | acc = float(correct_k* (100.0 / batch_size) ) 22 | res.append(acc) 23 | 24 | return res 25 | 26 | 27 | 28 | output = tf.random.normal([10, 6]) 29 | output = tf.math.softmax(output, axis=1) 30 | target = tf.random.uniform([10], maxval=6, dtype=tf.int32) 31 | print('prob:', output.numpy()) 32 | pred = tf.argmax(output, axis=1) 33 | print('pred:', pred.numpy()) 34 | print('label:', target.numpy()) 35 | 36 | acc = accuracy(output, target, topk=(1,2,3,4,5,6)) 37 | print('top-1-6 acc:', acc) -------------------------------------------------------------------------------- /ch05-TensorFlow进阶/gradient_clip.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow import keras 3 | from tensorflow.keras import datasets, layers, optimizers 4 | import os 5 | 6 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2' 7 | print(tf.__version__) 8 | 9 | (x, y), _ = datasets.mnist.load_data() 10 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 50. 11 | y = tf.convert_to_tensor(y) 12 | y = tf.one_hot(y, depth=10) 13 | print('x:', x.shape, 'y:', y.shape) 14 | train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128).repeat(30) 15 | x,y = next(iter(train_db)) 16 | print('sample:', x.shape, y.shape) 17 | # print(x[0], y[0]) 18 | 19 | 20 | 21 | def main(): 22 | 23 | # 784 => 512 24 | w1, b1 = tf.Variable(tf.random.truncated_normal([784, 512], stddev=0.1)), tf.Variable(tf.zeros([512])) 25 | # 512 => 256 26 | w2, b2 = tf.Variable(tf.random.truncated_normal([512, 256], stddev=0.1)), tf.Variable(tf.zeros([256])) 27 | # 256 => 10 28 | w3, b3 = tf.Variable(tf.random.truncated_normal([256, 10], stddev=0.1)), tf.Variable(tf.zeros([10])) 29 | 30 | 31 | 32 | optimizer = optimizers.SGD(lr=0.01) 33 | 34 | 35 | for step, (x,y) in enumerate(train_db): 36 | 37 | # [b, 28, 28] => [b, 784] 38 | x = tf.reshape(x, (-1, 784)) 39 | 40 | with tf.GradientTape() as tape: 41 | 42 | # layer1. 43 | h1 = x @ w1 + b1 44 | h1 = tf.nn.relu(h1) 45 | # layer2 46 | h2 = h1 @ w2 + b2 47 | h2 = tf.nn.relu(h2) 48 | # output 49 | out = h2 @ w3 + b3 50 | # out = tf.nn.relu(out) 51 | 52 | # compute loss 53 | # [b, 10] - [b, 10] 54 | loss = tf.square(y-out) 55 | # [b, 10] => [b] 56 | loss = tf.reduce_mean(loss, axis=1) 57 | # [b] => scalar 58 | loss = tf.reduce_mean(loss) 59 | 60 | 61 | 62 | # compute gradient 63 | grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3]) 64 | # print('==before==') 65 | # for g in grads: 66 | # print(tf.norm(g)) 67 | 68 | grads, _ = tf.clip_by_global_norm(grads, 15) 69 | 70 | # print('==after==') 71 | # for g in grads: 72 | # print(tf.norm(g)) 73 | # update w' = w - lr*grad 74 | optimizer.apply_gradients(zip(grads, [w1, b1, w2, b2, w3, b3])) 75 | 76 | 77 | 78 | if step % 100 == 0: 79 | print(step, 'loss:', float(loss)) 80 | 81 | 82 | 83 | 84 | if __name__ == '__main__': 85 | main() -------------------------------------------------------------------------------- /ch05-TensorFlow进阶/mnist_tensor.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import matplotlib 3 | from matplotlib import pyplot as plt 4 | # Default parameters for plots 5 | matplotlib.rcParams['font.size'] = 20 6 | matplotlib.rcParams['figure.titlesize'] = 20 7 | matplotlib.rcParams['figure.figsize'] = [9, 7] 8 | matplotlib.rcParams['font.family'] = ['STKaiTi'] 9 | matplotlib.rcParams['axes.unicode_minus']=False 10 | import tensorflow as tf 11 | from tensorflow import keras 12 | from tensorflow.keras import datasets, layers, optimizers 13 | import os 14 | 15 | 16 | 17 | 18 | 19 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2' 20 | print(tf.__version__) 21 | 22 | 23 | def preprocess(x, y): 24 | # [b, 28, 28], [b] 25 | print(x.shape,y.shape) 26 | x = tf.cast(x, dtype=tf.float32) / 255. 27 | x = tf.reshape(x, [-1, 28*28]) 28 | y = tf.cast(y, dtype=tf.int32) 29 | y = tf.one_hot(y, depth=10) 30 | 31 | return x,y 32 | 33 | #%% 34 | (x, y), (x_test, y_test) = datasets.mnist.load_data() 35 | print('x:', x.shape, 'y:', y.shape, 'x test:', x_test.shape, 'y test:', y_test) 36 | #%% 37 | batchsz = 512 38 | train_db = tf.data.Dataset.from_tensor_slices((x, y)) 39 | train_db = train_db.shuffle(1000) 40 | train_db = train_db.batch(batchsz) 41 | train_db = train_db.map(preprocess) 42 | train_db = train_db.repeat(20) 43 | 44 | #%% 45 | 46 | test_db = tf.data.Dataset.from_tensor_slices((x_test, y_test)) 47 | test_db = test_db.shuffle(1000).batch(batchsz).map(preprocess) 48 | x,y = next(iter(train_db)) 49 | print('train sample:', x.shape, y.shape) 50 | # print(x[0], y[0]) 51 | 52 | 53 | 54 | 55 | #%% 56 | def main(): 57 | 58 | # learning rate 59 | lr = 1e-2 60 | accs,losses = [], [] 61 | 62 | 63 | # 784 => 512 64 | w1, b1 = tf.Variable(tf.random.normal([784, 256], stddev=0.1)), tf.Variable(tf.zeros([256])) 65 | # 512 => 256 66 | w2, b2 = tf.Variable(tf.random.normal([256, 128], stddev=0.1)), tf.Variable(tf.zeros([128])) 67 | # 256 => 10 68 | w3, b3 = tf.Variable(tf.random.normal([128, 10], stddev=0.1)), tf.Variable(tf.zeros([10])) 69 | 70 | 71 | 72 | 73 | 74 | for step, (x,y) in enumerate(train_db): 75 | 76 | # [b, 28, 28] => [b, 784] 77 | x = tf.reshape(x, (-1, 784)) 78 | 79 | with tf.GradientTape() as tape: 80 | 81 | # layer1. 82 | h1 = x @ w1 + b1 83 | h1 = tf.nn.relu(h1) 84 | # layer2 85 | h2 = h1 @ w2 + b2 86 | h2 = tf.nn.relu(h2) 87 | # output 88 | out = h2 @ w3 + b3 89 | # out = tf.nn.relu(out) 90 | 91 | # compute loss 92 | # [b, 10] - [b, 10] 93 | loss = tf.square(y-out) 94 | # [b, 10] => scalar 95 | loss = tf.reduce_mean(loss) 96 | 97 | 98 | grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3]) 99 | for p, g in zip([w1, b1, w2, b2, w3, b3], grads): 100 | p.assign_sub(lr * g) 101 | 102 | 103 | # print 104 | if step % 80 == 0: 105 | print(step, 'loss:', float(loss)) 106 | losses.append(float(loss)) 107 | 108 | if step %80 == 0: 109 | # evaluate/test 110 | total, total_correct = 0., 0 111 | 112 | for x, y in test_db: 113 | # layer1. 114 | h1 = x @ w1 + b1 115 | h1 = tf.nn.relu(h1) 116 | # layer2 117 | h2 = h1 @ w2 + b2 118 | h2 = tf.nn.relu(h2) 119 | # output 120 | out = h2 @ w3 + b3 121 | # [b, 10] => [b] 122 | pred = tf.argmax(out, axis=1) 123 | # convert one_hot y to number y 124 | y = tf.argmax(y, axis=1) 125 | # bool type 126 | correct = tf.equal(pred, y) 127 | # bool tensor => int tensor => numpy 128 | total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy() 129 | total += x.shape[0] 130 | 131 | print(step, 'Evaluate Acc:', total_correct/total) 132 | 133 | accs.append(total_correct/total) 134 | 135 | 136 | plt.figure() 137 | x = [i*80 for i in range(len(losses))] 138 | plt.plot(x, losses, color='C0', marker='s', label='训练') 139 | plt.ylabel('MSE') 140 | plt.xlabel('Step') 141 | plt.legend() 142 | plt.savefig('train.svg') 143 | 144 | plt.figure() 145 | plt.plot(x, accs, color='C1', marker='s', label='测试') 146 | plt.ylabel('准确率') 147 | plt.xlabel('Step') 148 | plt.legend() 149 | plt.savefig('test.svg') 150 | 151 | if __name__ == '__main__': 152 | main() -------------------------------------------------------------------------------- /ch05-TensorFlow进阶/合并与分割.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/合并与分割.pdf -------------------------------------------------------------------------------- /ch05-TensorFlow进阶/填充与复制.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/填充与复制.pdf -------------------------------------------------------------------------------- /ch05-TensorFlow进阶/张量排序.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/张量排序.pdf -------------------------------------------------------------------------------- /ch05-TensorFlow进阶/张量限幅.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/张量限幅.pdf -------------------------------------------------------------------------------- /ch05-TensorFlow进阶/数据统计.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/数据统计.pdf -------------------------------------------------------------------------------- /ch05-TensorFlow进阶/高阶特性.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch05-TensorFlow进阶/高阶特性.pdf -------------------------------------------------------------------------------- /ch06-神经网络/auto_efficency_regression.py: -------------------------------------------------------------------------------- 1 | #%% 2 | from __future__ import absolute_import, division, print_function, unicode_literals 3 | 4 | import pathlib 5 | import os 6 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 7 | 8 | 9 | import matplotlib.pyplot as plt 10 | import pandas as pd 11 | import seaborn as sns 12 | 13 | import tensorflow as tf 14 | 15 | from tensorflow import keras 16 | from tensorflow.keras import layers, losses 17 | 18 | print(tf.__version__) 19 | 20 | 21 | # 在线下载汽车效能数据集 22 | dataset_path = keras.utils.get_file("auto-mpg.data", "http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data") 23 | 24 | # 效能(公里数每加仑),气缸数,排量,马力,重量 25 | # 加速度,型号年份,产地 26 | column_names = ['MPG','Cylinders','Displacement','Horsepower','Weight', 27 | 'Acceleration', 'Model Year', 'Origin'] 28 | raw_dataset = pd.read_csv(dataset_path, names=column_names, 29 | na_values = "?", comment='\t', 30 | sep=" ", skipinitialspace=True) 31 | 32 | dataset = raw_dataset.copy() 33 | # 查看部分数据 34 | dataset.tail() 35 | dataset.head() 36 | dataset 37 | #%% 38 | 39 | 40 | #%% 41 | 42 | # 统计空白数据,并清除 43 | dataset.isna().sum() 44 | dataset = dataset.dropna() 45 | dataset.isna().sum() 46 | dataset 47 | #%% 48 | 49 | # 处理类别型数据,其中origin列代表了类别1,2,3,分布代表产地:美国、欧洲、日本 50 | # 其弹出这一列 51 | origin = dataset.pop('Origin') 52 | # 根据origin列来写入新列 53 | dataset['USA'] = (origin == 1)*1.0 54 | dataset['Europe'] = (origin == 2)*1.0 55 | dataset['Japan'] = (origin == 3)*1.0 56 | dataset.tail() 57 | 58 | 59 | # 切分为训练集和测试集 60 | train_dataset = dataset.sample(frac=0.8,random_state=0) 61 | test_dataset = dataset.drop(train_dataset.index) 62 | 63 | 64 | #%% 统计数据 65 | sns.pairplot(train_dataset[["Cylinders", "Displacement", "Weight", "MPG"]], 66 | diag_kind="kde") 67 | #%% 68 | # 查看训练集的输入X的统计数据 69 | train_stats = train_dataset.describe() 70 | train_stats.pop("MPG") 71 | train_stats = train_stats.transpose() 72 | train_stats 73 | 74 | 75 | # 移动MPG油耗效能这一列为真实标签Y 76 | train_labels = train_dataset.pop('MPG') 77 | test_labels = test_dataset.pop('MPG') 78 | 79 | 80 | # 标准化数据 81 | def norm(x): 82 | return (x - train_stats['mean']) / train_stats['std'] 83 | normed_train_data = norm(train_dataset) 84 | normed_test_data = norm(test_dataset) 85 | #%% 86 | 87 | print(normed_train_data.shape,train_labels.shape) 88 | print(normed_test_data.shape, test_labels.shape) 89 | #%% 90 | 91 | class Network(keras.Model): 92 | # 回归网络 93 | def __init__(self): 94 | super(Network, self).__init__() 95 | # 创建3个全连接层 96 | self.fc1 = layers.Dense(64, activation='relu') 97 | self.fc2 = layers.Dense(64, activation='relu') 98 | self.fc3 = layers.Dense(1) 99 | 100 | def call(self, inputs, training=None, mask=None): 101 | # 依次通过3个全连接层 102 | x = self.fc1(inputs) 103 | x = self.fc2(x) 104 | x = self.fc3(x) 105 | 106 | return x 107 | 108 | model = Network() 109 | model.build(input_shape=(None, 9)) 110 | model.summary() 111 | optimizer = tf.keras.optimizers.RMSprop(0.001) 112 | train_db = tf.data.Dataset.from_tensor_slices((normed_train_data.values, train_labels.values)) 113 | train_db = train_db.shuffle(100).batch(32) 114 | 115 | # # 未训练时测试 116 | # example_batch = normed_train_data[:10] 117 | # example_result = model.predict(example_batch) 118 | # example_result 119 | 120 | 121 | train_mae_losses = [] 122 | test_mae_losses = [] 123 | for epoch in range(200): 124 | for step, (x,y) in enumerate(train_db): 125 | 126 | with tf.GradientTape() as tape: 127 | out = model(x) 128 | loss = tf.reduce_mean(losses.MSE(y, out)) 129 | mae_loss = tf.reduce_mean(losses.MAE(y, out)) 130 | 131 | if step % 10 == 0: 132 | print(epoch, step, float(loss)) 133 | 134 | grads = tape.gradient(loss, model.trainable_variables) 135 | optimizer.apply_gradients(zip(grads, model.trainable_variables)) 136 | 137 | train_mae_losses.append(float(mae_loss)) 138 | out = model(tf.constant(normed_test_data.values)) 139 | test_mae_losses.append(tf.reduce_mean(losses.MAE(test_labels, out))) 140 | 141 | 142 | plt.figure() 143 | plt.xlabel('Epoch') 144 | plt.ylabel('MAE') 145 | plt.plot(train_mae_losses, label='Train') 146 | 147 | plt.plot(test_mae_losses, label='Test') 148 | plt.legend() 149 | 150 | # plt.ylim([0,10]) 151 | plt.legend() 152 | plt.savefig('auto.svg') 153 | plt.show() 154 | 155 | 156 | 157 | 158 | #%% 159 | -------------------------------------------------------------------------------- /ch06-神经网络/forward.py: -------------------------------------------------------------------------------- 1 | #%% 2 | 3 | import tensorflow as tf 4 | from tensorflow import keras 5 | from tensorflow.keras import layers 6 | from tensorflow.keras import datasets 7 | import os 8 | 9 | 10 | #%% 11 | x = tf.random.normal([2,28*28]) 12 | w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1)) 13 | b1 = tf.Variable(tf.zeros([256])) 14 | o1 = tf.matmul(x,w1) + b1 15 | o1 16 | #%% 17 | x = tf.random.normal([4,28*28]) 18 | fc1 = layers.Dense(256, activation=tf.nn.relu) 19 | fc2 = layers.Dense(128, activation=tf.nn.relu) 20 | fc3 = layers.Dense(64, activation=tf.nn.relu) 21 | fc4 = layers.Dense(10, activation=None) 22 | h1 = fc1(x) 23 | h2 = fc2(h1) 24 | h3 = fc3(h2) 25 | h4 = fc4(h3) 26 | 27 | model = layers.Sequential([ 28 | layers.Dense(256, activation=tf.nn.relu) , 29 | layers.Dense(128, activation=tf.nn.relu) , 30 | layers.Dense(64, activation=tf.nn.relu) , 31 | layers.Dense(10, activation=None) , 32 | ]) 33 | out = model(x) 34 | 35 | #%% 36 | 256*784+256+128*256+128+64*128+64+10*64+10 37 | #%% 38 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 39 | 40 | # x: [60k, 28, 28], 41 | # y: [60k] 42 | (x, y), _ = datasets.mnist.load_data() 43 | # x: [0~255] => [0~1.] 44 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 255. 45 | y = tf.convert_to_tensor(y, dtype=tf.int32) 46 | 47 | print(x.shape, y.shape, x.dtype, y.dtype) 48 | print(tf.reduce_min(x), tf.reduce_max(x)) 49 | print(tf.reduce_min(y), tf.reduce_max(y)) 50 | 51 | 52 | train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128) 53 | train_iter = iter(train_db) 54 | sample = next(train_iter) 55 | print('batch:', sample[0].shape, sample[1].shape) 56 | 57 | 58 | # [b, 784] => [b, 256] => [b, 128] => [b, 10] 59 | # [dim_in, dim_out], [dim_out] 60 | # 隐藏层1张量 61 | w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1)) 62 | b1 = tf.Variable(tf.zeros([256])) 63 | # 隐藏层2张量 64 | w2 = tf.Variable(tf.random.truncated_normal([256, 128], stddev=0.1)) 65 | b2 = tf.Variable(tf.zeros([128])) 66 | # 隐藏层3张量 67 | w3 = tf.Variable(tf.random.truncated_normal([128, 64], stddev=0.1)) 68 | b3 = tf.Variable(tf.zeros([64])) 69 | # 输出层张量 70 | w4 = tf.Variable(tf.random.truncated_normal([64, 10], stddev=0.1)) 71 | b4 = tf.Variable(tf.zeros([10])) 72 | 73 | lr = 1e-3 74 | 75 | for epoch in range(10): # iterate db for 10 76 | for step, (x, y) in enumerate(train_db): # for every batch 77 | # x:[128, 28, 28] 78 | # y: [128] 79 | 80 | # [b, 28, 28] => [b, 28*28] 81 | x = tf.reshape(x, [-1, 28*28]) 82 | 83 | with tf.GradientTape() as tape: # tf.Variable 84 | # x: [b, 28*28] 85 | # 隐藏层1前向计算,[b, 28*28] => [b, 256] 86 | h1 = x@w1 + tf.broadcast_to(b1, [x.shape[0], 256]) 87 | h1 = tf.nn.relu(h1) 88 | # 隐藏层2前向计算,[b, 256] => [b, 128] 89 | h2 = h1@w2 + b2 90 | h2 = tf.nn.relu(h2) 91 | # 隐藏层3前向计算,[b, 128] => [b, 64] 92 | h3 = h2@w3 + b3 93 | h3 = tf.nn.relu(h3) 94 | # 输出层前向计算,[b, 64] => [b, 10] 95 | h4 = h3@w4 + b4 96 | out = h4 97 | 98 | # compute loss 99 | # out: [b, 10] 100 | # y: [b] => [b, 10] 101 | y_onehot = tf.one_hot(y, depth=10) 102 | 103 | # mse = mean(sum(y-out)^2) 104 | # [b, 10] 105 | loss = tf.square(y_onehot - out) 106 | # mean: scalar 107 | loss = tf.reduce_mean(loss) 108 | 109 | # compute gradients 110 | grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3, w4, b4]) 111 | # print(grads) 112 | # w1 = w1 - lr * w1_grad 113 | w1.assign_sub(lr * grads[0]) 114 | b1.assign_sub(lr * grads[1]) 115 | w2.assign_sub(lr * grads[2]) 116 | b2.assign_sub(lr * grads[3]) 117 | w3.assign_sub(lr * grads[4]) 118 | b3.assign_sub(lr * grads[5]) 119 | w4.assign_sub(lr * grads[6]) 120 | b4.assign_sub(lr * grads[7]) 121 | 122 | 123 | if step % 100 == 0: 124 | print(epoch, step, 'loss:', float(loss)) 125 | 126 | 127 | 128 | 129 | #%% 130 | -------------------------------------------------------------------------------- /ch06-神经网络/nb.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import tensorflow as tf 3 | from tensorflow import keras 4 | from tensorflow.keras import datasets, layers 5 | import os 6 | 7 | 8 | #%% 9 | a = tf.random.normal([4,35,8]) # 模拟成绩册A 10 | b = tf.random.normal([6,35,8]) # 模拟成绩册B 11 | tf.concat([a,b],axis=0) # 合并成绩册 12 | 13 | 14 | #%% 15 | x = tf.random.normal([2,784]) 16 | w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1)) 17 | b1 = tf.Variable(tf.zeros([256])) 18 | o1 = tf.matmul(x,w1) + b1 # 19 | o1 = tf.nn.relu(o1) 20 | o1 21 | #%% 22 | x = tf.random.normal([4,28*28]) 23 | # 创建全连接层,指定输出节点数和激活函数 24 | fc = layers.Dense(512, activation=tf.nn.relu) 25 | h1 = fc(x) # 通过fc类完成一次全连接层的计算 26 | 27 | 28 | #%% 29 | vars(fc) 30 | 31 | #%% 32 | x = tf.random.normal([4,4]) 33 | # 创建全连接层,指定输出节点数和激活函数 34 | fc = layers.Dense(3, activation=tf.nn.relu) 35 | h1 = fc(x) # 通过fc类完成一次全连接层的计算 36 | 37 | 38 | #%% 39 | fc.non_trainable_variables 40 | 41 | #%% 42 | embedding = layers.Embedding(10000, 100) 43 | 44 | #%% 45 | x = tf.ones([25000,80]) 46 | 47 | #%% 48 | 49 | embedding(x) 50 | 51 | #%% 52 | z = tf.random.normal([2,10]) # 构造输出层的输出 53 | y_onehot = tf.constant([1,3]) # 构造真实值 54 | y_onehot = tf.one_hot(y_onehot, depth=10) # one-hot编码 55 | # 输出层未使用Softmax函数,故from_logits设置为True 56 | loss = keras.losses.categorical_crossentropy(y_onehot,z,from_logits=True) 57 | loss = tf.reduce_mean(loss) # 计算平均交叉熵损失 58 | loss 59 | 60 | 61 | #%% 62 | criteon = keras.losses.CategoricalCrossentropy(from_logits=True) 63 | loss = criteon(y_onehot,z) # 计算损失 64 | loss 65 | 66 | 67 | #%% 68 | -------------------------------------------------------------------------------- /ch06-神经网络/全接连层.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch06-神经网络/全接连层.pdf -------------------------------------------------------------------------------- /ch06-神经网络/误差计算.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch06-神经网络/误差计算.pdf -------------------------------------------------------------------------------- /ch06-神经网络/输出方式.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch06-神经网络/输出方式.pdf -------------------------------------------------------------------------------- /ch07-反向传播算法/0.梯度下降-简介.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/0.梯度下降-简介.pdf -------------------------------------------------------------------------------- /ch07-反向传播算法/2.常见函数的梯度.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/2.常见函数的梯度.pdf -------------------------------------------------------------------------------- /ch07-反向传播算法/2nd_derivative.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | w = tf.Variable(1.0) 4 | b = tf.Variable(2.0) 5 | x = tf.Variable(3.0) 6 | 7 | with tf.GradientTape() as t1: 8 | with tf.GradientTape() as t2: 9 | y = x * w + b 10 | dy_dw, dy_db = t2.gradient(y, [w, b]) 11 | d2y_dw2 = t1.gradient(dy_dw, w) 12 | 13 | print(dy_dw) 14 | print(dy_db) 15 | print(d2y_dw2) 16 | 17 | assert dy_dw.numpy() == 3.0 18 | assert d2y_dw2 is None -------------------------------------------------------------------------------- /ch07-反向传播算法/3.激活函数及其梯度.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/3.激活函数及其梯度.pdf -------------------------------------------------------------------------------- /ch07-反向传播算法/4.损失函数及其梯度.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/4.损失函数及其梯度.pdf -------------------------------------------------------------------------------- /ch07-反向传播算法/5.单输出感知机梯度.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/5.单输出感知机梯度.pdf -------------------------------------------------------------------------------- /ch07-反向传播算法/6.多输出感知机梯度.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/6.多输出感知机梯度.pdf -------------------------------------------------------------------------------- /ch07-反向传播算法/7.链式法则.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/7.链式法则.pdf -------------------------------------------------------------------------------- /ch07-反向传播算法/8.多层感知机梯度.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch07-反向传播算法/8.多层感知机梯度.pdf -------------------------------------------------------------------------------- /ch07-反向传播算法/chain_rule.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | # 构建待优化变量 4 | x = tf.constant(1.) 5 | w1 = tf.constant(2.) 6 | b1 = tf.constant(1.) 7 | w2 = tf.constant(2.) 8 | b2 = tf.constant(1.) 9 | 10 | 11 | with tf.GradientTape(persistent=True) as tape: 12 | # 非tf.Variable类型的张量需要人为设置记录梯度信息 13 | tape.watch([w1, b1, w2, b2]) 14 | # 构建2层网络 15 | y1 = x * w1 + b1 16 | y2 = y1 * w2 + b2 17 | 18 | # 独立求解出各个导数 19 | dy2_dy1 = tape.gradient(y2, [y1])[0] 20 | dy1_dw1 = tape.gradient(y1, [w1])[0] 21 | dy2_dw1 = tape.gradient(y2, [w1])[0] 22 | 23 | # 验证链式法则 24 | print(dy2_dy1 * dy1_dw1) 25 | print(dy2_dw1) -------------------------------------------------------------------------------- /ch07-反向传播算法/crossentropy_loss.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | tf.random.set_seed(4323) 5 | 6 | x=tf.random.normal([1,3]) 7 | 8 | w=tf.random.normal([3,2]) 9 | 10 | b=tf.random.normal([2]) 11 | 12 | y = tf.constant([0, 1]) 13 | 14 | 15 | with tf.GradientTape() as tape: 16 | 17 | tape.watch([w, b]) 18 | logits = (x@w+b) 19 | loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y, logits, from_logits=True)) 20 | 21 | grads = tape.gradient(loss, [w, b]) 22 | print('w grad:', grads[0]) 23 | 24 | print('b grad:', grads[1]) -------------------------------------------------------------------------------- /ch07-反向传播算法/himmelblau.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from mpl_toolkits.mplot3d import Axes3D 3 | from matplotlib import pyplot as plt 4 | import tensorflow as tf 5 | 6 | 7 | 8 | def himmelblau(x): 9 | # himmelblau函数实现 10 | return (x[0] ** 2 + x[1] - 11) ** 2 + (x[0] + x[1] ** 2 - 7) ** 2 11 | 12 | 13 | x = np.arange(-6, 6, 0.1) 14 | y = np.arange(-6, 6, 0.1) 15 | print('x,y range:', x.shape, y.shape) 16 | # 生成x-y平面采样网格点,方便可视化 17 | X, Y = np.meshgrid(x, y) 18 | print('X,Y maps:', X.shape, Y.shape) 19 | Z = himmelblau([X, Y]) # 计算网格点上的函数值 20 | 21 | # 绘制himmelblau函数曲面 22 | fig = plt.figure('himmelblau') 23 | ax = fig.gca(projection='3d') 24 | ax.plot_surface(X, Y, Z) 25 | ax.view_init(60, -30) 26 | ax.set_xlabel('x') 27 | ax.set_ylabel('y') 28 | plt.show() 29 | 30 | # 参数的初始化值对优化的影响不容忽视,可以通过尝试不同的初始化值, 31 | # 检验函数优化的极小值情况 32 | # [1., 0.], [-4, 0.], [4, 0.] 33 | # x = tf.constant([4., 0.]) 34 | # x = tf.constant([1., 0.]) 35 | # x = tf.constant([-4., 0.]) 36 | x = tf.constant([-2., 2.]) 37 | 38 | for step in range(200):# 循环优化 39 | with tf.GradientTape() as tape: #梯度跟踪 40 | tape.watch([x]) # 记录梯度 41 | y = himmelblau(x) # 前向传播 42 | # 反向传播 43 | grads = tape.gradient(y, [x])[0] 44 | # 更新参数,0.01为学习率 45 | x -= 0.01*grads 46 | # 打印优化的极小值 47 | if step % 20 == 19: 48 | print ('step {}: x = {}, f(x) = {}' 49 | .format(step, x.numpy(), y.numpy())) -------------------------------------------------------------------------------- /ch07-反向传播算法/mse_grad.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | 5 | 6 | x=tf.random.normal([1,3]) 7 | 8 | w=tf.ones([3,2]) 9 | 10 | b=tf.ones([2]) 11 | 12 | y = tf.constant([0, 1]) 13 | 14 | 15 | with tf.GradientTape() as tape: 16 | 17 | tape.watch([w, b]) 18 | logits = tf.sigmoid(x@w+b) 19 | loss = tf.reduce_mean(tf.losses.MSE(y, logits)) 20 | 21 | grads = tape.gradient(loss, [w, b]) 22 | print('w grad:', grads[0]) 23 | 24 | print('b grad:', grads[1]) 25 | 26 | 27 | -------------------------------------------------------------------------------- /ch07-反向传播算法/multi_output_perceptron.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | 5 | 6 | x=tf.random.normal([1,3]) 7 | 8 | w=tf.ones([3,2]) 9 | 10 | b=tf.ones([2]) 11 | 12 | y = tf.constant([0, 1]) 13 | 14 | 15 | with tf.GradientTape() as tape: 16 | 17 | tape.watch([w, b]) 18 | logits = tf.sigmoid(x@w+b) 19 | loss = tf.reduce_mean(tf.losses.MSE(y, logits)) 20 | 21 | grads = tape.gradient(loss, [w, b]) 22 | print('w grad:', grads[0]) 23 | 24 | print('b grad:', grads[1]) 25 | 26 | 27 | -------------------------------------------------------------------------------- /ch07-反向传播算法/sigmoid_grad.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | a = tf.linspace(-10., 10., 10) 5 | 6 | with tf.GradientTape() as tape: 7 | tape.watch(a) 8 | y = tf.sigmoid(a) 9 | 10 | 11 | grads = tape.gradient(y, [a]) 12 | print('x:', a.numpy()) 13 | print('y:', y.numpy()) 14 | print('grad:', grads[0].numpy()) 15 | -------------------------------------------------------------------------------- /ch07-反向传播算法/single_output_perceptron.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | 5 | 6 | x=tf.random.normal([1,3]) 7 | 8 | w=tf.ones([3,1]) 9 | 10 | b=tf.ones([1]) 11 | 12 | y = tf.constant([1]) 13 | 14 | 15 | with tf.GradientTape() as tape: 16 | 17 | tape.watch([w, b]) 18 | logits = tf.sigmoid(x@w+b) 19 | loss = tf.reduce_mean(tf.losses.MSE(y, logits)) 20 | 21 | grads = tape.gradient(loss, [w, b]) 22 | print('w grad:', grads[0]) 23 | 24 | print('b grad:', grads[1]) 25 | 26 | 27 | -------------------------------------------------------------------------------- /ch08-Keras高层接口/1.Metrics.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch08-Keras高层接口/1.Metrics.pdf -------------------------------------------------------------------------------- /ch08-Keras高层接口/2.Compile&Fit.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch08-Keras高层接口/2.Compile&Fit.pdf -------------------------------------------------------------------------------- /ch08-Keras高层接口/3.自定义层.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch08-Keras高层接口/3.自定义层.pdf -------------------------------------------------------------------------------- /ch08-Keras高层接口/Keras实战CIFAR10.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch08-Keras高层接口/Keras实战CIFAR10.pdf -------------------------------------------------------------------------------- /ch08-Keras高层接口/compile_fit.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 3 | 4 | 5 | def preprocess(x, y): 6 | """ 7 | x is a simple image, not a batch 8 | """ 9 | x = tf.cast(x, dtype=tf.float32) / 255. 10 | x = tf.reshape(x, [28*28]) 11 | y = tf.cast(y, dtype=tf.int32) 12 | y = tf.one_hot(y, depth=10) 13 | return x,y 14 | 15 | 16 | batchsz = 128 17 | (x, y), (x_val, y_val) = datasets.mnist.load_data() 18 | print('datasets:', x.shape, y.shape, x.min(), x.max()) 19 | 20 | 21 | 22 | db = tf.data.Dataset.from_tensor_slices((x,y)) 23 | db = db.map(preprocess).shuffle(60000).batch(batchsz) 24 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)) 25 | ds_val = ds_val.map(preprocess).batch(batchsz) 26 | 27 | sample = next(iter(db)) 28 | print(sample[0].shape, sample[1].shape) 29 | 30 | 31 | network = Sequential([layers.Dense(256, activation='relu'), 32 | layers.Dense(128, activation='relu'), 33 | layers.Dense(64, activation='relu'), 34 | layers.Dense(32, activation='relu'), 35 | layers.Dense(10)]) 36 | network.build(input_shape=(None, 28*28)) 37 | network.summary() 38 | 39 | 40 | 41 | 42 | network.compile(optimizer=optimizers.Adam(lr=0.01), 43 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 44 | metrics=['accuracy'] 45 | ) 46 | 47 | network.fit(db, epochs=5, validation_data=ds_val, validation_freq=2) 48 | 49 | network.evaluate(ds_val) 50 | 51 | sample = next(iter(ds_val)) 52 | x = sample[0] 53 | y = sample[1] # one-hot 54 | pred = network.predict(x) # [b, 10] 55 | # convert back to number 56 | y = tf.argmax(y, axis=1) 57 | pred = tf.argmax(pred, axis=1) 58 | 59 | print(pred) 60 | print(y) 61 | -------------------------------------------------------------------------------- /ch08-Keras高层接口/keras_train.py: -------------------------------------------------------------------------------- 1 | import os 2 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2' 3 | 4 | import tensorflow as tf 5 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 6 | from tensorflow import keras 7 | 8 | 9 | 10 | def preprocess(x, y): 11 | # [0~255] => [-1~1] 12 | x = 2 * tf.cast(x, dtype=tf.float32) / 255. - 1. 13 | y = tf.cast(y, dtype=tf.int32) 14 | return x,y 15 | 16 | 17 | batchsz = 128 18 | # [50k, 32, 32, 3], [10k, 1] 19 | (x, y), (x_val, y_val) = datasets.cifar10.load_data() 20 | y = tf.squeeze(y) 21 | y_val = tf.squeeze(y_val) 22 | y = tf.one_hot(y, depth=10) # [50k, 10] 23 | y_val = tf.one_hot(y_val, depth=10) # [10k, 10] 24 | print('datasets:', x.shape, y.shape, x_val.shape, y_val.shape, x.min(), x.max()) 25 | 26 | 27 | train_db = tf.data.Dataset.from_tensor_slices((x,y)) 28 | train_db = train_db.map(preprocess).shuffle(10000).batch(batchsz) 29 | test_db = tf.data.Dataset.from_tensor_slices((x_val, y_val)) 30 | test_db = test_db.map(preprocess).batch(batchsz) 31 | 32 | 33 | sample = next(iter(train_db)) 34 | print('batch:', sample[0].shape, sample[1].shape) 35 | 36 | 37 | class MyDense(layers.Layer): 38 | # to replace standard layers.Dense() 39 | def __init__(self, inp_dim, outp_dim): 40 | super(MyDense, self).__init__() 41 | 42 | self.kernel = self.add_variable('w', [inp_dim, outp_dim]) 43 | # self.bias = self.add_variable('b', [outp_dim]) 44 | 45 | def call(self, inputs, training=None): 46 | 47 | x = inputs @ self.kernel 48 | return x 49 | 50 | class MyNetwork(keras.Model): 51 | 52 | def __init__(self): 53 | super(MyNetwork, self).__init__() 54 | 55 | self.fc1 = MyDense(32*32*3, 256) 56 | self.fc2 = MyDense(256, 128) 57 | self.fc3 = MyDense(128, 64) 58 | self.fc4 = MyDense(64, 32) 59 | self.fc5 = MyDense(32, 10) 60 | 61 | 62 | 63 | def call(self, inputs, training=None): 64 | """ 65 | 66 | :param inputs: [b, 32, 32, 3] 67 | :param training: 68 | :return: 69 | """ 70 | x = tf.reshape(inputs, [-1, 32*32*3]) 71 | # [b, 32*32*3] => [b, 256] 72 | x = self.fc1(x) 73 | x = tf.nn.relu(x) 74 | # [b, 256] => [b, 128] 75 | x = self.fc2(x) 76 | x = tf.nn.relu(x) 77 | # [b, 128] => [b, 64] 78 | x = self.fc3(x) 79 | x = tf.nn.relu(x) 80 | # [b, 64] => [b, 32] 81 | x = self.fc4(x) 82 | x = tf.nn.relu(x) 83 | # [b, 32] => [b, 10] 84 | x = self.fc5(x) 85 | 86 | return x 87 | 88 | 89 | network = MyNetwork() 90 | network.compile(optimizer=optimizers.Adam(lr=1e-3), 91 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 92 | metrics=['accuracy']) 93 | network.fit(train_db, epochs=15, validation_data=test_db, validation_freq=1) 94 | 95 | network.evaluate(test_db) 96 | network.save_weights('ckpt/weights.ckpt') 97 | del network 98 | print('saved to ckpt/weights.ckpt') 99 | 100 | 101 | network = MyNetwork() 102 | network.compile(optimizer=optimizers.Adam(lr=1e-3), 103 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 104 | metrics=['accuracy']) 105 | network.load_weights('ckpt/weights.ckpt') 106 | print('loaded weights from file.') 107 | network.evaluate(test_db) -------------------------------------------------------------------------------- /ch08-Keras高层接口/layer_model.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 3 | from tensorflow import keras 4 | 5 | def preprocess(x, y): 6 | """ 7 | x is a simple image, not a batch 8 | """ 9 | x = tf.cast(x, dtype=tf.float32) / 255. 10 | x = tf.reshape(x, [28*28]) 11 | y = tf.cast(y, dtype=tf.int32) 12 | y = tf.one_hot(y, depth=10) 13 | return x,y 14 | 15 | 16 | batchsz = 128 17 | (x, y), (x_val, y_val) = datasets.mnist.load_data() 18 | print('datasets:', x.shape, y.shape, x.min(), x.max()) 19 | 20 | 21 | 22 | db = tf.data.Dataset.from_tensor_slices((x,y)) 23 | db = db.map(preprocess).shuffle(60000).batch(batchsz) 24 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)) 25 | ds_val = ds_val.map(preprocess).batch(batchsz) 26 | 27 | sample = next(iter(db)) 28 | print(sample[0].shape, sample[1].shape) 29 | 30 | 31 | network = Sequential([layers.Dense(256, activation='relu'), 32 | layers.Dense(128, activation='relu'), 33 | layers.Dense(64, activation='relu'), 34 | layers.Dense(32, activation='relu'), 35 | layers.Dense(10)]) 36 | network.build(input_shape=(None, 28*28)) 37 | network.summary() 38 | 39 | 40 | class MyDense(layers.Layer): 41 | 42 | def __init__(self, inp_dim, outp_dim): 43 | super(MyDense, self).__init__() 44 | 45 | self.kernel = self.add_weight('w', [inp_dim, outp_dim]) 46 | self.bias = self.add_weight('b', [outp_dim]) 47 | 48 | def call(self, inputs, training=None): 49 | 50 | out = inputs @ self.kernel + self.bias 51 | 52 | return out 53 | 54 | class MyModel(keras.Model): 55 | 56 | def __init__(self): 57 | super(MyModel, self).__init__() 58 | 59 | self.fc1 = MyDense(28*28, 256) 60 | self.fc2 = MyDense(256, 128) 61 | self.fc3 = MyDense(128, 64) 62 | self.fc4 = MyDense(64, 32) 63 | self.fc5 = MyDense(32, 10) 64 | 65 | def call(self, inputs, training=None): 66 | 67 | x = self.fc1(inputs) 68 | x = tf.nn.relu(x) 69 | x = self.fc2(x) 70 | x = tf.nn.relu(x) 71 | x = self.fc3(x) 72 | x = tf.nn.relu(x) 73 | x = self.fc4(x) 74 | x = tf.nn.relu(x) 75 | x = self.fc5(x) 76 | 77 | return x 78 | 79 | 80 | network = MyModel() 81 | 82 | 83 | network.compile(optimizer=optimizers.Adam(lr=0.01), 84 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 85 | metrics=['accuracy'] 86 | ) 87 | 88 | network.fit(db, epochs=5, validation_data=ds_val, 89 | validation_freq=2) 90 | 91 | network.evaluate(ds_val) 92 | 93 | sample = next(iter(ds_val)) 94 | x = sample[0] 95 | y = sample[1] # one-hot 96 | pred = network.predict(x) # [b, 10] 97 | # convert back to number 98 | y = tf.argmax(y, axis=1) 99 | pred = tf.argmax(pred, axis=1) 100 | 101 | print(pred) 102 | print(y) 103 | -------------------------------------------------------------------------------- /ch08-Keras高层接口/metrics.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 3 | 4 | 5 | def preprocess(x, y): 6 | 7 | x = tf.cast(x, dtype=tf.float32) / 255. 8 | y = tf.cast(y, dtype=tf.int32) 9 | 10 | return x,y 11 | 12 | 13 | batchsz = 128 14 | (x, y), (x_val, y_val) = datasets.mnist.load_data() 15 | print('datasets:', x.shape, y.shape, x.min(), x.max()) 16 | 17 | 18 | 19 | db = tf.data.Dataset.from_tensor_slices((x,y)) 20 | db = db.map(preprocess).shuffle(60000).batch(batchsz).repeat(10) 21 | 22 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)) 23 | ds_val = ds_val.map(preprocess).batch(batchsz) 24 | 25 | 26 | 27 | 28 | network = Sequential([layers.Dense(256, activation='relu'), 29 | layers.Dense(128, activation='relu'), 30 | layers.Dense(64, activation='relu'), 31 | layers.Dense(32, activation='relu'), 32 | layers.Dense(10)]) 33 | network.build(input_shape=(None, 28*28)) 34 | network.summary() 35 | 36 | optimizer = optimizers.Adam(lr=0.01) 37 | 38 | acc_meter = metrics.Accuracy() 39 | loss_meter = metrics.Mean() 40 | 41 | 42 | for step, (x,y) in enumerate(db): 43 | 44 | with tf.GradientTape() as tape: 45 | # [b, 28, 28] => [b, 784] 46 | x = tf.reshape(x, (-1, 28*28)) 47 | # [b, 784] => [b, 10] 48 | out = network(x) 49 | # [b] => [b, 10] 50 | y_onehot = tf.one_hot(y, depth=10) 51 | # [b] 52 | loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onehot, out, from_logits=True)) 53 | 54 | loss_meter.update_state(loss) 55 | 56 | 57 | 58 | grads = tape.gradient(loss, network.trainable_variables) 59 | optimizer.apply_gradients(zip(grads, network.trainable_variables)) 60 | 61 | 62 | if step % 100 == 0: 63 | 64 | print(step, 'loss:', loss_meter.result().numpy()) 65 | loss_meter.reset_states() 66 | 67 | 68 | # evaluate 69 | if step % 500 == 0: 70 | total, total_correct = 0., 0 71 | acc_meter.reset_states() 72 | 73 | for step, (x, y) in enumerate(ds_val): 74 | # [b, 28, 28] => [b, 784] 75 | x = tf.reshape(x, (-1, 28*28)) 76 | # [b, 784] => [b, 10] 77 | out = network(x) 78 | 79 | 80 | # [b, 10] => [b] 81 | pred = tf.argmax(out, axis=1) 82 | pred = tf.cast(pred, dtype=tf.int32) 83 | # bool type 84 | correct = tf.equal(pred, y) 85 | # bool tensor => int tensor => numpy 86 | total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy() 87 | total += x.shape[0] 88 | 89 | acc_meter.update_state(y, pred) 90 | 91 | 92 | print(step, 'Evaluate Acc:', total_correct/total, acc_meter.result().numpy()) 93 | -------------------------------------------------------------------------------- /ch08-Keras高层接口/nb.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import tensorflow as tf 3 | from tensorflow import keras 4 | from tensorflow.keras import layers,Sequential,losses,optimizers,datasets 5 | 6 | 7 | #%% 8 | x = tf.constant([2.,1.,0.1]) 9 | layer = layers.Softmax(axis=-1) 10 | layer(x) 11 | #%% 12 | def proprocess(x,y): 13 | x = tf.reshape(x, [-1]) 14 | return x,y 15 | 16 | # x: [60k, 28, 28], 17 | # y: [60k] 18 | (x, y), (x_test,y_test) = datasets.mnist.load_data() 19 | # x: [0~255] => [0~1.] 20 | x = tf.convert_to_tensor(x, dtype=tf.float32) / 255. 21 | y = tf.convert_to_tensor(y, dtype=tf.int32) 22 | 23 | # x: [0~255] => [0~1.] 24 | x_test = tf.convert_to_tensor(x_test, dtype=tf.float32) / 255. 25 | y_test = tf.convert_to_tensor(y_test, dtype=tf.int32) 26 | 27 | train_db = tf.data.Dataset.from_tensor_slices((x,y)) 28 | train_db = train_db.shuffle(1000).map(proprocess).batch(128) 29 | 30 | val_db = tf.data.Dataset.from_tensor_slices((x_test,y_test)) 31 | val_db = val_db.shuffle(1000).map(proprocess).batch(128) 32 | 33 | x,y = next(iter(train_db)) 34 | print(x.shape, y.shape) 35 | #%% 36 | 37 | from tensorflow.keras import layers, Sequential 38 | network = Sequential([ 39 | layers.Dense(3, activation=None), 40 | layers.ReLU(), 41 | layers.Dense(2, activation=None), 42 | layers.ReLU() 43 | ]) 44 | x = tf.random.normal([4,3]) 45 | network(x) 46 | 47 | #%% 48 | layers_num = 2 49 | network = Sequential([]) 50 | for _ in range(layers_num): 51 | network.add(layers.Dense(3)) 52 | network.add(layers.ReLU()) 53 | network.build(input_shape=(None, 4)) 54 | network.summary() 55 | 56 | #%% 57 | for p in network.trainable_variables: 58 | print(p.name, p.shape) 59 | 60 | #%% 61 | # 创建5层的全连接层网络 62 | network = Sequential([layers.Dense(256, activation='relu'), 63 | layers.Dense(128, activation='relu'), 64 | layers.Dense(64, activation='relu'), 65 | layers.Dense(32, activation='relu'), 66 | layers.Dense(10)]) 67 | network.build(input_shape=(4, 28*28)) 68 | network.summary() 69 | 70 | 71 | #%% 72 | # 导入优化器,损失函数模块 73 | from tensorflow.keras import optimizers,losses 74 | # 采用Adam优化器,学习率为0.01;采用交叉熵损失函数,包含Softmax 75 | network.compile(optimizer=optimizers.Adam(lr=0.01), 76 | loss=losses.CategoricalCrossentropy(from_logits=True), 77 | metrics=['accuracy'] # 设置测量指标为准确率 78 | ) 79 | 80 | 81 | #%% 82 | # 指定训练集为db,验证集为val_db,训练5个epochs,每2个epoch验证一次 83 | history = network.fit(train_db, epochs=5, validation_data=val_db, validation_freq=2) 84 | 85 | 86 | #%% 87 | history.history # 打印训练记录 88 | 89 | #%% 90 | # 保存模型参数到文件上 91 | network.save_weights('weights.ckpt') 92 | print('saved weights.') 93 | del network # 删除网络对象 94 | # 重新创建相同的网络结构 95 | network = Sequential([layers.Dense(256, activation='relu'), 96 | layers.Dense(128, activation='relu'), 97 | layers.Dense(64, activation='relu'), 98 | layers.Dense(32, activation='relu'), 99 | layers.Dense(10)]) 100 | network.compile(optimizer=optimizers.Adam(lr=0.01), 101 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 102 | metrics=['accuracy'] 103 | ) 104 | # 从参数文件中读取数据并写入当前网络 105 | network.load_weights('weights.ckpt') 106 | print('loaded weights!') 107 | 108 | 109 | #%% 110 | # 新建池化层 111 | global_average_layer = layers.GlobalAveragePooling2D() 112 | # 利用上一层的输出作为本层的输入,测试其输出 113 | x = tf.random.normal([4,7,7,2048]) 114 | out = global_average_layer(x) # 池化层降维 115 | print(out.shape) 116 | 117 | 118 | #%% 119 | # 新建全连接层 120 | fc = layers.Dense(100) 121 | # 利用上一层的输出作为本层的输入,测试其输出 122 | x = tf.random.normal([4,2048]) 123 | out = fc(x) 124 | print(out.shape) 125 | 126 | 127 | #%% 128 | -------------------------------------------------------------------------------- /ch08-Keras高层接口/pretained.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import tensorflow as tf 3 | from tensorflow import keras 4 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 5 | 6 | #%% 7 | # 加载预训练网络模型,并去掉最后一层 8 | resnet = keras.applications.ResNet50(weights='imagenet',include_top=False) 9 | resnet.summary() 10 | # 测试网络的输出 11 | x = tf.random.normal([4,224,224,3]) 12 | out = resnet(x) 13 | out.shape 14 | #%% 15 | # 新建池化层 16 | global_average_layer = tf.keras.layers.GlobalAveragePooling2D() 17 | # 利用上一层的输出作为本层的输入,测试其输出 18 | x = tf.random.normal([4,7,7,2048]) 19 | out = global_average_layer(x) 20 | print(out.shape) 21 | #%% 22 | # 新建全连接层 23 | fc = tf.keras.layers.Dense(100) 24 | # 利用上一层的输出作为本层的输入,测试其输出 25 | x = tf.random.normal([4,2048]) 26 | out = fc(x) 27 | print(out.shape) 28 | #%% 29 | # 重新包裹成我们的网络模型 30 | mynet = Sequential([resnet, global_average_layer, fc]) 31 | mynet.summary() 32 | #%% 33 | resnet.trainable = False 34 | mynet.summary() 35 | 36 | #%% -------------------------------------------------------------------------------- /ch08-Keras高层接口/save_load_model.py: -------------------------------------------------------------------------------- 1 | import os 2 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2' 3 | 4 | import tensorflow as tf 5 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 6 | 7 | 8 | def preprocess(x, y): 9 | """ 10 | x is a simple image, not a batch 11 | """ 12 | x = tf.cast(x, dtype=tf.float32) / 255. 13 | x = tf.reshape(x, [28*28]) 14 | y = tf.cast(y, dtype=tf.int32) 15 | y = tf.one_hot(y, depth=10) 16 | return x,y 17 | 18 | 19 | batchsz = 128 20 | (x, y), (x_val, y_val) = datasets.mnist.load_data() 21 | print('datasets:', x.shape, y.shape, x.min(), x.max()) 22 | 23 | 24 | 25 | db = tf.data.Dataset.from_tensor_slices((x,y)) 26 | db = db.map(preprocess).shuffle(60000).batch(batchsz) 27 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)) 28 | ds_val = ds_val.map(preprocess).batch(batchsz) 29 | 30 | sample = next(iter(db)) 31 | print(sample[0].shape, sample[1].shape) 32 | 33 | 34 | network = Sequential([layers.Dense(256, activation='relu'), 35 | layers.Dense(128, activation='relu'), 36 | layers.Dense(64, activation='relu'), 37 | layers.Dense(32, activation='relu'), 38 | layers.Dense(10)]) 39 | network.build(input_shape=(None, 28*28)) 40 | network.summary() 41 | 42 | 43 | 44 | 45 | network.compile(optimizer=optimizers.Adam(lr=0.01), 46 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 47 | metrics=['accuracy'] 48 | ) 49 | 50 | network.fit(db, epochs=3, validation_data=ds_val, validation_freq=2) 51 | 52 | network.evaluate(ds_val) 53 | 54 | network.save('model.h5') 55 | print('saved total model.') 56 | del network 57 | 58 | print('loaded model from file.') 59 | network = tf.keras.models.load_model('model.h5', compile=False) 60 | network.compile(optimizer=optimizers.Adam(lr=0.01), 61 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 62 | metrics=['accuracy'] 63 | ) 64 | x_val = tf.cast(x_val, dtype=tf.float32) / 255. 65 | x_val = tf.reshape(x_val, [-1, 28*28]) 66 | y_val = tf.cast(y_val, dtype=tf.int32) 67 | y_val = tf.one_hot(y_val, depth=10) 68 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)).batch(128) 69 | network.evaluate(ds_val) 70 | -------------------------------------------------------------------------------- /ch08-Keras高层接口/save_load_weight.py: -------------------------------------------------------------------------------- 1 | import os 2 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2' 3 | 4 | import tensorflow as tf 5 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 6 | 7 | 8 | def preprocess(x, y): 9 | """ 10 | x is a simple image, not a batch 11 | """ 12 | x = tf.cast(x, dtype=tf.float32) / 255. 13 | x = tf.reshape(x, [28*28]) 14 | y = tf.cast(y, dtype=tf.int32) 15 | y = tf.one_hot(y, depth=10) 16 | return x,y 17 | 18 | 19 | batchsz = 128 20 | (x, y), (x_val, y_val) = datasets.mnist.load_data() 21 | print('datasets:', x.shape, y.shape, x.min(), x.max()) 22 | 23 | 24 | 25 | db = tf.data.Dataset.from_tensor_slices((x,y)) 26 | db = db.map(preprocess).shuffle(60000).batch(batchsz) 27 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)) 28 | ds_val = ds_val.map(preprocess).batch(batchsz) 29 | 30 | sample = next(iter(db)) 31 | print(sample[0].shape, sample[1].shape) 32 | 33 | 34 | network = Sequential([layers.Dense(256, activation='relu'), 35 | layers.Dense(128, activation='relu'), 36 | layers.Dense(64, activation='relu'), 37 | layers.Dense(32, activation='relu'), 38 | layers.Dense(10)]) 39 | network.build(input_shape=(None, 28*28)) 40 | network.summary() 41 | 42 | 43 | 44 | 45 | network.compile(optimizer=optimizers.Adam(lr=0.01), 46 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 47 | metrics=['accuracy'] 48 | ) 49 | 50 | network.fit(db, epochs=3, validation_data=ds_val, validation_freq=2) 51 | 52 | network.evaluate(ds_val) 53 | 54 | network.save_weights('weights.ckpt') 55 | print('saved weights.') 56 | del network 57 | 58 | network = Sequential([layers.Dense(256, activation='relu'), 59 | layers.Dense(128, activation='relu'), 60 | layers.Dense(64, activation='relu'), 61 | layers.Dense(32, activation='relu'), 62 | layers.Dense(10)]) 63 | network.compile(optimizer=optimizers.Adam(lr=0.01), 64 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 65 | metrics=['accuracy'] 66 | ) 67 | network.load_weights('weights.ckpt') 68 | print('loaded weights!') 69 | network.evaluate(ds_val) 70 | -------------------------------------------------------------------------------- /ch08-Keras高层接口/模型加载与保存.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch08-Keras高层接口/模型加载与保存.pdf -------------------------------------------------------------------------------- /ch09-过拟合/Regularization.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/Regularization.pdf -------------------------------------------------------------------------------- /ch09-过拟合/compile_fit.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 3 | 4 | 5 | def preprocess(x, y): 6 | """ 7 | x is a simple image, not a batch 8 | """ 9 | x = tf.cast(x, dtype=tf.float32) / 255. 10 | x = tf.reshape(x, [28*28]) 11 | y = tf.cast(y, dtype=tf.int32) 12 | y = tf.one_hot(y, depth=10) 13 | return x,y 14 | 15 | 16 | batchsz = 128 17 | (x, y), (x_val, y_val) = datasets.mnist.load_data() 18 | print('datasets:', x.shape, y.shape, x.min(), x.max()) 19 | 20 | 21 | 22 | db = tf.data.Dataset.from_tensor_slices((x,y)) 23 | db = db.map(preprocess).shuffle(60000).batch(batchsz) 24 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)) 25 | ds_val = ds_val.map(preprocess).batch(batchsz) 26 | 27 | sample = next(iter(db)) 28 | print(sample[0].shape, sample[1].shape) 29 | 30 | 31 | network = Sequential([layers.Dense(256, activation='relu'), 32 | layers.Dense(128, activation='relu'), 33 | layers.Dense(64, activation='relu'), 34 | layers.Dense(32, activation='relu'), 35 | layers.Dense(10)]) 36 | network.build(input_shape=(None, 28*28)) 37 | network.summary() 38 | 39 | 40 | 41 | 42 | network.compile(optimizer=optimizers.Adam(lr=0.01), 43 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 44 | metrics=['accuracy'] 45 | ) 46 | 47 | network.fit(db, epochs=5, validation_data=ds_val, 48 | validation_steps=2) 49 | 50 | network.evaluate(ds_val) 51 | 52 | sample = next(iter(ds_val)) 53 | x = sample[0] 54 | y = sample[1] # one-hot 55 | pred = network.predict(x) # [b, 10] 56 | # convert back to number 57 | y = tf.argmax(y, axis=1) 58 | pred = tf.argmax(pred, axis=1) 59 | 60 | print(pred) 61 | print(y) 62 | -------------------------------------------------------------------------------- /ch09-过拟合/dropout.py: -------------------------------------------------------------------------------- 1 | import os 2 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2' 3 | 4 | import tensorflow as tf 5 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 6 | 7 | 8 | def preprocess(x, y): 9 | 10 | x = tf.cast(x, dtype=tf.float32) / 255. 11 | y = tf.cast(y, dtype=tf.int32) 12 | 13 | return x,y 14 | 15 | 16 | batchsz = 128 17 | (x, y), (x_val, y_val) = datasets.mnist.load_data() 18 | print('datasets:', x.shape, y.shape, x.min(), x.max()) 19 | 20 | 21 | 22 | db = tf.data.Dataset.from_tensor_slices((x,y)) 23 | db = db.map(preprocess).shuffle(60000).batch(batchsz).repeat(10) 24 | 25 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)) 26 | ds_val = ds_val.map(preprocess).batch(batchsz) 27 | 28 | 29 | 30 | 31 | network = Sequential([layers.Dense(256, activation='relu'), 32 | layers.Dropout(0.5), # 0.5 rate to drop 33 | layers.Dense(128, activation='relu'), 34 | layers.Dropout(0.5), # 0.5 rate to drop 35 | layers.Dense(64, activation='relu'), 36 | layers.Dense(32, activation='relu'), 37 | layers.Dense(10)]) 38 | network.build(input_shape=(None, 28*28)) 39 | network.summary() 40 | 41 | optimizer = optimizers.Adam(lr=0.01) 42 | 43 | 44 | 45 | for step, (x,y) in enumerate(db): 46 | 47 | with tf.GradientTape() as tape: 48 | # [b, 28, 28] => [b, 784] 49 | x = tf.reshape(x, (-1, 28*28)) 50 | # [b, 784] => [b, 10] 51 | out = network(x, training=True) 52 | # [b] => [b, 10] 53 | y_onehot = tf.one_hot(y, depth=10) 54 | # [b] 55 | loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onehot, out, from_logits=True)) 56 | 57 | 58 | loss_regularization = [] 59 | for p in network.trainable_variables: 60 | loss_regularization.append(tf.nn.l2_loss(p)) 61 | loss_regularization = tf.reduce_sum(tf.stack(loss_regularization)) 62 | 63 | loss = loss + 0.0001 * loss_regularization 64 | 65 | 66 | grads = tape.gradient(loss, network.trainable_variables) 67 | optimizer.apply_gradients(zip(grads, network.trainable_variables)) 68 | 69 | 70 | if step % 100 == 0: 71 | 72 | print(step, 'loss:', float(loss), 'loss_regularization:', float(loss_regularization)) 73 | 74 | 75 | # evaluate 76 | if step % 500 == 0: 77 | total, total_correct = 0., 0 78 | 79 | for step, (x, y) in enumerate(ds_val): 80 | # [b, 28, 28] => [b, 784] 81 | x = tf.reshape(x, (-1, 28*28)) 82 | # [b, 784] => [b, 10] 83 | out = network(x, training=True) 84 | # [b, 10] => [b] 85 | pred = tf.argmax(out, axis=1) 86 | pred = tf.cast(pred, dtype=tf.int32) 87 | # bool type 88 | correct = tf.equal(pred, y) 89 | # bool tensor => int tensor => numpy 90 | total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy() 91 | total += x.shape[0] 92 | 93 | print(step, 'Evaluate Acc with drop:', total_correct/total) 94 | 95 | total, total_correct = 0., 0 96 | 97 | for step, (x, y) in enumerate(ds_val): 98 | # [b, 28, 28] => [b, 784] 99 | x = tf.reshape(x, (-1, 28*28)) 100 | # [b, 784] => [b, 10] 101 | out = network(x, training=False) 102 | # [b, 10] => [b] 103 | pred = tf.argmax(out, axis=1) 104 | pred = tf.cast(pred, dtype=tf.int32) 105 | # bool type 106 | correct = tf.equal(pred, y) 107 | # bool tensor => int tensor => numpy 108 | total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy() 109 | total += x.shape[0] 110 | 111 | print(step, 'Evaluate Acc without drop:', total_correct/total) -------------------------------------------------------------------------------- /ch09-过拟合/lenna.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_crop.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_crop.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_crop2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_crop2.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_eras.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_eras.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_eras2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_eras2.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_flip.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_flip.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_flip2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_flip2.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_guassian.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_guassian.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_perspective.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_perspective.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_resize.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_resize.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_rotate.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_rotate.png -------------------------------------------------------------------------------- /ch09-过拟合/lenna_rotate2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/lenna_rotate2.png -------------------------------------------------------------------------------- /ch09-过拟合/misc.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/misc.pdf -------------------------------------------------------------------------------- /ch09-过拟合/regularization.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 3 | 4 | 5 | def preprocess(x, y): 6 | 7 | x = tf.cast(x, dtype=tf.float32) / 255. 8 | y = tf.cast(y, dtype=tf.int32) 9 | 10 | return x,y 11 | 12 | 13 | batchsz = 128 14 | (x, y), (x_val, y_val) = datasets.mnist.load_data() 15 | print('datasets:', x.shape, y.shape, x.min(), x.max()) 16 | 17 | 18 | 19 | db = tf.data.Dataset.from_tensor_slices((x,y)) 20 | db = db.map(preprocess).shuffle(60000).batch(batchsz).repeat(10) 21 | 22 | ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)) 23 | ds_val = ds_val.map(preprocess).batch(batchsz) 24 | 25 | 26 | 27 | 28 | network = Sequential([layers.Dense(256, activation='relu'), 29 | layers.Dense(128, activation='relu'), 30 | layers.Dense(64, activation='relu'), 31 | layers.Dense(32, activation='relu'), 32 | layers.Dense(10)]) 33 | network.build(input_shape=(None, 28*28)) 34 | network.summary() 35 | 36 | optimizer = optimizers.Adam(lr=0.01) 37 | 38 | 39 | 40 | for step, (x,y) in enumerate(db): 41 | 42 | with tf.GradientTape() as tape: 43 | # [b, 28, 28] => [b, 784] 44 | x = tf.reshape(x, (-1, 28*28)) 45 | # [b, 784] => [b, 10] 46 | out = network(x) 47 | # [b] => [b, 10] 48 | y_onehot = tf.one_hot(y, depth=10) 49 | # [b] 50 | loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onehot, out, from_logits=True)) 51 | 52 | 53 | loss_regularization = [] 54 | for p in network.trainable_variables: 55 | loss_regularization.append(tf.nn.l2_loss(p)) 56 | loss_regularization = tf.reduce_sum(tf.stack(loss_regularization)) 57 | 58 | loss = loss + 0.0001 * loss_regularization 59 | 60 | 61 | grads = tape.gradient(loss, network.trainable_variables) 62 | optimizer.apply_gradients(zip(grads, network.trainable_variables)) 63 | 64 | 65 | if step % 100 == 0: 66 | 67 | print(step, 'loss:', float(loss), 'loss_regularization:', float(loss_regularization)) 68 | 69 | 70 | # evaluate 71 | if step % 500 == 0: 72 | total, total_correct = 0., 0 73 | 74 | for step, (x, y) in enumerate(ds_val): 75 | # [b, 28, 28] => [b, 784] 76 | x = tf.reshape(x, (-1, 28*28)) 77 | # [b, 784] => [b, 10] 78 | out = network(x) 79 | # [b, 10] => [b] 80 | pred = tf.argmax(out, axis=1) 81 | pred = tf.cast(pred, dtype=tf.int32) 82 | # bool type 83 | correct = tf.equal(pred, y) 84 | # bool tensor => int tensor => numpy 85 | total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy() 86 | total += x.shape[0] 87 | 88 | print(step, 'Evaluate Acc:', total_correct/total) -------------------------------------------------------------------------------- /ch09-过拟合/train_evalute_test.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics 3 | 4 | 5 | def preprocess(x, y): 6 | """ 7 | x is a simple image, not a batch 8 | """ 9 | x = tf.cast(x, dtype=tf.float32) / 255. 10 | x = tf.reshape(x, [28*28]) 11 | y = tf.cast(y, dtype=tf.int32) 12 | y = tf.one_hot(y, depth=10) 13 | return x,y 14 | 15 | 16 | batchsz = 128 17 | (x, y), (x_test, y_test) = datasets.mnist.load_data() 18 | print('datasets:', x.shape, y.shape, x.min(), x.max()) 19 | 20 | 21 | 22 | idx = tf.range(60000) 23 | idx = tf.random.shuffle(idx) 24 | x_train, y_train = tf.gather(x, idx[:50000]), tf.gather(y, idx[:50000]) 25 | x_val, y_val = tf.gather(x, idx[-10000:]) , tf.gather(y, idx[-10000:]) 26 | print(x_train.shape, y_train.shape, x_val.shape, y_val.shape) 27 | db_train = tf.data.Dataset.from_tensor_slices((x_train,y_train)) 28 | db_train = db_train.map(preprocess).shuffle(50000).batch(batchsz) 29 | 30 | db_val = tf.data.Dataset.from_tensor_slices((x_val,y_val)) 31 | db_val = db_val.map(preprocess).shuffle(10000).batch(batchsz) 32 | 33 | 34 | 35 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) 36 | db_test = db_test.map(preprocess).batch(batchsz) 37 | 38 | sample = next(iter(db_train)) 39 | print(sample[0].shape, sample[1].shape) 40 | 41 | 42 | network = Sequential([layers.Dense(256, activation='relu'), 43 | layers.Dense(128, activation='relu'), 44 | layers.Dense(64, activation='relu'), 45 | layers.Dense(32, activation='relu'), 46 | layers.Dense(10)]) 47 | network.build(input_shape=(None, 28*28)) 48 | network.summary() 49 | 50 | 51 | 52 | 53 | network.compile(optimizer=optimizers.Adam(lr=0.01), 54 | loss=tf.losses.CategoricalCrossentropy(from_logits=True), 55 | metrics=['accuracy'] 56 | ) 57 | 58 | network.fit(db_train, epochs=6, validation_data=db_val, validation_freq=2) 59 | 60 | print('Test performance:') 61 | network.evaluate(db_test) 62 | 63 | 64 | sample = next(iter(db_test)) 65 | x = sample[0] 66 | y = sample[1] # one-hot 67 | pred = network.predict(x) # [b, 10] 68 | # convert back to number 69 | y = tf.argmax(y, axis=1) 70 | pred = tf.argmax(pred, axis=1) 71 | 72 | print(pred) 73 | print(y) 74 | -------------------------------------------------------------------------------- /ch09-过拟合/交叉验证.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/交叉验证.pdf -------------------------------------------------------------------------------- /ch09-过拟合/学习率与动量.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/学习率与动量.pdf -------------------------------------------------------------------------------- /ch09-过拟合/过拟合与欠拟合.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch09-过拟合/过拟合与欠拟合.pdf -------------------------------------------------------------------------------- /ch10-卷积神经网络/BatchNorm.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/BatchNorm.pdf -------------------------------------------------------------------------------- /ch10-卷积神经网络/CIFAR与VGG实战.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/CIFAR与VGG实战.pdf -------------------------------------------------------------------------------- /ch10-卷积神经网络/ResNet与DenseNet.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/ResNet与DenseNet.pdf -------------------------------------------------------------------------------- /ch10-卷积神经网络/ResNet实战.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/ResNet实战.pdf -------------------------------------------------------------------------------- /ch10-卷积神经网络/bn_main.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | from tensorflow import keras 4 | from tensorflow.keras import layers, optimizers 5 | 6 | 7 | # 2 images with 4x4 size, 3 channels 8 | # we explicitly enforce the mean and stddev to N(1, 0.5) 9 | x = tf.random.normal([2,4,4,3], mean=1.,stddev=0.5) 10 | 11 | net = layers.BatchNormalization(axis=-1, center=True, scale=True, 12 | trainable=True) 13 | 14 | out = net(x) 15 | print('forward in test mode:', net.variables) 16 | 17 | 18 | out = net(x, training=True) 19 | print('forward in train mode(1 step):', net.variables) 20 | 21 | for i in range(100): 22 | out = net(x, training=True) 23 | print('forward in train mode(100 steps):', net.variables) 24 | 25 | 26 | optimizer = optimizers.SGD(lr=1e-2) 27 | for i in range(10): 28 | with tf.GradientTape() as tape: 29 | out = net(x, training=True) 30 | loss = tf.reduce_mean(tf.pow(out,2)) - 1 31 | 32 | grads = tape.gradient(loss, net.trainable_variables) 33 | optimizer.apply_gradients(zip(grads, net.trainable_variables)) 34 | print('backward(10 steps):', net.variables) 35 | 36 | 37 | 38 | 39 | -------------------------------------------------------------------------------- /ch10-卷积神经网络/cifar10_train.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import layers, optimizers, datasets, Sequential 3 | import os 4 | 5 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2' 6 | tf.random.set_seed(2345) 7 | 8 | conv_layers = [ # 5 units of conv + max pooling 9 | # unit 1 10 | layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), 11 | layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), 12 | layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'), 13 | 14 | # unit 2 15 | layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), 16 | layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), 17 | layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'), 18 | 19 | # unit 3 20 | layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), 21 | layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), 22 | layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'), 23 | 24 | # unit 4 25 | layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), 26 | layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), 27 | layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'), 28 | 29 | # unit 5 30 | layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), 31 | layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu), 32 | layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same') 33 | 34 | ] 35 | 36 | 37 | 38 | def preprocess(x, y): 39 | # [0~1] 40 | x = 2*tf.cast(x, dtype=tf.float32) / 255.-1 41 | y = tf.cast(y, dtype=tf.int32) 42 | return x,y 43 | 44 | 45 | (x,y), (x_test, y_test) = datasets.cifar10.load_data() 46 | y = tf.squeeze(y, axis=1) 47 | y_test = tf.squeeze(y_test, axis=1) 48 | print(x.shape, y.shape, x_test.shape, y_test.shape) 49 | 50 | 51 | train_db = tf.data.Dataset.from_tensor_slices((x,y)) 52 | train_db = train_db.shuffle(1000).map(preprocess).batch(128) 53 | 54 | test_db = tf.data.Dataset.from_tensor_slices((x_test,y_test)) 55 | test_db = test_db.map(preprocess).batch(64) 56 | 57 | sample = next(iter(train_db)) 58 | print('sample:', sample[0].shape, sample[1].shape, 59 | tf.reduce_min(sample[0]), tf.reduce_max(sample[0])) 60 | 61 | 62 | def main(): 63 | 64 | # [b, 32, 32, 3] => [b, 1, 1, 512] 65 | conv_net = Sequential(conv_layers) 66 | 67 | fc_net = Sequential([ 68 | layers.Dense(256, activation=tf.nn.relu), 69 | layers.Dense(128, activation=tf.nn.relu), 70 | layers.Dense(10, activation=None), 71 | ]) 72 | 73 | conv_net.build(input_shape=[None, 32, 32, 3]) 74 | fc_net.build(input_shape=[None, 512]) 75 | conv_net.summary() 76 | fc_net.summary() 77 | optimizer = optimizers.Adam(lr=1e-4) 78 | 79 | # [1, 2] + [3, 4] => [1, 2, 3, 4] 80 | variables = conv_net.trainable_variables + fc_net.trainable_variables 81 | 82 | for epoch in range(50): 83 | 84 | for step, (x,y) in enumerate(train_db): 85 | 86 | with tf.GradientTape() as tape: 87 | # [b, 32, 32, 3] => [b, 1, 1, 512] 88 | out = conv_net(x) 89 | # flatten, => [b, 512] 90 | out = tf.reshape(out, [-1, 512]) 91 | # [b, 512] => [b, 10] 92 | logits = fc_net(out) 93 | # [b] => [b, 10] 94 | y_onehot = tf.one_hot(y, depth=10) 95 | # compute loss 96 | loss = tf.losses.categorical_crossentropy(y_onehot, logits, from_logits=True) 97 | loss = tf.reduce_mean(loss) 98 | 99 | grads = tape.gradient(loss, variables) 100 | optimizer.apply_gradients(zip(grads, variables)) 101 | 102 | if step %100 == 0: 103 | print(epoch, step, 'loss:', float(loss)) 104 | 105 | 106 | 107 | total_num = 0 108 | total_correct = 0 109 | for x,y in test_db: 110 | 111 | out = conv_net(x) 112 | out = tf.reshape(out, [-1, 512]) 113 | logits = fc_net(out) 114 | prob = tf.nn.softmax(logits, axis=1) 115 | pred = tf.argmax(prob, axis=1) 116 | pred = tf.cast(pred, dtype=tf.int32) 117 | 118 | correct = tf.cast(tf.equal(pred, y), dtype=tf.int32) 119 | correct = tf.reduce_sum(correct) 120 | 121 | total_num += x.shape[0] 122 | total_correct += int(correct) 123 | 124 | acc = total_correct / total_num 125 | print(epoch, 'acc:', acc) 126 | 127 | 128 | 129 | if __name__ == '__main__': 130 | main() 131 | -------------------------------------------------------------------------------- /ch10-卷积神经网络/resnet.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow import keras 3 | from tensorflow.keras import layers, Sequential 4 | 5 | 6 | 7 | class BasicBlock(layers.Layer): 8 | # 残差模块 9 | def __init__(self, filter_num, stride=1): 10 | super(BasicBlock, self).__init__() 11 | # 第一个卷积单元 12 | self.conv1 = layers.Conv2D(filter_num, (3, 3), strides=stride, padding='same') 13 | self.bn1 = layers.BatchNormalization() 14 | self.relu = layers.Activation('relu') 15 | # 第二个卷积单元 16 | self.conv2 = layers.Conv2D(filter_num, (3, 3), strides=1, padding='same') 17 | self.bn2 = layers.BatchNormalization() 18 | 19 | if stride != 1:# 通过1x1卷积完成shape匹配 20 | self.downsample = Sequential() 21 | self.downsample.add(layers.Conv2D(filter_num, (1, 1), strides=stride)) 22 | else:# shape匹配,直接短接 23 | self.downsample = lambda x:x 24 | 25 | def call(self, inputs, training=None): 26 | 27 | # [b, h, w, c],通过第一个卷积单元 28 | out = self.conv1(inputs) 29 | out = self.bn1(out) 30 | out = self.relu(out) 31 | # 通过第二个卷积单元 32 | out = self.conv2(out) 33 | out = self.bn2(out) 34 | # 通过identity模块 35 | identity = self.downsample(inputs) 36 | # 2条路径输出直接相加 37 | output = layers.add([out, identity]) 38 | output = tf.nn.relu(output) # 激活函数 39 | 40 | return output 41 | 42 | 43 | class ResNet(keras.Model): 44 | # 通用的ResNet实现类 45 | def __init__(self, layer_dims, num_classes=10): # [2, 2, 2, 2] 46 | super(ResNet, self).__init__() 47 | # 根网络,预处理 48 | self.stem = Sequential([layers.Conv2D(64, (3, 3), strides=(1, 1)), 49 | layers.BatchNormalization(), 50 | layers.Activation('relu'), 51 | layers.MaxPool2D(pool_size=(2, 2), strides=(1, 1), padding='same') 52 | ]) 53 | # 堆叠4个Block,每个block包含了多个BasicBlock,设置步长不一样 54 | self.layer1 = self.build_resblock(64, layer_dims[0]) 55 | self.layer2 = self.build_resblock(128, layer_dims[1], stride=2) 56 | self.layer3 = self.build_resblock(256, layer_dims[2], stride=2) 57 | self.layer4 = self.build_resblock(512, layer_dims[3], stride=2) 58 | 59 | # 通过Pooling层将高宽降低为1x1 60 | self.avgpool = layers.GlobalAveragePooling2D() 61 | # 最后连接一个全连接层分类 62 | self.fc = layers.Dense(num_classes) 63 | 64 | def call(self, inputs, training=None): 65 | # 通过根网络 66 | x = self.stem(inputs) 67 | # 一次通过4个模块 68 | x = self.layer1(x) 69 | x = self.layer2(x) 70 | x = self.layer3(x) 71 | x = self.layer4(x) 72 | 73 | # 通过池化层 74 | x = self.avgpool(x) 75 | # 通过全连接层 76 | x = self.fc(x) 77 | 78 | return x 79 | 80 | 81 | 82 | def build_resblock(self, filter_num, blocks, stride=1): 83 | # 辅助函数,堆叠filter_num个BasicBlock 84 | res_blocks = Sequential() 85 | # 只有第一个BasicBlock的步长可能不为1,实现下采样 86 | res_blocks.add(BasicBlock(filter_num, stride)) 87 | 88 | for _ in range(1, blocks):#其他BasicBlock步长都为1 89 | res_blocks.add(BasicBlock(filter_num, stride=1)) 90 | 91 | return res_blocks 92 | 93 | 94 | def resnet18(): 95 | # 通过调整模块内部BasicBlock的数量和配置实现不同的ResNet 96 | return ResNet([2, 2, 2, 2]) 97 | 98 | 99 | def resnet34(): 100 | # 通过调整模块内部BasicBlock的数量和配置实现不同的ResNet 101 | return ResNet([3, 4, 6, 3]) -------------------------------------------------------------------------------- /ch10-卷积神经网络/resnet18_train.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import layers, optimizers, datasets, Sequential 3 | import os 4 | from resnet import resnet18 5 | 6 | os.environ['TF_CPP_MIN_LOG_LEVEL']='2' 7 | tf.random.set_seed(2345) 8 | 9 | 10 | 11 | 12 | 13 | def preprocess(x, y): 14 | # 将数据映射到-1~1 15 | x = 2*tf.cast(x, dtype=tf.float32) / 255. - 1 16 | y = tf.cast(y, dtype=tf.int32) # 类型转换 17 | return x,y 18 | 19 | 20 | (x,y), (x_test, y_test) = datasets.cifar10.load_data() # 加载数据集 21 | y = tf.squeeze(y, axis=1) # 删除不必要的维度 22 | y_test = tf.squeeze(y_test, axis=1) # 删除不必要的维度 23 | print(x.shape, y.shape, x_test.shape, y_test.shape) 24 | 25 | 26 | train_db = tf.data.Dataset.from_tensor_slices((x,y)) # 构建训练集 27 | # 随机打散,预处理,批量化 28 | train_db = train_db.shuffle(1000).map(preprocess).batch(512) 29 | 30 | test_db = tf.data.Dataset.from_tensor_slices((x_test,y_test)) #构建测试集 31 | # 随机打散,预处理,批量化 32 | test_db = test_db.map(preprocess).batch(512) 33 | # 采样一个样本 34 | sample = next(iter(train_db)) 35 | print('sample:', sample[0].shape, sample[1].shape, 36 | tf.reduce_min(sample[0]), tf.reduce_max(sample[0])) 37 | 38 | 39 | def main(): 40 | 41 | # [b, 32, 32, 3] => [b, 1, 1, 512] 42 | model = resnet18() # ResNet18网络 43 | model.build(input_shape=(None, 32, 32, 3)) 44 | model.summary() # 统计网络参数 45 | optimizer = optimizers.Adam(lr=1e-4) # 构建优化器 46 | 47 | for epoch in range(100): # 训练epoch 48 | 49 | for step, (x,y) in enumerate(train_db): 50 | 51 | with tf.GradientTape() as tape: 52 | # [b, 32, 32, 3] => [b, 10],前向传播 53 | logits = model(x) 54 | # [b] => [b, 10],one-hot编码 55 | y_onehot = tf.one_hot(y, depth=10) 56 | # 计算交叉熵 57 | loss = tf.losses.categorical_crossentropy(y_onehot, logits, from_logits=True) 58 | loss = tf.reduce_mean(loss) 59 | # 计算梯度信息 60 | grads = tape.gradient(loss, model.trainable_variables) 61 | # 更新网络参数 62 | optimizer.apply_gradients(zip(grads, model.trainable_variables)) 63 | 64 | if step %50 == 0: 65 | print(epoch, step, 'loss:', float(loss)) 66 | 67 | 68 | 69 | total_num = 0 70 | total_correct = 0 71 | for x,y in test_db: 72 | 73 | logits = model(x) 74 | prob = tf.nn.softmax(logits, axis=1) 75 | pred = tf.argmax(prob, axis=1) 76 | pred = tf.cast(pred, dtype=tf.int32) 77 | 78 | correct = tf.cast(tf.equal(pred, y), dtype=tf.int32) 79 | correct = tf.reduce_sum(correct) 80 | 81 | total_num += x.shape[0] 82 | total_correct += int(correct) 83 | 84 | acc = total_correct / total_num 85 | print(epoch, 'acc:', acc) 86 | 87 | 88 | 89 | if __name__ == '__main__': 90 | main() 91 | -------------------------------------------------------------------------------- /ch10-卷积神经网络/什么是卷积.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/什么是卷积.pdf -------------------------------------------------------------------------------- /ch10-卷积神经网络/卷积神经网络.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/卷积神经网络.pdf -------------------------------------------------------------------------------- /ch10-卷积神经网络/池化与采样.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/池化与采样.pdf -------------------------------------------------------------------------------- /ch10-卷积神经网络/经典卷积网络.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch10-卷积神经网络/经典卷积网络.pdf -------------------------------------------------------------------------------- /ch11-循环神经网络/LSTM.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/LSTM.pdf -------------------------------------------------------------------------------- /ch11-循环神经网络/LSTM实战.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/LSTM实战.pdf -------------------------------------------------------------------------------- /ch11-循环神经网络/RNN Layer使用.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/RNN Layer使用.pdf -------------------------------------------------------------------------------- /ch11-循环神经网络/nb.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import tensorflow as tf 3 | from tensorflow import keras 4 | from tensorflow.keras import layers 5 | 6 | import matplotlib.pyplot as plt 7 | 8 | #%% 9 | x = tf.range(10) 10 | x = tf.random.shuffle(x) 11 | # 创建共10个单词,每个单词用长度为4的向量表示的层 12 | net = layers.Embedding(10, 4) 13 | out = net(x) 14 | 15 | out 16 | #%% 17 | net.embeddings 18 | net.embeddings.trainable 19 | net.trainable = False 20 | #%% 21 | # 从预训练模型中加载词向量表 22 | embed_glove = load_embed('glove.6B.50d.txt') 23 | # 直接利用预训练的词向量表初始化Embedding层 24 | net.set_weights([embed_glove]) 25 | #%% 26 | cell = layers.SimpleRNNCell(3) 27 | cell.build(input_shape=(None,4)) 28 | cell.trainable_variables 29 | 30 | 31 | #%% 32 | # 初始化状态向量 33 | h0 = [tf.zeros([4, 64])] 34 | x = tf.random.normal([4, 80, 100]) 35 | xt = x[:,0,:] 36 | # 构建输入特征f=100,序列长度s=80,状态长度=64的Cell 37 | cell = layers.SimpleRNNCell(64) 38 | out, h1 = cell(xt, h0) # 前向计算 39 | print(out.shape, h1[0].shape) 40 | print(id(out), id(h1[0])) 41 | 42 | 43 | #%% 44 | h = h0 45 | # 在序列长度的维度解开输入,得到xt:[b,f] 46 | for xt in tf.unstack(x, axis=1): 47 | out, h = cell(xt, h) # 前向计算 48 | # 最终输出可以聚合每个时间戳上的输出,也可以只取最后时间戳的输出 49 | out = out 50 | 51 | #%% 52 | x = tf.random.normal([4,80,100]) 53 | xt = x[:,0,:] # 取第一个时间戳的输入x0 54 | # 构建2个Cell,先cell0,后cell1 55 | cell0 = layers.SimpleRNNCell(64) 56 | cell1 = layers.SimpleRNNCell(64) 57 | h0 = [tf.zeros([4,64])] # cell0的初始状态向量 58 | h1 = [tf.zeros([4,64])] # cell1的初始状态向量 59 | 60 | out0, h0 = cell0(xt, h0) 61 | out1, h1 = cell1(out0, h1) 62 | 63 | 64 | #%% 65 | for xt in tf.unstack(x, axis=1): 66 | # xtw作为输入,输出为out0 67 | out0, h0 = cell0(xt, h0) 68 | # 上一个cell的输出out0作为本cell的输入 69 | out1, h1 = cell1(out0, h1) 70 | 71 | 72 | #%% 73 | print(x.shape) 74 | # 保存上一层的所有时间戳上面的输出 75 | middle_sequences = [] 76 | # 计算第一层的所有时间戳上的输出,并保存 77 | for xt in tf.unstack(x, axis=1): 78 | out0, h0 = cell0(xt, h0) 79 | middle_sequences.append(out0) 80 | # 计算第二层的所有时间戳上的输出 81 | # 如果不是末层,需要保存所有时间戳上面的输出 82 | for xt in middle_sequences: 83 | out1, h1 = cell1(xt, h1) 84 | 85 | 86 | #%% 87 | layer = layers.SimpleRNN(64) 88 | x = tf.random.normal([4, 80, 100]) 89 | out = layer(x) 90 | out.shape 91 | 92 | #%% 93 | layer = layers.SimpleRNN(64,return_sequences=True) 94 | out = layer(x) 95 | out 96 | 97 | #%% 98 | net = keras.Sequential([ # 构建2层RNN网络 99 | # 除最末层外,都需要返回所有时间戳的输出 100 | layers.SimpleRNN(64, return_sequences=True), 101 | layers.SimpleRNN(64), 102 | ]) 103 | out = net(x) 104 | 105 | 106 | 107 | #%% 108 | W = tf.ones([2,2]) # 任意创建某矩阵 109 | eigenvalues = tf.linalg.eigh(W)[0] # 计算特征值 110 | eigenvalues 111 | #%% 112 | val = [W] 113 | for i in range(10): # 矩阵相乘n次方 114 | val.append([val[-1]@W]) 115 | # 计算L2范数 116 | norm = list(map(lambda x:tf.norm(x).numpy(),val)) 117 | plt.plot(range(1,12),norm) 118 | plt.xlabel('n times') 119 | plt.ylabel('L2-norm') 120 | plt.savefig('w_n_times_1.svg') 121 | #%% 122 | W = tf.ones([2,2])*0.4 # 任意创建某矩阵 123 | eigenvalues = tf.linalg.eigh(W)[0] # 计算特征值 124 | print(eigenvalues) 125 | val = [W] 126 | for i in range(10): 127 | val.append([val[-1]@W]) 128 | norm = list(map(lambda x:tf.norm(x).numpy(),val)) 129 | plt.plot(range(1,12),norm) 130 | plt.xlabel('n times') 131 | plt.ylabel('L2-norm') 132 | plt.savefig('w_n_times_0.svg') 133 | #%% 134 | a=tf.random.uniform([2,2]) 135 | tf.clip_by_value(a,0.4,0.6) # 梯度值裁剪 136 | 137 | #%% 138 | 139 | 140 | 141 | 142 | #%% 143 | a=tf.random.uniform([2,2]) * 5 144 | # 按范数方式裁剪 145 | b = tf.clip_by_norm(a, 5) 146 | tf.norm(a),tf.norm(b) 147 | 148 | #%% 149 | w1=tf.random.normal([3,3]) # 创建梯度张量1 150 | w2=tf.random.normal([3,3]) # 创建梯度张量2 151 | # 计算global norm 152 | global_norm=tf.math.sqrt(tf.norm(w1)**2+tf.norm(w2)**2) 153 | # 根据global norm和max norm=2裁剪 154 | (ww1,ww2),global_norm=tf.clip_by_global_norm([w1,w2],2) 155 | # 计算裁剪后的张量组的global norm 156 | global_norm2 = tf.math.sqrt(tf.norm(ww1)**2+tf.norm(ww2)**2) 157 | print(global_norm, global_norm2) 158 | 159 | #%% 160 | with tf.GradientTape() as tape: 161 | logits = model(x) # 前向传播 162 | loss = criteon(y, logits) # 误差计算 163 | # 计算梯度值 164 | grads = tape.gradient(loss, model.trainable_variables) 165 | grads, _ = tf.clip_by_global_norm(grads, 25) # 全局梯度裁剪 166 | # 利用裁剪后的梯度张量更新参数 167 | optimizer.apply_gradients(zip(grads, model.trainable_variables)) 168 | 169 | #%% 170 | x = tf.random.normal([2,80,100]) 171 | xt = x[:,0,:] # 得到一个时间戳的输入 172 | cell = layers.LSTMCell(64) # 创建Cell 173 | # 初始化状态和输出List,[h,c] 174 | state = [tf.zeros([2,64]),tf.zeros([2,64])] 175 | out, state = cell(xt, state) # 前向计算 176 | id(out),id(state[0]),id(state[1]) 177 | 178 | 179 | #%% 180 | net = layers.LSTM(4) 181 | net.build(input_shape=(None,5,3)) 182 | net.trainable_variables 183 | #%% 184 | 185 | net = layers.GRU(4) 186 | net.build(input_shape=(None,5,3)) 187 | net.trainable_variables 188 | 189 | #%% 190 | # 初始化状态向量 191 | h = [tf.zeros([2,64])] 192 | cell = layers.GRUCell(64) # 新建GRU Cell 193 | for xt in tf.unstack(x, axis=1): 194 | out, h = cell(xt, h) 195 | out.shape 196 | 197 | 198 | #%% 199 | -------------------------------------------------------------------------------- /ch11-循环神经网络/pretrained.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import numpy as np 4 | import tensorflow as tf 5 | from tensorflow import keras 6 | from tensorflow.keras import layers 7 | from tensorflow.keras.preprocessing.text import Tokenizer 8 | from tensorflow.keras.preprocessing.sequence import pad_sequences 9 | 10 | 11 | 12 | BASE_DIR = '' 13 | GLOVE_DIR = os.path.join(BASE_DIR, 'glove.6B') 14 | TEXT_DATA_DIR = os.path.join(BASE_DIR, '20_newsgroup') 15 | MAX_SEQUENCE_LENGTH = 1000 16 | MAX_NUM_WORDS = 20000 17 | EMBEDDING_DIM = 100 18 | VALIDATION_SPLIT = 0.2 19 | 20 | # first, build index mapping words in the embeddings set 21 | # to their embedding vector 22 | 23 | print('Indexing word vectors.') 24 | 25 | embeddings_index = {} 26 | with open(os.path.join(GLOVE_DIR, 'glove.6B.100d.txt')) as f: 27 | for line in f: 28 | values = line.split() 29 | word = values[0] 30 | coefs = np.asarray(values[1:], dtype='float32') 31 | embeddings_index[word] = coefs 32 | 33 | print('Found %s word vectors.' % len(embeddings_index)) 34 | 35 | # second, prepare text samples and their labels 36 | print('Processing text dataset') 37 | 38 | texts = [] # list of text samples 39 | labels_index = {} # dictionary mapping label name to numeric id 40 | labels = [] # list of label ids 41 | for name in sorted(os.listdir(TEXT_DATA_DIR)): 42 | path = os.path.join(TEXT_DATA_DIR, name) 43 | if os.path.isdir(path): 44 | label_id = len(labels_index) 45 | labels_index[name] = label_id 46 | for fname in sorted(os.listdir(path)): 47 | if fname.isdigit(): 48 | fpath = os.path.join(path, fname) 49 | args = {} if sys.version_info < (3,) else {'encoding': 'latin-1'} 50 | with open(fpath, **args) as f: 51 | t = f.read() 52 | i = t.find('\n\n') # skip header 53 | if 0 < i: 54 | t = t[i:] 55 | texts.append(t) 56 | labels.append(label_id) 57 | 58 | print('Found %s texts.' % len(texts)) 59 | 60 | # finally, vectorize the text samples into a 2D integer tensor 61 | tokenizer = Tokenizer(num_words=MAX_NUM_WORDS) 62 | tokenizer.fit_on_texts(texts) 63 | sequences = tokenizer.texts_to_sequences(texts) 64 | 65 | word_index = tokenizer.word_index 66 | print('Found %s unique tokens.' % len(word_index)) 67 | 68 | data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH) 69 | 70 | labels = to_categorical(np.asarray(labels)) 71 | print('Shape of data tensor:', data.shape) 72 | print('Shape of label tensor:', labels.shape) 73 | 74 | # split the data into a training set and a validation set 75 | indices = np.arange(data.shape[0]) 76 | np.random.shuffle(indices) 77 | data = data[indices] 78 | labels = labels[indices] 79 | num_validation_samples = int(VALIDATION_SPLIT * data.shape[0]) 80 | 81 | x_train = data[:-num_validation_samples] 82 | y_train = labels[:-num_validation_samples] 83 | x_val = data[-num_validation_samples:] 84 | y_val = labels[-num_validation_samples:] 85 | 86 | print('Preparing embedding matrix.') 87 | 88 | # prepare embedding matrix 89 | num_words = min(MAX_NUM_WORDS, len(word_index)) + 1 90 | embedding_matrix = np.zeros((num_words, EMBEDDING_DIM)) 91 | for word, i in word_index.items(): 92 | if i > MAX_NUM_WORDS: 93 | continue 94 | embedding_vector = embeddings_index.get(word) 95 | if embedding_vector is not None: 96 | # words not found in embedding index will be all-zeros. 97 | embedding_matrix[i] = embedding_vector 98 | 99 | # load pre-trained word embeddings into an Embedding layer 100 | # note that we set trainable = False so as to keep the embeddings fixed 101 | embedding_layer = Embedding(num_words, 102 | EMBEDDING_DIM, 103 | embeddings_initializer=Constant(embedding_matrix), 104 | input_length=MAX_SEQUENCE_LENGTH, 105 | trainable=False) 106 | 107 | print('Training model.') 108 | 109 | # train a 1D convnet with global maxpooling 110 | sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32') 111 | embedded_sequences = embedding_layer(sequence_input) 112 | x = Conv1D(128, 5, activation='relu')(embedded_sequences) 113 | x = MaxPooling1D(5)(x) 114 | x = Conv1D(128, 5, activation='relu')(x) 115 | x = MaxPooling1D(5)(x) 116 | x = Conv1D(128, 5, activation='relu')(x) 117 | x = GlobalMaxPooling1D()(x) 118 | x = Dense(128, activation='relu')(x) 119 | preds = Dense(len(labels_index), activation='softmax')(x) 120 | 121 | model = Model(sequence_input, preds) 122 | model.compile(loss='categorical_crossentropy', 123 | optimizer='rmsprop', 124 | metrics=['acc']) 125 | 126 | model.fit(x_train, y_train, 127 | batch_size=128, 128 | epochs=10, 129 | validation_data=(x_val, y_val)) -------------------------------------------------------------------------------- /ch11-循环神经网络/sentiment_analysis_cell - GRU.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import os 3 | import tensorflow as tf 4 | import numpy as np 5 | from tensorflow import keras 6 | from tensorflow.keras import layers, losses, optimizers, Sequential 7 | 8 | 9 | tf.random.set_seed(22) 10 | np.random.seed(22) 11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 12 | assert tf.__version__.startswith('2.') 13 | 14 | batchsz = 128 # 批量大小 15 | total_words = 10000 # 词汇表大小N_vocab 16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充 17 | embedding_len = 100 # 词向量特征长度f 18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词 19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words) 20 | print(x_train.shape, len(x_train[0]), y_train.shape) 21 | print(x_test.shape, len(x_test[0]), y_test.shape) 22 | #%% 23 | x_train[0] 24 | #%% 25 | # 数字编码表 26 | word_index = keras.datasets.imdb.get_word_index() 27 | # for k,v in word_index.items(): 28 | # print(k,v) 29 | #%% 30 | word_index = {k:(v+3) for k,v in word_index.items()} 31 | word_index[""] = 0 32 | word_index[""] = 1 33 | word_index[""] = 2 # unknown 34 | word_index[""] = 3 35 | # 翻转编码表 36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()]) 37 | 38 | def decode_review(text): 39 | return ' '.join([reverse_word_index.get(i, '?') for i in text]) 40 | 41 | decode_review(x_train[8]) 42 | 43 | #%% 44 | 45 | # x_train:[b, 80] 46 | # x_test: [b, 80] 47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充 48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len) 49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len) 50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch 51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) 52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True) 53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) 54 | db_test = db_test.batch(batchsz, drop_remainder=True) 55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train)) 56 | print('x_test shape:', x_test.shape) 57 | 58 | #%% 59 | 60 | class MyRNN(keras.Model): 61 | # Cell方式构建多层网络 62 | def __init__(self, units): 63 | super(MyRNN, self).__init__() 64 | # [b, 64],构建Cell初始化状态向量,重复使用 65 | self.state0 = [tf.zeros([batchsz, units])] 66 | self.state1 = [tf.zeros([batchsz, units])] 67 | # 词向量编码 [b, 80] => [b, 80, 100] 68 | self.embedding = layers.Embedding(total_words, embedding_len, 69 | input_length=max_review_len) 70 | # 构建2个Cell 71 | self.rnn_cell0 = layers.GRUCell(units, dropout=0.5) 72 | self.rnn_cell1 = layers.GRUCell(units, dropout=0.5) 73 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类 74 | # [b, 80, 100] => [b, 64] => [b, 1] 75 | self.outlayer = Sequential([ 76 | layers.Dense(units), 77 | layers.Dropout(rate=0.5), 78 | layers.ReLU(), 79 | layers.Dense(1)]) 80 | 81 | def call(self, inputs, training=None): 82 | x = inputs # [b, 80] 83 | # embedding: [b, 80] => [b, 80, 100] 84 | x = self.embedding(x) 85 | # rnn cell compute,[b, 80, 100] => [b, 64] 86 | state0 = self.state0 87 | state1 = self.state1 88 | for word in tf.unstack(x, axis=1): # word: [b, 100] 89 | out0, state0 = self.rnn_cell0(word, state0, training) 90 | out1, state1 = self.rnn_cell1(out0, state1, training) 91 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1] 92 | x = self.outlayer(out1, training) 93 | # p(y is pos|x) 94 | prob = tf.sigmoid(x) 95 | 96 | return prob 97 | 98 | def main(): 99 | units = 64 # RNN状态向量长度f 100 | epochs = 50 # 训练epochs 101 | 102 | model = MyRNN(units) 103 | # 装配 104 | model.compile(optimizer = optimizers.RMSprop(0.001), 105 | loss = losses.BinaryCrossentropy(), 106 | metrics=['accuracy']) 107 | # 训练和验证 108 | model.fit(db_train, epochs=epochs, validation_data=db_test) 109 | # 测试 110 | model.evaluate(db_test) 111 | 112 | 113 | if __name__ == '__main__': 114 | main() 115 | -------------------------------------------------------------------------------- /ch11-循环神经网络/sentiment_analysis_cell - LSTM.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import os 3 | import tensorflow as tf 4 | import numpy as np 5 | from tensorflow import keras 6 | from tensorflow.keras import layers, losses, optimizers, Sequential 7 | 8 | 9 | tf.random.set_seed(22) 10 | np.random.seed(22) 11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 12 | assert tf.__version__.startswith('2.') 13 | 14 | batchsz = 128 # 批量大小 15 | total_words = 10000 # 词汇表大小N_vocab 16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充 17 | embedding_len = 100 # 词向量特征长度f 18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词 19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words) 20 | print(x_train.shape, len(x_train[0]), y_train.shape) 21 | print(x_test.shape, len(x_test[0]), y_test.shape) 22 | #%% 23 | x_train[0] 24 | #%% 25 | # 数字编码表 26 | word_index = keras.datasets.imdb.get_word_index() 27 | # for k,v in word_index.items(): 28 | # print(k,v) 29 | #%% 30 | word_index = {k:(v+3) for k,v in word_index.items()} 31 | word_index[""] = 0 32 | word_index[""] = 1 33 | word_index[""] = 2 # unknown 34 | word_index[""] = 3 35 | # 翻转编码表 36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()]) 37 | 38 | def decode_review(text): 39 | return ' '.join([reverse_word_index.get(i, '?') for i in text]) 40 | 41 | decode_review(x_train[8]) 42 | 43 | #%% 44 | 45 | # x_train:[b, 80] 46 | # x_test: [b, 80] 47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充 48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len) 49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len) 50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch 51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) 52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True) 53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) 54 | db_test = db_test.batch(batchsz, drop_remainder=True) 55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train)) 56 | print('x_test shape:', x_test.shape) 57 | 58 | #%% 59 | 60 | class MyRNN(keras.Model): 61 | # Cell方式构建多层网络 62 | def __init__(self, units): 63 | super(MyRNN, self).__init__() 64 | # [b, 64],构建Cell初始化状态向量,重复使用 65 | self.state0 = [tf.zeros([batchsz, units]),tf.zeros([batchsz, units])] 66 | self.state1 = [tf.zeros([batchsz, units]),tf.zeros([batchsz, units])] 67 | # 词向量编码 [b, 80] => [b, 80, 100] 68 | self.embedding = layers.Embedding(total_words, embedding_len, 69 | input_length=max_review_len) 70 | # 构建2个Cell 71 | self.rnn_cell0 = layers.LSTMCell(units, dropout=0.5) 72 | self.rnn_cell1 = layers.LSTMCell(units, dropout=0.5) 73 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类 74 | # [b, 80, 100] => [b, 64] => [b, 1] 75 | self.outlayer = Sequential([ 76 | layers.Dense(units), 77 | layers.Dropout(rate=0.5), 78 | layers.ReLU(), 79 | layers.Dense(1)]) 80 | 81 | def call(self, inputs, training=None): 82 | x = inputs # [b, 80] 83 | # embedding: [b, 80] => [b, 80, 100] 84 | x = self.embedding(x) 85 | # rnn cell compute,[b, 80, 100] => [b, 64] 86 | state0 = self.state0 87 | state1 = self.state1 88 | for word in tf.unstack(x, axis=1): # word: [b, 100] 89 | out0, state0 = self.rnn_cell0(word, state0, training) 90 | out1, state1 = self.rnn_cell1(out0, state1, training) 91 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1] 92 | x = self.outlayer(out1,training) 93 | # p(y is pos|x) 94 | prob = tf.sigmoid(x) 95 | 96 | return prob 97 | 98 | def main(): 99 | units = 64 # RNN状态向量长度f 100 | epochs = 50 # 训练epochs 101 | 102 | model = MyRNN(units) 103 | # 装配 104 | model.compile(optimizer = optimizers.RMSprop(0.001), 105 | loss = losses.BinaryCrossentropy(), 106 | metrics=['accuracy']) 107 | # 训练和验证 108 | model.fit(db_train, epochs=epochs, validation_data=db_test) 109 | # 测试 110 | model.evaluate(db_test) 111 | 112 | 113 | if __name__ == '__main__': 114 | main() 115 | -------------------------------------------------------------------------------- /ch11-循环神经网络/sentiment_analysis_cell.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import os 3 | import tensorflow as tf 4 | import numpy as np 5 | from tensorflow import keras 6 | from tensorflow.keras import layers, losses, optimizers, Sequential 7 | 8 | 9 | tf.random.set_seed(22) 10 | np.random.seed(22) 11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 12 | assert tf.__version__.startswith('2.') 13 | 14 | batchsz = 128 # 批量大小 15 | total_words = 10000 # 词汇表大小N_vocab 16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充 17 | embedding_len = 100 # 词向量特征长度f 18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词 19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words) 20 | print(x_train.shape, len(x_train[0]), y_train.shape) 21 | print(x_test.shape, len(x_test[0]), y_test.shape) 22 | #%% 23 | x_train[0] 24 | #%% 25 | # 数字编码表 26 | word_index = keras.datasets.imdb.get_word_index() 27 | # for k,v in word_index.items(): 28 | # print(k,v) 29 | #%% 30 | word_index = {k:(v+3) for k,v in word_index.items()} 31 | word_index[""] = 0 32 | word_index[""] = 1 33 | word_index[""] = 2 # unknown 34 | word_index[""] = 3 35 | # 翻转编码表 36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()]) 37 | 38 | def decode_review(text): 39 | return ' '.join([reverse_word_index.get(i, '?') for i in text]) 40 | 41 | decode_review(x_train[8]) 42 | 43 | #%% 44 | 45 | # x_train:[b, 80] 46 | # x_test: [b, 80] 47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充 48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len) 49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len) 50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch 51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) 52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True) 53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) 54 | db_test = db_test.batch(batchsz, drop_remainder=True) 55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train)) 56 | print('x_test shape:', x_test.shape) 57 | 58 | #%% 59 | 60 | class MyRNN(keras.Model): 61 | # Cell方式构建多层网络 62 | def __init__(self, units): 63 | super(MyRNN, self).__init__() 64 | # [b, 64],构建Cell初始化状态向量,重复使用 65 | self.state0 = [tf.zeros([batchsz, units])] 66 | self.state1 = [tf.zeros([batchsz, units])] 67 | # 词向量编码 [b, 80] => [b, 80, 100] 68 | self.embedding = layers.Embedding(total_words, embedding_len, 69 | input_length=max_review_len) 70 | # 构建2个Cell 71 | self.rnn_cell0 = layers.SimpleRNNCell(units, dropout=0.5) 72 | self.rnn_cell1 = layers.SimpleRNNCell(units, dropout=0.5) 73 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类 74 | # [b, 80, 100] => [b, 64] => [b, 1] 75 | self.outlayer = Sequential([ 76 | layers.Dense(units), 77 | layers.Dropout(rate=0.5), 78 | layers.ReLU(), 79 | layers.Dense(1)]) 80 | 81 | def call(self, inputs, training=None): 82 | x = inputs # [b, 80] 83 | # embedding: [b, 80] => [b, 80, 100] 84 | x = self.embedding(x) 85 | # rnn cell compute,[b, 80, 100] => [b, 64] 86 | state0 = self.state0 87 | state1 = self.state1 88 | for word in tf.unstack(x, axis=1): # word: [b, 100] 89 | out0, state0 = self.rnn_cell0(word, state0, training) 90 | out1, state1 = self.rnn_cell1(out0, state1, training) 91 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1] 92 | x = self.outlayer(out1, training) 93 | # p(y is pos|x) 94 | prob = tf.sigmoid(x) 95 | 96 | return prob 97 | 98 | def main(): 99 | units = 64 # RNN状态向量长度f 100 | epochs = 50 # 训练epochs 101 | 102 | model = MyRNN(units) 103 | # 装配 104 | model.compile(optimizer = optimizers.RMSprop(0.001), 105 | loss = losses.BinaryCrossentropy(), 106 | metrics=['accuracy']) 107 | # 训练和验证 108 | model.fit(db_train, epochs=epochs, validation_data=db_test) 109 | # 测试 110 | model.evaluate(db_test) 111 | 112 | 113 | if __name__ == '__main__': 114 | main() 115 | -------------------------------------------------------------------------------- /ch11-循环神经网络/sentiment_analysis_layer - GRU.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import os 3 | import tensorflow as tf 4 | import numpy as np 5 | from tensorflow import keras 6 | from tensorflow.keras import layers, losses, optimizers, Sequential 7 | 8 | 9 | tf.random.set_seed(22) 10 | np.random.seed(22) 11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 12 | assert tf.__version__.startswith('2.') 13 | 14 | batchsz = 128 # 批量大小 15 | total_words = 10000 # 词汇表大小N_vocab 16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充 17 | embedding_len = 100 # 词向量特征长度f 18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词 19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words) 20 | print(x_train.shape, len(x_train[0]), y_train.shape) 21 | print(x_test.shape, len(x_test[0]), y_test.shape) 22 | #%% 23 | x_train[0] 24 | #%% 25 | # 数字编码表 26 | word_index = keras.datasets.imdb.get_word_index() 27 | # for k,v in word_index.items(): 28 | # print(k,v) 29 | #%% 30 | word_index = {k:(v+3) for k,v in word_index.items()} 31 | word_index[""] = 0 32 | word_index[""] = 1 33 | word_index[""] = 2 # unknown 34 | word_index[""] = 3 35 | # 翻转编码表 36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()]) 37 | 38 | def decode_review(text): 39 | return ' '.join([reverse_word_index.get(i, '?') for i in text]) 40 | 41 | decode_review(x_train[8]) 42 | 43 | #%% 44 | 45 | # x_train:[b, 80] 46 | # x_test: [b, 80] 47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充 48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len) 49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len) 50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch 51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) 52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True) 53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) 54 | db_test = db_test.batch(batchsz, drop_remainder=True) 55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train)) 56 | print('x_test shape:', x_test.shape) 57 | 58 | #%% 59 | 60 | class MyRNN(keras.Model): 61 | # Cell方式构建多层网络 62 | def __init__(self, units): 63 | super(MyRNN, self).__init__() 64 | # 词向量编码 [b, 80] => [b, 80, 100] 65 | self.embedding = layers.Embedding(total_words, embedding_len, 66 | input_length=max_review_len) 67 | # 构建RNN 68 | self.rnn = keras.Sequential([ 69 | layers.GRU(units, dropout=0.5, return_sequences=True), 70 | layers.GRU(units, dropout=0.5) 71 | ]) 72 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类 73 | # [b, 80, 100] => [b, 64] => [b, 1] 74 | self.outlayer = Sequential([ 75 | layers.Dense(32), 76 | layers.Dropout(rate=0.5), 77 | layers.ReLU(), 78 | layers.Dense(1)]) 79 | 80 | def call(self, inputs, training=None): 81 | x = inputs # [b, 80] 82 | # embedding: [b, 80] => [b, 80, 100] 83 | x = self.embedding(x) 84 | # rnn cell compute,[b, 80, 100] => [b, 64] 85 | x = self.rnn(x) 86 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1] 87 | x = self.outlayer(x,training) 88 | # p(y is pos|x) 89 | prob = tf.sigmoid(x) 90 | 91 | return prob 92 | 93 | def main(): 94 | units = 32 # RNN状态向量长度f 95 | epochs = 50 # 训练epochs 96 | 97 | model = MyRNN(units) 98 | # 装配 99 | model.compile(optimizer = optimizers.Adam(0.001), 100 | loss = losses.BinaryCrossentropy(), 101 | metrics=['accuracy']) 102 | # 训练和验证 103 | model.fit(db_train, epochs=epochs, validation_data=db_test) 104 | # 测试 105 | model.evaluate(db_test) 106 | 107 | 108 | if __name__ == '__main__': 109 | main() 110 | -------------------------------------------------------------------------------- /ch11-循环神经网络/sentiment_analysis_layer - LSTM - pretrained.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import os 3 | import tensorflow as tf 4 | import numpy as np 5 | from tensorflow import keras 6 | from tensorflow.keras import layers, losses, optimizers, Sequential 7 | 8 | 9 | tf.random.set_seed(22) 10 | np.random.seed(22) 11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 12 | assert tf.__version__.startswith('2.') 13 | 14 | batchsz = 128 # 批量大小 15 | total_words = 10000 # 词汇表大小N_vocab 16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充 17 | embedding_len = 100 # 词向量特征长度f 18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词 19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words) 20 | print(x_train.shape, len(x_train[0]), y_train.shape) 21 | print(x_test.shape, len(x_test[0]), y_test.shape) 22 | #%% 23 | x_train[0] 24 | #%% 25 | # 数字编码表 26 | word_index = keras.datasets.imdb.get_word_index() 27 | # for k,v in word_index.items(): 28 | # print(k,v) 29 | #%% 30 | word_index = {k:(v+3) for k,v in word_index.items()} 31 | word_index[""] = 0 32 | word_index[""] = 1 33 | word_index[""] = 2 # unknown 34 | word_index[""] = 3 35 | # 翻转编码表 36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()]) 37 | 38 | def decode_review(text): 39 | return ' '.join([reverse_word_index.get(i, '?') for i in text]) 40 | 41 | decode_review(x_train[8]) 42 | 43 | #%% 44 | print('Indexing word vectors.') 45 | embeddings_index = {} 46 | GLOVE_DIR = r'C:\Users\z390\Downloads\glove6b50dtxt' 47 | with open(os.path.join(GLOVE_DIR, 'glove.6B.100d.txt'),encoding='utf-8') as f: 48 | for line in f: 49 | values = line.split() 50 | word = values[0] 51 | coefs = np.asarray(values[1:], dtype='float32') 52 | embeddings_index[word] = coefs 53 | 54 | print('Found %s word vectors.' % len(embeddings_index)) 55 | #%% 56 | len(embeddings_index.keys()) 57 | len(word_index.keys()) 58 | #%% 59 | MAX_NUM_WORDS = total_words 60 | # prepare embedding matrix 61 | num_words = min(MAX_NUM_WORDS, len(word_index)) 62 | embedding_matrix = np.zeros((num_words, embedding_len)) 63 | applied_vec_count = 0 64 | for word, i in word_index.items(): 65 | if i >= MAX_NUM_WORDS: 66 | continue 67 | embedding_vector = embeddings_index.get(word) 68 | # print(word,embedding_vector) 69 | if embedding_vector is not None: 70 | # words not found in embedding index will be all-zeros. 71 | embedding_matrix[i] = embedding_vector 72 | applied_vec_count += 1 73 | print(applied_vec_count, embedding_matrix.shape) 74 | 75 | #%% 76 | # x_train:[b, 80] 77 | # x_test: [b, 80] 78 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充 79 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len) 80 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len) 81 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch 82 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) 83 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True) 84 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) 85 | db_test = db_test.batch(batchsz, drop_remainder=True) 86 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train)) 87 | print('x_test shape:', x_test.shape) 88 | 89 | #%% 90 | 91 | class MyRNN(keras.Model): 92 | # Cell方式构建多层网络 93 | def __init__(self, units): 94 | super(MyRNN, self).__init__() 95 | # 词向量编码 [b, 80] => [b, 80, 100] 96 | self.embedding = layers.Embedding(total_words, embedding_len, 97 | input_length=max_review_len, 98 | trainable=False) 99 | self.embedding.build(input_shape=(None,max_review_len)) 100 | # self.embedding.set_weights([embedding_matrix]) 101 | # 构建RNN 102 | self.rnn = keras.Sequential([ 103 | layers.LSTM(units, dropout=0.5, return_sequences=True), 104 | layers.LSTM(units, dropout=0.5) 105 | ]) 106 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类 107 | # [b, 80, 100] => [b, 64] => [b, 1] 108 | self.outlayer = Sequential([ 109 | layers.Dense(32), 110 | layers.Dropout(rate=0.5), 111 | layers.ReLU(), 112 | layers.Dense(1)]) 113 | 114 | def call(self, inputs, training=None): 115 | x = inputs # [b, 80] 116 | # embedding: [b, 80] => [b, 80, 100] 117 | x = self.embedding(x) 118 | # rnn cell compute,[b, 80, 100] => [b, 64] 119 | x = self.rnn(x) 120 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1] 121 | x = self.outlayer(x,training) 122 | # p(y is pos|x) 123 | prob = tf.sigmoid(x) 124 | 125 | return prob 126 | 127 | def main(): 128 | units = 512 # RNN状态向量长度f 129 | epochs = 50 # 训练epochs 130 | 131 | model = MyRNN(units) 132 | # 装配 133 | model.compile(optimizer = optimizers.Adam(0.001), 134 | loss = losses.BinaryCrossentropy(), 135 | metrics=['accuracy']) 136 | # 训练和验证 137 | model.fit(db_train, epochs=epochs, validation_data=db_test) 138 | # 测试 139 | model.evaluate(db_test) 140 | 141 | 142 | if __name__ == '__main__': 143 | main() 144 | 145 | 146 | #%% 147 | -------------------------------------------------------------------------------- /ch11-循环神经网络/sentiment_analysis_layer - LSTM.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import os 3 | import tensorflow as tf 4 | import numpy as np 5 | from tensorflow import keras 6 | from tensorflow.keras import layers, losses, optimizers, Sequential 7 | 8 | 9 | tf.random.set_seed(22) 10 | np.random.seed(22) 11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 12 | assert tf.__version__.startswith('2.') 13 | 14 | batchsz = 128 # 批量大小 15 | total_words = 10000 # 词汇表大小N_vocab 16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充 17 | embedding_len = 100 # 词向量特征长度f 18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词 19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words) 20 | print(x_train.shape, len(x_train[0]), y_train.shape) 21 | print(x_test.shape, len(x_test[0]), y_test.shape) 22 | #%% 23 | x_train[0] 24 | #%% 25 | # 数字编码表 26 | word_index = keras.datasets.imdb.get_word_index() 27 | # for k,v in word_index.items(): 28 | # print(k,v) 29 | #%% 30 | word_index = {k:(v+3) for k,v in word_index.items()} 31 | word_index[""] = 0 32 | word_index[""] = 1 33 | word_index[""] = 2 # unknown 34 | word_index[""] = 3 35 | # 翻转编码表 36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()]) 37 | 38 | def decode_review(text): 39 | return ' '.join([reverse_word_index.get(i, '?') for i in text]) 40 | 41 | decode_review(x_train[8]) 42 | 43 | #%% 44 | 45 | # x_train:[b, 80] 46 | # x_test: [b, 80] 47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充 48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len) 49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len) 50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch 51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) 52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True) 53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) 54 | db_test = db_test.batch(batchsz, drop_remainder=True) 55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train)) 56 | print('x_test shape:', x_test.shape) 57 | 58 | #%% 59 | 60 | class MyRNN(keras.Model): 61 | # Cell方式构建多层网络 62 | def __init__(self, units): 63 | super(MyRNN, self).__init__() 64 | # 词向量编码 [b, 80] => [b, 80, 100] 65 | self.embedding = layers.Embedding(total_words, embedding_len, 66 | input_length=max_review_len) 67 | # 构建RNN 68 | self.rnn = keras.Sequential([ 69 | layers.LSTM(units, dropout=0.5, return_sequences=True), 70 | layers.LSTM(units, dropout=0.5) 71 | ]) 72 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类 73 | # [b, 80, 100] => [b, 64] => [b, 1] 74 | self.outlayer = Sequential([ 75 | layers.Dense(32), 76 | layers.Dropout(rate=0.5), 77 | layers.ReLU(), 78 | layers.Dense(1)]) 79 | 80 | def call(self, inputs, training=None): 81 | x = inputs # [b, 80] 82 | # embedding: [b, 80] => [b, 80, 100] 83 | x = self.embedding(x) 84 | # rnn cell compute,[b, 80, 100] => [b, 64] 85 | x = self.rnn(x) 86 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1] 87 | x = self.outlayer(x,training) 88 | # p(y is pos|x) 89 | prob = tf.sigmoid(x) 90 | 91 | return prob 92 | 93 | def main(): 94 | units = 32 # RNN状态向量长度f 95 | epochs = 50 # 训练epochs 96 | 97 | model = MyRNN(units) 98 | # 装配 99 | model.compile(optimizer = optimizers.Adam(0.001), 100 | loss = losses.BinaryCrossentropy(), 101 | metrics=['accuracy']) 102 | # 训练和验证 103 | model.fit(db_train, epochs=epochs, validation_data=db_test) 104 | # 测试 105 | model.evaluate(db_test) 106 | 107 | 108 | if __name__ == '__main__': 109 | main() 110 | -------------------------------------------------------------------------------- /ch11-循环神经网络/sentiment_analysis_layer.py: -------------------------------------------------------------------------------- 1 | #%% 2 | import os 3 | import tensorflow as tf 4 | import numpy as np 5 | from tensorflow import keras 6 | from tensorflow.keras import layers, losses, optimizers, Sequential 7 | 8 | 9 | tf.random.set_seed(22) 10 | np.random.seed(22) 11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 12 | assert tf.__version__.startswith('2.') 13 | 14 | batchsz = 512 # 批量大小 15 | total_words = 10000 # 词汇表大小N_vocab 16 | max_review_len = 80 # 句子最大长度s,大于的句子部分将截断,小于的将填充 17 | embedding_len = 100 # 词向量特征长度f 18 | # 加载IMDB数据集,此处的数据采用数字编码,一个数字代表一个单词 19 | (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words) 20 | print(x_train.shape, len(x_train[0]), y_train.shape) 21 | print(x_test.shape, len(x_test[0]), y_test.shape) 22 | #%% 23 | x_train[0] 24 | #%% 25 | # 数字编码表 26 | word_index = keras.datasets.imdb.get_word_index() 27 | # for k,v in word_index.items(): 28 | # print(k,v) 29 | #%% 30 | word_index = {k:(v+3) for k,v in word_index.items()} 31 | word_index[""] = 0 32 | word_index[""] = 1 33 | word_index[""] = 2 # unknown 34 | word_index[""] = 3 35 | # 翻转编码表 36 | reverse_word_index = dict([(value, key) for (key, value) in word_index.items()]) 37 | 38 | def decode_review(text): 39 | return ' '.join([reverse_word_index.get(i, '?') for i in text]) 40 | 41 | decode_review(x_train[8]) 42 | 43 | #%% 44 | 45 | # x_train:[b, 80] 46 | # x_test: [b, 80] 47 | # 截断和填充句子,使得等长,此处长句子保留句子后面的部分,短句子在前面填充 48 | x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len) 49 | x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len) 50 | # 构建数据集,打散,批量,并丢掉最后一个不够batchsz的batch 51 | db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) 52 | db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True) 53 | db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) 54 | db_test = db_test.batch(batchsz, drop_remainder=True) 55 | print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train)) 56 | print('x_test shape:', x_test.shape) 57 | 58 | #%% 59 | 60 | class MyRNN(keras.Model): 61 | # Cell方式构建多层网络 62 | def __init__(self, units): 63 | super(MyRNN, self).__init__() 64 | # 词向量编码 [b, 80] => [b, 80, 100] 65 | self.embedding = layers.Embedding(total_words, embedding_len, 66 | input_length=max_review_len) 67 | # 构建RNN 68 | self.rnn = keras.Sequential([ 69 | layers.SimpleRNN(units, dropout=0.5, return_sequences=True), 70 | layers.SimpleRNN(units, dropout=0.5) 71 | ]) 72 | # 构建分类网络,用于将CELL的输出特征进行分类,2分类 73 | # [b, 80, 100] => [b, 64] => [b, 1] 74 | self.outlayer = Sequential([ 75 | layers.Dense(32), 76 | layers.Dropout(rate=0.5), 77 | layers.ReLU(), 78 | layers.Dense(1)]) 79 | 80 | def call(self, inputs, training=None): 81 | x = inputs # [b, 80] 82 | # embedding: [b, 80] => [b, 80, 100] 83 | x = self.embedding(x) 84 | # rnn cell compute,[b, 80, 100] => [b, 64] 85 | x = self.rnn(x) 86 | # 末层最后一个输出作为分类网络的输入: [b, 64] => [b, 1] 87 | x = self.outlayer(x,training) 88 | # p(y is pos|x) 89 | prob = tf.sigmoid(x) 90 | 91 | return prob 92 | 93 | def main(): 94 | units = 64 # RNN状态向量长度f 95 | epochs = 50 # 训练epochs 96 | 97 | model = MyRNN(units) 98 | # 装配 99 | model.compile(optimizer = optimizers.Adam(0.001), 100 | loss = losses.BinaryCrossentropy(), 101 | metrics=['accuracy']) 102 | # 训练和验证 103 | model.fit(db_train, epochs=epochs, validation_data=db_test) 104 | # 测试 105 | model.evaluate(db_test) 106 | 107 | 108 | if __name__ == '__main__': 109 | main() 110 | -------------------------------------------------------------------------------- /ch11-循环神经网络/循环神经网络.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/循环神经网络.pdf -------------------------------------------------------------------------------- /ch11-循环神经网络/情感分类实战.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/情感分类实战.pdf -------------------------------------------------------------------------------- /ch11-循环神经网络/时间序列表示.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/时间序列表示.pdf -------------------------------------------------------------------------------- /ch11-循环神经网络/梯度弥散与梯度爆炸.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch11-循环神经网络/梯度弥散与梯度爆炸.pdf -------------------------------------------------------------------------------- /ch12-自编码器/AE实战.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch12-自编码器/AE实战.pdf -------------------------------------------------------------------------------- /ch12-自编码器/AutoEncoders.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch12-自编码器/AutoEncoders.pdf -------------------------------------------------------------------------------- /ch12-自编码器/autoencoder.py: -------------------------------------------------------------------------------- 1 | import os 2 | import tensorflow as tf 3 | import numpy as np 4 | from tensorflow import keras 5 | from tensorflow.keras import Sequential, layers 6 | from PIL import Image 7 | from matplotlib import pyplot as plt 8 | 9 | 10 | 11 | tf.random.set_seed(22) 12 | np.random.seed(22) 13 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 14 | assert tf.__version__.startswith('2.') 15 | 16 | 17 | def save_images(imgs, name): 18 | new_im = Image.new('L', (280, 280)) 19 | 20 | index = 0 21 | for i in range(0, 280, 28): 22 | for j in range(0, 280, 28): 23 | im = imgs[index] 24 | im = Image.fromarray(im, mode='L') 25 | new_im.paste(im, (i, j)) 26 | index += 1 27 | 28 | new_im.save(name) 29 | 30 | 31 | h_dim = 20 32 | batchsz = 512 33 | lr = 1e-3 34 | 35 | 36 | (x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data() 37 | x_train, x_test = x_train.astype(np.float32) / 255., x_test.astype(np.float32) / 255. 38 | # we do not need label 39 | train_db = tf.data.Dataset.from_tensor_slices(x_train) 40 | train_db = train_db.shuffle(batchsz * 5).batch(batchsz) 41 | test_db = tf.data.Dataset.from_tensor_slices(x_test) 42 | test_db = test_db.batch(batchsz) 43 | 44 | print(x_train.shape, y_train.shape) 45 | print(x_test.shape, y_test.shape) 46 | 47 | 48 | 49 | class AE(keras.Model): 50 | 51 | def __init__(self): 52 | super(AE, self).__init__() 53 | 54 | # Encoders 55 | self.encoder = Sequential([ 56 | layers.Dense(256, activation=tf.nn.relu), 57 | layers.Dense(128, activation=tf.nn.relu), 58 | layers.Dense(h_dim) 59 | ]) 60 | 61 | # Decoders 62 | self.decoder = Sequential([ 63 | layers.Dense(128, activation=tf.nn.relu), 64 | layers.Dense(256, activation=tf.nn.relu), 65 | layers.Dense(784) 66 | ]) 67 | 68 | 69 | def call(self, inputs, training=None): 70 | # [b, 784] => [b, 10] 71 | h = self.encoder(inputs) 72 | # [b, 10] => [b, 784] 73 | x_hat = self.decoder(h) 74 | 75 | return x_hat 76 | 77 | 78 | 79 | model = AE() 80 | model.build(input_shape=(None, 784)) 81 | model.summary() 82 | 83 | optimizer = tf.optimizers.Adam(lr=lr) 84 | 85 | for epoch in range(100): 86 | 87 | for step, x in enumerate(train_db): 88 | 89 | #[b, 28, 28] => [b, 784] 90 | x = tf.reshape(x, [-1, 784]) 91 | 92 | with tf.GradientTape() as tape: 93 | x_rec_logits = model(x) 94 | 95 | rec_loss = tf.losses.binary_crossentropy(x, x_rec_logits, from_logits=True) 96 | rec_loss = tf.reduce_mean(rec_loss) 97 | 98 | grads = tape.gradient(rec_loss, model.trainable_variables) 99 | optimizer.apply_gradients(zip(grads, model.trainable_variables)) 100 | 101 | 102 | if step % 100 ==0: 103 | print(epoch, step, float(rec_loss)) 104 | 105 | 106 | # evaluation 107 | x = next(iter(test_db)) 108 | logits = model(tf.reshape(x, [-1, 784])) 109 | x_hat = tf.sigmoid(logits) 110 | # [b, 784] => [b, 28, 28] 111 | x_hat = tf.reshape(x_hat, [-1, 28, 28]) 112 | 113 | # [b, 28, 28] => [2b, 28, 28] 114 | x_concat = tf.concat([x, x_hat], axis=0) 115 | x_concat = x_hat 116 | x_concat = x_concat.numpy() * 255. 117 | x_concat = x_concat.astype(np.uint8) 118 | save_images(x_concat, 'ae_images/rec_epoch_%d.png'%epoch) 119 | -------------------------------------------------------------------------------- /ch12-自编码器/vae.py: -------------------------------------------------------------------------------- 1 | import os 2 | import tensorflow as tf 3 | import numpy as np 4 | from tensorflow import keras 5 | from tensorflow.keras import Sequential, layers 6 | from PIL import Image 7 | from matplotlib import pyplot as plt 8 | 9 | 10 | 11 | tf.random.set_seed(22) 12 | np.random.seed(22) 13 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 14 | assert tf.__version__.startswith('2.') 15 | 16 | 17 | def save_images(imgs, name): 18 | new_im = Image.new('L', (280, 280)) 19 | 20 | index = 0 21 | for i in range(0, 280, 28): 22 | for j in range(0, 280, 28): 23 | im = imgs[index] 24 | im = Image.fromarray(im, mode='L') 25 | new_im.paste(im, (i, j)) 26 | index += 1 27 | 28 | new_im.save(name) 29 | 30 | 31 | h_dim = 20 32 | batchsz = 512 33 | lr = 1e-3 34 | 35 | 36 | (x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data() 37 | x_train, x_test = x_train.astype(np.float32) / 255., x_test.astype(np.float32) / 255. 38 | # we do not need label 39 | train_db = tf.data.Dataset.from_tensor_slices(x_train) 40 | train_db = train_db.shuffle(batchsz * 5).batch(batchsz) 41 | test_db = tf.data.Dataset.from_tensor_slices(x_test) 42 | test_db = test_db.batch(batchsz) 43 | 44 | print(x_train.shape, y_train.shape) 45 | print(x_test.shape, y_test.shape) 46 | 47 | z_dim = 10 48 | 49 | class VAE(keras.Model): 50 | 51 | def __init__(self): 52 | super(VAE, self).__init__() 53 | 54 | # Encoder 55 | self.fc1 = layers.Dense(128) 56 | self.fc2 = layers.Dense(z_dim) # get mean prediction 57 | self.fc3 = layers.Dense(z_dim) 58 | 59 | # Decoder 60 | self.fc4 = layers.Dense(128) 61 | self.fc5 = layers.Dense(784) 62 | 63 | def encoder(self, x): 64 | 65 | h = tf.nn.relu(self.fc1(x)) 66 | # get mean 67 | mu = self.fc2(h) 68 | # get variance 69 | log_var = self.fc3(h) 70 | 71 | return mu, log_var 72 | 73 | def decoder(self, z): 74 | 75 | out = tf.nn.relu(self.fc4(z)) 76 | out = self.fc5(out) 77 | 78 | return out 79 | 80 | def reparameterize(self, mu, log_var): 81 | 82 | eps = tf.random.normal(log_var.shape) 83 | 84 | std = tf.exp(log_var*0.5) 85 | 86 | z = mu + std * eps 87 | return z 88 | 89 | def call(self, inputs, training=None): 90 | 91 | # [b, 784] => [b, z_dim], [b, z_dim] 92 | mu, log_var = self.encoder(inputs) 93 | # reparameterization trick 94 | z = self.reparameterize(mu, log_var) 95 | 96 | x_hat = self.decoder(z) 97 | 98 | return x_hat, mu, log_var 99 | 100 | 101 | model = VAE() 102 | model.build(input_shape=(4, 784)) 103 | optimizer = tf.optimizers.Adam(lr) 104 | 105 | for epoch in range(1000): 106 | 107 | for step, x in enumerate(train_db): 108 | 109 | x = tf.reshape(x, [-1, 784]) 110 | 111 | with tf.GradientTape() as tape: 112 | x_rec_logits, mu, log_var = model(x) 113 | 114 | rec_loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=x, logits=x_rec_logits) 115 | rec_loss = tf.reduce_sum(rec_loss) / x.shape[0] 116 | 117 | # compute kl divergence (mu, var) ~ N (0, 1) 118 | # https://stats.stackexchange.com/questions/7440/kl-divergence-between-two-univariate-gaussians 119 | kl_div = -0.5 * (log_var + 1 - mu**2 - tf.exp(log_var)) 120 | kl_div = tf.reduce_sum(kl_div) / x.shape[0] 121 | 122 | loss = rec_loss + 1. * kl_div 123 | 124 | grads = tape.gradient(loss, model.trainable_variables) 125 | optimizer.apply_gradients(zip(grads, model.trainable_variables)) 126 | 127 | 128 | if step % 100 == 0: 129 | print(epoch, step, 'kl div:', float(kl_div), 'rec loss:', float(rec_loss)) 130 | 131 | 132 | # evaluation 133 | z = tf.random.normal((batchsz, z_dim)) 134 | logits = model.decoder(z) 135 | x_hat = tf.sigmoid(logits) 136 | x_hat = tf.reshape(x_hat, [-1, 28, 28]).numpy() *255. 137 | x_hat = x_hat.astype(np.uint8) 138 | save_images(x_hat, 'vae_images/sampled_epoch%d.png'%epoch) 139 | 140 | x = next(iter(test_db)) 141 | x = tf.reshape(x, [-1, 784]) 142 | x_hat_logits, _, _ = model(x) 143 | x_hat = tf.sigmoid(x_hat_logits) 144 | x_hat = tf.reshape(x_hat, [-1, 28, 28]).numpy() *255. 145 | x_hat = x_hat.astype(np.uint8) 146 | save_images(x_hat, 'vae_images/rec_epoch%d.png'%epoch) 147 | 148 | -------------------------------------------------------------------------------- /ch13-生成对抗网络/GAN.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch13-生成对抗网络/GAN.pdf -------------------------------------------------------------------------------- /ch13-生成对抗网络/GAN实战.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch13-生成对抗网络/GAN实战.pdf -------------------------------------------------------------------------------- /ch13-生成对抗网络/dataset.py: -------------------------------------------------------------------------------- 1 | import multiprocessing 2 | 3 | import tensorflow as tf 4 | 5 | 6 | def make_anime_dataset(img_paths, batch_size, resize=64, drop_remainder=True, shuffle=True, repeat=1): 7 | 8 | # @tf.function 9 | def _map_fn(img): 10 | img = tf.image.resize(img, [resize, resize]) 11 | # img = tf.image.random_crop(img,[resize, resize]) 12 | # img = tf.image.random_flip_left_right(img) 13 | # img = tf.image.random_flip_up_down(img) 14 | img = tf.clip_by_value(img, 0, 255) 15 | img = img / 127.5 - 1 #-1~1 16 | return img 17 | 18 | dataset = disk_image_batch_dataset(img_paths, 19 | batch_size, 20 | drop_remainder=drop_remainder, 21 | map_fn=_map_fn, 22 | shuffle=shuffle, 23 | repeat=repeat) 24 | img_shape = (resize, resize, 3) 25 | len_dataset = len(img_paths) // batch_size 26 | 27 | return dataset, img_shape, len_dataset 28 | 29 | 30 | def batch_dataset(dataset, 31 | batch_size, 32 | drop_remainder=True, 33 | n_prefetch_batch=1, 34 | filter_fn=None, 35 | map_fn=None, 36 | n_map_threads=None, 37 | filter_after_map=False, 38 | shuffle=True, 39 | shuffle_buffer_size=None, 40 | repeat=None): 41 | # set defaults 42 | if n_map_threads is None: 43 | n_map_threads = multiprocessing.cpu_count() 44 | if shuffle and shuffle_buffer_size is None: 45 | shuffle_buffer_size = max(batch_size * 128, 2048) # set the minimum buffer size as 2048 46 | 47 | # [*] it is efficient to conduct `shuffle` before `map`/`filter` because `map`/`filter` is sometimes costly 48 | if shuffle: 49 | dataset = dataset.shuffle(shuffle_buffer_size) 50 | 51 | if not filter_after_map: 52 | if filter_fn: 53 | dataset = dataset.filter(filter_fn) 54 | 55 | if map_fn: 56 | dataset = dataset.map(map_fn, num_parallel_calls=n_map_threads) 57 | 58 | else: # [*] this is slower 59 | if map_fn: 60 | dataset = dataset.map(map_fn, num_parallel_calls=n_map_threads) 61 | 62 | if filter_fn: 63 | dataset = dataset.filter(filter_fn) 64 | 65 | dataset = dataset.batch(batch_size, drop_remainder=drop_remainder) 66 | 67 | dataset = dataset.repeat(repeat).prefetch(n_prefetch_batch) 68 | 69 | return dataset 70 | 71 | 72 | def memory_data_batch_dataset(memory_data, 73 | batch_size, 74 | drop_remainder=True, 75 | n_prefetch_batch=1, 76 | filter_fn=None, 77 | map_fn=None, 78 | n_map_threads=None, 79 | filter_after_map=False, 80 | shuffle=True, 81 | shuffle_buffer_size=None, 82 | repeat=None): 83 | """Batch dataset of memory data. 84 | 85 | Parameters 86 | ---------- 87 | memory_data : nested structure of tensors/ndarrays/lists 88 | 89 | """ 90 | dataset = tf.data.Dataset.from_tensor_slices(memory_data) 91 | dataset = batch_dataset(dataset, 92 | batch_size, 93 | drop_remainder=drop_remainder, 94 | n_prefetch_batch=n_prefetch_batch, 95 | filter_fn=filter_fn, 96 | map_fn=map_fn, 97 | n_map_threads=n_map_threads, 98 | filter_after_map=filter_after_map, 99 | shuffle=shuffle, 100 | shuffle_buffer_size=shuffle_buffer_size, 101 | repeat=repeat) 102 | return dataset 103 | 104 | 105 | def disk_image_batch_dataset(img_paths, 106 | batch_size, 107 | labels=None, 108 | drop_remainder=True, 109 | n_prefetch_batch=1, 110 | filter_fn=None, 111 | map_fn=None, 112 | n_map_threads=None, 113 | filter_after_map=False, 114 | shuffle=True, 115 | shuffle_buffer_size=None, 116 | repeat=None): 117 | """Batch dataset of disk image for PNG and JPEG. 118 | 119 | Parameters 120 | ---------- 121 | img_paths : 1d-tensor/ndarray/list of str 122 | labels : nested structure of tensors/ndarrays/lists 123 | 124 | """ 125 | if labels is None: 126 | memory_data = img_paths 127 | else: 128 | memory_data = (img_paths, labels) 129 | 130 | def parse_fn(path, *label): 131 | img = tf.io.read_file(path) 132 | img = tf.image.decode_jpeg(img, channels=3) # fix channels to 3 133 | return (img,) + label 134 | 135 | if map_fn: # fuse `map_fn` and `parse_fn` 136 | def map_fn_(*args): 137 | return map_fn(*parse_fn(*args)) 138 | else: 139 | map_fn_ = parse_fn 140 | 141 | dataset = memory_data_batch_dataset(memory_data, 142 | batch_size, 143 | drop_remainder=drop_remainder, 144 | n_prefetch_batch=n_prefetch_batch, 145 | filter_fn=filter_fn, 146 | map_fn=map_fn_, 147 | n_map_threads=n_map_threads, 148 | filter_after_map=filter_after_map, 149 | shuffle=shuffle, 150 | shuffle_buffer_size=shuffle_buffer_size, 151 | repeat=repeat) 152 | 153 | return dataset 154 | -------------------------------------------------------------------------------- /ch13-生成对抗网络/gan.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow import keras 3 | from tensorflow.keras import layers 4 | 5 | 6 | class Generator(keras.Model): 7 | # 生成器网络 8 | def __init__(self): 9 | super(Generator, self).__init__() 10 | filter = 64 11 | # 转置卷积层1,输出channel为filter*8,核大小4,步长1,不使用padding,不使用偏置 12 | self.conv1 = layers.Conv2DTranspose(filter*8, 4,1, 'valid', use_bias=False) 13 | self.bn1 = layers.BatchNormalization() 14 | # 转置卷积层2 15 | self.conv2 = layers.Conv2DTranspose(filter*4, 4,2, 'same', use_bias=False) 16 | self.bn2 = layers.BatchNormalization() 17 | # 转置卷积层3 18 | self.conv3 = layers.Conv2DTranspose(filter*2, 4,2, 'same', use_bias=False) 19 | self.bn3 = layers.BatchNormalization() 20 | # 转置卷积层4 21 | self.conv4 = layers.Conv2DTranspose(filter*1, 4,2, 'same', use_bias=False) 22 | self.bn4 = layers.BatchNormalization() 23 | # 转置卷积层5 24 | self.conv5 = layers.Conv2DTranspose(3, 4,2, 'same', use_bias=False) 25 | 26 | def call(self, inputs, training=None): 27 | x = inputs # [z, 100] 28 | # Reshape乘4D张量,方便后续转置卷积运算:(b, 1, 1, 100) 29 | x = tf.reshape(x, (x.shape[0], 1, 1, x.shape[1])) 30 | x = tf.nn.relu(x) # 激活函数 31 | # 转置卷积-BN-激活函数:(b, 4, 4, 512) 32 | x = tf.nn.relu(self.bn1(self.conv1(x), training=training)) 33 | # 转置卷积-BN-激活函数:(b, 8, 8, 256) 34 | x = tf.nn.relu(self.bn2(self.conv2(x), training=training)) 35 | # 转置卷积-BN-激活函数:(b, 16, 16, 128) 36 | x = tf.nn.relu(self.bn3(self.conv3(x), training=training)) 37 | # 转置卷积-BN-激活函数:(b, 32, 32, 64) 38 | x = tf.nn.relu(self.bn4(self.conv4(x), training=training)) 39 | # 转置卷积-激活函数:(b, 64, 64, 3) 40 | x = self.conv5(x) 41 | x = tf.tanh(x) # 输出x范围-1~1,与预处理一致 42 | 43 | return x 44 | 45 | 46 | class Discriminator(keras.Model): 47 | # 判别器 48 | def __init__(self): 49 | super(Discriminator, self).__init__() 50 | filter = 64 51 | # 卷积层 52 | self.conv1 = layers.Conv2D(filter, 4, 2, 'valid', use_bias=False) 53 | self.bn1 = layers.BatchNormalization() 54 | # 卷积层 55 | self.conv2 = layers.Conv2D(filter*2, 4, 2, 'valid', use_bias=False) 56 | self.bn2 = layers.BatchNormalization() 57 | # 卷积层 58 | self.conv3 = layers.Conv2D(filter*4, 4, 2, 'valid', use_bias=False) 59 | self.bn3 = layers.BatchNormalization() 60 | # 卷积层 61 | self.conv4 = layers.Conv2D(filter*8, 3, 1, 'valid', use_bias=False) 62 | self.bn4 = layers.BatchNormalization() 63 | # 卷积层 64 | self.conv5 = layers.Conv2D(filter*16, 3, 1, 'valid', use_bias=False) 65 | self.bn5 = layers.BatchNormalization() 66 | # 全局池化层 67 | self.pool = layers.GlobalAveragePooling2D() 68 | # 特征打平 69 | self.flatten = layers.Flatten() 70 | # 2分类全连接层 71 | self.fc = layers.Dense(1) 72 | 73 | 74 | def call(self, inputs, training=None): 75 | # 卷积-BN-激活函数:(4, 31, 31, 64) 76 | x = tf.nn.leaky_relu(self.bn1(self.conv1(inputs), training=training)) 77 | # 卷积-BN-激活函数:(4, 14, 14, 128) 78 | x = tf.nn.leaky_relu(self.bn2(self.conv2(x), training=training)) 79 | # 卷积-BN-激活函数:(4, 6, 6, 256) 80 | x = tf.nn.leaky_relu(self.bn3(self.conv3(x), training=training)) 81 | # 卷积-BN-激活函数:(4, 4, 4, 512) 82 | x = tf.nn.leaky_relu(self.bn4(self.conv4(x), training=training)) 83 | # 卷积-BN-激活函数:(4, 2, 2, 1024) 84 | x = tf.nn.leaky_relu(self.bn5(self.conv5(x), training=training)) 85 | # 卷积-BN-激活函数:(4, 1024) 86 | x = self.pool(x) 87 | # 打平 88 | x = self.flatten(x) 89 | # 输出,[b, 1024] => [b, 1] 90 | logits = self.fc(x) 91 | 92 | return logits 93 | 94 | def main(): 95 | 96 | d = Discriminator() 97 | g = Generator() 98 | 99 | 100 | x = tf.random.normal([2, 64, 64, 3]) 101 | z = tf.random.normal([2, 100]) 102 | 103 | prob = d(x) 104 | print(prob) 105 | x_hat = g(z) 106 | print(x_hat.shape) 107 | 108 | 109 | 110 | 111 | if __name__ == '__main__': 112 | main() -------------------------------------------------------------------------------- /ch13-生成对抗网络/gan_train.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import tensorflow as tf 4 | from tensorflow import keras 5 | from scipy.misc import toimage 6 | import glob 7 | from gan import Generator, Discriminator 8 | 9 | from dataset import make_anime_dataset 10 | 11 | 12 | def save_result(val_out, val_block_size, image_path, color_mode): 13 | def preprocess(img): 14 | img = ((img + 1.0) * 127.5).astype(np.uint8) 15 | # img = img.astype(np.uint8) 16 | return img 17 | 18 | preprocesed = preprocess(val_out) 19 | final_image = np.array([]) 20 | single_row = np.array([]) 21 | for b in range(val_out.shape[0]): 22 | # concat image into a row 23 | if single_row.size == 0: 24 | single_row = preprocesed[b, :, :, :] 25 | else: 26 | single_row = np.concatenate((single_row, preprocesed[b, :, :, :]), axis=1) 27 | 28 | # concat image row to final_image 29 | if (b+1) % val_block_size == 0: 30 | if final_image.size == 0: 31 | final_image = single_row 32 | else: 33 | final_image = np.concatenate((final_image, single_row), axis=0) 34 | 35 | # reset single row 36 | single_row = np.array([]) 37 | 38 | if final_image.shape[2] == 1: 39 | final_image = np.squeeze(final_image, axis=2) 40 | toimage(final_image).save(image_path) 41 | 42 | 43 | def celoss_ones(logits): 44 | # 计算属于与标签为1的交叉熵 45 | y = tf.ones_like(logits) 46 | loss = keras.losses.binary_crossentropy(y, logits, from_logits=True) 47 | return tf.reduce_mean(loss) 48 | 49 | 50 | def celoss_zeros(logits): 51 | # 计算属于与便签为0的交叉熵 52 | y = tf.zeros_like(logits) 53 | loss = keras.losses.binary_crossentropy(y, logits, from_logits=True) 54 | return tf.reduce_mean(loss) 55 | 56 | def d_loss_fn(generator, discriminator, batch_z, batch_x, is_training): 57 | # 计算判别器的误差函数 58 | # 采样生成图片 59 | fake_image = generator(batch_z, is_training) 60 | # 判定生成图片 61 | d_fake_logits = discriminator(fake_image, is_training) 62 | # 判定真实图片 63 | d_real_logits = discriminator(batch_x, is_training) 64 | # 真实图片与1之间的误差 65 | d_loss_real = celoss_ones(d_real_logits) 66 | # 生成图片与0之间的误差 67 | d_loss_fake = celoss_zeros(d_fake_logits) 68 | # 合并误差 69 | loss = d_loss_fake + d_loss_real 70 | 71 | return loss 72 | 73 | 74 | def g_loss_fn(generator, discriminator, batch_z, is_training): 75 | # 采样生成图片 76 | fake_image = generator(batch_z, is_training) 77 | # 在训练生成网络时,需要迫使生成图片判定为真 78 | d_fake_logits = discriminator(fake_image, is_training) 79 | # 计算生成图片与1之间的误差 80 | loss = celoss_ones(d_fake_logits) 81 | 82 | return loss 83 | 84 | def main(): 85 | 86 | tf.random.set_seed(3333) 87 | np.random.seed(3333) 88 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 89 | assert tf.__version__.startswith('2.') 90 | 91 | 92 | z_dim = 100 # 隐藏向量z的长度 93 | epochs = 3000000 # 训练步数 94 | batch_size = 64 # batch size 95 | learning_rate = 0.0002 96 | is_training = True 97 | 98 | # 获取数据集路径 99 | # C:\Users\z390\Downloads\anime-faces 100 | # r'C:\Users\z390\Downloads\faces\*.jpg' 101 | img_path = glob.glob(r'C:\Users\z390\Downloads\anime-faces\*\*.jpg') + \ 102 | glob.glob(r'C:\Users\z390\Downloads\anime-faces\*\*.png') 103 | # img_path = glob.glob(r'C:\Users\z390\Downloads\getchu_aligned_with_label\GetChu_aligned2\*.jpg') 104 | # img_path.extend(img_path2) 105 | print('images num:', len(img_path)) 106 | # 构建数据集对象 107 | dataset, img_shape, _ = make_anime_dataset(img_path, batch_size, resize=64) 108 | print(dataset, img_shape) 109 | sample = next(iter(dataset)) # 采样 110 | print(sample.shape, tf.reduce_max(sample).numpy(), 111 | tf.reduce_min(sample).numpy()) 112 | dataset = dataset.repeat(100) # 重复循环 113 | db_iter = iter(dataset) 114 | 115 | 116 | generator = Generator() # 创建生成器 117 | generator.build(input_shape = (4, z_dim)) 118 | discriminator = Discriminator() # 创建判别器 119 | discriminator.build(input_shape=(4, 64, 64, 3)) 120 | # 分别为生成器和判别器创建优化器 121 | g_optimizer = keras.optimizers.Adam(learning_rate=learning_rate, beta_1=0.5) 122 | d_optimizer = keras.optimizers.Adam(learning_rate=learning_rate, beta_1=0.5) 123 | 124 | generator.load_weights('generator.ckpt') 125 | discriminator.load_weights('discriminator.ckpt') 126 | print('Loaded chpt!!') 127 | 128 | d_losses, g_losses = [],[] 129 | for epoch in range(epochs): # 训练epochs次 130 | # 1. 训练判别器 131 | for _ in range(1): 132 | # 采样隐藏向量 133 | batch_z = tf.random.normal([batch_size, z_dim]) 134 | batch_x = next(db_iter) # 采样真实图片 135 | # 判别器前向计算 136 | with tf.GradientTape() as tape: 137 | d_loss = d_loss_fn(generator, discriminator, batch_z, batch_x, is_training) 138 | grads = tape.gradient(d_loss, discriminator.trainable_variables) 139 | d_optimizer.apply_gradients(zip(grads, discriminator.trainable_variables)) 140 | # 2. 训练生成器 141 | # 采样隐藏向量 142 | batch_z = tf.random.normal([batch_size, z_dim]) 143 | batch_x = next(db_iter) # 采样真实图片 144 | # 生成器前向计算 145 | with tf.GradientTape() as tape: 146 | g_loss = g_loss_fn(generator, discriminator, batch_z, is_training) 147 | grads = tape.gradient(g_loss, generator.trainable_variables) 148 | g_optimizer.apply_gradients(zip(grads, generator.trainable_variables)) 149 | 150 | if epoch % 100 == 0: 151 | print(epoch, 'd-loss:',float(d_loss), 'g-loss:', float(g_loss)) 152 | # 可视化 153 | z = tf.random.normal([100, z_dim]) 154 | fake_image = generator(z, training=False) 155 | img_path = os.path.join('gan_images', 'gan-%d.png'%epoch) 156 | save_result(fake_image.numpy(), 10, img_path, color_mode='P') 157 | 158 | d_losses.append(float(d_loss)) 159 | g_losses.append(float(g_loss)) 160 | 161 | if epoch % 10000 == 1: 162 | # print(d_losses) 163 | # print(g_losses) 164 | generator.save_weights('generator.ckpt') 165 | discriminator.save_weights('discriminator.ckpt') 166 | 167 | 168 | 169 | 170 | 171 | if __name__ == '__main__': 172 | main() -------------------------------------------------------------------------------- /ch13-生成对抗网络/wgan.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow import keras 3 | from tensorflow.keras import layers 4 | 5 | 6 | 7 | 8 | 9 | 10 | class Generator(keras.Model): 11 | 12 | def __init__(self): 13 | super(Generator, self).__init__() 14 | 15 | # z: [b, 100] => [b, 3*3*512] => [b, 3, 3, 512] => [b, 64, 64, 3] 16 | self.fc = layers.Dense(3*3*512) 17 | 18 | self.conv1 = layers.Conv2DTranspose(256, 3, 3, 'valid') 19 | self.bn1 = layers.BatchNormalization() 20 | 21 | self.conv2 = layers.Conv2DTranspose(128, 5, 2, 'valid') 22 | self.bn2 = layers.BatchNormalization() 23 | 24 | self.conv3 = layers.Conv2DTranspose(3, 4, 3, 'valid') 25 | 26 | def call(self, inputs, training=None): 27 | # [z, 100] => [z, 3*3*512] 28 | x = self.fc(inputs) 29 | x = tf.reshape(x, [-1, 3, 3, 512]) 30 | x = tf.nn.leaky_relu(x) 31 | 32 | # 33 | x = tf.nn.leaky_relu(self.bn1(self.conv1(x), training=training)) 34 | x = tf.nn.leaky_relu(self.bn2(self.conv2(x), training=training)) 35 | x = self.conv3(x) 36 | x = tf.tanh(x) 37 | 38 | return x 39 | 40 | 41 | class Discriminator(keras.Model): 42 | 43 | def __init__(self): 44 | super(Discriminator, self).__init__() 45 | 46 | # [b, 64, 64, 3] => [b, 1] 47 | self.conv1 = layers.Conv2D(64, 5, 3, 'valid') 48 | 49 | self.conv2 = layers.Conv2D(128, 5, 3, 'valid') 50 | self.bn2 = layers.BatchNormalization() 51 | 52 | self.conv3 = layers.Conv2D(256, 5, 3, 'valid') 53 | self.bn3 = layers.BatchNormalization() 54 | 55 | # [b, h, w ,c] => [b, -1] 56 | self.flatten = layers.Flatten() 57 | self.fc = layers.Dense(1) 58 | 59 | 60 | def call(self, inputs, training=None): 61 | 62 | x = tf.nn.leaky_relu(self.conv1(inputs)) 63 | x = tf.nn.leaky_relu(self.bn2(self.conv2(x), training=training)) 64 | x = tf.nn.leaky_relu(self.bn3(self.conv3(x), training=training)) 65 | 66 | # [b, h, w, c] => [b, -1] 67 | x = self.flatten(x) 68 | # [b, -1] => [b, 1] 69 | logits = self.fc(x) 70 | 71 | return logits 72 | 73 | def main(): 74 | 75 | d = Discriminator() 76 | g = Generator() 77 | 78 | 79 | x = tf.random.normal([2, 64, 64, 3]) 80 | z = tf.random.normal([2, 100]) 81 | 82 | prob = d(x) 83 | print(prob) 84 | x_hat = g(z) 85 | print(x_hat.shape) 86 | 87 | 88 | 89 | 90 | if __name__ == '__main__': 91 | main() -------------------------------------------------------------------------------- /ch13-生成对抗网络/wgan_train.py: -------------------------------------------------------------------------------- 1 | import os 2 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 3 | import numpy as np 4 | import tensorflow as tf 5 | from tensorflow import keras 6 | 7 | from PIL import Image 8 | import glob 9 | from gan import Generator, Discriminator 10 | 11 | from dataset import make_anime_dataset 12 | 13 | 14 | def save_result(val_out, val_block_size, image_path, color_mode): 15 | def preprocess(img): 16 | img = ((img + 1.0) * 127.5).astype(np.uint8) 17 | # img = img.astype(np.uint8) 18 | return img 19 | 20 | preprocesed = preprocess(val_out) 21 | final_image = np.array([]) 22 | single_row = np.array([]) 23 | for b in range(val_out.shape[0]): 24 | # concat image into a row 25 | if single_row.size == 0: 26 | single_row = preprocesed[b, :, :, :] 27 | else: 28 | single_row = np.concatenate((single_row, preprocesed[b, :, :, :]), axis=1) 29 | 30 | # concat image row to final_image 31 | if (b+1) % val_block_size == 0: 32 | if final_image.size == 0: 33 | final_image = single_row 34 | else: 35 | final_image = np.concatenate((final_image, single_row), axis=0) 36 | 37 | # reset single row 38 | single_row = np.array([]) 39 | 40 | if final_image.shape[2] == 1: 41 | final_image = np.squeeze(final_image, axis=2) 42 | Image.fromarray(final_image).save(image_path) 43 | 44 | 45 | def celoss_ones(logits): 46 | # [b, 1] 47 | # [b] = [1, 1, 1, 1,] 48 | # loss = tf.keras.losses.categorical_crossentropy(y_pred=logits, 49 | # y_true=tf.ones_like(logits)) 50 | return - tf.reduce_mean(logits) 51 | 52 | 53 | def celoss_zeros(logits): 54 | # [b, 1] 55 | # [b] = [1, 1, 1, 1,] 56 | # loss = tf.keras.losses.categorical_crossentropy(y_pred=logits, 57 | # y_true=tf.zeros_like(logits)) 58 | return tf.reduce_mean(logits) 59 | 60 | 61 | def gradient_penalty(discriminator, batch_x, fake_image): 62 | 63 | batchsz = batch_x.shape[0] 64 | 65 | # [b, h, w, c] 66 | t = tf.random.uniform([batchsz, 1, 1, 1]) 67 | # [b, 1, 1, 1] => [b, h, w, c] 68 | t = tf.broadcast_to(t, batch_x.shape) 69 | 70 | interplate = t * batch_x + (1 - t) * fake_image 71 | 72 | with tf.GradientTape() as tape: 73 | tape.watch([interplate]) 74 | d_interplote_logits = discriminator(interplate, training=True) 75 | grads = tape.gradient(d_interplote_logits, interplate) 76 | 77 | # grads:[b, h, w, c] => [b, -1] 78 | grads = tf.reshape(grads, [grads.shape[0], -1]) 79 | gp = tf.norm(grads, axis=1) #[b] 80 | gp = tf.reduce_mean( (gp-1)**2 ) 81 | 82 | return gp 83 | 84 | 85 | 86 | def d_loss_fn(generator, discriminator, batch_z, batch_x, is_training): 87 | # 1. treat real image as real 88 | # 2. treat generated image as fake 89 | fake_image = generator(batch_z, is_training) 90 | d_fake_logits = discriminator(fake_image, is_training) 91 | d_real_logits = discriminator(batch_x, is_training) 92 | 93 | d_loss_real = celoss_ones(d_real_logits) 94 | d_loss_fake = celoss_zeros(d_fake_logits) 95 | gp = gradient_penalty(discriminator, batch_x, fake_image) 96 | 97 | loss = d_loss_real + d_loss_fake + 10. * gp 98 | 99 | return loss, gp 100 | 101 | 102 | def g_loss_fn(generator, discriminator, batch_z, is_training): 103 | 104 | fake_image = generator(batch_z, is_training) 105 | d_fake_logits = discriminator(fake_image, is_training) 106 | loss = celoss_ones(d_fake_logits) 107 | 108 | return loss 109 | 110 | 111 | def main(): 112 | 113 | tf.random.set_seed(233) 114 | np.random.seed(233) 115 | assert tf.__version__.startswith('2.') 116 | 117 | 118 | # hyper parameters 119 | z_dim = 100 120 | epochs = 3000000 121 | batch_size = 512 122 | learning_rate = 0.0005 123 | is_training = True 124 | 125 | 126 | img_path = glob.glob(r'C:\Users\Jackie\Downloads\faces\*.jpg') 127 | assert len(img_path) > 0 128 | 129 | 130 | dataset, img_shape, _ = make_anime_dataset(img_path, batch_size) 131 | print(dataset, img_shape) 132 | sample = next(iter(dataset)) 133 | print(sample.shape, tf.reduce_max(sample).numpy(), 134 | tf.reduce_min(sample).numpy()) 135 | dataset = dataset.repeat() 136 | db_iter = iter(dataset) 137 | 138 | 139 | generator = Generator() 140 | generator.build(input_shape = (None, z_dim)) 141 | discriminator = Discriminator() 142 | discriminator.build(input_shape=(None, 64, 64, 3)) 143 | z_sample = tf.random.normal([100, z_dim]) 144 | 145 | 146 | g_optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate, beta_1=0.5) 147 | d_optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate, beta_1=0.5) 148 | 149 | 150 | for epoch in range(epochs): 151 | 152 | for _ in range(5): 153 | batch_z = tf.random.normal([batch_size, z_dim]) 154 | batch_x = next(db_iter) 155 | 156 | # train D 157 | with tf.GradientTape() as tape: 158 | d_loss, gp = d_loss_fn(generator, discriminator, batch_z, batch_x, is_training) 159 | grads = tape.gradient(d_loss, discriminator.trainable_variables) 160 | d_optimizer.apply_gradients(zip(grads, discriminator.trainable_variables)) 161 | 162 | batch_z = tf.random.normal([batch_size, z_dim]) 163 | 164 | with tf.GradientTape() as tape: 165 | g_loss = g_loss_fn(generator, discriminator, batch_z, is_training) 166 | grads = tape.gradient(g_loss, generator.trainable_variables) 167 | g_optimizer.apply_gradients(zip(grads, generator.trainable_variables)) 168 | 169 | if epoch % 100 == 0: 170 | print(epoch, 'd-loss:',float(d_loss), 'g-loss:', float(g_loss), 171 | 'gp:', float(gp)) 172 | 173 | z = tf.random.normal([100, z_dim]) 174 | fake_image = generator(z, training=False) 175 | img_path = os.path.join('images', 'wgan-%d.png'%epoch) 176 | save_result(fake_image.numpy(), 10, img_path, color_mode='P') 177 | 178 | 179 | 180 | if __name__ == '__main__': 181 | main() -------------------------------------------------------------------------------- /ch14-强化学习/REINFORCE_tf.py: -------------------------------------------------------------------------------- 1 | import gym,os 2 | import numpy as np 3 | import matplotlib 4 | from matplotlib import pyplot as plt 5 | # Default parameters for plots 6 | matplotlib.rcParams['font.size'] = 18 7 | matplotlib.rcParams['figure.titlesize'] = 18 8 | matplotlib.rcParams['figure.figsize'] = [9, 7] 9 | matplotlib.rcParams['font.family'] = ['KaiTi'] 10 | matplotlib.rcParams['axes.unicode_minus']=False 11 | 12 | import tensorflow as tf 13 | from tensorflow import keras 14 | from tensorflow.keras import layers,optimizers,losses 15 | from PIL import Image 16 | env = gym.make('CartPole-v1') # 创建游戏环境 17 | env.seed(2333) 18 | tf.random.set_seed(2333) 19 | np.random.seed(2333) 20 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 21 | assert tf.__version__.startswith('2.') 22 | 23 | learning_rate = 0.0002 24 | gamma = 0.98 25 | 26 | class Policy(keras.Model): 27 | # 策略网络,生成动作的概率分布 28 | def __init__(self): 29 | super(Policy, self).__init__() 30 | self.data = [] # 存储轨迹 31 | # 输入为长度为4的向量,输出为左、右2个动作 32 | self.fc1 = layers.Dense(128, kernel_initializer='he_normal') 33 | self.fc2 = layers.Dense(2, kernel_initializer='he_normal') 34 | # 网络优化器 35 | self.optimizer = optimizers.Adam(lr=learning_rate) 36 | 37 | def call(self, inputs, training=None): 38 | # 状态输入s的shape为向量:[4] 39 | x = tf.nn.relu(self.fc1(inputs)) 40 | x = tf.nn.softmax(self.fc2(x), axis=1) 41 | return x 42 | 43 | def put_data(self, item): 44 | # 记录r,log_P(a|s) 45 | self.data.append(item) 46 | 47 | def train_net(self, tape): 48 | # 计算梯度并更新策略网络参数。tape为梯度记录器 49 | R = 0 # 终结状态的初始回报为0 50 | for r, log_prob in self.data[::-1]:#逆序取 51 | R = r + gamma * R # 计算每个时间戳上的回报 52 | # 每个时间戳都计算一次梯度 53 | # grad_R=-log_P*R*grad_theta 54 | loss = -log_prob * R 55 | with tape.stop_recording(): 56 | # 优化策略网络 57 | grads = tape.gradient(loss, self.trainable_variables) 58 | # print(grads) 59 | self.optimizer.apply_gradients(zip(grads, self.trainable_variables)) 60 | self.data = [] # 清空轨迹 61 | 62 | def main(): 63 | pi = Policy() # 创建策略网络 64 | pi(tf.random.normal((4,4))) 65 | pi.summary() 66 | score = 0.0 # 计分 67 | print_interval = 20 # 打印间隔 68 | returns = [] 69 | 70 | for n_epi in range(400): 71 | s = env.reset() # 回到游戏初始状态,返回s0 72 | with tf.GradientTape(persistent=True) as tape: 73 | for t in range(501): # CartPole-v1 forced to terminates at 500 step. 74 | # 送入状态向量,获取策略 75 | s = tf.constant(s,dtype=tf.float32) 76 | # s: [4] => [1,4] 77 | s = tf.expand_dims(s, axis=0) 78 | prob = pi(s) # 动作分布:[1,2] 79 | # 从类别分布中采样1个动作, shape: [1] 80 | a = tf.random.categorical(tf.math.log(prob), 1)[0] 81 | a = int(a) # Tensor转数字 82 | s_prime, r, done, info = env.step(a) 83 | # 记录动作a和动作产生的奖励r 84 | # prob shape:[1,2] 85 | pi.put_data((r, tf.math.log(prob[0][a]))) 86 | s = s_prime # 刷新状态 87 | score += r # 累积奖励 88 | 89 | if n_epi >1000: 90 | env.render() 91 | # im = Image.fromarray(s) 92 | # im.save("res/%d.jpg" % info['frames'][0]) 93 | 94 | if done: # 当前episode终止 95 | break 96 | # episode终止后,训练一次网络 97 | pi.train_net(tape) 98 | del tape 99 | 100 | if n_epi%print_interval==0 and n_epi!=0: 101 | returns.append(score/print_interval) 102 | print(f"# of episode :{n_epi}, avg score : {score/print_interval}") 103 | score = 0.0 104 | env.close() # 关闭环境 105 | 106 | plt.plot(np.arange(len(returns))*print_interval, returns) 107 | plt.plot(np.arange(len(returns))*print_interval, returns, 's') 108 | plt.xlabel('回合数') 109 | plt.ylabel('总回报') 110 | plt.savefig('reinforce-tf-cartpole.svg') 111 | 112 | if __name__ == '__main__': 113 | main() -------------------------------------------------------------------------------- /ch14-强化学习/dqn_tf.py: -------------------------------------------------------------------------------- 1 | import collections 2 | import random 3 | import gym,os 4 | import numpy as np 5 | import tensorflow as tf 6 | from tensorflow import keras 7 | from tensorflow.keras import layers,optimizers,losses 8 | 9 | env = gym.make('CartPole-v1') # 创建游戏环境 10 | env.seed(1234) 11 | tf.random.set_seed(1234) 12 | np.random.seed(1234) 13 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 14 | assert tf.__version__.startswith('2.') 15 | 16 | # Hyperparameters 17 | learning_rate = 0.0002 18 | gamma = 0.99 19 | buffer_limit = 50000 20 | batch_size = 32 21 | 22 | 23 | class ReplayBuffer(): 24 | # 经验回放池 25 | def __init__(self): 26 | # 双向队列 27 | self.buffer = collections.deque(maxlen=buffer_limit) 28 | 29 | def put(self, transition): 30 | self.buffer.append(transition) 31 | 32 | def sample(self, n): 33 | # 从回放池采样n个5元组 34 | mini_batch = random.sample(self.buffer, n) 35 | s_lst, a_lst, r_lst, s_prime_lst, done_mask_lst = [], [], [], [], [] 36 | # 按类别进行整理 37 | for transition in mini_batch: 38 | s, a, r, s_prime, done_mask = transition 39 | s_lst.append(s) 40 | a_lst.append([a]) 41 | r_lst.append([r]) 42 | s_prime_lst.append(s_prime) 43 | done_mask_lst.append([done_mask]) 44 | # 转换成Tensor 45 | return tf.constant(s_lst, dtype=tf.float32),\ 46 | tf.constant(a_lst, dtype=tf.int32), \ 47 | tf.constant(r_lst, dtype=tf.float32), \ 48 | tf.constant(s_prime_lst, dtype=tf.float32), \ 49 | tf.constant(done_mask_lst, dtype=tf.float32) 50 | 51 | 52 | def size(self): 53 | return len(self.buffer) 54 | 55 | 56 | class Qnet(keras.Model): 57 | def __init__(self): 58 | # 创建Q网络,输入为状态向量,输出为动作的Q值 59 | super(Qnet, self).__init__() 60 | self.fc1 = layers.Dense(256, kernel_initializer='he_normal') 61 | self.fc2 = layers.Dense(256, kernel_initializer='he_normal') 62 | self.fc3 = layers.Dense(2, kernel_initializer='he_normal') 63 | 64 | def call(self, x, training=None): 65 | x = tf.nn.relu(self.fc1(x)) 66 | x = tf.nn.relu(self.fc2(x)) 67 | x = self.fc3(x) 68 | return x 69 | 70 | def sample_action(self, s, epsilon): 71 | # 送入状态向量,获取策略: [4] 72 | s = tf.constant(s, dtype=tf.float32) 73 | # s: [4] => [1,4] 74 | s = tf.expand_dims(s, axis=0) 75 | out = self(s)[0] 76 | coin = random.random() 77 | # 策略改进:e-贪心方式 78 | if coin < epsilon: 79 | # epsilon大的概率随机选取 80 | return random.randint(0, 1) 81 | else: # 选择Q值最大的动作 82 | return int(tf.argmax(out)) 83 | 84 | 85 | def train(q, q_target, memory, optimizer): 86 | # 通过Q网络和影子网络来构造贝尔曼方程的误差, 87 | # 并只更新Q网络,影子网络的更新会滞后Q网络 88 | huber = losses.Huber() 89 | for i in range(10): # 训练10次 90 | # 从缓冲池采样 91 | s, a, r, s_prime, done_mask = memory.sample(batch_size) 92 | with tf.GradientTape() as tape: 93 | # s: [b, 4] 94 | q_out = q(s) # 得到Q(s,a)的分布 95 | # 由于TF的gather_nd与pytorch的gather功能不一样,需要构造 96 | # gather_nd需要的坐标参数,indices:[b, 2] 97 | # pi_a = pi.gather(1, a) # pytorch只需要一行即可实现 98 | indices = tf.expand_dims(tf.range(a.shape[0]), axis=1) 99 | indices = tf.concat([indices, a], axis=1) 100 | q_a = tf.gather_nd(q_out, indices) # 动作的概率值, [b] 101 | q_a = tf.expand_dims(q_a, axis=1) # [b]=> [b,1] 102 | # 得到Q(s',a)的最大值,它来自影子网络! [b,4]=>[b,2]=>[b,1] 103 | max_q_prime = tf.reduce_max(q_target(s_prime),axis=1,keepdims=True) 104 | # 构造Q(s,a_t)的目标值,来自贝尔曼方程 105 | target = r + gamma * max_q_prime * done_mask 106 | # 计算Q(s,a_t)与目标值的误差 107 | loss = huber(q_a, target) 108 | # 更新网络,使得Q(s,a_t)估计符合贝尔曼方程 109 | grads = tape.gradient(loss, q.trainable_variables) 110 | # for p in grads: 111 | # print(tf.norm(p)) 112 | # print(grads) 113 | optimizer.apply_gradients(zip(grads, q.trainable_variables)) 114 | 115 | 116 | def main(): 117 | env = gym.make('CartPole-v1') # 创建环境 118 | q = Qnet() # 创建Q网络 119 | q_target = Qnet() # 创建影子网络 120 | q.build(input_shape=(2,4)) 121 | q_target.build(input_shape=(2,4)) 122 | for src, dest in zip(q.variables, q_target.variables): 123 | dest.assign(src) # 影子网络权值来自Q 124 | memory = ReplayBuffer() # 创建回放池 125 | 126 | print_interval = 20 127 | score = 0.0 128 | optimizer = optimizers.Adam(lr=learning_rate) 129 | 130 | for n_epi in range(10000): # 训练次数 131 | # epsilon概率也会8%到1%衰减,越到后面越使用Q值最大的动作 132 | epsilon = max(0.01, 0.08 - 0.01 * (n_epi / 200)) 133 | s = env.reset() # 复位环境 134 | for t in range(600): # 一个回合最大时间戳 135 | # if n_epi>1000: 136 | # env.render() 137 | # 根据当前Q网络提取策略,并改进策略 138 | a = q.sample_action(s, epsilon) 139 | # 使用改进的策略与环境交互 140 | s_prime, r, done, info = env.step(a) 141 | done_mask = 0.0 if done else 1.0 # 结束标志掩码 142 | # 保存5元组 143 | memory.put((s, a, r / 100.0, s_prime, done_mask)) 144 | s = s_prime # 刷新状态 145 | score += r # 记录总回报 146 | if done: # 回合结束 147 | break 148 | 149 | if memory.size() > 2000: # 缓冲池只有大于2000就可以训练 150 | train(q, q_target, memory, optimizer) 151 | 152 | if n_epi % print_interval == 0 and n_epi != 0: 153 | for src, dest in zip(q.variables, q_target.variables): 154 | dest.assign(src) # 影子网络权值来自Q 155 | print("# of episode :{}, avg score : {:.1f}, buffer size : {}, " \ 156 | "epsilon : {:.1f}%" \ 157 | .format(n_epi, score / print_interval, memory.size(), epsilon * 100)) 158 | score = 0.0 159 | env.close() 160 | 161 | 162 | if __name__ == '__main__': 163 | main() -------------------------------------------------------------------------------- /ch14-强化学习/ppo_tf_cartpole.py: -------------------------------------------------------------------------------- 1 | import matplotlib 2 | from matplotlib import pyplot as plt 3 | matplotlib.rcParams['font.size'] = 18 4 | matplotlib.rcParams['figure.titlesize'] = 18 5 | matplotlib.rcParams['figure.figsize'] = [9, 7] 6 | matplotlib.rcParams['font.family'] = ['KaiTi'] 7 | matplotlib.rcParams['axes.unicode_minus']=False 8 | 9 | plt.figure() 10 | 11 | import gym,os 12 | import numpy as np 13 | import tensorflow as tf 14 | from tensorflow import keras 15 | from tensorflow.keras import layers,optimizers,losses 16 | from collections import namedtuple 17 | from torch.utils.data import SubsetRandomSampler,BatchSampler 18 | 19 | env = gym.make('CartPole-v1') # 创建游戏环境 20 | env.seed(2222) 21 | tf.random.set_seed(2222) 22 | np.random.seed(2222) 23 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 24 | assert tf.__version__.startswith('2.') 25 | 26 | 27 | 28 | gamma = 0.98 # 激励衰减因子 29 | epsilon = 0.2 # PPO误差超参数0.8~1.2 30 | batch_size = 32 # batch size 31 | 32 | 33 | # 创建游戏环境 34 | env = gym.make('CartPole-v0').unwrapped 35 | Transition = namedtuple('Transition', ['state', 'action', 'a_log_prob', 'reward', 'next_state']) 36 | 37 | 38 | class Actor(keras.Model): 39 | def __init__(self): 40 | super(Actor, self).__init__() 41 | # 策略网络,也叫Actor网络,输出为概率分布pi(a|s) 42 | self.fc1 = layers.Dense(100, kernel_initializer='he_normal') 43 | self.fc2 = layers.Dense(2, kernel_initializer='he_normal') 44 | 45 | def call(self, inputs): 46 | x = tf.nn.relu(self.fc1(inputs)) 47 | x = self.fc2(x) 48 | x = tf.nn.softmax(x, axis=1) # 转换成概率 49 | return x 50 | 51 | class Critic(keras.Model): 52 | def __init__(self): 53 | super(Critic, self).__init__() 54 | # 偏置b的估值网络,也叫Critic网络,输出为v(s) 55 | self.fc1 = layers.Dense(100, kernel_initializer='he_normal') 56 | self.fc2 = layers.Dense(1, kernel_initializer='he_normal') 57 | 58 | def call(self, inputs): 59 | x = tf.nn.relu(self.fc1(inputs)) 60 | x = self.fc2(x) 61 | return x 62 | 63 | 64 | 65 | 66 | class PPO(): 67 | # PPO算法主体 68 | def __init__(self): 69 | super(PPO, self).__init__() 70 | self.actor = Actor() # 创建Actor网络 71 | self.critic = Critic() # 创建Critic网络 72 | self.buffer = [] # 数据缓冲池 73 | self.actor_optimizer = optimizers.Adam(1e-3) # Actor优化器 74 | self.critic_optimizer = optimizers.Adam(3e-3) # Critic优化器 75 | 76 | def select_action(self, s): 77 | # 送入状态向量,获取策略: [4] 78 | s = tf.constant(s, dtype=tf.float32) 79 | # s: [4] => [1,4] 80 | s = tf.expand_dims(s, axis=0) 81 | # 获取策略分布: [1, 2] 82 | prob = self.actor(s) 83 | # 从类别分布中采样1个动作, shape: [1] 84 | a = tf.random.categorical(tf.math.log(prob), 1)[0] 85 | a = int(a) # Tensor转数字 86 | return a, float(prob[0][a]) # 返回动作及其概率 87 | 88 | def get_value(self, s): 89 | # 送入状态向量,获取策略: [4] 90 | s = tf.constant(s, dtype=tf.float32) 91 | # s: [4] => [1,4] 92 | s = tf.expand_dims(s, axis=0) 93 | # 获取策略分布: [1, 2] 94 | v = self.critic(s)[0] 95 | return float(v) # 返回v(s) 96 | 97 | def store_transition(self, transition): 98 | # 存储采样数据 99 | self.buffer.append(transition) 100 | 101 | def optimize(self): 102 | # 优化网络主函数 103 | # 从缓存中取出样本数据,转换成Tensor 104 | state = tf.constant([t.state for t in self.buffer], dtype=tf.float32) 105 | action = tf.constant([t.action for t in self.buffer], dtype=tf.int32) 106 | action = tf.reshape(action,[-1,1]) 107 | reward = [t.reward for t in self.buffer] 108 | old_action_log_prob = tf.constant([t.a_log_prob for t in self.buffer], dtype=tf.float32) 109 | old_action_log_prob = tf.reshape(old_action_log_prob, [-1,1]) 110 | # 通过MC方法循环计算R(st) 111 | R = 0 112 | Rs = [] 113 | for r in reward[::-1]: 114 | R = r + gamma * R 115 | Rs.insert(0, R) 116 | Rs = tf.constant(Rs, dtype=tf.float32) 117 | # 对缓冲池数据大致迭代10遍 118 | for _ in range(round(10*len(self.buffer)/batch_size)): 119 | # 随机从缓冲池采样batch size大小样本 120 | index = np.random.choice(np.arange(len(self.buffer)), batch_size, replace=False) 121 | # 构建梯度跟踪环境 122 | with tf.GradientTape() as tape1, tf.GradientTape() as tape2: 123 | # 取出R(st),[b,1] 124 | v_target = tf.expand_dims(tf.gather(Rs, index, axis=0), axis=1) 125 | # 计算v(s)预测值,也就是偏置b,我们后面会介绍为什么写成v 126 | v = self.critic(tf.gather(state, index, axis=0)) 127 | delta = v_target - v # 计算优势值 128 | advantage = tf.stop_gradient(delta) # 断开梯度连接 129 | # 由于TF的gather_nd与pytorch的gather功能不一样,需要构造 130 | # gather_nd需要的坐标参数,indices:[b, 2] 131 | # pi_a = pi.gather(1, a) # pytorch只需要一行即可实现 132 | a = tf.gather(action, index, axis=0) # 取出batch的动作at 133 | # batch的动作分布pi(a|st) 134 | pi = self.actor(tf.gather(state, index, axis=0)) 135 | indices = tf.expand_dims(tf.range(a.shape[0]), axis=1) 136 | indices = tf.concat([indices, a], axis=1) 137 | pi_a = tf.gather_nd(pi, indices) # 动作的概率值pi(at|st), [b] 138 | pi_a = tf.expand_dims(pi_a, axis=1) # [b]=> [b,1] 139 | # 重要性采样 140 | ratio = (pi_a / tf.gather(old_action_log_prob, index, axis=0)) 141 | surr1 = ratio * advantage 142 | surr2 = tf.clip_by_value(ratio, 1 - epsilon, 1 + epsilon) * advantage 143 | # PPO误差函数 144 | policy_loss = -tf.reduce_mean(tf.minimum(surr1, surr2)) 145 | # 对于偏置v来说,希望与MC估计的R(st)越接近越好 146 | value_loss = losses.MSE(v_target, v) 147 | # 优化策略网络 148 | grads = tape1.gradient(policy_loss, self.actor.trainable_variables) 149 | self.actor_optimizer.apply_gradients(zip(grads, self.actor.trainable_variables)) 150 | # 优化偏置值网络 151 | grads = tape2.gradient(value_loss, self.critic.trainable_variables) 152 | self.critic_optimizer.apply_gradients(zip(grads, self.critic.trainable_variables)) 153 | 154 | self.buffer = [] # 清空已训练数据 155 | 156 | 157 | def main(): 158 | agent = PPO() 159 | returns = [] # 统计总回报 160 | total = 0 # 一段时间内平均回报 161 | for i_epoch in range(500): # 训练回合数 162 | state = env.reset() # 复位环境 163 | for t in range(500): # 最多考虑500步 164 | # 通过最新策略与环境交互 165 | action, action_prob = agent.select_action(state) 166 | next_state, reward, done, _ = env.step(action) 167 | # 构建样本并存储 168 | trans = Transition(state, action, action_prob, reward, next_state) 169 | agent.store_transition(trans) 170 | state = next_state # 刷新状态 171 | total += reward # 累积激励 172 | if done: # 合适的时间点训练网络 173 | if len(agent.buffer) >= batch_size: 174 | agent.optimize() # 训练网络 175 | break 176 | 177 | if i_epoch % 20 == 0: # 每20个回合统计一次平均回报 178 | returns.append(total/20) 179 | total = 0 180 | print(i_epoch, returns[-1]) 181 | 182 | print(np.array(returns)) 183 | plt.figure() 184 | plt.plot(np.arange(len(returns))*20, np.array(returns)) 185 | plt.plot(np.arange(len(returns))*20, np.array(returns), 's') 186 | plt.xlabel('回合数') 187 | plt.ylabel('总回报') 188 | plt.savefig('ppo-tf-cartpole.svg') 189 | 190 | 191 | if __name__ == '__main__': 192 | main() 193 | print("end") -------------------------------------------------------------------------------- /ch15-自定义数据集/pokemon.py: -------------------------------------------------------------------------------- 1 | import os, glob 2 | import random, csv 3 | import tensorflow as tf 4 | 5 | 6 | 7 | def load_csv(root, filename, name2label): 8 | # 从csv文件返回images,labels列表 9 | # root:数据集根目录,filename:csv文件名, name2label:类别名编码表 10 | if not os.path.exists(os.path.join(root, filename)): 11 | # 如果csv文件不存在,则创建 12 | images = [] 13 | for name in name2label.keys(): # 遍历所有子目录,获得所有的图片 14 | # 只考虑后缀为png,jpg,jpeg的图片:'pokemon\\mewtwo\\00001.png 15 | images += glob.glob(os.path.join(root, name, '*.png')) 16 | images += glob.glob(os.path.join(root, name, '*.jpg')) 17 | images += glob.glob(os.path.join(root, name, '*.jpeg')) 18 | # 打印数据集信息:1167, 'pokemon\\bulbasaur\\00000000.png' 19 | print(len(images), images) 20 | random.shuffle(images) # 随机打散顺序 21 | # 创建csv文件,并存储图片路径及其label信息 22 | with open(os.path.join(root, filename), mode='w', newline='') as f: 23 | writer = csv.writer(f) 24 | for img in images: # 'pokemon\\bulbasaur\\00000000.png' 25 | name = img.split(os.sep)[-2] 26 | label = name2label[name] 27 | # 'pokemon\\bulbasaur\\00000000.png', 0 28 | writer.writerow([img, label]) 29 | print('written into csv file:', filename) 30 | 31 | # 此时已经有csv文件,直接读取 32 | images, labels = [], [] 33 | with open(os.path.join(root, filename)) as f: 34 | reader = csv.reader(f) 35 | for row in reader: 36 | # 'pokemon\\bulbasaur\\00000000.png', 0 37 | img, label = row 38 | label = int(label) 39 | images.append(img) 40 | labels.append(label) 41 | # 返回图片路径list和标签list 42 | return images, labels 43 | 44 | 45 | def load_pokemon(root, mode='train'): 46 | # 创建数字编码表 47 | name2label = {} # "sq...":0 48 | # 遍历根目录下的子文件夹,并排序,保证映射关系固定 49 | for name in sorted(os.listdir(os.path.join(root))): 50 | # 跳过非文件夹 51 | if not os.path.isdir(os.path.join(root, name)): 52 | continue 53 | # 给每个类别编码一个数字 54 | name2label[name] = len(name2label.keys()) 55 | 56 | # 读取Label信息 57 | # [file1,file2,], [3,1] 58 | images, labels = load_csv(root, 'images.csv', name2label) 59 | 60 | if mode == 'train': # 60% 61 | images = images[:int(0.6 * len(images))] 62 | labels = labels[:int(0.6 * len(labels))] 63 | elif mode == 'val': # 20% = 60%->80% 64 | images = images[int(0.6 * len(images)):int(0.8 * len(images))] 65 | labels = labels[int(0.6 * len(labels)):int(0.8 * len(labels))] 66 | else: # 20% = 80%->100% 67 | images = images[int(0.8 * len(images)):] 68 | labels = labels[int(0.8 * len(labels)):] 69 | 70 | return images, labels, name2label 71 | 72 | # 这里的mean和std根据真实的数据计算获得,比如ImageNet 73 | img_mean = tf.constant([0.485, 0.456, 0.406]) 74 | img_std = tf.constant([0.229, 0.224, 0.225]) 75 | def normalize(x, mean=img_mean, std=img_std): 76 | # 标准化 77 | # x: [224, 224, 3] 78 | # mean: [224, 224, 3], std: [3] 79 | x = (x - mean)/std 80 | return x 81 | 82 | def denormalize(x, mean=img_mean, std=img_std): 83 | # 标准化的逆过程 84 | x = x * std + mean 85 | return x 86 | 87 | def preprocess(x,y): 88 | # x: 图片的路径List,y:图片的数字编码List 89 | x = tf.io.read_file(x) # 根据路径读取图片 90 | x = tf.image.decode_jpeg(x, channels=3) # 图片解码 91 | x = tf.image.resize(x, [244, 244]) # 图片缩放 92 | 93 | # 数据增强 94 | # x = tf.image.random_flip_up_down(x) 95 | x= tf.image.random_flip_left_right(x) # 左右镜像 96 | x = tf.image.random_crop(x, [224, 224, 3]) # 随机裁剪 97 | # 转换成张量 98 | # x: [0,255]=> 0~1 99 | x = tf.cast(x, dtype=tf.float32) / 255. 100 | # 0~1 => D(0,1) 101 | x = normalize(x) # 标准化 102 | y = tf.convert_to_tensor(y) # 转换成张量 103 | 104 | return x, y 105 | 106 | 107 | def main(): 108 | import time 109 | 110 | 111 | 112 | # 加载pokemon数据集,指定加载训练集 113 | images, labels, table = load_pokemon('pokemon', 'train') 114 | print('images:', len(images), images) 115 | print('labels:', len(labels), labels) 116 | print('table:', table) 117 | 118 | # images: string path 119 | # labels: number 120 | db = tf.data.Dataset.from_tensor_slices((images, labels)) 121 | db = db.shuffle(1000).map(preprocess).batch(32) 122 | 123 | # 创建TensorBoard对象 124 | writter = tf.summary.create_file_writer('logs') 125 | for step, (x,y) in enumerate(db): 126 | # x: [32, 224, 224, 3] 127 | # y: [32] 128 | with writter.as_default(): 129 | x = denormalize(x) # 反向normalize,方便可视化 130 | # 写入图片数据 131 | tf.summary.image('img',x,step=step,max_outputs=9) 132 | time.sleep(5) 133 | 134 | 135 | 136 | 137 | if __name__ == '__main__': 138 | main() -------------------------------------------------------------------------------- /ch15-自定义数据集/resnet.py: -------------------------------------------------------------------------------- 1 | import os 2 | import tensorflow as tf 3 | import numpy as np 4 | from tensorflow import keras 5 | from tensorflow.keras import layers 6 | 7 | 8 | 9 | tf.random.set_seed(22) 10 | np.random.seed(22) 11 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 12 | assert tf.__version__.startswith('2.') 13 | 14 | 15 | 16 | class ResnetBlock(keras.Model): 17 | 18 | def __init__(self, channels, strides=1): 19 | super(ResnetBlock, self).__init__() 20 | 21 | self.channels = channels 22 | self.strides = strides 23 | 24 | self.conv1 = layers.Conv2D(channels, 3, strides=strides, 25 | padding=[[0,0],[1,1],[1,1],[0,0]]) 26 | self.bn1 = keras.layers.BatchNormalization() 27 | self.conv2 = layers.Conv2D(channels, 3, strides=1, 28 | padding=[[0,0],[1,1],[1,1],[0,0]]) 29 | self.bn2 = keras.layers.BatchNormalization() 30 | 31 | if strides!=1: 32 | self.down_conv = layers.Conv2D(channels, 1, strides=strides, padding='valid') 33 | self.down_bn = tf.keras.layers.BatchNormalization() 34 | 35 | def call(self, inputs, training=None): 36 | residual = inputs 37 | 38 | x = self.conv1(inputs) 39 | x = tf.nn.relu(x) 40 | x = self.bn1(x, training=training) 41 | x = self.conv2(x) 42 | x = tf.nn.relu(x) 43 | x = self.bn2(x, training=training) 44 | 45 | # 残差连接 46 | if self.strides!=1: 47 | residual = self.down_conv(inputs) 48 | residual = tf.nn.relu(residual) 49 | residual = self.down_bn(residual, training=training) 50 | 51 | x = x + residual 52 | x = tf.nn.relu(x) 53 | return x 54 | 55 | 56 | class ResNet(keras.Model): 57 | 58 | def __init__(self, num_classes, initial_filters=16, **kwargs): 59 | super(ResNet, self).__init__(**kwargs) 60 | 61 | self.stem = layers.Conv2D(initial_filters, 3, strides=3, padding='valid') 62 | 63 | self.blocks = keras.models.Sequential([ 64 | ResnetBlock(initial_filters * 2, strides=3), 65 | ResnetBlock(initial_filters * 2, strides=1), 66 | # layers.Dropout(rate=0.5), 67 | 68 | ResnetBlock(initial_filters * 4, strides=3), 69 | ResnetBlock(initial_filters * 4, strides=1), 70 | 71 | ResnetBlock(initial_filters * 8, strides=2), 72 | ResnetBlock(initial_filters * 8, strides=1), 73 | 74 | ResnetBlock(initial_filters * 16, strides=2), 75 | ResnetBlock(initial_filters * 16, strides=1), 76 | ]) 77 | 78 | self.final_bn = layers.BatchNormalization() 79 | self.avg_pool = layers.GlobalMaxPool2D() 80 | self.fc = layers.Dense(num_classes) 81 | 82 | def call(self, inputs, training=None): 83 | # print('x:',inputs.shape) 84 | out = self.stem(inputs) 85 | out = tf.nn.relu(out) 86 | 87 | # print('stem:',out.shape) 88 | 89 | out = self.blocks(out, training=training) 90 | # print('res:',out.shape) 91 | 92 | out = self.final_bn(out, training=training) 93 | # out = tf.nn.relu(out) 94 | 95 | out = self.avg_pool(out) 96 | 97 | # print('avg_pool:',out.shape) 98 | out = self.fc(out) 99 | 100 | # print('out:',out.shape) 101 | 102 | return out 103 | 104 | 105 | 106 | def main(): 107 | num_classes = 5 108 | 109 | resnet18 = ResNet(5) 110 | resnet18.build(input_shape=(4,224,224,3)) 111 | resnet18.summary() 112 | 113 | 114 | 115 | 116 | 117 | 118 | if __name__ == '__main__': 119 | main() -------------------------------------------------------------------------------- /ch15-自定义数据集/train_scratch.py: -------------------------------------------------------------------------------- 1 | import matplotlib 2 | from matplotlib import pyplot as plt 3 | matplotlib.rcParams['font.size'] = 18 4 | matplotlib.rcParams['figure.titlesize'] = 18 5 | matplotlib.rcParams['figure.figsize'] = [9, 7] 6 | matplotlib.rcParams['font.family'] = ['KaiTi'] 7 | matplotlib.rcParams['axes.unicode_minus']=False 8 | 9 | import os 10 | import tensorflow as tf 11 | import numpy as np 12 | from tensorflow import keras 13 | from tensorflow.keras import layers,optimizers,losses 14 | from tensorflow.keras.callbacks import EarlyStopping 15 | 16 | tf.random.set_seed(1234) 17 | np.random.seed(1234) 18 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 19 | assert tf.__version__.startswith('2.') 20 | 21 | 22 | from pokemon import load_pokemon,normalize 23 | 24 | 25 | 26 | def preprocess(x,y): 27 | # x: 图片的路径,y:图片的数字编码 28 | x = tf.io.read_file(x) 29 | x = tf.image.decode_jpeg(x, channels=3) # RGBA 30 | x = tf.image.resize(x, [244, 244]) 31 | 32 | x = tf.image.random_flip_left_right(x) 33 | x = tf.image.random_flip_up_down(x) 34 | x = tf.image.random_crop(x, [224,224,3]) 35 | 36 | # x: [0,255]=> -1~1 37 | x = tf.cast(x, dtype=tf.float32) / 255. 38 | x = normalize(x) 39 | y = tf.convert_to_tensor(y) 40 | y = tf.one_hot(y, depth=5) 41 | 42 | return x, y 43 | 44 | 45 | batchsz = 32 46 | # 创建训练集Datset对象 47 | images, labels, table = load_pokemon('pokemon',mode='train') 48 | db_train = tf.data.Dataset.from_tensor_slices((images, labels)) 49 | db_train = db_train.shuffle(1000).map(preprocess).batch(batchsz) 50 | # 创建验证集Datset对象 51 | images2, labels2, table = load_pokemon('pokemon',mode='val') 52 | db_val = tf.data.Dataset.from_tensor_slices((images2, labels2)) 53 | db_val = db_val.map(preprocess).batch(batchsz) 54 | # 创建测试集Datset对象 55 | images3, labels3, table = load_pokemon('pokemon',mode='test') 56 | db_test = tf.data.Dataset.from_tensor_slices((images3, labels3)) 57 | db_test = db_test.map(preprocess).batch(batchsz) 58 | 59 | # 加载DenseNet网络模型,并去掉最后一层全连接层,最后一个池化层设置为max pooling 60 | net = keras.applications.DenseNet121(include_top=False, pooling='max') 61 | # 设计为不参与优化,即MobileNet这部分参数固定不动 62 | net.trainable = True 63 | newnet = keras.Sequential([ 64 | net, # 去掉最后一层的DenseNet121 65 | layers.Dense(1024, activation='relu'), # 追加全连接层 66 | layers.BatchNormalization(), # 追加BN层 67 | layers.Dropout(rate=0.5), # 追加Dropout层,防止过拟合 68 | layers.Dense(5) # 根据宝可梦数据的任务,设置最后一层输出节点数为5 69 | ]) 70 | newnet.build(input_shape=(4,224,224,3)) 71 | newnet.summary() 72 | 73 | # 创建Early Stopping类,连续3次不下降则终止 74 | early_stopping = EarlyStopping( 75 | monitor='val_accuracy', 76 | min_delta=0.001, 77 | patience=3 78 | ) 79 | 80 | newnet.compile(optimizer=optimizers.Adam(lr=1e-3), 81 | loss=losses.CategoricalCrossentropy(from_logits=True), 82 | metrics=['accuracy']) 83 | history = newnet.fit(db_train, validation_data=db_val, validation_freq=1, epochs=100, 84 | callbacks=[early_stopping]) 85 | history = history.history 86 | print(history.keys()) 87 | print(history['val_accuracy']) 88 | print(history['accuracy']) 89 | test_acc = newnet.evaluate(db_test) 90 | 91 | plt.figure() 92 | returns = history['val_accuracy'] 93 | plt.plot(np.arange(len(returns)), returns, label='验证准确率') 94 | plt.plot(np.arange(len(returns)), returns, 's') 95 | returns = history['accuracy'] 96 | plt.plot(np.arange(len(returns)), returns, label='训练准确率') 97 | plt.plot(np.arange(len(returns)), returns, 's') 98 | 99 | plt.plot([len(returns)-1],[test_acc[-1]], 'D', label='测试准确率') 100 | plt.legend() 101 | plt.xlabel('Epoch') 102 | plt.ylabel('准确率') 103 | plt.savefig('scratch.svg') -------------------------------------------------------------------------------- /ch15-自定义数据集/train_transfer.py: -------------------------------------------------------------------------------- 1 | import matplotlib 2 | from matplotlib import pyplot as plt 3 | matplotlib.rcParams['font.size'] = 18 4 | matplotlib.rcParams['figure.titlesize'] = 18 5 | matplotlib.rcParams['figure.figsize'] = [9, 7] 6 | matplotlib.rcParams['font.family'] = ['KaiTi'] 7 | matplotlib.rcParams['axes.unicode_minus']=False 8 | 9 | import os 10 | import tensorflow as tf 11 | import numpy as np 12 | from tensorflow import keras 13 | from tensorflow.keras import layers,optimizers,losses 14 | from tensorflow.keras.callbacks import EarlyStopping 15 | 16 | tf.random.set_seed(2222) 17 | np.random.seed(2222) 18 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 19 | assert tf.__version__.startswith('2.') 20 | 21 | 22 | from pokemon import load_pokemon,normalize 23 | 24 | 25 | 26 | def preprocess(x,y): 27 | # x: 图片的路径,y:图片的数字编码 28 | x = tf.io.read_file(x) 29 | x = tf.image.decode_jpeg(x, channels=3) # RGBA 30 | x = tf.image.resize(x, [244, 244]) 31 | 32 | x = tf.image.random_flip_left_right(x) 33 | x = tf.image.random_flip_up_down(x) 34 | x = tf.image.random_crop(x, [224,224,3]) 35 | 36 | # x: [0,255]=> -1~1 37 | x = tf.cast(x, dtype=tf.float32) / 255. 38 | x = normalize(x) 39 | y = tf.convert_to_tensor(y) 40 | y = tf.one_hot(y, depth=5) 41 | 42 | return x, y 43 | 44 | 45 | batchsz = 32 46 | # 创建训练集Datset对象 47 | images, labels, table = load_pokemon('pokemon',mode='train') 48 | db_train = tf.data.Dataset.from_tensor_slices((images, labels)) 49 | db_train = db_train.shuffle(1000).map(preprocess).batch(batchsz) 50 | # 创建验证集Datset对象 51 | images2, labels2, table = load_pokemon('pokemon',mode='val') 52 | db_val = tf.data.Dataset.from_tensor_slices((images2, labels2)) 53 | db_val = db_val.map(preprocess).batch(batchsz) 54 | # 创建测试集Datset对象 55 | images3, labels3, table = load_pokemon('pokemon',mode='test') 56 | db_test = tf.data.Dataset.from_tensor_slices((images3, labels3)) 57 | db_test = db_test.map(preprocess).batch(batchsz) 58 | 59 | # 加载DenseNet网络模型,并去掉最后一层全连接层,最后一个池化层设置为max pooling 60 | net = keras.applications.DenseNet121(weights='imagenet', include_top=False, pooling='max') 61 | # 设计为不参与优化,即MobileNet这部分参数固定不动 62 | net.trainable = True 63 | newnet = keras.Sequential([ 64 | net, # 去掉最后一层的DenseNet121 65 | layers.Dense(1024, activation='relu'), # 追加全连接层 66 | layers.BatchNormalization(), # 追加BN层 67 | layers.Dropout(rate=0.5), # 追加Dropout层,防止过拟合 68 | layers.Dense(5) # 根据宝可梦数据的任务,设置最后一层输出节点数为5 69 | ]) 70 | newnet.build(input_shape=(4,224,224,3)) 71 | newnet.summary() 72 | 73 | # 创建Early Stopping类,连续3次不下降则终止 74 | early_stopping = EarlyStopping( 75 | monitor='val_accuracy', 76 | min_delta=0.001, 77 | patience=3 78 | ) 79 | 80 | newnet.compile(optimizer=optimizers.Adam(lr=1e-3), 81 | loss=losses.CategoricalCrossentropy(from_logits=True), 82 | metrics=['accuracy']) 83 | history = newnet.fit(db_train, validation_data=db_val, validation_freq=1, epochs=100, 84 | callbacks=[early_stopping]) 85 | history = history.history 86 | print(history.keys()) 87 | print(history['val_accuracy']) 88 | print(history['accuracy']) 89 | test_acc = newnet.evaluate(db_test) 90 | 91 | plt.figure() 92 | returns = history['val_accuracy'] 93 | plt.plot(np.arange(len(returns)), returns, label='验证准确率') 94 | plt.plot(np.arange(len(returns)), returns, 's') 95 | returns = history['accuracy'] 96 | plt.plot(np.arange(len(returns)), returns, label='训练准确率') 97 | plt.plot(np.arange(len(returns)), returns, 's') 98 | 99 | plt.plot([len(returns)-1],[test_acc[-1]], 'D', label='测试准确率') 100 | plt.legend() 101 | plt.xlabel('Epoch') 102 | plt.ylabel('准确率') 103 | plt.savefig('transfer.svg') -------------------------------------------------------------------------------- /ch15-自定义数据集/宝可梦数据集.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/ch15-自定义数据集/宝可梦数据集.pdf -------------------------------------------------------------------------------- /【《TensorFlow深度学习》】.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dragen1860/Deep-Learning-with-TensorFlow-book/f99a94ed7d27cf7264bab37527e3c66735928f6d/【《TensorFlow深度学习》】.pdf --------------------------------------------------------------------------------