├── .gitignore ├── Beginner-Images ├── tf2doc-cnn-cifar10.md ├── tf2doc-cnn-cifar10 │ ├── cifar10-eg.jpg │ └── pooling.jpg ├── tf2doc-tfhub-image-tl.md └── tf2doc-tfhub-image-tl │ └── cifar10-eg.jpg ├── Beginner-ML-basics ├── tf2doc-ml-basic-image.md ├── tf2doc-ml-basic-image │ ├── fashion-mnist-sm.jpg │ └── fashion-mnist.jpg ├── tf2doc-ml-basic-overfit.md ├── tf2doc-ml-basic-overfit │ ├── 3_train_val_loss.jpg │ ├── imdb.jpg │ └── l2_dropout_loss.jpg ├── tf2doc-ml-basic-regression.md ├── tf2doc-ml-basic-regression │ ├── early_mse.jpg │ ├── mse.jpg │ ├── sns.jpg │ └── test_true_pred.jpg ├── tf2doc-ml-basic-save-model.md ├── tf2doc-ml-basic-save-model │ └── hdf5.png ├── tf2doc-ml-basic-structured-data.md ├── tf2doc-ml-basic-structured-data │ └── structured-data.jpg ├── tf2doc-ml-basic-text.md └── tf2doc-ml-basic-text │ ├── imdb-sm.jpg │ └── imdb.jpg ├── Beginner-Text-and-sequences ├── tf2doc-rnn-lstm-text.md └── tf2doc-rnn-lstm-text │ ├── acc1.jpg │ ├── acc2.jpg │ ├── rnn.jpg │ └── rnn_small.jpg ├── README.md ├── code ├── auto_mpg_regression.ipynb ├── cnn-cifar-10.ipynb ├── explore_imdb_overfit.ipynb ├── rnn-text.ipynb ├── save_restore_model.ipynb └── tfhub_image_transfer_learning.ipynb ├── tf2doc.md └── tf2doc └── tf.jpg /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | .vscode 3 | *.mx 4 | # code only include ipynb file 5 | code/* 6 | !code/*.ipynb -------------------------------------------------------------------------------- /Beginner-Images/tf2doc-cnn-cifar10.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: TensorFlow 2 中文文档 - 卷积神经网络分类 CIFAR-10 3 | date: 2019-07-19 20:00:10 4 | description: TensorFlow2文档,TensorFlow2.0文档,TensorFlow2.0 TF2.0 TensorFlow 2 / 2.0 官方文档中文版,卷积神经网络(Convolutional Neural Networks, CNN)分类 CIFAR-10 。 5 | tags: 6 | - TensorFlow 2 7 | - 官方文档 8 | keywords: 9 | - TensorFlow2.0 10 | - TensorFlow2文档 11 | - TensorFlow2.0文档 12 | nav: 简明教程 13 | categories: 14 | - TensorFlow2 文档 15 | image: post/tf2doc-cnn-cifar10/pooling.jpg 16 | github: https://github.com/geektutu/tensorflow2-docs-zh 17 | --- 18 | 19 | **TF2.0 TensorFlow 2 / 2.0 中文文档:卷积神经网络分类 CIFAR-10 Convolutional Neural Networks** 20 | 21 | 主要内容:使用卷积神经网络(Convolutional Neural Network, CNN)分类[CIFAT-10数据集](https://www.cs.toronto.edu/~kriz/cifar.html) 22 | 23 | 官方文档使用的是MNIST数据集,之前在[mnist手写数字识别(CNN卷积神经网络)](https://geektutu.com/post/tensorflow2-mnist-cnn.html)这篇文章中已经有详细的介绍了,包括训练模型、使用真实图片预测等。这篇文章选用 CIFAR-10 数据集来验证简单的卷积神经网络在图像分类问题上的表现。 24 | 25 | ## CIFAR-10 数据集简介 26 | 27 | 与 MNIST 手写数字一样,CIFAR-10 包含了60,000张图片,共10类。训练集50,000张,测试集10,000张。但与MNIST不同的是,CIFAR-10 数据集中的图片是彩色的,每张图片的大小是 _32x32x3_ ,3代表 _R/G/B_ 三个通道,每个像素点的颜色由 _R/G/B_ 三个值决定,_R/G/B_ 的取值范围为0-255。熟悉计算机视觉的童鞋应该了解,图片像素点的值还可以由 _R/G/B/A_ 四个值决定,A 代表透明度,取值范围为0-1。比如下面2个颜色,同样是黑色,透明度不同,感官上会有很大差别: 28 | 29 |

rgba(0, 0, 0, 1)

30 |

rgba(0, 0, 0, 0.5)

31 | 32 | 下载 CIFAR-10 数据集 33 | 34 | ```python 35 | # geektutu.com 36 | import matplotlib.pyplot as plt 37 | import tensorflow as tf 38 | from tensorflow.keras import layers, datasets, models 39 | 40 | (train_x, train_y), (test_x, test_y) = datasets.cifar10.load_data() 41 | ``` 42 | 43 | 看一看前15张图片长啥样吧。 44 | 45 | ```python 46 | # geektutu.com 47 | plt.figure(figsize=(5, 3)) 48 | plt.subplots_adjust(hspace=0.1) 49 | for n in range(15): 50 | plt.subplot(3, 5, n+1) 51 | plt.imshow(train_x[n]) 52 | plt.axis('off') 53 | _ = plt.suptitle("geektutu.com CIFAR-10 Example") 54 | ``` 55 | 56 | ![cifar-10 first 15 images](tf2doc-cnn-cifar10/cifar10-eg.jpg) 57 | 58 | 将0-255的像素值转换到0-1 59 | 60 | ```python 61 | # geektutu.com 62 | train_x, test_x = train_x / 255.0, test_x / 255.0 63 | print('train_x shape:', train_x.shape, 'test_x shape:', test_x.shape) 64 | # (50000, 32, 32, 3), (10000, 32, 32, 3) 65 | ``` 66 | 67 | ## 卷积层 68 | 69 | ```python 70 | # geektutu.com 71 | model = models.Sequential() 72 | model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3))) 73 | model.add(layers.MaxPooling2D((2, 2))) 74 | model.add(layers.Conv2D(64, (3, 3), activation='relu')) 75 | model.add(layers.MaxPooling2D((2, 2))) 76 | model.add(layers.Conv2D(64, (3, 3), activation='relu')) 77 | model.summary() 78 | ``` 79 | 80 | ```bash 81 | Model: "sequential_1" 82 | _________________________________________________________________ 83 | Layer (type) Output Shape Param # 84 | ================================================================= 85 | conv2d_3 (Conv2D) (None, 30, 30, 32) 896 86 | _________________________________________________________________ 87 | max_pooling2d_2 (MaxPooling2 (None, 15, 15, 32) 0 88 | _________________________________________________________________ 89 | conv2d_4 (Conv2D) (None, 13, 13, 64) 18496 90 | _________________________________________________________________ 91 | max_pooling2d_3 (MaxPooling2 (None, 6, 6, 64) 0 92 | _________________________________________________________________ 93 | conv2d_5 (Conv2D) (None, 4, 4, 64) 36928 94 | ================================================================= 95 | Total params: 56,320 96 | Trainable params: 56,320 97 | Non-trainable params: 0 98 | _________________________________________________________________ 99 | ``` 100 | 101 | CNN 的输入是三维张量 (image_height, image_width, color_channels),即 input_shape。每一层卷积层使用`tf.keras.layers.Conv2D`来搭建。Conv2D 共接收2个参数,第2个参数是卷积核大小,第1个参数是卷积核的个数。 102 | 103 | 关于 CNN 更详细的内容,可以参考[mnist手写数字识别(CNN卷积神经网络)](https://geektutu.com/post/tensorflow2-mnist-cnn.html),这里有卷积的动态效果图和推荐的视频。 104 | 105 | 第1、2卷积层后紧跟了最大池化层(MaxPooling2D),最大池化即选择图像区域的最大值作为该区域池化后的值,另一个常见的池化操作是平均池化,即计算图像区域的平均值作为该区域池化后的值。 106 | 107 | ![Max & Avg Pooling](tf2doc-cnn-cifar10/pooling.jpg) 108 | 109 | 每一轮卷积或池化后,图像的宽和高的值都会减小,假设图像的高度为h,卷积核大小为 m,那么很容易得出卷积后的高度 h1 = h - m + 1。池化前的高度为 h1,池化滤波器大小为 s,那么池化后的高度为 h1 / s。对应到`model.summary()`的输出,输入大小为 (32, 32),经过32个3x3的卷积核卷积后,大小为 (30, 30),紧接着池化后,大小变为(15, 15)。 110 | 111 | ## 全连接层 112 | 113 | 我们的目的是对图像进行分类,即期望输出一个长度为10的一维向量,第k个值代表输入图片分类为k的概率。因此需要通过 Dense 层,即全连接层,将3维的卷积层输出,转换为一维。这里可以使用`tf.keras.layers.Flatten()`。 114 | 115 | ```python 116 | # geektutu.com 117 | model.add(layers.Flatten()) 118 | model.add(layers.Dense(64, activation='relu')) 119 | model.add(layers.Dense(10, activation='softmax')) 120 | model.summary() 121 | ``` 122 | 123 | 看一下最终的模型。 124 | 125 | ```bash 126 | Model: "sequential_1" 127 | _________________________________________________________________ 128 | Layer (type) Output Shape Param # 129 | ================================================================= 130 | conv2d_3 (Conv2D) (None, 30, 30, 32) 896 131 | _________________________________________________________________ 132 | max_pooling2d_2 (MaxPooling2 (None, 15, 15, 32) 0 133 | _________________________________________________________________ 134 | conv2d_4 (Conv2D) (None, 13, 13, 64) 18496 135 | _________________________________________________________________ 136 | max_pooling2d_3 (MaxPooling2 (None, 6, 6, 64) 0 137 | _________________________________________________________________ 138 | conv2d_5 (Conv2D) (None, 4, 4, 64) 36928 139 | _________________________________________________________________ 140 | flatten_1 (Flatten) (None, 1024) 0 141 | _________________________________________________________________ 142 | dense_2 (Dense) (None, 64) 65600 143 | _________________________________________________________________ 144 | dense_3 (Dense) (None, 10) 650 145 | ================================================================= 146 | Total params: 122,570 147 | Trainable params: 122,570 148 | Non-trainable params: 0 149 | _________________________________________________________________ 150 | ``` 151 | 152 | ## 编译训练模型 153 | 154 | ```python 155 | # geektutu.com 156 | model.compile(optimizer='adam', 157 | loss='sparse_categorical_crossentropy', 158 | metrics=['accuracy']) 159 | 160 | model.fit(train_x, train_y, epochs=5) 161 | ``` 162 | 163 | ## 评估模型 164 | 165 | ```python 166 | # geektutu.com 167 | test_loss, test_acc = model.evaluate(test_x, test_y) 168 | test_acc # 0.683 169 | ``` 170 | 171 | 卷积神经网络非常适合用来处理图像,这个模型如果用来训练 MNIST 手写数字数据集,可以达到 99% 的正确率,但是在 CIFAR10 数据集上,只有 68.3% 的正确率,我们将在后面的文章中,使用复杂网络模型或者迁移学习来提高准确率。 172 | 173 | 返回[文档首页](https://geektutu.com/post/tf2doc.html) 174 | 175 | > 完整代码:[Github - cnn-cifar-10.ipynb](https://github.com/geektutu/tensorflow2-docs-zh/tree/master/code) 176 | > 参考文档:[Convolutional Neural Networks](https://www.tensorflow.org/beta/tutorials/images/intro_to_cnns) 177 | 178 | ## 附 推荐 179 | 180 | - [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) -------------------------------------------------------------------------------- /Beginner-Images/tf2doc-cnn-cifar10/cifar10-eg.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-Images/tf2doc-cnn-cifar10/cifar10-eg.jpg -------------------------------------------------------------------------------- /Beginner-Images/tf2doc-cnn-cifar10/pooling.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-Images/tf2doc-cnn-cifar10/pooling.jpg -------------------------------------------------------------------------------- /Beginner-Images/tf2doc-tfhub-image-tl.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: TensorFlow 2 中文文档 - TFHub 迁移学习 3 | date: 2019-07-19 22:00:10 4 | description: TensorFlow2文档,TensorFlow2.0文档,TensorFlow2.0 TF2.0 TensorFlow 2 / 2.0 官方文档中文版,迁移学习(transfer learning)分类 CIFAR-10 。 5 | tags: 6 | - TensorFlow 2 7 | - 官方文档 8 | keywords: 9 | - TensorFlow2.0 10 | - TensorFlow2文档 11 | - TensorFlow2.0文档 12 | nav: 简明教程 13 | categories: 14 | - TensorFlow2 文档 15 | image: post/tf2doc-cnn-cifar10/cifar10-eg.jpg 16 | github: https://github.com/geektutu/tensorflow2-docs-zh 17 | --- 18 | 19 | **TF2.0 TensorFlow 2 / 2.0 中文文档:TFHub 迁移学习 transfer learning** 20 | 21 | 主要内容:使用 [TFHub](https://www.tensorflow.org/hub) 中的预训练模型 _ImageNet_ 进行迁移学习,实现图像分类,数据集使用 CIFAR-10。 22 | 23 | ## ImageNet 模型简介 24 | 25 | TFHub 上有很多预训练好的模型(pretrained model),这次我们选择`ImageNet`。ImageNet 数据集大约有1500万张图片,2.2万类,可以说你能想到,想象不到的图片都能在里面找到。想下载感受一下的话可以到官网下载[ImageNet](http://www.image-net.org/)。 26 | 27 | 当然每次训练不太可能使用所有的图片,一般使用子集,比如2012年ILSVRC分类数据集使用了大概1/10的图片。我们今天用于迁移学习的预训练模型就只有1001个分类,想知道这1001类分别有哪些可以看[这里](https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt)。 28 | 29 | ### 下载 ImageNet Classifier 30 | 31 | ```python 32 | # geektutu.com 33 | import numpy as np 34 | from PIL import Image 35 | import matplotlib.pylab as plt 36 | import tensorflow as tf 37 | import tensorflow_hub as hub 38 | from tensorflow.keras import layers, datasets 39 | 40 | url ="https://tfhub.dev/google/tf2-preview/mobilenet_v2/classification/4" 41 | model = tf.keras.Sequential([ 42 | hub.KerasLayer(url, input_shape=(224, 224, 3)) 43 | ]) 44 | ``` 45 | 46 | ImageNet 数据集中的图片大小固定为 (224, 224, 3),因此模型的输入也是 (224, 224, 3)。 47 | 48 | ### 测试任意图片 49 | 50 | 在这里选取一张兔子的图片,也就是本站的 logo 来测试这个预训练好的模型。 51 | 52 | ```python 53 | # geektutu.com 54 | tutu = tf.keras.utils.get_file('tutu.png','https://geektutu.com/img/icon.png') 55 | tutu = Image.open(tutu).resize((224, 224)) 56 | tutu 57 | ``` 58 | ![geektutu](https://geektutu.com/img/icon.png) 59 | 60 | ```python 61 | # geektutu.com 62 | result = model.predict(np.array(tutu).reshape(1, 224, 224, 3)/255.0) 63 | ans = np.argmax(result[0], axis=-1) 64 | print('result.shape:', result.shape, 'ans:', ans) 65 | # result.shape: (1, 1001) ans: 332 66 | ``` 67 | 68 | 模型的输出有1001个分类,测试的结果是332,接下来我们将下载 _ImageNetLabels.txt_ ,就可以知道332代表的分类的名称,可以看到结果是 hare,即`兔`。 69 | 70 | ```python 71 | # geektutu.com 72 | labels_url = 'https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt' 73 | labels_path = tf.keras.utils.get_file('ImageNetLabels.txt', labels_url) 74 | imagenet_labels = np.array(open(labels_path).read().splitlines()) 75 | print(imagenet_labels[ans]) 76 | # hare 77 | ``` 78 | 79 | ## 迁移学习 80 | 81 | 在实际的应用中,预训练好的模型的输入输出可能并不能满足我们的需求,另外,训练上百万甚至上千万张图片,可能需要花费好几天的时间,那有没有办法只使用训练好的模型的一部分呢?训练好的模型的前几层对特征提取有非常好的效果,如果可以直接使用,那就事半功倍了。这种方法被称之为迁移学习(transfer learning)。 82 | 83 | 在接下来的例子中,我们复用了 _ImageNet Classifier_ 的特征提取的部分,并定义了自己的输出层。因为原来的模型输出是1001个分类,而我们希望识别的 CIFAR-10 数据集只有10个分类。 84 | 85 | ### resize 数据集 86 | 87 | 这次demo使用的是 CIFAR-10 数据集,这个数据集在上篇文档 [卷积神经网络分类 CIFAR-10](https://geektutu.com/post/tf2doc-cnn-cifar10.html)有比较详细的介绍,这里就不重复介绍了。再简单看一看这个数据集中的15张样例图片。 88 | 89 | ![CIFAR-10 examples](tf2doc-tfhub-image-tl/cifar10-eg.jpg) 90 | 91 | _ImageNet Classifier_ 的输入固定为(224, 224, 3),但 CIFAR-10 数据集中的图片大小是 32 * 32,简单起见,我们将每一张图片大小从 32x32 转换为 224x224,使用`pillow`库提供的 resize 方法。因为读取全部的数据,内存会被撑爆,所以训练集只截取了 30,000 张图片。 92 | 93 | ```python 94 | def resize(d, size=(224, 224)): 95 | return np.array([np.array(Image.fromarray(v).resize(size, Image.ANTIALIAS)) 96 | for i, v in enumerate(d)]) 97 | 98 | (train_x, train_y), (test_x, test_y) = datasets.cifar10.load_data() 99 | train_x, test_x = resize(train_x[:30000])/255.0, resize(test_x)/255.0 100 | train_y = train_y[:30000] 101 | ``` 102 | 103 | ### 下载特征提取层 104 | 105 | TFHub 提供了 _ImageNet Classifier_ 去掉了最后的分类层的版本,可以直接下载使用。 106 | 107 | ```python 108 | feature_extractor_url = 'https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4' 109 | feature_extractor_layer = hub.KerasLayer(feature_extractor_url, 110 | input_shape=(224,224,3)) 111 | # 这一层的训练值保持不变 112 | feature_extractor_layer.trainable = False 113 | ``` 114 | 115 | ### 添加分类层 116 | 117 | ```python 118 | model = tf.keras.Sequential([ 119 | feature_extractor_layer, 120 | layers.Dense(10, activation='softmax') 121 | ]) 122 | model.compile(optimizer=tf.keras.optimizers.Adam(), 123 | loss='sparse_categorical_crossentropy', 124 | metrics=['acc']) 125 | model.summary() 126 | ``` 127 | 128 | 这一步,我们在特征提取层后面,添加了输出为10的全连接层,用于最后的分类。从`model.summary()`中我们可以看到特征提取层的输出是1280。 129 | 130 | ``` 131 | Model: "sequential_1" 132 | _________________________________________________________________ 133 | Layer (type) Output Shape Param # 134 | ================================================================= 135 | keras_layer_1 (KerasLayer) (None, 1280) 2257984 136 | _________________________________________________________________ 137 | dense (Dense) (None, 10) 12810 138 | ================================================================= 139 | Total params: 2,270,794 140 | Trainable params: 12,810 141 | Non-trainable params: 2,257,984 142 | _________________________________________________________________ 143 | ``` 144 | 145 | ### 训练并评估模型 146 | 147 | ```python 148 | history = model.fit(train_x, train_y, epochs=1) 149 | loss, acc = model.evaluate(test_x, test_y) 150 | print(acc) 151 | ``` 152 | 153 | ```bash 154 | 10000/10000 [=====] - 256s 26ms/sample - loss: 0.7636 - acc: 0.7657 155 | ``` 156 | 157 | 本文的示例模型非常简单,在`feature_extractor_layer`直接添加了输出层,可训练参数很少。而且只使用大约一半的训练集,正确率仍然达到了 76% 。 158 | 159 | 类似于 ImageNet 的预训练模型还有很多,比如非常出名的 VGG 模型,有兴趣都可以尝试。 160 | 161 | 返回[文档首页](https://geektutu.com/post/tf2doc.html) 162 | 163 | > 完整代码:[Github - tfhub_image_transfer_learning.ipynb](https://github.com/geektutu/tensorflow2-docs-zh/tree/master/code) 164 | > 参考文档:[TensorFlow Hub with Keras](https://www.tensorflow.org/beta/tutorials/images/hub_with_keras) 165 | 166 | ## 附 推荐 167 | 168 | - [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) -------------------------------------------------------------------------------- /Beginner-Images/tf2doc-tfhub-image-tl/cifar10-eg.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-Images/tf2doc-tfhub-image-tl/cifar10-eg.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-image.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: TensorFlow 2 中文文档 - MNIST 图像分类 3 | date: 2019-07-09 00:30:10 4 | description: TensorFlow2文档,TensorFlow2.0文档,TensorFlow2.0 TF2.0 TensorFlow 2 / 2.0 官方文档中文版,图像分类 Classify images,示例使用 Fashion MNIST 数据集。 5 | tags: 6 | - TensorFlow 2 7 | - 官方文档 8 | keywords: 9 | - TensorFlow2.0 10 | - TensorFlow2文档 11 | - TensorFlow2.0文档 12 | - Fashion MNIST 13 | - 图像分类 14 | - Classify images 15 | nav: 简明教程 16 | categories: 17 | - TensorFlow2 文档 18 | image: post/tf2doc-ml-basic-image/fashion-mnist-sm.jpg 19 | github: https://github.com/geektutu/tensorflow2-docs-zh 20 | --- 21 | 22 | **TF2.0 TensorFlow 2 / 2.0 中文文档 - 图像分类 Classify images** 23 | 24 | 主要内容:使用神经网络对服饰图片进行分类。 25 | 26 | 这篇文档使用高级API`tf.keras`在TensorFlow中搭建和训练模型。 27 | 28 | ```python 29 | # TensorFlow and tf.keras 30 | import tensorflow as tf 31 | from tensorflow import keras 32 | 33 | # Helper libraries 34 | import numpy as np 35 | ``` 36 | 37 | ## 使用 Fashion MNIST 数据集 38 | 39 | ![tf2doc-ml-basic-image](tf2doc-ml-basic-image/fashion-mnist.jpg) 40 | 41 | Fashion Mnist数据集由70,000张黑白图片构成,每张图片大小为 28x28,由十类服饰图片构成。另一个MNIST数据集是手写数字,Fashion MNIST 与之相比更有挑战性,适合用来验证算法。 42 | 43 | 我们使用60,000张图片作为训练集,10,000张图片作为测试集。这个数据集可以从 TensorFlow 中直接获取,返回值为numpy数组。 44 | 45 | ```python 46 | fashion_mnist = keras.datasets.fashion_mnist 47 | 48 | (train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data() 49 | ``` 50 | 51 | 图片大小28x28,每个像素值取值范围0-255。标签是整数,取值范围0-9,与实际的服饰类别对应关系如下。 52 | 53 | | 标签 | 类别 | 标签 | 类别 | 标签 | 类别 | 标签 | 类别 | 54 | | ---- | ----------- | ---- | ------ | ---- | ------- | ---- | ---------- | 55 | | 0 | T-shirt/top | 3 | Dress | 6 | Shirt | 9 | Ankle boot | 56 | | 1 | Trouser | 4 | Coat | 7 | Sneaker | | | 57 | | 2 | Pullover | 5 | Sandal | 8 | Bag | | | 58 | 59 | ## 数据格式 60 | 61 | ```python 62 | train_images.shape # (60000, 28, 28) 63 | len(train_labels) # 60000 64 | train_labels # ([9, 0, 0, ..., 3, 0, 5], dtype=uint8) 65 | test_images.shape # (10000, 28, 28) 66 | len(test_labels) # 10000 67 | ``` 68 | 69 | ## 预处理 70 | 71 | 训练之前,我们需要对数据进行预处理。图片的每个像素值在0-255之间,需要转为0-1。训练集和测试集都需要经过相同的处理。 72 | 73 | ```python 74 | train_images = train_images / 255.0 75 | test_images = test_images / 255.0 76 | ``` 77 | 78 | ## 搭建模型 79 | 80 | 神经网络的基本构成是网络层(layer),大部分深度学习网络都由多个简单的 layers 构成。 81 | 82 | ```python 83 | model = keras.Sequential([ 84 | keras.layers.Flatten(input_shape=(28, 28)), 85 | keras.layers.Dense(128, activation='relu'), 86 | keras.layers.Dense(10, activation='softmax') 87 | ]) 88 | ``` 89 | 90 | 网络的第一层,`Flatten`将输入从28x28 的二维数组转为784的一维数组,这一层的作用仅仅是将每一行值平铺在一行。 91 | 92 | 接下来是2层`Dense`,即全连接层(fully connected, FC),第一层`Dense`有128个神经元。第二层有10个神经元,经过 _softmax_ 后,返回了和为1长度为10的概率数组,每一个数分别代表当前图片属于分类0-9的概率。 93 | 94 | ## 编译模型 95 | 96 | 模型准备训练前,在模型编译(Compile)时还需要设置一些参数。 97 | 98 | - _Loss function_ - 损失函数,训练时评估模型的正确率,希望最小化这个函数,往正确的方向训练模型。 99 | - _Optimizer_ - 优化器算法,更新模型参数的算法。 100 | - _Metrics_ - 指标,用来监视训练和测试步数,下面的例子中使用`accuracy`,即图片被正确分类的比例。 101 | 102 | ```python 103 | model.compile(optimizer='adam', 104 | loss='sparse_categorical_crossentropy', 105 | metrics=['accuracy']) 106 | ``` 107 | 108 | ## 训练模型 109 | 110 | 训练神经网络,通常有以下几个步骤。 111 | 112 | - 传入训练数据,`train_images`和`train_labels`。 113 | - 训练模型去关联图片和标签。 114 | - 模型对测试集`test_images`作预测,并用`test_labels`验证预测结果。 115 | 116 | 使用`model.fit`函数开始训练。 117 | 118 | ```python 119 | model.fit(train_images, train_labels, epochs=10) 120 | ``` 121 | 122 | ```bash 123 | Train on 60000 samples 124 | Epoch 1/10 125 | 60000/60000 [========] - 4s 70us/sample - loss: 0.5032 - accuracy: 0.8234 126 | Epoch 2/10 127 | 60000/60000 [========] - 4s 64us/sample - loss: 0.3793 - accuracy: 0.8618 128 | ... 129 | Epoch 10/10 130 | 60000/60000 [========] - 4s 66us/sample - loss: 0.2389 - accuracy: 0.9115 131 | ``` 132 | 133 | 最终达到了88%左右的准确率。 134 | 135 | 136 | 137 | ## 评估准确率 138 | 139 | 接下来,看看在测试集中表现如何? 140 | 141 | ```python 142 | test_loss, test_acc = model.evaluate(test_images, test_labels) 143 | print('\nTest accuracy:', test_acc) 144 | # 10000/10000 [========] - 0s 37us/sample - loss: 0.3610 - accuracy: 0.8777 145 | # Test accuracy: 0.8777 146 | ``` 147 | 148 | 测试集的准确率低于训练集,训练集和测试集准确率之间的差距代表模型_过拟合_(_overfitting_)。即对于训练中没有见过的新数据,模型表现差。 149 | 150 | ## 预测 151 | 152 | 使用`predict`函数进行预测。 153 | 154 | ```python 155 | predictions = model.predict(test_images) 156 | predictions[0] 157 | ``` 158 | 159 | 看下第一张图片的预测结果。 160 | 161 | ```bash 162 | array([1.06-05, 5.06-12, 8.44-08, 4.09-09, 2.87-07, 2.28-04, 163 | 6.18-06, 2.48-02, 3.81-06, 9.74-01], dtype=float32) 164 | ``` 165 | 166 | 每次预测返回长度为10的数组,代表属于每一种分类的可能性,最大可能性的label是9。测试集中的数据也是9,预测正确。 167 | 168 | ```python 169 | np.argmax(predictions[0]) # 9 170 | test_labels[0] # 9 171 | ``` 172 | 173 | 返回[文档首页](https://geektutu.com/post/tf2doc.html) 174 | 175 | > 参考地址:[Train your first neural network: basic classification](https://www.tensorflow.org/beta/tutorials/keras/basic_classification) 176 | 177 | ## 附 推荐 178 | 179 | - [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-image/fashion-mnist-sm.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-image/fashion-mnist-sm.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-image/fashion-mnist.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-image/fashion-mnist.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-overfit.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: TensorFlow 2 中文文档 - 过拟合与欠拟合 3 | date: 2019-07-12 01:10:10 4 | description: TensorFlow2文档,TensorFlow2.0文档,TensorFlow2.0 TF2.0 TensorFlow 2 / 2.0 官方文档中文版,过拟合与欠拟合 Explore overfitting and underfitting,示例使用 IMDB 影评数据集。 5 | tags: 6 | - TensorFlow 2 7 | - 官方文档 8 | keywords: 9 | - TensorFlow2.0 10 | - TensorFlow2文档 11 | - TensorFlow2.0文档 12 | - overfitting 13 | - 过拟合 14 | nav: 简明教程 15 | categories: 16 | - TensorFlow2 文档 17 | image: post/tf2doc-ml-basic-overfit/imdb.jpg 18 | github: https://github.com/geektutu/tensorflow2-docs-zh 19 | --- 20 | 21 | **TF2.0 TensorFlow 2 / 2.0 中文文档:过拟合与欠拟合 Explore overfitting and underfitting** 22 | 23 | 24 | 主要内容:探索正则化(weight regularization)和 dropout 两种避免过拟合的方式改善 IMDB 影评分类效果。 25 | 26 | 之前不管是对影评数据分类,还是预测燃油效率,都可以看到模型在验证集的准确率会在训练一段时间后达到顶峰,之后开始下降。 27 | 28 | 换句话说,过拟合了。在训练集上达到较高的准确率是容易的,但我们的目的是在测试集,即模型没有见过的数据集上表现良好。 29 | 30 | 过拟合的反面是欠拟合。即对于测试数据还存在改进的空间。通常由如下原因导致: 31 | 32 | - 模型不够好。 33 | - 过度正则化(over-regularized) 34 | - 训练时间过短。 35 | 36 | 这些都意味着模型没有学习到训练数据的特征。 37 | 38 | 如果训练太久,模型可能开始过拟合,导致在测试集上表现不佳,因此我们需要取得一个平衡。 39 | 40 | 41 | ## IMDB 数据集 42 | 43 | 我们这次采用 `multi-hot encoding`的方式处理 IMDB 数据集,这样能快速达到过拟合的效果。multi-hot 的方式很简单,给每一个单词一个编号,每句话用长度为10,000的一维向量表示,将出现单词的位置置为1即可。 44 | 45 | ```python 46 | import tensorflow as tf 47 | from tensorflow import keras 48 | from tensorflow.keras.datasets import imdb 49 | 50 | import numpy as np 51 | import matplotlib.pyplot as plt 52 | 53 | # 解决中文乱码问题 54 | plt.rcParams['font.sans-serif'] = ['SimHei'] 55 | plt.rcParams['axes.unicode_minus'] = False 56 | plt.rcParams['font.size'] = 20 57 | 58 | N = 10000 59 | 60 | def multi_hot_encoding(sentences, dim=10000): 61 | results = np.zeros((len(sentences), dim)) 62 | for i, word_indices in enumerate(sentences): 63 | results[i, word_indices] = 1.0 64 | return results 65 | 66 | 67 | (train_x, train_y), (test_x, test_y) = imdb.load_data(num_words=N) 68 | train_x = multi_hot_encoding(train_x) 69 | test_x = multi_hot_encoding(test_x) 70 | 71 | plt.plot(train_x[0]) 72 | plt.show() 73 | ``` 74 | ![IMDB words distribution](tf2doc-ml-basic-overfit/imdb.jpg) 75 | 76 | 如果出现这样的错误,[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1045),解决办法可以参考 [Github - google-images-download](https://github.com/hardikvasa/google-images-download/issues/140) 77 | 78 | ## 过拟合 79 | 80 | 防止过拟合最简单的方式是降低模型复杂度,比如减少模型的学习参数(learnable parameters)。减少神经网络的层数,减少每一层网络的节点数都能达到目的。在深度学习中,学习参数的数量往往代表了模型“容量”(capacity)。容量越大,记忆能力越强,更容易学习到训练集的特征和标签的映射关系,类似于字典。如果这种映射关系缺乏泛化能力,在测试集上不可能有好的表现。 81 | 82 | 训练的目的不是为了让模型完全拟合训练数据,而是为了训练出具有泛化能力的模型。 83 | 84 | 反过来,如果模型较小,记忆能力较弱,学习映射关系比较困难。为了最小化损失(loss),模型需要压缩学习到的东西,因而具备更好的预测能力。但是,如果模型过小,没有足够的神经元记住有用的信息,训练会非常困难。因此需要取得权衡。 85 | 86 | 如何选择合适的网络层数和神经元数量呢?没有公式,只能不断尝试。 87 | 88 | 为了确定合适的模型大小,建议一开始采用一个参数少的模型,然后逐渐增大,直到验证集的损失(loss)基本不再变化。我们用之前的 IMDB 影评分类模型来试一试。 89 | 90 | 先搭一个基线模型,再创建大、小2个模型比较。 91 | 92 | ```python 93 | def build_and_train(hidden_dim, regularizer=None, dropout=0): 94 | model = keras.Sequential([ 95 | keras.layers.Dense(hidden_dim, activation='relu', 96 | input_shape=(N,), 97 | kernel_regularizer=regularizer), 98 | keras.layers.Dropout(dropout), 99 | keras.layers.Dense(hidden_dim, activation='relu', 100 | kernel_regularizer=regularizer), 101 | keras.layers.Dropout(dropout), 102 | keras.layers.Dense(1, activation='sigmoid') 103 | ]) 104 | 105 | model.compile(optimizer='adam', loss='binary_crossentropy', 106 | metrics=['accuracy', 'binary_crossentropy']) 107 | history = model.fit(train_x, train_y, epochs=10, batch_size=512, 108 | validation_data=(test_x, test_y), verbose=0) 109 | 110 | return history 111 | 112 | 113 | baseline_history = build_and_train(16) 114 | smaller_history = build_and_train(4) 115 | larger_history = build_and_train(512) 116 | ``` 117 | 118 | 我们画图看一看这三个模型在训练集和验证集上的表现。 119 | 120 | ```python 121 | def plot_history(histories, key='binary_crossentropy'): 122 | plt.figure(figsize=(10, 6)) 123 | 124 | for name, history in histories: 125 | val = plt.plot(history.epoch, history.history['val_'+key], 126 | '--', label=name + ' 验证集') 127 | plt.plot(history.epoch, history.history[key], 128 | color=val[0].get_color(), label=name + ' 训练集') 129 | 130 | plt.xlabel('Epochs') 131 | plt.ylabel('Loss - ' + key) 132 | plt.legend() 133 | 134 | plt.xlim([0, max(history.epoch)]) 135 | 136 | plot_history([('基线', baseline_history), 137 | ('较小', smaller_history), 138 | ('较大', larger_history)]) 139 | ``` 140 | 141 | ![3_train_val_loss](tf2doc-ml-basic-overfit/3_train_val_loss.jpg) 142 | 143 | 大的模型在第一波(epoch)训练后,就已经过拟合了。模型容量越大,就会越快地拟合训练数据,虽然训练集损失(loss)很低,但事实上已经过拟合了,验证集的loss明显大于训练集。 144 | 145 | ## 如何防止过拟合 146 | 147 | ### 权重正则化 148 | 149 | > 如果关于同一个问题有许多种理论,每一种都能作出同样准确的预言,那么应该挑选其中使用假定最少的。尽管越复杂的方法通常能做出越好的预言,但是在结果大致相同的情况下,假设越少越好。 ---- 维基百科《奥卡剃须刀》 150 | 151 | 你可能听说过奥卡剃须刀原则——如无必要,勿增实体。将这个原则应用到神经网络模型,相同的预测能力下,优选简单的模型,比起复杂的模型,简单的模型不容易过拟合。 152 | 153 | 将模型变简单,除了将模型变小(减少网络层数和每层神经元个数)以外,还有另一种方式,减小模型权重(w)的熵(entropy)。即限制权重值在一个较小的范围内,这样模型中权重分布看起来更“regular”,这被称为“权重正则化”(weight regularization)。常用**L1正则化**和**L2正则化**2种方式,L2 正则化更通用。 154 | 155 | > 正则化建议参考:[机器学习中常常提到的正则化到底是什么意思?](https://www.zhihu.com/question/20924039) 156 | 157 | ### Dropout 158 | 159 | 在神经网络中,Dropout是最有效的以及使用最广泛的正则化方式。Dropout作用在网络层,训练过程中随机丢弃(dropping out)一部分输出值(例如置为0),Dropout的比例一般置为0.2到0.5之间。例如: 160 | 161 | ```python 162 | [0.2, 0.3, 0.5, 0.7, 0.9] 163 | # after 40% dropout 164 | [0.2, 0, 0.5, 0.7, 0] 165 | ``` 166 | 167 | 看看效果吧。 168 | 169 | ```python 170 | l2_model_history = build_and_train(16, keras.regularizers.l2(0.001)) 171 | dpt_model_history = build_and_train(16, dropout=0.2) 172 | plot_history([('基线', baseline_history), 173 | ('L2正则', l2_model_history), 174 | ('Dropout', dpt_model_history)]) 175 | ``` 176 | 177 | ![l2_dropout_loss](tf2doc-ml-basic-overfit/l2_dropout_loss.jpg)) 178 | 179 | 返回[文档首页](https://geektutu.com/post/tf2doc.html) 180 | 181 | > 完整代码:[Github - explore_imdb_overfit.ipynb](https://github.com/geektutu/tensorflow2-docs-zh/tree/master/code) 182 | > 参考文档:[Explore overfitting and underfitting](https://www.tensorflow.org/beta/tutorials/keras/overfit_and_underfit) 183 | 184 | 185 | ## 附 推荐 186 | 187 | - [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-overfit/3_train_val_loss.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-overfit/3_train_val_loss.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-overfit/imdb.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-overfit/imdb.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-overfit/l2_dropout_loss.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-overfit/l2_dropout_loss.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-regression.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: TensorFlow 2 中文文档 - 回归预测燃油效率 3 | date: 2019-07-11 01:00:10 4 | description: TensorFlow2文档,TensorFlow2.0文档,TensorFlow2.0 TF2.0 TensorFlow 2 / 2.0 官方文档中文版,Regression 回归,示例使用 Auto MPG 燃油效率数据集。 5 | tags: 6 | - TensorFlow 2 7 | - 官方文档 8 | keywords: 9 | - TensorFlow2.0 10 | - TensorFlow2文档 11 | - TensorFlow2.0文档 12 | - Regression 13 | - 回归 14 | nav: 简明教程 15 | categories: 16 | - TensorFlow2 文档 17 | image: post/tf2doc-ml-basic-regression/sns.jpg 18 | github: https://github.com/geektutu/tensorflow2-docs-zh 19 | --- 20 | 21 | **TF2.0 TensorFlow 2 / 2.0 文档:Regression 回归** 22 | 23 | 主要内容:使用回归预测烟油效率。 24 | 25 | 回归通常用来预测连续值,比如价格和概率。分类问题不一样,类别是固定的,目的是判断属于哪一类。比如给你一堆猫和狗的图片,判断一张图片是猫还是狗就是一个典型的分类问题。 26 | 27 | 接下来使用的是经典的 [Auto MPG](https://archive.ics.uci.edu/ml/datasets/auto+mpg) 数据集,这个数据集包括气缸(cylinders),排量(displayment),马力(horsepower) 和重量(weight)等属性。我们需要利用这些属性搭建模型,预测汽车的燃油效率(fuel efficiency)。 28 | 29 | 模型搭建使用`tf.keras` API。 30 | 31 | ```python 32 | import pathlib 33 | 34 | import matplotlib.pyplot as plt 35 | import pandas as pd 36 | import seaborn as sns 37 | import tensorflow as tf 38 | from tensorflow import keras 39 | from tensorflow.keras import layers 40 | ``` 41 | 42 | 43 | 44 | ## Auto MPG 数据集 45 | 46 | ### 获取数据 47 | 48 | ```python 49 | # 下载数据集到本地 50 | url = "http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data" 51 | dataset_path = keras.utils.get_file("auto-mpg.data", url) 52 | 53 | # 使用Pandas读取数据 54 | column_names = ['MPG','气缸','排量','马力','重量','加速度', '年份', '产地'] 55 | raw_dataset = pd.read_csv(dataset_path, names=column_names, 56 | na_values = "?", comment='\t', 57 | sep=" ", skipinitialspace=True) 58 | 59 | dataset = raw_dataset.copy() 60 | # 查看前3条数据 61 | dataset.head(3) 62 | ``` 63 | 64 | | | MPG | 气缸 | 排量 | 马力 | 重量 | 加速度 | 年份 | 产地 | 65 | | ---: | ---: | ---: | ----: | ----: | -----: | -----: | ---: | ---: | 66 | | 0 | 18.0 | 8 | 307.0 | 130.0 | 3504.0 | 12.0 | 70 | 1 | 67 | | 1 | 15.0 | 8 | 350.0 | 165.0 | 3693.0 | 11.5 | 70 | 1 | 68 | | 2 | 18.0 | 8 | 318.0 | 150.0 | 3436.0 | 11.0 | 70 | 1 | 69 | 70 | ### 清洗数据 71 | 72 | 检查是否有 NA 值。 73 | 74 | ```python 75 | dataset.isna().sum() 76 | ``` 77 | 78 | ```bash 79 | MPG 0 80 | 气缸 0 81 | 排量 0 82 | 马力 6 83 | 重量 0 84 | 加速度 0 85 | 年份 0 86 | 产地 0 87 | dtype: int64 88 | ``` 89 | 90 | 直接去除含有NA值的行(马力) 91 | 92 | ```python 93 | dataset = dataset.dropna() 94 | ``` 95 | 96 | 在获取的数据集中,`Origin`(产地)不是数值类型,需转为独热编码。 97 | 98 | ```python 99 | origin = dataset.pop('产地') 100 | dataset['美国'] = (origin == 1)*1.0 101 | dataset['欧洲'] = (origin == 2)*1.0 102 | dataset['日本'] = (origin == 3)*1.0 103 | # 看一看转换后的结果 104 | dataset.head(3) 105 | ``` 106 | 107 | | | MPG | 气缸 | 排量 | 马力 | 重量 | 加速度 | 年份 | 美国 | 欧洲 | 日本 | 108 | | ---: | ---: | ---: | ----: | ----: | -----: | -----: | ---: | ---: | ---: | ---: | 109 | | 0 | 18.0 | 8 | 307.0 | 130.0 | 3504.0 | 12.0 | 70 | 1.0 | 0.0 | 0.0 | 110 | | 1 | 15.0 | 8 | 350.0 | 165.0 | 3693.0 | 11.5 | 70 | 1.0 | 0.0 | 0.0 | 111 | | 2 | 18.0 | 8 | 318.0 | 150.0 | 3436.0 | 11.0 | 70 | 1.0 | 0.0 | 0.0 | 112 | 113 | ### 划分训练集与测试集 114 | 115 | ```python 116 | # 训练集 80%, 测试集 20% 117 | train_dataset = dataset.sample(frac=0.8, random_state=0) 118 | test_dataset = dataset.drop(train_dataset.index) 119 | ``` 120 | 121 | ### 检查数据 122 | 123 | 快速看一看训练集中属性两两之间的关系吧。 124 | 125 | ```python 126 | # 解决中文乱码问题 127 | plt.rcParams['font.sans-serif']=['SimHei'] 128 | plt.rcParams['axes.unicode_minus']=False 129 | 130 | sns.pairplot(train_dataset[["MPG", "气缸", "排量", "重量"]], diag_kind="kde") 131 | ``` 132 | 133 | > matplotlib 中文乱码看这里:[matplotlib图例中文乱码?](https://www.zhihu.com/question/25404709) 134 | 135 | ![Seaborn four feature relation](tf2doc-ml-basic-regression/sns.jpg) 136 | 137 | 你还可以使用`train_dataset.describle()`快速浏览每一属性的平均值、标准差、最小值、最大值等信息,能够帮助你快速地识别出不合理的数据。 138 | 139 | ```python 140 | train_stats = train_dataset.describe() 141 | train_stats.pop("MPG") 142 | train_stats = train_stats.transpose() 143 | train_stats 144 | ``` 145 | 146 | | | count | mean | std | min | 25% | 50% | 75% | max | 147 | | ---: | ----: | ---------: | ---------: | ---: | -----: | ----: | -----: | ----: | 148 | | 气缸 | 314.0 | 5.477707 | 1.699788 | 3.0 | 4.00 | 4.0 | 8.00 | 8.0 | 149 | | 排量 | 314.0 | 195.318471 | 104.331589 | 68.0 | 105.50 | 151.0 | 265.75 | 455.0 | 150 | | ... | ... | ... | ... | ... | ... | .... | ... | ... | 151 | 152 | ### 分离 label 153 | 154 | ```python 155 | # 分离 label 156 | train_labels = train_dataset.pop('MPG') 157 | test_labels = test_dataset.pop('MPG') 158 | ``` 159 | 160 | ### 归一化数据 161 | 162 | 通常训练前需要归一化数据,不同属性使用的计量单位不一样,值的范围不一样,训练就会很困难。比如其中一个属性的范围是[0.1, 0.5],而另一个属性的范围是[1000, 5000],那数值大的属性就容易对训练产生干扰,很可能导致训练不能收敛,或者是数值小的属性在模型中几乎没有发挥作用。归一化将不同范围的数据映射到[0,1]的空间内,可以有效地避免这个问题。 163 | 164 | ```python 165 | def norm(x): 166 | return (x - train_stats['mean']) / train_stats['std'] 167 | normed_train_data = norm(train_dataset) 168 | normed_test_data = norm(test_dataset) 169 | ``` 170 | 171 | ## 模型 172 | 173 | ### 搭建模型 174 | 175 | 我们的模型包含2个全连接的隐藏层构成,输出层返回一个连续值。 176 | 177 | ```python 178 | def build_model(): 179 | input_dim = len(train_dataset.keys()) 180 | model = keras.Sequential([ 181 | layers.Dense(64, activation='relu', input_shape=[input_dim,]), 182 | layers.Dense(64, activation='relu'), 183 | layers.Dense(1) 184 | ]) 185 | 186 | model.compile(loss='mse', metrics=['mae', 'mse'], 187 | optimizer=tf.keras.optimizers.RMSprop(0.001)) 188 | return model 189 | 190 | model = build_model() 191 | # 打印模型的描述信息,每一层的大小、参数个数等 192 | model.summary() 193 | ``` 194 | 195 | ```python 196 | Model: "sequential_1" 197 | _________________________________________________________________ 198 | Layer (type) Output Shape Param # 199 | ================================================================= 200 | dense_4 (Dense) (None, 64) 640 201 | _________________________________________________________________ 202 | dense_5 (Dense) (None, 32) 2080 203 | _________________________________________________________________ 204 | dense_6 (Dense) (None, 1) 33 205 | ================================================================= 206 | Total params: 2,753 207 | Trainable params: 2,753 208 | Non-trainable params: 0 209 | _________________________________________________________________ 210 | ``` 211 | 212 | ### 训练模型 213 | 214 | 在之前的案例,比如[结构化数据分类](https://geektutu.com/post/tf2doc-ml-basic-structured-data.html),我们调用`model.fit`会打印出训练的进度。我们可以禁用默认的行为,并自定义训练进度条。 215 | 216 | ```python 217 | import sys 218 | 219 | 220 | EPOCHS = 1000 221 | 222 | class ProgressBar(keras.callbacks.Callback): 223 | def on_epoch_end(self, epoch, logs): 224 | # 显示进度条 225 | self.draw_progress_bar(epoch + 1, EPOCHS) 226 | 227 | def draw_progress_bar(self, cur, total, bar_len=50): 228 | cur_len = int(cur / total * bar_len) 229 | sys.stdout.write("\r") 230 | sys.stdout.write("[{:<{}}] {}/{}".format("=" * cur_len, bar_len, cur, total)) 231 | sys.stdout.flush() 232 | 233 | history = model.fit( 234 | normed_train_data, train_labels, 235 | epochs=EPOCHS, validation_split = 0.2, verbose=0, 236 | callbacks=[ProgressBar()]) 237 | ``` 238 | 239 | ```bash 240 | [==================================================] 1000/1000 241 | ``` 242 | 243 | 训练过程都存储在了`history`对象中,我们可以借助 matplotlib 将训练过程可视化。 244 | 245 | ```python 246 | hist = pd.DataFrame(history.history) 247 | hist['epoch'] = history.epoch 248 | hist.tail(3) 249 | ``` 250 | 251 | | loss | mae | mse | val_loss | val_mae | val_mse | epoch | | 252 | | ---: | -------: | -------: | -------: | -------: | -------: | -------: | ---- | 253 | | 997 | 3.132053 | 1.142280 | 3.132053 | 9.711935 | 2.361466 | 9.711935 | 997 | 254 | | 998 | 3.021109 | 1.093424 | 3.021109 | 9.488593 | 2.298264 | 9.488593 | 998 | 255 | | 999 | 3.028849 | 1.132241 | 3.028849 | 9.453931 | 2.275017 | 9.453931 | 999 | 256 | 257 | ```python 258 | def plot_history(history): 259 | hist = pd.DataFrame(history.history) 260 | hist['epoch'] = history.epoch 261 | plt.figure() 262 | plt.xlabel('epoch') 263 | plt.ylabel('metric - MSE') 264 | plt.plot(hist['epoch'], hist['mse'], label='训练集') 265 | plt.plot(hist['epoch'], hist['val_mse'], label = '验证集') 266 | plt.ylim([0, 20]) 267 | plt.legend() 268 | 269 | plt.figure() 270 | plt.xlabel('epoch') 271 | plt.ylabel('metric - MAE') 272 | plt.plot(hist['epoch'], hist['mae'], label='训练集') 273 | plt.plot(hist['epoch'], hist['val_mae'], label = '验证集') 274 | plt.ylim([0, 5]) 275 | plt.legend() 276 | 277 | plot_history(history) 278 | ``` 279 | 280 | ![MAE](tf2doc-ml-basic-regression/mse.jpg) 281 | 282 | 从图中,我们可以看到,从100 epoch开始,训练集的loss仍旧继续降低,但验证集的loss却在升高,说明过拟合了,训练应该早一点结束。接下来,我们使用 `keras.callbacks.EarlyStopping`,每一波(epoch)训练结束时,测试训练情况,如果训练不再有效果(验证集的loss,即val_loss 不再下降),则自动地停止训练。 283 | 284 | ```python 285 | model = build_model() 286 | early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10) 287 | history = model.fit(normed_train_data, train_labels, epochs=EPOCHS, 288 | validation_split = 0.2, verbose=0, 289 | callbacks=[early_stop, ProgressBar()]) 290 | plot_history(history) 291 | ``` 292 | 293 | ```bash 294 | [=== ] 70/1000 295 | ``` 296 | 297 | 在第 70 epoch 时,停止了训练。 298 | 299 | ![MAE](tf2doc-ml-basic-regression/early_mse.jpg) 300 | 301 | 接下来使用测试集来评估训练效果。 302 | 303 | ```python 304 | loss, mae, mse = model.evaluate(normed_test_data, test_labels, verbose=0) 305 | print("测试集平均绝对误差(MAE): {:5.2f} MPG".format(mae)) 306 | # 测试集平均绝对误差(MAE): 1.90 MPG 307 | ``` 308 | 309 | 从图中我们也可以看出,1.9比验证集还略低一点。 310 | 311 | ## 预测 312 | 313 | 最后,我们使用测试集中的数据来预测 MPG 值。 314 | 315 | ```python 316 | test_pred = model.predict(normed_test_data).flatten() 317 | 318 | plt.scatter(test_labels, test_pred) 319 | plt.xlabel('真实值') 320 | plt.ylabel('预测值') 321 | plt.axis('equal') 322 | plt.axis('square') 323 | plt.xlim([0,plt.xlim()[1]]) 324 | plt.ylim([0,plt.ylim()[1]]) 325 | plt.plot([-100, 100], [-100, 100]) 326 | ``` 327 | 328 | 看起来,模型训练得还不错。 329 | 330 | ![Test True Pred](tf2doc-ml-basic-regression/test_true_pred.jpg) 331 | 332 | 333 | ## 结论 334 | 335 | - 均方误差(Mean Squared Error, MSE) 常作为回归问题的损失函数(loss function),与分类问题不太一样。 336 | - 同样,评价指标(evaluation metrics)也不一样,分类问题常用准确率(accuracy),回归问题常用平均绝对误差 (Mean Absolute Error, MAE) 337 | - 每一列数据都有不同的范围,每一列,即每一个feature的数据需要分别缩放到相同的范围。常用归一化的方式,缩放到[0, 1]。 338 | - 如果训练数据过少,最好搭建一个隐藏层少的小的神经网络,避免过拟合。 339 | - 早停法(Early Stoping)也是防止过拟合的一种方式。 340 | 341 | 返回[文档首页](https://geektutu.com/post/tf2doc.html) 342 | 343 | > 完整代码:[Github - auto_mpg_regression.ipynb](https://github.com/geektutu/tensorflow2-docs-zh/tree/master/code) 344 | > 参考文档:[Regression: Predict fuel efficiency](https://www.tensorflow.org/beta/tutorials/keras/basic_regression) 345 | 346 | ## 附 推荐 347 | 348 | - [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-regression/early_mse.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-regression/early_mse.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-regression/mse.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-regression/mse.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-regression/sns.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-regression/sns.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-regression/test_true_pred.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-regression/test_true_pred.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-save-model.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: TensorFlow 2 中文文档 - 保存与加载模型 3 | date: 2019-07-13 00:05:10 4 | description: TensorFlow2文档,TensorFlow2.0文档,TensorFlow2.0 TF2.0 TensorFlow 2 / 2.0 官方文档中文版,保存与加载模型 Save and Restore model。 5 | tags: 6 | - TensorFlow 2 7 | - 官方文档 8 | keywords: 9 | - TensorFlow2.0 10 | - TensorFlow2文档 11 | - TensorFlow2.0文档 12 | - overfitting 13 | - 过拟合 14 | nav: 简明教程 15 | categories: 16 | - TensorFlow2 文档 17 | image: post/tf2doc-ml-basic-save-model/hdf5.png 18 | github: https://github.com/geektutu/tensorflow2-docs-zh 19 | --- 20 | 21 | ![TensorFlow HDF5](tf2doc-ml-basic-save-model/hdf5.png) 22 | 23 | 27 | 28 | **TF2.0 TensorFlow 2 / 2.0 中文文档:保存与加载模型 Save and Restore model** 29 | 30 | 主要内容:使用 `tf.keras`接口训练、保存、加载模型,数据集选用 MNIST 。 31 | 32 | ```bash 33 | $ pip install -q tensorflow==2.0.0-beta1 34 | $ pip install -q h5py pyyaml 35 | ``` 36 | 37 | ## 准备训练数据 38 | 39 | ```python 40 | import tensorflow as tf 41 | from tensorflow import keras 42 | from tensorflow.keras import datasets, layers, models, callbacks 43 | from tensorflow.keras.datasets import mnist 44 | 45 | import os 46 | file_path = os.path.abspath('./mnist.npz') 47 | 48 | (train_x, train_y), (test_x, test_y) = datasets.mnist.load_data(path=file_path) 49 | train_y, test_y = train_y[:1000], test_y[:1000] 50 | train_x = train_x[:1000].reshape(-1, 28 * 28) / 255.0 51 | test_x = test_x[:1000].reshape(-1, 28 * 28) / 255.0 52 | ``` 53 | 54 | ## 搭建模型 55 | 56 | ```python 57 | def create_model(): 58 | model = models.Sequential([ 59 | layers.Dense(512, activation='relu', input_shape=(784,)), 60 | layers.Dropout(0.2), 61 | layers.Dense(10, activation='softmax') 62 | ]) 63 | 64 | model.compile(optimizer='adam', metrics=['accuracy'], 65 | loss='sparse_categorical_crossentropy') 66 | 67 | return model 68 | 69 | def evaluate(target_model): 70 | _, acc = target_model.evaluate(test_x, test_y) 71 | print("Restore model, accuracy: {:5.2f}%".format(100*acc)) 72 | ``` 73 | 74 | ## 自动保存 checkpoints 75 | 76 | 这样做,一是训练结束后得到了训练好的模型,使用得不必再重新训练,二是训练过程被中断,可以从断点处继续训练。 77 | 78 | 设置`tf.keras.callbacks.ModelCheckpoint`回调可以实现这一点。 79 | 80 | ```python 81 | # 存储模型的文件名,语法与 str.format 一致 82 | # period=10:每 10 epochs 保存一次 83 | checkpoint_path = "training_2/cp-{epoch:04d}.ckpt" 84 | checkpoint_dir = os.path.dirname(checkpoint_path) 85 | cp_callback = callbacks.ModelCheckpoint( 86 | checkpoint_path, verbose=1, save_weights_only=True, period=10) 87 | 88 | model = create_model() 89 | model.save_weights(checkpoint_path.format(epoch=0)) 90 | model.fit(train_x, train_y, epochs=50, callbacks=[cp_callback], 91 | validation_data=(test_x, test_y), verbose=0) 92 | ``` 93 | 94 | ```python 95 | Epoch 00010: saving model to training_2/cp-0010.ckpt 96 | Epoch 00020: saving model to training_2/cp-0020.ckpt 97 | Epoch 00030: saving model to training_2/cp-0030.ckpt 98 | Epoch 00040: saving model to training_2/cp-0040.ckpt 99 | Epoch 00050: saving model to training_2/cp-0050.ckpt 100 | ``` 101 | 102 | 加载权重: 103 | 104 | ```python 105 | latest = tf.train.latest_checkpoint(checkpoint_dir) 106 | # 'training_2/cp-0050.ckpt' 107 | model = create_model() 108 | model.load_weights(latest) 109 | evaluate(model) 110 | ``` 111 | 112 | ```bash 113 | 1000/1000 [===] - 0s 90us/sample - loss: 0.4703 - accuracy: 0.8780 114 | Restore model, accuracy: 87.80% 115 | ``` 116 | 117 | ## 手动保存权重 118 | 119 | ```python 120 | # 手动保存权重 121 | model.save_weights('./checkpoints/mannul_checkpoint') 122 | model = create_model() 123 | model.load_weights('./checkpoints/mannul_checkpoint') 124 | evaluate(model) 125 | ``` 126 | 127 | ```python 128 | 1000/1000 [===] - 0s 90us/sample - loss: 0.4703 - accuracy: 0.8780 129 | Restore model, accuracy: 87.80% 130 | ``` 131 | 132 | ## 保存整个模型 133 | 134 | 上面的示例仅仅保存了模型中的权重(weights),模型和优化器都可以一起保存,包括权重(weights)、模型配置(architecture)和优化器配置(optimizer configuration)。这样做的好处是,当你恢复模型时,完全不依赖于原来搭建模型的代码。 135 | 136 | 保存完整的模型有很多应用场景,比如在浏览器中使用 TensorFlow.js 加载运行,比如在移动设备上使用 TensorFlow Lite 加载运行。 137 | 138 | ### HDF5 139 | 140 | 直接调用`model.save`即可保存为 HDF5 格式的文件。 141 | 142 | ```python 143 | model.save('my_model.h5') 144 | ``` 145 | 146 | 从 HDF5 中恢复完整的模型。 147 | 148 | ```python 149 | new_model = models.load_model('my_model.h5') 150 | evaluate(new_model) 151 | ``` 152 | 153 | ```bash 154 | 1000/1000 [===] - 0s 90us/sample - loss: 0.4703 - accuracy: 0.8780 155 | Restore model, accuracy: 87.80% 156 | ``` 157 | 158 | ### saved_model 159 | 160 | 保存为`saved_model`格式。 161 | 162 | ```python 163 | import time 164 | saved_model_path = "./saved_models/{}".format(int(time.time())) 165 | tf.keras.experimental.export_saved_model(model, saved_model_path) 166 | ``` 167 | 168 | 恢复模型并预测 169 | 170 | ```python 171 | new_model = tf.keras.experimental.load_from_saved_model(saved_model_path) 172 | model.predict(test_x).shape 173 | ``` 174 | 175 | ```bash 176 | (1000, 10) 177 | ``` 178 | 179 | `saved_model`格式的模型可以直接用来预测(predict),但是 saved_model 没有保存优化器配置,如果要使用`evaluate`方法,则需要先 compile。 180 | 181 | ```python 182 | new_model.compile(optimizer=model.optimizer, 183 | loss='sparse_categorical_crossentropy', 184 | metrics=['accuracy']) 185 | evaluate(new_model) 186 | ``` 187 | 188 | ```bash 189 | 1000/1000 [===] - 0s 90us/sample - loss: 0.4703 - accuracy: 0.8780 190 | Restore model, accuracy: 87.80% 191 | ``` 192 | 193 | ## 最后 194 | 195 | TensorFlow 中还有其他的方式可以保存模型。 196 | 197 | - [Saving in eager](https://www.tensorflow.org/guide/eager#object_based_saving) eager 模型保存模型 198 | - [Save and Restore](https://www.tensorflow.org/guide/saved_model) -- low-level 的接口。 199 | 200 | 返回[文档首页](https://geektutu.com/post/tf2doc.html) 201 | 202 | > 完整代码:[Github - save_restore_model.ipynb](https://github.com/geektutu/tensorflow2-docs-zh/tree/master/code) 203 | > 参考文档:[Save and restore models](https://www.tensorflow.org/beta/tutorials/keras/save_and_restore_models) 204 | 205 | ## 附 推荐 206 | 207 | - [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-save-model/hdf5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-save-model/hdf5.png -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-structured-data.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: TensorFlow 2 中文文档 - 特征工程结构化数据分类 3 | date: 2019-07-09 23:45:10 4 | description: TensorFlow2文档,TensorFlow2.0文档,TensorFlow2.0 TF2.0 TensorFlow 2 / 2.0 官方文档中文版,结构化数据分类 Classify structured data,示例使用心脏病数据集。 5 | tags: 6 | - TensorFlow 2 7 | - 官方文档 8 | keywords: 9 | - TensorFlow2.0 10 | - TensorFlow2文档 11 | - TensorFlow2.0文档 12 | - Feature Column 13 | - structured data 14 | nav: 简明教程 15 | categories: 16 | - TensorFlow2 文档 17 | image: post/tf2doc-ml-basic-structured-data/structured-data.jpg 18 | github: https://github.com/geektutu/tensorflow2-docs-zh 19 | --- 20 | 21 | ![structured-data](tf2doc-ml-basic-structured-data/structured-data.jpg) 22 | 23 | 27 | 28 | **TF2.0 TensorFlow 2 / 2.0 中文文档 - 结构化数据分类 Classify structured data** 29 | 30 | 主要内容:介绍如何对结构化数据(例如 CSV 中的表格数据)分类。 31 | 32 | 这个教程包含完整的代码: 33 | 34 | - 使用 Pandas 加载 CSV 文件。 35 | - 使用 tf.data 打乱数据并获取batch。 36 | - 使用特征工程(feature columns)将 CSV 中的列映射为特征值(features) 37 | - 使用 Keras 搭建、训练和评估模型。 38 | 39 | ## 数据集 40 | 41 | 数据集由 Cleveland Clinic Foundation 提供的几百行心脏病数据构成。每一行代表一个病人,每一列描述一个属性,数据共14列,最后一列为是否患病。我们将使用这些信息预测一个病人是否有心脏病。这是一个典型的二分分类问题。 42 | 43 | | 列 | 描述 | 特征类型 | 数据类型 | 44 | | ------ | --------------------------------------------------- | -------- | -------- | 45 | | Age | 年龄 | 数值 | integer | 46 | | Sex | 性别(1男性; 0女性) | 类别 | integer | 47 | | ... | ... | ... | ... | 48 | | Thal | 3 = normal; 6 = fixed defect; 7 = reversable defect | 类别 | string | 49 | | Target | 是否感染,(1是;0否) | 分类 | integer | 50 | 51 | ## 导入库 52 | 53 | ```bash 54 | pip install -q sklearn 55 | pip install -q tensorflow==2.0.0-beta1 56 | ``` 57 | 58 | ```python 59 | import numpy as np 60 | import pandas as pd 61 | import tensorflow as tf 62 | 63 | from tensorflow import feature_column 64 | from tensorflow.keras import layers 65 | from sklearn.model_selection import train_test_split 66 | ``` 67 | 68 | ## 使用 Pandas 读取数据 69 | 70 | ```python 71 | URL = 'https://storage.googleapis.com/applied-dl/heart.csv' 72 | dataframe = pd.read_csv(URL) 73 | dataframe.head() 74 | ``` 75 | 76 | | - | age | sex | cp | … | slope | ca | thal | target | 77 | | ---- | ---- | ---- | ---- | ---- | ----- | ---- | ---------- | ------ | 78 | | 0 | 63 | 1 | 1 | | 3 | 0 | fixed | 0 | 79 | | 1 | 67 | 1 | 4 | | 2 | 3 | normal | 1 | 80 | | 2 | 67 | 1 | 4 | | 2 | 2 | reversible | 0 | 81 | | 3 | 37 | 1 | 3 | | 3 | 0 | normal | 0 | 82 | | 4 | 41 | 0 | 2 | | 1 | 0 | normal | 0 | 83 | 84 | ## 分割训练集、验证集和测试集 85 | 86 | ```python 87 | train, test = train_test_split(dataframe, test_size=0.2) 88 | train, val = train_test_split(train, test_size=0.2) 89 | print(len(train), 'train examples') # 193 90 | print(len(val), 'validation examples') # 49 91 | print(len(test), 'test examples') # 61 92 | ``` 93 | 94 | ## 创建 input pipeline 95 | 96 | 使用 tf.data ,我们可以使用特征工程(feature columns)将 Pandas DataFrame 中的列映射为特征值(features)。如果是一个非常大的 CSV 文件,不能直接放在内存中,就必须直接使用 tf.data 从磁盘中直接读取数据了。 97 | 98 | ```python 99 | # 帮助函数,返回 tf.data 数据集。 100 | def df_to_dataset(dataframe, shuffle=True, batch_size=32): 101 | dataframe = dataframe.copy() 102 | labels = dataframe.pop('target') 103 | ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels)) 104 | if shuffle: 105 | ds = ds.shuffle(buffer_size=len(dataframe)) 106 | ds = ds.batch(batch_size) 107 | return ds 108 | ``` 109 | 110 | ```python 111 | batch_size = 5 # 为了方面展示下面的示例,batch暂时设置为5。 112 | train_ds = df_to_dataset(train, batch_size=batch_size) 113 | val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size) 114 | test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size) 115 | ``` 116 | 117 | ## 理解 input pipeline 118 | 119 | ```python 120 | for feature_batch, label_batch in train_ds.take(1): 121 | print('Every feature:', list(feature_batch.keys())) 122 | print('A batch of ages:', feature_batch['age']) 123 | print('A batch of targets:', label_batch ) 124 | ``` 125 | 126 | ```python 127 | Every feature: ['cp', 'age', 'sex', ... , 'slope', 'ca'] 128 | A batch of ages: tf.Tensor([50 62 37 69 58], shape=(5,), dtype=int32) 129 | A batch of targets: tf.Tensor([0 0 0 0 0], shape=(5,), dtype=int32) 130 | ``` 131 | 132 | 可以看到数据集返回了一个键为列名的字典。 133 | 134 | ## 特征列示例 135 | 136 | TensorFlow 提供了很多种类型的特征列(feature column),接下来给几个例子看一看每一列的值是怎么被转换的。 137 | 138 | ```python 139 | example_batch = next(iter(train_ds))[0] 140 | 141 | # 帮助函数:创建一个特征列,并转换。 142 | def demo(feature_column): 143 | feature_layer = layers.DenseFeatures(feature_column) 144 | print(feature_layer(example_batch).numpy()) 145 | ``` 146 | 147 | ### 1) Numeric column 148 | 149 | 特征列的输出是模型的输入,Numeric columns 是最简单的类型,数值本身代表某个特征真实的值,因此转换后,值不发生改变。 150 | 151 | ```python 152 | age = feature_column.numeric_column("age") 153 | demo(age) 154 | ``` 155 | 156 | ```python 157 | [[50.] 158 | [62.] 159 | [37.] 160 | [69.] 161 | [58.]] 162 | ``` 163 | 164 | 在这个数据集中,大部分列都是数值类型。 165 | 166 | ### 2) Bucketized columns 167 | 168 | 有时候,并不想直接将数值传给模型,而是希望基于数值的范围离散成几个种类。比如人的年龄,0-10归为一类,用0表示;11-20归为一类,用1表示。我们可以用 bucketized column 将年龄划分到不同的 bucket 中。用中文比喻,就好像提供了不同的桶,在某一范围内的扔进A桶,另一范围的数据扔进B桶,以此类推。下面的例子使用独热编码来表示不同的 bucket。 169 | 170 | ```python 171 | age_buckets = feature_column.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) 172 | demo(age_buckets) 173 | ``` 174 | 175 | ```python 176 | [[0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.] 177 | [0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] 178 | [0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.] 179 | [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.] 180 | [0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]] 181 | ``` 182 | 183 | ### 3) Categorical columns 184 | 185 | 在这个数据集中,`thal`列使用字符串表示(e.g. 'fixed', 'normal', 'reversible')。字符串不能直接传给模型。所以我们要先将字符串映射为数值。可以使用categorical_column_with_vocabulary_list 和 categorical_column_with_vocabulary_file 来转换,前者接受一个列表作为输入,后者可以传入一个文件。 186 | 187 | ```python 188 | thal = feature_column.categorical_column_with_vocabulary_list( 189 | 'thal', ['fixed', 'normal', 'reversible']) 190 | 191 | thal_one_hot = feature_column.indicator_column(thal) 192 | demo(thal_one_hot) 193 | ``` 194 | 195 | ```python 196 | [[0. 1. 0.] 197 | [0. 1. 0.] 198 | [0. 1. 0.] 199 | [0. 1. 0.] 200 | [0. 0. 1.]] 201 | ``` 202 | 203 | 可以看到,最终的输出向量也是独热编码,和 Bucketized column 相似。 204 | 205 | ### 4) Embedding column 206 | 207 | 假设某一列有上千种类别,用独热编码来表示就不太合适了。这时候,可以使用 embedding column。embedding column 可以压缩维度,因此向量中的值不再只由0或1组成,可以包含任何数字。 208 | 209 | 在有很多种类别时使用 embedding column 是最合适的。接下来只是一个示例,不管输入有多少种可能性,最终的输出向量定长为8。 210 | 211 | ```python 212 | # embedding column的输入是categorical column 213 | thal_embedding = feature_column.embedding_column(thal, dimension=8) 214 | demo(thal_embedding) 215 | ``` 216 | 217 | ```python 218 | [[-0.42 -0.42 0.34 0.47 0.21 0.33 0.34 0.65] 219 | [-0.42 -0.42 0.34 0.47 0.21 0.33 0.34 0.65] 220 | [-0.42 -0.42 0.34 0.47 0.21 0.33 0.34 0.65] 221 | [-0.42 -0.42 0.34 0.47 0.21 0.33 0.34 0.65] 222 | [ 0.20 0.07 0.06 0.01 -0.47 -0.10 -0.70 0.00]] 223 | ``` 224 | 225 | ### 5) Hashed feature columns 226 | 227 | 另一种表示类别很多的 categorical column 的方式是使用 categorical_column_with_hash_bucket。这个特征列会计算输入的哈希值,然后根据哈希值对字符串进行编码。哈希桶(bucket)个数即参数`hash_bucket_size`。哈希桶(hash_buckets)的个数应明显小于实际的类别个数,以节省空间。 228 | 229 | 注意:哈希的一大副作用是可能存在冲突,不同的字符串可能映射到相同的哈希桶中。不过,在某些数据集,这个方式还是非常有效的。 230 | 231 | ```python 232 | thal_hashed = feature_column.categorical_column_with_hash_bucket( 233 | 'thal', hash_bucket_size=1000) 234 | demo(feature_column.indicator_column(thal_hashed)) 235 | ``` 236 | 237 | ```python 238 | [[0. 0. 0. ... 0. 0. 0.] 239 | [0. 0. 0. ... 0. 0. 0.] 240 | [0. 0. 0. ... 0. 0. 0.] 241 | [0. 0. 0. ... 0. 0. 0.] 242 | [0. 0. 0. ... 0. 0. 0.]] 243 | ``` 244 | 245 | ### 6) Crossed feature columns 246 | 247 | 将几个特征组合成一个特征,即 feature crosses,模型可以对每一个特征组合学习独立的权重。接下来,我们将组合 `age` 和 `thal` 列创建一个新的特征。注意:`crossed_column`不会创建所有可能的组合,因为组合可能性会非常多。背后是通过`hashed_column`处理的,可以设置哈希桶的大小。 248 | 249 | ```python 250 | crossed_feature = feature_column.crossed_column([age_buckets, thal], hash_bucket_size=1000) 251 | demo(feature_column.indicator_column(crossed_feature)) 252 | ``` 253 | 254 | ```python 255 | [[0. 0. 0. ... 0. 0. 0.] 256 | [0. 0. 0. ... 0. 0. 0.] 257 | [0. 0. 0. ... 0. 0. 0.] 258 | [0. 0. 0. ... 0. 0. 0.] 259 | [0. 0. 0. ... 0. 0. 0.]] 260 | ``` 261 | 262 | ## 选择需要使用的列 263 | 264 | 为了训练出准确率高的模型,大数据集、选取有意义的列、数据的展示方式都是非常重要的。 265 | 266 | 接下来的示例,我们随机选取一些列来训练。 267 | 268 | ```python 269 | feature_columns = [] 270 | 271 | # numeric cols 272 | for header in ['age', 'trestbps', 'chol', 'thalach', 'oldpeak', 'slope', 'ca']: 273 | feature_columns.append(feature_column.numeric_column(header)) 274 | 275 | # bucketized cols 276 | age_buckets = feature_column.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) 277 | feature_columns.append(age_buckets) 278 | 279 | # indicator cols 280 | thal = feature_column.categorical_column_with_vocabulary_list( 281 | 'thal', ['fixed', 'normal', 'reversible']) 282 | thal_one_hot = feature_column.indicator_column(thal) 283 | feature_columns.append(thal_one_hot) 284 | 285 | # embedding cols 286 | thal_embedding = feature_column.embedding_column(thal, dimension=8) 287 | feature_columns.append(thal_embedding) 288 | 289 | # crossed cols 290 | crossed_feature = feature_column.crossed_column([age_buckets, thal], hash_bucket_size=1000) 291 | crossed_feature = feature_column.indicator_column(crossed_feature) 292 | feature_columns.append(crossed_feature) 293 | ``` 294 | 295 | ### 创建特征层 296 | 297 | 我已经定义好了特征列,接下来使用 DenseFeatures 层将特征列传入到模型中。 298 | 299 | ```python 300 | feature_layer = tf.keras.layers.DenseFeatures(feature_columns) 301 | ``` 302 | 303 | 之前 batch 大小设置为5,是为了方便示例。接下来batch设置为32,创建新的 input pipeline。 304 | 305 | ## 创建、编译和训练模型 306 | 307 | ```python 308 | model = tf.keras.Sequential([ 309 | feature_layer, 310 | layers.Dense(128, activation='relu'), 311 | layers.Dense(128, activation='relu'), 312 | layers.Dense(1, activation='sigmoid') 313 | ]) 314 | 315 | model.compile(optimizer='adam', 316 | loss='binary_crossentropy', 317 | metrics=['accuracy'], 318 | run_eagerly=True) 319 | 320 | model.fit(train_ds, 321 | validation_data=val_ds, 322 | epochs=5) 323 | ``` 324 | 325 | ```bash 326 | Epoch 1/5 327 | 7/7 [========] - 1s 142ms/step - loss: 1.3386 - accuracy: 0.6090 - val_loss: 1.0882 - val_accuracy: 0.2857 328 | Epoch 2/5 329 | 7/7 [========] - 0s 31ms/step - loss: 1.4225 - accuracy: 0.3849 - val_loss: 0.9518 - val_accuracy: 0.7347 330 | Epoch 3/5 331 | 7/7 [========] - 0s 32ms/step - loss: 0.6602 - accuracy: 0.7165 - val_loss: 0.7390 - val_accuracy: 0.6327 332 | Epoch 4/5 333 | 7/7 [========] - 0s 30ms/step - loss: 0.7332 - accuracy: 0.6310 - val_loss: 0.6794 - val_accuracy: 0.7143 334 | Epoch 5/5 335 | 7/7 [========] - 0s 31ms/step - loss: 0.5617 - accuracy: 0.7003 - val_loss: 0.5326 - val_accuracy: 0.7143 336 | 337 | 338 | ``` 339 | 340 | ```python 341 | loss, accuracy = model.evaluate(test_ds) 342 | print("Accuracy", accuracy) 343 | ``` 344 | 345 | ```bash 346 | 2/2 [========] - 0s 15ms/step - loss: 0.3907 - accuracy: 0.7869 347 | Accuracy 0.78688526 348 | ``` 349 | 350 | 如果你使用深度学习模型,以及更大、更复杂的数据集,准确率会更高。一般来说,像这样的小数据集,建议使用决策树或者随机森林。这个教程的主要目的不是训练一个准确率高的模型,而是作一个示例:TensorFlow 如何处理结构化的数据。 351 | 352 | 试一试吧。 353 | 354 | 返回[文档首页](https://geektutu.com/post/tf2doc.html) 355 | 356 | > 参考文档:[Classify structured data](https://www.tensorflow.org/beta/tutorials/keras/feature_columns) 357 | 358 | ## 附 推荐 359 | 360 | - [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-structured-data/structured-data.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-structured-data/structured-data.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-text.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: TensorFlow 2 中文文档 - IMDB 文本分类 3 | date: 2019-07-09 00:40:10 4 | description: TensorFlow2文档,TensorFlow2.0文档,TensorFlow2.0 TF2.0 TensorFlow 2 / 2.0 官方文档中文版,文本分类 Classify text,示例使用 IMDB 数据集。 5 | tags: 6 | - TensorFlow 2 7 | - 官方文档 8 | keywords: 9 | - Classify text 10 | - TensorFlow2.0 11 | - TensorFlow2文档 12 | - TensorFlow2.0文档 13 | - IMDB datasets 14 | - 文本分类 15 | nav: 简明教程 16 | categories: 17 | - TensorFlow2 文档 18 | image: post/tf2doc-ml-basic-text/imdb-sm.jpg 19 | github: https://github.com/geektutu/tensorflow2-docs-zh 20 | --- 21 | 22 | **TF2.0 TensorFlow 2 / 2.0 中文文档 - 文本分类 Classify text** 23 | 24 | 主要内容:使用迁移学习算法解决一个典型的二分分类(Binary Classification)问题——电影正向评论和负向评论分类。 25 | 26 | 这篇文档使用包含有50,000条电影评论的 IMDB 数据集,25,000用于训练,25,000用于测试。而且训练集和测试集是均衡的,即其中包含同等数量的正向评论和负向评论。 27 | 28 | 代码使用`tf.keras`和`TensorFlow Hub`,TensorFlow Hub 是一个用于迁移学习的平台/库。 29 | 30 | ```python 31 | import numpy as np 32 | import tensorflow as tf # 2.0.0-beta1 33 | import tensorflow_hub as hub # 0.5.0 34 | import tensorflow_datasets as tfds 35 | ``` 36 | 37 | ## 下载 IMDB 数据集 38 | 39 | ![IMDB datasets](tf2doc-ml-basic-text/imdb.jpg) 40 | 41 | IMDB 数据集在`tfds`中是可以直接获取的,调用时会自动下载到你的机器上。 42 | 43 | ```python 44 | # 进一步划分训练集。 45 | # 60%(15,000)用于训练,40%(10,000)用于验证(validation)。 46 | train_validation_split = tfds.Split.TRAIN.subsplit([6, 4]) 47 | 48 | (train_data, validation_data), test_data = tfds.load( 49 | name="imdb_reviews", 50 | split=(train_validation_split, tfds.Split.TEST), 51 | as_supervised=True) 52 | ``` 53 | 54 | ## 数据格式 55 | 56 | 每个例子包含一句电影评论和对应的标签,0或1。0代表负向评论,1代表正向评论。 57 | 58 | 看一下前十条数据。 59 | 60 | ```python 61 | train_examples_batch, train_labels_batch = next(iter(train_data.batch(10))) 62 | train_examples_batch 63 | ``` 64 | 65 | ```python 66 | 77 | ``` 78 | 79 | 前十个标签。 80 | 81 | ```python 82 | train_labels_batch 83 | # 84 | ``` 85 | 86 | ## 搭建模型 87 | 88 | 神经网络需要堆叠多层,架构上需要考虑三点。 89 | 90 | - 文本怎么表示? 91 | - 模型需要多少层? 92 | - 每一层多少个_隐藏节点_ 93 | 94 | 一种表示文本的方式是将句子映射为向量(embeddings vectors),或者称为文本嵌入(text embedding)。嵌入方法很多,比如我们可以采用最简单的独热编码,假设常用单词总共1000个,给每一个单词一个独热编码。假设每句话由10个单词构成,那么每句话均可以映射到10x1000的二维空间中。那么某句话就可以表示为: 95 | 96 | ```python 97 | # 10x1000的二维向量表示一句话 98 | [[0, 0, 0, 1, 0, ... 0], 99 | [0, 0, ..., 0, 1,... 0], 100 | [0, 0, 0, 0, 0, ... 1], 101 | ... 102 | [0, 0, 0, 0, 0, ... 1]] 103 | ``` 104 | 105 | 文本嵌入的方法很多,要考虑的因素也很多,比如同义词如何处理,维度过高怎么办?知乎上有比较详细的回答:[word embedding的解释](https://www.zhihu.com/question/32275069)。 106 | 107 | 我们可以使用一个预训练(pre-trained)好的文本嵌入模型作为第一层,有3个好处。 108 | 109 | - 不用担心文本处理。 110 | - 能从迁移学习中受益。 111 | - 嵌入后size固定,处理起来简单。 112 | 113 | 接下来从 TensorFlow Hub 中选用的**pre-trained 文本嵌入模型**称为[google/tf2-preview/gnews-swivel-20dim/1](https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim/1)。 114 | 115 | 接下来创建一个 Keras Layer 使用这个模型将句子转为向量。取前三条评论试一试。注意无论句子的长度如何,最终的嵌入结果均为长度20的一维向量。 116 | 117 | ```python 118 | embedding = "https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim/1" 119 | hub_layer = hub.KerasLayer(embedding, input_shape=[], 120 | dtype=tf.string, trainable=True) 121 | hub_layer(train_examples_batch[:3]) 122 | ``` 123 | 124 | ```bash 125 | 130 | ``` 131 | 132 | 接下来,搭建完整的神经网络模型。 133 | 134 | ```python 135 | model = tf.keras.Sequential() 136 | model.add(hub_layer) 137 | model.add(tf.keras.layers.Dense(16, activation='relu')) 138 | model.add(tf.keras.layers.Dense(1, activation='sigmoid')) 139 | 140 | model.summary() 141 | ``` 142 | 143 | ```bash 144 | Model: "sequential" 145 | _________________________________________________________________ 146 | Layer (type) Output Shape Param 147 | ================================================================= 148 | keras_layer (KerasLayer) (None, 20) 400020 149 | _________________________________________________________________ 150 | dense (Dense) (None, 16) 336 151 | _________________________________________________________________ 152 | dense_1 (Dense) (None, 1) 17 153 | ================================================================= 154 | Total params: 400,373 155 | Trainable params: 400,373 156 | Non-trainable params: 0 157 | _________________________________________________________________ 158 | ``` 159 | 160 | 1. 第一层是 TensorFlow Hub 层,将句子转换为 tokens,然后映射每个 token,并组合成最终的向量。输出的维度是:句子个数 * 嵌入维度(20)。 161 | 2. 接下来是全连接层(Full-connected, FC),即`Dense`层,16个节点。 162 | 3. 最后一层,也是全连接层,只有一个节点。使用`sigmoid`激活函数,输出值是float,范围0-1,代表可能性/置信度。 163 | 164 | ## 损失函数和优化器 165 | 166 | `binary_crossentropy`更适合处理概率问题,`mean_squared_error`适合处理回归(Regression)问题。 167 | 168 | ```python 169 | model.compile(optimizer='adam', 170 | loss='binary_crossentropy', 171 | metrics=['accuracy']) 172 | ``` 173 | 174 | ## 训练模型 175 | 176 | 共 20 epochs,每个batch 512个数据。即对所有训练数据进行20轮迭代。在训练过程中,将监视模型在包含10,000条数据的验证集上的损失(loss)和正确率(accuracy)。 177 | 178 | ```python 179 | history = model.fit(train_data.shuffle(10000).batch(512), 180 | epochs=20, 181 | validation_data=validation_data.batch(512), 182 | verbose=1) 183 | ``` 184 | 185 | ```bash 186 | Epoch 1/20 187 | 30/30 [========] - 6s 190ms/step - loss: 1.0201 - accuracy: 0.4331 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00 188 | Epoch 2/20 189 | 30/30 [========] - 5s 159ms/step - loss: 0.7801 - accuracy: 0.4677 - val_loss: 0.7407 - val_accuracy: 0.5009 190 | ...... 191 | Epoch 20/20 192 | 30/30 [========] - 5s 152ms/step - loss: 0.1917 - accuracy: 0.9348 - val_loss: 0.2930 - val_accuracy: 0.8784 193 | ``` 194 | 195 | ## 评估模型 196 | 197 | `evaluate`返回2个值,Loss(误差,越小越好) 和 accuracy。 198 | 199 | ```python 200 | results = model.evaluate(test_data.batch(512), verbose=0) 201 | for name, value in zip(model.metrics_names, results): 202 | print("%s: %.3f" % (name, value)) 203 | # loss: 0.314 204 | # accuracy: 0.866 205 | ``` 206 | 207 | 这个非常基础的模型达到了87%的正确率,复杂一点的模型可以达到95%。 208 | 209 | 返回[文档首页](https://geektutu.com/post/tf2doc.html) 210 | 211 | > 参考地址:[Text classification of movie reviews with Keras and TensorFlow Hub](https://www.tensorflow.org/beta/tutorials/keras/basic_text_classification_with_tfhub) 212 | 213 | ## 附 推荐 214 | 215 | - [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-text/imdb-sm.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-text/imdb-sm.jpg -------------------------------------------------------------------------------- /Beginner-ML-basics/tf2doc-ml-basic-text/imdb.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-ML-basics/tf2doc-ml-basic-text/imdb.jpg -------------------------------------------------------------------------------- /Beginner-Text-and-sequences/tf2doc-rnn-lstm-text.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: TensorFlow 2 中文文档 - RNN LSTM 文本分类 3 | date: 2019-07-22 23:18:10 4 | description: TensorFlow2文档,TensorFlow2.0文档,TensorFlow2.0 TF2.0 TensorFlow 2 / 2.0 官方文档中文版,循环神经网络(Recurrent Neural Network, RNN) 和 长短期记忆模型(Long Short-Term Memory,LSTM) 分类 IMDB 。 5 | tags: 6 | - TensorFlow 2 7 | - 官方文档 8 | keywords: 9 | - TensorFlow2.0 10 | - TensorFlow2文档 11 | - TensorFlow2.0文档 12 | nav: 简明教程 13 | categories: 14 | - TensorFlow2 文档 15 | image: post/tf2doc-rnn-lstm-text/rnn_small.jpg 16 | github: https://github.com/geektutu/tensorflow2-docs-zh 17 | --- 18 | 19 | **TF2.0 TensorFlow 2 / 2.0 中文文档:RNN LSTM 文本分类 Text classification with an RNN** 20 | 21 | 主要内容:使用循环神经网络(Recurrent Neural Network, RNN) 分类 影评数据 IMDB 22 | 23 | 循环神经网络(Recurrent Neural Network, RNN)广泛适用于自然语言处理领域(Natural Language Processing, NLP),RNN有什么显著的特点呢?普通的神经网络,每一层的输出是下一层的输入,每一层之间是相互独立的,没有关系。但是对于语言来说,一句话中的单词顺序不同,整个语义就完全变了。因此自然语言处理往往需要能够更好地处理序列信息的神经网络,RNN 能够满足这个需求。 24 | 25 | RNN 中,隐藏层的状态,不仅取决于当前输入层的输出,还和上一步隐藏层的状态有关。 26 | 27 | ![RNN](tf2doc-rnn-lstm-text/rnn.jpg) 28 | 29 | 长短期记忆模型(Long short-term memory, LSTM)是一种特殊的RNN,主要是为了解决长序列训练过程中的梯度消失和梯度爆炸问题。简单来说,就是相比普通的RNN,LSTM能够在更长的序列中有更好的表现。 30 | 31 | 接下来我们使用`tf.keras`提供的 LSTM 网络层搭建 RNN 网络模型,对 IMDB 影评数据进行分类。 32 | 33 | ## 下载 IMDB 34 | 35 | ```python 36 | # geektutu.com 37 | import matplotlib.pyplot as plt 38 | import tensorflow_datasets as tfds 39 | import tensorflow as tf 40 | from tensorflow.keras import Sequential, layers 41 | 42 | ds, info = tfds.load('imdb_reviews/subwords8k', with_info=True, 43 | as_supervised=True) 44 | train_ds, test_ds = ds['train'], ds['test'] 45 | 46 | BUFFER_SIZE, BATCH_SIZE = 10000, 64 47 | train_ds = train_ds.shuffle(BUFFER_SIZE) 48 | train_ds = train_ds.padded_batch(BATCH_SIZE, train_ds.output_shapes) 49 | test_ds = test_ds.padded_batch(BATCH_SIZE, test_ds.output_shapes) 50 | ``` 51 | 52 | ## 文本预处理 53 | 54 | 通过 tfds 获取到的数据已经经过了文本预处理,即 Tokenizer,向量化文本(将文本转为数字序列)。 55 | 56 | 接下来我们看一看是如何转换的。 57 | 58 | ```python 59 | # geektutu.com 60 | tokenizer = info.features['text'].encoder 61 | print ('词汇个数:', tokenizer.vocab_size) 62 | 63 | sample_str = 'welcome to geektutu.com' 64 | tokenized_str = tokenizer.encode(sample_str) 65 | print ('向量化文本:', tokenized_str) 66 | 67 | for ts in tokenized_str: 68 | print (ts, '-->', tokenizer.decode([ts])) 69 | ``` 70 | 71 | 可以看到,有些单词被拆分了。因为 tokenizer 中不可能包含所有可能出现的单词,如果在 tokenizer 中没有的单词,就会被拆分。文本预处理有很多种方式,比如我们在[TensorFlow 2 中文文档 - IMDB 文本分类](https://geektutu.com/post/tf2doc-ml-basic-text.html)中使用了预训练好的字词嵌入模型 _google/tf2-preview/gnews-swivel-20dim/1_,来直接将影评文本转换为向量;还有非常出名的自然语言处理工具包 ntlk 在文本预处理环节提供了非常强大的功能。 72 | 73 | ```bash 74 | 词汇个数: 8185 75 | 向量化文本: [6351, 7961, 7, 703, 3108, 999, 999, 7975, 2449] 76 | 6351 --> welcome 77 | 7961 --> 78 | 7 --> to 79 | 703 --> ge 80 | 3108 --> ek 81 | 999 --> tu 82 | 999 --> tu 83 | 7975 --> . 84 | 2449 --> com 85 | ``` 86 | 87 | ## 搭建 RNN 模型 88 | 89 | 借助`tf.keras`,我们可以非常方便地搭建出 LSTM 网络层,在这里我们只使用一层来试一试。 90 | 91 | 如果你比较细心地话,会发现在第一层使用了`tf.keras.layers.Embedding`,那为什么要使用这一层呢?从我们刚才的预处理实验你会发现,IMDB 数据集的预处理是按照单词在 tokenizer 中的下标来处理的,维度(tokenizer.vocab_size)很高也很稀疏,经过 Embedding 层的转换,将产生大小固定为64的向量。而且这个转换是可训练的,经过足够的训练之后,相似语义的句子将产生相似的向量。 92 | 93 | 我们在 LSTM 层外面套了一个壳(层封装器, layer wrappers): `tf.keras.layers.Bidirectional`,这是 RNN 的双向封装器,用于对序列进行前向和后向计算。 94 | 95 | ```python 96 | # geektutu.com 97 | model = Sequential([ 98 | layers.Embedding(tokenizer.vocab_size, 64), 99 | layers.Bidirectional(layers.LSTM(64)), 100 | layers.Dense(64, activation='relu'), 101 | layers.Dense(1, activation='sigmoid') 102 | ]) 103 | model.compile(loss='binary_crossentropy', optimizer='adam', 104 | metrics=['accuracy']) 105 | history1 = model.fit(train_ds, epochs=3, validation_data=test_ds) 106 | loss, acc = model.evaluate(test_ds) 107 | print('准确率:', acc) # 0.81039 108 | ``` 109 | 110 | 最终达到了81%的准确率,我们使用 _matplotlib_ 把训练过程可视化吧。 111 | 112 | ```python 113 | # geektutu.com 114 | # 解决中文乱码问题 115 | plt.rcParams['font.sans-serif'] = ['SimHei'] 116 | plt.rcParams['axes.unicode_minus'] = False 117 | plt.rcParams['font.size'] = 20 118 | 119 | def plot_graphs(history, name): 120 | plt.plot(history.history[name]) 121 | plt.plot(history.history['验证集 - '+ name]) 122 | plt.xlabel("Epochs") 123 | plt.ylabel(name) 124 | plt.legend([name, '验证集 - ' + name]) 125 | plt.show() 126 | 127 | plot_graphs(history1, 'accuracy') 128 | ``` 129 | 130 | ![rnn acc1](tf2doc-rnn-lstm-text/acc1.jpg) 131 | 132 | ## 添加更多 LSTM 层 133 | 134 | ```python 135 | # geektutu.com 136 | model = Sequential([ 137 | layers.Embedding(tokenizer.vocab_size, 64), 138 | layers.Bidirectional(layers.LSTM(64, return_sequences=True)), 139 | layers.Bidirectional(layers.LSTM(32)), 140 | layers.Dense(64, activation='relu'), 141 | layers.Dense(1, activation='sigmoid') 142 | ]) 143 | model.compile(loss='binary_crossentropy', optimizer='adam', 144 | metrics=['accuracy']) 145 | history = model.fit(train_ds, epochs=3, validation_data=test_ds) 146 | loss, acc = model.evaluate(test_ds) 147 | print('准确率:', acc) # 0.83096% 148 | ``` 149 | 150 | 这一次,我们使用了2层 LSTM,正确率达到了83%。 151 | 152 | ```python 153 | # geektutu.com 154 | plot_graphs(history, 'accuracy') 155 | ``` 156 | 157 | ![rnn acc2](tf2doc-rnn-lstm-text/acc2.jpg) 158 | 159 | 返回[文档首页](https://geektutu.com/post/tf2doc.html) 160 | 161 | > 完整代码:[Github - rnn-text.ipynb](https://github.com/geektutu/tensorflow2-docs-zh/tree/master/code) 162 | > 参考文档:[Text classification with an RNN](https://www.tensorflow.org/beta/tutorials/text/text_classification_rnn) 163 | 164 | ## 附 推荐 165 | 166 | - [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) -------------------------------------------------------------------------------- /Beginner-Text-and-sequences/tf2doc-rnn-lstm-text/acc1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-Text-and-sequences/tf2doc-rnn-lstm-text/acc1.jpg -------------------------------------------------------------------------------- /Beginner-Text-and-sequences/tf2doc-rnn-lstm-text/acc2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-Text-and-sequences/tf2doc-rnn-lstm-text/acc2.jpg -------------------------------------------------------------------------------- /Beginner-Text-and-sequences/tf2doc-rnn-lstm-text/rnn.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-Text-and-sequences/tf2doc-rnn-lstm-text/rnn.jpg -------------------------------------------------------------------------------- /Beginner-Text-and-sequences/tf2doc-rnn-lstm-text/rnn_small.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/Beginner-Text-and-sequences/tf2doc-rnn-lstm-text/rnn_small.jpg -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # TensorFlow 2 / 2.0 官方文档中文版 2 | 3 | ![TensorFlow 2.0](tf2doc/tf.jpg) 4 | 5 | ## 相关链接 6 | 7 | - [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) 8 | - [机器学习笔试面试题](https://geektutu.com/post/qa-ml.html),[Github](https://github.com/geektutu/interview-questions) 9 | - [TensorFlow 2.0 中文文档](https://geektutu.com/post/tf2doc.html),[Github](https://github.com/geektutu/tensorflow2-docs-zh) 10 | - [TensorFlow 2.0 图像识别&强化学习实战](https://geektutu.com/post/tensorflow2-mnist-cnn.html),[Github](https://github.com/geektutu/tensorflow-tutorial-samples) 11 | ## 目录(持续更新) 12 | 13 | ### 基础 - 机器学习基础 ML basics 14 | 15 | 1. [图像分类 Classify images](https://geektutu.com/post/tf2doc-ml-basic-image.html) 16 | 2. [文本分类 Classify text](https://geektutu.com/post/tf2doc-ml-basic-text.html) 17 | 3. [结构化数据分类 Classify structured data](https://geektutu.com/post/tf2doc-ml-basic-structured-data.html) 18 | 4. [回归 Regression](https://geektutu.com/post/tf2doc-ml-basic-regression.html) 19 | 5. [过拟合与欠拟合 Overfitting and underfitting](https://geektutu.com/post/tf2doc-ml-basic-overfit.html) 20 | 6. [保存和恢复模型 Save and restore models](https://geektutu.com/post/tf2doc-ml-basic-save-model.html) 21 | 22 | ### 基础 - 图像分类 23 | 24 | 1. [卷积神经网络 Convolutional Neural Networks](https://geektutu.com/post/tf2doc-cnn-cifar10.html) 25 | 2. [使用TFHub进行迁移学习 TensorFlow Hub with Keras](https://geektutu.com/post/tf2doc-tfhub-image-tl.html) 26 | 3. 使用预训练CNN进行迁移学习 Transfer Learning Using Pretrained ConvNets 27 | 28 | ### 基础 - 文本分类 29 | 30 | 1. [使用RNN对文本分类进行分类 Text classification with an RNN](https://geektutu.com/post/tf2doc-rnn-lstm-text.html) 31 | 32 | ### 进阶 - 自定义 33 | 34 | 1. 张量和操作 Tensors and operations 35 | 2. 自定义层 Custom layers 36 | 3. 自动微分 Automatic differentiation 37 | 4. 自定义训练:攻略 Custom training:walkthrough 38 | 5. 动态图机制 TF function and AutoGraph -------------------------------------------------------------------------------- /code/cnn-cifar-10.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## 下载数据集" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 46, 13 | "metadata": {}, 14 | "outputs": [], 15 | "source": [ 16 | "# geektutu.com\n", 17 | "import matplotlib.pyplot as plt\n", 18 | "import tensorflow as tf\n", 19 | "from tensorflow.keras import layers, datasets, models\n", 20 | "\n", 21 | "(train_x, train_y), (test_x, test_y) = datasets.cifar10.load_data()" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 47, 27 | "metadata": {}, 28 | "outputs": [ 29 | { 30 | "data": { 31 | "image/png": "\n", 32 | "text/plain": [ 33 | "
" 34 | ] 35 | }, 36 | "metadata": { 37 | "needs_background": "light" 38 | }, 39 | "output_type": "display_data" 40 | } 41 | ], 42 | "source": [ 43 | "# geektutu.com\n", 44 | "plt.figure(figsize=(5, 3))\n", 45 | "plt.subplots_adjust(hspace=0.1)\n", 46 | "for n in range(15):\n", 47 | " plt.subplot(3, 5, n+1)\n", 48 | " plt.imshow(train_x[n])\n", 49 | " plt.axis('off')\n", 50 | "_ = plt.suptitle(\"geektutu.com CIFAR-10 Example\")" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": 48, 56 | "metadata": {}, 57 | "outputs": [ 58 | { 59 | "name": "stdout", 60 | "output_type": "stream", 61 | "text": [ 62 | "train_x shape: (50000, 32, 32, 3) test_x shape: (10000, 32, 32, 3)\n" 63 | ] 64 | } 65 | ], 66 | "source": [ 67 | "# geektutu.com\n", 68 | "train_x, test_x = train_x / 255.0, test_x / 255.0\n", 69 | "print('train_x shape:', train_x.shape, 'test_x shape:', test_x.shape)\n", 70 | "# (50000, 32, 32, 3), (10000, 32, 32, 3)" 71 | ] 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "metadata": {}, 76 | "source": [ 77 | "## 卷积层" 78 | ] 79 | }, 80 | { 81 | "cell_type": "code", 82 | "execution_count": 52, 83 | "metadata": {}, 84 | "outputs": [ 85 | { 86 | "name": "stdout", 87 | "output_type": "stream", 88 | "text": [ 89 | "Model: \"sequential_1\"\n", 90 | "_________________________________________________________________\n", 91 | "Layer (type) Output Shape Param # \n", 92 | "=================================================================\n", 93 | "conv2d_3 (Conv2D) (None, 30, 30, 32) 896 \n", 94 | "_________________________________________________________________\n", 95 | "max_pooling2d_2 (MaxPooling2 (None, 15, 15, 32) 0 \n", 96 | "_________________________________________________________________\n", 97 | "conv2d_4 (Conv2D) (None, 13, 13, 64) 18496 \n", 98 | "_________________________________________________________________\n", 99 | "max_pooling2d_3 (MaxPooling2 (None, 6, 6, 64) 0 \n", 100 | "_________________________________________________________________\n", 101 | "conv2d_5 (Conv2D) (None, 4, 4, 64) 36928 \n", 102 | "=================================================================\n", 103 | "Total params: 56,320\n", 104 | "Trainable params: 56,320\n", 105 | "Non-trainable params: 0\n", 106 | "_________________________________________________________________\n" 107 | ] 108 | } 109 | ], 110 | "source": [ 111 | "# geektutu.com\n", 112 | "model = models.Sequential()\n", 113 | "model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))\n", 114 | "model.add(layers.MaxPooling2D((2, 2)))\n", 115 | "model.add(layers.Conv2D(64, (3, 3), activation='relu'))\n", 116 | "model.add(layers.MaxPooling2D((2, 2)))\n", 117 | "model.add(layers.Conv2D(64, (3, 3), activation='relu'))\n", 118 | "model.summary()" 119 | ] 120 | }, 121 | { 122 | "cell_type": "markdown", 123 | "metadata": {}, 124 | "source": [ 125 | "## 全连接层" 126 | ] 127 | }, 128 | { 129 | "cell_type": "code", 130 | "execution_count": 53, 131 | "metadata": {}, 132 | "outputs": [ 133 | { 134 | "name": "stdout", 135 | "output_type": "stream", 136 | "text": [ 137 | "Model: \"sequential_1\"\n", 138 | "_________________________________________________________________\n", 139 | "Layer (type) Output Shape Param # \n", 140 | "=================================================================\n", 141 | "conv2d_3 (Conv2D) (None, 30, 30, 32) 896 \n", 142 | "_________________________________________________________________\n", 143 | "max_pooling2d_2 (MaxPooling2 (None, 15, 15, 32) 0 \n", 144 | "_________________________________________________________________\n", 145 | "conv2d_4 (Conv2D) (None, 13, 13, 64) 18496 \n", 146 | "_________________________________________________________________\n", 147 | "max_pooling2d_3 (MaxPooling2 (None, 6, 6, 64) 0 \n", 148 | "_________________________________________________________________\n", 149 | "conv2d_5 (Conv2D) (None, 4, 4, 64) 36928 \n", 150 | "_________________________________________________________________\n", 151 | "flatten_1 (Flatten) (None, 1024) 0 \n", 152 | "_________________________________________________________________\n", 153 | "dense_2 (Dense) (None, 64) 65600 \n", 154 | "_________________________________________________________________\n", 155 | "dense_3 (Dense) (None, 10) 650 \n", 156 | "=================================================================\n", 157 | "Total params: 122,570\n", 158 | "Trainable params: 122,570\n", 159 | "Non-trainable params: 0\n", 160 | "_________________________________________________________________\n" 161 | ] 162 | } 163 | ], 164 | "source": [ 165 | "# geektutu.com\n", 166 | "model.add(layers.Flatten())\n", 167 | "model.add(layers.Dense(64, activation='relu'))\n", 168 | "model.add(layers.Dense(10, activation='softmax'))\n", 169 | "model.summary()" 170 | ] 171 | }, 172 | { 173 | "cell_type": "markdown", 174 | "metadata": {}, 175 | "source": [ 176 | "## 编译训练模型" 177 | ] 178 | }, 179 | { 180 | "cell_type": "code", 181 | "execution_count": 54, 182 | "metadata": {}, 183 | "outputs": [ 184 | { 185 | "name": "stderr", 186 | "output_type": "stream", 187 | "text": [ 188 | "WARNING: Logging before flag parsing goes to stderr.\n", 189 | "W0720 16:46:59.197520 140735530943360 deprecation.py:323] From /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/ops/math_grad.py:1250: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.\n", 190 | "Instructions for updating:\n", 191 | "Use tf.where in 2.0, which has the same broadcast rule as np.where\n" 192 | ] 193 | }, 194 | { 195 | "name": "stdout", 196 | "output_type": "stream", 197 | "text": [ 198 | "Train on 50000 samples\n", 199 | "Epoch 1/5\n", 200 | "50000/50000 [==============================] - 20s 394us/sample - loss: 1.4972 - accuracy: 0.4585\n", 201 | "Epoch 2/5\n", 202 | "50000/50000 [==============================] - 19s 386us/sample - loss: 1.1277 - accuracy: 0.6025\n", 203 | "Epoch 3/5\n", 204 | "50000/50000 [==============================] - 20s 398us/sample - loss: 0.9836 - accuracy: 0.6541\n", 205 | "Epoch 4/5\n", 206 | "50000/50000 [==============================] - 20s 404us/sample - loss: 0.8808 - accuracy: 0.6917\n", 207 | "Epoch 5/5\n", 208 | "50000/50000 [==============================] - 20s 406us/sample - loss: 0.8040 - accuracy: 0.7179\n" 209 | ] 210 | }, 211 | { 212 | "data": { 213 | "text/plain": [ 214 | "" 215 | ] 216 | }, 217 | "execution_count": 54, 218 | "metadata": {}, 219 | "output_type": "execute_result" 220 | } 221 | ], 222 | "source": [ 223 | "model.compile(optimizer='adam',\n", 224 | " loss='sparse_categorical_crossentropy',\n", 225 | " metrics=['accuracy'])\n", 226 | "\n", 227 | "model.fit(train_x, train_y, epochs=5)" 228 | ] 229 | }, 230 | { 231 | "cell_type": "markdown", 232 | "metadata": {}, 233 | "source": [ 234 | "## 评估模型" 235 | ] 236 | }, 237 | { 238 | "cell_type": "code", 239 | "execution_count": 58, 240 | "metadata": {}, 241 | "outputs": [ 242 | { 243 | "name": "stdout", 244 | "output_type": "stream", 245 | "text": [ 246 | "10000/10000 [==============================] - 1s 122us/sample - loss: 0.9077 - accuracy: 0.6830\n" 247 | ] 248 | }, 249 | { 250 | "data": { 251 | "text/plain": [ 252 | "0.683" 253 | ] 254 | }, 255 | "execution_count": 58, 256 | "metadata": {}, 257 | "output_type": "execute_result" 258 | } 259 | ], 260 | "source": [ 261 | "# geektutu.com\n", 262 | "test_loss, test_acc = model.evaluate(test_x, test_y)\n", 263 | "test_acc # 0.683" 264 | ] 265 | } 266 | ], 267 | "metadata": { 268 | "kernelspec": { 269 | "display_name": "Python 3", 270 | "language": "python", 271 | "name": "python3" 272 | }, 273 | "language_info": { 274 | "codemirror_mode": { 275 | "name": "ipython", 276 | "version": 3 277 | }, 278 | "file_extension": ".py", 279 | "mimetype": "text/x-python", 280 | "name": "python", 281 | "nbconvert_exporter": "python", 282 | "pygments_lexer": "ipython3", 283 | "version": "3.7.0" 284 | } 285 | }, 286 | "nbformat": 4, 287 | "nbformat_minor": 2 288 | } 289 | -------------------------------------------------------------------------------- /code/rnn-text.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## 下载数据集" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 1, 13 | "metadata": {}, 14 | "outputs": [], 15 | "source": [ 16 | "import matplotlib.pyplot as plt\n", 17 | "import tensorflow_datasets as tfds\n", 18 | "import tensorflow as tf\n", 19 | "from tensorflow.keras import Sequential, layers\n", 20 | "\n", 21 | "ds, info = tfds.load('imdb_reviews/subwords8k', with_info=True,\n", 22 | " as_supervised=True)\n", 23 | "train_ds, test_ds = ds['train'], ds['test']\n", 24 | "\n", 25 | "BUFFER_SIZE, BATCH_SIZE = 10000, 64\n", 26 | "train_ds = train_ds.shuffle(BUFFER_SIZE)\n", 27 | "train_ds = train_ds.padded_batch(BATCH_SIZE, train_ds.output_shapes)\n", 28 | "test_ds = test_ds.padded_batch(BATCH_SIZE, test_ds.output_shapes)" 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "metadata": {}, 34 | "source": [ 35 | "## 文本预处理" 36 | ] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": 2, 41 | "metadata": {}, 42 | "outputs": [ 43 | { 44 | "name": "stdout", 45 | "output_type": "stream", 46 | "text": [ 47 | "词汇个数: 8185\n", 48 | "向量化文本: [6351, 7961, 7, 703, 3108, 999, 999, 7975, 2449]\n", 49 | "6351 --> welcome\n", 50 | "7961 --> \n", 51 | "7 --> to \n", 52 | "703 --> ge\n", 53 | "3108 --> ek\n", 54 | "999 --> tu\n", 55 | "999 --> tu\n", 56 | "7975 --> .\n", 57 | "2449 --> com\n" 58 | ] 59 | } 60 | ], 61 | "source": [ 62 | "# geektutu.com\n", 63 | "tokenizer = info.features['text'].encoder\n", 64 | "print ('词汇个数:', tokenizer.vocab_size)\n", 65 | "\n", 66 | "sample_str = 'welcome to geektutu.com'\n", 67 | "tokenized_str = tokenizer.encode(sample_str)\n", 68 | "print ('向量化文本:', tokenized_str)\n", 69 | "\n", 70 | "for ts in tokenized_str:\n", 71 | " print (ts, '-->', tokenizer.decode([ts]))" 72 | ] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "metadata": {}, 77 | "source": [ 78 | "## 搭建 RNN 模型" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": {}, 85 | "outputs": [ 86 | { 87 | "name": "stdout", 88 | "output_type": "stream", 89 | "text": [ 90 | "Epoch 1/10\n", 91 | " 45/Unknown - 132s 3s/step - loss: 0.6927 - accuracy: 0.5083" 92 | ] 93 | } 94 | ], 95 | "source": [ 96 | "model = Sequential([\n", 97 | " layers.Embedding(tokenizer.vocab_size, 64),\n", 98 | " layers.Bidirectional(layers.LSTM(64)),\n", 99 | " layers.Dense(64, activation='relu'),\n", 100 | " layers.Dense(1, activation='sigmoid')\n", 101 | "])\n", 102 | "model.compile(loss='binary_crossentropy', optimizer='adam',\n", 103 | " metrics=['accuracy'])\n", 104 | "history1 = model.fit(train_ds, epochs=10, validation_data=test_ds)\n", 105 | "loss, acc = model.evaluate(test_ds)\n", 106 | "print('准确率:', acc)" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": 7, 112 | "metadata": {}, 113 | "outputs": [ 114 | { 115 | "data": { 116 | "image/png": "\n", 117 | "text/plain": [ 118 | "
" 119 | ] 120 | }, 121 | "metadata": { 122 | "needs_background": "light" 123 | }, 124 | "output_type": "display_data" 125 | } 126 | ], 127 | "source": [ 128 | "# geektutu.com\n", 129 | "# 解决中文乱码问题\n", 130 | "plt.rcParams['font.sans-serif'] = ['SimHei']\n", 131 | "plt.rcParams['axes.unicode_minus'] = False\n", 132 | "plt.rcParams['font.size'] = 20\n", 133 | "\n", 134 | "def plot_graphs(history, name):\n", 135 | " plt.plot(history.history[name])\n", 136 | " plt.plot(history.history['val_'+ name])\n", 137 | " plt.xlabel(\"Epochs\")\n", 138 | " plt.ylabel(name)\n", 139 | " plt.legend([name, '验证集 - ' + name])\n", 140 | " plt.show()\n", 141 | "\n", 142 | "plot_graphs(history1, 'accuracy')" 143 | ] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "metadata": {}, 148 | "source": [ 149 | "## 添加更多 LSTM 层" 150 | ] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "execution_count": 5, 155 | "metadata": {}, 156 | "outputs": [ 157 | { 158 | "name": "stdout", 159 | "output_type": "stream", 160 | "text": [ 161 | "Epoch 1/10\n", 162 | "391/391 [==============================] - 1811s 5s/step - loss: 0.5664 - accuracy: 0.6991 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00\n", 163 | "Epoch 2/10\n", 164 | "391/391 [==============================] - 1513s 4s/step - loss: 0.3780 - accuracy: 0.8415 - val_loss: 0.4220 - val_accuracy: 0.8242\n", 165 | "Epoch 3/10\n", 166 | "391/391 [==============================] - 1461s 4s/step - loss: 0.3037 - accuracy: 0.8775 - val_loss: 0.4419 - val_accuracy: 0.8112\n", 167 | "Epoch 4/10\n", 168 | "391/391 [==============================] - 1450s 4s/step - loss: 0.2260 - accuracy: 0.9142 - val_loss: 0.4181 - val_accuracy: 0.8461\n", 169 | "Epoch 5/10\n", 170 | "391/391 [==============================] - 1434s 4s/step - loss: 0.1900 - accuracy: 0.9297 - val_loss: 0.4600 - val_accuracy: 0.8032\n", 171 | "Epoch 6/10\n", 172 | "391/391 [==============================] - 1436s 4s/step - loss: 0.3483 - accuracy: 0.8509 - val_loss: 0.4899 - val_accuracy: 0.7797\n", 173 | "Epoch 7/10\n", 174 | "391/391 [==============================] - 1425s 4s/step - loss: 0.3232 - accuracy: 0.8670 - val_loss: 0.4060 - val_accuracy: 0.8207\n", 175 | "Epoch 8/10\n", 176 | "391/391 [==============================] - 1437s 4s/step - loss: 0.2113 - accuracy: 0.9182 - val_loss: 0.4142 - val_accuracy: 0.8446\n", 177 | "Epoch 9/10\n", 178 | "391/391 [==============================] - 1445s 4s/step - loss: 0.1486 - accuracy: 0.9465 - val_loss: 0.4460 - val_accuracy: 0.8392\n", 179 | "Epoch 10/10\n", 180 | "391/391 [==============================] - 1431s 4s/step - loss: 0.0923 - accuracy: 0.9696 - val_loss: 0.5298 - val_accuracy: 0.8310\n", 181 | " 391/Unknown - 435s 1s/step - loss: 0.5298 - accuracy: 0.8310准确率: 0.83096\n" 182 | ] 183 | } 184 | ], 185 | "source": [ 186 | "model = Sequential([\n", 187 | " layers.Embedding(tokenizer.vocab_size, 64),\n", 188 | " layers.Bidirectional(layers.LSTM(64, return_sequences=True)),\n", 189 | " layers.Bidirectional(layers.LSTM(32)),\n", 190 | " layers.Dense(64, activation='relu'),\n", 191 | " layers.Dense(1, activation='sigmoid')\n", 192 | "])\n", 193 | "model.compile(loss='binary_crossentropy', optimizer='adam',\n", 194 | " metrics=['accuracy'])\n", 195 | "history = model.fit(train_ds, epochs=10, validation_data=test_ds)\n", 196 | "loss, acc = model.evaluate(test_ds)\n", 197 | "print('准确率:', acc)" 198 | ] 199 | }, 200 | { 201 | "cell_type": "code", 202 | "execution_count": 8, 203 | "metadata": {}, 204 | "outputs": [ 205 | { 206 | "data": { 207 | "image/png": "\n", 208 | "text/plain": [ 209 | "
" 210 | ] 211 | }, 212 | "metadata": { 213 | "needs_background": "light" 214 | }, 215 | "output_type": "display_data" 216 | } 217 | ], 218 | "source": [ 219 | "plot_graphs(history, 'accuracy')" 220 | ] 221 | } 222 | ], 223 | "metadata": { 224 | "kernelspec": { 225 | "display_name": "Python 3", 226 | "language": "python", 227 | "name": "python3" 228 | }, 229 | "language_info": { 230 | "codemirror_mode": { 231 | "name": "ipython", 232 | "version": 3 233 | }, 234 | "file_extension": ".py", 235 | "mimetype": "text/x-python", 236 | "name": "python", 237 | "nbconvert_exporter": "python", 238 | "pygments_lexer": "ipython3", 239 | "version": "3.7.0" 240 | } 241 | }, 242 | "nbformat": 4, 243 | "nbformat_minor": 2 244 | } 245 | -------------------------------------------------------------------------------- /code/save_restore_model.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## 准备数据" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 1, 13 | "metadata": {}, 14 | "outputs": [], 15 | "source": [ 16 | "import warnings\n", 17 | "warnings.simplefilter(\"ignore\")" 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": 2, 23 | "metadata": {}, 24 | "outputs": [], 25 | "source": [ 26 | "import tensorflow as tf\n", 27 | "from tensorflow import keras\n", 28 | "from tensorflow.keras import datasets, layers, models, callbacks\n", 29 | "from tensorflow.keras.datasets import mnist\n", 30 | "\n", 31 | "import os\n", 32 | "file_path = os.path.abspath('./mnist.npz')\n", 33 | "\n", 34 | "(train_x, train_y), (test_x, test_y) = datasets.mnist.load_data(path=file_path)\n", 35 | "train_y, test_y = train_y[:1000], test_y[:1000]\n", 36 | "train_x = train_x[:1000].reshape(-1, 28 * 28) / 255.0\n", 37 | "test_x = test_x[:1000].reshape(-1, 28 * 28) / 255.0" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "## 搭建模型" 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": 3, 50 | "metadata": {}, 51 | "outputs": [], 52 | "source": [ 53 | "def create_model():\n", 54 | " model = models.Sequential([\n", 55 | " layers.Dense(512, activation='relu', input_shape=(784,)),\n", 56 | " layers.Dropout(0.2),\n", 57 | " layers.Dense(10, activation='softmax')\n", 58 | " ])\n", 59 | "\n", 60 | " model.compile(optimizer='adam', metrics=['accuracy'],\n", 61 | " loss='sparse_categorical_crossentropy')\n", 62 | "\n", 63 | " return model\n", 64 | "\n", 65 | "def evaluate(target_model):\n", 66 | " _, acc = target_model.evaluate(test_x, test_y)\n", 67 | " print(\"Restore model, accuracy: {:5.2f}%\".format(100*acc))" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": {}, 73 | "source": [ 74 | "## 自动保存 checkpoints" 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": 4, 80 | "metadata": {}, 81 | "outputs": [ 82 | { 83 | "name": "stderr", 84 | "output_type": "stream", 85 | "text": [ 86 | "WARNING: Logging before flag parsing goes to stderr.\n", 87 | "W0713 00:06:20.997914 140735530943360 callbacks.py:859] `period` argument is deprecated. Please use `save_freq` to specify the frequency in number of samples seen.\n", 88 | "W0713 00:06:21.173481 140735530943360 deprecation.py:323] From /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/ops/math_grad.py:1250: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.\n", 89 | "Instructions for updating:\n", 90 | "Use tf.where in 2.0, which has the same broadcast rule as np.where\n" 91 | ] 92 | }, 93 | { 94 | "name": "stdout", 95 | "output_type": "stream", 96 | "text": [ 97 | "\n", 98 | "Epoch 00010: saving model to training_2/cp-0010.ckpt\n", 99 | "\n", 100 | "Epoch 00020: saving model to training_2/cp-0020.ckpt\n", 101 | "\n", 102 | "Epoch 00030: saving model to training_2/cp-0030.ckpt\n", 103 | "\n", 104 | "Epoch 00040: saving model to training_2/cp-0040.ckpt\n", 105 | "\n", 106 | "Epoch 00050: saving model to training_2/cp-0050.ckpt\n" 107 | ] 108 | }, 109 | { 110 | "data": { 111 | "text/plain": [ 112 | "" 113 | ] 114 | }, 115 | "execution_count": 4, 116 | "metadata": {}, 117 | "output_type": "execute_result" 118 | } 119 | ], 120 | "source": [ 121 | "# 存储模型的文件名,语法与 str.format 一致\n", 122 | "# period=10:每 10 epochs 保存一次\n", 123 | "checkpoint_path = \"training_2/cp-{epoch:04d}.ckpt\"\n", 124 | "checkpoint_dir = os.path.dirname(checkpoint_path)\n", 125 | "cp_callback = callbacks.ModelCheckpoint(\n", 126 | " checkpoint_path, verbose=1, save_weights_only=True, period=10)\n", 127 | "\n", 128 | "model = create_model()\n", 129 | "model.save_weights(checkpoint_path.format(epoch=0))\n", 130 | "model.fit(train_x, train_y, epochs=50, callbacks=[cp_callback],\n", 131 | " validation_data=(test_x, test_y), verbose=0)" 132 | ] 133 | }, 134 | { 135 | "cell_type": "code", 136 | "execution_count": 5, 137 | "metadata": {}, 138 | "outputs": [ 139 | { 140 | "name": "stdout", 141 | "output_type": "stream", 142 | "text": [ 143 | "1000/1000 [==============================] - 0s 94us/sample - loss: 0.5232 - accuracy: 0.8720\n", 144 | "Restore model, accuracy: 87.20%\n" 145 | ] 146 | } 147 | ], 148 | "source": [ 149 | "latest = tf.train.latest_checkpoint(checkpoint_dir)\n", 150 | "# 'training_2/cp-0050.ckpt'\n", 151 | "model = create_model()\n", 152 | "model.load_weights(latest)\n", 153 | "evaluate(model)" 154 | ] 155 | }, 156 | { 157 | "cell_type": "markdown", 158 | "metadata": {}, 159 | "source": [ 160 | "## 手动保存权重" 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": 6, 166 | "metadata": {}, 167 | "outputs": [ 168 | { 169 | "name": "stdout", 170 | "output_type": "stream", 171 | "text": [ 172 | "1000/1000 [==============================] - 0s 90us/sample - loss: 0.5232 - accuracy: 0.8720\n", 173 | "Restore model, accuracy: 87.20%\n" 174 | ] 175 | } 176 | ], 177 | "source": [ 178 | "# 手动保存权重\n", 179 | "model.save_weights('./checkpoints/mannul_checkpoint')\n", 180 | "model = create_model()\n", 181 | "model.load_weights('./checkpoints/mannul_checkpoint')\n", 182 | "evaluate(model)" 183 | ] 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "metadata": {}, 188 | "source": [ 189 | "## 保存整个模型" 190 | ] 191 | }, 192 | { 193 | "cell_type": "markdown", 194 | "metadata": {}, 195 | "source": [ 196 | "### HDF5" 197 | ] 198 | }, 199 | { 200 | "cell_type": "code", 201 | "execution_count": 7, 202 | "metadata": {}, 203 | "outputs": [], 204 | "source": [ 205 | "model.save('my_model.h5')" 206 | ] 207 | }, 208 | { 209 | "cell_type": "code", 210 | "execution_count": 8, 211 | "metadata": {}, 212 | "outputs": [ 213 | { 214 | "name": "stderr", 215 | "output_type": "stream", 216 | "text": [ 217 | "W0713 00:06:28.440529 140735530943360 hdf5_format.py:192] Error in loading the saved optimizer state. As a result, your model is starting with a freshly initialized optimizer.\n" 218 | ] 219 | }, 220 | { 221 | "name": "stdout", 222 | "output_type": "stream", 223 | "text": [ 224 | "1000/1000 [==============================] - 0s 91us/sample - loss: 0.5232 - accuracy: 0.8720\n", 225 | "Restore model, accuracy: 87.20%\n" 226 | ] 227 | } 228 | ], 229 | "source": [ 230 | "new_model = models.load_model('my_model.h5')\n", 231 | "evaluate(new_model)" 232 | ] 233 | }, 234 | { 235 | "cell_type": "markdown", 236 | "metadata": {}, 237 | "source": [ 238 | "### saved_model" 239 | ] 240 | }, 241 | { 242 | "cell_type": "code", 243 | "execution_count": 9, 244 | "metadata": {}, 245 | "outputs": [ 246 | { 247 | "name": "stderr", 248 | "output_type": "stream", 249 | "text": [ 250 | "W0713 00:06:29.094913 140735530943360 deprecation.py:323] From /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:253: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.\n", 251 | "Instructions for updating:\n", 252 | "This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.\n", 253 | "W0713 00:06:29.096133 140735530943360 export_utils.py:182] Export includes no default signature!\n", 254 | "W0713 00:06:29.277875 140735530943360 util.py:244] Unresolved object in checkpoint: (root).optimizer.iter\n", 255 | "W0713 00:06:29.278687 140735530943360 util.py:244] Unresolved object in checkpoint: (root).optimizer.beta_1\n", 256 | "W0713 00:06:29.279356 140735530943360 util.py:244] Unresolved object in checkpoint: (root).optimizer.beta_2\n", 257 | "W0713 00:06:29.280627 140735530943360 util.py:244] Unresolved object in checkpoint: (root).optimizer.decay\n", 258 | "W0713 00:06:29.281178 140735530943360 util.py:244] Unresolved object in checkpoint: (root).optimizer.learning_rate\n", 259 | "W0713 00:06:29.281950 140735530943360 util.py:252] A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/alpha/guide/checkpoints#loading_mechanics for details.\n", 260 | "W0713 00:06:29.402854 140735530943360 export_utils.py:182] Export includes no default signature!\n" 261 | ] 262 | }, 263 | { 264 | "data": { 265 | "text/plain": [ 266 | "(1000, 10)" 267 | ] 268 | }, 269 | "execution_count": 9, 270 | "metadata": {}, 271 | "output_type": "execute_result" 272 | } 273 | ], 274 | "source": [ 275 | "import time\n", 276 | "saved_model_path = \"./saved_models/{}\".format(int(time.time()))\n", 277 | "tf.keras.experimental.export_saved_model(model, saved_model_path)\n", 278 | "new_model = tf.keras.experimental.load_from_saved_model(saved_model_path)\n", 279 | "model.predict(test_x).shape" 280 | ] 281 | }, 282 | { 283 | "cell_type": "code", 284 | "execution_count": 10, 285 | "metadata": {}, 286 | "outputs": [ 287 | { 288 | "name": "stdout", 289 | "output_type": "stream", 290 | "text": [ 291 | "1000/1000 [==============================] - 0s 95us/sample - loss: 0.5232 - accuracy: 0.8720\n", 292 | "Restore model, accuracy: 87.20%\n" 293 | ] 294 | } 295 | ], 296 | "source": [ 297 | "new_model.compile(optimizer=model.optimizer,\n", 298 | " loss='sparse_categorical_crossentropy',\n", 299 | " metrics=['accuracy'])\n", 300 | "evaluate(new_model)" 301 | ] 302 | } 303 | ], 304 | "metadata": { 305 | "kernelspec": { 306 | "display_name": "Python 3", 307 | "language": "python", 308 | "name": "python3" 309 | }, 310 | "language_info": { 311 | "codemirror_mode": { 312 | "name": "ipython", 313 | "version": 3 314 | }, 315 | "file_extension": ".py", 316 | "mimetype": "text/x-python", 317 | "name": "python", 318 | "nbconvert_exporter": "python", 319 | "pygments_lexer": "ipython3", 320 | "version": "3.7.0" 321 | } 322 | }, 323 | "nbformat": 4, 324 | "nbformat_minor": 2 325 | } 326 | -------------------------------------------------------------------------------- /tf2doc.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: TensorFlow 2 / 2.0 中文文档 3 | date: 2019-07-09 00:10:10 4 | description: TensorFlow2文档,TensorFlow2.0文档,TensorFlow2.0 TF2.0 TensorFlow 2 / 2.0 官方文档中文版,有删改,序言,介绍整个文档的构成。 5 | tags: 6 | - TensorFlow 2 7 | - 官方文档 8 | keywords: 9 | - TensorFlow2文档 10 | - TensorFlow2.0文档 11 | - TensorFlow2.0 12 | - TF2.0 13 | nav: 简明教程 14 | categories: 15 | - TensorFlow2 文档 16 | top: 3 17 | image: post/tf2doc/tf.jpg 18 | github: https://github.com/geektutu/tensorflow2-docs-zh 19 | --- 20 | 21 | ![TensorFlow 2.0](tf2doc/tf.jpg) 22 | 23 | 如果你对 Python 还不熟悉,推荐先阅读 [一篇文章入门 Python](https://geektutu.com/post/quick-python.html) 24 | 25 | ## 文档地址 26 | 27 | - 文档地址:[TensorFlow 2 / 2.0 中文文档](https://geektutu.com/post/tf2doc.html) 28 | - Github:[Github - tensorflow2-docs](https://github.com/geektutu/tensorflow2-docs-zh) 29 | - 知乎专栏:[Zhihu - Tensorflow2-docs](https://zhuanlan.zhihu.com/geektutu) 30 | 31 | ## 目录(持续更新) 32 | 33 | ### 基础 - 机器学习基础 ML basics 34 | 35 | 1. [图像分类 Classify images](https://geektutu.com/post/tf2doc-ml-basic-image.html) 36 | 2. [文本分类 Classify text](https://geektutu.com/post/tf2doc-ml-basic-text.html) 37 | 3. [结构化数据分类 Classify structured data](https://geektutu.com/post/tf2doc-ml-basic-structured-data.html) 38 | 4. [回归 Regression](https://geektutu.com/post/tf2doc-ml-basic-regression.html) 39 | 5. [过拟合与欠拟合 Overfitting and underfitting](https://geektutu.com/post/tf2doc-ml-basic-overfit.html) 40 | 6. [保存和恢复模型 Save and restore models](https://geektutu.com/post/tf2doc-ml-basic-save-model.html) 41 | 42 | ### 基础 - 图像分类 43 | 44 | 1. [卷积神经网络 Convolutional Neural Networks](https://geektutu.com/post/tf2doc-cnn-cifar10.html) 45 | 2. [使用TFHub进行迁移学习 TensorFlow Hub with Keras](https://geektutu.com/post/tf2doc-tfhub-image-tl.html) 46 | 3. 使用预训练CNN进行迁移学习 Transfer Learning Using Pretrained ConvNets 47 | 48 | ### 基础 - 文本分类 49 | 50 | 1. [使用RNN对文本分类进行分类 Text classification with an RNN](https://geektutu.com/post/tf2doc-rnn-lstm-text.html) 51 | 52 | ### 进阶 - 自定义 53 | 54 | 1. 张量和操作 Tensors and operations 55 | 2. 自定义层 Custom layers 56 | 3. 自动微分 Automatic differentiation 57 | 4. 自定义训练:攻略 Custom training:walkthrough 58 | 5. 动态图机制 TF function and AutoGraph 59 | 60 | ## 极客兔兔实战 61 | 62 | ### 监督学习 63 | 64 | 1. [mnist手写数字识别(CNN卷积神经网络)](https://geektutu.com/post/tensorflow2-mnist-cnn.html) 65 | 2. [监督学习玩转 OpenAI gym game](https://geektutu.com/post/tensorflow2-gym-nn.html) 66 | 67 | ### 强化学习 68 | 69 | 1. [强化学习 Q-Learning 玩转 OpenAI gym](https://geektutu.com/post/tensorflow2-gym-q-learning.html) 70 | 2. [强化学习 DQN 玩转 gym Mountain Car](https://geektutu.com/post/tensorflow2-gym-dqn.html) 71 | 3. [强化学习 70行代码实战 Policy Gradient](https://geektutu.com/post/tensorflow2-gym-pg.html) 72 | 73 | ## 声明 74 | 75 | **TensorFlow 2 中文文档**主要参考 [TensorFlow官网](https://www.tensorflow.org/beta/tutorials/keras),书写而成。选取了一些有价值的章节作总结,内容目录基本与官方文档一致,但在内容上作了大量的简化,以代码实践为主。TensorFlow 是机器学习的高阶框架,功能强大,接口很多,TensorFlow 2 废弃了大量重复的接口,将 Keras 作为搭建网络的主力接口,也添加了很多新的特性,极大地改进了可用性,能有效地减少代码量。 76 | 77 | **TensorFlow 2 中文文档**的目的是选取官方文档中有代表性的内容,帮助大家快速入门,一览TensorFlow 在图像识别、文本分类、结构化数据等方面的风采。介绍 TensorFlow 1.x 的文档已经很多,所以这份文档侧重于总结 TensorFlow 2 的新特性。 78 | 79 | TensorFlow官网的文档遵循[署名 4.0 国际 (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/deed.zh)协议,代码遵循[Apache 2.0 协议](https://www.apache.org/licenses/LICENSE-2.0),本文档完全遵守上述协议。将在显著地方注明来源。 80 | 81 | 代码基于**Python3**和**TensorFlow 2.0 beta**实现。 82 | 83 | 力求简洁,部分代码删改过,例如兼容Python 2.x的代码均被删除。 -------------------------------------------------------------------------------- /tf2doc/tf.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/geektutu/tensorflow2-docs-zh/3418ce1c0e721f549d0cff7eeff65ca030c55f82/tf2doc/tf.jpg --------------------------------------------------------------------------------