Keras学习笔记
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
使用Sequential模型
一个Sequential模型适用于简单的层堆叠, 其中每一层正好有一个输入张量和一个输出张量。
model = keras.Sequential(
[
layers.Dense(2, activation="relu", name="layer1"),
layers.Dense(3, activation="relu", name="layer2"),
layers.Dense(4, name="layer3"),
]
)
x = tf.ones((3, 3))
x
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]], dtype=float32)>
y = model(x)
y
<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]], dtype=float32)>
上述代码等效于一下代码:
# Create 3 layers
layer1 = layers.Dense(2, activation="relu", name="layer1")
layer2 = layers.Dense(3, activation="relu", name="layer2")
layer3 = layers.Dense(4, name="layer3")
# Call layers on a test input
x = tf.ones((3, 3))
y = layer3(layer2(layer1(x)))
y
<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]], dtype=float32)>
Sequential不适用于以下情况:
模型有多个输入或多个输出
任何一层都有多个输入或多个输出
需要进行图层共享
需要非线性拓扑(例如,残余连接,多分支模型)
可通过以下layers属性访问其图层
model.layers
[<tensorflow.python.keras.layers.core.Dense at 0x1ee679d3cf8>,
<tensorflow.python.keras.layers.core.Dense at 0x1ee679f3a90>,
<tensorflow.python.keras.layers.core.Dense at 0x1ee67a0c208>]
还可以通过以下add()方法逐步创建一个顺序模型:
model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))
还有一种相应的pop()方法可以删除图层:顺序模型的行为非常类似于图层列表。
print(len(model.layers))
3
model.pop()
print(len(model.layers))
2
Sequential构造函数接受name参数,就像Keras中的任何层或模型一样。这对于用语义上有意义的名称注释TensorBoard图很有用。
model = keras.Sequential(name="my_sequential")
model.add(layers.Dense(2, activation="relu", name="layer1"))
model.add(layers.Dense(3, activation="relu", name="layer2"))
model.add(layers.Dense(4, name="layer3"))
预先指定输入形状
Keras中的所有图层都需要知道其输入的形状,以便能够创建其权重。因此,当创建这样的图层时,最初没有权重:
layer = layers.Dense(3)
layer.weights
[]
由于权重的形状取决于输入的形状,因此会在首次调用输入时创建其权重:
x = tf.ones((1, 4))
y = layer(x)
layer.weights
[<tf.Variable 'dense_3/kernel:0' shape=(4, 3) dtype=float32, numpy=
array([[-0.23496091, -0.42415935, -0.38969237],
[ 0.47878957, 0.6321573 , 0.53070235],
[-0.57678986, 0.5862113 , -0.5439472 ],
[-0.8276289 , 0.88936853, -0.6267946 ]], dtype=float32)>,
<tf.Variable 'dense_3/bias:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]
这也适用于顺序模型。当实例化没有输入形状的顺序模型时,它不是“构建”的:它没有权重(并且调用 model.weights结果仅说明了这一点)。权重是在模型首次看到一些输入数据时创建的:
model = keras.Sequential(
[
layers.Dense(2, activation="relu"),
layers.Dense(3, activation="relu"),
layers.Dense(4),
]
)
x = tf.ones((1, 4))
y = model(x)
print("Number of weights after calling the model:", len(model.weights)) # 6
Number of weights after calling the model: 6
一旦“构建”了模型,就可以调用其summary()方法以显示其内容:
model.summary()
Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_4 (Dense) (1, 2) 10
_________________________________________________________________
dense_5 (Dense) (1, 3) 9
_________________________________________________________________
dense_6 (Dense) (1, 4) 16
=================================================================
Total params: 35
Trainable params: 35
Non-trainable params: 0
_________________________________________________________________
但是,当逐步构建顺序模型时,能够显示到目前为止的模型摘要(包括当前输出形状)非常有用。在这种情况下,应该通过将一个Input 对象传递给模型来启动模型,以使它从一开始就知道其输入形状:
model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))
model.summary()
Model: "sequential_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_10 (Dense) (None, 2) 10
=================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________
由于该Input对象model.layers不是图层,因此不会显示为的一部分:
model.layers
[<tensorflow.python.keras.layers.core.Dense at 0x1ee68ade4e0>]
一个简单的替代方法是将一个input_shape参数传递给第一层:
model = keras.Sequential()
model.add(layers.Dense(2, activation="relu", input_shape=(4,)))
model.summary()
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_11 (Dense) (None, 2) 10
=================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________
使用这样的预定义输入形状构建的模型始终具有权重(甚至在查看任何数据之前),并且始终具有定义的输出形状。
通常,建议的最佳做法是始终事先指定顺序模型的输入形状(如果预先知道它是什么)。
常见的调试工作流程:add()+summary()
在构建新的顺序体系结构时,以渐进方式堆叠层add()并经常打印模型摘要很有用。例如,可以监视堆栈Conv2D和MaxPooling2D图层如何对图像特征贴图进行下采样:
model = keras.Sequential()
model.add(keras.Input(shape=(250, 250, 3))) # 250x250 RGB images
model.add(layers.Conv2D(32, 5, strides=2, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.summary()
# (40, 40, 32)
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(2))
model.summary()
# 现在我们有了4x4的特征图,是时候应用MaxPooling了。
model.add(layers.GlobalMaxPooling2D())
# 最后,我们添加一个分类层。
model.add(layers.Dense(10))
Model: "sequential_11"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_6 (Conv2D) (None, 123, 123, 32) 2432
_________________________________________________________________
conv2d_7 (Conv2D) (None, 121, 121, 32) 9248
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 40, 40, 32) 0
=================================================================
Total params: 11,680
Trainable params: 11,680
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_11"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_6 (Conv2D) (None, 123, 123, 32) 2432
_________________________________________________________________
conv2d_7 (Conv2D) (None, 121, 121, 32) 9248
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 38, 38, 32) 9248
_________________________________________________________________
conv2d_9 (Conv2D) (None, 36, 36, 32) 9248
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 12, 12, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 10, 10, 32) 9248
_________________________________________________________________
conv2d_11 (Conv2D) (None, 8, 8, 32) 9248
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 4, 4, 32) 0
=================================================================
Total params: 48,672
Trainable params: 48,672
Non-trainable params: 0
_________________________________________________________________
拥有模型后该怎么办
一旦模型架构准备就绪,将需要:
训练模型,评估模型并进行推理。
将模型保存到磁盘并还原。
通过利用多个GPU来加速模型训练。
使用顺序模型进行特征提取
一旦建立了顺序模型,它的行为就类似于功能API模型。这意味着每个层都有一个input and output属性。这些属性可以做一些事情,例如快速创建一个模型,以提取顺序模型中所有中间层的输出:
initial_model = keras.Sequential(
[
keras.Input(shape=(250, 250, 3)),
layers.Conv2D(32, 5, strides=2, activation="relu"),
layers.Conv2D(32, 3, activation="relu"),
layers.Conv2D(32, 3, activation="relu"),
]
)
feature_extractor = keras.Model(
inputs=initial_model.inputs,
outputs=[layer.output for layer in initial_model.layers],
)
# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)
这是一个类似的示例,仅从一层中提取要素:
initial_model = keras.Sequential(
[
keras.Input(shape=(250, 250, 3)),
layers.Conv2D(32, 5, strides=2, activation="relu"),
layers.Conv2D(32, 3, activation="relu", name="my_intermediate_layer"),
layers.Conv2D(32, 3, activation="relu"),
]
)
feature_extractor = keras.Model(
inputs=initial_model.inputs,
outputs=initial_model.get_layer(name="my_intermediate_layer").output,
)
# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)