Keras学习笔记

import tensorflow as tf
from tensorflow  import keras
from tensorflow.keras import layers

使用Sequential模型

一个Sequential模型适用于简单的层堆叠, 其中每一层正好有一个输入张量和一个输出张量。

model = keras.Sequential(
    [
        layers.Dense(2, activation="relu", name="layer1"),
        layers.Dense(3, activation="relu", name="layer2"),
        layers.Dense(4, name="layer3"),
    ]
)
x = tf.ones((3, 3))
x

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]], dtype=float32)>

y = model(x)
y

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]], dtype=float32)>

上述代码等效于一下代码:

# Create 3 layers
layer1 = layers.Dense(2, activation="relu", name="layer1")
layer2 = layers.Dense(3, activation="relu", name="layer2")
layer3 = layers.Dense(4, name="layer3")

# Call layers on a test input
x = tf.ones((3, 3))
y = layer3(layer2(layer1(x)))
y

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]], dtype=float32)>

Sequential不适用于以下情况:

模型有多个输入或多个输出
任何一层都有多个输入或多个输出
需要进行图层共享
需要非线性拓扑(例如,残余连接,多分支模型)

可通过以下layers属性访问其图层

model.layers

[<tensorflow.python.keras.layers.core.Dense at 0x1ee679d3cf8>,
<tensorflow.python.keras.layers.core.Dense at 0x1ee679f3a90>,
<tensorflow.python.keras.layers.core.Dense at 0x1ee67a0c208>]

还可以通过以下add()方法逐步创建一个顺序模型:

model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))

还有一种相应的pop()方法可以删除图层:顺序模型的行为非常类似于图层列表。

print(len(model.layers))

3

model.pop()
print(len(model.layers))

2

Sequential构造函数接受name参数,就像Keras中的任何层或模型一样。这对于用语义上有意义的名称注释TensorBoard图很有用。

model = keras.Sequential(name="my_sequential")
model.add(layers.Dense(2, activation="relu", name="layer1"))
model.add(layers.Dense(3, activation="relu", name="layer2"))
model.add(layers.Dense(4, name="layer3"))

预先指定输入形状

Keras中的所有图层都需要知道其输入的形状,以便能够创建其权重。因此,当创建这样的图层时,最初没有权重:

layer = layers.Dense(3)
layer.weights

[]

由于权重的形状取决于输入的形状,因此会在首次调用输入时创建其权重:

x = tf.ones((1, 4))
y = layer(x)
layer.weights

[<tf.Variable 'dense_3/kernel:0' shape=(4, 3) dtype=float32, numpy=
array([[-0.23496091, -0.42415935, -0.38969237],
[ 0.47878957, 0.6321573 , 0.53070235],
[-0.57678986, 0.5862113 , -0.5439472 ],
[-0.8276289 , 0.88936853, -0.6267946 ]], dtype=float32)>,
<tf.Variable 'dense_3/bias:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]

这也适用于顺序模型。当实例化没有输入形状的顺序模型时,它不是“构建”的:它没有权重(并且调用 model.weights结果仅说明了这一点)。权重是在模型首次看到一些输入数据时创建的:

model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)

x = tf.ones((1, 4))
y = model(x)
print("Number of weights after calling the model:", len(model.weights))  # 6

Number of weights after calling the model: 6

一旦“构建”了模型,就可以调用其summary()方法以显示其内容:

model.summary()

Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_4 (Dense) (1, 2) 10
_________________________________________________________________
dense_5 (Dense) (1, 3) 9
_________________________________________________________________
dense_6 (Dense) (1, 4) 16
=================================================================
Total params: 35
Trainable params: 35
Non-trainable params: 0
_________________________________________________________________

但是,当逐步构建顺序模型时,能够显示到目前为止的模型摘要(包括当前输出形状)非常有用。在这种情况下,应该通过将一个Input 对象传递给模型来启动模型,以使它从一开始就知道其输入形状:

model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))

model.summary()

Model: "sequential_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_10 (Dense) (None, 2) 10
=================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________

由于该Input对象model.layers不是图层,因此不会显示为的一部分:

model.layers

[<tensorflow.python.keras.layers.core.Dense at 0x1ee68ade4e0>]

一个简单的替代方法是将一个input_shape参数传递给第一层:

model = keras.Sequential()
model.add(layers.Dense(2, activation="relu", input_shape=(4,)))

model.summary()

Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_11 (Dense) (None, 2) 10
=================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________

使用这样的预定义输入形状构建的模型始终具有权重(甚至在查看任何数据之前),并且始终具有定义的输出形状。

通常,建议的最佳做法是始终事先指定顺序模型的输入形状(如果预先知道它是什么)。

常见的调试工作流程:add()+summary()

在构建新的顺序体系结构时,以渐进方式堆叠层add()并经常打印模型摘要很有用。例如,可以监视堆栈Conv2D和MaxPooling2D图层如何对图像特征贴图进行下采样:

model = keras.Sequential()
model.add(keras.Input(shape=(250, 250, 3)))  # 250x250 RGB images
model.add(layers.Conv2D(32, 5, strides=2, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))

model.summary()

# (40, 40, 32)

model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(2))

model.summary()

# 现在我们有了4x4的特征图,是时候应用MaxPooling了。
model.add(layers.GlobalMaxPooling2D())

# 最后,我们添加一个分类层。
model.add(layers.Dense(10))

Model: "sequential_11"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_6 (Conv2D) (None, 123, 123, 32) 2432
_________________________________________________________________
conv2d_7 (Conv2D) (None, 121, 121, 32) 9248
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 40, 40, 32) 0
=================================================================
Total params: 11,680
Trainable params: 11,680
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_11"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_6 (Conv2D) (None, 123, 123, 32) 2432
_________________________________________________________________
conv2d_7 (Conv2D) (None, 121, 121, 32) 9248
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 38, 38, 32) 9248
_________________________________________________________________
conv2d_9 (Conv2D) (None, 36, 36, 32) 9248
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 12, 12, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 10, 10, 32) 9248
_________________________________________________________________
conv2d_11 (Conv2D) (None, 8, 8, 32) 9248
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 4, 4, 32) 0
=================================================================
Total params: 48,672
Trainable params: 48,672
Non-trainable params: 0
_________________________________________________________________

拥有模型后该怎么办

一旦模型架构准备就绪,将需要:

训练模型,评估模型并进行推理。
将模型保存到磁盘并还原。
通过利用多个GPU来加速模型训练。

使用顺序模型进行特征提取

一旦建立了顺序模型,它的行为就类似于功能API模型。这意味着每个层都有一个input and output属性。这些属性可以做一些事情,例如快速创建一个模型,以提取顺序模型中所有中间层的输出:

initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=[layer.output for layer in initial_model.layers],
)

# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)

这是一个类似的示例,仅从一层中提取要素:

initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu", name="my_intermediate_layer"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=initial_model.get_layer(name="my_intermediate_layer").output,
)
# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)