我试图在Keras中构建一个堆叠的双向LSTM seq2seq模型,但是当将编码器的输出状态传递给解码器的输入状态时,我遇到了一个问题。基于这个拉请求,这看起来应该是可能的。最终,我希望保留encoder_output
向量,用于其他下游任务。
错误信息:
ValueError: An `initial_state` was passed that is not compatible with `cell.state_size`. Received `state_spec`=[InputSpec(shape=(None, 100), ndim=2)]; however `cell.state_size` is (100, 100)
我的模特:
MAX_SEQUENCE_LENGTH = 50
EMBEDDING_DIM = 250
latent_size_1 = 100
latent_size_2 = 50
latent_size_3 = 250
embedding_layer = Embedding(num_words,
EMBEDDING_DIM,
embeddings_initializer=Constant(embedding_matrix),
input_length=MAX_SEQUENCE_LENGTH,
trainable=False,
mask_zero=True)
encoder_inputs = Input(shape=(MAX_SEQUENCE_LENGTH,), name="encoder_input")
encoder_emb = embedding_layer(encoder_inputs)
encoder_lstm_1 = Bidirectional(LSTM(latent_size_1, return_sequences=True),
merge_mode="concat",
name="encoder_lstm_1")(encoder_emb)
encoder_outputs, forward_h, forward_c, backward_h, backward_c = Bidirectional(LSTM(latent_size_2, return_state=True),
merge_mode="concat"
name="encoder_lstm_2")(encoder_lstm_1)
state_h = Concatenate()([forward_h, backward_h])
state_c = Concatenate()([forward_c, backward_c])
encoder_states = [state_h, state_c]
decoder_inputs = Input(shape=(MAX_SEQUENCE_LENGTH,), name="decoder_input")
decoder_emb = embedding_layer(decoder_inputs)
decoder_lstm_1 = Bidirectional(LSTM(latent_size_1, return_sequences=True),
merge_mode="concat",
name="decoder_lstm_1")(decoder_emb, initial_state=encoder_states)
decoder_lstm_2 = Bidirectional(LSTM(latent_size_3, return_sequences=True),
merge_mode="concat",
name="decoder_lstm_2")(decoder_lstm_1)
decoder_outputs = Dense(num_words, activation='softmax', name="Dense_layer")(decoder_lstm_2)
seq2seq_Model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
非常感谢您的任何帮助/建议/指导!
发布于 2019-07-24 23:21:54
你的代码有两个问题,
encoder_states
中的状态(而应该使用encoder_states = [forward_h, forward_c, backward_h, backward_c]
)。latent_size_2
(而不是latent_size_1
)。因此,如果您希望这样做,作为您的解码器初始状态,您的解码器应该是latent_size_2
。您可以在下面找到带有这些更正的代码。
from tensorflow.keras.layers import Embedding, Input, Bidirectional, LSTM, Dense, Concatenate
from tensorflow.keras.initializers import Constant
from tensorflow.keras.models import Model
MAX_SEQUENCE_LENGTH = 50
EMBEDDING_DIM = 250
latent_size_1 = 100
latent_size_2 = 50
latent_size_3 = 250
num_words = 5000
embedding_layer = Embedding(num_words,
EMBEDDING_DIM,
embeddings_initializer=Constant(1.0),
input_length=MAX_SEQUENCE_LENGTH,
trainable=False,
mask_zero=True)
encoder_inputs = Input(shape=(MAX_SEQUENCE_LENGTH,), name="encoder_input")
encoder_emb = embedding_layer(encoder_inputs)
encoder_lstm_1 = Bidirectional(LSTM(latent_size_1, return_sequences=True),
merge_mode="concat",
name="encoder_lstm_1")(encoder_emb)
encoder_outputs, forward_h, forward_c, backward_h, backward_c = Bidirectional(LSTM(latent_size_2, return_state=True),
merge_mode="concat", name="encoder_lstm_2")(encoder_lstm_1)
encoder_states = [forward_h, forward_c, backward_h, backward_c]
decoder_inputs = Input(shape=(MAX_SEQUENCE_LENGTH,), name="decoder_input")
decoder_emb = embedding_layer(decoder_inputs)
decoder_lstm_1 = Bidirectional(
LSTM(latent_size_2, return_sequences=True),
merge_mode="concat", name="decoder_lstm_1")(decoder_emb, initial_state=encoder_states)
decoder_lstm_2 = Bidirectional(LSTM(latent_size_3, return_sequences=True),
merge_mode="concat",
name="decoder_lstm_2")(decoder_lstm_1)
decoder_outputs = Dense(num_words, activation='softmax', name="Dense_layer")(decoder_lstm_2)
seq2seq_Model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
https://stackoverflow.com/questions/57190769
复制相似问题