首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >categorical_crossentropy/weighted_loss/Squeeze}}:无法压缩dim[1],预期维数为1,{节点dim[1]=2

categorical_crossentropy/weighted_loss/Squeeze}}:无法压缩dim[1],预期维数为1,{节点dim[1]=2
EN

Stack Overflow用户
提问于 2022-04-01 16:49:46
回答 1查看 239关注 0票数 1

我现在正在研究一个神经网络,它可以预测下一个活动和跟踪的结果(事件序列,摘自事件日志)。首先,我用一个额外的事件扩展了每个跟踪,该事件表示跟踪的结果(e.g.trace t的标签是o1,然后成为跟踪的最终刚添加事件的活动),特别是,我首先将每个唯一的活动编码为一个整数(然后每个跟踪都是一个整数数组)。然后用一个附加事件扩展每个跟踪,该事件指示结果(e.g.trace t作为其标记,outcome1,然后成为每个跟踪,然后被划分为固定维数的窗口,该窗口将作为神经网络的输入)。例如,对于编码为a b b c ... x x y z o的跟踪1 2 2 3 ... 24 24 25 26 27,并且windows的fix维数等于4,由此产生的窗口将为0 0 a0 a0 a b b.X y z编码为0 0 0 1 0 1 2 2 2. 24 24 25 26:如你所见,结果不包括在窗口中,因为它们必须被预测。

然后目标数据将是:

  • 用于下一个活动预测,一个数组包含每个窗口的下一个活动(按照上面的例子,它将是be .O编码为2 2 3.27
  • 用于结果预测,一个数组包含每个窗口的结果(在我们的例子中,它将是一个o数组,或者编码为27:但是,由于将考虑多个跟踪,将产生多个具有不同结果的窗口)。

这里我展示了这些数据的一些例子:活动地图(前两项活动是结果)

代码语言:javascript
运行
复制
{'regular': 1, 'deviant': 2, 'Round Grinding - Machine 3': 3, 'Round Grinding - Machine 2': 4, 'Grinding Rework - Machine 27': 5, 'Lapping - Machine 1': 6, 'other': 7, 'Turning & Milling Q.C.': 8, 'Laser Marking - Machine 7': 9, 'Round Grinding - Q.C.': 10, 'Turning & Milling - Machine 4': 11, 'Final Inspection Q.C.': 12, 'Packing': 13, 'Turning & Milling - Machine 8': 14, 'Flat Grinding - Machine 11': 15, 'Round Grinding - Manual': 16, 'Wire Cut - Machine 13': 17, 'Turning & Milling - Machine 9': 18, 'Milling - Machine 16': 19, 'Turning - Machine 8': 20, 'Turning Q.C.': 21, 'Turning & Milling - Machine 5': 22, 'Turning & Milling - Machine 10': 23, 'Turning & Milling - Machine 6': 24, 'Round Grinding - Machine 12': 25, 'Turning - Machine 9': 26, 'Milling - Machine 14': 27, 'Turn & Mill. & Screw Assem - Machine 10': 28}

输入数据(变量x_training) (跟踪的编码窗口)(这里只有3个跟踪)

代码语言:javascript
运行
复制
[array([0., 0., 0., 3.]), array([0., 0., 3., 4.]), array([0., 3., 4., 5.]), array([3., 4., 5., 5.]), array([4., 5., 5., 6.]), array([5., 5., 6., 6.]), array([5., 6., 6., 5.]), array([6., 6., 5., 7.]), array([6., 5., 7., 7.]), array([5., 7., 7., 7.]), array([7., 7., 7., 7.]), array([7., 7., 7., 7.]), array([7., 7., 7., 5.]), array([7., 7., 5., 5.]), array([7., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 8.]), array([5., 5., 8., 6.]), array([5., 8., 6., 9.]), array([8., 6., 9., 3.]), array([ 6.,  9.,  3., 10.]), array([ 0.,  0.,  0., 11.]), array([ 0.,  0., 11., 11.]), array([ 0., 11., 11., 11.]), array([11., 11., 11., 11.]), array([11., 11., 11., 11.]), array([11., 11., 11., 11.]), array([11., 11., 11.,  8.]), array([11., 11.,  8., 11.]), array([11.,  8., 11., 11.]), array([ 8., 11., 11.,  8.]), array([11., 11.,  8., 11.]), array([11.,  8., 11., 11.]), array([ 8., 11., 11., 11.]), array([11., 11., 11.,  8.]), array([11., 11.,  8., 11.]), array([0., 0., 0., 6.]), array([0., 0., 6., 3.]), array([0., 6., 3., 3.]), array([6., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 6.]), array([3., 3., 6., 3.]), array([3., 6., 3., 3.]), array([6., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([ 3.,  3.,  3., 12.]), array([ 3.,  3., 12., 12.]), array([ 3., 12., 12., 12.]), array([12., 12., 12., 12.]), array([12., 12., 12., 13.]), array([12., 12., 13., 12.]), array([12., 13., 12., 12.]), array([13., 12., 12.,  3.]), array([12., 12.,  3., 12.]), array([12.,  3., 12., 12.]), array([ 3., 12., 12., 12.]), array([12., 12., 12., 12.]), ... (and so on) ]

目标数据(下一个活动)(变量y_training) (考虑这里的每个整数为+1,因为我使用了标签编码器fit_transform)。

代码语言:javascript
运行
复制
[ 3  4  4  5  5  4  6  6  6  6  6  4  4  4  4  4  4  4  4  4  4  4  4  4
  4  4  4  4  7  5  8  2  9  0 10 10 10 10 10  7 10 10  7 10 10 10  7 10
  1  2  2  2  2  2  2  2  2  2  5  2  2  2  2  2 11 11 11 11 12 11 11  2
 11 11 11 11  1 ... (and so on)]

目标数据(结果)(变量z_training) (对于这个给定的数据集,结果是二进制的,但并不总是二进制的)(这里也考虑每个结果为+1,因为我使用了标签编码器fit_transform)。

代码语言:javascript
运行
复制
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 ... (and so on)]

还可以考虑一下,我让y_training和z_training都分类了。

在这里你可以找到我建立的神经网络:

代码语言:javascript
运行
复制
        self.x_training = np.asarray(self.x_training)
        outsize_act = len(np.unique(self.y_training)) 
        outsize_out= len(np.unique(self.z_training)) 
        self.y_training = to_categorical(self.y_training)
        self.z_training = to_categorical(self.z_training)

        unique_events = len(self.act_dictionary)
        X_train, X_val, Y_train, Y_val, Z_train, Z_val = train_test_split(self.x_training, self.y_training, self.z_training, test_size=0.2,random_state=42, shuffle=True)
        size_act = (unique_events + 1) // 2 

        input_act = Input(shape=(self.example_size,), dtype='int32', name='input_act')
        x_act = Embedding(output_dim=size_act, input_dim=unique_events + 1, input_length=self.example_size)(input_act)
        
        l1 = LSTM(16, return_sequences=True, kernel_initializer='glorot_uniform')(x_act)
        b1 = BatchNormalization()(l1)
        l2_1 = LSTM(16, return_sequences=False, kernel_initializer='glorot_uniform')(b1) # the layer specialized in activity prediction
        b2_1 = BatchNormalization()(l2_1)
        l2_2 = LSTM(16, return_sequences=False, kernel_initializer='glorot_uniform')(b1) #the layer specialized in outcome prediction
        b2_2 = BatchNormalization()(l2_2)

        output_l = Dense(outsize_act, activation='softmax', name='act_output')(b2_1)
        output_o = Dense(outsize_out, activation='softmax', name='outcome_output')(b2_2)

        model = Model(inputs=input_act, outputs=[output_l, output_o])
        print(model.summary())

        opt = Adam()
        model.compile(loss={'act_output':'categorical_crossentropy', 'outcome_output':'categorical_crossentropy'}, optimizer=opt, metrics=['accuracy'])

        early_stopping = EarlyStopping(monitor='val_loss', patience=42)
        model_checkpoint = ModelCheckpoint('output_files/models/model_{epoch:02d}-{val_loss:.2f}.h5', monitor='val_loss', verbose=0, save_best_only=True,save_weights_only=False, mode='auto')
        lr_reducer = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=10, verbose=0, mode='auto', min_delta=0.0001, cooldown=0, min_lr=0)

        model.fit(X_train, {'act_output':Y_train, 'outcome_output':Z_train}, epochs=200, batch_size=128, verbose=2, callbacks=[early_stopping, model_checkpoint, lr_reducer], validation_data=(X_val, Y_val,Z_val))
        model.save("model/generate_" + self.log_name + ".h5")

在这里,您可以找到我所得到的错误:

代码语言:javascript
运行
复制
Epoch 1/200
Traceback (most recent call last):
  File "C:\Users\...\manager.py", line 244, in build_neural_network_model
    model.fit(X_train, {'act_output':Y_train, 'outcome_output':Z_train}, epochs=200, batch_size=128, verbose=2, callbacks=[early_stopping, model_checkpoint, lr_reducer],
  File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\tensorflow\python\framework\func_graph.py", line 1147, in autograph_handler
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\keras\engine\training.py", line 1525, in test_function  *
        return step_function(self, iterator)
    File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\keras\engine\training.py", line 1514, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\keras\engine\training.py", line 1507, in run_step  **
        outputs = model.test_step(data)
    File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\keras\engine\training.py", line 1473, in test_step
        self.compute_loss(x, y, y_pred, sample_weight)
    File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\keras\engine\training.py", line 918, in compute_loss
        return self.compiled_loss(
    File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\keras\engine\compile_utils.py", line 201, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\keras\losses.py", line 142, in __call__
        return losses_utils.compute_weighted_loss(
    File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\keras\utils\losses_utils.py", line 321, in compute_weighted_loss
        losses, _, sample_weight = squeeze_or_expand_dimensions(  # pylint: disable=unbalanced-tuple-unpacking
    File "C:\Users\...\AppData\Roaming\Python\Python39\site-packages\keras\utils\losses_utils.py", line 211, in squeeze_or_expand_dimensions
        sample_weight = tf.squeeze(sample_weight, [-1])

    ValueError: Can not squeeze dim[1], expected a dimension of 1, got 2 for '{{node categorical_crossentropy/weighted_loss/Squeeze}} = Squeeze[T=DT_FLOAT, squeeze_dims=[-1]](IteratorGetNext:2)' with input shapes: [?,2].

因此,我请求您的帮助:我已经搜索并找到了许多类似的线程,但是没有一个解决方案有效。我想这是因为我在研究一个双输出神经网络。这是我第一次研究神经网络,也许有一些明显的错误,我不明白。谢谢你的帮助。

EN

回答 1

Stack Overflow用户

发布于 2022-04-11 15:29:09

@Stefano,很高兴知道你改正了你的错误,谢谢你的分享。

为社区利益在答案部分添加Stefano(用户)注释:

解决了。问题在model.fit()中的参数model.fit()中,它应该是validation_data = (X_val, [Y_val,Z_val])

编码愉快!

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71710344

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档