相关文章:
oschina_飞桨专区:https://www.oschina.net/group/paddlepaddle
首先我们了解下目前飞桨最新版本报错信息的结构,如下图:
报错信息为四段式结构,由上至下依次为Python默认错误信息栈、C++错误信息栈、飞桨Python错误信息栈(仅声明式编程模式)、核心错误概要。
当使用飞桨遇到报错提示时,定位流程是啥样子的呢?请对应上文提到的飞桨报错信息结构图,按如下流程逐步分析。
下面结合示例,向大家讲解飞桨的报错信息的分析过程(示例使用飞桨2020年7月1日的develop版本)。飞桨支持两种编程模式,声明式编程模式(静态图)和命令式编程模式(动态图),我们将逐一介绍。
执行如下静态图示例代码:
import paddle.fluid as fluid
import numpy
# 1. 网络结构定义
x = fluid.layers.data(name='X', shape=[-1, 13], dtype='float32')
y = fluid.layers.data(name='Y', shape=[-1, 1], dtype='float32')
predict = fluid.layers.fc(input=x, size=1, act=None)
loss = fluid.layers.square_error_cost(input=predict, label=y)
avg_loss = fluid.layers.mean(loss)
# 2. 优化器配置
fluid.optimizer.SGD(learning_rate=0.01).minimize(avg_loss)
# 3. 执行环境准备
place = fluid.CPUPlace()
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
# 4. 执行网络
x = numpy.random.random(size=(8, 12)).astype('float32')
y = numpy.random.random(size=(8, 1)).astype('float32')
loss_data, = exe.run(fluid.default_main_program(), feed={'X': x, 'Y': y}, fetch_list=[avg_loss.name])
代码执行后的报错信息如下:
Traceback (most recent call last):
File "paddle_error_case1.py", line 24, in <module>
loss_data, = exe.run(fluid.default_main_program(), feed={'X': x, 'Y': y}, fetch_list=[avg_loss.name])
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/executor.py", line 1079, in run
six.reraise(*sys.exc_info())
File "/usr/local/lib/python3.5/dist-packages/six.py", line 696, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/executor.py", line 1074, in run
return_merged=return_merged)
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/executor.py", line 1162, in _run_impl
use_program_cache=use_program_cache)
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/executor.py", line 1237, in _run_program
fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet:
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::MulOp::InferShape(paddle::framework::InferShapeContext*) const
3 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
5 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
6 paddle::framework::Executor::RunPartialPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, long, long, bool, bool, bool)
7 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool)
8 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string > > const&, bool, bool)
------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/framework.py", line 2799, in append_op
attrs=kwargs.get("attrs", None))
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/layers/nn.py", line 349, in fc
"y_num_col_dims": 1})
File "paddle_error_case1.py", line 9, in <module>
predict = fluid.layers.fc(input=x, size=1, act=None)
----------------------
Error Message Summary:
----------------------
InvalidArgumentError: After flatten the input tensor X and Y to 2-D dimensions matrix X1 and Y1, the matrix X1's width must be equal with matrix Y1's height. But received X's shape = [8, 12], X1's shape = [8, 12], X1's width = 12; Y's shape = [13, 1], Y1's shape = [13, 1], Y1's height = 13.
[Hint: Expected x_mat_dims[1] == y_mat_dims[0], but received x_mat_dims[1]:12 != y_mat_dims[0]:13.] at (/work/paddle/paddle/fluid/operators/mul_op.cc:83)
[operator < mul > error]
从示例中可获得如下信息:
这是一个参数错误;出错的Op是mul;mul Op输入的Tensor X矩阵的宽度,即第2维的大小需要和输入Tensor Y矩阵的高度,即第一维的大小相等,才可以进行正常的矩阵乘法;给出了具体的输入X与Y的维度信息即出错维度的值,有一处的维度写错了,可能是13误写成了12。
目前飞桨有12种错误类型,更多介绍请查看《报错信息文案书写规范》,链接如下:https://github.com/[PaddlePaddle](https://www.oschina.net/action/visit/ad?id=1185)/[Paddle](https://www.oschina.net/action/visit/ad?id=1185)/wiki/[Paddle](https://www.oschina.net/action/visit/ad?id=1185)-Error-Message-Writing-Specification
Paddle插入的Python错误信息栈为了和C++栈的调用顺序保持一致,最下面的信息是用户代码的位置,这和原生python错误信息栈的顺序有所区别。这里我们可以得知,是调用fc的时候出错的,fc中包含一个乘法运算和一个加法运算,根据前面的信息可以得知是此处的乘法运算的输入数据存在问题。至此,通过检查代码,可以找到错误位置:
将代码中的12改为13,即可解决该问题。
动态图不区分网络模型的编译期和执行期,报错信息中不需要再插入编译时的python信息栈。执行如下动态图示例代码:
import numpy
import paddle.fluid as fluid
place = fluid.CPUPlace()
with fluid.dygraph.guard(place):
x = numpy.random.random(size=(10, 2)).astype('float32')
linear = fluid.dygraph.Linear(1, 10)
data = fluid.dygraph.to_variable(x)
res = linear(data)
代码执行后的报错信息如下:
/work/scripts {master} python paddle_error_case2.py
Traceback (most recent call last):
File "paddle_error_case2.py", line 9, in <module>
res = linear(data)
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/dygraph/layers.py", line 600, in __call__
outputs = self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/paddle/fluid/dygraph/nn.py", line 965, in forward
'transpose_Y', False, "alpha", 1)
paddle.fluid.core_avx.EnforceNotMet:
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::MatMulOp::InferShape(paddle::framework::InferShapeContext*) const
3 paddle::imperative::PreparedOp::Run(paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap const&)
4 paddle::imperative::Tracer::TraceOp(std::string const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap, paddle::platform::Place const&, bool)
5 paddle::imperative::Tracer::TraceOp(std::string const&, paddle::imperative::NameVarBaseMap const&, paddle::imperative::NameVarBaseMap const&, paddle::framework::AttributeMap)
----------------------
Error Message Summary:
----------------------
InvalidArgumentError: Input X's width should be equal to the Y's height, but received X's shape: [10, 2],Y's shape: [1, 10].
[Hint: Expected mat_dim_x.width_ == mat_dim_y.height_, but received mat_dim_x.width_:2 != mat_dim_y.height_:1.] at (/work/paddle/paddle/fluid/operators/matmul_op.cc:411)
[operator < matmul > error]
同样,我们可以依据前面讲述的步骤对报错进行分析。
通过检查代码,也可以比较容易地定位到错误位置在:
将代码中的2改为1,即可解决该问题。