ControlFlag是一个开源的、利用机器学习来发现任意代码库中的错误的项目,起初它专注于发现C/C++代码中的错误,但随着其新的V1.1版本的发布,开始支持发现PHP代码当中的错误。
注意gcc和cmake的版本,太低不行1️⃣
#下载安装包
https://github.com/IntelLabs/control-flag/releases/tag/v1.1
cd control-flag-1.1
cmake .
make -j
make test
#创建日志目录
[root@nfsFileSystem control-flag-1.1]# mkdir log
#准备一个错误的代码
vi /vagrant/php/test.php
<?php
if (x = 7) y = x;
if($a=!3) echo 22;
#扫描
[root@nfsFileSystem control-flag-1.1]# scripts/scan_for_anomalies.sh -d /vagrant/php -t /vagrant/php_controlflag_if_stmts.ts -o log -l 3
Training: start.
Trie L1 build took: 13.641s
Trie L2 build took: 16.266s
Training: complete.
Storing logs in log
#查看扫描结果
[vagrant@nfsFileSystem control-flag-1.1]$ grep "Potential anomaly" -C 5 log/thread_0.log
[TID=139824646551296] Scanning File: /vagrant/php/test.php
Level:ONE Expression:(parenthesized_expression (binary_expression ("=") (variable_name (name))(unary_op_expression (integer)))) not found in training dataset: Source file: /vagrant/php/test.php:3:2:($a=!3)
Expression is Okay
Level:TWO Expression:(parenthesized_expression (assignment_expression left: (variable_name (name)) right: (unary_op_expression (integer)))) not found in training dataset: Source file: /vagrant/php/test.php:3:2:($a=!3)
Expression is Potential anomaly
Did you mean:(parenthesized_expression (assignment_expression left: (variable_name (name)) right: (unary_op_expression (integer)))) with editing cost:0 and occurrences: 0
Did you mean:(parenthesized_expression (binary_expression left: (variable_name (name)) right: (unary_op_expression (integer)))) with editing cost:2 and occurrences: 217
Did you mean:(parenthesized_expression (assignment_expression left: (variable_name (name)) right: (variable_name (name)))) with editing cost:2 and occurrences: 3
从扫描结果看,代码if($a=!3) echo 22;
提示了Expression is Potential anomaly
,也给出了几条它的猜测
相反,代码if (x = 7) y = x;
就没扫出来问题,提示Expression is Okay
其实我私下扫过几个完整的 php 项目,也想了很多 php 的错误语法,令人失望的是基本都扫不出来,有些虽然提示了Expression is Potential anomaly
,也基本是误报
简单总结:没什么用
[root@nfsFileSystem control-flag-1.1]# scripts/scan_for_anomalies.sh -d /vagrant/code/ -t /vagrant/c_lang_if_stmts_6000_gitrepos_small.ts -o log
Training: start.
Trie L1 build took: 11.483s
Trie L2 build took: 6.254s
Training: complete.
Storing logs in log
Scan progress:2/2 ... in progress
1️⃣ gcc版本太低(比如7.3.1
)会报类似以下错误,我换8.3.1
后正常
CMake Error in src/CMakeLists.txt:
Target "cf_base" requires the language dialect "CXX17" , but CMake does not
know the compile flags to use to enable it.