程序运行操作系统: CentOS6.5 64bit
JDK版本:7
测试程序很简单,就一个类一个main函数,大概流程:
先从参数中读取 获取zip文件的时间间隔interval,再从参数中获取zip文件路径。再通过ZipFile类的api来从zip文件中获取文件的全路径名。每次获取一个文件sleep interval时间,便于测试。
代码如下:
/**
* Usage: App <interval in ms to get entry> <zip file path>
* @param args
*/
public static void main( String[] args ) {
String arg0 = args[0];
String arg1 = args[1];
Long interval = Long.valueOf(arg0);
System.out.println("interval = " + interval);
String filename = arg1;
System.out.println("filename = " + filename);
ZipFile zipFile = null;
try {
zipFile = new ZipFile(filename);
Enumeration<? extends ZipEntry> entries = zipFile.entries();
while (entries.hasMoreElements()) {
ZipEntry entry = entries.nextElement();
System.out.println(entry.getName());
try {
Thread.sleep(interval);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (zipFile != null) {
try {
zipFile.close();
} catch (IOException e) {
}
}
}
}
先在/tmp目录下放置一个用于测试的test.zip的压缩文件
将程序打包到服务器,执行如下命令:
java -classpath $CLASSPATH com.spiro.test.App 5000 /tmp/test.zip > $LOG_HOME/app.log
app.log输出:
interval = 5000
filename = /tmp/test.zip
frontend/
frontend/gulpfile.js
frontend/node_modules/
frontend/node_modules/.bin/
frontend/node_modules/.bin/browser-sync
frontend/node_modules/.bin/browser-sync.cmd
frontend/node_modules/.bin/gulp
紧接着重新开一个终端执行如下命令:
echo "abcd" > /tmp/test.zip
将test.zip文件的内容重置修改
app.log则输出:
interval = 5000
filename = /tmp/test.zip
frontend/
frontend/gulpfile.js
frontend/node_modules/
frontend/node_modules/.bin/
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGBUS (0x7) at pc=0x00007fa6701cb7b2, pid=7262, tid=140352832034560
#
# JRE version: OpenJDK Runtime Environment (7.0_141-b02) (build 1.7.0_141-mockbuild_2017_05_09_14_20-b00)
# Java VM: OpenJDK 64-Bit Server VM (24.141-b02 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 2.6.10
# Distribution: CentOS release 6.9 (Final), package rhel-2.6.10.1.el6_9-x86_64 u141-b02
# Problematic frame:
# C [libzip.so+0x47b2]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /tmp/jvm-7262/hs_error.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
# http://icedtea.classpath.org/bugzilla
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
jps查看进程已经消失。
根据日志提示,jvm dump文件输出到文件/tmp/jvm-7262/hs_error.log,
查看栈:
Stack: [0x00007fa670a25000,0x00007fa670b26000], sp=0x00007fa670b24650, free space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libzip.so+0x47b2]
C [libzip.so+0x4dc8] ZIP_GetNextEntry+0x48
j java.util.zip.ZipFile.getNextEntry(JI)J+0
j java.util.zip.ZipFile.access$500(JI)J+2
j java.util.zip.ZipFile$1.nextElement()Ljava/util/zip/ZipEntry;+54
j java.util.zip.ZipFile$1.nextElement()Ljava/lang/Object;+1
j com.spiro.test.App.main([Ljava/lang/String;)V+100
v ~StubRoutines::call_stub
V [libjvm.so+0x60ea4e]
V [libjvm.so+0x60d5e8]
V [libjvm.so+0x61e9c7]
V [libjvm.so+0x632fac]
C [libjli.so+0x34c5]
可以看到java代码定位在
private static native long getNextEntry(long jzfile, int i);
至此重现了问题。
通过查询资料,这个跟mmap的linux操作系统机制有关,大致意识是:mmap机制通过将文件映射到内存,这样可以提高文件的访问效率,但是一旦来读取的过程中,文件被修改了,就可能导致错误,从而导致jvm crash。
https://bugs.openjdk.java.net/browse/JDK-8160933
网上给的解决方案中有通过在jvm参数中加入 -Dsun.zip.disableMemoryMapping=true选项,禁用mmap机制,我们下面就来看看加上这个选项的效果;
执行:
java -Dsun.zip.disableMemoryMapping=true
-classpath $CLASSPATH com.spiro.test.App 5000 /tmp/test.zip > $LOG_HOME/app.log
app.log输出:
frontend/
frontend/gulpfile.js
frontend/node_modules/
frontend/node_modules/.bin/
frontend/node_modules/.bin/browser-sync
frontend/node_modules/.bin/browser-sync.cmd
frontend/node_modules/.bin/gulp
执行:
echo "abcd" > test.zip
发现进程会继续执行,并在一段时间后会抛Error异常:
Exception in thread "main" java.util.zip.ZipError: jzentry == 0,
jzfile = 140689976146736,
total = 8313,
name = /tmp/test.zip,
i = 62,
message = null
at java.util.zip.ZipFile$1.nextElement(ZipFile.java:505)
at java.util.zip.ZipFile$1.nextElement(ZipFile.java:483)
at com.spiro.test.App.main(App.java:38)
在禁用了mmap后,进程没有crash,而是在一段时间后抛了异常,然后退出进程。
禁用mmap后,文件没有映射到内存,而是程序预先加载一部分数据到内存后继续读取,后文件数据变化后,才发生异常错误。这还只是猜测,后续有空再继续研究。
可以看到jvm crash的根源就在开启mmap机制后,zip文件在读取过程中被修改了。
解决的方法有两种:
1. 从代码逻辑上控制zip文件在操作过程中,不要被其他逻辑修改。
2. 在jvm启动参数中加入-Dsun.zip.disableMemoryMapping=true 。
但是个人觉得2这种方式指标不治本,问题根源还在于文件资源共享访问时要做控制。