前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >IKAnalyzer2012FF + Lucene4.9 TokenStream contract violation: reset()/close() call missing

IKAnalyzer2012FF + Lucene4.9 TokenStream contract violation: reset()/close() call missing

作者头像
全栈程序员站长
发布2022-09-15 18:01:39
4050
发布2022-09-15 18:01:39
举报
文章被收录于专栏:全栈程序员必看

大家好,又见面了,我是你们的朋友全栈君。

异常信息如下:

代码语言:javascript
复制
java.lang.IllegalStateException: TokenStream contract violation: reset()/close() call missing, reset() called multiple times, or subclass does not call super.reset(). Please see Javadocs of TokenStream class for more information about the correct consuming workflow.
	at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:111)
	at java.io.Reader.read(Reader.java:140)
	at org.wltea.analyzer.core.AnalyzeContext.fillBuffer(AnalyzeContext.java:124)
	at org.wltea.analyzer.core.IKSegmenter.next(IKSegmenter.java:122)
	at org.wltea.analyzer.lucene.IKTokenizer.incrementToken(IKTokenizer.java:78)
	at unit.test.IKAnalyzerTest.test01(IKAnalyzerTest.java:29)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at junit.framework.TestCase.runTest(TestCase.java:168)
	at junit.framework.TestCase.runBare(TestCase.java:134)
	at junit.framework.TestResult$1.protect(TestResult.java:110)
	at junit.framework.TestResult.runProtected(TestResult.java:128)
	at junit.framework.TestResult.run(TestResult.java:113)
	at junit.framework.TestCase.run(TestCase.java:124)
	at junit.framework.TestSuite.runTest(TestSuite.java:243)
	at junit.framework.TestSuite.run(TestSuite.java:238)
	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)

代码:

代码语言:javascript
复制
public void test01() throws IOException {
<span style="white-space:pre">	</span>String text="基于java语言开发的轻量级的中文分词工具包,一个轻量级框架!"; 
		
        // 创建分词对象  
        Analyzer analyzer = new IKAnalyzer(true);       
        StringReader reader = new StringReader(text);  
        
        // 分词  
        TokenStream ts = analyzer.tokenStream("", reader);  
        CharTermAttribute term = ts.getAttribute(CharTermAttribute.class);  
        
        // 遍历分词数据  
        while(ts.incrementToken()){  
            System.out.print(term.toString()+" | ");  
        }  
        
        reader.close();  
        System.out.println();  
}

上面的代码为旧的分词步骤,按照新的API,调用TokenStream的流程如下:

1、Instantiation of TokenStream/TokenFilters which add/get attributes to/ the AttributeSource. 2、The consumer calls reset(). 3、The consumer retrieves attributes the stream and stores local references to all attributes it wants to access. 4、The consumer calls incrementToken() until it returns false consuming the attributes after each call. 5、The consumer calls end() so that any end-of-stream operations can be performed. 6、The consumer calls close() to release any resource when finished using the TokenStream.

所以在调用incrementToken()之前需要调用一次reset(),如下面的第10行代码:

代码语言:javascript
复制
public void test01() throws IOException {
		String text="基于java语言开发的轻量级的中文分词工具包,一个轻量级框架!"; 
		
        // 创建分词对象  
        Analyzer analyzer = new IKAnalyzer(true);       
        StringReader reader = new StringReader(text);  
        
        // 分词  
        TokenStream ts = analyzer.tokenStream("", reader); 
        ts.reset();
        CharTermAttribute term = ts.getAttribute(CharTermAttribute.class);  
        
        // 遍历分词数据  
        while(ts.incrementToken()){  
            System.out.print(term.toString()+" | ");  
        }  
        
        reader.close();  
        System.out.println();  
	}

发布者:全栈程序员栈长,转载请注明出处:https://javaforall.cn/163106.html原文链接:https://javaforall.cn

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档