Lucene分词器 IKAnalyzer

需要导入jar:




public class IKAnalyze {

    //分词器
    @Test
    public void test() throws Exception {
        //String word="a good person,Happy Every Day";
        //String word="我为何不哭,因为我仅存的,就只有坚强了";
        String word="中华人民共和国KWWL  DRGYBN,北大老鸟,我们是";
        //Analyzer analyzer = new StandardAnalyzer();//一元分词
        // Analyzer analyzer=new CJKAnalyzer();  //二元分词
        // Analyzer analyzer=new SmartChineseAnalyzer();  //智能中文分词
        //IK分词
      Analyzer analyzer=new IKAnalyzer(true);
        testAnalyzer(analyzer,word);
    }
    //使用指定的分词器对指定的文本进行分词
    public  void testAnalyzer(Analyzer analyzer, String text) throws Exception {
        System.out.println("分词器:" + analyzer.getClass());
        StringReader reader= new StringReader(text);
        TokenStream tokenStream = analyzer.tokenStream("content",reader);
        tokenStream.reset();
        CharTermAttribute cta =tokenStream.addAttribute(CharTermAttribute.class);
        while (tokenStream.incrementToken()) {
            System.out.println(cta);
        }
        reader.close();
    }


版权声明:本文为qq_36024638原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。