最近我想通过Java获取HTML源代码中的信息。基本需求是获取HTML的主要内容区域。include the content I need. almost every content is filled by many Chinese charactor.Like: 好好学习我想用java解析包含主要内容的div或其他元素区域。include the content I need. almost every content is filled by many Chinese character like: