Mondrian对Hive的支持
一.测试Mondrian对Hive的支持
1.创建一个web项目,把olap4j.jar Mondrian.jar以及hive相关的jar包放进项目中
2. 准备四张表 Customer - 客户信息维表 Product - 产品维表 ProductType - 产品类表维表 Sale - 销售记录表:
在hive shell下执行下面命令:
create database mondrian;
use mondrian;
create table Sale (saleId INT, proId INT, cusId INT, unitPrice FLOAT, number INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
create table Product (proId INT, proTypeId INT, proName STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
create table ProductType (proTypeId INT, proTypeName STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
create table Customer (cusId INT, gender STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
3.在hive的存放目录下新建一个文件夹myTmp,放进四个数据文件:
# Customer文件
1,F
2,M
3,M
4,F
# ProductType文件
1,electrical
2,digital
3,furniture
# Product数据文件
1,1,washing machine
2,1,television
3,2,mp3
4,2,mp4
5,2,camera
6,3,chair
7,3,desk
# Sale数据文件
1,1,1,340.34,2
2,1,2,140.34,1
3,2,3,240.34,3
4,3,4,540.34,4
5,4,1,80.34,5
6,5,2,90.34,26
7,6,3,140.34,7
8,7,4,640.34,28
9,6,1,140.34,29
10,7,2,740.34,29
11,5,3,30.34,28
12,4,4,1240.34,72
13,3,1,314.34,27
14,3,2,45.34,27
再把文件数据加载到表里(在hive shell下执行如下命令:)
load data local inpath "/home/apache-hive-2.1.1-bin/myTmp/Customer" OVERWRITE into table Customer
load data local inpath "/home/apache-hive-2.1.1-bin/myTmp/ProductType" OVERWRITE into table ProductType
load data local inpath "/home/apache-hive-2.1.1-bin/myTmp/Product" OVERWRITE into table Product
load data local inpath "/home/apache-hive-2.1.1-bin/myTmp/Sale" OVERWRITE into table Sale
4.将xml文件放进web项目的src下,取名olapSchema.xml
<Schema name="hello">
<Cube name="Sales">
<!-- 事实表(fact table) -->
<Table name="Sale"/>
<!-- 客户维 -->
<Dimension name="cusGender" foreignKey="cusId">
<Hierarchy hasAll="true" allMemberName="allGender" primaryKey="cusId">
<Table name="Customer"/>
<Level name="gender" column="gender"/>
</Hierarchy>
</Dimension>
<!-- 产品类别维 -->
<Dimension name="proType" foreignKey="proId">
<Hierarchy hasAll="true" allMemberName="allPro" primaryKey="proId" primaryKeyTable="Product">
<join leftKey="proTypeId" rightKey="proTypeId">
<Table name="Product"/>
<Table name="ProductType"/>
</join>
<Level name="proTypeId" column="proTypeId" nameColumn="proTypeName" uniqueMembers="true" table="ProductType"/>
<Level name="proId" column="proId" nameColumn="proName" uniqueMembers="true" table="Product"/>
</Hierarchy>
</Dimension>
<Measure name="numb" column="number" aggregator="sum" datatype="Numeric"/>
<Measure name="totalSale" aggregator="sum" formatString="$ #,##0.00">
<!-- unitPrice*number所得值的列 -->
<MeasureExpression>
<SQL dialect="generic">unitPrice*number</SQL>
</MeasureExpression>
</Measure>
<CalculatedMember name="averPri" dimension="Measures">
<Formula>[Measures].[totalSale] / [Measures].[numb]</Formula>
<CalculatedMemberProperty name="FORMAT_STRING" value="$ #,##0.00"/>
</CalculatedMember>
</Cube>
</Schema>
5.在项目中新建一个类
public class ConnectHive { @org.junit.Test public void Test(){ Connection connection = DriverManager.getConnection( "Provider=mondrian;" + "Jdbc=jdbc:hive2://169.254.147.128:10000/mondrian;" + "JdbcUser=hadoop;JdbcPassword=FUyu0117;" + "Catalog="+ this.getClass().getResource("/").getPath() + "olapSchema" + ".xml;" + "JdbcDrivers=org.apache.hive.jdbc.HiveDriver", null); Query query = connection.parseQuery( "select \n" + "{[Measures].[numb],[Measures].[averPri],[Measures].[totalSale]} on columns,\n" + "{([proType].[allPro],[cusGender].[allGender])} \n" + "on rows\n" + "from [Sales]\n"); @SuppressWarnings("deprecation") Result result = connection.execute(query); PrintWriter pw = new PrintWriter(System.out); result.print(pw); pw.flush(); } } |
---|
6.启动hiveserver2 (在hive的bin目录下执行)
hive --service hiveserver2 &
7.测试连接是否已连上
在hive的bin目录下,执行beeline,然后输入 !connect jdbc:hive2://上面设置的ip地址:10000 user password 后面两个是你创建的用户名和密码
!connect jdbc:hive2://169.254.147.128:10000 hadoop FUyu0117
查看表:
8.运行测试类
执行结果:
遇到的问题及处理办法:
1. Caused by: MetaException(message:Version information not found in metastore. )
处理办法:修改conf/hive-site.xml 中的 “hive.metastore.schema.verification” 值为 false 即可
2.org.apache.hadoop.hive.ql.metadata.HiveException:MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql))
处理办法:这是因为作为metastore的数据库没有初始化
只要一条命令:./schematool -initSchema -dbType mysql(这里按照你自己用的数据库来初始化,可以是derby,我用的mysql)(hive的bin目录下)
3. java.lang.NoClassDefFoundError: org/eigenbase/xom/XOMUtil
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
Caused by: java.lang.ClassNotFoundException: org.eigenbase.xom.XOMUtil
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
处理办法:加入jar包eigenbase-xom.jar
4. ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
处理办法:将log4j2.xml 配置放在src下
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="OFF">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n" />
</Console>
</Appenders>
<Loggers>
<Logger name="hive.server2.query.ApiQueryTest" level="trace">
<AppenderRef ref="Console" />
</Logger>
<Logger name="hive.server2.query" level="debug">
<AppenderRef ref="Console" />
</Logger>
<Root level="error">
<AppenderRef ref="Console" />
</Root>
</Loggers>
</Configuration>
5. java.lang.NoClassDefFoundError: org/eigenbase/resgen/ShadowResourceBundle
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
Caused by: java.lang.ClassNotFoundException: org.eigenbase.resgen.ShadowResourceBundle
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
处理办法:加入jar包 eigenbase-resgen.jar
6. mondrian.olap.MondrianException: Mondrian Error:Internal error: Virtual file is not readable: ${path}/src/olapSchema.xml
at mondrian.resource.MondrianResource$_Def0.ex(MondrianResource.java:984)
at mondrian.olap.Util.newInternal(Util.java:2403)
at mondrian.olap.Util.newError(Util.java:2418)
at mondrian.olap.Util.readVirtualFile(Util.java:3356)
处理办法:在代码里把catalog路径"Catalog=file:/src/olapSchema.xml;" +改成下面的
"Catalog="+this.getClass().getResource("/").getPath() + "olapSchema" + ".xml;" +
7. mondrian.olap.MondrianException: Mondrian Error:Internal error: Error while creating SQL connection: Jdbc=jdbc:hive://169.254.147.128:10000/mondrian; JdbcUser=; JdbcPassword=
at mondrian.resource.MondrianResource$_Def0.ex(MondrianResource.java:987)
at mondrian.olap.Util.newInternal(Util.java:2410)
at mondrian.olap.Util.newError(Util.java:2426)
Caused by: java.sql.SQLException: No suitable driver found for jdbc:hive://169.254.147.128:10000/mondrian
at java.sql.DriverManager.getConnection(DriverManager.java:604)
at java.sql.DriverManager.getConnection(DriverManager.java:190)
处理办法:在代码里把连接路径的hive改成hive2
"Jdbc=jdbc:hive2://169.254.147.128:10000/mondrian;" +
8. java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at org.apache.hive.jdbc.HiveConnection.createUnderlyingTransport(HiveConnection.java:418)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
处理办法:加入jar包:commons-configuration-1.6.jar
9.启动hiveserver2时卡住不动:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/app/hadoop-2.6.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
处理办法:将hive的lib目录下的log4j-slf4j-impl-2.4.1.jar删除,它与/home/hadoop/app/hadoop-2.6.2/share/hadoop/common/lib下的slf4j-log4j12-1.7.5.jar重复包含了
10. 启动hiveserver2时报错:org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:9083.
at org.apache.thrift.transport.TServerSocket.<init>(TServerSocket.java:109)
Caused by: java.net.BindException: Address already in use
at java.net.PlainSocketImpl.socketBind(Native Method)
处理办法:重复启动,kill后重新启动即可解决
11. mondrian.olap.MondrianException: Mondrian Error:Internal error: Error while creating SQL connection: Jdbc=jdbc:hive2://169.254.147.128:10000/mondrian; JdbcUser=hive; JdbcPassword=hive
at mondrian.resource.MondrianResource$_Def0.ex(MondrianResource.java:987)
at mondrian.olap.Util.newInternal(Util.java:2410)
Caused by: java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://169.254.147.128:10000/mondrian: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: root is not allowed to impersonate hive
Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException:
处理办法:修改hadoop 配置文件 etc/hadoop/core-site.xml,加入如下配置项
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
Hadoop.proxyuser.root.hosts配置项名称中root部分为报错User:* 中的用户名部分
例如User: hadoop is not allowed to impersonate anonymous则需要将xml变更为如下格式
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
• 重启hadoop
12. mondrian.olap.MondrianException: Mondrian Error:Internal error: Error while creating SQL connection: Jdbc=jdbc:hive2://169.254.147.128:10000/mondrian; JdbcUser=; JdbcPassword=
at mondrian.resource.MondrianResource$_Def0.ex(MondrianResource.java:987)
Caused by: java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://169.254.147.128:10000/mondrian: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: Permission denied: user=anonymous, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwx------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
处理办法:由于Hive没有hdfs:/tmp目录的权限,赋权限即可:
hadoop dfs -chmod -R 777 /tmp
13. mondrian.olap.MondrianException: Mondrian Error:Internal error: Error while executing query [select {[Measures].[numb], [Measures].[averPri], [Measures].[totalSale]} ON COLUMNS,
{([proType].[allPro], [cusGender].[allGender])} ON ROWS
from [Sales]
]
at mondrian.resource.MondrianResource$_Def0.ex(MondrianResource.java:987)
Caused by: mondrian.olap.MondrianException: Mondrian Error:mondrian.olap.MondrianException: Mondrian Error:Failed to load segment form SQL
at mondrian.rolap.agg.SegmentLoader.loadImpl(SegmentLoader.java:241)
Caused by: mondrian.olap.MondrianException: Mondrian Error:Internal error: Error while loading segment; sql=[select sum(Sale.number) as m0, sum(unitPrice*number) as m1 from Sale Sale]
at mondrian.resource.MondrianResource$_Def0.ex(MondrianResource.java:987)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException Unable to determine if hdfs://dfy:9000/user/hive/warehouse/mondrian.db/sale is encrypted: org.apache.hadoop.security.AccessControlException: Permission denied: user=anonymous, access=EXECUTE, inode="/user":hadoop:supergroup:drwx------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
处理办法:代码中改成JdbcUser=hadoop;JdbcPassword=FUyu0117;
14.报错hive Specified key was too long; max key length is 767 bytes
show variables like 'character%';
+--------------------------+-----------------------------------+
| Variable_name | Value |
+--------------------------+-----------------------------------+
| character_set_client | gbk |
| character_set_connection | gbk |
| character_set_database | gbk |
| character_set_filesystem | binary |
| character_set_results | gbk |
| character_set_server | gbk |
| character_set_system | utf8 |
| character_sets_dir | E:\phpStudy\MySQL\share\charsets\ |
+—————————————+-----------------------------------+
除了character_set_system | utf8,其它可以latin1
set names latin1它相当于下面的三句指令:
SET character_set_client = gbk;
SET character_set_results = gbk;
SET character_set_connection = gbk;
set global binlog_format='MIXED';READ-COMMITTED需要把bin-log以mixed方式来记录
改完上述两种方法后,我还是会出现以上问题,我把mysql的metastore_db里面的所有表删除,hadoop重启.