HBase常见错误:
1. 向Hbase插入时,报错java.lang.IllegalArgumentException: KeyValue size too large的解决办法
2020-04-08 09:34:38,120 ERROR [main] ExecReducer: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"0","_col1":"","_col2":"2020-04-08","_col3":"joyshebaoBeiJing","_col4":"105","_col5":"北京,"},"value":null}
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:253)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: KeyValue size too large
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:763)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
... 7 more
Caused by: java.lang.IllegalArgumentException: KeyValue size too large
at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:1577)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.validatePut(BufferedMutatorImpl.java:158)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:133)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:119)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1085)
at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:146)
at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:117)
at org.apache.hadoop.hive.ql.io.HivePassThroughRecordWriter.write(HivePassThroughRecordWriter.java:40)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:717)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1007)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:818)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:692)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:758)
... 8 more
在进行插入操作的时候,hbase会挨个检查要插入的列,检查每个列的大小是否小于 maxKeyValueSize值,当cell的大小大于maxKeyValueSize时,就会抛出KeyValue size too large的异常。
hbase.client.keyvalue.maxsize 一个KeyValue实例的最大size.这个是用来设置存储文件中的单个entry的大小上界。因为一个KeyValue是不能分割的,所以可以避免因为数据过大导致region不可分割。
明智的做法是把它设为可以被最大region size整除的数。如果设置为0或者更小,就会禁用这个检查。默认10MB。
默认: 10485760
size 的默认大小是10M,如果cell的大小超过10M,那么就会报 KeyValue size too large的错误。
解决方法:
方法一、根据官网提示,修改配置文件hbase-default.xml ,调大hbase.client.keyvalue.maxsize 的值:
<property>
<name>hbase.client.keyvalue.maxsize</name>
<value>20971520</value>
</property>
方法二:修改代码,使用configuration对象修改此配置:
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.client.keyvalue.maxsize","20971520");
推荐此种方式。