版本信息:
hadoop版本2.4.1,hbase版本0.98.7,zookeeper使用hbase内置。
Hbase报错:
1> ERROR: org.apache.hadoop.hbase.NotServingRegionException: Region TABLE_DATA,,1499846409408.a956e500977ea35daa46971885f8b6a1. is not online on slave4,60020,1513925992700
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2762)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:4231)
at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3143)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29925)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
at java.lang.Thread.run(Thread.java:662)
2> WARN #11, table=TABLE_DATA, attempt=36/35 failed 1 ops, last exception: org.apache.hadoop.hbase.exceptions.RegionOpeningException: org.apache.hadoop.hbase.exceptions.RegionOpeningException: Region TABLE_DATA,,1499846409408.a956e500977ea35daa46971885f8b6a1. is opening on datacube5,60020,1513933961539
3> Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 3049711 number_of_rows: 100 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3177)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29925)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
at java.lang.Thread.run(Thread.java:662)
(偶尔会报这个错,重启应用后消失)
错误分析:
可能是hbase使用的是内置zk运行时间长出错,导致元数据被损坏,重启hbase后,元数据没有正常恢复导致有部分表和节点读不出数据。使用hbck进行region一致性检测,发现多个表存在 inconsistencies detected。
尝试解决方法:
1.尝试重启hbase,看是否能正常运行。 hadoop和hbase 都使用hadoop用户来操作。尽量避免强杀hadoop和hbase 进程,使用自带命令结束进程,防止数据被损坏。
2.重新配置zk,使用外置zk 安装在hbase的数据节点上 同时也是hdfs 的数据节点
3.查看hadoop 状态信息,是否为health
bin/hadoop fsck /
bin/hadoop fsck /hbase
4.修复hbase
hbase hbck -repair
转载于:https://my.oschina.net/nxxYqmvPOvsfH/blog/1594909