目录
snappy压缩:
介绍snappy
安装工具
编译hadoop(生产文件)
测试
使用版本:
hadoop-2.6.0-cdh5.7.0
centos 7
java 1.7.0_80
开始情况:
开始安装
2.安装工具:
java 安装并配置PATH
java下载路径: https://www.oracle.com/technetwork/java/javase/downloads/java-archive-downloads-javase7-521261.html
vi /etc/profile (添加如下)
export JAVA_HOME=/opt/jdk1.8.0_191
export PATH=$JAVA_HOME/bin:$PATH
启动环境变量 source /etc/profile
安装基础软件
yum -y install gcc gcc-c++ libtool cmake maven zlib-devel openssl-devel protobuf
解压安装:
hadoop-2.6.0-cdh5.9.0-src.tar.gz(下载地址:http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.7.0-src.tar.gz,也可下载二进制包,内包含src源码:hadoop-2.6.0-cdh5.9.0-tar.gz)
snappy1.1.1.tar.gz(下载地址:http://pkgs.fedoraproject.org/repo/pkgs/snappy/snappy-1.1.1.tar.gz/8887e3b7253b22a31f5486bca3cbc1c2/snappy-1.1.1.tar.gz)
安装snappy
# tar xf snappy-1.1.1.tar.gz
# cd snappy-1.1.1
# ./configure
# make && make install
查看snappy是否安装完成
# ll /usr/local/lib/ | grep snappy
3: 编译hadoop
# cd hadoop-2.6.0-cdh5.7.0
# mvn package -Pdist,native -DskipTests -Dtar -Drequire.snappy
把编译生成的hadoop动态库替换原来的
在hadoop2.x源码已经集成了Snappy压缩了,所以编译安装hadoop-snappy 根本是多余的,只要安装snappy本地库和重新编译hadoop native 库
4:修改配置文件:
hadoop.env.sh
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
core-site.xml
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
mapred-site.xml
<property>
<name>mapred.output.compress</name>
<value>true</value>
</property>
<property>
<name>mapred.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
hadoop测试:
全部配置好后(集群中所有的节点都需要copy动态库lib和修改配置),重启hadoop集群环境,运行自带的测试实例 wordcount,如果mapreduce过程中没有错误信息即表示snappy压缩安装方法配置成功。
hbase-site.xml
<property>
<name>hbase.regionserver.codecs</name>
<value>snappy</value>
</property>
hbase测试:
bin/hbase shell
create 'test',{NAME=>'cf',COMPRESSION='SNAPPY'}
put 'test','1','cf:f1','1'
hbase(main):014:1> disable 'tsdb' #禁用表
hbase(main):014:1> desc 'tsdb' #查看表结构
hbase(main):014:1> alter 'tsdb',{NAME=>'t',COMPRESSION => 'SNAPPY'} #压缩修改为snappy
hbase(main):014:1> enable 'tsdb' #使用该表
hbase(main):014:1> major_compact 'tsdb' #最好使该表的region compact一次
所遇问题:
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (clean) @ hadoop-main ---
[WARNING] Rule 1: org.apache.maven.plugins.enforcer.RequireJavaVersion failed with message:
Detected JDK Version: 1.8.0-191 is not in the allowed range [1.7.0,1.7.1000}].
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Hadoop Main ................................ FAILURE [0.188s]
[INFO] Apache Hadoop Project POM ......................... SKIPPED
[INFO] Apache Hadoop Annotations ......................... SKIPPED
[INFO] Apache Hadoop Assemblies .......................... SKIPPED
[INFO] Apache Hadoop Project Dist POM .................... SKIPPED
[INFO] Apache Hadoop Maven Plugins ....................... SKIPPED
[INFO] Apache Hadoop MiniKDC ............................. SKIPPED
[INFO] Apache Hadoop Auth ................................ SKIPPED
[INFO] Apache Hadoop Auth Examples ....................... SKIPPED
[INFO] Apache Hadoop Common .............................. SKIPPED
[INO] Apache Hadoop NFS ................................. SKIPPED
[INFO] Apache Hadoop KMS ................................. SKIPPED
[INFO] Apache Hadoop Common Project ...................... SKIPPED
[INFO] Apache Hadoop HDFS ................................ SKIPPED
[INFO] Apache Hadoop HttpFS .............................. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal ............. SKIPPED
[INFO] Apache Hadoop HDFS-NFS ............................ SKIPPED
[INFO] Apache Hadoop HDFS Project ........................ SKIPPED
[INFO] hadoop-yarn ....................................... SKIPPED
[INFO] hadoop-yarn-api ................................... SKIPPED
[INFO] hadoop-yarn-common ................................ SKIPPED
[INFO] hadoop-yarn-server ................................ SKIPPED
[INFO] hadoop-yarn-server-common ......................... SKIPPED
[INFO] hadoop-yarn-server-nodemanager .................... SKIPPED
[INFO] hadoop-yarn-server-web-proxy ...................... SKIPPED
[INFO] hadoop-yarn-server-applicationhistoryservice ...... SKIPPED
[INFO] hadoop-yarn-server-resourcemanager ................ SKIPPED
[INFO] hadoop-yarn-server-tests .......................... SKIPPED
[INFO] hadoop-yarn-client ................................ SKIPPED
[INFO] hadoop-yarn-applications .......................... SKIPPED
[INFO] hadoop-yarn-applications-distributedshell ......... SKIPPED
[INFO] hadoop-yarn-applications-unmanaged-am-launcher .... SKIPPED
[INFO] hadoop-yarn-site .................................. SKIPPED
[INFO] hadoop-yarn-registry .............................. SKIPPED
[INFO] hadoop-yarn-project ............................... SKIPPED
[INFO] hadoop-mapreduce-client ........................... SKIPPED
[INFO] hadoop-mapreduce-client-core ...................... SKIPPED
[INFO] hadoop-mapreduce-client-common .................... SKIPPED
[INFO] hadoop-mapreduce-client-shuffle ................... SKIPPED
[INFO] hadoop-mapreduce-client-app ....................... SKIPPED
[INFO] hadoop-mapreduce-client-hs ........................ SKIPPED
[INFO] hadoop-mapreduce-client-jobclient ................. SKIPPED
[INFO] hadoop-mapreduce-client-hs-plugins ................ SKIPPED
[INFO] hadoop-mapreduce-client-nativetask ................ SKIPPED
[INFO] Apache Hadoop MapReduce Examples .................. SKIPPED
[INFO] hadoop-mapreduce .................................. SKIPPED
[INFO] Apache Hadoop MapReduce Streaming ................. SKIPPED
[INFO] Apache Hadoop Distributed Copy .................... SKIPPED
[INFO] Apache Hadoop Archives ............................ SKIPPED
[INFO] Apache Hadoop Archive Logs ........................ SKIPPED
[INFO] Apache Hadoop Rumen ............................... SKIPPED
[INFO] Apache Hadoop Gridmix ............................. SKIPPED
[INFO] Apache Hadoop Data Join ........................... SKIPPED
[INFO] Apache Hadoop Ant Tasks ........................... SKIPPED
[INFO] Apache Hadoop Extras .............................. SKIPPED
[INFO] Apache Hadoop Pipes ............................... SKIPPED
[INFO] Apache Hadoop OpenStack support ................... SKIPPED
[INFO] Apache Hadoop Amazon Web Services support ......... SKIPPED
[INFO] Apache Hadoop Azure support ....................... SKIPPED
[INFO] Apache Hadoop Client .............................. SKIPPED
[INFO] Apache Hadoop Mini-Cluster ........................ SKIPPED
[INFO] Apache Hadoop Scheduler Load Simulator ............ SKIPPED
[INFO] Apache Hadoop Tools Dist .......................... SKIPPED
[IFO] Apache Hadoop Tools ............................... SKIPPED
[INFO] Apache Hadoop Distribution ........................ SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.092s
[INFO] Finished at: Tue Nov 20 09:25:13 CST 2018
[INFO] Final Memory: 41M/1920M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:1.3.1:enforce (clean) on project hadoop-main: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed. -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:1.3.1:enforce (clean) on project hadoop-main: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed.
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:217)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
a org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:414)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:357)
Caused by: org.apache.maven.plugin.MojoExecutionException: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed.
at org.apache.maven.plugins.enforcer.EnforceMojo.execute(EnforceMojo.java:209)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
.. 19 more
[ERRR]
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
java的版本不对, 最开始使用的是[
root@slave1 bin]# java -version
java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)
解决方案: 修改java版本
gz压缩:
使用gz压缩格式(非常简单):
core-site.xml
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
mapred-site.xml
<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
</property>
<property>
<name>mapreduce.map.output.compress.codec</name>
<value>org.apache.hadoop.io.compress.GzipCodec</value>
</property>
hbase:
hbase(main):014:1> disable 'tsdb' #禁用表
hbase(main):014:1> desc 'tsdb' #查看表结构
hbase(main):014:1> alter 'tsdb',{NAME=>'t',COMPRESSION => 'GZ'} #压缩修改为snappy
hbase(main):014:1> enable 'tsdb' #使用该表
hbase(main):014:1> major_compact 'tsdb' #最好使该表的region compact一次
所遇问题:
hbase(main):001:0> list
TABLE
ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet
at org.apache.hadoop.hbase.master.HMaster.checkServiceStarted(HMaster.java:2296)
at org.apache.hadoop.hbase.master.MasterRpcServices.isMasterRunning(MasterRpcServices.java:936)
at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55654)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:748)
Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:
hbase> list
hbase> list 'abc.*'
hbase> list 'ns:abc.*'
hbase> list 'ns:.*'
解决方案:
查看lHbase的log日志,发现dfs是安全模式
2015-09-11 15:28:09,429 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting for dfs to exit safe mode... 2015-09-11 15:28:19,433 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting for dfs to exit safe mode... 2015-09-11 15:28:29,458 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting for dfs to exit safe mode... 2015-09-11 15:28:29,540 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: getMaster attempt 23 of 140 failed; retrying after sleep of 64134 org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet
查看dfs模式
grid@master1:~$ hadoop dfsadmin -safemode get
Safe mode is ON
关闭dfs安全模式
grid@master1:~$ hadoop dfsadmin -safemode leave
Safe mode is OFF
再次启动Hbase,能正常启动
GZ测试结果:
由8.4G的dfs目录文件变成了1.3G,
参考:
https://blog.csdn.net/qq_27078095/article/details/56865443
http://www.micmiu.com/bigdata/hadoop/hadoop-snappy-install-config/