一、flume安装
1、上传flume包到master并解压缩
tar -zxvf flume-ng-1.2.0-cdh3u5.tar.gz 2、添加flume路径到.bash_profile中(可有可无,只为方便) export FLUME_HOME=/home/hadoop/flume-ng-1.2.0-cdh3u5
export FLUME_CONF_DIR=$FLUME_HOME/conf3、修改flume配置文件(文件位置:flume-ng-1.2.0-cdh3u5/conf/)cp flume-conf.properties.template flume-conf.properties
编辑flume-conf.properties
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = hd2
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c14、运行flume(flume-ng-1.2.0-cdh3u5/下运行) bin/flume-ng agent --conf /home/hadoop/flume-ng-1.2.0-cdh3u5/conf/ --conf-file conf/flume-conf.properties --name a1 -Dflume.root.logger=INFO,console5、上面界面保留,另开一个界面输入telnet hadoop110(主机名) 44444
hello world
6、可查看到如下结果,证明成功
以上配置文件仅为flume测试使用,企业中切勿进行。
二、企业flume之exec---常用于日志的抽取
1、只需要在配置文件conf中添加一个配置文件即可,文件名无要求(我这里的叫flume.conf )
agent.sources = reader
agent.channels = memoryChannel
agent.sinks = avro-forward-sink
agent.sources.reader.type = exec
agent.sources.reader.command =<span style="color:#FF0000;"> tail -f /home/flume/sf</span>
agent.sources.reader.logStdErr = true
agent.sources.reader.restart = true
agent.sources.reader.channels = memoryChannel
agent.sinks.avro-forward-sink.type = avro
agent.sinks.avro-forward-sink.hostname = <span style="color:#FF0000;">192.168.0.13</span>
agent.sinks.avro-forward-sink.port = 44444
agent.sinks.avro-forward-sink.channel = memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 10000
agent.channels.memoryChannel.transactionCapacity = 100agent.sources.reader.command = 源地址 agent.sinks.avro-forward-sink.hostname =目标地址 其他参数可以查看其他文档 2、运行
bin/flume-ng agent -n agent -c conf -f /usr/local/flume/conf/flume.conf -Dflume.root.logger=INFO,console3、上面的窗口不要关闭,在另一个窗口中传入数据到/home/flume/sf中,你会发现上面的窗口有显示出你抽取的文件名称,则成功三、企业flume之spooldir-----常用于文件的抽取
同二,只需要新建一个配置文件即可,并加入一下内容
agent.sources = reader
agent.channels = fileChannel
agent.sinks = avro-forward-sink
agent.sources.reader.type=spooldir
agent.sources.reader.spoolDir=<span style="color:#FF0000;">/home/history</span>
agent.sources.reader.channels=fileChannel
agent.sources.reader.fileHeader = false
agent.sources.reader.interceptors = i1
agent.sources.reader.interceptors.i1.type = timestamp
agent.sources.reader.channels = fileChannel
agent.sinks.avro-forward-sink.type = avro
agent.sinks.avro-forward-sink.hostname =<span style="color:#FF0000;"> 192.168.0.12</span>
agent.sinks.avro-forward-sink.port = 44444
agent.sinks.avro-forward-sink.channel = fileChannel
agent.channels.fileChannel.type=file
agent.channels.fileChannel.checkpointDir=/home/test
agent.channels.fileChannel.dataDirs=/home/test1运行
bin/flume-ng agent -n agent -c conf -f /usr/local/flume/conf/history.conf -Dflume.root.logger=INFO,console查看结果同上,最终结果也可以在192.168.0.12上进行查看
四、flume总结
1、flume抽取具有很大的灵活性,只需适当的修改配置文件中的内容即可
2、flume不能抽取pdf,jpg等文件,会报错
版权声明:本文为Technology_2016原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。