presto可以查询hive和mysql_presto-cli通过hive查询hdfs

1.  启动hive metastore

2. 启动hive thrift接口

3. 下载presto server

wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.186/presto-server-0.186.tar.gz

4. 配置

config.properties

coordinator=true

node-scheduler.include-coordinator=true

http-server.http.port=20000

query.max-memory=40GB

query.max-memory-per-node=4GB

exchange.http-client.request-timeout=20s

http-server.threads.max=400

discovery-server.enabled=true

discovery.uri=http://10.183.225.158:20000

该配置将coordinate与worker配置再一台机器上。集群部署建议分开:

etc/config.properties包含 Presto Server 相关的配置,每一个 Presto Server 可以同时作为 coordinator 和 worker 使用。你可以将他们配置在一个节点上,但是,在一个大的集群上建议分开配置以提高性能。

coordinator 的最小配置:

coordinator=truenode-scheduler.include-coordinator=falsehttp-server.http.port=9090task.max-memory=1GB

discovery-server.enabled=truediscovery.uri= http://xx1:9090

worker 的最小配置:

coordinator=falsehttp-server.http.port=9090task.max-memory=1GB

discovery.uri= http://xx2:9090

可选的,作为测试,你可以在一个节点上同时配置两者:

coordinator=truenode-scheduler.include-coordinator=truehttp-server.http.port=9090task.max-memory=1GB

discovery-server.enabled=truediscovery.uri=http://xx3:9090

node.properties

node.environment=xxxoipresto01

node.data-dir=/data/slot0/presto/presto-01-coordinator

node.id=presto-01-coord

jvm.properties

-server

-Xmx10G

-XX:+UseConcMarkSweepGC

-XX:+ExplicitGCInvokesConcurrent

-XX:+CMSClassUnloadingEnabled

-XX:+AggressiveOpts

-XX:+HeapDumpOnOutOfMemoryError

-XX:ReservedCodeCacheSize=600M

-XX:NativeMemoryTracking=summary

-Dcom.sun.management.jmxremote

-Dcom.sun.management.jmxremote.authenticate=false

-Dcom.sun.management.jmxremote.ssl=false

-Dcom.sun.management.jmxremote.port=20001

-Dcom.sun.management.jmxremote.rmi.port=20001

log.properties

com.facebook.presto=INFO

etc/catalog/hive.properties

connector.name=hive-hadoop2

hive.metastore.uri=thrift://hadoop1:9083

hive.config.resources=/xxx/soft/hadoop-2.7.3/etc/hadoop/core-site.xml,/xxx/soft/hadoop-2.7.3/etc/hadoop/hdfs-site.xml

hive.metastore.authentication.type=KERBEROS

hive.metastore.service.principal=hive/hadoop1@JENKIN.COM

hive.metastore.client.principal=presto@JENKIN.COM

hive.metastore.client.keytab=/etc/presto/presto.keytab

hive.hdfs.authentication.type=KERBEROS

hive.hdfs.impersonation.enabled=false

hive.hdfs.presto.principal=presto@JENKIN.COM

hive.hdfs.presto.keytab=/etc/presto/presto.keytab

4. 启动presto coordinate

/usr/bin/nohup /xxx/soft/lesports-presto-server-0.176/bin/launcher run --node-config /etc/presto/01/node.properties --jvm-config /etc/presto/01/jvm.config --config /etc/presto/01/config.properties --log-levels-file /etc/presto/01/log.properties >/xxx/soft/lesports-presto-server-0.176/stdout.log 2>&1 &

这里为了做集群隔离把配置文件放到了另外的目录。

启动是否成功看日志,或者http://10.183.225.158:20000/看管理界面,或者:

netstat -nlp | grep 20000

5. 下载presto-cli

wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.186/presto-cli-0.186-executable.jar

6. 改名为presto-cli

mv presto-cli-0.186-executable.jar presto-cli

7. 启动

./presto-cli --server hadoop1:20000 --catalog hive --schema jenkindb

10.183.225.158的hostname为hadoop1

8. 查询

presto:jenkindb> show tables;

Table

----------

jenkintb

test1

(2 rows)

Query 20171020_023712_00002_a78ua, FINISHED, 1 node

Splits: 18 total, 18 done (100.00%)

0:01 [2 rows, 49B] [2 rows/s, 51B/s]

presto:jenkindb> select * from jenkintb;

id | name

----+----------

1 | jenkin

2 | jenkin.k

3 | anne

(3 rows)

Query 20171020_023724_00003_a78ua, FINISHED, 1 node

Splits: 17 total, 17 done (100.00%)

0:00 [3 rows, 27B] [8 rows/s, 72B/s]

问题解决:

问题:failed: No worker nodes available

解决:presto coordinate config.properties增加 node-scheduler.include-coordinator=true

问题:org.apache.hadoop.ipc.RemoteException: User: presto@JENKIN.COM is not allowed to impersonate root

解决:hive.properties增加 hive.hdfs.impersonation.enabled=false


版权声明:本文为weixin_39600447原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。