一、配置主函数参数
找到core模块的Engine类

二、填写参数

三、具体参数参考
Vm options: -Ddatax.home=/Users/XXX/IdeaProjects/PlatformDatax/target/datax/datax
Program agrs :-job /Users/XXX/IdeaProjects/PlatformDatax/hdfswrite2oss.json -jobid 1
Working directory: /Users/XXX/IdeaProjects/PlatformDatax

我这里修改了hdfswriter,实现以orc格式将数据写入oss的功能,测试json为hdfswrite2oss.json
{
"core": {
"transport": {
"channel": {
"speed": {
"byte": "3145728"
}
}
}
},
"job": {
"errorLimit": {
"record": "0"
},
"setting": {
"speed": {
"byte": "3145728"
}
},
"content": [{
"reader": {
"name": "streamreader",
"parameter": {
"column": [{
"value": 1234567,
"type": "long"
},
{
"value": "logistics_channel1,logistics_channel2,logistics_channel2",
"type": "bytes"
},
{
"value": "加密字段",
"type": "bytes"
},
{
"value": "2020-06-17 18:00:00",
"type": "bytes"
},
{
"value": "物流content",
"type": "bytes"
},
{
"value": "2020-06-17 18:00:00",
"type": "bytes"
},
{
"value": "2020-06-17 18:00:00",
"type": "bytes"
},
{
"value": "2020-06-17 18:00:00",
"type": "bytes"
}
],
"sliceRecordCount": 100000
}
},
"writer": {
"name": "hdfswriter",
"parameter": {
"defaultFS": "oss://bukect",
"hadoopConfig": {
"fs.oss.endpoint": "oss endpoint",
"fs.oss.accessKeyId": "accessKeyId",
"fs.oss.accessKeySecret": "accessKeySecret",
"fs.oss.impl": "org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem",
"fs.oss.buffer.dir": "/tmp/oss"
},
"path": "/metadata/hive/db_name/table_name/p_dt=20200617",
"fileName": "table_name",
"column": [{
"name": "id",
"type": "bigint"
}, {
"name": "logistics_channel",
"type": "array<string>"
}, {
"name": "logistics_number_ex",
"type": "string"
}, {
"name": "operate_time",
"type": "string"
}, {
"name": "content",
"type": "string"
}, {
"name": "create_time",
"type": "string"
}, {
"name": "update_time",
"type": "string"
}, {
"name": "odps_runtime",
"type": "string"
}],
"fileType": "ORC",
"encoding": "UTF-8",
"fieldDelimiter": "\u0001",
"writeMode": "append"
}
}
}]
}
}
版权声明:本文为u010848845原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。