操作HDFS
- node01节点用root用户启动hadoop集群
[hadoop@node01 bin]$ su root
密码:
[root@node01 bin]#
[root@node01 bin]# cd
[root@node01 ~]# start-all.sh
- 编写flow文件
operateHdfs.flow,内容如下
nodes:
- name: jobA
type: command
config:
command: echo "start execute"
command.1: /export/servers/hadoop-2.7.5/bin/hdfs dfs -mkdir /azkaban
command.2: /export/servers/hadoop-2.7.5/bin/hdfs dfs -put /export/servers/hadoop-2.7.5/NOTICE.txt /azkaban
- 生成zip项目文件、web ui上传zip、执行flow
- 查看HDFS结果

MR任务
- 记得启动hadoop的
historyserver,否则执行mr项目时,job的日志会报如下类似错误日志
22-03-2021 23:17:23 CST jobMR INFO - 21/03/22 23:17:23 INFO impl.YarnClientImpl: Submitted application application_1616423563192_0001
22-03-2021 23:17:39 CST jobMR INFO - 21/03/22 23:17:39 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
22-03-2021 23:17:41 CST jobMR INFO - 21/03/22 23:17:41 INFO ipc.Client: Retrying connect to server: node01/192.168.77.30:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
22-03-2021 23:17:42 CST jobMR INFO - 21/03/22 23:17:42 INFO ipc.Client: Retrying connect to server: node01/192.168.77.30:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
22-03-2021 23:17:44 CST jobMR INFO - 21/03/22 23:17:44 INFO ipc.Client: Retrying connect to server: node01/192.168.77.30:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
192.168.77.30:10020 应该是hadoop集群的historyserver服务
- 编写flow文件
mr.flow,内容如下
nodes:
- name: jobMR
type: command
config:
command: /export/servers/hadoop-2.7.5/bin/hadoop jar /export/servers/hadoop-2.7.5/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar pi 3 3
- 为了避免执行mr过程中,对hdfs操作的一些权限问题
[hadoop@node01 azkaban-exec-server-4.0.0]$ su root
[root@node01 azkaban-exec-server-4.0.0]# hdfs dfs -chmod -R 777 /tmp/
- 生成zip项目文件、web ui上传zip、执行flow
- 查看结果

- 可以去yarn界面看看此job的执行情况

Views: 4









































