操作HDFS
- node01节点用root用户启动hadoop集群
1 2 3 4 5 |
[hadoop@node01 bin]$ su root 密码: [root@node01 bin]# [root@node01 bin]# cd [root@node01 ~]# start-all.sh |
- 编写flow文件
operateHdfs.flow
,内容如下
1 2 3 4 5 6 7 |
nodes: - name: jobA type: command config: command: echo "start execute" command.1: /export/servers/hadoop-2.7.5/bin/hdfs dfs -mkdir /azkaban command.2: /export/servers/hadoop-2.7.5/bin/hdfs dfs -put /export/servers/hadoop-2.7.5/NOTICE.txt /azkaban |
- 生成zip项目文件、web ui上传zip、执行flow
- 查看HDFS结果
MR任务
- 记得启动hadoop的
historyserver
,否则执行mr项目时,job的日志会报如下类似错误日志
1 2 3 4 5 6 |
22-03-2021 23:17:23 CST jobMR INFO - 21/03/22 23:17:23 INFO impl.YarnClientImpl: Submitted application application_1616423563192_0001 22-03-2021 23:17:39 CST jobMR INFO - 21/03/22 23:17:39 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 22-03-2021 23:17:41 CST jobMR INFO - 21/03/22 23:17:41 INFO ipc.Client: Retrying connect to server: node01/192.168.77.30:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 22-03-2021 23:17:42 CST jobMR INFO - 21/03/22 23:17:42 INFO ipc.Client: Retrying connect to server: node01/192.168.77.30:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 22-03-2021 23:17:44 CST jobMR INFO - 21/03/22 23:17:44 INFO ipc.Client: Retrying connect to server: node01/192.168.77.30:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) |
192.168.77.30:10020 应该是hadoop集群的historyserver服务
- 编写flow文件
mr.flow
,内容如下
1 2 3 4 5 |
nodes: - name: jobMR type: command config: command: /export/servers/hadoop-2.7.5/bin/hadoop jar /export/servers/hadoop-2.7.5/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.5.jar pi 3 3 |
- 为了避免执行mr过程中,对hdfs操作的一些权限问题
1 2 |
[hadoop@node01 azkaban-exec-server-4.0.0]$ su root [root@node01 azkaban-exec-server-4.0.0]# hdfs dfs -chmod -R 777 /tmp/ |
- 生成zip项目文件、web ui上传zip、执行flow
- 查看结果
- 可以去yarn界面看看此job的执行情况
Views: 4