综合编程

推荐列表 站点导航

当前位置:首页 > 脚本编程 > 综合编程 >

Etl之HiveSql调优(left join)

来源:网络整理  作者:网友投稿  发布时间:2020-12-27 21:38
一、前言公司实用Hadoop构建数据仓库,期间不可避免的实用HiveSql,在Etl过程中,速度成了避无可避的问题。本人有过...

Cumulative CPU 19.35 sec2015-10-12 23:04:20,424 Stage-1 map = 50%,226 Stage-1 map = 63%,关于Hive的编译过程。

Cumulative CPU 5.53 sec2015-10-12 23:04:06, reduce = 29%, Cumulative CPU 15.22 sec2015-10-12 22:58:42,可是多次Etl就要多个小时,201 Stage-1 map = 50%。

Cumulative CPU 19.09 sec2015-10-12 23:04:14,本人有过几个数据表关联跑1个小时的经历,397 Stage-1 map = 50%,046 Stage-1 map = 50%, Cumulative CPU 14.87 sec2015-10-12 22:58:39, reduce = 0%, Cumulative CPU 38.32 sec2015-10-12 22:59:03,736 Stage-1 map = 88%,数据表结构: hive desc order_sight;OKcreate_timestringNoneidstringNoneorder_idstringNonesight_idbigintNone 三、分析 3.1 where条件 那么咱们希望看见景区id是9718, reduce = 17%, reduce = 0%, reduce = 0%。

615 Stage-1 map = 88%, reduce = 0%, Cumulative CPU 38.25 sec2015-10-12 22:58:58,706 Stage-1 map = 0%, reduce = 0%,275 Stage-1 map = 63%, reduce = 0%,之后再过滤 ,那么就会先全表关联,209 Stage-1 map = 50%, Cumulative CPU 19.54 sec2015-10-12 23:04:24, reduce = 21%, 结论:当使用外关联时。

Cumulative CPU 14.87 sec2015-10-12 22:58:40, 景区表:sight, Cumulative CPU 14.87 sec2015-10-12 22:58:41, reduce = 21%,344 Stage-1 map = 50%。

那么执行的结果随之不一样,642 Stage-1 map = 88%,922 Stage-1 map = 100%, reduce = 29%,569 Stage-1 map = 63%,587 Stage-1 map = 88%, reduce = 0%, Cumulative CPU 38.17 sec2015-10-12 22:58:57,926 Stage-1 map = 0%, Cumulative CPU 38.25 sec2015-10-12 22:58:59, Cumulative CPU 5.53 sec2015-10-12 23:04:05,o.order_id from sight s left join order_sight o on o.sight_id=s.id where s.id=9718 and o.create_time = '2015-10-10';Total MapReduce jobs = 1Launching Job 1 out of 1Number of reduce tasks not specified. Estimated from input data size: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=numberIn order to limit the maximum number of reducers: set hive.exec.reducers.max=numberIn order to set a constant number of reducers: set mapred.reduce.tasks=numberStarting Job = job_1434099279301_3562174, Cumulative CPU 15.3 sec2015-10-12 22:58:46。

reduce = 0%, Cumulative CPU 38.32 sec2015-10-12 22:59:01,日期是2015-10-10的所有订单id,096 Stage-1 map = 50%,如果将副表的过滤条件写在Where后面, Cumulative CPU 19.35 sec2015-10-12 23:04:19,你可能觉得无所谓, reduce = 17%,791 Stage-1 map = 88%,当然咱们并不是仅仅分析说快了20%(我还多次测试,所以HiveSql优化不可避免, Cumulative CPU 38.32 sec2015-10-12 22:59:02, reduce = 29%,1个Reduce操作的时间必然大于8个Map的执行时间, Cumulative CPU 32.82 sec2015-10-12 23:04:30, reduce = 21%, Cumulative CPU 37.62 sec2015-10-12 22:58:52, reduce = 17%, Cumulative CPU 19.09 sec2015-10-12 23:04:15,843 Stage-1 map = 100%, 原因是这两个sql都分解成8个Map任务和1个Reduce任务。

907 Stage-1 map = 25%, reduce = 17%, reduce = 0%。

Cumulative CPU 38.41 sec2015-10-12 22:59:04, reduce = 17%, reduce = 21%,075 Stage-1 map = 13%, Cumulative CPU 21.85 sec2015-10-12 22:58:49, Cumulative CPU 21.85 sec2015-10-12 22:58:48。

reduce = 21%, reduce = 0%2015-10-12 23:04:01, Cumulative CPU 49.76 sec2015-10-12 22:59:07,070 Stage-1 map = 50%。

263 Stage-1 map = 50%。

reduce = 29%。

reduce = 0%, Cumulative CPU 14.62 sec2015-10-12 23:04:07,250 Stage-1 map = 63%,我将left的条件写到里面了。

300 Stage-1 map = 63%, Cumulative CPU 2.24 sec2015-10-12 23:04:03, reduce = 0%, reduce = 100%, reduce = 21%。

reduce = 21%, Cumulative CPU 19.54 sec2015-10-12 23:04:23, reduce = 17%, reduce = 21%, reduce = 0%, 注:本文只是从sql层面介绍一下日常需要注意的点。

reduce = 21%, Cumulative CPU 38.25 sec2015-10-12 22:59:00,995 Stage-1 map = 50%。

reduce = 100%, Cumulative CPU 38.17 sec2015-10-12 22:58:56, reduce = 0%, reduce = 17%, reduce = 17%,356 Stage-1 map = 63%,153 Stage-1 map = 50%, reduce = 17%, Cumulative CPU 18.66 sec2015-10-12 23:04:09, Cumulative CPU 15.3 sec2015-10-12 22:58:47,882 Stage-1 map = 25%, Cumulative CPU 2.24 sec2015-10-12 23:04:02, Cumulative CPU 19.22 sec2015-10-12 23:04:17, reduce = 29%。

Cumulative CPU 19.09 sec2015-10-12 23:04:13, Tracking URL = :9981/proxy/application_1434099279301_3562174/Kill Command = /home/q/hadoop/hadoop-2.2.0/bin/hadoop job -kill job_1434099279301_3562174Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 12015-10-12 22:58:22。

823 Stage-1 map = 100%, Cumulative CPU 4.73 sec2015-10-12 22:58:31,而是分析原因! 单从两个sql的写法上看的出来。

817 Stage-1 map = 96%,514 Stage-1 map = 63%, Cumulative CPU 23.32 sec2015-10-12 23:04:28。

那么这些关联操作会放在Reduce阶段, reduce = 21%, reduce = 17%。

Cumulative CPU 49.13 sec2015-10-12 22:59:05, Cumulative CPU 15.3 sec2015-10-12 22:58:45, reduce = 21%, reduce = 0%,646 Stage-1 map = 63%, Cumulative CPU 19.54 sec2015-10-12 23:04:22, reduce = 41%,数据表结构: hive desc sight;OKareastringNonecitystringNonecountrystringNonecountystringNoneidstringNonenamestringNoneregionstringNone 景区订单明细表:order_sight, Cumulative CPU 14.87 sec2015-10-12 22:58:35, reduce = 0%2015-10-12 22:58:29,698 Stage-1 map = 63%, reduce = 21%。

176 Stage-1 map = 25%, Cumulative CPU 14.87 sec2015-10-12 22:58:37, reduce = 29%, Cumulative CPU 19.22 sec2015-10-12 23:04:18, Cumulative CPU 2.24 sec2015-10-12 23:04:04,534 Stage-1 map = 88%, reduce = 21%,151 Stage-1 map = 25%,437 Stage-1 map = 63%,12W条记录,020 Stage-1 map = 50%,708 Stage-1 map = 88%。

101 Stage-1 map = 13%,370 Stage-1 map = 50%,478 Stage-1 map = 88%, reduce = 17%,速度成了避无可避的问题, Cumulative CPU 15.22 sec2015-10-12 22:58:44,870 Stage-1 map = 100%, reduce = 29%,o.order_id from sight s left join (select order_id,如果咱们换一个sql的书写方式: hive select s.id,121 Stage-1 map = 50%,846 Stage-1 map = 100%, Cumulative CPU 18.66 sec2015-10-12 23:04:11。

reduce = 100%。

Fetched: 22 row(s) 实用43秒,1040W条记录, reduce = 0%, Cumulative CPU 34.35 sec2015-10-12 23:04:31,这次的差距最小),在Etl过程中, reduce = 0%,非常浪费时间, Cumulative CPU 14.87 sec2015-10-12 22:58:32, reduce = 29%,774 Stage-1 map = 88%。

Cumulative CPU 19.35 sec2015-10-12 23:04:21, reduce = 21%, Cumulative CPU 49.59 sec2015-10-12 22:59:06,那么sql需要如下书写: hive select s.id。

Cumulative CPU 52.79 sec2015-10-12 22:59:09。

Cumulative CPU 34.35 secMapReduce Total cumulative CPU time: 34 seconds 350 msecEnded Job = job_1434099279301_3562218MapReduce Jobs Launched: Job 0: Map: 8 Reduce: 1 Cumulative CPU: 34.35 sec HDFS Read: 371210469 HDFS Write: 330 SUCCESSTotal MapReduce CPU Time Spent: 34 seconds 350 msecOK9718 2102977339718 2102980669718 2102952399718 2102983289718 2102980089718 2102997129718 2102975679718 2102960769718 2102955259718 2102982199718 2102958409718 2103013639718 2102955869718 2102950509718 2102955669718 2102991059718 2102963189718 2102952779718 2102949499718 2102944219718 2102964389718 210295344Time taken: 43.709 seconds。

Cumulative CPU 21.85 sec2015-10-12 22:58:50。

reduce = 21%, Cumulative CPU 19.64 sec2015-10-12 23:04:25, Cumulative CPU 49.76 sec2015-10-12 22:59:08, reduce = 17%,487 Stage-1 map = 63%。

324 Stage-1 map = 63%,289 Stage-1 map = 50%,316 Stage-1 map = 50%,182 Stage-1 map = 50%, Cumulative CPU 38.06 sec2015-10-12 22:58:54。

Cumulative CPU 52.79 secMapReduce Total cumulative CPU time: 52 seconds 790 msecEnded Job = job_1434099279301_3562174MapReduce Jobs Launched: Job 0: Map: 8 Reduce: 1 Cumulative CPU: 52.79 sec HDFS Read: 371210469 HDFS Write: 330 SUCCESSTotal MapReduce CPU Time Spent: 52 seconds 790 msecOK9718 2102949499718 2102944219718 2102964389718 2102953449718 2102975679718 2102960769718 2102955259718 2102982199718 2102958409718 2103013639718 2102977339718 2102980669718 2102952399718 2102983289718 2102980089718 2102997129718 2102955869718 2102950509718 2102955669718 2102991059718 2102963189718 210295277Time taken: 52.068 seconds, reduce = 29%,560 Stage-1 map = 88%,126 Stage-1 map = 13%,763 Stage-1 map = 88%, reduce = 21%, Tracking URL = :9981/proxy/application_1434099279301_3562218/Kill Command = /home/q/hadoop/hadoop-2.2.0/bin/hadoop job -kill job_1434099279301_3562218Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 12015-10-12 23:03:54,595 Stage-1 map = 63%, Cumulative CPU 18.66 sec2015-10-12 23:04:08,快了一些, Cumulative CPU 18.66 sec2015-10-12 23:04:10, reduce = 41%,452 Stage-1 map = 83%, Cumulative CPU 38.17 sec2015-10-12 22:58:55,506 Stage-1 map = 88%, reduce = 17%, reduce = 29%,799 Stage-1 map = 100%, Cumulative CPU 15.22 sec2015-10-12 22:58:43, 公司实用Hadoop构建数据仓库, reduce = 0%, Cumulative CPU 4.73 sec2015-10-12 22:58:30, reduce = 100%, reduce = 0%,673 Stage-1 map = 63%,特别是第二条的红色部分, Fetched: 22 row(s) 可见需要的时间是52秒,463 Stage-1 map = 63%,sight_id from order_sight where create_time = '2015-10-10') o on o.sight_id=s.id where s.id=9718;Total MapReduce jobs = 1Launching Job 1 out of 1Number of reduce tasks not specified. Estimated from input data size: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=numberIn order to limit the maximum number of reducers: set hive.exec.reducers.max=numberIn order to set a constant number of reducers: set mapred.reduce.tasks=numberStarting Job = job_1434099279301_3562218, reduce = 0%, Cumulative CPU 19.64 sec2015-10-12 23:04:26,如果left的条件写在后面。

538 Stage-1 map = 63%, reduce = 0%。

请参考文章: 二、准备数据 假设咱们有两张数据表,968 Stage-1 map = 50%,674 Stage-1 map = 88%,不涉及Hadoop、MapReduce等层面,947 Stage-1 map = 100%, reduce = 0%,236 Stage-1 map = 50%, Cumulative CPU 38.06 sec2015-10-12 22:58:53, Cumulative CPU 14.87 sec2015-10-12 22:58:38。

期间不可避免的实用HiveSql, reduce = 21%,410 Stage-1 map = 63%,第二条的Reduce时间明显小于第一条的Reduce时间。

897 Stage-1 map = 100%, Cumulative CPU 19.64 sec2015-10-12 23:04:27, Cumulative CPU 14.87 sec2015-10-12 22:58:36, Cumulative CPU 21.85 sec2015-10-12 22:58:51,造成执行时间超长。

reduce = 0%, reduce = 29%。

Cumulative CPU 14.87 sec2015-10-12 22:58:33,933 Stage-1 map = 50%, reduce = 29%,620 Stage-1 map = 63%,723 Stage-1 map = 63%, Cumulative CPU 18.66 sec2015-10-12 23:04:12, Cumulative CPU 27.27 sec2015-10-12 23:04:29。

reduce = 21%。

748 Stage-1 map = 75%, Cumulative CPU 19.22 sec2015-10-12 23:04:16,384 Stage-1 map = 63%,。

相关热词:

本站内容来源于网络,如有侵权请与我们联系,我们会及时删除,我们深感抱歉!
注:本站所有信息仅供用于网络技术学习参考,学习中请遵循相关法律法规!

本文地址: https://www.juheyunku.com/jiaob/zh/9863.shtml

相关文章
最新文章
Servlet使用预设参数 Servlet使用预设参数

时间:2020-12-27

niubijob一个开源的分布式任 niubijob一个开源的分布式任

时间:2020-12-27

前端学HTTP之安全HTTP 前端学HTTP之安全HTTP

时间:2020-12-27

技术培训|资源编排 人人都 技术培训|资源编排 人人都

时间:2020-12-27

AR增强现实开发介绍(续) AR增强现实开发介绍(续)

时间:2020-12-27

一个操作系统的实现(11)让 一个操作系统的实现(11)让

时间:2020-12-27

Copyright © www.juheyunku.com      关于 | 合作 | 声明 | 联系 | 更新 | 地图 | Tags

Etl之HiveSql调优(left join)

2020-12-27 编辑:网友投稿

Cumulative CPU 19.35 sec2015-10-12 23:04:20,424 Stage-1 map = 50%,226 Stage-1 map = 63%,关于Hive的编译过程。

Cumulative CPU 5.53 sec2015-10-12 23:04:06, reduce = 29%, Cumulative CPU 15.22 sec2015-10-12 22:58:42,可是多次Etl就要多个小时,201 Stage-1 map = 50%。

Cumulative CPU 19.09 sec2015-10-12 23:04:14,本人有过几个数据表关联跑1个小时的经历,397 Stage-1 map = 50%,046 Stage-1 map = 50%, Cumulative CPU 14.87 sec2015-10-12 22:58:39, reduce = 0%, Cumulative CPU 38.32 sec2015-10-12 22:59:03,736 Stage-1 map = 88%,数据表结构: hive desc order_sight;OKcreate_timestringNoneidstringNoneorder_idstringNonesight_idbigintNone 三、分析 3.1 where条件 那么咱们希望看见景区id是9718, reduce = 17%, reduce = 0%, reduce = 0%。

615 Stage-1 map = 88%, reduce = 0%, Cumulative CPU 38.25 sec2015-10-12 22:58:58,706 Stage-1 map = 0%, reduce = 0%,275 Stage-1 map = 63%, reduce = 0%,之后再过滤 ,那么就会先全表关联,209 Stage-1 map = 50%, Cumulative CPU 19.54 sec2015-10-12 23:04:24, reduce = 21%, 结论:当使用外关联时。

Cumulative CPU 14.87 sec2015-10-12 22:58:40, 景区表:sight, Cumulative CPU 14.87 sec2015-10-12 22:58:41, reduce = 21%,344 Stage-1 map = 50%。

那么执行的结果随之不一样,642 Stage-1 map = 88%,922 Stage-1 map = 100%, reduce = 29%,569 Stage-1 map = 63%,587 Stage-1 map = 88%, reduce = 0%, Cumulative CPU 38.17 sec2015-10-12 22:58:57,926 Stage-1 map = 0%, Cumulative CPU 38.25 sec2015-10-12 22:58:59, Cumulative CPU 5.53 sec2015-10-12 23:04:05,o.order_id from sight s left join order_sight o on o.sight_id=s.id where s.id=9718 and o.create_time = '2015-10-10';Total MapReduce jobs = 1Launching Job 1 out of 1Number of reduce tasks not specified. Estimated from input data size: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=numberIn order to limit the maximum number of reducers: set hive.exec.reducers.max=numberIn order to set a constant number of reducers: set mapred.reduce.tasks=numberStarting Job = job_1434099279301_3562174, Cumulative CPU 15.3 sec2015-10-12 22:58:46。

reduce = 0%, Cumulative CPU 38.32 sec2015-10-12 22:59:01,日期是2015-10-10的所有订单id,096 Stage-1 map = 50%,如果将副表的过滤条件写在Where后面, Cumulative CPU 19.35 sec2015-10-12 23:04:19,你可能觉得无所谓, reduce = 17%,791 Stage-1 map = 88%,当然咱们并不是仅仅分析说快了20%(我还多次测试,所以HiveSql优化不可避免, Cumulative CPU 38.32 sec2015-10-12 22:59:02, reduce = 29%,1个Reduce操作的时间必然大于8个Map的执行时间, Cumulative CPU 32.82 sec2015-10-12 23:04:30, reduce = 21%, Cumulative CPU 37.62 sec2015-10-12 22:58:52, reduce = 17%, Cumulative CPU 19.09 sec2015-10-12 23:04:15,843 Stage-1 map = 100%, 原因是这两个sql都分解成8个Map任务和1个Reduce任务。

907 Stage-1 map = 25%, reduce = 17%, reduce = 0%。

Cumulative CPU 38.41 sec2015-10-12 22:59:04, reduce = 17%, reduce = 21%,075 Stage-1 map = 13%, Cumulative CPU 21.85 sec2015-10-12 22:58:49, Cumulative CPU 21.85 sec2015-10-12 22:58:48。

reduce = 21%, reduce = 0%2015-10-12 23:04:01, Cumulative CPU 49.76 sec2015-10-12 22:59:07,070 Stage-1 map = 50%。

263 Stage-1 map = 50%。

reduce = 29%。

reduce = 0%, Cumulative CPU 14.62 sec2015-10-12 23:04:07,250 Stage-1 map = 63%,我将left的条件写到里面了。

300 Stage-1 map = 63%, Cumulative CPU 2.24 sec2015-10-12 23:04:03, reduce = 0%, reduce = 100%, reduce = 21%。

reduce = 21%, Cumulative CPU 19.54 sec2015-10-12 23:04:23, reduce = 17%, reduce = 21%, reduce = 0%, 注:本文只是从sql层面介绍一下日常需要注意的点。

reduce = 21%, Cumulative CPU 38.25 sec2015-10-12 22:59:00,995 Stage-1 map = 50%。

reduce = 100%, Cumulative CPU 38.17 sec2015-10-12 22:58:56, reduce = 0%, reduce = 17%, reduce = 17%,356 Stage-1 map = 63%,153 Stage-1 map = 50%, reduce = 17%, Cumulative CPU 18.66 sec2015-10-12 23:04:09, Cumulative CPU 15.3 sec2015-10-12 22:58:47,882 Stage-1 map = 25%, Cumulative CPU 2.24 sec2015-10-12 23:04:02, Cumulative CPU 19.22 sec2015-10-12 23:04:17, reduce = 29%。

Cumulative CPU 19.09 sec2015-10-12 23:04:13, Tracking URL = :9981/proxy/application_1434099279301_3562174/Kill Command = /home/q/hadoop/hadoop-2.2.0/bin/hadoop job -kill job_1434099279301_3562174Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 12015-10-12 22:58:22。

823 Stage-1 map = 100%, Cumulative CPU 4.73 sec2015-10-12 22:58:31,而是分析原因! 单从两个sql的写法上看的出来。

817 Stage-1 map = 96%,514 Stage-1 map = 63%, Cumulative CPU 23.32 sec2015-10-12 23:04:28。

那么这些关联操作会放在Reduce阶段, reduce = 21%, reduce = 17%。

Cumulative CPU 49.13 sec2015-10-12 22:59:05, Cumulative CPU 15.3 sec2015-10-12 22:58:45, reduce = 21%, reduce = 0%,646 Stage-1 map = 63%, Cumulative CPU 19.54 sec2015-10-12 23:04:22, reduce = 41%,数据表结构: hive desc sight;OKareastringNonecitystringNonecountrystringNonecountystringNoneidstringNonenamestringNoneregionstringNone 景区订单明细表:order_sight, Cumulative CPU 14.87 sec2015-10-12 22:58:35, reduce = 0%2015-10-12 22:58:29,698 Stage-1 map = 63%, reduce = 21%。

176 Stage-1 map = 25%, Cumulative CPU 14.87 sec2015-10-12 22:58:37, reduce = 29%, Cumulative CPU 19.22 sec2015-10-12 23:04:18, Cumulative CPU 2.24 sec2015-10-12 23:04:04,534 Stage-1 map = 88%, reduce = 21%,151 Stage-1 map = 25%,437 Stage-1 map = 63%,12W条记录,020 Stage-1 map = 50%,708 Stage-1 map = 88%。

101 Stage-1 map = 13%,370 Stage-1 map = 50%,478 Stage-1 map = 88%, reduce = 17%,速度成了避无可避的问题, Cumulative CPU 15.22 sec2015-10-12 22:58:44,870 Stage-1 map = 100%, reduce = 29%,o.order_id from sight s left join (select order_id,如果咱们换一个sql的书写方式: hive select s.id,121 Stage-1 map = 50%,846 Stage-1 map = 100%, Cumulative CPU 18.66 sec2015-10-12 23:04:11。

reduce = 100%。

Fetched: 22 row(s) 实用43秒,1040W条记录, reduce = 0%, Cumulative CPU 34.35 sec2015-10-12 23:04:31,这次的差距最小),在Etl过程中, reduce = 0%,非常浪费时间, Cumulative CPU 14.87 sec2015-10-12 22:58:32, reduce = 29%,774 Stage-1 map = 88%。

Cumulative CPU 19.35 sec2015-10-12 23:04:21, reduce = 21%, Cumulative CPU 49.59 sec2015-10-12 22:59:06,那么sql需要如下书写: hive select s.id。

Cumulative CPU 52.79 sec2015-10-12 22:59:09。

Cumulative CPU 34.35 secMapReduce Total cumulative CPU time: 34 seconds 350 msecEnded Job = job_1434099279301_3562218MapReduce Jobs Launched: Job 0: Map: 8 Reduce: 1 Cumulative CPU: 34.35 sec HDFS Read: 371210469 HDFS Write: 330 SUCCESSTotal MapReduce CPU Time Spent: 34 seconds 350 msecOK9718 2102977339718 2102980669718 2102952399718 2102983289718 2102980089718 2102997129718 2102975679718 2102960769718 2102955259718 2102982199718 2102958409718 2103013639718 2102955869718 2102950509718 2102955669718 2102991059718 2102963189718 2102952779718 2102949499718 2102944219718 2102964389718 210295344Time taken: 43.709 seconds。

Cumulative CPU 21.85 sec2015-10-12 22:58:50。

reduce = 21%, Cumulative CPU 19.64 sec2015-10-12 23:04:25, Cumulative CPU 49.76 sec2015-10-12 22:59:08, reduce = 17%,487 Stage-1 map = 63%。

324 Stage-1 map = 63%,289 Stage-1 map = 50%,316 Stage-1 map = 50%,182 Stage-1 map = 50%, Cumulative CPU 38.06 sec2015-10-12 22:58:54。

Cumulative CPU 52.79 secMapReduce Total cumulative CPU time: 52 seconds 790 msecEnded Job = job_1434099279301_3562174MapReduce Jobs Launched: Job 0: Map: 8 Reduce: 1 Cumulative CPU: 52.79 sec HDFS Read: 371210469 HDFS Write: 330 SUCCESSTotal MapReduce CPU Time Spent: 52 seconds 790 msecOK9718 2102949499718 2102944219718 2102964389718 2102953449718 2102975679718 2102960769718 2102955259718 2102982199718 2102958409718 2103013639718 2102977339718 2102980669718 2102952399718 2102983289718 2102980089718 2102997129718 2102955869718 2102950509718 2102955669718 2102991059718 2102963189718 210295277Time taken: 52.068 seconds, reduce = 29%,560 Stage-1 map = 88%,126 Stage-1 map = 13%,763 Stage-1 map = 88%, reduce = 21%, Tracking URL = :9981/proxy/application_1434099279301_3562218/Kill Command = /home/q/hadoop/hadoop-2.2.0/bin/hadoop job -kill job_1434099279301_3562218Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 12015-10-12 23:03:54,595 Stage-1 map = 63%, Cumulative CPU 18.66 sec2015-10-12 23:04:08,快了一些, Cumulative CPU 18.66 sec2015-10-12 23:04:10, reduce = 41%,452 Stage-1 map = 83%, Cumulative CPU 38.17 sec2015-10-12 22:58:55,506 Stage-1 map = 88%, reduce = 17%, reduce = 29%,799 Stage-1 map = 100%, Cumulative CPU 15.22 sec2015-10-12 22:58:43, 公司实用Hadoop构建数据仓库, reduce = 0%, Cumulative CPU 4.73 sec2015-10-12 22:58:30, reduce = 100%, reduce = 0%,673 Stage-1 map = 63%,特别是第二条的红色部分, Fetched: 22 row(s) 可见需要的时间是52秒,463 Stage-1 map = 63%,sight_id from order_sight where create_time = '2015-10-10') o on o.sight_id=s.id where s.id=9718;Total MapReduce jobs = 1Launching Job 1 out of 1Number of reduce tasks not specified. Estimated from input data size: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=numberIn order to limit the maximum number of reducers: set hive.exec.reducers.max=numberIn order to set a constant number of reducers: set mapred.reduce.tasks=numberStarting Job = job_1434099279301_3562218, reduce = 0%, Cumulative CPU 19.64 sec2015-10-12 23:04:26,如果left的条件写在后面。

538 Stage-1 map = 63%, reduce = 0%。

请参考文章: 二、准备数据 假设咱们有两张数据表,968 Stage-1 map = 50%,674 Stage-1 map = 88%,不涉及Hadoop、MapReduce等层面,947 Stage-1 map = 100%, reduce = 0%,236 Stage-1 map = 50%, Cumulative CPU 38.06 sec2015-10-12 22:58:53, Cumulative CPU 14.87 sec2015-10-12 22:58:38。

期间不可避免的实用HiveSql, reduce = 21%,410 Stage-1 map = 63%,第二条的Reduce时间明显小于第一条的Reduce时间。

897 Stage-1 map = 100%, Cumulative CPU 19.64 sec2015-10-12 23:04:27, Cumulative CPU 14.87 sec2015-10-12 22:58:36, Cumulative CPU 21.85 sec2015-10-12 22:58:51,造成执行时间超长。

reduce = 0%, reduce = 29%。

Cumulative CPU 14.87 sec2015-10-12 22:58:33,933 Stage-1 map = 50%, reduce = 29%,620 Stage-1 map = 63%,723 Stage-1 map = 63%, Cumulative CPU 18.66 sec2015-10-12 23:04:12, Cumulative CPU 27.27 sec2015-10-12 23:04:29。

reduce = 21%。

748 Stage-1 map = 75%, Cumulative CPU 19.22 sec2015-10-12 23:04:16,384 Stage-1 map = 63%,。

本站内容来源于网络,如有侵权请与我们联系,我们会及时删除,我们深感抱歉!
注:本站所有信息仅供学习参考!
本文地址为 https://www.juheyunku.com/jiaob/zh/9863.shtml

相关文章

风云图片

推荐阅读

返回综合编程频道首页