2非常相似的SQL查询有完全不同的性能

问题描述：

我有以下2个SQL查询，它们是95％相同的，但性能显着不同。2非常相似的SQL查询有完全不同的性能

SQL查询1（< 0,1s）：

SELECT CONCAT(a.`report_year`, '-', a. `report_month`) as `yearmonth`, 
AVG(a.cost_leasing/b.rate*IF(`report_year`=2016,0.73235, 
     IF(`report_year`=2017,0.83430,1))) as average, 
'current' as `type` 
FROM `vehicles` as a, `exchange_rates` as b 
WHERE cid='3' AND 
STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`, '-01'), 
     '%Y-%m-%d') >= '2016-01-01' AND 
LAST_DAY(STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`, 
     '-01'), '%Y-%m-%d')) <= '2017-06-30' AND 
`country` IN ('XX','UK') AND 
a.currency = b.currency AND 
b.`year` = `report_year` AND 
fxid=2 
GROUP BY `yearmonth` 
ORDER BY `yearmonth`;

解释查询1：

1 SIMPLE a ref new_selectors,... new_cost_leasing 4 const 10812 Using where; Using index; Using temporary; Using f... 
1 SIMPLE b ref PRIMARY,date,fxid fxid 19 const,c1682fleet.a.report_year,c1682fleet.a.curren... 196 Using where; Using index

SQL查询2（> 3秒）：

SELECT CONCAT(c.`report_year`, '-', c.`report_month`) as `yearmonth`, 
AVG(c.cost_leasing/d.rate*IF(`report_year`=2016,0.73235, 
     IF(`report_year`=2017,0.83430,1))), 
'baseline' 
FROM `kpis` as c, `exchange_rates` as d 
WHERE cid='3' AND 
STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`, '-01'), 
     '%Y-%m-%d') >= '2016-01-01' AND 
LAST_DAY(STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`, 
      '-01'), '%Y-%m-%d')) <= '2017-06-30' AND 
`country` IN ('XX','UK') AND 
c.kid=1 AND 
c.currency = d.currency AND 
d.`year` = `report_year` AND 
fxid=2 
GROUP BY `yearmonth` 
ORDER BY `yearmonth`;

解释查询2 ：

1 SIMPLE c ref oem_group,... cost_leasing 8 const,const 30038 Using where; Using index; Using temporary; Using f... 
1 SIMPLE d ref PRIMARY,date,fxid fxid 19 const,c1682fleet.c.report_year,c1682fleet.c.curren... 196 Using where; Using index

SHOW指数从车辆：

vehicles 0 PRIMARY 1 vid A 146068    BTREE   
vehicles 1 new_cost_leasing 1 cid A 12    BTREE   
vehicles 1 new_cost_leasing 2 cost_leasing A 4564    BTREE   
vehicles 1 new_cost_leasing 3 currency A 5216    BTREE   
vehicles 1 new_cost_leasing 4 report_month A 24344    BTREE   
vehicles 1 new_cost_leasing 5 report_year A 29213    BTREE   
vehicles 1 new_cost_leasing 6 country A 36517    BTREE   
vehicles 1 new_cost_leasing 7 supplier A 29213    BTREE   
vehicles 1 new_cost_leasing 8 jato_segment A 24344    BTREE   
vehicles 1 new_cost_leasing 9 business_unit A 36517    BTREE   
vehicles 1 new_cost_leasing 10 entity A 73034    BTREE

SHOW指数从exchange_rates：

exchange_rates 0 PRIMARY 1 fxid A 2    BTREE   
exchange_rates 0 PRIMARY 2 currency A 160    BTREE   
exchange_rates 0 PRIMARY 3 date A 569250    BTREE   
exchange_rates 1 date 1 fxid A 2    BTREE   
exchange_rates 1 date 2 date A 28462    BTREE   
exchange_rates 1 date 3 currency A 569250    BTREE   
exchange_rates 1 date 4 rate A 569250    BTREE   
exchange_rates 1 fxid 1 fxid A 2    BTREE   
exchange_rates 1 fxid 2 year A 114    BTREE   
exchange_rates 1 fxid 3 currency A 2904    BTREE   
exchange_rates 1 fxid 4 rate A 569250    BTREE

SHOW指数从关键绩效指标：

kpis 0 PRIMARY 1 vid A 60308    BTREE     
kpis 1 cost_leasing 1 cid A 2    BTREE   
kpis 1 cost_leasing 2 kid A 2    BTREE   
kpis 1 cost_leasing 3 cost_leasing A 78    BTREE   
kpis 1 cost_leasing 4 currency A 78    BTREE   
kpis 1 cost_leasing 5 report_month A 1096    BTREE   
kpis 1 cost_leasing 6 report_year A 3350    BTREE   
kpis 1 cost_leasing 7 country A 1884    BTREE   
kpis 1 cost_leasing 8 supplier A 4020    BTREE   
kpis 1 cost_leasing 9 jato_segment A 3015    BTREE   
kpis 1 cost_leasing 10 business_unit A 4307    BTREE   
kpis 1 cost_leasing 11 entity A 6030    BTREE   
kpis 1 avg_cost 1 cid A 2    BTREE   
kpis 1 avg_cost 2 kid A 2    BTREE   
kpis 1 avg_cost 3 country A 48    BTREE   
kpis 1 avg_cost 4 report_year A 96    BTREE   
kpis 1 avg_cost 5 currency A 96    BTREE   
kpis 1 avg_cost 6 cost_leasing A 172    BTREE

问题： 我的问题是，为什么会出现即使你是一个显着的性能差异（因子30） gh在查询2（kid）中只有一个附加标准，它甚至是索引的一部分。

任何想法如何优化查询2？

如果他们使用完全不同的表格，我不会说他们95％是相同的。有可能不同的结构，索引，记录数量......需要更多信息。 –

tabels与kpis另外包含字段'kid'这一事实相同。索引相同，除了kpi索引还包含'kid'作为列的事实之外，它们是相等的。您可以在解释答案中看到受影响的行。 – faulix90

你能发布你的索引定义吗？ – eventHandler

答

，我发现了问题所在：exchange_rates列year不是唯一的和选择的vehicles刚刚选择的kpis的一半大小，但因为没有唯一的列year的大基数的创造exchange_rates和kpis加盟一个临时的超过200万条目，这对于平均运作来说非常大。

解决方案：不要使用year我使用的唯一列date，改变了条件

`date` = MAKEDATE(`report_year`, 1)

如果您知道全局在ONE SELECT中使用的超过一百万行没有任何意义，那么您的my.cnf/ini中的这一行sql_select_limit = 1M＃来停止错误的数据量---在你任职期间会为你工作 - 或者直到有人为你的my.cfg移除它。 –

答

这些都不是优化搜索：

STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`, '-01'), '%Y-%m-%d') >= '2016-01-01' 

LAST_DAY(STR_TO_DATE(CONCAT(`report_year`, '-', `report_month`, '-01'), '%Y-%m-%d')) <= '2017-06-30'

您正在使用5个功能来连接每一行并转换为日期，但只有2个日期常量进行比较。如果可以将其逆转并将2个日期常量转换为适合未改变数据的某些事件，那将节省大量的工作量。您不仅可以节省功能的计算工作量，还可以使用report_year和report_month上的索引。

我还没有时间来测试这么多，我猜测invloved的列是整数，但我认为日期范围处理的一组更多的谓词将有助于这两个查询。例如

SQL Fiddle

的MySQL 5。6架构设置：

CREATE TABLE Table1 
    (`Report_Year` int, `Report_Month` int) 
; 

INSERT INTO Table1 
    (`Report_Year`, `Report_Month`) 
VALUES 
    (2015, 1), (2015, 2), (2015, 3), 
    (2015, 4), (2015, 5), (2015, 6), 
    (2015, 7), (2015, 8), (2015, 9), 
    (2015, 10), (2015, 11), (2015, 12), 
    (2016, 1), (2016, 2), (2016, 3), 
    (2016, 4), (2016, 5), (2016, 6), 
    (2016, 7), (2016, 8), (2016, 9), 
    (2016, 10), (2016, 11), (2016, 12), 
    (2017, 1), (2017, 2), (2017, 3), 
    (2017, 4), (2017, 5), (2017, 6), 
    (2017, 7), (2017, 8), (2017, 9), 
    (2017, 10), (2017, 11), (2017, 12) 
;

**查询**：

set @start := '2016-04-04'; 
set @end := '2017-01-30'; 

select *, @start, @end 
from table1 
where (
     ((year(@start) < year(@end)) AND report_year = year(@start) and report_month >= month(@start)) 
     OR 
     ((year(@start) < year(@end)) AND report_year > year(@start) and report_year < year(@end)) 
     OR 
     ((year(@start) <= year(@end)) AND report_year = year(@end) and report_month <= month(@end)) 
    )

[结果]：

| Report_Year | Report_Month |  @start |  @end | 
|-------------|--------------|------------|------------| 
|  2016 |   4 | 2016-04-04 | 2017-01-30 | 
|  2016 |   5 | 2016-04-04 | 2017-01-30 | 
|  2016 |   6 | 2016-04-04 | 2017-01-30 | 
|  2016 |   7 | 2016-04-04 | 2017-01-30 | 
|  2016 |   8 | 2016-04-04 | 2017-01-30 | 
|  2016 |   9 | 2016-04-04 | 2017-01-30 | 
|  2016 |   10 | 2016-04-04 | 2017-01-30 | 
|  2016 |   11 | 2016-04-04 | 2017-01-30 | 
|  2016 |   12 | 2016-04-04 | 2017-01-30 | 
|  2017 |   1 | 2016-04-04 | 2017-01-30 |

[结果]：

set @start := '2016-01-01'; 
set @end := '2016-06-30'; 

| Report_Year | Report_Month |  @start |  @end | 
|-------------|--------------|------------|------------| 
|  2016 |   1 | 2016-01-01 | 2016-06-30 | 
|  2016 |   2 | 2016-01-01 | 2016-06-30 | 
|  2016 |   3 | 2016-01-01 | 2016-06-30 | 
|  2016 |   4 | 2016-01-01 | 2016-06-30 | 
|  2016 |   5 | 2016-01-01 | 2016-06-30 | 
|  2016 |   6 | 2016-01-01 | 2016-06-30 | 

set @start := '2016-01-01'; 
set @end := '2017-06-30';

[结果]：

| Report_Year | Report_Month |  @start |  @end | 
|-------------|--------------|------------|------------| 
|  2016 |   1 | 2016-01-01 | 2017-06-30 | 
|  2016 |   2 | 2016-01-01 | 2017-06-30 | 
|  2016 |   3 | 2016-01-01 | 2017-06-30 | 
|  2016 |   4 | 2016-01-01 | 2017-06-30 | 
|  2016 |   5 | 2016-01-01 | 2017-06-30 | 
|  2016 |   6 | 2016-01-01 | 2017-06-30 | 
|  2016 |   7 | 2016-01-01 | 2017-06-30 | 
|  2016 |   8 | 2016-01-01 | 2017-06-30 | 
|  2016 |   9 | 2016-01-01 | 2017-06-30 | 
|  2016 |   10 | 2016-01-01 | 2017-06-30 | 
|  2016 |   11 | 2016-01-01 | 2017-06-30 | 
|  2016 |   12 | 2016-01-01 | 2017-06-30 | 
|  2017 |   1 | 2016-01-01 | 2017-06-30 | 
|  2017 |   2 | 2016-01-01 | 2017-06-30 | 
|  2017 |   3 | 2016-01-01 | 2017-06-30 | 
|  2017 |   4 | 2016-01-01 | 2017-06-30 | 
|  2017 |   5 | 2016-01-01 | 2017-06-30 | 
|  2017 |   6 | 2016-01-01 | 2017-06-30 |

谢谢你们的例子，努力工作的'sargable谓词'。 –

缺少分号，请在sqlfiddle.com的查询中再次，谢谢。 –

sqlfiddle退出的冒号，但在选择标记布局时被丢弃...可能的sqlfiddle在该布局中的错误。 –

答

可优化搜索是优点。更好地处理dates是一个重点。这里有一些更多的观点。

11列指数几乎保证是浪费。即使是6列指数也不太可能被充分利用。只有索引最左边的列将被使用。通常它会到达下一列无用的点，所以它会停止。

通常将日期分成年，月和日不是一个好主意。由于您似乎只需要年份和月份，建议值为CHAR(7) CHARSET ascii，值为'2017-06'。或者你真的有报告在一个月中停止？

请确认每个列名与所涉及的表一致。我们知道哪个表，例如，fxid是这是非常重要的

请使用JOIN .. ON语法：（我宁愿AS v和AS er作为助记符）

FROM vehicles AS a 
JOIN exchange_rates AS b ON a.currency = b.currency

期望的索引（与当前的年/月列）：

b,d: INDEX(fxid, currency, year) 
a: INDEX(cid, currency, report_year) 
c: INDEX(kid, cid, currency, report_year)

更多关于创建索引：http://mysql.rjweb.org/doc.php/index_cookbook_mysql

2非常相似的SQL查询有完全不同的性能

相关推荐