mysql大查询优化

问题描述:

我需要优化以下查询,最多需要10分钟才能运行。 执行解释它似乎在“table_3”表的所有350815行上运行,其他所有行都运行1。 通用规则放置索引的方式?我应该考虑使用多维索引吗?在JOINS,WHERE或GROUP BY中,我应该在哪里使用它们,如果我没有记错的话,应该有一个层次结构。另外如果我有1行的所有表格,但一个(在说明表的行列中)我怎么能优化通常我的优化包括结束所有列,但只有一行,但一个。 所有表平均从100k到1000k +行。每桌mysql大查询优化

CREATE TABLE datab1.sku_performance 
SELECT 
     table1.sku, 
     CONCAT(table1.sku,' ',table1.fk_container) as sku_container, 
     table1.price as price, 
     SUM(CASE WHEN (table1.fk_table1_status = 82 
        OR table1.fk_table1_status = 119 
        OR table1.fk_table1_status = 124 
        OR table1.fk_table1_status = 141 
        OR table1.fk_table1_status = 131) THEN 1 ELSE 0 END) 
      /COUNT(DISTINCT id_catalog_school_class) as qty_returned, 
     SUM(CASE WHEN (table1.fk_table1_status In (23,13,44,65,6,75,8,171,12,166)) 
       THEN 1 ELSE 0 END) 
      /COUNT(DISTINCT id_catalog_school_class) as qt, 
     container.id_container as container_id, 
     container.idden as container_idden, 
     container.delivery_badge, 
     catalog_school.id_catalog_school, 
     LEFT(catalog_school.flight_fair,2) as departing_country, 
     catalog_school.weight, 
     catalog_school.flight_type, 
     catalog_school.price, 
     table_3.id_table_3, 
     table_3.fk_catalog_brand, 
     MAX(LEFT(table_3.note,3)) AS supplier, 
     GROUP_CONCAT(product_number, ' by ',FORMAT(catalog_school_class.quantity,0) 
      ORDER BY product_number ASC SEPARATOR ' + ') as supplier_prod, 
     Sum(distinct(catalog_school_class.purch_pri * catalog_school_class.quantity)) AS final_purch_pri, 
     catalog_groupp.idden as supplier_idden, 
     catalog_category_details.id_catalog_category, 
     catalog_category_details.cat1 as product_cat1, 
     catalog_category_details.cat2 as product_cat2, 
     COUNT(distinct catalog_school_class.id_catalog_school_class) as setinfo, 
     datab1.pageviewgrouped.pv as page_views, 
     Sum(distinct(catalog_school_class.purch_pri * catalog_school_class.quantity)) AS purch_pri, 
     container_has_table_3.position, 
     max(table1.created_at) as last_order_date 
    FROM 
     table1 
     LEFT JOIN container 
      ON table1.fk_container = container.id_container 
     LEFT JOIN catalog_school 
      ON table1.sku = catalog_school.sku 
      LEFT JOIN table_3 
       ON catalog_school.fk_table_3 = table_3.id_table_3 
       LEFT JOIN container_has_table_3 
        ON table_3.id_table_3 = container_has_table_3.fk_table_3 
       LEFT JOIN datab1.pageviewgrouped 
        on table_3.id_table_3 = datab1.pageviewgrouped.url 
        LEFT JOIN datab1.catalog_category_details 
        ON datab1.catalog_category_details.id_catalog_category = table_3_has_catalog_minority.fk_catalog_category 
       LEFT JOIN catalog_groupp 
        ON table_3.fk_catalog_groupp = catalog_groupp.id_catalog_groupp 
       LEFT JOIN table_3_has_catalog_minority 
        ON table_3.id_table_3 = table_3_has_catalog_minority.fk_table_3 
      LEFT JOIN catalog_school_class 
       ON catalog_school.id_catalog_school = catalog_school_class.fk_catalog_school 
    WHERE 
      table_3.status_ok = 1 
     AND catalog_school.status = 'active' 
     AND table_3_has_catalog_minority.is_primary = '1' 
    GROUP BY 
     table1.sku, 
     table1.fk_container; 

enter image description here

行:

.table1 960096 to 1.3mn rows 
.container 9275 to 13000 rows 
.catalog_school 709970 to 1 mn rows 
.table_3 709970 to 1 mn rows 
.container_has_table_3 709970 to 1 mn rows 
.pageviewgrouped 500000 rows 
.catalog_school_class 709970 to 1 mn rows 
.catalog_groupp 3000 rows 
.table_3_has_catalog_minority 709970 to 1 mn rows 
.catalog_category_details 659 rows 
+1

要优化查询**,我们需要查看表和索引定义**以及每个表的行数。也许你的表格定义不好。也许索引没有正确创建。也许你没有一个你认为你做过的那个专栏的索引。没有看到表和索引定义,我们不能说。我们还需要行计数,因为这会大大影响查询优化。如果你知道如何做一个'EXPLAIN'或者得到一个执行计划,那就把结果也放在问题中。如果您没有索引,请尽快访问http://use-the-index-luke.com。 – 2015-02-23 19:57:40

+0

这里发生了什么? '... WHERE table_3.status_ok = AND ...'看起来有些东西丢失了。 – Turophile 2015-02-23 21:40:07

+0

当您为每个表格提供“SHOW CREATE TABLE”时,请告诉我们每个表格有多大。 – 2015-02-25 05:22:33

太多投入一个评论,所以我会在这里补充和调整后为可能需要的...你已经离开JOIN到处都有,但是WHERE子句是从Table_3,Catalog_School和Table_3_has_catalog_minority中具体限定的字段。这默认情况下将它们更改为INNER JOIN。

对于您的where子句

WHERE 
      table_3.status_ok = 1 
     AND catalog_school.status = 'active' 
     AND table_3_has_catalog_minority.is_primary = '1' 

哪个表/列将具有基于这些标准的最小的结果。例如:Table_3.Status_ok = 1可能有500k记录,但是table_3_has_catalog_minority.is_primary可能只有65k,而catalog_school.status ='active'可能有430k。

此外,您的某些列不符合他们来自的表格。你能否确认...如“id_catalog_school_class”和“product_number”

有时候,改变表的顺序,熟悉数据构成,在MySQL中增加一个“STRAIGHT_JOIN”关键字可以提高性能。这是我过去曾与*合同和赠款数据库合作,拥有超过2000万条记录并加入了大约15个查询表。它从挂起服务器到在不到2小时内完成查询。考虑到我正在处理的数据量,那实际上是一个好时机。

解剖这个东西后,我为了可读性重新构造了一些,为表引用添加了别名,并改变了查询的顺序并提供了一些建议的索引。为了帮助查询,我尝试将Catalog_School表移到第一个位置并添加了STRAIGHT_JOIN。该索引基于STATUS首先匹配WHERE子句,然后我将SKU作为GROUP BY的第一个元素包含在内,然后将其他列用于连接到后续表。通过在索引中包含这些列,它可以限定连接,而无需转到原始数据。

通过将组更改为Catalog_School.SKU而不是table_1.SKU,可以使用catalog_school中的索引来帮助优化该组。自从catalog_school.sku = table_1.sku加入后,它的值相同。我还为table_1和table_3添加了索引引用,这些建议也是建议 - 再次,为了预先限定连接而不必访问表的原始数据页。

我想知道你的数据的最终性能(好或坏)。

TABLE    INDEX ON... 
catalog_school (status, sku, fk_table_3, id_catalog_school) 
table_1   (sku, fk_container)  
table_3   (id_table_3, status_ok, fk_catalog_groupp) 

SELECT STRAIGHT_JOIN 
     CS.sku, 
     CONCAT(CS.sku,' ',T1.fk_container) as sku_container, 
     T1.price as price, 
     SUM(CASE WHEN (T1.fk_table1_status IN (82, 119, 124, 141, 131) 
       THEN 1 ELSE 0 END) 
      /COUNT(DISTINCT CSC.id_catalog_school_class) as qty_returned, 
     SUM(CASE WHEN (T1.fk_table1_status In (23,13,44,65,6,75,8,171,12,166)) 
       THEN 1 ELSE 0 END) 
      /COUNT(DISTINCT CSC.id_catalog_school_class) as qt, 
     CS.id_catalog_school, 
     LEFT(CS.flight_fair,2) as departing_country, 
     CS.weight, 
     CS.flight_type, 
     CS.price, 
     T3.id_table_3, 
     T3.fk_catalog_brand, 
     MAX(LEFT(T3.note,3)) AS supplier, 
     C.id_container as container_id, 
     C.idden as container_idden, 
     C.delivery_badge, 
     GROUP_CONCAT(product_number, ' by ',FORMAT(CSC.quantity,0) 
      ORDER BY product_number ASC SEPARATOR ' + ') as supplier_prod, 
     Sum(distinct(CSC.purch_pri * CSC.quantity)) AS final_purch_pri, 
     CGP.idden as supplier_idden, 
     CCD.id_catalog_category, 
     CCD.cat1 as product_cat1, 
     CCD.cat2 as product_cat2, 
     COUNT(distinct CSC.id_catalog_school_class) as setinfo, 
     PVG.pv as page_views, 
     Sum(distinct(CSC.purch_pri * CSC.quantity)) AS purch_pri, 
     CHT3.position, 
     max(T1.created_at) as last_order_date 
    FROM 
     catalog_school CS 

     JOIN table1 T1 
      ON CS.sku = T1.sku 
      LEFT JOIN container C 
       ON T1.fk_container = C.id_container 

     LEFT JOIN catalog_school_class CSC 
      ON CS.id_catalog_school = CSC.fk_catalog_school 

     JOIN table_3 T3 
      ON CS.fk_table_3 = T3.id_table_3 
      JOIN table_3_has_catalog_minority T3HCM 
       ON T3.id_table_3 = T3HCM.fk_table_3 
       LEFT JOIN datab1.catalog_category_details CCD 
        ON T3HCM.fk_catalog_category = CCD.id_catalog_category 

      LEFT JOIN container_has_table_3 CHT3 
       ON T3.id_table_3 = CHT3.fk_table_3 

      LEFT JOIN datab1.pageviewgrouped PVG 
       on T3.id_table_3 = PVG.url 

      LEFT JOIN catalog_groupp CGP 
       ON T3.fk_catalog_groupp = CGP.id_catalog_groupp 
    WHERE 
      CS.status = 'active' 
     AND T3.status_ok = 1 
     AND T3HCM.is_primary = '1' 
    GROUP BY 
     CS.sku, 
     T1.fk_container; 
+0

@MarkoC,很高兴这个答案似乎已经工作/帮助...但是想知道基于这个输入/建议对查询的最后时间进行了优化......它也可以帮助其他人想知道哪些技术已经工作,太有效了。 – DRapp 2015-03-08 03:37:24

+0

再次感谢,当我完成我的优化后,我会发布时间!目前我正在处理它,并试图进一步! – 2015-03-17 14:23:21