可以优化此MySQL查询吗?

问题描述:

我目前正在尝试优化一个MySQL查询,该查询在具有10,000多行的表上运行速度稍慢。可以优化此MySQL查询吗?

CREATE TABLE IF NOT EXISTS `person` (
    `_id` int(11) unsigned NOT NULL AUTO_INCREMENT, 
    `_oid` char(8) NOT NULL, 
    `firstname` varchar(255) NOT NULL, 
    `lastname` varchar(255) NOT NULL, 
    PRIMARY KEY (`_id`), 
    KEY `_oid` (`_oid`) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8; 

CREATE TABLE IF NOT EXISTS `person_cars` (
    `_id` int(11) NOT NULL AUTO_INCREMENT, 
    `_oid` char(8) NOT NULL, 
    `idx` varchar(255) NOT NULL, 
    `val` blob NOT NULL, 
    PRIMARY KEY (`_id`), 
    KEY `_oid` (`_oid`), 
    KEY `idx` (`idx`), 
    KEY `val` (`val`(64)) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8; 

# Insert some 10000+ rows… 

INSERT INTO `person` (`_oid`,`firstname`,`lastname`) 
VALUES 
    ('1', 'John', 'Doe'), 
    ('2', 'Jack', 'Black'), 
    ('3', 'Jim', 'Kirk'), 
    ('4', 'Forrest', 'Gump'); 

INSERT INTO `person_cars` (`_oid`,`idx`,`val`) 
VALUES 
    ('1', '0', 'BMW'), 
    ('1', '1', 'PORSCHE'), 
    ('2', '0', 'BMW'), 
    ('3', '1', 'MERCEDES'), 
    ('3', '0', 'TOYOTA'), 
    ('3', '1', 'NISSAN'), 
    ('4', '0', 'OLDMOBILE'); 


SELECT `_person`.`_oid`, 
     `_person`.`firstname`, 
     `_person`.`lastname`, 
     `_person_cars`.`cars[0]`, 
     `_person_cars`.`cars[1]` 

FROM `person` `_person` 

LEFT JOIN (

    SELECT `_person`.`_oid`, 
      IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) AS `cars[0]`, 
      IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) AS `cars[1]` 
    FROM `person` `_person` 
    JOIN `person_cars` `_person_cars` ON `_person`.`_oid` = `_person_cars`.`_oid` 
    GROUP BY `_person`.`_oid` 

) `_person_cars` ON `_person_cars`.`_oid` = `_person`.`_oid` 

WHERE `cars[0]` = 'BMW' OR `cars[1]` = 'BMW'; 

上面的SELECT查询在运行MySQL 5.1.53的虚拟机上花费〜170ms。与约。两个表中的每一行都有10,000行。

+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+ 
| id | select_type | table  | type | possible_keys | key | key_len | ref | rows | Extra          | 
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+ 
| 1 | PRIMARY  | <derived2> | ALL | NULL   | NULL | NULL | NULL | 4 | Using where         | 
| 1 | PRIMARY  | _person  | ALL | _oid   | NULL | NULL | NULL | 4 | Using where; Using join buffer    | 
| 2 | DERIVED  | _person_cars | ALL | _oid   | NULL | NULL | NULL | 7 | Using temporary; Using filesort    | 
| 2 | DERIVED  | _person  | index | _oid   | _oid | 24  | NULL | 4 | Using where; Using index; Using join buffer | 
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+ 

有些10,000行给出的结果:

当我解释一下上面的查询,结果取决于有多少行是每个表中不同

+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+ 
| id | select_type | table  | type | possible_keys | key | key_len | ref     | rows | Extra       | 
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+ 
| 1 | PRIMARY  | <derived2> | ALL | NULL   | NULL | NULL | NULL     | 6613 | Using where      | 
| 1 | PRIMARY  | _person  | ref | _oid   | _oid | 24  | _person_cars._oid  | 10 |         | 
| 2 | DERIVED  | _person_cars | ALL | _oid   | NULL | NULL | NULL     | 9913 | Using temporary; Using filesort | 
| 2 | DERIVED  | _person  | ref | _oid   | _oid | 24  | test._person_cars._oid | 10 | Using index      | 
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+ 

事情变得更糟,当我省略WHERE子句或当我加入与person_cars类似的另一个表时。

有没有人有一个想法如何优化SELECT查询使事情变得更快一点?

+0

两个简单的问题,为什么您使用用于存储车赚了BLOB数据类型?还有,你是否考虑过使用InnoDb作为MyISAM的数据库引擎? – GordyD 2011-05-24 10:16:02

+0

我使用blob是因为我必须在该列中存储任意长度的数据。汽车只是一个例子。而且,不,我还没有尝试InnoDb,因为该项目根本不使用InnoDb。我会给它一个镜头,谢谢:) – xlttj 2011-05-24 11:10:43

+0

还要记住有BLOB或TEXT字段使所有临时表(在连接和排序过程中)实际上是在磁盘表上。 – Marki555 2012-01-09 20:18:25

这是缓慢的,因为这将迫使该再得到结合在一起的人三个全表扫描:

LEFT JOIN (
    ... 
    GROUP BY `_person`.`_oid` -- the group by here 
) `_person_cars` ... 

WHERE ... -- and the where clauses on _person_cars. 

考虑where子句中的左连接实际上是一个内部联接,换一个。你可以在加入人员之前推动条件。该连接也被不必要地应用两次。

这将在子查询使其更快,但如果你用/限制条款已经命令它仍然会导致全表扫描的人员(即还没有好),因为该组:

JOIN (
SELECT `_person_cars`.`_oid`, 
      IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) AS `cars[0]`, 
      IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) AS `cars[1]` 
    FROM `person_cars` 
    GROUP BY `_person_cars`.`_oid` 
    HAVING IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) = 'BMW' OR 
      IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) = 'BMW' 
) `_person_cars` ... -- smaller number of rows 

如果通过/限制适用的订单,你会得到更好的结果有两个疑问,即:

SELECT `_person`.`_oid`, 
     `_person`.`firstname`, 
     `_person`.`lastname` 
FROM `_person` 
JOIN `_person_cars` 
ON `_person_cars`.`_oid` = `_person`.`_oid` 
AND `_person_cars`.`val` = 'BMW' 
GROUP BY -- pre-sort the result before grouping, so as to not do the work twice 
     `_person`.`lastname`, 
     `_person`.`firstname`, 
     -- eliminate users with multiple BMWs 
     `_person`.`_oid` 
ORDER BY `_person`.`lastname`, 
     `_person`.`firstname`, 
     `_person`.`_oid` 
LIMIT 10 

,然后使用所产生的ID的IN()子句选择汽车。

哦,你的vals列可能应该是一个varchar。

+0

感谢您的广泛答复,我会仔细研究这一点,并试着了解... – xlttj 2011-05-24 11:58:24

入住这

SELECT 
    p._oid  AS oid, 
    p.firstname AS firstname, 
    p.lastname AS lastname, 
    pc.val  AS car1, 
    pc2.val  AS car2 
FROM person AS p 
    LEFT JOIN person_cars AS pc 
    ON pc._oid = p._oid 
     AND pc.idx = 0 
    LEFT JOIN person_cars AS pc2 
    ON pc2._oid = p._oid 
     AND pc2.idx = 1 
WHERE pc.val = 'BMW' 
    OR pc2.val = 'BWM'