可以优化此MySQL查询吗?
问题描述:
我目前正在尝试优化一个MySQL查询,该查询在具有10,000多行的表上运行速度稍慢。可以优化此MySQL查询吗?
CREATE TABLE IF NOT EXISTS `person` (
`_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`_oid` char(8) NOT NULL,
`firstname` varchar(255) NOT NULL,
`lastname` varchar(255) NOT NULL,
PRIMARY KEY (`_id`),
KEY `_oid` (`_oid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `person_cars` (
`_id` int(11) NOT NULL AUTO_INCREMENT,
`_oid` char(8) NOT NULL,
`idx` varchar(255) NOT NULL,
`val` blob NOT NULL,
PRIMARY KEY (`_id`),
KEY `_oid` (`_oid`),
KEY `idx` (`idx`),
KEY `val` (`val`(64))
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
# Insert some 10000+ rows…
INSERT INTO `person` (`_oid`,`firstname`,`lastname`)
VALUES
('1', 'John', 'Doe'),
('2', 'Jack', 'Black'),
('3', 'Jim', 'Kirk'),
('4', 'Forrest', 'Gump');
INSERT INTO `person_cars` (`_oid`,`idx`,`val`)
VALUES
('1', '0', 'BMW'),
('1', '1', 'PORSCHE'),
('2', '0', 'BMW'),
('3', '1', 'MERCEDES'),
('3', '0', 'TOYOTA'),
('3', '1', 'NISSAN'),
('4', '0', 'OLDMOBILE');
SELECT `_person`.`_oid`,
`_person`.`firstname`,
`_person`.`lastname`,
`_person_cars`.`cars[0]`,
`_person_cars`.`cars[1]`
FROM `person` `_person`
LEFT JOIN (
SELECT `_person`.`_oid`,
IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) AS `cars[0]`,
IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) AS `cars[1]`
FROM `person` `_person`
JOIN `person_cars` `_person_cars` ON `_person`.`_oid` = `_person_cars`.`_oid`
GROUP BY `_person`.`_oid`
) `_person_cars` ON `_person_cars`.`_oid` = `_person`.`_oid`
WHERE `cars[0]` = 'BMW' OR `cars[1]` = 'BMW';
上面的SELECT查询在运行MySQL 5.1.53的虚拟机上花费〜170ms。与约。两个表中的每一行都有10,000行。
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | Using where |
| 1 | PRIMARY | _person | ALL | _oid | NULL | NULL | NULL | 4 | Using where; Using join buffer |
| 2 | DERIVED | _person_cars | ALL | _oid | NULL | NULL | NULL | 7 | Using temporary; Using filesort |
| 2 | DERIVED | _person | index | _oid | _oid | 24 | NULL | 4 | Using where; Using index; Using join buffer |
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+
有些10,000行给出的结果:
当我解释一下上面的查询,结果取决于有多少行是每个表中不同
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 6613 | Using where |
| 1 | PRIMARY | _person | ref | _oid | _oid | 24 | _person_cars._oid | 10 | |
| 2 | DERIVED | _person_cars | ALL | _oid | NULL | NULL | NULL | 9913 | Using temporary; Using filesort |
| 2 | DERIVED | _person | ref | _oid | _oid | 24 | test._person_cars._oid | 10 | Using index |
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+
事情变得更糟,当我省略WHERE子句或当我加入与person_cars
类似的另一个表时。
有没有人有一个想法如何优化SELECT查询使事情变得更快一点?
答
这是缓慢的,因为这将迫使该再得到结合在一起的人三个全表扫描:
LEFT JOIN (
...
GROUP BY `_person`.`_oid` -- the group by here
) `_person_cars` ...
WHERE ... -- and the where clauses on _person_cars.
考虑where子句中的左连接实际上是一个内部联接,换一个。你可以在加入人员之前推动条件。该连接也被不必要地应用两次。
这将在子查询使其更快,但如果你用/限制条款已经命令它仍然会导致全表扫描的人员(即还没有好),因为该组:
JOIN (
SELECT `_person_cars`.`_oid`,
IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) AS `cars[0]`,
IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) AS `cars[1]`
FROM `person_cars`
GROUP BY `_person_cars`.`_oid`
HAVING IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) = 'BMW' OR
IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) = 'BMW'
) `_person_cars` ... -- smaller number of rows
如果通过/限制适用的订单,你会得到更好的结果有两个疑问,即:
SELECT `_person`.`_oid`,
`_person`.`firstname`,
`_person`.`lastname`
FROM `_person`
JOIN `_person_cars`
ON `_person_cars`.`_oid` = `_person`.`_oid`
AND `_person_cars`.`val` = 'BMW'
GROUP BY -- pre-sort the result before grouping, so as to not do the work twice
`_person`.`lastname`,
`_person`.`firstname`,
-- eliminate users with multiple BMWs
`_person`.`_oid`
ORDER BY `_person`.`lastname`,
`_person`.`firstname`,
`_person`.`_oid`
LIMIT 10
,然后使用所产生的ID的IN()子句选择汽车。
哦,你的vals
列可能应该是一个varchar。
+0
感谢您的广泛答复,我会仔细研究这一点,并试着了解... – xlttj 2011-05-24 11:58:24
答
入住这
SELECT
p._oid AS oid,
p.firstname AS firstname,
p.lastname AS lastname,
pc.val AS car1,
pc2.val AS car2
FROM person AS p
LEFT JOIN person_cars AS pc
ON pc._oid = p._oid
AND pc.idx = 0
LEFT JOIN person_cars AS pc2
ON pc2._oid = p._oid
AND pc2.idx = 1
WHERE pc.val = 'BMW'
OR pc2.val = 'BWM'
两个简单的问题,为什么您使用用于存储车赚了BLOB数据类型?还有,你是否考虑过使用InnoDb作为MyISAM的数据库引擎? – GordyD 2011-05-24 10:16:02
我使用blob是因为我必须在该列中存储任意长度的数据。汽车只是一个例子。而且,不,我还没有尝试InnoDb,因为该项目根本不使用InnoDb。我会给它一个镜头,谢谢:) – xlttj 2011-05-24 11:10:43
还要记住有BLOB或TEXT字段使所有临时表(在连接和排序过程中)实际上是在磁盘表上。 – Marki555 2012-01-09 20:18:25