在连接过滤器中投射 - 是否排除索引扫描?
所以我加入了一个好几张桌子,结果是可怕的。在连接过滤器中投射 - 是否排除索引扫描?
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Nested Loop Left Join (cost=15291.63..367830181335340285952.00 rows=8062002970089247211520 width=676) |
| Join Filter: CASE WHEN (x415."SOURCE_URI" IS NOT NULL) THEN ((x415."SOURCE_URI")::text = (x47."SOURCE_URI")::text) ELSE NULL::boolean END |
| -> Nested Loop Left Join (cost=15291.63..53031043002887.09 rows=1767529209075750 width=900) |
| Join Filter: CASE WHEN (CASE WHEN (x414."CAT" IS NULL) THEN NULL::integer ELSE 1 END IS NOT NULL) THEN ((x414."CAT")::text = (x415."CAT")::text) ELSE NULL::boolean END |
| -> Nested Loop Left Join (cost=15291.63..5166730086.91 rows=172147962900 width=846) |
| Join Filter: CASE WHEN (CASE WHEN (x297."ID" IS NULL) THEN NULL::integer ELSE 1 END IS NOT NULL) THEN (x297."ID" = x258."TRACK") ELSE NULL::boolean END |
| -> Nested Loop Left Join (cost=15291.63..2291101.66 rows=71430690 width=816) |
| Join Filter: CASE WHEN (CASE WHEN (x297."ID" IS NULL) THEN NULL::integer ELSE 1 END IS NOT NULL) THEN (x297."ID" = x213."TRACK") ELSE NULL::boolean END |
| -> Nested Loop Left Join (cost=15291.63..148094.28 rows=33270 width=786) |
| Join Filter: CASE WHEN (CASE WHEN (x330."ID" IS NULL) THEN NULL::integer ELSE 1 END IS NOT NULL) THEN (x330."ID" = x297."MEDIUM") ELSE NULL::boolean END |
| -> Nested Loop Left Join (cost=15291.63..146700.24 rows=4 width=724) |
| -> Nested Loop Left Join (cost=15291.22..146681.05 rows=4 width=654) |
| -> Nested Loop Left Join (cost=15290.80..146667.32 rows=4 width=542) |
| -> Nested Loop Left Join (cost=15290.25..146643.82 rows=4 width=496) |
| -> Nested Loop Left Join (cost=15289.69..146621.36 rows=4 width=448) |
| -> Hash Right Join (cost=15289.27..146602.54 rows=4 width=438) |
| Hash Cond: ((x355."SOURCE_URI")::text = (x410."SOURCE_URI")::text) |
| -> Seq Scan on "RELEASE_IMAGE" x355 (cost=0.00..111558.45 rows=5267945 width=120) |
| -> Hash (cost=15289.22..15289.22 rows=4 width=318) |
| -> Hash Right Join (cost=13996.67..15289.22 rows=4 width=318) |
| Hash Cond: ((x376."SOURCE_URI")::text = (x410."SOURCE_URI")::text) |
| -> Seq Scan on "RELEASE_VOTED_TAG" x376 (cost=0.00..1118.94 rows=46294 width=82)
|
| -> Hash (cost=13996.62..13996.62 rows=4 width=236) |
| -> Hash Right Join (cost=13934.61..13996.62 rows=4 width=236) |
| Hash Cond: ((x330."SOURCE_URI")::text = (x410."SOURCE_URI")::text) |
| -> Seq Scan on "MEDIUM" x330 (cost=0.00..53.00 rows=2400 width=83) |
| -> Hash (cost=13934.56..13934.56 rows=4 width=153) |
| -> Nested Loop (cost=0.56..13934.56 rows=4 width=153) |
| -> Seq Scan on "RELEASE_BARCODE" (cost=0.00..13900.21 rows=4 width=40) |
| Filter: (("BARCODE")::text = ANY ('{75992731324,075992731324,0075992731324}'::text[])) |
| -> Index Scan using "RELEASE_pkey" on "RELEASE" x410 (cost=0.56..8.58 rows=1 width=153) |
| Index Cond: (("SOURCE_URI")::text = ("RELEASE_BARCODE"."SOURCE_URI")::text) |
| -> Index Only Scan using "RELEASE_CAT_PK" on "RELEASE_CAT_NO" x414 (cost=0.41..4.70 rows=1 width=74) |
| Index Cond: ("SOURCE_URI" = (x410."SOURCE_URI")::text) |
| -> Index Only Scan using "RELEASE_GENRE_PK" on "RELEASE_GENRE" x409 (cost=0.56..5.61 rows=1 width=48) |
| Index Cond: ("SOURCE_URI" = (x410."SOURCE_URI")::text) |
| -> Index Only Scan using "RELEASE_TYPE_PK" on "RELEASE_TYPE" x394 (cost=0.56..5.83 rows=4 width=46) |
| Index Cond: ("SOURCE_URI" = (x410."SOURCE_URI")::text) |
| -> Index Only Scan using "RELEASE_URL_PK" on "RELEASE_URL" x165 (cost=0.41..3.41 rows=2 width=112) |
| Index Cond: ("SOURCE_URI" = (x410."SOURCE_URI")::text) |
| -> Index Scan using release_label_source_uri on "RELEASE_LABEL" x111 (cost=0.41..4.79 rows=1 width=134) |
| Index Cond: ((x410."SOURCE_URI")::text = ("SOURCE_URI")::text) |
| -> Materialize (cost=0.00..437.53 rows=16635 width=62) |
| -> Seq Scan on "TRACK" x297 (cost=0.00..354.35 rows=16635 width=62) |
| -> Materialize (cost=0.00..97.41 rows=4294 width=30) |
| -> Seq Scan on "TRACK_COMPOSER" x213 (cost=0.00..75.94 rows=4294 width=30) |
| -> Materialize (cost=0.00..110.30 rows=4820 width=30)
| -> Seq Scan on "TRACK_ARTIST" x258 (cost=0.00..86.20 rows=4820 width=30) |
| -> Materialize (cost=0.00..579.02 rows=20535 width=74) |
| -> Seq Scan on "RELEASE_CAT_NO" x415 (cost=0.00..476.35 rows=20535 width=74) |
| -> Materialize (cost=0.00..366235.13 rows=9122342 width=40) |
| -> Seq Scan on "RELEASE" x47 (cost=0.00..249354.42 rows=9122342 width=40) |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
https://explain.depesz.com/s/3DSD
我的第一反应就是增加一些指标。所以我加了以下内容:
CREATE INDEX RELEASE_CAT_CAT_NO on "RELEASE_CAT_NO" ("CAT");
CREATE INDEX "track_medium" on "TRACK" ("MEDIUM");
CREATE INDEX "track_composer_track" on "TRACK_COMPOSER" ("TRACK");
CREATE INDEX "track_artist_track" on "TRACK_ARTIST" ("TRACK");
但这没有什么区别。当我执行更简单的查询时,我可以看到正在使用的索引,但仍不适用于此查询。
这就是说,增加索引确实帮助:
CREATE INDEX "release_label_source_uri" on "RELEASE_LABEL" ("SOURCE_URI");
我想知道是否加入过滤器,这可能投值到不同类型,分别负责:
| Join Filter: CASE WHEN (CASE WHEN (x414."CAT" IS NULL) THEN NULL::integer ELSE 1 END IS NOT NULL) THEN ((x414."CAT")::text = (x415."CAT")::text) ELSE NULL::boolean END |
CAT
是varchar
并且我如上创建了一个索引。当子查询执行选择为,取决于CAT
是否为空返回1
或0
上述代码的代码取出。
我认为这仅发生于结果,而不会影响扫描的类型?但我想知道的原因是因为它出现在“加入过滤器”输出中。
这是通过者均基于油滑框架生成的查询。 PostgreSQL 9.6.3。
一些想法:
你有专门的外连接。这大大限制了可能的执行路径。
检查你是否真的需要外部连接,或者你可以在某些地方使用内部连接。-
你的许多加盟条件非常复杂,只允许嵌套循环连接,这将影响性能很多,如果多行参与。
尝试简化它们!例如,考虑一下:
... LEFT JOIN ... ON CASE WHEN (x415."SOURCE_URI" IS NOT NULL) THEN ((x415."SOURCE_URI")::text = (x47."SOURCE_URI")::text) ELSE NULL::boolean END
SQL的这个大脑受损部分可以写成
... LEFT JOIN ... ON x415."SOURCE_URI" = x47."SOURCE_URI"
然后PostgreSQL的可以使用散列连接,如果你有很多行,这将大大加快你的查询速度。
-
还有一个索引可以帮助你执行计划,这取决于如何大的表是:
CREATE INDEX ON "RELEASE_BARCODE"("BARCODE");
不幸的是,左连接是必需的 - 它们是可选的关系。 这是生成的SQL。我可能会看到我是否可以亲自编写SQL来查看它的执行情况。 不幸的是,条形码上的索引没有帮助(太多)。 –
我想大部分都可以在良好的加入条件下获得。我已经延长了我的答案。 –
嗯。我手动推出了一个SQL语句,它执行0.16s,相对成本873.05。 –
请** [编辑] **你的问题,并添加'创建表问题和您正在使用的查询的表格语句。但总的来说:如果你想在你的查询中使用_expression_来使用一个索引,那么这个索引必须使用**完全相同的表达式来定义。 –
对不起,但查询太大而不适合问题主体。你能建议像SQLFiddle这样一个体面的地方,但接受一个更大的查询(也是理想的格式)吗? –