在连接过滤器中投射 - 是否排除索引扫描?

问题描述:

所以我加入了一个好几张桌子,结果是可怕的。在连接过滤器中投射 - 是否排除索引扫描?

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 
| QUERY PLAN                                              | 
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| 
| Nested Loop Left Join (cost=15291.63..367830181335340285952.00 rows=8062002970089247211520 width=676)                       | 
| Join Filter: CASE WHEN (x415."SOURCE_URI" IS NOT NULL) THEN ((x415."SOURCE_URI")::text = (x47."SOURCE_URI")::text) ELSE NULL::boolean END             | 
| -> Nested Loop Left Join (cost=15291.63..53031043002887.09 rows=1767529209075750 width=900)                        | 
|   Join Filter: CASE WHEN (CASE WHEN (x414."CAT" IS NULL) THEN NULL::integer ELSE 1 END IS NOT NULL) THEN ((x414."CAT")::text = (x415."CAT")::text) ELSE NULL::boolean END    | 
|   -> Nested Loop Left Join (cost=15291.63..5166730086.91 rows=172147962900 width=846)                         | 
|    Join Filter: CASE WHEN (CASE WHEN (x297."ID" IS NULL) THEN NULL::integer ELSE 1 END IS NOT NULL) THEN (x297."ID" = x258."TRACK") ELSE NULL::boolean END       | 
|    -> Nested Loop Left Join (cost=15291.63..2291101.66 rows=71430690 width=816)                         | 
|      Join Filter: CASE WHEN (CASE WHEN (x297."ID" IS NULL) THEN NULL::integer ELSE 1 END IS NOT NULL) THEN (x297."ID" = x213."TRACK") ELSE NULL::boolean END     | 
|      -> Nested Loop Left Join (cost=15291.63..148094.28 rows=33270 width=786)                         | 
|       Join Filter: CASE WHEN (CASE WHEN (x330."ID" IS NULL) THEN NULL::integer ELSE 1 END IS NOT NULL) THEN (x330."ID" = x297."MEDIUM") ELSE NULL::boolean END    | 
|       -> Nested Loop Left Join (cost=15291.63..146700.24 rows=4 width=724)                        | 
|         -> Nested Loop Left Join (cost=15291.22..146681.05 rows=4 width=654)                       | 
|          -> Nested Loop Left Join (cost=15290.80..146667.32 rows=4 width=542)                     | 
|            -> Nested Loop Left Join (cost=15290.25..146643.82 rows=4 width=496)                    | 
|             -> Nested Loop Left Join (cost=15289.69..146621.36 rows=4 width=448)                  | 
|               -> Hash Right Join (cost=15289.27..146602.54 rows=4 width=438)                  | 
|                Hash Cond: ((x355."SOURCE_URI")::text = (x410."SOURCE_URI")::text)                | 
|                -> Seq Scan on "RELEASE_IMAGE" x355 (cost=0.00..111558.45 rows=5267945 width=120)            | 
|                -> Hash (cost=15289.22..15289.22 rows=4 width=318)                    | 
|                  -> Hash Right Join (cost=13996.67..15289.22 rows=4 width=318)               | 
|                   Hash Cond: ((x376."SOURCE_URI")::text = (x410."SOURCE_URI")::text)             | 
|                   -> Seq Scan on "RELEASE_VOTED_TAG" x376 (cost=0.00..1118.94 rows=46294 width=82)        
| 
|                   -> Hash (cost=13996.62..13996.62 rows=4 width=236)                 | 
|                     -> Hash Right Join (cost=13934.61..13996.62 rows=4 width=236)            | 
|                      Hash Cond: ((x330."SOURCE_URI")::text = (x410."SOURCE_URI")::text)          | 
|                      -> Seq Scan on "MEDIUM" x330 (cost=0.00..53.00 rows=2400 width=83)          | 
|                      -> Hash (cost=13934.56..13934.56 rows=4 width=153)              | 
|                        -> Nested Loop (cost=0.56..13934.56 rows=4 width=153)           | 
|                         -> Seq Scan on "RELEASE_BARCODE" (cost=0.00..13900.21 rows=4 width=40)      | 
|                           Filter: (("BARCODE")::text = ANY ('{75992731324,075992731324,0075992731324}'::text[])) | 
|                         -> Index Scan using "RELEASE_pkey" on "RELEASE" x410 (cost=0.56..8.58 rows=1 width=153) | 
|                           Index Cond: (("SOURCE_URI")::text = ("RELEASE_BARCODE"."SOURCE_URI")::text)   | 
|               -> Index Only Scan using "RELEASE_CAT_PK" on "RELEASE_CAT_NO" x414 (cost=0.41..4.70 rows=1 width=74)         | 
|                Index Cond: ("SOURCE_URI" = (x410."SOURCE_URI")::text)                   | 
|             -> Index Only Scan using "RELEASE_GENRE_PK" on "RELEASE_GENRE" x409 (cost=0.56..5.61 rows=1 width=48)          | 
|               Index Cond: ("SOURCE_URI" = (x410."SOURCE_URI")::text)                     | 
|            -> Index Only Scan using "RELEASE_TYPE_PK" on "RELEASE_TYPE" x394 (cost=0.56..5.83 rows=4 width=46)            | 
|             Index Cond: ("SOURCE_URI" = (x410."SOURCE_URI")::text)                      | 
|          -> Index Only Scan using "RELEASE_URL_PK" on "RELEASE_URL" x165 (cost=0.41..3.41 rows=2 width=112)              | 
|            Index Cond: ("SOURCE_URI" = (x410."SOURCE_URI")::text)                        | 
|         -> Index Scan using release_label_source_uri on "RELEASE_LABEL" x111 (cost=0.41..4.79 rows=1 width=134)              | 
|          Index Cond: ((x410."SOURCE_URI")::text = ("SOURCE_URI")::text)                       | 
|       -> Materialize (cost=0.00..437.53 rows=16635 width=62)                            | 
|         -> Seq Scan on "TRACK" x297 (cost=0.00..354.35 rows=16635 width=62)                       | 
|      -> Materialize (cost=0.00..97.41 rows=4294 width=30)                              | 
|       -> Seq Scan on "TRACK_COMPOSER" x213 (cost=0.00..75.94 rows=4294 width=30)                       | 
|    -> Materialize (cost=0.00..110.30 rows=4820 width=30)      
|      -> Seq Scan on "TRACK_ARTIST" x258 (cost=0.00..86.20 rows=4820 width=30)                         | 
|   -> Materialize (cost=0.00..579.02 rows=20535 width=74)                                | 
|    -> Seq Scan on "RELEASE_CAT_NO" x415 (cost=0.00..476.35 rows=20535 width=74)                         | 
| -> Materialize (cost=0.00..366235.13 rows=9122342 width=40)                                | 
|   -> Seq Scan on "RELEASE" x47 (cost=0.00..249354.42 rows=9122342 width=40)                           | 
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 

https://explain.depesz.com/s/3DSD

我的第一反应就是增加一些指标。所以我加了以下内容:

CREATE INDEX RELEASE_CAT_CAT_NO on "RELEASE_CAT_NO" ("CAT"); 
CREATE INDEX "track_medium" on "TRACK" ("MEDIUM"); 
CREATE INDEX "track_composer_track" on "TRACK_COMPOSER" ("TRACK"); 
CREATE INDEX "track_artist_track" on "TRACK_ARTIST" ("TRACK"); 

但这没有什么区别。当我执行更简单的查询时,我可以看到正在使用的索引,但仍不适用于此查询。

这就是说,增加索引确实帮助:

CREATE INDEX "release_label_source_uri" on "RELEASE_LABEL" ("SOURCE_URI"); 

我想知道是否加入过滤器,这可能投值到不同类型,分别负责:

|   Join Filter: CASE WHEN (CASE WHEN (x414."CAT" IS NULL) THEN NULL::integer ELSE 1 END IS NOT NULL) THEN ((x414."CAT")::text = (x415."CAT")::text) ELSE NULL::boolean END    | 

CATvarchar并且我如上创建了一个索引。当子查询执行选择为,取决于CAT是否为空返回10上述代码的代码取出。

我认为这仅发生于结果,而不会影响扫描的类型?但我想知道的原因是因为它出现在“加入过滤器”输出中。

这是通过者均基于油滑框架生成的查询。 PostgreSQL 9.6.3。

+1

请** [编辑] **你的问题,并添加'创建表问题和您正在使用的查询的表格语句。但总的来说:如果你想在你的查询中使用_expression_来使用一个索引,那么这个索引必须使用**完全相同的表达式来定义。 –

+0

对不起,但查询太大而不适合问题主体。你能建议像SQLFiddle这样一个体面的地方,但接受一个更大的查询(也是理想的格式)吗? –

一些想法:

  • 你有专门的外连接。这大大限制了可能的执行路径。
    检查你是否真的需要外部连接,或者你可以在某些地方使用内部连接。

  • 你的许多加盟条件非常复杂,只允许嵌套循环连接,这将影响性能很多,如果多行参与。
    尝试简化它们!

    例如,考虑一下:

    ... LEFT JOIN ... 
    ON CASE 
         WHEN (x415."SOURCE_URI" IS NOT NULL) 
         THEN ((x415."SOURCE_URI")::text = (x47."SOURCE_URI")::text) 
         ELSE NULL::boolean 
        END 
    

    SQL的这个大脑受损部分可以写成

    ... LEFT JOIN ... 
    ON x415."SOURCE_URI" = x47."SOURCE_URI" 
    

    然后PostgreSQL的可以使用散列连接,如果你有很多行,这将大大加快你的查询速度。

  • 还有一个索引可以帮助你执行计划,这取决于如何大的表是:

    CREATE INDEX ON "RELEASE_BARCODE"("BARCODE"); 
    
+0

不幸的是,左连接是必需的 - 它们是可选的关系。 这是生成的SQL。我可能会看到我是否可以亲自编写SQL来查看它的执行情况。 不幸的是,条形码上的索引没有帮助(太多)。 –

+0

我想大部分都可以在良好的加入条件下获得。我已经延长了我的答案。 –

+0

嗯。我手动推出了一个SQL语句,它执行0.16s,相对成本873.05。 –