Hiearchical Join需要很长时间
问题描述:
我有一个代表我产品结构的树状分层结构。
在每个产品级别(6个级别)上,我都有一个销售价格链接到它。我使用两张相互连接的表格将较低层次的价格与上面的价格联系起来。
我想这样做,以便我不会多次考虑价格。这是用下面的代码(我只用0级,1和2显示的想法通知)来完成:Hiearchical Join需要很长时间
SELECT L0_SALESPRICE
,L1_SALESPRICE
,L2_SALESPRICE
FROM
(SELECT DISTINCT A.*
FROM BCT A
JOIN QuotationLine QL ON A.PRICECALCID = QL.PRICECALCID
WHERE A.Levels = 0) AS L0
JOIN
(SELECT DISTINCT A.*
FROM BCT A
JOIN QuotationLine QL ON A.PRICECALCID = QL.PRICECALCID
WHERE A.Levels = 1) AS L1 ON L0.ItemId = L1.ParentItemId
JOIN
(SELECT DISTINCT A.*
FROM BCT A
JOIN QuotationLine QL ON A.PRICECALCID = QL.PRICECALCID
WHERE A.Levels = 2) AS L2 ON L1.ItemId = L2.ParentItemId
的问题是,查询永远不会完成执行,并且我得到一个内存不足的错误。
表格BCT是750 000行,表格QuotationLine是22000行。
任何意见表示赞赏。
答
为演示如何解决此问题,以下是一些示例表定义。
CREATE TABLE [Description]
(
[DescriptionId] INT IDENTITY NOT NULL CONSTRAINT [PK_Description] PRIMARY KEY,
[DescriptionText] NVARCHAR(50) NOT NULL
)
GO
CREATE TABLE [Hierarchy]
(
[HierarchyId] INT NOT NULL CONSTRAINT [PK_Hierarchy] PRIMARY KEY,
[ParentHierarchyId] INT NULL CONSTRAINT [FK_Hierarchy_ParentHierarchyId] REFERENCES [Hierarchy] ([HierarchyId]) ON DELETE NO ACTION ON UPDATE NO ACTION,
[Price] MONEY NOT NULL,
[DescriptionId] INT NULL CONSTRAINT [FK_Hierarchy_Description] REFERENCES [Description] ([DescriptionId]) ON DELETE SET NULL ON UPDATE CASCADE
)
GO
CREATE INDEX [IX_Hierarchy_ParentHierarchyId] ON [Hierarchy]
([ParentHierarchyId]) INCLUDE ([HierarchyId], [Price], [DescriptionId])
GO
一个天真的方式来获得一个层次 - 也就是说,一个是不可能解决您的性能问题 - 可能是:
;WITH RowEnds AS
(
SELECT h.[HierarchyId], h.[ParentHierarchyId], h.[Price], h.[DescriptionId], h.[HierarchyId] AS [RowEndHierarchyId], 0 AS [ReverseLevel]
FROM [Hierarchy] h
WHERE NOT EXISTS (SELECT 1 FROM [Hierarchy] i WHERE i.[ParentHierarchyId] = h.[HierarchyId])
UNION ALL
SELECT h.[HierarchyId], h.[ParentHierarchyId], h.[Price], h.[DescriptionId], r.[RowEndHierarchyId], r.[ReverseLevel] + 1 AS [ReverseLevel]
FROM [Hierarchy] h
INNER JOIN RowEnds r ON h.[HierarchyId] = r.[ParentHierarchyId]
),
InOrder AS
(
SELECT r.RowEndHierarchyId, r.[HierarchyId], r.Price, d.DescriptionText, RANK() OVER (PARTITION BY r.[RowEndHierarchyId] ORDER BY r.[ReverseLevel] DESC) AS [Level]
FROM RowEnds r
LEFT JOIN [Description] d ON r.DescriptionId = d.DescriptionId
)
SELECT DISTINCT o.RowEndHierarchyId, p.[1] AS Price1, d.[1] AS Description1, p.[2] AS Price2, d.[2] AS Description2, p.[3] AS Price3, d.[3] AS Description3,
p.[4] AS Price4, d.[4] AS Description4, p.[5] AS Price5, d.[5] AS Description5, p.[6] AS Price6, d.[6] AS Description6,
p.[7] AS Price7, d.[7] AS Description7
FROM InOrder o
INNER JOIN
(SELECT projp.RowEndHierarchyId, projp.[Level], projp.[Price]
FROM InOrder projp) ppre
PIVOT (MIN([Price]) FOR [Level] IN ([1], [2], [3], [4], [5], [6], [7])) p
ON o.RowEndHierarchyId = p.RowEndHierarchyId
LEFT JOIN
(SELECT projd.RowEndHierarchyId, projd.[Level], projd.DescriptionText
FROM INOrder projd) dpre
PIVOT (MIN(DescriptionText) FOR [Level] IN ([1], [2], [3], [4], [5], [6], [7])) d
ON o.RowEndHierarchyId = d.RowEndHierarchyId
ORDER BY o.RowEndHierarchyId
这个例子,当然,使用递归公用表表达式来获得层次结构。该查询不是从树的根部开始,而是朝着树叶工作,而是采用相反的方法。这样做的好处是输出中的每一行都对应树中的一个叶节点。
但是,这种方法的性能可能仍然不理想,因为您没有机会对公共表表达式进行索引,其结果集可能非常大。假设tempdb
具有足够的空间和性能,以下更详细的查询可能会提高性能。
CREATE TABLE #RowEnd
(
[RowEndHierarchyId] INT NOT NULL,
[HierarchyId] INT NOT NULL,
[ParentHierarchyId] INT NULL,
[Price] MONEY NOT NULL,
[DescriptionId] INT NULL,
[ReverseLevel] INT NOT NULL,
PRIMARY KEY ([RowEndHierarchyId], [ReverseLevel] DESC)
)
CREATE INDEX [IX_RowEnd_ParentHierarchyId] ON #RowEnd
([ParentHierarchyId], [RowEndHierarchyId], [ReverseLevel])
CREATE INDEX [IX_RowEnd_ReverseLevel] ON #RowEnd
([ReverseLevel] DESC, [ParentHierarchyId], [RowEndHierarchyId])
INSERT #RowEnd ([HierarchyId], [ParentHierarchyId], [Price], [DescriptionId], [RowEndHierarchyId], [ReverseLevel])
SELECT h.[HierarchyId], h.[ParentHierarchyId], h.[Price], h.[DescriptionId], h.[HierarchyId], 1
FROM [Hierarchy] h
WHERE NOT EXISTS (SELECT 1 FROM [Hierarchy] i WHERE i.ParentHierarchyId = h.[HierarchyId])
DECLARE @ReverseLevel INT
SET @ReverseLevel = 0
WHILE EXISTS (SELECT 1 FROM #RowEnd re WHERE re.ReverseLevel > @ReverseLevel)
BEGIN
SET @ReverseLevel = @ReverseLevel + 1
INSERT #RowEnd ([HierarchyId], [ParentHierarchyId], [Price], [DescriptionId], [RowEndHierarchyId], [ReverseLevel])
SELECT h.[HierarchyId], h.[ParentHierarchyId], h.[Price], h.[DescriptionId], re.[RowEndHierarchyId], @ReverseLevel + 1
FROM [Hierarchy] h
INNER JOIN #RowEnd re ON re.ParentHierarchyId = h.[HierarchyId] AND re.ReverseLevel = @ReverseLevel
END
CREATE TABLE #Price
(
RowEndHierarchyId INT NOT NULL PRIMARY KEY,
[1] MONEY NULL,
[2] MONEY NULL,
[3] MONEY NULL,
[4] MONEY NULL,
[5] MONEY NULL,
[6] MONEY NULL,
[7] MONEY NULL
)
INSERT #Price (RowEndHierarchyId, [1], [2], [3], [4], [5], [6], [7])
SELECT p.RowEndHierarchyId, p.[1], p.[2], p.[3], p.[4], p.[5], p.[6], p.[7]
FROM (SELECT re.RowEndHierarchyId, re.Price, RANK() OVER (PARTITION BY re.RowEndHierarchyId ORDER BY re.ReverseLevel DESC) AS [Level]
FROM #RowEnd re) ppre
PIVOT (MIN([Price]) FOR [Level] IN ([1], [2], [3], [4], [5], [6], [7])) p
CREATE TABLE #Description
(
RowEndHierarchyId INT NOT NULL PRIMARY KEY,
[1] NVARCHAR(50) NULL,
[2] NVARCHAR(50) NULL,
[3] NVARCHAR(50) NULL,
[4] NVARCHAR(50) NULL,
[5] NVARCHAR(50) NULL,
[6] NVARCHAR(50) NULL,
[7] NVARCHAR(50) NULL
)
INSERT #Description (RowEndHierarchyId, [1], [2], [3], [4], [5], [6], [7])
SELECT d.RowEndHierarchyId, d.[1], d.[2], d.[3], d.[4], d.[5], d.[6], d.[7]
FROM (SELECT re.RowEndHierarchyId, dt.DescriptionText, RANK() OVER (PARTITION BY re.RowEndHierarchyId ORDER BY re.ReverseLevel DESC) AS [Level]
FROM #RowEnd re
LEFT JOIN [Description] dt ON re.DescriptionId = dt.DescriptionId) dpre
PIVOT (MIN([DescriptionText]) FOR [Level] IN ([1], [2], [3], [4], [5], [6], [7])) d
SELECT p.RowEndHierarchyId,
p.[1] AS Price1, d.[1] AS Description1,
p.[2] AS Price2, d.[2] AS Description2,
p.[3] AS Price3, d.[3] AS Description3,
p.[4] AS Price4, d.[4] AS Description4,
p.[5] AS Price5, d.[5] AS Description5,
p.[6] AS Price6, d.[6] AS Description6,
p.[7] AS Price7, d.[7] AS Description7
FROM #Price p
INNER JOIN #Description d ON p.RowEndHierarchyId = d.RowEndHierarchyId
ORDER BY p.RowEndHierarchyId
DROP TABLE #Description
DROP TABLE #Price
DROP TABLE #RowEnd
获取层次结构的基本逻辑与先前的版本类似。但是,以这种方式为临时表建立索引可能会大大提高查询性能。
由于行数的原因,这需要很长时间。你真的需要把他们全部拿走吗? – DavidG 2014-10-20 08:23:10
首先推出查询计划,以便我们了解您是否错过索引等。然后描述您的硬件 - 像这样的东西在适当的中档服务器上并不那么困难,但对于小的东西 - 哎哟。 – TomTom 2014-10-20 08:36:19
尝试使用'CASE'表达式'WHEN A.Levels = 0',这样你只需要查询表格一次。你能提供一些DDL或SQLFiddle吗? – NickyvV 2014-10-20 08:46:01