Hirschberg 's method for LCS
一种采用了分治思想和动态规划的LCS算法,仅需O(NM)时间复杂度和O(min(N,M))空间复杂度就可得最大子串及其长度,具体参见论文:A Linear Space Algorithm for Computing Maximal Common Subsequences。这里介绍一下自己的理解。
如何分治?
给定两个字符串X(长度为m),Y(长度为n)。将X平分为两个子串X(1,m/2)和X(m/2+1,m),我们想在Y找到一个分割点j将Y切割为Y(1,j)和Y(j+1,n),使得LCS(X(1,m/2),Y(1,j))+LCS(X(m/2+1,m),Y(j+1,n))=LCS(X,Y)。
我们很容易看出,当LCS(X(1,m/2),Y(1,j))+LCS(X(m/2+1,m),Y(j+1,n))取得值最大时,j即为我们想找的切割点。
找到切割点之后我们就可以采用分治思想去推出子串。
具体算法参见论文:
关于时间复杂度:
摘自:https://www.ics.uci.edu/~eppstein/161/960229.html
This is a recursive algorithm, with a time recurrence
T(m,n) = O(mn) + T(m/2,k) + T(m/2,n-k)
You can think of this as sort of like quicksort -- we're breaking both strings into parts. But unlike quicksort it doesn't matter that the second string can be broken unequally. No matter
what k is, the recurrence solves to O(mn). The easiest way to see this is to think about what it's doing in the array L. The main part of the algorithm visits the whole array, then the two calls visit two subarrays, one above and left of (m/2,k) and the other
below and to the right. No matter what k is, the total size of these two subarrays is roughly mn/2. So instead we can write a simplified recurrence
T(mn) = O(mn) + T(mn/2)
which solves to O(mn) time total.