动态规划求解最短编辑距离

最短编辑距离

The edit distance between two strings is the minimum number of letter insertions(插入), letter
deletions(删除), and letter substitutions(替换) required to transform one string into another. For example, the edit distance between FOOD and MONEY is at most four:

动态规划求解最短编辑距离

递归结构

If we remove the last column, the remaining columns must represent the shortest edit sequence for the remaining prefixes.
If the prefixes had a shorter edit sequence, gluing the last column back on would gives us a shorter edit sequence for the original strings. So once we figure out what should happen in the last column, the Recursion Fairy can figure out the rest of the optimal gap representation.
Thus, for any two input strings A[1 … m] and B[1 … n], we can formulate the edit distance problem recursively as follows: For any indices i and j, let Edit(i, j) denote the edit distance between the prefixes A[1 … i] and B[1 … j]. We need to compute Edit(m, n).
We conclude that the Edit function satisfies the following recurrence:
动态规划求解最短编辑距离

动态规划

• Subproblems(子问题): Each recursive subproblem is identified by two indices 0 ≤ i ≤ m and
0 ≤ j ≤ n.
• Memoization structure(存储结构): So we can memoize all possible values of Edit(i, j) in a
two-dimensional array Edit[0 … m, 0 … n].
• Dependencies(依赖): Each entry Edit[i, j] depends only on its three neighboring entries
Edit[i − 1, j], Edit[i, j − 1], and Edit[i − 1, j − 1].
• Evaluation order(求值顺序):
So if we fill in our table in the standard row-major order—row by row from top down, each row from left to right—then whenever we reach an entry in the table, the entries it depends on are already available. (This isn’t the only evaluation order we could use, but it works, so let’s go with it.)
• Space and time(空间与时间复杂度): The memoization structure uses O(mn) space. We can compute each entry Edit[i, j] in O(1) time once we know its predecessors, so the overall algorithm runs in O(mn) time.
动态规划求解最短编辑距离
Transforming ALGORITHM into ALTRUISTIC. These edit sequences are shown in Figure below.

动态规划求解最短编辑距离