Similar Problems
Similar Problems not available
Find The Shortest Superstring - Leetcode Solution
Companies:
LeetCode: Find The Shortest Superstring Leetcode Solution
Difficulty: Hard
Topics: string dynamic-programming bit-manipulation array
Problem Statement:
Given a set of n strings arr[], the task is to find the shortest string containing each string in the set as substring.
Example 1:
Input: arr[] = {"abcd", "cdef", "fgh", "de"} Output: "abcdefgh" Explanation: "abcd" + "cdef" -> "abcdef" "abcdef" + "fgh" -> "abcdefgh"
Example 2:
Input: arr[] = {"abcdef", "efgh", "ghij"} Output: "abcdefghij" Explanation: "abcdef" + "efgh" -> "abcdefgh" "abcdefgh" + "ghij" -> "abcdefghij"
Approach:
- We start by choosing the two strings with the highest overlap
- We merge the two strings using the overlap and add the merged string to the remaining strings
- We repeat this process iteratively until we get a single string with all the substrings
- We calculate the length of the final string and return it as the answer.
Algorithm:
- Create a 2D Matrix of size n*n to store overlaps of all the strings. matrix[i][j] will store overlap of ith and jth string.
- For each pair of strings(i, j), calculate the overlap matrix[i][j].
- Find pair of strings(i,j) with maximum overlap.
- Merge strings i and j using the overlapping part.
- Add the merged string to remaining strings array.
- Repeat steps 2-5 until we have only one string in the remaining strings array.
- Calculate the length of the final string and return it as the answer.
Python Implementation:
import sys
def find_overlaps(s1, s2): n1, n2 = len(s1), len(s2) best = (-sys.maxsize, -1) # (overlap, overlap_start)
# compare all suffixes of s1 with all prefixes of s2
for i in range(1, min(n1, n2) + 1):
if s1[n1 - i:] == s2[:i]:
if best[0] < i:
best = (i, n1 - i)
# return both the overlap length and start index
return best
def shortestSuperstring(strings): N = len(strings)
overlaps = [[0] * N for _ in range(N)]
# calculate overlaps of all string pairs
for i in range(N):
for j in range(N):
if i == j:
continue
overlaps[i][j], _ = find_overlaps(strings[i], strings[j])
# While there are more than one string in strings
while len(strings) > 1:
# Find the pair with maximum overlap
max_overlap = -sys.maxsize
p, q = 0, 0
for i in range(N):
for j in range(N):
if overlaps[i][j] > max_overlap:
max_overlap = overlaps[i][j]
p, q = i, j
# Merge strings p and q using overlap
merged = strings[p] + strings[q][max_overlap:]
# Update strings and overlaps arrays
strings = strings[:p] + [merged] + strings[p+1:q] + strings[q+1:]
N -= 1
overlaps = [[0] * N for _ in range(N)]
for i in range(N):
for j in range(N):
if i == j:
continue
overlaps[i][j], _ = find_overlaps(strings[i], strings[j])
return strings[0]
Time Complexity:
The time complexity of this algorithm is O(n^3) because we need to calculate overlaps of all the string pairs which takes O(n^2) time and for each string pair calculation takes O(n) time.
Space Complexity:
The space complexity of this algorithm is also O(n^2) because we are storing overlap values between each pair of strings in a 2D matrix of size n*n.
Find The Shortest Superstring Solution Code
1