Similar Problems

Similar Problems not available

Encode String With Shortest Length - Leetcode Solution

Companies:

LeetCode:  Encode String With Shortest Length Leetcode Solution

Difficulty: Hard

Topics: string dynamic-programming  

Problem Statement:

Given a non-empty string, encode the string such that its encoded length is the shortest. The encoding rule is: k[encoded_string], where the encoded_string inside the square brackets is being repeated exactly k times.

Note:

  • k will be a positive integer and encoded string will not be empty or have extra space.
  • You may assume that the input string contains only lowercase English letters.
  • If an encoding process does not make the string shorter, then do not encode it. If there are several solutions, return any of them.

Example 1:

Input: "aaa" Output: "aaa" Explanation: There is no way to encode it such that it is shorter than the input string, so we do not encode it.

Example 2:

Input: "aaaaa" Output: "5[a]" Explanation: "5[a]" is shorter than "aaaaa" by one character.

Example 3:

Input: "aaaaaaaaaa" Output: "10[a]" Explanation: "a9[a]" or "9[a]a" are also valid solutions, both of them have the same length = 11, which is the same as "10[a]".

Solution:

Approach: Dynamic Programming

To solve the problem, we need to look at the substring patterns that we can use to reduce the size of the overall string. By doing so, we can create a set of rules that we can follow to minimize the size of the encoded string.

For this problem, we will use dynamic programming which will help us encode each substring with the minimum possible length. We will represent the dynamic programming as a two-dimensional array dp where dp[i][j] represents the shortest possible encoded string that can be obtained by encoding the substring S[i..j]. We will start with minimum length substrings, and gradually increase the length to the full length of the input string. We will use the results of the intermediate substrings to reduce our calculations for the next substring. Finally, the answer will be stored in dp[0][n-1] (where n is the length of the input string).

Algorithm:

  1. Initialize a dp table to save the encoded string for each substring. dp[i][j] represents the shortest possible encoded string of S[i..j] In the beginning, all dp[i][j] = S[i..j], indicating we can only use the original substrings as the encoded string.
  2. We will start with substrings of length 1, the dp table diagonal. For each dp[i][i], the encoding is the character itself, so dp[i][i] should be just S[i].
  3. After initializing the dp table for substrings of length 1, we will gradually increase the length of the substrings. The substring length starts from 2 to n (where n is the length of the input string).
  4. In the outer loop, we set the length of the substring. For each length, we will then move the left endpoint(i) from 0 to n-len. The right endpoint(j) is then calculated as j = i + len - 1.
  5. Next, for each substring S[i..j], we will check if we can find repeating substrings that we can encode
  6. We will loop from i to j to check all the substrings of this substring. We can find a repeating substring if the length of the substring can divide the length of the current part of the string.
  7. At each loop, we will look at the minimum encoded result so far for the two parts of the string we are dividing. If the encoded length of the two parts combined is less than the current best encoded length, we update the best encoded length accordingly. dp[i][j] records the best encoded string for non-overlapping substrings S[i..j] with minimum length
  8. After all the iteration of length, left, and right end, dp[0][n-1] should be the right answer.

Pseudo code:

n = length of string S dp = array of strings of n*n for i = 0 to n-1: dp[i][i] = S[i] for len = 2 to n: for i = 0 to n-len: j = i + len - 1 dp[i][j] = S[i..j] for k = i to j: x = dp[i][k] y = dp[k+1][j] if len(x) + len(y) < len(dp[i][j]): dp[i][j] = len(x+y) + "[" + dp[i][k] + "]" return dp[0][n-1]

Time Complexity:

The outer loop for length is executed n times, the left endpoint moves from 0 to n-len, so it is executed (n-len+1) times, and for each endpoint loop, we move to all possible right endpoint(j = i + len - 1), giving a total of n*(n+1)/2 iterations for the left and right endpoint. The inner loop of checking for substrings has a time complexity of O(N), so the time complexity of the entire solution will be O(N^4) where N is the length of the input string.

Space Complexity:

We store the encoded result string with length at most n, so the space complexity will be O(N^2) where N is the length of the input string.

Code:

class Solution: def encode(self, s: str) -> str: n = len(s) dp = [["" for j in range(n)] for i in range(n)]

    for i in range(n):
        dp[i][i] = s[i]
    
    for length in range(2, n+1):
        for i in range(n-length+1):
            j = i + length - 1
            dp[i][j] = s[i:j+1]
            for k in range(i, j):
                if (len(dp[i][k]) + len(dp[k+1][j])) < len(dp[i][j]):
                    dp[i][j] = dp[i][k] + dp[k+1][j]
    
            sub = s[i:j+1]
            idx = (sub + sub).find(sub, 1)
            if idx < len(sub):
                pattern = str(len(sub) // idx) + "[" + dp[i][i+idx-1] + "]"
                if len(pattern) < len(dp[i][j]):
                    dp[i][j] = pattern

    return dp[0][n-1]

Encode String With Shortest Length Solution Code

1