[Leetcode] 30. Substring with Concatenation of All Words

whitehousechef·2024년 8월 3일

https://leetcode.com/problems/substring-with-concatenation-of-all-words/description/?envType=study-plan-v2&envId=top-interview-150

initial

all about planning first. My initial approach was very complicated. But if we use a Counter to count the words in the given words list and when we slice from [i:i+total_length] and check if that sliced substring has the same words as the given words list, we add i to our answer list.

VERY IMPORTATNT
1) counter can be equated with defaultdict object. So like

if counter == defaultdict()

2) slicing s[i:i+total_length]. I kept doing i+total_length+1 as the end index cuz i knew the end index is exclusive but that +1 is wrong. For example, if [0:6] the substring's length is 0,1,2,3,4,5 which is 6 because it is 0-indexed.

3) when iterating

for i in range(start,final,step)

I need to add the step paramter here but I didnt include start parameter. That is wrong because python doesnt know the step parameter if there are only 2 parameters.

solution

from collections import defaultdict
class Solution:
    def findSubstring(self, s: str, words: List[str]) -> List[int]:
        if not s or not words or len(s) == 0 or len(words) == 0:
            return []
        length = len(words[0])
        total_length = length * len(words)
        ans = []
        freq = Counter(words)

        def isValid(index, substring):
            dic = defaultdict(int)
            for i in range(0,len(substring), length):
                subsubstring = substring[i:i + length ]
                dic[subsubstring] += 1
            return dic == freq

        for i in range(len(s) - total_length + 1):
            substring = s[i:i+total_length]
            if isValid(i, substring):
                ans.append(i)
        return ans

complexity

The provided code aims to find all starting indices in the string s where a concatenation of all the words in words appears exactly once and without any intervening characters. The time and space complexity analysis of the algorithm is as follows:

Time Complexity

Initialization and Edge Case Check:
- Checking if s or words is empty and calculating the lengths are ( O(1) ).
Word Frequency Count:
- Creating the Counter object freq from words takes ( O(m) ), where ( m ) is the number of words.
Main Loop:
- The main loop iterates over all possible starting indices in the string s where the substring of total_length could be valid.
- There are ( n - total_length + 1 ) such indices, where ( n ) is the length of s.
Checking Validity of Substring:
- For each starting index, the isValid function is called.
- Inside isValid, the substring is divided into len(words) chunks, each of length length, and their frequency is compared against freq.
- This takes ( O(total_length) ) time because it involves creating a defaultdict and comparing it to the Counter.

Since total_length is ( length \times len(words) ) and isValid is called ( n - total_length + 1 ) times, the overall time complexity is:

[ O(n \cdot (m \cdot length)) = O(n \cdot m \cdot length) ]

where:

( n ) is the length of s.
( m ) is the number of words in words.
length is the length of each word (assumed to be constant).

Space Complexity

Frequency Counter:
- Storing the word frequencies in freq uses ( O(m) ) space.
Result List:
- The list ans could store up to ( O(n) ) starting indices in the worst case.
Dictionary in isValid:
- The dic dictionary used inside isValid can have up to ( m ) entries in the worst case, using ( O(m) ) space.

Therefore, the overall space complexity is:

[ O(m + n) ]

where:

( m ) is the number of words in words.
( n ) is the length of the string s (for storing the result indices).

Summary

Time Complexity: ( O(n \cdot m \cdot length) )
Space Complexity: ( O(m + n) )

This analysis shows that the algorithm scales linearly with the size of the input string s and the number of words in words, making it efficient for reasonably sized inputs.

whitehousechef

이전 포스트

[Leetcode] 637. Average of Levels in Binary Tree

다음 포스트