all about planning first. My initial approach was very complicated. But if we use a Counter to count the words in the given words list and when we slice from [i:i+total_length] and check if that sliced substring has the same words as the given words list, we add i to our answer list.
VERY IMPORTATNT
1) counter can be equated with defaultdict object. So like
if counter == defaultdict()
2) slicing s[i:i+total_length]. I kept doing i+total_length+1 as the end index cuz i knew the end index is exclusive but that +1 is wrong. For example, if [0:6] the substring's length is 0,1,2,3,4,5 which is 6 because it is 0-indexed.
3) when iterating
for i in range(start,final,step)
I need to add the step paramter here but I didnt include start parameter. That is wrong because python doesnt know the step parameter if there are only 2 parameters.
from collections import defaultdict
class Solution:
def findSubstring(self, s: str, words: List[str]) -> List[int]:
if not s or not words or len(s) == 0 or len(words) == 0:
return []
length = len(words[0])
total_length = length * len(words)
ans = []
freq = Counter(words)
def isValid(index, substring):
dic = defaultdict(int)
for i in range(0,len(substring), length):
subsubstring = substring[i:i + length ]
dic[subsubstring] += 1
return dic == freq
for i in range(len(s) - total_length + 1):
substring = s[i:i+total_length]
if isValid(i, substring):
ans.append(i)
return ans
The provided code aims to find all starting indices in the string s where a concatenation of all the words in words appears exactly once and without any intervening characters. The time and space complexity analysis of the algorithm is as follows:
Initialization and Edge Case Check:
s or words is empty and calculating the lengths are ( O(1) ).Word Frequency Count:
Counter object freq from words takes ( O(m) ), where ( m ) is the number of words.Main Loop:
s where the substring of total_length could be valid.s.Checking Validity of Substring:
isValid function is called.isValid, the substring is divided into len(words) chunks, each of length length, and their frequency is compared against freq.defaultdict and comparing it to the Counter.Since total_length is ( length \times len(words) ) and isValid is called ( n - total_length + 1 ) times, the overall time complexity is:
[ O(n \cdot (m \cdot length)) = O(n \cdot m \cdot length) ]
where:
s.words.length is the length of each word (assumed to be constant).Frequency Counter:
freq uses ( O(m) ) space.Result List:
ans could store up to ( O(n) ) starting indices in the worst case.Dictionary in isValid:
dic dictionary used inside isValid can have up to ( m ) entries in the worst case, using ( O(m) ) space.Therefore, the overall space complexity is:
[ O(m + n) ]
where:
words.s (for storing the result indices).This analysis shows that the algorithm scales linearly with the size of the input string s and the number of words in words, making it efficient for reasonably sized inputs.