https://leetcode.com/problems/number-of-matching-subsequences/description/
we can use 2 pointer approach but causes tle but correct
class Solution:
def numMatchingSubseq(self, s: str, words: List[str]) -> int:
ans=0
def subsequence(word,s):
idx=0
word_idx=0
while word_idx<len(word) and idx<len(s):
if s[idx]!=word[word_idx]:
idx+=1
else:
idx+=1
word_idx+=1
return True if word_idx==len(word) else False
for word in words:
if subsequence(word,s):
ans+=1
return ans
so ive never seen this approach before. We can use dict where
key - character that this word is waiting for
value - list of words that need this current character to move on
for example
word_dict = {
'a': ["abc", "abd"],
'b': ["bcd"]
}
"abc" and "abd" need 'a' next.
"bcd" needs 'b' next.
When checking if words are subsequences of S, instead of checking each word repeatedly by scanning S, we:
Group words by the next character they need to match in S.
Then as we scan through S character by character, we only advance the words waiting for that character.
V IMPT, for each char in string, we need to reset the value of that key, cuz the words that needed this current character are processed. Since they dont need this current character anymore, they move on to the next bucket.
from collections import defaultdict
class Solution:
def numMatchingSubseq(self, s: str, words: List[str]) -> int:
ans=0
dic = defaultdict(list)
for word in words:
dic[word[0]].append(word)
for c in s:
nextWords = dic[c]
dic[c]=[]
for word in nextWords:
if len(word)==1:
ans+=1
else:
dic[word[1]].append(word[1:])
return ans
two pointer is n*m time,
but bucket is n+l
l cuz we process each alphabet in words once with this bucket logic
for space both is n+l, cuz
input is already n+l
bucket (dict) stores total number of suffixes stored across all buckets = total number of characters in all words.
so just al chracters of the words are stored in the dic