[LeetCode / C++] 49. Group Anagrams

Semidragon·2024년 10월 9일
0

CodingTest

목록 보기
78/80

Leetcode Question Link

1. Question

Given an array of strings strs, group the anagrams together. You can return the answer in any order.

Example 1:

Input: strs = ["eat","tea","tan","ate","nat","bat"]

Output: [["bat"],["nat","tan"],["ate","eat","tea"]]

Explanation:

There is no string in strs that can be rearranged to form "bat".
The strings "nat" and "tan" are anagrams as they can be rearranged to form each other.
The strings "ate", "eat", and "tea" are anagrams as they can be rearranged to form each other.

Example 2:

Input: strs = [""]

Output: [[""]]

Example 3:

Input: strs = ["a"]

Output: [["a"]]

2. Thoughts

// @Problem Description
// group anagrams

// @Algorithm
// Use Hashmap
// Sort the string in alphabet order
// Use the sorted string as key // use the unsorted as value
// append the values to the result

// @Complexity
// N = # strs // M = length of each str 
// sort string = O(n * mlogm)
// hashing = O(N)
// Thus O(N MlogM) => time
//space => hashmap O(N*M)

3. Tips learned

3.1 Hashmap in C++ (unordered_map)

In C++, you can use a hash map by using the unordered_map container from the Standard Template Library (STL). unordered_map provides an efficient way to store and access key-value pairs using a hash table. Here’s a step-by-step guide on how to use it:

Step 1: Include the Required Header

To use unordered_map, include the following header:

#include <unordered_map>

Step 2: Create an unordered_map

To declare an unordered_map, use the following syntax:

std::unordered_map<KeyType, ValueType> mapName;
  • KeyType: The data type of the keys (e.g., int, std::string).
  • ValueType: The data type of the values.

4. My solution

using namespace std;
#include <iostream>
#include <vector>
#include <unordered_map>

class Solution
{
public:
    vector<vector<string>> groupAnagrams(vector<string> &strs)
    {
        vector<vector<string>> result;
        unordered_map<string, vector<string>> hashmap;
        for (auto &str : strs)
        {
            string sortedstr = str;
            sort(sortedstr.begin(), sortedstr.end());
            hashmap[sortedstr].push_back(str);
        }
        for (auto &entry : hashmap)
        {
            result.push_back(entry.second);
        }
        return result;
    }
};
126/126 cases passed (21 ms)
Your runtime beats 89.07 % of cpp submissions
Your memory usage beats 92.06 % of cpp submissions (23.7 MB)

5. AI Solution and Improvements

Your solution is quite good and implements the problem requirements efficiently. The use of a hash map (unordered_map) to group the anagrams is optimal for this task. However, there are a few optimizations and alternative approaches you can consider to make your solution slightly better in terms of both time complexity (constant factors) and performance for certain input scenarios.

Potential Improvements:

  1. Use Character Frequency as Key Instead of Sorting:
    • Currently, you're using std::sort() on each string, which has a time complexity of O(M * log(M)), where M is the length of the string. We can avoid this sorting operation by using the frequency of characters in each string to form a unique key.
    • This can be done by counting each character's occurrences and creating a fixed-size vector or string to represent that count, which has O(M) time complexity. This could be faster compared to sorting when M is large.

Here’s the modified version of your code using a frequency array as a key:

#include <iostream>
#include <vector>
#include <unordered_map>
using namespace std;

class Solution
{
public:
    vector<vector<string>> groupAnagrams(vector<string> &strs)
    {
        unordered_map<string, vector<string>> hashmap;

        for (auto &str : strs)
        {
            // Create a frequency count key for each string
            vector<int> count(26, 0);
            for (char c : str)
            {
                count[c - 'a']++;
            }
            
            // Convert frequency vector into a unique string key
            string key;
            for (int i = 0; i < 26; ++i)
            {
                key += '#' + to_string(count[i]);
            }

            hashmap[key].push_back(str);
        }

        // Prepare the result
        vector<vector<string>> result;
        for (auto &entry : hashmap)
        {
            result.push_back(entry.second);
        }

        return result;
    }
};

Explanation:

  1. Frequency Count as Key:

    • Instead of sorting each string, a frequency count vector is used to determine how many times each character appears.
    • This is represented as a vector of size 26 (vector<int> count(26, 0)), assuming we only have lowercase English letters ('a' to 'z').
    • This method has O(M) time complexity for counting characters in each string, which is better than sorting when M is large.
  2. Creating a Key:

    • We convert the frequency count vector into a string by concatenating the counts. We use a delimiter like # to ensure unique keys.
    • For example, the frequency array {1, 0, 0, ..., 2} becomes "#1#0#0#...#2".
  3. Inserting into Hash Map:

    • The key for each string is now based on character frequency rather than sorting, which reduces the constant factor associated with the key creation step.

Complexity Analysis:

  • Time Complexity:

    • The new time complexity is O(N * M), where N is the number of strings and M is the maximum length of the strings.
    • This avoids the O(M * log(M)) cost of sorting, which can make it faster especially when M is large.
  • Space Complexity:

    • The space complexity remains O(N * M) due to storing all strings and using the hash map, but we are also using O(1) extra space for the frequency array (of size 26).

Advantages:

  • Faster Key Generation: Generating a frequency-based key is faster than sorting, especially for longer strings.
  • Improved Efficiency: This is particularly beneficial when M (the length of the strings) is relatively large compared to N (the number of strings).

This new approach will provide significant performance improvement, especially for cases where the strings are long, making it more efficient compared to sorting-based solutions.

profile
Semidragon's network [CS undergrad @ Sungkyunkwan University | Networks + System @ CSI]

0개의 댓글