Check If an Array is a Subset of Another Array

Check If an Array is a Subset of Another Array

Which technique works better? HashMaps or Sorting+Two-pointer

Today, I solved a basic array problem, but it gave me valuable insights into how my thought process works when approaching a problem. Simple questions like these help in analyzing problem-solving patterns. Thanks to my DSA interviewer, I was able to evaluate whether I was on the right track.

Problem Statement:
Given two arrays, a[] and b[], determine whether b[] is a subset of a[].

Expected Complexity:

  • Time Complexity: O(m+n)

  • Space Complexity: O(n)

The image below shows an example provided by GeeksForGeeks:

My Thought Process in coming up with an approach:

  1. What is a subset? When all elements of b[] are present in a[], in any order.

  2. Considering duplicates. Since both a[] and b[] can contain duplicates, we need to track the frequency of each element in a[].

  3. Data structure: A HashMap is ideal for storing elements and their counts efficiently.

  4. Implementation Steps:

    1. Traverse a[] and store each element along with its count in a HashMap.

    2. Traverse b[] and check if each element exists in the HashMap:

      • If present, decrease its count by 1.

      • If an element is missing or its count drops below zero, b[] is not a subset.

  5. Considering Edge cases. Before processing, a quick check:

    • If b[] has more elements than a[], it cannot be a subset.

    • In this case, we return false immediately, saving computation time.

  6. Outcome. This approach successfully passed all test cases.

There is an alternative method that can also be used to solve this problem: Sorting & Two-pointer technique

Below is a comparison table explaining when to use which method based on different scenarios.

Sorting + Two-pointerHashMap
Time complexity: O(nlogn + mlogm), where m, and n are lengths of two arrays. Although sorting modifies the original array, space complexity is O(1).Time complexity (in this question): O(n + m), and space complexity is O(n), which may create concerns in very large datasets.
If you need to check multiple subsets against the same array, sorting it once and using two-pointers is better.Using a hashmap may result in rebuilding the hashmap every time, which may be inefficient in repeated queries.
Sorting avoids using any memory by working directly on the arrays.If a[] and b[] contain large numbers (e.g., values up to 10^9), storing them in a HashMap can lead to higher memory consumption.

Code of the problem using HashMap:

class Solution {
    public boolean isSubset(int a[], int b[]) {
        // edge case optimization
        if (b.length > a.length) {
            return false;
        }
        // create a hashmap, and loop through first array and store elements and 
        // their occurances
        HashMap<Integer, Integer> map = new HashMap<>();
        for(int i = 0; i < a.length; i++) {
            map.put(a[i], map.getOrDefault(a[i], 0) + 1);
        }

        // loop through b[] to check for each element if it's present in map or not
        for(int i = 0; i < b.length; i++) {
            if (map.containsKey(b[i])) {
                if (map.get(b[i]) == 0) {
                     return false;
                }
                else {
                    map.put(b[i], map.get(b[i]) - 1);
                }
            }
            else {
                return false;
            }
        }
        return true;
    }
}

Code of the problem using Sorting and Two-pointer:

class Solution {
    public boolean isSubset(int a[], int b[]) {
        if (b.length > a.length) {
            return false;
        }
        // sort both arrays
        Arrays.sort(a);
        Arrays.sort(b);

        int p1 = 0, p2 = 0;
        while (p1 < a.length && p2 < b.length) {
            // if values at both indexes are equal, move forward
            if (a[p1] == b[p2]) {
                p1++; p2++;
            }
            // if the value at index p1 is less than p2, then bigger value might lie
            // further in a[], so move forward
            else if (a[p1] < b[p2]) {
                p1++;
            }
            // if value at index p2 is greater than value at index p1, means here is 
            // violation, so we can directly return false
            else if (a[p1] > b[p2]) {
                return false;
            }
        }
        return true;
    }
}