Efficient Algorithm for Sorting Elements by Frequency Using Hashing (S4 Approach)

If 2 numbers have the same frequency, the elements of the array are printed at a decreasing frequency, and then the frequency at which the first one occurs.

Example:

Input : arr[] = {2, 5, 2, 8, 5, 6, 8, 8}
Output : arr[] = {8, 8, 8, 2, 2, 5, 5, 6}

Input : arr[] = {2, 5, 2, 6, -1, 9999999, 5, 8, 8, 8}
Output : arr[] = {8, 8, 8, 2, 2, 5, 5, 6, -1, 9999999}

We have discussed the different approaches in the following posts:

Sort elements by frequency|S1

Sort elements by frequency|S2

Sort array elements by frequency S3 (using STL)

All of the above methods are valid in O(n Log n) time, where n is the total number of elements. In this article, we discuss a new way to work in O(n + m Log m) time, where n is the total number of elements and m is the total number of different elements.

The method here is to use hashing

  1. We insert all the elements and their counts into the hash. This step takes O(n) time, where n is the number of elements.
  2. We copy the contents of the hash into an array (or vector) and sort them by count. This step takes O(m Log m) time, where m is the total number of different elements.
  3. To maintain the order of the elements at the same frequency, we use another hash with the keys of the array and the values of the indexes. If both elements have the same frequency, the elements are sorted according to the index.

The following diagram is a simulation of the above method:

We don’t need to declare another mapping, m2, because it doesn’t provide a proper expected outcome for the problem.

Instead, we just need to check the first value of the pair sent as an argument in the sortByVal function.

Here’s an implementation of the above method:

C ++

//     CPP program for the above approach
#include <bits/stdc++.h>
using namespace std;
 
// Used for sorting by frequency. And if frequency is same, // then by appearance
bool sortByVal( const pair< int , int >& a, const pair< int , int >& b)
{
 
    // If frequency is same then sort by index
    if (a.second == b.second) 
        return a.first < b.first;
 
    return a.second > b.second;
}
 
// function to sort elements by frequency
vector< int >sortByFreq( int a[], int n)
{
 
    vector< int >res;
 
    unordered_map< int , int > m;
 
    vector<pair< int , int > > v;
 
    for ( int i = 0; i < n; ++i) {
 
        // Map m is used to keep track of count 
        // of elements in array
        m[a[i]]++;     
    }
 
    // Copy map to vector
    copy(m.begin(), m.end(), back_inserter(v));
 
    // Sort the element of array by frequency
    sort(v.begin(), v.end(), sortByVal);
 
    for ( int i = 0; i < v.size(); ++i) 
       while (v[i].second--)
       {
               res.push_back(v[i].first);
       }
 
    return res;
}
 
// Driver program
int main()
{
 
    int a[] = { 2, 5, 2, 6, -1, 9999999, 5, 8, 8, 8 };
    int n = sizeof (a) / sizeof (a[0]);
    vector< int >res;
    res = sortByFreq(a, n);
 
    for ( int i = 0;i < res.size(); i++)
          cout<<res[i]<< " " ;
 
    return 0;
 
}

The output is as follows:

8 8 8 2 2 5 5 6 -1 9999999