Sort int array by frequency then value
up vote
4
down vote
favorite
Problem
Take an array of ints as input and run a special sorting algorithm on it. This algorithm must first sort by int frequency, less frequent first, then sort by value.
For example [4,5,6,5,4,3] is sorted to [3,6,4,4,5,5].
I'm looking for feedback on my use of data structures to implement this sort, memory management, any redundancies, and other general optimizations I can make. No need to give feedback on how I capture input in main since that is not my priority here.
Code
#include <map>
#include <set>
#include <list>
#include <cmath>
#include <ctime>
#include <deque>
#include <queue>
#include <stack>
#include <string>
#include <bitset>
#include <cstdio>
#include <limits>
#include <vector>
#include <climits>
#include <cstring>
#include <cstdlib>
#include <fstream>
#include <numeric>
#include <sstream>
#include <iostream>
#include <algorithm>
#include <unordered_map>
using namespace std;
int * customSort(int * arr, int n) {
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
auto mapIt = frequencyMap.find(arr[i]);
if (mapIt == frequencyMap.end()) {
frequencyMap.insert(make_pair(arr[i], 1));
} else {
frequencyMap[arr[i]]++;
}
}
map<int, set<int>> orderedfrequencyMap;
for (auto &p : frequencyMap) {
auto mapIt = orderedfrequencyMap.find(p.second);
if (mapIt == orderedfrequencyMap.end()) {
set<int> s = {p.first};
orderedfrequencyMap.insert(make_pair(p.second, s));
} else {
(*mapIt).second.insert(p.first);
}
}
int arrI = 0;
int solution[n];
for (auto &p : orderedfrequencyMap) {
for (auto element : p.second) {
for (int i = 0; i < p.first; i++) {
arr[arrI] = element;
cout << element << endl;
arrI++;
}
}
}
return solution;
}
int main() {
int n;
cin >> n;
int arr[n];
int element;
for (int i = 0; i < n; i++) {
cin >> element;
arr[i] = element;
}
customSort(arr, n);
return 0;
}
c++ sorting hash-map set
add a comment |
up vote
4
down vote
favorite
Problem
Take an array of ints as input and run a special sorting algorithm on it. This algorithm must first sort by int frequency, less frequent first, then sort by value.
For example [4,5,6,5,4,3] is sorted to [3,6,4,4,5,5].
I'm looking for feedback on my use of data structures to implement this sort, memory management, any redundancies, and other general optimizations I can make. No need to give feedback on how I capture input in main since that is not my priority here.
Code
#include <map>
#include <set>
#include <list>
#include <cmath>
#include <ctime>
#include <deque>
#include <queue>
#include <stack>
#include <string>
#include <bitset>
#include <cstdio>
#include <limits>
#include <vector>
#include <climits>
#include <cstring>
#include <cstdlib>
#include <fstream>
#include <numeric>
#include <sstream>
#include <iostream>
#include <algorithm>
#include <unordered_map>
using namespace std;
int * customSort(int * arr, int n) {
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
auto mapIt = frequencyMap.find(arr[i]);
if (mapIt == frequencyMap.end()) {
frequencyMap.insert(make_pair(arr[i], 1));
} else {
frequencyMap[arr[i]]++;
}
}
map<int, set<int>> orderedfrequencyMap;
for (auto &p : frequencyMap) {
auto mapIt = orderedfrequencyMap.find(p.second);
if (mapIt == orderedfrequencyMap.end()) {
set<int> s = {p.first};
orderedfrequencyMap.insert(make_pair(p.second, s));
} else {
(*mapIt).second.insert(p.first);
}
}
int arrI = 0;
int solution[n];
for (auto &p : orderedfrequencyMap) {
for (auto element : p.second) {
for (int i = 0; i < p.first; i++) {
arr[arrI] = element;
cout << element << endl;
arrI++;
}
}
}
return solution;
}
int main() {
int n;
cin >> n;
int arr[n];
int element;
for (int i = 0; i < n; i++) {
cin >> element;
arr[i] = element;
}
customSort(arr, n);
return 0;
}
c++ sorting hash-map set
7
Please don't remove your header includes - not only does it slow down those of us who want to compile your code ourselves, but we frequently find improvements to make in that respect (missing includes, unnecessary includes, C includes, for example). Same goes fornamespace std: we want to see your actual code, not some mangled version of it. Thanks!
– Toby Speight
3 hours ago
5
Don't remove your includes. As part of the code review I want to copy and paste it into a file and compile it. Having to add the includes back adds to my work load. Also if you leave the includes I can spot where you have bad includes.
– Martin York
3 hours ago
1
Use your real code.
– Nic Hartley
54 mins ago
Thanks for the feedback everyone, I will remember this in the future. Added my includes and removed the namesapce comment, as my real code did originally use the bad practice global using statement so myself some time in implementing this prototype.
– greg
18 mins ago
add a comment |
up vote
4
down vote
favorite
up vote
4
down vote
favorite
Problem
Take an array of ints as input and run a special sorting algorithm on it. This algorithm must first sort by int frequency, less frequent first, then sort by value.
For example [4,5,6,5,4,3] is sorted to [3,6,4,4,5,5].
I'm looking for feedback on my use of data structures to implement this sort, memory management, any redundancies, and other general optimizations I can make. No need to give feedback on how I capture input in main since that is not my priority here.
Code
#include <map>
#include <set>
#include <list>
#include <cmath>
#include <ctime>
#include <deque>
#include <queue>
#include <stack>
#include <string>
#include <bitset>
#include <cstdio>
#include <limits>
#include <vector>
#include <climits>
#include <cstring>
#include <cstdlib>
#include <fstream>
#include <numeric>
#include <sstream>
#include <iostream>
#include <algorithm>
#include <unordered_map>
using namespace std;
int * customSort(int * arr, int n) {
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
auto mapIt = frequencyMap.find(arr[i]);
if (mapIt == frequencyMap.end()) {
frequencyMap.insert(make_pair(arr[i], 1));
} else {
frequencyMap[arr[i]]++;
}
}
map<int, set<int>> orderedfrequencyMap;
for (auto &p : frequencyMap) {
auto mapIt = orderedfrequencyMap.find(p.second);
if (mapIt == orderedfrequencyMap.end()) {
set<int> s = {p.first};
orderedfrequencyMap.insert(make_pair(p.second, s));
} else {
(*mapIt).second.insert(p.first);
}
}
int arrI = 0;
int solution[n];
for (auto &p : orderedfrequencyMap) {
for (auto element : p.second) {
for (int i = 0; i < p.first; i++) {
arr[arrI] = element;
cout << element << endl;
arrI++;
}
}
}
return solution;
}
int main() {
int n;
cin >> n;
int arr[n];
int element;
for (int i = 0; i < n; i++) {
cin >> element;
arr[i] = element;
}
customSort(arr, n);
return 0;
}
c++ sorting hash-map set
Problem
Take an array of ints as input and run a special sorting algorithm on it. This algorithm must first sort by int frequency, less frequent first, then sort by value.
For example [4,5,6,5,4,3] is sorted to [3,6,4,4,5,5].
I'm looking for feedback on my use of data structures to implement this sort, memory management, any redundancies, and other general optimizations I can make. No need to give feedback on how I capture input in main since that is not my priority here.
Code
#include <map>
#include <set>
#include <list>
#include <cmath>
#include <ctime>
#include <deque>
#include <queue>
#include <stack>
#include <string>
#include <bitset>
#include <cstdio>
#include <limits>
#include <vector>
#include <climits>
#include <cstring>
#include <cstdlib>
#include <fstream>
#include <numeric>
#include <sstream>
#include <iostream>
#include <algorithm>
#include <unordered_map>
using namespace std;
int * customSort(int * arr, int n) {
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
auto mapIt = frequencyMap.find(arr[i]);
if (mapIt == frequencyMap.end()) {
frequencyMap.insert(make_pair(arr[i], 1));
} else {
frequencyMap[arr[i]]++;
}
}
map<int, set<int>> orderedfrequencyMap;
for (auto &p : frequencyMap) {
auto mapIt = orderedfrequencyMap.find(p.second);
if (mapIt == orderedfrequencyMap.end()) {
set<int> s = {p.first};
orderedfrequencyMap.insert(make_pair(p.second, s));
} else {
(*mapIt).second.insert(p.first);
}
}
int arrI = 0;
int solution[n];
for (auto &p : orderedfrequencyMap) {
for (auto element : p.second) {
for (int i = 0; i < p.first; i++) {
arr[arrI] = element;
cout << element << endl;
arrI++;
}
}
}
return solution;
}
int main() {
int n;
cin >> n;
int arr[n];
int element;
for (int i = 0; i < n; i++) {
cin >> element;
arr[i] = element;
}
customSort(arr, n);
return 0;
}
c++ sorting hash-map set
c++ sorting hash-map set
edited 21 mins ago
asked 3 hours ago
greg
27517
27517
7
Please don't remove your header includes - not only does it slow down those of us who want to compile your code ourselves, but we frequently find improvements to make in that respect (missing includes, unnecessary includes, C includes, for example). Same goes fornamespace std: we want to see your actual code, not some mangled version of it. Thanks!
– Toby Speight
3 hours ago
5
Don't remove your includes. As part of the code review I want to copy and paste it into a file and compile it. Having to add the includes back adds to my work load. Also if you leave the includes I can spot where you have bad includes.
– Martin York
3 hours ago
1
Use your real code.
– Nic Hartley
54 mins ago
Thanks for the feedback everyone, I will remember this in the future. Added my includes and removed the namesapce comment, as my real code did originally use the bad practice global using statement so myself some time in implementing this prototype.
– greg
18 mins ago
add a comment |
7
Please don't remove your header includes - not only does it slow down those of us who want to compile your code ourselves, but we frequently find improvements to make in that respect (missing includes, unnecessary includes, C includes, for example). Same goes fornamespace std: we want to see your actual code, not some mangled version of it. Thanks!
– Toby Speight
3 hours ago
5
Don't remove your includes. As part of the code review I want to copy and paste it into a file and compile it. Having to add the includes back adds to my work load. Also if you leave the includes I can spot where you have bad includes.
– Martin York
3 hours ago
1
Use your real code.
– Nic Hartley
54 mins ago
Thanks for the feedback everyone, I will remember this in the future. Added my includes and removed the namesapce comment, as my real code did originally use the bad practice global using statement so myself some time in implementing this prototype.
– greg
18 mins ago
7
7
Please don't remove your header includes - not only does it slow down those of us who want to compile your code ourselves, but we frequently find improvements to make in that respect (missing includes, unnecessary includes, C includes, for example). Same goes for
namespace std: we want to see your actual code, not some mangled version of it. Thanks!– Toby Speight
3 hours ago
Please don't remove your header includes - not only does it slow down those of us who want to compile your code ourselves, but we frequently find improvements to make in that respect (missing includes, unnecessary includes, C includes, for example). Same goes for
namespace std: we want to see your actual code, not some mangled version of it. Thanks!– Toby Speight
3 hours ago
5
5
Don't remove your includes. As part of the code review I want to copy and paste it into a file and compile it. Having to add the includes back adds to my work load. Also if you leave the includes I can spot where you have bad includes.
– Martin York
3 hours ago
Don't remove your includes. As part of the code review I want to copy and paste it into a file and compile it. Having to add the includes back adds to my work load. Also if you leave the includes I can spot where you have bad includes.
– Martin York
3 hours ago
1
1
Use your real code.
– Nic Hartley
54 mins ago
Use your real code.
– Nic Hartley
54 mins ago
Thanks for the feedback everyone, I will remember this in the future. Added my includes and removed the namesapce comment, as my real code did originally use the bad practice global using statement so myself some time in implementing this prototype.
– greg
18 mins ago
Thanks for the feedback everyone, I will remember this in the future. Added my includes and removed the namesapce comment, as my real code did originally use the bad practice global using statement so myself some time in implementing this prototype.
– greg
18 mins ago
add a comment |
1 Answer
1
active
oldest
votes
up vote
10
down vote
accepted
Maps
If you access an element in a map and it does not exist then it is automatically created and "zero-initialized". This means you don't need to search and conditialy update.
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
auto mapIt = frequencyMap.find(arr[i]);
if (mapIt == frequencyMap.end()) {
frequencyMap.insert(make_pair(arr[i], 1));
} else {
frequencyMap[arr[i]]++;
}
}
Can be simplified too:
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
++frequencyMap[arr[i]];
}
VLA
Variable length arrays are an extension of C (part of C99). They were never part of the C++ standard. So this is not legal C++.
cin >> n;
int arr[n];
Your compiler supports it as an extension to the language but its not a good idea to use it. We use std::vector as a better alternative:
cin >> n;
std::vector<int> arr(n);
This also allows you to use range based for loops:
unordered_map<int, int> frequencyMap;
for (int val: arr) {
++frequencyMap[val];
}
Order by frequency.
I like your idea of ordering by frequency using a std::map but it becomes clunky when you also have to order by value (and thus have a map of std::set). I would simply copy the frequency info into a vector and use a sort.
using Val = std::pair<int, int>;
std::vector<Val> values(std::begin(frequencyMap), std::end(frequencyMap));
std::sort(std::begin(values), std::end(values), (Val const& lhs, Val const& rhs)
{
return (lhs.second < rhs.second) || (lhs.second == rhs.second && lhs.first < rhs.first);
});
Standard Issues from Beginners
Don't use
using namespace std;
This is a crutch that will break for you. Don't get into the habit of using it when it breaks it is not obvious and can lead to hours of debugging. The reason the standard namespace is called std and not standard is so prefixing object in the standard namespace is not that burdensome.
std::vector<int> val; // not that hard to add std::
Don't return an array from a function
return solution;
You are returning the address of a local variable. Once this goes out of scope that variable no longer exists. So you have an invalid pointer returned. This is also another good reason to use std::vector as the copy/move semantics will correctly move it from the current scope to the calling scope.
PS. Your compiler should warn you about this.
No need to use return in main
return 0;
The compiler plants a return 0 at the end of main.
By not planting a return in main() it is an indication that your application does not have a failing state (usual for simple code). If it does have a user inserted return 0 I always start looking for other places it could return an error code. By not putting a return 0 you indicate to me that there is no error situation and I don't have to look for it.
Prefer pre increment to post increment
for (int i = 0; i < p.first; i++)
for (int i = 0; i < p.first; ++i)
In this case there is basically no difference. But there are situations where the post increment is slightly more complex.
What happen if we change i from an int into an iterator. Then it is a class the executes a function. Most iteratros behave like POD types with pre/post increment but to do that the post increment is slightly more complex. So why not use the one that will always give you optimal code (even if you change the type). So prefer pre-increment.
Prefer 'n' over std::endl
The difference is that std::endl will force a flush. In simple things like this it makes no difference. But if you are outputting a large amount of information then the extra flush can cause significant performance degradation.
The streams will auto flush themselves and having a human forcing a flush is always non optimal. So prefer to get the same resutls but at optimal speed with 'n'.
Pointer short hand
You should note that we have a shortcut pointer notation.
(*mapIt).second
// Shorthand as:
mapIt->second
Avoid short variable names
for (int i = 0; i < n; i++) {
This is great if your for loop is two lines long. But if the loop grows and becomes more complex then searching for all instances of i becomes a pain because of all the false positives.
Believe me just because your loop only covers two lines today, in ten years after thousands of bug fixes it will not look as neat. So give me something that is easy to search for in the code base.
I know a lot of people like using i as a loop variable. But this is a hold over from Foortran. Please use longer identifiers it will not cost you that much and the maintainer will not hunt you down with an axe.
Looks Like this:
#include <unordered_map>
#include <vector>
#include <iostream>
#include <iterator>
void customSort(std::vector<int> const& arr)
{
std::unordered_map<int, int> frequencyMap;
for (int val: arr) {
++frequencyMap[val];
}
using Val = std::pair<int, int>;
std::vector<Val> values(std::begin(frequencyMap), std::end(frequencyMap));
std::sort(std::begin(values), std::end(values), (Val const& lhs, Val const& rhs) {
return (lhs.second < rhs.second) || (lhs.second == rhs.second && lhs.first < rhs.first);
});
for(auto const& val: values) {
for(int loop = 0;loop < val.second; ++loop) {
std::cout << val.first << "n";
}
}
}
int main()
{
std::vector<int> arr;
int n;
std::cin >> n;
std::copy_n(std::istream_iterator<int>(std::cin), n,
std::back_inserter(arr));
customSort(arr);
}
Excellent response, thanks for breaking it down in the way you have. Very easy to understand how I can refactor now. In the Order by frequency section what issinginsing Val = std::pair<int, int>;?. Also in the Prefer pre increment to post increment you mention most iterators behave like POD types, what are POD types?
– greg
2 hours ago
1
Missed cut and paste:using Val = std::pair<int, int>;This is the modern C++ way of doing typedef.
– Martin York
2 hours ago
1
POD: Plain old data. Integers/char etc. When you increment an integer. It is post increment it increments the value but returns the original value to be used in the expression.
– Martin York
1 hour ago
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
10
down vote
accepted
Maps
If you access an element in a map and it does not exist then it is automatically created and "zero-initialized". This means you don't need to search and conditialy update.
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
auto mapIt = frequencyMap.find(arr[i]);
if (mapIt == frequencyMap.end()) {
frequencyMap.insert(make_pair(arr[i], 1));
} else {
frequencyMap[arr[i]]++;
}
}
Can be simplified too:
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
++frequencyMap[arr[i]];
}
VLA
Variable length arrays are an extension of C (part of C99). They were never part of the C++ standard. So this is not legal C++.
cin >> n;
int arr[n];
Your compiler supports it as an extension to the language but its not a good idea to use it. We use std::vector as a better alternative:
cin >> n;
std::vector<int> arr(n);
This also allows you to use range based for loops:
unordered_map<int, int> frequencyMap;
for (int val: arr) {
++frequencyMap[val];
}
Order by frequency.
I like your idea of ordering by frequency using a std::map but it becomes clunky when you also have to order by value (and thus have a map of std::set). I would simply copy the frequency info into a vector and use a sort.
using Val = std::pair<int, int>;
std::vector<Val> values(std::begin(frequencyMap), std::end(frequencyMap));
std::sort(std::begin(values), std::end(values), (Val const& lhs, Val const& rhs)
{
return (lhs.second < rhs.second) || (lhs.second == rhs.second && lhs.first < rhs.first);
});
Standard Issues from Beginners
Don't use
using namespace std;
This is a crutch that will break for you. Don't get into the habit of using it when it breaks it is not obvious and can lead to hours of debugging. The reason the standard namespace is called std and not standard is so prefixing object in the standard namespace is not that burdensome.
std::vector<int> val; // not that hard to add std::
Don't return an array from a function
return solution;
You are returning the address of a local variable. Once this goes out of scope that variable no longer exists. So you have an invalid pointer returned. This is also another good reason to use std::vector as the copy/move semantics will correctly move it from the current scope to the calling scope.
PS. Your compiler should warn you about this.
No need to use return in main
return 0;
The compiler plants a return 0 at the end of main.
By not planting a return in main() it is an indication that your application does not have a failing state (usual for simple code). If it does have a user inserted return 0 I always start looking for other places it could return an error code. By not putting a return 0 you indicate to me that there is no error situation and I don't have to look for it.
Prefer pre increment to post increment
for (int i = 0; i < p.first; i++)
for (int i = 0; i < p.first; ++i)
In this case there is basically no difference. But there are situations where the post increment is slightly more complex.
What happen if we change i from an int into an iterator. Then it is a class the executes a function. Most iteratros behave like POD types with pre/post increment but to do that the post increment is slightly more complex. So why not use the one that will always give you optimal code (even if you change the type). So prefer pre-increment.
Prefer 'n' over std::endl
The difference is that std::endl will force a flush. In simple things like this it makes no difference. But if you are outputting a large amount of information then the extra flush can cause significant performance degradation.
The streams will auto flush themselves and having a human forcing a flush is always non optimal. So prefer to get the same resutls but at optimal speed with 'n'.
Pointer short hand
You should note that we have a shortcut pointer notation.
(*mapIt).second
// Shorthand as:
mapIt->second
Avoid short variable names
for (int i = 0; i < n; i++) {
This is great if your for loop is two lines long. But if the loop grows and becomes more complex then searching for all instances of i becomes a pain because of all the false positives.
Believe me just because your loop only covers two lines today, in ten years after thousands of bug fixes it will not look as neat. So give me something that is easy to search for in the code base.
I know a lot of people like using i as a loop variable. But this is a hold over from Foortran. Please use longer identifiers it will not cost you that much and the maintainer will not hunt you down with an axe.
Looks Like this:
#include <unordered_map>
#include <vector>
#include <iostream>
#include <iterator>
void customSort(std::vector<int> const& arr)
{
std::unordered_map<int, int> frequencyMap;
for (int val: arr) {
++frequencyMap[val];
}
using Val = std::pair<int, int>;
std::vector<Val> values(std::begin(frequencyMap), std::end(frequencyMap));
std::sort(std::begin(values), std::end(values), (Val const& lhs, Val const& rhs) {
return (lhs.second < rhs.second) || (lhs.second == rhs.second && lhs.first < rhs.first);
});
for(auto const& val: values) {
for(int loop = 0;loop < val.second; ++loop) {
std::cout << val.first << "n";
}
}
}
int main()
{
std::vector<int> arr;
int n;
std::cin >> n;
std::copy_n(std::istream_iterator<int>(std::cin), n,
std::back_inserter(arr));
customSort(arr);
}
Excellent response, thanks for breaking it down in the way you have. Very easy to understand how I can refactor now. In the Order by frequency section what issinginsing Val = std::pair<int, int>;?. Also in the Prefer pre increment to post increment you mention most iterators behave like POD types, what are POD types?
– greg
2 hours ago
1
Missed cut and paste:using Val = std::pair<int, int>;This is the modern C++ way of doing typedef.
– Martin York
2 hours ago
1
POD: Plain old data. Integers/char etc. When you increment an integer. It is post increment it increments the value but returns the original value to be used in the expression.
– Martin York
1 hour ago
add a comment |
up vote
10
down vote
accepted
Maps
If you access an element in a map and it does not exist then it is automatically created and "zero-initialized". This means you don't need to search and conditialy update.
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
auto mapIt = frequencyMap.find(arr[i]);
if (mapIt == frequencyMap.end()) {
frequencyMap.insert(make_pair(arr[i], 1));
} else {
frequencyMap[arr[i]]++;
}
}
Can be simplified too:
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
++frequencyMap[arr[i]];
}
VLA
Variable length arrays are an extension of C (part of C99). They were never part of the C++ standard. So this is not legal C++.
cin >> n;
int arr[n];
Your compiler supports it as an extension to the language but its not a good idea to use it. We use std::vector as a better alternative:
cin >> n;
std::vector<int> arr(n);
This also allows you to use range based for loops:
unordered_map<int, int> frequencyMap;
for (int val: arr) {
++frequencyMap[val];
}
Order by frequency.
I like your idea of ordering by frequency using a std::map but it becomes clunky when you also have to order by value (and thus have a map of std::set). I would simply copy the frequency info into a vector and use a sort.
using Val = std::pair<int, int>;
std::vector<Val> values(std::begin(frequencyMap), std::end(frequencyMap));
std::sort(std::begin(values), std::end(values), (Val const& lhs, Val const& rhs)
{
return (lhs.second < rhs.second) || (lhs.second == rhs.second && lhs.first < rhs.first);
});
Standard Issues from Beginners
Don't use
using namespace std;
This is a crutch that will break for you. Don't get into the habit of using it when it breaks it is not obvious and can lead to hours of debugging. The reason the standard namespace is called std and not standard is so prefixing object in the standard namespace is not that burdensome.
std::vector<int> val; // not that hard to add std::
Don't return an array from a function
return solution;
You are returning the address of a local variable. Once this goes out of scope that variable no longer exists. So you have an invalid pointer returned. This is also another good reason to use std::vector as the copy/move semantics will correctly move it from the current scope to the calling scope.
PS. Your compiler should warn you about this.
No need to use return in main
return 0;
The compiler plants a return 0 at the end of main.
By not planting a return in main() it is an indication that your application does not have a failing state (usual for simple code). If it does have a user inserted return 0 I always start looking for other places it could return an error code. By not putting a return 0 you indicate to me that there is no error situation and I don't have to look for it.
Prefer pre increment to post increment
for (int i = 0; i < p.first; i++)
for (int i = 0; i < p.first; ++i)
In this case there is basically no difference. But there are situations where the post increment is slightly more complex.
What happen if we change i from an int into an iterator. Then it is a class the executes a function. Most iteratros behave like POD types with pre/post increment but to do that the post increment is slightly more complex. So why not use the one that will always give you optimal code (even if you change the type). So prefer pre-increment.
Prefer 'n' over std::endl
The difference is that std::endl will force a flush. In simple things like this it makes no difference. But if you are outputting a large amount of information then the extra flush can cause significant performance degradation.
The streams will auto flush themselves and having a human forcing a flush is always non optimal. So prefer to get the same resutls but at optimal speed with 'n'.
Pointer short hand
You should note that we have a shortcut pointer notation.
(*mapIt).second
// Shorthand as:
mapIt->second
Avoid short variable names
for (int i = 0; i < n; i++) {
This is great if your for loop is two lines long. But if the loop grows and becomes more complex then searching for all instances of i becomes a pain because of all the false positives.
Believe me just because your loop only covers two lines today, in ten years after thousands of bug fixes it will not look as neat. So give me something that is easy to search for in the code base.
I know a lot of people like using i as a loop variable. But this is a hold over from Foortran. Please use longer identifiers it will not cost you that much and the maintainer will not hunt you down with an axe.
Looks Like this:
#include <unordered_map>
#include <vector>
#include <iostream>
#include <iterator>
void customSort(std::vector<int> const& arr)
{
std::unordered_map<int, int> frequencyMap;
for (int val: arr) {
++frequencyMap[val];
}
using Val = std::pair<int, int>;
std::vector<Val> values(std::begin(frequencyMap), std::end(frequencyMap));
std::sort(std::begin(values), std::end(values), (Val const& lhs, Val const& rhs) {
return (lhs.second < rhs.second) || (lhs.second == rhs.second && lhs.first < rhs.first);
});
for(auto const& val: values) {
for(int loop = 0;loop < val.second; ++loop) {
std::cout << val.first << "n";
}
}
}
int main()
{
std::vector<int> arr;
int n;
std::cin >> n;
std::copy_n(std::istream_iterator<int>(std::cin), n,
std::back_inserter(arr));
customSort(arr);
}
Excellent response, thanks for breaking it down in the way you have. Very easy to understand how I can refactor now. In the Order by frequency section what issinginsing Val = std::pair<int, int>;?. Also in the Prefer pre increment to post increment you mention most iterators behave like POD types, what are POD types?
– greg
2 hours ago
1
Missed cut and paste:using Val = std::pair<int, int>;This is the modern C++ way of doing typedef.
– Martin York
2 hours ago
1
POD: Plain old data. Integers/char etc. When you increment an integer. It is post increment it increments the value but returns the original value to be used in the expression.
– Martin York
1 hour ago
add a comment |
up vote
10
down vote
accepted
up vote
10
down vote
accepted
Maps
If you access an element in a map and it does not exist then it is automatically created and "zero-initialized". This means you don't need to search and conditialy update.
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
auto mapIt = frequencyMap.find(arr[i]);
if (mapIt == frequencyMap.end()) {
frequencyMap.insert(make_pair(arr[i], 1));
} else {
frequencyMap[arr[i]]++;
}
}
Can be simplified too:
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
++frequencyMap[arr[i]];
}
VLA
Variable length arrays are an extension of C (part of C99). They were never part of the C++ standard. So this is not legal C++.
cin >> n;
int arr[n];
Your compiler supports it as an extension to the language but its not a good idea to use it. We use std::vector as a better alternative:
cin >> n;
std::vector<int> arr(n);
This also allows you to use range based for loops:
unordered_map<int, int> frequencyMap;
for (int val: arr) {
++frequencyMap[val];
}
Order by frequency.
I like your idea of ordering by frequency using a std::map but it becomes clunky when you also have to order by value (and thus have a map of std::set). I would simply copy the frequency info into a vector and use a sort.
using Val = std::pair<int, int>;
std::vector<Val> values(std::begin(frequencyMap), std::end(frequencyMap));
std::sort(std::begin(values), std::end(values), (Val const& lhs, Val const& rhs)
{
return (lhs.second < rhs.second) || (lhs.second == rhs.second && lhs.first < rhs.first);
});
Standard Issues from Beginners
Don't use
using namespace std;
This is a crutch that will break for you. Don't get into the habit of using it when it breaks it is not obvious and can lead to hours of debugging. The reason the standard namespace is called std and not standard is so prefixing object in the standard namespace is not that burdensome.
std::vector<int> val; // not that hard to add std::
Don't return an array from a function
return solution;
You are returning the address of a local variable. Once this goes out of scope that variable no longer exists. So you have an invalid pointer returned. This is also another good reason to use std::vector as the copy/move semantics will correctly move it from the current scope to the calling scope.
PS. Your compiler should warn you about this.
No need to use return in main
return 0;
The compiler plants a return 0 at the end of main.
By not planting a return in main() it is an indication that your application does not have a failing state (usual for simple code). If it does have a user inserted return 0 I always start looking for other places it could return an error code. By not putting a return 0 you indicate to me that there is no error situation and I don't have to look for it.
Prefer pre increment to post increment
for (int i = 0; i < p.first; i++)
for (int i = 0; i < p.first; ++i)
In this case there is basically no difference. But there are situations where the post increment is slightly more complex.
What happen if we change i from an int into an iterator. Then it is a class the executes a function. Most iteratros behave like POD types with pre/post increment but to do that the post increment is slightly more complex. So why not use the one that will always give you optimal code (even if you change the type). So prefer pre-increment.
Prefer 'n' over std::endl
The difference is that std::endl will force a flush. In simple things like this it makes no difference. But if you are outputting a large amount of information then the extra flush can cause significant performance degradation.
The streams will auto flush themselves and having a human forcing a flush is always non optimal. So prefer to get the same resutls but at optimal speed with 'n'.
Pointer short hand
You should note that we have a shortcut pointer notation.
(*mapIt).second
// Shorthand as:
mapIt->second
Avoid short variable names
for (int i = 0; i < n; i++) {
This is great if your for loop is two lines long. But if the loop grows and becomes more complex then searching for all instances of i becomes a pain because of all the false positives.
Believe me just because your loop only covers two lines today, in ten years after thousands of bug fixes it will not look as neat. So give me something that is easy to search for in the code base.
I know a lot of people like using i as a loop variable. But this is a hold over from Foortran. Please use longer identifiers it will not cost you that much and the maintainer will not hunt you down with an axe.
Looks Like this:
#include <unordered_map>
#include <vector>
#include <iostream>
#include <iterator>
void customSort(std::vector<int> const& arr)
{
std::unordered_map<int, int> frequencyMap;
for (int val: arr) {
++frequencyMap[val];
}
using Val = std::pair<int, int>;
std::vector<Val> values(std::begin(frequencyMap), std::end(frequencyMap));
std::sort(std::begin(values), std::end(values), (Val const& lhs, Val const& rhs) {
return (lhs.second < rhs.second) || (lhs.second == rhs.second && lhs.first < rhs.first);
});
for(auto const& val: values) {
for(int loop = 0;loop < val.second; ++loop) {
std::cout << val.first << "n";
}
}
}
int main()
{
std::vector<int> arr;
int n;
std::cin >> n;
std::copy_n(std::istream_iterator<int>(std::cin), n,
std::back_inserter(arr));
customSort(arr);
}
Maps
If you access an element in a map and it does not exist then it is automatically created and "zero-initialized". This means you don't need to search and conditialy update.
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
auto mapIt = frequencyMap.find(arr[i]);
if (mapIt == frequencyMap.end()) {
frequencyMap.insert(make_pair(arr[i], 1));
} else {
frequencyMap[arr[i]]++;
}
}
Can be simplified too:
unordered_map<int, int> frequencyMap;
for (int i = 0; i < n; i++) {
++frequencyMap[arr[i]];
}
VLA
Variable length arrays are an extension of C (part of C99). They were never part of the C++ standard. So this is not legal C++.
cin >> n;
int arr[n];
Your compiler supports it as an extension to the language but its not a good idea to use it. We use std::vector as a better alternative:
cin >> n;
std::vector<int> arr(n);
This also allows you to use range based for loops:
unordered_map<int, int> frequencyMap;
for (int val: arr) {
++frequencyMap[val];
}
Order by frequency.
I like your idea of ordering by frequency using a std::map but it becomes clunky when you also have to order by value (and thus have a map of std::set). I would simply copy the frequency info into a vector and use a sort.
using Val = std::pair<int, int>;
std::vector<Val> values(std::begin(frequencyMap), std::end(frequencyMap));
std::sort(std::begin(values), std::end(values), (Val const& lhs, Val const& rhs)
{
return (lhs.second < rhs.second) || (lhs.second == rhs.second && lhs.first < rhs.first);
});
Standard Issues from Beginners
Don't use
using namespace std;
This is a crutch that will break for you. Don't get into the habit of using it when it breaks it is not obvious and can lead to hours of debugging. The reason the standard namespace is called std and not standard is so prefixing object in the standard namespace is not that burdensome.
std::vector<int> val; // not that hard to add std::
Don't return an array from a function
return solution;
You are returning the address of a local variable. Once this goes out of scope that variable no longer exists. So you have an invalid pointer returned. This is also another good reason to use std::vector as the copy/move semantics will correctly move it from the current scope to the calling scope.
PS. Your compiler should warn you about this.
No need to use return in main
return 0;
The compiler plants a return 0 at the end of main.
By not planting a return in main() it is an indication that your application does not have a failing state (usual for simple code). If it does have a user inserted return 0 I always start looking for other places it could return an error code. By not putting a return 0 you indicate to me that there is no error situation and I don't have to look for it.
Prefer pre increment to post increment
for (int i = 0; i < p.first; i++)
for (int i = 0; i < p.first; ++i)
In this case there is basically no difference. But there are situations where the post increment is slightly more complex.
What happen if we change i from an int into an iterator. Then it is a class the executes a function. Most iteratros behave like POD types with pre/post increment but to do that the post increment is slightly more complex. So why not use the one that will always give you optimal code (even if you change the type). So prefer pre-increment.
Prefer 'n' over std::endl
The difference is that std::endl will force a flush. In simple things like this it makes no difference. But if you are outputting a large amount of information then the extra flush can cause significant performance degradation.
The streams will auto flush themselves and having a human forcing a flush is always non optimal. So prefer to get the same resutls but at optimal speed with 'n'.
Pointer short hand
You should note that we have a shortcut pointer notation.
(*mapIt).second
// Shorthand as:
mapIt->second
Avoid short variable names
for (int i = 0; i < n; i++) {
This is great if your for loop is two lines long. But if the loop grows and becomes more complex then searching for all instances of i becomes a pain because of all the false positives.
Believe me just because your loop only covers two lines today, in ten years after thousands of bug fixes it will not look as neat. So give me something that is easy to search for in the code base.
I know a lot of people like using i as a loop variable. But this is a hold over from Foortran. Please use longer identifiers it will not cost you that much and the maintainer will not hunt you down with an axe.
Looks Like this:
#include <unordered_map>
#include <vector>
#include <iostream>
#include <iterator>
void customSort(std::vector<int> const& arr)
{
std::unordered_map<int, int> frequencyMap;
for (int val: arr) {
++frequencyMap[val];
}
using Val = std::pair<int, int>;
std::vector<Val> values(std::begin(frequencyMap), std::end(frequencyMap));
std::sort(std::begin(values), std::end(values), (Val const& lhs, Val const& rhs) {
return (lhs.second < rhs.second) || (lhs.second == rhs.second && lhs.first < rhs.first);
});
for(auto const& val: values) {
for(int loop = 0;loop < val.second; ++loop) {
std::cout << val.first << "n";
}
}
}
int main()
{
std::vector<int> arr;
int n;
std::cin >> n;
std::copy_n(std::istream_iterator<int>(std::cin), n,
std::back_inserter(arr));
customSort(arr);
}
edited 2 hours ago
answered 3 hours ago
Martin York
72.2k481256
72.2k481256
Excellent response, thanks for breaking it down in the way you have. Very easy to understand how I can refactor now. In the Order by frequency section what issinginsing Val = std::pair<int, int>;?. Also in the Prefer pre increment to post increment you mention most iterators behave like POD types, what are POD types?
– greg
2 hours ago
1
Missed cut and paste:using Val = std::pair<int, int>;This is the modern C++ way of doing typedef.
– Martin York
2 hours ago
1
POD: Plain old data. Integers/char etc. When you increment an integer. It is post increment it increments the value but returns the original value to be used in the expression.
– Martin York
1 hour ago
add a comment |
Excellent response, thanks for breaking it down in the way you have. Very easy to understand how I can refactor now. In the Order by frequency section what issinginsing Val = std::pair<int, int>;?. Also in the Prefer pre increment to post increment you mention most iterators behave like POD types, what are POD types?
– greg
2 hours ago
1
Missed cut and paste:using Val = std::pair<int, int>;This is the modern C++ way of doing typedef.
– Martin York
2 hours ago
1
POD: Plain old data. Integers/char etc. When you increment an integer. It is post increment it increments the value but returns the original value to be used in the expression.
– Martin York
1 hour ago
Excellent response, thanks for breaking it down in the way you have. Very easy to understand how I can refactor now. In the Order by frequency section what is
sing in sing Val = std::pair<int, int>;?. Also in the Prefer pre increment to post increment you mention most iterators behave like POD types, what are POD types?– greg
2 hours ago
Excellent response, thanks for breaking it down in the way you have. Very easy to understand how I can refactor now. In the Order by frequency section what is
sing in sing Val = std::pair<int, int>;?. Also in the Prefer pre increment to post increment you mention most iterators behave like POD types, what are POD types?– greg
2 hours ago
1
1
Missed cut and paste:
using Val = std::pair<int, int>; This is the modern C++ way of doing typedef.– Martin York
2 hours ago
Missed cut and paste:
using Val = std::pair<int, int>; This is the modern C++ way of doing typedef.– Martin York
2 hours ago
1
1
POD: Plain old data. Integers/char etc. When you increment an integer. It is post increment it increments the value but returns the original value to be used in the expression.
– Martin York
1 hour ago
POD: Plain old data. Integers/char etc. When you increment an integer. It is post increment it increments the value but returns the original value to be used in the expression.
– Martin York
1 hour ago
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209168%2fsort-int-array-by-frequency-then-value%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
7
Please don't remove your header includes - not only does it slow down those of us who want to compile your code ourselves, but we frequently find improvements to make in that respect (missing includes, unnecessary includes, C includes, for example). Same goes for
namespace std: we want to see your actual code, not some mangled version of it. Thanks!– Toby Speight
3 hours ago
5
Don't remove your includes. As part of the code review I want to copy and paste it into a file and compile it. Having to add the includes back adds to my work load. Also if you leave the includes I can spot where you have bad includes.
– Martin York
3 hours ago
1
Use your real code.
– Nic Hartley
54 mins ago
Thanks for the feedback everyone, I will remember this in the future. Added my includes and removed the namesapce comment, as my real code did originally use the bad practice global using statement so myself some time in implementing this prototype.
– greg
18 mins ago