Most efficient way to find an entry in a C++ vector











up vote
7
down vote

favorite
2












I'm trying to construct an output table containing 80 rows of table status, that could be EMPTY or USED as below



+-------+-------+
| TABLE | STATE |
+-------+-------+
| 00 | USED |
| 01 | EMPTY |
| 02 | EMPTY |
..
..
| 79 | EMPTY |
+-------+-------+


I have a vector m_availTableNums which contains a list of available table numbers. In the below example, I'm putting 20 random table numbers into the above vector such that all the remaining 60 would be empty. My logic below works fine.



Is there a scope for improvement here on the find logic?



#include <iostream>
#include <iomanip>
#include <cmath>
#include <string.h>
#include <cstdlib>
#include <vector>
#include <algorithm>
#include <ctime>

using namespace std;

int main(int argc, const char * argv) {
std::vector<uint8_t> m_availTableNums;
char tmpStr[50];

uint8_t tableNum;
uint8_t randomCount;
srand(time(NULL));
for( tableNum=0; tableNum < 20; tableNum++ )
{
randomCount = ( rand() % 80 );
m_availTableNums.push_back( randomCount );
}

sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);
tmpStr[0]='';

sprintf(tmpStr, "| TABLE | STATE |");
printf("%sn", tmpStr);
tmpStr[0]='';

sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);

tableNum = 0;
for( tableNum=0; tableNum < 80 ; tableNum++ )
{
tmpStr[0]='';
if ( std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end() )
{
sprintf(tmpStr, "| %02d | EMPTY |", tableNum );
} else {
sprintf(tmpStr, "| %02d | USED |", tableNum );
}

printf("%sn", tmpStr);
}

tmpStr[0]='';
sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);

return 0;
}









share|improve this question




















  • 1




    Is that hardcoded 80 a coincidence (i.e. could be other numbers, too) or by design? Because if that's by design and if you really only want to know taken-or-not-taken, a std::bitset<80> will do, too. This will not only be idiomatic, but also faster than any other solution. A bitset of that size fits in two quadwords, or one cache line, respectively. No runtime allocations (while inserting without having called reserve() like you do). Access is constant, single assembly instruction, it doesn't get easier or faster than that.
    – Damon
    Dec 5 at 19:59












  • @Damon It’s important to remember that not every machine is an x86 computer; many embedded systems for example use 8- or 16-bit processors and have no cache.
    – Edward
    Dec 6 at 10:27










  • @Edward: C++11 pretty much rules out 8- or 16-bit embedded systems, but yes in general you are right. Though a bit set worth 5 16-bit words will still be an order of magnitude more efficient to access than iterating a std::vector even on a no-cache machine.
    – Damon
    Dec 6 at 11:52












  • In fact, g++ has supported 8-bit AVR processors for some years. I have g++ 7.2.0 for AVR on my main computer right now and it supports C++17 on that platform.
    – Edward
    Dec 6 at 13:09















up vote
7
down vote

favorite
2












I'm trying to construct an output table containing 80 rows of table status, that could be EMPTY or USED as below



+-------+-------+
| TABLE | STATE |
+-------+-------+
| 00 | USED |
| 01 | EMPTY |
| 02 | EMPTY |
..
..
| 79 | EMPTY |
+-------+-------+


I have a vector m_availTableNums which contains a list of available table numbers. In the below example, I'm putting 20 random table numbers into the above vector such that all the remaining 60 would be empty. My logic below works fine.



Is there a scope for improvement here on the find logic?



#include <iostream>
#include <iomanip>
#include <cmath>
#include <string.h>
#include <cstdlib>
#include <vector>
#include <algorithm>
#include <ctime>

using namespace std;

int main(int argc, const char * argv) {
std::vector<uint8_t> m_availTableNums;
char tmpStr[50];

uint8_t tableNum;
uint8_t randomCount;
srand(time(NULL));
for( tableNum=0; tableNum < 20; tableNum++ )
{
randomCount = ( rand() % 80 );
m_availTableNums.push_back( randomCount );
}

sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);
tmpStr[0]='';

sprintf(tmpStr, "| TABLE | STATE |");
printf("%sn", tmpStr);
tmpStr[0]='';

sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);

tableNum = 0;
for( tableNum=0; tableNum < 80 ; tableNum++ )
{
tmpStr[0]='';
if ( std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end() )
{
sprintf(tmpStr, "| %02d | EMPTY |", tableNum );
} else {
sprintf(tmpStr, "| %02d | USED |", tableNum );
}

printf("%sn", tmpStr);
}

tmpStr[0]='';
sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);

return 0;
}









share|improve this question




















  • 1




    Is that hardcoded 80 a coincidence (i.e. could be other numbers, too) or by design? Because if that's by design and if you really only want to know taken-or-not-taken, a std::bitset<80> will do, too. This will not only be idiomatic, but also faster than any other solution. A bitset of that size fits in two quadwords, or one cache line, respectively. No runtime allocations (while inserting without having called reserve() like you do). Access is constant, single assembly instruction, it doesn't get easier or faster than that.
    – Damon
    Dec 5 at 19:59












  • @Damon It’s important to remember that not every machine is an x86 computer; many embedded systems for example use 8- or 16-bit processors and have no cache.
    – Edward
    Dec 6 at 10:27










  • @Edward: C++11 pretty much rules out 8- or 16-bit embedded systems, but yes in general you are right. Though a bit set worth 5 16-bit words will still be an order of magnitude more efficient to access than iterating a std::vector even on a no-cache machine.
    – Damon
    Dec 6 at 11:52












  • In fact, g++ has supported 8-bit AVR processors for some years. I have g++ 7.2.0 for AVR on my main computer right now and it supports C++17 on that platform.
    – Edward
    Dec 6 at 13:09













up vote
7
down vote

favorite
2









up vote
7
down vote

favorite
2






2





I'm trying to construct an output table containing 80 rows of table status, that could be EMPTY or USED as below



+-------+-------+
| TABLE | STATE |
+-------+-------+
| 00 | USED |
| 01 | EMPTY |
| 02 | EMPTY |
..
..
| 79 | EMPTY |
+-------+-------+


I have a vector m_availTableNums which contains a list of available table numbers. In the below example, I'm putting 20 random table numbers into the above vector such that all the remaining 60 would be empty. My logic below works fine.



Is there a scope for improvement here on the find logic?



#include <iostream>
#include <iomanip>
#include <cmath>
#include <string.h>
#include <cstdlib>
#include <vector>
#include <algorithm>
#include <ctime>

using namespace std;

int main(int argc, const char * argv) {
std::vector<uint8_t> m_availTableNums;
char tmpStr[50];

uint8_t tableNum;
uint8_t randomCount;
srand(time(NULL));
for( tableNum=0; tableNum < 20; tableNum++ )
{
randomCount = ( rand() % 80 );
m_availTableNums.push_back( randomCount );
}

sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);
tmpStr[0]='';

sprintf(tmpStr, "| TABLE | STATE |");
printf("%sn", tmpStr);
tmpStr[0]='';

sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);

tableNum = 0;
for( tableNum=0; tableNum < 80 ; tableNum++ )
{
tmpStr[0]='';
if ( std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end() )
{
sprintf(tmpStr, "| %02d | EMPTY |", tableNum );
} else {
sprintf(tmpStr, "| %02d | USED |", tableNum );
}

printf("%sn", tmpStr);
}

tmpStr[0]='';
sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);

return 0;
}









share|improve this question















I'm trying to construct an output table containing 80 rows of table status, that could be EMPTY or USED as below



+-------+-------+
| TABLE | STATE |
+-------+-------+
| 00 | USED |
| 01 | EMPTY |
| 02 | EMPTY |
..
..
| 79 | EMPTY |
+-------+-------+


I have a vector m_availTableNums which contains a list of available table numbers. In the below example, I'm putting 20 random table numbers into the above vector such that all the remaining 60 would be empty. My logic below works fine.



Is there a scope for improvement here on the find logic?



#include <iostream>
#include <iomanip>
#include <cmath>
#include <string.h>
#include <cstdlib>
#include <vector>
#include <algorithm>
#include <ctime>

using namespace std;

int main(int argc, const char * argv) {
std::vector<uint8_t> m_availTableNums;
char tmpStr[50];

uint8_t tableNum;
uint8_t randomCount;
srand(time(NULL));
for( tableNum=0; tableNum < 20; tableNum++ )
{
randomCount = ( rand() % 80 );
m_availTableNums.push_back( randomCount );
}

sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);
tmpStr[0]='';

sprintf(tmpStr, "| TABLE | STATE |");
printf("%sn", tmpStr);
tmpStr[0]='';

sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);

tableNum = 0;
for( tableNum=0; tableNum < 80 ; tableNum++ )
{
tmpStr[0]='';
if ( std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end() )
{
sprintf(tmpStr, "| %02d | EMPTY |", tableNum );
} else {
sprintf(tmpStr, "| %02d | USED |", tableNum );
}

printf("%sn", tmpStr);
}

tmpStr[0]='';
sprintf(tmpStr, "+-------+-------+");
printf("%sn", tmpStr);

return 0;
}






c++ performance c++11 vectors






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 5 at 19:25









200_success

128k15149412




128k15149412










asked Dec 5 at 9:23









Inian

14318




14318








  • 1




    Is that hardcoded 80 a coincidence (i.e. could be other numbers, too) or by design? Because if that's by design and if you really only want to know taken-or-not-taken, a std::bitset<80> will do, too. This will not only be idiomatic, but also faster than any other solution. A bitset of that size fits in two quadwords, or one cache line, respectively. No runtime allocations (while inserting without having called reserve() like you do). Access is constant, single assembly instruction, it doesn't get easier or faster than that.
    – Damon
    Dec 5 at 19:59












  • @Damon It’s important to remember that not every machine is an x86 computer; many embedded systems for example use 8- or 16-bit processors and have no cache.
    – Edward
    Dec 6 at 10:27










  • @Edward: C++11 pretty much rules out 8- or 16-bit embedded systems, but yes in general you are right. Though a bit set worth 5 16-bit words will still be an order of magnitude more efficient to access than iterating a std::vector even on a no-cache machine.
    – Damon
    Dec 6 at 11:52












  • In fact, g++ has supported 8-bit AVR processors for some years. I have g++ 7.2.0 for AVR on my main computer right now and it supports C++17 on that platform.
    – Edward
    Dec 6 at 13:09














  • 1




    Is that hardcoded 80 a coincidence (i.e. could be other numbers, too) or by design? Because if that's by design and if you really only want to know taken-or-not-taken, a std::bitset<80> will do, too. This will not only be idiomatic, but also faster than any other solution. A bitset of that size fits in two quadwords, or one cache line, respectively. No runtime allocations (while inserting without having called reserve() like you do). Access is constant, single assembly instruction, it doesn't get easier or faster than that.
    – Damon
    Dec 5 at 19:59












  • @Damon It’s important to remember that not every machine is an x86 computer; many embedded systems for example use 8- or 16-bit processors and have no cache.
    – Edward
    Dec 6 at 10:27










  • @Edward: C++11 pretty much rules out 8- or 16-bit embedded systems, but yes in general you are right. Though a bit set worth 5 16-bit words will still be an order of magnitude more efficient to access than iterating a std::vector even on a no-cache machine.
    – Damon
    Dec 6 at 11:52












  • In fact, g++ has supported 8-bit AVR processors for some years. I have g++ 7.2.0 for AVR on my main computer right now and it supports C++17 on that platform.
    – Edward
    Dec 6 at 13:09








1




1




Is that hardcoded 80 a coincidence (i.e. could be other numbers, too) or by design? Because if that's by design and if you really only want to know taken-or-not-taken, a std::bitset<80> will do, too. This will not only be idiomatic, but also faster than any other solution. A bitset of that size fits in two quadwords, or one cache line, respectively. No runtime allocations (while inserting without having called reserve() like you do). Access is constant, single assembly instruction, it doesn't get easier or faster than that.
– Damon
Dec 5 at 19:59






Is that hardcoded 80 a coincidence (i.e. could be other numbers, too) or by design? Because if that's by design and if you really only want to know taken-or-not-taken, a std::bitset<80> will do, too. This will not only be idiomatic, but also faster than any other solution. A bitset of that size fits in two quadwords, or one cache line, respectively. No runtime allocations (while inserting without having called reserve() like you do). Access is constant, single assembly instruction, it doesn't get easier or faster than that.
– Damon
Dec 5 at 19:59














@Damon It’s important to remember that not every machine is an x86 computer; many embedded systems for example use 8- or 16-bit processors and have no cache.
– Edward
Dec 6 at 10:27




@Damon It’s important to remember that not every machine is an x86 computer; many embedded systems for example use 8- or 16-bit processors and have no cache.
– Edward
Dec 6 at 10:27












@Edward: C++11 pretty much rules out 8- or 16-bit embedded systems, but yes in general you are right. Though a bit set worth 5 16-bit words will still be an order of magnitude more efficient to access than iterating a std::vector even on a no-cache machine.
– Damon
Dec 6 at 11:52






@Edward: C++11 pretty much rules out 8- or 16-bit embedded systems, but yes in general you are right. Though a bit set worth 5 16-bit words will still be an order of magnitude more efficient to access than iterating a std::vector even on a no-cache machine.
– Damon
Dec 6 at 11:52














In fact, g++ has supported 8-bit AVR processors for some years. I have g++ 7.2.0 for AVR on my main computer right now and it supports C++17 on that platform.
– Edward
Dec 6 at 13:09




In fact, g++ has supported 8-bit AVR processors for some years. I have g++ 7.2.0 for AVR on my main computer right now and it supports C++17 on that platform.
– Edward
Dec 6 at 13:09










4 Answers
4






active

oldest

votes

















up vote
19
down vote



accepted










Header files



It's strange that this code uses the C header <string.h> but the C++ versions of <cmath>, <ctime> and <cstdlib>. I recommend sticking to the C++ headers except on the rare occasions that you need to compile the same code with a C compiler. In this case, I don't see anything using <cstring>, so we can probably just drop that, along with <iomanip>, <iostream> and <cmath>. And we need to add some missing includes: <cstdint> and <cstdio>.



Avoid using namespace std



The standard namespace is not one that's designed for wholesale import like this. Unexpected name collisions when you add another header or move to a newer C++ could even cause changes to the meaning of your program.



Use the appropriate signature for main()



Since we're ignoring the command-line arguments, we can use a main() that takes no parameters:



int main()


Remove pointless temporary string



Instead of formatting into tmpStr and immediately printing its contents to standard output, we can eliminate that variable by formatting directly to standard output (using the same format string). For example, instead of:




std::sprintf(tmpStr, "+-------+-------+");
std::printf("%sn", tmpStr);
tmpStr[0]='';

std::sprintf(tmpStr, "| TABLE | STATE |");
std::printf("%sn", tmpStr);
tmpStr[0]='';

std::sprintf(tmpStr, "+-------+-------+");
std::printf("%sn", tmpStr);



we could simply write:



std::puts("+-------+-------+n"
"| TABLE | STATE |n"
"+-------+-------+");


And instead of




    tmpStr[0]='';
if ( std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end() )
{
std::sprintf(tmpStr, "| %02d | EMPTY |", tableNum );
} else {
std::sprintf(tmpStr, "| %02d | USED |", tableNum );
}

printf("%sn", tmpStr);



we would have:



    if (std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end()) {
std::printf("| %02d | EMPTY |n", tableNum);
} else {
std::printf("| %02d | USED |n", tableNum);
}


Reduce duplication



Most of these statements are common:




        std::printf("|  %02d   | EMPTY |n", tableNum);
} else {
std::printf("| %02d | USED |n", tableNum);
}



The only bit that's different is the EMPTY or USED string. So let's decide that first:



    const char *status =
std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end()
? "EMPTY" : "USED";
std::printf("| %02d | %-5s |n", tableNum, status);


Prefer nullptr value to NULL macro



The C++ null pointer has strong type, whereas NULL or 0 can be interpreted as integer.



Reduce scope of variables



randomCount doesn't need to be valid outside the first for loop, and we don't need to use the same tableNum for both loops. Also, we could follow convention and use a short name for a short-lived loop index; i is the usual choice:



for (std::uint8_t i = 0;  i < 20;  ++i) {
std::uint8_t randomCount = rand() % 80;
m_availTableNums.push_back(randomCount);
}




for (std::uint8_t i = 0;  i < 80;  ++i) {


Avoid magic numbers



What's special about 80? Could we need a different range? Let's give the constant a name, and then we can be sure that the loop matches this range:



constexpr std::uint8_t LIMIT = 80;
...
std::uint8_t randomCount = rand() % LIMIT;
...
for (std::uint8_t i = 0; i < LIMIT; ++i) {


A departure from specification



The description says




I'm putting 20 random table numbers into the above vector such that, all the remaining 60 would be empty.




That's not exactly what's happening, as we're sampling with replacement from the values 0..79. There's nothing to prevent duplicates being added (it's actually quite unlikely that there will be exactly 60 empty values).



Reduce the algorithmic complexity



Each time through the loop, we use std::find() to see whether we have any matching elements. This is a linear search, so it examines elements in turn until it finds a match. Since it only finds a match one-quarter of the time, the other three-quarters will examine every element in the list, and the time it takes will be proportional to the list length - we say it scales as O(n), where n is the size of the vector. The complete loop therefore scales as O(mn), where m is the value of LIMIT.



We can reduce the complexity to O(m + n) if we use some extra storage to store the values in a way that makes them easy to test. For example, we could populate a vector that's indexed by the values from m_availTableNums:



auto by_val = std::vector<bool>(LIMIT, false);
for (auto value: m_availTableNums)
by_val[value] = true;

for (std::uint8_t i = 0; i < LIMIT; ++i) {
const char *status = by_val[i] ? "EMPTY" : "USED";
std::printf("| %02d | %-5s |n", i, status);
}


If the range were much larger, we might use an (unordered) set instead of vector<bool>. We might also choose vector<char> instead of vector<bool> for better speed at a cost of more space.





Simplified code



Here's my version, keeping to the spirit of the original (creating a list of indices, rather than changing to storing in the form we want to use them):



#include <algorithm>
#include <cstdint>
#include <cstdio>
#include <cstdlib>
#include <ctime>
#include <vector>

int main()
{
constexpr std::uint8_t LIMIT = 80;
std::vector<std::uint8_t> m_availTableNums;

std::srand(std::time(nullptr));

for (std::uint8_t i = 0; i < 20; ++i) {
std::uint8_t randomCount = rand() % LIMIT;
m_availTableNums.push_back(randomCount);
}

std::puts("+-------+-------+n"
"| TABLE | STATE |n"
"+-------+-------+");

auto by_val = std::vector<bool>(LIMIT, false);
for (auto value: m_availTableNums)
by_val[value] = true;

for (std::uint8_t i = 0; i < LIMIT; ++i) {
const char *status = by_val[i] ? "EMPTY" : "USED";
std::printf("| %02d | %-5s |n", i, status);
}

std::puts("+-------+-------+");
}





share|improve this answer























  • Excellent points! ++, I do have to let know that I'm not an expert yet in C++, just in the beginner stages. These points are really useful. One question before I accept this answer. Your opinion on dave's comment above to use sort and remove elements at head?
    – Inian
    Dec 5 at 10:03






  • 1




    That's a possibility - in complexity terms, that's similar to my suggestion. To some extent, the best choice may depend on whether modifying the values is acceptable. If you go with the sorting, don't forget that you may need to remove multiple elements from the head of the vector when there are duplicate values in it (oh, and don't actually remove - that's expensive; just advance over them and leave them in place).
    – Toby Speight
    Dec 5 at 10:27










  • The only thing tmpStr is ever used for is to immediately pass it to printf() with a %s format. So why not just printf() directly? - first make sure the string doesn't contain any characters with a special meaning for printf
    – hanshenrik
    Dec 5 at 15:41










  • @hanshenrik, perhaps I didn't word that very well. I meant we should replace the sprintf into tmpStr with a printf directly to standard output - same (or nearly same) format string. I'll see if I can edit to prevent misunderstandings.
    – Toby Speight
    Dec 5 at 15:49






  • 1




    @Graham, have you compared the generated code using iterators and using indices? Decent optimising compilers will transform indexing operations into pointer operations quite routinely.
    – Toby Speight
    Dec 6 at 13:50


















up vote
8
down vote













I would seriously consider whether std::vector is the right data structure. Do you need the available table numbers to be in the order you got them, or can they be sorted? Do you need to allow duplicate table numbers? (I.e. if I add 42 twice, do I have one entry for 42 or two?)



If the data can be sorted and de-duplicated, then the most efficient way to find an entry is to store the data in a std::set. Storing the data takes a little longer (O(log n) for the set vs. amortized O(1) for the vector), but finding the data makes up for it (O(log n) for the set vs. O(n) for the vector). If you use a set, make sure to use set's find function (e.g. mySet.find(tableNum) rather than std::find(mySet.begin(),mySet.end(),tableNum)), or there's no benefit.



On a more general code-style note, since you're using C++ data structures anyway, you should probably use std::string and std::cout instead of char */char and printf.






share|improve this answer






























    up vote
    3
    down vote













    A std::set or std::bitset might be the better containers to use for this, but assuming we're going to stick with std::vector, here's some improvements on this.



    See Toby Speight's answer for some of the improvements I used but did not cover.



    Avoid using unsigned numbers



    Don't use unsigned int or uint8_t. Counterintuitively, it's generally best to use signed int types even when a value should never be negative. Operations on mixed signed types come with a lot of problems. It's an unfortunate mistake the standard went with unsigned numbers for container sizes. Unsigned numbers are best kept to bitwise operations.



    Store whether an index is used at that index



    Instead of storing a list of indexes that have been used, let's just store a bool at each possible index. This uses more space, but then no extra searching has to be done, just a single loop. (Notice I've changed m_availTableNums to availTableNums. The prefix m_ is conventionally only used for class members).



    std::vector<bool> availTableNums (80);


    Putting the 80 here will initialize the vector with 80 false values. Now when we fill the table, we choose random indices to insert true which indicates a value of used.



    std::srand(std::time(nullptr));
    for(int i=0; i < 20; i++)
    {
    availTableNums[std::rand() % availTableNums.size()] = true;
    }


    Using modern RNG facilities



    std::rand has no guarantees of the "randomness" of the numbers it generates. It's fine if you want to run up a quick test, but in production code we could modify the previous example to make it a little more robust.



    std::default_random_engine engine {std::time()};
    std::uniform_int_distribution<std::size_t> uniform_dist {0, availTableNums.size()};
    for(int i=0; i < 20; i++)
    {
    availTableNums[uniform_dist(e1)] = true;
    }


    Here we use std::size_t, even though it's an unsigned type, as it is better to not have to cast from one type to another if we could avoid it. The random number generator will automatically pick from the correct range so we no longer need the modulus operator.



    Use C++ string streams



    There is no need for a temporary string, and C++ streams are often a better choice than the C printf family of functions. We can write the header of the table as:



    std::cout << "+-------+-------+n"
    << "| TABLE | STATE |n"
    << "+-------+-------+" << std::endl;


    The extra <<s are left for style but not necessary. Now for the printing inside the loop, use std::setw(2) to make sure the printed index is within 2 columns. The value of this resets if string is sent to std::cout so it will not effect the rest of the line.



    The use of std::endl will cause the output to always be flushed from the buffer and will have a small performance penalty. However, I am usually quite generous with them unless it is shown they have a sizeable impact on performance. It can make code easier to debug if output is consistently flushed and you know exactly where something went wrong.



    for(auto i=0u; i < availTableNums.size(); i++)
    {
    std::cout << "| " << std::setw(2) << i << " | "
    << (availTableNums[i] ? "USED " : "EMPTY")
    << " |" << std::endl;
    }


    We again choose an unsigned index as comparing two unsigned types is better than comparing mixed signs or casting one to the other. The conditional expression ?: allows us to concisely choose between the two strings depending on whether the index is true or false.





    Putting it all together with only the relavent headers, we get:



    #include <ctime>
    #include <iomanip>
    #include <iostream>
    #include <random>
    #include <string>
    #include <vector>


    int main()
    {
    std::vector<bool> availTableNums (80);

    std::default_random_engine engine {std::time()};
    std::uniform_int_distribution<std::size_t> uniform_dist {0, availTableNums.size()};
    for(int i=0; i < 20; i++)
    {
    availTableNums[uniform_dist(e1)] = true;
    }

    std::cout << "+-------+-------+n"
    << "| TABLE | STATE |n"
    << "+-------+-------+" << std::endl;

    for(auto i=0u; i < availTableNums.size(); i++)
    {
    std::cout << "| " << std::setw(2) << i << " | "
    << (availTableNums[i] ? "USED " : "EMPTY")
    << " |" << std::endl;
    }

    std::cout << "+-------+-------+" << std::endl;

    return 0;
    }





    share|improve this answer























    • +1 Avoid using unsigned numbers: Really good advice. Hard to get across especially with std::size_t being unsigned but you addressed that point well.
      – Martin York
      Dec 6 at 0:20






    • 1




      Avoid advising using std::endl. Do he really need to pay for a flush? Furthermore, in your TABLE | STATE you can safely keep the 1rst << and remove others.
      – Calak
      Dec 6 at 0:51






    • 2




      I'd consider the unsigned numbers a preference thing. There are some very strong reasons to use unsigned numbers. To call their use a "mistake" in the STL suggests you haven't thought those through.
      – Cort Ammon
      Dec 6 at 1:02










    • @Calak - I put in a note about the performance penalty, but haven't been convinced it's always that bad. I left the extra << in as a style choice but I could probably note that.
      – TheLoneMilkMan
      Dec 6 at 1:29










    • @CortAmmon - Well I'm basing that off of the opinion of Stroustrup and others involved in the STL design. I'm aware there's not a consensus but the issue isn't usually brought up enough.
      – TheLoneMilkMan
      Dec 6 at 1:42


















    up vote
    2
    down vote













    The above are good advice. Do you need exactly N values true? If so, the above loop might be better as (pieces left out for simplicity):



    const std::size_t N = 80;
    const std::size_t K = 20;

    std::unordered_set<std::size_t> exists;

    // Fill exists with K random values between [0 .. N], inclusive.
    std::default_random_engine rng{std::time()};
    std::uniform_int_distribution<std::size_t> uniform_dist {0, N};
    while (exists.size() < K) {
    exists.insert(uniform_dist(rng));
    }

    // Print each value in order, in a table.
    for(auto i = uniform_dist.min(); i <= uniform_dist.max(); ++i) {
    std::cout << "| " << std::setw(2) << i << " | "
    << ((exists.count(i) == 0) ? "EMPTY" : "USED ")
    << " |" << std::endl;
    }





    share|improve this answer





















    • (Nice catch, with the exception of picking the symbolic name for the total number in your question about the requirements instead of that for the "true sub-set".)
      – greybeard
      Dec 8 at 7:56











    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "196"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209055%2fmost-efficient-way-to-find-an-entry-in-a-c-vector%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    19
    down vote



    accepted










    Header files



    It's strange that this code uses the C header <string.h> but the C++ versions of <cmath>, <ctime> and <cstdlib>. I recommend sticking to the C++ headers except on the rare occasions that you need to compile the same code with a C compiler. In this case, I don't see anything using <cstring>, so we can probably just drop that, along with <iomanip>, <iostream> and <cmath>. And we need to add some missing includes: <cstdint> and <cstdio>.



    Avoid using namespace std



    The standard namespace is not one that's designed for wholesale import like this. Unexpected name collisions when you add another header or move to a newer C++ could even cause changes to the meaning of your program.



    Use the appropriate signature for main()



    Since we're ignoring the command-line arguments, we can use a main() that takes no parameters:



    int main()


    Remove pointless temporary string



    Instead of formatting into tmpStr and immediately printing its contents to standard output, we can eliminate that variable by formatting directly to standard output (using the same format string). For example, instead of:




    std::sprintf(tmpStr, "+-------+-------+");
    std::printf("%sn", tmpStr);
    tmpStr[0]='';

    std::sprintf(tmpStr, "| TABLE | STATE |");
    std::printf("%sn", tmpStr);
    tmpStr[0]='';

    std::sprintf(tmpStr, "+-------+-------+");
    std::printf("%sn", tmpStr);



    we could simply write:



    std::puts("+-------+-------+n"
    "| TABLE | STATE |n"
    "+-------+-------+");


    And instead of




        tmpStr[0]='';
    if ( std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end() )
    {
    std::sprintf(tmpStr, "| %02d | EMPTY |", tableNum );
    } else {
    std::sprintf(tmpStr, "| %02d | USED |", tableNum );
    }

    printf("%sn", tmpStr);



    we would have:



        if (std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end()) {
    std::printf("| %02d | EMPTY |n", tableNum);
    } else {
    std::printf("| %02d | USED |n", tableNum);
    }


    Reduce duplication



    Most of these statements are common:




            std::printf("|  %02d   | EMPTY |n", tableNum);
    } else {
    std::printf("| %02d | USED |n", tableNum);
    }



    The only bit that's different is the EMPTY or USED string. So let's decide that first:



        const char *status =
    std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end()
    ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", tableNum, status);


    Prefer nullptr value to NULL macro



    The C++ null pointer has strong type, whereas NULL or 0 can be interpreted as integer.



    Reduce scope of variables



    randomCount doesn't need to be valid outside the first for loop, and we don't need to use the same tableNum for both loops. Also, we could follow convention and use a short name for a short-lived loop index; i is the usual choice:



    for (std::uint8_t i = 0;  i < 20;  ++i) {
    std::uint8_t randomCount = rand() % 80;
    m_availTableNums.push_back(randomCount);
    }




    for (std::uint8_t i = 0;  i < 80;  ++i) {


    Avoid magic numbers



    What's special about 80? Could we need a different range? Let's give the constant a name, and then we can be sure that the loop matches this range:



    constexpr std::uint8_t LIMIT = 80;
    ...
    std::uint8_t randomCount = rand() % LIMIT;
    ...
    for (std::uint8_t i = 0; i < LIMIT; ++i) {


    A departure from specification



    The description says




    I'm putting 20 random table numbers into the above vector such that, all the remaining 60 would be empty.




    That's not exactly what's happening, as we're sampling with replacement from the values 0..79. There's nothing to prevent duplicates being added (it's actually quite unlikely that there will be exactly 60 empty values).



    Reduce the algorithmic complexity



    Each time through the loop, we use std::find() to see whether we have any matching elements. This is a linear search, so it examines elements in turn until it finds a match. Since it only finds a match one-quarter of the time, the other three-quarters will examine every element in the list, and the time it takes will be proportional to the list length - we say it scales as O(n), where n is the size of the vector. The complete loop therefore scales as O(mn), where m is the value of LIMIT.



    We can reduce the complexity to O(m + n) if we use some extra storage to store the values in a way that makes them easy to test. For example, we could populate a vector that's indexed by the values from m_availTableNums:



    auto by_val = std::vector<bool>(LIMIT, false);
    for (auto value: m_availTableNums)
    by_val[value] = true;

    for (std::uint8_t i = 0; i < LIMIT; ++i) {
    const char *status = by_val[i] ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", i, status);
    }


    If the range were much larger, we might use an (unordered) set instead of vector<bool>. We might also choose vector<char> instead of vector<bool> for better speed at a cost of more space.





    Simplified code



    Here's my version, keeping to the spirit of the original (creating a list of indices, rather than changing to storing in the form we want to use them):



    #include <algorithm>
    #include <cstdint>
    #include <cstdio>
    #include <cstdlib>
    #include <ctime>
    #include <vector>

    int main()
    {
    constexpr std::uint8_t LIMIT = 80;
    std::vector<std::uint8_t> m_availTableNums;

    std::srand(std::time(nullptr));

    for (std::uint8_t i = 0; i < 20; ++i) {
    std::uint8_t randomCount = rand() % LIMIT;
    m_availTableNums.push_back(randomCount);
    }

    std::puts("+-------+-------+n"
    "| TABLE | STATE |n"
    "+-------+-------+");

    auto by_val = std::vector<bool>(LIMIT, false);
    for (auto value: m_availTableNums)
    by_val[value] = true;

    for (std::uint8_t i = 0; i < LIMIT; ++i) {
    const char *status = by_val[i] ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", i, status);
    }

    std::puts("+-------+-------+");
    }





    share|improve this answer























    • Excellent points! ++, I do have to let know that I'm not an expert yet in C++, just in the beginner stages. These points are really useful. One question before I accept this answer. Your opinion on dave's comment above to use sort and remove elements at head?
      – Inian
      Dec 5 at 10:03






    • 1




      That's a possibility - in complexity terms, that's similar to my suggestion. To some extent, the best choice may depend on whether modifying the values is acceptable. If you go with the sorting, don't forget that you may need to remove multiple elements from the head of the vector when there are duplicate values in it (oh, and don't actually remove - that's expensive; just advance over them and leave them in place).
      – Toby Speight
      Dec 5 at 10:27










    • The only thing tmpStr is ever used for is to immediately pass it to printf() with a %s format. So why not just printf() directly? - first make sure the string doesn't contain any characters with a special meaning for printf
      – hanshenrik
      Dec 5 at 15:41










    • @hanshenrik, perhaps I didn't word that very well. I meant we should replace the sprintf into tmpStr with a printf directly to standard output - same (or nearly same) format string. I'll see if I can edit to prevent misunderstandings.
      – Toby Speight
      Dec 5 at 15:49






    • 1




      @Graham, have you compared the generated code using iterators and using indices? Decent optimising compilers will transform indexing operations into pointer operations quite routinely.
      – Toby Speight
      Dec 6 at 13:50















    up vote
    19
    down vote



    accepted










    Header files



    It's strange that this code uses the C header <string.h> but the C++ versions of <cmath>, <ctime> and <cstdlib>. I recommend sticking to the C++ headers except on the rare occasions that you need to compile the same code with a C compiler. In this case, I don't see anything using <cstring>, so we can probably just drop that, along with <iomanip>, <iostream> and <cmath>. And we need to add some missing includes: <cstdint> and <cstdio>.



    Avoid using namespace std



    The standard namespace is not one that's designed for wholesale import like this. Unexpected name collisions when you add another header or move to a newer C++ could even cause changes to the meaning of your program.



    Use the appropriate signature for main()



    Since we're ignoring the command-line arguments, we can use a main() that takes no parameters:



    int main()


    Remove pointless temporary string



    Instead of formatting into tmpStr and immediately printing its contents to standard output, we can eliminate that variable by formatting directly to standard output (using the same format string). For example, instead of:




    std::sprintf(tmpStr, "+-------+-------+");
    std::printf("%sn", tmpStr);
    tmpStr[0]='';

    std::sprintf(tmpStr, "| TABLE | STATE |");
    std::printf("%sn", tmpStr);
    tmpStr[0]='';

    std::sprintf(tmpStr, "+-------+-------+");
    std::printf("%sn", tmpStr);



    we could simply write:



    std::puts("+-------+-------+n"
    "| TABLE | STATE |n"
    "+-------+-------+");


    And instead of




        tmpStr[0]='';
    if ( std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end() )
    {
    std::sprintf(tmpStr, "| %02d | EMPTY |", tableNum );
    } else {
    std::sprintf(tmpStr, "| %02d | USED |", tableNum );
    }

    printf("%sn", tmpStr);



    we would have:



        if (std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end()) {
    std::printf("| %02d | EMPTY |n", tableNum);
    } else {
    std::printf("| %02d | USED |n", tableNum);
    }


    Reduce duplication



    Most of these statements are common:




            std::printf("|  %02d   | EMPTY |n", tableNum);
    } else {
    std::printf("| %02d | USED |n", tableNum);
    }



    The only bit that's different is the EMPTY or USED string. So let's decide that first:



        const char *status =
    std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end()
    ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", tableNum, status);


    Prefer nullptr value to NULL macro



    The C++ null pointer has strong type, whereas NULL or 0 can be interpreted as integer.



    Reduce scope of variables



    randomCount doesn't need to be valid outside the first for loop, and we don't need to use the same tableNum for both loops. Also, we could follow convention and use a short name for a short-lived loop index; i is the usual choice:



    for (std::uint8_t i = 0;  i < 20;  ++i) {
    std::uint8_t randomCount = rand() % 80;
    m_availTableNums.push_back(randomCount);
    }




    for (std::uint8_t i = 0;  i < 80;  ++i) {


    Avoid magic numbers



    What's special about 80? Could we need a different range? Let's give the constant a name, and then we can be sure that the loop matches this range:



    constexpr std::uint8_t LIMIT = 80;
    ...
    std::uint8_t randomCount = rand() % LIMIT;
    ...
    for (std::uint8_t i = 0; i < LIMIT; ++i) {


    A departure from specification



    The description says




    I'm putting 20 random table numbers into the above vector such that, all the remaining 60 would be empty.




    That's not exactly what's happening, as we're sampling with replacement from the values 0..79. There's nothing to prevent duplicates being added (it's actually quite unlikely that there will be exactly 60 empty values).



    Reduce the algorithmic complexity



    Each time through the loop, we use std::find() to see whether we have any matching elements. This is a linear search, so it examines elements in turn until it finds a match. Since it only finds a match one-quarter of the time, the other three-quarters will examine every element in the list, and the time it takes will be proportional to the list length - we say it scales as O(n), where n is the size of the vector. The complete loop therefore scales as O(mn), where m is the value of LIMIT.



    We can reduce the complexity to O(m + n) if we use some extra storage to store the values in a way that makes them easy to test. For example, we could populate a vector that's indexed by the values from m_availTableNums:



    auto by_val = std::vector<bool>(LIMIT, false);
    for (auto value: m_availTableNums)
    by_val[value] = true;

    for (std::uint8_t i = 0; i < LIMIT; ++i) {
    const char *status = by_val[i] ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", i, status);
    }


    If the range were much larger, we might use an (unordered) set instead of vector<bool>. We might also choose vector<char> instead of vector<bool> for better speed at a cost of more space.





    Simplified code



    Here's my version, keeping to the spirit of the original (creating a list of indices, rather than changing to storing in the form we want to use them):



    #include <algorithm>
    #include <cstdint>
    #include <cstdio>
    #include <cstdlib>
    #include <ctime>
    #include <vector>

    int main()
    {
    constexpr std::uint8_t LIMIT = 80;
    std::vector<std::uint8_t> m_availTableNums;

    std::srand(std::time(nullptr));

    for (std::uint8_t i = 0; i < 20; ++i) {
    std::uint8_t randomCount = rand() % LIMIT;
    m_availTableNums.push_back(randomCount);
    }

    std::puts("+-------+-------+n"
    "| TABLE | STATE |n"
    "+-------+-------+");

    auto by_val = std::vector<bool>(LIMIT, false);
    for (auto value: m_availTableNums)
    by_val[value] = true;

    for (std::uint8_t i = 0; i < LIMIT; ++i) {
    const char *status = by_val[i] ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", i, status);
    }

    std::puts("+-------+-------+");
    }





    share|improve this answer























    • Excellent points! ++, I do have to let know that I'm not an expert yet in C++, just in the beginner stages. These points are really useful. One question before I accept this answer. Your opinion on dave's comment above to use sort and remove elements at head?
      – Inian
      Dec 5 at 10:03






    • 1




      That's a possibility - in complexity terms, that's similar to my suggestion. To some extent, the best choice may depend on whether modifying the values is acceptable. If you go with the sorting, don't forget that you may need to remove multiple elements from the head of the vector when there are duplicate values in it (oh, and don't actually remove - that's expensive; just advance over them and leave them in place).
      – Toby Speight
      Dec 5 at 10:27










    • The only thing tmpStr is ever used for is to immediately pass it to printf() with a %s format. So why not just printf() directly? - first make sure the string doesn't contain any characters with a special meaning for printf
      – hanshenrik
      Dec 5 at 15:41










    • @hanshenrik, perhaps I didn't word that very well. I meant we should replace the sprintf into tmpStr with a printf directly to standard output - same (or nearly same) format string. I'll see if I can edit to prevent misunderstandings.
      – Toby Speight
      Dec 5 at 15:49






    • 1




      @Graham, have you compared the generated code using iterators and using indices? Decent optimising compilers will transform indexing operations into pointer operations quite routinely.
      – Toby Speight
      Dec 6 at 13:50













    up vote
    19
    down vote



    accepted







    up vote
    19
    down vote



    accepted






    Header files



    It's strange that this code uses the C header <string.h> but the C++ versions of <cmath>, <ctime> and <cstdlib>. I recommend sticking to the C++ headers except on the rare occasions that you need to compile the same code with a C compiler. In this case, I don't see anything using <cstring>, so we can probably just drop that, along with <iomanip>, <iostream> and <cmath>. And we need to add some missing includes: <cstdint> and <cstdio>.



    Avoid using namespace std



    The standard namespace is not one that's designed for wholesale import like this. Unexpected name collisions when you add another header or move to a newer C++ could even cause changes to the meaning of your program.



    Use the appropriate signature for main()



    Since we're ignoring the command-line arguments, we can use a main() that takes no parameters:



    int main()


    Remove pointless temporary string



    Instead of formatting into tmpStr and immediately printing its contents to standard output, we can eliminate that variable by formatting directly to standard output (using the same format string). For example, instead of:




    std::sprintf(tmpStr, "+-------+-------+");
    std::printf("%sn", tmpStr);
    tmpStr[0]='';

    std::sprintf(tmpStr, "| TABLE | STATE |");
    std::printf("%sn", tmpStr);
    tmpStr[0]='';

    std::sprintf(tmpStr, "+-------+-------+");
    std::printf("%sn", tmpStr);



    we could simply write:



    std::puts("+-------+-------+n"
    "| TABLE | STATE |n"
    "+-------+-------+");


    And instead of




        tmpStr[0]='';
    if ( std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end() )
    {
    std::sprintf(tmpStr, "| %02d | EMPTY |", tableNum );
    } else {
    std::sprintf(tmpStr, "| %02d | USED |", tableNum );
    }

    printf("%sn", tmpStr);



    we would have:



        if (std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end()) {
    std::printf("| %02d | EMPTY |n", tableNum);
    } else {
    std::printf("| %02d | USED |n", tableNum);
    }


    Reduce duplication



    Most of these statements are common:




            std::printf("|  %02d   | EMPTY |n", tableNum);
    } else {
    std::printf("| %02d | USED |n", tableNum);
    }



    The only bit that's different is the EMPTY or USED string. So let's decide that first:



        const char *status =
    std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end()
    ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", tableNum, status);


    Prefer nullptr value to NULL macro



    The C++ null pointer has strong type, whereas NULL or 0 can be interpreted as integer.



    Reduce scope of variables



    randomCount doesn't need to be valid outside the first for loop, and we don't need to use the same tableNum for both loops. Also, we could follow convention and use a short name for a short-lived loop index; i is the usual choice:



    for (std::uint8_t i = 0;  i < 20;  ++i) {
    std::uint8_t randomCount = rand() % 80;
    m_availTableNums.push_back(randomCount);
    }




    for (std::uint8_t i = 0;  i < 80;  ++i) {


    Avoid magic numbers



    What's special about 80? Could we need a different range? Let's give the constant a name, and then we can be sure that the loop matches this range:



    constexpr std::uint8_t LIMIT = 80;
    ...
    std::uint8_t randomCount = rand() % LIMIT;
    ...
    for (std::uint8_t i = 0; i < LIMIT; ++i) {


    A departure from specification



    The description says




    I'm putting 20 random table numbers into the above vector such that, all the remaining 60 would be empty.




    That's not exactly what's happening, as we're sampling with replacement from the values 0..79. There's nothing to prevent duplicates being added (it's actually quite unlikely that there will be exactly 60 empty values).



    Reduce the algorithmic complexity



    Each time through the loop, we use std::find() to see whether we have any matching elements. This is a linear search, so it examines elements in turn until it finds a match. Since it only finds a match one-quarter of the time, the other three-quarters will examine every element in the list, and the time it takes will be proportional to the list length - we say it scales as O(n), where n is the size of the vector. The complete loop therefore scales as O(mn), where m is the value of LIMIT.



    We can reduce the complexity to O(m + n) if we use some extra storage to store the values in a way that makes them easy to test. For example, we could populate a vector that's indexed by the values from m_availTableNums:



    auto by_val = std::vector<bool>(LIMIT, false);
    for (auto value: m_availTableNums)
    by_val[value] = true;

    for (std::uint8_t i = 0; i < LIMIT; ++i) {
    const char *status = by_val[i] ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", i, status);
    }


    If the range were much larger, we might use an (unordered) set instead of vector<bool>. We might also choose vector<char> instead of vector<bool> for better speed at a cost of more space.





    Simplified code



    Here's my version, keeping to the spirit of the original (creating a list of indices, rather than changing to storing in the form we want to use them):



    #include <algorithm>
    #include <cstdint>
    #include <cstdio>
    #include <cstdlib>
    #include <ctime>
    #include <vector>

    int main()
    {
    constexpr std::uint8_t LIMIT = 80;
    std::vector<std::uint8_t> m_availTableNums;

    std::srand(std::time(nullptr));

    for (std::uint8_t i = 0; i < 20; ++i) {
    std::uint8_t randomCount = rand() % LIMIT;
    m_availTableNums.push_back(randomCount);
    }

    std::puts("+-------+-------+n"
    "| TABLE | STATE |n"
    "+-------+-------+");

    auto by_val = std::vector<bool>(LIMIT, false);
    for (auto value: m_availTableNums)
    by_val[value] = true;

    for (std::uint8_t i = 0; i < LIMIT; ++i) {
    const char *status = by_val[i] ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", i, status);
    }

    std::puts("+-------+-------+");
    }





    share|improve this answer














    Header files



    It's strange that this code uses the C header <string.h> but the C++ versions of <cmath>, <ctime> and <cstdlib>. I recommend sticking to the C++ headers except on the rare occasions that you need to compile the same code with a C compiler. In this case, I don't see anything using <cstring>, so we can probably just drop that, along with <iomanip>, <iostream> and <cmath>. And we need to add some missing includes: <cstdint> and <cstdio>.



    Avoid using namespace std



    The standard namespace is not one that's designed for wholesale import like this. Unexpected name collisions when you add another header or move to a newer C++ could even cause changes to the meaning of your program.



    Use the appropriate signature for main()



    Since we're ignoring the command-line arguments, we can use a main() that takes no parameters:



    int main()


    Remove pointless temporary string



    Instead of formatting into tmpStr and immediately printing its contents to standard output, we can eliminate that variable by formatting directly to standard output (using the same format string). For example, instead of:




    std::sprintf(tmpStr, "+-------+-------+");
    std::printf("%sn", tmpStr);
    tmpStr[0]='';

    std::sprintf(tmpStr, "| TABLE | STATE |");
    std::printf("%sn", tmpStr);
    tmpStr[0]='';

    std::sprintf(tmpStr, "+-------+-------+");
    std::printf("%sn", tmpStr);



    we could simply write:



    std::puts("+-------+-------+n"
    "| TABLE | STATE |n"
    "+-------+-------+");


    And instead of




        tmpStr[0]='';
    if ( std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end() )
    {
    std::sprintf(tmpStr, "| %02d | EMPTY |", tableNum );
    } else {
    std::sprintf(tmpStr, "| %02d | USED |", tableNum );
    }

    printf("%sn", tmpStr);



    we would have:



        if (std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end()) {
    std::printf("| %02d | EMPTY |n", tableNum);
    } else {
    std::printf("| %02d | USED |n", tableNum);
    }


    Reduce duplication



    Most of these statements are common:




            std::printf("|  %02d   | EMPTY |n", tableNum);
    } else {
    std::printf("| %02d | USED |n", tableNum);
    }



    The only bit that's different is the EMPTY or USED string. So let's decide that first:



        const char *status =
    std::find(m_availTableNums.begin(), m_availTableNums.end(), tableNum) != m_availTableNums.end()
    ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", tableNum, status);


    Prefer nullptr value to NULL macro



    The C++ null pointer has strong type, whereas NULL or 0 can be interpreted as integer.



    Reduce scope of variables



    randomCount doesn't need to be valid outside the first for loop, and we don't need to use the same tableNum for both loops. Also, we could follow convention and use a short name for a short-lived loop index; i is the usual choice:



    for (std::uint8_t i = 0;  i < 20;  ++i) {
    std::uint8_t randomCount = rand() % 80;
    m_availTableNums.push_back(randomCount);
    }




    for (std::uint8_t i = 0;  i < 80;  ++i) {


    Avoid magic numbers



    What's special about 80? Could we need a different range? Let's give the constant a name, and then we can be sure that the loop matches this range:



    constexpr std::uint8_t LIMIT = 80;
    ...
    std::uint8_t randomCount = rand() % LIMIT;
    ...
    for (std::uint8_t i = 0; i < LIMIT; ++i) {


    A departure from specification



    The description says




    I'm putting 20 random table numbers into the above vector such that, all the remaining 60 would be empty.




    That's not exactly what's happening, as we're sampling with replacement from the values 0..79. There's nothing to prevent duplicates being added (it's actually quite unlikely that there will be exactly 60 empty values).



    Reduce the algorithmic complexity



    Each time through the loop, we use std::find() to see whether we have any matching elements. This is a linear search, so it examines elements in turn until it finds a match. Since it only finds a match one-quarter of the time, the other three-quarters will examine every element in the list, and the time it takes will be proportional to the list length - we say it scales as O(n), where n is the size of the vector. The complete loop therefore scales as O(mn), where m is the value of LIMIT.



    We can reduce the complexity to O(m + n) if we use some extra storage to store the values in a way that makes them easy to test. For example, we could populate a vector that's indexed by the values from m_availTableNums:



    auto by_val = std::vector<bool>(LIMIT, false);
    for (auto value: m_availTableNums)
    by_val[value] = true;

    for (std::uint8_t i = 0; i < LIMIT; ++i) {
    const char *status = by_val[i] ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", i, status);
    }


    If the range were much larger, we might use an (unordered) set instead of vector<bool>. We might also choose vector<char> instead of vector<bool> for better speed at a cost of more space.





    Simplified code



    Here's my version, keeping to the spirit of the original (creating a list of indices, rather than changing to storing in the form we want to use them):



    #include <algorithm>
    #include <cstdint>
    #include <cstdio>
    #include <cstdlib>
    #include <ctime>
    #include <vector>

    int main()
    {
    constexpr std::uint8_t LIMIT = 80;
    std::vector<std::uint8_t> m_availTableNums;

    std::srand(std::time(nullptr));

    for (std::uint8_t i = 0; i < 20; ++i) {
    std::uint8_t randomCount = rand() % LIMIT;
    m_availTableNums.push_back(randomCount);
    }

    std::puts("+-------+-------+n"
    "| TABLE | STATE |n"
    "+-------+-------+");

    auto by_val = std::vector<bool>(LIMIT, false);
    for (auto value: m_availTableNums)
    by_val[value] = true;

    for (std::uint8_t i = 0; i < LIMIT; ++i) {
    const char *status = by_val[i] ? "EMPTY" : "USED";
    std::printf("| %02d | %-5s |n", i, status);
    }

    std::puts("+-------+-------+");
    }






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Dec 6 at 8:57

























    answered Dec 5 at 9:52









    Toby Speight

    23.3k639111




    23.3k639111












    • Excellent points! ++, I do have to let know that I'm not an expert yet in C++, just in the beginner stages. These points are really useful. One question before I accept this answer. Your opinion on dave's comment above to use sort and remove elements at head?
      – Inian
      Dec 5 at 10:03






    • 1




      That's a possibility - in complexity terms, that's similar to my suggestion. To some extent, the best choice may depend on whether modifying the values is acceptable. If you go with the sorting, don't forget that you may need to remove multiple elements from the head of the vector when there are duplicate values in it (oh, and don't actually remove - that's expensive; just advance over them and leave them in place).
      – Toby Speight
      Dec 5 at 10:27










    • The only thing tmpStr is ever used for is to immediately pass it to printf() with a %s format. So why not just printf() directly? - first make sure the string doesn't contain any characters with a special meaning for printf
      – hanshenrik
      Dec 5 at 15:41










    • @hanshenrik, perhaps I didn't word that very well. I meant we should replace the sprintf into tmpStr with a printf directly to standard output - same (or nearly same) format string. I'll see if I can edit to prevent misunderstandings.
      – Toby Speight
      Dec 5 at 15:49






    • 1




      @Graham, have you compared the generated code using iterators and using indices? Decent optimising compilers will transform indexing operations into pointer operations quite routinely.
      – Toby Speight
      Dec 6 at 13:50


















    • Excellent points! ++, I do have to let know that I'm not an expert yet in C++, just in the beginner stages. These points are really useful. One question before I accept this answer. Your opinion on dave's comment above to use sort and remove elements at head?
      – Inian
      Dec 5 at 10:03






    • 1




      That's a possibility - in complexity terms, that's similar to my suggestion. To some extent, the best choice may depend on whether modifying the values is acceptable. If you go with the sorting, don't forget that you may need to remove multiple elements from the head of the vector when there are duplicate values in it (oh, and don't actually remove - that's expensive; just advance over them and leave them in place).
      – Toby Speight
      Dec 5 at 10:27










    • The only thing tmpStr is ever used for is to immediately pass it to printf() with a %s format. So why not just printf() directly? - first make sure the string doesn't contain any characters with a special meaning for printf
      – hanshenrik
      Dec 5 at 15:41










    • @hanshenrik, perhaps I didn't word that very well. I meant we should replace the sprintf into tmpStr with a printf directly to standard output - same (or nearly same) format string. I'll see if I can edit to prevent misunderstandings.
      – Toby Speight
      Dec 5 at 15:49






    • 1




      @Graham, have you compared the generated code using iterators and using indices? Decent optimising compilers will transform indexing operations into pointer operations quite routinely.
      – Toby Speight
      Dec 6 at 13:50
















    Excellent points! ++, I do have to let know that I'm not an expert yet in C++, just in the beginner stages. These points are really useful. One question before I accept this answer. Your opinion on dave's comment above to use sort and remove elements at head?
    – Inian
    Dec 5 at 10:03




    Excellent points! ++, I do have to let know that I'm not an expert yet in C++, just in the beginner stages. These points are really useful. One question before I accept this answer. Your opinion on dave's comment above to use sort and remove elements at head?
    – Inian
    Dec 5 at 10:03




    1




    1




    That's a possibility - in complexity terms, that's similar to my suggestion. To some extent, the best choice may depend on whether modifying the values is acceptable. If you go with the sorting, don't forget that you may need to remove multiple elements from the head of the vector when there are duplicate values in it (oh, and don't actually remove - that's expensive; just advance over them and leave them in place).
    – Toby Speight
    Dec 5 at 10:27




    That's a possibility - in complexity terms, that's similar to my suggestion. To some extent, the best choice may depend on whether modifying the values is acceptable. If you go with the sorting, don't forget that you may need to remove multiple elements from the head of the vector when there are duplicate values in it (oh, and don't actually remove - that's expensive; just advance over them and leave them in place).
    – Toby Speight
    Dec 5 at 10:27












    The only thing tmpStr is ever used for is to immediately pass it to printf() with a %s format. So why not just printf() directly? - first make sure the string doesn't contain any characters with a special meaning for printf
    – hanshenrik
    Dec 5 at 15:41




    The only thing tmpStr is ever used for is to immediately pass it to printf() with a %s format. So why not just printf() directly? - first make sure the string doesn't contain any characters with a special meaning for printf
    – hanshenrik
    Dec 5 at 15:41












    @hanshenrik, perhaps I didn't word that very well. I meant we should replace the sprintf into tmpStr with a printf directly to standard output - same (or nearly same) format string. I'll see if I can edit to prevent misunderstandings.
    – Toby Speight
    Dec 5 at 15:49




    @hanshenrik, perhaps I didn't word that very well. I meant we should replace the sprintf into tmpStr with a printf directly to standard output - same (or nearly same) format string. I'll see if I can edit to prevent misunderstandings.
    – Toby Speight
    Dec 5 at 15:49




    1




    1




    @Graham, have you compared the generated code using iterators and using indices? Decent optimising compilers will transform indexing operations into pointer operations quite routinely.
    – Toby Speight
    Dec 6 at 13:50




    @Graham, have you compared the generated code using iterators and using indices? Decent optimising compilers will transform indexing operations into pointer operations quite routinely.
    – Toby Speight
    Dec 6 at 13:50












    up vote
    8
    down vote













    I would seriously consider whether std::vector is the right data structure. Do you need the available table numbers to be in the order you got them, or can they be sorted? Do you need to allow duplicate table numbers? (I.e. if I add 42 twice, do I have one entry for 42 or two?)



    If the data can be sorted and de-duplicated, then the most efficient way to find an entry is to store the data in a std::set. Storing the data takes a little longer (O(log n) for the set vs. amortized O(1) for the vector), but finding the data makes up for it (O(log n) for the set vs. O(n) for the vector). If you use a set, make sure to use set's find function (e.g. mySet.find(tableNum) rather than std::find(mySet.begin(),mySet.end(),tableNum)), or there's no benefit.



    On a more general code-style note, since you're using C++ data structures anyway, you should probably use std::string and std::cout instead of char */char and printf.






    share|improve this answer



























      up vote
      8
      down vote













      I would seriously consider whether std::vector is the right data structure. Do you need the available table numbers to be in the order you got them, or can they be sorted? Do you need to allow duplicate table numbers? (I.e. if I add 42 twice, do I have one entry for 42 or two?)



      If the data can be sorted and de-duplicated, then the most efficient way to find an entry is to store the data in a std::set. Storing the data takes a little longer (O(log n) for the set vs. amortized O(1) for the vector), but finding the data makes up for it (O(log n) for the set vs. O(n) for the vector). If you use a set, make sure to use set's find function (e.g. mySet.find(tableNum) rather than std::find(mySet.begin(),mySet.end(),tableNum)), or there's no benefit.



      On a more general code-style note, since you're using C++ data structures anyway, you should probably use std::string and std::cout instead of char */char and printf.






      share|improve this answer

























        up vote
        8
        down vote










        up vote
        8
        down vote









        I would seriously consider whether std::vector is the right data structure. Do you need the available table numbers to be in the order you got them, or can they be sorted? Do you need to allow duplicate table numbers? (I.e. if I add 42 twice, do I have one entry for 42 or two?)



        If the data can be sorted and de-duplicated, then the most efficient way to find an entry is to store the data in a std::set. Storing the data takes a little longer (O(log n) for the set vs. amortized O(1) for the vector), but finding the data makes up for it (O(log n) for the set vs. O(n) for the vector). If you use a set, make sure to use set's find function (e.g. mySet.find(tableNum) rather than std::find(mySet.begin(),mySet.end(),tableNum)), or there's no benefit.



        On a more general code-style note, since you're using C++ data structures anyway, you should probably use std::string and std::cout instead of char */char and printf.






        share|improve this answer














        I would seriously consider whether std::vector is the right data structure. Do you need the available table numbers to be in the order you got them, or can they be sorted? Do you need to allow duplicate table numbers? (I.e. if I add 42 twice, do I have one entry for 42 or two?)



        If the data can be sorted and de-duplicated, then the most efficient way to find an entry is to store the data in a std::set. Storing the data takes a little longer (O(log n) for the set vs. amortized O(1) for the vector), but finding the data makes up for it (O(log n) for the set vs. O(n) for the vector). If you use a set, make sure to use set's find function (e.g. mySet.find(tableNum) rather than std::find(mySet.begin(),mySet.end(),tableNum)), or there's no benefit.



        On a more general code-style note, since you're using C++ data structures anyway, you should probably use std::string and std::cout instead of char */char and printf.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Dec 5 at 15:59









        Toby Speight

        23.3k639111




        23.3k639111










        answered Dec 5 at 15:24









        user186363

        812




        812






















            up vote
            3
            down vote













            A std::set or std::bitset might be the better containers to use for this, but assuming we're going to stick with std::vector, here's some improvements on this.



            See Toby Speight's answer for some of the improvements I used but did not cover.



            Avoid using unsigned numbers



            Don't use unsigned int or uint8_t. Counterintuitively, it's generally best to use signed int types even when a value should never be negative. Operations on mixed signed types come with a lot of problems. It's an unfortunate mistake the standard went with unsigned numbers for container sizes. Unsigned numbers are best kept to bitwise operations.



            Store whether an index is used at that index



            Instead of storing a list of indexes that have been used, let's just store a bool at each possible index. This uses more space, but then no extra searching has to be done, just a single loop. (Notice I've changed m_availTableNums to availTableNums. The prefix m_ is conventionally only used for class members).



            std::vector<bool> availTableNums (80);


            Putting the 80 here will initialize the vector with 80 false values. Now when we fill the table, we choose random indices to insert true which indicates a value of used.



            std::srand(std::time(nullptr));
            for(int i=0; i < 20; i++)
            {
            availTableNums[std::rand() % availTableNums.size()] = true;
            }


            Using modern RNG facilities



            std::rand has no guarantees of the "randomness" of the numbers it generates. It's fine if you want to run up a quick test, but in production code we could modify the previous example to make it a little more robust.



            std::default_random_engine engine {std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, availTableNums.size()};
            for(int i=0; i < 20; i++)
            {
            availTableNums[uniform_dist(e1)] = true;
            }


            Here we use std::size_t, even though it's an unsigned type, as it is better to not have to cast from one type to another if we could avoid it. The random number generator will automatically pick from the correct range so we no longer need the modulus operator.



            Use C++ string streams



            There is no need for a temporary string, and C++ streams are often a better choice than the C printf family of functions. We can write the header of the table as:



            std::cout << "+-------+-------+n"
            << "| TABLE | STATE |n"
            << "+-------+-------+" << std::endl;


            The extra <<s are left for style but not necessary. Now for the printing inside the loop, use std::setw(2) to make sure the printed index is within 2 columns. The value of this resets if string is sent to std::cout so it will not effect the rest of the line.



            The use of std::endl will cause the output to always be flushed from the buffer and will have a small performance penalty. However, I am usually quite generous with them unless it is shown they have a sizeable impact on performance. It can make code easier to debug if output is consistently flushed and you know exactly where something went wrong.



            for(auto i=0u; i < availTableNums.size(); i++)
            {
            std::cout << "| " << std::setw(2) << i << " | "
            << (availTableNums[i] ? "USED " : "EMPTY")
            << " |" << std::endl;
            }


            We again choose an unsigned index as comparing two unsigned types is better than comparing mixed signs or casting one to the other. The conditional expression ?: allows us to concisely choose between the two strings depending on whether the index is true or false.





            Putting it all together with only the relavent headers, we get:



            #include <ctime>
            #include <iomanip>
            #include <iostream>
            #include <random>
            #include <string>
            #include <vector>


            int main()
            {
            std::vector<bool> availTableNums (80);

            std::default_random_engine engine {std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, availTableNums.size()};
            for(int i=0; i < 20; i++)
            {
            availTableNums[uniform_dist(e1)] = true;
            }

            std::cout << "+-------+-------+n"
            << "| TABLE | STATE |n"
            << "+-------+-------+" << std::endl;

            for(auto i=0u; i < availTableNums.size(); i++)
            {
            std::cout << "| " << std::setw(2) << i << " | "
            << (availTableNums[i] ? "USED " : "EMPTY")
            << " |" << std::endl;
            }

            std::cout << "+-------+-------+" << std::endl;

            return 0;
            }





            share|improve this answer























            • +1 Avoid using unsigned numbers: Really good advice. Hard to get across especially with std::size_t being unsigned but you addressed that point well.
              – Martin York
              Dec 6 at 0:20






            • 1




              Avoid advising using std::endl. Do he really need to pay for a flush? Furthermore, in your TABLE | STATE you can safely keep the 1rst << and remove others.
              – Calak
              Dec 6 at 0:51






            • 2




              I'd consider the unsigned numbers a preference thing. There are some very strong reasons to use unsigned numbers. To call their use a "mistake" in the STL suggests you haven't thought those through.
              – Cort Ammon
              Dec 6 at 1:02










            • @Calak - I put in a note about the performance penalty, but haven't been convinced it's always that bad. I left the extra << in as a style choice but I could probably note that.
              – TheLoneMilkMan
              Dec 6 at 1:29










            • @CortAmmon - Well I'm basing that off of the opinion of Stroustrup and others involved in the STL design. I'm aware there's not a consensus but the issue isn't usually brought up enough.
              – TheLoneMilkMan
              Dec 6 at 1:42















            up vote
            3
            down vote













            A std::set or std::bitset might be the better containers to use for this, but assuming we're going to stick with std::vector, here's some improvements on this.



            See Toby Speight's answer for some of the improvements I used but did not cover.



            Avoid using unsigned numbers



            Don't use unsigned int or uint8_t. Counterintuitively, it's generally best to use signed int types even when a value should never be negative. Operations on mixed signed types come with a lot of problems. It's an unfortunate mistake the standard went with unsigned numbers for container sizes. Unsigned numbers are best kept to bitwise operations.



            Store whether an index is used at that index



            Instead of storing a list of indexes that have been used, let's just store a bool at each possible index. This uses more space, but then no extra searching has to be done, just a single loop. (Notice I've changed m_availTableNums to availTableNums. The prefix m_ is conventionally only used for class members).



            std::vector<bool> availTableNums (80);


            Putting the 80 here will initialize the vector with 80 false values. Now when we fill the table, we choose random indices to insert true which indicates a value of used.



            std::srand(std::time(nullptr));
            for(int i=0; i < 20; i++)
            {
            availTableNums[std::rand() % availTableNums.size()] = true;
            }


            Using modern RNG facilities



            std::rand has no guarantees of the "randomness" of the numbers it generates. It's fine if you want to run up a quick test, but in production code we could modify the previous example to make it a little more robust.



            std::default_random_engine engine {std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, availTableNums.size()};
            for(int i=0; i < 20; i++)
            {
            availTableNums[uniform_dist(e1)] = true;
            }


            Here we use std::size_t, even though it's an unsigned type, as it is better to not have to cast from one type to another if we could avoid it. The random number generator will automatically pick from the correct range so we no longer need the modulus operator.



            Use C++ string streams



            There is no need for a temporary string, and C++ streams are often a better choice than the C printf family of functions. We can write the header of the table as:



            std::cout << "+-------+-------+n"
            << "| TABLE | STATE |n"
            << "+-------+-------+" << std::endl;


            The extra <<s are left for style but not necessary. Now for the printing inside the loop, use std::setw(2) to make sure the printed index is within 2 columns. The value of this resets if string is sent to std::cout so it will not effect the rest of the line.



            The use of std::endl will cause the output to always be flushed from the buffer and will have a small performance penalty. However, I am usually quite generous with them unless it is shown they have a sizeable impact on performance. It can make code easier to debug if output is consistently flushed and you know exactly where something went wrong.



            for(auto i=0u; i < availTableNums.size(); i++)
            {
            std::cout << "| " << std::setw(2) << i << " | "
            << (availTableNums[i] ? "USED " : "EMPTY")
            << " |" << std::endl;
            }


            We again choose an unsigned index as comparing two unsigned types is better than comparing mixed signs or casting one to the other. The conditional expression ?: allows us to concisely choose between the two strings depending on whether the index is true or false.





            Putting it all together with only the relavent headers, we get:



            #include <ctime>
            #include <iomanip>
            #include <iostream>
            #include <random>
            #include <string>
            #include <vector>


            int main()
            {
            std::vector<bool> availTableNums (80);

            std::default_random_engine engine {std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, availTableNums.size()};
            for(int i=0; i < 20; i++)
            {
            availTableNums[uniform_dist(e1)] = true;
            }

            std::cout << "+-------+-------+n"
            << "| TABLE | STATE |n"
            << "+-------+-------+" << std::endl;

            for(auto i=0u; i < availTableNums.size(); i++)
            {
            std::cout << "| " << std::setw(2) << i << " | "
            << (availTableNums[i] ? "USED " : "EMPTY")
            << " |" << std::endl;
            }

            std::cout << "+-------+-------+" << std::endl;

            return 0;
            }





            share|improve this answer























            • +1 Avoid using unsigned numbers: Really good advice. Hard to get across especially with std::size_t being unsigned but you addressed that point well.
              – Martin York
              Dec 6 at 0:20






            • 1




              Avoid advising using std::endl. Do he really need to pay for a flush? Furthermore, in your TABLE | STATE you can safely keep the 1rst << and remove others.
              – Calak
              Dec 6 at 0:51






            • 2




              I'd consider the unsigned numbers a preference thing. There are some very strong reasons to use unsigned numbers. To call their use a "mistake" in the STL suggests you haven't thought those through.
              – Cort Ammon
              Dec 6 at 1:02










            • @Calak - I put in a note about the performance penalty, but haven't been convinced it's always that bad. I left the extra << in as a style choice but I could probably note that.
              – TheLoneMilkMan
              Dec 6 at 1:29










            • @CortAmmon - Well I'm basing that off of the opinion of Stroustrup and others involved in the STL design. I'm aware there's not a consensus but the issue isn't usually brought up enough.
              – TheLoneMilkMan
              Dec 6 at 1:42













            up vote
            3
            down vote










            up vote
            3
            down vote









            A std::set or std::bitset might be the better containers to use for this, but assuming we're going to stick with std::vector, here's some improvements on this.



            See Toby Speight's answer for some of the improvements I used but did not cover.



            Avoid using unsigned numbers



            Don't use unsigned int or uint8_t. Counterintuitively, it's generally best to use signed int types even when a value should never be negative. Operations on mixed signed types come with a lot of problems. It's an unfortunate mistake the standard went with unsigned numbers for container sizes. Unsigned numbers are best kept to bitwise operations.



            Store whether an index is used at that index



            Instead of storing a list of indexes that have been used, let's just store a bool at each possible index. This uses more space, but then no extra searching has to be done, just a single loop. (Notice I've changed m_availTableNums to availTableNums. The prefix m_ is conventionally only used for class members).



            std::vector<bool> availTableNums (80);


            Putting the 80 here will initialize the vector with 80 false values. Now when we fill the table, we choose random indices to insert true which indicates a value of used.



            std::srand(std::time(nullptr));
            for(int i=0; i < 20; i++)
            {
            availTableNums[std::rand() % availTableNums.size()] = true;
            }


            Using modern RNG facilities



            std::rand has no guarantees of the "randomness" of the numbers it generates. It's fine if you want to run up a quick test, but in production code we could modify the previous example to make it a little more robust.



            std::default_random_engine engine {std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, availTableNums.size()};
            for(int i=0; i < 20; i++)
            {
            availTableNums[uniform_dist(e1)] = true;
            }


            Here we use std::size_t, even though it's an unsigned type, as it is better to not have to cast from one type to another if we could avoid it. The random number generator will automatically pick from the correct range so we no longer need the modulus operator.



            Use C++ string streams



            There is no need for a temporary string, and C++ streams are often a better choice than the C printf family of functions. We can write the header of the table as:



            std::cout << "+-------+-------+n"
            << "| TABLE | STATE |n"
            << "+-------+-------+" << std::endl;


            The extra <<s are left for style but not necessary. Now for the printing inside the loop, use std::setw(2) to make sure the printed index is within 2 columns. The value of this resets if string is sent to std::cout so it will not effect the rest of the line.



            The use of std::endl will cause the output to always be flushed from the buffer and will have a small performance penalty. However, I am usually quite generous with them unless it is shown they have a sizeable impact on performance. It can make code easier to debug if output is consistently flushed and you know exactly where something went wrong.



            for(auto i=0u; i < availTableNums.size(); i++)
            {
            std::cout << "| " << std::setw(2) << i << " | "
            << (availTableNums[i] ? "USED " : "EMPTY")
            << " |" << std::endl;
            }


            We again choose an unsigned index as comparing two unsigned types is better than comparing mixed signs or casting one to the other. The conditional expression ?: allows us to concisely choose between the two strings depending on whether the index is true or false.





            Putting it all together with only the relavent headers, we get:



            #include <ctime>
            #include <iomanip>
            #include <iostream>
            #include <random>
            #include <string>
            #include <vector>


            int main()
            {
            std::vector<bool> availTableNums (80);

            std::default_random_engine engine {std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, availTableNums.size()};
            for(int i=0; i < 20; i++)
            {
            availTableNums[uniform_dist(e1)] = true;
            }

            std::cout << "+-------+-------+n"
            << "| TABLE | STATE |n"
            << "+-------+-------+" << std::endl;

            for(auto i=0u; i < availTableNums.size(); i++)
            {
            std::cout << "| " << std::setw(2) << i << " | "
            << (availTableNums[i] ? "USED " : "EMPTY")
            << " |" << std::endl;
            }

            std::cout << "+-------+-------+" << std::endl;

            return 0;
            }





            share|improve this answer














            A std::set or std::bitset might be the better containers to use for this, but assuming we're going to stick with std::vector, here's some improvements on this.



            See Toby Speight's answer for some of the improvements I used but did not cover.



            Avoid using unsigned numbers



            Don't use unsigned int or uint8_t. Counterintuitively, it's generally best to use signed int types even when a value should never be negative. Operations on mixed signed types come with a lot of problems. It's an unfortunate mistake the standard went with unsigned numbers for container sizes. Unsigned numbers are best kept to bitwise operations.



            Store whether an index is used at that index



            Instead of storing a list of indexes that have been used, let's just store a bool at each possible index. This uses more space, but then no extra searching has to be done, just a single loop. (Notice I've changed m_availTableNums to availTableNums. The prefix m_ is conventionally only used for class members).



            std::vector<bool> availTableNums (80);


            Putting the 80 here will initialize the vector with 80 false values. Now when we fill the table, we choose random indices to insert true which indicates a value of used.



            std::srand(std::time(nullptr));
            for(int i=0; i < 20; i++)
            {
            availTableNums[std::rand() % availTableNums.size()] = true;
            }


            Using modern RNG facilities



            std::rand has no guarantees of the "randomness" of the numbers it generates. It's fine if you want to run up a quick test, but in production code we could modify the previous example to make it a little more robust.



            std::default_random_engine engine {std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, availTableNums.size()};
            for(int i=0; i < 20; i++)
            {
            availTableNums[uniform_dist(e1)] = true;
            }


            Here we use std::size_t, even though it's an unsigned type, as it is better to not have to cast from one type to another if we could avoid it. The random number generator will automatically pick from the correct range so we no longer need the modulus operator.



            Use C++ string streams



            There is no need for a temporary string, and C++ streams are often a better choice than the C printf family of functions. We can write the header of the table as:



            std::cout << "+-------+-------+n"
            << "| TABLE | STATE |n"
            << "+-------+-------+" << std::endl;


            The extra <<s are left for style but not necessary. Now for the printing inside the loop, use std::setw(2) to make sure the printed index is within 2 columns. The value of this resets if string is sent to std::cout so it will not effect the rest of the line.



            The use of std::endl will cause the output to always be flushed from the buffer and will have a small performance penalty. However, I am usually quite generous with them unless it is shown they have a sizeable impact on performance. It can make code easier to debug if output is consistently flushed and you know exactly where something went wrong.



            for(auto i=0u; i < availTableNums.size(); i++)
            {
            std::cout << "| " << std::setw(2) << i << " | "
            << (availTableNums[i] ? "USED " : "EMPTY")
            << " |" << std::endl;
            }


            We again choose an unsigned index as comparing two unsigned types is better than comparing mixed signs or casting one to the other. The conditional expression ?: allows us to concisely choose between the two strings depending on whether the index is true or false.





            Putting it all together with only the relavent headers, we get:



            #include <ctime>
            #include <iomanip>
            #include <iostream>
            #include <random>
            #include <string>
            #include <vector>


            int main()
            {
            std::vector<bool> availTableNums (80);

            std::default_random_engine engine {std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, availTableNums.size()};
            for(int i=0; i < 20; i++)
            {
            availTableNums[uniform_dist(e1)] = true;
            }

            std::cout << "+-------+-------+n"
            << "| TABLE | STATE |n"
            << "+-------+-------+" << std::endl;

            for(auto i=0u; i < availTableNums.size(); i++)
            {
            std::cout << "| " << std::setw(2) << i << " | "
            << (availTableNums[i] ? "USED " : "EMPTY")
            << " |" << std::endl;
            }

            std::cout << "+-------+-------+" << std::endl;

            return 0;
            }






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Dec 6 at 1:31

























            answered Dec 5 at 23:04









            TheLoneMilkMan

            1312




            1312












            • +1 Avoid using unsigned numbers: Really good advice. Hard to get across especially with std::size_t being unsigned but you addressed that point well.
              – Martin York
              Dec 6 at 0:20






            • 1




              Avoid advising using std::endl. Do he really need to pay for a flush? Furthermore, in your TABLE | STATE you can safely keep the 1rst << and remove others.
              – Calak
              Dec 6 at 0:51






            • 2




              I'd consider the unsigned numbers a preference thing. There are some very strong reasons to use unsigned numbers. To call their use a "mistake" in the STL suggests you haven't thought those through.
              – Cort Ammon
              Dec 6 at 1:02










            • @Calak - I put in a note about the performance penalty, but haven't been convinced it's always that bad. I left the extra << in as a style choice but I could probably note that.
              – TheLoneMilkMan
              Dec 6 at 1:29










            • @CortAmmon - Well I'm basing that off of the opinion of Stroustrup and others involved in the STL design. I'm aware there's not a consensus but the issue isn't usually brought up enough.
              – TheLoneMilkMan
              Dec 6 at 1:42


















            • +1 Avoid using unsigned numbers: Really good advice. Hard to get across especially with std::size_t being unsigned but you addressed that point well.
              – Martin York
              Dec 6 at 0:20






            • 1




              Avoid advising using std::endl. Do he really need to pay for a flush? Furthermore, in your TABLE | STATE you can safely keep the 1rst << and remove others.
              – Calak
              Dec 6 at 0:51






            • 2




              I'd consider the unsigned numbers a preference thing. There are some very strong reasons to use unsigned numbers. To call their use a "mistake" in the STL suggests you haven't thought those through.
              – Cort Ammon
              Dec 6 at 1:02










            • @Calak - I put in a note about the performance penalty, but haven't been convinced it's always that bad. I left the extra << in as a style choice but I could probably note that.
              – TheLoneMilkMan
              Dec 6 at 1:29










            • @CortAmmon - Well I'm basing that off of the opinion of Stroustrup and others involved in the STL design. I'm aware there's not a consensus but the issue isn't usually brought up enough.
              – TheLoneMilkMan
              Dec 6 at 1:42
















            +1 Avoid using unsigned numbers: Really good advice. Hard to get across especially with std::size_t being unsigned but you addressed that point well.
            – Martin York
            Dec 6 at 0:20




            +1 Avoid using unsigned numbers: Really good advice. Hard to get across especially with std::size_t being unsigned but you addressed that point well.
            – Martin York
            Dec 6 at 0:20




            1




            1




            Avoid advising using std::endl. Do he really need to pay for a flush? Furthermore, in your TABLE | STATE you can safely keep the 1rst << and remove others.
            – Calak
            Dec 6 at 0:51




            Avoid advising using std::endl. Do he really need to pay for a flush? Furthermore, in your TABLE | STATE you can safely keep the 1rst << and remove others.
            – Calak
            Dec 6 at 0:51




            2




            2




            I'd consider the unsigned numbers a preference thing. There are some very strong reasons to use unsigned numbers. To call their use a "mistake" in the STL suggests you haven't thought those through.
            – Cort Ammon
            Dec 6 at 1:02




            I'd consider the unsigned numbers a preference thing. There are some very strong reasons to use unsigned numbers. To call their use a "mistake" in the STL suggests you haven't thought those through.
            – Cort Ammon
            Dec 6 at 1:02












            @Calak - I put in a note about the performance penalty, but haven't been convinced it's always that bad. I left the extra << in as a style choice but I could probably note that.
            – TheLoneMilkMan
            Dec 6 at 1:29




            @Calak - I put in a note about the performance penalty, but haven't been convinced it's always that bad. I left the extra << in as a style choice but I could probably note that.
            – TheLoneMilkMan
            Dec 6 at 1:29












            @CortAmmon - Well I'm basing that off of the opinion of Stroustrup and others involved in the STL design. I'm aware there's not a consensus but the issue isn't usually brought up enough.
            – TheLoneMilkMan
            Dec 6 at 1:42




            @CortAmmon - Well I'm basing that off of the opinion of Stroustrup and others involved in the STL design. I'm aware there's not a consensus but the issue isn't usually brought up enough.
            – TheLoneMilkMan
            Dec 6 at 1:42










            up vote
            2
            down vote













            The above are good advice. Do you need exactly N values true? If so, the above loop might be better as (pieces left out for simplicity):



            const std::size_t N = 80;
            const std::size_t K = 20;

            std::unordered_set<std::size_t> exists;

            // Fill exists with K random values between [0 .. N], inclusive.
            std::default_random_engine rng{std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, N};
            while (exists.size() < K) {
            exists.insert(uniform_dist(rng));
            }

            // Print each value in order, in a table.
            for(auto i = uniform_dist.min(); i <= uniform_dist.max(); ++i) {
            std::cout << "| " << std::setw(2) << i << " | "
            << ((exists.count(i) == 0) ? "EMPTY" : "USED ")
            << " |" << std::endl;
            }





            share|improve this answer





















            • (Nice catch, with the exception of picking the symbolic name for the total number in your question about the requirements instead of that for the "true sub-set".)
              – greybeard
              Dec 8 at 7:56















            up vote
            2
            down vote













            The above are good advice. Do you need exactly N values true? If so, the above loop might be better as (pieces left out for simplicity):



            const std::size_t N = 80;
            const std::size_t K = 20;

            std::unordered_set<std::size_t> exists;

            // Fill exists with K random values between [0 .. N], inclusive.
            std::default_random_engine rng{std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, N};
            while (exists.size() < K) {
            exists.insert(uniform_dist(rng));
            }

            // Print each value in order, in a table.
            for(auto i = uniform_dist.min(); i <= uniform_dist.max(); ++i) {
            std::cout << "| " << std::setw(2) << i << " | "
            << ((exists.count(i) == 0) ? "EMPTY" : "USED ")
            << " |" << std::endl;
            }





            share|improve this answer





















            • (Nice catch, with the exception of picking the symbolic name for the total number in your question about the requirements instead of that for the "true sub-set".)
              – greybeard
              Dec 8 at 7:56













            up vote
            2
            down vote










            up vote
            2
            down vote









            The above are good advice. Do you need exactly N values true? If so, the above loop might be better as (pieces left out for simplicity):



            const std::size_t N = 80;
            const std::size_t K = 20;

            std::unordered_set<std::size_t> exists;

            // Fill exists with K random values between [0 .. N], inclusive.
            std::default_random_engine rng{std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, N};
            while (exists.size() < K) {
            exists.insert(uniform_dist(rng));
            }

            // Print each value in order, in a table.
            for(auto i = uniform_dist.min(); i <= uniform_dist.max(); ++i) {
            std::cout << "| " << std::setw(2) << i << " | "
            << ((exists.count(i) == 0) ? "EMPTY" : "USED ")
            << " |" << std::endl;
            }





            share|improve this answer












            The above are good advice. Do you need exactly N values true? If so, the above loop might be better as (pieces left out for simplicity):



            const std::size_t N = 80;
            const std::size_t K = 20;

            std::unordered_set<std::size_t> exists;

            // Fill exists with K random values between [0 .. N], inclusive.
            std::default_random_engine rng{std::time()};
            std::uniform_int_distribution<std::size_t> uniform_dist {0, N};
            while (exists.size() < K) {
            exists.insert(uniform_dist(rng));
            }

            // Print each value in order, in a table.
            for(auto i = uniform_dist.min(); i <= uniform_dist.max(); ++i) {
            std::cout << "| " << std::setw(2) << i << " | "
            << ((exists.count(i) == 0) ? "EMPTY" : "USED ")
            << " |" << std::endl;
            }






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Dec 8 at 3:16









            Laramie

            211




            211












            • (Nice catch, with the exception of picking the symbolic name for the total number in your question about the requirements instead of that for the "true sub-set".)
              – greybeard
              Dec 8 at 7:56


















            • (Nice catch, with the exception of picking the symbolic name for the total number in your question about the requirements instead of that for the "true sub-set".)
              – greybeard
              Dec 8 at 7:56
















            (Nice catch, with the exception of picking the symbolic name for the total number in your question about the requirements instead of that for the "true sub-set".)
            – greybeard
            Dec 8 at 7:56




            (Nice catch, with the exception of picking the symbolic name for the total number in your question about the requirements instead of that for the "true sub-set".)
            – greybeard
            Dec 8 at 7:56


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Code Review Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209055%2fmost-efficient-way-to-find-an-entry-in-a-c-vector%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            flock() on closed filehandle LOCK_FILE at /usr/bin/apt-mirror

            Mangá

            Eduardo VII do Reino Unido