Trim NA values from left and right of Rcpp::NumericVector
up vote
2
down vote
favorite
I receive data with NA values. I would like to trim off the leading and trailing NA values, but to leave the interior NA values in place.
I've written the below in Rcpp, but it is very much a hand rolled solution -- and i'm alive to the fact that i'm going to make mistakes doing things this way.
What is the idiomatic cpp / Rcpp (sugar?) way of doing something like this?
#include<Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(Rcpp::NumericVector X) {
// find first and last non-NA value
double * F = X.begin();
double * pBegin = X.begin();
double * pEnd = X.end();
double * L = X.end() - 1; // note you must decrement by one as .end() is AFTER
// look for the values
while( F < pEnd && NumericVector::is_na(*F)) F++;
while( L > pBegin && NumericVector::is_na(*L)) L--;
// create trimmed X vector
Rcpp::NumericVector Xtrim(L - F + 1);
for( int i = 0; i < Xtrim.size(); i++) {
Xtrim[i] = *(F + i);
}
return Xtrim;
}
/*** R
x <- c(NA, NA, NA, 1:4, NA, 6:8, NA, 10, NA, NA, NA)
trimNA(x)
*/
c++ rcpp
add a comment |
up vote
2
down vote
favorite
I receive data with NA values. I would like to trim off the leading and trailing NA values, but to leave the interior NA values in place.
I've written the below in Rcpp, but it is very much a hand rolled solution -- and i'm alive to the fact that i'm going to make mistakes doing things this way.
What is the idiomatic cpp / Rcpp (sugar?) way of doing something like this?
#include<Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(Rcpp::NumericVector X) {
// find first and last non-NA value
double * F = X.begin();
double * pBegin = X.begin();
double * pEnd = X.end();
double * L = X.end() - 1; // note you must decrement by one as .end() is AFTER
// look for the values
while( F < pEnd && NumericVector::is_na(*F)) F++;
while( L > pBegin && NumericVector::is_na(*L)) L--;
// create trimmed X vector
Rcpp::NumericVector Xtrim(L - F + 1);
for( int i = 0; i < Xtrim.size(); i++) {
Xtrim[i] = *(F + i);
}
return Xtrim;
}
/*** R
x <- c(NA, NA, NA, 1:4, NA, 6:8, NA, 10, NA, NA, NA)
trimNA(x)
*/
c++ rcpp
I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I receive data with NA values. I would like to trim off the leading and trailing NA values, but to leave the interior NA values in place.
I've written the below in Rcpp, but it is very much a hand rolled solution -- and i'm alive to the fact that i'm going to make mistakes doing things this way.
What is the idiomatic cpp / Rcpp (sugar?) way of doing something like this?
#include<Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(Rcpp::NumericVector X) {
// find first and last non-NA value
double * F = X.begin();
double * pBegin = X.begin();
double * pEnd = X.end();
double * L = X.end() - 1; // note you must decrement by one as .end() is AFTER
// look for the values
while( F < pEnd && NumericVector::is_na(*F)) F++;
while( L > pBegin && NumericVector::is_na(*L)) L--;
// create trimmed X vector
Rcpp::NumericVector Xtrim(L - F + 1);
for( int i = 0; i < Xtrim.size(); i++) {
Xtrim[i] = *(F + i);
}
return Xtrim;
}
/*** R
x <- c(NA, NA, NA, 1:4, NA, 6:8, NA, 10, NA, NA, NA)
trimNA(x)
*/
c++ rcpp
I receive data with NA values. I would like to trim off the leading and trailing NA values, but to leave the interior NA values in place.
I've written the below in Rcpp, but it is very much a hand rolled solution -- and i'm alive to the fact that i'm going to make mistakes doing things this way.
What is the idiomatic cpp / Rcpp (sugar?) way of doing something like this?
#include<Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(Rcpp::NumericVector X) {
// find first and last non-NA value
double * F = X.begin();
double * pBegin = X.begin();
double * pEnd = X.end();
double * L = X.end() - 1; // note you must decrement by one as .end() is AFTER
// look for the values
while( F < pEnd && NumericVector::is_na(*F)) F++;
while( L > pBegin && NumericVector::is_na(*L)) L--;
// create trimmed X vector
Rcpp::NumericVector Xtrim(L - F + 1);
for( int i = 0; i < Xtrim.size(); i++) {
Xtrim[i] = *(F + i);
}
return Xtrim;
}
/*** R
x <- c(NA, NA, NA, 1:4, NA, 6:8, NA, 10, NA, NA, NA)
trimNA(x)
*/
c++ rcpp
c++ rcpp
edited yesterday
Incomputable
6,42621453
6,42621453
asked yesterday
ricardo
180210
180210
I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday
add a comment |
I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday
I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday
I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday
add a comment |
2 Answers
2
active
oldest
votes
up vote
2
down vote
accepted
I don't know Rcpp well enough to be sure, but we should be able to reduce copying in one of two ways:
- Accept the argument as a const reference, so that we don't copy in to the function, or
- Modify our copy of the argument in-place, then return it as result (rather than copying again to
Xtrim).
Both of these eliminate a vector copy - in general, the first is to be preferred, as the input vector is bigger than the output vector. If we must copy something, prefer the smallest copy.
The former option is simplest, if we're allowed to change the signature:
// The Rcpp Vector header is incomplete; it needs these dependencies. :-(
#include <RcppCommon.h>
#include <Rcpp/RObject.h>
#include <Rcpp/Dimension.h>
#include <Rcpp/Vector.h>
#include <algorithm>
#include <iterator>
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(const Rcpp::NumericVector& x)
{
// remove trailing NA values
auto rbegin = std::make_reverse_iterator(x.end());
auto rend = std::make_reverse_iterator(x.begin());
auto last_real_value = std::find_if_not(rbegin, rend,
&Rcpp::NumericVector::is_na).base();
// remove leading NA values
auto first_real_value = std::find_if_not(x.begin(), last_real_value,
&Rcpp::NumericVector::is_na);
return {first_real_value, last_real_value};
}
Since the Rcpp vector doesn't provide standard-collection-like rbegin()/rend() methods, I had to manually create reverse iterators. That wouldn't be necessary with a standard vector.
For the latter option, we could use the vector's erase() method to remove the rightmost values, and then the leftmost ones (removing the leftmost first would work, but could require moving more data, making it less efficient).
Here's how that might work (untested):
// The Rcpp Vector header is incomplete; it needs these dependencies. :-(
#include <RcppCommon.h>
#include <Rcpp/RObject.h>
#include <Rcpp/Dimension.h>
#include <Rcpp/Vector.h>
#include <algorithm>
#include <iterator>
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(Rcpp::NumericVector x)
{
// remove trailing NA values
auto rbegin = std::make_reverse_iterator(x.end());
auto rend = std::make_reverse_iterator(x.begin());
auto last_real_value = std::find_if_not(rbegin, rend,
&Rcpp::NumericVector::is_na).base();
x.erase(last_real_value, x.end());
// remove leading NA values
auto first_real_value = std::find_if_not(x.begin(), x.end(),
&Rcpp::NumericVector::is_na);
x.erase(x.begin(), first_real_value);
return x;
}
Other review items
The code doesn't work correctly if every value is an NA - we should stop when we reach F rather than pBegin here:
while( L > F && NumericVector::is_na(*L)) L--;
Use the standard indexing operator:
Xtrim[i] = F[i];
(But it's better to use std::copy() instead of the loop).
Use actual iterators, rather than assuming you'll get bare pointers.
Don't include all of <Rcpp.h>, as that makes compilation really slow. Unfortunately, Rcpp's headers are broken because they don't include all they need - see my examples for workaround (that's still much faster to compile than the original).
I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday
add a comment |
up vote
2
down vote
Code Review
Code is good, except the naming. There are some pieces which could be improved
Use auto for iterators, as exact iterator type is not important, but it's properties are. Iterator properties are usually inherited from container it came from.
Use better constructor, there seems to be one which accepts pair of iterators, and in this case, pair of pointers denoting contiguous range.
Better (?) approach
The problem seems to be somewhat specific case of trimming in general. Note that trimming from the right and trimming from the left are symmetric, and std::reverse_iterator "normalizes" trimming from right (great improvements thanks to @Toby).
template <typename BidirIterator,
typename Predicate>
auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {
auto left_edge = std::find_if_not(first, last, predicate);
auto right_edge = std::find_if_not(std::reverse_iterator(last),
std::reverse_iterator(left_edge),
predicate).base();
return std::pair{left_edge, right_edge};
}
When reversing a range, first becomes last, and vice versa, that is why std::reverse_iterator arguments were swapped. There is also C++17 feature in play (template class argument deduction).
Full code:
#include <utility>
#include <iterator>
#include <algorithm>
template <typename BidirIterator,
typename Predicate>
auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {
auto left_edge = std::find_if_not(first, last, predicate);
auto right_edge = std::find_if_not(std::reverse_iterator(last),
std::reverse_iterator(left_edge),
predicate).base();
return std::pair{left_edge, right_edge};
}
#include <vector>
#include <stdexcept>
int main() {
std::vector<int> initial_values{2, 2, 3, 4, 5, 6};
auto predicate = (int x) { return x % 2 == 0; };
auto [new_first, new_last] = trim(initial_values.begin(),
initial_values.end(),
predicate);
std::vector<int> resulting_values(new_first, new_last);
std::vector<int> correct_result{3, 4, 5};
if (resulting_values != correct_result) {
throw std::logic_error("incorrect trimming occured");
}
}
Demo on Wandbox.
To make trimming a little bit easier to use, one might write something like this:
template <typename Container, typename Predicate>
Container trim_copy(Container&& container, Predicate predicate) {
auto [new_first, new_last] = trim(container.begin(), container.end(),
predicate);
return Container(new_first, new_last);
}
and pass a lambda calling the NumericVector::is_na:
auto predicate = (auto x) { return NumericVector::is_na(x); };
or may be pointer to the function directly.
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
I don't know Rcpp well enough to be sure, but we should be able to reduce copying in one of two ways:
- Accept the argument as a const reference, so that we don't copy in to the function, or
- Modify our copy of the argument in-place, then return it as result (rather than copying again to
Xtrim).
Both of these eliminate a vector copy - in general, the first is to be preferred, as the input vector is bigger than the output vector. If we must copy something, prefer the smallest copy.
The former option is simplest, if we're allowed to change the signature:
// The Rcpp Vector header is incomplete; it needs these dependencies. :-(
#include <RcppCommon.h>
#include <Rcpp/RObject.h>
#include <Rcpp/Dimension.h>
#include <Rcpp/Vector.h>
#include <algorithm>
#include <iterator>
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(const Rcpp::NumericVector& x)
{
// remove trailing NA values
auto rbegin = std::make_reverse_iterator(x.end());
auto rend = std::make_reverse_iterator(x.begin());
auto last_real_value = std::find_if_not(rbegin, rend,
&Rcpp::NumericVector::is_na).base();
// remove leading NA values
auto first_real_value = std::find_if_not(x.begin(), last_real_value,
&Rcpp::NumericVector::is_na);
return {first_real_value, last_real_value};
}
Since the Rcpp vector doesn't provide standard-collection-like rbegin()/rend() methods, I had to manually create reverse iterators. That wouldn't be necessary with a standard vector.
For the latter option, we could use the vector's erase() method to remove the rightmost values, and then the leftmost ones (removing the leftmost first would work, but could require moving more data, making it less efficient).
Here's how that might work (untested):
// The Rcpp Vector header is incomplete; it needs these dependencies. :-(
#include <RcppCommon.h>
#include <Rcpp/RObject.h>
#include <Rcpp/Dimension.h>
#include <Rcpp/Vector.h>
#include <algorithm>
#include <iterator>
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(Rcpp::NumericVector x)
{
// remove trailing NA values
auto rbegin = std::make_reverse_iterator(x.end());
auto rend = std::make_reverse_iterator(x.begin());
auto last_real_value = std::find_if_not(rbegin, rend,
&Rcpp::NumericVector::is_na).base();
x.erase(last_real_value, x.end());
// remove leading NA values
auto first_real_value = std::find_if_not(x.begin(), x.end(),
&Rcpp::NumericVector::is_na);
x.erase(x.begin(), first_real_value);
return x;
}
Other review items
The code doesn't work correctly if every value is an NA - we should stop when we reach F rather than pBegin here:
while( L > F && NumericVector::is_na(*L)) L--;
Use the standard indexing operator:
Xtrim[i] = F[i];
(But it's better to use std::copy() instead of the loop).
Use actual iterators, rather than assuming you'll get bare pointers.
Don't include all of <Rcpp.h>, as that makes compilation really slow. Unfortunately, Rcpp's headers are broken because they don't include all they need - see my examples for workaround (that's still much faster to compile than the original).
I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday
add a comment |
up vote
2
down vote
accepted
I don't know Rcpp well enough to be sure, but we should be able to reduce copying in one of two ways:
- Accept the argument as a const reference, so that we don't copy in to the function, or
- Modify our copy of the argument in-place, then return it as result (rather than copying again to
Xtrim).
Both of these eliminate a vector copy - in general, the first is to be preferred, as the input vector is bigger than the output vector. If we must copy something, prefer the smallest copy.
The former option is simplest, if we're allowed to change the signature:
// The Rcpp Vector header is incomplete; it needs these dependencies. :-(
#include <RcppCommon.h>
#include <Rcpp/RObject.h>
#include <Rcpp/Dimension.h>
#include <Rcpp/Vector.h>
#include <algorithm>
#include <iterator>
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(const Rcpp::NumericVector& x)
{
// remove trailing NA values
auto rbegin = std::make_reverse_iterator(x.end());
auto rend = std::make_reverse_iterator(x.begin());
auto last_real_value = std::find_if_not(rbegin, rend,
&Rcpp::NumericVector::is_na).base();
// remove leading NA values
auto first_real_value = std::find_if_not(x.begin(), last_real_value,
&Rcpp::NumericVector::is_na);
return {first_real_value, last_real_value};
}
Since the Rcpp vector doesn't provide standard-collection-like rbegin()/rend() methods, I had to manually create reverse iterators. That wouldn't be necessary with a standard vector.
For the latter option, we could use the vector's erase() method to remove the rightmost values, and then the leftmost ones (removing the leftmost first would work, but could require moving more data, making it less efficient).
Here's how that might work (untested):
// The Rcpp Vector header is incomplete; it needs these dependencies. :-(
#include <RcppCommon.h>
#include <Rcpp/RObject.h>
#include <Rcpp/Dimension.h>
#include <Rcpp/Vector.h>
#include <algorithm>
#include <iterator>
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(Rcpp::NumericVector x)
{
// remove trailing NA values
auto rbegin = std::make_reverse_iterator(x.end());
auto rend = std::make_reverse_iterator(x.begin());
auto last_real_value = std::find_if_not(rbegin, rend,
&Rcpp::NumericVector::is_na).base();
x.erase(last_real_value, x.end());
// remove leading NA values
auto first_real_value = std::find_if_not(x.begin(), x.end(),
&Rcpp::NumericVector::is_na);
x.erase(x.begin(), first_real_value);
return x;
}
Other review items
The code doesn't work correctly if every value is an NA - we should stop when we reach F rather than pBegin here:
while( L > F && NumericVector::is_na(*L)) L--;
Use the standard indexing operator:
Xtrim[i] = F[i];
(But it's better to use std::copy() instead of the loop).
Use actual iterators, rather than assuming you'll get bare pointers.
Don't include all of <Rcpp.h>, as that makes compilation really slow. Unfortunately, Rcpp's headers are broken because they don't include all they need - see my examples for workaround (that's still much faster to compile than the original).
I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
I don't know Rcpp well enough to be sure, but we should be able to reduce copying in one of two ways:
- Accept the argument as a const reference, so that we don't copy in to the function, or
- Modify our copy of the argument in-place, then return it as result (rather than copying again to
Xtrim).
Both of these eliminate a vector copy - in general, the first is to be preferred, as the input vector is bigger than the output vector. If we must copy something, prefer the smallest copy.
The former option is simplest, if we're allowed to change the signature:
// The Rcpp Vector header is incomplete; it needs these dependencies. :-(
#include <RcppCommon.h>
#include <Rcpp/RObject.h>
#include <Rcpp/Dimension.h>
#include <Rcpp/Vector.h>
#include <algorithm>
#include <iterator>
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(const Rcpp::NumericVector& x)
{
// remove trailing NA values
auto rbegin = std::make_reverse_iterator(x.end());
auto rend = std::make_reverse_iterator(x.begin());
auto last_real_value = std::find_if_not(rbegin, rend,
&Rcpp::NumericVector::is_na).base();
// remove leading NA values
auto first_real_value = std::find_if_not(x.begin(), last_real_value,
&Rcpp::NumericVector::is_na);
return {first_real_value, last_real_value};
}
Since the Rcpp vector doesn't provide standard-collection-like rbegin()/rend() methods, I had to manually create reverse iterators. That wouldn't be necessary with a standard vector.
For the latter option, we could use the vector's erase() method to remove the rightmost values, and then the leftmost ones (removing the leftmost first would work, but could require moving more data, making it less efficient).
Here's how that might work (untested):
// The Rcpp Vector header is incomplete; it needs these dependencies. :-(
#include <RcppCommon.h>
#include <Rcpp/RObject.h>
#include <Rcpp/Dimension.h>
#include <Rcpp/Vector.h>
#include <algorithm>
#include <iterator>
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(Rcpp::NumericVector x)
{
// remove trailing NA values
auto rbegin = std::make_reverse_iterator(x.end());
auto rend = std::make_reverse_iterator(x.begin());
auto last_real_value = std::find_if_not(rbegin, rend,
&Rcpp::NumericVector::is_na).base();
x.erase(last_real_value, x.end());
// remove leading NA values
auto first_real_value = std::find_if_not(x.begin(), x.end(),
&Rcpp::NumericVector::is_na);
x.erase(x.begin(), first_real_value);
return x;
}
Other review items
The code doesn't work correctly if every value is an NA - we should stop when we reach F rather than pBegin here:
while( L > F && NumericVector::is_na(*L)) L--;
Use the standard indexing operator:
Xtrim[i] = F[i];
(But it's better to use std::copy() instead of the loop).
Use actual iterators, rather than assuming you'll get bare pointers.
Don't include all of <Rcpp.h>, as that makes compilation really slow. Unfortunately, Rcpp's headers are broken because they don't include all they need - see my examples for workaround (that's still much faster to compile than the original).
I don't know Rcpp well enough to be sure, but we should be able to reduce copying in one of two ways:
- Accept the argument as a const reference, so that we don't copy in to the function, or
- Modify our copy of the argument in-place, then return it as result (rather than copying again to
Xtrim).
Both of these eliminate a vector copy - in general, the first is to be preferred, as the input vector is bigger than the output vector. If we must copy something, prefer the smallest copy.
The former option is simplest, if we're allowed to change the signature:
// The Rcpp Vector header is incomplete; it needs these dependencies. :-(
#include <RcppCommon.h>
#include <Rcpp/RObject.h>
#include <Rcpp/Dimension.h>
#include <Rcpp/Vector.h>
#include <algorithm>
#include <iterator>
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(const Rcpp::NumericVector& x)
{
// remove trailing NA values
auto rbegin = std::make_reverse_iterator(x.end());
auto rend = std::make_reverse_iterator(x.begin());
auto last_real_value = std::find_if_not(rbegin, rend,
&Rcpp::NumericVector::is_na).base();
// remove leading NA values
auto first_real_value = std::find_if_not(x.begin(), last_real_value,
&Rcpp::NumericVector::is_na);
return {first_real_value, last_real_value};
}
Since the Rcpp vector doesn't provide standard-collection-like rbegin()/rend() methods, I had to manually create reverse iterators. That wouldn't be necessary with a standard vector.
For the latter option, we could use the vector's erase() method to remove the rightmost values, and then the leftmost ones (removing the leftmost first would work, but could require moving more data, making it less efficient).
Here's how that might work (untested):
// The Rcpp Vector header is incomplete; it needs these dependencies. :-(
#include <RcppCommon.h>
#include <Rcpp/RObject.h>
#include <Rcpp/Dimension.h>
#include <Rcpp/Vector.h>
#include <algorithm>
#include <iterator>
// [[Rcpp::export]]
Rcpp::NumericVector trimNA(Rcpp::NumericVector x)
{
// remove trailing NA values
auto rbegin = std::make_reverse_iterator(x.end());
auto rend = std::make_reverse_iterator(x.begin());
auto last_real_value = std::find_if_not(rbegin, rend,
&Rcpp::NumericVector::is_na).base();
x.erase(last_real_value, x.end());
// remove leading NA values
auto first_real_value = std::find_if_not(x.begin(), x.end(),
&Rcpp::NumericVector::is_na);
x.erase(x.begin(), first_real_value);
return x;
}
Other review items
The code doesn't work correctly if every value is an NA - we should stop when we reach F rather than pBegin here:
while( L > F && NumericVector::is_na(*L)) L--;
Use the standard indexing operator:
Xtrim[i] = F[i];
(But it's better to use std::copy() instead of the loop).
Use actual iterators, rather than assuming you'll get bare pointers.
Don't include all of <Rcpp.h>, as that makes compilation really slow. Unfortunately, Rcpp's headers are broken because they don't include all they need - see my examples for workaround (that's still much faster to compile than the original).
edited yesterday
answered yesterday
Toby Speight
22k536108
22k536108
I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday
add a comment |
I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday
I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday
I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday
add a comment |
up vote
2
down vote
Code Review
Code is good, except the naming. There are some pieces which could be improved
Use auto for iterators, as exact iterator type is not important, but it's properties are. Iterator properties are usually inherited from container it came from.
Use better constructor, there seems to be one which accepts pair of iterators, and in this case, pair of pointers denoting contiguous range.
Better (?) approach
The problem seems to be somewhat specific case of trimming in general. Note that trimming from the right and trimming from the left are symmetric, and std::reverse_iterator "normalizes" trimming from right (great improvements thanks to @Toby).
template <typename BidirIterator,
typename Predicate>
auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {
auto left_edge = std::find_if_not(first, last, predicate);
auto right_edge = std::find_if_not(std::reverse_iterator(last),
std::reverse_iterator(left_edge),
predicate).base();
return std::pair{left_edge, right_edge};
}
When reversing a range, first becomes last, and vice versa, that is why std::reverse_iterator arguments were swapped. There is also C++17 feature in play (template class argument deduction).
Full code:
#include <utility>
#include <iterator>
#include <algorithm>
template <typename BidirIterator,
typename Predicate>
auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {
auto left_edge = std::find_if_not(first, last, predicate);
auto right_edge = std::find_if_not(std::reverse_iterator(last),
std::reverse_iterator(left_edge),
predicate).base();
return std::pair{left_edge, right_edge};
}
#include <vector>
#include <stdexcept>
int main() {
std::vector<int> initial_values{2, 2, 3, 4, 5, 6};
auto predicate = (int x) { return x % 2 == 0; };
auto [new_first, new_last] = trim(initial_values.begin(),
initial_values.end(),
predicate);
std::vector<int> resulting_values(new_first, new_last);
std::vector<int> correct_result{3, 4, 5};
if (resulting_values != correct_result) {
throw std::logic_error("incorrect trimming occured");
}
}
Demo on Wandbox.
To make trimming a little bit easier to use, one might write something like this:
template <typename Container, typename Predicate>
Container trim_copy(Container&& container, Predicate predicate) {
auto [new_first, new_last] = trim(container.begin(), container.end(),
predicate);
return Container(new_first, new_last);
}
and pass a lambda calling the NumericVector::is_na:
auto predicate = (auto x) { return NumericVector::is_na(x); };
or may be pointer to the function directly.
add a comment |
up vote
2
down vote
Code Review
Code is good, except the naming. There are some pieces which could be improved
Use auto for iterators, as exact iterator type is not important, but it's properties are. Iterator properties are usually inherited from container it came from.
Use better constructor, there seems to be one which accepts pair of iterators, and in this case, pair of pointers denoting contiguous range.
Better (?) approach
The problem seems to be somewhat specific case of trimming in general. Note that trimming from the right and trimming from the left are symmetric, and std::reverse_iterator "normalizes" trimming from right (great improvements thanks to @Toby).
template <typename BidirIterator,
typename Predicate>
auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {
auto left_edge = std::find_if_not(first, last, predicate);
auto right_edge = std::find_if_not(std::reverse_iterator(last),
std::reverse_iterator(left_edge),
predicate).base();
return std::pair{left_edge, right_edge};
}
When reversing a range, first becomes last, and vice versa, that is why std::reverse_iterator arguments were swapped. There is also C++17 feature in play (template class argument deduction).
Full code:
#include <utility>
#include <iterator>
#include <algorithm>
template <typename BidirIterator,
typename Predicate>
auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {
auto left_edge = std::find_if_not(first, last, predicate);
auto right_edge = std::find_if_not(std::reverse_iterator(last),
std::reverse_iterator(left_edge),
predicate).base();
return std::pair{left_edge, right_edge};
}
#include <vector>
#include <stdexcept>
int main() {
std::vector<int> initial_values{2, 2, 3, 4, 5, 6};
auto predicate = (int x) { return x % 2 == 0; };
auto [new_first, new_last] = trim(initial_values.begin(),
initial_values.end(),
predicate);
std::vector<int> resulting_values(new_first, new_last);
std::vector<int> correct_result{3, 4, 5};
if (resulting_values != correct_result) {
throw std::logic_error("incorrect trimming occured");
}
}
Demo on Wandbox.
To make trimming a little bit easier to use, one might write something like this:
template <typename Container, typename Predicate>
Container trim_copy(Container&& container, Predicate predicate) {
auto [new_first, new_last] = trim(container.begin(), container.end(),
predicate);
return Container(new_first, new_last);
}
and pass a lambda calling the NumericVector::is_na:
auto predicate = (auto x) { return NumericVector::is_na(x); };
or may be pointer to the function directly.
add a comment |
up vote
2
down vote
up vote
2
down vote
Code Review
Code is good, except the naming. There are some pieces which could be improved
Use auto for iterators, as exact iterator type is not important, but it's properties are. Iterator properties are usually inherited from container it came from.
Use better constructor, there seems to be one which accepts pair of iterators, and in this case, pair of pointers denoting contiguous range.
Better (?) approach
The problem seems to be somewhat specific case of trimming in general. Note that trimming from the right and trimming from the left are symmetric, and std::reverse_iterator "normalizes" trimming from right (great improvements thanks to @Toby).
template <typename BidirIterator,
typename Predicate>
auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {
auto left_edge = std::find_if_not(first, last, predicate);
auto right_edge = std::find_if_not(std::reverse_iterator(last),
std::reverse_iterator(left_edge),
predicate).base();
return std::pair{left_edge, right_edge};
}
When reversing a range, first becomes last, and vice versa, that is why std::reverse_iterator arguments were swapped. There is also C++17 feature in play (template class argument deduction).
Full code:
#include <utility>
#include <iterator>
#include <algorithm>
template <typename BidirIterator,
typename Predicate>
auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {
auto left_edge = std::find_if_not(first, last, predicate);
auto right_edge = std::find_if_not(std::reverse_iterator(last),
std::reverse_iterator(left_edge),
predicate).base();
return std::pair{left_edge, right_edge};
}
#include <vector>
#include <stdexcept>
int main() {
std::vector<int> initial_values{2, 2, 3, 4, 5, 6};
auto predicate = (int x) { return x % 2 == 0; };
auto [new_first, new_last] = trim(initial_values.begin(),
initial_values.end(),
predicate);
std::vector<int> resulting_values(new_first, new_last);
std::vector<int> correct_result{3, 4, 5};
if (resulting_values != correct_result) {
throw std::logic_error("incorrect trimming occured");
}
}
Demo on Wandbox.
To make trimming a little bit easier to use, one might write something like this:
template <typename Container, typename Predicate>
Container trim_copy(Container&& container, Predicate predicate) {
auto [new_first, new_last] = trim(container.begin(), container.end(),
predicate);
return Container(new_first, new_last);
}
and pass a lambda calling the NumericVector::is_na:
auto predicate = (auto x) { return NumericVector::is_na(x); };
or may be pointer to the function directly.
Code Review
Code is good, except the naming. There are some pieces which could be improved
Use auto for iterators, as exact iterator type is not important, but it's properties are. Iterator properties are usually inherited from container it came from.
Use better constructor, there seems to be one which accepts pair of iterators, and in this case, pair of pointers denoting contiguous range.
Better (?) approach
The problem seems to be somewhat specific case of trimming in general. Note that trimming from the right and trimming from the left are symmetric, and std::reverse_iterator "normalizes" trimming from right (great improvements thanks to @Toby).
template <typename BidirIterator,
typename Predicate>
auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {
auto left_edge = std::find_if_not(first, last, predicate);
auto right_edge = std::find_if_not(std::reverse_iterator(last),
std::reverse_iterator(left_edge),
predicate).base();
return std::pair{left_edge, right_edge};
}
When reversing a range, first becomes last, and vice versa, that is why std::reverse_iterator arguments were swapped. There is also C++17 feature in play (template class argument deduction).
Full code:
#include <utility>
#include <iterator>
#include <algorithm>
template <typename BidirIterator,
typename Predicate>
auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {
auto left_edge = std::find_if_not(first, last, predicate);
auto right_edge = std::find_if_not(std::reverse_iterator(last),
std::reverse_iterator(left_edge),
predicate).base();
return std::pair{left_edge, right_edge};
}
#include <vector>
#include <stdexcept>
int main() {
std::vector<int> initial_values{2, 2, 3, 4, 5, 6};
auto predicate = (int x) { return x % 2 == 0; };
auto [new_first, new_last] = trim(initial_values.begin(),
initial_values.end(),
predicate);
std::vector<int> resulting_values(new_first, new_last);
std::vector<int> correct_result{3, 4, 5};
if (resulting_values != correct_result) {
throw std::logic_error("incorrect trimming occured");
}
}
Demo on Wandbox.
To make trimming a little bit easier to use, one might write something like this:
template <typename Container, typename Predicate>
Container trim_copy(Container&& container, Predicate predicate) {
auto [new_first, new_last] = trim(container.begin(), container.end(),
predicate);
return Container(new_first, new_last);
}
and pass a lambda calling the NumericVector::is_na:
auto predicate = (auto x) { return NumericVector::is_na(x); };
or may be pointer to the function directly.
edited yesterday
Toby Speight
22k536108
22k536108
answered yesterday
Incomputable
6,42621453
6,42621453
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207955%2ftrim-na-values-from-left-and-right-of-rcppnumericvector%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday