Trim NA values from left and right of Rcpp::NumericVector

up vote
2
down vote

favorite

I receive data with NA values. I would like to trim off the leading and trailing NA values, but to leave the interior NA values in place.

I've written the below in Rcpp, but it is very much a hand rolled solution -- and i'm alive to the fact that i'm going to make mistakes doing things this way.

What is the idiomatic cpp / Rcpp (sugar?) way of doing something like this?

#include<Rcpp.h>

using namespace Rcpp;



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(Rcpp::NumericVector X) {



    // find first and last non-NA value



    double * F = X.begin();

    double * pBegin = X.begin();

    double * pEnd = X.end();

    double * L = X.end() - 1; // note you must decrement by one as .end() is AFTER



    // look for the values

    while( F < pEnd && NumericVector::is_na(*F)) F++;

    while( L > pBegin && NumericVector::is_na(*L)) L--;





    // create trimmed X vector

    Rcpp::NumericVector Xtrim(L - F + 1);



    for( int i = 0; i < Xtrim.size(); i++) {

        Xtrim[i] = *(F + i);

    }



    return Xtrim;

}





/*** R



x <- c(NA, NA, NA, 1:4, NA, 6:8, NA, 10, NA, NA, NA)

trimNA(x)

*/

edited yesterday

Incomputable

6,42621453

asked yesterday

ricardo

180210

I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday

add a comment |

up vote
2
down vote

favorite

I receive data with NA values. I would like to trim off the leading and trailing NA values, but to leave the interior NA values in place.

I've written the below in Rcpp, but it is very much a hand rolled solution -- and i'm alive to the fact that i'm going to make mistakes doing things this way.

What is the idiomatic cpp / Rcpp (sugar?) way of doing something like this?

#include<Rcpp.h>

using namespace Rcpp;



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(Rcpp::NumericVector X) {



    // find first and last non-NA value



    double * F = X.begin();

    double * pBegin = X.begin();

    double * pEnd = X.end();

    double * L = X.end() - 1; // note you must decrement by one as .end() is AFTER



    // look for the values

    while( F < pEnd && NumericVector::is_na(*F)) F++;

    while( L > pBegin && NumericVector::is_na(*L)) L--;





    // create trimmed X vector

    Rcpp::NumericVector Xtrim(L - F + 1);



    for( int i = 0; i < Xtrim.size(); i++) {

        Xtrim[i] = *(F + i);

    }



    return Xtrim;

}





/*** R



x <- c(NA, NA, NA, 1:4, NA, 6:8, NA, 10, NA, NA, NA)

trimNA(x)

*/

edited yesterday

Incomputable

6,42621453

asked yesterday

ricardo

180210

I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday

add a comment |

up vote
2
down vote

favorite

I receive data with NA values. I would like to trim off the leading and trailing NA values, but to leave the interior NA values in place.

I've written the below in Rcpp, but it is very much a hand rolled solution -- and i'm alive to the fact that i'm going to make mistakes doing things this way.

What is the idiomatic cpp / Rcpp (sugar?) way of doing something like this?

#include<Rcpp.h>

using namespace Rcpp;



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(Rcpp::NumericVector X) {



    // find first and last non-NA value



    double * F = X.begin();

    double * pBegin = X.begin();

    double * pEnd = X.end();

    double * L = X.end() - 1; // note you must decrement by one as .end() is AFTER



    // look for the values

    while( F < pEnd && NumericVector::is_na(*F)) F++;

    while( L > pBegin && NumericVector::is_na(*L)) L--;





    // create trimmed X vector

    Rcpp::NumericVector Xtrim(L - F + 1);



    for( int i = 0; i < Xtrim.size(); i++) {

        Xtrim[i] = *(F + i);

    }



    return Xtrim;

}





/*** R



x <- c(NA, NA, NA, 1:4, NA, 6:8, NA, 10, NA, NA, NA)

trimNA(x)

*/

edited yesterday

Incomputable

6,42621453

asked yesterday

ricardo

180210

I receive data with NA values. I would like to trim off the leading and trailing NA values, but to leave the interior NA values in place.

I've written the below in Rcpp, but it is very much a hand rolled solution -- and i'm alive to the fact that i'm going to make mistakes doing things this way.

What is the idiomatic cpp / Rcpp (sugar?) way of doing something like this?

#include<Rcpp.h>

using namespace Rcpp;



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(Rcpp::NumericVector X) {



    // find first and last non-NA value



    double * F = X.begin();

    double * pBegin = X.begin();

    double * pEnd = X.end();

    double * L = X.end() - 1; // note you must decrement by one as .end() is AFTER



    // look for the values

    while( F < pEnd && NumericVector::is_na(*F)) F++;

    while( L > pBegin && NumericVector::is_na(*L)) L--;





    // create trimmed X vector

    Rcpp::NumericVector Xtrim(L - F + 1);



    for( int i = 0; i < Xtrim.size(); i++) {

        Xtrim[i] = *(F + i);

    }



    return Xtrim;

}





/*** R



x <- c(NA, NA, NA, 1:4, NA, 6:8, NA, 10, NA, NA, NA)

trimNA(x)

*/

c++ rcpp

edited yesterday

Incomputable

6,42621453

asked yesterday

ricardo

180210

edited yesterday

Incomputable

6,42621453

asked yesterday

ricardo

180210

edited yesterday

Incomputable

6,42621453

edited yesterday

Incomputable

6,42621453

edited yesterday

Incomputable

6,42621453

asked yesterday

ricardo

180210

asked yesterday

ricardo

180210

asked yesterday

ricardo

180210

I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday

add a comment |

I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday

I edited title to better reflect the intention. Please rollback if my edit is wrong.
– Incomputable
yesterday

add a comment |

2 Answers
2

active

oldest

votes

up vote
2
down vote

accepted

I don't know Rcpp well enough to be sure, but we should be able to reduce copying in one of two ways:

Accept the argument as a const reference, so that we don't copy in to the function, or

Modify our copy of the argument in-place, then return it as result (rather than copying again to Xtrim).

Both of these eliminate a vector copy - in general, the first is to be preferred, as the input vector is bigger than the output vector. If we must copy something, prefer the smallest copy.

The former option is simplest, if we're allowed to change the signature:

// The Rcpp Vector header is incomplete; it needs these dependencies.  :-(

#include <RcppCommon.h>

#include <Rcpp/RObject.h>

#include <Rcpp/Dimension.h>



#include <Rcpp/Vector.h>



#include <algorithm>

#include <iterator>



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(const Rcpp::NumericVector& x)

{

    // remove trailing NA values

    auto rbegin = std::make_reverse_iterator(x.end());

    auto rend = std::make_reverse_iterator(x.begin());

    auto last_real_value = std::find_if_not(rbegin, rend,

                                            &Rcpp::NumericVector::is_na).base();



    // remove leading NA values

    auto first_real_value = std::find_if_not(x.begin(), last_real_value,

                                             &Rcpp::NumericVector::is_na);



    return {first_real_value, last_real_value};

}

Since the Rcpp vector doesn't provide standard-collection-like rbegin()/rend() methods, I had to manually create reverse iterators. That wouldn't be necessary with a standard vector.

For the latter option, we could use the vector's erase() method to remove the rightmost values, and then the leftmost ones (removing the leftmost first would work, but could require moving more data, making it less efficient).

Here's how that might work (untested):

// The Rcpp Vector header is incomplete; it needs these dependencies.  :-(

#include <RcppCommon.h>

#include <Rcpp/RObject.h>

#include <Rcpp/Dimension.h>



#include <Rcpp/Vector.h>



#include <algorithm>

#include <iterator>



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(Rcpp::NumericVector x)

{

    // remove trailing NA values

    auto rbegin = std::make_reverse_iterator(x.end());

    auto rend = std::make_reverse_iterator(x.begin());

    auto last_real_value = std::find_if_not(rbegin, rend,

                                             &Rcpp::NumericVector::is_na).base();

    x.erase(last_real_value, x.end());



    // remove leading NA values

    auto first_real_value = std::find_if_not(x.begin(), x.end(),

                                             &Rcpp::NumericVector::is_na);

    x.erase(x.begin(), first_real_value);



    return x;

}

Other review items

The code doesn't work correctly if every value is an NA - we should stop when we reach F rather than pBegin here:

while( L > F && NumericVector::is_na(*L)) L--;

Use the standard indexing operator:

    Xtrim[i] = F[i];

(But it's better to use std::copy() instead of the loop).

Use actual iterators, rather than assuming you'll get bare pointers.

Don't include all of <Rcpp.h>, as that makes compilation really slow. Unfortunately, Rcpp's headers are broken because they don't include all they need - see my examples for workaround (that's still much faster to compile than the original).

edited yesterday

answered yesterday

Toby Speight

22k536108

I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday

add a comment |

up vote
2
down vote

Code Review

Code is good, except the naming. There are some pieces which could be improved

Use auto for iterators, as exact iterator type is not important, but it's properties are. Iterator properties are usually inherited from container it came from.

Use better constructor, there seems to be one which accepts pair of iterators, and in this case, pair of pointers denoting contiguous range.

Better (?) approach

The problem seems to be somewhat specific case of trimming in general. Note that trimming from the right and trimming from the left are symmetric, and std::reverse_iterator "normalizes" trimming from right (great improvements thanks to @Toby).

template <typename BidirIterator,

          typename Predicate>

auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {

    auto left_edge = std::find_if_not(first, last, predicate);

    auto right_edge = std::find_if_not(std::reverse_iterator(last),

                                       std::reverse_iterator(left_edge),

                                       predicate).base();



    return std::pair{left_edge, right_edge};

}

When reversing a range, first becomes last, and vice versa, that is why std::reverse_iterator arguments were swapped. There is also C++17 feature in play (template class argument deduction).

Full code:

#include <utility>

#include <iterator>

#include <algorithm>



template <typename BidirIterator,

          typename Predicate>

auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {

    auto left_edge = std::find_if_not(first, last, predicate);

    auto right_edge = std::find_if_not(std::reverse_iterator(last),

                                       std::reverse_iterator(left_edge),

                                       predicate).base();



    return std::pair{left_edge, right_edge};

}



#include <vector>

#include <stdexcept>



int main() {

    std::vector<int> initial_values{2, 2, 3, 4, 5, 6};

    auto predicate = (int x) { return x % 2 == 0; };

    auto [new_first, new_last] = trim(initial_values.begin(),

                                      initial_values.end(),

                                      predicate);

    std::vector<int> resulting_values(new_first, new_last);

    std::vector<int> correct_result{3, 4, 5};

    if (resulting_values != correct_result) {

        throw std::logic_error("incorrect trimming occured");

    }

}

Demo on Wandbox.

To make trimming a little bit easier to use, one might write something like this:

template <typename Container, typename Predicate>

Container trim_copy(Container&& container, Predicate predicate) {

    auto [new_first, new_last] = trim(container.begin(), container.end(),

                                      predicate);

    return Container(new_first, new_last);

}

and pass a lambda calling the NumericVector::is_na:

auto predicate = (auto x) { return NumericVector::is_na(x); };

or may be pointer to the function directly.

edited yesterday

Toby Speight

22k536108

answered yesterday

Incomputable

6,42621453

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207955%2ftrim-na-values-from-left-and-right-of-rcppnumericvector%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
2
down vote

accepted

I don't know Rcpp well enough to be sure, but we should be able to reduce copying in one of two ways:

Accept the argument as a const reference, so that we don't copy in to the function, or

Modify our copy of the argument in-place, then return it as result (rather than copying again to Xtrim).

Both of these eliminate a vector copy - in general, the first is to be preferred, as the input vector is bigger than the output vector. If we must copy something, prefer the smallest copy.

The former option is simplest, if we're allowed to change the signature:

// The Rcpp Vector header is incomplete; it needs these dependencies.  :-(

#include <RcppCommon.h>

#include <Rcpp/RObject.h>

#include <Rcpp/Dimension.h>



#include <Rcpp/Vector.h>



#include <algorithm>

#include <iterator>



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(const Rcpp::NumericVector& x)

{

    // remove trailing NA values

    auto rbegin = std::make_reverse_iterator(x.end());

    auto rend = std::make_reverse_iterator(x.begin());

    auto last_real_value = std::find_if_not(rbegin, rend,

                                            &Rcpp::NumericVector::is_na).base();



    // remove leading NA values

    auto first_real_value = std::find_if_not(x.begin(), last_real_value,

                                             &Rcpp::NumericVector::is_na);



    return {first_real_value, last_real_value};

}

Since the Rcpp vector doesn't provide standard-collection-like rbegin()/rend() methods, I had to manually create reverse iterators. That wouldn't be necessary with a standard vector.

Here's how that might work (untested):

// The Rcpp Vector header is incomplete; it needs these dependencies.  :-(

#include <RcppCommon.h>

#include <Rcpp/RObject.h>

#include <Rcpp/Dimension.h>



#include <Rcpp/Vector.h>



#include <algorithm>

#include <iterator>



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(Rcpp::NumericVector x)

{

    // remove trailing NA values

    auto rbegin = std::make_reverse_iterator(x.end());

    auto rend = std::make_reverse_iterator(x.begin());

    auto last_real_value = std::find_if_not(rbegin, rend,

                                             &Rcpp::NumericVector::is_na).base();

    x.erase(last_real_value, x.end());



    // remove leading NA values

    auto first_real_value = std::find_if_not(x.begin(), x.end(),

                                             &Rcpp::NumericVector::is_na);

    x.erase(x.begin(), first_real_value);



    return x;

}

Other review items

The code doesn't work correctly if every value is an NA - we should stop when we reach F rather than pBegin here:

while( L > F && NumericVector::is_na(*L)) L--;

Use the standard indexing operator:

    Xtrim[i] = F[i];

(But it's better to use std::copy() instead of the loop).

Use actual iterators, rather than assuming you'll get bare pointers.

edited yesterday

answered yesterday

Toby Speight

22k536108

I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday

add a comment |

up vote
2
down vote

accepted

I don't know Rcpp well enough to be sure, but we should be able to reduce copying in one of two ways:

Accept the argument as a const reference, so that we don't copy in to the function, or

Modify our copy of the argument in-place, then return it as result (rather than copying again to Xtrim).

Both of these eliminate a vector copy - in general, the first is to be preferred, as the input vector is bigger than the output vector. If we must copy something, prefer the smallest copy.

The former option is simplest, if we're allowed to change the signature:

// The Rcpp Vector header is incomplete; it needs these dependencies.  :-(

#include <RcppCommon.h>

#include <Rcpp/RObject.h>

#include <Rcpp/Dimension.h>



#include <Rcpp/Vector.h>



#include <algorithm>

#include <iterator>



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(const Rcpp::NumericVector& x)

{

    // remove trailing NA values

    auto rbegin = std::make_reverse_iterator(x.end());

    auto rend = std::make_reverse_iterator(x.begin());

    auto last_real_value = std::find_if_not(rbegin, rend,

                                            &Rcpp::NumericVector::is_na).base();



    // remove leading NA values

    auto first_real_value = std::find_if_not(x.begin(), last_real_value,

                                             &Rcpp::NumericVector::is_na);



    return {first_real_value, last_real_value};

}

Since the Rcpp vector doesn't provide standard-collection-like rbegin()/rend() methods, I had to manually create reverse iterators. That wouldn't be necessary with a standard vector.

Here's how that might work (untested):

// The Rcpp Vector header is incomplete; it needs these dependencies.  :-(

#include <RcppCommon.h>

#include <Rcpp/RObject.h>

#include <Rcpp/Dimension.h>



#include <Rcpp/Vector.h>



#include <algorithm>

#include <iterator>



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(Rcpp::NumericVector x)

{

    // remove trailing NA values

    auto rbegin = std::make_reverse_iterator(x.end());

    auto rend = std::make_reverse_iterator(x.begin());

    auto last_real_value = std::find_if_not(rbegin, rend,

                                             &Rcpp::NumericVector::is_na).base();

    x.erase(last_real_value, x.end());



    // remove leading NA values

    auto first_real_value = std::find_if_not(x.begin(), x.end(),

                                             &Rcpp::NumericVector::is_na);

    x.erase(x.begin(), first_real_value);



    return x;

}

Other review items

The code doesn't work correctly if every value is an NA - we should stop when we reach F rather than pBegin here:

while( L > F && NumericVector::is_na(*L)) L--;

Use the standard indexing operator:

    Xtrim[i] = F[i];

(But it's better to use std::copy() instead of the loop).

Use actual iterators, rather than assuming you'll get bare pointers.

edited yesterday

answered yesterday

Toby Speight

22k536108

I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday

add a comment |

up vote
2
down vote

accepted

I don't know Rcpp well enough to be sure, but we should be able to reduce copying in one of two ways:

Accept the argument as a const reference, so that we don't copy in to the function, or

Modify our copy of the argument in-place, then return it as result (rather than copying again to Xtrim).

Both of these eliminate a vector copy - in general, the first is to be preferred, as the input vector is bigger than the output vector. If we must copy something, prefer the smallest copy.

The former option is simplest, if we're allowed to change the signature:

// The Rcpp Vector header is incomplete; it needs these dependencies.  :-(

#include <RcppCommon.h>

#include <Rcpp/RObject.h>

#include <Rcpp/Dimension.h>



#include <Rcpp/Vector.h>



#include <algorithm>

#include <iterator>



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(const Rcpp::NumericVector& x)

{

    // remove trailing NA values

    auto rbegin = std::make_reverse_iterator(x.end());

    auto rend = std::make_reverse_iterator(x.begin());

    auto last_real_value = std::find_if_not(rbegin, rend,

                                            &Rcpp::NumericVector::is_na).base();



    // remove leading NA values

    auto first_real_value = std::find_if_not(x.begin(), last_real_value,

                                             &Rcpp::NumericVector::is_na);



    return {first_real_value, last_real_value};

}

Since the Rcpp vector doesn't provide standard-collection-like rbegin()/rend() methods, I had to manually create reverse iterators. That wouldn't be necessary with a standard vector.

Here's how that might work (untested):

// The Rcpp Vector header is incomplete; it needs these dependencies.  :-(

#include <RcppCommon.h>

#include <Rcpp/RObject.h>

#include <Rcpp/Dimension.h>



#include <Rcpp/Vector.h>



#include <algorithm>

#include <iterator>



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(Rcpp::NumericVector x)

{

    // remove trailing NA values

    auto rbegin = std::make_reverse_iterator(x.end());

    auto rend = std::make_reverse_iterator(x.begin());

    auto last_real_value = std::find_if_not(rbegin, rend,

                                             &Rcpp::NumericVector::is_na).base();

    x.erase(last_real_value, x.end());



    // remove leading NA values

    auto first_real_value = std::find_if_not(x.begin(), x.end(),

                                             &Rcpp::NumericVector::is_na);

    x.erase(x.begin(), first_real_value);



    return x;

}

Other review items

The code doesn't work correctly if every value is an NA - we should stop when we reach F rather than pBegin here:

while( L > F && NumericVector::is_na(*L)) L--;

Use the standard indexing operator:

    Xtrim[i] = F[i];

(But it's better to use std::copy() instead of the loop).

Use actual iterators, rather than assuming you'll get bare pointers.

edited yesterday

answered yesterday

Toby Speight

22k536108

I don't know Rcpp well enough to be sure, but we should be able to reduce copying in one of two ways:

Accept the argument as a const reference, so that we don't copy in to the function, or

Modify our copy of the argument in-place, then return it as result (rather than copying again to Xtrim).

Both of these eliminate a vector copy - in general, the first is to be preferred, as the input vector is bigger than the output vector. If we must copy something, prefer the smallest copy.

The former option is simplest, if we're allowed to change the signature:

// The Rcpp Vector header is incomplete; it needs these dependencies.  :-(

#include <RcppCommon.h>

#include <Rcpp/RObject.h>

#include <Rcpp/Dimension.h>



#include <Rcpp/Vector.h>



#include <algorithm>

#include <iterator>



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(const Rcpp::NumericVector& x)

{

    // remove trailing NA values

    auto rbegin = std::make_reverse_iterator(x.end());

    auto rend = std::make_reverse_iterator(x.begin());

    auto last_real_value = std::find_if_not(rbegin, rend,

                                            &Rcpp::NumericVector::is_na).base();



    // remove leading NA values

    auto first_real_value = std::find_if_not(x.begin(), last_real_value,

                                             &Rcpp::NumericVector::is_na);



    return {first_real_value, last_real_value};

}

Since the Rcpp vector doesn't provide standard-collection-like rbegin()/rend() methods, I had to manually create reverse iterators. That wouldn't be necessary with a standard vector.

Here's how that might work (untested):

// The Rcpp Vector header is incomplete; it needs these dependencies.  :-(

#include <RcppCommon.h>

#include <Rcpp/RObject.h>

#include <Rcpp/Dimension.h>



#include <Rcpp/Vector.h>



#include <algorithm>

#include <iterator>



// [[Rcpp::export]]

Rcpp::NumericVector trimNA(Rcpp::NumericVector x)

{

    // remove trailing NA values

    auto rbegin = std::make_reverse_iterator(x.end());

    auto rend = std::make_reverse_iterator(x.begin());

    auto last_real_value = std::find_if_not(rbegin, rend,

                                             &Rcpp::NumericVector::is_na).base();

    x.erase(last_real_value, x.end());



    // remove leading NA values

    auto first_real_value = std::find_if_not(x.begin(), x.end(),

                                             &Rcpp::NumericVector::is_na);

    x.erase(x.begin(), first_real_value);



    return x;

}

Other review items

The code doesn't work correctly if every value is an NA - we should stop when we reach F rather than pBegin here:

while( L > F && NumericVector::is_na(*L)) L--;

Use the standard indexing operator:

    Xtrim[i] = F[i];

(But it's better to use std::copy() instead of the loop).

Use actual iterators, rather than assuming you'll get bare pointers.

edited yesterday

answered yesterday

Toby Speight

22k536108

edited yesterday

answered yesterday

Toby Speight

22k536108

answered yesterday

Toby Speight

22k536108

answered yesterday

Toby Speight

22k536108

I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday

add a comment |

I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday

I'm very new to c++, could you demonstrate how you'd use erase and explain the benefits?
– ricardo
yesterday

add a comment |

up vote
2
down vote

Code Review

Code is good, except the naming. There are some pieces which could be improved

Use auto for iterators, as exact iterator type is not important, but it's properties are. Iterator properties are usually inherited from container it came from.

Use better constructor, there seems to be one which accepts pair of iterators, and in this case, pair of pointers denoting contiguous range.

Better (?) approach

template <typename BidirIterator,

          typename Predicate>

auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {

    auto left_edge = std::find_if_not(first, last, predicate);

    auto right_edge = std::find_if_not(std::reverse_iterator(last),

                                       std::reverse_iterator(left_edge),

                                       predicate).base();



    return std::pair{left_edge, right_edge};

}

When reversing a range, first becomes last, and vice versa, that is why std::reverse_iterator arguments were swapped. There is also C++17 feature in play (template class argument deduction).

Full code:

#include <utility>

#include <iterator>

#include <algorithm>



template <typename BidirIterator,

          typename Predicate>

auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {

    auto left_edge = std::find_if_not(first, last, predicate);

    auto right_edge = std::find_if_not(std::reverse_iterator(last),

                                       std::reverse_iterator(left_edge),

                                       predicate).base();



    return std::pair{left_edge, right_edge};

}



#include <vector>

#include <stdexcept>



int main() {

    std::vector<int> initial_values{2, 2, 3, 4, 5, 6};

    auto predicate = (int x) { return x % 2 == 0; };

    auto [new_first, new_last] = trim(initial_values.begin(),

                                      initial_values.end(),

                                      predicate);

    std::vector<int> resulting_values(new_first, new_last);

    std::vector<int> correct_result{3, 4, 5};

    if (resulting_values != correct_result) {

        throw std::logic_error("incorrect trimming occured");

    }

}

Demo on Wandbox.

To make trimming a little bit easier to use, one might write something like this:

template <typename Container, typename Predicate>

Container trim_copy(Container&& container, Predicate predicate) {

    auto [new_first, new_last] = trim(container.begin(), container.end(),

                                      predicate);

    return Container(new_first, new_last);

}

and pass a lambda calling the NumericVector::is_na:

auto predicate = (auto x) { return NumericVector::is_na(x); };

or may be pointer to the function directly.

edited yesterday

Toby Speight

22k536108

answered yesterday

Incomputable

6,42621453

add a comment |

up vote
2
down vote

Code Review

Code is good, except the naming. There are some pieces which could be improved

Use auto for iterators, as exact iterator type is not important, but it's properties are. Iterator properties are usually inherited from container it came from.

Use better constructor, there seems to be one which accepts pair of iterators, and in this case, pair of pointers denoting contiguous range.

Better (?) approach

template <typename BidirIterator,

          typename Predicate>

auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {

    auto left_edge = std::find_if_not(first, last, predicate);

    auto right_edge = std::find_if_not(std::reverse_iterator(last),

                                       std::reverse_iterator(left_edge),

                                       predicate).base();



    return std::pair{left_edge, right_edge};

}

When reversing a range, first becomes last, and vice versa, that is why std::reverse_iterator arguments were swapped. There is also C++17 feature in play (template class argument deduction).

Full code:

#include <utility>

#include <iterator>

#include <algorithm>



template <typename BidirIterator,

          typename Predicate>

auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {

    auto left_edge = std::find_if_not(first, last, predicate);

    auto right_edge = std::find_if_not(std::reverse_iterator(last),

                                       std::reverse_iterator(left_edge),

                                       predicate).base();



    return std::pair{left_edge, right_edge};

}



#include <vector>

#include <stdexcept>



int main() {

    std::vector<int> initial_values{2, 2, 3, 4, 5, 6};

    auto predicate = (int x) { return x % 2 == 0; };

    auto [new_first, new_last] = trim(initial_values.begin(),

                                      initial_values.end(),

                                      predicate);

    std::vector<int> resulting_values(new_first, new_last);

    std::vector<int> correct_result{3, 4, 5};

    if (resulting_values != correct_result) {

        throw std::logic_error("incorrect trimming occured");

    }

}

Demo on Wandbox.

To make trimming a little bit easier to use, one might write something like this:

template <typename Container, typename Predicate>

Container trim_copy(Container&& container, Predicate predicate) {

    auto [new_first, new_last] = trim(container.begin(), container.end(),

                                      predicate);

    return Container(new_first, new_last);

}

and pass a lambda calling the NumericVector::is_na:

auto predicate = (auto x) { return NumericVector::is_na(x); };

or may be pointer to the function directly.

edited yesterday

Toby Speight

22k536108

answered yesterday

Incomputable

6,42621453

add a comment |

up vote
2
down vote

Code Review

Code is good, except the naming. There are some pieces which could be improved

Use auto for iterators, as exact iterator type is not important, but it's properties are. Iterator properties are usually inherited from container it came from.

Use better constructor, there seems to be one which accepts pair of iterators, and in this case, pair of pointers denoting contiguous range.

Better (?) approach

template <typename BidirIterator,

          typename Predicate>

auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {

    auto left_edge = std::find_if_not(first, last, predicate);

    auto right_edge = std::find_if_not(std::reverse_iterator(last),

                                       std::reverse_iterator(left_edge),

                                       predicate).base();



    return std::pair{left_edge, right_edge};

}

When reversing a range, first becomes last, and vice versa, that is why std::reverse_iterator arguments were swapped. There is also C++17 feature in play (template class argument deduction).

Full code:

#include <utility>

#include <iterator>

#include <algorithm>



template <typename BidirIterator,

          typename Predicate>

auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {

    auto left_edge = std::find_if_not(first, last, predicate);

    auto right_edge = std::find_if_not(std::reverse_iterator(last),

                                       std::reverse_iterator(left_edge),

                                       predicate).base();



    return std::pair{left_edge, right_edge};

}



#include <vector>

#include <stdexcept>



int main() {

    std::vector<int> initial_values{2, 2, 3, 4, 5, 6};

    auto predicate = (int x) { return x % 2 == 0; };

    auto [new_first, new_last] = trim(initial_values.begin(),

                                      initial_values.end(),

                                      predicate);

    std::vector<int> resulting_values(new_first, new_last);

    std::vector<int> correct_result{3, 4, 5};

    if (resulting_values != correct_result) {

        throw std::logic_error("incorrect trimming occured");

    }

}

Demo on Wandbox.

To make trimming a little bit easier to use, one might write something like this:

template <typename Container, typename Predicate>

Container trim_copy(Container&& container, Predicate predicate) {

    auto [new_first, new_last] = trim(container.begin(), container.end(),

                                      predicate);

    return Container(new_first, new_last);

}

and pass a lambda calling the NumericVector::is_na:

auto predicate = (auto x) { return NumericVector::is_na(x); };

or may be pointer to the function directly.

edited yesterday

Toby Speight

22k536108

answered yesterday

Incomputable

6,42621453

Code Review

Code is good, except the naming. There are some pieces which could be improved

Use auto for iterators, as exact iterator type is not important, but it's properties are. Iterator properties are usually inherited from container it came from.

Use better constructor, there seems to be one which accepts pair of iterators, and in this case, pair of pointers denoting contiguous range.

Better (?) approach

template <typename BidirIterator,

          typename Predicate>

auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {

    auto left_edge = std::find_if_not(first, last, predicate);

    auto right_edge = std::find_if_not(std::reverse_iterator(last),

                                       std::reverse_iterator(left_edge),

                                       predicate).base();



    return std::pair{left_edge, right_edge};

}

When reversing a range, first becomes last, and vice versa, that is why std::reverse_iterator arguments were swapped. There is also C++17 feature in play (template class argument deduction).

Full code:

#include <utility>

#include <iterator>

#include <algorithm>



template <typename BidirIterator,

          typename Predicate>

auto trim(BidirIterator first, BidirIterator last, Predicate predicate) {

    auto left_edge = std::find_if_not(first, last, predicate);

    auto right_edge = std::find_if_not(std::reverse_iterator(last),

                                       std::reverse_iterator(left_edge),

                                       predicate).base();



    return std::pair{left_edge, right_edge};

}



#include <vector>

#include <stdexcept>



int main() {

    std::vector<int> initial_values{2, 2, 3, 4, 5, 6};

    auto predicate = (int x) { return x % 2 == 0; };

    auto [new_first, new_last] = trim(initial_values.begin(),

                                      initial_values.end(),

                                      predicate);

    std::vector<int> resulting_values(new_first, new_last);

    std::vector<int> correct_result{3, 4, 5};

    if (resulting_values != correct_result) {

        throw std::logic_error("incorrect trimming occured");

    }

}

Demo on Wandbox.

To make trimming a little bit easier to use, one might write something like this:

template <typename Container, typename Predicate>

Container trim_copy(Container&& container, Predicate predicate) {

    auto [new_first, new_last] = trim(container.begin(), container.end(),

                                      predicate);

    return Container(new_first, new_last);

}

and pass a lambda calling the NumericVector::is_na:

auto predicate = (auto x) { return NumericVector::is_na(x); };

or may be pointer to the function directly.

edited yesterday

Toby Speight

22k536108

answered yesterday

Incomputable

6,42621453

edited yesterday

Toby Speight

22k536108

edited yesterday

Toby Speight

22k536108

edited yesterday

Toby Speight

22k536108

answered yesterday

Incomputable

6,42621453

answered yesterday

Incomputable

6,42621453

answered yesterday

Incomputable

6,42621453

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtyktl

Trim NA values from left and right of Rcpp::NumericVector

2 Answers
2

Other review items

Code Review

Better (?) approach

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Other review items

Other review items

Other review items

Other review items

Code Review

Better (?) approach

Code Review

Better (?) approach

Code Review

Better (?) approach

Code Review

Better (?) approach

Post as a guest

Popular posts from this blog

VLC cannot play a non-UDF mastered DVD

Mouse cursor on multiple screens with different PPI

Agildo Ribeiro

Trim NA values from left and right of Rcpp::NumericVector

2 Answers 2

Other review items

Code Review

Better (?) approach

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Other review items

Other review items

Other review items

Other review items

Code Review

Better (?) approach

Code Review

Better (?) approach

Code Review

Better (?) approach

Code Review

Better (?) approach

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

VLC cannot play a non-UDF mastered DVD

Mouse cursor on multiple screens with different PPI

Agildo Ribeiro

2 Answers
2

2 Answers
2

2 Answers
2