Table of Contents

Do you know how many ways we can implement a filter function in C++?

While the problem is relatively easy to understand - take a container, copy elements that match a predicate and the return a new container - it’s good to exercise with the Standard Library and check a few ideas. We can also apply some Modern C++ techniques.

Let’s start!

The Problem Statement  

To be precise by filter I mean a function with the following interface:

auto Filter(const Container& cont, UnaryPredicate p) {}

It takes a container and a predicate, and then it creates an output container with elements that satisfies the predicate.

We can use it like the following:

const std::vector<std::string> vec{ "Hello", "**txt", "World", "error", "warning", "C++", "****" };

auto filtered = FilterRaw(vec, [](auto& elem) { return !elem.starts_with('*'); });
// filtered should have "Hello", "World", "error", "warning", "C++"

Additionally, we can have a look at a definition from wikipedia and functional programming:

In functional programming, filter is a higher-order function that processes a data structure (usually a list) in some order to produce a new data structure containing exactly those elements of the original data structure for which a given predicate returns the boolean value true.

Writing such a function can be a good exercise with various options and algorithms in the Standard Library. What’s more, our function hides internal things like iterators, so it’s more like a range based version.

Let’s start with the first option:

Good old Raw Loops  

While it’s good to avoid raw loops, they might help us to fully understand the problem, especially for a simple problem like we have:

template <typename T, typename Pred>
auto FilterRaw(const std::vector<T>& vec, Pred p) {
    std::vector<T> out;
    for (auto&& elem : vec)
        if (p(elem))
            out.push_back(elem);
    return out;
}

Simple yet very effective.

Please notice some nice things about this straightforward implementation.

  • The code uses auto return type deduction, so there’s no need to write the explicit type.
  • It returns the output vector by value, but the compiler will leverage the copy elision (in most case), or move semantics at worse.

Since we’re at raw loops, we need can take a moment and appreciate range based for loops that we get with C++11. Without this functionality our code would look much worse:

template <typename T, typename Pred>
std::vector<T> FilterRawOld(const std::vector<T>& vec, Pred p) {
  std::vector<T> out;
  for (typename std::vector<T>::const_iterator it = begin(vec); it != end(vec); ++it)
    if (p(*it))
      out.push_back(*it);
  return out;
}

And now let’s move to something better and see some of the existing std:: algorithms that might help us with the implementation.

Filter by std::copy_if  

std::copy_if is probably the most natural choice. We can leverage back_inserter and then push matched elements into the output vector.

template <typename T, typename Pred>
auto FilterCopyIf(const std::vector<T>& vec, Pred p) {
    std::vector<T> out;
    std::copy_if(begin(vec), end(vec), std::back_inserter(out), p);
    return out;
}

std::remove_copy_if  

But we can also do the reverse:

template <typename T, typename Pred>
auto FilterRemoveCopyIf(const std::vector<T>& vec, Pred p) {
    std::vector<T> out;
    std::remove_copy_if(begin(vec), end(vec), 
                        std::back_inserter(out), std::not_fn(p));
    return out;
}

Depending on the requirements, we can also use remove_copy_if which copies elements that do not satisfy the predicate. For our implementation, I had to add std::not_fn to reverse the predicate.

One remark: std::not_fn is available since C++17.

The Famous Remove Erase Idiom  

template <typename T, typename Pred>
auto FilterRemoveErase(const std::vector<T>& vec, Pred p) {
    auto out = vec;
    out.erase(std::remove_if(begin(out), end(out), std::not_fn(p)), end(out));
    return out;
}

Here’s a little inconvenience. Because we don’t want to modify the input container, we had to copy it first. This might cause some extra processing and is less efficient than using back_inserter.

Adding Some C++20  

After seeing a few examples, we can finally see a convenient feature from C++20.

template <typename T, typename Pred>
auto FilterEraseIf(const std::vector<T>& vec, Pred p) {
    auto out = vec;
    std::erase_if(out, std::not_fn(p));
    return out;
}

One minor thing, this approach copies all elements first. So it might be slower than the approach with copy_if.

Adding Some C++20 Ranges  

And finally a solution with Ranges:

template <typename T, typename Pred>
auto FilterRangesCopyIf(const std::vector<T>& vec, Pred p) {
    std::vector<T> out;
    std::ranges::copy_if(vec, std::back_inserter(out), p);
    return out;
}

The code is super simple, and we might even say that our Filter function has no point here, since the Ranges interface is so easy to use in code directly.

Making it More Generic  

So far I showed you code that operates on std::vector. But how about other containers?

Let’s try and make our Filter function more generic. This is easy with std::erase_if which has overloads for many Standard containers:

template <typename TCont, typename Pred>
auto FilterEraseIfGen(const TCont& cont, Pred p) {
    auto out = cont;
    std::erase_if(out, std::not_fn(p));
    return out;
}

And another version for ranges.

template <typename TCont, typename Pred>
auto FilterRangesCopyIfGen(const TCont& vec, Pred p) {
    TCont out;
    std::ranges::copy_if(vec, std::back_inserter(out), p);
    return out;
}

Right now it can work with other containers, not only with std::vector:

std::set<std::string> mySet{ 
    "Hello", "**txt", "World", "error", "warning", "C++", "****" 
};
auto filtered = FilterEraseIfGen(mySet, [](auto& elem) { 
    return !elem.starts_with('*'); 
});

On the other hand, if you prefer not to copy all elements upfront, we might need more work.

Generic Copy If Approach  

The main problem is that we cannot use back_inserter on associative containers, or on containers that don’t support push_back() member function. In that case, we can fallback to std::inserter adapter.

That’s why one of a possible solution is to detect if a given container supports push_back :

template <typename T, typename = void>
struct has_push_back : std::false_type {};

template <typename T>
struct has_push_back<T,
  std::void_t<
    decltype(std::declval<T>().push_back(std::declval<typename T::value_type>()))
    >
  > : std::true_type {};

template <typename TCont, typename Pred>
auto FilterCopyIfGen(const TCont& cont, Pred p) {
    TCont out;
    if constexpr(has_push_back<TCont>::value)
        std::copy_if(begin(cont), end(cont), std::back_inserter(out), p);
    else
        std::copy_if(begin(cont), end(cont), std::inserter(out, out.begin()), p);

    return out;
}

This seems to work! But of course, I’m open to some better code and ideas :)

I took the approach from How To Detect Function Overloads in C++17, std::from_chars Example - C++ Stories.

June 2021 Update:

We can leverage concepts and make the code much simpler. Have a look (as commented by danesh110)

template <typename T> 
concept has_push_back = requires(T container, typename T::value_type v) { 
    container.push_back(v);
};

And see more in Simplify Code with if constexpr and Concepts in C++17/C++20 - C++ Stories.

More C++20 Concepts  

We can add more concepts, and restrict other template parameters.

For example, if I write:

auto filtered = FilterCopyIf(vec, [](auto& elem, int a) { 
    return !elem.starts_with('*'); 
});

So it’s two input arguments into an unary predicate I get the following in Visual Studio:

C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29333\include\algorithm(1713,13): error C2672: 'operator __surrogate_func': no matching overloaded function found
1>  C:\Users\Admin\Documents\GitHub\articles\filterElements\filters.cpp(38): message : see reference to function template instantiation '_OutIt std::copy_if<std::_Vector_const_iterator<std::_Vector_val<std::_Simple_types<_Ty>>>,std::back_insert_iterator<std::vector<_Ty,std::allocator<_Ty>>>,Pred>(_InIt,_InIt,_OutIt,_Pr)' being compiled
1>          with

but then after a few lines, we have

error C2780: 'auto main::<lambda_4>::operator ()(_T1 &,int) const': expects 2 arguments - 1 provided

We can experiment with concepts and restrict our predicate to be std::predicate, an existing concept from the Standard Library. In our case, we need a function that takes one argument and then returns a type convertible to bool.

template <typename T, std::predicate<const T&> Pred>   // <<
auto FilterCopyIfConcepts(const std::vector<T>& vec, Pred p) {
    std::vector<T> out;
    std::copy_if(begin(vec), end(vec), std::back_inserter(out), p);
    return out;
}

And then the problematic code:

auto filtered = FilterCopyIfConcepts(vec, [](auto& elem, int a) { 
    return !elem.starts_with('*'); 
});

Says the following:

1>  filters.cpp(143,19): error C2672: 'FilterCopyIfConcepts': no matching overloaded function found
1>  filters.cpp(143,101): error C7602: 'FilterCopyIfConcepts': the associated constraints are not satisfied

It’s a bit better, as we have messages about our top-level function and not some internals, but it would be great to see why and which constraint wasn’t satisfied.

Making it Parallel?  

Since C++17 we also have parallel algorithms, so why not add it to our list?

As it appears std::copy_if par is not supported in Visual Studio, and this problem is a bit more complicated. We’ll leave this topic for now and try to solve it some next time.

You can write a manual version:

std::mutex mut;
    std::for_each(std::execution::par, begin(vec), end(vec),
        [&out, &mut, p](auto&& elem) {
            if (p(elem))
            {
                std::unique_lock lock(mut);
                out.push_back(elem);
            }
        });

But this will often block, and it’s probably not the best approach. So stay tuned for our future experiments with this topic.

Here’s the latest update and experiment: Implementing Parallel copy_If in C++ - C++ Stories

Summary  

In this article, I’ve shown at least 12 possible ways to filter elements from various containers. We started from code that worked on std::vector, and you’ve also seen multiple ways to make it more generic and applicable to other container types. For example, we used std::erase_if from C++20, concepts, and even a custom type trait.

See my code in a separate Github Repo:

https://github.com/fenbf/articles/blob/master/filterElements/filters.cpp

Back To You  

  • What other options do you see?
  • What techniques do you prefer?

Let us know in comments below the article or join the discussion this @r/cpp thread.