Implementing Parallel copy_if in C++

In a blog post about a dozen ways to filter elements, I mentioned only serial versions of the code. But how about leveraging concurrency? Maybe we can throw some more threads and async tasks and complete the copy faster? For example, I have 6 cores on my machine, so it would be nice to see, like 5x speedup over the sequential copy?

READ MORE...

Vector of Objects vs Vector of Pointers

Memory access patterns are one of the key factors for writing efficient code that runs over large data sets. In this blog post, you’ll see why there might be a perf difference of almost 2.5x (in both directions!) when working with a vector of pointers versus a vector of value types.

READ MORE...

Preprocessing Phase for C++17's Searchers

Searchers from C++17 are a new way to perform efficient pattern lookups. The new standard offers three searchers: default_searcher , boyer_moore_searcher and boyer_moore_horspool_searcher. The last two implements algorithms that require some additional preprocessing for the input pattern. Is there a chance to separate preprocessing time from the search time?

READ MORE...

How to Initialize a String Member

How do you initialise a string member in the constructor? By using const string&, string value and move, string_view or maybe something else? Let’s have a look at possible options. Intro Below there’s a simple class with one string member. We’d like to initialise it. For example: class UserName { std::string mName; public: UserName(const std::string& str) : mName(str) { } }; As you can see a constructor is taking const std::string& str.

READ MORE...

Performance of std::string_view vs std::string from C++17

How much is std::string_view faster than standard std::string operations? Have a look at a few examples where I compare std::string_view against std::string. Intro I was looking for some examples of string_view, and after a while, I got curious about the performance gain we might get. string_view is conceptually only a view of the string: usually implemented as[ptr, length].

READ MORE...

Please stop with performance optimizations!

As you might notice from reading this blog, I love doing performance optimizations. Let’s take some algorithm or some part of the app, understand it and then improve, so it works 5x… or 100x faster! Doesn’t that sound awesome? I hope that you answered “Yes” to the question in the introduction.

READ MORE...

Curious case of branch performance

When doing my last performance tests for bool packing, I got strange results sometimes. It appeared that one constant generated different results than the other. Why was that? Let’s have a quick look at branching performance. The problem Just to recall (first part, second part) I wanted to pack eight booleans (results of a condition) into one byte, 1 bit per condition result.

READ MORE...