Table of Contents

From dynamic container operations to compile-time constants, C++ offers a variety of techniques (as in this famous Meme :)). In this article, we’ll delve into advanced initialization methods likereserve() and emplace_backfor containers to tuples with piecewise_construct and forward_as_tuple. Thanks to those techniques, we can reduce the number of temporary objects and create variables more efficiently.

Let’s jump in.

Intro  

As a background, we can use the following class that will be handy to illustrate when its special member functions are called. That way, we’ll be able to see extra temporary objects.

struct MyType {
    MyType() { std::cout << "MyType default\n"; }
    explicit MyType(std::string str) : str_(std::move(str)) { 
        std::cout << std::format("MyType {}\n", str_); 
    }
    ~MyType() { 
        std::cout << std::format("~MyType {}\n", str_);  
    }
    MyType(const MyType& other) : str_(other.str_) { 
        std::cout << std::format("MyType copy {}\n", str_); 
    }
    MyType(MyType&& other) noexcept : str_(std::move(other.str_)) { 
        std::cout << std::format("MyType move {}\n", str_);  
    }
    MyType& operator=(const MyType& other) { 
        if (this != &other)
            str_ = other.str_;
        std::cout << std::format("MyType = {}\n", str_);  
        return *this;
    }
    MyType& operator=(MyType&& other) noexcept { 
        if (this != &other)
            str_ = std::move(other.str_);
        std::cout << std::format("MyType = move {}\n", str_);  
        return *this; 
    }
    std::string str_;
};

I borrowed this type from my other article: Moved or Not Moved - That Is the Question! - C++ Stories

And we can now start with the relatively simple but essential element:

reserve and emplace_back: efficiently growing vectors  

Vectors in C++ are dynamic arrays that can grow as needed. However, each time a vector grows beyond its current capacity, it might need to reallocate memory, which can be costly. To optimize this, we can use the reserve() method combined with emplace_back.

The reserve method doesn’t change the size of the vector but ensures that the vector has enough allocated memory to store the specified number of elements. By reserving space ahead of time, you can prevent multiple reallocations as elements are added to the vector.

Here’s an example that compares the techniques:

#include <iostream>
#include <vector>
#include <string>
#include <format>

// ... [MyType definition here] ...

int main() {    
    {
        std::cout << "push_back\n";
        std::vector<MyType> vec;
        vec.push_back(MyType("First"));
        std::cout << std::format("capacity: {}\n", vec.capacity());
        vec.push_back(MyType("Second"));
    }
    {
        std::cout << "no reserve() + emplace_\n";
        std::vector<MyType> vec;
        vec.emplace_back("First");
        std::cout << std::format("capacity: {}\n", vec.capacity());
        vec.emplace_back("Second");
    }
    {
        std::vector<MyType> vec;
        vec.reserve(2);  // Reserve space for 2 elements
        vec.emplace_back("First");
        vec.emplace_back("Second");
    }
}

And the output:

--- push_back
MyType First
MyType move First
~MyType 
capacity: 1
MyType Second
MyType move Second
MyType move First
~MyType 
~MyType 
~MyType First
~MyType Second
--- emplace_back
MyType First
capacity: 1
MyType Second
MyType move First
~MyType 
~MyType First
~MyType Second
--- reserve() + emplace_
MyType First
MyType Second
~MyType First
~MyType Second

Run at @Compiler Explorer

In the example, you can see a comparison between three insertion techniques:

  • just push_back()
  • just emplace_back()
  • reserve() with emplace_back

In the first case, we have to pass temporary objects to push_back, and they are moved to initialize the vector’s elements. But then there’s also a reallocation since the vector has to grow when you add a second element.

The emplace_back() technique is a bit better and easier to write, as no temporary objects are created.

But then, the third option is most efficient, as we can reserve space upfront and then just create elements in place.

By using reserve and then emplace_back, we ensure that the vector doesn’t need to reallocate memory as we add elements up to the reserved capacity. This combination is a powerful way to optimize performance, especially when adding multiple elements to a vector.

constinit: ensuring compile-time initialization in C++20  

constinit is a powerful tool to enforce constant initialization, particularly for static or thread-local variables. Introduced in C++20, this keyword addresses a longstanding challenge in C++: the static initialization order fiasco. By ensuring variables are initialized at compile-time, constinit provides a more predictable and safer initialization process.

At its core, constinit guarantees that the variable it qualifies is initialized during compile-time. This is especially beneficial for global or static variables, ensuring they are free from dynamic initialization order issues.

Consider the following example:

#include <array>

// Initialize at compile time
constexpr int compute(int v) { return v*v*v; }
constinit int global = compute(10);

// This won't work:
// constinit int another = global;

int main() {
    // But allows changes later...
    global = 100;

    // global is not constant!
    // std::array<int, global> arr;
}

In the code above, the global variable is initialized at compile-time using the compute function. However, unlike const or constexpr, constinit doesn’t render the variable immutable. This means that while its initial value is set at compile-time, it can be modified during runtime, as demonstrated in the main function. Moreover, since a constinit variable is not constexpr, you cannot use it to initialize another constinit object (like int another).

See more in my other article: const vs constexpr vs consteval vs constinit in C++20 - C++ Stories and Solving Undefined Behavior in Factories with constinit from C++20 - C++ Stories.

Lambda expression and initialization  

C++14 brought a significant update to lambda captures, introducing the ability to initialize new data members directly within the lambda’s capture clause. This feature, known as capture with an initializer or generalized lambda capture, offers us more flexibility and precision when working with lambdas.

Traditionally, lambda expressions could capture variables from their enclosing scope. With C++14, you can now create and initialize new data members directly in the capture clause, making lambdas even more versatile.

Consider this example:

#include <iostream>

int main() {
    int x = 30;
    int y = 12;
    const auto foo = [z = x + y]() { std::cout << z; };
    x = 0;
    y = 0;
    foo();
}

Output:

42

Here, a new data member, z, is created and initialized with the sum of x and y. This initialization occurs at the point of lambda definition, not invocation. As a result, even if x and y are modified after lambda’s definition, z retains its initial value.

To understand this feature better, let’s look at how the lambda translates to a callable type:

struct _unnamedLambda {
    void operator()() const {
        std::cout << z;
    }
    int z;
} someInstance;

The lambda essentially becomes an instance of an unnamed struct with an operator()() method and a data member z.

Capture with an initializer isn’t just limited to simple types. You can also capture references.

How can this technique be handy? There are at least two cases:

  • capturing a moveable-only types, by value
  • optimizations

Let’s consider the first scenario; here’s how you can capture std::unique_ptr:

#include <iostream>
#include <memory>

int main(){
    std::unique_ptr<int> p(new int{10});
    const auto bar = [ptr=std::move(p)] {
        std::cout << "pointer in lambda: " << ptr.get() << '\n';
    };
    std::cout << "pointer in main(): " << p.get() << '\n';
    bar();
}

Previously in C++11, you couldn’t capture a unique pointer by value. Only capturing by reference was possible. Now, since C++14, we can move an object into a member of the closure type:

Another use case might be an optimization:

If you capture a variable and then compute some temporary object:

auto result = std::find_if(vs.begin(), vs.end(),
        [&prefix](const std::string& s) {
            return s == prefix + "bar"s; 
        }
    );

Why not compute it once and store it inside the lambda object:

result = std::find_if(vs.begin(), vs.end(), 
        [savedString = prefix + "bar"s](const std::string& s) { 
            return s == savedString; 
        }
    );

That way, savedString is computed once and not every time the function is invoked.

make_unique_for_overwrite: optimizing memory initialization in C++20  

With smart pointers, we gained tools that significantly reduced the risks associated with dynamic memory allocations. However, as with any tool, there’s always room for improvement and optimization.

When using make_unique (or make_shared) to allocate arrays, the default behavior is to value-initialize each element. This means that for built-in types, each element is set to zero, and for custom types, their default constructors are called. While this ensures that the memory is initialized to a known state, it introduces a performance overhead, especially when the intention is to overwrite the allocated memory immediately.

Consider the following:

auto ptr = std::make_unique<int[]>(1000); 

This line not only allocates memory for 1000 integers but also initializes each of them to zero. If the next step is to fill this memory with data from a file or a network operation, the initial zeroing is unnecessary and wasteful.

To address this inefficiency, C++20 introduced make_unique_for_overwrite and make_shared_for_overwrite. These functions allocate memory without value-initializing it, making them faster when the immediate intention is to overwrite the memory.

auto ptr = std::make_unique_for_overwrite<int[]>(1000);

The _for_overwrite functions are most beneficial when the allocated memory is immediately overwritten with other data. If the memory isn’t overwritten, it contains indeterminate values, which can lead to undefined behavior if accessed.

These new functions can lead to noticeable performance improvements for applications that perform heavy memory operations, such as data processing tools or game engines.

Would you like to see more?
You can see my benchmarks in a separate article. The new init can sometimes be 20x faster than the regular make_shared/make_unique versions. The text is available for C++ Stories Premium/Patreon members. See all Premium benefits here.

piecewise_construct and forward_as_tuple  

And finally, let’s see the fifth technique: direct initialization of pairs or tuples with multi-parameter constructors.

This is where std::piecewise_construct and std::forward_as_tuple come into play.

For instance:

std::pair<MyType, MyType> p { "one", "two" };

The above code creates the pair without extra temporary MyType objects.

But how about the case where you have one additional constructor taking two arguments:

MyType(std::string str, int a)

In that case, the attempt:

std::pair<MyType, MyType> p { "one", 1, "two", 2 };

It fails, as the call is ambiguous to the compiler.

In those scenarios, std::piecewise_construct comes to the rescue. It’s a tag that instructs std::pair to perform piecewise construction. When combined with std::forward_as_tuple, which creates a tuple of lvalue or rvalue references, you can forward multiple arguments to the constructors of the pair’s elements.

{
    std::cout << "regular: \n";
    std::pair<MyType, MyType> p { MyType{"one", 1}, MyType{"two", 2}};
}
{
    std::cout << "piecewise + forward: \n";
    std::pair<MyType, MyType>p2(std::piecewise_construct,
               std::forward_as_tuple("one", 1),
               std::forward_as_tuple("two", 2));
}

If we run this program, we can see the following output:

regular: 
MyType one, 1
MyType two, 2
MyType move one
MyType move two
~MyType 
~MyType 
~MyType two
~MyType one
piecewise + forward: 
MyType one, 1
MyType two, 2
~MyType two
~MyType one

Run @Compiler Explorer

As you can see, we have two temporary objects created with the regular approach. With the piecewise option we can pass parameters to the pair’s elements directly.

std::piecewise_construct is particularly useful with containers like std::map and std::unordered_map that store key-value pairs (std::pair). The utility of std::piecewise_construct becomes handy when you want to insert elements into these containers, and either the key or the value (or both) have multi-parameter constructors or are non-copyable.

See the example below:

#include <string>
#include <map>

struct Key {
    Key(int a, int b) : sum(a + b) {}
    int sum;
    bool operator<(const Key& other) const { 
        return sum < other.sum; 
    }
};

struct Value {
    Value(const std::string& s, double d) : name(s), data(d) {}
    std::string name;
    double data;
};

int main() {
    std::map<Key, Value> myMap;

    // doesn't compile: ambiguous
    // myMap.emplace(3, 4, "example", 42.0);

    // works:
    myMap.emplace(
        std::piecewise_construct,
        std::forward_as_tuple(3, 4),  
        std::forward_as_tuple("example", 42.0) 
    );
}

Run @Compiler Explorer

Summary  

This article explores various techniques for initializing C++ code. We delve into the complexities of modern C++ features, including the efficiency of reserve and emplace_back, the accuracy of constinit, and the flexibility of lambda initializations. Additionally, we examine the nuanced capabilities of piecewise and forward_as_tuple. These advanced techniques demonstrate the evolution and strength of the C++ language, and offer developers the ability to write more expressive, efficient, and versatile code.

Some may consider it an unnecessary complication in the language, but I have a different perspective. Consider the emplace() function, which can improve container insertions. However, if optimization isn’t required, temporary objects can be passed instead using simpler code. C++ provides a straightforward approach but enables users to delve into the internals for optimal code, working “under the hood” if needed.

Back to you

The list of advanced techniques provided may not be exhaustive. I am curious about any other useful techniques for initializing objects in a more efficient yet challenging manner. Please feel free to share your thoughts in the comments section.