Empty Base Class Optimisation, no_unique_address and unique_ptr

Table of Contents

C++20 added a couple of new attributes in the form of [[attrib_name]]. One of them - [[no_unique_address]] - can have surprising effects on the code! In this blog post, you’ll learn how to optimize your classes’ layout and make some data members “disappear”. In most cases, it will be just one line of C++20 code.

Let’s go.

A motivating example

unique_ptr is one of the most useful smart pointers and is also easy to use. It’s very flexible as you can even control the way the deleter works.

I wrote an article on that topic some time ago:

Custom Deleters for C++ Smart Pointers - unique_ptr

To recall the knowledge, let’s have a look at an example. There’s a legacy type LegacyList, and the users are obligated to call ReleaseElements before the list is deleted:

struct LegacyList {
     void ReleaseElements(); // needs to be called before delete
};

struct LegacyListDeleterFunctor {  
    void operator()(LegacyList* p) {
        p->ReleaseElements(); 
        delete p;
    }
};

using unique_legacylist_ptr =  
  std::unique_ptr<LegacyList, LegacyListDeleterFunctor>;

Play with the example @Coliru

As you can see, we can create a unique_ptr that holds the pointer to a LegacyList object and then properly destroys it in the custom deleter.

But there’s another nifty property of unique_ptr related to deleters:

Do you know what the size of the unique_legacylist_ptr is? It holds the pointer and the deleter… so it should be at least 2x pointer size?

But it’s not.

For stateless deleters (to check if the Standard requires it) the size of the unique_ptr is just one pointer! And it’s achieved through Empty Base Class Optimisation.

To understand how to works, we need to open the hood and look at the internals.

Internals of `unique_ptr`

For this purpose, let’s go to Github repository for Microsoft STL implementation of unique_ptr:

STL/memory line 2435 · microsoft/STL

and then if we go to line 2558:

https://github.com/microsoft/STL/blob/master/stl/inc/memory#L2558

You can see the following helper type:

_Compressed_pair<_Dx, pointer> _Mypair;

The implementation stores the pointer and the deleter inside a compressed pair.

Throughout the class code you can notice that unique_ptr uses the _Mypair object to reference the data and the deleter. For example in the destructor:

~unique_ptr() noexcept {
        if (_Mypair._Myval2) {
            _Mypair._Get_first()(_Mypair._Myval2); // call deleter
        }
    }

Ok… but what’s that compressed pair?

The purpose of this class is to hold two types, similarly as std::pair, but when one of those types is empty, then the compressed pair doesn’t use any storage for it.

Wow, looks interesting!

But how does it work?

See below:

Empty Base Class Optimisation

In C++, there’s a requirement that even a type that has no data members must have a nonzero size.

For example:

struct Empty { };
std::cout << sizeof(Empty); // prints 1

However there’s no requirement about empty base classes, so for example:

struct Empty { };
struct EmptyEmpty : Empty { };
std::cout << sizeof(EmptyEmpty);

It’s also 1… not 1 + 1!

Play with the code @Coliru

That’s why if you know that your class is empty, then you can inherit from that class, and the compiler won’t enlarge your derived class!

Empty classes can represent anything, like a stateless deleter (for example, for unique_ptr), stateless allocator, or a class that implements some interface or policy with only member functions and no state. In fact, in STL, there are many places where this technique is used to save space.

Going back to the compressed pair:

Let’s have a look at the code:

This time we have to go into the xmemory header:

https://github.com/microsoft/STL/blob/master/stl/inc/xmemory#L1319

We have two template specialisations:

The first one:

// store a pair of values, deriving from empty first
template <class _Ty1, class _Ty2, bool = is_empty_v<_Ty1> && 
                                         !is_final_v<_Ty1>>
class _Compressed_pair final : private _Ty1 {
public:
    _Ty2 _Myval2;
    
    // ... the rest of impl

And the second one:

// store a pair of values, not deriving from first
template <class _Ty1, class _Ty2>
class _Compressed_pair<_Ty1, _Ty2, false> final { 
public:
    _Ty1 _Myval1;
    _Ty2 _Myval2;
    
    // ... the rest of impl

The main trick here is that we need to check if the first type is empty. If it is, then we cannot store any objects as members (as it would take the space, at least 1 byte), but privately derive from it. The inheritance gives us a chance to call member functions of the empty class.

As you can see, the compressed pair is quite simple, as it considers only if the first type is empty. You can also have a look at the compressed pair from the boost library, where the first or the second type can be empty.: Compressed_Pair - Boost 1.73.0

Okay, but this article is in the series about C++20 features…, and clearly, EBO is not a new invention.

That’s why we have to look at the proposal of P0840:

The `no_unique_address` C++20 attribute

In C++20, we’ll have an addition, a new attribute that allows us to reduce the need for EBO and rely on the new attribute!

Rather than inheriting and checking if a type is empty or not… we can just write:

template <typename T, typename U>
struct compressed_pair_cpp20 {
    [[no_unique_address]] T _val1;
    [[no_unique_address]] U _val2;
};

Much simpler!

There’s no need for any template magic here! The compiler can check if the class is empty, and then it’s allowed to use the same address as other non-static data members. It will reuse the space.

The attribute can be applied to non-static data members, which are not bit fields.

For example:

struct Empty { };

compressed_pair_cpp20<int, Empty> p;
std::cout << std::addressof(p._val1) << '\n';
std::cout << std::addressof(p._val2) << '\n';

In both lines, you should see the same address, as _val1, and _val2 will occupy the same position in memory.

Play with the code @Coliru

Other uses

Thus far, we’ve learned that the new attribute can be used in places like stateless deleter. What are other options?

If we go to the proposal we can see the following code:

template<typename Key, typename Value,
         typename Hash, typename Pred, typename Allocator>
class hash_map {
  [[no_unique_address]] Hash hasher;
  [[no_unique_address]] Pred pred;
  [[no_unique_address]] Allocator alloc;
  Bucket *buckets;
  // ...
public:
  // ...
};

As you can see, hasher, pred, and alloc have the attribute [[no_unique_address]] applied.

If those non-static data members are empty, they might have the same address as buckets.

It looks like the new attribute is handy for template classes that work with empty data members. This falls into the category of stateless deleters, predicates, allocators, and other “custom” objects that live inside your class.

This article started as a preview for Patrons, sometimes even months before the publication. If you want to get extra content, previews, free ebooks and access to our Discord server, join the C++ Stories Premium membership or see more information.

Wrap up

Ok… so we made a little journey inside the STL implementation!

To sum up:

unique_ptr has the optimization where a stateless deleter (a stateless function object or a captur-less lambda) won’t take any space, and the size of this pointer will be just a single pointer type.

Internally, for MSVC (but other vendors have a similar approach) uses compressed pair to store the pointer field and the deleter. The compressed pair uses Empty Base Class Optimisation to compress the space if one of the types is an empty type. EBO uses inheritance, and some template magic is needed to build a proper specialization of the compressed pair class.

(For example GCC libc++ there’s std::tuple used to store the pointer and the deleter, while there’s no requirement on std::tuple to be “compressed” it seems that the GCC implementation uses this approach, see here: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/unique_ptr.h#L201)

This technique can be significantly simplified in C++20 thanks to the new attribute [[no_unique_address]].

Compiler support

GCC And Clang support this new attribute since version 9.0, MSVC since VS 2019 16.9(*).

(*): The MSVC seems to recognize this attribute, but not handle it properly due to ABI issues, you can use [[msvc::no_unique_address]] as a workaround. Please have a look at this bug report: https://github.com/microsoft/STL/issues/1364 for more information. Reported by the following readed at r/cpp comments.

And here’s another important message about the MSVC compiler: MSVC C++20 and the /std:c++20 Switch.

References

Optimizing the Layout of Empty Base Classes in VS2015 Update 2 | C++ Team Blog
More Boost utilities - and inside, there’s a link to the PDF with the article.
The Empty Base Class Optimization (EBCO) | Templates and Inheritance Interacting in C++ | InformIT (extract from the book)
- And also, there’s a similar chapter in the second edition of “C++ Templates: The Complete Guide”.
You don’t need a stateful deleter in your unique_ptr (usually) - /dev/krzaq
https://www.reddit.com/r/cpp_questions/comments/cfmxj5/no_unique_address_all_the_things/