Table of Contents

Would you like to gain 20…35 or even 50% speed improvements when searching in associative containers? In this blog post, we’ll explore a technique called “heterogenous access” that offers such impressive speedups. We’ll explore ordered containers, and the support for unordered collections added recently in C++20.

Recap on heterogeneous lookup in ordered containers  

Let’s bring the example and have a look at how this feature works for ordered containers.

If you have the following code:

std::map<std::string, int> intMap { 
    { "Hello Super Long String", 1 }, 
    { "Another Longish String", 2 }, 
    { "This cannot fall into SSO buffer", 3 }
};

std::cout << "Lookup in intMap with by const char*:\n";
std::cout << intMap.contains("Hello Super Long String") << '\n';

In the above code, although “Hello Super Long String” is a string literal, it has to be converted into a regular std::string (so a memory allocation is needed here), and then the search is performed.

On the other hand this code:

std::map<std::string, int, std::less<>> trIntMap { 
    { "Hello Super Long String", 1 }, 
    { "Another Longish String", 2 }, 
    {"This cannot fall into SSO buffer", 3 }
};

std::cout << "Lookup in trIntMap by const char*: \n";
std::cout << trIntMap.contains("Hello Super Long String") << '\n';

Won’t do any extra allocation for the contains() function call.

We can observe it with the following trick, where we hijack the global new operator:

void* operator new(std::size_t sz){
    std::cout << "Allocating: " << sz << '\n';
    return std::malloc(sz);
}

And here’s the result:

Allocating: 24
Allocating: 23
Allocating: 33
Allocating: 72
Allocating: 24
Allocating: 72
Allocating: 23
Allocating: 72
Allocating: 33
Allocating: 24
Allocating: 23
Allocating: 33
Allocating: 72
Allocating: 24
Allocating: 72
Allocating: 23
Allocating: 72
Allocating: 33
Lookup in intMap with by const char*:
Allocating: 24                         // << !
1
Lookup in trIntMap by const char*: 
1
Lookup in trIntMap by string_view: 
1

Play with code @Compiler Explorer

As you can see, at the top, we have lots of allocations for tree nodes, but then at the bottom, we have one allocation - 24 bytes for looking up the string in intMap, but there are no extra allocations for trInMap.

How it works?  

As you can see, it’s straightforward to enable the “Faster” lookup; all you have to do is use std::less<> for the comparator.

The magic happens inside.

The main idea is that heterogenous access is enabled for comparators that have is_transparent tag.

By default std::map is declared with the following template params:

template<class Key, class T,
    class Compare = std::less<Key>,
    class Allocator = std::allocator<std::pair<const Key, T> >
> class map;

Please notice std::less<Key>.

But on the other hand, std::less<> is a template specialization that works as follows:

template <>
struct less<void> { 
    using is_transparent = int;

    // simplified version...
    template <class _Ty1, class _Ty2>
    constexpr auto operator()(_Ty1&& _Left, _Ty2&& _Right) const
        return static_cast<_Ty1&&>(_Left) < static_cast<_Ty2&&>(_Right);
    }
};

As you can see, operator() uses two separate template parameters, so they don’t have to be the same. They only have to compare.

This is possible for our example: comparing const char* against std::string or string_view. There’s no need to have std::string.

Further on, inside std::map we have function overloads that enable us to use that other comparable type. For example:

bool contains( const Key& key ) const;

template<class K> 
bool contains( const K& x ) const; // enabled when 
                                   // Compare::is_transparent is present

How to enable it for unordered containers?  

How about unordered containers?

While ordered containers got this feature in C++14, we had to wait 6 years for unordered types, but it’s finally there in C++20.

Thanks to the following paper: P0919 and final refinements in P1690.

This time, we have to enable the lookup for the comparator and the hashing function.

See the example:

struct string_hash {
  using is_transparent = void;
  [[nodiscard]] size_t operator()(const char *txt) const {
    return std::hash<std::string_view>{}(txt);
  }
  [[nodiscard]] size_t operator()(std::string_view txt) const {
    return std::hash<std::string_view>{}(txt);
  }
  [[nodiscard]] size_t operator()(const std::string &txt) const {
    return std::hash<std::string>{}(txt);
  }
};

std::unordered_map<std::string, int, string_hash, std::equal_to<>>
      intMapTransparent { 
    { "Hello Super Long String", 1 }, 
    { "Another Longish String", 2 }, 
    {"This cannot fall into SSO buffer", 3 }
};

bool found = intMapNormal.contains("Hello Super Long String");
std::cout << "Found: " << std::boolalpha << found << '\n';

This time it’s a bit more verbose to create the container.

As you can see, I marked my custom hasher string_hash with is_transparent, and then I had to implement three different overloads for operator().

It’s important to be consistent with the hashing results. Ensure that if the data type is different, but the values are “considered the same,” they should result in the same hash value. In other words:

const std::string txt { "Hello World" };
const std::string_view sv { "Hello World" };

// if txt == sv then
// string_hash{}(txt) == string_hash{}(sv)

How is it implemented  

Similarly to ordered containers, the “search”-like functions inside containers have overloads:

For example contains():

// the container:
template<
    class Key,
    class Hash = std::hash<Key>,
    class KeyEqual = std::equal_to<Key>,
    class Allocator = std::allocator<Key>
> class unordered_set;

// two functions:
bool contains( const Key& key ) const;

template <class K> 
bool contains() const K& x ) const; // enabled when Hash::is_transparent and
                                    // KeyEqual::is_transparent is present

Function list  

The heterogeneous lookup, both for ordered and unordered containers, is available in the following member functions.

This includes: std::map, std::set, std::multimap, std::multiset, std::unordered_map, std::unoredered_set, std::unordered_multimap, std::unordered_multiset.

Function Notes
find()
count()
equal_range()
contains() Available since C++20
lower_bound Ordered containers only
upper_bound Ordered containers only

Additionally, in C++23, we’ll have more member functions aware of transparent searching:

Function Notes
erase in C++23, see P2077
extract in C++23, see P2077
insert_or_assign planned for C++23, see P2363
insert planned for C++23, for std::set and std::unordered_set, see P2363
operator[] planned for C++23, see P2363
bucket for unordered containers, see P2363

Additionally, in the summary for P2363 there’s a table that shows all functions supporting this lookup technique: P2364 - Summary & Table.

The Performance Gains with Heterogeneous Lookup  

Inside the paper P0919, the author - Mateusz - presents several experiments for unordered containers (Github repo here: mpusz/unordered_v2):

  • 20% performance gain for short text (SSO used in std::string temporary).
  • 35% performance gain for long text (dynamic memory allocation in std::stringtemporary).

If you want other benchmarks, look at this blog post: C++14/20 Heterogeneous Lookup Benchmark – Coding Tidbit where the author experimented with various containers and created a long list of tests.

My results on a four-core machine, VS 2019:

Short String Benchmark
======================
          Normal Map with string timing:  780ms
           Normal Map with char* timing:  821ms
            Trans Map with char* timing:  985ms
      Trans Map with string_view timing:  671ms
    Normal Unord Map with string timing:  227ms
     Normal Unord Map with char* timing:  602ms
      Trans Unord Map with char* timing:  347ms
Trans Unord Map with string_view timing:  243ms

Long String Benchmark
=====================
          Normal Map with string timing:  614ms
           Normal Map with char* timing: 2875ms
            Trans Map with char* timing: 3083ms
      Trans Map with string_view timing:  724ms
    Normal Unord Map with string timing:  924ms
     Normal Unord Map with char* timing: 3363ms
      Trans Unord Map with char* timing: 1947ms
Trans Unord Map with string_view timing:  997ms

In most cases, “Trans” access reaches the times for the case similar to the original key type. And it’s much faster than when an extra memory allocation has to happen (for long strings).

Summary  

In this article, we covered a technique called “heterogenous access” for searching inside associative containers. While the term may sound complicated, the idea is simple: to search with things different from the “key type” but comparable to it. For example, we allow searching with string literals or string views in a container composed of keyType=std::string.

What’s the main advantage?

We can avoid the cost of creating a temporary key object that would be used to compare or identify things.

The initial support for ordered containers has been present since C++14, and now in C++20, we got it for unordered collections. It works by adding a special is_transparent tag type to comparators or hashing function objects.

This technique is handy in situations where you need to look for strings and have different representations of them. Additionally, it’s convenient to store some larger object, and the key is just part of it. See my previous article with such an example: Heterogeneous Lookup in Ordered Containers, C++14 Feature - C++ Stories.

Ok, but why this feature is not enabled by default?

As we can read in abseil guideline abseil / Tip of the Week #144: Heterogeneous Lookup in Associative Containers:

Implicitly supporting heterogeneous lookup can be dangerous, as the relationship between values might not be maintained after conversions. For example, 1.0 < 1.1, but static_cast<int>(1.0) == static_cast<int>(1.1). Thus, using a double to look up a value in a std::set<int> could lead to incorrect results. These potential bugs are the reason this feature is opt-in.

Back to you

  • Have you tried heterogenous access?