Table of Contents

Sometimes, If you mix different integer types in an expression, you might end up with tricky cases. For example, comparing long with size_t might give different results than long with unsigned short. C++20 brings some help, and there’s no need to learn all the complex rules :)

Conversion and Ranks  

Let’s have a look at two comparisons:

#include <iostream>

int main() {
    long a = -100;
    unsigned short b = 100;
    std::cout << (a < b);   // 1
    size_t c = 100;
    std::cout << (a < c);   // 2
}   

If you run the code @Compiler Explorer (GCC 12, x86-64, default flags) you’ll see:

10

Why? Why not 11?

(By the way, I asked that question on Twitter, see https://twitter.com/fenbf/status/1568566458333990914 - thank you for all the answers and hints)

If we run C++Insights, we’ll see the following transformation:

long a = static_cast<long>(-100);
unsigned short b = 100;
std::cout.operator<<((a < static_cast<long>(b)));
size_t c = 100;
std::cout.operator<<((static_cast<unsigned long>(a) < c));

As you can see, in the first case, the compiler converted unsigned short to long, and then comparing -100 to 100 made sense. But in the second case, long was promoted to unsigned long and thus -100 become (-100) % std::numeric_limits<size_t>::max() which is some super large positive number.

In general, if you have a binary operation, the compiler needs to have the same types; if the types differ, the compiler must perform some conversion. See the notes from C++ Reference -

For the binary operators (except shifts), if the promoted operands have different types, additional set of implicit conversions is applied, known as usual arithmetic conversions with the goal to produce the common type (also accessible via the std::common_type type trait)…

As for integral types:

  • If both operands are signed or both are unsigned, the operand with lesser conversion rank is converted to the operand with the greater integer conversion rank.
  • Otherwise, if the unsigned operand’s conversion rank is greater or equal to the conversion rank of the signed operand, the signed operand is converted to the unsigned operand’s type.
  • Otherwise, if the signed operand’s type can represent all values of the unsigned operand, the unsigned operand is converted to the signed operand’s type.
  • Otherwise, both operands are converted to the unsigned counterpart of the signed operand’s type.

And the conversion rank:

The conversion rank above increases in order bool, signed char, short, int, long, long long (since C++11). The rank of any unsigned type is equal to the rank of the corresponding signed type. The rank of char is equal to the rank of signed char and unsigned char. The ranks of char8_t, (since C++20) char16_t, char32_t, and (since C++11) wchar_t are equal to the ranks of their corresponding underlying types.

For our use case, the rank of unsigned short is smaller than long and thus it was promoted to long. While in the second case, the rank of size_t, which can be unsigned long is larger or equal to the rank of long, so we have promotion to unsigned long.

If you compare signed with unsigned, make sure the signed value is positive to avoid unexpected conversions.

Use Cases  

In general, we should aim to use the same integral types to avoid various conversion warnings and bugs. For example, the following code:

std::vector numbers {42, 76, 2, 21, 98, 100 };
for (int i = 0; i < numbers.size(); ++i)
        std::cout << i << "(" << numbers[i] << "), ";

It will generate a GCC warning in -Wall. However, it can be easily fixed by using unsigned int or size_t as the type for the loop counter.

What’s more, such code might also be improved by various C++ features, for example:

std::vector numbers {42, 76, 2, 21, 98, 100 };
for (int i = 0; auto &num : numbers)
    std::cout << "i: " << i++ << " - " << num << '\n';

The above example uses a range-based-for loop with an initializer (C++20). That way, there’s no need to compare the counter against the container size.

On the other hand, there are situations where you get integral numbers of different types:

long id = -1;
if (id >= 0 && id < container.size()) {

}

In the above sample, I used id, which can have some negative value (to indicate some other properties), and when it’s valid (in range), I can access elements of some container.

In this case, I don’t want to change the type of the id object, so I have to put static_cast<size_t>(id) to avoid warnings.

Putting casts here and there might not be the best idea, not to mention the code style.

Additionally, we should also follow the C++ Core Guideline Rule:

ES.100: Don’t mix signed and unsigned arithmetic:

Reason Avoid wrong results.

Fortunately, in C++20, we have a utility to handle such situations.

It’s called “Safe Integral Comparisons” - P0586 by Federico Kircheis.

Safe integral comparisons functions  

In the Standard Library we’ll have the following new functions that compare with the “mathematical” meaning:

// <utility> header:
template <class T, class U>
constexpr bool cmp_equal (T t , U u) noexcept
template <class T, class U>
constexpr bool cmp_not_equal (T t , U u) noexcept
template <class T, class U>
constexpr bool cmp_less (T t , U u) noexcept
template <class T, class U>
constexpr bool cmp_greater (T t , U u) noexcept
template <class T, class U>
constexpr bool cmp_less_equal (T t , U u) noexcept
template <class T, class U>
constexpr bool cmp_greater_equal (T t , U u) noexcept
template <class R, class T>
constexpr bool in_range (T t) noexcept

T and U are required to be standard integer types and so those functions cannot be used to compare std::byte, char, char8_t, char16_t, char32_t, wchar_t and bool.

You can find those functions in the <utility> header file.

This article started as a preview for Patrons months ago. If you want to get exlusive content, early previews, bonus materials and access to Discord server, join the C++ Stories Premium membership.

Examples  

We can rewrite our initial example into:

#include <iostream>
#include <utility>

int main() {
    long a = -100;
    unsigned short b = 100;
    std::cout << std::cmp_less(a, b);
    size_t c = 100;
    std::cout << std::cmp_less(a, c);
}   

See the code at @Compiler Explorer

And here’s another snippet:

#include <cstdint>
#include <iostream>
#include <utility>
 
int main() {
    std::cout << std::boolalpha;
    std::cout << 256 << "\tin uint8_t:\t" << std::in_range<uint8_t>(256) << '\n';
    std::cout << 256 << "\tin long:\t" << std::in_range<long>(256) << '\n';
    std::cout << -1 << "\tin uint8_t:\t" << std::in_range<unsigned>(-1) << '\n';
}

Run @Compiler Explorer

Real code  

I also looked at some open-source code using codesearch.isocpp.org. I searched for static_cast<int> to see some loops patterns or conditions. Some interesting things?

// actcd19/main/c/chromium/chromium_72.0.3626.121-1/chrome/browser/media/webrtc/window_icon_util_x11.cc:49:

int start = 0;
int i = 0;
while (i + 1 < static_cast<int>(size)) {
    if ((i == 0 || static_cast<int>(data[i] * data[i + 1]) > width * height) &&
        (i + 1 + data[i] * data[i + 1] < static_cast<int>(size))) {

size is probably unsigned, so they always have to convert it and compare it against int.

And searching for static_cast<size_t> shows: codesearch.isocpp.org

// actcd19/main/c/chromium/chromium_72.0.3626.121-
// 1/third_party/libwebm/source/common/vp9_level_stats_tests.cc:92:

for (int i = 0; i < frame_count; ++i) {
    const mkvparser::Block::Frame& frame = block->GetFrame(i);
    if (static_cast<size_t>(frame.len) > data.size()) {
        data.resize(frame.len);
        data_len = static_cast<size_t>(frame.len);
        // ...

This time frame.len has to be converted to size_t to allow safe comparisons.

Implementation Notes  

Since MSVC is on Github, you can quickly see how the feature was developed, see this pull request and even see the code in STL/utility at master · Microsoft/STL.

Here’s the code for cmp_equal():

template <class _Ty1, class _Ty2>
_NODISCARD constexpr bool cmp_equal(const _Ty1 _Left, const _Ty2 _Right) noexcept {
  static_assert(_Is_standard_integer<_Ty1> && _Is_standard_integer<_Ty2>,
   "The integer comparison functions only "
   "accept standard and extended integer types.");
  if constexpr (is_signed_v<_Ty1> == is_signed_v<_Ty2>) {
    return _Left == _Right;
  } else if constexpr (is_signed_v<_Ty2>) {
    return _Left == static_cast<make_unsigned_t<_Ty2>>(_Right) && _Right >= 0;
  } else {
    return static_cast<make_unsigned_t<_Ty1>>(_Left) == _Right && _Left >= 0;
  }
}

And a similar code for cmp_less():

template <class _Ty1, class _Ty2>
_NODISCARD constexpr bool cmp_less(const _Ty1 _Left, const _Ty2 _Right) noexcept {
    static_assert(_Is_standard_integer<_Ty1> && _Is_standard_integer<_Ty2>, "same...");
    if constexpr (is_signed_v<_Ty1> == is_signed_v<_Ty2>) {
        return _Left < _Right;
    } else if constexpr (is_signed_v<_Ty2>) {
        return _Right > 0 && _Left < static_cast<make_unsigned_t<_Ty2>>(_Right);
    } else {
        return _Left < 0 || static_cast<make_unsigned_t<_Ty1>>(_Left) < _Right;
    }
}

Notes:

  • the std:: namespace is omitted here, sois_signed_v is a standard type trait, std::is_signed_v, same as make_unsigned_t is std::make_unsigned_t.
  • Notice the excellent and expressive use of if constexpr; it makes metaprogramming code very easy to read.

The code fragments present cmp_equal() and cmp_less(). In both cases, the main idea is to work with the same sign. There are three cases to cover:

  • If both types have the same sign, then we can compare them directly
  • But when the sign differs (two remaining cases), then the code uses make_unisgned_t to convert the _Right or _Left part and ensure that the value is not smaller than 0.

Help from the compiler  

When I asked the question on Twitter, I also got a helpful answer:

My example used only default GCC settings, but it’s best to turn on handy compiler warnings and avoid such conversion bugs at compile time.

Just adding -Wall generates the following warning:

<source>:8:21: warning: comparison of integer expressions of different signedness: 'long int' and 'size_t' {aka 'long unsigned int'} [-Wsign-compare]
    8 |     std::cout << (a < c);
      |                   ~~^~~

See at Compiler Explorer

You can also compile with -Werror -Wall -Wextra, and then the compiler won’t let you run the code with signed to unsigned conversions.

Compiler Support  

As of September 2022, the feature is implemented in GCC 10.0, Clang 13.0, and MSVC 16.7.

Summary  

This post discussed some fundamental issues with integer promotions and comparisons. In short, if you have a binary arithmetic operation, the compiler must have the same types for operands. Thanks to promotion rules, some types might be converted from signed to unsigned and thus yield problematic results. C++20 offers a new set of comparison functions cmp_**, ensuring the sign is correctly handled.

If you want to read more about integer conversions, look at this excellent blog post: The Usual Arithmetic Confusions by Shafik Yaghmour. And also this one Summary of C/C++ integer rules by Nayuki.

Back to you

  • What’s your approach for working with different integer types?
  • How do you avoid conversion errors?

Share your feedback in the comments below.