In this blog post, we’ll show how to implement a custom pipe operator and apply it to a data processing example. Thanks to C++23 and `std::expected`we can write a rather efficient framework that easily handles `unexpected` outcomes.

This is a collaborative guest post by prof. Bogusław Cyganek:

Prof. Cyganek is a researcher and lecturer at the Department of Electronics, AGH University of Science and Technology in Cracow, Poland. He has worked as a software engineer for a number of companies such as Nisus Writer USA, Compression Techniques USA, Manta Corp. USA, Visual Atoms UK, Wroclaw University in Poland, and Diagnostyka Inc. Poland. His research interests include computer vision and pattern recognition, as well as the development of embedded systems. See his recent book at Amazon and his home page. Prof. Cyganek also provides commercial training for Modern C++, Standard Library, and more.

``````A(R)
``````

is equivalent to

``````R | A
``````

The range adaptor closure objects can be chained by operator `|`. If `A` and `B` are RACO, then

``````A | B
``````

is another RACO `C` that fulfills the following condition:

• `C` stores copies of `A` and `B`, each directly initialized from `std::forward<decltype((T))>(T)`, for `T` being `A` or `B`, respectively.
• If `a` and `b` are those stored copies of `A` and `B`, respectively, and `R` is a range object, then the following expressions are equivalent:
``````b(a(R))
R | a | b
C(R)
R | C
R | (A | B)
``````

## A Basic Example

Below, there’s an example of an overloaded operator `|`. The example is inspired by the CppCon talk: Functional Composable Operations with Unix-Style Pipes in C++ - Ankur Satle - CppCon 2022 Its left operator is a function `f`, and its right parameter is `s`, passed by the right-reference `std::string &&`. The `|` operator simply calls `f` providing it with `s`, as done on line `[4]`. Let’s observe that `s` needs to be `std::move`’d, since it is a named object here. Hence, the callable `f` must be able to accept `std::string &&` as its parameter and return `std::string` – for simplicity, exactly this is defined on line `[1]` as an alias `Function`:

 ``````1 2 3 4 5 `````` ``````using Function = std::function; auto operator | (std::string &&s, Function f) -> std::string { return f(std::move(s)); } ``````

To see the pipeline in action, let’s define a number of functions, starting on line `[8]`, each processing `std::string`. That is, each of them extends the input string `s`, prints a diagnostic message, and finally returns the modified string, as follows:

 `````` 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 `````` ``````std::string StringProc_1(std::string &&s) { s += " proc by 1,"; std::cout << "I'm in StringProc_1, s = " << s << "\n"; return s; } std::string StringProc_2(std::string &&s) { s += " proc by 2,"; std::cout << "I'm in StringProc_2, s = " << s << "\n"; return s; } std::string StringProc_3(std::string &&s) { s += " proc by 3,"; std::cout << "I'm in StringProc_3, s = " << s << "\n"; return s; } ``````

The entire pipeline is called in the test function `SimplePipeTest`, defined on lines` [28-33]`.

 ``````28 29 30 31 32 `````` ``````void SimplePipeTest() { std::string start_str("Start string "); std::cout << (std::move(start_str) | StringProc_1 | StringProc_2 | StringProc_3); } ``````

The pipe operator is called in a series, starting with the initial string `start_str`. In other words, `start_str` is passed on to `StringProc_1`, its result is passed on to `StringProc_2`, and then to `StringProc_3`; then, finally, its result is streamed to `std::cout`.

The output is as follows:

``````I'm in StringProc_1, s = Start string proc by 1,
I'm in StringProc_2, s = Start string proc by 1, proc by 2,
I'm in StringProc_3, s = Start string proc by 1, proc by 2, proc by 3,
Start string proc by 1, proc by 2, proc by 3,
``````

## Making it more general

This is an easy way to organize a pipeline operation in C++. However, to make our operator `|` more generic, it can be re-coded to the following:

``````template <typename T, typename Function>
requires (std::invocable<Function, T>)
constexpr auto operator | (T &&t, Function &&f) -> typename std::invoke_result_t<Function, T> {
return std::invoke(std::forward<Function>(f), std::forward<T>(t));
}
``````

The improvements we have added are as follows:

• A concept has been added that requires:
• That `Function` parameter can be invoked with a parameter of type `T`.
• `constexpr`, so the operator `|` can be called and executed at compile time, if its arguments are also available at that time.
• The return type is defined with the helper `std::invoke_result_t<Function, T>`, which deduces the return type at compile time.
• Calls `std::invoke` that invokes the callable object `f` with the parameter `t`. The benefit of using `std::invoke`, instead of a direct call `f(t)`, is that the former works with any callable, such as a function pointer, a reference to a function, a lambda function, a member function pointer, a functional object (i.e., the one with `operator()` on board), or a pointer to member data. In other words, the callable `f` has to satisfy the Callable concept.

Here’s an updated example that illustrates the benefits of our updated `operator |`:

``````void SimplePipeTest() {
std::string start_str("Start string ");
std::cout << (std::move(start_str) |
StringProc_1 | StringProc_2 | [](std::string&& s) {
s += " proc by 3,";
cout << "I'm in StringProc_3, s = " << s << "\n";
return s;
});
}
``````

And the output:

``````I'm in StringProc_1, s = Start string  proc by 1,
I'm in StringProc_2, s = Start string  proc by 1, proc by 2,
I'm in StringProc_3, s = Start string  proc by 1, proc by 2, proc by 3,
Start string  proc by 1, proc by 2, proc by 3,
``````

## Handling errors

Everything would be fine, but what to do if one link in the above pipeline cannot complete its operation and transmit its result because an error occurred? Of course, it may throw an exception and interrupt the entire operation. But there is also another alternative.

We can use `std::optional` to express whether the operation was successful and we have the result, or whether we have a situation in which the calculations failed for some reason and we simply cannot provide any result. But if you have a C++23 compiler, an even better option is to use `std::expected`. Unlike `std::optional`, which has been available since C++17, in the event of a calculation failure, it allows you to pass an error code and not just state the failure. We have implemented this idea into the new version of the pipe operator `|`, see below:

 `````` 1 2 3 4 5 6 7 8 9 10 11 `````` ``````using namespace std; template requires invocable && is_expected> constexpr auto operator | (std::expected &&ex, Function &&f) -> typename invoke_result_t { return ex ? invoke(forward(f), *forward>(ex)) : ex; } ``````

The key part is the new input parameter on line `[5]` – it is no longer a `T` object, but a `std::expected<T, E>`, where `T` stands for an expected value, while `E` denotes an unexpected value to represent those cases where an expected value cannot be computed

To verify if this is as expected, a new concept is defined on lines `[3-4]`. Its new second part `is_expected` is responsible for verifying if the result of invoking `Function f` with the parameters `T` actually returns `std::expected`. Its entire definition will be analyzed later.

Given the `ex` parameter, passed as a universal reference, on line `[8]` it is checked whether it has a valid object. If so, then on line `[9]`, as before, we call the action `f` with the `ex` parameter also passed by universal reference. Otherwise, we simply return `ex` on line `[10]`. But in this case, it only transfers the error code. Other functions in the chain will behave the same way. This means that if an error occurs in the pipeline at some stage of processing, it will be propagated to the end of the chain, and no other ‘worker’ function `f` will be called again.

You can read more about `std::expected` in our other articles:

To see this pipeline in action, let’s build a more complex example:

``````// Some error types just for the example
enum class OpErrorType : unsigned char {
kInvalidInput, kOverflow, kUnderflow
};

std::string fStr{};
int fVal{};
};

// For the pipeline operation - the expected type is Payload,
// while the 'unexpected' is OpErrorType
``````

`PayloadOrError` is simply `std::expected`, which has `Payload` as the expected type, and reports any errors in the form of `OpErrorType` error codes.

The `Payload` structure has two members: `fStr` of type `std::string` and `fVal` of type `int`. In practice, of course, it can be any object that we want to process in a pipeline.

The elements of the processing chain are the functions `Payload_Proc_1` and subsequent functions. A characteristic feature of each of them is the initial condition in which we check whether the object `s`, passed through a right reference, contains a valid object. If not, the function immediately ends its operation by returning the `s` object, which, in this case, carries the error code.

``````PayloadOrError Payload_Proc_1(PayloadOrError &&s) {
if (!s)
return s;
++s->fVal;
s->fStr += " proc by 1,";
std::cout << "I'm in Payload_Proc_1, s = " << s->fStr << "\n";
return s;
}
``````

However, if we have a valid `Payload` object, we can freely process it. Finally, this processed object `s` is returned so that it can be processed by another function in the pipeline, and so on.

We introduced a slight variation only to the `Payload_Proc_2` function. This time we simulate an error – if the randomly drawn value is even, `std::unexpected` with a randomly drawn error code will be returned. This means that the calculations have failed and, as a result, the pipeline has been interrupted.

``````PayloadOrError Payload_Proc_2(PayloadOrError &&s) {
if (!s)
return s;
++s->fVal;
s->fStr += " proc by 2,";
std::cout << "I'm in Payload_Proc_2, s = " << s->fStr << "\n";
// Emulate the error, at least once in a while ...
std::mt19937 rand_gen( std::random_device {} () );
return ( rand_gen() % 2 ) ? s :
std::unexpected { rand_gen() % 2 ?
OpErrorType::kOverflow : OpErrorType::kUnderflow };
}
``````

And the last `Proc_3`:

``````PayloadOrError Payload_Proc_3(PayloadOrError &&s) {
if (!s)
return s;
++s->fVal;
s->fStr += " proc by 3,";
std::cout << "I'm in Payload_Proc_3, s = " << s->fStr << "\n";
return s;
}
``````

The entire pipeline component with `std::expected` is launched and tested in function `Payload_PipeTest`. If the pipeline operation was successful, then the resulting string and integer are printed. Otherwise, one of the error messages is displayed in one of the branches of the switch statement.

``````void Payload_PipeTest() {
if (res)
print_nl("Success! Result of the pipe: ", res->fStr, ", ", res->fVal);
else
switch (res.error()) {
case OpErrorType::kInvalidInput:
print_nl("Error: OpErrorType::kInvalidInput");
break;
case OpErrorType::kOverflow:
print_nl("Error: OpErrorType::kOverflow");
break;
case OpErrorType::kUnderflow:
print_nl("Error: OpErrorType::kUnderflow");
break;
default:
print_nl("That's really an unexpected error ...");
break;
}
}
``````

The last thing to explain is the `is_expected` concept. First, the parameter `t` of type `T` is introduced. Then, it is checked that type `T` defines `value_type`, as well as `error_type`. And then the series of three nested requirements begins.

``````template <typename T>
concept is_expected = requires(T t) {
typename T::value_type;
typename T::error_type;
requires std::is_constructible_v<bool, T>;
requires std::same_as<std::remove_cvref<decltype(*t)>, typename T::value_type>;
requires std::constructible_from<T, std::unexpected<typename T::error_type>>;
};
``````

What is characteristic of them is the first word `requires`. The main difference is that inserting the keyword `requires` forces the compiler to check what the value of this expression actually is – if it is true, then the concept is fulfilled, as in the following requirement:

``````requires std::is_constructible_v<bool, T>;
``````

However, the same expression without the `requires` keyword at the beginning only checks whether the expression compiles or not, without evaluating its logical value. Of course, the first approach is ‘stronger’.

``````std::is_constructible_v<bool, T>;
``````

The condition is used to ensure that the type `T` can be explicitly converted to `bool`. However, to check this why don’t we just call:

``````requires std::is_convertible<T, bool>;
``````

or

``````requires std::convertible_to<T, bool>;
``````

The thing is that the former is valid only if `T` is implicitly convertible to `bool`. On the other hand, the latter is valid only if `T` is implicitly and explicitly convertible to `bool`. However, `std::expected` defines only the explicit conversion to `bool`, that is:

``````constexpr explicit operator bool() const noexcept;
``````

Therefore neither of the two above will work in our case. Hence, a workaround is to use `std::is_constructible_v<bool, T>`, which is valid if an object of the `bool` type can be constructed out of `T`. In our case, this means that the following initialization:

``````bool test_b{(bool)PayloadOrError()};
``````

is possible. On the other hand, and as explained earlier, if put with no `requires` keyword in front, both `std::is_convertible` and `std::convertible_to` compile. However, in this case only syntax, but not the value is verified.

The other two requirements are a bit simpler. We require that the type of dereferenced `*t` is the same as `T::value_type`. Finally, it is verified that `T` can be constructed out of `std::unexpected` object. All these are fulfilled if `T` is compatible with the `std::expected` type.

The whole concept can be easily verified using `static_assert`, like below:

``````static_assert(is_expected<PayloadOrError>); // a short-cut to verify the concept
``````

The last thing is to observe and analyze the results of this code. After execution, we sometimes receive the following texts:

``````I'm in Payload_Proc_1, s = Start string proc by 1,
I'm in Payload_Proc_2, s = Start string proc by 1, proc by 2,
I'm in Payload_Proc_3, s = Start string proc by 1, proc by 2, proc by 3,
Success! Result of the pipe: Start string proc by 1, proc by 2, proc by 3,, 45
``````

but sometimes it happens that we get the below error message:

``````I'm in Payload_Proc_1, s = Start string proc by 1,
I'm in Payload_Proc_2, s = Start string proc by 1, proc by 2,
Error: OpErrorType::kOverflow
``````

or the other one:

``````I'm in Payload_Proc_1, s = Start string proc by 1,
I'm in Payload_Proc_2, s = Start string proc by 1, proc by 2,
Error: OpErrorType::kUnderflow
``````

All this is OK and as intended. And the above pipeline building techniques are a powerful programming tool that we can successfully use in many projects. They are also the basis for functional programming with the ranges library.

Let us observe that the presented pipeline framework is a kind of alternative to monadic processing in `std::expected`.

Finally, let’s notice that there are also proposals to provide users of the ranges library with mechanisms to create adaptor closure objects, so users can seamlessly implement their custom range adaptors any way they like.

Here’s the code to experiment: Run @Compiler Explorer

## Summary

What a ride! The article started with a simple notion of a pipe operator, and then we extended it with a generic calling code and `std::expected`. As you can see, thanks to `std::expected`, we can efficiently handle cases where something goes on the “else” path.

#### Back to you

• Do you use pipe operator for functional composition?
• Do you compose ranges with the pipe operator or you prefer regular invocation?