# Introduction

std::variant is a new feature of C++17. This is C++’s take on a sum type, which is a type that hold one of several legal values, which may be different types. Now, if you’re thinking, wait, C++ has union types, you’d be correct. Unfortunately, union types in C++ are absolutely horrible, for reasons we will discuss. To cut to the punchline: despite the short-comings of std::variant, union types have no place in modern C++ (with the possible exception of code needing to interface with C code).

# Union Types: How they didn’t fit in with C++

I’ll start with some code that is valid C containing a union type. This is pretty typical case of a sum type: I want to be able to express that the value is either present or not, and if it is not, I’d like to be able to get an error message with some kind of description of information on why. In this case, we’ll just say the error message is encoded by a string, but you could image something more complicated with a error code as well as some string or debug information.

The advantage of this, rather than a struct containing both a const char* and a int are twofold. Primarily, it guides the consumer to do the correct thing. As documented by the union type, it only contains one of the values, therefore the consumer knows both of the fields are not meaningful together. Secondly, it is more efficient as space as the union only needs to be large enough to store its largest element. So, in this case, the union is likely 8 bytes on a 64 bit machine, but a struct would be larger.

Now, this above example is really meant to be C code. It will compile in C++, but it is not idiomatic C++. One reason is the error_message field is a const char*, which isn’t really the type used for strings in C++: usually std::string is used. Can this be done in C++?

Well, not simply. std::string has a destructor, which cleans up allocated memory. Imagine the code that a C++ compiler would have to generate for a union variable. It must statically dispatch a destructor call to the std::stringif it is present, but it doesn’t have this information at compile time. Even if you image a run-time call to the right constructor, the union lacks that information as it does not remember which field is populated. The union is just data. Until C++11 this would have been a compile time error about std::string having a non-trivial copy constructor and destructor. However, C++11 allows it, albeit with the caveat that you’re on your own as far as managing the memory it contains: you have to manually call the destructor.

Which means, you need to put in the logic of the handling the tagging, writing a destructor which cases by its tag and dispatches the correct function. Once again, this is an example of having to think like a compiler to write code, which is an all too familiar experience in C++. Here is an implementation of the above in C++.

Now, there’s probably some design choices above that were unnecessary, and some you would have done differently, but the bulk of the work was necessary. Namely, we need to keep track of the value which is populated, and we need to be very careful when setting new values since we need to make sure the destructor of std::string is called when we switch to a value and we need to make sure that the constructor of std::string is called whenever we switch to using a string.

Doing the above to use sum types is pretty absurd, and a pretty high barrier to use. Doing explicit constructor calls on particular memory (in the form of placement new) and explicit destructor calls violates modern sensibilities in C++.

# std::variant: Not as bad

The above using std::variant is pretty easy.

Note, if you’re wondering what happens if your variant has the same type more than once, you can get the index (using std::variant::index()) and using std::get you can access by index or type. You can also use std::get_if() to access and return a pointer which is null if the access fails.

So this code is clean-ish, but the accessing notation is a bit rough. The cleanest notation for this would be pattern matching. Unfortunately, pattern matching is still a bit rough around the edges here. For the closest thing, you’re stuck doing one of 2 things: writing a functor that handles each case (which is, each possible type) explicitly, or writing a very generic function that handles each case uniformly using parametric polymorphism (templates). Of course mixing and matching is possible as well. Here’s an example:

You could actually accomplish this with lambdas and some generic helper class that is able to wrap all the lambdas up into 1 functor.

This doesn’t work well if you can’t identify the case up to type. You can always wrap the value in a struct to differentiate different cases, and this does have other benefits (having types work for you in static analysis to prove correctness is nice). Perhaps it is true in general if you find that you are always running into this trouble with sum types that you are unable to discern the case with the type that you are using too many primitive types. I haven’t seen enough of this to know for sure, but I could imagine that being a symptom.

# Appendix: std::optional

Sum types are pretty useful, but there is one common use case: you’ll want to express that the value is either there or is not there.

One could go through the typical construction in languages with sum types, and make it a std::variant of the type and the unit type. In C++, the unit type is void, but because of historical hackiness in early C language design, you can’t actually create instances of void in C++ (despite functions being able to return void). The standard library has provided std::monostate as a unit type.

Doing this construction, however, is unnecessary, as C++ provides std::optional which has a more natural interface to optional values in C++. You can treat it by in large as a pointer which could be null. In fact, you can treat std::optional, raw pointers, std::unique_ptr, and std::shared_ptr uniformly in interfaces as all have overloaded bool conversion, dereferences, and indirect member access operators that have the same semantics.

# Conclusion

std::variant provides C++ which a much needed sum type. It’s very evidently not perfect. Pattern matching is not a language feature, but just something shoved into the standard library that feels much like a hack, particularly in contrast with sum types in languages like ML and Rust. I look forward to there being more syntactic sugar added around to make std::variant feel more like a first class citizen of C++, as tuples, pairs, and function type continually do.