Type Inspection¶
We designed CAF with distributed systems in mind. Hence, all message types must be serializable. Using a message type that is not serializable causes a compiler error unless explicitly listed as unsafe message type by the user (see Unsafe Message Types). Any unsafe message type may be used only for messages that remain local, i.e., never cross the wire.
Data Model¶
Type inspection in CAF uses a hierarchical data model with the following building blocks:
- built-in types
Signed and unsigned integer types for 8, 16, 32 and 64 bit
The floating point types
float,doubleandlong doubleBytes, booleans, and strings
- lists
Dynamically-sized container types such as
std::vector.- tuples
Fixed-sized container types such as
std::tupleorstd::arrayas well as built-in C array types.- maps
Dynamically-sized container types with key/value pairs such as
std::map.- objects
User-defined types. An object has one or more fields. Fields have a name and may be optional. Further, fields may take on a fixed number of different types.
To see how this maps to C++ types, consider the following type definition:
struct test {
variant<string, double> x1;
optional<tuple<double, double>> x2;
vector<string> x3;
};
Here, field x1 is either a string or a double at runtime. The field
x2 is optional and may contain a fixed-size tuple with two elements
(built-in types). Lastly, field x3 contains any number of string values at
runtime.
Inspecting Objects¶
The inspection API allows CAF to deconstruct C++ objects. Users can either
provide free functions named inspect that CAF picks up via ADL or specialize
caf::inspector_access.
In both cases, users call members and member functions on an Inspector that
provides a domain-specific language (DSL) for describing the structure of a C++
object.
After listing a custom type T in a type ID block and either providing a free
inspect function overload or specializing inspector_access, CAF is able
to:
Serialize and deserialize objects of type
Tto/from Byte sequences.Render objects of type
Tas a human-readable string viacaf::deep_to_string.Read objects of type
Tfrom a configuration file.
In the remainder of this section, we use the following Plain Old Data (POD) type
point_3d in our code examples. Since all member variables of POD types are
public, writing custom inspection code is straightforward and we can focus on
the inspection API.
struct point_3d {
int32_t x;
int32_t y;
int32_t z;
};
Note
We strongly recommend using the fixed-width integer types in all user-defined
messaging types. Consistently using these types over short, int,
long, etc. avoids bugs in heterogeneous environments that are hard to
debug.
Writing inspect Overloads¶
Adding overloads for inspect generally provides the simplest way to teach
CAF how to serialize and deserialize custom data types. We recommend this way of
adding inspection support whenever possible, since it adds the least amount of
boilerplate code.
For our POD type point_3d, we simply pass all member variables as fields to
the inspector:
template <class Inspector>
bool inspect(Inspector& f, point_3d& x) {
return f.object(x).fields(f.field("x", x.x),
f.field("y", x.y),
f.field("z", x.z));
}
As mentioned in the section on the data model, objects are containers for fields that in turn
contain values. When providing an inspect overload, CAF recursively
traverses all fields.
Not every type needs to expose itself as object, though. For example,
consider the following ID type that simply wraps a string:
struct id { std::string value; };
template <class Inspector>
bool inspect(Inspector& f, id& x) {
return f.object(x).fields(f.field("value", x.value));
}
The type id is basically a strong typedef to improve type safety when
writing code. To a type inspector, ID objects look as follows:
object(type: "id") {
field(name: "value") {
value(type: "string") {
...
}
}
}
Now, this type has little use on its own. Usually, we would use such a type to
compose other types such as the following type person:
struct person { std::string name; id key; };
template <class Inspector>
bool inspect(Inspector& f, person& x) {
return f.object(x).fields(f.field("name", x.name), f.field("key", x.key));
}
By providing the inspect overload for ID, inspectors can recursively visit
an id as an object. Hence, the above implementations work as expected. When
using person in human-readable data formats such as CAF configurations,
however, allowing CAF to look “inside” a strong typedef can simplify working
with such types.
With the current implementation, we could read the key manager.ceo from a
configuration file with this content:
manager {
ceo {
name = "Bob"
key = {
value = "TWFuIGlz"
}
}
}
This clearly appears more verbose than it needs to be. Users generally need not
care about such internal types like id that only exist as a safeguard during
programming.
Hence, we generally recommend making such types transparent to CAF inspectors.
For our id type, the inspect overload may instead look as follows:
template <class Inspector>
bool inspect(Inspector& f, id& x) {
return f.apply(x.value);
}
In contrast to the previous implementation, inspectors now simply read or write
the strings as values whenever they encounter an id. This simplifies our
config file from before and thus gives a much cleaner interface to users:
manager {
ceo {
name = "Bob"
key = "TWFuIGlz"
}
}
Specializing inspector_access¶
Working with 3rd party libraries usually rules out adding free functions for
existing classes, because the namespace belongs to a another project. Hence, CAF
also allows specializing inspector_access instead. This requires writing
more boilerplate code but allows customizing every step of the inspection
process.
The full interface of inspector_access looks as follows:
template <class T>
struct inspector_access {
template <class Inspector>
static bool apply(Inspector& f, T& x);
template <class Inspector>
static bool save_field(Inspector& f, string_view field_name, T& x);
template <class Inspector, class IsPresent, class Get>
static bool save_field(Inspector& f, string_view field_name,
IsPresent& is_present, Get& get);
template <class Inspector, class IsValid, class SyncValue>
static bool load_field(Inspector& f, string_view field_name, T& x,
IsValid& is_valid, SyncValue& sync_value);
template <class Inspector, class IsValid, class SyncValue, class SetFallback>
static bool load_field(Inspector& f, string_view field_name, T& x,
IsValid& is_valid, SyncValue& sync_value,
SetFallback& set_fallback);
};
The static member function apply has the same role as the free inspect
function. For most types, we can implement only apply and use a default
implementation for the other member functions. For example, specializing
inspector_access for our point_3d would look as follows:
namespace caf {
template <>
struct inspector_access<point_3d> : inspector_access_base<point_3d> {
template <class Inspector>
static bool apply(Inspector& f, point_3d& x) {
return f.object(x).fields(f.field("x", x.x),
f.field("y", x.y),
f.field("z", x.z));
}
};
} // namespace caf
By inheriting from inspector_access_base, we use the default implementations
for save_field and load_field. Customizing this set of functions only
becomes necessary when integration custom types that have semantics similar to
tuple, variant, or optional.
Note
Please refer to the Doxygen documentation for more details on save_field
and load_field.
Types with Getter and Setter Access¶
Types that declare their fields private and only grant access via getter and setter cannot pass references to the member variables to the inspector. Instead, they can pass a pair of function objects to the inspector to read and write the field.
Consider the following non-POD type foobar:
class foobar {
public:
const std::string& foo() {
return foo_;
}
void foo(std::string value) {
foo_ = std::move(value);
}
const std::string& bar() {
return bar_;
}
void bar(std::string value) {
bar_ = std::move(value);
}
private:
std::string foo_;
std::string bar_;
};
Since foo_ and bar_ are not accessible from outside the class, the
inspector has to use the getter and setter functions. However, C++ has no
formalized API for getters and setters. Moreover, not all setters are so trivial
as in the example above. Setters may enforce invariants, for example, and thus
may fail.
In order to work with any flair of getter and setter functions, CAF requires
users to wrap these member functions calls into two function objects. The first
one wraps the getter, takes no arguments, and returns the underlying value
(either by reference or by value). The second one wraps the setter, takes
exactly one argument (the new value), and returns a bool that indicates
whether the operation succeeded (by returning true) or failed (by returning
false).
The example below shows a possible inspect implementation for the fobar
class shown before:
template <class Inspector>
bool inspect(Inspector& f, foobar& x) {
auto get_foo = [&x]() -> decltype(auto) { return x.foo(); };
auto set_foo = [&x](std::string value) {
x.foo(std::move(value));
return true;
};
auto get_bar = [&x]() -> decltype(auto) { return x.bar(); };
auto set_bar = [&x](std::string value) {
x.bar(std::move(value));
return true;
};
return f.object(x).fields(f.field("foo", get_foo, set_foo),
f.field("bar", get_bar, set_bar));
}
Note
For classes that lie in the responsibility of the same developers that
implement the inspect function, implementing inspect as friend
function inside the class usually can avoid going through the getter and
setter functions.
Fallbacks and Invariants¶
For each field, we may provide a fallback value for optional fields or a
predicate that checks invariants on the data (or both). For example, consider
the following class duration and its implementation for inspect:
struct duration {
string unit;
double count;
};
bool valid_time_unit(const string& unit) {
return unit == "seconds" || unit == "minutes";
}
template <class Inspector>
bool inspect(Inspector& f, duration& x) {
return f.object(x).fields(
f.field("unit", x.unit).fallback("seconds").invariant(valid_time_unit),
f.field("count", x.count));
}
In “real code”, we probably would not use a string to store the time unit.
However, with the fallback, we have enabled CAF to use "seconds" whenever
the input contains no value for the unit field. Further, the invariant makes
sure that we verify our input before accepting it.
With this implementation for inspect, we could use duration in a
configuration files as follows (assuming a parameter named
example-app.request-timeout):
# example 1: ok, falls back to "seconds"
example-app {
request-timeout {
count = 1.3
}
}
# example 2: ok, explicit definition of the time unit
example-app {
request-timeout {
count = 1.3
unit = "minutes"
}
}
# example 3: error, "parsecs" is not a time unit (invariant does not hold)
example-app {
request-timeout {
count = 12
unit = "parsecs"
}
}
Splitting Save and Load¶
When writing custom inspect functions, providing a single overload for all
inspectors may result in undesired tradeoffs or convoluted code. Sometimes,
inspection code can benefit from splitting it into a save and a load
function. For this reason, all inspector provide a static constant called
is_loading. This allows delegating to custom functions via enable_if or
if constexpr:
template <class Inspector>
bool inspect(Inspector& f, my_class& x) {
if constexpr (Inspector:is_loading)
return load(f, x);
else
return save(f, x);
}
Specializing on the Data Format¶
Much like is_loading allows client code to dispatch based on the mode of an
inspector, the member function has_human_readable_format() allows client
code to dispatch based on the data format.
The canonical example for choosing a different data representation for
human-readable input and output is the enum type. When generating data for
machine-to-machine communication, using the underlying integer representation
gives the best performance. However, using the constant names results in a much
better user experience in all other cases.
The following code illustrates how to provide a string representation for
inspectors that operate on human-readable data representations while operating
directly on the underlying type of the enum class otherwise.
enum class weekday : uint8_t {
monday,
tuesday,
wednesday,
thursday,
friday,
saturday,
sunday,
};
std::string to_string(weekday);
bool parse(std::string_view input, weekday& dest);
template <class Inspector>
bool inspect(Inspector& f, weekday& x) {
if (f.has_human_readable_format()) {
auto get = [&x] { return to_string(x); };
auto set = [&x](std::string str) { return parse(str, x); };
return f.apply(get, set);
} else {
auto get = [&x] { return static_cast<uint8_t>(x); };
auto set = [&x](uint8_t val) {
if (val < 7) {
x = static_cast<weekday>(val);
return true;
} else {
return false;
}
};
return f.apply(get, set);
}
}
When inspecting an object of type weekday, we treat is as if it were a
string for inspectors with human-readable data formats. Otherwise, we treat the
weekday as if it were an integer between 0 and 6.
Unsafe Message Types¶
Message types that do not provide serialization code cause compile time errors
when used in actor communication. When using CAF for concurrency only, this
errors can be suppressed by explicitly allowing types via
CAF_ALLOW_UNSAFE_MESSAGE_TYPE. The macro is defined as follows.
#define CAF_ALLOW_UNSAFE_MESSAGE_TYPE(type_name) \
namespace caf { \
template <> \
struct allowed_unsafe_message_type<type_name> : std::true_type {}; \
}
Keep in mind that unsafe means that your program runs into undefined behavior (or segfaults) when you break your promise and try to serialize messages that contain unsafe message types.
Note
Even unsafe messages types still require a type ID.