Type Inspection¶
We designed CAF with distributed systems in mind. Hence, all message types must be serializable. Using a message type that is not serializable causes a compiler error unless explicitly listed as unsafe message type by the user (see Unsafe Message Types). Any unsafe message type may be used only for messages that remain local, i.e., never cross the wire.
Data Model¶
Type inspection in CAF uses a hierarchical data model with the following building blocks:
- built-in types
Signed and unsigned integer types for 8, 16, 32 and 64 bit
The floating point types
float
,double
andlong double
Bytes, booleans, and strings
- lists
Dynamically-sized container types such as
std::vector
.- tuples
Fixed-sized container types such as
std::tuple
orstd::array
as well as built-in C array types.- maps
Dynamically-sized container types with key/value pairs such as
std::map
.- objects
User-defined types. An object has one or more fields. Fields have a name and may be optional. Further, fields may take on a fixed number of different types.
To see how this maps to C++ types, consider the following type definition:
struct test {
variant<string, double> x1;
optional<tuple<double, double>> x2;
vector<string> x3;
};
Here, field x1
is either a string
or a double
at runtime. The field
x2
is optional and may contain a fixed-size tuple with two elements
(built-in types). Lastly, field x3
contains any number of string values at
runtime.
Inspecting Objects¶
The inspection API allows CAF to deconstruct C++ objects. Users can either
provide free functions named inspect
that CAF picks up via ADL or specialize
caf::inspector_access
.
In both cases, users call members and member functions on an Inspector
that
provides a domain-specific language (DSL) for describing the structure of a C++
object.
After listing a custom type T
in a type ID block and either providing a free
inspect
function overload or specializing inspector_access
, CAF is able
to:
Serialize and deserialize objects of type
T
to/from Byte sequences.Render objects of type
T
as a human-readable string viacaf::deep_to_string
.Read objects of type
T
from a configuration file.
In the remainder of this section, we use the following Plain Old Data (POD) type
point_3d
in our code examples. Since all member variables of POD types are
public, writing custom inspection code is straightforward and we can focus on
the inspection API.
struct point_3d {
int32_t x;
int32_t y;
int32_t z;
};
Note
We strongly recommend using the fixed-width integer types in all user-defined
messaging types. Consistently using these types over short
, int
,
long
, etc. avoids bugs in heterogeneous environments that are hard to
debug.
Writing inspect
Overloads¶
Adding overloads for inspect
generally provides the simplest way to teach
CAF how to serialize and deserialize custom data types. We recommend this way of
adding inspection support whenever possible, since it adds the least amount of
boilerplate code.
For our POD type point_3d
, we simply pass all member variables as fields to
the inspector:
template <class Inspector>
bool inspect(Inspector& f, point_3d& x) {
return f.object(x).fields(f.field("x", x.x),
f.field("y", x.y),
f.field("z", x.z));
}
As mentioned in the section on the data model, objects are containers for fields that in turn
contain values. When providing an inspect
overload, CAF recursively
traverses all fields.
Not every type needs to expose itself as object
, though. For example,
consider the following ID type that simply wraps a string:
struct id { std::string value; };
template <class Inspector>
bool inspect(Inspector& f, id& x) {
return f.object(x).fields(f.field("value", x.value));
}
The type id
is basically a strong typedef to improve type safety when
writing code. To a type inspector, ID objects look as follows:
object(type: "id") {
field(name: "value") {
value(type: "string") {
...
}
}
}
Now, this type has little use on its own. Usually, we would use such a type to
compose other types such as the following type person
:
struct person { std::string name; id key; };
template <class Inspector>
bool inspect(Inspector& f, person& x) {
return f.object(x).fields(f.field("name", x.name), f.field("key", x.key));
}
By providing the inspect
overload for ID, inspectors can recursively visit
an id
as an object. Hence, the above implementations work as expected. When
using person
in human-readable data formats such as CAF configurations,
however, allowing CAF to look “inside” a strong typedef can simplify working
with such types.
With the current implementation, we could read the key manager.ceo
from a
configuration file with this content:
manager {
ceo {
name = "Bob"
key = {
value = "TWFuIGlz"
}
}
}
This clearly appears more verbose than it needs to be. Users generally need not
care about such internal types like id
that only exist as a safeguard during
programming.
Hence, we generally recommend making such types transparent to CAF inspectors.
For our id
type, the inspect
overload may instead look as follows:
template <class Inspector>
bool inspect(Inspector& f, id& x) {
return f.apply(x.value);
}
In contrast to the previous implementation, inspectors now simply read or write
the strings as values whenever they encounter an id
. This simplifies our
config file from before and thus gives a much cleaner interface to users:
manager {
ceo {
name = "Bob"
key = "TWFuIGlz"
}
}
Specializing inspector_access
¶
Working with 3rd party libraries usually rules out adding free functions for
existing classes, because the namespace belongs to a another project. Hence, CAF
also allows specializing inspector_access
instead. This requires writing
more boilerplate code but allows customizing every step of the inspection
process.
The full interface of inspector_access
looks as follows:
template <class T>
struct inspector_access {
template <class Inspector>
static bool apply(Inspector& f, T& x);
template <class Inspector>
static bool save_field(Inspector& f, string_view field_name, T& x);
template <class Inspector, class IsPresent, class Get>
static bool save_field(Inspector& f, string_view field_name,
IsPresent& is_present, Get& get);
template <class Inspector, class IsValid, class SyncValue>
static bool load_field(Inspector& f, string_view field_name, T& x,
IsValid& is_valid, SyncValue& sync_value);
template <class Inspector, class IsValid, class SyncValue, class SetFallback>
static bool load_field(Inspector& f, string_view field_name, T& x,
IsValid& is_valid, SyncValue& sync_value,
SetFallback& set_fallback);
};
The static member function apply
has the same role as the free inspect
function. For most types, we can implement only apply
and use a default
implementation for the other member functions. For example, specializing
inspector_access
for our point_3d
would look as follows:
namespace caf {
template <>
struct inspector_access<point_3d> : inspector_access_base<point_3d> {
template <class Inspector>
static bool apply(Inspector& f, point_3d& x) {
return f.object(x).fields(f.field("x", x.x),
f.field("y", x.y),
f.field("z", x.z));
}
};
} // namespace caf
By inheriting from inspector_access_base
, we use the default implementations
for save_field
and load_field
. Customizing this set of functions only
becomes necessary when integration custom types that have semantics similar to
tuple
, variant
, or optional
.
Note
Please refer to the Doxygen documentation for more details on save_field
and load_field
.
Types with Getter and Setter Access¶
Types that declare their fields private and only grant access via getter and setter cannot pass references to the member variables to the inspector. Instead, they can pass a pair of function objects to the inspector to read and write the field.
Consider the following non-POD type foobar
:
class foobar {
public:
const std::string& foo() {
return foo_;
}
void foo(std::string value) {
foo_ = std::move(value);
}
const std::string& bar() {
return bar_;
}
void bar(std::string value) {
bar_ = std::move(value);
}
private:
std::string foo_;
std::string bar_;
};
Since foo_
and bar_
are not accessible from outside the class, the
inspector has to use the getter and setter functions. However, C++ has no
formalized API for getters and setters. Moreover, not all setters are so trivial
as in the example above. Setters may enforce invariants, for example, and thus
may fail.
In order to work with any flair of getter and setter functions, CAF requires
users to wrap these member functions calls into two function objects. The first
one wraps the getter, takes no arguments, and returns the underlying value
(either by reference or by value). The second one wraps the setter, takes
exactly one argument (the new value), and returns a bool
that indicates
whether the operation succeeded (by returning true
) or failed (by returning
false
).
The example below shows a possible inspect
implementation for the fobar
class shown before:
template <class Inspector>
bool inspect(Inspector& f, foobar& x) {
auto get_foo = [&x]() -> decltype(auto) { return x.foo(); };
auto set_foo = [&x](std::string value) {
x.foo(std::move(value));
return true;
};
auto get_bar = [&x]() -> decltype(auto) { return x.bar(); };
auto set_bar = [&x](std::string value) {
x.bar(std::move(value));
return true;
};
return f.object(x).fields(f.field("foo", get_foo, set_foo),
f.field("bar", get_bar, set_bar));
}
Note
For classes that lie in the responsibility of the same developers that
implement the inspect
function, implementing inspect
as friend
function inside the class usually can avoid going through the getter and
setter functions.
Fallbacks and Invariants¶
For each field, we may provide a fallback value for optional fields or a
predicate that checks invariants on the data (or both). For example, consider
the following class duration
and its implementation for inspect
:
struct duration {
string unit;
double count;
};
bool valid_time_unit(const string& unit) {
return unit == "seconds" || unit == "minutes";
}
template <class Inspector>
bool inspect(Inspector& f, duration& x) {
return f.object(x).fields(
f.field("unit", x.unit).fallback("seconds").invariant(valid_time_unit),
f.field("count", x.count));
}
In “real code”, we probably would not use a string
to store the time unit.
However, with the fallback, we have enabled CAF to use "seconds"
whenever
the input contains no value for the unit
field. Further, the invariant makes
sure that we verify our input before accepting it.
With this implementation for inspect
, we could use duration
in a
configuration files as follows (assuming a parameter named
example-app.request-timeout
):
# example 1: ok, falls back to "seconds"
example-app {
request-timeout {
count = 1.3
}
}
# example 2: ok, explicit definition of the time unit
example-app {
request-timeout {
count = 1.3
unit = "minutes"
}
}
# example 3: error, "parsecs" is not a time unit (invariant does not hold)
example-app {
request-timeout {
count = 12
unit = "parsecs"
}
}
Splitting Save and Load¶
When writing custom inspect
functions, providing a single overload for all
inspectors may result in undesired tradeoffs or convoluted code. Sometimes,
inspection code can benefit from splitting it into a save
and a load
function. For this reason, all inspector provide a static constant called
is_loading
. This allows delegating to custom functions via enable_if
or
if constexpr
:
template <class Inspector>
bool inspect(Inspector& f, my_class& x) {
if constexpr (Inspector:is_loading)
return load(f, x);
else
return save(f, x);
}
Specializing on the Data Format¶
Much like is_loading
allows client code to dispatch based on the mode of an
inspector, the member function has_human_readable_format()
allows client
code to dispatch based on the data format.
The canonical example for choosing a different data representation for
human-readable input and output is the enum
type. When generating data for
machine-to-machine communication, using the underlying integer representation
gives the best performance. However, using the constant names results in a much
better user experience in all other cases.
The following code illustrates how to provide a string representation for
inspectors that operate on human-readable data representations while operating
directly on the underlying type of the enum class
otherwise.
enum class weekday : uint8_t {
monday,
tuesday,
wednesday,
thursday,
friday,
saturday,
sunday,
};
std::string to_string(weekday);
bool parse(std::string_view input, weekday& dest);
template <class Inspector>
bool inspect(Inspector& f, weekday& x) {
if (f.has_human_readable_format()) {
auto get = [&x] { return to_string(x); };
auto set = [&x](std::string str) { return parse(str, x); };
return f.apply(get, set);
} else {
auto get = [&x] { return static_cast<uint8_t>(x); };
auto set = [&x](uint8_t val) {
if (val < 7) {
x = static_cast<weekday>(val);
return true;
} else {
return false;
}
};
return f.apply(get, set);
}
}
When inspecting an object of type weekday
, we treat is as if it were a
string for inspectors with human-readable data formats. Otherwise, we treat the
weekday as if it were an integer between 0 and 6.
Unsafe Message Types¶
Message types that do not provide serialization code cause compile time errors
when used in actor communication. When using CAF for concurrency only, this
errors can be suppressed by explicitly allowing types via
CAF_ALLOW_UNSAFE_MESSAGE_TYPE
. The macro is defined as follows.
#define CAF_ALLOW_UNSAFE_MESSAGE_TYPE(type_name) \
namespace caf { \
template <> \
struct allowed_unsafe_message_type<type_name> : std::true_type {}; \
}
Keep in mind that unsafe means that your program runs into undefined behavior (or segfaults) when you break your promise and try to serialize messages that contain unsafe message types.
Note
Even unsafe messages types still require a type ID.