05 Data Types
By now, we've covered the absolute basics. As part of that foundation, we introduced fundamental primitive data types such as integers, characters, and floating-point numbers, along with the kinds of values they can represent.
In the next few chapters, we’ll start working more closely with memory. As we go deeper into the language, it becomes increasingly important to understand how memory behaves—especially when working with a lower-level language like C++. At this point, it’s not just about writing code that works; it’s about knowing what’s really happening behind the scenes.
As part of our earlier foundation, we introduced primitive data types, which act as the building blocks for all data structures. Now, it’s time to look more closely at how these basic types can be refined and extended into more specialized or optimized forms, depending on the needs of your data.
Static and Strong Typing
C++ is a statically typed language. This means that the data type of every variable must be explicitly declared and known at compile time. In other words, the compiler needs to know what kind of data each variable will hold before the program runs.
This requirement brings a few important benefits. First, it allows the compiler to catch many type-related errors early—before the code ever executes. Second, it enables more efficient memory usage and better optimization. After all, the data type of a variable determines how much memory it occupies and how that memory is interpreted during execution—which in turn affects how the program interacts with the system’s memory layout.
Compared to dynamically typed languages such as Python and JavaScript, where types are resolved at runtime. C++’s static typing provides stronger guarantees and performance benefits—though at the cost of some flexibility.
C++ is also considered a strongly typed language. This refers to how strictly the language enforces type rules. While C++ allows you to perform type conversions (also called casting), the compiler won't let you mix incompatible types without an explicit cast. This helps prevent unintended behavior or subtle bugs caused by incorrect assumptions about how data is represented in memory.
Primitive Data Types
In the introduction chapter, we introduced some of the primitive data types to kick-start your learning—but that was just the absolute core. To give you more control and flexibility, C++ actually offers a broader range of primitive types, enabling more precise management of memory and data.
Standard Integer Types
Type | Size | Signed Range |
---|---|---|
char |
1 byte | -128 to 127 |
short |
2 bytes | -32,768 to 32,767 |
int |
4 bytes | -2,147,483,648 to 2,147,483,647 |
long |
4 or 8 bytes | Depends on the system |
long long |
8 bytes | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
Note
On many 64-bit systems, int
and long
are both 4 bytes, but this can vary depending on the compiler and platform.
Fixed-Width Integer Types
These types guarantee their size across different systems and compilers.
But to use the fixed-width integer types ending with _t
, you need to include the <cstdint>
header.
Type | Size | Signed Range / Purpose |
---|---|---|
int8_t |
1 byte | -128 to 127 |
int16_t |
2 bytes | -32,768 to 32,767 |
int32_t |
4 bytes | -2,147,483,648 to 2,147,483,647 |
int64_t |
8 bytes | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
size_t |
4 or 8 bytes | Represents the size of any object in memory (always unsigned) |
Note
Types like intptr_t
and uintptr_t
are integer types capable of storing pointer values.
They’re mostly useful in low-level or systems programming, such as writing memory managers or interfacing with hardware. Most C++ applications will never need them.
Note
Similarly, intmax_t
and uintmax_t
represent the largest signed and unsigned integer types available on a given platform.
While they sound powerful, they’re rarely used in everyday programming and are mainly helpful in highly portable libraries or template-heavy code.
For most use cases, fixed-width types like int64_t
or size_t
are a better fit.
Unsigned Integers
Signed integers divide their range between positive and negative numbers, effectively splitting their capacity in half.
However, in many cases, negative numbers are unnecessary or nonsensical. For example, color channels use values from 0 to 255, and array sizes or memory addresses are always non-negative.
In these scenarios, using a signed integer like int8_t would be wasteful because half of the range goes unused.
To allocate the entire bit range exclusively for non-negative values you can use unsigned types.
For example, uint8_t
ranges from 0 to 255, doubling the maximum value you can store in the same memory size.
Type | Unsigned Range |
---|---|
unsigned char |
0 to 255 |
unsigned short |
0 to 65,535 |
unsigned int |
0 to 4,294,967,295 |
unsigned long |
Depends on system |
unsigned long long |
0 to 18,446,744,073,709,551,615 |
uint8_t |
0 to 255 |
uint16_t |
0 to 65,535 |
uint32_t |
0 to 4,294,967,295 |
uint64_t |
0 to 18,446,744,073,709,551,615 |
Floating-Point Types
Type | Size | Precision (Approx.) | Range |
---|---|---|---|
float |
4 bytes | ~6-7 decimal digits | ±1.5 × 10⁻⁴⁵ to ±3.4 × 10³⁸ |
double |
8 bytes | ~15-16 decimal digits | ±5.0 × 10⁻³²⁴ to ±1.8 × 10³⁰⁸ |
long double |
8-16 bytes | Varies | Higher precision than double |
Note
double
is the default choice for floating-point arithmetic, providing more precision with minimal performance loss.
However, when memory usage is critical, float may be preferred.
To use floats, you must explicitly specify them by appending the value with an f
suffix (e.g., 3.14f).
Otherwise, the compiler treats floating-point literals as double by default.
Character Types
So far, we have introduced only ASCII character encoding, which supports 128 characters (or 256 in extended ASCII). However, ASCII is insufficient for representing the full range of characters used worldwide.
To address this, other encodings were developed, with UTF-8 becoming the most widely used due to its flexibility and backward compatibility with ASCII.
UTF-8 can represent a wide variety of characters, including:
- Latin characters with diacritics:
é, ñ, ü, č, š
- Mathematical symbols:
∞, ∑, π, √
- Currency symbols:
€, ¥, ₹, ₿
- Emoji:
😀, ❤️, 🚀, 🔥
- Non-Latin scripts:
你好 (Chinese), Привет (Russian), नमस्ते (Hindi), العربية (Arabic)
Type | Size | Prefix | Description |
---|---|---|---|
char |
1 byte | None | Holds a single character (typically ASCII) |
wchar_t |
2 or 4 bytes | L'A' |
Wide character (Unicode, size platform-dependent) |
char8_t |
1 byte | u8'A' |
UTF-8 character (C++20) |
char16_t |
2 bytes | u'A' |
UTF-16 character (Unicode) |
char32_t |
4 bytes | U'A' |
UTF-32 character (Unicode) |
These specialized types require prefixes before assigning literals to indicate their encoding.
Handling non-ASCII characters in the terminal can be tricky, especially on Windows.
To support UTF-8 output, you need to include the windows.h
header and adjust settings via its functions.
On Linux and macOS, this usually works without additional setup.
output | |
---|---|
Note
It’s best to avoid non-ASCII characters in your source code unless necessary (e.g., in GUI applications). Non-ASCII characters can complicate compatibility across different environments and are rarely required in typical C++ code.
Automatic Type Inference
Before diving into this section, it’s helpful to have a grasp of data structures, since type inference often involves more complex types like containers and iterators.
In C++, data types determine what kind of data a variable can hold and how much memory it uses. Manually specifying types can sometimes be tedious or verbose—especially when the type is obvious from the context. That’s where the auto keyword comes in: it lets the compiler automatically deduce the variable’s type based on its initializer, assignment, or the return value of a function.
Using auto can simplify your code and make it more concise, but it’s not without trade-offs. Overusing it may reduce readability, making it harder to quickly understand what a variable represents. It can also introduce subtle bugs if the inferred type doesn’t match your intentions, especially as your code evolves over time.
main.cpp | |
---|---|
output | |
---|---|
Here, name is correctly inferred as std::string
.
However, if the return type of GetName()
were changed to const char*
, this code would no longer compile—C-style strings don’t support the .size() method.
This is a good example of how type inference can become fragile when the underlying return type changes. In such cases, using an explicit type can help make your code more predictable.
One place where auto truly shines is in range-based for loops, where the element type is usually obvious and unlikely to change.
main.cpp | |
---|---|
output | |
---|---|
Here, auto
makes the loop cleaner and more readable, without sacrificing clarity or safety.
Since the type of element is tightly bound to the container, there's little risk of confusion or error.
Determining the Byte Size of Variables
You can get the byte size of any variable using the sizeof
operator.
Even compound structures like C-style arrays can be measured this way, which is useful because it lets you calculate the number of elements in the array using a simple formula: the total byte size of the array divided by the byte size of a single element.
main.cpp | |
---|---|
output | |
---|---|
The sizeof
operator can also be used with data types directly. So, instead of dividing by sizeof(array[0])
, you could divide by sizeof(int)
:
This works the same, but it's a bit more error-prone. If you change the array's type, you'd need to update int in multiple places. By dividing by sizeof(array[0]), your code automatically adapts to whatever type the array holds—making it safer and more maintainable.
Type Casting
Sometimes, we need to reinterpret the data a variable holds by changing its type. This process is called type casting. Type casting is especially important in C++ because it's a statically typed language. As a result, conversions are often necessary when working with different data types together, whether to ensure compatibility or to match the expected type in a specific context.
But not all types in C++ are directly compatible, and trying to mix them without conversion can lead to errors or unexpected behavior. Fortunately, C++ provides two main ways to perform type conversions:
- Implicit Conversion (Automatic):
In many cases, the compiler can handle type conversions for you—this is called implicit conversion.
For example, assigning an
int
to adouble
works automatically, because the compiler knows how to safely widen the value without losing information. - Explicit Conversion (Casting): When a conversion isn’t safe or obvious, C++ requires you to perform an explicit cast. This tells the compiler, “Yes, I know what I’m doing—convert this anyway.” There are several ways to do this, and we’ll cover them.
Conversions
In C++, conversions are implicit.
They occur when the compiler automatically converts one type to another without data loss, such as in arithmetic operations between different types or when assigning a double
to an int
.
C-Style Cast
Since C++ is built on top of C, it inherits nearly all of its features, including its casting techniques. The C-style cast is one of the most commonly used casting methods, even in other programming languages, due to its simplicity and ease of use. It is applied by prefixing a variable with a type in parentheses.
main.cpp | |
---|---|
output | |
---|---|
C++ Casts
C-style casts are simple but potentially dangerous.
They allow almost any kind of conversion without checks, which can easily lead to undefined behavior.
That’s why C++ introduced its own casting operators—static_cast
, dynamic_cast
, const_cast
, and reinterpret_cast
.
It’s important to understand that C++ casts don’t add new capabilities beyond what C-style casts can do. Instead, they provide safer and more controlled ways to perform conversions. By using explicit keywords, C++ casts make your intent clear and help the compiler enforce stricter type checks. This reduces the risk of accidental or unsafe conversions.
Another benefit is maintainability. Because each cast uses a specific keyword, it's easier to search for and reason about casting operations in your code. That makes tasks like debugging or refactoring much more manageable.
In short, C++ casts offer the same power as C-style casts, but with better safety and clarity. While C-style casts are still widely used, especially in older code, you should prefer C++ casts in modern C++ projects. They’re more explicit, more readable overall.
Static Cast
C++ equivalent of a traditional C-style cast is the static_cast
.
Unlike C-style casts, it provides better type safety by restricting certain conversions (e.g., removing const
or casting between incompatible pointer types) that could lead to undefined behavior.
main.cpp | |
---|---|
output | |
---|---|
Dynamic Cast
Info
It is recommended to go through user-defined structures before proceeding further, as the upcoming examples will focus on class behavior and involve casting between classes.
dynamic_cast
adds another layer of safety to static_cast
by performing runtime validation.
If the cast is unsuccessful, it returns a nullptr
.
Unlike static_cast
, which is evaluated at compile-time, dynamic_cast
is evaluated at runtime, making it slightly slower but significantly safer when working with polymorphic class hierarchies.
Imagine we have three classes in our game: Entity
(the base class), Player
and Enemy
(both derived from Entity
).
Casting a Player
to Entity
is straightforward, as Player
already inherits all the properties of Entity
, only losing its own unique members in the process.
The challenge arises when trying to cast an Entity
to a Player
.
The compiler has no way of knowing whether the Entity
instance actually represents a Player
, an Enemy
, or just a base Entity
.
With static_cast
, the compiler trusts the programmer, but if the cast is incorrect, accessing Player
-specific members can lead to a crash.
This is where dynamic_cast
becomes valuable.
If we have an Entity
instance that is actually an Enemy
and try to cast it to Player
using dynamic_cast
, the cast will fail and return nullptr
, allowing us to safely check the result before accessing Player
-specific members.
output | |
---|---|
Constant Cast
Constant cast serves a single purpose: adding or removing the const
qualifier from a variable.
This might sound simple, but it has important use cases, especially when dealing with legacy code, function parameters, or APIs that don’t support const
properly.
Imagine you have a const
variable, but you need to modify its value.
Normally, C++ prevents this to ensure safety, but const_cast
allows you to override this restriction.
However, this should only be done when you’re absolutely sure it’s safe, modifying truly const
values leads to undefined behavior.
main.cpp | |
---|---|
output | |
---|---|
Reinterpret Cast
Info
It is recommended to go through user-defined types before proceeding further, as the upcoming examples will focus on class behavior and involve casting between classes.
Reinterpret cast is the most dangerous of all C++ casts, as it allows converting any pointer type to any other pointer type, even if they are unrelated.
Unlike static_cast
, which ensures some level of type safety, reinterpret_cast
simply reinterprets the memory without any validation.
The reinterpret_cast
is commonly used in low-level programming, such as bit manipulation, working with hardware registers, or handling network protocols.
It allows for type punning, which enables treating a block of memory as a different type.
However, this cast should be used with caution, as improper usage can lead to undefined behavior and memory corruption.
main.cpp | |
---|---|
output | |
---|---|
reinterpret_cast
can also convert a pointer into an integer type (uintptr_t
), which may be useful in low-level debugging or custom memory management, but should generally be avoided in high-level C++ code.
main.cpp | |
---|---|
output | |
---|---|
When to Cast
Casting should be avoided whenever possible. If explicit conversion is required, consider alternative approaches before resorting to a cast. C++ provides safer mechanisms, such as uniform initialization syntax and automatic type inference, which can help minimize unnecessary conversions.
General Guidelines for Casting:
- Prefer explicit casting over implicit casting whenever possible.
- Use uniform initialization (
T2{T1}
) andauto
to reduce unnecessary implicit conversions. - During code reviews, scrutinize casts carefully and require clear documentation explaining their necessity. Polymorphism is a valid justification.
- If casting is unavoidable, document its usage with well-written comments next to each cast or group of casts.
- Avoid C-style casts (
(T)expression
), as they are unsafe, bypass type checks, and make code harder to maintain. - Do not misuse
reinterpret_cast
unless you fully understand the potential consequences. - Do not remove
const
qualifiers unless absolutely necessary, as it can lead to unintended side effects.
Constants and Compile-Time Constructs
In the introduction chapter, we introduced the idea of constants—variables that are meant to remain unchanged throughout a program's execution.
However, we later saw that const
can be bypassed with const_cast
, which means it doesn't provide absolute immutability.
This is generally a bad design choice, as it violates the principle that constants should truly be constant.
To ensure a variable or function is genuinely immutable and can be evaluated at compile time, modern C++ provides the constexpr
keyword.
This is also a type-safe alternative to the old preprocessor macros like #define
and offers a more reliable way to express intent in your code.
Constexpr
The constexpr
keyword guarantees that a variable or function is evaluated at compile time.
This is especially useful for constants, mathematical functions, and lookup tables that should not change during the program’s lifetime.
Unlike const
, which merely promises not to modify a variable after initialization, constexpr
enforces that the value is computed before runtime.
It Allows the compiler to optimize your code by "hard baking" the result directly into the binary.
main.cpp | |
---|---|
output | |
---|---|
Consteval and Constinit
Warning
This is a feature of C++20, so make sure you are using a compatible compiler version to avoid build errors.
Modern compilers have become quite good at determining which expressions can be evaluated at compile time, making constexpr
slightly less explicit than it once was.
However, if you need to guarantee that a function or variable is evaluated at compile time, you should use consteval
for functions and constinit
for variables.
consteval
ensures a function is always evaluated at compile time. It cannot accept runtime values as arguments, making it ideal for functions that should never execute at runtime.constinit
guarantees that a variable is initialized at compile time. It can only be used with static or thread-local variables and requires a constant expression as its initializer, like a literal or aconsteval
function. It cannot be used in function scope, as it is specifically intended for variables with static storage duration.
Keep in mind that constinit
cannot be used with automatic (local) variables, and trying to initialize it with a runtime value will result in a compile error.
main.cpp | |
---|---|
output | |
---|---|
Padding and Alignment
Info
This section covers advanced memory management concepts, including how data is organized in memory and how to optimize it. This knowledge is generally not required unless you are working in performance-critical or low-level systems.
Padding and alignment refer to how data is organized in memory. Every data type has specific alignment requirements, meaning it must be stored at memory addresses divisible by its size to ensure efficient access. If these requirements are not met, the compiler adds padding to fill the gaps, potentially wasting memory.
For example, an int
typically requires 4-byte alignment, meaning it must start at a memory address divisible by 4.
We can retrieve the alignment requirement of any data type by including the <type_traits>
header, which provides the std::alignment_of_v<T>
operator.
The _v
suffix stands for "value" and is a shorthand that gives you the result directly, rather than as a type like int
.
main.cpp | |
---|---|
output | |
---|---|
This concept is especially important when creating structs. If the alignment is not correct, the compiler will automatically add padding to ensure proper alignment, potentially wasting otherwise usable memory.
Tip
To detect padding and alignment issues, you can use Clang's -Wpadded
compile flag, which will emit warnings for potentially inefficient struct layouts.
Additionally, clang-tidy offers checks like clang-analyzer-optin.performance.Padding
for deeper analysis.
Unfortunately, MSVC and GCC do not have direct equivalents, so manual inspection or static analysis tools may be necessary for those compilers.
main.cpp | |
---|---|
output | |
---|---|
As we can see from this example, the original struct wastes 13 bytes of memory due to padding, resulting in approximately 75% overhead. This happens because each member must be aligned according to its alignment requirements, leaving gaps in memory.
We can significantly reduce padding by reordering the struct members from largest to smallest data types. This approach minimizes internal fragmentation, though some padding may still remain at the end, as it is platform-dependent and cannot be completely eliminated.
output | |
---|---|
Difference Between Alignment and Size
While sizeof
gives you the total memory size (in bytes) that a type occupies—including any internal padding—std::alignment_of_v<T>
tells you the alignment requirement of the type T.
Expression | Description |
---|---|
sizeof(T) |
How much space an object of type T takes up |
std::alignment_of_v<T> |
How that object must be aligned in memory |
Alignment ensures that variables are placed in memory in a way that’s efficient for the CPU. On many systems, certain types must be aligned to memory addresses divisible by 4, 8, 16, etc. Misalignment can lead to performance penalties or even runtime errors on strict architectures.
Type Punning
Info
This section covers advanced memory management concepts, including how data is organized in memory and how to optimize it. This knowledge is generally not required unless you are working in performance-critical or low-level systems.
C++ is considered a strongly typed language, but it still allows for certain implicit conversions—such as converting integers to floating-point numbers or pointers between compatible types. While variables in C++ have explicitly defined data types, the language also provides low-level memory access, giving us the ability to reinterpret data in ways that would typically be restricted in stricter languages. This flexibility comes from the ability to directly access and manipulate memory, allowing a block of memory to be treated as a different type when necessary.
main.cpp | |
---|---|
output | |
---|---|
The reason why the output is so strange is that we pass the address of a
, which is an integer (4 bytes), to an double pointer.
The pointer tells the compiler that the value under this address is a double
(8 bytes).
Therefore, when we try to output it, the program reads 4 bytes past our value and accesses some unknown memory.
main.cpp | |
---|---|
output | |
---|---|
What is done in this example is called raw memory manipulation. This is also one of the reasons why C++ is such a powerful language: it can manipulate memory very easily and freely. While the code in this example should probably never exist in a real-world application, it demonstrates how data types are represented in memory and how they can be manipulated.
However, what can be useful is the representation and access to memory. In the last example, we moved through the struct's values using raw memory manipulation, but it could also be easily interpreted as a C-style array, for example.
main.cpp | |
---|---|
output | |
---|---|
Here, we just manipulate it using indices.
This is possible because the int
pointer is pointing to the address of x
, and if it moves by four bytes (which is specified by the [1]
), it will then point to the variable y
.
This works because structs are also organized in contiguous memory.
Dynamic Typing
C++ is known as a statically-typed language; however, it also provides mechanisms that allow for a certain level of dynamic typing, meaning the data types of variables can be determined at runtime.
Optional Data
Warning
This is a feature of C++17, so make sure you are using a compatible compiler version to avoid build errors.
What often happens when reading from a file or another dynamic source is that the function responsible needs to return a value if it finds some text or not. The common approach is to return a signal that nothing was found — for example, returning an empty string if a file is empty or missing content. However, this isn't ideal, as it makes it hard to distinguish between missing data and intentionally empty data.
This is where std::optional
becomes useful; to use it, you must include the <optional>
header.
It is a type that holds a value that may or may not be present, stored on the stack.
It provides a clear and type-safe way to represent optional data, making your code more expressive and reliable when handling uncertain results.
This approach isn’t necessarily wrong, but it’s not ideal either—it’s somewhat clunky and lacks clarity. It's similar to adding a separate flag variable to indicate whether the output was successful.
These kinds of workarounds are unnecessary in modern C++.
The cleaner solution is to use std::optional
, which directly expresses the idea of a value that might not be present.
In this example, the use of std::optional
explicitly indicates that the returned data may or may not be present.
The check for the presence of data is clean and straightforward: we simply check if the std::optional
contains a value.
If it does, we dereference the optional variable to access the data.
If the file reading process fails, the function returns an empty std::optional
(i.e., the default constructor), indicating the absence of data.
Using std::optional
is often preferable to returning an empty string, as it avoids unnecessary string construction and copying, effectively representing a "no value" state without the overhead of a full string object.
If the file could not be read and you want to set the data to a default value, you can use the .value_or
method. This ensures that if data
contains a value, it will be used; otherwise, the provided default value will be assigned.
example | |
---|---|
Beyond its primary use case of representing "no value" or a missing result, std::optional
is also extremely useful when you need to delay initialization of a variable, especially in situations where immediate construction isn’t possible or desirable.
For example, in game development or GUI applications, you often have to wait until certain systems are initialized (like a rendering context or asset loader) before you can safely create certain objects.
Instead of using raw pointers, manual flags, or heap allocation, you can wrap your object in a std::optional
and initialize it later with the .emplace()
method when the conditions are right.
example | |
---|---|
Multi-Type Data Holder
Warning
This is a feature of C++17, so make sure you are using a compatible compiler version to avoid build errors.
What often happens when handling inputs from external sources—like files, command-line arguments, or network responses—is that the data may come in different types. For example, a value might be a string in one case, a number in another, or something else entirely.
This is where std::variant
becomes useful. It allows you to store a value that could be one of several types, all in a single, type-safe variable.
Think of it as a modern, type-safe union—making your code more flexible without sacrificing safety or clarity.
To use std::variant
, you must include the <variant>
header.
Also the union is still useful for low-level optimization but does not offer type safety. When absolute optimization is not required, std::variant
should be preferred.
main.cpp | |
---|---|
This is a simple example of using std::variant
and how to retrieve its data using std::get<T>()
.
However, if the stored type doesn't match the requested type, it throws a bad variant access exception.
To avoid this, you can use std::get_if<T>()
to safely access the value only if it holds the correct type.
example | |
---|---|
Any Data Type
Warning
This is a feature of C++17, so make sure you are using a compatible compiler version to avoid build errors.
What often happens in flexible systems—like plugin architectures, event dispatchers, or scripting layers—is the need to store and pass around values of any type, without knowing in advance what that type will be.
This is where std::any
becomes useful. It is a type-safe alternative to void*
, allowing you to store any value while preserving its type information.
To use std::any
, include the <any>
header.
main.cpp | |
---|---|
At first glance, std::any
may seem similar to std::variant
, with the key difference being that std::any
doesn't require you to specify possible types in advance.
However, this flexibility is also its biggest drawback—std::variant
is often the better choice because its explicitly defined type list ensures type safety at compile time.
For example, in the code above, we had to assign the string using std::string("String")
.
If we had written just "String"
, it would have been stored as a const char*
, which could lead to a type mismatch when retrieving it with std::any_cast<std::string>(data)
.
With std::variant
, this problem doesn't arise—because the type must be one of the predefined options, "String"
would be implicitly converted to std::string
when matched against the allowed types.
Additionally, std::variant
can be more memory-efficient when working with large data types, especially if you're passing them by reference.
std::any
, on the other hand, always stores a copy of the value, which can introduce unnecessary overhead.
In short, std::any
should only be used when you truly need type-erased behavior and cannot determine the set of possible types in advance.
Even then, the design should be reconsidered—std::variant
is in most cases the safer alternative.
RTTI (Run-Time Type Identification)
Run-Time Type Identification (RTTI) is a feature in C++ that allows you to determine the actual type of an object at runtime. This is useful in scenarios involving inheritance and polymorphism, particularly when working with base class pointers or references that may point to derived class objects.
RTTI supports two key tools:
typeid
operatordynamic_cast
operator
Type Id
The typeid
operator returns a reference to a std::type_info
object, which contains information about the type of the expression passed to it.
To use typeid
, include the <typeinfo>
header.
main.cpp | |
---|---|
output | |
---|---|
Although the output may look odd at first glance, 7Derived
is the mangled name of the Derived
type.
In many C++ compilers (like GCC or Clang), typeid(...).name()
returns an encoded string where:
7
indicates the number of characters in the class name (Derived has 7 letters).- Derived is the actual class name.
This process is called name mangling, and it's used by compilers to encode type information for linking and overloading support.
Considerations When Using RTTI
- RTTI increases memory overhead because types need to store additional information about themselves.
- Each time a
dynamic_cast
is performed, a runtime check is required to validate whether the cast is correct.
It is possible to disable RTTI, but the process varies depending on the compiler.
When RTTI is turned off, the compiler will typically generate a warning after building the project, indicating that RTTI is disabled.
Additionally, dynamic_cast
will exhibit unpredictable behavior, potentially leading to crashes or undefined results.
Chapter Summary
By completing this chapter, you should now have a solid understanding of data types in C++—from the basic building blocks like integers, floating-point numbers, and characters, to more advanced tools like RTTI and type punning.
You’ve also explored how data types affect memory layout, performance, and the overall type safety of your programs.
In particular, you've learned:
- How to safely convert between types using modern C++ casting operators like
static_cast
anddynamic_cast
, which offer greater clarity and safety than traditional C-style casts. - The flexibility of dynamic typing through
std::optional
,std::variant
, andstd::any
, each suited for different scenarios where type uncertainty or variability is involved. - How type punning allows you to reinterpret data at a low level, and how RTTI (Run-Time Type Information) can help identify types at runtime—while being mindful of their performance implications.