C#: Value and Reference Types
C# is a programming language developed by Microsoft, being the main language of the .NET platform. With each new version, Microsoft introduces various improvements that enhance the developer’s experience and enable the building of more robust software. The latest version of C# (v9) was recently released.
This post aims to start a series of publications that will address C#. Just as a literature writer needs to master the language to be able to manipulate it and construct logical sentences, a software developer also needs to explore the language that serves as an instrument to produce software. So, let’s practice!
Common Type System
Software written for the .NET platform is language-independent. This means two things:
The platform will be able to execute any program that can be compiled to the Intermediate Language. F# and Visual Basic are examples of languages that compile to the Intermediate Language.
You can reference in your C# project, libraries written in F# (and vice versa), but compiled for .NET, so these languages must follow certain conventions.
The data types in C# are derived from the Common Type System (CTS), which is a system for specifying how languages that compile to the .NET platform should work with types.
The CTS defines, for example, various categories of types: classes, structures, enumerations, interfaces, and delegates. It also defines access modifiers, how inheritance should be defined, and operator overloading.
Data Types
In C#, when you create a variable and assign a value to it, the storage of that value can occur in different ways depending on the type. This is because the CTS defines value types and reference types.
Value Types
When assigning a value to a variable that is a value type, you are storing an instance of that type in memory. When you assign this variable to another variable of the same type, a copy of the value occurs. For example:
|
|
When assigning the value 20
to the variable x
, a space in memory is allocated to store the value 20
. When assigning the variable x
to the variable y
, a new space in memory is allocated to store a copy of the value 20
. That is, there are two separate memory spaces for storing this value.
In C#, value types are derived from System.ValueType
, such as integral numeric types int
and byte
; floating-point numeric types like float
and double
; and types bool
, char
, struct
, enum
, and tuples.
Reference Types
Unlike value type variables, when initializing a reference type variable, in addition to allocating a space in memory to store the assigned content, the initialized variable is stored in another memory space, where there is a reference to the first memory position of the content. This means that when assigning one reference type variable to another, only the reference value is copied. For example:
|
|
When assigning the object resulting from the expression new Car("Gol")
to the variable car
, a space in memory is allocated to store all the values that the car
object requires. In addition, a new space in memory is allocated to store the memory address (reference) of the stored object. Therefore, when assigning the content of the variable car
to the variable car2
, only the reference of the stored object is copied. This means in practice that these two variables point to the same location in memory.
Now consider the following situation:
|
|
Notice that the Name
property was changed from the variable car2
. If car2
points to the same memory position as car
, when printing the content of this property through both variables, the returned values should be the same, displaying the change in the value of the Name
variable.
Since reference type variables store only memory addresses, nullifying the variable car2
would have no impact on car
.
|
|
The same occurs in an operation of initializing a new Car
object being assigned to the variable car2
. New memory spaces are allocated to store the content of this object, and car2
now contains the reference to this newly allocated memory space.
|
|
In C#, classes, interfaces, and delegates are reference types. Variables declared as being of type string
and dynamic
are also reference types.
Going a bit further down…
All this behavior that exists in value types and reference types is related to the way the operating system provides a running process with memory allocation space. Two of these spaces are the abstractions stack and heap.”
Stack
The stack memory area is much smaller than the heap and works as a LIFO (last in, first out) data structure. Every time a function is called, a block of memory is reserved on the stack for storing the function’s local variables. When the function returns, this area is deallocated. Because this area is small, depending on the amount of space required for a function, an error called stack overflow can occur, indicating that the stack is full. Due to its size and navigation method, retrieving data from the stack is very fast. Additionally, the memory spaces of the stack are not fragmented.
Heap
The heap memory area is intended for the dynamic allocation of variables. Its size varies according to use. It is not accessed directly like the stack. Instead, access to values stored in it depends precisely on the stack, where there is a reference to the data contained in the heap memory area. When there are no longer any variables in the stack pointing to memory positions in the heap, in C# there is the garbage collector, which is responsible for deallocating these memory positions. Retrieving a value stored in the heap may not be as fast as in the stack due to its size and fragmentation.
Memory areas vs Data Types
Given the characteristics of each memory area, you might be wondering where each data type is stored. In C#, every reference type is stored in the heap memory area. Value types, in general, are stored in the stack, with two exceptions: (1) when a value type is declared as a member of a class, it is stored in the heap along with the rest of the class; (2) when a value type is declared as a member of a struct, it is stored wherever the struct is stored (in the stack if it is a member of a local function, or in the heap if it is a member of a class).
Nullable Types
Due to the way value types are stored, it is not possible to assign null
to a value type. However, there may be situations where a null value of this type is desired. Version 2 of C# introduced the concept of nullable value types. A nullable value type T?
allows any variable of the value type T
to have a null
value. Every nullable value type is derived from System.Nullable<T>
. For example:
|
|
Every reference type can be assigned the value null
. And this can be a problem. How many NullReferenceException
have you seen occur? Many, right? To try to address this, C# 8 introduced the feature of nullable reference types, which is disabled by default. When enabling this feature, all reference types that are declared without the T?
syntax need to be initialized with non-null values. If this rule is violated, during the static analysis of the code, the compiler will produce a warning
. For example:
|
|
With nullable reference types enabled, for the variable name
to be nullable, it must be declared as string? name
.
Parameter passing
In C#, parameter passing to a method can be done by value or by reference.
Pass by value
When a parameter is passed to a method by value, then a copy of that parameter is made in memory. Changes to the parameter that occur within the method do not impact the original data stored in the argument variable. For example:
|
|
The value of the variable x
from the Main
method is copied to the variable x
within the scope of the DoubleIt
method. Thus, the variable x
from the Main
method does not undergo any modification, as they are different variables.
By default, in C# arguments are always passed by value to a method. When a reference type is passed to a method, a copy of its reference is made.
Pass by reference
To pass a parameter by reference, one must use the ref
or out
modifiers.
The ref
modifier can be declared alongside the parameter in the method declaration to pass a value type by reference. For example:
|
|
When ref
is used with a reference type, instead of copying the reference that the variable stores to the argument variable, the reference of the variable containing the object’s reference is passed. This way, it is possible that within the method, the passed variable can point to another reference. For example:
|
|
The out
modifier is used in a similar way to ref
, but it requires the method that uses it to initialize the variable if it is passed without being initialized. For example:
|
|
The variable number
is passed to the method int.TryParse
by reference, since the out
modifier is used. It can be passed without being initialized, but the method int.TryParse
will have to initialize it before returning.
Type conversion
A value type can be converted to a reference type and vice versa. These operations are costly and can affect the performance of the application.
Boxing
When a value type is converted to an object
type, this process is called boxing. The boxing conversion of a value type stores an object instance in the heap and copies the value into the new object.
|
|
Unboxing
Unboxing conversion is the opposite: it is the conversion from a reference type to a value type. The unboxing conversion of a reference type to a value type stores the value in the stack. Attempting to perform an unboxing conversion of a reference to an incompatible value type causes an InvalidCastException
.
|
|
The figure below shows what happens in the heap and the stack when the boxing and unboxing conversion processes occur.
References
MICROSOFT. Common Type System. Available at: https://docs.microsoft.com/en/dotnet/standard/base-types/common-type-system
MICROSOFT. Typing (C# Programming Reference). Available at: https://docs.microsoft.com/en/dotnet/csharp/programming-guide/types/
MICROSOFT. Reference Types (C# Reference). Available at: https://docs.microsoft.com/en/dotnet/csharp/language-reference/keywords/reference-types
MICROSOFT. Value Types (C# Reference). Available at: https://docs.microsoft.com/en/dotnet/csharp/language-reference/builtin-types/value-types