Introduction

Abstractions are an essential part of programming, almost everywhere you can see some or another way, abstractions were made to simplify our work. Especially in game development where we have different platforms with specific traits that you must support or use.

Unity solves the bulk of these issues (as any game engine should), nevertheless, abstractions aren’t only about supporting different hardware or operating systems also it’s about supporting different behaviors, the ability to write unit tests, and much more.

As with almost any instrument it has drawbacks, in C# it greatly covered, albeit long ago, in the following articles:

Although the primary (and already sole) programming language in Unity is C#, it doesn’t mean it has all the benefits and disadvantages of the CLR or Mono runtimes because Unity uses its own C#-to-CPP translator named IL2CPP¹.

IL2CPP guts are vastly covered before², but I want to emphasize the actual price of using abstractions and concentrate on two things:

Performance comparison with real-world example (magnitude calculation) with charts and comparison between various types of abstractions ³
Touch a compiler side, what optimizations it does for non-virtual calls and what can’t be done for virtual calls

Tests

Configuration

Standalone build wth Apple Silicon target.

As for hardware was used Macbook Pro with the next specifications:

M3 Pro chip. Number of cores: 12 (6 performance and 6 efficiency)
36 Gigabytes of RAM
Plugged power

Build settings:

Api Compatibility Level: Net Standard 2.1
IL2CPP Code generation: Faster Runtime

Preparation

Code

Virtual calls can be made using certain instruments in the C#, I will use only two: abstract classes and interfaces⁴:

public abstract class AbstractClass
{
	public abstract void InvokeAbstract();
}

public interface PureInterface
{
	public void InvokeInterface();
}

public class ImplementationClass : AbstractClass, PureInterface
{
	void PureInterface.InvokeInterface()
	{
			
	}

	public override void InvokeAbstract()
	{
			
	}

	public void InvokeDirect()
	{
			
	}
}

Explicit implementation of the interface was used only for better visibility, it doesn’t make a difference in the upcoming tests.

And vector magnitude calculation:

private void CalculateMagnitude(double[] results)
{
    for (var i = 0; i < IterationCount; i++)
    {
        var vec = new UnityEngine.Vector3( i, i, i );
        
        results[i] = System.Math.Sqrt(vec.x * vec.x + vec.y * vec.y + vec.z * vec.z);
    }
}

A few comments about the code above:

Vector3.Magnitude from the Mathf lib wasn’t used because it contains a lot of unnecessary operations.
results array was added to avoid removal of the code by the compiler.

Debug configuration

Debug configuration is a decent representation of how abstractions will behave if almost nothing is changed at a compile time.

Take a look at the next next chart:

Debug Abstractions Chart

What we can see is that direct call is more than 3 times faster than abstract and more than 7x times faster than interface call, although more interesting is a magnitude calculation that is 2.6x times slower than interface call. And here you might say “Of course it is slower, it is an expensive operation”. But remember, code with magnitude calculation also requires saving the result to the array whereas for interface we just make a call to an empty method. And for all those calls to the interface method we don’t have cache misses that might occur and those calls will be more expensive⁵.

Three virtual calls through the interfaces require more resources than a calculation of vector magnitude for Debug configuration. in naive implementation.

Release configuration

Release configuration is more interesting because compiler optimizations are stepping up.

Take a look at the chart below:

Release Abstractions Chart

And here is better to start from the end. Vector magnitude calculations identically required time as a call to the method through the interface. Abstract class is faster than interface call roughly 25% and still, the most interesting part is direct call: it took 0 time to complete.

That happened not because of the drastic optimization from the compiler side but this operation merely never happened. A compiler removed the operation entirely because it does nothing in this context and removal changes nothing.

Vector magnitude calculations identically required time as a call to the method through the interface for Release configuration.

Compile-time

Alright, we saw that the compiler greatly reduced the cost of non-virtual calls by cleaning the code because of the redundancy, but does it even count? We don’t often leave useless code in our code base, however, it depends on the size of the project.

The compiler can optimize it in many ways, and most importantly it depends on the context.

For example:

private void IterativeInvokation()
{
    for (var i = 0; i < 10000; i++)
    {
        Processing();
    }
}

private void Processing(int f)
{
    int k, t, d;

    /*
    Expensive operations
    */
}

Here we can ask two questions⁶:

Where k, t, and d variables will be stored?
Will be Processing method inlined?

Answers:

Most likely in the registers because the compiler is aware of the context where Processing will be called (amount of iterations and variables usage).
If a method is not big a not exceed a certain amount of calls, most likely, it will be inlined

An example of abstraction:

private void IterativeInvokation()
{
    Abstraction instance = new Implementation();

    for (var i = 0; i < 10000; i++)
    {
        instance.Processing();
    }
}

public interface Abstraction
{
    public void Processing(int f);
}

public class Implementation : Abstraction
{
    public void Processing(int f)
    {
        int k, t, d;

        /*
        Expensive operations
        */
    }
}

The situation for the compiler significantly changed:

Most likely on the stack because the compiler doesn’t know where the method will be called and how often it will be called.
It won’t be inlined in any case because what method will be called here is unknown at a compile time.

This means that a cost of abstraction isn’t only about “get method address from vtable and call it”, usage of abstractions is throwing the compiler out of the code execution context and a bulk of optimizations can’t be made.

Usage of abstractions is throwing the compiler out of the code execution context and a bulk of optimizations can’t be made.

What other optimizations can be impacted if you use abstraction?

Inlining⁷
Registers usage instead of stack
Unused code stripping
Unused calls stripping
Compile time branch evaluation

This is not full list of impact, it can be much longer

One more thing

Compilers have an optimization even for abstraction calls known as Devirtualization when at compile time virtual call can be replaced with a non-virtual call. And here is but: translated C# code in Unity doesn’t support devirtualization.⁸

Why? Devirtualization is a part of the compiler itself and to support it Unity must use abstractions of C++ (virtual methods), and it didn’t because IL2CPP uses its own approach to call virtual methods.

Translated C++ code in Unity doesn’t support Devirtualization

Why not talk before about Mono? The answer is simple: you already can’t deliver an application using Mono backend for several platforms, f.e. Android and iOS because of a lack of Arm64 support. ↩︎
An introduction to IL2CPP internals by Josh Peterson, IL2CPP Function and Boxing Costs by Jackson Dunstan ↩︎
Jackson already covered it a long time ago in his article IL2CPP Function and Boxing Costs by Jackson Dunstan but he used instructions count as criteria of a performance, I prefer head-to-head measurement approach. ↩︎
Delegates and anonymous methods can be considered as virtual methods ↩︎
IL2CPP Function and Boxing Costs by Jackson Dunstan ↩︎
Answers here are assumptions because everything depends on the compiler and the level of optimizations passed to a compiler ↩︎
From my perspective this is the most important optimization because when code is inlined it also impacts how operations will be processed by CPU, f.e. out-of-order execution. ↩︎
In fact, it exists, but strictly for specific cases: IL2CPP optimizations: Devirtualization ↩︎

Introduction#

Tests#

Configuration#

Preparation#

Code#

Debug configuration#

Release configuration#

Compile-time#

One more thing#