C# Source Generators: A Comprehensive Guide

Unlocking Code Automation: Dive Deep into Source Generators in C#

Photo by Mohammad Rahmani on Unsplash

Introduction

The concept of generating source code isn’t new. Historically, developers have used various tools and techniques like T4 templates or code generators to automate boilerplate or repetitive code. However, these methods often resulted in a disconnect between the original source and the generated code, making it a challenge to understand and maintain.

C# Source Generators are an exciting addition to the C# compiler’s toolchain. They allow developers to generate additional source code during the compilation process. Unlike traditional code generation tools, which create a separate file that gets compiled along with the rest of your code, Source Generators produce code that’s in-memory, meaning it becomes a seamless part of the compilation without leaving any trace on the filesystem.

At its core, a Source Generator is a piece of code that runs during compilation and can inspect your program to produce additional files or change the contents of existing ones. Think of it as a metaprogramming tool; you’re writing code that writes code!

Code automation, particularly with tools like Source Generators, provides several benefits:

  • Reduced Boilerplate: Every developer knows the pain of writing repetitive code. Source Generators can automate the creation of boilerplate, ensuring consistency and reducing manual effort.
  • Performance Optimizations: With the ability to generate code at compile-time, developers can introduce specific optimizations tailored to the application’s needs, often leading to more performant and efficient code.
  • Dynamic Code Generation: Depending on the application’s context, Source Generators can produce different code, offering a level of dynamism that’s hard to achieve with traditional programming.
  • Enhanced Maintainability: By reducing manual code and automating repetitive tasks, the codebase becomes more maintainable. There’s less manual code to debug, and the generated code follows a consistent pattern, making it easier to understand and manage.
  • Consistency and Standardization: When code is generated automatically, it adheres to a set pattern or standard. This ensures that team members are always on the same page, leading to fewer discrepancies and conflicts.

How it works?

Source Generators are deeply integrated into the Roslyn Compilation Pipeline. When you compile your C# code, the compiler first parses the source files and produces a syntax tree. This tree is then passed to the Source Generator, which can analyze it and produce additional syntax trees. The final compilation includes both the original and generated syntax trees.

https://learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/media/source-generators/source-generator-visualization.png

To truly harness the power of Source Generators, one must understand the Roslyn APIs, particularly Syntax Trees and Semantic Models. A Syntax Tree represents the syntactic structure of your code, while a Semantic Model provides richer information about the meaning or semantics of the code.

// In your SourceGeneratorLibrary

using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.Text;

[Generator]
public class SimpleGenerator : ISourceGenerator
{
    public void Initialize(GeneratorInitializationContext context) { }

    public void Execute(GeneratorExecutionContext context)
    {
        var sourceCode = @"
        namespace Generated
        {
            public static class HelloWorld
            {
                public static void SayHello() => System.Console.WriteLine(""Hello from generated code!"");
            }
        }";

        context.AddSource("HelloWorldGenerated", SourceText.From(sourceCode, System.Text.Encoding.UTF8));
    }
}

In this simple example, the generator adds a new class HelloWorld that contains a method SayHello. When this generator runs during compilation, the method can be called from any other part of the application, even though it doesn't exist in any of the source files.

public class Program
{
    public static void Main(string[] args)
    {
        HelloWorld.SayHello();
    }
}

A significant strength of Source Generators is the ability to analyze existing code. By applying custom attributes to methods, classes, or properties, developers can provide hints or metadata to the Source Generator. The generator can then act on these hints, producing code tailored to specific scenarios.

QuickStart

  • Create a .NET console application.

use .NET 7

with following code

namespace SourceGeneratorExample
{
    internal class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Hello, World!");
        }
    }
}
  • Add a .NET standard library project for the source generator to the solution. The source generator has to be in a separated library.

Give the project a name:

Make sure the project target the netstandard2.0. ⚠️This is required to make the source generator work.

  • Add NuGet packages to project MySourceGenerators
Install-Package Microsoft.CodeAnalysis.Analyzers
Install-Package Microsoft.CodeAnalysis.CSharp

⚠️The installed version of `Microsoft.CodeAnalysis.CSharp` must match the compiler version used by Visual Studio. If the versions differ, it will not function properly. To determine the compiler version that Visual Studio utilizes, follow these steps:

Given that Visual Studio utilizes version 4.6, I need to install Microsoft.CodeAnalysis.CSharp 4.6, even though the latest available version is 4.7.

  • Add EnforceExtendedAnalyzerRules setting to MySourceGenerators project file
<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <TargetFramework>netstandard2.0</TargetFramework>
    <EnforceExtendedAnalyzerRules>true</EnforceExtendedAnalyzerRules>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.CodeAnalysis.Analyzers" Version="3.3.4">
      <PrivateAssets>all</PrivateAssets>
      <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
    </PackageReference>
    <PackageReference Include="Microsoft.CodeAnalysis.CSharp" Version="4.7.0" />
  </ItemGroup>

</Project>
  • Create a new C# file named SimpleSourceGenerator.cs that specifies your own Source Generator like so:
using Microsoft.CodeAnalysis.Text;
using Microsoft.CodeAnalysis;
using System.Text;

namespace MySourceGenerators
{
    [Generator]
    public class SimpleSourceGenerator : ISourceGenerator
    {
        public void Initialize(GeneratorInitializationContext context) { }

        public void Execute(GeneratorExecutionContext context)
        {
            var sourceCode = @"
        namespace Generated
        {
            public static class HelloWorld
            {
                public static void SayHello() => System.Console.WriteLine(""Hello from generated code!"");
            }
        }";

            context.AddSource("HelloWorldGenerated", SourceText.From(sourceCode, Encoding.UTF8));
        }
    }

}
  • In project SourceGeneratorExample, add a project reference to MySourceGenerators.

modify the SourceGeneratorExample project file to add OutputItemType and ReferenceOutputAssembly

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net7.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>

  <ItemGroup>
    <ProjectReference Include="..\MySourceGenerators\MySourceGenerators.csproj" 
       OutputItemType="Analyzer"
                      ReferenceOutputAssembly="false"/>
  </ItemGroup>

</Project>
  • modify project SourceGeneratorExample program.cs
using generated;
namespace SourceGeneratorExample
{
    internal class Program
    {
        static void Main(string[] args)
        {
            HelloWorld.SayHello();
        }
    }
}

It worked!

You can actually see the generated source code:

The generated source code also serves debugging purposes, allowing you to trace into it. You can even set a breakpoint within this generated code.

However, there’s a caveat. If we modify the source generator and simply rename SayHello to SayHello1,

public void Execute(GeneratorExecutionContext context)
{
    var sourceCode = @"
namespace generated
{
    public static class HelloWorld
    {
        public static void SayHello1() => System.Console.WriteLine(""Hello from generated code!"");
    }
}";

    context.AddSource("HelloWorldGenerated.g.cs", SourceText.From(sourceCode, System.Text.Encoding.UTF8));
}

If you attempt to rebuild the project now, the automatically generated source file won’t reflect the changes. To update the auto-generated file, you’ll need to restart Visual Studio.

Further discussions can be found here:

Visual Studio does not reload source generator assemblies when they change on disk · Issue #48083 · dotnet/roslyn
I’m just started writing a source generator, and I’m finding that Visual Studio is caching source generators aggressively, and it’s making it very hard to do iterative development. This is what I’m…

Although Visual Studio indicates in the source code editor that `SayHello1` doesn’t exist, you can still run and debug the program, revealing that it indeed executes the updated code.

Having grasped the process of creating a source generator, let’s delve into a more intricate example.

Example

A typical web application needs a way to serialize their vast number of data models for API communications efficiently. Instead of relying on runtime reflection, which can be slow and inefficient, we can Source Generators to pre-generate serialization code for each model. This resulted in a boost in performance and reduced the overhead of serialization.

  • Add AutoSerializeAttribute to project SourceGeneratorExample
namespace SourceGeneratorExample
{
    [AttributeUsage(AttributeTargets.Class)]
    public class AutoSerializeAttribute : Attribute
    {
    }
}
  • Add data model Product to project SourceGeneratorExample
namespace SourceGeneratorExample
{
    [AutoSerialize]
    public class Product
    {
        public int Id { get; set; }
        public string? Name { get; set; }
        public double Price { get; set; }
    }
}
  • Implement the Source Generator SerializationGenerator in project MySourceGenerators:
using System.Linq;
using System.Text;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.Text;

namespace MySourceGenerators
{
    [Generator]
    public class SerializationGenerator : ISourceGenerator
    {
        public void Initialize(GeneratorInitializationContext context) { }

        public void Execute(GeneratorExecutionContext context)
        {
            StringBuilder sourceBuilder = new StringBuilder();

            foreach (var syntaxTree in context.Compilation.SyntaxTrees)
            {
                var model = context.Compilation.GetSemanticModel(syntaxTree);

                // Look for classes with the AutoSerialize attribute
                var classes = syntaxTree.GetRoot().DescendantNodes().OfType<ClassDeclarationSyntax>();

                foreach (var classDecl in classes)
                {
                    var classSymbol = model.GetDeclaredSymbol(classDecl);
                    if (classSymbol is INamedTypeSymbol)
                    {
                        var autoSerializeAttribute = classSymbol.GetAttributes().FirstOrDefault(ad => ad.AttributeClass.Name == "AutoSerializeAttribute");

                        if (autoSerializeAttribute != null)
                        {
                            sourceBuilder.Append(GenerateSerializationCode(classSymbol as INamedTypeSymbol));
                        }
                    }
                }
            }

            context.AddSource("SerializationGenerated", SourceText.From(sourceBuilder.ToString(), Encoding.UTF8));
        }

        private string GenerateSerializationCode(INamedTypeSymbol classSymbol)
        {
            StringBuilder sb = new StringBuilder();

            sb.AppendLine($"namespace {classSymbol.ContainingNamespace}");
            sb.AppendLine("{");
            sb.AppendLine($"    public partial class {classSymbol.Name}");
            sb.AppendLine("    {");
            sb.AppendLine("        public string Serialize()");
            sb.AppendLine("        {");
            sb.Append(@"            return $@""{{");

            var properties = classSymbol.GetMembers().OfType<IPropertySymbol>().ToList();
            for (int i = 0; i < properties.Count; i++)
            {
                var property = properties[i];
                if (i != properties.Count - 1)
                {
                    if (property.Type.SpecialType == SpecialType.System_String)
                    {
                        sb.Append($@"""""{property.Name}"""":""""{{{property.Name}}}"""",");
                    }
                    else
                    {
                        sb.Append($@"""""{property.Name}"""":{{{property.Name}}},");
                    }
                }
                else
                {
                    if (property.Type.SpecialType == SpecialType.System_String)
                    {
                        sb.Append($@"""""{property.Name}"""":""""{{{property.Name}}}""""");
                    }
                    else
                    {
                        sb.Append($@"""""{property.Name}"""":{{{property.Name}}}");
                    }
                }
            }
            sb.AppendLine(@"}}"";");
            sb.AppendLine("        }");
            sb.AppendLine("    }");
            sb.AppendLine("}");

            return sb.ToString();
        }

    }
}

The Execute method orchestrate the source generation process. It iterates over every syntax tree in the compilation, looking for classes adorned with the AutoSerializeAttribute. Once found, it invokes the GenerateSerializationCode method to produce the serialization code for that class.

the source generator generate the following code for class Product:

namespace SourceGeneratorExample
{
    public partial class Product
    {
        public string Serialize()
        {
            return $@"{{""Id"":{Id},""Name"":""{Name}"",""Price"":{Price}}}";
        }
    }
}
  • Now we can use it to convert product object to json string
using generated;
namespace SourceGeneratorExample
{
    internal class Program
    {
        static void Main(string[] args)
        {
            HelloWorld.SayHello();

            Product p = new Product();
            p.Id = 1;
            p.Name = "Test";
            p.Price = 100;
            
            var s = p.Serialize();
        }
    }
}

From this point onward, you can define any number of data model classes. The source generator will automatically craft a Serialize function for each one.

Here are several insights we can glean from the aforementioned example.

Syntax Trees and Semantic Models

For source generator programmers, mastering Syntax Trees and Semantic Models is fundamental. These tools provide the necessary insights to analyze, understand, and generate code effectively. Let’s explore the basics of how to utilize them:

A Syntax Tree represents the source code’s hierarchical structure. Every construct in the code, from classes to individual statements, becomes a node in this tree.

In the Source Generator context, the compiler provides the Syntax Trees. When the generator’s Execute method is invoked, the GeneratorExecutionContext contains the Compilation object, which holds all the Syntax Trees.

foreach (var syntaxTree in context.Compilation.SyntaxTrees)
{
    // Process each tree
}

Navigate the tree to find specific code constructs using methods like DescendantNodes().

var classDeclarations = syntaxTree.GetRoot().DescendantNodes().OfType<ClassDeclarationSyntax>();

While Syntax Trees detail the code’s structure, Semantic Models dive into its meaning.

With a Syntax Tree at hand, you can get its corresponding Semantic Model via the Compilation object.

SemanticModel model = context.Compilation.GetSemanticModel(syntaxTree);

Symbols represent code elements. Use the Semantic Model to retrieve them and understand their context.

var classSymbol = model.GetDeclaredSymbol(classDeclaration);

For type safety and accurate code generation, fetch type information for expressions.

var typeInfo = model.GetTypeInfo(expression);

Syntax Trees and Semantic Models are like the eyes and brain of Source Generators. While the former lets the tool see and dissect the code, the latter empowers it to understand and interpret it.

Assemble the raw string of source code

Assembling raw strings of source code in source generators can be challenging. The process requires careful attention to syntax, structure, and the intricacies of the language.

Building source code using string concatenation, especially with the complexity of string interpolation, can indeed be cumbersome. This method is error-prone and can lead to hard-to-find issues in the generated code.

Special characters must be properly escaped, especially when dealing with strings within strings or special characters in interpolated strings. As the generated code becomes more complex, it becomes harder to read and understand the source generator code itself. Making changes or debugging issues in the generated code can be difficult, especially without clear separation or modularization.

A cleaner approach would be to utilize the StringBuilder more effectively and to refactor the code to make it more readable.

Instead of generating raw strings, consider using Roslyn’s Syntax API to build up syntax trees programmatically. This approach can be more robust and less error-prone than raw string manipulation, but it comes with a steeper learning curve. With the Syntax API, you’re essentially constructing code using a fluent API, which ensures that the generated code is syntactically correct.

using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.Text;
using System.Linq;

private string GenerateSerializationCode(INamedTypeSymbol classSymbol)
{
    // Create the namespace
    var namespaceDeclaration = SyntaxFactory.NamespaceDeclaration(SyntaxFactory.ParseName(classSymbol.ContainingNamespace.ToDisplayString()))
        .AddMembers(CreateClass(classSymbol));

    // Build the syntax tree
    var syntaxTree = SyntaxFactory.SyntaxTree(namespaceDeclaration);
    var workspace = new AdhocWorkspace();
    var formattedRoot = Formatter.Format(syntaxTree.GetRoot(), workspace);
    string formattedCode = formattedRoot.ToFullString();

    return formattedCode;
}

private ClassDeclarationSyntax CreateClass(INamedTypeSymbol classSymbol)
{
    // Create the Serialize method
    var serializeMethod = SyntaxFactory.MethodDeclaration(SyntaxFactory.PredefinedType(SyntaxFactory.Token(SyntaxKind.StringKeyword)), "Serialize")
        .AddModifiers(SyntaxFactory.Token(SyntaxKind.PublicKeyword))
        .WithBody(SyntaxFactory.Block(
            SyntaxFactory.SingletonList<StatementSyntax>(
                SyntaxFactory.ReturnStatement(CreateSerializationExpression(classSymbol))
            )
        ));

    // Create the class declaration
    var classDeclaration = SyntaxFactory.ClassDeclaration(classSymbol.Name)
        .AddModifiers(SyntaxFactory.Token(SyntaxKind.PublicKeyword), SyntaxFactory.Token(SyntaxKind.PartialKeyword))
        .AddMembers(serializeMethod);

    return classDeclaration;
}        

private ExpressionSyntax CreateSerializationExpression(INamedTypeSymbol classSymbol)
{
    StringBuilder sb = new StringBuilder();
    sb.Append(@"$@""{{");
    var properties = classSymbol.GetMembers().OfType<IPropertySymbol>().ToList();
    for (int i = 0; i < properties.Count; i++)
    {
        var property = properties[i];
        if (i != properties.Count - 1)
        {
            if (property.Type.SpecialType == SpecialType.System_String)
            {
                sb.Append($@"""""{property.Name}"""":""""{{{property.Name}}}"""",");
            }
            else
            {
                sb.Append($@"""""{property.Name}"""":{{{property.Name}}},");
            }
        }
        else
        {
            if (property.Type.SpecialType == SpecialType.System_String)
            {
                sb.Append($@"""""{property.Name}"""":""""{{{property.Name}}}""""");
            }
            else
            {
                sb.Append($@"""""{property.Name}"""":{{{property.Name}}}");
            }
        }
    }
    sb.AppendLine(@"}}"";");

    return SyntaxFactory.ParseExpression(sb.ToString());
}

If you’re unfamiliar with the Roslyn API, you might resort to assembling raw strings. Although C# 11.0 introduced the raw string literal feature to simplify this process, the current source generators are constrained to work with the netstandard 2.0 library, which only supports up to C# 7.3.

Debugging

Debugging the generated source code is natually supported, but how to debug the source generator itself?

Debugging the source generator itself is a bit more involved than debugging typical code because the source generator runs during the compilation process. Here’s how you can debug a source generator:

In your source generator code, insert the following line at the point where you want to start debugging:

System.Diagnostics.Debugger.Launch();

For example:

using Microsoft.CodeAnalysis.Text;
using Microsoft.CodeAnalysis;

namespace MySourceGenerators
{
    [Generator]
    public class SimpleSourceGenerator : ISourceGenerator
    {
        public void Initialize(GeneratorInitializationContext context) { }

        public void Execute(GeneratorExecutionContext context)
        {
            System.Diagnostics.Debugger.Launch();
            var sourceCode = @"
        namespace generated
        {
            public static class HelloWorld
            {
                public static void SayHello() => System.Console.WriteLine(""Hello from generated code!"");
            }
        }";

            context.AddSource("HelloWorldGenerated.g.cs", SourceText.From(sourceCode, System.Text.Encoding.UTF8));
        }
    }

}

When the code containing the source generator is compiled (e.g., when you build the project that uses the source generator), the Debugger.Launch() line will trigger a debugger selection dialog.

You can then choose an instance of Visual Studio to begin debugging.

Alternatively, if you do not want to use System.Diagnostics.Debugger.Launch(); you can manually attach the debugger:

  • Build and run your project normally without the debugger.
  • In Visual Studio, go to Debug -> Attach to Process.
  • Find the VBCSCompiler.exe process in the list and attach to it.
  • Set breakpoints in your source generator code.
  • Open the same solution in another Visual Studio instance (This is required as, you can not build the project while debugging in the same visual studio instance)
  • Trigger the code path that uses your source generator by building the project in the 2nd Visual Studio Instance

Final thought

The serialization example given in this article serves merely as a demonstration of how to craft a source code generator. Numerous real-world scenarios could benefit from using a source code generator, such as:

  • ORM Mappings: Automatically generating data access layer code based on database schemas, streamlining the process of connecting application models to database tables.
  • API Clients: Given an OpenAPI or Swagger definition, generate client libraries in C# to interact with those APIs seamlessly.
  • Data Transfer Objects (DTOs): Create DTOs automatically based on domain models, ensuring that data sent between layers or systems is in the correct format.
  • Boilerplate Code Elimination: Any repetitive, standard code patterns, such as the implementation of certain design patterns (e.g., Singleton, Factory), can be auto-generated to reduce manual errors and ensure consistency.
  • Proxy Generation: For microservices or distributed systems, generate proxies that allow services to communicate with each other transparently.
  • Builder Pattern Implementation: For classes with many optional parameters or configurations, automatically generate builder classes to simplify object instantiation.
  • Stub and Mock Generation: For unit testing, generate stubs or mocks of interfaces or classes to simulate various behaviors and scenarios without implementing the actual business logic.
  • UI Code Generation: Based on metadata or configurations, automatically generate UI components or forms for data entry and display.
  • Attribute-driven Code: For classes or methods adorned with specific custom attributes, generate code to handle cross-cutting concerns like logging, caching, or validation.
  • Localization and Internationalization: Generate localized resource files or classes based on a master language file, streamlining the process of supporting multiple languages in an application.
  • Configuration Class Generation: From configuration files (e.g., YAML, JSON), generate strongly-typed classes to access configuration settings in a type-safe manner.

Interestingly, starting with .NET 6, the System.Text.Json library has incorporated support for source code generation. You can readily integrate this feature into your code.

How to use source generation in System.Text.Json - .NET
Learn how to use source generation in System.Text.Json.

GitHub - devedium/SourceGeneratorExample: C# Source Generators Example
C# Source Generators Example. Contribute to devedium/SourceGeneratorExample development by creating an account on GitHub.

Subscribe to Dev·edium

Sign up now to get access to the library of members-only issues.
Jamie Larson
Subscribe
oAs6O4K19CZGaIdjI5ohK+O2y5lBTW6uQ==" crossorigin="anonymous" referrerpolicy="no-referrer">