Distributed Tracing with OpenTracing API of .NET Core Applications on Kubernetes

Distributed tracing is a great method in order to be able to address and monitor where we have a performance problem in our applications that we designed as microservice architecture.

In other words, it is a necessary method in order to get some answers such as which request go where, how much time do a request spend end-to-end?

In this article, we will talk about the distributed tracing operations of microservices that we developed on kubernetes with .NET Core using the OpenTracing API and the Jaeger tracer.

Scenario

Let’s assume we are working on an e-commerce system and we host our applications on kubernetes. We have a User API that is responsible for user-related actions. When a new user is registered in the system, an event called “UserRegisteredEvent” is getting published. One of the subscribers of this event is a service that is responsible for sending an activation e-mail to the user to activate his account.

Well, we will perform end-to-end tracing operations on kubernetes of this asynchronous user registration journey using OpenTracing API and Jaeger tracer.

What is OpenTracing and Jaeger?

In a nutshell, I would like to mention about OpenTracing and Jaeger. It is a specification that allows us to add instrumentation to our applications without depending on any vendor. Just like OpenAPI.

Jaeger is a useful OpenTracing compatible tracer developed by Uber Technologies. It allows us to perform distributed tracing operations on our microservice architecture. You can find more detailed information about Jaeger here.

You can follow this documentation for the installation of Jaeger on kubernetes. I followed the development setup for this article.

kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-kubernetes/master/all-in-one/jaeger-all-in-one-template.yml

Jaeger is composed of “Agent“, “Collector” and “Query“. The agent is a network daemon that listens to the trace data from UDP and passes it to the collector. Trace data is called “Span“. The collector processes the trace data, that passed to itself, in a pipeline (validations, indexes, transformations). Then it stores the data according to the type of the component (Elasticsearch, Cassandra and Kafka) which will be selected.

Query, as is evident from its name, is a UI where we can query the relevant trace results.

Okay, Let’s Get Some Coding!

Before we start coding, we need to have some tools for the platform:

  • Message Broker (I will use RabbitMQ)
  • Docker
  • and Kuberentes

NOTE: Since our main topic is not to about the creation of the platform, I will not focus on the installation topics.

First, let’s develop the User API that will responsible for users to register in the system. To do that, let’s create a project like below.

dotnet new webapi -n User.API

Then, create a class library called “User.Common.Contracts“.

dotnet new classlib -n User.Common.Contracts

In this library, we will define contracts that will be shared between our applications. Now let’s create an event that will be published when a new user is registered in the system.

using System.Collections.Generic;

namespace User.Common.Contracts
{
    public class UserRegisteredEvent
    {
        public string Email { get; set; }
        public Dictionary<string, string> TracingKeys { get; set; }
    }
}

After creating the event, let’s add the “User.Common.Contracts” library as a reference to the “User.API” project.

dotnet add reference ../User.Common.Contracts/User.Common.Contracts.csproj

Now let’s create a new folder called “Models” in the “User.API” project, and then also create “Requests” and “Responses” folders under the “Models” folder.

Let’s define the model, that will be used while a user register in the system, in the “Requests” folder as below.

namespace User.API.Models.Requests
{
    public class CreateUserRequest
    {
        public string Username { get; set; }
        public string Password { get; set; }
        public string Email { get; set; }
    }
}

Also in the “Responses” folder, we need to define an internal response wrapper class.

using System.Collections.Generic;
using System.Linq;

namespace User.API.Models.Responses
{
    public class BaseResponse
    {
        public BaseResponse()
        {
            Errors = new List();
        }

        public T Data { get; set; }
        public List Errors { get; set; }
        public bool HasError { get { return Errors.Any();  } }
    }
}

Now we have to create a new folder called “Services” in the “User.API” project. After that, let’s define an interface called “IUserService” in this folder.

using System.Threading.Tasks;
using User.API.Models.Requests;
using User.API.Models.Responses;

namespace User.API.Services
{
    public interface IUserService
    {
         Task CreateUserAsync(CreateUserRequest request);
    }
}

We will perform user related business logic operations via this service.

At this point, we will add a service bus to our project via NuGet to perform the messaging operations in a reliable way. I will use the MetroBus library that is a lightweight wrapper of the MassTransit library.

dotnet add package MetroBus
dotnet add package MetroBus.Microsoft.Extensions.DependencyInjection

Then we need to add OpenTracing and Jaeger packages to the project to be able to perform distributed tracing operations.

dotnet add package OpenTracing.Contrib.NetCore
dotnet add package Jaeger

Now we need another new folder. I love structured folder style. Anyway, let’s create a folder called “Implementations” under the “Services” folder and implement the “IUserService” interface as follows.

using System;
using System.Threading.Tasks;
using User.API.Models.Requests;
using User.API.Models.Responses;
using MassTransit;
using Microsoft.Extensions.Logging;
using User.Common.Contracts;
using OpenTracing;
using OpenTracing.Tag;
using OpenTracing.Propagation;
using System.Collections.Generic;

namespace User.API.Services.Implementations
{
    public class UserService : IUserService
    {
        private readonly ILogger _logger;
        private readonly IBusControl _busControl;
        private readonly ITracer _tracer;

        public UserService(ILogger logger, IBusControl busControl, ITracer tracer)
        {
            _logger = logger;
            _busControl = busControl;
            _tracer = tracer;
        }

        public async Task CreateUserAsync(CreateUserRequest request)
        {
            BaseResponse createUserResponse = new BaseResponse();

            try
            {
                using (var scope = _tracer.BuildSpan("create-user-async").StartActive(finishSpanOnDispose: true))
                {
                    var span = scope.Span.SetTag(Tags.SpanKind, Tags.SpanKindClient);

                    var dictionary = new Dictionary<string, string>();
                    _tracer.Inject(span.Context, BuiltinFormats.TextMap, new TextMapInjectAdapter(dictionary));

                    //some user create business logics

                    createUserResponse.Data = 1; // User id

                    await _busControl.Publish(new UserRegisteredEvent
                    {
                        Email = request.Email,
                        TracingKeys = dictionary
                    });
                }
            }
            catch (Exception ex)
            {
                createUserResponse.Errors.Add(ex.Message);
                _logger.LogError(ex, ex.Message);
            }

            return createUserResponse;
        }
    }
}

Well, let’s take look at what we did in the service class.

Trace context is already propagating automatically to other services by Jaeger. So, when you have api-to-api communication and make the necessary configuration, you can trace a request end to end.

At this point, since we are designing our sample project as an event-based communication (api-to-subscriber), we have provided the trace context propagation operations manually. If we look at the trace scope, we can see that we have created a span called the “create-user-async“. After that, we have added a client tag. It is possible to add additional metadata by using tags. Then we have injected the related trace context into the dictionary by using the “TextMapInjectAdapter“.

By completing the user registration process, we have published the “UserRegisteredEvent” together with the tracing keys to the queue via the service bus. After this point, whoever consumes this event, we will be able to trace the whole request flow as long as the tracing keys are used.

Now, we can create a controller. Let’s create a controller called “UsersController” as follow.

using Microsoft.AspNetCore.Mvc;
using User.API.Services;
using User.API.Models.Requests;
using User.API.Models.Responses;
using System.Threading.Tasks;

namespace User.API.Controllers
{
    [Route("api/users")]
    [ApiController]
    public class UsersController : ControllerBase
    {
        private readonly IUserService _userService;

        public UsersController(IUserService userService)
        {
            _userService = userService;
        }

        [HttpPost]
        public async Task Post([FromBody]CreateUserRequest request)
        {
            BaseResponse createUserResponse = await _userService.CreateUserAsync(request);

            if(!createUserResponse.HasError)
            {
                return Created("users", createUserResponse.Data);
            }
            else
            {
                return BadRequest(createUserResponse.Errors);
            }
        }
    }
}

Also in the controller, we are performing user registration operations through the “IUserService” interface.

Now, let’s open the “Startup” class and perform the service injection operations as follows.

public void ConfigureServices(IServiceCollection services)
{
    services.AddMvc().SetCompatibilityVersion(CompatibilityVersion.Version_2_2);

    services.AddScoped<IUserService, UserService>();

    string rabbitMqUri = Configuration.GetValue("RabbitMqUri");
    string rabbitMqUserName = Configuration.GetValue("RabbitMqUserName");
    string rabbitMqPassword = Configuration.GetValue("RabbitMqPassword");

    services.AddSingleton(MetroBusInitializer.Instance.UseRabbitMq(rabbitMqUri, rabbitMqUserName, rabbitMqPassword).Build());

    services.AddOpenTracing();
    services.AddSingleton(serviceProvider =>
    {
        Environment.SetEnvironmentVariable("JAEGER_SERVICE_NAME", "User.API");
        Environment.SetEnvironmentVariable("JAEGER_AGENT_HOST", "localhost");
        Environment.SetEnvironmentVariable("JAEGER_AGENT_PORT", "6831");
        Environment.SetEnvironmentVariable("JAEGER_SAMPLER_TYPE", "const");
        
        var loggerFactory = new LoggerFactory();

        var config = Jaeger.Configuration.FromEnv(loggerFactory);
        var tracer = config.GetTracer();

        GlobalTracer.Register(tracer);

        return tracer;
    });

    services.AddHealthChecks();
}

public void Configure(IApplicationBuilder app, IHostingEnvironment env)
{
    //...
    
    app.UseHealthChecks("/health");
}

We initialized the service bus using RabbitMQ and injected it. Then we injected the tracer after configuring it. I used “const” sampler as a sampling type while configuring the tracer. There are also a few sampling options such as “Probabilistic“, “Rate Limiting” and “Remote“. More detailed sampling information is available here. You can change the agent host information with the node IP in your kubernetes environment. If you set up the agent as a sidecar, you will not need to set any information. It will access with default information.

Now API is ready. Let’s go back to our scenario. When a user is registered in the system, we would publish an event. Then we would create a service which sends an activation e-mail, that subscribed to this event, in order be able to user activates his account.

Now we published the event and we can start developing the service which sends an activation e-mail to users.

Developing of Subscriber

To do that, let’s create a new .NET Core console application.

dotnet new console -n User.Activation.Consumer

After creating, let’s add the “User.Common.Contracts” library as a reference. Then we need to include MetroBus, OpenTracing and Jaeger to the project via NuGet.

dotnet add package MetroBus
dotnet add package MetroBus.Microsoft.Extensions.DependencyInjection
dotnet add package OpenTracing.Contrib.NetCore 
dotnet add package Jaeger

The console application will be a background service that will work as a daemon. Let’s include “Microsoft.Extensions.Hosting” and “Microsoft.Extensions.DependencyInjection” packages via NuGet to configure the app startup and lifetime management.

dotnet add package Microsoft.Extensions.Hosting
dotnet add package Microsoft.Extensions.DependencyInjection
Also, we need to include “Microsoft.Extensions.Configuration” and “Microsoft.Extensions.Configuration.Json” packages in order to perform configuration management.
dotnet add package Microsoft.Extensions.Configuration
dotnet add package Microsoft.Extensions.Configuration.Json
First, let’s create a folder called “Common” and then a class called “TracingExtension” into this folder.
using System;
using System.Collections.Generic;
using OpenTracing;
using OpenTracing.Propagation;
using OpenTracing.Tag;

namespace User.Activation.Consumer.Common
{
    public static class TracingExtension
    {
        public static IScope StartServerSpan(ITracer tracer, IDictionary<string, string> headers, string operationName)
        {
            ISpanBuilder spanBuilder;
            try
            {
                ISpanContext parentSpanCtx = tracer.Extract(BuiltinFormats.TextMap, new TextMapExtractAdapter(headers));

                spanBuilder = tracer.BuildSpan(operationName);
                if (parentSpanCtx != null)
                {
                    spanBuilder = spanBuilder.AsChildOf(parentSpanCtx);
                }
            }
            catch (Exception)
            {
                spanBuilder = tracer.BuildSpan(operationName);
            }

            return spanBuilder.WithTag(Tags.SpanKind, Tags.SpanKindConsumer).StartActive(true);
        }
    }
}

If we recall the API side, we have performed trace context propagation operation manually by publishing tracing keys in the event. Now, at this point that we want to create a span in the consumer, we will perform that using the “TracingExtension” class by extracting the tracing keys into the context.

Well, let’s create another new folder called “Consumers” into the root folder, then define a class named “UserActivationConsumer” in this folder.
-> User.Activation.Consumer.Common
—> Common
—> Consumers
using System.Threading.Tasks;
using MassTransit;
using OpenTracing;
using User.Activation.Consumer.Common;
using User.Common.Contracts;

namespace User.Activation.Consumer.Consumers
{
    public class UserActivationConsumer : IConsumer<UserRegisteredEvent>
    {
        private readonly ITracer _tracer;

        public UserActivationConsumer(ITracer tracer)
        {
            _tracer = tracer;
        }

        public async Task Consume(ConsumeContext<UserRegisteredEvent> context)
        {
            using (var scope = TracingExtension.StartServerSpan(_tracer, context.Message.TracingKeys, "user-activation-link-sender-consumer"))
            {
                //some user activation link send business logics

                await System.Console.Out.WriteLineAsync($"Activation link sent for {context.Message.Email}");
            }
        }
    }
}

At this point, we are performing the subscribe operations to the “UserRegisteredEvent” model that we published in the API. Then we perform the injection operation of the “ITracer” interface.

In the “consume” method, we are creating a scope by using the “TracingExtension” class that we create to perform the trace context propagation operations. With the “user-activation-link-sender-consumer” trace scope, that has a propagated trace context, we will now be able to trace our operations as api-to-subscriber.

Since this service will work as a background service, now let’s return to the root directory and create folders called “Services/Implementations“. Then under the “Implementations” folder, we need to create a class called “BusService” and implement it like below.
using System.Threading;
using System.Threading.Tasks;
using MassTransit;
using Microsoft.Extensions.Hosting;

namespace User.Activation.Consumer.Services.Implementations
{
    public class BusService : IHostedService
    {
        private readonly IBusControl _busControl;

        public BusService(IBusControl busControl)
        {
            _busControl = busControl;
        }

        public Task StartAsync(CancellationToken cancellationToken)
        {
            return _busControl.StartAsync(cancellationToken);
        }

        public Task StopAsync(CancellationToken cancellationToken)
        {
            return _busControl.StopAsync(cancellationToken);
        }
    }
}

In the “BusService” class, we just implemented the start and stop methods.

Let’s edit the “Program” class as follows.

using System;
using System.IO;
using System.Threading.Tasks;
using Jaeger;
using Jaeger.Samplers;
using MassTransit;
using MetroBus;
using MetroBus.Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using OpenTracing;
using OpenTracing.Util;
using User.Activation.Consumer.Consumers;
using User.Activation.Consumer.Services.Implementations;

namespace User.Activation.Consumer
{
    class Program
    {
        static async Task Main(string[] args)
        {
            var host = new HostBuilder()
                .ConfigureAppConfiguration((hostingContext, config) =>
                {
                    config.SetBasePath(basePath: Directory.GetCurrentDirectory());
                    config.AddJsonFile("appsettings.json", optional : true);
                })
                .ConfigureServices((hostContext, services) =>
                {
                    //Init tracer
                    services.AddSingleton<ITracer>(t => InitTracer());

                    string rabbitMqUri = hostContext.Configuration.GetValue<string>("RabbitMqUri");
                    string rabbitMqUserName = hostContext.Configuration.GetValue<string>("RabbitMqUserName");
                    string rabbitMqPassword = hostContext.Configuration.GetValue<string>("RabbitMqPassword");

                    services.AddMetroBus(x =>
                    {
                        x.AddConsumer<UserActivationConsumer>();
                    });

                    services.AddSingleton<IBusControl>(provider => MetroBusInitializer.Instance
                        .UseRabbitMq(rabbitMqUri, rabbitMqUserName, rabbitMqPassword)
                        .RegisterConsumer<UserActivationConsumer>("user.activation.queue", provider)
                        .Build());

                    services.AddHostedService<BusService>();
                });

            await host.RunConsoleAsync();
        }

        private static ITracer InitTracer()
        {
            Environment.SetEnvironmentVariable("JAEGER_SERVICE_NAME", "User.Activation.Consumer");
            Environment.SetEnvironmentVariable("JAEGER_AGENT_HOST", "localhost");
            Environment.SetEnvironmentVariable("JAEGER_AGENT_PORT", "6831");
            Environment.SetEnvironmentVariable("JAEGER_SAMPLER_TYPE", "const");

            var loggerFactory = new LoggerFactory();

            var config = Jaeger.Configuration.FromEnv(loggerFactory);
            var tracer = config.GetTracer();

            GlobalTracer.Register(tracer);

            return tracer;
        }
    }
}

I think that what we have done in the code block above is clear. We are injecting our services by configuring the configuration and dependency injection. We are also initializing the tracer with the “const” type and with the name “User.Activation.Consumer“.

We are registering the consumer with the queue named “user.activation.queue“. The consumer will subscribe to the queue via “UserRegisteredEvent” model.

Deployment

Well, we are ready to deploy now. I prepared a simple Docker file and Helm chart in order to perform deployment operations. With this chart, I will deploy applications to Azure Kubernetes Service. You can change the chart according to your own environment.

You can reach the chart and docker file that I have prepared from here.

Test

Now we can go to the test phase. First, let’s perform a POST request to the “api/users” endpoint in order be able to create a new user in the system.

With this request, we have started the user registration journey. As in our scenario, the event was published after the user registered.

After publishing the event, the service (user-activation-consumer), that subscribed to the event in order to send the activation email to the user, has performed its related process.

So what happened in this process, let’s have a look at the Jaeger.

If we look at the flow on the Jaeger, this process has 4 depths and 5 spans. The total process took 29.54ms. If we look at the details of the post operations, the request gets into the “create-user-async” method after passing the related action. Also, user registration operations are performing in this method. Then in the “User.Activation.Consumer” service, the activation link sending operations are performing asynchronously.

Although this journey is executing asynchronously, as we can see, we are able to get some answers such as where this process is now, how much time is spent on each process.

Conclusion

As a developer, we can debug and optimize our code or the request’s life cycle that works in a microservice architecture with distributed tracing. With the OpenTracing API, we can provide that our system can be traced flexibly with different tracers without falling into a vendor lock-in situation. Also in this article, I tried to show the propagation process of trace information in a distributed system.

Projects: https://github.com/GokGokalp/OpenTracing-Jaeger-Sample

References

https://github.com/yurishkuro/opentracing-tutorial
https://www.jaegertracing.io/docs/1.11/architecture/

Gökhan Gökalp

View Comments

  • eline sağlık güzel makale olmuş.
    application insight üzerindede tüm cycle görebiliyoruz.
    bunun artısı nedir?

    • Teşekkür ederim. OpenTracing uyumlu Microsoft Azure'un managed Application Insight'ını kullanmak da harika bir seçenek. İkisinin de kendisine has yetenekleri mevcut. Seçim tamamen size ve kullanmakta olduğunuz platforma ve neye ihtiyacınız var (async support, open-source yada değil, ) sorusuna göre değişiklik göstermektedir. Bana göre önemli olan tüm cycle'ı görüp yada göremediğiniz.

  • Emeğine sağlık. tracing key leri payloada koymak yerine masstransit in send context nin header na koymak daha genel bir cozum olmazmi ?

    • Teşekkür ederim yorumunuz için. Kesinlikle daha iyi olacaktır. Ben sadece explicit bir şekilde göstermek istedim. :)

Recent Posts

Overcoming Event Size Limits with the Conditional Claim-Check Pattern in Event-Driven Architectures

{:en}In today’s technological age, we typically build our application solutions on event-driven architecture in order…

2 months ago

Securing the Supply Chain of Containerized Applications to Reduce Security Risks (Policy Enforcement-Automated Governance with OPA Gatekeeper and Ratify) – Part 2

{:tr} Makalenin ilk bölümünde, Software Supply Chain güvenliğinin öneminden ve containerized uygulamaların güvenlik risklerini azaltabilmek…

7 months ago

Securing the Supply Chain of Containerized Applications to Reduce Security Risks (Security Scanning, SBOMs, Signing&Verifying Artifacts) – Part 1

{:tr}Bildiğimiz gibi modern yazılım geliştirme ortamında containerization'ın benimsenmesi, uygulamaların oluşturulma ve dağıtılma şekillerini oldukça değiştirdi.…

9 months ago

Delegating Identity & Access Management to Azure AD B2C and Integrating with .NET

{:tr}Bildiğimiz gibi bir ürün geliştirirken olabildiğince farklı cloud çözümlerinden faydalanmak, harcanacak zaman ve karmaşıklığın yanı…

1 year ago

How to Order Events in Microservices by Using Azure Service Bus (FIFO Consumers)

{:tr}Bazen bazı senaryolar vardır karmaşıklığını veya eksi yanlarını bildiğimiz halde implemente etmekten kaçamadığımız veya implemente…

2 years ago

Providing Atomicity for Eventual Consistency with Outbox Pattern in .NET Microservices

{:tr}Bildiğimiz gibi microservice architecture'ına adapte olmanın bir çok artı noktası olduğu gibi, maalesef getirdiği bazı…

2 years ago