Tutorial: How To Implement Jaeger and OpenTracing As Tracing Middleware

Niki Manoledaki
When DevOps met Multi-Cloud

--

Are you new to distributed tracing, Jaeger, or OpenTracing, and would like to learn more? Are you wondering how to instrument your API with tracing to log useful metadata?

Tracing tools such as Jaeger and OpenTracing can be used to increase visibility across distributed applications while debugging, such as measuring API response times and the performance of external services.

This tutorial is a guide on how to implement tracing as middleware to a Go API using OpenTracing and a Jaeger tracer. It goes over how to configure Jaeger, set it as an OpenTracing global tracer, add OpenTracing as middleware, create a parent span, and create child spans where there are calls to external services.

What is End-to-End Distributed Tracing?

Jaeger and OpenTracing are both open-source CNCF (Cloud Native Computing Foundation) tools that have gained popularity in the observability and cloud native spaces. Tracing enables us to monitor our system and make our system more observable. An important tool to have as part of your observability stack, end-to-end distributed tracing tools such as Jaeger facilitate the debugging process of distributed programs, such as those that live in a Kubernetes cluster. For a thorough introduction to observability, I highly recommend this article by industry veteran Charity Majors.

With tracing, we can instrument an API to monitor its performance, log useful metadata, and monitor external service calls made by this API and each component in the distributed system. Any useful metadata logged by each trace can be viewed in the Jaeger UI in a centralised way. Other tools such as Grafana can also be used to intercept and visualise the traces emitted by Jaeger about our API.

Visualize end-to-end distributed tracing with Jaeger.
Visualize end-to-end distributed tracing with Jaeger (source)

Traces are composed of spans. A span is the smallest unit of work that a component in a distributed system can contribute to a trace. For example, each time a request hits one of the API’s endpoints, the tracing middleware will start a “parent” span at the start of the request, log any useful metadata about the request, then end the span once the request has been handled. As we will see later in this tutorial, one or more “child” spans can be created as part of the parent span wherever a call to an external service is made in the process of handling this request. Child spans are typically a call to a database, a call to an authentication service, or a call to any other microservice in the distributed system that our API interacts with.

Retrieve useful metadata about your services from traces (source)

Part 1: Implement Jaeger

Step 1: Run the Jaeger all-in-one container locally

Before jumping into the API’s code, run the Jaeger all-in-one container with the necessary Jaeger components: the Jaeger UI, collector, query, and agent.

The agent listens for spans from the application that is being instrumented. The agent should be co-located with this application in the same host, which is why the Sidecar deployment strategy is often used to deploy Jaeger alongside a Deployment in Kubernetes. The agent also handles the connection to the collector.

The collector collects spans from one or multiple agents. It places the spans in a queue to be processed by the query and sends spans for backend storage. The AIO container comes with in-memory storage by default but this could be substituted for a database if persistent storage is required.

The query service retrieves the traces from storage and serves them in the UI.

Step 2: Configure a Jaeger tracer

The next step is to configure and initialise the Jaeger tracer.

Create jaeger.go in the cmd directory at the root of your project. This file will contain the configuration for the Jaeger tracer.

There are many different strategies for configuring a Jaeger tracer, some of which are listed in the Jaeger documentation, here.

The following code prepares three kinds of Jaeger configurations based on three different use cases: deploying your API with Jaeger disabled, deploying Jaeger in your local development environment to test it, and deploying Jaeger in production.

Let’s break this down further below.

a) Jaeger configuration for when Jaeger is disabled

The configuration at lines 19 and 20 is for running the API with Jaeger disabled, which is useful if Jaeger is not present in the environment that the API is running in.

It might be necessary to disable Jaeger or run the API without tracing. It’s a good idea to allow for this kind of dependency inversion (to decouple dependencies from your API).

To hit this configuration, set the following environment variable in your environment.

export DISABLE_JAEGER=true

Make sure to add any environment variables in any Kubernetes Deployment manifests or Helm charts as well if necessary.

b) Jaeger configuration to test Jaeger locally

Lines 21 to 31 configure Jaeger for local testing. This configuration is useful, for example, to check that your API is working with Jaeger correctly in your development environment.

In order to hit this configuration, set this environment variable locally:

export DEV_ENV=true

The main difference between the production and testing configurations is the additional configuration for the Sampler (lines 24–27). The Sampler is what decides how often to sample and display a trace from the number of calls made to the API. With the production configuration, the collector returns the default probabilistic sampling policy with a probability of 0.001 (0.1%) (docs).

A smaller sampling probability is useful when sampling from a large number of requests in production. However, it is not helpful when developing and testing Jaeger locally to check that it works correctly. If you hit the production configuration in your local development environment, odds are that nothing will be displayed on the Jaeger UI.

c) Jaeger configuration for production

Lines 32 to 34 prepare the configuration that is recommended for production by the Jaeger docs.

While this configuration appears to be an empty struct, it configures the tracer with the default values set by Jaeger, which are intended to be used in production and can be found here.

These values can be overridden by setting them in the configuration in a similar way to how the default Sampler value was overridden in the development environment configuration.

Create the Jaeger tracer

Lines 36 to 46 create the Jaeger tracer from one of the configuration options that has been prepared in the previous lines.

There are multiple ways and functions that can be used to initialise a tracer but cfg.InitGlobalTracer does precisely what we need it to do. Here’s the code for what happens under the hood.

As you can see from the code, the function sets the Jaeger tracer as OpenTracing’s global tracer by doing opentracing.SetGlobalTracer(tracer). This makes the tracer globally accessible as a Singleton, which can be returned from anywhere in the API by doing opentracing.GlobalTracer(). This way we don't need to pass the tracer from one function to another and populate our code with more tracing code than necessary.

Step 3: Initialise Jaeger in main.go

Lastly, the tracer must be initialised as early as possible in your application, so a good place to do this is at the very top of main.go, before the server is initialised, as shown in the example code below.

Part 2: Implement OpenTracing

OpenTracing is a vendor-neutral interface for the actual tracer implementation. OpenTracing is a collection of methods that you can use throughout your API and use it with any tracer supported by OpenTracing.

Say you were to swap out one tracer (Jaeger) for another tracer (Zipkin), you would not have to make any changes to your code aside from the top-level configuration of the tracer.

Step 1: Create a parent (root) span

Let’s start implementing our tracing middleware with OpenTracing.

A middleware is a piece of code that performs a specific action whenever a request is sent to an endpoint in the API. These are typical actions that are necessary prior to a request being handled, such as authentication, authorization, or in this case, tracing.

In the example code above in Step 3 of the Jaeger initialisation, lines 20 to 24 initialise an HTTP router with Gorilla Mux. At line 21, there is a function by Mux that is to be used for adding middleware to a router.

Let’s create a middleware function that creates the “parent” or “root” span. This span encompasses the entire request to our API and will calculate the time it took for our API to handle this request from start to finish. It will also tag any relevant metadata, such as which endpoint was hit, unique identifiers like user IDs, and the status code returned.

Read more about line 66 below, or jump to the next step.

A note about context.Context and opentracing.Context

We need a way to pass down the parent span’s context so that we can start a child span later on in the API. However, we also don’t want to be passing any OpenTracing-related variables from function to function. One solution that uses idiomatic Go is to use the context package.

Since we are creating a middleware, we can use the request’s context — r.Context() —and derive the context once to make sure that the trace’s context is added to our request’s context.

r = r.WithContext(opentracing.ContextWithSpan(r.Context(), span))

This post from the Go blog is a great read about the context package and what it means to derive context in Go. In essence, we can’t just pass down the context as it is (because it would not contain the span). We need to transform it first before passing it down to the request to the handlers with r = r.WithContext()!

Also, bear in mind that opentracing.Context() =/= Go's context.Context()! In the code above, it might seem like we are adding the span context twice, but the two types of context are not the same.

Step 2: Add Tags and Logs

Tags are metadata that are constant throughout the entire span, such as the request URL, database statement, or whether this is a client or server span.

Logs are specific events that occur at specific moments in the span e.g. errors. This tutorial explains tags and logs further. Let’s jump to the next step: how to create child spans where there are external service calls.

OpenTracing, being a vendor-neutral standardisation project, has some conventions regarding what to tag or log and how to do that here.

Step 3: Create one (or more) child span(s)

The helper functions for creating and finishing child spans should surround the external service call.

In the example code below, a child span is created where an endpoint makes a call to a database.

The helper functionmw.StartChildSpan starts a child span with the function StartSpanFromContext (code for this function).

It creates a ChildSpan by making use of the context that has been passed from the request and also calls on the globally accessible tracer instance.

Lastly, the helper function mw.FinishChildSpan logs any useful metadata, such as the duration of the child span, then ends the span.

After sending a curl request to your API, open a browser and navigate to http://localhost:16686/ to see your service’s traces show up on the Jaeger UI.

The code for this tutorial can be found here.

--

--

Software engineer with a cloud native focus. Currently building backend services and maintaining eksctl @ WeaveWorks.