Tracker Case Study

This is the story of how Clearcode scoped and built the MVP for one of our internal projects — tracker.

Every AdTech and MarTech platform is different.

Demand-side platforms (DSPs), for example, help advertisers purchase inventory from publishers on an impression by impression basis via real-time bidding (RTB).

Customer data platforms (CDPs), on the other hand, collect first-party data from a range of sources, create single customer views (SCVs), and push audiences to other systems and tools.

Although the functionality and goal of AdTech and MarTech platforms varies, they all have one thing in common; they all need a component that collects and delivers data from different sources (e.g. websites) to different systems (e.g. DSPs).

This component is known as a tracker.

Key points:

  • The tracker is used to collect event data (e.g. impressions, clicks, and video metrics) from different sources.
  • We designed and built our own tracker system that can be used for future client projects.
  • It’s part of our AdTech Foundations — various components that can be used to help our clients save months of development time and tens of thousands of dollars when building AdTech and MarTech platforms.
  • We chose the programming languages and technology stacks by running benchmark tests.
  • The two best performing technology stacks were Go (aka Golang) and Nginx + Lua, but chose Go because of its growing popularity.
  • By using our tracker in your AdTech or MarTech development project, you can save thousands in development costs and months of development time.
Tracker

Our tracker can be used to collect event data for AdTech & MarTech platforms.

Krzysiek Trębicki, project manager of Tracker

Tracker_Magazine

Download now, read later

Download the full version of our Tracker case study.

DOWNLOAD PDF NOW


Here’s an overview of the development process we followed when building the tracker: The development phase for building our tracker

MVP Scoping Phase

The goal of the Minimum Viable Product (MVP) Scoping phase was to define the scope of the project and select the architecture and tech stack.

We achieved this by doing the following:

Creating a Story Map

We started the project by creating a story map to help us:

  • Define the key components and features of the tracker.
  • Identify and solve the main technical challenges, such as performance, speed, and scalability.
  • Decide what events the tracker should collect and how it will do it.
Here’s an overview of what the story map looked like: The story map for the tracker's MVP

Defining the Functional Requirements

Based on the results from the story mapping sessions, we created a list of functional requirements for the tracker.

The functional requirements relate to the features and processes of the tracker.

We identified that the tracker would need to:

  • Receive and handle multiple types of requests (impressions, clicks, conversions, etc.).
  • Process requests and generate events with proper dimensions and metrics.
  • Allow request types and preprocessing of events to be configured.
  • Extend its basic functionalities with plugins by exposing its API.
  • Expose generated events to specified collectors (plugins) and allow them to be attached to specific event types by configuration.
  • Include built-in log and queue collectors.
  • Include a built-in plugin for our budget management component, banker.
  • Support request redirects.

Defining the Non-Functional Requirements

The non-functional requirements of the project aren’t connected with building a working product, but are connected to the performance, scalability, security, delivery, and interoperability of the tracker.

We identified the following non-functional requirements of tracker:

  • High availability (99.999%).
  • High requests per second (2000 requests/s).
  • Low requests processing latency (15 ms on average).
  • Security
    • Privacy — e.g. ensuring we don’t expose personal data to third parties when we need to share it with other components.
    • Availability — ensure the platform won’t be impacted by a distributed denial-of-service (DDoS) from bad requests.
  • Deployable to AWS, GCP, and Azure.
  • Ability to integrate with custom plugins.
  • Ability to integrate with other DSP components such as the banker.
  • Platform scalability — the tracker needs to be able to handle large increases in events.
  • Idempotency — processing the same logs many times in case of errors or temporary unavailability

Selecting the Architecture and Tech Stack

We selected the architecture and tech stack for the tracker project by:

  • Researching benchmarking tools used for testing the performance of the programming languages.
  • Comparing different variations of the technology stack.

Benchmark tools research

We researched different benchmark tools to help us select the right programming language for the tracker.

The main metrics we wanted to test were:

  • Latency of requests
  • Requests per second
  • Error ratio
  • Percentiles latency
  • The ideal benchmark tool needed to be easily integrated with our continuous integration (CI) environment, either run via the command line or a plugin for Jenkins.

    Below are the pros and cons of the benchmark tools that met our requirements.

    Wrk

    Wrk is a HTTP benchmarking tool capable of generating significant load.

    Pros

    • It’s not a domain-specific language (DSL), meaning we don’t have to learn a new programming language.
    • Tests scripts are written in Lua, which provides speed during testing.
    • Easy to set up and use.

    Cons

    • Although Lua provides speed during testing, it contains some problems that impact test results.
    • There’s no visualization of benchmark results.


    The best choice for simple performance and load testing would be Wrk, however in case we need some more advanced scenarios, it requires configuration tests in Lua, which may be time consuming. It also doesn't support metrics visualization out of the box.


    k6

    Pros

    • It’s not a domain-specific language (DSL).
    • Tests are written in JavaScript, so it’s easy for the team to use.
    • Easy to set up and use.
    • Easy to integrate with Grafana for metrics visualization.

    Cons

    • Tests are written in JavaScript, which is not an ideal testing language.


    Locust

    Locust is an easy-to-use, distributed, load testing tool. It's written in Python and built on the Requests library.

    Pros

    • It’s not a DSL and tests are written in Python, so it’s easy for the team to use.
    • Easy to set up and use.
    • Optional web UI with charts.

    Cons

    • We found out during the benchmark testing phase that Locust is a bit slow and requires a lot of infrastructure to create a decent amount of load on the benchmark tool.


    k6 and Locust seem to be comparably easy to set up and use, although Locust was chosen as it allows us to write tests in Python, which is a technology the team knows very well.


    Gatling

    Gatling is a powerful open-source load testing solution.

    Pros

    • There’s a Jenkins plugin available, which allows us to view reports generated by Gatling in Jenkins.

    Cons

    • It uses a scala-based DSL for writing tests, meaning we’d have to learn how to use it before we could start running tests.
    • It takes time to get started and set up.


    Gatling was discarded due to its complexity and scala-based DSL used for configuring tests. Once we had chosen the benchmark tools, we moved on to selecting the technologies that we would test.

    Technology Stack Comparison

    We decided to test the following programming languages and technologies:

    • Golang (aka Go) because it’s growing in popularity and we were familiar with it.
    • Rust because other development teams have used it to build trackers in the past.
    • Python because we are very familiar with it.
    • OpenResty = Nginx + Lua because it allows us to create a tracker using just an nginx HTTP server.

    We decided to run initial benchmark tests on all the technologies using the benchmarking tools listed above, but later focused on running more tests using wrk2, gatling and locust.

    All three tools were configured to allow for maximum elasticity.

    The Results and the Chosen Programming Language

    The two best performing programming languages were Nginx + Lua and Golang, and while they had similar results across all benchmarking tools, we chose Golang due to current market needs and its popularity.

    The MVP Scoping Phase took 1 sprint (2 weeks) to complete.

The MVP Development Phase

With the MVP Scoping Phase completed and our architecture and tech stack selected, we began building the MVP of the tracker.

We built the tracker in 5 sprints (10 weeks).

Below is an overview of what we produced in each sprint.


Sprint 1

What we achieved and built in this sprint:

  • Implemented basic functionality.
  • Configured event types.
  • Created benchmarks as a continuous integration (CI) step.
  • Produced visual benchmark results.

Sprint 2

What we achieved and built in this sprint:

  • Created and tested request redirects.
  • Performed tests on the tracker.

Sprint 3

What we achieved and built in this sprint:

  • Built a plugin for the log storage component.
  • Enhanced configuration that allows us to configure any tracking paths..
  • Produced the tracker’s documentation.

Sprint 4

What we achieved and built in this sprint:

  • Built a plugin for our banker (a budget management component).
  • Configured event extraction.
  • Ran end to end tests.

Sprint 5

What we achieved and built in this sprint:

  • Produced auto-generated documentation.
  • Created a quickstart guide.
  • Built docker images and pushed them into the internal registry.

How Can Our Tracker Help You?

Speed up the development phase of your custom-built AdTech or MarTech platform.
Implement it into your existing AdTech or MarTech platforms to collect data.