Skip to content

[DOCS-11051] Add Azure Event Hub using Kafka source #30211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
32 changes: 18 additions & 14 deletions config/_default/menus/main.en.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5230,72 +5230,76 @@ menu:
url: observability_pipelines/sources/amazon_s3/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_amazon_s3
weight: 902
- name: Azure Event Hubs
url: observability_pipelines/sources/azure_event_hubs/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_azure_event_hubs
weight: 903
- name: Datadog Agent
url: observability_pipelines/sources/datadog_agent/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_datadog_agent
weight: 903
weight: 904
- name: Fluent
url: observability_pipelines/sources/fluent/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_fluent
weight: 904
weight: 905
- name: Google Pub/Sub
url: observability_pipelines/sources/google_pubsub/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_google_pubsub
weight: 905
weight: 906
- name: HTTP Client
url: observability_pipelines/sources/http_client/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_http_client
weight: 906
weight: 907
- name: HTTP Server
url: observability_pipelines/sources/http_server/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_http_server
weight: 907
weight: 908
- name: Lambda Forwarder
url: observability_pipelines/sources/lambda_forwarder/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_lambda_forwarder
weight: 908
weight: 909
- name: Kafka
url: observability_pipelines/sources/kafka/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_kafka
weight: 909
weight: 910
- name: Logstash
url: observability_pipelines/sources/logstash/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_logstash
weight: 910
weight: 911
- name: Socket
url: observability_pipelines/sources/socket/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_socket
weight: 911
weight: 912
- name: Splunk HEC
url: observability_pipelines/sources/splunk_hec/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_splunk_hec
weight: 912
weight: 913
- name: Splunk TCP
url: observability_pipelines/sources/splunk_tcp/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_splunk_tcp
weight: 913
weight: 914
- name: Sumo Logic Hosted Collector
url: observability_pipelines/sources/sumo_logic/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_sumo_logic
weight: 914
weight: 915
- name: Syslog
url: observability_pipelines/sources/syslog/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_syslog
weight: 915
weight: 916
- name: Processors
url: observability_pipelines/processors/
parent: observability_pipelines
Expand Down
112 changes: 112 additions & 0 deletions content/en/observability_pipelines/sources/azure_event_hubs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
title: Send Azure Event Hubs Logs to Observability Pipelines
disable_toc: false
---

## Overview

This document walks through how to send Azure Event Hubs logs to Observability Pipelines using the Kafka source. The setup steps include setting up Azure Event Hubs for the Kafka source:

- [Create an Event Hubs namespace](#create-an-azure-event-hubs-namespace)
- [Create an Event Hub (Kafka topic)](#create-an-event-hub-kafka-topic)
- [Configure shared access policy](#configure-shared-access-policy)
- [Set up diagnostic settings](#set-up-diagnostic-settings)
- [Configure Kafka-compatible connection for the event hub](#configure-kafka-compatible-connection-for-the-event-hub)

After Azure Event Hubs has been set up, you [set up a pipeline with the Kafka source](#set-up-a-pipeline-with-the-kafka-source) to send Azure Event Hubs logs to Observability Pipelines.

## Set up Azure Event Hubs for the Kafka source

### Create an Azure Event Hubs namespace

1. In the Azure Portal, navigate to [Event Hubs](https://portal.azure.com/#browse/Microsoft.EventHub%2Fnamespaces).
1. Click **Create**.
1. Fill in the **Project Details** (subscription, resource group) and **Instance Details** (namespace name, region, select Standard, Premium, or Dedicated tier).
1. Ensure the region matches your Azure resources (for example, `westus`).
1. Click **Review + create**.

**Note**: The Kafka endpoint is automatically enabled for standard and higher tiers.

### Create an event hub (Kafka topic)

1. In the namespace you created, select **Event Hubs** and click **+ Event Hub**.
1. Enter a name (for example, `datadog-topic`) and configure the settings (for example, 4 partitions and a 7-day retention time).
1. Click **Review + create**. This Event Hub acts as a Kafka topic.

### Configure shared access policy

1. In the Event Hub you created, navigate to **Settings** > **Shared access policies**.
1. Click **+ Add**.
1. Enter a policy name (for example, `DatadogKafkaPolicy`).
1. Select the **Manage** checkbox, which should automatically select the **Send** and **Listen** checkboxes.
1. Click **Create**.
1. Copy the **Primary connection string** to use for Kafka authentication.

### Set up diagnostic settings

1. Configure Azure resources (for example, VMs, App Services) or subscription-level activity logs to stream logs to the Event Hub.
1. For resources:
1. Navigate to the resource and then to **Monitoring** > **Diagnostic settings**.
1. Click **+ Add diagnostic setting**.
1. Select log categories you want (for example, AuditLogs, SignInLogs for Microsoft Entra ID).
1. In **Destination details**:
1. Check the **Stream to an event hub** box.
1. Select the namespace and Event Hub (`datadog-topic`).
1. Click **Save**.
1. For activity logs:
1. Navigate to **Microsoft Entra ID** > **Monitoring** > **Audit logs** > **Export Data Settings**.
1. Check the **Stream to the Event Hub** box.
1. Repeat for each region. Logs must stream to Event Hubs in the same region.

### Configure Kafka-compatible connection for the event hub

Azure Event Hubs exposes a Kafka endpoint at `NAMESPACE.servicebus.windows.net:9093`, which Observability Pipelines uses as the Kafka source.

#### Get the Kafka endpoint

1. In the Azure Portal, navigate to your Event Hubs Namespace (for example, `myeventhubns`).
1. On the **Overview** page, under the **Essentials** section, locate the **Host name** or **Fully Qualified Domain Name (FQDN)**. It is in the format: `<NAMESPACE>.servicebus.windows.net` (for example, `myeventhubns.servicebus.windows.net`).
1. Append the Kafka port `:9093` to form the Bootstrap Servers value: `<NAMESPACE>.servicebus.windows.net:9093`.
- For example, if your namespace is `myeventhubns`, the Bootstrap Servers is `myeventhubns.servicebus.windows.net:9093`.
- You need this information when you set up the Observability Pipelines Kafka source.

#### Set up authentication

1. Azure Event Hubs uses SASL_SSL with the PLAIN mechanism for Kafka authentication.
1. The connection string is formatted for Observability Pipelines:
```
Username: $$ConnectionString
Password: Endpoint=sb://<NAMESPACE>.servicebus.windows.net/;SharedAccessKeyName=<PolicyName>;SharedAccessKey=<Key>
```

## Set up a pipeline with the Kafka source

1. Navigate to [Observability Pipelines](https://app.datadoghq.com/observability-pipelines).
1. Select the Kafka source.
1. In the **Group ID** field, specify or create a unique consumer group (for example, `datadog-consumer-group`).
1. Enter `datadog-topic` in the **Topics** field.
1. Toggle the switch to enable SASL authentication.
1. In the **Mechanism** dropdown menu, select **PLAIN**.
1. Enable TLS.
1. Download the certificate from [https://curl.se/docs/caextract.html](https://curl.se/docs/caextract.html) and save it to `/var/lib/observability-pipelines-worker/config/cert.pem`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should host this cert in docs? I don't think this third party site is something we can rely on. But I think certs expire, so I am not really sure what to do here. I found this out via: https://datadoghq.atlassian.net/wiki/spaces/PRODUCTSA/pages/5118492913/One+Oncology+2025#:~:text=Third%20TLS%20needs%20to%20be%20enabled.%20But%20I%20didn%27t%20know%20what%20cert%20to%20use.%20I%20found%20this%20really%20old%20github%20issue%20post

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also want to be specific saying "Save this cert to this path on your OP worker host/container"? If it is containerized they will likely have to mount a volume to load it.

This is a bit chicken and egg problem because this guide assumes they haven't deployed OPW yet, so that directory won't yet exist until OPW has been installed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work?

Suggested change
1. Download the certificate from [https://curl.se/docs/caextract.html](https://curl.se/docs/caextract.html) and save it to `/var/lib/observability-pipelines-worker/config/cert.pem`.
1. Download the certificate from [https://curl.se/docs/caextract.html](https://curl.se/docs/caextract.html) and save it to this path on your Worker host or container:`/var/lib/observability-pipelines-worker/config/cert.pem`. If you are using a container, you might have to mount a volume to load it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should host this cert in docs? I don't think this third party site is something we can rely on. But I think certs expire, so I am not really sure what to do here. I found this out via:

Hm I'm also not sure what's best here either..if it expires, how does someone find another one to use in this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change looks good to me.

Regarding cert, I don't really know 😓 I could ask the broader PSA team and see if someone smarter than me on these things has any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked PSA team for suggestions, I am out of my depth here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone Smarter than me helped me out :) @krlv (Thank you so much!)

His verbatim response below:

okay, I think I got it. Azure Events Hub supports SASL/PLAIN auth only over SSL encrypted connection (kinda make sense -- you're sending the password in a plain text over the wire). So in your case, OPW has to use TLS encryption for the authentication request.

Moreover, based on this comment, i have a hunch that any trusted root cert will work -- even the one from the ca-certificates/OpenSSL. if you're running OPW on a linux instance, you should probably use /etc/ssl/certs/ca-certificates.crt or /usr/lib/ssl/certs/ca-certificates.crt instead of downloading it from the cURL website.

Back to your original question in GH -- no, we shouldn't host the CA certs on our docs site. We don't host root certs; even when we changed the one used to sign Datadog certs, we just pointed users to download it directly from the DigiCert website: https://docs.datadoghq.com/data_security/guide/tls_cert_chain_of_trust/?tab=g1rootcertificateold#action-needed

My 2 cents on the doc: I'd highly recommend testing the ca-certificates package (using /etc/ssl/certs/ca-certificates.crt as the "Certificate path" in the OP UI) before suggesting customers download and mount root CA cert from the internet. Hopefully OPW can handle .crt files in addition to .pem. If this works, we can simply reference ca-certificates in the docs and provide /etc/ssl/certs/ca-certificates.crt as an example.

@emarsha94 and/or I will need to test this, I think we must do this before we proceed with publishing. I'll also check container land for these certs to see if they can be used. Hopefully our Azure Event Hubs environment is still around so we can test this without having to set everything up again 🤞

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slacked this as well, but putting here for visibility:

If it's going to take some time to test, do you think we could just say something like:

Download a trusted root certificate and save it to this path on your Worker host or container: /var/lib/observability-pipelines-worker/config/cert.pem. If you're running the Worker on a Linux instance, you should use the certifiate: /etc/ssl/certs/ca-certificates.crt or /usr/lib/ssl/certs/ca-certificates.crt. If you are using a container, you might have to mount a volume to load the certificate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just don't know if those certs already exist on a standard distribution or in our container image and if they'll work, thus why I want to test again.

1. Enter `/cert.pem` in the **Certificate path** field.
{{< img src="observability_pipelines/sources/kafka_settings.png" alt="The Kafka source settings with example values" style="width:45%;" >}}
1. Click **Next: Select Destination**.
1. After you set up your destinations and processors, click **Next: Install**.
1. Select your platform in the **Choose your installation platform** dropdown menu.
1. Enter the environment variables for your Kafka source:
1. For **Kafka Bootstrap Servers**, enter `<NAMESPACE>.servicebus.windows.net:9093` (for example, `myeventhubns.servicebus.windows.net:9093`).
1. For **Kafka SASL Username**, enter `$$ConnectionString`.
1. For **Kafka SASL Password**, enter the full connection string (for example, `Endpoint=sb://<NAMESPACE>.servicebus.windows.net/;SharedAccessKeyName=<PolicyName>;SharedAccessKey=<Key>`).
1. Enter your Kafka TLS passphrase.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we left this blank in our test, but @emarsha94 will have to confirm -- I don't see in either of our notes where we denoted needing this.

Copy link
Contributor

@znbbz znbbz Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is the art we didnt give to the customer before in the notes, and we had to update. $$ConnectionString is definitely needed for instance

Copy link
Contributor Author

@maycmlee maycmlee Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@znbbz I think @ckelner is referring to this part?

  1. Enter your Kafka TLS passphrase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I am going to have to set this up to test it again for the certificate, I'll make note if we leave this blank or not when we get it to a working state.

{{< img src="observability_pipelines/sources/kafka_env_vars.png" alt="The install page with example values for the kafka environment variables" style="width:100%;" >}}
1. Enter the environment variables for your destinations, if applicable.
1. Follow the rest of the instructions on the page to install the Worker based on your platform.

#### Check your Observability Pipelines environment file

If you run into issues after installing the Worker, check your Observability Pipelines environment file (`/etc/default/observability-pipelines-worker`) to make sure the environment variables are correctly set:

- `DD_OP_SOURCE_KAFKA_SASL_USERNAME="$$ConnectionString"`
- `DD_OP_SOURCE_KAFKA_BOOTSTRAP_SERVERS=<NAMESPACE>.servicebus.windows.net:9093`
- `DD_OP_SOURCE_KAFKA_SASL_PASSWORD=<Endpoint=sb://<NAMESPACE>.servicebus.windows.net/;SharedAccessKeyName=<PolicyName>;SharedAccessKey=<Key>>`
2 changes: 2 additions & 0 deletions content/en/observability_pipelines/sources/kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ disable_toc: false

Use Observability Pipelines' Kafka source to receive logs from your Kafka topics. Select and set up this source when you [set up a pipeline][1]. The Kafka source uses [librdkafka][2].

You can also [send Azure Event Hub logs to Observability Pipelines using the Kafka source](/observability_pipelines/sources/azure_event_hub/#send-azure-event-hub-logs-to-observability-pipelines-using-the-kafka-source).

## Prerequisites

{{% observability_pipelines/prerequisites/kafka %}}
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading