Skip to content

[DOCS-11051] Add Azure Event Hub using Kafka source #30211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
31 changes: 18 additions & 13 deletions config/_default/menus/main.en.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5731,71 +5731,76 @@ menu:
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_amazon_s3
weight: 802
- name: Azure Event Hubs
url: observability_pipelines/sources/azure_event_hubs/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_azure_event_hubs
weight: 803
- name: Datadog Agent
url: observability_pipelines/sources/datadog_agent/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_datadog_agent
weight: 803
weight: 804
- name: Fluent
url: observability_pipelines/sources/fluent/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_fluent
weight: 804
weight: 805
- name: Google Pub/Sub
url: observability_pipelines/sources/google_pubsub/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_google_pubsub
weight: 805
weight: 806
- name: HTTP Client
url: observability_pipelines/sources/http_client/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_http_client
weight: 806
weight: 807
- name: HTTP Server
url: observability_pipelines/sources/http_server/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_http_server
weight: 807
weight: 808
- name: Lambda Forwarder
url: observability_pipelines/sources/lambda_forwarder/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_lambda_forwarder
weight: 808
weight: 809
- name: Kafka
url: observability_pipelines/sources/kafka/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_kafka
weight: 809
weight: 810
- name: Logstash
url: observability_pipelines/sources/logstash/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_logstash
weight: 810
weight: 811
- name: Socket
url: observability_pipelines/sources/socket/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_socket
weight: 811
weight: 812
- name: Splunk HEC
url: observability_pipelines/sources/splunk_hec/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_splunk_hec
weight: 812
weight: 813
- name: Splunk TCP
url: observability_pipelines/sources/splunk_tcp/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_splunk_tcp
weight: 813
weight: 814
- name: Sumo Logic Hosted Collector
url: observability_pipelines/sources/sumo_logic/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_sumo_logic
weight: 814
weight: 815
- name: Syslog
url: observability_pipelines/sources/syslog/
parent: observability_pipelines_sources
identifier: observability_pipelines_sources_syslog
weight: 815
weight: 816
- name: Processors
url: observability_pipelines/processors/
parent: observability_pipelines
Expand Down
110 changes: 110 additions & 0 deletions content/en/observability_pipelines/sources/azure_event_hubs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
---
title: Send Azure Event Hubs Logs to Observability Pipelines
disable_toc: false
---

## Overview

This document walks through how to send Azure Event Hubs logs to Observability Pipelines using the Kafka source. The setup steps include setting up Azure Event Hubs for the Kafka source:

- [Create an Event Hubs namespace](#create-an-azure-event-hubs-namespace)
- [Create an Event Hub (Kafka topic)](#create-an-event-hub-kafka-topic)
- [Configure shared access policy](#configure-shared-access-policy)
- [Set up diagnostic settings](#set-up-diagnostic-settings)
- [Configure Kafka-compatible connection for the event hub](#configure-kafka-compatible-connection-for-the-event-hub)

After Azure Event Hubs has been set up, you [set up a pipeline with the Kafka source](#set-up-a-pipeline-with-the-kafka-source) to send Azure Event Hubs logs to Observability Pipelines.

## Set up Azure Event Hubs for the Kafka source

### Create an Azure Event Hubs namespace

1. In the Azure Portal, navigate to [Event Hubs](https://portal.azure.com/#browse/Microsoft.EventHub%2Fnamespaces).
1. Click **Create**.
1. Fill in the **Project Details** (subscription, resource group) and **Instance Details** (namespace name, region, select Standard, Premium, or Dedicated tier).
1. Ensure the region matches your Azure resources (for example, `westus`).
1. Click **Review + create**.

**Note**: The Kafka endpoint is automatically enabled for standard and higher tiers.

### Create an event hub (Kafka topic)

1. In the namespace you created, select **Event Hubs** and click **+ Event Hub**.
1. Enter a name (for example, `datadog-topic`) and configure the settings (for example, 4 partitions and a 7-day retention time).
1. Click **Review + create**. This Event Hub acts as a Kafka topic.

### Configure shared access policy

1. In the Event Hub you created, navigate to **Settings** > **Shared access policies**.
1. Click **+ Add**.
1. Enter a policy name (for example, `DatadogKafkaPolicy`).
1. Select the **Manage** checkbox, which should automatically select the **Send** and **Listen** checkboxes.
1. Click **Create**.
1. Copy the **Connection string-primary key** for Kafka authentication.

### Set up diagnostic settings

1. Configure Azure resources (for example, VMs, App Services) or subscription-level activity logs to stream logs to the Event Hub.
1. For resources:
1. Navigate to the resource and then to **Monitoring** > **Diagnostic settings**.
1. Click **+ Add diagnostic setting**.
1. Select log categories you want (for example, AuditLogs, SignInLogs for Microsoft Entra ID).
1. In **Destination details**:
1. Check the **Stream to an event hub** box.
1. Select the namespace and Event Hub (`datadog-topic`).
1. Click **Save**.
1. For activity logs:
1. Navigate to **Microsoft Entra ID** > **Monitoring** > **Audit logs** > **Export Data Settings**.
1. Check the **Stream to the Event Hub** box.
1. Repeat for each region. Logs must stream to Event Hubs in the same region.

### Configure Kafka-compatible connection for the event hub

Azure Event Hubs exposes a Kafka endpoint at `NAMESPACE.servicebus.windows.net:9093`, which Observability Pipelines uses as the Kafka source.

#### Get the Kafka endpoint

1. In the Azure Portal, navigate to your Event Hubs Namespace (for example, `myeventhubns`).
1. On the **Overview** page, under the **Essentials** section, locate the **Host name** or **Fully Qualified Domain Name (FQDN)**. It is in the format: `<NAMESPACE>.servicebus.windows.net` (for example, `myeventhubns.servicebus.windows.net`).
1. Append the Kafka port `:9093` to form the Bootstrap Servers value: `<NAMESPACE>.servicebus.windows.net:9093`.
- For example, if your namespace is `myeventhubns`, the Bootstrap Servers is `myeventhubns.servicebus.windows.net:9093`.
- You need this information when you set up the Observability Pipelines Kafka source.

#### Set up authentication

1. Azure Event Hubs uses SASL_SSL with the PLAIN mechanism for Kafka authentication.
1. The connection string is formatted for Observability Pipelines:
```
Username: $ConnectionString
Password: Endpoint=sb://<NAMESPACE>.servicebus.windows.net/;SharedAccessKeyName=<PolicyName>;SharedAccessKey=<Key>
```

## Set up a pipeline with the Kafka source

1. Navigate to [Observability Pipelines](https://app.datadoghq.com/observability-pipelines).
1. Select the Kafka source.
1. In the **Group ID** field, specify or create a unique consumer group (for example, `datadog-consumer-group`).
1. Enter `datadog-topic` in the **Topics** field.
1. Toggle the switch to enable SASL authentication.
1. In the **Mechanism** dropdown menu, select **PLAIN**.
1. Enable TLS.
1. Download the certificate from [https://curl.se/docs/caextract.html](https://curl.se/docs/caextract.html) and save it to `/var/lib/observability-pipelines-worker/config/cert.pem`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should host this cert in docs? I don't think this third party site is something we can rely on. But I think certs expire, so I am not really sure what to do here. I found this out via: https://datadoghq.atlassian.net/wiki/spaces/PRODUCTSA/pages/5118492913/One+Oncology+2025#:~:text=Third%20TLS%20needs%20to%20be%20enabled.%20But%20I%20didn%27t%20know%20what%20cert%20to%20use.%20I%20found%20this%20really%20old%20github%20issue%20post

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also want to be specific saying "Save this cert to this path on your OP worker host/container"? If it is containerized they will likely have to mount a volume to load it.

This is a bit chicken and egg problem because this guide assumes they haven't deployed OPW yet, so that directory won't yet exist until OPW has been installed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work?

Suggested change
1. Download the certificate from [https://curl.se/docs/caextract.html](https://curl.se/docs/caextract.html) and save it to `/var/lib/observability-pipelines-worker/config/cert.pem`.
1. Download the certificate from [https://curl.se/docs/caextract.html](https://curl.se/docs/caextract.html) and save it to this path on your Worker host or container:`/var/lib/observability-pipelines-worker/config/cert.pem`. If you are using a container, you might have to mount a volume to load it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should host this cert in docs? I don't think this third party site is something we can rely on. But I think certs expire, so I am not really sure what to do here. I found this out via:

Hm I'm also not sure what's best here either..if it expires, how does someone find another one to use in this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change looks good to me.

Regarding cert, I don't really know 😓 I could ask the broader PSA team and see if someone smarter than me on these things has any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked PSA team for suggestions, I am out of my depth here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone Smarter than me helped me out :) @krlv (Thank you so much!)

His verbatim response below:

okay, I think I got it. Azure Events Hub supports SASL/PLAIN auth only over SSL encrypted connection (kinda make sense -- you're sending the password in a plain text over the wire). So in your case, OPW has to use TLS encryption for the authentication request.

Moreover, based on this comment, i have a hunch that any trusted root cert will work -- even the one from the ca-certificates/OpenSSL. if you're running OPW on a linux instance, you should probably use /etc/ssl/certs/ca-certificates.crt or /usr/lib/ssl/certs/ca-certificates.crt instead of downloading it from the cURL website.

Back to your original question in GH -- no, we shouldn't host the CA certs on our docs site. We don't host root certs; even when we changed the one used to sign Datadog certs, we just pointed users to download it directly from the DigiCert website: https://docs.datadoghq.com/data_security/guide/tls_cert_chain_of_trust/?tab=g1rootcertificateold#action-needed

My 2 cents on the doc: I'd highly recommend testing the ca-certificates package (using /etc/ssl/certs/ca-certificates.crt as the "Certificate path" in the OP UI) before suggesting customers download and mount root CA cert from the internet. Hopefully OPW can handle .crt files in addition to .pem. If this works, we can simply reference ca-certificates in the docs and provide /etc/ssl/certs/ca-certificates.crt as an example.

@emarsha94 and/or I will need to test this, I think we must do this before we proceed with publishing. I'll also check container land for these certs to see if they can be used. Hopefully our Azure Event Hubs environment is still around so we can test this without having to set everything up again 🤞

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slacked this as well, but putting here for visibility:

If it's going to take some time to test, do you think we could just say something like:

Download a trusted root certificate and save it to this path on your Worker host or container: /var/lib/observability-pipelines-worker/config/cert.pem. If you're running the Worker on a Linux instance, you should use the certifiate: /etc/ssl/certs/ca-certificates.crt or /usr/lib/ssl/certs/ca-certificates.crt. If you are using a container, you might have to mount a volume to load the certificate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just don't know if those certs already exist on a standard distribution or in our container image and if they'll work, thus why I want to test again.

1. Enter `/cert.pem` in the **Certificate path** field.
1. Click **Next: Select Destination**.
1. After you set up your destinations and processors, click **Next: Install**.
1. Select your platform in the **Choose your installation platform** dropdown menu.
1. Enter the environment variables for your Kafka source:
1. For **Kafka Bootstrap Servers**, enter `\<NAMESPACE\>.servicebus.windows.net:9093` (for example, `myeventhubns.servicebus.windows.net:9093`).
1. For **Kafka SASL Username**, enter `$ConnectionString`.
1. For **Kafka SASL Password**, enter the full connection string (for example, `Endpoint=sb://<NAMESPACE>.servicebus.windows.net/;SharedAccessKeyName=<PolicyName>;SharedAccessKey=<Key>`).
1. Enter your Kafka TLS passphrase.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we left this blank in our test, but @emarsha94 will have to confirm -- I don't see in either of our notes where we denoted needing this.

Copy link
Contributor

@znbbz znbbz Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is the art we didnt give to the customer before in the notes, and we had to update. $$ConnectionString is definitely needed for instance

Copy link
Contributor Author

@maycmlee maycmlee Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@znbbz I think @ckelner is referring to this part?

  1. Enter your Kafka TLS passphrase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I am going to have to set this up to test it again for the certificate, I'll make note if we leave this blank or not when we get it to a working state.

1. Enter the environment variables for your destinations, if applicable.
1. Follow the rest of the instructions on the page to install the Worker based on your platform.

#### Configure the Observability Pipelines environment file

In the Observability Pipelines environment file (`/etc/default/observability-pipelines-worker`), add the following connection variables:

- `DD_OP_SOURCE_KAFKA_SASL_USERNAME="$ConnectionString"`
- `DD_OP_SOURCE_KAFKA_BOOTSTRAP_SERVERS=<NAMESPACE>.servicebus.windows.net:9093`
- `DD_OP_SOURCE_KAFKA_SASL_PASSWORD=<Endpoint=sb://<NAMESPACE>.servicebus.windows.net/;SharedAccessKeyName=<PolicyName>;SharedAccessKey=<Key>>`
2 changes: 2 additions & 0 deletions content/en/observability_pipelines/sources/kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ disable_toc: false

Use Observability Pipelines' Kafka source to receive logs from your Kafka topics. Select and set up this source when you [set up a pipeline][1]. The Kafka source uses [librdkafka][2].

You can also [send Azure Event Hub logs to Observability Pipelines using the Kafka source](/observability_pipelines/sources/azure_event_hub/#send-azure-event-hub-logs-to-observability-pipelines-using-the-kafka-source).

## Prerequisites

{{% observability_pipelines/prerequisites/kafka %}}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,52 @@ To set up the Microsoft Sentinel destination in Observability Pipelines:
1. Enter the Data Collection Rule (DCR) immutable ID, such as `dcr-000a00a000a00000a000000aa000a0aa`.

[10161]: https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-register-app?tabs=certificate%2Cexpose-a-web-api
[10162]: https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/data-collection-rule-overview
[10162]: https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/data-collection-rule-overview

### Create an Event Hubs Namespace

1. In the Azure Portal, go to Event Hubs > Create.
1. Fill in Project Details (subscription, resource group) and Instance Details (namespace name, region, select Standard, Premium, or Dedicated tier).
1. Ensure the region matches your Azure resources (for example, westus).
1. Review and create the namespace.
1. Note: The Kafka endpoint is automatically enabled for Standard and higher tiers.

### Create an Event Hub (Kafka Topic)

1. Inside the namespace, select Event Hubs > + Event Hub.
1. Enter a name (for example, datadog-topic) and configure settings (for example, 4 partitions, 7-day retention).
1. This Event Hub acts as a Kafka topic.

### Configure Shared Access Policy

1. In the Event Hub, go to Settings > Shared access policies > + Add.
1. Create a policy (for example, DatadogKafkaPolicy) with Listen, Send, and Manage permissions.
1. Copy the Connection string-primary key for Kafka authentication.

### Set Up Diagnostic Settings

1. Configure Azure resources (for example, VMs, App Services) or subscription-level activity logs to stream logs to the Event Hub.
1. Navigate to the resource > Monitoring > Diagnostic settings > Add diagnostic setting.
1. Select log categories (for example, AuditLogs, SignInLogs for Microsoft Entra ID).
1. Check Stream to an event hub, select the namespace and Event Hub (datadog-topic).
1. Save the settings.
1. For activity logs, go to Microsoft Entra ID > Monitoring > Audit logs > Export Data Settings, and stream to the Event Hub.
1. Repeat for each region, as logs must stream to Event Hubs in the same region.

### Configure Kafka-Compatible Connection for Azure Event Hub

Azure Event Hubs exposes a Kafka endpoint at NAMESPACE.servicebus.windows.net:9093, which Observability Pipelines will use as the Kafka source.

### Retrieve Kafka Connection Details

1. In the Azure Portal, navigate to your Event Hubs Namespace (for example, myeventhubns).
1. On the Overview page, under the Essentials section, locate the Host name or Fully Qualified Domain Name (FQDN). It will be in the format: <NAMESPACE>.servicebus.windows.net (e.g., myeventhubns.servicebus.windows.net).
1. Append the Kafka port :9093 to form the Bootstrap Servers value: <NAMESPACE>.servicebus.windows.net:9093.
1. Example: If your namespace is myeventhubns, the Bootstrap Servers is myeventhubns.servicebus.windows.net:9093.

### Set Up Authentication

1. Azure Event Hubs uses SASL_SSL with the PLAIN mechanism for Kafka authentication.
1. The connection string will be formatted for Observability Pipelines:
- Username: $ConnectionString
- Password: Endpoint=sb://<NAMESPACE>.servicebus.windows.net/;SharedAccessKeyName=<PolicyName>;SharedAccessKey=<Key>
Loading