Skip to content

Commit 345fdb0

Browse files
committed
feat: enhance Azure Data Explorer and Event Hubs configurations with compression and partitioning options
1 parent bba6f83 commit 345fdb0

File tree

2 files changed

+92
-42
lines changed

2 files changed

+92
-42
lines changed

docs/configuration/targets/azure/azure-data-explorer.mdx

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ Creates an Azure Data Explorer (Kusto) target that ingests data directly into Az
2323
table: <string>
2424
schema: <string>
2525
type: <string>
26+
compression: <string>
2627
flush_immediately: <boolean>
2728
timeout: <numeric>
2829
batch_size: <numeric>
@@ -31,6 +32,8 @@ Creates an Azure Data Explorer (Kusto) target that ingests data directly into Az
3132
tables:
3233
- name: <string>
3334
schema: <string>
35+
type: <string>
36+
compression: <string>
3437
interval: <string|numeric>
3538
cron: <string>
3639
debug:
@@ -61,7 +64,8 @@ The following fields are used to define the target:
6164
|`database`|Y|-|Target database name|
6265
|`table`|N(2)|-|Default/fallback table name (catch-all for unmatched events)|
6366
|`schema`|N(2)|-|Table schema definition for the default/fallback table|
64-
|`type`|N|`parquet`|Data format: `parquet`, `json`, `multijson`, or `avro`|
67+
|`type`|N|`parquet`|Data format. See [Formats](#formats) below|
68+
|`compression`|N|-|Compression algorithm. See [Compression](#compression) below|
6569

6670
(1) = Conditionally required (see authentication methods above)
6771
(2) = Required if you want a catch-all table for unmatched events, or if not using the tables array
@@ -139,6 +143,17 @@ For tables not defined in the configuration, the target can automatically discov
139143
|`avro`|Apache Avro format with schema|
140144
|`parquet`|Apache Parquet columnar format with schema (default)|
141145
146+
### Compression
147+
148+
Data can use the following compression algorithms:
149+
150+
|Format|Default|Compression Codecs|
151+
|---|---|---|
152+
|JSON|-|Not supported|
153+
|MultiJSON|-|Not supported|
154+
|Avro|`zstd`|`deflate`, `snappy`, `zstd`|
155+
|Parquet|`zstd`|`gzip`, `snappy`, `zstd`, `brotli`, `lz4`|
156+
142157
:::warning
143158
Consider cluster capacity when setting batch sizes and timeouts.
144159
:::
@@ -177,16 +192,23 @@ targets:
177192
endpoint: "https://cluster.region.kusto.windows.net"
178193
database: "logs"
179194
type: "parquet"
195+
compression: "zstd"
180196
# Catch-all table for unmatched events
181197
table: "general_logs"
182198
schema: "TimeGenerated:datetime,Message:string,Source:string"
183199
tables:
184200
- name: "security_events"
185201
schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"
202+
type: "parquet"
203+
compression: "zstd"
186204
- name: "system_events"
187205
schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"
206+
type: "avro"
207+
compression: "snappy"
188208
- name: "application_events"
189209
schema: "TimeGenerated:datetime,Computer:string,EventID:int,Message:string"
210+
type: "parquet"
211+
compression: "brotli"
190212
```
191213
192214
In this example, events with `SystemS3` set to "security_events", "system_events", or "application_events" will route to their respective tables. All other events will route to the "general_logs" catch-all table.

docs/configuration/targets/azure/azure-event-hubs.mdx

Lines changed: 69 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -21,17 +21,17 @@ Creates a target that sends processed messages to _Azure Event Hubs_ with suppor
2121
client_secret: <string>
2222
namespace: <string>
2323
event_hub: <string>
24-
partition_key: <string>
25-
format: <string>
26-
batch_size: <numeric>
27-
max_retry: <numeric>
28-
retry_interval: <numeric>
29-
timeout: <numeric>
30-
compression: <string>
24+
partition:
25+
id: <string>
26+
key: <string>
27+
field_format: <string>
28+
max_bytes: <numeric>
29+
max_events: <numeric>
3130
tls:
3231
status: <boolean>
3332
cert_name: <string>
3433
key_name: <string>
34+
insecure_skip_verify: <boolean>
3535
interval: <string|numeric>
3636
cron: <string>
3737
debug:
@@ -74,37 +74,47 @@ EventHubs target supports two authentication methods:
7474

7575
\* = Conditionally required (see authentication methods above)
7676

77+
### Partition Configuration
78+
79+
|Field|Required|Default|Description|
80+
|---|---|---|---|
81+
|`partition.id`|N*|-|Specific partition ID to send messages to|
82+
|`partition.key`|N*|-|Partition key for message routing|
83+
84+
\* = Mutually exclusive - use either `partition.id` OR `partition.key`, not both
85+
7786
### Message Configuration
7887

7988
|Field|Required|Default|Description|
8089
|---|---|---|---|
81-
|`partition_key`|N|-|Partition key for message routing|
82-
|`format`|N|`json`|Output format (`json`, `multijson`, `raw`)|
83-
|`compression`|N|-|Compression method (`gzip`, `lz4`)|
90+
|`field_format`|N|-|Data normalization format. See applicable <Topic id="normalization-mapping">Normalization</Topic> section|
8491

8592
### Performance
8693

8794
|Field|Required|Default|Description|
8895
|---|---|---|---|
89-
|`batch_size`|N|`100`|Number of messages per batch|
90-
|`timeout`|N|`30`|Connection timeout in seconds|
91-
|`max_retry`|N|`3`|Maximum retry attempts|
92-
|`retry_interval`|N|`5`|Retry interval in seconds|
96+
|`max_bytes`|N|`0`|Maximum batch size in bytes (0 for unlimited)|
97+
|`max_events`|N|`1000`|Maximum number of events per batch|
9398

9499
### TLS
95100

96101
|Field|Required|Default|Description|
97102
|---|---|---|---|
98103
|`tls.status`|N|`false`|Enable TLS encryption|
99-
|`tls.cert_name`|N*||TLS certificate file path (required if TLS enabled)|
100-
|`tls.key_name`|N*||TLS private key file path (required if TLS enabled)|
104+
|`tls.cert_name`|N*|-|TLS certificate file path (required if TLS enabled)|
105+
|`tls.key_name`|N*|-|TLS private key file path (required if TLS enabled)|
106+
|`tls.insecure_skip_verify`|N|`false`|Skip TLS certificate verification (NOT recommended for production)|
101107

102108
\* = Conditionally required (only when `tls.status: true`)
103109

104110
:::note
105111
TLS certificate and key files must be placed in the service root directory.
106112
:::
107113

114+
:::warning
115+
Setting `insecure_skip_verify: true` disables certificate validation and should only be used for testing/development environments.
116+
:::
117+
108118
### Scheduler
109119

110120
|Field|Required|Default|Description|
@@ -121,11 +131,18 @@ TLS certificate and key files must be placed in the service root directory.
121131

122132
## Details
123133

124-
The EventHubs target sends processed messages to Azure Event Hubs for real-time event streaming and analytics. It supports automatic batching for optimal performance, configurable retry mechanisms for reliability, and multiple authentication methods for flexible deployment scenarios.
134+
The EventHubs target sends processed messages to Azure Event Hubs for real-time event streaming and analytics. It supports automatic batching for optimal performance and multiple authentication methods for flexible deployment scenarios.
135+
136+
Messages are sent with automatic partition distribution unless a specific partition ID or key is provided. The target handles connection pooling and automatic reconnection on network failures.
137+
138+
### Partition Management
125139

126-
Messages are sent with automatic partition distribution unless a specific partition key is provided. The target handles connection pooling and automatic reconnection on network failures.
140+
You can control message routing to Event Hubs partitions using two mutually exclusive options:
127141

128-
Format options include JSON for structured data, multijson for line-delimited JSON arrays, and raw format for preserving original message structure. Compression options help reduce network bandwidth for high-volume scenarios.
142+
- **`partition.id`**: Routes all messages to a specific partition by ID (0-based index)
143+
- **`partition.key`**: Uses a partition key for consistent hashing across partitions
144+
145+
If neither is specified, Event Hubs automatically distributes messages across available partitions using round-robin distribution.
129146

130147
## Examples
131148

@@ -144,12 +161,11 @@ The following are commonly used configuration types.
144161
properties:
145162
client_connection_string: "Endpoint=sb://mynamespace.servicebus.windows.net/;SharedAccessKeyName=mykey;SharedAccessKey=myvalue"
146163
event_hub: "processed-logs"
147-
format: json
148-
batch_size: 100
164+
max_events: 100
149165
```
150166
</CodeCol>
151167
<CommentCol>
152-
Target sends JSON messages to Event Hubs in batches...
168+
Target sends messages to Event Hubs in batches...
153169
</CommentCol>
154170
<CodeCol>
155171
```json
@@ -179,8 +195,7 @@ The following are commonly used configuration types.
179195
client_secret: "${AZURE_CLIENT_SECRET}"
180196
namespace: "production-namespace"
181197
event_hub: "security-events"
182-
format: json
183-
batch_size: 250
198+
max_events: 250
184199
```
185200
</CodeCol>
186201
</ExampleGrid>
@@ -198,35 +213,51 @@ The following are commonly used configuration types.
198213
properties:
199214
client_connection_string: "${EVENTHUBS_CONNECTION_STRING}"
200215
event_hub: "high-volume-events"
201-
format: multijson
202-
batch_size: 500
203-
compression: gzip
204-
timeout: 60
205-
max_retry: 5
206-
retry_interval: 3
216+
max_events: 500
217+
max_bytes: 1048576
207218
```
208219
</CodeCol>
209220
</ExampleGrid>
210221

211-
### Partitioned Messages
222+
### Partition Key Configuration
212223

213224
<ExampleGrid>
214225
<CommentCol>
215-
Using partition keys for message routing control...
226+
Using partition key for consistent message routing...
216227
</CommentCol>
217228
<CodeCol>
218229
```yaml
219-
- name: partitioned_target
230+
- name: partitioned_key_target
220231
type: eventhubs
221232
properties:
222233
tenant_id: "${AZURE_TENANT_ID}"
223234
client_id: "${AZURE_CLIENT_ID}"
224235
client_secret: "${AZURE_CLIENT_SECRET}"
225236
namespace: "analytics-namespace"
226237
event_hub: "partitioned-logs"
227-
partition_key: "source_system"
228-
format: json
229-
batch_size: 200
238+
partition:
239+
key: "source_system"
240+
max_events: 200
241+
```
242+
</CodeCol>
243+
</ExampleGrid>
244+
245+
### Partition ID Configuration
246+
247+
<ExampleGrid>
248+
<CommentCol>
249+
Sending all messages to a specific partition...
250+
</CommentCol>
251+
<CodeCol>
252+
```yaml
253+
- name: partitioned_id_target
254+
type: eventhubs
255+
properties:
256+
client_connection_string: "${EVENTHUBS_CONNECTION_STRING}"
257+
event_hub: "specific-partition"
258+
partition:
259+
id: "0"
260+
max_events: 150
230261
```
231262
</CodeCol>
232263
</ExampleGrid>
@@ -244,8 +275,7 @@ The following are commonly used configuration types.
244275
properties:
245276
client_connection_string: "${EVENTHUBS_CONNECTION_STRING}"
246277
event_hub: "secure-events"
247-
format: json
248-
batch_size: 150
278+
max_events: 150
249279
tls:
250280
status: true
251281
cert_name: "eventhubs.crt"
@@ -271,9 +301,7 @@ The following are commonly used configuration types.
271301
properties:
272302
client_connection_string: "${EVENTHUBS_CONNECTION_STRING}"
273303
event_hub: "processed-events"
274-
format: json
275-
batch_size: 100
276-
compression: lz4
304+
max_events: 100
277305
```
278306
</CodeCol>
279307
</ExampleGrid>

0 commit comments

Comments
 (0)