Skip to content

Commit

Permalink
update(kafka integration): Add max_wait_time to Advanced Configurat…
Browse files Browse the repository at this point in the history
…ions
  • Loading branch information
Meggielqk committed Oct 29, 2024
1 parent 316ca9a commit 7d72bad
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 18 deletions.
31 changes: 16 additions & 15 deletions en_US/data-integration/data-bridge-kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -385,26 +385,27 @@ This section describes some advanced configuration options that can optimize the
| Fields | Descriptions | Recommended Values |
| ----------------------------------------- | ------------------------------------------------------------ | ------------------ |
| Min Metadata Refresh Interval | The minimum time interval the client must wait before refreshing Kafka broker and topic metadata. Setting this value too small may increase the load on the Kafka server unnecessarily. | `3` |
| Metadata Request Timeout | The maximum duration to wait when the bridge requests metadata from Kafka. | `5` |
| Connect Timeout | The maximum time to wait for TCP connection establishment, which includes the authentication time if enabled. | `5` |
| Fetch Bytes (Source) | The byte size to pull from Kafka with each fetch request. Note that if the configured value is smaller than the message size in Kafka, it may negatively impact fetch performance. | `896` |
| Max Batch Bytes (Sink) | The maximum size, in bytes, for collecting messages within a Kafka batch. Typically, Kafka brokers have a default batch size limit of 1 MB. However, EMQX's default value is intentionally set slightly lower than 1 MB to account for Kafka message encoding overheads, particularly when individual messages are very small. If a single message exceeds this limit, it will still be sent as a separate batch. | `896` |
| Offset Commit Interval (Source) | The time interval between two offset commit requests sent for each consumer group. | `5` |
| Min Metadata Refresh Interval | The minimum time interval the client must wait before refreshing Kafka broker and topic metadata. Setting this value too small may increase the load on the Kafka server unnecessarily. | `3 `second |
| Metadata Request Timeout | The maximum duration to wait when the bridge requests metadata from Kafka. | `5` second |
| Connect Timeout | The maximum time to wait for TCP connection establishment, which includes the authentication time if enabled. | `5` second |
| Max Wait Time (Source) | The maximum duration to wait for a fetch response from the Kafka broker. | `1` second |
| Fetch Bytes (Source) | The byte size to pull from Kafka with each fetch request. Note that if the configured value is smaller than the message size in Kafka, it may negatively impact fetch performance. | `896` KB |
| Max Batch Bytes (Sink) | The maximum size, in bytes, for collecting messages within a Kafka batch. Typically, Kafka brokers have a default batch size limit of 1 MB. However, EMQX's default value is intentionally set slightly lower than 1 MB to account for Kafka message encoding overheads, particularly when individual messages are very small. If a single message exceeds this limit, it will still be sent as a separate batch. | `896` KB |
| Offset Commit Interval (Source) | The time interval between two offset commit requests sent for each consumer group. | `5` second |
| Required Acks (Sink) | Required acknowledgments for the Kafka partition leader to await from its followers before sending an acknowledgment back to the EMQX Kafka producer: <br />`all_isr`: Requires acknowledgment from all in-sync replicas.<br />`leader_only`: Requires acknowledgment only from the partition leader.<br />`none`: No acknowledgment from Kafka is needed. | `all_isr` |
| Partition Count Refresh Interval (Source) | The time interval at which the Kafka producer detects an increased number of partitions. Once Kafka's partition count is augmented, EMQX will incorporate these newly discovered partitions into its message dispatching process, based on the specified `partition_strategy`. | `60` |
| Partition Count Refresh Interval (Source) | The time interval at which the Kafka producer detects an increased number of partitions. Once Kafka's partition count is augmented, EMQX will incorporate these newly discovered partitions into its message dispatching process, based on the specified `partition_strategy`. | `60` second |
| Max Inflight (Sink) | The maximum number of batches allowed for Kafka producer (per-partition) to send before receiving acknowledgment from Kafka. Greater value typically means better throughput. However, there can be a risk of message reordering when this value is greater than 1.<br />This option controls the number of unacknowledged messages in transit, effectively balancing the load to prevent overburdening the system. | `10` |
| Query Mode (Source) | Allows you to choose asynchronous or synchronous query modes to optimize message transmission based on different requirements. In asynchronous mode, writing to Kafka does not block the MQTT message publish process. However, this might result in clients receiving messages ahead of their arrival in Kafka. | `Async` |
| Synchronous Query Timeout (Sink) | In synchronous query mode, establishes a maximum wait time for confirmation. This ensures timely message transmission completion to avoid prolonged waits.<br />It applies only when the bridge query mode is configured to `Sync`. | `5` |
| Synchronous Query Timeout (Sink) | In synchronous query mode, establishes a maximum wait time for confirmation. This ensures timely message transmission completion to avoid prolonged waits.<br />It applies only when the bridge query mode is configured to `Sync`. | `5` second |
| Buffer Mode (Sink) | Defines whether messages are stored in a buffer before being sent. Memory buffering can increase transmission speeds.<br />`memory`: Messages are buffered in memory. They will be lost in the event of an EMQX node restart.<br />`disk`: Messages are buffered on disk, ensuring they can survive an EMQX node restart.<br />`hybrid`: Messages are initially buffered in memory. When they reach a certain limit (refer to the `segment_bytes` configuration for more details), they are gradually offloaded to disk. Similar to the memory mode, messages will be lost if the EMQX node restarts. | `memory` |
| Per-partition Buffer Limit (Sink) | Maximum allowed buffer size, in bytes, for each Kafka partition. When this limit is reached, older messages will be discarded to make room for new ones by reclaiming buffer space. <br />This option helps to balance memory usage and performance. | `2` |
| Segment File Bytes (Sink) | This setting is applicable when the buffer mode is configured as `disk` or `hybrid`. It controls the size of segmented files used to store messages, influencing the optimization level of disk storage. | `100` |
| Memory Overload Protection (Sink) | This setting applies when the buffer mode is configured as `memory`. EMQX will automatically discard older buffered messages when it encounters high memory pressure. It helps prevent system instability due to excessive memory usage, ensuring system reliability. <br />**Note**: The threshold for high memory usage is defined in the configuration parameter `sysmon.os.sysmem_high_watermark`. This configuration is effective only on Linux systems. | Disabled |
| Socket Send / Receive Buffer Size | Manages the size of socket buffers to optimize network transmission performance. | `1024` |
| Per-partition Buffer Limit (Sink) | Maximum allowed buffer size, in bytes, for each Kafka partition. When this limit is reached, older messages will be discarded to make room for new ones by reclaiming buffer space. <br />This option helps to balance memory usage and performance. | `2` GB |
| Segment File Bytes (Sink) | This setting is applicable when the buffer mode is configured as `disk` or `hybrid`. It controls the size of segmented files used to store messages, influencing the optimization level of disk storage. | `100` MB |
| Memory Overload Protection (Sink) | This setting applies when the buffer mode is configured as `memory`. EMQX will automatically discard older buffered messages when it encounters high memory pressure. It helps prevent system instability due to excessive memory usage, ensuring system reliability. <br />**Note**: The threshold for high memory usage is defined in the configuration parameter `sysmon.os.sysmem_high_watermark`. This configuration is effective only on Linux systems. | `Disabled` |
| Socket Send / Receive Buffer Size | Manages the size of socket buffers to optimize network transmission performance. | `1024` KB |
| TCP Keepalive | This configuration enables TCP keepalive mechanism for Kafka bridge connections to maintain ongoing connection validity, preventing connection disruptions caused by extended periods of inactivity. The value should be provided as a comma-separated list of three numbers in the format `Idle, Interval, Probes`:<br />Idle: This represents the number of seconds a connection must remain idle before the server initiates keep-alive probes. The default value on Linux is 7200 seconds.<br />Interval: The interval specifies the number of seconds between each TCP keep-alive probe. On Linux, the default is 75 seconds.<br />Probes: This parameter defines the maximum number of TCP keep-alive probes to send before considering the connection as closed if there's no response from the other end. The default on Linux is 9 probes.<br />For example, if you set the value to '240,30,5,' it means that TCP keepalive probes will be sent after 240 seconds of idle time, with subsequent probes sent every 30 seconds. If there are no responses for 5 consecutive probe attempts, the connection will be marked as closed. | `none` |
| Max Linger Time | Maximum duration for a per-partition producer to wait for messages in order to collect a batch to buffer. The default value `0` means no wait. For non-memory buffer mode, it's advised to configure at least `5ms` for less IOPS. | `0` |
| Max Linger Bytes | Maximum number of bytes for a per-partition producer to wait for messages in order to collect a batch to buffer. | `10` |
| Health Check Interval | The time interval for checking the running status of the connector. | `15` |
| Max Linger Time | Maximum duration for a per-partition producer to wait for messages in order to collect a batch to buffer. The default value `0` means no wait. For non-memory buffer mode, it's advised to configure at least `5ms` for less IOPS. | `0` milliseconds |
| Max Linger Bytes | Maximum number of bytes for a per-partition producer to wait for messages in order to collect a batch to buffer. | `10` MB |
| Health Check Interval | The time interval for checking the running status of the connector. | `15` second |

## More Information

Expand Down
7 changes: 4 additions & 3 deletions zh_CN/data-integration/data-bridge-kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -383,9 +383,10 @@ EMQX v5.7.2 引入了一项新功能,可以在 SQL 处理阶段将从设置的

| 字段 | 描述 | 推荐值 |
| ---------------------------- | ------------------------------------------------------------ | --------- |
| 元数据刷新最小间隔 | 客户端在刷新 Kafka 代理和主题元数据之前必须等待的最短时间间隔。将此值设置得太小可能会不必要地增加 Kafka 服务器的负载。 | `3`秒 |
| 元数据请求超时 | 连接器从 Kafka 请求元数据时的最大等待时长。 | `5`秒 |
| 连接超时 | 等待 TCP 连接建立的最大时间,包括启用时的认证时间。 | `5`秒 |
| 元数据刷新最小间隔 | 客户端在刷新 Kafka 代理和主题元数据之前必须等待的最短时间间隔。将此值设置得太小可能会不必要地增加 Kafka 服务器的负载。 | `3 `秒 |
| 元数据请求超时 | 连接器从 Kafka 请求元数据时的最大等待时长。 | `5 `秒 |
| 连接超时 | 等待 TCP 连接建立的最大时间,包括启用时的认证时间。 | `5` 秒 |
| 最大等待时间 | 等待 Kafka broker 发送响应对象的最大时间。 | `1`秒 |
| 拉取字节数(消费者) | 每次从 Kafka 拉取请求中拉取的字节大小。请注意,如果配置的值小于 Kafka 中的消息大小,可能会对拉取性能产生负面影响。 | `896`KB |
| 最大批量字节数(生产者) | 在 Kafka 批次中收集消息的最大字节大小。通常,Kafka 代理的默认批量大小限制为 1 MB。然而,EMQX 的默认值故意设置得略低于 1 MB,以考虑 Kafka 消息编码开销,特别是当单个消息非常小的时候。如果单个消息超过此限制,它仍然会作为单独的批次发送。 | `896`KB |
| 偏移提交间隔(消费者) | 每个消费者组发送两次偏移提交请求之间的时间间隔。 | `5`秒 |
Expand Down

0 comments on commit 7d72bad

Please sign in to comment.