Kafka users should closely monitor the storage space utilization of their Kafka instances, as excessive usage can impede the normal operation of Kafka. It is generally advisable to maintain the storage space utilization of the Kafka cluster within the range of 50-70%. If the usage rate continues to rise, appropriate measures should be taken to avoid the impact of insufficient space on the system.
In Kafka, message data is stored in log files. Without controlling the number and size of log files, a significant amount of storage space can be occupied. Therefore, it is necessary to regularly clean up old data and logs, and appropriately configure parameters such as log file size.
By automatically cleaning up data in Kafka, we can effectively control the size of data on disk, remove unnecessary data, and reduce the maintenance costs of the Kafka cluster. Such measures can not only optimize the utilization of storage space but also ensure the efficient operation of Kafka.
| Parameter | Description | 
|---|---|
| log.cleanup.policy | delete (default): Cleans up data based on retention time and log's max size. compact: For any key in a partition, its corresponding value is the latest. | 
In Kafka, the effective duration of messages can be configured using the retention.ms parameter. If expired messages are not cleaned up in a timely manner, they will occupy storage space. To address this issue, Kafka uses a segment-based regular cleaning mechanism. Segments currently in use will not be cleaned; deletion operations will only be executed when either log.retention.hours or log.retention.bytes reaches the configured requirements.
| Parameter | Description | 
|---|---|
| log.retention.hours | The duration for which log files are retained, defaulting to 7 days. Cleans up messages that exceed this specified time. | 
| log.retention.bytes | The retained size of log files. Old messages will be deleted once the specified size is exceeded, defaulting to -1 (do not delete). | 
| log.segment.bytes | The maximum size of log files, with a default value of 1073741824 (i.e., 1GB). When the current log segment file size exceeds this configured value, it will trigger a log file rollover. | 
| log.roll.hours | The maximum allowed range between the timestamp of the latest message in the current log segment and the current system timestamp, measured in hours. | 
In Kafka, message compression is a commonly used optimization strategy that can reduce storage space and bandwidth usage by compressing messages. Kafka offers several compression algorithms to choose from, such as zstd, LZ4, snappy, and GZIP. When using message compression, the following aspects should be considered:
Choice of Compression Algorithm
The choice of compression algorithm is based on a trade-off between compression ratio and performance. Different compression algorithms have distinct characteristics in these aspects. For example, the zstd algorithm usually achieves a higher compression ratio and faster speed but requires more CPU resources. In contrast, the snappy and LZ4 algorithms offer faster speeds and lower latency but have relatively lower compression ratios. Thus, when selecting a compression algorithm, it is essential to consider actual business needs and available hardware resources.
Message Latency Before and After Compression
Using compression algorithms for compressing and decompressing messages will consume certain CPU resources and may introduce some message dispatch latency. Therefore, before deciding whether to enable message compression, it is necessary to assess the cluster's bandwidth, storage, and CPU resource situation comprehensively. If bandwidth and storage resources are limited while CPU resources are relatively abundant, enabling compression may be considered to reduce storage and network overhead. However, in cases where CPU resources are nearing saturation, careful evaluation is needed to determine whether to enable compression to avoid negative impacts on cluster performance.
Configuration of Compression Parameters
When using Kafka message compression, relevant parameters need to be set in the producer and consumer configuration files. For instance, the compression.type parameter is used to specify the compression algorithm to be used.
Throughput: LZ4 > Snappy > zstd > GZIP.
Compression Ratio: zstd > LZ4 > GZIP > Snappy.
| Parameter | Description | 
|---|---|
| compression.type | Optional values: uncompressed,zstd,lz4,snappy,gzip,producer. | 
Kafka provides several metrics to monitor the utilization of storage space. We need to regularly check these metrics and take appropriate measures upon discovering an ongoing increase in storage space utilization. Through monitoring and optimization measures for storage space, we can better manage the storage space of the Kafka cluster and prevent failures caused by storage space issues, thereby ensuring the stable operation of Kafka.
| Metric | Description | 
|---|---|
| log.log_end_offset | The offset of the last message in the current log. By comparing the log_end_offsetof different topics and partitions, the storage space utilization of each partition can be understood. | 
| log.segment.bytes | The size of each log segment. | 
| log.segment.ms | The time duration for each log segment. This can be used to determine whether old data and logs need to be cleaned up. | 
| log.retention.bytes | The total size threshold of messages stored in Kafka. When the total size exceeds this threshold, Kafka will automatically delete old messages. | 
| log.retention.hours | The retention duration of messages in Kafka. Messages exceeding this time limit will be automatically deleted. |