This diagram illustrates the concept of offset in Kafka.
Kafka maintains a numerical offset for each record in a partition. This offset acts as a unique identifier of a record within that partition. It also denotes the position of the consumer in the partition.
There are two notions of position or offset relevant to the user of the consumer:
- Current offset - The position of the consumer gives the offset of the next record that will be given out. It will be one larger than the highest offset the consumer has read in that partition. It automatically advances every time the consumer receives messages in a call to
poll(Duration)
. - Committed offset - The committed position is the last offset that has been stored securely. If the process fail and restart, this is the offset that the consumer will recover to.
In Kafka consumer client, offset can be automatically committed periodically. It is the default behavior and configurable via enable.auto.commit
. Alternatively, it can also be committed manually by calling one of the commit APIs (e.g. commitSync
and commitAsync
).