Deinition

One naive approach for solving the event-time windowing problem would be to simply base our event-time windows on the current processing time. But data processing and transport is not instantaneous, so processing and event times are almost never equal. Any hiccup or spike in our pipeline might cause us to incorrectly assign messages to windows.

Another intuitive, but ultimately incorrect, approach would be to consider the rate of messages processed by the pipeline. Although this is an interesting metric, the rate may vary arbitrarily with changes in input, variability of expected results, resources available for processing, and so on. Even more important, rate does not help answer the fundamental questions of completeness. Specifically, rate does not tell us when we have seen all of the messages for a particular time interval. (A rate metric could never tell us that a single message is failing to make progress through our pipeline.)

We require a more robust measure of progress. To arrive there, we make one fundamental assumption about our streaming data: each message has an associated logical event timestamp.

Messages are ingested by the pipeline, processed, and eventually marked completed. Each message is either “in-flight,” meaning that it has been received but not yet completed, or “completed,” meaning that no more processing on behalf of this message is required.

Watermark have two fundamental usage:

  1. Completeness: If the watermark has advanced past some timestamp T, we are guaranteed by its monotonic property that no more processing will occur for on-time (nonlate data) events at or before T.
  2. Visibility: If a message is stuck in our pipeline for any reason, the watermark cannot advance.

Source Watermark Creation

Heuristic watermark creation, on the other hand, creates a watermark that is merely an estimate that no data with event times less than the watermark will ever be seen again. If the heuristic is a reasonably good one, the amount of late data might be very small, and the watermark remains useful as a completion estimate.

Method: By tracking the minimum event times of unprocessed data in the existing set of log files, monitoring growth rates, and utilizing external information like network topology and bandwidth availability, you can create a remarkably accurate watermark, even given the lack of perfect knowledge of all the inputs.

Watermark Propagation

We give the following definitions for the watermarks at the boundaries of stages:

  • An input watermark, which captures the progress of everything upstream of that stage
  • An output watermark, which captures the progress of the stage itself, and is essentially defined as the minimum of the stage’s input watermark and the event times of all nonlate data active messages within the stage.

Flink Example

Flink performs watermark tracking and aggregation in-band (no centralized watermark propagation monitoring, watermark flow through the pipeline).

A Flink pipeline with two sources and event-time watermarks propagating in-band.

In this pipeline data is generated at two sources. These sources also both generate watermark “checkpoints” that are sent synchronously in-band with the data stream. This means that when a watermark checkpoint from source A for timestamp “53” is emitted, it guarantees that no nonlate data messages will be emitted from source A with timestamp behind “53”. The downstream “keyBy” operators consume the input data and the watermark checkpoints. As new watermark checkpoints are consumed, the downstream operators’ view of the watermark is advanced, and a new watermark checkpoint for downstream operators can be emitted.

Streaming System Monitoring

Normally we would like to monitor event watermark. However, this is not enough to distinguish between old data and a delayed system. In other words, by only examining the event-timee watermark, we cannot distinguish between a system that is processing data from an hour ago quickly and without delay, and a system that is attempting to process real-time data and has been delayed for an hour while doing so.

To solve this problem, we have to look into another time domain - processing-time watermark. The processing-time watermark, therefore, provides a notion of processing delay separate from the data delay.

Event-time watermark increasing. It is not possible to know from this information whether this is due to data buffering or system processing delay.

We seed that the delay is monotonically increaing, but there is not enough information to distinguish between the cases of a stuck system and stuck data. Only by looking at the processing-time watermark, can we distinuighs the cases.

Processing-time watermark also increasing. This indicates that the system processing is delayed.

In first case, when we examine the processing-time watermark delay we see that it too is increasing. This tells us that an operation in our system is stuck, and the stuckness is also causing the data delay to fall behind.

In next case, the processing-time watermark delay is small. This tells us that there are no stuck operations. The event-time watermark delay is still increasing, which indicates that we have some buffered state that we are waiting to drain.

Event-time watermark delay increasing, processing-time watermark stable. This is an indication that data are buffered in the system and waiting to be processed, rather than an indication that a system operation is preventing data processing from completing.

This is possible, for example, if we are buffering some state while waiting for a window boundary to emit an aggregation, and corresponds to a normal operation of the pipeline:

Event-time watermark delay increasing, processing-time watermark stable. This is an indication that data are buffered in the system and waiting to be processed, rather than an indication that a system operation is preventing data processing from completing.

Therefore, the processing-time watermark is a useful tool in distinguishing system latency from data latency. In addition to visibility, we can use the processing-time watermark at the system-implementation level for tasks such as garbage collection of temporary state.

Leave a Reply

Your email address will not be published. Required fields are marked *