Felix Angell

Flink

  1. A variant of the chandley-lamport algorithm is used for snapshotting.
  2. The exactly once guarantee does not mean that every event is processed exactly once. It means that every event will affect the state managed by Flink exactly once.

Kafka

  1. To delete something from a log compacted topic you can produce a null payload (0 bytes) with the same key as a 'tombstone'. Log compaction will delete this record when it kicks in.

Simple Storage Service (S3)

  1. Deleting billions objects can be expensive if you call the delete APIs. The List call is what costs. In theory if you know where the sources are you can delete everything for free. Another option is to set the expiry time=now.
  2. Keys are case-sensitive
  3. Paths are fake. /builds/1/installer.exe is the name of the key there are no directories.