|
| 1 | +# JetStream Read-after-Write |
| 2 | + |
| 3 | +| Metadata | Value | |
| 4 | +|----------|--------------------------------------------------------------| |
| 5 | +| Date | 2025-07-11 | |
| 6 | +| Author | @MauriceVanVeen | |
| 7 | +| Status | Proposed | |
| 8 | +| Tags | jetstream, kv, objectstore, server, client, refinement, 2.12 | |
| 9 | +| Updates | ADR-8, ADR-17, ADR-20, ADR-31, ADR-37 | |
| 10 | + |
| 11 | +| Revision | Date | Author | Info | |
| 12 | +|----------|------------|-----------------|----------------| |
| 13 | +| 1 | 2025-07-11 | @MauriceVanVeen | Initial design | |
| 14 | + |
| 15 | +## Problem Statement |
| 16 | + |
| 17 | +JetStream does NOT support read-after-write or monotonic reads. This can be especially problematic when |
| 18 | +using [ADR-8 JetStream based Key-Value Stores](ADR-8.md), primarily but not limited to the use of _Direct Get_. |
| 19 | + |
| 20 | +Specifically, we have no way to guarantee a write like `kv.Put` can be observed by a subsequent `kv.Get` or `kv.Watch`, |
| 21 | +especially when the KV/stream is replicated or mirrored. |
| 22 | + |
| 23 | +## Context |
| 24 | + |
| 25 | +The topic of immediate consistency within NATS JetStream can sometimes be a bit confusing. On our docs we claim we |
| 26 | +maintain immediate consistency (as opposed to eventual consistency) even in the face of failures. Which is true.. but, |
| 27 | +as with anything, it depends. |
| 28 | + |
| 29 | +- **Monotonic writes**, all writes to a single stream (replicated or not) are monotonic. It's ordered regardless of |
| 30 | + publisher by the stream sequence. |
| 31 | +- **Monotonic reads**, if you're using consumers. All reads for a consumer (replicated or not) are monotonic. It's |
| 32 | + ordered by consumer delivery sequence. (Messages can be redelivered on failure, but this also depends on which |
| 33 | + settings are used) |
| 34 | + |
| 35 | +Those paths are immediately consistent.. but they are not immediately consistent with respect to each other. This is no |
| 36 | +problem for publishers and consumers of a stream, because they observe all operations to be monotonic. |
| 37 | +But, if you use the KV abstraction for example, you're more often going to use single message gets through `kv.Get`. |
| 38 | +Since those rely on `DirectGet`, even followers can answer, which means we (by default) can't guarantee read-after-write |
| 39 | +or even monotonic reads. Such message GET requests get served randomly by all servers within the peer group (or even |
| 40 | +mirrors if enabled). Those obviously can't be made immediately consistent, since both replication and mirroring are |
| 41 | +async. |
| 42 | + |
| 43 | +Also, when following up a `kv.Create` with `kv.Keys`, you might expect read-after-write such that the returned keys |
| 44 | +contains the key you've just written to. This also requires read-after-write. |
| 45 | + |
| 46 | +## Design |
| 47 | + |
| 48 | +Before sharing the proposed design, let's look at an alternative. Read-after-write could be achieved by having reads (on |
| 49 | +an opt-in basis) go through Raft replication first. This has several disadvantages: |
| 50 | + |
| 51 | +- Reads will become significantly slower, due to requiring replication first. |
| 52 | +- Reads require quorum, due to replication, disallowing any reads when there's downtime or temporarily no leader. |
| 53 | +- Only the stream leader can answer reads, as it is the first one to know that it can answer the request. (Followers |
| 54 | + replicate asynchronously, so letting them answer would make the response take even longer to return.) |
| 55 | +- Mirrors can still answer `DirectGet` requests, the transparency of mirrors answering read requests will violate any |
| 56 | + read-after-write guarantees (as the client will not know). This would mean mirrors must not be enabled if this |
| 57 | + guarantee should be kept. |
| 58 | +- Read-after-write guarantees could temporarily be violated when scaling streams up or down. |
| 59 | +- This is not a compatible approach for consumers, meaning they could not have these guarantees based on this approach. |
| 60 | + |
| 61 | +Although having reads be served through Raft does (mostly) offer a strong guarantee of read-after-write and monotonic |
| 62 | +reads, the disadvantages outway the advantages. Ideally, the solution has the following advantages: |
| 63 | + |
| 64 | +- It's explicitly defined, either in configuration or in code. |
| 65 | +- Works for both replicated and non-replicated streams. (Scale up/down has no influence, and implementation is not |
| 66 | + replication-specific) |
| 67 | +- Incurs no slowdown, just as fast as reads that don't guarantee read-after-write (no prior replication required). |
| 68 | +- Let followers, and even mirrors, answer read requests as long as they can make the guarantee. |
| 69 | +- Let followers, and mirrors, inform the client when they can't make the guarantee. The guarantee is always kept, but |
| 70 | + an error is returned that can be retried (to get a successful read). This can be tuned by disabling reads on mirrors |
| 71 | + or followers. |
| 72 | + |
| 73 | +Now, on to the proposed design which has the above advantages. |
| 74 | + |
| 75 | +The write and read paths remain eventually consistent as it is now. But one can opt-in for immediate consistency to |
| 76 | +guarantee read-after-write and monotonic reads, for both direct/msg read requests as well as consumers. |
| 77 | + |
| 78 | +- **Read-after-write** is achieved because all writes through `js.Publish`, `kv.Put`, etc. return the sequence |
| 79 | + (inherently last sequence) of the stream. In `DirectGet` requests those observed last sequences can be used for read |
| 80 | + requests. |
| 81 | +- **Monotonic reads** is achieved by collecting the highest sequence seen in read requests and using that sequence for |
| 82 | + subsequent read requests. |
| 83 | + |
| 84 | +This can be implemented with an additional `MinLastSeq` field in `JSApiMsgGetRequest` and `ConsumerConfig`. |
| 85 | + |
| 86 | +- This ensures the server only replies with data if it can actually 100% guarantee immediate consistency. This is done |
| 87 | + by confirming the `LastSeq` it has for its local stream, is at least the `MinLastSeq` specified. |
| 88 | +- Side-note: although `MsgGet` is only answered by the leader, technically an old leader could still respond and serve |
| 89 | + stale reads. Although this shouldn't happen often in practice, until now we couldn't guarantee it. The error can be |
| 90 | + detected on the old leader, and it can delay the error response, allowing for the real leader to send the actual |
| 91 | + answer. |
| 92 | +- Followers/mirrors reject the read request if they can't satisfy the `MinLastSeq`. But can serve reads and share the |
| 93 | + load otherwise. |
| 94 | +- Consumers don't start delivering messages, until the `MinLastSeq` is reached. (To ensure `pending` counts are correct |
| 95 | + when following up `kv.Create` with `kv.Keys` for example) |
| 96 | + |
| 97 | +In terms of API, it can look like this: |
| 98 | + |
| 99 | +```go |
| 100 | +// Write |
| 101 | +r, err := kv.Put(ctx, "key", []byte("value")) |
| 102 | + |
| 103 | +// Read request |
| 104 | +kve, err := kv.Get(ctx, "key", jetstream.MinLastRevision(r)) |
| 105 | + |
| 106 | +// Watch/consumer |
| 107 | +kl, err := kv.ListKeys(ctx, jetstream.MinLastRevision(r)) |
| 108 | +``` |
| 109 | + |
| 110 | +By specifying the `MinLastRevision` (or `MinLastSequence` when using a stream normally), you can be sure your read |
| 111 | +request will be rejected by a follower if it can't be satisfied, or the follower will wait to deliver you messages from |
| 112 | +the consumer until it's up-to-date. |
| 113 | + |
| 114 | +This satisfies read-after-write and monotonic reads when combining the write and read paths. |
| 115 | + |
| 116 | +## Decision |
| 117 | + |
| 118 | +[Maybe this was just an architectural decision...] |
| 119 | + |
| 120 | +## Consequences |
| 121 | + |
| 122 | +Since this is an opt-in on a read request or consumer create basis, this is not a breaking change. Depending on client |
| 123 | +implementation, this could be harder to implement. But given it's just another field in the `JSApiMsgGetRequest` and |
| 124 | +`ConsumerConfig`, each client should have no trouble supporting it. |
0 commit comments