Skip to content

Commit d1906d5

Browse files
ADR-52: JetStream Read-after-Write
Signed-off-by: Maurice van Veen <[email protected]>
1 parent 488f364 commit d1906d5

File tree

4 files changed

+155
-16
lines changed

4 files changed

+155
-16
lines changed

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ This repository captures Architecture, Design Specifications and Feature Guidanc
2020
|-----|----|-----------|
2121
|[ADR-49](adr/ADR-49.md)|jetstream, spec, 2.12|JetStream Distributed Counter CRDT|
2222
|[ADR-50](adr/ADR-50.md)|jetstream, server, client, 2.12|JetStream Batch Publishing|
23+
|[ADR-52](adr/ADR-52.md)|jetstream, kv, objectstore, server, client, spec, refinement, 2.12|JetStream Read-after-Write (updating [ADR-8](adr/ADR-8.md), [ADR-17](adr/ADR-17.md), [ADR-20](adr/ADR-20.md), [ADR-31](adr/ADR-31.md), [ADR-37](adr/ADR-37.md))|
2324

2425
## Client
2526

@@ -53,6 +54,7 @@ This repository captures Architecture, Design Specifications and Feature Guidanc
5354
|[ADR-47](adr/ADR-47.md)|client, spec, orbit|Request Many|
5455
|[ADR-48](adr/ADR-48.md)|jetstream, client, kv, refinement, 2.11|TTL Support for Key-Value Buckets (updating [ADR-8](adr/ADR-8.md))|
5556
|[ADR-50](adr/ADR-50.md)|jetstream, server, client, 2.12|JetStream Batch Publishing|
57+
|[ADR-52](adr/ADR-52.md)|jetstream, kv, objectstore, server, client, spec, refinement, 2.12|JetStream Read-after-Write (updating [ADR-8](adr/ADR-8.md), [ADR-17](adr/ADR-17.md), [ADR-20](adr/ADR-20.md), [ADR-31](adr/ADR-31.md), [ADR-37](adr/ADR-37.md))|
5658

5759
## Jetstream
5860

@@ -82,6 +84,7 @@ This repository captures Architecture, Design Specifications and Feature Guidanc
8284
|[ADR-48](adr/ADR-48.md)|jetstream, client, kv, refinement, 2.11|TTL Support for Key-Value Buckets (updating [ADR-8](adr/ADR-8.md))|
8385
|[ADR-49](adr/ADR-49.md)|jetstream, spec, 2.12|JetStream Distributed Counter CRDT|
8486
|[ADR-50](adr/ADR-50.md)|jetstream, server, client, 2.12|JetStream Batch Publishing|
87+
|[ADR-52](adr/ADR-52.md)|jetstream, kv, objectstore, server, client, spec, refinement, 2.12|JetStream Read-after-Write (updating [ADR-8](adr/ADR-8.md), [ADR-17](adr/ADR-17.md), [ADR-20](adr/ADR-20.md), [ADR-31](adr/ADR-31.md), [ADR-37](adr/ADR-37.md))|
8588

8689
## Kv
8790

@@ -90,13 +93,15 @@ This repository captures Architecture, Design Specifications and Feature Guidanc
9093
|[ADR-8](adr/ADR-8.md)|jetstream, client, kv, spec|JetStream based Key-Value Stores|
9194
|[ADR-19](adr/ADR-19.md)|jetstream, client, kv, objectstore|API prefixes for materialized JetStream views|
9295
|[ADR-48](adr/ADR-48.md)|jetstream, client, kv, refinement, 2.11|TTL Support for Key-Value Buckets (updating [ADR-8](adr/ADR-8.md))|
96+
|[ADR-52](adr/ADR-52.md)|jetstream, kv, objectstore, server, client, spec, refinement, 2.12|JetStream Read-after-Write (updating [ADR-8](adr/ADR-8.md), [ADR-17](adr/ADR-17.md), [ADR-20](adr/ADR-20.md), [ADR-31](adr/ADR-31.md), [ADR-37](adr/ADR-37.md))|
9397

9498
## Objectstore
9599

96100
|Index|Tags|Description|
97101
|-----|----|-----------|
98102
|[ADR-19](adr/ADR-19.md)|jetstream, client, kv, objectstore|API prefixes for materialized JetStream views|
99103
|[ADR-20](adr/ADR-20.md)|jetstream, client, objectstore, spec|JetStream based Object Stores|
104+
|[ADR-52](adr/ADR-52.md)|jetstream, kv, objectstore, server, client, spec, refinement, 2.12|JetStream Read-after-Write (updating [ADR-8](adr/ADR-8.md), [ADR-17](adr/ADR-17.md), [ADR-20](adr/ADR-20.md), [ADR-31](adr/ADR-31.md), [ADR-37](adr/ADR-37.md))|
100105

101106
## Observability
102107

@@ -116,6 +121,7 @@ This repository captures Architecture, Design Specifications and Feature Guidanc
116121
|Index|Tags|Description|
117122
|-----|----|-----------|
118123
|[ADR-48](adr/ADR-48.md)|jetstream, client, kv, refinement, 2.11|TTL Support for Key-Value Buckets (updating [ADR-8](adr/ADR-8.md))|
124+
|[ADR-52](adr/ADR-52.md)|jetstream, kv, objectstore, server, client, spec, refinement, 2.12|JetStream Read-after-Write (updating [ADR-8](adr/ADR-8.md), [ADR-17](adr/ADR-17.md), [ADR-20](adr/ADR-20.md), [ADR-31](adr/ADR-31.md), [ADR-37](adr/ADR-37.md))|
119125

120126
## Security
121127

@@ -153,6 +159,7 @@ This repository captures Architecture, Design Specifications and Feature Guidanc
153159
|[ADR-43](adr/ADR-43.md)|jetstream, client, server, 2.11|JetStream Per-Message TTL|
154160
|[ADR-44](adr/ADR-44.md)|jetstream, server, 2.11|Versioning for JetStream Assets|
155161
|[ADR-50](adr/ADR-50.md)|jetstream, server, client, 2.12|JetStream Batch Publishing|
162+
|[ADR-52](adr/ADR-52.md)|jetstream, kv, objectstore, server, client, spec, refinement, 2.12|JetStream Read-after-Write (updating [ADR-8](adr/ADR-8.md), [ADR-17](adr/ADR-17.md), [ADR-20](adr/ADR-20.md), [ADR-31](adr/ADR-31.md), [ADR-37](adr/ADR-37.md))|
156163

157164
## Spec
158165

@@ -165,6 +172,7 @@ This repository captures Architecture, Design Specifications and Feature Guidanc
165172
|[ADR-40](adr/ADR-40.md)|client, server, spec|NATS Connection|
166173
|[ADR-47](adr/ADR-47.md)|client, spec, orbit|Request Many|
167174
|[ADR-49](adr/ADR-49.md)|jetstream, spec, 2.12|JetStream Distributed Counter CRDT|
175+
|[ADR-52](adr/ADR-52.md)|jetstream, kv, objectstore, server, client, spec, refinement, 2.12|JetStream Read-after-Write (updating [ADR-8](adr/ADR-8.md), [ADR-17](adr/ADR-17.md), [ADR-20](adr/ADR-20.md), [ADR-31](adr/ADR-31.md), [ADR-37](adr/ADR-37.md))|
168176

169177
## Deprecated
170178

adr/ADR-31.md

Lines changed: 19 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,11 @@
77
| Status | Implemented |
88
| Tags | jetstream, client, server, 2.11 |
99

10-
| Revision | Date | Author | Info |
11-
|----------|------------|------------|------------------------------------------------|
12-
| 1 | 2022-08-08 | @tbeets | Initial design |
13-
| 2 | 2024-03-06 | @ripienaar | Adds Multi and Batch behaviors for Server 2.11 |
10+
| Revision | Date | Author | Info | Refinement | Server Requirement |
11+
|----------|------------|-----------------|------------------------------------------------|------------|--------------------|
12+
| 1 | 2022-08-08 | @tbeets | Initial design | | |
13+
| 2 | 2024-03-06 | @ripienaar | Adds Multi and Batch behaviors for Server 2.11 | | |
14+
| 3 | 2025-07-11 | @MauriceVanVeen | Update on Read-after-Write guarantee | ADR-52 | |
1415

1516
## Context and motivation
1617

@@ -41,14 +42,20 @@ clients. Also, read availability can be enhanced as mirrors may be available to
4142

4243
###### A note on read-after-write coherency
4344

44-
The existing Get API `$JS.API.STREAM.MSG.GET.<stream>` provides read-after-write coherency by routing requests to a
45-
stream's current peer leader (R>1) or single server (R=1). A client that publishes a message to stream (with ACK) is
46-
assured that a subsequent call to the Get API will return that message as the read will go a server that defines
47-
_most current_.
45+
The existing Get API `$JS.API.STREAM.MSG.GET.<stream>` as well as _Direct Get_ do NOT provide any read-after-write
46+
guarantees by default. The existing Get API only guarantees read-after-write if the underlying stream is not
47+
replicated (R=1).
4848

49-
In contrast, _Direct Get_ does not assure read-after-write coherency as responders may be non-leader stream servers
50-
(that may not have yet applied the latest consensus writes) or MIRROR downstream servers that have not yet _consumed_
51-
the latest consensus writes from upstream.
49+
_Direct Get_ does not assure read-after-write coherency as responders may be non-leader stream servers (that may not
50+
have yet applied the latest consensus writes) or MIRROR downstream servers that have not yet _consumed_ the latest
51+
consensus writes from upstream.
52+
53+
The Get API routes requests to a stream's current peer leader (R>1). A client that publishes multiple messages to a
54+
stream (with ACK) is assured that they will be properly ordered by sequence, regardless of which peer leader was active
55+
at that time. However, during and after leader elections, calls to the Get API could still be served by a server that
56+
still thinks it's leader even if a new leader was elected in the meantime (but it doesn't know yet).
57+
58+
Read-after-write guarantees can be opted into with [ADR-52](adr/ADR-52.md).
5259

5360
## Implementation
5461

@@ -60,7 +67,7 @@ the latest consensus writes from upstream.
6067
based on `max_msgs_per_subject`
6168

6269
> Allow Direct is set automatically based on the inferred use case of the stream. Maximum messages per subject is a
63-
tell-tale of a stream that is a KV bucket.
70+
> tell-tale of a stream that is a KV bucket.
6471
6572
### Direct Get API
6673

adr/ADR-52.md

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# JetStream Read-after-Write
2+
3+
| Metadata | Value |
4+
|----------|--------------------------------------------------------------------|
5+
| Date | 2025-07-11 |
6+
| Author | @MauriceVanVeen |
7+
| Status | Proposed |
8+
| Tags | jetstream, kv, objectstore, server, client, spec, refinement, 2.12 |
9+
| Updates | ADR-8, ADR-17, ADR-20, ADR-31, ADR-37 |
10+
11+
| Revision | Date | Author | Info |
12+
|----------|------------|-----------------|----------------|
13+
| 1 | 2025-07-11 | @MauriceVanVeen | Initial design |
14+
15+
## Problem Statement
16+
17+
JetStream does NOT support read-after-write or monotonic reads. This can be especially problematic when
18+
using [ADR-8 JetStream based Key-Value Stores](ADR-8.md), primarily but not limited to the use of _Direct Get_.
19+
20+
Specifically, we have no way to guarantee a write like `kv.Put` can be observed by a subsequent `kv.Get` or `kv.Watch`,
21+
especially when the KV/stream is replicated or mirrored.
22+
23+
## Context
24+
25+
The topic of immediate consistency within NATS JetStream can sometimes be a bit confusing. On our docs we claim we
26+
maintain immediate consistency (as opposed to eventual consistency) even in the face of failures. Which is true.. but,
27+
as with anything, it depends.
28+
29+
- **Monotonic writes**, all writes to a single stream (replicated or not) are monotonic. It's ordered regardless of
30+
publisher by the stream sequence.
31+
- **Monotonic reads**, if you're using consumers. All reads for a consumer (replicated or not) are monotonic. It's
32+
ordered by consumer delivery sequence. (Messages can be redelivered on failure, but this also depends on which
33+
settings are used)
34+
35+
Those paths are immediately consistent.. but they are not immediately consistent with respect to each other. This is no
36+
problem for publishers and consumers of a stream, because they observe all operations to be monotonic.
37+
But, if you use the KV abstraction for example, you're more often going to use single message gets through `kv.Get`.
38+
Since those rely on `DirectGet`, even followers can answer, which means we (by default) can't guarantee read-after-write
39+
or even monotonic reads. Such message gets get served randomly by all servers within the peer group (or even mirrors if
40+
enabled). Those obviously can't be made immediately consistent, since both replication and mirroring is async.
41+
42+
Also, when following up a `kv.Create` with `kv.Keys`, you might expect read-after-write such that the returned keys
43+
contains the key you've just written to. This also requires read-after-write.
44+
45+
## Design
46+
47+
Before sharing the proposed design, let's look at an alternative. Read-after-write could be achieved by having reads (on
48+
an opt-in basis) go through Raft replication first. This has several disadvantages:
49+
50+
- Reads will become significantly slower, due to requiring replication first.
51+
- Reads require quorum, due to replication, disallowing any reads when there's downtime or temporarily no leader.
52+
- Only the stream leader can answer reads, as it is the first one to know that it can answer the request. (Followers
53+
replicate asynchronously, so letting them answer would make the response take even longer to return.)
54+
- Mirrors can still answer `DirectGet` requests, the transparency of mirrors answering read requests will violate any
55+
read-after-write guarantees (as the client will not know). This would mean mirrors must not be enabled if this
56+
guarantee should be kept.
57+
- Read-after-write guarantees could temporarily be violated when scaling streams up or down.
58+
- This is not a compatible approach for consumers, meaning they could not have these guarantees based on this approach.
59+
60+
Although having reads be served through Raft does (mostly) offer a strong guarantee of read-after-write and monotonic
61+
reads, the disadvantages outway the advantages. Ideally, the solution has the following advantages:
62+
63+
- It's explicitly defined, either in configuration or in code.
64+
- Works for both replicated and non-replicated streams. (Scale up/down has no influence, and implementation is not
65+
replication-specific)
66+
- Incurs no slowdown, just as fast as reads that don't guarantee read-after-write (no prior replication required).
67+
- Let followers, and even mirrors, answer read requests as long as they can make the guarantee.
68+
- Let followers, and mirrors, inform the client when they can't make the guarantee. The guarantee is always kept, but
69+
an error is returned that can be retried (to get a successful read). This can be tuned by disabling reads on mirrors
70+
or followers.
71+
72+
Now, on to the proposed design which has the above advantages.
73+
74+
The write and read paths remain eventually consistent as it is now. But one can opt-in for immediate consistency to
75+
guarantee read-after-write and monotonic reads, for both direct/msg read requests as well as consumers.
76+
77+
- **Read-after-write** is achieved because all writes through `js.Publish`, `kv.Put`, etc. return the sequence
78+
(inherently last sequence) of the stream. In `DirectGet` requests those observed last sequences can be used for read
79+
requests.
80+
- **Monotonic reads** is achieved by collecting the highest sequence seen in read requests and using that sequence for
81+
subsequent read requests.
82+
83+
This can be implemented with an additional `MinLastSeq` field in `JSApiMsgGetRequest` and `ConsumerConfig`.
84+
85+
- This ensures the server only replies with data if it can actually 100% guarantee immediate consistency. This is done
86+
by confirming the `LastSeq` it has for its local stream, is at least the `MinLastSeq` specified.
87+
- Side-note: although `MsgGet` is only answered by the leader, technically an old leader could still respond and serve
88+
stale reads. Although this shouldn't happen often in practice, until now we couldn't guarantee it. The error can be
89+
detected on the old leader, and it can delay the error response, allowing for the real leader to send the actual
90+
answer.
91+
- Followers/mirrors reject the read request if they can't satisfy the `MinLastSeq`. But can serve reads and share the
92+
load otherwise.
93+
- Consumers don't start delivering messages, until the `MinLastSeq` is reached. (To ensure `pending` counts are correct
94+
when following up `kv.Create` with `kv.Keys` for example)
95+
96+
In terms of API, it can look like this:
97+
98+
```go
99+
// Write
100+
r, err := kv.Put(ctx, "key", []byte("value"))
101+
102+
// Read request
103+
kve, err := kv.Get(ctx, "key", jetstream.MinLastRevision(r))
104+
105+
// Watch/consumer
106+
kl, err := kv.ListKeys(ctx, jetstream.MinLastRevision(r))
107+
```
108+
109+
By specifying the `MinLastRevision` (or `MinLastSequence` when using a stream normally), you can be sure your read
110+
request will be rejected by a follower if it can't be satisfied, or the follower will wait to deliver you messages from
111+
the consumer until it's up-to-date.
112+
113+
This satisfies read-after-write and monotonic reads when combining the write and read paths.
114+
115+
## Decision
116+
117+
[Maybe this was just an architectural decision...]
118+
119+
## Consequences
120+
121+
Since this is an opt-in on a read request or consumer create basis, this is not a breaking change. Depending on client
122+
implementation, this could be harder to implement. But given it's just another field in the `JSApiMsgGetRequest` and
123+
`ConsumerConfig`, each client should have no trouble supporting it.

adr/ADR-8.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
| 7 | 2025-01-23 | Add Max Age limit Markers, remove non direct gets | ADR-48 | 2.11.0 |
2323
| 8 | 2025-02-17 | Add Metadata | | 2.10.0 |
2424
| 9 | 2025-04-09 | Document max_age and duplicate_window requirements | | |
25+
| 10 | 2025-07-11 | Update on Read-after-Write guarantee | ADR-52 | |
2526

2627
## Context
2728

@@ -291,12 +292,12 @@ The features to support KV is in NATS Server 2.6.0.
291292

292293
#### Consistency Guarantees
293294

294-
We do not provide read-after-write consistency. Reads are performed directly to any replica, including out
295-
of date ones. If those replicas do not catch up multiple reads of the same key can give different values between
296-
reads. If the cluster is healthy and performing well most reads would result in consistent values, but this should not
295+
We do not provide read-after-write consistency by default. Reads are performed directly to any replica, including
296+
out-of-date ones. If those replicas do not catch up, multiple reads of the same key can give different values between
297+
reads. If the cluster is healthy and performing well, most reads would result in consistent values, but this should not
297298
be relied on to be true.
298299

299-
Historically we had read-after-write consistency, this has been deprecated and retained here for historical record only.
300+
Read-after-write guarantees can be opted into with [ADR-52](adr/ADR-52.md).
300301

301302
#### Buckets
302303

0 commit comments

Comments
 (0)