Open
Description
Simple PUT Retry Logic
Problem
When a client initiates a PUT operation, if the selected target peer doesn't respond with SuccessfulPut
, the operation fails permanently. This causes:
- River chat updates to fail silently
- Poor user experience when a single peer is unreachable
- Unnecessary failures when alternative peers are available
Current Behavior
// In request_put()
let target = op_manager
.ring
.closest_potentially_caching(&key, [&sender.peer].as_slice())
.into_iter()
.next()
.ok_or(RingError::EmptyRing)?;
// Send RequestPut to target
// If no SuccessfulPut received → operation fails permanently
Proposed Solution
Add simple retry logic with alternative peers:
pub struct PutState {
// ... existing fields ...
AwaitingResponse {
key: ContractKey,
// ... other fields ...
retry_count: usize,
tried_peers: HashSet<PeerId>,
}
}
// When timeout occurs (no SuccessfulPut within ~500ms-2s):
if retry_count < MAX_RETRIES {
// Get alternative peer
let candidates = op_manager
.ring
.k_closest_potentially_caching(&key, &tried_peers, 5);
if let Some(next_peer) = candidates.first() {
// Send RequestPut to next_peer
// Increment retry_count
// Add current peer to tried_peers
}
}
Key Points
- Simple retry only: We only retry the initial PUT request to the first peer
- No propagation tracking: Once any peer sends
SuccessfulPut
, they have responsibility - Fast timeout: 500ms-2s per attempt (not the 60-second operation TTL)
- Limited retries: ~5-10 attempts max to avoid infinite loops
Implementation Approach
- Add retry fields to
PutState::AwaitingResponse
- Add timeout detection in PUT operation processing
- On timeout, select next peer and retry
- On
SuccessfulPut
, complete operation normally
Success Criteria
- PUT operations succeed even when initial target peer is unreachable
- No changes to PUT propagation logic
- No protocol changes required
- Simple, minimal code changes
Priority
High - This directly impacts River chat reliability and user experience
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Triage