Open
Description
Problem
The test test_gw_to_peer_outbound_conn_forwarded was failing in CI due to a race condition where connection establishment events are not properly synchronized. The test required a 500ms delay to work around timing issues.
Root Cause
- Non-atomic connection management: add_connection() may return before the connection is fully available for forwarding operations
- Asynchronous event handling without proper synchronization: The handshake handler processes events sequentially but doesn't ensure proper ordering/completion
- Event emission timing: Events may be emitted before the underlying operations are fully complete
Current Workaround
Added a 500ms delay in the test between establishing peer connection and joiner connection. This is not acceptable for a production system focused on efficiency.
Expected Behavior
- Connection establishment should be atomic and deterministic
- Once add_connection() returns, the connection should be immediately available for forwarding
- No time-based delays should be required for proper operation
- Connection establishment latency should be minimal and predictable
Proposed Solutions
- Synchronous connection management: Make add_connection() atomic and blocking until connection is fully ready
- Event completion signals: Only emit events when operations are fully complete
- Proper async coordination: Use async synchronization primitives (channels, barriers, etc.) instead of time delays
- Connection state tracking: Implement proper state machine for connection lifecycle
Impact
- Performance: 500ms delays are unacceptable for a distributed system
- Reliability: Race conditions can cause intermittent failures
- Scalability: Poor connection management affects network topology building
- Developer experience: Flaky tests waste CI resources and developer time
Files Involved
- crates/core/src/node/network_bridge/handshake.rs (test with workaround)
- crates/core/src/ring/connection_manager.rs (connection management)
- crates/core/src/operations/connect.rs (forwarding logic)
This issue should be prioritized as it affects core networking functionality and violates Freenet's efficiency design goals.
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Done