-
Notifications
You must be signed in to change notification settings - Fork 404
FilterIter
API redesign
#2000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
FilterIter
API redesign
#2000
Conversation
crates/bitcoind_rpc/src/bip158.rs
Outdated
/// Hard cap on how far to walk back when a reorg is detected. | ||
const MAX_REORG_DEPTH: u32 = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the main motivation for this is to protect against attacks from a malicious/faulty node?
This is not going to work as BDK does not check PoW (not currently). The node can just cheaply construct multiple <100 block reorgs and exhaust resources that way.
The only way to protect against attacks is to check PoW.
Note that I've already mentioned this here: #1985 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to understand the rationale of this PR (since we already had #1985 going). If you can expand on this, it would be appreciated.
I can see how the reorg-logic is simpler here (since it uses the pre-cached headers instead of having the check both the checkpoints/headers). Is this the main reasoning?
I'm assuming the main rationale for header-caching is so that it will be easier/faster to emit CheckPoint
s with Header
s once that becomes available. Is this correct?
89b1584
to
e368af5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've demonstrated that the chain-update construction logic is unsound with the following test: 21fa815.
My intuition tells me that making the FilterIter
checkpoint-centric would make it easier to ensure correctness here. To make this current approach work would mean copying everying in the CheckPoint
into FilterIter::headers
which would be counter-intuitive.
crates/bitcoind_rpc/src/bip158.rs
Outdated
/// Number of recent blocks from the tip to be returned in a chain update. | ||
const CHAIN_SUFFIX_LEN: u32 = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we include a rationale for this?
From my understanding, this will make it faster to find the "point of agreement" if there is a reorg of depth < 10.
On second though, I'm not sure how impactful this is - reorgs are relatively rare and rescanning blocks should be pretty fast with filters?
FilterIter
detects reorgs
This is how I propose we move forward with #2000/#1985:
Reorg logicBacktrack using No need to cache multiple headers/blockhashes. Just keep track of the last emitted header/blockhash. No need to have |
To address the claim that there is a bug in I updated the test to print the update chain and the original chain (and also added block of height 50 to the original chain to show that it's not some sort of "genesis block problem"): 4f55f2f
Note that the block hash of the block at height To have the update connect, the chain source should include |
@ValuedMammal Sorry I pushed 4f55f2f on the wrong remote! Feel free to get rid of it. |
You're right I take back the claim. I was thinking along the lines of a "transitive invalidation with no point of agreement", but since we traverse over height 50 of the original (which is still valid) the update is ambiguous.
Good idea - only thing is that the caller might eventually wish to collect the headers? |
@ValuedMammal What would be the usecase of these headers? Would the caller want headers from every single block or just the relevant blocks? |
Still just the relevant ones. Maybe a better way to state the question is if |
4f55f2f
to
d8e916f
Compare
@ValuedMammal yes I think it should be considered later down the line. Since the blockhash is "stable" per height, we can take in a closure for constructing any checkpoint type. |
Concept ACK. This is looking really good. |
d8e916f
to
88e8d8e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addresses some pain points in making LocalChain
updates apply reliably.
P.S. Hopefully this highlights the value of the BlockGraph
work you’ve been leading.
crates/bitcoind_rpc/src/bip158.rs
Outdated
fn find_base(&self) -> Result<GetBlockHeaderResult, Error> { | ||
for cp in self.cp.iter() { | ||
let height = cp.height(); | ||
|
||
// if we have a checkpoint we use a lookback of ten blocks | ||
// to ensure consistency of the local chain | ||
if let Some(cp) = self.cp.as_ref() { | ||
// adjust start height to point of agreement + 1 | ||
let base = self.find_base_with(cp.clone())?; | ||
self.height = base.height + 1; | ||
let fetched_hash = self.client.get_block_hash(height as u64)?; | ||
|
||
for _ in 0..9 { | ||
let hash = match header.previous_block_hash { | ||
Some(hash) => hash, | ||
None => break, | ||
}; | ||
header = self.client.get_block_header_info(&hash)?; | ||
let height = header.height as u32; | ||
if height < self.height { | ||
break; | ||
} | ||
self.blocks.insert(height, hash); | ||
if fetched_hash == cp.hash() { | ||
return Ok(self.client.get_block_header_info(&fetched_hash)?); | ||
} | ||
} | ||
|
||
self.stop = tip_height; | ||
|
||
Ok(Some(BlockId { | ||
height: tip_height, | ||
hash: tip_hash, | ||
})) | ||
Err(Error::ReorgDepthExceeded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an unfortunate restriction with LocalChain
(as it exists currently). The only safe way to ensure two CheckPoint
chains connect is if the update chain contains all heights of the original chain from the point of agreement. We probably need to keep track of these must-be-included-heights internally in the FilterIter
.
We probably also need to change how we structure an Event
since non-matching blocks which are not tips may need to return a CheckPoint
update. I propose the following:
type SyncPoint {
CheckPoint(CheckPoint),
BlockId(BlockId),
}
type Event {
at: SyncPoint,
match_block: Option<Block>,
}
// Suggestions for helper methods.
impl Event {
pub fn checkpoint(&self) -> Option<CheckPoint> { /*TODO*/ }
pub fn block_id(&self) -> BlockId { /*TODO*/ }
pub fn height(&self) -> u32 { /*TODO*/ }
pub fn is_match(&self) -> bool { /*TODO*/ }
}
Let me know what you think.
Additionally, what do you think about changing the find_base
logic to call get_block_header_info
against cp.hash
(so no need for a separate get_block_hash
call)? However, we may need to look at the RPC error code for if a block does not exist (instead of directly returning the error).
crates/bitcoind_rpc/src/bip158.rs
Outdated
cp = cp | ||
.range(..=next_height) | ||
.next() | ||
.ok_or(Error::ReorgDepthExceeded)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to keep track of the heights we have removed to ensure the checkpoint update connects.
Will it be easier to just include the checkpoint with every event? /// Event returned by [`FilterIter`].
#[derive(Debug, Clone)]
pub struct Event {
/// Checkpoint
pub cp: CheckPoint,
/// Block, will be `Some(..)` for matching blocks.
pub block: Option<Block>,
} |
The API now consists of the methods `new` and `next`. The local checkpoint and SPK inventory are provided to the constructor. `next` is now responsible for locating the point of agreement. The next filter to fetch is determined by the `next_block_hash` field of `GetBlockHeaderResult`. If the next header has negative confirmations due to a reorg, we rewind the internal state until we find a header still in the best chain. Matched block events contain the most up to date checkpoint, so it can be applied directly to the local chain as events are processed. Added `Tip` variant to `Event` enum for emitting the tip checkpoint when all blocks have been scanned. Removed `EventInner`.
Change `Event` to a simple struct containing `cp` and optional `block`. The checkpoint is updated on each iteration whether or not it corresponds to a matching block. We use `CheckPoint::insert` which will also purge evicted blocks if needed. Change implementation of `find_base` to use `get_block_header_info` which helps to reduce the number of RPC calls. Add test `event_checkpoint_connects_to_local_chain` to check the expected events after a reorg, and check that intermediate updates can be applied to the local chain.
88e8d8e
to
ff92f30
Compare
I added a commit ff92f30 and rebased. |
@ValuedMammal yes, but the caller can't easily determine the height of non-matching events. In which case, maybe we should only emit matched blocks + tip? This ensures that the |
Previously
FilterIter
did not detect or handle reorgs betweennext
calls, meaning that if a reorg occurred, we might process blocks from a stale fork potentially resulting in an invalid wallet state. This PR aims to fix that by adding logic to explicitly check for and respond to a reorg on every call tonext
.Notes to the reviewers
The old implementation required storing block IDs of scanned blocks before creating a checkpoint update, but because the interface was split across different methods, it introduced a timing risk between method calls which, when we consider the possibility of reorgs, made the implementation somewhat brittle.
To address this, we make sure that 1) Finding the start block and 2) Updating the internal checkpoint are directly tied to the logic of
next
. Since the checkpoint in practice is derived from a clone of the local chain, this ensures that the checkpoint returned bynext
can always find a connection point with the receiver. Additionally we now emit a checkpoint at every height to ensure that any "must-include" heights are not missing.For example usage see
examples/filter_iter.rs
fixes #1848
Changelog notice
Checklists
All Submissions:
New Features:
Bugfixes: