ar_data_sync skip local peer verification#650
ar_data_sync skip local peer verification#650shizzard wants to merge 16 commits intorelease/performance-2.8from
ar_data_sync skip local peer verification#650Conversation
Prior to this change ar_tx_blacklist_tests relied on every test node having pack_served_chunks set and being able to repack chunks on demand. To preserve that behavior we have to add each node to the local_peers config for the test nodes. Note: I haven't run this change on other tests yet, it's possible it causes something else to fail (in which case we may want to clear local_peers in that failing test)
apps/arweave/src/ar_data_sync.erl
Outdated
| bucket_based_offset => false }) of | ||
| {ok, Config} = application:get_env(arweave, config), | ||
| case get_chunk(Start + 1, #{ | ||
| pack => lists:member(pack_served_chunks, Config#config.enable), |
There was a problem hiding this comment.
I couldn't trace through all code paths leading to get_tx_data. Some look like they start at an HTTP handler - makes sense that we'd only repack when the option is set for those.
But it looks like a bunch come from ar_storage:read_tx which I think is used by a lot of internal processes. I wonder if those processes might depend on the chunk being repacked? Were you able to confirm one way or another?
There was a problem hiding this comment.
No, but I'll check what I can.
There was a problem hiding this comment.
worst case you can revert this to pack => true - I don't think there's much of a risk there (and probably no new risk?)
| Peer = ar_http_util:arweave_peer(Req), | ||
| IsLocalPeerAddr = lists:member(Peer, Config#config.local_peers), | ||
|
|
||
| case IsPackServedChunks orelse IsLocalPeerAddr of |
There was a problem hiding this comment.
Should this be andalso? I forget where we landed
Like I could see it working like this:
- server sets
pack_served_chunks, in which case it only repacks chunks for members of itslocal_peers - client sets
request_packed_chunksandsync_from_local_peers_onlyin which case it requests packed chunks from itslocal_peersand does not validate it them before storing.
There was a problem hiding this comment.
We decided that we pack served chunks for local peers even when the option is not set.
There was a problem hiding this comment.
I think we should always require the option to be set. Unless you and Lev discussed otherwise?
i.e. there are a lot of scenarios where local_peers are set for other reasons (e.g. reduce rate limiting between CM cluster peers) and in those cases we probably don't want to also pack served chunks unless it's desired (e.g. we wouldn't want a CM Exit Node to get bogged down repacking accidentally)
There was a problem hiding this comment.
We discussed this few times:
- CM exit node will unlikely have any data at all;
- local_peers are considered trusted and therefore any requested packing is served.
There was a problem hiding this comment.
I honestly don't remember wha we discussed. But I think we should require enable pack_served_chunks to be set. My understanding is that we use local_peers as a filter on top of that so that nodes don't end up packing for peers they don't want to. But we still need enable pack_served_chunks
The client can trust all data it receives from one of its local_peers though. That makes sense.
There was a problem hiding this comment.
Just chatted on slack. Current approach as implemented works! clients will only request packed chunks if they have fetch_packed_chunks set, but if they do request a packed chunk, the local_peer will repack.
No description provided.