Skip to content

Conversation

@syszery
Copy link
Contributor

@syszery syszery commented Sep 9, 2025

Motivation

Refers to issue #9048 — Zebra currently lacks support for the ping RPC method and its associated latency metrics (pingtime and pingwait) in the getpeerinfo response.

This draft PR begins implementing ping support.

Solution

This PR lays the foundation for ping support without introducing incomplete logic or placeholder hacks.

  • Stub implementation of the ping method added to the Rpc trait.
    • It currently returns Ok(()), matching zcashd’s behavior.
  • The PeerInfo struct (used in getpeerinfo) was extended with:
    • pingtime: Option<f64>
    • pingwait: Option<f64>
  • From<MetaAddr> implementation updated to initialize these fields as None for now.

Tests

  • Added a new test to verify JSON serialization of pingtime and pingwait with Some and None values.
  • All existing tests in zebra-rpc build and pass.

Specifications & References

Follow-up Work

  • Trigger sending of ping message in the ping RPC method.
  • Implement a round-trip timer to measure latency.
  • Populate pingtime and pingwait in the PeerInfo response.

Feedback Requested

My idea is to use the existing logic in the zebra-network crate, specifically in connection.rs, to implement ping behavior:

Does this look like the right place to hook into for ping tracking, or would another approach be preferred?

Note:
The zebra-network crate already defines a Request::Ping(Nonce) variant (see (request.rs line 48). However, this variant is marked as internal and described as a "bit of a hack." As such, I did not consider using it here.

Acknowledgements

Thanks @oxarbitrage for outlining the relative complexity of this task in #9730 and suggesting it might be better to tackle later. That guidance was very helpful and is much appreciated.

I had already started this work over the weekend, so I wanted to clean up the scaffold and share what I learned. This draft PR can serve as a starting point for further discussion or future implementation. I plan to return to this at the end of September after my vacation.

Let me know if this direction makes sense and how best to proceed!

@github-actions github-actions bot added the C-feature Category: New features label Sep 9, 2025
conradoplg
conradoplg previously approved these changes Sep 24, 2025
Copy link
Collaborator

@conradoplg conradoplg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good, feel free to move out of draft or to build on top of it, thanks!

@syszery syszery marked this pull request as ready for review September 25, 2025 11:58
@syszery syszery requested a review from a team as a code owner September 25, 2025 11:58
@syszery syszery requested review from upbqdn and removed request for a team September 25, 2025 11:58
@syszery
Copy link
Contributor Author

syszery commented Sep 25, 2025

Hi @conradoplg, thanks for the early feedback! I've just rebased the branch onto the latest main and fixed all Clippy warnings. I also marked the PR as 'ready for review', but I do plan to continue working on it — just wanted to make sure it’s unblocked in case this is considered high priority, since it’s part of the Milestone: Replace zcashd dependencies.

@syszery
Copy link
Contributor Author

syszery commented Sep 27, 2025

I've implemented rudimentary tracking of peer latency in connection.rs, using the existing Request::Ping(Nonce) variant to send pings and measure RTT via pong responses.

I'd like to hook this into the RPC layer next, and I'm looking for feedback on two points:

  1. Does this usage align with the intended purpose of Request::Ping(Nonce)?
  2. What's the recommended way to expose peer latency to the RPC crate?

Currently, RTT data is stored in Connection via a PeerLatency struct. The getpeerinfo RPC pulls data from the AddressBook, but it doesn't seem like the right place for live metrics like latency. Would a different approach be preferred?

Happy to adjust direction based on feedback. Thanks!

Copy link
Collaborator

@conradoplg conradoplg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some adjustments are required but hopefully it should be simple, see comments.

Regarding propagating the measured latency, I think it makes sense to use the address book. You will need to propagate the result as if Pong were returning something:

  • In process_message(), after handling the Pong, change Handler::Finished(Ok(Response::Nil)) to return a new response type that you will need to create which includes the latency.
  • If you do that then in send_periodic_heartbeats_run_loop(), the heartbeat_timeout() should return the new Pong response. Extract the latency from it and then pass it to the heartbest_ts_collector inside the MetaAddr::new_responded which you will need to modify to pass the latency. Then check where that is handled and store the latency in the address book.

I'm not 100% sure that will work but should you give some hints. As always, let us know if you prefer for us to work on it. Thank you for all the help!

@syszery
Copy link
Contributor Author

syszery commented Oct 4, 2025

Cleaned up artifacts from the earlier commits and addressed all reviewer feedback related to the first point:

In process_message(), after handling the Pong, change Handler::Finished(Ok(Response::Nil)) to return a new response type that you will need to create which includes the latency.

Ping-pong latency tracking is now wired inside the handler, and a test was added for the round-trip. This is meant as an intermediate step.

Thank you @conradoplg for stepping in and pointing me in the right direction with such detailed feedback!

@syszery
Copy link
Contributor Author

syszery commented Oct 5, 2025

Addressed the second part of the review feedback:

"If you do that then in send_periodic_heartbeats_run_loop(), the heartbeat_timeout() should return the new Pong response. Extract the latency from it and then pass it to the heartbeat_ts_collector inside the MetaAddr::new_responded, which you will need to modify to pass the latency."

Next steps:

"Then check where that is handled and store the latency in the address book."

@syszery
Copy link
Contributor Author

syszery commented Oct 5, 2025

Some adjustments are required but hopefully it should be simple, see comments.

Regarding propagating the measured latency, I think it makes sense to use the address book. You will need to propagate the result as if Pong were returning something:

  • In process_message(), after handling the Pong, change Handler::Finished(Ok(Response::Nil)) to return a new response type that you will need to create which includes the latency.
  • If you do that then in send_periodic_heartbeats_run_loop(), the heartbeat_timeout() should return the new Pong response. Extract the latency from it and then pass it to the heartbest_ts_collector inside the MetaAddr::new_responded which you will need to modify to pass the latency. Then check where that is handled and store the latency in the address book.

I'm not 100% sure that will work but should you give some hints. As always, let us know if you prefer for us to work on it. Thank you for all the help!

The last commit now fully implements the suggested steps. I had to adapt several tests and decided to use some default values. Based on feedback, I can change these to more meaningful latencies.

Next steps:

  • Implement the ping RPC method to trigger a ping to all peers.
  • Track and update pingwait for each peer in the address book.

@conradoplg
Copy link
Collaborator

It's looking great, thanks

@syszery
Copy link
Contributor Author

syszery commented Oct 9, 2025

Hi @conradoplg, I am sorry to bother you again, but I got stuck here. I struggle to find a way how to send the ping message to all connected peers.

What I found so far:

  • In zebra-network/src/peer_set/set.rs, the PeerSet handles Service::call. I added a Request::Ping(_) => self.broadcast_all(req) arm, and verified it reaches mocked peers in a simple test.
  • In RPC, methods like add_node and get_peer_info use the AddressBookPeers trait, but that’s only for peer metadata.
  • On the other hand, send_raw_transaction directly calls into mempool::Request via the RpcImpl.

My question:

  • Is this PeerSet::call(Request::Ping) even the right way to broadcast pings?
  • If yes, how should the RPC crate trigger it? Should I expose it via a new trait (similar to AddressBookPeers) — or is it better to pass the PeerSet directly into RpcImpl, like how the mempool is handled?

I’d really appreciate any tips or guidance!

@oxarbitrage oxarbitrage added the do-not-merge Tells Mergify not to merge this PR label Oct 14, 2025
@arya2
Copy link
Contributor

arya2 commented Oct 15, 2025

I struggle to find a way how to send the ping message to all connected peers.

I'm not sure if the method exists in zcashd alongside a heartbeat task, or if it was required because zcashd doesn't use a heartbeat task. If it's the latter, returning the pingwait field on the getpeerinfo's response type could be enough to close the issue, so this PR could merge as-is without it.

@str4d Do you know if a ping RPC method will be required in Zebra or if it's okay to always record the ping from the heartbeat task?

My question:

  • Is this PeerSet::call(Request::Ping) even the right way to broadcast pings?

Not currently, as it would only send the Ping request to a fraction of peers. Adding a new request variant similar to AdvertiseBlockToAll and a new method similar to broadcast_all() to measure the time between sending the request and getting a response in the peer set could work.

  • If yes, how should the RPC crate trigger it? Should I expose it via a new trait (similar to AddressBookPeers) — or is it better to pass the PeerSet directly into RpcImpl, like how the mempool is handled?

The peer set implements tower::Service<zebra_network::Request> already, so a field of any type which implements that would work. No new trait is required, AddressBookPeers was an unfortunate addition because the address book isn't a service.

@oxarbitrage oxarbitrage removed the do-not-merge Tells Mergify not to merge this PR label Oct 17, 2025
@syszery
Copy link
Contributor Author

syszery commented Oct 18, 2025

Thanks @arya2 for the explanation and for steering me away from chasing the wrong “solution”!

I understand that the best next step is to fully implement support for the missing pingwait field, and then see whether Zebra needs a ping RPC method at all (or if the heartbeat logic is already sufficient).

@mpguerra mpguerra linked an issue Oct 28, 2025 that may be closed by this pull request
@conradoplg
Copy link
Collaborator

This is looking good. I agree with @arya2 that we can leave the actual force-pinging-everyone for later (for the time we could implement a dummy ping RPC that does nothing).

Also feel free to do pingwait in a separate PR if you prefer, and we can get this merged (probably after the next release though)

@syszery
Copy link
Contributor Author

syszery commented Nov 13, 2025

Hi @conradoplg, thank you for your feedback! I’ve extended the PR and added a proposal for the missing pingwait:

  • Added ping_sent_at to MetaAddr
  • Updated MetaAddrChange::new_responded to include ping_sent_at
  • Implemented basic pingwait calculation

Happy to adjust based on your feedback.

@arya2 arya2 added do-not-merge Tells Mergify not to merge this PR and removed do-not-merge Tells Mergify not to merge this PR labels Nov 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement ping RPC

5 participants