Skip to content

Conversation

@Frando
Copy link
Member

@Frando Frando commented Nov 6, 2025

Description

  • Adds ConnectionInfo, which is a weak info handle to a connection (does not keep the connection alive, cannot be upgraded to a full Connection). From a ConnectionInfo some static info is directly available, and as long as the connection is alive you can access the connection stats. You can also watch for path changes, and wait for the connection to be closed (without keeping it alive).
  • Adds functions remote_info and remotes to get a RemoteInfo or list of remote all remote infos.
  • The RemoteInfo allows to get a list of ConnectionInfo that are currently active
  • It also allows to watch the currently selected path, replacing Endpoint::conn_type
    • We have to think through if this is hindering a potential future where we send on multiple paths in parallel
    • If we remove this, you would have to go through all connections, call ConnectionInfo::paths_info, and check if there is a ip path (-> direct) or only a relay path (-> relayed).
    • I'm torn a bit if we want this or not. It is a much simpler API, but maybe we shouldn't expose it?
  • It also allows to get the RTT of the currently selected path
  • Removes Endpoint::conn_type and Endpoint::latency

Breaking Changes

Notes & open questions

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.
    • List all breaking changes in the above "Breaking Changes" section.
    • Open an issue or PR on any number0 repos that are affected by this breaking change. Give guidance on how the updates should be handled or do the actual updates themselves. The major ones are:

@github-actions
Copy link

github-actions bot commented Nov 6, 2025

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3614/docs/iroh/

Last updated: 2025-11-06T11:24:22Z

@Frando Frando changed the base branch from main to feat-multipath November 6, 2025 11:18
@Frando Frando force-pushed the Frando/remote-info branch 2 times, most recently from 0b74cec to 8fa7c37 Compare November 6, 2025 11:21
@Frando Frando force-pushed the Frando/remote-info branch from 8fa7c37 to 1bbdbf1 Compare November 6, 2025 11:23
@n0bot n0bot bot added this to iroh Nov 6, 2025
@github-project-automation github-project-automation bot moved this to 🏗 In progress in iroh Nov 6, 2025
Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my biggest problem is that I don't think we can allocate a vector with an item (whatever the item is) for each connection. I guess that gives two main options:

  • Not provide any API that needs this. I know you won't like this :)
  • Figure out how to make those this available as a liftime-disconnected iterator or something. I know I won't like this :)
  • ???

pub async fn latency(&self, endpoint_id: EndpointId) -> Option<Duration> {
self.msock.latency(endpoint_id).await
/// TODO: Expand docs.
pub fn remotes(&self) -> Vec<RemoteInfo> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider making this impl Iterator, it means we don't have to allocate a vec (even if we currently do internally). E.g. if you have a busy server this could be thousands really.

})
};
let info = conn.to_info();
// Register this connection with the magicsock.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this comment now, so we look a bit less like an AI 😅

Comment on lines +1535 to +1538
/// Returns the side of the connection (client or server).
pub fn side(&self) -> Side {
self.inner.side()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for adding this!

I'd write a lot more docs, like explaining why on the QUIC-level this is a thing and why you might want to know for a protocol despite that iroh always allows you to be both. Basically assume you get a user landing from hn or reddit and exploring iroh for the first time without having a clue.

Comment on lines +1540 to +1542
/// Returns a [`ConnectionInfo`], which is a weak handle to the connection
/// that does not keep the connection alive, but does allow to access some information
/// about the connection, and allows to wait for the connection to be closed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a stickler for following the rust doc style guide strictly :)

So re-phrasing this so you have a 1-line summary followed by a blank line and longer explanation will focus you on writing meaningful words in that one line, because that's the line you'll see on rustdoc summary pages and will most often read from ide help.

(I'll only leave this comment once)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully. Haven't done docs properly at all yet, wanna settle the approach first.

Comment on lines +1588 to +1591
// We could add such util methods here, not sure.
pub fn has_direct_path(&mut self) -> bool {
self.paths_info.get().keys().any(|addr| addr.is_ip())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think selected_path like RemoteInfo has is a better idea for now.

// is needed.
if let Some(conn) = handle.upgrade() {
// TODO(frando): The code here upgraded the handle again (which we already upgraded a couple of lines above)
// Was there a reason for that?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was an outright bug. No need to upgrade it twice. A sign the function is too long perhaps...

tx.send(rtt).ok();
}

fn handle_msg_connection_infos(&self, tx: oneshot::Sender<Vec<ConnectionInfo>>) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, this allocates a Vec for all connections again. I do think that is problematic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one here is much less problematic than the Vec for the remote infos. While busy machines will definitely have a lot of remotes, the number of connections to a single other endpoint will be smallish in the vast majority of uses. But yes, not allocating would definitely be better.

}
self.open_path(addr);
self.close_redundant_paths(addr);
self.pub_selected_path.set(Some(addr.clone().into())).ok();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow why we store selected path twice? Can't we use the watchable directly? Or is that too horrible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can likely, yes.

#[debug("Latency(..)")]
Latency(oneshot::Sender<Option<Duration>>),
/// Returns info about currently open connections to the remote endpoint.
#[debug("Latency(..)")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not latency :)

Comment on lines +1120 to +1127
/// The ALPN of the connection.
///
/// We store this to be able to create a [`ConnectionInfo`] from the [`ConnectionState`].
alpn: Vec<u8>,
/// The side (client or server)
///
/// We store this to be able to create a [`ConnectionInfo`] from the [`ConnectionState`].
side: Side,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these exist purely for diagnostics right, nothing on the actor needs this info.

I'm not entirely happy about that, but can live with it. Would move them to their own section and clearly document them as such though.

@Frando
Copy link
Member Author

Frando commented Nov 6, 2025

So my biggest problem is that I don't think we can allocate a vector with an item (whatever the item is) for each connection. I guess that gives two main options:

* Not provide any API that needs this.  I know you won't like this :)

* Figure out how to make those this available as a liftime-disconnected iterator or something.  I know I won't like this :)

* ???

First: Sure, not doing it is always an option. For me the monitor API in #3610 is more important, and that doesn't have any of these issues, because we simply hand out a ConnectionInfo at time of creation, and don't care any further. If users need a map of all connections or all remotes, they can build it up from there.

So we can, I think, reasonably decide that the monitor callback is the only introspection mechanism.

(#3610 and this PR share the ConnectionInfo struct, otherwise they're mostly independent, which one is based on which is rather arbitrary)

If we do want to stick to a way to iterate over active remotes and connections, well, I think we can

a) allocate the full list: fine for smaller nodes, not fine for those with many connected nodes.
b) lock on each iteration step: will have to evaluate closer what the effects would be
c) look into fancier concurrent datastructures like dashmap or papaya, or persistent datastructures with cheap clones like those from rpds

dashmap can deadlock so we'd have to be very careful how to expose it in such a way that it cannot deadlock by users holding it wrong. papaya I recently stumbled upon and could be made to work, I think. However, we'd have again to be very careful that users holding it slightly wrong don't interfere with interior performance of the endpoint. I'll do some more thinking and will have a closer look at some concurrent maps maybe.

@flub flub mentioned this pull request Nov 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🏗 In progress

Development

Successfully merging this pull request may close these issues.

3 participants