netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* How to find I/F to destination
@ 2007-05-04 12:48 David Howells
  2007-05-04 12:54 ` Evgeniy Polyakov
  2007-05-04 13:00 ` Patrick McHardy
  0 siblings, 2 replies; 18+ messages in thread
From: David Howells @ 2007-05-04 12:48 UTC (permalink / raw)
  To: netdev; +Cc: Patrick McHardy


Hi,

I would like to determine through which interface packets sent to a particular
UDP destination will go through, and so determine the MTU size for that
interface.  Can anyone suggest a good way of doing this from within the
kernel?

Doing this will permit AF_RXRPC to obtain a better initial guess as to the
maximum size of the packets that can be sent that way.

I could use the code Patrick gave me to allow AFS to iterate through all the
interfaces and then pick the smallest MTU, but that seems wrong somehow -
though it probably will result in the correct answer 99% of the time.

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 12:48 How to find I/F to destination David Howells
@ 2007-05-04 12:54 ` Evgeniy Polyakov
  2007-05-04 13:04   ` David Howells
                     ` (2 more replies)
  2007-05-04 13:00 ` Patrick McHardy
  1 sibling, 3 replies; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-04 12:54 UTC (permalink / raw)
  To: David Howells; +Cc: netdev, Patrick McHardy

On Fri, May 04, 2007 at 01:48:48PM +0100, David Howells (dhowells@redhat.com) wrote:
> 
> Hi,

Hi David.

> I would like to determine through which interface packets sent to a particular
> UDP destination will go through, and so determine the MTU size for that
> interface.  Can anyone suggest a good way of doing this from within the
> kernel?

I used following code in netchannels:

static int netchannel_ip_route_output_flow(struct rtable **rp, struct flowi *flp, int flags)
{
	int err;

	err = __ip_route_output_key(rp, flp);
	if (err)
		return err;

	if (flp->proto) {
		if (!flp->fl4_src)
			flp->fl4_src = (*rp)->rt_src;
		if (!flp->fl4_dst)
			flp->fl4_dst = (*rp)->rt_dst;
	}

	return 0;
}

struct dst_entry *route_get_raw(u32 saddr, u32 daddr, u16 sport, u16 dport, u8 proto)
{
	struct rtable *rt;
	struct flowi fl = { .oif = 0,
			    .nl_u = { .ip4_u =
				      { .saddr = saddr,
					.daddr = daddr,
					.tos = 0 } },
			    .proto = proto,
			    .uli_u = { .ports =
				       { .sport = sport,
					 .dport = dport } } };

	if (netchannel_ip_route_output_flow(&rt, &fl, 0))
		goto no_route;
	return dst_clone(&rt->u.dst);

no_route:
	return NULL;
}

This is basically a copied input route code.
dst entry, obtained from route_get_raw() holds a pointer to network device.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 12:48 How to find I/F to destination David Howells
  2007-05-04 12:54 ` Evgeniy Polyakov
@ 2007-05-04 13:00 ` Patrick McHardy
  1 sibling, 0 replies; 18+ messages in thread
From: Patrick McHardy @ 2007-05-04 13:00 UTC (permalink / raw)
  To: David Howells; +Cc: netdev

David Howells wrote:
> I would like to determine through which interface packets sent to a particular
> UDP destination will go through, and so determine the MTU size for that
> interface.  Can anyone suggest a good way of doing this from within the
> kernel?


Do a route lookup (ip_route_output_key), then either use dst_mtu to get
the PMTU value or dst->dev in case you really want the device's MTU.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 12:54 ` Evgeniy Polyakov
@ 2007-05-04 13:04   ` David Howells
  2007-05-04 13:06     ` David Howells
  2007-05-04 13:08   ` David Howells
  2007-05-04 13:23   ` David Howells
  2 siblings, 1 reply; 18+ messages in thread
From: David Howells @ 2007-05-04 13:04 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev, Patrick McHardy

Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:

> 	err = __ip_route_output_key(rp, flp);

Is there any way to get at this without having to link against the ipv4
module?

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:04   ` David Howells
@ 2007-05-04 13:06     ` David Howells
  0 siblings, 0 replies; 18+ messages in thread
From: David Howells @ 2007-05-04 13:06 UTC (permalink / raw)
  Cc: Evgeniy Polyakov, netdev, Patrick McHardy

David Howells <dhowells@redhat.com> wrote:

> > 	err = __ip_route_output_key(rp, flp);
> 
> Is there any way to get at this without having to link against the ipv4
> module?

Ah, nevermind.  ipv4 can't be a module.

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 12:54 ` Evgeniy Polyakov
  2007-05-04 13:04   ` David Howells
@ 2007-05-04 13:08   ` David Howells
  2007-05-04 13:16     ` Evgeniy Polyakov
  2007-05-04 13:20     ` Evgeniy Polyakov
  2007-05-04 13:23   ` David Howells
  2 siblings, 2 replies; 18+ messages in thread
From: David Howells @ 2007-05-04 13:08 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev, Patrick McHardy


Should route_get_raw() release the rtable if gets back?

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:08   ` David Howells
@ 2007-05-04 13:16     ` Evgeniy Polyakov
  2007-05-04 13:24       ` David Howells
  2007-05-04 13:20     ` Evgeniy Polyakov
  1 sibling, 1 reply; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-04 13:16 UTC (permalink / raw)
  To: David Howells; +Cc: netdev, Patrick McHardy

On Fri, May 04, 2007 at 02:08:15PM +0100, David Howells (dhowells@redhat.com) wrote:
> 
> Should route_get_raw() release the rtable if gets back?

Yes, dst entry should be released when not used anymore.

> David

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:08   ` David Howells
  2007-05-04 13:16     ` Evgeniy Polyakov
@ 2007-05-04 13:20     ` Evgeniy Polyakov
  1 sibling, 0 replies; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-04 13:20 UTC (permalink / raw)
  To: David Howells; +Cc: netdev, Patrick McHardy

On Fri, May 04, 2007 at 02:08:15PM +0100, David Howells (dhowells@redhat.com) wrote:
> 
> Should route_get_raw() release the rtable if gets back?

You can also cache returned entry and then just clone it and check
->obsolete() callback.

Something like this:
struct dst_entry *route_get(struct dst_entry *dst)
{
	if (dst && dst->obsolete && dst->ops->check(dst, 0) == NULL) {
		dst_release(dst);
		return NULL;
	}
	return dst_clone(dst);
}

Copied from route code too.

> David

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 12:54 ` Evgeniy Polyakov
  2007-05-04 13:04   ` David Howells
  2007-05-04 13:08   ` David Howells
@ 2007-05-04 13:23   ` David Howells
  2007-05-04 13:25     ` Evgeniy Polyakov
  2 siblings, 1 reply; 18+ messages in thread
From: David Howells @ 2007-05-04 13:23 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev, Patrick McHardy

Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:

> static int netchannel_ip_route_output_flow(struct rtable **rp, struct flowi *flp, int flags)

What's proto?  Should that be IPPROTO_UDP?

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:16     ` Evgeniy Polyakov
@ 2007-05-04 13:24       ` David Howells
  2007-05-04 13:29         ` Evgeniy Polyakov
  0 siblings, 1 reply; 18+ messages in thread
From: David Howells @ 2007-05-04 13:24 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev, Patrick McHardy

Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:

> > Should route_get_raw() release the rtable if gets back?
> 
> Yes, dst entry should be released when not used anymore.

I meant the rtable returned by __ip_route_output_key(), not the dst.

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:23   ` David Howells
@ 2007-05-04 13:25     ` Evgeniy Polyakov
  0 siblings, 0 replies; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-04 13:25 UTC (permalink / raw)
  To: David Howells; +Cc: netdev, Patrick McHardy

On Fri, May 04, 2007 at 02:23:24PM +0100, David Howells (dhowells@redhat.com) wrote:
> Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> 
> > static int netchannel_ip_route_output_flow(struct rtable **rp, struct flowi *flp, int flags)
> 
> What's proto?  Should that be IPPROTO_UDP?

Yep.

> David

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:24       ` David Howells
@ 2007-05-04 13:29         ` Evgeniy Polyakov
  2007-05-04 13:33           ` David Howells
  0 siblings, 1 reply; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-04 13:29 UTC (permalink / raw)
  To: David Howells; +Cc: netdev, Patrick McHardy

On Fri, May 04, 2007 at 02:24:49PM +0100, David Howells (dhowells@redhat.com) wrote:
> Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> 
> > > Should route_get_raw() release the rtable if gets back?
> > 
> > Yes, dst entry should be released when not used anymore.
> 
> I meant the rtable returned by __ip_route_output_key(), not the dst.

That is the same, dst is dereferenced as rtable.
Cloned dst is returned, so it must be put back at the ned of the usage.

> David

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:29         ` Evgeniy Polyakov
@ 2007-05-04 13:33           ` David Howells
  2007-05-04 13:43             ` Patrick McHardy
  2007-05-05  9:15             ` Evgeniy Polyakov
  0 siblings, 2 replies; 18+ messages in thread
From: David Howells @ 2007-05-04 13:33 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev, Patrick McHardy

Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:

> That is the same, dst is dereferenced as rtable.
> Cloned dst is returned, so it must be put back at the ned of the usage.

If that's the case, then why do you bother to clone it in route_get_raw()?
Surely that'll give you *two* clones...

BTW, it seems to work.  The attached function gives me:

	[0mount ] <== rxrpc_assess_MTU_size() [if_mtu 1500]

Thanks!

David

---
static void rxrpc_assess_MTU_size(struct rxrpc_peer *peer)
{
	struct rtable *rt;
	struct flowi fl;
	int ret;

	peer->if_mtu = 1500;

	memset(&fl, 0, sizeof(fl));

	switch (peer->srx.transport.family) {
	case AF_INET:
		fl.oif = 0;
		fl.proto = IPPROTO_UDP,
		fl.nl_u.ip4_u.saddr = 0;
		fl.nl_u.ip4_u.daddr = peer->srx.transport.sin.sin_addr.s_addr;
		fl.nl_u.ip4_u.tos = 0;
		/* assume AFS.CM talking to AFS.FS */
		fl.uli_u.ports.sport = htonl(7001);
		fl.uli_u.ports.dport = htonl(7000);
		break;
	default:
		BUG();
	}

	ret = ip_route_output_key(&rt, &fl);
	if (ret < 0) {
		kleave(" [route err %d]", ret);
		return;
	}

	peer->if_mtu = dst_mtu(&rt->u.dst);
	kleave(" [if_mtu %u]", peer->if_mtu);
}

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:33           ` David Howells
@ 2007-05-04 13:43             ` Patrick McHardy
  2007-05-04 13:55               ` David Howells
  2007-05-05  9:15             ` Evgeniy Polyakov
  1 sibling, 1 reply; 18+ messages in thread
From: Patrick McHardy @ 2007-05-04 13:43 UTC (permalink / raw)
  To: David Howells; +Cc: Evgeniy Polyakov, netdev

David Howells wrote:
> static void rxrpc_assess_MTU_size(struct rxrpc_peer *peer)
> {
> 	struct rtable *rt;
> 	struct flowi fl;
> 	int ret;
> 
> 	peer->if_mtu = 1500;
> 
> 	memset(&fl, 0, sizeof(fl));
> 
> 	switch (peer->srx.transport.family) {
> 	case AF_INET:
> 		fl.oif = 0;
> 		fl.proto = IPPROTO_UDP,
> 		fl.nl_u.ip4_u.saddr = 0;
> 		fl.nl_u.ip4_u.daddr = peer->srx.transport.sin.sin_addr.s_addr;
> 		fl.nl_u.ip4_u.tos = 0;
> 		/* assume AFS.CM talking to AFS.FS */
> 		fl.uli_u.ports.sport = htonl(7001);
> 		fl.uli_u.ports.dport = htonl(7000);

htons()

> 		break;
> 	default:
> 		BUG();
> 	}
> 
> 	ret = ip_route_output_key(&rt, &fl);
> 	if (ret < 0) {
> 		kleave(" [route err %d]", ret);
> 		return;
> 	}
> 
> 	peer->if_mtu = dst_mtu(&rt->u.dst);


You need dst_release(&rt->u.dst) here.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:43             ` Patrick McHardy
@ 2007-05-04 13:55               ` David Howells
  2007-05-04 13:59                 ` Patrick McHardy
  0 siblings, 1 reply; 18+ messages in thread
From: David Howells @ 2007-05-04 13:55 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Evgeniy Polyakov, netdev

Patrick McHardy <kaber@trash.net> wrote:

> htons()

Blech.  Thanks.  Does it actually matter what ports are specified?

> You need dst_release(&rt->u.dst) here.

Thanks.  I think Evgeniy's code may be wrong then.  He ends with a
dst_clone(), which I think is superfluous.

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:55               ` David Howells
@ 2007-05-04 13:59                 ` Patrick McHardy
  2007-05-05  9:13                   ` Evgeniy Polyakov
  0 siblings, 1 reply; 18+ messages in thread
From: Patrick McHardy @ 2007-05-04 13:59 UTC (permalink / raw)
  To: David Howells; +Cc: Evgeniy Polyakov, netdev

David Howells wrote:
> Patrick McHardy <kaber@trash.net> wrote:
> 
> 
>>htons()
> 
> 
> Blech.  Thanks.  Does it actually matter what ports are specified?


It matters when IPsec port selectors are used to find the correct
policy.

>>You need dst_release(&rt->u.dst) here.
> 
> 
> Thanks.  I think Evgeniy's code may be wrong then.  He ends with a
> dst_clone(), which I think is superfluous.


Yes, that looks wrong.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:59                 ` Patrick McHardy
@ 2007-05-05  9:13                   ` Evgeniy Polyakov
  0 siblings, 0 replies; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-05  9:13 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: David Howells, netdev

On Fri, May 04, 2007 at 03:59:18PM +0200, Patrick McHardy (kaber@trash.net) wrote:
> >>You need dst_release(&rt->u.dst) here.
> > 
> > 
> > Thanks.  I think Evgeniy's code may be wrong then.  He ends with a
> > dst_clone(), which I think is superfluous.
> 
> 
> Yes, that looks wrong.

Main idea is to get a reference, and then clone it for each user.
Then each user drops its reference, and when system is not used anymore,
main reference is dropped too.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: How to find I/F to destination
  2007-05-04 13:33           ` David Howells
  2007-05-04 13:43             ` Patrick McHardy
@ 2007-05-05  9:15             ` Evgeniy Polyakov
  1 sibling, 0 replies; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-05  9:15 UTC (permalink / raw)
  To: David Howells; +Cc: netdev, Patrick McHardy

On Fri, May 04, 2007 at 02:33:45PM +0100, David Howells (dhowells@redhat.com) wrote:
> Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> 
> > That is the same, dst is dereferenced as rtable.
> > Cloned dst is returned, so it must be put back at the ned of the usage.
> 
> If that's the case, then why do you bother to clone it in route_get_raw()?
> Surely that'll give you *two* clones...
> 
> BTW, it seems to work.  The attached function gives me:
> 
> 	[0mount ] <== rxrpc_assess_MTU_size() [if_mtu 1500]
> 
> Thanks!
> 
> David
> 
> ---
> static void rxrpc_assess_MTU_size(struct rxrpc_peer *peer)
> {
> 	struct rtable *rt;
> 	struct flowi fl;
> 	int ret;
> 
> 	peer->if_mtu = 1500;
> 
> 	memset(&fl, 0, sizeof(fl));
> 
> 	switch (peer->srx.transport.family) {
> 	case AF_INET:
> 		fl.oif = 0;
> 		fl.proto = IPPROTO_UDP,
> 		fl.nl_u.ip4_u.saddr = 0;
> 		fl.nl_u.ip4_u.daddr = peer->srx.transport.sin.sin_addr.s_addr;
> 		fl.nl_u.ip4_u.tos = 0;
> 		/* assume AFS.CM talking to AFS.FS */
> 		fl.uli_u.ports.sport = htonl(7001);
> 		fl.uli_u.ports.dport = htonl(7000);
> 		break;
> 	default:
> 		BUG();
> 	}
> 
> 	ret = ip_route_output_key(&rt, &fl);

This one is quite slow compared to atomic reference increase, so yes,
there are two clones - one for main 'route', which then is being cloned
for each used in fast path. When it is not needed anymore (netchannel is
removed), the first one is dropped.

> 	if (ret < 0) {
> 		kleave(" [route err %d]", ret);
> 		return;
> 	}
> 
> 	peer->if_mtu = dst_mtu(&rt->u.dst);
> 	kleave(" [if_mtu %u]", peer->if_mtu);
> }

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-05-05  9:15 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-04 12:48 How to find I/F to destination David Howells
2007-05-04 12:54 ` Evgeniy Polyakov
2007-05-04 13:04   ` David Howells
2007-05-04 13:06     ` David Howells
2007-05-04 13:08   ` David Howells
2007-05-04 13:16     ` Evgeniy Polyakov
2007-05-04 13:24       ` David Howells
2007-05-04 13:29         ` Evgeniy Polyakov
2007-05-04 13:33           ` David Howells
2007-05-04 13:43             ` Patrick McHardy
2007-05-04 13:55               ` David Howells
2007-05-04 13:59                 ` Patrick McHardy
2007-05-05  9:13                   ` Evgeniy Polyakov
2007-05-05  9:15             ` Evgeniy Polyakov
2007-05-04 13:20     ` Evgeniy Polyakov
2007-05-04 13:23   ` David Howells
2007-05-04 13:25     ` Evgeniy Polyakov
2007-05-04 13:00 ` Patrick McHardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).