* How to find I/F to destination
@ 2007-05-04 12:48 David Howells
2007-05-04 12:54 ` Evgeniy Polyakov
2007-05-04 13:00 ` Patrick McHardy
0 siblings, 2 replies; 18+ messages in thread
From: David Howells @ 2007-05-04 12:48 UTC (permalink / raw)
To: netdev; +Cc: Patrick McHardy
Hi,
I would like to determine through which interface packets sent to a particular
UDP destination will go through, and so determine the MTU size for that
interface. Can anyone suggest a good way of doing this from within the
kernel?
Doing this will permit AF_RXRPC to obtain a better initial guess as to the
maximum size of the packets that can be sent that way.
I could use the code Patrick gave me to allow AFS to iterate through all the
interfaces and then pick the smallest MTU, but that seems wrong somehow -
though it probably will result in the correct answer 99% of the time.
David
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 12:48 How to find I/F to destination David Howells
@ 2007-05-04 12:54 ` Evgeniy Polyakov
2007-05-04 13:04 ` David Howells
` (2 more replies)
2007-05-04 13:00 ` Patrick McHardy
1 sibling, 3 replies; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-04 12:54 UTC (permalink / raw)
To: David Howells; +Cc: netdev, Patrick McHardy
On Fri, May 04, 2007 at 01:48:48PM +0100, David Howells (dhowells@redhat.com) wrote:
>
> Hi,
Hi David.
> I would like to determine through which interface packets sent to a particular
> UDP destination will go through, and so determine the MTU size for that
> interface. Can anyone suggest a good way of doing this from within the
> kernel?
I used following code in netchannels:
static int netchannel_ip_route_output_flow(struct rtable **rp, struct flowi *flp, int flags)
{
int err;
err = __ip_route_output_key(rp, flp);
if (err)
return err;
if (flp->proto) {
if (!flp->fl4_src)
flp->fl4_src = (*rp)->rt_src;
if (!flp->fl4_dst)
flp->fl4_dst = (*rp)->rt_dst;
}
return 0;
}
struct dst_entry *route_get_raw(u32 saddr, u32 daddr, u16 sport, u16 dport, u8 proto)
{
struct rtable *rt;
struct flowi fl = { .oif = 0,
.nl_u = { .ip4_u =
{ .saddr = saddr,
.daddr = daddr,
.tos = 0 } },
.proto = proto,
.uli_u = { .ports =
{ .sport = sport,
.dport = dport } } };
if (netchannel_ip_route_output_flow(&rt, &fl, 0))
goto no_route;
return dst_clone(&rt->u.dst);
no_route:
return NULL;
}
This is basically a copied input route code.
dst entry, obtained from route_get_raw() holds a pointer to network device.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 12:48 How to find I/F to destination David Howells
2007-05-04 12:54 ` Evgeniy Polyakov
@ 2007-05-04 13:00 ` Patrick McHardy
1 sibling, 0 replies; 18+ messages in thread
From: Patrick McHardy @ 2007-05-04 13:00 UTC (permalink / raw)
To: David Howells; +Cc: netdev
David Howells wrote:
> I would like to determine through which interface packets sent to a particular
> UDP destination will go through, and so determine the MTU size for that
> interface. Can anyone suggest a good way of doing this from within the
> kernel?
Do a route lookup (ip_route_output_key), then either use dst_mtu to get
the PMTU value or dst->dev in case you really want the device's MTU.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 12:54 ` Evgeniy Polyakov
@ 2007-05-04 13:04 ` David Howells
2007-05-04 13:06 ` David Howells
2007-05-04 13:08 ` David Howells
2007-05-04 13:23 ` David Howells
2 siblings, 1 reply; 18+ messages in thread
From: David Howells @ 2007-05-04 13:04 UTC (permalink / raw)
To: Evgeniy Polyakov; +Cc: netdev, Patrick McHardy
Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> err = __ip_route_output_key(rp, flp);
Is there any way to get at this without having to link against the ipv4
module?
David
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:04 ` David Howells
@ 2007-05-04 13:06 ` David Howells
0 siblings, 0 replies; 18+ messages in thread
From: David Howells @ 2007-05-04 13:06 UTC (permalink / raw)
Cc: Evgeniy Polyakov, netdev, Patrick McHardy
David Howells <dhowells@redhat.com> wrote:
> > err = __ip_route_output_key(rp, flp);
>
> Is there any way to get at this without having to link against the ipv4
> module?
Ah, nevermind. ipv4 can't be a module.
David
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 12:54 ` Evgeniy Polyakov
2007-05-04 13:04 ` David Howells
@ 2007-05-04 13:08 ` David Howells
2007-05-04 13:16 ` Evgeniy Polyakov
2007-05-04 13:20 ` Evgeniy Polyakov
2007-05-04 13:23 ` David Howells
2 siblings, 2 replies; 18+ messages in thread
From: David Howells @ 2007-05-04 13:08 UTC (permalink / raw)
To: Evgeniy Polyakov; +Cc: netdev, Patrick McHardy
Should route_get_raw() release the rtable if gets back?
David
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:08 ` David Howells
@ 2007-05-04 13:16 ` Evgeniy Polyakov
2007-05-04 13:24 ` David Howells
2007-05-04 13:20 ` Evgeniy Polyakov
1 sibling, 1 reply; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-04 13:16 UTC (permalink / raw)
To: David Howells; +Cc: netdev, Patrick McHardy
On Fri, May 04, 2007 at 02:08:15PM +0100, David Howells (dhowells@redhat.com) wrote:
>
> Should route_get_raw() release the rtable if gets back?
Yes, dst entry should be released when not used anymore.
> David
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:08 ` David Howells
2007-05-04 13:16 ` Evgeniy Polyakov
@ 2007-05-04 13:20 ` Evgeniy Polyakov
1 sibling, 0 replies; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-04 13:20 UTC (permalink / raw)
To: David Howells; +Cc: netdev, Patrick McHardy
On Fri, May 04, 2007 at 02:08:15PM +0100, David Howells (dhowells@redhat.com) wrote:
>
> Should route_get_raw() release the rtable if gets back?
You can also cache returned entry and then just clone it and check
->obsolete() callback.
Something like this:
struct dst_entry *route_get(struct dst_entry *dst)
{
if (dst && dst->obsolete && dst->ops->check(dst, 0) == NULL) {
dst_release(dst);
return NULL;
}
return dst_clone(dst);
}
Copied from route code too.
> David
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 12:54 ` Evgeniy Polyakov
2007-05-04 13:04 ` David Howells
2007-05-04 13:08 ` David Howells
@ 2007-05-04 13:23 ` David Howells
2007-05-04 13:25 ` Evgeniy Polyakov
2 siblings, 1 reply; 18+ messages in thread
From: David Howells @ 2007-05-04 13:23 UTC (permalink / raw)
To: Evgeniy Polyakov; +Cc: netdev, Patrick McHardy
Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> static int netchannel_ip_route_output_flow(struct rtable **rp, struct flowi *flp, int flags)
What's proto? Should that be IPPROTO_UDP?
David
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:16 ` Evgeniy Polyakov
@ 2007-05-04 13:24 ` David Howells
2007-05-04 13:29 ` Evgeniy Polyakov
0 siblings, 1 reply; 18+ messages in thread
From: David Howells @ 2007-05-04 13:24 UTC (permalink / raw)
To: Evgeniy Polyakov; +Cc: netdev, Patrick McHardy
Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> > Should route_get_raw() release the rtable if gets back?
>
> Yes, dst entry should be released when not used anymore.
I meant the rtable returned by __ip_route_output_key(), not the dst.
David
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:23 ` David Howells
@ 2007-05-04 13:25 ` Evgeniy Polyakov
0 siblings, 0 replies; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-04 13:25 UTC (permalink / raw)
To: David Howells; +Cc: netdev, Patrick McHardy
On Fri, May 04, 2007 at 02:23:24PM +0100, David Howells (dhowells@redhat.com) wrote:
> Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
>
> > static int netchannel_ip_route_output_flow(struct rtable **rp, struct flowi *flp, int flags)
>
> What's proto? Should that be IPPROTO_UDP?
Yep.
> David
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:24 ` David Howells
@ 2007-05-04 13:29 ` Evgeniy Polyakov
2007-05-04 13:33 ` David Howells
0 siblings, 1 reply; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-04 13:29 UTC (permalink / raw)
To: David Howells; +Cc: netdev, Patrick McHardy
On Fri, May 04, 2007 at 02:24:49PM +0100, David Howells (dhowells@redhat.com) wrote:
> Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
>
> > > Should route_get_raw() release the rtable if gets back?
> >
> > Yes, dst entry should be released when not used anymore.
>
> I meant the rtable returned by __ip_route_output_key(), not the dst.
That is the same, dst is dereferenced as rtable.
Cloned dst is returned, so it must be put back at the ned of the usage.
> David
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:29 ` Evgeniy Polyakov
@ 2007-05-04 13:33 ` David Howells
2007-05-04 13:43 ` Patrick McHardy
2007-05-05 9:15 ` Evgeniy Polyakov
0 siblings, 2 replies; 18+ messages in thread
From: David Howells @ 2007-05-04 13:33 UTC (permalink / raw)
To: Evgeniy Polyakov; +Cc: netdev, Patrick McHardy
Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> That is the same, dst is dereferenced as rtable.
> Cloned dst is returned, so it must be put back at the ned of the usage.
If that's the case, then why do you bother to clone it in route_get_raw()?
Surely that'll give you *two* clones...
BTW, it seems to work. The attached function gives me:
[0mount ] <== rxrpc_assess_MTU_size() [if_mtu 1500]
Thanks!
David
---
static void rxrpc_assess_MTU_size(struct rxrpc_peer *peer)
{
struct rtable *rt;
struct flowi fl;
int ret;
peer->if_mtu = 1500;
memset(&fl, 0, sizeof(fl));
switch (peer->srx.transport.family) {
case AF_INET:
fl.oif = 0;
fl.proto = IPPROTO_UDP,
fl.nl_u.ip4_u.saddr = 0;
fl.nl_u.ip4_u.daddr = peer->srx.transport.sin.sin_addr.s_addr;
fl.nl_u.ip4_u.tos = 0;
/* assume AFS.CM talking to AFS.FS */
fl.uli_u.ports.sport = htonl(7001);
fl.uli_u.ports.dport = htonl(7000);
break;
default:
BUG();
}
ret = ip_route_output_key(&rt, &fl);
if (ret < 0) {
kleave(" [route err %d]", ret);
return;
}
peer->if_mtu = dst_mtu(&rt->u.dst);
kleave(" [if_mtu %u]", peer->if_mtu);
}
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:33 ` David Howells
@ 2007-05-04 13:43 ` Patrick McHardy
2007-05-04 13:55 ` David Howells
2007-05-05 9:15 ` Evgeniy Polyakov
1 sibling, 1 reply; 18+ messages in thread
From: Patrick McHardy @ 2007-05-04 13:43 UTC (permalink / raw)
To: David Howells; +Cc: Evgeniy Polyakov, netdev
David Howells wrote:
> static void rxrpc_assess_MTU_size(struct rxrpc_peer *peer)
> {
> struct rtable *rt;
> struct flowi fl;
> int ret;
>
> peer->if_mtu = 1500;
>
> memset(&fl, 0, sizeof(fl));
>
> switch (peer->srx.transport.family) {
> case AF_INET:
> fl.oif = 0;
> fl.proto = IPPROTO_UDP,
> fl.nl_u.ip4_u.saddr = 0;
> fl.nl_u.ip4_u.daddr = peer->srx.transport.sin.sin_addr.s_addr;
> fl.nl_u.ip4_u.tos = 0;
> /* assume AFS.CM talking to AFS.FS */
> fl.uli_u.ports.sport = htonl(7001);
> fl.uli_u.ports.dport = htonl(7000);
htons()
> break;
> default:
> BUG();
> }
>
> ret = ip_route_output_key(&rt, &fl);
> if (ret < 0) {
> kleave(" [route err %d]", ret);
> return;
> }
>
> peer->if_mtu = dst_mtu(&rt->u.dst);
You need dst_release(&rt->u.dst) here.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:43 ` Patrick McHardy
@ 2007-05-04 13:55 ` David Howells
2007-05-04 13:59 ` Patrick McHardy
0 siblings, 1 reply; 18+ messages in thread
From: David Howells @ 2007-05-04 13:55 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Evgeniy Polyakov, netdev
Patrick McHardy <kaber@trash.net> wrote:
> htons()
Blech. Thanks. Does it actually matter what ports are specified?
> You need dst_release(&rt->u.dst) here.
Thanks. I think Evgeniy's code may be wrong then. He ends with a
dst_clone(), which I think is superfluous.
David
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:55 ` David Howells
@ 2007-05-04 13:59 ` Patrick McHardy
2007-05-05 9:13 ` Evgeniy Polyakov
0 siblings, 1 reply; 18+ messages in thread
From: Patrick McHardy @ 2007-05-04 13:59 UTC (permalink / raw)
To: David Howells; +Cc: Evgeniy Polyakov, netdev
David Howells wrote:
> Patrick McHardy <kaber@trash.net> wrote:
>
>
>>htons()
>
>
> Blech. Thanks. Does it actually matter what ports are specified?
It matters when IPsec port selectors are used to find the correct
policy.
>>You need dst_release(&rt->u.dst) here.
>
>
> Thanks. I think Evgeniy's code may be wrong then. He ends with a
> dst_clone(), which I think is superfluous.
Yes, that looks wrong.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:59 ` Patrick McHardy
@ 2007-05-05 9:13 ` Evgeniy Polyakov
0 siblings, 0 replies; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-05 9:13 UTC (permalink / raw)
To: Patrick McHardy; +Cc: David Howells, netdev
On Fri, May 04, 2007 at 03:59:18PM +0200, Patrick McHardy (kaber@trash.net) wrote:
> >>You need dst_release(&rt->u.dst) here.
> >
> >
> > Thanks. I think Evgeniy's code may be wrong then. He ends with a
> > dst_clone(), which I think is superfluous.
>
>
> Yes, that looks wrong.
Main idea is to get a reference, and then clone it for each user.
Then each user drops its reference, and when system is not used anymore,
main reference is dropped too.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: How to find I/F to destination
2007-05-04 13:33 ` David Howells
2007-05-04 13:43 ` Patrick McHardy
@ 2007-05-05 9:15 ` Evgeniy Polyakov
1 sibling, 0 replies; 18+ messages in thread
From: Evgeniy Polyakov @ 2007-05-05 9:15 UTC (permalink / raw)
To: David Howells; +Cc: netdev, Patrick McHardy
On Fri, May 04, 2007 at 02:33:45PM +0100, David Howells (dhowells@redhat.com) wrote:
> Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
>
> > That is the same, dst is dereferenced as rtable.
> > Cloned dst is returned, so it must be put back at the ned of the usage.
>
> If that's the case, then why do you bother to clone it in route_get_raw()?
> Surely that'll give you *two* clones...
>
> BTW, it seems to work. The attached function gives me:
>
> [0mount ] <== rxrpc_assess_MTU_size() [if_mtu 1500]
>
> Thanks!
>
> David
>
> ---
> static void rxrpc_assess_MTU_size(struct rxrpc_peer *peer)
> {
> struct rtable *rt;
> struct flowi fl;
> int ret;
>
> peer->if_mtu = 1500;
>
> memset(&fl, 0, sizeof(fl));
>
> switch (peer->srx.transport.family) {
> case AF_INET:
> fl.oif = 0;
> fl.proto = IPPROTO_UDP,
> fl.nl_u.ip4_u.saddr = 0;
> fl.nl_u.ip4_u.daddr = peer->srx.transport.sin.sin_addr.s_addr;
> fl.nl_u.ip4_u.tos = 0;
> /* assume AFS.CM talking to AFS.FS */
> fl.uli_u.ports.sport = htonl(7001);
> fl.uli_u.ports.dport = htonl(7000);
> break;
> default:
> BUG();
> }
>
> ret = ip_route_output_key(&rt, &fl);
This one is quite slow compared to atomic reference increase, so yes,
there are two clones - one for main 'route', which then is being cloned
for each used in fast path. When it is not needed anymore (netchannel is
removed), the first one is dropped.
> if (ret < 0) {
> kleave(" [route err %d]", ret);
> return;
> }
>
> peer->if_mtu = dst_mtu(&rt->u.dst);
> kleave(" [if_mtu %u]", peer->if_mtu);
> }
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2007-05-05 9:15 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-04 12:48 How to find I/F to destination David Howells
2007-05-04 12:54 ` Evgeniy Polyakov
2007-05-04 13:04 ` David Howells
2007-05-04 13:06 ` David Howells
2007-05-04 13:08 ` David Howells
2007-05-04 13:16 ` Evgeniy Polyakov
2007-05-04 13:24 ` David Howells
2007-05-04 13:29 ` Evgeniy Polyakov
2007-05-04 13:33 ` David Howells
2007-05-04 13:43 ` Patrick McHardy
2007-05-04 13:55 ` David Howells
2007-05-04 13:59 ` Patrick McHardy
2007-05-05 9:13 ` Evgeniy Polyakov
2007-05-05 9:15 ` Evgeniy Polyakov
2007-05-04 13:20 ` Evgeniy Polyakov
2007-05-04 13:23 ` David Howells
2007-05-04 13:25 ` Evgeniy Polyakov
2007-05-04 13:00 ` Patrick McHardy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).