* netlink recvmsg() and MSG_TRUNC
@ 2007-03-06 23:19 David Miller
2007-03-06 23:40 ` James Morris
2007-03-06 23:49 ` Herbert Xu
0 siblings, 2 replies; 9+ messages in thread
From: David Miller @ 2007-03-06 23:19 UTC (permalink / raw)
To: netdev
So if you don't give a large enough buffer to
recvmsg() for the netlink response a few things
happen:
1) MSG_TRUNC is set
2) The length returned and the amount of data copied is the
size given in the recvmsg() call
3) If enough other packets remain in the receive buffer,
nlk->cb is left at non-NULL for a partial dump. This
means that you can't just immediately resubmit the
original request else you'll get NLMSG_ERROR with error
set to -EBUSY. This is what netlink_dump_start() does
when it sees nlk->cb non-NULL.
Now, the user is basically stuck and there is no real
way to recover from this besides doing something like
openning up a new netlink socket and then doing the recvmsg()
with a larger buffer, wash rinse repeat.
I looked at how some of our standard userspace code handles
this and it's not pretty:
1) iproute2 basically just uses a 16K buffer, signals an error
when it sees MSG_TRUNC, and that's it, whoopee
2) Thomas's libnl believes that recvmsg() will return the
true length necessary to receive the whole message, he
signals on this to double the buffer size and try the
recvmsg() again. As mentioned recvmsg() never returns
a length larger than the given buffer size, so this code
never triggers, and if it did it would lose entries because
netlink_recvmsg() drops the SKB even when it signals
MSG_TRUNC.
The behavior of dropping the SKB matches what UDP does in
the case of MSG_TRUNC.
I guess one thing the user could do when it sees MSG_TRUNC
is keep calling recvmsg() until the receive queue is emptied
of packets, in order to get that pesky nlk->cb cleared to
NULL, then resubmit.
But that's rediculous and complicated.
Any ideas?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: netlink recvmsg() and MSG_TRUNC
2007-03-06 23:19 netlink recvmsg() and MSG_TRUNC David Miller
@ 2007-03-06 23:40 ` James Morris
2007-03-06 23:49 ` Herbert Xu
1 sibling, 0 replies; 9+ messages in thread
From: James Morris @ 2007-03-06 23:40 UTC (permalink / raw)
To: David Miller; +Cc: netdev
On Tue, 6 Mar 2007, David Miller wrote:
> I guess one thing the user could do when it sees MSG_TRUNC
> is keep calling recvmsg() until the receive queue is emptied
> of packets, in order to get that pesky nlk->cb cleared to
> NULL, then resubmit.
>
> But that's rediculous and complicated.
>
> Any ideas?
Only slightly less complicated: user calls recvmsg() once with a new flag
MSG_FLUSH, which causes the queue to be flushed, then resubmits ?
- James
--
James Morris
<jmorris@namei.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: netlink recvmsg() and MSG_TRUNC
2007-03-06 23:19 netlink recvmsg() and MSG_TRUNC David Miller
2007-03-06 23:40 ` James Morris
@ 2007-03-06 23:49 ` Herbert Xu
2007-03-06 23:57 ` Stephen Hemminger
2007-03-07 0:02 ` David Miller
1 sibling, 2 replies; 9+ messages in thread
From: Herbert Xu @ 2007-03-06 23:49 UTC (permalink / raw)
To: David Miller; +Cc: netdev
David Miller <davem@davemloft.net> wrote:
>
> I guess one thing the user could do when it sees MSG_TRUNC
> is keep calling recvmsg() until the receive queue is emptied
> of packets, in order to get that pesky nlk->cb cleared to
> NULL, then resubmit.
>
> But that's rediculous and complicated.
>
> Any ideas?
Which netlink family generates (or needs to generate) unbounded
messages to user-space? Or indeed which ones generate messages
greater than 64K (or 4K for that matter)?
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: netlink recvmsg() and MSG_TRUNC
2007-03-06 23:49 ` Herbert Xu
@ 2007-03-06 23:57 ` Stephen Hemminger
2007-03-07 0:04 ` Herbert Xu
2007-03-07 0:02 ` David Miller
1 sibling, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2007-03-06 23:57 UTC (permalink / raw)
To: Herbert Xu; +Cc: David Miller, netdev
On Wed, 07 Mar 2007 10:49:07 +1100
Herbert Xu <herbert@gondor.apana.org.au> wrote:
> David Miller <davem@davemloft.net> wrote:
> >
> > I guess one thing the user could do when it sees MSG_TRUNC
> > is keep calling recvmsg() until the receive queue is emptied
> > of packets, in order to get that pesky nlk->cb cleared to
> > NULL, then resubmit.
> >
> > But that's rediculous and complicated.
> >
> > Any ideas?
>
> Which netlink family generates (or needs to generate) unbounded
> messages to user-space? Or indeed which ones generate messages
> greater than 64K (or 4K for that matter)?
>
> Cheers,
I know some commands send big blocks down of configuration information.
One example is netem statistical data, but there are others.
--
Stephen Hemminger <shemminger@linux-foundation.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: netlink recvmsg() and MSG_TRUNC
2007-03-06 23:49 ` Herbert Xu
2007-03-06 23:57 ` Stephen Hemminger
@ 2007-03-07 0:02 ` David Miller
2007-03-07 0:04 ` Herbert Xu
1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2007-03-07 0:02 UTC (permalink / raw)
To: herbert; +Cc: netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 07 Mar 2007 10:49:07 +1100
> Which netlink family generates (or needs to generate) unbounded
> messages to user-space? Or indeed which ones generate messages
> greater than 64K (or 4K for that matter)?
Create a lot of intefaces, try to dump them :-)
GLIBC can even hit this via it's ifaddrs.c code.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: netlink recvmsg() and MSG_TRUNC
2007-03-07 0:02 ` David Miller
@ 2007-03-07 0:04 ` Herbert Xu
2007-03-07 0:05 ` David Miller
0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2007-03-07 0:04 UTC (permalink / raw)
To: David Miller; +Cc: netdev
On Tue, Mar 06, 2007 at 04:02:02PM -0800, David Miller wrote:
>
> Create a lot of intefaces, try to dump them :-)
Dumps should be done using 4K (NLMSG_GOODSIZE) skb's, where is the problem?
> GLIBC can even hit this via it's ifaddrs.c code.
Do you have a simple test case that I can run?
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: netlink recvmsg() and MSG_TRUNC
2007-03-06 23:57 ` Stephen Hemminger
@ 2007-03-07 0:04 ` Herbert Xu
0 siblings, 0 replies; 9+ messages in thread
From: Herbert Xu @ 2007-03-07 0:04 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, netdev
On Tue, Mar 06, 2007 at 03:57:50PM -0800, Stephen Hemminger wrote:
>
> I know some commands send big blocks down of configuration information.
> One example is netem statistical data, but there are others.
You mean dumps? Unless someone is coalescing them I don't see a problem
there.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: netlink recvmsg() and MSG_TRUNC
2007-03-07 0:04 ` Herbert Xu
@ 2007-03-07 0:05 ` David Miller
2007-03-07 0:07 ` Herbert Xu
0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2007-03-07 0:05 UTC (permalink / raw)
To: herbert; +Cc: netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 7 Mar 2007 11:04:19 +1100
> On Tue, Mar 06, 2007 at 04:02:02PM -0800, David Miller wrote:
> >
> > Create a lot of intefaces, try to dump them :-)
>
> Dumps should be done using 4K (NLMSG_GOODSIZE) skb's, where is the problem?
Actually, more accurately it's using PAGE_SIZE. :)
> > GLIBC can even hit this via it's ifaddrs.c code.
>
> Do you have a simple test case that I can run?
I see, so the better fix would be to make glibc's
netlink_request() function start with a getpagesize()'d
buffer.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: netlink recvmsg() and MSG_TRUNC
2007-03-07 0:05 ` David Miller
@ 2007-03-07 0:07 ` Herbert Xu
0 siblings, 0 replies; 9+ messages in thread
From: Herbert Xu @ 2007-03-07 0:07 UTC (permalink / raw)
To: David Miller; +Cc: netdev
On Tue, Mar 06, 2007 at 04:05:57PM -0800, David Miller wrote:
>
> Actually, more accurately it's using PAGE_SIZE. :)
Aha it's you non-i386 people :)
> I see, so the better fix would be to make glibc's
> netlink_request() function start with a getpagesize()'d
> buffer.
Yes that's a good idea.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-03-07 0:07 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-06 23:19 netlink recvmsg() and MSG_TRUNC David Miller
2007-03-06 23:40 ` James Morris
2007-03-06 23:49 ` Herbert Xu
2007-03-06 23:57 ` Stephen Hemminger
2007-03-07 0:04 ` Herbert Xu
2007-03-07 0:02 ` David Miller
2007-03-07 0:04 ` Herbert Xu
2007-03-07 0:05 ` David Miller
2007-03-07 0:07 ` Herbert Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).