From: Eric Dumazet <eric.dumazet@gmail.com>
To: Alban Crequy <alban.crequy@collabora.co.uk>
Cc: David Miller <davem@davemloft.net>,
shemminger@vyatta.com, gorcunov@openvz.org, adobriyan@gmail.com,
lennart@poettering.net, kay.sievers@vrfy.org,
ian.molton@collabora.co.uk, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/9] AF_UNIX: find the recipients for multicast messages
Date: Tue, 23 Nov 2010 17:08:37 +0100 [thread overview]
Message-ID: <1290528517.3046.91.camel@edumazet-laptop> (raw)
In-Reply-To: <20101123150315.4e67a139@chocolatine.cbg.collabora.co.uk>
Le mardi 23 novembre 2010 à 15:03 +0000, Alban Crequy a écrit :
> Le Mon, 22 Nov 2010 11:05:19 -0800 (PST),
> David Miller <davem@davemloft.net> a écrit :
>
> > From: Alban Crequy <alban.crequy@collabora.co.uk>
> > Date: Mon, 22 Nov 2010 18:36:17 +0000
> >
> > > unix_find_multicast_recipients() builds an array of recipients. It
> > > can either find the peers of a specific multicast address, or find
> > > all the peers of all multicast group the sender is part of.
> > >
> > > Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk>
> >
> > You really should use RCU to lock this stuff, this way sends run
> > lockless and have less worries wrt. the memory allocation. You'll
> > also only take a spinlock in the write paths which change the
> > multicast groups, which ought to be rare.
>
> I understand the benefit to use RCU in order to have lockless sends.
>
> But with RCU I will still have worries about the memory allocation:
>
> - I cannot allocate inside a rcu_read_lock()-rcu_read_unlock() block.
>
Thats not true.
Sames rules than inside a spin_lock() or write_lock() apply.
We already allocate memory inside rcu_read_lock() in network stack.
> - If I iterate locklessly over the multicast group members with
> hlist_for_each_entry_rcu(), new members can be added, so the
> array can be allocated with the wrong size and I have to try again
> ("goto try_again") when this rare case occurs.
You are allowed to allocate memory to add stuff while doing your loop
iteration.
Nothing prevents you to use a chain of items, each item holding up to
128 sockets for example. If full, allocate a new item.
We have such schem in poll()/select() for example
fs/select.c function poll_get_entry()
Use a small embedded struct on stack, and allocate extra items if number
of fd is too big.
(If you cant allocate memory to hold pointers, chance is you wont be
able to clone skbs anyway. One skb is about 400 bytes.)
If new members are added to the group while you are iterating the list,
they wont receive a copy of the message.
Or just chain skbs while you clone them, store in skb->sk the socket...
no need for extra memory allocations.
>
> - Another idea would be to avoid completely the allocation by inlining
> unix_find_multicast_recipients() inside unix_dgram_sendmsg() and
> delivering the messages to the recipients as long as the list is
> being iterated locklessly. But I want to provide atomicity of
> delivery: the message must be delivered with skb_queue_tail() either
> to all the recipients or to none of them in case of interruption or
> memory pressure. I don't see how I can achieve that without
> iterating several times on the list of recipients, hence the
> allocation and the copy in the array. I also want to guarantee the
> order of delivery as described in multicast-unix-sockets.txt and for
> this, I am taking lots of spinlocks anyway. I don't see how to avoid
> that, but I would be happy to be wrong and have a better solution.
>
So if one destination has a full receive queue, you want nobody receive
the message ? That seems a bit risky to me, if someone sends SIGSTOP to
one of your process...
>
> To give an idea of the number of members in a multicast group for the
> D-Bus use case, I have 90 D-Bus connections on my session bus:
>
> $ dbus-send --print-reply --dest=org.freedesktop.DBus \
> /org/freedesktop/DBus org.freedesktop.DBus.ListNames | grep '":'|wc -l
> 90
>
> In common cases, there should be only a few real recipients (1 or 2?)
> after the socket filters eliminate most of them, but
> unix_find_multicast_recipients() will still allocate an array of
> about that size.
>
I am not sure if doing 90 clones of skb and filtering them one by one is
going to be fast :-(
next prev parent reply other threads:[~2010-11-23 16:08 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-22 18:34 [PATCH 0/9] RFC v2: Multicast and filtering features on AF_UNIX Alban Crequy
2010-11-22 18:36 ` [PATCH 1/9] AF_UNIX: Add constant for Unix socket options level Alban Crequy
2010-11-22 18:36 ` [PATCH 2/9] AF_UNIX: add setsockopt on Unix sockets Alban Crequy
2010-11-22 18:36 ` [PATCH 3/9] AF_UNIX: create, join and leave multicast groups with setsockopt Alban Crequy
2010-11-22 19:00 ` David Miller
2010-11-22 18:36 ` [PATCH 4/9] AF_UNIX: find the recipients for multicast messages Alban Crequy
2010-11-22 19:05 ` David Miller
2010-11-23 15:03 ` Alban Crequy
2010-11-23 16:08 ` Eric Dumazet [this message]
2010-11-23 16:56 ` Eric Dumazet
2010-11-23 17:47 ` Alban Crequy
2010-11-23 18:39 ` David Miller
2010-11-22 20:14 ` Andi Kleen
2010-11-22 18:36 ` [PATCH 5/9] AF_UNIX: Deliver message to several recipients in case of multicast Alban Crequy
2010-11-22 18:36 ` [PATCH 6/9] AF_UNIX: Apply Linux Socket Filtering to Unix sockets Alban Crequy
2010-11-22 18:36 ` [PATCH 7/9] AF_UNIX: Documentation on multicast Unix Sockets Alban Crequy
2010-11-22 19:07 ` Rémi Denis-Courmont
2010-11-22 20:09 ` Alban Crequy
2010-11-22 18:36 ` [PATCH 8/9] AF_UNIX: add options on multicast connected socket Alban Crequy
2010-11-22 18:36 ` [PATCH 9/9] AF_UNIX: implement poll(POLLOUT) for multicast sockets Alban Crequy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1290528517.3046.91.camel@edumazet-laptop \
--to=eric.dumazet@gmail.com \
--cc=adobriyan@gmail.com \
--cc=alban.crequy@collabora.co.uk \
--cc=davem@davemloft.net \
--cc=gorcunov@openvz.org \
--cc=ian.molton@collabora.co.uk \
--cc=kay.sievers@vrfy.org \
--cc=lennart@poettering.net \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=shemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox