From: Stephen Hemminger <shemminger@linux-foundation.org>
To: Patrick McHardy <kaber@trash.net>
Cc: David Miller <davem@davemloft.net>, netdev@vger.kernel.org
Subject: Re: [IPV4 3/5] fib_trie: dump doesnt use RCU
Date: Wed, 23 Jan 2008 22:45:02 -0800 [thread overview]
Message-ID: <479833EE.1090905@linux-foundation.org> (raw)
In-Reply-To: <47983304.3030309@trash.net>
Patrick McHardy wrote:
> David Miller wrote:
>> From: Stephen Hemminger <shemminger@linux-foundation.org>
>> Date: Wed, 23 Jan 2008 14:48:47 -0800
>>
>>
>>> Since fib dump (via netlink) holds the RTNL mutex, it is unnecessary
>>> to use RCU, and it is impossible to get truncated (-EBUSY) result.
>>>
>>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
>>>
>>
>> You tested this patch, right? :-/
>>
>> The whole reason we need the nlk->cb[] state is to hold things across
>> multiple recvmsg() calls that might be necessary to obtain the full
>> dump.
>>
>> rtnetlink goes:
>>
>> rtnl_lock();
>> netlink_rcv_skb(skb, &rtnetlink_rcv_msg);
>> ...
>> static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
>> {
>> ...
>> if (kind == 2 && nlh->nlmsg_flags&NLM_F_DUMP) {
>> struct sock *rtnl;
>> rtnl_dumpit_func dumpit;
>>
>> dumpit = rtnl_get_dumpit(family, type);
>> if (dumpit == NULL)
>> return -EOPNOTSUPP;
>>
>> __rtnl_unlock();
>> rtnl = net->rtnl;
>> err = netlink_dump_start(rtnl, skb, nlh, dumpit, NULL);
>> rtnl_lock();
>> return err;
>>
>> (NOTE: Drops RTNL semaphore for netlink_dump_start() call)
>>
>> ...
>> int netlink_dump_start(struct sock *ssk, struct sk_buff *skb,
>> struct nlmsghdr *nlh,
>> int (*dump)(struct sk_buff *skb,
>> struct netlink_callback *),
>> int (*done)(struct netlink_callback *))
>> {
>> ...
>> cb->dump = dump;
>> cb->done = done;
>> cb->nlh = nlh;
>> atomic_inc(&skb->users);
>> cb->skb = skb;
>> ...
>> mutex_lock(nlk->cb_mutex);
>> ...
>> nlk->cb = cb;
>> mutex_unlock(nlk->cb_mutex);
>>
>> netlink_dump(sk);
>> ...
>>
>> static int netlink_dump(struct sock *sk)
>> {
>> ...
>> mutex_lock(nlk->cb_mutex);
>> ...
>> len = cb->dump(skb, cb);
>>
>> if (len > 0) {
>> mutex_unlock(nlk->cb_mutex);
>> skb_queue_tail(&sk->sk_receive_queue, skb);
>> sk->sk_data_ready(sk, len);
>> return 0;
>> }
>>
>> (NOTE: Therefore cb->dump() runs without RTNL semaphore held)
>>
>> ...
>> static int netlink_recvmsg(struct kiocb *kiocb, struct socket *sock,
>> struct msghdr *msg, size_t len,
>> int flags)
>> {
>> ...
>> if (nlk->cb && atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf / 2)
>> netlink_dump(sk);
>> ...
>>
>> Therefore, that RTNL assertion you added should have triggered on any
>> dump you may have tried since ->dump() is always invoked without the
>> RTNL semaphore since rtnetlink drops it around the ->dump() call and
>> the call chain for this fib_trie cause would be:
>>
>> inet_dump_fib()
>> fn_trie_dump()
>>
>> and nothing in that code path retakes the RTNL semaphore.
>
> Actually we're always holding the rtnl during dumps, nlk->cb_mutex points
> to rtnl_mutex in case of rtnetlink. It used to be held only during the
> first
> ->dump invocation and not on continuations, but I changed this a few
> versions
> ago.
>
>
Yes, I tested, no the assert didn't hit... Actually, the reason I went
down this path was
because I couldn't trigger -EBUSY with concurrent updates.
P.s: I checked and Quagga handles -EAGAIN, so if you need an error code that
would be the one to use.
next prev parent reply other threads:[~2008-01-24 6:44 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20080123224844.610730277@linux-foundation.org>
2008-01-23 22:48 ` [IPV4 1/5] fib_trie: more whitespace cleanup Stephen Hemminger
2008-01-24 4:37 ` David Miller
2008-01-23 22:48 ` [IPV4 2/5] fib_trie: remove unneeded NULL check Stephen Hemminger
2008-01-24 4:38 ` David Miller
2008-01-23 22:48 ` [IPV4 3/5] fib_trie: dump doesnt use RCU Stephen Hemminger
2008-01-24 4:50 ` David Miller
2008-01-24 6:41 ` Patrick McHardy
2008-01-24 6:43 ` David Miller
2008-01-24 6:47 ` Patrick McHardy
2008-01-24 7:26 ` David Miller
2008-01-24 6:45 ` Stephen Hemminger [this message]
2008-01-24 7:27 ` David Miller
2008-01-24 21:51 ` [PATCH] fib_trie: rescan if key is lost during dump Stephen Hemminger
2008-01-25 8:23 ` Jarek Poplawski
2008-01-25 16:13 ` Stephen Hemminger
2008-01-25 19:01 ` Jarek Poplawski
2008-02-01 0:45 ` David Miller
2008-01-23 22:48 ` [IPV4 4/5] fib_trie: version 0.410 Stephen Hemminger
2008-01-24 4:50 ` David Miller
2008-01-23 22:48 ` [IPV4 5/5] fib_semantics: sparse warnings Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=479833EE.1090905@linux-foundation.org \
--to=shemminger@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.