From: Thomas Graf <tgraf@suug.ch>
To: Patrick McHardy <kaber@trash.net>
Cc: davem@davemloft.net, netdev@oss.sgi.com, spam@crocom.com.pl,
kuznet@ms2.inr.ac.ru, jmorris@redhat.com
Subject: Re: [PATCH] PKT_SCHED: Initialize list field in dummy qdiscs
Date: Sun, 7 Nov 2004 15:00:15 +0100 [thread overview]
Message-ID: <20041107140015.GA31969@postel.suug.ch> (raw)
In-Reply-To: <418DE37E.2050504@trash.net>
> Before the RCU change distruction of the qdisc and all inner
> qdiscs happend immediately and under the rtnl semaphore. This
> made sure nothing holding the rtnl semaphore could end up with
> invalid memory. This is not true anymore, qdiscs found on
> dev->qdisc_list can be suddenly destroyed.
And we should switch back to this again if possible. I haven't
audited all paths to dev_activate but we have at most
a list addition which might not be protected with an
rtnl semaphore. I'm not 100% sure about this yet.
> dev->qdisc_list is protected by qdisc_tree_lock everywhere but in
> qdisc_lookup, this is also the only structure that is consistently
> protected by this lock. To fix the list corruption we can either
> protect qdisc_lookup with qdisc_tree_lock or use rcu-list macros
> and remove all read_lock(&qdisc_tree_locks) (and replace it by
> a spinlock).
qdisc_lookup was the only one not yet protected by a preempt
disable.
> Unfortunately, since we can not rely on the rtnl protection for
> memory anymore, it seems we need to refcount all uses of
> dev->qdisc_list that before relied on this protection and can't
> use rcu_read_lock.
There is no list iteration not yet protected by the rtnl semaphore
and the only interruption is because of the rcu callback.
> To make this safe, we need to atomically
> atomic_dec_and_test/list_del in qdisc_destroy and atomically do
> list_for_each_entry/atomic_inc in qdisc_lookup, so we should
> should simply keep the non-rcu lists and use qdisc_tree_lock
> in qdisc_lookup.
You mean before qdisc_lookup and until the reference is released
again? These are huge locking regions involving calls which might
sleep and possible qdisc_destroy calling paths. So this won't
work quite well.
So in my opinion we should screw that call_rcu because it doesn't
make much sense. In case dev_activate is not synchronized with
rtnl sempaphore we have to make sure that qdisc_destroy always
locks on qdisc_tree_lock which is not the case for a few paths as
of now, although I'm not sure if any of those actually ever call
qdisc_destroy with refcnt==1.
If screwing call_rcu is not possible we can still do a refcnt
incremented before call_rcu in qdisc_destroy and every base
caller of qdisc_destroy (excluding those in qdisc destroy routines)
sleeps on it after it invoked qdisc_destroy and reached a safe
place to sleep. So we can make sure that the qdisc is really gone
after invoking qdisc_destroy. Otherwise we will always run into
troubles with new messages arriving and qdisc deletions still
pending.
next prev parent reply other threads:[~2004-11-07 14:00 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-11-05 9:48 PROBLEM: IProute hangs after running traffic shaping scripts Szymon Miotk
2004-11-05 11:54 ` Thomas Graf
2004-11-05 14:16 ` [PATCH] PKT_SCHED: Initialize list field in dummy qdiscs Thomas Graf
2004-11-05 16:12 ` Patrick McHardy
2004-11-05 16:39 ` Thomas Graf
2004-11-05 17:26 ` Patrick McHardy
2004-11-05 17:58 ` Thomas Graf
2004-11-05 18:18 ` Patrick McHardy
2004-11-05 19:43 ` Thomas Graf
2004-11-06 1:18 ` Thomas Graf
2004-11-06 1:47 ` Patrick McHardy
2004-11-06 1:59 ` Thomas Graf
2004-11-06 14:50 ` Thomas Graf
2004-11-07 8:57 ` Patrick McHardy
2004-11-07 14:00 ` Thomas Graf [this message]
2004-11-07 16:19 ` Patrick McHardy
2004-11-07 16:33 ` Thomas Graf
2004-11-07 17:02 ` Patrick McHardy
2004-11-07 17:49 ` Thomas Graf
2004-11-07 18:22 ` Patrick McHardy
2004-11-07 19:08 ` Thomas Graf
2004-11-06 0:36 ` David S. Miller
2004-11-07 22:22 ` PROBLEM: IProute hangs after running traffic shaping scripts Patrick McHardy
2004-11-08 1:40 ` Patrick McHardy
2004-11-08 13:54 ` Thomas Graf
2004-11-08 16:12 ` Patrick McHardy
2004-11-08 18:33 ` Thomas Graf
2004-11-08 19:46 ` Patrick McHardy
2004-11-08 20:15 ` Thomas Graf
2004-11-10 0:18 ` David S. Miller
2004-11-10 0:40 ` Patrick McHardy
2004-11-10 0:55 ` Patrick McHardy
2004-11-10 6:13 ` David S. Miller
2004-11-10 12:08 ` Szymon Miotk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20041107140015.GA31969@postel.suug.ch \
--to=tgraf@suug.ch \
--cc=davem@davemloft.net \
--cc=jmorris@redhat.com \
--cc=kaber@trash.net \
--cc=kuznet@ms2.inr.ac.ru \
--cc=netdev@oss.sgi.com \
--cc=spam@crocom.com.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).