netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Graf <tgraf@suug.ch>
To: Patrick McHardy <kaber@trash.net>
Cc: davem@davemloft.net, netdev@oss.sgi.com, spam@crocom.com.pl,
	kuznet@ms2.inr.ac.ru, jmorris@redhat.com
Subject: Re: [PATCH] PKT_SCHED: Initialize list field in dummy qdiscs
Date: Sun, 7 Nov 2004 15:00:15 +0100	[thread overview]
Message-ID: <20041107140015.GA31969@postel.suug.ch> (raw)
In-Reply-To: <418DE37E.2050504@trash.net>

> Before the RCU change distruction of the qdisc and all inner
> qdiscs happend immediately and under the rtnl semaphore. This
> made sure nothing holding the rtnl semaphore could end up with
> invalid memory. This is not true anymore, qdiscs found on
> dev->qdisc_list can be suddenly destroyed.

And we should switch back to this again if possible. I haven't
audited all paths to dev_activate but we have at most
a list addition which might not be protected with an
rtnl semaphore. I'm not 100% sure about this yet.

> dev->qdisc_list is protected by qdisc_tree_lock everywhere but in
> qdisc_lookup, this is also the only structure that is consistently
> protected by this lock. To fix the list corruption we can either
> protect qdisc_lookup with qdisc_tree_lock or use rcu-list macros
> and remove all read_lock(&qdisc_tree_locks) (and replace it by
> a spinlock).

qdisc_lookup was the only one not yet protected by a preempt
disable.

> Unfortunately, since we can not rely on the rtnl protection for
> memory anymore, it seems we need to refcount all uses of
> dev->qdisc_list that before relied on this protection and can't
> use rcu_read_lock.

There is no list iteration not yet protected by the rtnl semaphore
and the only interruption is because of the rcu callback.

> To make this safe, we need to atomically
> atomic_dec_and_test/list_del in qdisc_destroy and atomically do
> list_for_each_entry/atomic_inc in qdisc_lookup, so we should
> should simply keep the non-rcu lists and use qdisc_tree_lock
> in qdisc_lookup.

You mean before qdisc_lookup and until the reference is released
again? These are huge locking regions involving calls which might
sleep and possible qdisc_destroy calling paths.  So this won't
work quite well.

So in my opinion we should screw that call_rcu because it doesn't
make much sense. In case dev_activate is not synchronized with
rtnl sempaphore we have to make sure that qdisc_destroy always
locks on qdisc_tree_lock which is not the case for a few paths as
of now, although I'm not sure if any of those actually ever call
qdisc_destroy with refcnt==1.

If screwing call_rcu is not possible we can still do a refcnt
incremented before call_rcu in qdisc_destroy and every base
caller of qdisc_destroy (excluding those in qdisc destroy routines)
sleeps on it after it invoked qdisc_destroy and reached a safe
place to sleep. So we can make sure that the qdisc is really gone
after invoking qdisc_destroy. Otherwise we will always run into
troubles with new messages arriving and qdisc deletions still
pending.

  reply	other threads:[~2004-11-07 14:00 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-05  9:48 PROBLEM: IProute hangs after running traffic shaping scripts Szymon Miotk
2004-11-05 11:54 ` Thomas Graf
2004-11-05 14:16 ` [PATCH] PKT_SCHED: Initialize list field in dummy qdiscs Thomas Graf
2004-11-05 16:12   ` Patrick McHardy
2004-11-05 16:39     ` Thomas Graf
2004-11-05 17:26       ` Patrick McHardy
2004-11-05 17:58         ` Thomas Graf
2004-11-05 18:18           ` Patrick McHardy
2004-11-05 19:43             ` Thomas Graf
2004-11-06  1:18               ` Thomas Graf
2004-11-06  1:47                 ` Patrick McHardy
2004-11-06  1:59                   ` Thomas Graf
2004-11-06 14:50                     ` Thomas Graf
2004-11-07  8:57                       ` Patrick McHardy
2004-11-07 14:00                         ` Thomas Graf [this message]
2004-11-07 16:19                           ` Patrick McHardy
2004-11-07 16:33                             ` Thomas Graf
2004-11-07 17:02                               ` Patrick McHardy
2004-11-07 17:49                                 ` Thomas Graf
2004-11-07 18:22                                   ` Patrick McHardy
2004-11-07 19:08                                     ` Thomas Graf
2004-11-06  0:36           ` David S. Miller
2004-11-07 22:22 ` PROBLEM: IProute hangs after running traffic shaping scripts Patrick McHardy
2004-11-08  1:40   ` Patrick McHardy
2004-11-08 13:54     ` Thomas Graf
2004-11-08 16:12       ` Patrick McHardy
2004-11-08 18:33         ` Thomas Graf
2004-11-08 19:46           ` Patrick McHardy
2004-11-08 20:15             ` Thomas Graf
2004-11-10  0:18             ` David S. Miller
2004-11-10  0:40               ` Patrick McHardy
2004-11-10  0:55                 ` Patrick McHardy
2004-11-10  6:13                   ` David S. Miller
2004-11-10 12:08             ` Szymon Miotk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041107140015.GA31969@postel.suug.ch \
    --to=tgraf@suug.ch \
    --cc=davem@davemloft.net \
    --cc=jmorris@redhat.com \
    --cc=kaber@trash.net \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=netdev@oss.sgi.com \
    --cc=spam@crocom.com.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).