From: Steven Rostedt <rostedt@goodmis.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: netdev@oss.sgi.com, "Paul E. McKenney" <paulmck@us.ibm.com>,
linux-kernel@vger.kernel.org
Subject: Re: 2.6.13-rc6-rt6
Date: Fri, 19 Aug 2005 17:22:28 -0400 [thread overview]
Message-ID: <1124486548.18408.18.camel@localhost.localdomain> (raw)
In-Reply-To: <20050817162324.GA24495@elte.hu>
On Wed, 2005-08-17 at 18:23 +0200, Ingo Molnar wrote:
>
> > And it goes on and on. This happens everytime. Without netconsole, I
> > only get the nonzero lock count error. Also, one of my lockups on SMP
> > had to do with the kernel_thread_helper:
> >
> > Using IPI Shortcut mode
> > khelper/794[CPU#0]: BUG in set_new_owner at kernel/rt.c:916
This was with netconsole and showed up after a bunch of other bugs. So
this is a side effect of what happened earlier.
>
> this is a 'must not happen'. Somehow lock->held list got non-empty.
> Maybe some use-after-free thing? Havent seen it myself.
>
I started debugging netconsole with the RT patch and found this
happening. After seeing what's wrong, I looked at the latest git
branch, and it seems to already have a similar solution that I was going
to make. Here's a description of what's wrong.
In net/core/dev.c the following code is in net_rx_action:
netpoll_poll_lock(dev);
if (dev->quota <= 0 || dev->poll(dev, &budget)) {
netpoll_poll_unlock(dev);
raw_local_irq_disable();
list_del(&dev->poll_list);
list_add_tail(&dev->poll_list, &queue->poll_list);
if (dev->quota < 0)
dev->quota += dev->weight;
else
dev->quota = dev->weight;
} else {
netpoll_poll_unlock(dev);
The netpoll_poll_lock and netpoll_poll_unlock look like this (in current RT):
static inline netpoll_poll_lock(struct net_device *dev)
{
if (dev->npinfo) {
spin_lock(&dev->npinfo->poll_lock);
dev->npinfo->poll_owner = smp_processor_id();
}
}
static inline void netpoll_poll_unlock(struct net_device *dev)
{
if (dev->npinfo) {
dev->npinfo->poll_owner = -1;
spin_unlock(&dev->npinfo->poll_lock);
}
}
The problem here is that between netpoll_poll_lock and
netpoll_poll_unlock the dev->npinfo gets assigned. So we unlock the
dev->npinfo->poll_lock without ever locking it.
Here's the port from the latest git to solve this. I've CCed the netdev,
since I'm not sure I got all the places for rcu_lock for the netpoll. At
least to solve this problem. I did boot up the kernel and this patch
did fix my bugs that I was getting using netconsole. (I have one more
patch to send to fix the illegal API messages).
-- Steve
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Index: linux_realtime_ernie/include/linux/netpoll.h
===================================================================
--- linux_realtime_ernie/include/linux/netpoll.h (revision 296)
+++ linux_realtime_ernie/include/linux/netpoll.h (working copy)
@@ -60,25 +60,31 @@
return ret;
}
-static inline void netpoll_poll_lock(struct net_device *dev)
+static inline void *netpoll_poll_lock(struct net_device *dev)
{
+ rcu_read_lock();
if (dev->npinfo) {
spin_lock(&dev->npinfo->poll_lock);
dev->npinfo->poll_owner = smp_processor_id();
+ return dev->npinfo;
}
+ return NULL;
}
-static inline void netpoll_poll_unlock(struct net_device *dev)
+static inline void netpoll_poll_unlock(void *have)
{
- if (dev->npinfo) {
- dev->npinfo->poll_owner = -1;
- spin_unlock(&dev->npinfo->poll_lock);
+ struct netpoll_info *npi = have;
+
+ if (npi) {
+ npi->poll_owner = -1;
+ spin_unlock(&npi->poll_lock);
}
+ rcu_read_unlock();
}
#else
#define netpoll_rx(a) 0
-#define netpoll_poll_lock(a)
+#define netpoll_poll_lock(a) 0
#define netpoll_poll_unlock(a)
#endif
Index: linux_realtime_ernie/net/core/netpoll.c
===================================================================
--- linux_realtime_ernie/net/core/netpoll.c (revision 296)
+++ linux_realtime_ernie/net/core/netpoll.c (working copy)
@@ -726,6 +726,9 @@
/* last thing to do is link it to the net device structure */
ndev->npinfo = npinfo;
+ /* avoid racing with NAPI reading npinfo */
+ synchronize_rcu();
+
return 0;
release:
Index: linux_realtime_ernie/net/core/dev.c
===================================================================
--- linux_realtime_ernie/net/core/dev.c (revision 296)
+++ linux_realtime_ernie/net/core/dev.c (working copy)
@@ -1723,6 +1723,7 @@
while (!list_empty(&queue->poll_list)) {
struct net_device *dev;
+ void *have;
if (budget <= 0 || jiffies - start_time > 1)
goto softnet_break;
@@ -1735,10 +1736,10 @@
dev = list_entry(queue->poll_list.next,
struct net_device, poll_list);
- netpoll_poll_lock(dev);
+ have = netpoll_poll_lock(dev);
if (dev->quota <= 0 || dev->poll(dev, &budget)) {
- netpoll_poll_unlock(dev);
+ netpoll_poll_unlock(have);
raw_local_irq_disable();
list_del(&dev->poll_list);
list_add_tail(&dev->poll_list, &queue->poll_list);
@@ -1747,7 +1748,7 @@
else
dev->quota = dev->weight;
} else {
- netpoll_poll_unlock(dev);
+ netpoll_poll_unlock(have);
dev_put(dev);
raw_local_irq_disable();
}
next prev parent reply other threads:[~2005-08-19 21:22 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-08-16 12:18 2.6.13-rc6-rt3 Ingo Molnar
2005-08-16 15:31 ` 2.6.13-rc6-rt5 Steven Rostedt
2005-08-16 15:44 ` 2.6.13-rc6-rt5 Steven Rostedt
2005-08-16 16:08 ` 2.6.13-rc6-rt5 Steven Rostedt
2005-08-16 16:16 ` 2.6.13-rc6-rt5 Ingo Molnar
2005-08-16 16:22 ` 2.6.13-rc6-rt5 Ingo Molnar
2005-08-16 16:32 ` 2.6.13-rc6-rt5 Ingo Molnar
2005-08-16 16:37 ` 2.6.13-rc6-rt5 Ingo Molnar
2005-08-16 16:52 ` 2.6.13-rc6-rt5 Ingo Molnar
2005-08-16 17:08 ` 2.6.13-rc6-rt6 Ingo Molnar
2005-08-16 17:50 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-16 18:07 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-16 18:50 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-17 4:20 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-17 5:46 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-17 6:47 ` 2.6.13-rc6-rt6 Ingo Molnar
2005-08-17 14:05 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-17 14:24 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-17 16:13 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-17 16:23 ` 2.6.13-rc6-rt6 Ingo Molnar
2005-08-17 17:10 ` 2.6.13-rc6-rt6 K.R. Foley
2005-08-17 18:31 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-17 19:31 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-18 0:02 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-18 2:44 ` 2.6.13-rc6-rt6 Steven Rostedt
[not found] ` <20050822075012.GB19386@elte.hu>
[not found] ` <1124704837.5208.22.camel@localhost.localdomain>
[not found] ` <20050822101632.GA28803@elte.hu>
[not found] ` <1124710309.5208.30.camel@localhost.localdomain>
[not found] ` <20050822113858.GA1160@elte.hu>
[not found] ` <1124715755.5647.4.camel@localhost.localdomain>
[not found] ` <20050822183355.GB13888@elte.hu>
2005-08-22 19:40 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-22 19:44 ` [RFC] RT-patch update to remove the global pi_lock Steven Rostedt
2005-08-22 22:19 ` Daniel Walker
2005-08-23 0:26 ` Steven Rostedt
2005-08-23 0:51 ` Daniel Walker
2005-08-23 1:32 ` Steven Rostedt
2005-08-23 3:38 ` Steven Rostedt
[not found] ` <1124908080.5604.22.camel@localhost.localdomain>
[not found] ` <1124917003.5711.8.camel@localhost.localdomain>
2005-08-24 21:05 ` Thomas Gleixner
2005-08-25 1:13 ` Steven Rostedt
2005-08-25 1:38 ` Daniel Walker
2005-08-25 1:48 ` Steven Rostedt
2005-08-25 6:31 ` Ingo Molnar
2005-08-25 6:35 ` Ingo Molnar
2005-08-25 16:15 ` Steven Rostedt
2005-08-25 19:34 ` Ingo Molnar
2005-08-25 19:46 ` Steven Rostedt
2005-08-23 5:29 ` Ingo Molnar
2005-08-25 14:47 ` Steven Rostedt
2005-08-25 15:06 ` Steven Rostedt
2005-08-25 17:47 ` Ingo Molnar
2005-08-25 20:09 ` Steven Rostedt
2005-08-25 21:32 ` Daniel Walker
2005-08-26 2:23 ` Steven Rostedt
2005-08-26 13:52 ` Steven Rostedt
2005-08-30 15:00 ` Steven Rostedt
2005-08-30 15:52 ` Steven Rostedt
2005-08-30 23:08 ` Steven Rostedt
2005-08-31 15:01 ` [FYI] 2.6.13-rt3 and a nanosleep jitter test Steven Rostedt
2005-08-31 15:12 ` Daniel Walker
2005-08-31 15:30 ` Steven Rostedt
2005-08-31 15:13 ` Daniel Walker
2005-08-31 15:19 ` Steven Rostedt
2005-08-31 15:30 ` Daniel Walker
2005-08-23 5:46 ` 2.6.13-rc6-rt6 Ingo Molnar
2005-08-19 21:22 ` Steven Rostedt [this message]
2005-08-19 22:47 ` 2.6.13-rc6-rt6 Paul E. McKenney
2005-08-19 23:02 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-19 23:12 ` 2.6.13-rc6-rt6 Paul E. McKenney
2005-08-19 23:20 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-19 23:44 ` 2.6.13-rc6-rt6 Paul E. McKenney
2005-08-22 7:53 ` 2.6.13-rc6-rt6 Ingo Molnar
2005-08-17 19:27 ` 2.6.13-rc6-rt6 Ingo Molnar
2005-08-17 19:39 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-17 17:32 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-17 19:34 ` 2.6.13-rc6-rt6 Steven Rostedt
2005-08-17 5:59 ` 2.6.13-rc6-rt6 Ingo Molnar
2005-08-17 20:01 ` 2.6.13-rc6-rt8 Peter Bortas
2005-08-23 6:14 ` 2.6.13-rc6-rt8 Ingo Molnar
2005-08-28 20:36 ` 2.6.13-rc6-rt8 Peter Bortas
2005-08-18 9:57 ` 2.6.13-rc6-rt3 Alistair John Strachan
2005-08-18 10:00 ` 2.6.13-rc6-rt3 Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1124486548.18408.18.camel@localhost.localdomain \
--to=rostedt@goodmis.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=netdev@oss.sgi.com \
--cc=paulmck@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox