From: Peter Zijlstra <peterz@infradead.org>
To: David Miller <davem@davemloft.net>
Cc: jarkao2@gmail.com, Larry.Finger@lwfinger.net, kaber@trash.net,
torvalds@linux-foundation.org, akpm@linux-foundation.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-wireless@vger.kernel.org, mingo@redhat.com,
Nick Piggin <nickpiggin@yahoo.com.au>,
Paul E McKenney <paulmck@linux.vnet.ibm.com>
Subject: Re: Kernel WARNING: at net/core/dev.c:1330 __netif_schedule+0x2c/0x98()
Date: Thu, 24 Jul 2008 11:10:48 +0200 [thread overview]
Message-ID: <1216890648.7257.258.camel@twins> (raw)
In-Reply-To: <20080723.131607.79681752.davem@davemloft.net>
On Wed, 2008-07-23 at 13:16 -0700, David Miller wrote:
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Wed, 23 Jul 2008 11:49:14 +0000
>
> > On Wed, Jul 23, 2008 at 11:35:19AM +0000, Jarek Poplawski wrote:
> > > On Wed, Jul 23, 2008 at 12:58:16PM +0200, Peter Zijlstra wrote:
> > ...
> > > > When I look at the mac802.11 code in ieee80211_tx_pending() it looks
> > > > like it can do with just one lock at a time, instead of all - but I
> > > > might be missing some obvious details.
> > > >
> > > > So I guess my question is, is netif_tx_lock() here to stay, or is the
> > > > right fix to convert all those drivers to use __netif_tx_lock() which
> > > > locks only a single queue?
> > > >
> > >
> > > It's a new thing mainly for new hardware/drivers, and just after
> > > conversion (older drivers effectively use __netif_tx_lock()), so it'll
> > > probably stay for some time until something better is found. David,
> > > will tell the rest, I hope.
> >
> > ...And, of course, these new drivers should also lock a single queue
> > where possible.
>
> It isn't going away.
>
> There will always be a need for a "stop all the TX queues" operation.
Ok, then how about something like this, the idea is to wrap the per tx
lock with a read lock of the device and let the netif_tx_lock() be the
write side, therefore excluding all device locks, but not incure the
cacheline bouncing on the read side by using per-cpu counters like rcu
does.
This of course requires that netif_tx_lock() is rare, otherwise stuff
will go bounce anyway...
Probably missed a few details,.. but I think the below ought to show the
idea...
struct tx_lock {
int busy;
spinlock_t lock;
unsigned long *counters;
};
int tx_lock_init(struct tx_lock *txl)
{
txl->busy = 0;
spin_lock_init(&txl->lock);
txl->counters = alloc_percpu(unsigned long);
if (!txl->counters)
return -ENOMEM;
return 0;
}
void __netif_tx_lock(struct netdev_queue *txq, cpu)
{
struct net_device *dev = txq->dev;
if (rcu_dereference(dev->tx_lock.busy)) {
spin_lock(&dev->tx_lock.lock);
(*percpu_ptr(dev->tx_lock.counters, cpu))++;
spin_unlock(&dev->tx_lock.lock);
} else
(*percpu_ptr(dev->tx_lock.counters, cpu))++;
spin_lock(&txq->_xmit_lock);
txq->xmit_lock_owner = cpu;
}
void __netif_tx_unlock(struct netdev_queue *txq)
{
struct net_device *dev = txq->dev;
(*percpu_ptr(dev->tx_lock.counters, txq->xmit_lock_owner))--;
txq->xmit_lock_owner = -1;
spin_unlock(&txq->xmit_lock);
}
unsigned long tx_lock_read_counters(struct tx_lock *txl)
{
int i;
unsigned long counter = 0;
/* can use online - the inc/dec are matched per cpu */
for_each_online_cpu(i)
counter += *percpu_ptr(txl->counters, i);
return counter;
}
void netif_tx_lock(struct net_device *dev)
{
spin_lock(&dev->tx_lock.lock);
rcu_assign_pointer(dev->tx_lock.busy, 1);
while (tx_lock_read_counters(&dev->tx_lock)
cpu_relax();
}
void netif_tx_unlock(struct net_device *dev)
{
rcu_assign_pointer(dev->tx_lock.busy, 0);
smp_wmb(); /* because rcu_assign_pointer is broken */
spin_unlock(&dev->tx_lock.lock);
}
next prev parent reply other threads:[~2008-07-24 9:10 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20080721133059.GA30637@elte.hu>
[not found] ` <20080721134506.GA27598@elte.hu>
[not found] ` <20080721143023.GA32451@elte.hu>
2008-07-21 15:10 ` [crash] BUG: unable to handle kernel NULL pointer dereference at 0000000000000370 David Miller
[not found] ` <20080721150446.GA17746@elte.hu>
2008-07-21 15:24 ` David Miller
2008-07-21 18:18 ` Ian Schram
2008-07-21 19:06 ` Ingo Molnar
2008-07-21 19:13 ` Larry Finger
2008-07-21 19:34 ` Ingo Molnar
2008-07-21 19:43 ` Larry Finger
2008-07-21 19:47 ` Linus Torvalds
2008-07-21 20:15 ` David Miller
2008-07-21 20:28 ` Larry Finger
2008-07-21 20:21 ` David Miller
2008-07-21 20:38 ` Larry Finger
2008-07-21 20:46 ` David Miller
2008-07-21 20:51 ` Patrick McHardy
2008-07-21 21:01 ` David Miller
2008-07-21 21:06 ` Patrick McHardy
2008-07-21 21:35 ` Patrick McHardy
2008-07-21 21:42 ` Patrick McHardy
2008-07-21 21:51 ` Larry Finger
2008-07-21 22:04 ` Patrick McHardy
2008-07-21 22:40 ` Larry Finger
2008-07-21 23:15 ` David Miller
2008-07-22 6:34 ` Larry Finger
2008-07-22 10:51 ` Jarek Poplawski
2008-07-22 11:32 ` David Miller
2008-07-22 12:52 ` Larry Finger
2008-07-22 20:43 ` David Miller
2008-07-22 13:02 ` Larry Finger
2008-07-22 14:53 ` Patrick McHardy
2008-07-22 21:17 ` David Miller
2008-07-22 16:39 ` Kernel WARNING: at net/core/dev.c:1330 __netif_schedule+0x2c/0x98() Larry Finger
2008-07-22 17:20 ` Patrick McHardy
2008-07-22 18:39 ` Larry Finger
2008-07-22 18:44 ` Patrick McHardy
2008-07-22 19:30 ` Larry Finger
2008-07-22 23:04 ` David Miller
2008-07-23 6:20 ` Jarek Poplawski
2008-07-23 7:59 ` David Miller
2008-07-23 8:54 ` Jarek Poplawski
2008-07-23 9:03 ` Peter Zijlstra
2008-07-23 9:35 ` Jarek Poplawski
2008-07-23 9:50 ` Peter Zijlstra
2008-07-23 10:13 ` Jarek Poplawski
2008-07-23 10:58 ` Peter Zijlstra
2008-07-23 11:35 ` Jarek Poplawski
2008-07-23 11:49 ` Jarek Poplawski
2008-07-23 20:16 ` David Miller
2008-07-23 20:43 ` Jarek Poplawski
2008-07-23 20:55 ` David Miller
2008-07-24 9:10 ` Peter Zijlstra [this message]
2008-07-24 9:20 ` David Miller
2008-07-24 9:27 ` Peter Zijlstra
2008-07-24 9:32 ` David Miller
2008-07-24 10:08 ` Peter Zijlstra
2008-07-24 10:38 ` Nick Piggin
2008-07-24 10:55 ` Miklos Szeredi
2008-07-24 11:06 ` Nick Piggin
2008-08-01 21:10 ` Paul E. McKenney
2008-07-24 10:59 ` Peter Zijlstra
2008-08-01 21:10 ` Paul E. McKenney
2008-07-23 20:14 ` David Miller
2008-07-24 7:00 ` Peter Zijlstra
2008-07-25 17:04 ` Ingo Oeser
2008-07-25 18:36 ` Jarek Poplawski
2008-07-25 19:16 ` Johannes Berg
2008-07-25 19:34 ` Jarek Poplawski
2008-07-25 19:36 ` Johannes Berg
2008-07-25 20:01 ` Jarek Poplawski
2008-07-26 9:18 ` David Miller
2008-07-26 10:53 ` Jarek Poplawski
2008-07-26 13:18 ` Jarek Poplawski
2008-07-27 0:34 ` David Miller
2008-07-27 20:37 ` Jarek Poplawski
2008-07-31 12:29 ` David Miller
2008-07-31 12:38 ` Nick Piggin
2008-07-31 12:44 ` David Miller
2008-08-01 4:27 ` David Miller
2008-08-01 7:09 ` Peter Zijlstra
2008-08-01 6:48 ` Jarek Poplawski
2008-08-01 7:00 ` David Miller
2008-08-01 7:01 ` Jarek Poplawski
2008-08-01 7:01 ` David Miller
2008-08-01 7:41 ` Jarek Poplawski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1216890648.7257.258.camel@twins \
--to=peterz@infradead.org \
--cc=Larry.Finger@lwfinger.net \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=jarkao2@gmail.com \
--cc=kaber@trash.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=paulmck@linux.vnet.ibm.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).