From: Gautham R Shenoy <ego@in.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, a.p.zijlstra@chello.nl,
Dipankar Sarma <dipankar@in.ibm.com>
Subject: [PATCH] Add irq protection in the percpu-counters cpu-hotplug-callback path
Date: Mon, 15 Oct 2007 11:48:44 +0530 [thread overview]
Message-ID: <20071015061844.GA15728@in.ibm.com> (raw)
In-Reply-To: <20071011213126.cf92efb7.akpm@linux-foundation.org>
Hi Andrew,
While running regular cpu-offline tests on 2.6.23-mm1, I
hit the following lockdep warning.
It was triggered because some of the per-cpu counters and thus
their locks are accessed from IRQ context.
This can cause a deadlock if it interrupts a cpu-offline thread which
is transferring a dead-cpu's counts to the global counter.
Please find the patch for the same below. Tested on i386.
Thanks and Regards
gautham.
=====================Warning! ===========================================
[root@llm43]# ./all_hotplug_once
CPU 1 is now offline
=================================
[ INFO: inconsistent lock state ]
2.6.23-mm1 #3
---------------------------------
inconsistent {in-softirq-W} -> {softirq-on-W} usage.
sh/7103 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&percpu_counter_irqsafe){-+..}, at: [<c028e296>] percpu_counter_hotcpu_callback+0x22/0x67
{in-softirq-W} state was registered at:
[<c014126f>] __lock_acquire+0x40d/0xb4a
[<c0141966>] __lock_acquire+0xb04/0xb4a
[<c0141a0b>] lock_acquire+0x5f/0x79
[<c028e4b5>] __percpu_counter_add+0x62/0xad
[<c04d5e81>] _spin_lock+0x21/0x2c
[<c028e4b5>] __percpu_counter_add+0x62/0xad
[<c028e4b5>] __percpu_counter_add+0x62/0xad
[<c01531af>] test_clear_page_writeback+0x88/0xc5
[<c014d35e>] end_page_writeback+0x20/0x3c
[<c0188757>] end_buffer_async_write+0x133/0x181
[<c0141966>] __lock_acquire+0xb04/0xb4a
[<c0187eb4>] end_bio_bh_io_sync+0x21/0x29
[<c0187e93>] end_bio_bh_io_sync+0x0/0x29
[<c0189345>] bio_endio+0x27/0x29
[<c04358f8>] dec_pending+0x17d/0x199
[<c0435a13>] clone_endio+0x73/0x9f
[<c04359a0>] clone_endio+0x0/0x9f
[<c0189345>] bio_endio+0x27/0x29
[<c027ba83>] __end_that_request_first+0x150/0x2c0
[<c034a161>] scsi_end_request+0x1d/0xab
[<c014f5ed>] mempool_free+0x63/0x67
[<c034ac22>] scsi_io_completion+0x108/0x2c7
[<c027e03b>] blk_done_softirq+0x51/0x5c
[<c012b291>] __do_softirq+0x68/0xdb
[<c012b33a>] do_softirq+0x36/0x51
[<c012b4bf>] irq_exit+0x43/0x4e
[<c0106f60>] do_IRQ+0x73/0x83
[<c0105902>] common_interrupt+0x2e/0x34
[<c01600d8>] add_to_swap+0x23/0x66
[<c01031b4>] mwait_idle_with_hints+0x3b/0x3f
[<c01033a8>] mwait_idle+0x0/0xf
[<c01034d1>] cpu_idle+0x9a/0xc7
[<ffffffff>] 0xffffffff
irq event stamp: 4007
hardirqs last enabled at (4007): [<c04d4d9c>] __mutex_lock_slowpath+0x21d/0x241
hardirqs last disabled at (4006): [<c04d4bda>] __mutex_lock_slowpath+0x5b/0x241
softirqs last enabled at (2130): [<c0135ab7>] __rcu_offline_cpu+0x2f/0x5a
softirqs last disabled at (2128): [<c04d5e94>] _spin_lock_bh+0x8/0x31
other info that might help us debug this:
6 locks held by sh/7103:
#0: (&buffer->mutex){--..}, at: [<c019f414>] sysfs_write_file+0x22/0xdb
#1: (cpu_add_remove_lock){--..}, at: [<c01450fd>] cpu_down+0x13/0x36
#2: (sched_hotcpu_mutex){--..}, at: [<c01220db>] migration_call+0x26/0x36a
#3: (cache_chain_mutex){--..}, at: [<c0168289>] cpuup_callback+0x28/0x1f9
#4: (workqueue_mutex){--..}, at: [<c013456d>] workqueue_cpu_callback+0x26/0xca
#5: (percpu_counters_lock){--..}, at: [<c028e287>] percpu_counter_hotcpu_callback+0x13/0x67
stack backtrace:
[<c013febd>] print_usage_bug+0x101/0x10b
[<c01406fd>] mark_lock+0x249/0x3f0
[<c01412d6>] __lock_acquire+0x474/0xb4a
[<c0141a0b>] lock_acquire+0x5f/0x79
[<c028e296>] percpu_counter_hotcpu_callback+0x22/0x67
[<c04d5e81>] _spin_lock+0x21/0x2c
[<c028e296>] percpu_counter_hotcpu_callback+0x22/0x67
[<c028e296>] percpu_counter_hotcpu_callback+0x22/0x67
[<c04d7e3d>] notifier_call_chain+0x2a/0x47
[<c013aece>] raw_notifier_call_chain+0x9/0xc
[<c014503d>] _cpu_down+0x174/0x221
[<c014510f>] cpu_down+0x25/0x36
[<c02e7a66>] store_online+0x24/0x56
[<c02e7a42>] store_online+0x0/0x56
[<c02e5132>] sysdev_store+0x1e/0x22
[<c019f499>] sysfs_write_file+0xa7/0xdb
[<c019f3f2>] sysfs_write_file+0x0/0xdb
[<c016b882>] vfs_write+0x83/0xf6
[<c016bde3>] sys_write+0x3c/0x63
[<c0104e8e>] sysenter_past_esp+0x5f/0x99
=======================
--->
From: Gautham R Shenoy <ego@in.ibm.com>
Some of the per-cpu counters and thus their locks
are accessed from IRQ contexts. This can cause a deadlock
if it interrupts a cpu-offline thread which is transferring
a dead-cpu's counts to the global counter.
Add appropriate IRQ protection in the cpu-hotplug callback path.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---
lib/percpu_counter.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
Index: linux-2.6.23/lib/percpu_counter.c
===================================================================
--- linux-2.6.23.orig/lib/percpu_counter.c
+++ linux-2.6.23/lib/percpu_counter.c
@@ -124,12 +124,13 @@ static int __cpuinit percpu_counter_hotc
mutex_lock(&percpu_counters_lock);
list_for_each_entry(fbc, &percpu_counters, list) {
s32 *pcount;
+ unsigned long flags;
- spin_lock(&fbc->lock);
+ spin_lock_irqsave(&fbc->lock, flags);
pcount = per_cpu_ptr(fbc->counters, cpu);
fbc->count += *pcount;
*pcount = 0;
- spin_unlock(&fbc->lock);
+ spin_unlock_irqrestore(&fbc->lock, flags);
}
mutex_unlock(&percpu_counters_lock);
return NOTIFY_OK;
--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
next prev parent reply other threads:[~2007-10-15 6:19 UTC|newest]
Thread overview: 163+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-12 4:31 2.6.23-mm1 Andrew Morton
2007-10-12 5:03 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-12 6:42 ` 2.6.23-mm1 Andrew Morton
2007-10-12 6:46 ` 2.6.23-mm1 Al Viro
2007-10-12 7:13 ` 2.6.23-mm1 Andrew Morton
2007-10-12 18:06 ` [PATCH net-2.6] uml: hard_header fix Stephen Hemminger
2007-10-12 19:04 ` 2.6.23-mm1 Al Viro
2007-10-12 19:47 ` 2.6.23-mm1 thread exit_group issue Mathieu Desnoyers
2007-10-12 20:01 ` Andrew Morton
2007-10-13 1:03 ` Andrew Morton
2007-10-13 11:48 ` Oleg Nesterov
2007-10-13 12:02 ` Oleg Nesterov
2007-10-13 17:49 ` Andrew Morton
2007-10-14 4:04 ` Mathieu Desnoyers
2007-10-12 7:25 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-12 8:36 ` 2.6.23-mm1 Sam Ravnborg
2007-10-12 8:31 ` 2.6.23-mm1 Torsten Kaiser
2007-10-12 8:37 ` 2.6.23-mm1 Andrew Morton
2007-10-12 12:46 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 8:01 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 10:55 ` 2.6.23-mm1 Jeff Garzik
2007-10-13 12:03 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 12:19 ` 2.6.23-mm1 Jeff Garzik
2007-10-13 14:32 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 14:40 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 15:13 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 17:48 ` 2.6.23-mm1 Jeff Garzik
2007-10-13 18:05 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 18:18 ` 2.6.23-mm1 Andrew Morton
2007-10-13 18:35 ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 11:54 ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 18:39 ` 2.6.23-mm1 Andrew Morton
2007-10-14 19:12 ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 19:26 ` 2.6.23-mm1 Andrew Morton
2007-10-14 19:26 ` 2.6.23-mm1 Andrew Morton
2007-10-14 19:40 ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 22:03 ` 2.6.23-mm1 Milan Broz
2007-10-14 22:03 ` 2.6.23-mm1 Milan Broz
2007-10-15 6:50 ` 2.6.23-mm1 Jens Axboe
2007-10-15 6:50 ` 2.6.23-mm1 Jens Axboe
2007-10-15 7:31 ` 2.6.23-mm1 Neil Brown
2007-10-15 7:31 ` 2.6.23-mm1 Neil Brown
2007-10-15 7:45 ` 2.6.23-mm1 Jens Axboe
2007-10-15 7:45 ` 2.6.23-mm1 Jens Axboe
2007-10-13 18:41 ` 2.6.23-mm1 Jeff Garzik
2007-10-12 6:48 ` 2.6.23-mm1 Cedric Le Goater
2007-10-12 6:51 ` [PATCH] add missing parenthesis in cfe_writeblk() macro Mariusz Kozlowski
2007-10-12 7:44 ` 2.6.23-mm1 - build failure on axonram Kamalesh Babulal
2007-10-12 9:42 ` Build Failure (Was Re: 2.6.23-mm1) Dhaval Giani
2007-10-12 9:42 ` Dhaval Giani
2007-10-12 20:38 ` 2.6.23-mm1 Laurent Riffard
2007-10-12 21:00 ` 2.6.23-mm1 Andrew Morton
2007-10-13 9:29 ` [PATCH] Reiser4: Drop 'size' argument from bio_endio and bi_end_io Laurent Riffard
2007-10-13 10:10 ` Jens Axboe
2007-10-14 13:09 ` Edward Shishkin
2007-10-15 16:13 ` 2.6.23-mm1 Zan Lynx
2007-10-12 21:32 ` 2.6.23-mm1 Rafael J. Wysocki
2007-10-15 16:09 ` 2.6.23-mm1 Mark Gross
2007-10-15 20:40 ` 2.6.23-mm1 Rafael J. Wysocki
2007-10-16 19:58 ` 2.6.23-mm1 Mark Gross
2007-10-16 20:28 ` 2.6.23-mm1 Rafael J. Wysocki
2007-10-16 23:31 ` 2.6.23-mm1 Mark Gross
2007-10-17 21:15 ` [PATCH] static initialization with blocking notifiers. was :wqRe: 2.6.23-mm1 Mark Gross
2007-10-17 17:21 ` [PATCH] static initialization and blocking notification for pm_qos... was 2.6.23-mm1 Mark Gross
2007-10-13 4:35 ` 2.6.23-mm1 - Build failure on rgmii Kamalesh Babulal
2007-10-13 4:44 ` 2.6.23-mm1 - build failure with advansys Kamalesh Babulal
2007-10-13 6:52 ` Andrew Morton
2007-10-13 6:52 ` Andrew Morton
2007-10-18 0:07 ` Paul Mackerras
2007-10-18 0:07 ` Paul Mackerras
2007-10-18 1:48 ` Matthew Wilcox
2007-10-18 1:48 ` Matthew Wilcox
2007-10-13 15:50 ` 2.6.23-mm1 pm_prepare() and _finish() w/ args vs. without Joseph Fannin
2007-10-13 17:22 ` Rafael J. Wysocki
2007-10-13 18:40 ` Joseph Fannin
2007-10-13 19:13 ` Rafael J. Wysocki
2007-10-14 19:47 ` Joseph Fannin
2007-10-14 20:20 ` Rafael J. Wysocki
2007-10-15 20:55 ` Rafael J. Wysocki
2007-10-16 17:29 ` Joseph Fannin
2007-10-13 17:12 ` 2.6.23-mm1 Gabriel C
2007-10-13 18:01 ` 2.6.23-mm1 Andrew Morton
2007-10-13 18:08 ` 2.6.23-mm1 Gabriel C
2007-10-15 16:28 ` 2.6.23-mm1 Dave Hansen
2007-10-13 17:58 ` Suspend Broken (Re: 2.6.23-mm1) Dhaval Giani
2007-10-13 18:33 ` Rafael J. Wysocki
2007-10-14 4:26 ` Dhaval Giani
2007-10-14 14:19 ` Rafael J. Wysocki
2007-10-13 22:11 ` [2.6.23-mm1] CONFIG_LOCALVERSION handling broken Tilman Schmidt
2007-10-17 20:27 ` Sam Ravnborg
2007-10-17 23:06 ` Tilman Schmidt
2007-10-27 15:19 ` Tilman Schmidt
2007-10-27 15:28 ` Sam Ravnborg
2007-10-14 22:34 ` 2.6.23-mm1: BUG in reiserfs_delete_xattrs Laurent Riffard
2007-10-14 22:34 ` Laurent Riffard
2007-10-15 8:40 ` Christoph Hellwig
2007-10-15 18:31 ` Jeff Mahoney
2007-10-15 18:31 ` Jeff Mahoney
2007-10-15 18:31 ` Jeff Mahoney
2007-10-15 20:06 ` Laurent Riffard
2007-10-15 20:06 ` Laurent Riffard
2007-10-15 20:23 ` Jeff Mahoney
2007-10-15 20:23 ` Jeff Mahoney
2007-10-17 8:59 ` Christoph Hellwig
2007-10-17 8:58 ` Christoph Hellwig
2007-10-17 14:55 ` Jeff Mahoney
2007-10-17 14:55 ` Jeff Mahoney
2007-10-17 14:55 ` Jeff Mahoney
2007-10-15 19:51 ` Laurent Riffard
2007-10-15 19:51 ` Laurent Riffard
2007-10-15 19:51 ` Laurent Riffard
2007-10-15 6:18 ` Gautham R Shenoy [this message]
2007-10-15 12:28 ` nfs mmap adventure (was: 2.6.23-mm1) Peter Zijlstra
2007-10-15 14:06 ` David Howells
2007-10-15 15:51 ` Trond Myklebust
2007-10-15 16:38 ` Peter Zijlstra
2007-10-16 1:46 ` Nick Piggin
2007-10-15 23:27 ` David Howells
2007-10-15 15:43 ` Trond Myklebust
2007-10-16 7:18 ` 2.6.23-mm1 - regression- PowerPC link failure at arch/powerpc/kernel/head_64.o Kamalesh Babulal
2007-10-16 7:28 ` Andrew Morton
2007-10-16 7:44 ` Kamalesh Babulal
2007-10-21 6:42 ` Kamalesh Babulal
2007-10-27 5:05 ` Stephen Rothwell
2007-10-17 7:01 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-17 9:02 ` 2.6.23-mm1 Andrew Morton
2007-10-17 9:10 ` 2.6.23-mm1 Jiri Kosina
2007-10-17 9:36 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-17 11:42 ` 2.6.23-mm1 Jiri Kosina
2007-10-17 12:33 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-19 9:07 ` PIE randomization (was Re: 2.6.23-mm1) Jiri Kosina
2007-10-19 21:54 ` 2.6.23-mm1 Jiri Kosina
2007-10-17 15:54 ` 2.6.23-mm1 - list_add corruption in cgroup Cedric Le Goater
2007-10-18 15:56 ` Paul Menage
2007-10-19 22:11 ` Paul Menage
2007-10-18 12:06 ` 2.6.23-mm1 - powerpc - Build fails at arch/powerpc/boot/inflate.o Kamalesh Babulal
2007-10-18 12:06 ` Kamalesh Babulal
2007-10-18 12:23 ` Paul Mackerras
2007-10-18 12:23 ` Paul Mackerras
2007-10-18 13:20 ` Kamalesh Babulal
2007-10-18 13:20 ` Kamalesh Babulal
2007-10-20 4:57 ` oops in lbmIODone, fails to boot [Re: 2.6.23-mm1] Mattia Dongili
2007-10-20 5:34 ` Andrew Morton
2007-10-20 12:18 ` Dave Kleikamp
2007-10-21 5:44 ` Mattia Dongili
2007-10-20 5:13 ` 2.6.23-mm1 - autofs broken Rik van Riel
2007-10-20 5:39 ` Andrew Morton
2007-10-20 5:54 ` Rik van Riel
2007-10-20 5:54 ` Rik van Riel
2007-10-20 14:56 ` Rik van Riel
2007-10-22 22:03 ` Dave Hansen
2007-10-22 3:45 ` Ian Kent
2007-10-22 16:46 ` Rik van Riel
2007-10-21 5:58 ` mysqld prevents s2ram [Re: 2.6.23-mm1] Mattia Dongili
2007-10-21 6:28 ` Mattia Dongili
2007-10-21 9:58 ` Pavel Machek
2007-10-21 11:53 ` Rafael J. Wysocki
2007-10-22 18:40 ` kernel panic when running tcpdump Mariusz Kozlowski
2007-10-22 18:40 ` Mariusz Kozlowski
2007-10-22 19:03 ` Andrew Morton
2007-10-22 19:03 ` Andrew Morton
2007-10-22 21:16 ` Mariusz Kozlowski
2007-10-22 21:16 ` Mariusz Kozlowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071015061844.GA15728@in.ibm.com \
--to=ego@in.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=dipankar@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.