From: Gautham R Shenoy <ego@in.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, a.p.zijlstra@chello.nl,
Dipankar Sarma <dipankar@in.ibm.com>
Subject: [PATCH] Add irq protection in the percpu-counters cpu-hotplug-callback path
Date: Mon, 15 Oct 2007 11:48:44 +0530 [thread overview]
Message-ID: <20071015061844.GA15728@in.ibm.com> (raw)
In-Reply-To: <20071011213126.cf92efb7.akpm@linux-foundation.org>
Hi Andrew,
While running regular cpu-offline tests on 2.6.23-mm1, I
hit the following lockdep warning.
It was triggered because some of the per-cpu counters and thus
their locks are accessed from IRQ context.
This can cause a deadlock if it interrupts a cpu-offline thread which
is transferring a dead-cpu's counts to the global counter.
Please find the patch for the same below. Tested on i386.
Thanks and Regards
gautham.
=====================Warning! ===========================================
[root@llm43]# ./all_hotplug_once
CPU 1 is now offline
=================================
[ INFO: inconsistent lock state ]
2.6.23-mm1 #3
---------------------------------
inconsistent {in-softirq-W} -> {softirq-on-W} usage.
sh/7103 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&percpu_counter_irqsafe){-+..}, at: [<c028e296>] percpu_counter_hotcpu_callback+0x22/0x67
{in-softirq-W} state was registered at:
[<c014126f>] __lock_acquire+0x40d/0xb4a
[<c0141966>] __lock_acquire+0xb04/0xb4a
[<c0141a0b>] lock_acquire+0x5f/0x79
[<c028e4b5>] __percpu_counter_add+0x62/0xad
[<c04d5e81>] _spin_lock+0x21/0x2c
[<c028e4b5>] __percpu_counter_add+0x62/0xad
[<c028e4b5>] __percpu_counter_add+0x62/0xad
[<c01531af>] test_clear_page_writeback+0x88/0xc5
[<c014d35e>] end_page_writeback+0x20/0x3c
[<c0188757>] end_buffer_async_write+0x133/0x181
[<c0141966>] __lock_acquire+0xb04/0xb4a
[<c0187eb4>] end_bio_bh_io_sync+0x21/0x29
[<c0187e93>] end_bio_bh_io_sync+0x0/0x29
[<c0189345>] bio_endio+0x27/0x29
[<c04358f8>] dec_pending+0x17d/0x199
[<c0435a13>] clone_endio+0x73/0x9f
[<c04359a0>] clone_endio+0x0/0x9f
[<c0189345>] bio_endio+0x27/0x29
[<c027ba83>] __end_that_request_first+0x150/0x2c0
[<c034a161>] scsi_end_request+0x1d/0xab
[<c014f5ed>] mempool_free+0x63/0x67
[<c034ac22>] scsi_io_completion+0x108/0x2c7
[<c027e03b>] blk_done_softirq+0x51/0x5c
[<c012b291>] __do_softirq+0x68/0xdb
[<c012b33a>] do_softirq+0x36/0x51
[<c012b4bf>] irq_exit+0x43/0x4e
[<c0106f60>] do_IRQ+0x73/0x83
[<c0105902>] common_interrupt+0x2e/0x34
[<c01600d8>] add_to_swap+0x23/0x66
[<c01031b4>] mwait_idle_with_hints+0x3b/0x3f
[<c01033a8>] mwait_idle+0x0/0xf
[<c01034d1>] cpu_idle+0x9a/0xc7
[<ffffffff>] 0xffffffff
irq event stamp: 4007
hardirqs last enabled at (4007): [<c04d4d9c>] __mutex_lock_slowpath+0x21d/0x241
hardirqs last disabled at (4006): [<c04d4bda>] __mutex_lock_slowpath+0x5b/0x241
softirqs last enabled at (2130): [<c0135ab7>] __rcu_offline_cpu+0x2f/0x5a
softirqs last disabled at (2128): [<c04d5e94>] _spin_lock_bh+0x8/0x31
other info that might help us debug this:
6 locks held by sh/7103:
#0: (&buffer->mutex){--..}, at: [<c019f414>] sysfs_write_file+0x22/0xdb
#1: (cpu_add_remove_lock){--..}, at: [<c01450fd>] cpu_down+0x13/0x36
#2: (sched_hotcpu_mutex){--..}, at: [<c01220db>] migration_call+0x26/0x36a
#3: (cache_chain_mutex){--..}, at: [<c0168289>] cpuup_callback+0x28/0x1f9
#4: (workqueue_mutex){--..}, at: [<c013456d>] workqueue_cpu_callback+0x26/0xca
#5: (percpu_counters_lock){--..}, at: [<c028e287>] percpu_counter_hotcpu_callback+0x13/0x67
stack backtrace:
[<c013febd>] print_usage_bug+0x101/0x10b
[<c01406fd>] mark_lock+0x249/0x3f0
[<c01412d6>] __lock_acquire+0x474/0xb4a
[<c0141a0b>] lock_acquire+0x5f/0x79
[<c028e296>] percpu_counter_hotcpu_callback+0x22/0x67
[<c04d5e81>] _spin_lock+0x21/0x2c
[<c028e296>] percpu_counter_hotcpu_callback+0x22/0x67
[<c028e296>] percpu_counter_hotcpu_callback+0x22/0x67
[<c04d7e3d>] notifier_call_chain+0x2a/0x47
[<c013aece>] raw_notifier_call_chain+0x9/0xc
[<c014503d>] _cpu_down+0x174/0x221
[<c014510f>] cpu_down+0x25/0x36
[<c02e7a66>] store_online+0x24/0x56
[<c02e7a42>] store_online+0x0/0x56
[<c02e5132>] sysdev_store+0x1e/0x22
[<c019f499>] sysfs_write_file+0xa7/0xdb
[<c019f3f2>] sysfs_write_file+0x0/0xdb
[<c016b882>] vfs_write+0x83/0xf6
[<c016bde3>] sys_write+0x3c/0x63
[<c0104e8e>] sysenter_past_esp+0x5f/0x99
=======================
--->
From: Gautham R Shenoy <ego@in.ibm.com>
Some of the per-cpu counters and thus their locks
are accessed from IRQ contexts. This can cause a deadlock
if it interrupts a cpu-offline thread which is transferring
a dead-cpu's counts to the global counter.
Add appropriate IRQ protection in the cpu-hotplug callback path.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
---
lib/percpu_counter.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
Index: linux-2.6.23/lib/percpu_counter.c
===================================================================
--- linux-2.6.23.orig/lib/percpu_counter.c
+++ linux-2.6.23/lib/percpu_counter.c
@@ -124,12 +124,13 @@ static int __cpuinit percpu_counter_hotc
mutex_lock(&percpu_counters_lock);
list_for_each_entry(fbc, &percpu_counters, list) {
s32 *pcount;
+ unsigned long flags;
- spin_lock(&fbc->lock);
+ spin_lock_irqsave(&fbc->lock, flags);
pcount = per_cpu_ptr(fbc->counters, cpu);
fbc->count += *pcount;
*pcount = 0;
- spin_unlock(&fbc->lock);
+ spin_unlock_irqrestore(&fbc->lock, flags);
}
mutex_unlock(&percpu_counters_lock);
return NOTIFY_OK;
--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
next prev parent reply other threads:[~2007-10-15 6:19 UTC|newest]
Thread overview: 139+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-12 4:31 2.6.23-mm1 Andrew Morton
2007-10-12 5:03 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-12 6:42 ` 2.6.23-mm1 Andrew Morton
2007-10-12 6:46 ` 2.6.23-mm1 Al Viro
2007-10-12 7:13 ` 2.6.23-mm1 Andrew Morton
2007-10-12 18:06 ` [PATCH net-2.6] uml: hard_header fix Stephen Hemminger
2007-10-12 19:04 ` 2.6.23-mm1 Al Viro
2007-10-12 19:47 ` 2.6.23-mm1 thread exit_group issue Mathieu Desnoyers
2007-10-12 20:01 ` Andrew Morton
2007-10-13 1:03 ` Andrew Morton
2007-10-13 11:48 ` Oleg Nesterov
2007-10-13 12:02 ` Oleg Nesterov
2007-10-13 17:49 ` Andrew Morton
2007-10-14 4:04 ` Mathieu Desnoyers
2007-10-12 7:25 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-12 8:36 ` 2.6.23-mm1 Sam Ravnborg
2007-10-12 8:31 ` 2.6.23-mm1 Torsten Kaiser
2007-10-12 8:37 ` 2.6.23-mm1 Andrew Morton
2007-10-12 12:46 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 8:01 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 10:55 ` 2.6.23-mm1 Jeff Garzik
2007-10-13 12:03 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 12:19 ` 2.6.23-mm1 Jeff Garzik
2007-10-13 14:32 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 14:40 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 15:13 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 17:48 ` 2.6.23-mm1 Jeff Garzik
2007-10-13 18:05 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 18:18 ` 2.6.23-mm1 Andrew Morton
2007-10-13 18:35 ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 11:54 ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 18:39 ` 2.6.23-mm1 Andrew Morton
2007-10-14 19:12 ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 19:26 ` 2.6.23-mm1 Andrew Morton
2007-10-14 19:40 ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 22:03 ` 2.6.23-mm1 Milan Broz
2007-10-15 6:50 ` 2.6.23-mm1 Jens Axboe
2007-10-15 7:31 ` 2.6.23-mm1 Neil Brown
2007-10-15 7:45 ` 2.6.23-mm1 Jens Axboe
2007-10-13 18:41 ` 2.6.23-mm1 Jeff Garzik
2007-10-12 6:48 ` 2.6.23-mm1 Cedric Le Goater
2007-10-12 6:51 ` [PATCH] add missing parenthesis in cfe_writeblk() macro Mariusz Kozlowski
2007-10-12 7:44 ` 2.6.23-mm1 - build failure on axonram Kamalesh Babulal
2007-10-12 9:42 ` Build Failure (Was Re: 2.6.23-mm1) Dhaval Giani
2007-10-12 20:38 ` 2.6.23-mm1 Laurent Riffard
2007-10-12 21:00 ` 2.6.23-mm1 Andrew Morton
2007-10-13 9:29 ` [PATCH] Reiser4: Drop 'size' argument from bio_endio and bi_end_io Laurent Riffard
2007-10-13 10:10 ` Jens Axboe
2007-10-14 13:09 ` Edward Shishkin
2007-10-15 16:13 ` 2.6.23-mm1 Zan Lynx
2007-10-12 21:32 ` 2.6.23-mm1 Rafael J. Wysocki
2007-10-15 16:09 ` 2.6.23-mm1 Mark Gross
2007-10-15 20:40 ` 2.6.23-mm1 Rafael J. Wysocki
2007-10-16 19:58 ` 2.6.23-mm1 Mark Gross
2007-10-16 20:28 ` 2.6.23-mm1 Rafael J. Wysocki
2007-10-16 23:31 ` 2.6.23-mm1 Mark Gross
2007-10-17 21:15 ` [PATCH] static initialization with blocking notifiers. was :wqRe: 2.6.23-mm1 Mark Gross
2007-10-17 17:21 ` [PATCH] static initialization and blocking notification for pm_qos... was 2.6.23-mm1 Mark Gross
2007-10-13 4:35 ` 2.6.23-mm1 - Build failure on rgmii Kamalesh Babulal
2007-10-13 4:44 ` 2.6.23-mm1 - build failure with advansys Kamalesh Babulal
2007-10-13 6:52 ` Andrew Morton
2007-10-18 0:07 ` Paul Mackerras
2007-10-18 1:48 ` Matthew Wilcox
2007-10-13 15:50 ` 2.6.23-mm1 pm_prepare() and _finish() w/ args vs. without Joseph Fannin
2007-10-13 17:22 ` Rafael J. Wysocki
2007-10-13 18:40 ` Joseph Fannin
2007-10-13 19:13 ` Rafael J. Wysocki
2007-10-14 19:47 ` Joseph Fannin
2007-10-14 20:20 ` Rafael J. Wysocki
2007-10-15 20:55 ` Rafael J. Wysocki
2007-10-16 17:29 ` Joseph Fannin
2007-10-13 17:12 ` 2.6.23-mm1 Gabriel C
2007-10-13 18:01 ` 2.6.23-mm1 Andrew Morton
2007-10-13 18:08 ` 2.6.23-mm1 Gabriel C
2007-10-15 16:28 ` 2.6.23-mm1 Dave Hansen
2007-10-13 17:58 ` Suspend Broken (Re: 2.6.23-mm1) Dhaval Giani
2007-10-13 18:33 ` Rafael J. Wysocki
2007-10-14 4:26 ` Dhaval Giani
2007-10-14 14:19 ` Rafael J. Wysocki
2007-10-13 22:11 ` [2.6.23-mm1] CONFIG_LOCALVERSION handling broken Tilman Schmidt
2007-10-17 20:27 ` Sam Ravnborg
2007-10-17 23:06 ` Tilman Schmidt
2007-10-27 15:19 ` Tilman Schmidt
2007-10-27 15:28 ` Sam Ravnborg
2007-10-14 22:34 ` 2.6.23-mm1: BUG in reiserfs_delete_xattrs Laurent Riffard
2007-10-15 8:40 ` Christoph Hellwig
2007-10-15 18:31 ` Jeff Mahoney
2007-10-15 20:06 ` Laurent Riffard
2007-10-15 20:23 ` Jeff Mahoney
2007-10-17 8:59 ` Christoph Hellwig
2007-10-17 8:58 ` Christoph Hellwig
2007-10-17 14:55 ` Jeff Mahoney
2007-10-15 19:51 ` Laurent Riffard
2007-10-15 6:18 ` Gautham R Shenoy [this message]
2007-10-15 12:28 ` nfs mmap adventure (was: 2.6.23-mm1) Peter Zijlstra
2007-10-15 14:06 ` David Howells
2007-10-15 15:51 ` Trond Myklebust
2007-10-15 16:38 ` Peter Zijlstra
2007-10-16 1:46 ` Nick Piggin
2007-10-15 23:27 ` David Howells
2007-10-15 15:43 ` Trond Myklebust
2007-10-16 7:18 ` 2.6.23-mm1 - regression- PowerPC link failure at arch/powerpc/kernel/head_64.o Kamalesh Babulal
2007-10-16 7:28 ` Andrew Morton
2007-10-16 7:44 ` Kamalesh Babulal
2007-10-21 6:42 ` Kamalesh Babulal
2007-10-27 5:05 ` Stephen Rothwell
2007-10-17 7:01 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-17 9:02 ` 2.6.23-mm1 Andrew Morton
2007-10-17 9:10 ` 2.6.23-mm1 Jiri Kosina
2007-10-17 9:36 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-17 11:42 ` 2.6.23-mm1 Jiri Kosina
2007-10-17 12:33 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-19 9:07 ` PIE randomization (was Re: 2.6.23-mm1) Jiri Kosina
2007-10-19 21:54 ` 2.6.23-mm1 Jiri Kosina
2007-10-17 15:54 ` 2.6.23-mm1 - list_add corruption in cgroup Cedric Le Goater
2007-10-18 15:56 ` Paul Menage
2007-10-19 22:11 ` Paul Menage
2007-10-18 12:06 ` 2.6.23-mm1 - powerpc - Build fails at arch/powerpc/boot/inflate.o Kamalesh Babulal
2007-10-18 12:23 ` Paul Mackerras
2007-10-18 13:20 ` Kamalesh Babulal
2007-10-20 4:57 ` oops in lbmIODone, fails to boot [Re: 2.6.23-mm1] Mattia Dongili
2007-10-20 5:34 ` Andrew Morton
2007-10-20 12:18 ` Dave Kleikamp
2007-10-21 5:44 ` Mattia Dongili
2007-10-20 5:13 ` 2.6.23-mm1 - autofs broken Rik van Riel
2007-10-20 5:39 ` Andrew Morton
2007-10-20 5:54 ` Rik van Riel
2007-10-20 5:54 ` Rik van Riel
2007-10-20 14:56 ` Rik van Riel
2007-10-22 22:03 ` Dave Hansen
2007-10-22 3:45 ` Ian Kent
2007-10-22 16:46 ` Rik van Riel
2007-10-21 5:58 ` mysqld prevents s2ram [Re: 2.6.23-mm1] Mattia Dongili
2007-10-21 6:28 ` Mattia Dongili
2007-10-21 9:58 ` Pavel Machek
2007-10-21 11:53 ` Rafael J. Wysocki
2007-10-22 18:40 ` kernel panic when running tcpdump Mariusz Kozlowski
2007-10-22 19:03 ` Andrew Morton
2007-10-22 21:16 ` Mariusz Kozlowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071015061844.GA15728@in.ibm.com \
--to=ego@in.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=dipankar@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox