* [PATCH -rt 1/5] hotplug: sync_unplug: No '\n' in task name
2011-10-16 10:56 [RFC] [PATCH -rt 0/5] patchset focusing on cpu hotplug Yong Zhang
@ 2011-10-16 10:56 ` Yong Zhang
2011-10-16 10:56 ` [PATCH -rt 2/5] hotplug: Call cpu_unplug_begin() a little early Yong Zhang
` (3 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Yong Zhang @ 2011-10-16 10:56 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-rt-users, tglx
Otherwise the output will look a little odd.
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
---
kernel/cpu.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 5f2382a..83fb084 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -139,7 +139,7 @@ static int cpu_unplug_begin(unsigned int cpu)
struct task_struct *tsk;
init_completion(&hp->synced);
- tsk = kthread_create(sync_unplug_thread, hp, "sync_unplug/%d\n", cpu);
+ tsk = kthread_create(sync_unplug_thread, hp, "sync_unplug/%d", cpu);
if (IS_ERR(tsk))
return (PTR_ERR(tsk));
kthread_bind(tsk, cpu);
--
1.7.1
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH -rt 2/5] hotplug: Call cpu_unplug_begin() a little early
2011-10-16 10:56 [RFC] [PATCH -rt 0/5] patchset focusing on cpu hotplug Yong Zhang
2011-10-16 10:56 ` [PATCH -rt 1/5] hotplug: sync_unplug: No '\n' in task name Yong Zhang
@ 2011-10-16 10:56 ` Yong Zhang
2011-10-16 10:56 ` [RFC] [PATCH -rt 3/5] printk: don't call printk_tick in printk_needs_cpu() Yong Zhang
` (2 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Yong Zhang @ 2011-10-16 10:56 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-rt-users, tglx
cpu_unplug_begin() should be called before CPU_DOWN_PREPARE,
because at CPU_DOWN_PREPARE cpu_active is cleared and sched_domain
is rebuilt, it will finally lead to thread 'sync_unplug' will
be running on the cpu on which it's created.
I found by an incorrect warning on smp_processor_id() called by
sync_unplug/1, and trace shows below:
(echo 1 > /sys/device/system/cpu/cpu1/online)
bash-1664 [000] 83.136620: _cpu_down: Bind sync_unplug to cpu 1
bash-1664 [000] 83.136623: sched_wait_task: comm=sync_unplug/1 pid=1724 prio=120
bash-1664 [000] 83.136624: _cpu_down: Wake sync_unplug
bash-1664 [000] 83.136629: sched_wakeup: comm=sync_unplug/1 pid=1724 prio=120 success=1 target_cpu=000
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
---
kernel/cpu.c | 16 +++++++---------
1 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 83fb084..13066a3 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -337,22 +337,20 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
return -EBUSY;
}
- err = __cpu_notify(CPU_DOWN_PREPARE | mod, hcpu, -1, &nr_calls);
+ cpu_hotplug_begin();
+ err = cpu_unplug_begin(cpu);
if (err) {
- nr_calls--;
- __cpu_notify(CPU_DOWN_FAILED | mod, hcpu, nr_calls, NULL);
- printk("%s: attempt to take down CPU %u failed\n",
- __func__, cpu);
+ printk("cpu_unplug_begin(%d) failed\n", cpu);
goto out_cancel;
}
- cpu_hotplug_begin();
- err = cpu_unplug_begin(cpu);
+ err = __cpu_notify(CPU_DOWN_PREPARE | mod, hcpu, -1, &nr_calls);
if (err) {
nr_calls--;
__cpu_notify(CPU_DOWN_FAILED | mod, hcpu, nr_calls, NULL);
- printk("cpu_unplug_begin(%d) failed\n", cpu);
- goto out_cancel;
+ printk("%s: attempt to take down CPU %u failed\n",
+ __func__, cpu);
+ goto out_release;
}
err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
--
1.7.1
^ permalink raw reply related [flat|nested] 15+ messages in thread* [RFC] [PATCH -rt 3/5] printk: don't call printk_tick in printk_needs_cpu()
2011-10-16 10:56 [RFC] [PATCH -rt 0/5] patchset focusing on cpu hotplug Yong Zhang
2011-10-16 10:56 ` [PATCH -rt 1/5] hotplug: sync_unplug: No '\n' in task name Yong Zhang
2011-10-16 10:56 ` [PATCH -rt 2/5] hotplug: Call cpu_unplug_begin() a little early Yong Zhang
@ 2011-10-16 10:56 ` Yong Zhang
2011-10-16 10:56 ` [PATCH -rt 4/5] workqueue: hotplug fix Yong Zhang
2011-10-16 10:56 ` [RFC] [PATCH -rt 5/5] cpufreq: get rid of get_online_cpus()/put_online_cpus() Yong Zhang
4 siblings, 0 replies; 15+ messages in thread
From: Yong Zhang @ 2011-10-16 10:56 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-rt-users, tglx
printk_tick() can't be called in atomic context, otherwise below
warning will show:
[ 117.597095] BUG: sleeping function called from invalid context at kernel/rtmutex.c:645
[ 117.597102] in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: kworker/0:0
[ 117.597111] Pid: 0, comm: kworker/0:0 Not tainted 3.0.6-rt17-00284-gb76d419-dirty #7
[ 117.597116] Call Trace:
[ 117.597131] [<c06e3b61>] ? printk+0x1d/0x24
[ 117.597142] [<c01390b6>] __might_sleep+0xe6/0x110
[ 117.597151] [<c06e634c>] rt_spin_lock+0x1c/0x30
[ 117.597158] [<c0142f26>] __wake_up+0x26/0x60
[ 117.597166] [<c014c78e>] printk_tick+0x3e/0x40
[ 117.597173] [<c014c7b4>] printk_needs_cpu+0x24/0x30
[ 117.597181] [<c017ecc8>] tick_nohz_stop_sched_tick+0x2e8/0x410
[ 117.597191] [<c017305a>] ? sched_clock_idle_wakeup_event+0x1a/0x20
[ 117.597201] [<c010182a>] cpu_idle+0x4a/0xb0
[ 117.597209] [<c06e0b97>] start_secondary+0xd3/0xd7
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
---
kernel/printk.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/printk.c b/kernel/printk.c
index 0c9a0ab..a50af4e 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1282,8 +1282,8 @@ void printk_tick(void)
int printk_needs_cpu(int cpu)
{
- if (cpu_is_offline(cpu))
- printk_tick();
+ if (unlikely(cpu_is_offline(cpu)))
+ __this_cpu_write(printk_pending, 0);
return __this_cpu_read(printk_pending);
}
--
1.7.1
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH -rt 4/5] workqueue: hotplug fix
2011-10-16 10:56 [RFC] [PATCH -rt 0/5] patchset focusing on cpu hotplug Yong Zhang
` (2 preceding siblings ...)
2011-10-16 10:56 ` [RFC] [PATCH -rt 3/5] printk: don't call printk_tick in printk_needs_cpu() Yong Zhang
@ 2011-10-16 10:56 ` Yong Zhang
2011-10-19 7:14 ` Thomas Gleixner
2011-10-16 10:56 ` [RFC] [PATCH -rt 5/5] cpufreq: get rid of get_online_cpus()/put_online_cpus() Yong Zhang
4 siblings, 1 reply; 15+ messages in thread
From: Yong Zhang @ 2011-10-16 10:56 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-rt-users, tglx
BUG: sleeping function called from invalid context at kernel/rtmutex.c:645
in_atomic(): 1, irqs_disabled(): 0, pid: 1739, name: bash
Pid: 1739, comm: bash Not tainted 3.0.6-rt17-00284-gb76d419 #3
Call Trace:
[<c06e3b5d>] ? printk+0x1d/0x20
[<c01390b6>] __might_sleep+0xe6/0x110
[<c06e633c>] rt_spin_lock+0x1c/0x30
[<c01655a6>] flush_gcwq+0x236/0x320
[<c021c651>] ? kfree+0xe1/0x1a0
[<c05b7178>] ? __cpufreq_remove_dev+0xf8/0x260
[<c0183fad>] ? rt_down_write+0xd/0x10
[<c06cd91e>] workqueue_cpu_down_callback+0x26/0x2d
[<c06e9d65>] notifier_call_chain+0x45/0x60
[<c0171cfe>] __raw_notifier_call_chain+0x1e/0x30
[<c014c9b4>] __cpu_notify+0x24/0x40
[<c06cbc6f>] _cpu_down+0xdf/0x330
[<c06cbef0>] cpu_down+0x30/0x50
[<c06cd6b0>] store_online+0x50/0xa7
[<c06cd660>] ? acpi_os_map_memory+0xec/0xec
[<c04f2faa>] sysdev_store+0x2a/0x40
[<c02887a4>] sysfs_write_file+0xa4/0x100
[<c0229ab2>] vfs_write+0xa2/0x170
[<c0288700>] ? sysfs_poll+0x90/0x90
[<c0229d92>] sys_write+0x42/0x70
[<c06ecedf>] sysenter_do_call+0x12/0x2d
CPU 1 is now offline
SMP alternatives: switching to UP code
SMP alternatives: switching to SMP code
Booting Node 0 Processor 1 APIC 0x1
smpboot cpu 1: start_ip = 9b000
Initializing CPU#1
BUG: sleeping function called from invalid context at kernel/rtmutex.c:645
in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: kworker/0:0
Pid: 0, comm: kworker/0:0 Not tainted 3.0.6-rt17-00284-gb76d419 #3
Call Trace:
[<c06e3b5d>] ? printk+0x1d/0x20
[<c01390b6>] __might_sleep+0xe6/0x110
[<c06e633c>] rt_spin_lock+0x1c/0x30
[<c06cd85b>] workqueue_cpu_up_callback+0x56/0xf3
[<c06e9d65>] notifier_call_chain+0x45/0x60
[<c0171cfe>] __raw_notifier_call_chain+0x1e/0x30
[<c014c9b4>] __cpu_notify+0x24/0x40
[<c014c9ec>] cpu_notify+0x1c/0x20
[<c06e1d43>] notify_cpu_starting+0x1e/0x20
[<c06e0aad>] smp_callin+0xfb/0x10e
[<c06e0ad9>] start_secondary+0x19/0xd7
NMI watchdog enabled, takes one hw-pmu counter.
Switched to NOHz mode on CPU #1
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
---
kernel/workqueue.c | 12 ++++++++----
1 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index e262f6b..ddbf89c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3195,30 +3195,34 @@ static int __devinit workqueue_cpu_up_callback(struct notifier_block *nfb,
}
/* some are called w/ irq disabled, don't disturb irq status */
- spin_lock_irqsave(&gcwq->lock, flags);
switch (action) {
case CPU_UP_PREPARE:
+ spin_lock_irqsave(&gcwq->lock, flags);
BUG_ON(gcwq->first_idle);
gcwq->first_idle = new_worker;
+ spin_unlock_irqrestore(&gcwq->lock, flags);
break;
case CPU_UP_CANCELED:
+ spin_lock_irqsave(&gcwq->lock, flags);
destroy_worker(gcwq->first_idle);
gcwq->first_idle = NULL;
+ spin_unlock_irqrestore(&gcwq->lock, flags);
break;
case CPU_ONLINE:
+ spin_lock_irqsave(&gcwq->lock, flags);
spin_unlock_irq(&gcwq->lock);
kthread_bind(gcwq->first_idle->task, cpu);
spin_lock_irq(&gcwq->lock);
gcwq->flags |= GCWQ_MANAGE_WORKERS;
start_worker(gcwq->first_idle);
gcwq->first_idle = NULL;
+ spin_unlock_irqrestore(&gcwq->lock, flags);
break;
}
- spin_unlock_irqrestore(&gcwq->lock, flags);
return notifier_from_errno(0);
}
@@ -3274,14 +3278,14 @@ static void flush_gcwq(struct global_cwq *gcwq)
spin_unlock_irq(&gcwq->lock);
- gcwq = get_gcwq(get_cpu());
+ gcwq = get_gcwq(get_cpu_light());
spin_lock_irq(&gcwq->lock);
list_for_each_entry_safe(work, nw, &non_affine_works, entry) {
list_del_init(&work->entry);
___queue_work(get_work_cwq(work)->wq, gcwq, work);
}
spin_unlock_irq(&gcwq->lock);
- put_cpu();
+ put_cpu_light();
}
static int __devinit workqueue_cpu_down_callback(struct notifier_block *nfb,
--
1.7.1
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH -rt 4/5] workqueue: hotplug fix
2011-10-16 10:56 ` [PATCH -rt 4/5] workqueue: hotplug fix Yong Zhang
@ 2011-10-19 7:14 ` Thomas Gleixner
2011-10-24 2:26 ` Yong Zhang
0 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2011-10-19 7:14 UTC (permalink / raw)
To: Yong Zhang; +Cc: linux-kernel, linux-rt-users
On Sun, 16 Oct 2011, Yong Zhang wrote:
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3195,30 +3195,34 @@ static int __devinit workqueue_cpu_up_callback(struct notifier_block *nfb,
> }
>
> /* some are called w/ irq disabled, don't disturb irq status */
> - spin_lock_irqsave(&gcwq->lock, flags);
>
> switch (action) {
> case CPU_UP_PREPARE:
> + spin_lock_irqsave(&gcwq->lock, flags);
> BUG_ON(gcwq->first_idle);
> gcwq->first_idle = new_worker;
> + spin_unlock_irqrestore(&gcwq->lock, flags);
> break;
>
> case CPU_UP_CANCELED:
> + spin_lock_irqsave(&gcwq->lock, flags);
> destroy_worker(gcwq->first_idle);
> gcwq->first_idle = NULL;
> + spin_unlock_irqrestore(&gcwq->lock, flags);
> break;
>
> case CPU_ONLINE:
> + spin_lock_irqsave(&gcwq->lock, flags);
> spin_unlock_irq(&gcwq->lock);
> kthread_bind(gcwq->first_idle->task, cpu);
> spin_lock_irq(&gcwq->lock);
> gcwq->flags |= GCWQ_MANAGE_WORKERS;
> start_worker(gcwq->first_idle);
> gcwq->first_idle = NULL;
> + spin_unlock_irqrestore(&gcwq->lock, flags);
> break;
> }
>
> - spin_unlock_irqrestore(&gcwq->lock, flags);
This part of the patch is pretty pointless.
> return notifier_from_errno(0);
> }
> @@ -3274,14 +3278,14 @@ static void flush_gcwq(struct global_cwq *gcwq)
>
> spin_unlock_irq(&gcwq->lock);
>
> - gcwq = get_gcwq(get_cpu());
> + gcwq = get_gcwq(get_cpu_light());
Right.
> spin_lock_irq(&gcwq->lock);
> list_for_each_entry_safe(work, nw, &non_affine_works, entry) {
> list_del_init(&work->entry);
> ___queue_work(get_work_cwq(work)->wq, gcwq, work);
> }
> spin_unlock_irq(&gcwq->lock);
> - put_cpu();
> + put_cpu_light();
> }
>
> static int __devinit workqueue_cpu_down_callback(struct notifier_block *nfb,
> --
> 1.7.1
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH -rt 4/5] workqueue: hotplug fix
2011-10-19 7:14 ` Thomas Gleixner
@ 2011-10-24 2:26 ` Yong Zhang
2011-10-24 9:25 ` Thomas Gleixner
0 siblings, 1 reply; 15+ messages in thread
From: Yong Zhang @ 2011-10-24 2:26 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: linux-kernel, linux-rt-users
On Wed, Oct 19, 2011 at 09:14:28AM +0200, Thomas Gleixner wrote:
> On Sun, 16 Oct 2011, Yong Zhang wrote:
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -3195,30 +3195,34 @@ static int __devinit workqueue_cpu_up_callback(struct notifier_block *nfb,
> > }
> >
> > /* some are called w/ irq disabled, don't disturb irq status */
> > - spin_lock_irqsave(&gcwq->lock, flags);
> >
> > switch (action) {
> > case CPU_UP_PREPARE:
> > + spin_lock_irqsave(&gcwq->lock, flags);
> > BUG_ON(gcwq->first_idle);
> > gcwq->first_idle = new_worker;
> > + spin_unlock_irqrestore(&gcwq->lock, flags);
> > break;
> >
> > case CPU_UP_CANCELED:
> > + spin_lock_irqsave(&gcwq->lock, flags);
> > destroy_worker(gcwq->first_idle);
> > gcwq->first_idle = NULL;
> > + spin_unlock_irqrestore(&gcwq->lock, flags);
> > break;
> >
> > case CPU_ONLINE:
> > + spin_lock_irqsave(&gcwq->lock, flags);
> > spin_unlock_irq(&gcwq->lock);
> > kthread_bind(gcwq->first_idle->task, cpu);
> > spin_lock_irq(&gcwq->lock);
> > gcwq->flags |= GCWQ_MANAGE_WORKERS;
> > start_worker(gcwq->first_idle);
> > gcwq->first_idle = NULL;
> > + spin_unlock_irqrestore(&gcwq->lock, flags);
> > break;
> > }
> >
> > - spin_unlock_irqrestore(&gcwq->lock, flags);
>
> This part of the patch is pretty pointless.
But CPU_STARTING is called with irq disabled, and it will take
the lock unconditionally. Thus below warning is triggered:
BUG: sleeping function called from invalid context at kernel/rtmutex.c:645
in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: kworker/0:0
Pid: 0, comm: kworker/0:0 Not tainted 3.0.6-rt17-00284-gb76d419 #3
Call Trace:
[<c06e3b5d>] ? printk+0x1d/0x20
[<c01390b6>] __might_sleep+0xe6/0x110
[<c06e633c>] rt_spin_lock+0x1c/0x30
[<c06cd85b>] workqueue_cpu_up_callback+0x56/0xf3
[<c06e9d65>] notifier_call_chain+0x45/0x60
[<c0171cfe>] __raw_notifier_call_chain+0x1e/0x30
[<c014c9b4>] __cpu_notify+0x24/0x40
[<c014c9ec>] cpu_notify+0x1c/0x20
[<c06e1d43>] notify_cpu_starting+0x1e/0x20
[<c06e0aad>] smp_callin+0xfb/0x10e
[<c06e0ad9>] start_secondary+0x19/0xd7
Thanks,
Yong
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH -rt 4/5] workqueue: hotplug fix
2011-10-24 2:26 ` Yong Zhang
@ 2011-10-24 9:25 ` Thomas Gleixner
0 siblings, 0 replies; 15+ messages in thread
From: Thomas Gleixner @ 2011-10-24 9:25 UTC (permalink / raw)
To: Yong Zhang; +Cc: linux-kernel, linux-rt-users
On Mon, 24 Oct 2011, Yong Zhang wrote:
> On Wed, Oct 19, 2011 at 09:14:28AM +0200, Thomas Gleixner wrote:
> > > - spin_unlock_irqrestore(&gcwq->lock, flags);
> >
> > This part of the patch is pretty pointless.
>
> But CPU_STARTING is called with irq disabled, and it will take
> the lock unconditionally. Thus below warning is triggered:
Aarg, yes.
Thanks,
tglx
^ permalink raw reply [flat|nested] 15+ messages in thread
* [RFC] [PATCH -rt 5/5] cpufreq: get rid of get_online_cpus()/put_online_cpus()
2011-10-16 10:56 [RFC] [PATCH -rt 0/5] patchset focusing on cpu hotplug Yong Zhang
` (3 preceding siblings ...)
2011-10-16 10:56 ` [PATCH -rt 4/5] workqueue: hotplug fix Yong Zhang
@ 2011-10-16 10:56 ` Yong Zhang
2011-10-19 9:12 ` Thomas Gleixner
4 siblings, 1 reply; 15+ messages in thread
From: Yong Zhang @ 2011-10-16 10:56 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-rt-users, tglx
Fix below false positive (seems this is not a real deadlock scenario)
lockdep warning:
=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.6-rt18-00293-g6f698ae-dirty #24
-------------------------------------------------------
bash/1709 is trying to acquire lock:
(s_active#103){++++.+}, at: [<c02ab4a3>] sysfs_addrm_finish+0x33/0x60
but task is already holding lock:
(cpu_hotplug.lock){+.+.+.}, at: [<c0150530>] cpu_hotplug_begin+0x20/0x50
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (cpu_hotplug.lock){+.+.+.}:
[<c018ba31>] validate_chain.clone.20+0x701/0x810
[<c018e1c4>] __lock_acquire+0x3b4/0x8d0
[<c018eda8>] lock_acquire+0x88/0x1b0
[<c0736b48>] _mutex_lock+0x38/0x50
[<c01506a1>] get_online_cpus+0x31/0x50
[<c05ea533>] cpufreq_add_dev_interface+0x103/0x270
[<c05ea9ac>] cpufreq_add_dev+0x30c/0x410
[<c0524aa1>] sysdev_driver_register+0xb1/0x140
[<c05e988e>] cpufreq_register_driver+0x8e/0x150
[<c0a25e92>] acpi_cpufreq_init+0x78/0x8b
[<c0101225>] do_one_initcall+0x35/0x170
[<c09f5852>] kernel_init+0xb8/0x14e
[<c073e742>] kernel_thread_helper+0x6/0x10
-> #1 (&(*({ do { const void *__vpp_verify = (typeof((&(cpu_policy_rwsem))))((void *)0); (void)__vpp_verify; } while (0); ({ unsigned long __ptr; __asm__ ("" : "=r"(__ptr) : "0"((typeof(*(&(cpu_policy_rwsem))) *)(&(cpu_policy_rwsem)))); (typeof((typeof(*(&(cpu_policy_rwsem))) *)(&(cpu_policy_rwsem)))) (__ptr + (((__per_cpu_offset[cpu])))); }); }))){+++++.}:
[<c018ba31>] validate_chain.clone.20+0x701/0x810
[<c018e1c4>] __lock_acquire+0x3b4/0x8d0
[<c018eda8>] lock_acquire+0x88/0x1b0
[<c0195f66>] __rt_down_read+0x36/0x60
[<c0195faf>] rt_down_read+0xf/0x20
[<c05e975f>] lock_policy_rwsem_read+0x3f/0x80
[<c05ea3f4>] show+0x34/0x70
[<c02a9fb7>] sysfs_read_file+0x87/0x130
[<c024522f>] vfs_read+0x9f/0x170
[<c0245348>] sys_read+0x48/0x80
[<c073e15f>] sysenter_do_call+0x12/0x38
-> #0 (s_active#103){++++.+}:
[<c018b2fa>] check_prev_add+0x5fa/0x630
[<c018ba31>] validate_chain.clone.20+0x701/0x810
[<c018e1c4>] __lock_acquire+0x3b4/0x8d0
[<c018eda8>] lock_acquire+0x88/0x1b0
[<c02aac85>] sysfs_deactivate+0xc5/0x120
[<c02ab4a3>] sysfs_addrm_finish+0x33/0x60
[<c02aba0a>] sysfs_remove_dir+0x6a/0x80
[<c03bab3f>] kobject_del+0xf/0x30
[<c03babbf>] kobject_release+0x5f/0x80
[<c03bbf3d>] kref_put+0x2d/0x60
[<c03baa7d>] kobject_put+0x1d/0x50
[<c05eab6e>] __cpufreq_remove_dev+0xbe/0x260
[<c07339f3>] cpufreq_cpu_callback+0x70/0x83
[<c073ac5a>] notifier_call_chain+0x7a/0xa0
[<c017a3fe>] __raw_notifier_call_chain+0x1e/0x30
[<c0150494>] __cpu_notify+0x24/0x40
[<c071b7c1>] _cpu_down+0x151/0x350
[<c071b9f0>] cpu_down+0x30/0x50
[<c071d550>] store_online+0x50/0xa7
[<c052466a>] sysdev_store+0x2a/0x40
[<c02a9ed4>] sysfs_write_file+0xa4/0x100
[<c02450c2>] vfs_write+0xa2/0x170
[<c02453c8>] sys_write+0x48/0x80
[<c073e15f>] sysenter_do_call+0x12/0x38
other info that might help us debug this:
Chain exists of:
s_active --> &(*({ do { const void *__vpp_verify = (typeof((&(cpu_policy_rwsem))))((void *)0); (void)__vpp_verify; } while (0); ({ unsigned long __ptr; __asm__ ("" : "=r"(__ptr) : "0"((typeof(*(&(cpu_policy_rwsem))) *)(&(cpu_policy_rwsem)))); (typeof((typeof(*(&(cpu_policy_rwsem))) *)(&(cpu_policy_rwsem)))) (__ptr + (((__per_cpu_offset[cpu])))); }); })) --> cpu_hotplug.lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(cpu_hotplug.lock);
lock(&(*({ do { const void *__vpp_verify = (typeof((&(cpu_policy_rwsem))))((void *)0); (void)__vpp_verify; } while (0); ({ unsigned long __ptr; __asm__ ("" : "=r"(__ptr) : "0"((typeof(*(&(cpu_policy_rwsem))) *)(&(cpu_policy_rwsem)))); (typeof((typeof(*(&(cpu_policy_rwsem))) *)(&(cpu_policy_rwsem)))) (__ptr + (((__per_cpu_offset[cpu])))); }); })));
lock(cpu_hotplug.lock);
lock(s_active);
*** DEADLOCK ***
5 locks held by bash/1709:
#0: (&buffer->mutex){+.+.+.}, at: [<c02a9e57>] sysfs_write_file+0x27/0x100
#1: (s_active#119){.+.+.+}, at: [<c02a9eb9>] sysfs_write_file+0x89/0x100
#2: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<c011dc42>] cpu_hotplug_driver_lock+0x12/0x20
#3: (cpu_add_remove_lock){+.+.+.}, at: [<c01506d2>] cpu_maps_update_begin+0x12/0x20
#4: (cpu_hotplug.lock){+.+.+.}, at: [<c0150530>] cpu_hotplug_begin+0x20/0x50
stack backtrace:
Pid: 1709, comm: bash Not tainted 3.0.6-rt18-00293-g6f698ae-dirty #24
Call Trace:
[<c0733d94>] ? printk+0x1d/0x21
[<c018a28f>] print_circular_bug+0xcf/0xe0
[<c018b2fa>] check_prev_add+0x5fa/0x630
[<c018bfd9>] ? mark_lock_irq+0xc9/0x270
[<c018ba31>] validate_chain.clone.20+0x701/0x810
[<c018e1c4>] __lock_acquire+0x3b4/0x8d0
[<c018ef24>] ? lockdep_init_map+0x54/0x4e0
[<c018eda8>] lock_acquire+0x88/0x1b0
[<c02ab4a3>] ? sysfs_addrm_finish+0x33/0x60
[<c02aac85>] sysfs_deactivate+0xc5/0x120
[<c02ab4a3>] ? sysfs_addrm_finish+0x33/0x60
[<c0234382>] ? unlock_slab_and_free_delayed.clone.25+0x92/0xc0
[<c0234382>] ? unlock_slab_and_free_delayed.clone.25+0x92/0xc0
[<c02ab4a3>] sysfs_addrm_finish+0x33/0x60
[<c02aba0a>] sysfs_remove_dir+0x6a/0x80
[<c03bab3f>] kobject_del+0xf/0x30
[<c03babbf>] kobject_release+0x5f/0x80
[<c03bab60>] ? kobject_del+0x30/0x30
[<c03bbf3d>] kref_put+0x2d/0x60
[<c03baa7d>] kobject_put+0x1d/0x50
[<c073620d>] ? rt_mutex_unlock+0xd/0x10
[<c01961f2>] ? rt_up_write+0x22/0x30
[<c05e918d>] ? unlock_policy_rwsem_write+0x2d/0x40
[<c05eab6e>] __cpufreq_remove_dev+0xbe/0x260
[<c019614b>] ? rt_down_write_trylock+0x4b/0x60
[<c07339f3>] cpufreq_cpu_callback+0x70/0x83
[<c073ac5a>] notifier_call_chain+0x7a/0xa0
[<c017a3fe>] __raw_notifier_call_chain+0x1e/0x30
[<c0150494>] __cpu_notify+0x24/0x40
[<c071b7c1>] _cpu_down+0x151/0x350
[<c071b9f0>] cpu_down+0x30/0x50
[<c071d550>] store_online+0x50/0xa7
[<c071d500>] ? acpi_os_map_memory+0xf2/0xf2
[<c052466a>] sysdev_store+0x2a/0x40
[<c02a9ed4>] sysfs_write_file+0xa4/0x100
[<c02450c2>] vfs_write+0xa2/0x170
[<c02a9e30>] ? sysfs_poll+0x90/0x90
[<c02453c8>] sys_write+0x48/0x80
[<c073e15f>] sysenter_do_call+0x12/0x38
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
---
drivers/cpufreq/cpufreq.c | 20 +++++++-------------
1 files changed, 7 insertions(+), 13 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 7741f0f..8561182 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -821,16 +821,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
goto err_out_kobj_put;
}
- get_online_cpus();
+ raw_spin_lock_irqsave(&cpufreq_driver_lock, flags);
for_each_cpu(j, policy->cpus) {
if (!cpu_online(j))
continue;
- raw_spin_lock_irqsave(&cpufreq_driver_lock, flags);
per_cpu(cpufreq_cpu_data, j) = policy;
per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
- raw_spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
}
- put_online_cpus();
+ raw_spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
ret = cpufreq_add_dev_symlink(cpu, policy);
if (ret)
@@ -972,13 +970,11 @@ static int cpufreq_add_dev(struct sys_device *sys_dev)
err_out_unregister:
- get_online_cpus();
+ raw_spin_lock_irqsave(&cpufreq_driver_lock, flags);
for_each_cpu(j, policy->cpus) {
- raw_spin_lock_irqsave(&cpufreq_driver_lock, flags);
per_cpu(cpufreq_cpu_data, j) = NULL;
- raw_spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
}
- put_online_cpus();
+ raw_spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
kobject_put(&policy->kobj);
wait_for_completion(&policy->kobj_unregister);
@@ -1045,7 +1041,6 @@ static int __cpufreq_remove_dev(struct sys_device *sys_dev)
}
#endif
- raw_spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
#ifdef CONFIG_SMP
#ifdef CONFIG_HOTPLUG_CPU
@@ -1058,17 +1053,14 @@ static int __cpufreq_remove_dev(struct sys_device *sys_dev)
* per_cpu(cpufreq_cpu_data) while holding the lock, and remove
* the sysfs links afterwards.
*/
- get_online_cpus();
if (unlikely(cpumask_weight(data->cpus) > 1)) {
for_each_cpu(j, data->cpus) {
if (j == cpu)
continue;
- raw_spin_lock_irqsave(&cpufreq_driver_lock, flags);
per_cpu(cpufreq_cpu_data, j) = NULL;
- raw_spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
}
}
- put_online_cpus();
+ raw_spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
if (unlikely(cpumask_weight(data->cpus) > 1)) {
for_each_cpu(j, data->cpus) {
@@ -1087,6 +1079,8 @@ static int __cpufreq_remove_dev(struct sys_device *sys_dev)
cpufreq_cpu_put(data);
}
}
+#else
+ raw_spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
#endif
if (cpufreq_driver->target)
--
1.7.1
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [RFC] [PATCH -rt 5/5] cpufreq: get rid of get_online_cpus()/put_online_cpus()
2011-10-16 10:56 ` [RFC] [PATCH -rt 5/5] cpufreq: get rid of get_online_cpus()/put_online_cpus() Yong Zhang
@ 2011-10-19 9:12 ` Thomas Gleixner
2011-10-24 2:44 ` Yong Zhang
0 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2011-10-19 9:12 UTC (permalink / raw)
To: Yong Zhang; +Cc: linux-kernel, linux-rt-users
On Sun, 16 Oct 2011, Yong Zhang wrote:
> Fix below false positive (seems this is not a real deadlock scenario)
> lockdep warning:
This looks like you caused it with patch 1. Both need a bit more
thought, but thanks for catching that.
Thanks,
tglx
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] [PATCH -rt 5/5] cpufreq: get rid of get_online_cpus()/put_online_cpus()
2011-10-19 9:12 ` Thomas Gleixner
@ 2011-10-24 2:44 ` Yong Zhang
2011-10-24 9:24 ` Thomas Gleixner
0 siblings, 1 reply; 15+ messages in thread
From: Yong Zhang @ 2011-10-24 2:44 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: linux-kernel, linux-rt-users
On Wed, Oct 19, 2011 at 11:12:20AM +0200, Thomas Gleixner wrote:
> On Sun, 16 Oct 2011, Yong Zhang wrote:
>
> > Fix below false positive (seems this is not a real deadlock scenario)
> > lockdep warning:
>
> This looks like you caused it with patch 1.
Hmmm, yup.
CPU_DOWN_PREPARE will be under cpu_hotplug.lock.
> Both need a bit more
> thought, but thanks for catching that.
No sure whether it's big issue or not.
Mind showing more about your concern?
Thanks,
Yong
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] [PATCH -rt 5/5] cpufreq: get rid of get_online_cpus()/put_online_cpus()
2011-10-24 2:44 ` Yong Zhang
@ 2011-10-24 9:24 ` Thomas Gleixner
2011-10-24 11:30 ` Yong Zhang
0 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2011-10-24 9:24 UTC (permalink / raw)
To: Yong Zhang; +Cc: linux-kernel, linux-rt-users
On Mon, 24 Oct 2011, Yong Zhang wrote:
> On Wed, Oct 19, 2011 at 11:12:20AM +0200, Thomas Gleixner wrote:
> > On Sun, 16 Oct 2011, Yong Zhang wrote:
> >
> > > Fix below false positive (seems this is not a real deadlock scenario)
> > > lockdep warning:
> >
> > This looks like you caused it with patch 1.
>
> Hmmm, yup.
> CPU_DOWN_PREPARE will be under cpu_hotplug.lock.
>
> > Both need a bit more
> > thought, but thanks for catching that.
>
> No sure whether it's big issue or not.
> Mind showing more about your concern?
This is probably not the only place which will run into that issue and
I have not much interest to patch all those sites.
Thanks,
tglx
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] [PATCH -rt 5/5] cpufreq: get rid of get_online_cpus()/put_online_cpus()
2011-10-24 9:24 ` Thomas Gleixner
@ 2011-10-24 11:30 ` Yong Zhang
2011-10-24 11:45 ` Thomas Gleixner
0 siblings, 1 reply; 15+ messages in thread
From: Yong Zhang @ 2011-10-24 11:30 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: linux-kernel, linux-rt-users
On Mon, Oct 24, 2011 at 11:24:51AM +0200, Thomas Gleixner wrote:
>
>
> On Mon, 24 Oct 2011, Yong Zhang wrote:
>
> > On Wed, Oct 19, 2011 at 11:12:20AM +0200, Thomas Gleixner wrote:
> > > On Sun, 16 Oct 2011, Yong Zhang wrote:
> > >
> > > > Fix below false positive (seems this is not a real deadlock scenario)
> > > > lockdep warning:
> > >
> > > This looks like you caused it with patch 1.
> >
> > Hmmm, yup.
> > CPU_DOWN_PREPARE will be under cpu_hotplug.lock.
> >
> > > Both need a bit more
> > > thought, but thanks for catching that.
> >
> > No sure whether it's big issue or not.
> > Mind showing more about your concern?
>
> This is probably not the only place which will run into that issue and
> I have not much interest to patch all those sites.
If so, I think that kind of violation has been caught in mainline,
because mainline has had CPU_DOWN_PREPARE called under
cpu_hotplug.lock, right?
Or am I missing something?
Thanks,
Yong
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] [PATCH -rt 5/5] cpufreq: get rid of get_online_cpus()/put_online_cpus()
2011-10-24 11:30 ` Yong Zhang
@ 2011-10-24 11:45 ` Thomas Gleixner
2011-10-24 11:56 ` Yong Zhang
0 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2011-10-24 11:45 UTC (permalink / raw)
To: Yong Zhang; +Cc: linux-kernel, linux-rt-users
On Mon, 24 Oct 2011, Yong Zhang wrote:
> On Mon, Oct 24, 2011 at 11:24:51AM +0200, Thomas Gleixner wrote:
> > > No sure whether it's big issue or not.
> > > Mind showing more about your concern?
> >
> > This is probably not the only place which will run into that issue and
> > I have not much interest to patch all those sites.
>
> If so, I think that kind of violation has been caught in mainline,
> because mainline has had CPU_DOWN_PREPARE called under
> cpu_hotplug.lock, right?
>
> Or am I missing something?
Hmm, no. I obviously forgot that I moved it in RT :(
So yes, it should be safe. Though the question is why cpufreq does not
suffer from that that problem in mainline. I'll stare at that code
some more.
Thanks,
tglx
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] [PATCH -rt 5/5] cpufreq: get rid of get_online_cpus()/put_online_cpus()
2011-10-24 11:45 ` Thomas Gleixner
@ 2011-10-24 11:56 ` Yong Zhang
0 siblings, 0 replies; 15+ messages in thread
From: Yong Zhang @ 2011-10-24 11:56 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: linux-kernel, linux-rt-users
On Mon, Oct 24, 2011 at 01:45:16PM +0200, Thomas Gleixner wrote:
> On Mon, 24 Oct 2011, Yong Zhang wrote:
> > On Mon, Oct 24, 2011 at 11:24:51AM +0200, Thomas Gleixner wrote:
> > > > No sure whether it's big issue or not.
> > > > Mind showing more about your concern?
> > >
> > > This is probably not the only place which will run into that issue and
> > > I have not much interest to patch all those sites.
> >
> > If so, I think that kind of violation has been caught in mainline,
> > because mainline has had CPU_DOWN_PREPARE called under
> > cpu_hotplug.lock, right?
> >
> > Or am I missing something?
>
> Hmm, no. I obviously forgot that I moved it in RT :(
>
> So yes, it should be safe. Though the question is why cpufreq does not
> suffer from that that problem in mainline.
By this one: peter_zijlstra-fix-cpufreq.patch
Thanks,
Yong
^ permalink raw reply [flat|nested] 15+ messages in thread