linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] sched/autogroup: Fix race with task_groups list
@ 2013-05-24 16:07 Gerald Schaefer
  2013-05-24 16:07 ` [PATCH 1/1] " Gerald Schaefer
  0 siblings, 1 reply; 4+ messages in thread
From: Gerald Schaefer @ 2013-05-24 16:07 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Li Zefan
  Cc: linux-kernel, Martin Schwidefsky, Heiko Carstens, Gerald Schaefer

Below is the output of a panic that was triggered during CPU unplug.
__enable_runtime() accessed a freed and poisoned rt_rq that it got from
for_each_rt_rq() from the task_groups list. It seems to me that there
is a race with autogroup_create(), where tg->rt_rq is freed after the
tg was already added to the task_groups list.

A possible patch is attached, which moves the tg list add behind the
tg modifiaction in autogroup_create(), but I am currently not able to
reproduce the bug to test the patch. Feedback is welcome, as I am not
really familiar with scheduling or autogroup code.

[   47.256201] Unable to handle kernel pointer dereference at virtual kernel address 6b6b6b6b6b6b6000
[   47.256236] Oops: 0038 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[   47.256243] Modules linked in: dm_multipath scsi_dh eadm_sch dm_mod ctcm fsm ipv6 autofs4
[   47.256253] CPU: 0 Not tainted 3.9.2-60.x.20130514-s390xdefault #1
[   47.256255] Process cpuplugd (pid: 6542, task: 00000032710b4ae0, ksp: 0000003270dc77a8)
[   47.256258] Krnl PSW : 0404c00180000000 00000000001b71dc (__lock_acquire+0x14e8/0x16a4)
[   47.256265]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
               Krnl GPRS: 0000000000000001 0000000000000001 6b6b6b6b00000000 0000000000000000
[   47.256270]            0000000000000000 0000000000000000 0000000000000002 0000000000a54018
[   47.256272]            00000032710b4ae0 0000000000000000 0000000000000000 6b6b6b6b6b6b6c33
[   47.256275]            0000000000cb2708 00000000006e12a0 0000003270dc7798 0000003270dc76f0
[   47.256285] Krnl Code: 00000000001b71ce: e340f1300004        lg      %r4,304(%r15)
                          00000000001b71d4: eb6ff0f00004        lmg     %r6,%r15,240(%r15)
                         #00000000001b71da: 07f4                bcr     15,%r4
                         >00000000001b71dc: d507d000b000        clc     0(8,%r13),0(%r11)
                          00000000001b71e2: a774f5d8            brc     7,1b5d92
                          00000000001b71e6: a7f4f5d4            brc     15,1b5d8e
                          00000000001b71ea: e310f0a80004        lg      %r1,168(%r15)
                          00000000001b71f0: e310d0200009        sg      %r1,32(%r13)
[   47.256304] Call Trace:
[   47.256306] ([<0000000000000000>] 0x0)
[   47.256309]  [<00000000001b7b96>] lock_acquire+0x1be/0x234
[   47.256312]  [<00000000006ce794>] _raw_spin_lock+0x5c/0x98
[   47.256319]  [<0000000000190abc>] __enable_runtime+0x5c/0x16c
[   47.256323]  [<0000000000191cb0>] rq_online_rt+0xbc/0xe0
[   47.256326]  [<0000000000173b00>] set_rq_online+0xac/0xc8
[   47.256329]  [<0000000000178ae8>] rq_attach_root+0x1e4/0x220
[   47.256332]  [<0000000000179560>] cpu_attach_domain+0x1b8/0x40c
[   47.256335]  [<000000000018201e>] build_sched_domains+0x1896/0x1f58
[   47.256339]  [<0000000000182c6a>] partition_sched_domains+0x572/0x694
[   47.256341]  [<00000000001de8d6>] cpuset_update_active_cpus+0x2e/0x40
[   47.256345]  [<0000000000182e6a>] cpuset_cpu_inactive+0x3a/0x80
[   47.256348]  [<00000000006d27fa>] notifier_call_chain+0x11a/0x168
[   47.256352]  [<000000000016e5e2>] __raw_notifier_call_chain+0x22/0x30
[   47.256357]  [<000000000013a874>] __cpu_notify+0x44/0x70
[   47.256363]  [<00000000006b4bf6>] _cpu_down+0xd6/0x3bc
[   47.256367]  [<00000000006b4f1e>] cpu_down+0x42/0x60
[   47.256370]  [<00000000006b83ae>] store_online+0x4a/0xb4
[   47.256373]  [<00000000003291e2>] sysfs_write_file+0x116/0x174
[   47.256378]  [<000000000029cfd0>] vfs_write+0xa4/0x180
[   47.256382]  [<000000000029d4d4>] SyS_write+0x5c/0x98
[   47.256385]  [<00000000006d013c>] sysc_nr_ok+0x22/0x28
[   47.256388]  [<000000477ec0af28>] 0x477ec0af28
[   47.256390] INFO: lockdep is turned off.
[   47.256392] Last Breaking-Event-Address:
[   47.256393]  [<00000000001b5d64>] __lock_acquire+0x70/0x16a4
[   47.256396]
[   47.256398] Kernel panic - not syncing: Fatal exception: panic_on_oops


Gerald Schaefer (1):
  sched/autogroup: Fix race with task_groups list

 kernel/sched/auto_group.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

-- 
1.8.1.6


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/1] sched/autogroup: Fix race with task_groups list
  2013-05-24 16:07 [PATCH 0/1] sched/autogroup: Fix race with task_groups list Gerald Schaefer
@ 2013-05-24 16:07 ` Gerald Schaefer
  2013-05-27  9:03   ` Peter Zijlstra
  2013-05-28 13:05   ` [tip:sched/core] " tip-bot for Gerald Schaefer
  0 siblings, 2 replies; 4+ messages in thread
From: Gerald Schaefer @ 2013-05-24 16:07 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Li Zefan
  Cc: linux-kernel, Martin Schwidefsky, Heiko Carstens, Gerald Schaefer

In autogroup_create(), a tg is allocated and added to the task_groups
list. If CONFIG_RT_GROUP_SCHED is set, this tg is then modified while on
the list, without locking. This can race with someone walking the list,
like __enable_runtime() during CPU unplug, and result in a use-after-free
bug.

To fix this, move sched_online_group(), which adds the tg to the list,
to the end of the autogroup_create() function after the modification.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
 kernel/sched/auto_group.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/sched/auto_group.c b/kernel/sched/auto_group.c
index 64de5f8..4a07353 100644
--- a/kernel/sched/auto_group.c
+++ b/kernel/sched/auto_group.c
@@ -77,8 +77,6 @@ static inline struct autogroup *autogroup_create(void)
 	if (IS_ERR(tg))
 		goto out_free;
 
-	sched_online_group(tg, &root_task_group);
-
 	kref_init(&ag->kref);
 	init_rwsem(&ag->lock);
 	ag->id = atomic_inc_return(&autogroup_seq_nr);
@@ -98,6 +96,7 @@ static inline struct autogroup *autogroup_create(void)
 #endif
 	tg->autogroup = ag;
 
+	sched_online_group(tg, &root_task_group);
 	return ag;
 
 out_free:
-- 
1.8.1.6


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] sched/autogroup: Fix race with task_groups list
  2013-05-24 16:07 ` [PATCH 1/1] " Gerald Schaefer
@ 2013-05-27  9:03   ` Peter Zijlstra
  2013-05-28 13:05   ` [tip:sched/core] " tip-bot for Gerald Schaefer
  1 sibling, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2013-05-27  9:03 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Ingo Molnar, Li Zefan, linux-kernel, Martin Schwidefsky,
	Heiko Carstens

On Fri, May 24, 2013 at 06:07:49PM +0200, Gerald Schaefer wrote:
> In autogroup_create(), a tg is allocated and added to the task_groups
> list. If CONFIG_RT_GROUP_SCHED is set, this tg is then modified while on
> the list, without locking. This can race with someone walking the list,
> like __enable_runtime() during CPU unplug, and result in a use-after-free
> bug.
> 
> To fix this, move sched_online_group(), which adds the tg to the list,
> to the end of the autogroup_create() function after the modification.
> 
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

Ah indeed, nice catch. Thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip:sched/core] sched/autogroup: Fix race with task_groups list
  2013-05-24 16:07 ` [PATCH 1/1] " Gerald Schaefer
  2013-05-27  9:03   ` Peter Zijlstra
@ 2013-05-28 13:05   ` tip-bot for Gerald Schaefer
  1 sibling, 0 replies; 4+ messages in thread
From: tip-bot for Gerald Schaefer @ 2013-05-28 13:05 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, peterz, gerald.schaefer, tglx

Commit-ID:  41261b6a832ea0e788627f6a8707854423f9ff49
Gitweb:     http://git.kernel.org/tip/41261b6a832ea0e788627f6a8707854423f9ff49
Author:     Gerald Schaefer <gerald.schaefer@de.ibm.com>
AuthorDate: Fri, 24 May 2013 18:07:49 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 28 May 2013 09:40:22 +0200

sched/autogroup: Fix race with task_groups list

In autogroup_create(), a tg is allocated and added to the task_groups
list. If CONFIG_RT_GROUP_SCHED is set, this tg is then modified while on
the list, without locking. This can race with someone walking the list,
like __enable_runtime() during CPU unplug, and result in a use-after-free
bug.

To fix this, move sched_online_group(), which adds the tg to the list,
to the end of the autogroup_create() function after the modification.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1369411669-46971-2-git-send-email-gerald.schaefer@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/auto_group.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/sched/auto_group.c b/kernel/sched/auto_group.c
index 64de5f8..4a07353 100644
--- a/kernel/sched/auto_group.c
+++ b/kernel/sched/auto_group.c
@@ -77,8 +77,6 @@ static inline struct autogroup *autogroup_create(void)
 	if (IS_ERR(tg))
 		goto out_free;
 
-	sched_online_group(tg, &root_task_group);
-
 	kref_init(&ag->kref);
 	init_rwsem(&ag->lock);
 	ag->id = atomic_inc_return(&autogroup_seq_nr);
@@ -98,6 +96,7 @@ static inline struct autogroup *autogroup_create(void)
 #endif
 	tg->autogroup = ag;
 
+	sched_online_group(tg, &root_task_group);
 	return ag;
 
 out_free:

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-05-28 13:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-24 16:07 [PATCH 0/1] sched/autogroup: Fix race with task_groups list Gerald Schaefer
2013-05-24 16:07 ` [PATCH 1/1] " Gerald Schaefer
2013-05-27  9:03   ` Peter Zijlstra
2013-05-28 13:05   ` [tip:sched/core] " tip-bot for Gerald Schaefer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).