sched: fair: NULL ptr deref in check_preempt

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* sched: fair: NULL ptr deref in check_preempt_wakeup
@ 2014-02-15 23:27 Sasha Levin
  2014-02-15 23:32 ` Sasha Levin
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Sasha Levin @ 2014-02-15 23:27 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra; +Cc: Dave Jones, LKML

Hi folks,

While fuzzing with trinity inside a KVM tools guest running latest -next kernel, I've
stumbled on the following:

[  522.645288] BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
[  522.646271] IP: [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
[  522.646976] PGD b0a79067 PUD ae9cf067 PMD 0
[  522.647494] Oops: 0000 [#1] PREEMPT SMP
[  522.648000] Dumping ftrace buffer:
[  522.648380]    (ftrace buffer empty)
[  522.648775] Modules linked in:
[  522.649125] CPU: 0 PID: 11735 Comm: trinity-c50 Not tainted 
3.14.0-rc2-next-20140214-sasha-00008-g95d9d16-dirty #85
[  522.650021] task: ffff8800c00bb000 ti: ffff88007fdb8000 task.ti: ffff88007fdb8000
[  522.650021] RIP: 0010:[<ffffffff81186c6f>]  [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
[  522.650021] RSP: 0018:ffff880226e03ba8  EFLAGS: 00010046
[  522.650021] RAX: 0000000000000000 RBX: ffff880226fd79c0 RCX: 0000000000000008
[  522.650021] RDX: 0000000000000000 RSI: ffff880211313000 RDI: 000000000000000c
[  522.650021] RBP: ffff880226e03be8 R08: 0000000000000000 R09: 000000000000b4bb
[  522.650021] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[  522.650021] R13: ffff880211313068 R14: ffff8800c00bb000 R15: 0000000000000000
[  522.650021] FS:  00007f435269f700(0000) GS:ffff880226e00000(0000) knlGS:0000000000000000
[  522.650021] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  522.650021] CR2: 0000000000000150 CR3: 00000000abd2c000 CR4: 00000000000006f0
[  522.650021] DR0: 0000000000995750 DR1: 0000000000000000 DR2: 0000000000000000
[  522.650021] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000600
[  522.650021] Stack:
[  522.650021]  ffff880211313000 01ff880226fd79c0 ffff880211313000 ffff880226fd79c0
[  522.650021]  ffff880226fd79c0 ffff880211313000 0000000000000000 ffff880226e00000
[  522.650021]  ffff880226e03c08 ffffffff8117361d ffff880226fd79c0 ffff880226fd79c0
[  522.650021] Call Trace:
[  522.650021]  <IRQ>
[  522.650021]  [<ffffffff8117361d>] check_preempt_curr+0x3d/0xb0
[  522.650021]  [<ffffffff81175d88>] ttwu_do_wakeup+0x18/0x130
[  522.650021]  [<ffffffff81175ee4>] T.2248+0x44/0x50
[  522.650021]  [<ffffffff81175f9e>] ttwu_queue+0xae/0xd0
[  522.650021]  [<ffffffff81180224>] ? try_to_wake_up+0x34/0x2a0
[  522.650021]  [<ffffffff81180454>] try_to_wake_up+0x264/0x2a0
[  522.650021]  [<ffffffff811a1672>] ? __lock_acquired+0x2a2/0x2e0
[  522.650021]  [<ffffffff8118049d>] default_wake_function+0xd/0x10
[  522.650021]  [<ffffffff811952f8>] autoremove_wake_function+0x18/0x40
[  522.650021]  [<ffffffff811951b2>] __wake_up_common+0x52/0x90
[  522.650021]  [<ffffffff8119550d>] ? __wake_up+0x2d/0x70
[  522.650021]  [<ffffffff81195523>] __wake_up+0x43/0x70
[  522.650021]  [<ffffffff843119a3>] p9_client_cb+0x43/0x70
[  522.650021]  [<ffffffff84319d05>] req_done+0x105/0x110
[  522.650021]  [<ffffffff81cafca6>] vring_interrupt+0x86/0xa0
[  522.650021]  [<ffffffff811b9a28>] ? handle_irq_event+0x38/0x70
[  522.650021]  [<ffffffff811b9779>] handle_irq_event_percpu+0x129/0x3a0
[  522.650021]  [<ffffffff811b9a33>] handle_irq_event+0x43/0x70
[  522.650021]  [<ffffffff811bd1e8>] handle_edge_irq+0xe8/0x120
[  522.650021]  [<ffffffff81070a34>] handle_irq+0x164/0x180
[  522.650021]  [<ffffffff811833c9>] ? vtime_account_system+0x79/0x90
[  522.650021]  [<ffffffff81183435>] ? vtime_common_account_irq_enter+0x55/0x60
[  522.650021]  [<ffffffff8106f629>] do_IRQ+0x59/0x100
[  522.650021]  [<ffffffff84395e72>] common_interrupt+0x72/0x72
[  522.650021]  <EOI>
[  522.650021]  [<ffffffff812510d5>] ? context_tracking_user_exit+0x1a5/0x1c0
[  522.650021]  [<ffffffff8107cfdd>] syscall_trace_enter+0x2d/0x280
[  522.650021]  [<ffffffff8439f081>] tracesys+0x7e/0xe2
[  522.650021] Code: 0f 1f 40 00 ff c8 4d 8b ad 48 01 00 00 39 d0 7f f3 eb 18 66 0f 1f 84 00 00 00 
00 00 4d 8b a4 24 48 01 00 00 4d 8b ad 48 01 00 00 <49> 8b bc 24 50 01 00 00 49 3b bd 50 01 00 00 75 
e0 48 85 ff 74
[  522.650021] RIP  [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
[  522.650021]  RSP <ffff880226e03ba8>
[  522.650021] CR2: 0000000000000150
[  522.650021] ---[ end trace adce75aec8b1b32f ]---

Since it's pretty inlined, the code points to:

	check_preempt_wakeup()
		find_matching_se()
			find_matching_se()
				check_preempt_wakeup()


	static inline struct cfs_rq *
	is_same_group(struct sched_entity *se, struct sched_entity *pse)
	{
	        if (se->cfs_rq == pse->cfs_rq)	<=== HERE
	                return se->cfs_rq;
	
	        return NULL;
	}


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-15 23:27 sched: fair: NULL ptr deref in check_preempt_wakeup Sasha Levin
@ 2014-02-15 23:32 ` Sasha Levin
  2014-02-16 19:19 ` Peter Zijlstra
  2014-02-17  8:11 ` Michael wang
  2 siblings, 0 replies; 13+ messages in thread
From: Sasha Levin @ 2014-02-15 23:32 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra; +Cc: Dave Jones, LKML

On 02/15/2014 06:27 PM, Sasha Levin wrote:
> Hi folks,
>
> While fuzzing with trinity inside a KVM tools guest running latest -next kernel, I've
> stumbled on the following:

As soon as I've finished writing that mail I've hit it again, with a different (but similar) stack 
trace.

[  770.993016] BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
[  770.993865] IP: [<ffffffff8118ef99>] pick_next_task_fair+0x109/0x290
[  770.994531] PGD 1addee067 PUD 1addef067 PMD 0
[  770.995018] Oops: 0000 [#1] PREEMPT SMP
[  770.995573] Dumping ftrace buffer:
[  770.995928]    (ftrace buffer empty)
[  770.996304] Modules linked in:
[  770.996661] CPU: 0 PID: 13754 Comm: trinity-c155 Not tainted 3.14.0-rc2-next-20140214
[  770.997646] task: ffff88021151b000 ti: ffff88016b9f4000 task.ti: ffff88016b9f4000
[  770.998384] RIP: 0010:[<ffffffff8118ef99>]  [<ffffffff8118ef99>] pick_next_task_fair+
[  770.999254] RSP: 0018:ffff88016b9f5bc8  EFLAGS: 00010097
[  770.999787] RAX: 000000004caed01b RBX: ffff880226fd79c0 RCX: 000000000004ccca
[  771.000035] RDX: 0000000000a7076b RSI: ffff880060ff8028 RDI: ffff88008b998078
[  771.000035] RBP: ffff88016b9f5c08 R08: 0000000000000000 R09: 0000000000000001
[  771.000035] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88008b998000
[  771.000035] R13: 0000000000000000 R14: ffff880226fd7a88 R15: ffff880060ffb7c8
[  771.000035] FS:  00007f6e01002700(0000) GS:ffff880226e00000(0000) knlGS:0000000000000
[  771.000035] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  771.000035] CR2: 0000000000000150 CR3: 00000001feeef000 CR4: 00000000000006f0
[  771.000035] DR0: 00007f6e009b2000 DR1: 0000000000000000 DR2: 0000000000000000
[  771.000035] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000600
[  771.000035] Stack:
[  771.000035]  ffff880100000001 ffffffff00000001 ffff88016b9f5c08 ffff880226fd79c0
[  771.000035]  0000000000000000 ffff88021151b990 0000000000000282 00000000ffffffff
[  771.000035]  ffff88016b9f5c88 ffffffff8438ef35 ffff88016b9f5c78 ffffffff811a19a5
[  771.000035] Call Trace:
[  771.000035]  [<ffffffff8438ef35>] __schedule+0x2a5/0x840
[  771.000035]  [<ffffffff811a19a5>] ? __lock_contended+0x205/0x240
[  771.000035]  [<ffffffff8438f795>] schedule+0x65/0x70
[  771.000035]  [<ffffffff8438fb73>] schedule_preempt_disabled+0x13/0x20
[  771.000035]  [<ffffffff8439105d>] mutex_lock_nested+0x2ad/0x510
[  771.000035]  [<ffffffff812ed326>] ? lookup_slow+0x46/0xd0
[  771.000035]  [<ffffffff812ed70d>] ? unlazy_walk+0x16d/0x1e0
[  771.000035]  [<ffffffff812ed326>] ? lookup_slow+0x46/0xd0
[  771.000035]  [<ffffffff812ed326>] lookup_slow+0x46/0xd0
[  771.000035]  [<ffffffff812efbe5>] path_lookupat+0xe5/0x660
[  771.000035]  [<ffffffff812b97ea>] ? kmem_cache_alloc+0x1fa/0x300
[  771.000035]  [<ffffffff812eb497>] ? getname_flags+0x57/0x1c0
[  771.000035]  [<ffffffff812f018f>] filename_lookup+0x2f/0xd0
[  771.000035]  [<ffffffff812f155c>] user_path_at_empty+0x6c/0xb0
[  771.000035]  [<ffffffff812510b5>] ? context_tracking_user_exit+0x185/0x1c0
[  771.000035]  [<ffffffff811a3ccd>] ? trace_hardirqs_on+0xd/0x10
[  771.000035]  [<ffffffff812f15ac>] user_path_at+0xc/0x10
[  771.000035]  [<ffffffff812de913>] do_sys_truncate+0x43/0xc0
[  771.000035]  [<ffffffff812de9a9>] SyS_truncate+0x9/0x10
[  771.000035]  [<ffffffff8439f0e0>] tracesys+0xdd/0xe2
[  771.000035] Code: 4d 8b ad 48 01 00 00 39 c2 7c 19 4d 8b b7 50 01 00 00 4c 89 fe 4c 89 f7 e8 55 
98 ff ff 4d 8b bf 48 01 00 00 4d 8b b7 50 01 00 00 <49> 8b bd 50 01 00 00 49 39 fe 75 a3 4d 85 f6 74 
9e 4c 89 ee 4c
[  771.000035] RIP  [<ffffffff8118ef99>] pick_next_task_fair+0x109/0x290
[  771.000035]  RSP <ffff88016b9f5bc8>
[  771.000035] CR2: 0000000000000150
[  771.000035] ---[ end trace 408e14968ec7dd7a ]---


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-15 23:27 sched: fair: NULL ptr deref in check_preempt_wakeup Sasha Levin
  2014-02-15 23:32 ` Sasha Levin
@ 2014-02-16 19:19 ` Peter Zijlstra
  2014-02-17  8:11 ` Michael wang
  2 siblings, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2014-02-16 19:19 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Ingo Molnar, Dave Jones, LKML

On Sat, Feb 15, 2014 at 06:27:52PM -0500, Sasha Levin wrote:
> Hi folks,
> 
> While fuzzing with trinity inside a KVM tools guest running latest -next kernel, I've
> stumbled on the following:
> 
> [  522.645288] BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
> [  522.646271] IP: [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
> 
> Since it's pretty inlined, the code points to:
> 
> 	check_preempt_wakeup()
> 		find_matching_se()
> 			find_matching_se()
> 				check_preempt_wakeup()
> 
> 
> 	static inline struct cfs_rq *
> 	is_same_group(struct sched_entity *se, struct sched_entity *pse)
> 	{
> 	        if (se->cfs_rq == pse->cfs_rq)	<=== HERE
> 	                return se->cfs_rq;
> 	
> 	        return NULL;
> 	}

Hrm.. that means we got se->depth wrong. I'll have a poke tomorrow.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-15 23:27 sched: fair: NULL ptr deref in check_preempt_wakeup Sasha Levin
  2014-02-15 23:32 ` Sasha Levin
  2014-02-16 19:19 ` Peter Zijlstra
@ 2014-02-17  8:11 ` Michael wang
  2014-02-17  9:20   ` Peter Zijlstra
                     ` (2 more replies)
  2 siblings, 3 replies; 13+ messages in thread
From: Michael wang @ 2014-02-17  8:11 UTC (permalink / raw)
  To: Sasha Levin, Ingo Molnar, Peter Zijlstra; +Cc: Dave Jones, LKML

Hi, Sasha

On 02/16/2014 07:27 AM, Sasha Levin wrote:
> Hi folks,
> 
> While fuzzing with trinity inside a KVM tools guest running latest -next
> kernel, I've
> stumbled on the following:

I've reproduced the same issue with tip/master, and below patch fixed the
problem on my box along with some rcu stall info disappeared, would you
like to have a try?

BTW, I reproduced it by steps:
1. change current to RT
2. move to a different depth cpu-cgroup
3. change it back to FAIR

Seems like it was caused by that RT has no task_move_group() implemented
which could maintain depth, and that lead to a wrong depth after switched
back to FAIR...

Regards,
Michael Wang



diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 235cfa7..4445e56 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7317,7 +7317,11 @@ static void switched_from_fair(struct rq *rq, struct task_struct *p)
  */
 static void switched_to_fair(struct rq *rq, struct task_struct *p)
 {
-	if (!p->se.on_rq)
+	struct sched_entity *se = &p->se;
+#ifdef CONFIG_FAIR_GROUP_SCHED
+	se->depth = se->parent ? se->parent->depth + 1 : 0;
+#endif
+	if (!se->on_rq)
 		return;
 
 	/*


> 
> [  522.645288] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000150
> [  522.646271] IP: [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
> [  522.646976] PGD b0a79067 PUD ae9cf067 PMD 0
> [  522.647494] Oops: 0000 [#1] PREEMPT SMP
> [  522.648000] Dumping ftrace buffer:
> [  522.648380]    (ftrace buffer empty)
> [  522.648775] Modules linked in:
> [  522.649125] CPU: 0 PID: 11735 Comm: trinity-c50 Not tainted
> 3.14.0-rc2-next-20140214-sasha-00008-g95d9d16-dirty #85
> [  522.650021] task: ffff8800c00bb000 ti: ffff88007fdb8000 task.ti:
> ffff88007fdb8000
> [  522.650021] RIP: 0010:[<ffffffff81186c6f>]  [<ffffffff81186c6f>]
> check_preempt_wakeup+0x11f/0x210
> [  522.650021] RSP: 0018:ffff880226e03ba8  EFLAGS: 00010046
> [  522.650021] RAX: 0000000000000000 RBX: ffff880226fd79c0 RCX:
> 0000000000000008
> [  522.650021] RDX: 0000000000000000 RSI: ffff880211313000 RDI:
> 000000000000000c
> [  522.650021] RBP: ffff880226e03be8 R08: 0000000000000000 R09:
> 000000000000b4bb
> [  522.650021] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> [  522.650021] R13: ffff880211313068 R14: ffff8800c00bb000 R15:
> 0000000000000000
> [  522.650021] FS:  00007f435269f700(0000) GS:ffff880226e00000(0000)
> knlGS:0000000000000000
> [  522.650021] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  522.650021] CR2: 0000000000000150 CR3: 00000000abd2c000 CR4:
> 00000000000006f0
> [  522.650021] DR0: 0000000000995750 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  522.650021] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7:
> 0000000000000600
> [  522.650021] Stack:
> [  522.650021]  ffff880211313000 01ff880226fd79c0 ffff880211313000
> ffff880226fd79c0
> [  522.650021]  ffff880226fd79c0 ffff880211313000 0000000000000000
> ffff880226e00000
> [  522.650021]  ffff880226e03c08 ffffffff8117361d ffff880226fd79c0
> ffff880226fd79c0
> [  522.650021] Call Trace:
> [  522.650021]  <IRQ>
> [  522.650021]  [<ffffffff8117361d>] check_preempt_curr+0x3d/0xb0
> [  522.650021]  [<ffffffff81175d88>] ttwu_do_wakeup+0x18/0x130
> [  522.650021]  [<ffffffff81175ee4>] T.2248+0x44/0x50
> [  522.650021]  [<ffffffff81175f9e>] ttwu_queue+0xae/0xd0
> [  522.650021]  [<ffffffff81180224>] ? try_to_wake_up+0x34/0x2a0
> [  522.650021]  [<ffffffff81180454>] try_to_wake_up+0x264/0x2a0
> [  522.650021]  [<ffffffff811a1672>] ? __lock_acquired+0x2a2/0x2e0
> [  522.650021]  [<ffffffff8118049d>] default_wake_function+0xd/0x10
> [  522.650021]  [<ffffffff811952f8>] autoremove_wake_function+0x18/0x40
> [  522.650021]  [<ffffffff811951b2>] __wake_up_common+0x52/0x90
> [  522.650021]  [<ffffffff8119550d>] ? __wake_up+0x2d/0x70
> [  522.650021]  [<ffffffff81195523>] __wake_up+0x43/0x70
> [  522.650021]  [<ffffffff843119a3>] p9_client_cb+0x43/0x70
> [  522.650021]  [<ffffffff84319d05>] req_done+0x105/0x110
> [  522.650021]  [<ffffffff81cafca6>] vring_interrupt+0x86/0xa0
> [  522.650021]  [<ffffffff811b9a28>] ? handle_irq_event+0x38/0x70
> [  522.650021]  [<ffffffff811b9779>] handle_irq_event_percpu+0x129/0x3a0
> [  522.650021]  [<ffffffff811b9a33>] handle_irq_event+0x43/0x70
> [  522.650021]  [<ffffffff811bd1e8>] handle_edge_irq+0xe8/0x120
> [  522.650021]  [<ffffffff81070a34>] handle_irq+0x164/0x180
> [  522.650021]  [<ffffffff811833c9>] ? vtime_account_system+0x79/0x90
> [  522.650021]  [<ffffffff81183435>] ?
> vtime_common_account_irq_enter+0x55/0x60
> [  522.650021]  [<ffffffff8106f629>] do_IRQ+0x59/0x100
> [  522.650021]  [<ffffffff84395e72>] common_interrupt+0x72/0x72
> [  522.650021]  <EOI>
> [  522.650021]  [<ffffffff812510d5>] ?
> context_tracking_user_exit+0x1a5/0x1c0
> [  522.650021]  [<ffffffff8107cfdd>] syscall_trace_enter+0x2d/0x280
> [  522.650021]  [<ffffffff8439f081>] tracesys+0x7e/0xe2
> [  522.650021] Code: 0f 1f 40 00 ff c8 4d 8b ad 48 01 00 00 39 d0 7f f3
> eb 18 66 0f 1f 84 00 00 00 00 00 4d 8b a4 24 48 01 00 00 4d 8b ad 48 01
> 00 00 <49> 8b bc 24 50 01 00 00 49 3b bd 50 01 00 00 75 e0 48 85 ff 74
> [  522.650021] RIP  [<ffffffff81186c6f>] check_preempt_wakeup+0x11f/0x210
> [  522.650021]  RSP <ffff880226e03ba8>
> [  522.650021] CR2: 0000000000000150
> [  522.650021] ---[ end trace adce75aec8b1b32f ]---
> 
> Since it's pretty inlined, the code points to:
> 
>     check_preempt_wakeup()
>         find_matching_se()
>             find_matching_se()
>                 check_preempt_wakeup()
> 
> 
>     static inline struct cfs_rq *
>     is_same_group(struct sched_entity *se, struct sched_entity *pse)
>     {
>             if (se->cfs_rq == pse->cfs_rq)    <=== HERE
>                     return se->cfs_rq;
>     
>             return NULL;
>     }
> 
> 
> Thanks,
> Sasha
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-17  8:11 ` Michael wang
@ 2014-02-17  9:20   ` Peter Zijlstra
  2014-02-18  2:26     ` Michael wang
  2014-02-17 21:07   ` Sasha Levin
  2014-02-19 16:16   ` Peter Zijlstra
  2 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2014-02-17  9:20 UTC (permalink / raw)
  To: Michael wang; +Cc: Sasha Levin, Ingo Molnar, Dave Jones, LKML

On Mon, Feb 17, 2014 at 04:11:09PM +0800, Michael wang wrote:
> BTW, I reproduced it by steps:
> 1. change current to RT
> 2. move to a different depth cpu-cgroup
> 3. change it back to FAIR
> 
> Seems like it was caused by that RT has no task_move_group() implemented
> which could maintain depth, and that lead to a wrong depth after switched
> back to FAIR...


> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 235cfa7..4445e56 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7317,7 +7317,11 @@ static void switched_from_fair(struct rq *rq, struct task_struct *p)
>   */
>  static void switched_to_fair(struct rq *rq, struct task_struct *p)
>  {
> -	if (!p->se.on_rq)
> +	struct sched_entity *se = &p->se;
> +#ifdef CONFIG_FAIR_GROUP_SCHED
> +	se->depth = se->parent ? se->parent->depth + 1 : 0;
> +#endif
> +	if (!se->on_rq)
>  		return;
>  
>  	/*

Yes indeed. My first idea yesterday was to put it in set_task_rq() to be
absolutely sure we catch all; but if this is sufficient its better.

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-17  9:20   ` Peter Zijlstra
@ 2014-02-18  2:26     ` Michael wang
  2014-02-19 18:10       ` Sasha Levin
  0 siblings, 1 reply; 13+ messages in thread
From: Michael wang @ 2014-02-18  2:26 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Sasha Levin, Ingo Molnar, Dave Jones, LKML

On 02/17/2014 05:20 PM, Peter Zijlstra wrote:
[snip]
>>  static void switched_to_fair(struct rq *rq, struct task_struct *p)
>>  {
>> -	if (!p->se.on_rq)
>> +	struct sched_entity *se = &p->se;
>> +#ifdef CONFIG_FAIR_GROUP_SCHED
>> +	se->depth = se->parent ? se->parent->depth + 1 : 0;
>> +#endif
>> +	if (!se->on_rq)
>>  		return;
>>  
>>  	/*
> 
> Yes indeed. My first idea yesterday was to put it in set_task_rq() to be
> absolutely sure we catch all; but if this is sufficient its better.

Agree, let's wait for Sasha's testing result then :)

Regards,
Michael Wang

> 
> Thanks!
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-18  2:26     ` Michael wang
@ 2014-02-19 18:10       ` Sasha Levin
  2014-02-19 18:37         ` Peter Zijlstra
  2014-02-20  2:22         ` Michael wang
  0 siblings, 2 replies; 13+ messages in thread
From: Sasha Levin @ 2014-02-19 18:10 UTC (permalink / raw)
  To: Michael wang, Peter Zijlstra; +Cc: Ingo Molnar, Dave Jones, LKML

On 02/17/2014 09:26 PM, Michael wang wrote:
> On 02/17/2014 05:20 PM, Peter Zijlstra wrote:
> [snip]
>>> >>  static void switched_to_fair(struct rq *rq, struct task_struct *p)
>>> >>  {
>>> >>-	if (!p->se.on_rq)
>>> >>+	struct sched_entity *se = &p->se;
>>> >>+#ifdef CONFIG_FAIR_GROUP_SCHED
>>> >>+	se->depth = se->parent ? se->parent->depth + 1 : 0;
>>> >>+#endif
>>> >>+	if (!se->on_rq)
>>> >>  		return;
>>> >>
>>> >>  	/*
>> >
>> >Yes indeed. My first idea yesterday was to put it in set_task_rq() to be
>> >absolutely sure we catch all; but if this is sufficient its better.
> Agree, let's wait for Sasha's testing result then:)

I took my time with testing it seems I'm hitting new issues with both sched and mm, and I've wanted 
to confirm I don't see this one any more.

It does seem like this patch fixes the problem for me, so:

	Tested-by: Sasha Levin <sasha.levin@oracle.com>


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-19 18:10       ` Sasha Levin
@ 2014-02-19 18:37         ` Peter Zijlstra
  2014-02-20  2:22         ` Michael wang
  1 sibling, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2014-02-19 18:37 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Michael wang, Ingo Molnar, Dave Jones, LKML

On Wed, Feb 19, 2014 at 01:10:22PM -0500, Sasha Levin wrote:
> On 02/17/2014 09:26 PM, Michael wang wrote:
> >On 02/17/2014 05:20 PM, Peter Zijlstra wrote:
> >[snip]
> >>>>>  static void switched_to_fair(struct rq *rq, struct task_struct *p)
> >>>>>  {
> >>>>>-	if (!p->se.on_rq)
> >>>>>+	struct sched_entity *se = &p->se;
> >>>>>+#ifdef CONFIG_FAIR_GROUP_SCHED
> >>>>>+	se->depth = se->parent ? se->parent->depth + 1 : 0;
> >>>>>+#endif
> >>>>>+	if (!se->on_rq)
> >>>>>  		return;
> >>>>>
> >>>>>  	/*
> >>>
> >>>Yes indeed. My first idea yesterday was to put it in set_task_rq() to be
> >>>absolutely sure we catch all; but if this is sufficient its better.
> >Agree, let's wait for Sasha's testing result then:)
> 
> I took my time with testing it seems I'm hitting new issues with both sched
> and mm, and I've wanted to confirm I don't see this one any more.
> 
> It does seem like this patch fixes the problem for me, so:
> 
> 	Tested-by: Sasha Levin <sasha.levin@oracle.com>
> 

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-19 18:10       ` Sasha Levin
  2014-02-19 18:37         ` Peter Zijlstra
@ 2014-02-20  2:22         ` Michael wang
  1 sibling, 0 replies; 13+ messages in thread
From: Michael wang @ 2014-02-20  2:22 UTC (permalink / raw)
  To: Sasha Levin, Peter Zijlstra; +Cc: Ingo Molnar, Dave Jones, LKML

On 02/20/2014 02:10 AM, Sasha Levin wrote:
> On 02/17/2014 09:26 PM, Michael wang wrote:
>> On 02/17/2014 05:20 PM, Peter Zijlstra wrote:
>> [snip]
>>>> >>  static void switched_to_fair(struct rq *rq, struct task_struct *p)
>>>> >>  {
>>>> >>-    if (!p->se.on_rq)
>>>> >>+    struct sched_entity *se = &p->se;
>>>> >>+#ifdef CONFIG_FAIR_GROUP_SCHED
>>>> >>+    se->depth = se->parent ? se->parent->depth + 1 : 0;
>>>> >>+#endif
>>>> >>+    if (!se->on_rq)
>>>> >>          return;
>>>> >>
>>>> >>      /*
>>> >
>>> >Yes indeed. My first idea yesterday was to put it in set_task_rq()
>>> to be
>>> >absolutely sure we catch all; but if this is sufficient its better.
>> Agree, let's wait for Sasha's testing result then:)
> 
> I took my time with testing it seems I'm hitting new issues with both
> sched and mm, and I've wanted to confirm I don't see this one any more.
> 
> It does seem like this patch fixes the problem for me, so:
> 
>     Tested-by: Sasha Levin <sasha.levin@oracle.com>

Thanks for the testing :) will post the patch later.

Regards,
Michael Wang

> 
> 
> Thanks,
> Sasha
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-17  8:11 ` Michael wang
  2014-02-17  9:20   ` Peter Zijlstra
@ 2014-02-17 21:07   ` Sasha Levin
  2014-02-18  2:28     ` Michael wang
  2014-02-19 16:16   ` Peter Zijlstra
  2 siblings, 1 reply; 13+ messages in thread
From: Sasha Levin @ 2014-02-17 21:07 UTC (permalink / raw)
  To: Michael wang, Ingo Molnar, Peter Zijlstra; +Cc: Dave Jones, LKML

On 02/17/2014 03:11 AM, Michael wang wrote:
> Hi, Sasha
>
> On 02/16/2014 07:27 AM, Sasha Levin wrote:
>> Hi folks,
>>
>> While fuzzing with trinity inside a KVM tools guest running latest -next
>> kernel, I've
>> stumbled on the following:
>
> I've reproduced the same issue with tip/master, and below patch fixed the
> problem on my box along with some rcu stall info disappeared, would you
> like to have a try?
>
> BTW, I reproduced it by steps:
> 1. change current to RT
> 2. move to a different depth cpu-cgroup
> 3. change it back to FAIR
>
> Seems like it was caused by that RT has no task_move_group() implemented
> which could maintain depth, and that lead to a wrong depth after switched
> back to FAIR...

I *think* it works. There seems to be another sched issue that causes lockups,
so I can't say for certain that this one doesn't occur anymore.

I'm still working on collecting data for the other issue, I'll mail about it soon.


Thanks,
Sasha


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-17 21:07   ` Sasha Levin
@ 2014-02-18  2:28     ` Michael wang
  0 siblings, 0 replies; 13+ messages in thread
From: Michael wang @ 2014-02-18  2:28 UTC (permalink / raw)
  To: Sasha Levin, Ingo Molnar, Peter Zijlstra; +Cc: Dave Jones, LKML

On 02/18/2014 05:07 AM, Sasha Levin wrote:
[snip]
> 
> I *think* it works. There seems to be another sched issue that causes
> lockups,
> so I can't say for certain that this one doesn't occur anymore.
> 
> I'm still working on collecting data for the other issue, I'll mail
> about it soon.

Thanks for that, looking forward the results :)

Regards,
Michael Wang

> 
> 
> Thanks,
> Sasha
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-17  8:11 ` Michael wang
  2014-02-17  9:20   ` Peter Zijlstra
  2014-02-17 21:07   ` Sasha Levin
@ 2014-02-19 16:16   ` Peter Zijlstra
  2014-02-20  2:18     ` Michael wang
  2 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2014-02-19 16:16 UTC (permalink / raw)
  To: Michael wang; +Cc: Sasha Levin, Ingo Molnar, Dave Jones, LKML

On Mon, Feb 17, 2014 at 04:11:09PM +0800, Michael wang wrote:
> > While fuzzing with trinity inside a KVM tools guest running latest -next
> > kernel, I've
> > stumbled on the following:
> 
> I've reproduced the same issue with tip/master, and below patch fixed the
> problem on my box along with some rcu stall info disappeared, would you
> like to have a try?
> 
> BTW, I reproduced it by steps:
> 1. change current to RT
> 2. move to a different depth cpu-cgroup
> 3. change it back to FAIR
> 
> Seems like it was caused by that RT has no task_move_group() implemented
> which could maintain depth, and that lead to a wrong depth after switched
> back to FAIR...
> 
> Regards,
> Michael Wang
> 
> 
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 235cfa7..4445e56 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7317,7 +7317,11 @@ static void switched_from_fair(struct rq *rq, struct task_struct *p)
>   */
>  static void switched_to_fair(struct rq *rq, struct task_struct *p)
>  {
> -	if (!p->se.on_rq)
> +	struct sched_entity *se = &p->se;
> +#ifdef CONFIG_FAIR_GROUP_SCHED
> +	se->depth = se->parent ? se->parent->depth + 1 : 0;
> +#endif
> +	if (!se->on_rq)
>  		return;
>  
>  	/*


Michael, do you think you can send a proper patch for this?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: sched: fair: NULL ptr deref in check_preempt_wakeup
  2014-02-19 16:16   ` Peter Zijlstra
@ 2014-02-20  2:18     ` Michael wang
  0 siblings, 0 replies; 13+ messages in thread
From: Michael wang @ 2014-02-20  2:18 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Sasha Levin, Ingo Molnar, Dave Jones, LKML

On 02/20/2014 12:16 AM, Peter Zijlstra wrote:
[snip]
>>
>>
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 235cfa7..4445e56 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -7317,7 +7317,11 @@ static void switched_from_fair(struct rq *rq, struct task_struct *p)
>>   */
>>  static void switched_to_fair(struct rq *rq, struct task_struct *p)
>>  {
>> -	if (!p->se.on_rq)
>> +	struct sched_entity *se = &p->se;
>> +#ifdef CONFIG_FAIR_GROUP_SCHED
>> +	se->depth = se->parent ? se->parent->depth + 1 : 0;
>> +#endif
>> +	if (!se->on_rq)
>>  		return;
>>  
>>  	/*
> 
> 
> Michael, do you think you can send a proper patch for this?

My pleasure :) will post it later.

Regards,
Michael Wang

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-02-20  2:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-15 23:27 sched: fair: NULL ptr deref in check_preempt_wakeup Sasha Levin
2014-02-15 23:32 ` Sasha Levin
2014-02-16 19:19 ` Peter Zijlstra
2014-02-17  8:11 ` Michael wang
2014-02-17  9:20   ` Peter Zijlstra
2014-02-18  2:26     ` Michael wang
2014-02-19 18:10       ` Sasha Levin
2014-02-19 18:37         ` Peter Zijlstra
2014-02-20  2:22         ` Michael wang
2014-02-17 21:07   ` Sasha Levin
2014-02-18  2:28     ` Michael wang
2014-02-19 16:16   ` Peter Zijlstra
2014-02-20  2:18     ` Michael wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox