mlockall triggred rcu_preempt stall.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* mlockall triggred rcu_preempt stall.
@ 2013-07-19 14:53 Dave Jones
  2013-07-19 22:15 ` Paul E. McKenney
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Jones @ 2013-07-19 14:53 UTC (permalink / raw)
  To: Linux Kernel; +Cc: linux-mm, paulmck

My fuzz tester keeps hitting this. Every instance shows the non-irq stack
came in from mlockall.  I'm only seeing this on one box, but that has more
ram (8gb) than my other machines, which might explain it.

	Dave

INFO: rcu_preempt self-detected stall on CPU { 3}  (t=6500 jiffies g=470344 c=470343 q=0)
sending NMI to all CPUs:
NMI backtrace for cpu 3
CPU: 3 PID: 29664 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #32
task: ffff88023e743fc0 ti: ffff88022f6f2000 task.ti: ffff88022f6f2000
RIP: 0010:[<ffffffff810bf7d1>]  [<ffffffff810bf7d1>] trace_hardirqs_off_caller+0x21/0xb0
RSP: 0018:ffff880244e03c30  EFLAGS: 00000046
RAX: ffff88023e743fc0 RBX: 0000000000000001 RCX: 000000000000003c
RDX: 000000000000000f RSI: 0000000000000004 RDI: ffffffff81033cab
RBP: ffff880244e03c38 R08: ffff880243288a80 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffff880243288a80
R13: ffff8802437eda40 R14: 0000000000080000 R15: 000000000000d010
FS:  00007f50ae33b740(0000) GS:ffff880244e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000097f000 CR3: 0000000240fa0000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
 ffffffff810bf86d ffff880244e03c98 ffffffff81033cab 0000000000000096
 000000000000d008 0000000300000002 0000000000000004 0000000000000003
 0000000000002710 ffffffff81c50d00 ffffffff81c50d00 ffff880244fcde00
Call Trace:
 <IRQ> 
 [<ffffffff810bf86d>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff81033cab>] __x2apic_send_IPI_mask+0x1ab/0x1c0
 [<ffffffff81033cdc>] x2apic_send_IPI_all+0x1c/0x20
 [<ffffffff81030115>] arch_trigger_all_cpu_backtrace+0x65/0xa0
 [<ffffffff811144b1>] rcu_check_callbacks+0x331/0x8e0
 [<ffffffff8108bfa0>] ? hrtimer_run_queues+0x20/0x180
 [<ffffffff8109e905>] ? sched_clock_cpu+0xb5/0x100
 [<ffffffff81069557>] update_process_times+0x47/0x80
 [<ffffffff810bd115>] tick_sched_handle.isra.16+0x25/0x60
 [<ffffffff810bd231>] tick_sched_timer+0x41/0x60
 [<ffffffff8108ace1>] __run_hrtimer+0x81/0x4e0
 [<ffffffff810bd1f0>] ? tick_sched_do_timer+0x60/0x60
 [<ffffffff8108b93f>] hrtimer_interrupt+0xff/0x240
 [<ffffffff8102de84>] local_apic_timer_interrupt+0x34/0x60
 [<ffffffff81718c5f>] smp_apic_timer_interrupt+0x3f/0x60
 [<ffffffff817178ef>] apic_timer_interrupt+0x6f/0x80
 [<ffffffff8170e8e0>] ? retint_restore_args+0xe/0xe
 [<ffffffff8105f101>] ? __do_softirq+0xb1/0x440
 [<ffffffff8105f64d>] irq_exit+0xcd/0xe0
 [<ffffffff81718c65>] smp_apic_timer_interrupt+0x45/0x60
 [<ffffffff817178ef>] apic_timer_interrupt+0x6f/0x80
 <EOI> 
 [<ffffffff8170e8e0>] ? retint_restore_args+0xe/0xe
 [<ffffffff8170b830>] ? wait_for_completion_killable+0x170/0x170
 [<ffffffff8170c853>] ? preempt_schedule_irq+0x53/0x90
 [<ffffffff8170e9f6>] retint_kernel+0x26/0x30
 [<ffffffff8107a523>] ? queue_work_on+0x43/0x90
 [<ffffffff8107c369>] schedule_on_each_cpu+0xc9/0x1a0
 [<ffffffff81167770>] ? lru_add_drain+0x50/0x50
 [<ffffffff811677c5>] lru_add_drain_all+0x15/0x20
 [<ffffffff81186965>] SyS_mlockall+0xa5/0x1a0
 [<ffffffff81716e94>] tracesys+0xdd/0xe2
Code: 5d c3 0f 1f 84 00 00 00 00 00 44 8b 1d 29 73 bd 00 65 48 8b 04 25 00 ba 00 00 45 85 db 74 69 44 8b 90 a4 06 00 00 45 85 d2 75 5d <44> 8b 0d a0 47 00 01 45 85 c9 74 33 44 8b 80 70 06 00 00 45 85 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mlockall triggred rcu_preempt stall.
  2013-07-19 14:53 mlockall triggred rcu_preempt stall Dave Jones
@ 2013-07-19 22:15 ` Paul E. McKenney
  2013-07-20  0:32   ` Dave Jones
  0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2013-07-19 22:15 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm
  Cc: kosaki.motohiro, walken, akpm, torvalds

On Fri, Jul 19, 2013 at 10:53:23AM -0400, Dave Jones wrote:
> My fuzz tester keeps hitting this. Every instance shows the non-irq stack
> came in from mlockall.  I'm only seeing this on one box, but that has more
> ram (8gb) than my other machines, which might explain it.

Are you building CONFIG_PREEMPT=n?  I don't see any preemption points in
do_mlockall(), so a range containing enough vmas might well stall the
CPU in that case.  

Does the patch below help?  If so, we probably need others, but let's
first see if this one helps.  ;-)

CCing the MM guys and those who have most recently touched do_mlockall()
for their insight as well.

							Thanx, Paul

> 	Dave

------------------------------------------------------------------------

mm: Place preemption point in do_mlockall() loop

There is a loop in do_mlockall() that lacks a preemption point, which
means that the following can happen on non-preemptible builds of the
kernel:

> My fuzz tester keeps hitting this. Every instance shows the non-irq stack
> came in from mlockall.  I'm only seeing this on one box, but that has more
> ram (8gb) than my other machines, which might explain it.
>
> 	Dave
>
> INFO: rcu_preempt self-detected stall on CPU { 3}  (t=6500 jiffies g=470344 c=470343 q=0)
> sending NMI to all CPUs:
> NMI backtrace for cpu 3
> CPU: 3 PID: 29664 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #32
> task: ffff88023e743fc0 ti: ffff88022f6f2000 task.ti: ffff88022f6f2000
> RIP: 0010:[<ffffffff810bf7d1>]  [<ffffffff810bf7d1>] trace_hardirqs_off_caller+0x21/0xb0
> RSP: 0018:ffff880244e03c30  EFLAGS: 00000046
> RAX: ffff88023e743fc0 RBX: 0000000000000001 RCX: 000000000000003c
> RDX: 000000000000000f RSI: 0000000000000004 RDI: ffffffff81033cab
> RBP: ffff880244e03c38 R08: ffff880243288a80 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000001 R12: ffff880243288a80
> R13: ffff8802437eda40 R14: 0000000000080000 R15: 000000000000d010
> FS:  00007f50ae33b740(0000) GS:ffff880244e00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000000097f000 CR3: 0000000240fa0000 CR4: 00000000001407e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Stack:
>  ffffffff810bf86d ffff880244e03c98 ffffffff81033cab 0000000000000096
>  000000000000d008 0000000300000002 0000000000000004 0000000000000003
>  0000000000002710 ffffffff81c50d00 ffffffff81c50d00 ffff880244fcde00
> Call Trace:
>  <IRQ>
>  [<ffffffff810bf86d>] ? trace_hardirqs_off+0xd/0x10
>  [<ffffffff81033cab>] __x2apic_send_IPI_mask+0x1ab/0x1c0
>  [<ffffffff81033cdc>] x2apic_send_IPI_all+0x1c/0x20
>  [<ffffffff81030115>] arch_trigger_all_cpu_backtrace+0x65/0xa0
>  [<ffffffff811144b1>] rcu_check_callbacks+0x331/0x8e0
>  [<ffffffff8108bfa0>] ? hrtimer_run_queues+0x20/0x180
>  [<ffffffff8109e905>] ? sched_clock_cpu+0xb5/0x100
>  [<ffffffff81069557>] update_process_times+0x47/0x80
>  [<ffffffff810bd115>] tick_sched_handle.isra.16+0x25/0x60
>  [<ffffffff810bd231>] tick_sched_timer+0x41/0x60
>  [<ffffffff8108ace1>] __run_hrtimer+0x81/0x4e0
>  [<ffffffff810bd1f0>] ? tick_sched_do_timer+0x60/0x60
>  [<ffffffff8108b93f>] hrtimer_interrupt+0xff/0x240
>  [<ffffffff8102de84>] local_apic_timer_interrupt+0x34/0x60
>  [<ffffffff81718c5f>] smp_apic_timer_interrupt+0x3f/0x60
>  [<ffffffff817178ef>] apic_timer_interrupt+0x6f/0x80
>  [<ffffffff8170e8e0>] ? retint_restore_args+0xe/0xe
>  [<ffffffff8105f101>] ? __do_softirq+0xb1/0x440
>  [<ffffffff8105f64d>] irq_exit+0xcd/0xe0
>  [<ffffffff81718c65>] smp_apic_timer_interrupt+0x45/0x60
>  [<ffffffff817178ef>] apic_timer_interrupt+0x6f/0x80
>  <EOI>
>  [<ffffffff8170e8e0>] ? retint_restore_args+0xe/0xe
>  [<ffffffff8170b830>] ? wait_for_completion_killable+0x170/0x170
>  [<ffffffff8170c853>] ? preempt_schedule_irq+0x53/0x90
>  [<ffffffff8170e9f6>] retint_kernel+0x26/0x30
>  [<ffffffff8107a523>] ? queue_work_on+0x43/0x90
>  [<ffffffff8107c369>] schedule_on_each_cpu+0xc9/0x1a0
>  [<ffffffff81167770>] ? lru_add_drain+0x50/0x50
>  [<ffffffff811677c5>] lru_add_drain_all+0x15/0x20
>  [<ffffffff81186965>] SyS_mlockall+0xa5/0x1a0
>  [<ffffffff81716e94>] tracesys+0xdd/0xe2

This commit addresses this problem by inserting the required preemption
point.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>

diff --git a/mm/mlock.c b/mm/mlock.c
index 79b7cf7..92022eb 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -506,6 +506,7 @@ static int do_mlockall(int flags)
 
 		/* Ignore errors */
 		mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags);
+		cond_resched();
 	}
 out:
 	return 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: mlockall triggred rcu_preempt stall.
  2013-07-19 22:15 ` Paul E. McKenney
@ 2013-07-20  0:32   ` Dave Jones
  2013-07-20 14:00     ` Paul E. McKenney
  2013-07-30 17:57     ` Paul E. McKenney
  0 siblings, 2 replies; 6+ messages in thread
From: Dave Jones @ 2013-07-20  0:32 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Linux Kernel, linux-mm, kosaki.motohiro, walken, akpm, torvalds

On Fri, Jul 19, 2013 at 03:15:39PM -0700, Paul E. McKenney wrote:
 > On Fri, Jul 19, 2013 at 10:53:23AM -0400, Dave Jones wrote:
 > > My fuzz tester keeps hitting this. Every instance shows the non-irq stack
 > > came in from mlockall.  I'm only seeing this on one box, but that has more
 > > ram (8gb) than my other machines, which might explain it.
 > 
 > Are you building CONFIG_PREEMPT=n?  I don't see any preemption points in
 > do_mlockall(), so a range containing enough vmas might well stall the
 > CPU in that case.  

That was with full preempt.

 > Does the patch below help?  If so, we probably need others, but let's
 > first see if this one helps.  ;-)

I'll try it on Monday.

	Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mlockall triggred rcu_preempt stall.
  2013-07-20  0:32   ` Dave Jones
@ 2013-07-20 14:00     ` Paul E. McKenney
  2013-07-30 17:57     ` Paul E. McKenney
  1 sibling, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2013-07-20 14:00 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm, kosaki.motohiro, walken, akpm,
	torvalds

On Fri, Jul 19, 2013 at 08:32:12PM -0400, Dave Jones wrote:
> On Fri, Jul 19, 2013 at 03:15:39PM -0700, Paul E. McKenney wrote:
>  > On Fri, Jul 19, 2013 at 10:53:23AM -0400, Dave Jones wrote:
>  > > My fuzz tester keeps hitting this. Every instance shows the non-irq stack
>  > > came in from mlockall.  I'm only seeing this on one box, but that has more
>  > > ram (8gb) than my other machines, which might explain it.
>  > 
>  > Are you building CONFIG_PREEMPT=n?  I don't see any preemption points in
>  > do_mlockall(), so a range containing enough vmas might well stall the
>  > CPU in that case.  
> 
> That was with full preempt.
> 
>  > Does the patch below help?  If so, we probably need others, but let's
>  > first see if this one helps.  ;-)
> 
> I'll try it on Monday.

Given full preempt, I wouldn't think that my patch would have any effect,
but look forward to hearing what happens.

Hmmm....  Were you running mlockall() concurrently from a bunch of
different processes sharing lots of memory via mmap() or some such?

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mlockall triggred rcu_preempt stall.
  2013-07-20  0:32   ` Dave Jones
  2013-07-20 14:00     ` Paul E. McKenney
@ 2013-07-30 17:57     ` Paul E. McKenney
  2013-07-30 18:05       ` Dave Jones
  1 sibling, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2013-07-30 17:57 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm, kosaki.motohiro, walken, akpm,
	torvalds

On Fri, Jul 19, 2013 at 08:32:12PM -0400, Dave Jones wrote:
> On Fri, Jul 19, 2013 at 03:15:39PM -0700, Paul E. McKenney wrote:
>  > On Fri, Jul 19, 2013 at 10:53:23AM -0400, Dave Jones wrote:
>  > > My fuzz tester keeps hitting this. Every instance shows the non-irq stack
>  > > came in from mlockall.  I'm only seeing this on one box, but that has more
>  > > ram (8gb) than my other machines, which might explain it.
>  > 
>  > Are you building CONFIG_PREEMPT=n?  I don't see any preemption points in
>  > do_mlockall(), so a range containing enough vmas might well stall the
>  > CPU in that case.  
> 
> That was with full preempt.
> 
>  > Does the patch below help?  If so, we probably need others, but let's
>  > first see if this one helps.  ;-)
> 
> I'll try it on Monday.

Any news?  If I don't hear otherwise, I will assume that the patch did
not help, and will therefore drop it.

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mlockall triggred rcu_preempt stall.
  2013-07-30 17:57     ` Paul E. McKenney
@ 2013-07-30 18:05       ` Dave Jones
  0 siblings, 0 replies; 6+ messages in thread
From: Dave Jones @ 2013-07-30 18:05 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Linux Kernel, linux-mm, kosaki.motohiro, walken, akpm, torvalds

On Tue, Jul 30, 2013 at 10:57:18AM -0700, Paul E. McKenney wrote:
 > On Fri, Jul 19, 2013 at 08:32:12PM -0400, Dave Jones wrote:
 > > On Fri, Jul 19, 2013 at 03:15:39PM -0700, Paul E. McKenney wrote:
 > >  > On Fri, Jul 19, 2013 at 10:53:23AM -0400, Dave Jones wrote:
 > >  > > My fuzz tester keeps hitting this. Every instance shows the non-irq stack
 > >  > > came in from mlockall.  I'm only seeing this on one box, but that has more
 > >  > > ram (8gb) than my other machines, which might explain it.
 > >  > 
 > >  > Are you building CONFIG_PREEMPT=n?  I don't see any preemption points in
 > >  > do_mlockall(), so a range containing enough vmas might well stall the
 > >  > CPU in that case.  
 > > 
 > > That was with full preempt.
 > > 
 > >  > Does the patch below help?  If so, we probably need others, but let's
 > >  > first see if this one helps.  ;-)
 > > 
 > > I'll try it on Monday.
 > 
 > Any news?  If I don't hear otherwise, I will assume that the patch did
 > not help, and will therefore drop it.

I wasn't able to do any tests yesterday, because I kept hitting other oopses.
I've got patches for the more obvious ones now, so I'll start testing rc3 this
afternoon.

	Dave


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-07-30 18:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-19 14:53 mlockall triggred rcu_preempt stall Dave Jones
2013-07-19 22:15 ` Paul E. McKenney
2013-07-20  0:32   ` Dave Jones
2013-07-20 14:00     ` Paul E. McKenney
2013-07-30 17:57     ` Paul E. McKenney
2013-07-30 18:05       ` Dave Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).