public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* soft lockup in sysvipc code.
@ 2013-09-04 21:50 Dave Jones
  2013-09-06  3:34 ` Lin Ming
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Dave Jones @ 2013-09-04 21:50 UTC (permalink / raw)
  To: Linux Kernel; +Cc: manfred

Haven't seen this before.
Tree based on v3.11-3104-gf357a82

BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:25479]
Modules linked in: sctp snd_seq_dummy fuse dlci rfcomm tun bnep hidp ipt_ULOG nfnetlink can_raw can_bcm scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 xfs snd_hda_codec_realtek libcrc32c snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr usb_debug e1000e ptp pps_core
irq event stamp: 1143030
hardirqs last  enabled at (1143029): [<ffffffff81715d5c>] restore_args+0x0/0x30
hardirqs last disabled at (1143030): [<ffffffff8171f9aa>] apic_timer_interrupt+0x6a/0x80
softirqs last  enabled at (1143028): [<ffffffff8105a938>] __do_softirq+0x198/0x460
softirqs last disabled at (1143023): [<ffffffff8105ae55>] irq_exit+0x135/0x150
CPU: 0 PID: 25479 Comm: trinity-child0 Not tainted 3.11.0+ #44
task: ffff88022c013f90 ti: ffff88022bd8c000 task.ti: ffff88022bd8c000
RIP: 0010:[<ffffffff8131014b>]  [<ffffffff8131014b>] idr_find_slowpath+0x9b/0x150
RSP: 0018:ffff88022bd8dc88  EFLAGS: 00000206
RAX: 0000000000000006 RBX: ffff000a6c0a0000 RCX: 0000000000000008
RDX: 0000000000000008 RSI: ffffffff81c41040 RDI: ffff88022c014668
RBP: ffff88022bd8dca0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88023831a290
R13: 0000000000000001 R14: ffff88022bd8dbe8 R15: ffff8802449dffff
FS:  00007fcfcad2c740(0000) GS:ffff880244800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fcfc84cb968 CR3: 00000001de93f000 CR4: 00000000001407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 0000000000000260 0000000000002dba ffffffff81c7e258 ffff88022bd8dcf8
 ffffffff812b1131 ffff88022c013f90 ffff8801d37174c0 ffff88022bd8dd38
 ffffffff81c7e2f0 ffff88022bd8dd38 ffff8801e065cec8 ffff880241d86ca8
Call Trace:
 [<ffffffff812b1131>] sysvipc_find_ipc+0x61/0x300
 [<ffffffff812b1416>] sysvipc_proc_next+0x46/0xd0
 [<ffffffff811e13b9>] traverse.isra.7+0xc9/0x260
 [<ffffffff810c4078>] ? lock_release_non_nested+0x308/0x350
 [<ffffffff811e1b61>] seq_read+0x3e1/0x450
 [<ffffffff81228f10>] ? proc_reg_write+0x80/0x80
 [<ffffffff81228f4d>] proc_reg_read+0x3d/0x80
 [<ffffffff811ba3a3>] do_loop_readv_writev+0x63/0x90
 [<ffffffff811bbccd>] do_readv_writev+0x21d/0x240
 [<ffffffff810994af>] ? local_clock+0x3f/0x50
 [<ffffffff81152376>] ? context_tracking_user_exit+0x46/0x1a0
 [<ffffffff811bbd25>] vfs_readv+0x35/0x60
 [<ffffffff811bbff2>] SyS_preadv+0xa2/0xd0
 [<ffffffff8171ef54>] tracesys+0xdd/0xe2
Code: 7e 6e 41 8b 84 24 2c 08 00 00 83 eb 08 c1 e0 03 39 c3 0f 85 c1 00 00 00 89 d9 44 89 e8 d3 f8 0f b6 c0 48 83 c0 04 4d 8b 64 c4 08 <e8> 80 b4 d6 ff 85 c0 74 c4 80 3d f7 2f 9d 00 00 75 bb e8 6e b4 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: soft lockup in sysvipc code.
  2013-09-04 21:50 soft lockup in sysvipc code Dave Jones
@ 2013-09-06  3:34 ` Lin Ming
  2013-09-07  9:11 ` Manfred Spraul
  2013-09-09  2:35 ` Ramkumar Ramachandra
  2 siblings, 0 replies; 6+ messages in thread
From: Lin Ming @ 2013-09-06  3:34 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, manfred

On Thu, Sep 5, 2013 at 5:50 AM, Dave Jones <davej@redhat.com> wrote:
> Haven't seen this before.
> Tree based on v3.11-3104-gf357a82
>
> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:25479]

Can't imagine how it could happen.
In my understanding, "soft lockup" happens when code stuck at
somewhere with preemption disabled.

Look at the code, preemption disabled at:
sysvipc_proc_next -> sysvipc_find_ipc -> ipc_lock_by_ptr

enabled at:
sysvipc_proc_next -> ipc_unlock
or
sysvipc_proc_stop -> ipc_unlock

And I didn't find code may stuck in the path.
I may miss something ......

Regards,
Lin Ming

> Modules linked in: sctp snd_seq_dummy fuse dlci rfcomm tun bnep hidp ipt_ULOG nfnetlink can_raw can_bcm scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 xfs snd_hda_codec_realtek libcrc32c snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr usb_debug e1000e ptp pps_core
> irq event stamp: 1143030
> hardirqs last  enabled at (1143029): [<ffffffff81715d5c>] restore_args+0x0/0x30
> hardirqs last disabled at (1143030): [<ffffffff8171f9aa>] apic_timer_interrupt+0x6a/0x80
> softirqs last  enabled at (1143028): [<ffffffff8105a938>] __do_softirq+0x198/0x460
> softirqs last disabled at (1143023): [<ffffffff8105ae55>] irq_exit+0x135/0x150
> CPU: 0 PID: 25479 Comm: trinity-child0 Not tainted 3.11.0+ #44
> task: ffff88022c013f90 ti: ffff88022bd8c000 task.ti: ffff88022bd8c000
> RIP: 0010:[<ffffffff8131014b>]  [<ffffffff8131014b>] idr_find_slowpath+0x9b/0x150
> RSP: 0018:ffff88022bd8dc88  EFLAGS: 00000206
> RAX: 0000000000000006 RBX: ffff000a6c0a0000 RCX: 0000000000000008
> RDX: 0000000000000008 RSI: ffffffff81c41040 RDI: ffff88022c014668
> RBP: ffff88022bd8dca0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: ffff88023831a290
> R13: 0000000000000001 R14: ffff88022bd8dbe8 R15: ffff8802449dffff
> FS:  00007fcfcad2c740(0000) GS:ffff880244800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fcfc84cb968 CR3: 00000001de93f000 CR4: 00000000001407f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
>  0000000000000260 0000000000002dba ffffffff81c7e258 ffff88022bd8dcf8
>  ffffffff812b1131 ffff88022c013f90 ffff8801d37174c0 ffff88022bd8dd38
>  ffffffff81c7e2f0 ffff88022bd8dd38 ffff8801e065cec8 ffff880241d86ca8
> Call Trace:
>  [<ffffffff812b1131>] sysvipc_find_ipc+0x61/0x300
>  [<ffffffff812b1416>] sysvipc_proc_next+0x46/0xd0
>  [<ffffffff811e13b9>] traverse.isra.7+0xc9/0x260
>  [<ffffffff810c4078>] ? lock_release_non_nested+0x308/0x350
>  [<ffffffff811e1b61>] seq_read+0x3e1/0x450
>  [<ffffffff81228f10>] ? proc_reg_write+0x80/0x80
>  [<ffffffff81228f4d>] proc_reg_read+0x3d/0x80
>  [<ffffffff811ba3a3>] do_loop_readv_writev+0x63/0x90
>  [<ffffffff811bbccd>] do_readv_writev+0x21d/0x240
>  [<ffffffff810994af>] ? local_clock+0x3f/0x50
>  [<ffffffff81152376>] ? context_tracking_user_exit+0x46/0x1a0
>  [<ffffffff811bbd25>] vfs_readv+0x35/0x60
>  [<ffffffff811bbff2>] SyS_preadv+0xa2/0xd0
>  [<ffffffff8171ef54>] tracesys+0xdd/0xe2
> Code: 7e 6e 41 8b 84 24 2c 08 00 00 83 eb 08 c1 e0 03 39 c3 0f 85 c1 00 00 00 89 d9 44 89 e8 d3 f8 0f b6 c0 48 83 c0 04 4d 8b 64 c4 08 <e8> 80 b4 d6 ff 85 c0 74 c4 80 3d f7 2f 9d 00 00 75 bb e8 6e b4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: soft lockup in sysvipc code.
  2013-09-04 21:50 soft lockup in sysvipc code Dave Jones
  2013-09-06  3:34 ` Lin Ming
@ 2013-09-07  9:11 ` Manfred Spraul
  2013-09-09  1:20   ` Davidlohr Bueso
  2013-09-09  2:35 ` Ramkumar Ramachandra
  2 siblings, 1 reply; 6+ messages in thread
From: Manfred Spraul @ 2013-09-07  9:11 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, Davidlohr Bueso

Hi Dave,

On 09/04/2013 11:50 PM, Dave Jones wrote:
> Haven't seen this before.
> Tree based on v3.11-3104-gf357a82
>
> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:25479]
> Modules linked in: sctp snd_seq_dummy fuse dlci rfcomm tun bnep hidp ipt_ULOG nfnetlink can_raw can_bcm scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 xfs snd_hda_codec_realtek libcrc32c snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr usb_debug e1000e ptp pps_core
> irq event stamp: 1143030
> hardirqs last  enabled at (1143029): [<ffffffff81715d5c>] restore_args+0x0/0x30
> hardirqs last disabled at (1143030): [<ffffffff8171f9aa>] apic_timer_interrupt+0x6a/0x80
> softirqs last  enabled at (1143028): [<ffffffff8105a938>] __do_softirq+0x198/0x460
> softirqs last disabled at (1143023): [<ffffffff8105ae55>] irq_exit+0x135/0x150
> CPU: 0 PID: 25479 Comm: trinity-child0 Not tainted 3.11.0+ #44
> task: ffff88022c013f90 ti: ffff88022bd8c000 task.ti: ffff88022bd8c000
> RIP: 0010:[<ffffffff8131014b>]  [<ffffffff8131014b>] idr_find_slowpath+0x9b/0x150
> RSP: 0018:ffff88022bd8dc88  EFLAGS: 00000206
> RAX: 0000000000000006 RBX: ffff000a6c0a0000 RCX: 0000000000000008
> RDX: 0000000000000008 RSI: ffffffff81c41040 RDI: ffff88022c014668
> RBP: ffff88022bd8dca0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: ffff88023831a290
> R13: 0000000000000001 R14: ffff88022bd8dbe8 R15: ffff8802449dffff
> FS:  00007fcfcad2c740(0000) GS:ffff880244800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fcfc84cb968 CR3: 00000001de93f000 CR4: 00000000001407f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
>   0000000000000260 0000000000002dba ffffffff81c7e258 ffff88022bd8dcf8
>   ffffffff812b1131 ffff88022c013f90 ffff8801d37174c0 ffff88022bd8dd38
>   ffffffff81c7e2f0 ffff88022bd8dd38 ffff8801e065cec8 ffff880241d86ca8
> Call Trace:
>   [<ffffffff812b1131>] sysvipc_find_ipc+0x61/0x300
>   [<ffffffff812b1416>] sysvipc_proc_next+0x46/0xd0
>   [<ffffffff811e13b9>] traverse.isra.7+0xc9/0x260
>   [<ffffffff810c4078>] ? lock_release_non_nested+0x308/0x350
>   [<ffffffff811e1b61>] seq_read+0x3e1/0x450
>   [<ffffffff81228f10>] ? proc_reg_write+0x80/0x80
>   [<ffffffff81228f4d>] proc_reg_read+0x3d/0x80
Do you have any details about load?

I haven't seen it either - but I don't have a stress test that does
#while true;cat /proc/sysvipc/*>/dev/null;done
in parallel with create/remove/whatever operations.

Davidlohr: Have you done any stress tests for the /proc interface?

--
     Manfred

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: soft lockup in sysvipc code.
  2013-09-07  9:11 ` Manfred Spraul
@ 2013-09-09  1:20   ` Davidlohr Bueso
  2013-09-09  1:51     ` Dave Jones
  0 siblings, 1 reply; 6+ messages in thread
From: Davidlohr Bueso @ 2013-09-09  1:20 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: Dave Jones, Linux Kernel, Davidlohr Bueso

On Sat, 2013-09-07 at 11:11 +0200, Manfred Spraul wrote:
> Hi Dave,
> 
> On 09/04/2013 11:50 PM, Dave Jones wrote:
> > Haven't seen this before.
> > Tree based on v3.11-3104-gf357a82
> >
> > BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:25479]
> > Modules linked in: sctp snd_seq_dummy fuse dlci rfcomm tun bnep hidp ipt_ULOG nfnetlink can_raw can_bcm scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 xfs snd_hda_codec_realtek libcrc32c snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr usb_debug e1000e ptp pps_core
> > irq event stamp: 1143030
> > hardirqs last  enabled at (1143029): [<ffffffff81715d5c>] restore_args+0x0/0x30
> > hardirqs last disabled at (1143030): [<ffffffff8171f9aa>] apic_timer_interrupt+0x6a/0x80
> > softirqs last  enabled at (1143028): [<ffffffff8105a938>] __do_softirq+0x198/0x460
> > softirqs last disabled at (1143023): [<ffffffff8105ae55>] irq_exit+0x135/0x150
> > CPU: 0 PID: 25479 Comm: trinity-child0 Not tainted 3.11.0+ #44
> > task: ffff88022c013f90 ti: ffff88022bd8c000 task.ti: ffff88022bd8c000
> > RIP: 0010:[<ffffffff8131014b>]  [<ffffffff8131014b>] idr_find_slowpath+0x9b/0x150
> > RSP: 0018:ffff88022bd8dc88  EFLAGS: 00000206
> > RAX: 0000000000000006 RBX: ffff000a6c0a0000 RCX: 0000000000000008
> > RDX: 0000000000000008 RSI: ffffffff81c41040 RDI: ffff88022c014668
> > RBP: ffff88022bd8dca0 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000001 R11: 0000000000000001 R12: ffff88023831a290
> > R13: 0000000000000001 R14: ffff88022bd8dbe8 R15: ffff8802449dffff
> > FS:  00007fcfcad2c740(0000) GS:ffff880244800000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007fcfc84cb968 CR3: 00000001de93f000 CR4: 00000000001407f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Stack:
> >   0000000000000260 0000000000002dba ffffffff81c7e258 ffff88022bd8dcf8
> >   ffffffff812b1131 ffff88022c013f90 ffff8801d37174c0 ffff88022bd8dd38
> >   ffffffff81c7e2f0 ffff88022bd8dd38 ffff8801e065cec8 ffff880241d86ca8
> > Call Trace:
> >   [<ffffffff812b1131>] sysvipc_find_ipc+0x61/0x300
> >   [<ffffffff812b1416>] sysvipc_proc_next+0x46/0xd0
> >   [<ffffffff811e13b9>] traverse.isra.7+0xc9/0x260
> >   [<ffffffff810c4078>] ? lock_release_non_nested+0x308/0x350
> >   [<ffffffff811e1b61>] seq_read+0x3e1/0x450
> >   [<ffffffff81228f10>] ? proc_reg_write+0x80/0x80
> >   [<ffffffff81228f4d>] proc_reg_read+0x3d/0x80
> Do you have any details about load?
> 
> I haven't seen it either - but I don't have a stress test that does
> #while true;cat /proc/sysvipc/*>/dev/null;done
> in parallel with create/remove/whatever operations.
> 
> Davidlohr: Have you done any stress tests for the /proc interface?
> 

I don't. I tried reproducing the issue by a trivial multithreaded
program were each CPU does parallel reads to the
different /proc/sysvipc/ files, but no luck triggering any lockups.

Dave, are you able to reproduce this on demand? Any chance of bisecting
this?

Thanks,
Davidlohr

> --
>      Manfred



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: soft lockup in sysvipc code.
  2013-09-09  1:20   ` Davidlohr Bueso
@ 2013-09-09  1:51     ` Dave Jones
  0 siblings, 0 replies; 6+ messages in thread
From: Dave Jones @ 2013-09-09  1:51 UTC (permalink / raw)
  To: Davidlohr Bueso; +Cc: Manfred Spraul, Linux Kernel, Davidlohr Bueso

On Sun, Sep 08, 2013 at 06:20:18PM -0700, Davidlohr Bueso wrote:
 > On Sat, 2013-09-07 at 11:11 +0200, Manfred Spraul wrote:
 > > Hi Dave,
 > > 
 > > On 09/04/2013 11:50 PM, Dave Jones wrote:
 > > > Haven't seen this before.
 > > > Tree based on v3.11-3104-gf357a82
 > > >
 > > > BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:25479]
 > > > Modules linked in: sctp snd_seq_dummy fuse dlci rfcomm tun bnep hidp ipt_ULOG nfnetlink can_raw can_bcm scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 xfs snd_hda_codec_realtek libcrc32c snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr usb_debug e1000e ptp pps_core
 > > > irq event stamp: 1143030
 > > > hardirqs last  enabled at (1143029): [<ffffffff81715d5c>] restore_args+0x0/0x30
 > > > hardirqs last disabled at (1143030): [<ffffffff8171f9aa>] apic_timer_interrupt+0x6a/0x80
 > > > softirqs last  enabled at (1143028): [<ffffffff8105a938>] __do_softirq+0x198/0x460
 > > > softirqs last disabled at (1143023): [<ffffffff8105ae55>] irq_exit+0x135/0x150
 > > > CPU: 0 PID: 25479 Comm: trinity-child0 Not tainted 3.11.0+ #44
 > > > task: ffff88022c013f90 ti: ffff88022bd8c000 task.ti: ffff88022bd8c000
 > > > RIP: 0010:[<ffffffff8131014b>]  [<ffffffff8131014b>] idr_find_slowpath+0x9b/0x150
 > > > RSP: 0018:ffff88022bd8dc88  EFLAGS: 00000206
 > > > RAX: 0000000000000006 RBX: ffff000a6c0a0000 RCX: 0000000000000008
 > > > RDX: 0000000000000008 RSI: ffffffff81c41040 RDI: ffff88022c014668
 > > > RBP: ffff88022bd8dca0 R08: 0000000000000000 R09: 0000000000000000
 > > > R10: 0000000000000001 R11: 0000000000000001 R12: ffff88023831a290
 > > > R13: 0000000000000001 R14: ffff88022bd8dbe8 R15: ffff8802449dffff
 > > > FS:  00007fcfcad2c740(0000) GS:ffff880244800000(0000) knlGS:0000000000000000
 > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 > > > CR2: 00007fcfc84cb968 CR3: 00000001de93f000 CR4: 00000000001407f0
 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 > > > Stack:
 > > >   0000000000000260 0000000000002dba ffffffff81c7e258 ffff88022bd8dcf8
 > > >   ffffffff812b1131 ffff88022c013f90 ffff8801d37174c0 ffff88022bd8dd38
 > > >   ffffffff81c7e2f0 ffff88022bd8dd38 ffff8801e065cec8 ffff880241d86ca8
 > > > Call Trace:
 > > >   [<ffffffff812b1131>] sysvipc_find_ipc+0x61/0x300
 > > >   [<ffffffff812b1416>] sysvipc_proc_next+0x46/0xd0
 > > >   [<ffffffff811e13b9>] traverse.isra.7+0xc9/0x260
 > > >   [<ffffffff810c4078>] ? lock_release_non_nested+0x308/0x350
 > > >   [<ffffffff811e1b61>] seq_read+0x3e1/0x450
 > > >   [<ffffffff81228f10>] ? proc_reg_write+0x80/0x80
 > > >   [<ffffffff81228f4d>] proc_reg_read+0x3d/0x80
 > > Do you have any details about load?
 > > 
 > > I haven't seen it either - but I don't have a stress test that does
 > > #while true;cat /proc/sysvipc/*>/dev/null;done
 > > in parallel with create/remove/whatever operations.
 > > 
 > > Davidlohr: Have you done any stress tests for the /proc interface?
 > > 
 > 
 > I don't. I tried reproducing the issue by a trivial multithreaded
 > program were each CPU does parallel reads to the
 > different /proc/sysvipc/ files, but no luck triggering any lockups.
 > 
 > Dave, are you able to reproduce this on demand? Any chance of bisecting
 > this?

I hit it 2-3 times last week. Tomorrow I'll try more focussed testing.
Hopefully I can come up with a trinity invocation that will repro it quickly.

I'm guessing this is just /proc/sysipc/*  right ?

	Dave


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: soft lockup in sysvipc code.
  2013-09-04 21:50 soft lockup in sysvipc code Dave Jones
  2013-09-06  3:34 ` Lin Ming
  2013-09-07  9:11 ` Manfred Spraul
@ 2013-09-09  2:35 ` Ramkumar Ramachandra
  2 siblings, 0 replies; 6+ messages in thread
From: Ramkumar Ramachandra @ 2013-09-09  2:35 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, manfred

Dave Jones wrote:
> Haven't seen this before.
> Tree based on v3.11-3104-gf357a82
>
> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:25479]

A combination of hard and soft locks is the solution to the problem, ultimately.

> Modules linked in: sctp snd_seq_dummy fuse dlci rfcomm tun bnep hidp ipt_ULOG nfnetlink can_raw can_bcm scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 xfs snd_hda_codec_realtek libcrc32c snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd soundcore pcspkr usb_debug e1000e ptp pps_core

Changing hardware results in some downtime, but has to be done every
once in a while.

> irq event stamp: 1143030
> hardirqs last  enabled at (1143029): [<ffffffff81715d5c>] restore_args+0x0/0x30
> hardirqs last disabled at (1143030): [<ffffffff8171f9aa>] apic_timer_interrupt+0x6a/0x80
> softirqs last  enabled at (1143028): [<ffffffff8105a938>] __do_softirq+0x198/0x460
> softirqs last disabled at (1143023): [<ffffffff8105ae55>] irq_exit+0x135/0x150
> CPU: 0 PID: 25479 Comm: trinity-child0 Not tainted 3.11.0+ #44

Don't taint in a hurry; that'll result in breakages.

> task: ffff88022c013f90 ti: ffff88022bd8c000 task.ti: ffff88022bd8c000

Error: ti is too extreme.

> RIP: 0010:[<ffffffff8131014b>]  [<ffffffff8131014b>] idr_find_slowpath+0x9b/0x150
> RSP: 0018:ffff88022bd8dc88  EFLAGS: 00000206
> RAX: 0000000000000006 RBX: ffff000a6c0a0000 RCX: 0000000000000008
> RDX: 0000000000000008 RSI: ffffffff81c41040 RDI: ffff88022c014668
> RBP: ffff88022bd8dca0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: ffff88023831a290
> R13: 0000000000000001 R14: ffff88022bd8dbe8 R15: ffff8802449dffff
> FS:  00007fcfcad2c740(0000) GS:ffff880244800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fcfc84cb968 CR3: 00000001de93f000 CR4: 00000000001407f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

That's the lifecycle of the process.

> Stack:
>  0000000000000260 0000000000002dba ffffffff81c7e258 ffff88022bd8dcf8
>  ffffffff812b1131 ffff88022c013f90 ffff8801d37174c0 ffff88022bd8dd38
>  ffffffff81c7e2f0 ffff88022bd8dd38 ffff8801e065cec8 ffff880241d86ca8

In the beginning, there was quite a bit of downtime: lately, it's
becoming highly consistent.

> Call Trace:
>  [<ffffffff812b1131>] sysvipc_find_ipc+0x61/0x300
>  [<ffffffff812b1416>] sysvipc_proc_next+0x46/0xd0
>  [<ffffffff811e13b9>] traverse.isra.7+0xc9/0x260
>  [<ffffffff810c4078>] ? lock_release_non_nested+0x308/0x350
>  [<ffffffff811e1b61>] seq_read+0x3e1/0x450
>  [<ffffffff81228f10>] ? proc_reg_write+0x80/0x80
>  [<ffffffff81228f4d>] proc_reg_read+0x3d/0x80
>  [<ffffffff811ba3a3>] do_loop_readv_writev+0x63/0x90
>  [<ffffffff811bbccd>] do_readv_writev+0x21d/0x240
>  [<ffffffff810994af>] ? local_clock+0x3f/0x50
>  [<ffffffff81152376>] ? context_tracking_user_exit+0x46/0x1a0
>  [<ffffffff811bbd25>] vfs_readv+0x35/0x60
>  [<ffffffff811bbff2>] SyS_preadv+0xa2/0xd0
>  [<ffffffff8171ef54>] tracesys+0xdd/0xe2
> Code: 7e 6e 41 8b 84 24 2c 08 00 00 83 eb 08 c1 e0 03 39 c3 0f 85 c1 00 00 00 89 d9 44 89 e8 d3 f8 0f b6 c0 48 83 c0 04 4d 8b 64 c4 08 <e8> 80 b4 d6 ff 85 c0 74 c4 80 3d f7 2f 9d 00 00 75 bb e8 6e b4

Sequential read after lock was released. vfs came before local clock
and context tracking, so there seems to be an ordering issue.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-09-09  2:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-04 21:50 soft lockup in sysvipc code Dave Jones
2013-09-06  3:34 ` Lin Ming
2013-09-07  9:11 ` Manfred Spraul
2013-09-09  1:20   ` Davidlohr Bueso
2013-09-09  1:51     ` Dave Jones
2013-09-09  2:35 ` Ramkumar Ramachandra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox