All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Andrew Morton <akpm@osdl.org>
Cc: Dave Olson <olson@unixfolk.com>,
	mingo@elte.hu, ccb@acm.org, linux-kernel@vger.kernel.org
Subject: Re: [patch] increase spinlock-debug looping timeouts (write_lock and NMI)
Date: Tue, 20 Jun 2006 17:37:32 +1000	[thread overview]
Message-ID: <4497A5BC.4070005@yahoo.com.au> (raw)
In-Reply-To: <20060619233947.94f7e644.akpm@osdl.org>

Andrew Morton wrote:
> On Mon, 19 Jun 2006 22:35:46 -0700 (PDT)
> Dave Olson <olson@unixfolk.com> wrote:
> 
> 
>>| 
>>| I get that impression ;) If it takes 1-2 seconds to get this lock then it
>>| can take five seconds.  a) that's just gross and b) the NMI watchdog will
>>| nuke the box.
>>| 
>>| Why is it taking so long to get the lock?
>>| 
>>| Does it happen in non-debug mode?
>>| 
>>| What do we do about it?
>>
>>It seems possible that this might be the cause of problems we've had
>>with our InfiniPath hardware/software, and also Mellanox/OpenIB hardware/software
>>on some quad-socket/dual core opteron systems (8 cpu cores).
>>
>>We'll see very long delays when 8 MPI processes exit "simultaneously", and sometimes
>>get NMI, sometimes system hangs, and sometimes just hung up for many seconds (and
>>often in that state, doing sysrq-P or sysrq-T will make things happy again).
>>
> 
> 
> OK.  I assume these processes have done a mmap(MAP_SHARED) of a lot of
> memory?
> 
> 
>>A typical trace looks like this (on an fc4 2.6.16 kernel):
> 
> 
> fc4?  You seem to have an RH-FCx which doesn't enable
> CONFIG_DEBUG_SPINLOCK.  Or maybe we didn't have all that debug code in
> 2.6.16.  Doesn't matter, really.
> 
> 
>>[root@quad-00 ~]# NMI Watchdog detected LOCKUP on CPU 0
>>CPU 0                                                  
>>Modules linked in: nfs nfsd exportfs lockd nfs_acl ipv6 autofs4 sunrpc ib_sdp(U)
>>ib_cm(U) ib_umad(U) ib_uverbs(U) ib_ipoib(U) ib_sa(U) ib_ipath(U) ib_mad(U)
>>ib_core(U) video button battery ac i2c_nforce2 i2c_core ipath_core(U) e1000
>>floppy sg dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod sata_nv libata aic79xx
>>scsi_transport_spi sd_mod scsi_mod
>>Pid: 4239, comm: mpi_multibw Not tainted 2.6.16-1.2096_FC4.rootsmp #1
>>RIP: 0010:[<ffffffff80213a30>] <ffffffff80213a30>{_raw_write_lock+161}
>>RSP: 0018:ffff810078e07c18  EFLAGS: 00000086                          
>>RAX: 000000008f100300 RBX: ffff81007b7bea58 RCX: 00000000002dc5a0
>>RDX: 0000000000927efd RSI: 0000000000000001 RDI: ffff81007b7bea58
>>RBP: ffff81007b7bea40 R08: ffff810002e3ae80 R09: 00000000fffffffa
>>R10: 0000000000000003 R11: ffffffff801644e2 R12: ffff81007b7bea58
>>R13: 00002aaaad800000 R14: ffff810002e3aec0 R15: 00002aaabba6f000
>>FS:  0000000040a00960(0000) GS:ffffffff80514000(0000) knlGS:00000000f7fc86c0
>>CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b                           
>>CR2: 00000033f38bdaf0 CR3: 0000000000101000 CR4: 00000000000006e0
>>Process mpi_multibw (pid: 4239, threadinfo ffff810078e06000, task ffff810079d8a040)
>>Stack: ffff810002e3aec0 ffffffff8016452b 0000000078ebb067 00002aaaad757000 
>>       ffff810078dccab8 ffffffff8016b840 0000000000000000 ffff810078e07d38 
>>       ffffffffffffffff 0000000000000000                                   
>>Call Trace: <ffffffff8016452b>{__set_page_dirty_nobuffers+73}
>>       <ffffffff8016b840>{unmap_vmas+1042} <ffffffff8016e638>{exit_mmap+124}
>>       <ffffffff80132b07>{mmput+37} <ffffffff80138373>{do_exit+584}         
>>       <ffffffff801416dc>{__dequeue_signal+459} <ffffffff80138af0>{sys_exit_group+0}
>>       <ffffffff80142af3>{get_signal_to_deliver+1568}
>><ffffffff8010a14a>{do_signal+116}
>>       <ffffffff80195dc1>{__pollwait+0} <ffffffff80196b0c>{sys_select+934}
>>       <ffffffff8010aa87>{sysret_signal+28}
>><ffffffff8010ad73>{ptregscall_common+103}
>>     
>>Code: 84 c0 75 7f f0 81 03 00 00 00 01 f3 90 48 83 c1 01 48 8b 15 
>>Kernel panic - not syncing: nmi watchdog 

Any ideas what it might be waiting on?


> 
> 
> blam, dead box, that's the one, thanks.
> 
> With our current rwlock semantics I don't know if this is fixable. 
> Probably we need to go back to a spinlock on tree_lock.

Lockless pagecache makes most of the readside locks go away, so I have
converted tree_lock back to a spinlock in my tree. I've just started
working on it again with a view for submitting it (or at least the
RCU radix tree, to start with)... been having fun with a userspace RCU
for rtth ;)

Otherwise, a straight rwlock->spinlock conversion will have a few more
scalability issues, but I'd guess it wouldn't be a problem  at all for
most workloads on most systems.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

  parent reply	other threads:[~2006-06-20  7:37 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <fa.VT2rwoX1M/2O/aO5crhlRDNx4YA@ifi.uio.no>
     [not found] ` <fa.Zp589GPrIISmAAheRowfRgZ1jgs@ifi.uio.no>
2006-06-20  5:35   ` [patch] increase spinlock-debug looping timeouts (write_lock and NMI) Dave Olson
2006-06-20  6:39     ` Andrew Morton
2006-06-20  6:53       ` Dave Jones
2006-06-20  7:37       ` Nick Piggin [this message]
2006-06-20  8:03         ` Andrew Morton
2006-06-20  8:33         ` Ingo Molnar
2006-06-20  9:37           ` Nick Piggin
2006-06-20  9:51             ` Ingo Molnar
2006-06-20 10:59               ` Nick Piggin
2006-06-20 13:04                 ` Arjan van de Ven
2006-06-20 13:28                   ` update pci device id cckuo
2006-06-20 14:06                     ` Arjan van de Ven
2006-06-20 13:36                   ` [patch] increase spinlock-debug looping timeouts (write_lock and NMI) Nick Piggin
2006-06-20 14:53                     ` Arjan van de Ven
2006-06-20 15:16                       ` Nick Piggin
2006-06-20 16:27                         ` Nick Piggin
2006-06-20  8:43         ` Arjan van de Ven
2006-06-20 16:11       ` Dave Olson
2006-06-20 21:10         ` Andrew Morton
2006-06-22  5:45 Dave Olson
2006-06-22  5:57 ` Andrew Morton
2006-06-23  7:57 ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2006-06-23 16:27 Dave Olson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4497A5BC.4070005@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@osdl.org \
    --cc=ccb@acm.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=olson@unixfolk.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.