From mboxrd@z Thu Jan 1 00:00:00 1970 Subject: Re: [PATCH/RFC 2/14] Reclaim Scalability: convert inode i_mmap_lock to reader/writer lock From: Lee Schermerhorn In-Reply-To: <20070920012441.GQ4608@v2.random> References: <20070914205359.6536.98017.sendpatchset@localhost> <20070914205412.6536.34898.sendpatchset@localhost> <20070920012441.GQ4608@v2.random> Content-Type: text/plain Date: Thu, 20 Sep 2007 10:10:48 -0400 Message-Id: <1190297448.5326.8.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Andrea Arcangeli Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mel@csn.ul.ie, clameter@sgi.com, riel@redhat.com, balbir@linux.vnet.ibm.com, a.p.zijlstra@chello.nl, eric.whitney@hp.com, npiggin@suse.de List-ID: On Thu, 2007-09-20 at 03:24 +0200, Andrea Arcangeli wrote: > On Fri, Sep 14, 2007 at 04:54:12PM -0400, Lee Schermerhorn wrote: > > Note: This patch is meant to address a situation I've seen > > running large Oracle OLTP workload--1000s of users--on an > > large HP ia64 NUMA platform. The system hung, spitting out > > "soft lockup" messages on the console. Stack traces showed > > that all cpus were in page_referenced(), as mentioned above. > > I let the system run overnight in this state--it never > > recovered before I decided to reboot. > > Just to understand better, was that an oom condition? Can you press > SYSRQ+M to check the RAM and swap levels? If it's an oom condition the > problem may be quite different. Actually, the system never went OOM. Didn't get that far. I was trying to create an Oracle workload that would put me at the brink of reclaim, and then by running some app that would eat page cache, push it over the edge. But, I apparently went too far--too many Oracle users for this system--and it went into reclaim, got hung up with all cpus spinning on the i_mmap_lock in page_referenced_file(). I just got this system back for testing. Soon as I build a 23-rc6-mm1 kernel for it, I'll retest that with the same workload to demonstrate the problem. Then I'll try it with the rw_lock patch to see if that helps. > > Still making those spinlocks rw sounds good to me. Well, except for the concern about the extra overhead of rw_locks. I'm more worried about this for the i_mmap_lock than the anon_vma lock. The only time we need to take the anon_vma lock for write is when adding a new vma to the list, or removing one [vma_link(), et al]. But, the i_mmap_lock is also used to protect the truncate_count, and must be taken for write there. I expected that a kernel build might show something with all the forks for parallel make, mapping of libc, cc executable, ... but nothing. Thanks, Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org