From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Lyashkov Subject: [RFC] possible badness in prune_dcache() Date: Fri, 04 Apr 2008 14:40:00 +0300 Message-ID: <1207309200.12346.25.camel@bear.shadowland> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7BIT Cc: Andrew Perepechko To: linux-fsdevel@vger.kernel.org Return-path: Received: from gmp-eb-inf-1.sun.com ([192.18.6.21]:62459 "EHLO gmp-eb-inf-1.sun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751798AbYDDLzI (ORCPT ); Fri, 4 Apr 2008 07:55:08 -0400 Received: from fe-emea-10.sun.com (gmp-eb-lb-2-fe3.eu.sun.com [192.18.6.12]) by gmp-eb-inf-1.sun.com (8.13.7+Sun/8.12.9) with ESMTP id m34Be0Yx014524 for ; Fri, 4 Apr 2008 11:40:10 GMT Received: from conversion-daemon.fe-emea-10.sun.com by fe-emea-10.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) id <0JYS00D01TFM1100@fe-emea-10.sun.com> (original mail from Alexey.Lyashkov@Sun.COM) for linux-fsdevel@vger.kernel.org; Fri, 04 Apr 2008 12:40:00 +0100 (BST) Received: from [192.168.1.2] ([80.68.11.60]) by fe-emea-10.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) with ESMTPSA id <0JYS00CQXTQJX380@fe-emea-10.sun.com> for linux-fsdevel@vger.kernel.org; Fri, 04 Apr 2008 12:39:56 +0100 (BST) Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Hello list, When investigation livelock in RHEL3, i found possible badness in prune_dcache, which exist in 2.6.24 also. situation - system has ~6 mounted filesystem, at 5 FS do some io, and one FS start umount. shrink_dcache_parent - collect unused dentries after call select_parent() and put these dentries into end of LRU for kill. but between exit from select_parent() and enter to prune_dcache(,sb) some processed add more unused dentries and put to end lru also. prune_dcache start skip some dentiries in loop >>>> while (skip && tmp != &dentry_unused && list_entry(tmp, struct dentry, d_lru)->d_sb != sb) { skip--; tmp = tmp->prev; } >>> but not found correct dentry for superblock. later condition if (tmp == &dentry_unused) break; not hit - because LRU has additional dentry, and prune_dcache(,sb) kill dentry not related to submitted superblock. (this first stranges - for me) but this not all, because count != 0 prune_dcache run in loop and kill all dentries not related to submited sb, with can be need many time (in my situation ~15min for ~200k dentries). after exit from prune_dcache - shrink_dcache_parent() do loop and try again destroy dentries - which also need some time. my tests - show changing condition, from if (tmp == &dentry_unused) break; to if (tmp == &dentry_unused) || (sb && (list_entry(tmp, struct dentry, d_lru)->d_sb != sb)) break; help with live lock - because prune_dcache exit from loop early and move required dentries to end of LRU - for easy kill. Please comment this investigation. PS. please send copy to me, i not subscribed to list. -- Alex Lyashkov Lustre Group, Sun Microsystems