From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762598AbZATUBY (ORCPT ); Tue, 20 Jan 2009 15:01:24 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754589AbZATUBN (ORCPT ); Tue, 20 Jan 2009 15:01:13 -0500 Received: from smtp-out.google.com ([216.239.33.17]:24900 "EHLO smtp-out.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753464AbZATUBM (ORCPT ); Tue, 20 Jan 2009 15:01:12 -0500 DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type: content-transfer-encoding:x-gmailtapped-by:x-gmailtapped; b=bSrUbEV5S4x9OC4pNdyyS01OZ8+QWWyMpTcOb3EIbnnonOYQgjFcK2xlm6V5ncVVT UI23kl/yPfS355W3HsMMw== Message-ID: <49762D5D.6050705@google.com> Date: Tue, 20 Jan 2009 12:00:29 -0800 From: Mike Waychison User-Agent: Thunderbird 2.0.0.17 (X11/20080925) MIME-Version: 1.0 To: Eric Dumazet CC: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton Subject: Re: [PATCH v1 0/8] Deferred dput() and iput() -- reducing lock contention References: <20090117022936.20425.43248.stgit@crlf.corp.google.com> <49718304.90404@cosmosbay.com> In-Reply-To: <49718304.90404@cosmosbay.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-GMailtapped-By: 172.28.16.144 X-GMailtapped: mikew Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Eric Dumazet wrote: > Mike Waychison a écrit : >> We've noticed that at times it can become very easy to have a system begin to >> livelock on dcache_lock/inode_lock (specifically in atomic_dec_and_lock()) when >> a lot of dentries are getting finalized at the same time (massive delete and >> large fdtable destructions are two paths I've seen cause problems). >> >> This patchset is an attempt to try and reduce the locking overheads associated >> with final dput() and final iput(). This is done by batching dentries and >> inodes into per-process queues and processing them in 'parallel' to consolidate >> some of the locking. >> >> Besides various workload testing, I threw together a load (at the end of this >> email) that causes massive fdtables (50K sockets by default) to get destroyed >> on each cpu in the system. It also populates the dcache for procfs on those >> tasks for good measure. Comparing lock_stat results (hardware is a Sun x4600 >> M2 populated with 8 4-core 2.3GHz packages (32 CPUs) + 128GiB RAM): >> > > Hello Mike > > Seems quite a large/intrusive infrastructure for a well known problem. > I even wasted some time on it. > But it seems nobody cared too much or people were too busy. > > https://kerneltrap.org/mailarchive/linux-netdev/2008/12/11/4397594 > > (patch 6 should be discarded as followups show it was wrong > [PATH 6/7] fs: struct file move from call_rcu() to SLAB_DESTROY_BY_RCU) > > > sockets / pipes dont need dcache_lock or inode_lock at all, I am > sure Google machines also uses sockets :) Yup :) I'll try to take a look at your patches this week. At a minimum, the removal of the locks seems highly desirable. > > Your test/bench program is quite biased (populating dcache for procfs, using > 50k filedesc on 32 cpu, not very realistic IMHO). Yup, extremely biased. It was meant to hurt the dput/iput path specifically and I used it as a way to compare apples to apples with/without the changes. It is still representative of a real-world workload we see though (our frontend servers when they are restarted have many tcp sockets, easily more than 50K each). > > I had a workload with processes using 1.000.000 file descriptors, > (mainly sockets) and got some latency problems when they had to exit(). > This problem was addressed by one cond_resched() added in close_files() > (commit 944be0b224724fcbf63c3a3fe3a5478c325a6547 ) > Yup. We pulled that change into our tree a while back for the same reason. It doesn't help the lock contention issue though.