From mboxrd@z Thu Jan 1 00:00:00 1970 From: wangdi Subject: Re: [LSF/MM TOPIC] Parallelize file operation (like creation, unlink) under large shared directory Date: Sun, 22 Jan 2012 13:17:18 -0800 Message-ID: <4F1C7CDE.4010000@whamcloud.com> References: <4F1B2612.7020001@whamcloud.com> <4F1C1556.7010908@panasas.com> <4F1C7223.3040301@whamcloud.com> <20120122203936.GE23916@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Boaz Harrosh , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, Jinshan Xiong To: Al Viro Return-path: Received: from mail-iy0-f174.google.com ([209.85.210.174]:34933 "EHLO mail-iy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752409Ab2AVVRV (ORCPT ); Sun, 22 Jan 2012 16:17:21 -0500 Received: by iacb35 with SMTP id b35so2697196iac.19 for ; Sun, 22 Jan 2012 13:17:20 -0800 (PST) In-Reply-To: <20120122203936.GE23916@ZenIV.linux.org.uk> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 01/22/2012 12:39 PM, Al Viro wrote: > On Sun, Jan 22, 2012 at 12:31:31PM -0800, wangdi wrote: > >> We actually already implemented this for ext4, and we saw a lot performance improvement(at least 30% improvements for open/create in a single directory)for lustre stack, >> but we want to make this improvement accessible through the VFS. Probably XFS and Btrfs could also benefit from this. > You do realize that i_mutex locking is relied upon for protection of a lot > of stuff besides the obvious (i.e. on-disk directory contents)? > I'm not saying that it's hopeless, but it's highly non-trivial; the things > like rmdir/mount races, access to ->d_parent/->d_name in a lot of code, > etc. need to be taken care of and it is a _lot_ of code review to deal > with - just to verify the correctness of such changes. Yes, I agree it is non-trivial change here. What I want to say is that i_mutex lock might be too big in some cases, and it just serializes everything. So it might be useful if we could refine this lock a bit. For example we can define this lock with several modes, (read, write, current read, current write, exclusive etc), and different code can get the lock with different mode as required, which might bring us some concurrency. Thanks WangDi