From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Hellwig Subject: Re: [Joel.Becker@oracle.com: Re: [Linux-cluster] Re: [PATCH 1/3] dlm: use configfs] Date: Thu, 25 Aug 2005 22:23:01 +0200 Message-ID: <20050825202301.GA15195@lst.de> References: <20050822213220.GH19387@insight.us.oracle.com> <20050822144521.24494329.akpm@osdl.org> <20050822215049.GI19387@insight.us.oracle.com> <20050822150505.7978136d.akpm@osdl.org> <20050824071835.GA10235@lst.de> <20050824203352.GB30246@insight.us.oracle.com> <20050825095819.GA4785@lst.de> <430E11BA.4030603@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Christoph Hellwig , Joel Becker , Andrew Morton , mark.fasheh@oracle.com, linux-fsdevel@vger.kernel.org Return-path: Received: from verein.lst.de ([213.95.11.210]:42701 "EHLO mail.lst.de") by vger.kernel.org with ESMTP id S932549AbVHYUXK (ORCPT ); Thu, 25 Aug 2005 16:23:10 -0400 To: Zach Brown Content-Disposition: inline In-Reply-To: <430E11BA.4030603@oracle.com> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Thu, Aug 25, 2005 at 11:45:14AM -0700, Zach Brown wrote: > Yeah, we aim to simplify this code. For the record, it wasn't buffered > aio that was the problem. There were two naughty moving parts: > > First, trying not to block in the dlm when issuing aio ops and tracking > state to restart after dlm ops returned eiocbqueued. This was just > overly aggressive. This can behave like block mapping lookups in that > it rarely blocks. Most aio that people care about (direct io writes to > already allocated regions) will simply be acquiring and releasing > shared-read locks around each op -- trivial local operations. > > Second, trying to hold dlm locks around the entirety of aio ops. This > led to the mess of trying to tear down locks in the iocb dtor method. > (which can then race with unmount, aio does __fput on the filp, dropping > the vfsmount ref, before calling dtor.. bleh). We can get around this > by unlocking after performing the block mapping lookups and issueing the > io and introducing a cluster DLM lock which behaves like i_alloc_sem. You might want to look at XFS as a model for this. While it's not clustered it has it's own r/w semaphore to protect block allocations. It's not using the i_alloc_sem at all but some 'clever' behaviour with downgrading the lock after the block allocations are done. > So, how about a patch that lets the fs provide a callback to > acquire/release i_alloc_sem at the current sites (dio, notify_change) > that work with it? Most file systems wouldn't provide a callback and > the code would just use the sem as usual, but clustered guys could use > dlm locking. If we're going down that route I'd say provide the callback for filesystems that actually need locking only, but there must be a better way to do that. Note that in any case you're doing lots of work for the buffere path aswell in aio.c that should be nessecary with a bit of refactoring.