From mboxrd@z Thu Jan  1 00:00:00 1970
From: Christoph Hellwig <hch@lst.de>
Subject: Re: [Joel.Becker@oracle.com: Re: [Linux-cluster] Re: [PATCH 1/3] dlm: use configfs]
Date: Thu, 25 Aug 2005 22:23:01 +0200
Message-ID: <20050825202301.GA15195@lst.de>
References: <20050822213220.GH19387@insight.us.oracle.com> <20050822144521.24494329.akpm@osdl.org> <20050822215049.GI19387@insight.us.oracle.com> <20050822150505.7978136d.akpm@osdl.org> <20050824071835.GA10235@lst.de> <20050824203352.GB30246@insight.us.oracle.com> <20050825095819.GA4785@lst.de> <430E11BA.4030603@oracle.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Christoph Hellwig <hch@lst.de>,
	Joel Becker <Joel.Becker@oracle.com>,
	Andrew Morton <akpm@osdl.org>, mark.fasheh@oracle.com,
	linux-fsdevel@vger.kernel.org
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from verein.lst.de ([213.95.11.210]:42701 "EHLO mail.lst.de")
	by vger.kernel.org with ESMTP id S932549AbVHYUXK (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Thu, 25 Aug 2005 16:23:10 -0400
To: Zach Brown <zach.brown@oracle.com>
Content-Disposition: inline
In-Reply-To: <430E11BA.4030603@oracle.com>
Sender: linux-fsdevel-owner@vger.kernel.org
List-Id: linux-fsdevel.vger.kernel.org

On Thu, Aug 25, 2005 at 11:45:14AM -0700, Zach Brown wrote:
> Yeah, we aim to simplify this code.  For the record, it wasn't buffered
> aio that was the problem.  There were two naughty moving parts:
> 
> First, trying not to block in the dlm when issuing aio ops and tracking
> state to restart after dlm ops returned eiocbqueued.  This was just
> overly aggressive.  This can behave like block mapping lookups in that
> it rarely blocks.  Most aio that people care about (direct io writes to
> already allocated regions) will simply be acquiring and releasing
> shared-read locks around each op -- trivial local operations.
> 
> Second, trying to hold dlm locks around the entirety of aio ops. This
> led to the mess of trying to tear down locks in the iocb dtor method.
> (which can then race with unmount, aio does __fput on the filp, dropping
> the vfsmount ref, before calling dtor.. bleh). We can get around this
> by unlocking after performing the block mapping lookups and issueing the
> io and introducing a cluster DLM lock which behaves like i_alloc_sem.

You might want to look at XFS as a model for this.  While it's not
clustered it has it's own r/w semaphore to protect block allocations.
It's not using the i_alloc_sem at all but some 'clever' behaviour with
downgrading the lock after the block allocations are done.

> So, how about a patch that lets the fs provide a callback to
> acquire/release i_alloc_sem at the current sites (dio, notify_change)
> that work with it? Most file systems wouldn't provide a callback and
> the code would just use the sem as usual, but clustered guys could use
> dlm locking.

If we're going down that route I'd say provide the callback for
filesystems that actually need locking only, but there must be a better
way to do that.

Note that in any case you're doing lots of work for the buffere path
aswell in aio.c that should be nessecary with a bit of refactoring.