From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zach Brown Subject: Re: [Joel.Becker@oracle.com: Re: [Linux-cluster] Re: [PATCH 1/3] dlm: use configfs] Date: Thu, 25 Aug 2005 11:45:14 -0700 Message-ID: <430E11BA.4030603@oracle.com> References: <20050822213220.GH19387@insight.us.oracle.com> <20050822144521.24494329.akpm@osdl.org> <20050822215049.GI19387@insight.us.oracle.com> <20050822150505.7978136d.akpm@osdl.org> <20050824071835.GA10235@lst.de> <20050824203352.GB30246@insight.us.oracle.com> <20050825095819.GA4785@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Joel Becker , Andrew Morton , mark.fasheh@oracle.com, linux-fsdevel@vger.kernel.org Return-path: Received: from tetsuo.zabbo.net ([207.173.201.20]:14993 "EHLO tetsuo.zabbo.net") by vger.kernel.org with ESMTP id S932352AbVHYSpM (ORCPT ); Thu, 25 Aug 2005 14:45:12 -0400 To: Christoph Hellwig In-Reply-To: <20050825095819.GA4785@lst.de> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org > Currently we don't support buffered aio on any filesystem in > mainline, so adding crufty code to mainline sounds like a bad idea. > Zab agreed on that and wants to remove it as much as it gets. Yeah, we aim to simplify this code. For the record, it wasn't buffered aio that was the problem. There were two naughty moving parts: First, trying not to block in the dlm when issuing aio ops and tracking state to restart after dlm ops returned eiocbqueued. This was just overly aggressive. This can behave like block mapping lookups in that it rarely blocks. Most aio that people care about (direct io writes to already allocated regions) will simply be acquiring and releasing shared-read locks around each op -- trivial local operations. Second, trying to hold dlm locks around the entirety of aio ops. This led to the mess of trying to tear down locks in the iocb dtor method. (which can then race with unmount, aio does __fput on the filp, dropping the vfsmount ref, before calling dtor.. bleh). We can get around this by unlocking after performing the block mapping lookups and issueing the io and introducing a cluster DLM lock which behaves like i_alloc_sem. So, how about a patch that lets the fs provide a callback to acquire/release i_alloc_sem at the current sites (dio, notify_change) that work with it? Most file systems wouldn't provide a callback and the code would just use the sem as usual, but clustered guys could use dlm locking. - z