From mboxrd@z Thu Jan 1 00:00:00 1970 From: "J. Bruce Fields" Subject: Re: [PATCH 0/2] fix nfsd stable write implementation Date: Tue, 30 Oct 2012 10:07:25 -0400 Message-ID: <20121030140725.GC24618@fieldses.org> References: <1351285617-20450-1-git-send-email-bfields@redhat.com> <20121030102833.306e833a@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "J. Bruce Fields" , linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Peter Staubach To: NeilBrown Return-path: Received: from fieldses.org ([174.143.236.118]:32894 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759048Ab2J3OH2 (ORCPT ); Tue, 30 Oct 2012 10:07:28 -0400 Content-Disposition: inline In-Reply-To: <20121030102833.306e833a@notabene.brown> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, Oct 30, 2012 at 10:28:33AM +1100, NeilBrown wrote: > On Fri, 26 Oct 2012 17:06:55 -0400 "J. Bruce Fields" > wrote: > > > From: "J. Bruce Fields" > > > > Peter pointed out to me that the nfs server is implementing stable > > writes by setting the O_SYNC flag. I can't see why we couldn't write > > and then sync instead, but I don't know this stuff as well as I should; > > does the following look reasonable to people? > > Bruce changed the code to implement stable writes by calling > vfs_fsync_range(). I can't see why we couldn't use O_SYNC instead. > > It seems like you are making a change just for the sake of making a change. > Is there some reason that you think a separate 'sync' is more efficient than > using O_SYNC ? Oh, sorry, see the changelog on the second patch: the problem is that the struct file can be shared across multiple writes in the NFSv4 case, so a single stable write could make all subsequent writes synchronous. (And some day people would like filehandle caching for v2/v3, in which case we'll run into the same problem.) > As a general principle, I think it is best to give the file system as much > information as possible to that it can make whatever optimisation decisions > that it wants to. > > Setting O_SYNC gives the filesystem more information than not, because it > allows it to change the behaviour of the 'write' request... though I don't > know if any filesystem actually uses the information. I'm not sure how to figure out if that's a real problem or not. --b.