From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out1-smtp.messagingengine.com ([66.111.4.25]:34247 "EHLO out1-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751878AbdHEUgu (ORCPT ); Sat, 5 Aug 2017 16:36:50 -0400 From: Nikolaus Rath To: fuse-devel@lists.sourceforge.net, linux-fsdevel Subject: Re: [fuse-devel] [fuse] interaction between O_APPEND and writeback cache References: <87mv7fup1n.fsf@vostro.rath.org> Date: Sat, 05 Aug 2017 22:36:44 +0200 In-Reply-To: (Miklos Szeredi's message of "Fri, 4 Aug 2017 21:59:06 +0200") Message-ID: <87wp6h4uqr.fsf@vostro.rath.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Aug 04 2017, Miklos Szeredi wrote: > On Fri, Aug 4, 2017 at 9:10 PM, Nikolaus Rath wrote: >> Hello, >> >> I am confused about how O_APPEND is supposed to interact with the >> writeback cache. >> >> As far as I can tell, the O_APPEND flag is currently passed to the >> filesystem process, so my expectation is that the filesystem process is >> responsible for ignoring any offset in write requests and instead write >> at the current end of the file[1]. >> >> However, with writeback cache enabled the filesystem process cannot tell >> which data is "new" and came from userspace, should be appended, and >> which data is old and just made a round-trip to the kernel. So it seems >> to me that the filesystem process should probably leave the handling of >> O_APPEND to the kernel. But then, shouldn't the kernel filter out this >> flag when sending the open request? > > Indeed, when writing back the cache the kernel should definitely not > set O_APPEND. Well, 4.9 certainly does it though. Should I try to make a patch, or are you or Maxim going to do that shortly anyway? Do you think it makes sense to filter out O_APPEND in libfuse as well (to work around the issue for present day kernels)? >> On the other hand, when the kernel handles O_APPEND, then it is no >> longer atomic (think of a network fuse filesystem). > > Yes, network filesystem generally needs to handle consistency of > caches across nodes and O_APPEND in no exception (i.e. you cannot have > two nodes writing O_APPEND to cache at the same time, because that > will not work). This poses a bit of a problem though. So a network filesystem either cannot use writeback caching or O_APPEND will (silently) not work. With the current behavior (O_APPEND being passed to open() when writeback is enabled) the filesystem would at least have a chance to return an error, i.e. instead of a silent failure there would be a noisy error. With that in mind, maybe the current behavior isn't so bad? We'd just have to document that if writeback cache is enabled and O_APPEND is received, the filesystem has to decide if it is fine with the kernel handling O_APPEND (and in that case ignore the flag for subsequent writes) or return an error. Best, -Nikolaus --=20 GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F =C2=BBTime flies like an arrow, fruit flies like a Banana.=C2= =AB