From: Trond Myklebust <trondmy@primarydata.com>
To: "neilb@suse.com" <neilb@suse.com>,
"chuck.lever@oracle.com" <chuck.lever@oracle.com>
Cc: "Anna.Schumaker@netapp.com" <Anna.Schumaker@netapp.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH/RFC] NFS: add nostatflush mount option.
Date: Thu, 21 Dec 2017 15:54:51 +0000 [thread overview]
Message-ID: <1513871689.11836.3.camel@primarydata.com> (raw)
In-Reply-To: <4B4DA4D4-8068-4C10-92BE-F03632522C75@oracle.com>
T24gVGh1LCAyMDE3LTEyLTIxIGF0IDEwOjM5IC0wNTAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
SGkgTmVpbC0NCj4gDQo+IA0KPiA+IE9uIERlYyAyMCwgMjAxNywgYXQgOTo1NyBQTSwgTmVpbEJy
b3duIDxuZWlsYkBzdXNlLmNvbT4gd3JvdGU6DQo+ID4gDQo+ID4gDQo+ID4gV2hlbiBhbiBpX29w
LT5nZXRhdHRyKCkgY2FsbCBpcyBtYWRlIG9uIGFuIE5GUyBmaWxlDQo+ID4gKHR5cGljYWxseSBm
cm9tIGEgJ3N0YXQnIGZhbWlseSBzeXN0ZW0gY2FsbCksIE5GUw0KPiA+IHdpbGwgZmlyc3QgZmx1
c2ggYW55IGRpcnR5IGRhdGEgdG8gdGhlIHNlcnZlci4NCj4gPiANCj4gPiBUaGlzIGVuc3VyZXMg
dGhhdCB0aGUgbXRpbWUgcmVwb3J0ZWQgaXMgY29ycmVjdCBhbmQgc3RhYmxlLA0KPiA+IGJ1dCBo
YXMgYSBwZXJmb3JtYW5jZSBwZW5hbHR5LiAgJ3N0YXQnIGlzIG5vcm1hbGx5IHRob3VnaHQNCj4g
PiB0byBiZSBhIHF1aWNrIG9wZXJhdGlvbiwgYW5kIGltcG9zaW5nIHRoaXMgY29zdCBjYW4gYmUN
Cj4gPiBzdXJwcmlzaW5nLg0KPiANCj4gVG8gYmUgY2xlYXIsIHRoaXMgYmVoYXZpb3IgaXMgYSBQ
T1NJWCByZXF1aXJlbWVudC4NCj4gDQo+IA0KPiA+IEkgaGF2ZSBzZWVuIHByb2JsZW1zIHdoZW4g
b25lIHByb2Nlc3MgaXMgd3JpdGluZyBhIGxhcmdlDQo+ID4gZmlsZSBhbmQgYW5vdGhlciBwcm9j
ZXNzIHBlcmZvcm1zICJscyAtbCIgb24gdGhlIGNvbnRhaW5pbmcNCj4gPiBkaXJlY3RvcnkgYW5k
IGlzIGJsb2NrZWQgZm9yIGFzIGxvbmcgYXMgaXQgdGFrZSB0byBmbHVzaA0KPiA+IGFsbCB0aGUg
ZGlydHkgZGF0YSB0byB0aGUgc2VydmVyLCB3aGljaCBjYW4gYmUgbWludXRlcy4NCj4gDQo+IFll
cywgYSB3ZWxsLWtub3duIGFubm95YW5jZSB0aGF0IGNhbm5vdCBiZSBhZGRyZXNzZWQNCj4gZXZl
biB3aXRoIGEgd3JpdGUgZGVsZWdhdGlvbi4NCj4gDQo+IA0KPiA+IEkgaGF2ZSBhbHNvIHNlZW4g
YSBsZWdhY3kgYXBwbGljYXRpb24gd2hpY2ggZnJlcXVlbnRseSBjYWxscw0KPiA+ICJmc3RhdCIg
b24gYSBmaWxlIHRoYXQgaXQgaXMgd3JpdGluZyB0by4gIE9uIGEgbG9jYWwNCj4gPiBmaWxlc3lz
dGVtIChhbmQgaW4gdGhlIFNvbGFyaXMgaW1wbGVtZW50YXRpb24gb2YgTkZTKSB0aGlzDQo+ID4g
ZnN0YXQgY2FsbCBpcyBjaGVhcC4gIE9uIExpbnV4L05GUywgdGhlIGNhdXNlcyBhIG5vdGljZWFi
bGUNCj4gPiBkZWNyZWFzZSBpbiB0aHJvdWdocHV0Lg0KPiANCj4gSWYgdGhlIHByZWNlZGluZyB3
cml0ZSBpcyBzbWFsbCwgTGludXggY291bGQgYmUgdXNpbmcNCj4gYSBGSUxFX1NZTkMgd3JpdGUs
IGJ1dCBTb2xhcmlzIGNvdWxkIGJlIHVzaW5nIFVOU1RBQkxFLg0KPiANCj4gDQo+ID4gVGhlIG9u
bHkgY2lyY3Vtc3RhbmNlcyB3aGVyZSBhbiBhcHBsaWNhdGlvbiBjYWxsaW5nICdzdGF0KCknDQo+
ID4gbWlnaHQgZ2V0IGFuIG10aW1lIHdoaWNoIGlzIG5vdCBzdGFibGUgYXJlIHRpbWVzIHdoZW4g
c29tZQ0KPiA+IG90aGVyIHByb2Nlc3MgaXMgd3JpdGluZyB0byB0aGUgZmlsZSBhbmQgdGhlIHR3
byBwcm9jZXNzZXMNCj4gPiBhcmUgbm90IHVzaW5nIGxvY2tpbmcgdG8gZW5zdXJlIGNvbnNpc3Rl
bmN5LCBvciB3aGVuIHRoZSBvbmUNCj4gPiBwcm9jZXNzIGlzIGJvdGggd3JpdGluZyBhbmQgc3Rh
dGluZy4gIEluIG5laXRoZXIgb2YgdGhlc2UNCj4gPiBjYXNlcyBpcyBpdCByZWFzb25hYmxlIHRv
IGV4cGVjdCB0aGUgbXRpbWUgdG8gYmUgc3RhYmxlLg0KPiANCj4gSSdtIG5vdCBjb252aW5jZWQg
dGhpcyBpcyBhIHN0cm9uZyBlbm91Z2ggcmF0aW9uYWxlDQo+IGZvciBjbGFpbWluZyBpdCBpcyBz
YWZlIHRvIGRpc2FibGUgdGhlIGV4aXN0aW5nDQo+IGJlaGF2aW9yLg0KPiANCj4gWW91J3ZlIGV4
cGxhaW5lZCBjYXNlcyB3aGVyZSB0aGUgbmV3IGJlaGF2aW9yIGlzDQo+IHJlYXNvbmFibGUsIGJ1
dCBkbyB5b3UgaGF2ZSBhbnkgZXhhbXBsZXMgd2hlcmUgdGhlDQo+IG5ldyBiZWhhdmlvciB3b3Vs
ZCBiZSBhIHByb2JsZW0/IFRoZXJlIG11c3QgYmUgYQ0KPiByZWFzb24gd2h5IFBPU0lYIGV4cGxp
Y2l0bHkgcmVxdWlyZXMgYW4gdXAtdG8tZGF0ZQ0KPiBtdGltZS4NCj4gDQo+IFdoYXQgZ3VpZGFu
Y2Ugd291bGQgbmZzKDUpIGdpdmUgb24gd2hlbiBpdCBpcyBzYWZlDQo+IHRvIHNwZWNpZnkgdGhl
IG5ldyBtb3VudCBvcHRpb24/DQo+IA0KPiANCj4gPiBJbiB0aGUgbW9zdCBjb21tb24gY2FzZXMg
d2hlcmUgbXRpbWUgaXMgaW1wb3J0YW50DQo+ID4gKGUuZy4gbWFrZSksIG5vIG90aGVyIHByb2Nl
c3MgaGFzIHRoZSBmaWxlIG9wZW4sIHNvIHRoZXJlDQo+ID4gd2lsbCBiZSBubyBkaXJ0eSBkYXRh
IGFuZCB0aGUgbXRpbWUgd2lsbCBiZSBzdGFibGUuDQo+IA0KPiBJc24ndCBpdCBhbHNvIHRoZSBj
YXNlIHRoYXQgbWFrZSBpcyBhIG11bHRpLXByb2Nlc3MNCj4gd29ya2xvYWQgd2hlcmUgb25lIHBy
b2Nlc3MgbW9kaWZpZXMgYSBmaWxlLCB0aGVuDQo+IGNsb3NlcyBpdCAod2hpY2ggdHJpZ2dlcnMg
YSBmbHVzaCksIGFuZCB0aGVuIGFub3RoZXINCj4gcHJvY2VzcyBzdGF0cyB0aGUgZmlsZT8gVGhl
IG5ldyBtb3VudCBvcHRpb24gZG9lcw0KPiBub3QgY2hhbmdlIHRoZSBiZWhhdmlvciBvZiBjbG9z
ZSgyKSwgZG9lcyBpdD8NCj4gDQo+IA0KPiA+IFJhdGhlciB0aGFuIHVuaWxhdGVyYWxseSBjaGFu
Z2luZyB0aGlzIGJlaGF2aW9yIG9mICdzdGF0JywNCj4gPiB0aGlzIHBhdGNoIGFkZHMgYSAibm9z
eW5jZmx1c2giIG1vdW50IG9wdGlvbiB0byBhbGxvdw0KPiA+IHN5c2FkbWlucyB0byBoYXZlIGFw
cGxpY2F0aW9ucyB3aGljaCBhcmUgaHVydCBieSB0aGUgY3VycmVudA0KPiA+IGJlaGF2aW9yIHRv
IGRpc2FibGUgaXQuDQo+IA0KPiBJTU8gYSBtb3VudCBvcHRpb24gaXMgYXQgdGhlIHdyb25nIGdy
YW51bGFyaXR5LiBBDQo+IG1vdW50IHBvaW50IHdpbGwgYmUgc2hhcmVkIGJldHdlZW4gYXBwbGlj
YXRpb25zIHRoYXQNCj4gY2FuIHRvbGVyYXRlIHRoZSBub24tUE9TSVggYmVoYXZpb3IgYW5kIHRo
b3NlIHRoYXQNCj4gY2Fubm90LCBmb3IgaW5zdGFuY2UuDQoNCkFncmVlZC4gDQoNClRoZSBvdGhl
ciB0aGluZyB0byBub3RlIGhlcmUgaXMgdGhhdCB3ZSBub3cgaGF2ZSBhbiBlbWJyeW9uaWMgc3Rh
dHgoKQ0Kc3lzdGVtIGNhbGwsIHdoaWNoIGFsbG93cyB0aGUgYXBwbGljYXRpb24gaXRzZWxmIHRv
IGRlY2lkZSB3aGV0aGVyIG9yDQpub3QgaXQgbmVlZHMgdXAgdG8gZGF0ZSB2YWx1ZXMgZm9yIHRo
ZSBhdGltZS9jdGltZS9tdGltZS4gV2hpbGUgd2UNCmhhdmVuJ3QgeWV0IHBsdW1iZWQgaW4gdGhl
IE5GUyBzaWRlLCB0aGUgaW50ZW50aW9uIHdhcyBhbHdheXMgdG8gdXNlDQp0aGF0IGluZm9ybWF0
aW9uIHRvIHR1cm4gb2ZmIHRoZSB3cml0ZWJhY2sgZmx1c2hpbmcgd2hlbiBwb3NzaWJsZS4NCg0K
Q2hlZXJzDQogIFRyb25kDQoNCi0tIA0KVHJvbmQgTXlrbGVidXN0DQpMaW51eCBORlMgY2xpZW50
IG1haW50YWluZXIsIFByaW1hcnlEYXRhDQp0cm9uZC5teWtsZWJ1c3RAcHJpbWFyeWRhdGEuY29t
DQo=
WARNING: multiple messages have this Message-ID (diff)
From: Trond Myklebust <trondmy@primarydata.com>
To: "neilb@suse.com" <neilb@suse.com>,
"chuck.lever@oracle.com" <chuck.lever@oracle.com>
Cc: "Anna.Schumaker@netapp.com" <Anna.Schumaker@netapp.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH/RFC] NFS: add nostatflush mount option.
Date: Thu, 21 Dec 2017 15:54:51 +0000 [thread overview]
Message-ID: <1513871689.11836.3.camel@primarydata.com> (raw)
In-Reply-To: <4B4DA4D4-8068-4C10-92BE-F03632522C75@oracle.com>
On Thu, 2017-12-21 at 10:39 -0500, Chuck Lever wrote:
> Hi Neil-
>
>
> > On Dec 20, 2017, at 9:57 PM, NeilBrown <neilb@suse.com> wrote:
> >
> >
> > When an i_op->getattr() call is made on an NFS file
> > (typically from a 'stat' family system call), NFS
> > will first flush any dirty data to the server.
> >
> > This ensures that the mtime reported is correct and stable,
> > but has a performance penalty. 'stat' is normally thought
> > to be a quick operation, and imposing this cost can be
> > surprising.
>
> To be clear, this behavior is a POSIX requirement.
>
>
> > I have seen problems when one process is writing a large
> > file and another process performs "ls -l" on the containing
> > directory and is blocked for as long as it take to flush
> > all the dirty data to the server, which can be minutes.
>
> Yes, a well-known annoyance that cannot be addressed
> even with a write delegation.
>
>
> > I have also seen a legacy application which frequently calls
> > "fstat" on a file that it is writing to. On a local
> > filesystem (and in the Solaris implementation of NFS) this
> > fstat call is cheap. On Linux/NFS, the causes a noticeable
> > decrease in throughput.
>
> If the preceding write is small, Linux could be using
> a FILE_SYNC write, but Solaris could be using UNSTABLE.
>
>
> > The only circumstances where an application calling 'stat()'
> > might get an mtime which is not stable are times when some
> > other process is writing to the file and the two processes
> > are not using locking to ensure consistency, or when the one
> > process is both writing and stating. In neither of these
> > cases is it reasonable to expect the mtime to be stable.
>
> I'm not convinced this is a strong enough rationale
> for claiming it is safe to disable the existing
> behavior.
>
> You've explained cases where the new behavior is
> reasonable, but do you have any examples where the
> new behavior would be a problem? There must be a
> reason why POSIX explicitly requires an up-to-date
> mtime.
>
> What guidance would nfs(5) give on when it is safe
> to specify the new mount option?
>
>
> > In the most common cases where mtime is important
> > (e.g. make), no other process has the file open, so there
> > will be no dirty data and the mtime will be stable.
>
> Isn't it also the case that make is a multi-process
> workload where one process modifies a file, then
> closes it (which triggers a flush), and then another
> process stats the file? The new mount option does
> not change the behavior of close(2), does it?
>
>
> > Rather than unilaterally changing this behavior of 'stat',
> > this patch adds a "nosyncflush" mount option to allow
> > sysadmins to have applications which are hurt by the current
> > behavior to disable it.
>
> IMO a mount option is at the wrong granularity. A
> mount point will be shared between applications that
> can tolerate the non-POSIX behavior and those that
> cannot, for instance.
Agreed.
The other thing to note here is that we now have an embryonic statx()
system call, which allows the application itself to decide whether or
not it needs up to date values for the atime/ctime/mtime. While we
haven't yet plumbed in the NFS side, the intention was always to use
that information to turn off the writeback flushing when possible.
Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com
next prev parent reply other threads:[~2017-12-21 15:55 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-21 2:57 [PATCH/RFC] NFS: add nostatflush mount option NeilBrown
2017-12-21 15:39 ` Chuck Lever
2017-12-21 15:39 ` Chuck Lever
2017-12-21 15:54 ` Trond Myklebust [this message]
2017-12-21 15:54 ` Trond Myklebust
2017-12-21 20:59 ` NeilBrown
2017-12-21 21:39 ` Trond Myklebust
2017-12-21 22:35 ` NeilBrown
2017-12-22 3:17 ` Trond Myklebust
2017-12-23 13:16 ` Jeff Layton
2018-01-01 23:29 ` NeilBrown
2018-01-05 1:34 ` Trond Myklebust
2017-12-21 20:51 ` NeilBrown
2017-12-22 16:38 ` Chuck Lever
2017-12-22 16:38 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1513871689.11836.3.camel@primarydata.com \
--to=trondmy@primarydata.com \
--cc=Anna.Schumaker@netapp.com \
--cc=chuck.lever@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.