* Re: [nfsv4] layoutcommits and file layout [not found] <978693366.32.1292516428080.JavaMail.root@thunderbeast.private.linuxbox.com> @ 2010-12-16 16:21 ` Matt W. Benjamin [not found] ` <1740153586.34.1292516481789.JavaMail.root-DQa+Qhn4Z593Hjf6844flrbbgpPoC6wPvwx5bNz670MAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Matt W. Benjamin @ 2010-12-16 16:21 UTC (permalink / raw) To: Fred Isaman; +Cc: Benny Halevy, Boaz Harrosh, linux-nfs, nfsv4 SGksCgpXZSBoYXZlIGEgZmlsZXMgaW1wbGVtZW50YXRpb24gd2hpY2ggd2FudHMgdG8gcmVjZWl2 ZSBMQVlPVVRDT01NSVQgd2hlbiBhIGNsaWVudCBpcyBmaW5pc2hlZCB3aXRoIGEgbGF5b3V0LiAg SXQgd2FzIG15IGNsZWFyIHVuZGVyc3RhbmRpbmcgZnJvbSByZmM1NjYxIHRoYXQgd2UgY291bGQg ZXhwZWN0IHRoaXMgYmVoYXZpb3IuCgpJIHdvdWxkIGhhdmUgbm8gaXNzdWUgYXQgYWxsIHdpdGgg YWxsb3dpbmcgdGhlIGltcGxlbWVudGF0aW9uIHRvIGluZGljYXRlIHRvIGNsaWVudHMgdGhhdCBp dCBkb2Vzbid0IGNhcmUgdG8gb3IgbmVlZCB0byByZWNlaXZlIExBWU9VVENPTU1JVCwgYXMgSSBi ZWxpZXZlIERhdmlkIE5vdmVjayBwcm9wb3NlZC4gIEl0IGNyZWF0ZXMgYSByZWFsIHByb2JsZW0g aWYgTEFZT1VUQ09NTUlUIGlzIHNpbXBseSByYXRpb25hbGlzZWQgb3V0IG9mIHRoZSBzcGVjaWZp Y2F0aW9uIGJ5IGZpbGVzeXN0ZW1zIHdoaWNoIGhhdmUgb3RoZXIgbWVhbnMgdG8gZWZmaWNpZW50 bHkgZW5zdXJlIHNlbWFudGljcyB3aGljaCBtYXkgbm90IGJlIGVmZmljaWVudCBvciByZWFzb25h YmxlIGZvciBvdXJzLgoKTWF0dAoKLS0tLS0gIkZyZWQgSXNhbWFuIiA8aWlzYW1hbkBuZXRhcHAu Y29tPiB3cm90ZToKCj4gPj4+IE9uIDEyLzEwLzIwMTAgMDM6MjIgQU0sIEZyZWQgSXNhbWFuIHdy b3RlOgo+ID4+Pj4gU2luY2UgZmlsZSBkb2VzIG5vdCBuZWVkIHRoZW0sIGp1c3QgdHVybiB0aGVt IG9mZiBmb3IKPiA+Pj4+IHRoZSBtb21lbnQuIMKgTm9uLWZpbGUgbGF5b3V0cyB3aWxsIHByb2Jh Ymx5IGhhdmUgdG8gdHJpZ2dlciB0aGVtCj4gaW4KPiA+Pj4+IHNvbWUgZmFzaGlvbiBhdCBjbG9z ZS4KPiA+Pj4+Cj4gPj4+Cj4gPj4+IFJycnIuIEFyZSB3ZSBiYWNrIHRvIHRoaXMgYXJndW1lbnQu IFdlIHN0YW5kIGRvd24gd2luIGFuIGFyZ3VtZW50Cj4gPj4+IGFuZCAyIHdlZWtzIGxhdGVyIHlv dSBhcmUgYmFjayBvbiBpdCBoYXMgaWYgd2UgbmV2ZXIgdGFsa2VkIGFib3V0Cj4gaXQuCj4gPj4+ Cj4gPj4+IE5PISEhIG9ubHkgImNvaGVyZW50IGNsdXN0ZXJlZCBmaWxlc3lzdGVtcyIgZG8gbm90 IG5lZWQgdGhlbS4gSXQKPiBoYXMKPiA+Pj4gbm90aGluZyB0byBkbyB3aXRoIGxheW91dCB0eXBl LiBBIG5vbmUtY2x1c3RlcmVkIGFnZ3JlZ2F0ZWQKPiBwYXJhbGxlbAo+ID4+PiBmaWxlc3lzdGVt IHdpbGwgbmVlZCB0aGVtIGp1c3QgdGhlIHNhbWUgYXMgYmxvY2tzIGFuZCBvYmplY3RzLgo+ID4+ Pgo+ID4+PiBBTkQgVEhFIFNURCBET0VTIE5PVCBHSVZFIFlPVSBBIENIT0lDRSEhIQo+ID4+Cj4g Pj4gWW91IGtlZXAgc2F5aW5nIHRoaXMsIGJ1dCBqdXN0IHJlcGVhdGluZyBpdCBkb2VzIG5vdCBj b252aW5jZSBtZS4KPiA+PiBDb3VsZCB5b3UgcGxlYXNlIHRha2UgdGhlIHRpbWUgdG8gZXhwbGFp biAqd2h5KiB0aGV5IGFyZSBuZWVkZWQuCgotLSAKCk1hdHQgQmVuamFtaW4KClRoZSBMaW51eCBC b3gKMjA2IFNvdXRoIEZpZnRoIEF2ZS4gU3VpdGUgMTUwCkFubiBBcmJvciwgTUkgIDQ4MTA0Cgpo dHRwOi8vbGludXhib3guY29tCgp0ZWwuIDczNC03NjEtNDY4OQpmYXguIDczNC03NjktODkzOApj ZWwuIDczNC0yMTYtNTMwOQpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fXwpuZnN2NCBtYWlsaW5nIGxpc3QKbmZzdjRAaWV0Zi5vcmcKaHR0cHM6Ly93d3cuaWV0 Zi5vcmcvbWFpbG1hbi9saXN0aW5mby9uZnN2NAo= ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <1740153586.34.1292516481789.JavaMail.root-DQa+Qhn4Z593Hjf6844flrbbgpPoC6wPvwx5bNz670MAvxtiuMwx3w@public.gmane.org>]
* Re: [nfsv4] layoutcommits and file layout [not found] ` <1740153586.34.1292516481789.JavaMail.root-DQa+Qhn4Z593Hjf6844flrbbgpPoC6wPvwx5bNz670MAvxtiuMwx3w@public.gmane.org> @ 2010-12-16 23:07 ` Christoph Hellwig 2011-01-03 14:21 ` Benny Halevy 0 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2010-12-16 23:07 UTC (permalink / raw) To: Matt W. Benjamin Cc: Fred Isaman, Benny Halevy, linux-nfs, nfsv4, Boaz Harrosh On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote: > Hi, > > We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior. Care to post it to the list? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout 2010-12-16 23:07 ` Christoph Hellwig @ 2011-01-03 14:21 ` Benny Halevy 2011-01-03 14:40 ` Trond Myklebust 0 siblings, 1 reply; 7+ messages in thread From: Benny Halevy @ 2011-01-03 14:21 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-nfs, nfsv4 On 2010-12-17 01:07, Christoph Hellwig wrote: > On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote: >> Hi, >> >> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior. > > Care to post it to the list? > I don't know what Matt's server is doing but the fundamental problem is manifested with extending a file with parallel DS writes. Assuming that the DS writes are executed in arbitrary order, exposing the file length before LAYOUTCOMMIT can cause a concurrent reader to read a hole. Although locking can solve this case, day-to-day applications that work well over local filesystem and legacy NFS may break because of this. Benny _______________________________________________ nfsv4 mailing list nfsv4@ietf.org https://www.ietf.org/mailman/listinfo/nfsv4 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout 2011-01-03 14:21 ` Benny Halevy @ 2011-01-03 14:40 ` Trond Myklebust 2011-01-05 19:01 ` Benny Halevy 0 siblings, 1 reply; 7+ messages in thread From: Trond Myklebust @ 2011-01-03 14:40 UTC (permalink / raw) To: Benny Halevy; +Cc: Christoph Hellwig, linux-nfs, nfsv4 On Mon, 2011-01-03 at 16:21 +0200, Benny Halevy wrote: > On 2010-12-17 01:07, Christoph Hellwig wrote: > > On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote: > >> Hi, > >> > >> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior. > > > > Care to post it to the list? > > > > I don't know what Matt's server is doing but the fundamental problem is > manifested with extending a file with parallel DS writes. > Assuming that the DS writes are executed in arbitrary order, > exposing the file length before LAYOUTCOMMIT can cause > a concurrent reader to read a hole. Although locking can > solve this case, day-to-day applications that work well over > local filesystem and legacy NFS may break because of this. ...and this differs from ordinary NFS writes exactly how? Both cached and uncached (i.e. O_DIRECT) writes can and will be flushed to disk in entirely random order when writing to the MDS. If you have a parallel reader on another client (or even on the same client in the case of O_DIRECT), and want it to see accurate data, then use locking. If not, you will see holes and other strangeness. IOW: There are no 'day-to-day applications that work well over legacy NFS' that rely on this behaviour. _______________________________________________ nfsv4 mailing list nfsv4@ietf.org https://www.ietf.org/mailman/listinfo/nfsv4 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout 2011-01-03 14:40 ` Trond Myklebust @ 2011-01-05 19:01 ` Benny Halevy 2011-01-05 19:04 ` Trond Myklebust 0 siblings, 1 reply; 7+ messages in thread From: Benny Halevy @ 2011-01-05 19:01 UTC (permalink / raw) To: Trond Myklebust; +Cc: Christoph Hellwig, linux-nfs, nfsv4 On 2011-01-03 16:40, Trond Myklebust wrote: > On Mon, 2011-01-03 at 16:21 +0200, Benny Halevy wrote: >> On 2010-12-17 01:07, Christoph Hellwig wrote: >>> On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote: >>>> Hi, >>>> >>>> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior. >>> >>> Care to post it to the list? >>> >> >> I don't know what Matt's server is doing but the fundamental problem is >> manifested with extending a file with parallel DS writes. >> Assuming that the DS writes are executed in arbitrary order, >> exposing the file length before LAYOUTCOMMIT can cause >> a concurrent reader to read a hole. Although locking can >> solve this case, day-to-day applications that work well over >> local filesystem and legacy NFS may break because of this. > > ...and this differs from ordinary NFS writes exactly how? > > Both cached and uncached (i.e. O_DIRECT) writes can and will be flushed > to disk in entirely random order when writing to the MDS. If you have a > parallel reader on another client (or even on the same client in the > case of O_DIRECT), and want it to see accurate data, then use locking. > If not, you will see holes and other strangeness. > > IOW: There are no 'day-to-day applications that work well over legacy > NFS' that rely on this behaviour. > Assuming the client writes sequentially (over tcp) the writes will practically be processed in order into the server's cache so with no crashes in the mix a parallel reader will see no holes. I'd really like the following scenario to work over pNFS with no hassles: "some app >> foo" on one client, and "tail -f foo" on another ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout 2011-01-05 19:01 ` Benny Halevy @ 2011-01-05 19:04 ` Trond Myklebust 2011-01-05 19:14 ` Trond Myklebust 0 siblings, 1 reply; 7+ messages in thread From: Trond Myklebust @ 2011-01-05 19:04 UTC (permalink / raw) To: Benny Halevy; +Cc: Christoph Hellwig, linux-nfs, nfsv4 On Wed, 2011-01-05 at 21:01 +0200, Benny Halevy wrote: > On 2011-01-03 16:40, Trond Myklebust wrote: > > On Mon, 2011-01-03 at 16:21 +0200, Benny Halevy wrote: > >> On 2010-12-17 01:07, Christoph Hellwig wrote: > >>> On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote: > >>>> Hi, > >>>> > >>>> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior. > >>> > >>> Care to post it to the list? > >>> > >> > >> I don't know what Matt's server is doing but the fundamental problem is > >> manifested with extending a file with parallel DS writes. > >> Assuming that the DS writes are executed in arbitrary order, > >> exposing the file length before LAYOUTCOMMIT can cause > >> a concurrent reader to read a hole. Although locking can > >> solve this case, day-to-day applications that work well over > >> local filesystem and legacy NFS may break because of this. > > > > ...and this differs from ordinary NFS writes exactly how? > > > > Both cached and uncached (i.e. O_DIRECT) writes can and will be flushed > > to disk in entirely random order when writing to the MDS. If you have a > > parallel reader on another client (or even on the same client in the > > case of O_DIRECT), and want it to see accurate data, then use locking. > > If not, you will see holes and other strangeness. > > > > IOW: There are no 'day-to-day applications that work well over legacy > > NFS' that rely on this behaviour. > > > > Assuming the client writes sequentially (over tcp) the writes will > practically be processed in order into the server's cache so with > no crashes in the mix a parallel reader will see no holes. > I'd really like the following scenario to work over pNFS with > no hassles: > "some app >> foo" on one client, and > "tail -f foo" on another No, that doesn't work today! Believe me, I get the "bug reports"... There is no point in trying to add properties to pNFS that don't exist with ordinary NFS. Trond -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout 2011-01-05 19:04 ` Trond Myklebust @ 2011-01-05 19:14 ` Trond Myklebust 0 siblings, 0 replies; 7+ messages in thread From: Trond Myklebust @ 2011-01-05 19:14 UTC (permalink / raw) To: Benny Halevy; +Cc: Christoph Hellwig, linux-nfs, nfsv4 On Wed, 2011-01-05 at 14:04 -0500, Trond Myklebust wrote: > On Wed, 2011-01-05 at 21:01 +0200, Benny Halevy wrote: > > On 2011-01-03 16:40, Trond Myklebust wrote: > > > On Mon, 2011-01-03 at 16:21 +0200, Benny Halevy wrote: > > >> On 2010-12-17 01:07, Christoph Hellwig wrote: > > >>> On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote: > > >>>> Hi, > > >>>> > > >>>> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior. > > >>> > > >>> Care to post it to the list? > > >>> > > >> > > >> I don't know what Matt's server is doing but the fundamental problem is > > >> manifested with extending a file with parallel DS writes. > > >> Assuming that the DS writes are executed in arbitrary order, > > >> exposing the file length before LAYOUTCOMMIT can cause > > >> a concurrent reader to read a hole. Although locking can > > >> solve this case, day-to-day applications that work well over > > >> local filesystem and legacy NFS may break because of this. > > > > > > ...and this differs from ordinary NFS writes exactly how? > > > > > > Both cached and uncached (i.e. O_DIRECT) writes can and will be flushed > > > to disk in entirely random order when writing to the MDS. If you have a > > > parallel reader on another client (or even on the same client in the > > > case of O_DIRECT), and want it to see accurate data, then use locking. > > > If not, you will see holes and other strangeness. > > > > > > IOW: There are no 'day-to-day applications that work well over legacy > > > NFS' that rely on this behaviour. > > > > > > > Assuming the client writes sequentially (over tcp) the writes will > > practically be processed in order into the server's cache so with > > no crashes in the mix a parallel reader will see no holes. > > I'd really like the following scenario to work over pNFS with > > no hassles: > > "some app >> foo" on one client, and > > "tail -f foo" on another > > No, that doesn't work today! Believe me, I get the "bug reports"... > > There is no point in trying to add properties to pNFS that don't exist > with ordinary NFS. ...and for the record: use of TCP does _not_ suffice to ensure writes are processed in order. In the Linux kernel, we have all sorts of parallelism going on before the writes even hit the socket on the client. Everything from background flushing to queuing in the sunrpc layer (e.g. for a session slot) conspires to destroy any hope of ever achieving what you propose above. That's not even counting what goes on with the server side. Think, for instance, of the case where the server crashes before a COMMIT has been successfully sent. Not only will your reader see holes, it will think the file has been truncated... Trond -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com _______________________________________________ nfsv4 mailing list nfsv4@ietf.org https://www.ietf.org/mailman/listinfo/nfsv4 ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-01-05 19:14 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <978693366.32.1292516428080.JavaMail.root@thunderbeast.private.linuxbox.com>
2010-12-16 16:21 ` [nfsv4] layoutcommits and file layout Matt W. Benjamin
[not found] ` <1740153586.34.1292516481789.JavaMail.root-DQa+Qhn4Z593Hjf6844flrbbgpPoC6wPvwx5bNz670MAvxtiuMwx3w@public.gmane.org>
2010-12-16 23:07 ` Christoph Hellwig
2011-01-03 14:21 ` Benny Halevy
2011-01-03 14:40 ` Trond Myklebust
2011-01-05 19:01 ` Benny Halevy
2011-01-05 19:04 ` Trond Myklebust
2011-01-05 19:14 ` Trond Myklebust
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).