* Re: [nfsv4] layoutcommits and file layout
[not found] <978693366.32.1292516428080.JavaMail.root@thunderbeast.private.linuxbox.com>
@ 2010-12-16 16:21 ` Matt W. Benjamin
[not found] ` <1740153586.34.1292516481789.JavaMail.root-DQa+Qhn4Z593Hjf6844flrbbgpPoC6wPvwx5bNz670MAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Matt W. Benjamin @ 2010-12-16 16:21 UTC (permalink / raw)
To: Fred Isaman; +Cc: Benny Halevy, Boaz Harrosh, linux-nfs, nfsv4
SGksCgpXZSBoYXZlIGEgZmlsZXMgaW1wbGVtZW50YXRpb24gd2hpY2ggd2FudHMgdG8gcmVjZWl2
ZSBMQVlPVVRDT01NSVQgd2hlbiBhIGNsaWVudCBpcyBmaW5pc2hlZCB3aXRoIGEgbGF5b3V0LiAg
SXQgd2FzIG15IGNsZWFyIHVuZGVyc3RhbmRpbmcgZnJvbSByZmM1NjYxIHRoYXQgd2UgY291bGQg
ZXhwZWN0IHRoaXMgYmVoYXZpb3IuCgpJIHdvdWxkIGhhdmUgbm8gaXNzdWUgYXQgYWxsIHdpdGgg
YWxsb3dpbmcgdGhlIGltcGxlbWVudGF0aW9uIHRvIGluZGljYXRlIHRvIGNsaWVudHMgdGhhdCBp
dCBkb2Vzbid0IGNhcmUgdG8gb3IgbmVlZCB0byByZWNlaXZlIExBWU9VVENPTU1JVCwgYXMgSSBi
ZWxpZXZlIERhdmlkIE5vdmVjayBwcm9wb3NlZC4gIEl0IGNyZWF0ZXMgYSByZWFsIHByb2JsZW0g
aWYgTEFZT1VUQ09NTUlUIGlzIHNpbXBseSByYXRpb25hbGlzZWQgb3V0IG9mIHRoZSBzcGVjaWZp
Y2F0aW9uIGJ5IGZpbGVzeXN0ZW1zIHdoaWNoIGhhdmUgb3RoZXIgbWVhbnMgdG8gZWZmaWNpZW50
bHkgZW5zdXJlIHNlbWFudGljcyB3aGljaCBtYXkgbm90IGJlIGVmZmljaWVudCBvciByZWFzb25h
YmxlIGZvciBvdXJzLgoKTWF0dAoKLS0tLS0gIkZyZWQgSXNhbWFuIiA8aWlzYW1hbkBuZXRhcHAu
Y29tPiB3cm90ZToKCj4gPj4+IE9uIDEyLzEwLzIwMTAgMDM6MjIgQU0sIEZyZWQgSXNhbWFuIHdy
b3RlOgo+ID4+Pj4gU2luY2UgZmlsZSBkb2VzIG5vdCBuZWVkIHRoZW0sIGp1c3QgdHVybiB0aGVt
IG9mZiBmb3IKPiA+Pj4+IHRoZSBtb21lbnQuIMKgTm9uLWZpbGUgbGF5b3V0cyB3aWxsIHByb2Jh
Ymx5IGhhdmUgdG8gdHJpZ2dlciB0aGVtCj4gaW4KPiA+Pj4+IHNvbWUgZmFzaGlvbiBhdCBjbG9z
ZS4KPiA+Pj4+Cj4gPj4+Cj4gPj4+IFJycnIuIEFyZSB3ZSBiYWNrIHRvIHRoaXMgYXJndW1lbnQu
IFdlIHN0YW5kIGRvd24gd2luIGFuIGFyZ3VtZW50Cj4gPj4+IGFuZCAyIHdlZWtzIGxhdGVyIHlv
dSBhcmUgYmFjayBvbiBpdCBoYXMgaWYgd2UgbmV2ZXIgdGFsa2VkIGFib3V0Cj4gaXQuCj4gPj4+
Cj4gPj4+IE5PISEhIG9ubHkgImNvaGVyZW50IGNsdXN0ZXJlZCBmaWxlc3lzdGVtcyIgZG8gbm90
IG5lZWQgdGhlbS4gSXQKPiBoYXMKPiA+Pj4gbm90aGluZyB0byBkbyB3aXRoIGxheW91dCB0eXBl
LiBBIG5vbmUtY2x1c3RlcmVkIGFnZ3JlZ2F0ZWQKPiBwYXJhbGxlbAo+ID4+PiBmaWxlc3lzdGVt
IHdpbGwgbmVlZCB0aGVtIGp1c3QgdGhlIHNhbWUgYXMgYmxvY2tzIGFuZCBvYmplY3RzLgo+ID4+
Pgo+ID4+PiBBTkQgVEhFIFNURCBET0VTIE5PVCBHSVZFIFlPVSBBIENIT0lDRSEhIQo+ID4+Cj4g
Pj4gWW91IGtlZXAgc2F5aW5nIHRoaXMsIGJ1dCBqdXN0IHJlcGVhdGluZyBpdCBkb2VzIG5vdCBj
b252aW5jZSBtZS4KPiA+PiBDb3VsZCB5b3UgcGxlYXNlIHRha2UgdGhlIHRpbWUgdG8gZXhwbGFp
biAqd2h5KiB0aGV5IGFyZSBuZWVkZWQuCgotLSAKCk1hdHQgQmVuamFtaW4KClRoZSBMaW51eCBC
b3gKMjA2IFNvdXRoIEZpZnRoIEF2ZS4gU3VpdGUgMTUwCkFubiBBcmJvciwgTUkgIDQ4MTA0Cgpo
dHRwOi8vbGludXhib3guY29tCgp0ZWwuIDczNC03NjEtNDY4OQpmYXguIDczNC03NjktODkzOApj
ZWwuIDczNC0yMTYtNTMwOQpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fXwpuZnN2NCBtYWlsaW5nIGxpc3QKbmZzdjRAaWV0Zi5vcmcKaHR0cHM6Ly93d3cuaWV0
Zi5vcmcvbWFpbG1hbi9saXN0aW5mby9uZnN2NAo=
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout
[not found] ` <1740153586.34.1292516481789.JavaMail.root-DQa+Qhn4Z593Hjf6844flrbbgpPoC6wPvwx5bNz670MAvxtiuMwx3w@public.gmane.org>
@ 2010-12-16 23:07 ` Christoph Hellwig
2011-01-03 14:21 ` Benny Halevy
0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2010-12-16 23:07 UTC (permalink / raw)
To: Matt W. Benjamin
Cc: Fred Isaman, Benny Halevy, linux-nfs, nfsv4, Boaz Harrosh
On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote:
> Hi,
>
> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior.
Care to post it to the list?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout
2010-12-16 23:07 ` Christoph Hellwig
@ 2011-01-03 14:21 ` Benny Halevy
2011-01-03 14:40 ` Trond Myklebust
0 siblings, 1 reply; 7+ messages in thread
From: Benny Halevy @ 2011-01-03 14:21 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-nfs, nfsv4
On 2010-12-17 01:07, Christoph Hellwig wrote:
> On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote:
>> Hi,
>>
>> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior.
>
> Care to post it to the list?
>
I don't know what Matt's server is doing but the fundamental problem is
manifested with extending a file with parallel DS writes.
Assuming that the DS writes are executed in arbitrary order,
exposing the file length before LAYOUTCOMMIT can cause
a concurrent reader to read a hole. Although locking can
solve this case, day-to-day applications that work well over
local filesystem and legacy NFS may break because of this.
Benny
_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www.ietf.org/mailman/listinfo/nfsv4
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout
2011-01-03 14:21 ` Benny Halevy
@ 2011-01-03 14:40 ` Trond Myklebust
2011-01-05 19:01 ` Benny Halevy
0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2011-01-03 14:40 UTC (permalink / raw)
To: Benny Halevy; +Cc: Christoph Hellwig, linux-nfs, nfsv4
On Mon, 2011-01-03 at 16:21 +0200, Benny Halevy wrote:
> On 2010-12-17 01:07, Christoph Hellwig wrote:
> > On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote:
> >> Hi,
> >>
> >> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior.
> >
> > Care to post it to the list?
> >
>
> I don't know what Matt's server is doing but the fundamental problem is
> manifested with extending a file with parallel DS writes.
> Assuming that the DS writes are executed in arbitrary order,
> exposing the file length before LAYOUTCOMMIT can cause
> a concurrent reader to read a hole. Although locking can
> solve this case, day-to-day applications that work well over
> local filesystem and legacy NFS may break because of this.
...and this differs from ordinary NFS writes exactly how?
Both cached and uncached (i.e. O_DIRECT) writes can and will be flushed
to disk in entirely random order when writing to the MDS. If you have a
parallel reader on another client (or even on the same client in the
case of O_DIRECT), and want it to see accurate data, then use locking.
If not, you will see holes and other strangeness.
IOW: There are no 'day-to-day applications that work well over legacy
NFS' that rely on this behaviour.
_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www.ietf.org/mailman/listinfo/nfsv4
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout
2011-01-03 14:40 ` Trond Myklebust
@ 2011-01-05 19:01 ` Benny Halevy
2011-01-05 19:04 ` Trond Myklebust
0 siblings, 1 reply; 7+ messages in thread
From: Benny Halevy @ 2011-01-05 19:01 UTC (permalink / raw)
To: Trond Myklebust; +Cc: Christoph Hellwig, linux-nfs, nfsv4
On 2011-01-03 16:40, Trond Myklebust wrote:
> On Mon, 2011-01-03 at 16:21 +0200, Benny Halevy wrote:
>> On 2010-12-17 01:07, Christoph Hellwig wrote:
>>> On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote:
>>>> Hi,
>>>>
>>>> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior.
>>>
>>> Care to post it to the list?
>>>
>>
>> I don't know what Matt's server is doing but the fundamental problem is
>> manifested with extending a file with parallel DS writes.
>> Assuming that the DS writes are executed in arbitrary order,
>> exposing the file length before LAYOUTCOMMIT can cause
>> a concurrent reader to read a hole. Although locking can
>> solve this case, day-to-day applications that work well over
>> local filesystem and legacy NFS may break because of this.
>
> ...and this differs from ordinary NFS writes exactly how?
>
> Both cached and uncached (i.e. O_DIRECT) writes can and will be flushed
> to disk in entirely random order when writing to the MDS. If you have a
> parallel reader on another client (or even on the same client in the
> case of O_DIRECT), and want it to see accurate data, then use locking.
> If not, you will see holes and other strangeness.
>
> IOW: There are no 'day-to-day applications that work well over legacy
> NFS' that rely on this behaviour.
>
Assuming the client writes sequentially (over tcp) the writes will
practically be processed in order into the server's cache so with
no crashes in the mix a parallel reader will see no holes.
I'd really like the following scenario to work over pNFS with
no hassles:
"some app >> foo" on one client, and
"tail -f foo" on another
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout
2011-01-05 19:01 ` Benny Halevy
@ 2011-01-05 19:04 ` Trond Myklebust
2011-01-05 19:14 ` Trond Myklebust
0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2011-01-05 19:04 UTC (permalink / raw)
To: Benny Halevy; +Cc: Christoph Hellwig, linux-nfs, nfsv4
On Wed, 2011-01-05 at 21:01 +0200, Benny Halevy wrote:
> On 2011-01-03 16:40, Trond Myklebust wrote:
> > On Mon, 2011-01-03 at 16:21 +0200, Benny Halevy wrote:
> >> On 2010-12-17 01:07, Christoph Hellwig wrote:
> >>> On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote:
> >>>> Hi,
> >>>>
> >>>> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior.
> >>>
> >>> Care to post it to the list?
> >>>
> >>
> >> I don't know what Matt's server is doing but the fundamental problem is
> >> manifested with extending a file with parallel DS writes.
> >> Assuming that the DS writes are executed in arbitrary order,
> >> exposing the file length before LAYOUTCOMMIT can cause
> >> a concurrent reader to read a hole. Although locking can
> >> solve this case, day-to-day applications that work well over
> >> local filesystem and legacy NFS may break because of this.
> >
> > ...and this differs from ordinary NFS writes exactly how?
> >
> > Both cached and uncached (i.e. O_DIRECT) writes can and will be flushed
> > to disk in entirely random order when writing to the MDS. If you have a
> > parallel reader on another client (or even on the same client in the
> > case of O_DIRECT), and want it to see accurate data, then use locking.
> > If not, you will see holes and other strangeness.
> >
> > IOW: There are no 'day-to-day applications that work well over legacy
> > NFS' that rely on this behaviour.
> >
>
> Assuming the client writes sequentially (over tcp) the writes will
> practically be processed in order into the server's cache so with
> no crashes in the mix a parallel reader will see no holes.
> I'd really like the following scenario to work over pNFS with
> no hassles:
> "some app >> foo" on one client, and
> "tail -f foo" on another
No, that doesn't work today! Believe me, I get the "bug reports"...
There is no point in trying to add properties to pNFS that don't exist
with ordinary NFS.
Trond
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [nfsv4] layoutcommits and file layout
2011-01-05 19:04 ` Trond Myklebust
@ 2011-01-05 19:14 ` Trond Myklebust
0 siblings, 0 replies; 7+ messages in thread
From: Trond Myklebust @ 2011-01-05 19:14 UTC (permalink / raw)
To: Benny Halevy; +Cc: Christoph Hellwig, linux-nfs, nfsv4
On Wed, 2011-01-05 at 14:04 -0500, Trond Myklebust wrote:
> On Wed, 2011-01-05 at 21:01 +0200, Benny Halevy wrote:
> > On 2011-01-03 16:40, Trond Myklebust wrote:
> > > On Mon, 2011-01-03 at 16:21 +0200, Benny Halevy wrote:
> > >> On 2010-12-17 01:07, Christoph Hellwig wrote:
> > >>> On Thu, Dec 16, 2010 at 11:21:21AM -0500, Matt W. Benjamin wrote:
> > >>>> Hi,
> > >>>>
> > >>>> We have a files implementation which wants to receive LAYOUTCOMMIT when a client is finished with a layout. It was my clear understanding from rfc5661 that we could expect this behavior.
> > >>>
> > >>> Care to post it to the list?
> > >>>
> > >>
> > >> I don't know what Matt's server is doing but the fundamental problem is
> > >> manifested with extending a file with parallel DS writes.
> > >> Assuming that the DS writes are executed in arbitrary order,
> > >> exposing the file length before LAYOUTCOMMIT can cause
> > >> a concurrent reader to read a hole. Although locking can
> > >> solve this case, day-to-day applications that work well over
> > >> local filesystem and legacy NFS may break because of this.
> > >
> > > ...and this differs from ordinary NFS writes exactly how?
> > >
> > > Both cached and uncached (i.e. O_DIRECT) writes can and will be flushed
> > > to disk in entirely random order when writing to the MDS. If you have a
> > > parallel reader on another client (or even on the same client in the
> > > case of O_DIRECT), and want it to see accurate data, then use locking.
> > > If not, you will see holes and other strangeness.
> > >
> > > IOW: There are no 'day-to-day applications that work well over legacy
> > > NFS' that rely on this behaviour.
> > >
> >
> > Assuming the client writes sequentially (over tcp) the writes will
> > practically be processed in order into the server's cache so with
> > no crashes in the mix a parallel reader will see no holes.
> > I'd really like the following scenario to work over pNFS with
> > no hassles:
> > "some app >> foo" on one client, and
> > "tail -f foo" on another
>
> No, that doesn't work today! Believe me, I get the "bug reports"...
>
> There is no point in trying to add properties to pNFS that don't exist
> with ordinary NFS.
...and for the record: use of TCP does _not_ suffice to ensure writes
are processed in order.
In the Linux kernel, we have all sorts of parallelism going on before
the writes even hit the socket on the client. Everything from background
flushing to queuing in the sunrpc layer (e.g. for a session slot)
conspires to destroy any hope of ever achieving what you propose above.
That's not even counting what goes on with the server side. Think, for
instance, of the case where the server crashes before a COMMIT has been
successfully sent. Not only will your reader see holes, it will think
the file has been truncated...
Trond
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www.ietf.org/mailman/listinfo/nfsv4
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-01-05 19:14 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <978693366.32.1292516428080.JavaMail.root@thunderbeast.private.linuxbox.com>
2010-12-16 16:21 ` [nfsv4] layoutcommits and file layout Matt W. Benjamin
[not found] ` <1740153586.34.1292516481789.JavaMail.root-DQa+Qhn4Z593Hjf6844flrbbgpPoC6wPvwx5bNz670MAvxtiuMwx3w@public.gmane.org>
2010-12-16 23:07 ` Christoph Hellwig
2011-01-03 14:21 ` Benny Halevy
2011-01-03 14:40 ` Trond Myklebust
2011-01-05 19:01 ` Benny Halevy
2011-01-05 19:04 ` Trond Myklebust
2011-01-05 19:14 ` Trond Myklebust
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).