NFS_UNSTABLE vs. FILE and DATA sync.

All of lore.kernel.org
 help / color / mirror / Atom feed

* NFS_UNSTABLE vs. FILE and DATA sync.
@ 2007-08-06 16:02 Wim Colgate
  2007-08-06 16:37 ` Chuck Lever
  0 siblings, 1 reply; 12+ messages in thread
From: Wim Colgate @ 2007-08-06 16:02 UTC (permalink / raw)
  To: nfs

[-- Attachment #1.1: Type: text/plain, Size: 576 bytes --]

Hi,

My last question was a bit too nebulous, and didn't really expect an
answer. Learning my lesson, I have a specific question.

If I have a soft mount, and open a file with O_DIRECT and O_SYNC, should
I ever expect a callback (nfs_writeback_done) with a successful
task->tk_status (i.e >= 0) with the committed state
(resp->verf->committed) set to NFS_UNSTABLE? 

A secondary question: if the above is expected, does this occur because
someone is caching the write and is there a mechanism to disable this
effect?

Regards,

Wim

[-- Attachment #1.2: Type: text/html, Size: 2935 bytes --]

[-- Attachment #2: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #3: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 16:02 NFS_UNSTABLE vs. FILE and DATA sync Wim Colgate
@ 2007-08-06 16:37 ` Chuck Lever
  2007-08-06 17:10   ` Chuck Lever
  2007-08-06 17:33   ` Peter Staubach
  0 siblings, 2 replies; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 16:37 UTC (permalink / raw)
  To: Wim Colgate; +Cc: nfs

[-- Attachment #1: Type: text/plain, Size: 618 bytes --]

Wim Colgate wrote:
> If I have a soft mount, and open a file with O_DIRECT and O_SYNC, should 
> I ever expect a callback (nfs_writeback_done) with a successful 
> task->tk_status (i.e >= 0) with the committed state 
> (resp->verf->committed) set to NFS_UNSTABLE?

Yes, this can happen if the server decides to return NFS_UNSTABLE. 
Rare, but possible.

> A secondary question: if the above is expected, does this occur because 
> someone is caching the write and is there a mechanism to disable this 
> effect?

Servers can return NFS_UNSTABLE to any WRITE request, so I can't think 
of a way this might be disabled.

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 16:37 ` Chuck Lever
@ 2007-08-06 17:10   ` Chuck Lever
  2007-08-06 18:58     ` Trond Myklebust
  2007-08-06 17:33   ` Peter Staubach
  1 sibling, 1 reply; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 17:10 UTC (permalink / raw)
  To: Wim Colgate; +Cc: nfs

[-- Attachment #1: Type: text/plain, Size: 1322 bytes --]



Chuck Lever wrote:
> Wim Colgate wrote:
>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC, 
>> should I ever expect a callback (nfs_writeback_done) with a successful 
>> task->tk_status (i.e >= 0) with the committed state 
>> (resp->verf->committed) set to NFS_UNSTABLE?
> 
> Yes, this can happen if the server decides to return NFS_UNSTABLE. Rare, 
> but possible.

Let me be more clear about this.

O_DIRECT and O_SYNC determine client behavior only.  However, they don't 
necessarily force NFS_FILE_SYNC writes all the time.  For example, if an 
application issues a direct write request that is much larger than the 
current wsize, the Linux NFS client will send the write using 
NFS_UNSTABLE requests followed by a COMMIT.  That makes it easier for 
the server to schedule disk writes more efficiently.

>> A secondary question: if the above is expected, does this occur 
>> because someone is caching the write and is there a mechanism to 
>> disable this effect?
> 
> Servers can return NFS_UNSTABLE to any WRITE request, so I can't think 
> of a way this might be disabled.

Even though an NFS client requests an NFS_FILE_SYNC write, the server 
still has the choice of returning something less, even NFS_UNSTABLE.  In 
general that's a rare occurrence, but is something I've seen in practice.

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 16:37 ` Chuck Lever
  2007-08-06 17:10   ` Chuck Lever
@ 2007-08-06 17:33   ` Peter Staubach
  2007-08-06 17:40     ` Wim Colgate
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Staubach @ 2007-08-06 17:33 UTC (permalink / raw)
  To: chuck.lever; +Cc: nfs, Wim Colgate

Chuck Lever wrote:
> Wim Colgate wrote:
>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC, 
>> should I ever expect a callback (nfs_writeback_done) with a 
>> successful task->tk_status (i.e >= 0) with the committed state 
>> (resp->verf->committed) set to NFS_UNSTABLE?
>
> Yes, this can happen if the server decides to return NFS_UNSTABLE. 
> Rare, but possible.
>
>> A secondary question: if the above is expected, does this occur 
>> because someone is caching the write and is there a mechanism to 
>> disable this effect?
>
> Servers can return NFS_UNSTABLE to any WRITE request, so I can't think 
> of a way this might be disabled. 

Actually, it would be a protocol error for a server to return
a commitment level less than was requested by the client.  The
server can return a greater commitment level, but not less than.

       ps

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 17:33   ` Peter Staubach
@ 2007-08-06 17:40     ` Wim Colgate
  2007-08-06 19:16       ` Chuck Lever
  0 siblings, 1 reply; 12+ messages in thread
From: Wim Colgate @ 2007-08-06 17:40 UTC (permalink / raw)
  To: Peter Staubach, chuck.lever; +Cc: nfs

Interesting information.

Specifically I am trying to inject errors by manually (but politely)
bringing the NFS server down then up, then down (rinse and repeat ...)
while doing IO from a linux client. As mentioned the open file is
O_DIRECT and O_SYNC -- which I thought should mean either the data hits
the server's storage or I should get an error; and I'm more than happy
to deal with an IO error.

I'm confident the writes are less than wsize (4096 bytes to be precise).

Is there a 100% guaranteed method to get the behavior I thought O_DIRECT
and O_SYNC was providing?

Thanks,

Wim

-----Original Message-----
From: Peter Staubach [mailto:staubach@redhat.com] 
Sent: Monday, August 06, 2007 10:33 AM
To: chuck.lever@oracle.com
Cc: Wim Colgate; nfs@lists.sourceforge.net
Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.

Chuck Lever wrote:
> Wim Colgate wrote:
>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC, 
>> should I ever expect a callback (nfs_writeback_done) with a 
>> successful task->tk_status (i.e >= 0) with the committed state 
>> (resp->verf->committed) set to NFS_UNSTABLE?
>
> Yes, this can happen if the server decides to return NFS_UNSTABLE. 
> Rare, but possible.
>
>> A secondary question: if the above is expected, does this occur 
>> because someone is caching the write and is there a mechanism to 
>> disable this effect?
>
> Servers can return NFS_UNSTABLE to any WRITE request, so I can't think

> of a way this might be disabled. 

Actually, it would be a protocol error for a server to return
a commitment level less than was requested by the client.  The
server can return a greater commitment level, but not less than.

       ps

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 17:10   ` Chuck Lever
@ 2007-08-06 18:58     ` Trond Myklebust
  2007-08-06 19:13       ` Chuck Lever
  0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2007-08-06 18:58 UTC (permalink / raw)
  To: chuck.lever; +Cc: nfs, Wim Colgate

On Mon, 2007-08-06 at 13:10 -0400, Chuck Lever wrote:
> Even though an NFS client requests an NFS_FILE_SYNC write, the server 
> still has the choice of returning something less, even NFS_UNSTABLE.  In 
> general that's a rare occurrence, but is something I've seen in practice.

As Peter said, a server that return anything other than FILE_SYNC to a
FILE_SYNC write request would be in clear violation of the description
of WRITE semantics on page 51 of RFC1813:

      committed
         The server should return an indication of the level of
         commitment of the data and metadata via committed. If
         the server committed all data and metadata to stable
         storage, committed should be set to FILE_SYNC. If the
         level of commitment was at least as strong as
         DATA_SYNC, then committed should be set to DATA_SYNC.
         Otherwise, committed must be returned as UNSTABLE. If
         stable was FILE_SYNC, then committed must also be
         FILE_SYNC: anything else constitutes a protocol
         violation. If stable was DATA_SYNC, then committed may
         be FILE_SYNC or DATA_SYNC: anything else constitutes a
         protocol violation. If stable was UNSTABLE, then
         committed may be either FILE_SYNC, DATA_SYNC, or
         UNSTABLE.

I see no reason why we should care about supporting such a server.

Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 18:58     ` Trond Myklebust
@ 2007-08-06 19:13       ` Chuck Lever
  2007-08-06 19:19         ` Trond Myklebust
  0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 19:13 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: nfs, Wim Colgate

[-- Attachment #1: Type: text/plain, Size: 1724 bytes --]

Trond Myklebust wrote:
> On Mon, 2007-08-06 at 13:10 -0400, Chuck Lever wrote:
>> Even though an NFS client requests an NFS_FILE_SYNC write, the server 
>> still has the choice of returning something less, even NFS_UNSTABLE.  In 
>> general that's a rare occurrence, but is something I've seen in practice.
> 
> As Peter said, a server that return anything other than FILE_SYNC to a
> FILE_SYNC write request would be in clear violation of the description
> of WRITE semantics on page 51 of RFC1813:
> 
>       committed
>          The server should return an indication of the level of
>          commitment of the data and metadata via committed. If
>          the server committed all data and metadata to stable
>          storage, committed should be set to FILE_SYNC. If the
>          level of commitment was at least as strong as
>          DATA_SYNC, then committed should be set to DATA_SYNC.
>          Otherwise, committed must be returned as UNSTABLE. If
>          stable was FILE_SYNC, then committed must also be
>          FILE_SYNC: anything else constitutes a protocol
>          violation. If stable was DATA_SYNC, then committed may
>          be FILE_SYNC or DATA_SYNC: anything else constitutes a
>          protocol violation. If stable was UNSTABLE, then
>          committed may be either FILE_SYNC, DATA_SYNC, or
>          UNSTABLE.
> 
> I see no reason why we should care about supporting such a server.

I said nothing about whether the server should or should not return such 
a value.  I just said that it is a possibility, and that I have observed 
the behavior in the field.

The client, if it is a good implementation, needs to check for this 
possibility and throw an error in that case.

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 17:40     ` Wim Colgate
@ 2007-08-06 19:16       ` Chuck Lever
  2007-08-06 19:33         ` Wim Colgate
  0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 19:16 UTC (permalink / raw)
  To: Wim Colgate; +Cc: nfs

[-- Attachment #1: Type: text/plain, Size: 2111 bytes --]

Wim Colgate wrote:
> Specifically I am trying to inject errors by manually (but politely)
> bringing the NFS server down then up, then down (rinse and repeat ...)
> while doing IO from a linux client. As mentioned the open file is
> O_DIRECT and O_SYNC -- which I thought should mean either the data hits
> the server's storage or I should get an error; and I'm more than happy
> to deal with an IO error.
> 
> I'm confident the writes are less than wsize (4096 bytes to be precise).
> 
> 
> Is there a 100% guaranteed method to get the behavior I thought O_DIRECT
> and O_SYNC was providing?

What behavior did you expect O_DIRECT + O_SYNC to provide?  O_DIRECT 
means "don't cache data" and O_SYNC means "make sure the data is flushed 
to the server's disk before each write() system call returns." 
Technically, you don't need NFS_FILE_SYNC writes to do either of those.

Which kernel are you testing?  The client's use of NFS_FILE_SYNC writes 
changed over time.

> -----Original Message-----
> From: Peter Staubach [mailto:staubach@redhat.com] 
> Sent: Monday, August 06, 2007 10:33 AM
> To: chuck.lever@oracle.com
> Cc: Wim Colgate; nfs@lists.sourceforge.net
> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
> 
> Chuck Lever wrote:
>> Wim Colgate wrote:
>>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC, 
>>> should I ever expect a callback (nfs_writeback_done) with a 
>>> successful task->tk_status (i.e >= 0) with the committed state 
>>> (resp->verf->committed) set to NFS_UNSTABLE?
>> Yes, this can happen if the server decides to return NFS_UNSTABLE. 
>> Rare, but possible.
>>
>>> A secondary question: if the above is expected, does this occur 
>>> because someone is caching the write and is there a mechanism to 
>>> disable this effect?
>> Servers can return NFS_UNSTABLE to any WRITE request, so I can't think
> 
>> of a way this might be disabled. 
> 
> Actually, it would be a protocol error for a server to return
> a commitment level less than was requested by the client.  The
> server can return a greater commitment level, but not less than.
> 
>        ps

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 19:13       ` Chuck Lever
@ 2007-08-06 19:19         ` Trond Myklebust
  2007-08-06 19:35           ` Chuck Lever
  0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2007-08-06 19:19 UTC (permalink / raw)
  To: chuck.lever; +Cc: nfs, Wim Colgate

On Mon, 2007-08-06 at 15:13 -0400, Chuck Lever wrote:
> Trond Myklebust wrote:
> > On Mon, 2007-08-06 at 13:10 -0400, Chuck Lever wrote:
> >> Even though an NFS client requests an NFS_FILE_SYNC write, the server 
> >> still has the choice of returning something less, even NFS_UNSTABLE.  In 
> >> general that's a rare occurrence, but is something I've seen in practice.
> > 
> > As Peter said, a server that return anything other than FILE_SYNC to a
> > FILE_SYNC write request would be in clear violation of the description
> > of WRITE semantics on page 51 of RFC1813:
> > 
> >       committed
> >          The server should return an indication of the level of
> >          commitment of the data and metadata via committed. If
> >          the server committed all data and metadata to stable
> >          storage, committed should be set to FILE_SYNC. If the
> >          level of commitment was at least as strong as
> >          DATA_SYNC, then committed should be set to DATA_SYNC.
> >          Otherwise, committed must be returned as UNSTABLE. If
> >          stable was FILE_SYNC, then committed must also be
> >          FILE_SYNC: anything else constitutes a protocol
> >          violation. If stable was DATA_SYNC, then committed may
> >          be FILE_SYNC or DATA_SYNC: anything else constitutes a
> >          protocol violation. If stable was UNSTABLE, then
> >          committed may be either FILE_SYNC, DATA_SYNC, or
> >          UNSTABLE.
> > 
> > I see no reason why we should care about supporting such a server.
> 
> I said nothing about whether the server should or should not return such 
> a value.  I just said that it is a possibility, and that I have observed 
> the behavior in the field.
> 
> The client, if it is a good implementation, needs to check for this 
> possibility and throw an error in that case.

We have never supported servers that blatantly violate the protocol, and
I see no reason to be burdening the client with a whole load of checks
for server protocol violations either.

If you want a tool for testing servers, then use something like pynfs.

Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 19:16       ` Chuck Lever
@ 2007-08-06 19:33         ` Wim Colgate
  2007-08-06 19:42           ` Chuck Lever
  0 siblings, 1 reply; 12+ messages in thread
From: Wim Colgate @ 2007-08-06 19:33 UTC (permalink / raw)
  To: chuck.lever; +Cc: nfs

Hi Chuck,

The linux kernel I was using is 2.6.18-8.

To be fair, I was not trying to force NFS_FILE_SYNC; to make a long
story short, I started with O_DIRECT (please don't cache data). I moved
to add O_SYNC (don't return until my data is written safely). And when I
couldn't explain why I was missing some data (discrepancy between client
and server), I started investigating what was happening under the hood.
I didn't really want to start a controversy -- I just wanted to
understand what was happening.

My understanding of NFS is fairly pedestrian in that I merely get the
big picture. Which is why I posted here.

Thanks,

Wim

-----Original Message-----
From: Chuck Lever [mailto:chuck.lever@oracle.com] 
Sent: Monday, August 06, 2007 12:16 PM
To: Wim Colgate
Cc: nfs@lists.sourceforge.net
Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.

Wim Colgate wrote:
> Specifically I am trying to inject errors by manually (but politely)
> bringing the NFS server down then up, then down (rinse and repeat ...)
> while doing IO from a linux client. As mentioned the open file is
> O_DIRECT and O_SYNC -- which I thought should mean either the data
hits
> the server's storage or I should get an error; and I'm more than happy
> to deal with an IO error.
> 
> I'm confident the writes are less than wsize (4096 bytes to be
precise).
> 
> 
> Is there a 100% guaranteed method to get the behavior I thought
O_DIRECT
> and O_SYNC was providing?

What behavior did you expect O_DIRECT + O_SYNC to provide?  O_DIRECT 
means "don't cache data" and O_SYNC means "make sure the data is flushed

to the server's disk before each write() system call returns." 
Technically, you don't need NFS_FILE_SYNC writes to do either of those.

Which kernel are you testing?  The client's use of NFS_FILE_SYNC writes 
changed over time.

> -----Original Message-----
> From: Peter Staubach [mailto:staubach@redhat.com] 
> Sent: Monday, August 06, 2007 10:33 AM
> To: chuck.lever@oracle.com
> Cc: Wim Colgate; nfs@lists.sourceforge.net
> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
> 
> Chuck Lever wrote:
>> Wim Colgate wrote:
>>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC, 
>>> should I ever expect a callback (nfs_writeback_done) with a 
>>> successful task->tk_status (i.e >= 0) with the committed state 
>>> (resp->verf->committed) set to NFS_UNSTABLE?
>> Yes, this can happen if the server decides to return NFS_UNSTABLE. 
>> Rare, but possible.
>>
>>> A secondary question: if the above is expected, does this occur 
>>> because someone is caching the write and is there a mechanism to 
>>> disable this effect?
>> Servers can return NFS_UNSTABLE to any WRITE request, so I can't
think
> 
>> of a way this might be disabled. 
> 
> Actually, it would be a protocol error for a server to return
> a commitment level less than was requested by the client.  The
> server can return a greater commitment level, but not less than.
> 
>        ps

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 19:19         ` Trond Myklebust
@ 2007-08-06 19:35           ` Chuck Lever
  0 siblings, 0 replies; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 19:35 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: nfs, Wim Colgate

[-- Attachment #1: Type: text/plain, Size: 1336 bytes --]

Trond Myklebust wrote:
>>> I see no reason why we should care about supporting such a server.
>> I said nothing about whether the server should or should not return such 
>> a value.  I just said that it is a possibility, and that I have observed 
>> the behavior in the field.
>>
>> The client, if it is a good implementation, needs to check for this 
>> possibility and throw an error in that case.
> 
> We have never supported servers that blatantly violate the protocol, and
> I see no reason to be burdening the client with a whole load of checks
> for server protocol violations either.

Take a look at nfs_writeback_done().  You'll see a specific check for a 
"known bug in Tru64 < 5.0" that is exactly the check to see if the 
server is violating this part of the protocol.  Tru64 is not the only 
server where this can happen.

And in fact, the behavior *is* actually "supported" by the client -- I 
believe Linux will attempt to post a COMMIT if the server returns 
NFS_UNSTABLE to *any* write the client has done.

> If you want a tool for testing servers, then use something like pynfs.

No argument there.  Wim, you can precisely determine the request stream 
a server sees with a test client such as pynfs.  However, it appeared 
that you were looking at whole system client-server interaction during 
server instability.

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS_UNSTABLE vs. FILE and DATA sync.
  2007-08-06 19:33         ` Wim Colgate
@ 2007-08-06 19:42           ` Chuck Lever
  0 siblings, 0 replies; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 19:42 UTC (permalink / raw)
  To: Wim Colgate; +Cc: nfs

[-- Attachment #1: Type: text/plain, Size: 3545 bytes --]

Wim Colgate wrote:
> The linux kernel I was using is 2.6.18-8.
> 
> To be fair, I was not trying to force NFS_FILE_SYNC; to make a long
> story short, I started with O_DIRECT (please don't cache data). I moved
> to add O_SYNC (don't return until my data is written safely). And when I
> couldn't explain why I was missing some data (discrepancy between client
> and server), I started investigating what was happening under the hood.

In fact O_DIRECT also guarantees that the data is on the server's disk 
before the write() call returns.  In some older versions of the client, 
O_SYNC forced the direct I/O engine to use NFS_FILE_SYNC writes for 
everything.  I don't think that logic is there any more.

But what you describe above is a bug.  A network dump would be the next 
step to understand the true interaction between the client and the 
server during a server reboot.

There were some bugs in the client's direct I/O engine where server 
reboot recovery might result in data loss.  Trond fixed a couple of bugs 
in this area around 2.6.19 or 20.  It would be interesting if you tested 
a later kernel, just for behavioral comparison.

> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@oracle.com] 
> Sent: Monday, August 06, 2007 12:16 PM
> To: Wim Colgate
> Cc: nfs@lists.sourceforge.net
> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
> 
> Wim Colgate wrote:
>> Specifically I am trying to inject errors by manually (but politely)
>> bringing the NFS server down then up, then down (rinse and repeat ...)
>> while doing IO from a linux client. As mentioned the open file is
>> O_DIRECT and O_SYNC -- which I thought should mean either the data
> hits
>> the server's storage or I should get an error; and I'm more than happy
>> to deal with an IO error.
>>
>> I'm confident the writes are less than wsize (4096 bytes to be
> precise).
>>
>> Is there a 100% guaranteed method to get the behavior I thought
> O_DIRECT
>> and O_SYNC was providing?
> 
> What behavior did you expect O_DIRECT + O_SYNC to provide?  O_DIRECT 
> means "don't cache data" and O_SYNC means "make sure the data is flushed
> 
> to the server's disk before each write() system call returns." 
> Technically, you don't need NFS_FILE_SYNC writes to do either of those.
> 
> Which kernel are you testing?  The client's use of NFS_FILE_SYNC writes 
> changed over time.
> 
>> -----Original Message-----
>> From: Peter Staubach [mailto:staubach@redhat.com] 
>> Sent: Monday, August 06, 2007 10:33 AM
>> To: chuck.lever@oracle.com
>> Cc: Wim Colgate; nfs@lists.sourceforge.net
>> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
>>
>> Chuck Lever wrote:
>>> Wim Colgate wrote:
>>>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC, 
>>>> should I ever expect a callback (nfs_writeback_done) with a 
>>>> successful task->tk_status (i.e >= 0) with the committed state 
>>>> (resp->verf->committed) set to NFS_UNSTABLE?
>>> Yes, this can happen if the server decides to return NFS_UNSTABLE. 
>>> Rare, but possible.
>>>
>>>> A secondary question: if the above is expected, does this occur 
>>>> because someone is caching the write and is there a mechanism to 
>>>> disable this effect?
>>> Servers can return NFS_UNSTABLE to any WRITE request, so I can't
> think
>>> of a way this might be disabled. 
>> Actually, it would be a protocol error for a server to return
>> a commitment level less than was requested by the client.  The
>> server can return a greater commitment level, but not less than.
>>
>>        ps

[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard


[-- Attachment #3: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2007-08-06 20:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-06 16:02 NFS_UNSTABLE vs. FILE and DATA sync Wim Colgate
2007-08-06 16:37 ` Chuck Lever
2007-08-06 17:10   ` Chuck Lever
2007-08-06 18:58     ` Trond Myklebust
2007-08-06 19:13       ` Chuck Lever
2007-08-06 19:19         ` Trond Myklebust
2007-08-06 19:35           ` Chuck Lever
2007-08-06 17:33   ` Peter Staubach
2007-08-06 17:40     ` Wim Colgate
2007-08-06 19:16       ` Chuck Lever
2007-08-06 19:33         ` Wim Colgate
2007-08-06 19:42           ` Chuck Lever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.