* NFS_UNSTABLE vs. FILE and DATA sync.
@ 2007-08-06 16:02 Wim Colgate
2007-08-06 16:37 ` Chuck Lever
0 siblings, 1 reply; 12+ messages in thread
From: Wim Colgate @ 2007-08-06 16:02 UTC (permalink / raw)
To: nfs
[-- Attachment #1.1: Type: text/plain, Size: 576 bytes --]
Hi,
My last question was a bit too nebulous, and didn't really expect an
answer. Learning my lesson, I have a specific question.
If I have a soft mount, and open a file with O_DIRECT and O_SYNC, should
I ever expect a callback (nfs_writeback_done) with a successful
task->tk_status (i.e >= 0) with the committed state
(resp->verf->committed) set to NFS_UNSTABLE?
A secondary question: if the above is expected, does this occur because
someone is caching the write and is there a mechanism to disable this
effect?
Regards,
Wim
[-- Attachment #1.2: Type: text/html, Size: 2935 bytes --]
[-- Attachment #2: Type: text/plain, Size: 315 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #3: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 16:02 NFS_UNSTABLE vs. FILE and DATA sync Wim Colgate
@ 2007-08-06 16:37 ` Chuck Lever
2007-08-06 17:10 ` Chuck Lever
2007-08-06 17:33 ` Peter Staubach
0 siblings, 2 replies; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 16:37 UTC (permalink / raw)
To: Wim Colgate; +Cc: nfs
[-- Attachment #1: Type: text/plain, Size: 618 bytes --]
Wim Colgate wrote:
> If I have a soft mount, and open a file with O_DIRECT and O_SYNC, should
> I ever expect a callback (nfs_writeback_done) with a successful
> task->tk_status (i.e >= 0) with the committed state
> (resp->verf->committed) set to NFS_UNSTABLE?
Yes, this can happen if the server decides to return NFS_UNSTABLE.
Rare, but possible.
> A secondary question: if the above is expected, does this occur because
> someone is caching the write and is there a mechanism to disable this
> effect?
Servers can return NFS_UNSTABLE to any WRITE request, so I can't think
of a way this might be disabled.
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard
[-- Attachment #3: Type: text/plain, Size: 315 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #4: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 16:37 ` Chuck Lever
@ 2007-08-06 17:10 ` Chuck Lever
2007-08-06 18:58 ` Trond Myklebust
2007-08-06 17:33 ` Peter Staubach
1 sibling, 1 reply; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 17:10 UTC (permalink / raw)
To: Wim Colgate; +Cc: nfs
[-- Attachment #1: Type: text/plain, Size: 1322 bytes --]
Chuck Lever wrote:
> Wim Colgate wrote:
>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC,
>> should I ever expect a callback (nfs_writeback_done) with a successful
>> task->tk_status (i.e >= 0) with the committed state
>> (resp->verf->committed) set to NFS_UNSTABLE?
>
> Yes, this can happen if the server decides to return NFS_UNSTABLE. Rare,
> but possible.
Let me be more clear about this.
O_DIRECT and O_SYNC determine client behavior only. However, they don't
necessarily force NFS_FILE_SYNC writes all the time. For example, if an
application issues a direct write request that is much larger than the
current wsize, the Linux NFS client will send the write using
NFS_UNSTABLE requests followed by a COMMIT. That makes it easier for
the server to schedule disk writes more efficiently.
>> A secondary question: if the above is expected, does this occur
>> because someone is caching the write and is there a mechanism to
>> disable this effect?
>
> Servers can return NFS_UNSTABLE to any WRITE request, so I can't think
> of a way this might be disabled.
Even though an NFS client requests an NFS_FILE_SYNC write, the server
still has the choice of returning something less, even NFS_UNSTABLE. In
general that's a rare occurrence, but is something I've seen in practice.
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard
[-- Attachment #3: Type: text/plain, Size: 315 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #4: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 16:37 ` Chuck Lever
2007-08-06 17:10 ` Chuck Lever
@ 2007-08-06 17:33 ` Peter Staubach
2007-08-06 17:40 ` Wim Colgate
1 sibling, 1 reply; 12+ messages in thread
From: Peter Staubach @ 2007-08-06 17:33 UTC (permalink / raw)
To: chuck.lever; +Cc: nfs, Wim Colgate
Chuck Lever wrote:
> Wim Colgate wrote:
>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC,
>> should I ever expect a callback (nfs_writeback_done) with a
>> successful task->tk_status (i.e >= 0) with the committed state
>> (resp->verf->committed) set to NFS_UNSTABLE?
>
> Yes, this can happen if the server decides to return NFS_UNSTABLE.
> Rare, but possible.
>
>> A secondary question: if the above is expected, does this occur
>> because someone is caching the write and is there a mechanism to
>> disable this effect?
>
> Servers can return NFS_UNSTABLE to any WRITE request, so I can't think
> of a way this might be disabled.
Actually, it would be a protocol error for a server to return
a commitment level less than was requested by the client. The
server can return a greater commitment level, but not less than.
ps
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 17:33 ` Peter Staubach
@ 2007-08-06 17:40 ` Wim Colgate
2007-08-06 19:16 ` Chuck Lever
0 siblings, 1 reply; 12+ messages in thread
From: Wim Colgate @ 2007-08-06 17:40 UTC (permalink / raw)
To: Peter Staubach, chuck.lever; +Cc: nfs
Interesting information.
Specifically I am trying to inject errors by manually (but politely)
bringing the NFS server down then up, then down (rinse and repeat ...)
while doing IO from a linux client. As mentioned the open file is
O_DIRECT and O_SYNC -- which I thought should mean either the data hits
the server's storage or I should get an error; and I'm more than happy
to deal with an IO error.
I'm confident the writes are less than wsize (4096 bytes to be precise).
Is there a 100% guaranteed method to get the behavior I thought O_DIRECT
and O_SYNC was providing?
Thanks,
Wim
-----Original Message-----
From: Peter Staubach [mailto:staubach@redhat.com]
Sent: Monday, August 06, 2007 10:33 AM
To: chuck.lever@oracle.com
Cc: Wim Colgate; nfs@lists.sourceforge.net
Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
Chuck Lever wrote:
> Wim Colgate wrote:
>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC,
>> should I ever expect a callback (nfs_writeback_done) with a
>> successful task->tk_status (i.e >= 0) with the committed state
>> (resp->verf->committed) set to NFS_UNSTABLE?
>
> Yes, this can happen if the server decides to return NFS_UNSTABLE.
> Rare, but possible.
>
>> A secondary question: if the above is expected, does this occur
>> because someone is caching the write and is there a mechanism to
>> disable this effect?
>
> Servers can return NFS_UNSTABLE to any WRITE request, so I can't think
> of a way this might be disabled.
Actually, it would be a protocol error for a server to return
a commitment level less than was requested by the client. The
server can return a greater commitment level, but not less than.
ps
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 17:10 ` Chuck Lever
@ 2007-08-06 18:58 ` Trond Myklebust
2007-08-06 19:13 ` Chuck Lever
0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2007-08-06 18:58 UTC (permalink / raw)
To: chuck.lever; +Cc: nfs, Wim Colgate
On Mon, 2007-08-06 at 13:10 -0400, Chuck Lever wrote:
> Even though an NFS client requests an NFS_FILE_SYNC write, the server
> still has the choice of returning something less, even NFS_UNSTABLE. In
> general that's a rare occurrence, but is something I've seen in practice.
As Peter said, a server that return anything other than FILE_SYNC to a
FILE_SYNC write request would be in clear violation of the description
of WRITE semantics on page 51 of RFC1813:
committed
The server should return an indication of the level of
commitment of the data and metadata via committed. If
the server committed all data and metadata to stable
storage, committed should be set to FILE_SYNC. If the
level of commitment was at least as strong as
DATA_SYNC, then committed should be set to DATA_SYNC.
Otherwise, committed must be returned as UNSTABLE. If
stable was FILE_SYNC, then committed must also be
FILE_SYNC: anything else constitutes a protocol
violation. If stable was DATA_SYNC, then committed may
be FILE_SYNC or DATA_SYNC: anything else constitutes a
protocol violation. If stable was UNSTABLE, then
committed may be either FILE_SYNC, DATA_SYNC, or
UNSTABLE.
I see no reason why we should care about supporting such a server.
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 18:58 ` Trond Myklebust
@ 2007-08-06 19:13 ` Chuck Lever
2007-08-06 19:19 ` Trond Myklebust
0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 19:13 UTC (permalink / raw)
To: Trond Myklebust; +Cc: nfs, Wim Colgate
[-- Attachment #1: Type: text/plain, Size: 1724 bytes --]
Trond Myklebust wrote:
> On Mon, 2007-08-06 at 13:10 -0400, Chuck Lever wrote:
>> Even though an NFS client requests an NFS_FILE_SYNC write, the server
>> still has the choice of returning something less, even NFS_UNSTABLE. In
>> general that's a rare occurrence, but is something I've seen in practice.
>
> As Peter said, a server that return anything other than FILE_SYNC to a
> FILE_SYNC write request would be in clear violation of the description
> of WRITE semantics on page 51 of RFC1813:
>
> committed
> The server should return an indication of the level of
> commitment of the data and metadata via committed. If
> the server committed all data and metadata to stable
> storage, committed should be set to FILE_SYNC. If the
> level of commitment was at least as strong as
> DATA_SYNC, then committed should be set to DATA_SYNC.
> Otherwise, committed must be returned as UNSTABLE. If
> stable was FILE_SYNC, then committed must also be
> FILE_SYNC: anything else constitutes a protocol
> violation. If stable was DATA_SYNC, then committed may
> be FILE_SYNC or DATA_SYNC: anything else constitutes a
> protocol violation. If stable was UNSTABLE, then
> committed may be either FILE_SYNC, DATA_SYNC, or
> UNSTABLE.
>
> I see no reason why we should care about supporting such a server.
I said nothing about whether the server should or should not return such
a value. I just said that it is a possibility, and that I have observed
the behavior in the field.
The client, if it is a good implementation, needs to check for this
possibility and throw an error in that case.
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard
[-- Attachment #3: Type: text/plain, Size: 315 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #4: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 17:40 ` Wim Colgate
@ 2007-08-06 19:16 ` Chuck Lever
2007-08-06 19:33 ` Wim Colgate
0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 19:16 UTC (permalink / raw)
To: Wim Colgate; +Cc: nfs
[-- Attachment #1: Type: text/plain, Size: 2111 bytes --]
Wim Colgate wrote:
> Specifically I am trying to inject errors by manually (but politely)
> bringing the NFS server down then up, then down (rinse and repeat ...)
> while doing IO from a linux client. As mentioned the open file is
> O_DIRECT and O_SYNC -- which I thought should mean either the data hits
> the server's storage or I should get an error; and I'm more than happy
> to deal with an IO error.
>
> I'm confident the writes are less than wsize (4096 bytes to be precise).
>
>
> Is there a 100% guaranteed method to get the behavior I thought O_DIRECT
> and O_SYNC was providing?
What behavior did you expect O_DIRECT + O_SYNC to provide? O_DIRECT
means "don't cache data" and O_SYNC means "make sure the data is flushed
to the server's disk before each write() system call returns."
Technically, you don't need NFS_FILE_SYNC writes to do either of those.
Which kernel are you testing? The client's use of NFS_FILE_SYNC writes
changed over time.
> -----Original Message-----
> From: Peter Staubach [mailto:staubach@redhat.com]
> Sent: Monday, August 06, 2007 10:33 AM
> To: chuck.lever@oracle.com
> Cc: Wim Colgate; nfs@lists.sourceforge.net
> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
>
> Chuck Lever wrote:
>> Wim Colgate wrote:
>>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC,
>>> should I ever expect a callback (nfs_writeback_done) with a
>>> successful task->tk_status (i.e >= 0) with the committed state
>>> (resp->verf->committed) set to NFS_UNSTABLE?
>> Yes, this can happen if the server decides to return NFS_UNSTABLE.
>> Rare, but possible.
>>
>>> A secondary question: if the above is expected, does this occur
>>> because someone is caching the write and is there a mechanism to
>>> disable this effect?
>> Servers can return NFS_UNSTABLE to any WRITE request, so I can't think
>
>> of a way this might be disabled.
>
> Actually, it would be a protocol error for a server to return
> a commitment level less than was requested by the client. The
> server can return a greater commitment level, but not less than.
>
> ps
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard
[-- Attachment #3: Type: text/plain, Size: 315 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #4: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 19:13 ` Chuck Lever
@ 2007-08-06 19:19 ` Trond Myklebust
2007-08-06 19:35 ` Chuck Lever
0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2007-08-06 19:19 UTC (permalink / raw)
To: chuck.lever; +Cc: nfs, Wim Colgate
On Mon, 2007-08-06 at 15:13 -0400, Chuck Lever wrote:
> Trond Myklebust wrote:
> > On Mon, 2007-08-06 at 13:10 -0400, Chuck Lever wrote:
> >> Even though an NFS client requests an NFS_FILE_SYNC write, the server
> >> still has the choice of returning something less, even NFS_UNSTABLE. In
> >> general that's a rare occurrence, but is something I've seen in practice.
> >
> > As Peter said, a server that return anything other than FILE_SYNC to a
> > FILE_SYNC write request would be in clear violation of the description
> > of WRITE semantics on page 51 of RFC1813:
> >
> > committed
> > The server should return an indication of the level of
> > commitment of the data and metadata via committed. If
> > the server committed all data and metadata to stable
> > storage, committed should be set to FILE_SYNC. If the
> > level of commitment was at least as strong as
> > DATA_SYNC, then committed should be set to DATA_SYNC.
> > Otherwise, committed must be returned as UNSTABLE. If
> > stable was FILE_SYNC, then committed must also be
> > FILE_SYNC: anything else constitutes a protocol
> > violation. If stable was DATA_SYNC, then committed may
> > be FILE_SYNC or DATA_SYNC: anything else constitutes a
> > protocol violation. If stable was UNSTABLE, then
> > committed may be either FILE_SYNC, DATA_SYNC, or
> > UNSTABLE.
> >
> > I see no reason why we should care about supporting such a server.
>
> I said nothing about whether the server should or should not return such
> a value. I just said that it is a possibility, and that I have observed
> the behavior in the field.
>
> The client, if it is a good implementation, needs to check for this
> possibility and throw an error in that case.
We have never supported servers that blatantly violate the protocol, and
I see no reason to be burdening the client with a whole load of checks
for server protocol violations either.
If you want a tool for testing servers, then use something like pynfs.
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 19:16 ` Chuck Lever
@ 2007-08-06 19:33 ` Wim Colgate
2007-08-06 19:42 ` Chuck Lever
0 siblings, 1 reply; 12+ messages in thread
From: Wim Colgate @ 2007-08-06 19:33 UTC (permalink / raw)
To: chuck.lever; +Cc: nfs
Hi Chuck,
The linux kernel I was using is 2.6.18-8.
To be fair, I was not trying to force NFS_FILE_SYNC; to make a long
story short, I started with O_DIRECT (please don't cache data). I moved
to add O_SYNC (don't return until my data is written safely). And when I
couldn't explain why I was missing some data (discrepancy between client
and server), I started investigating what was happening under the hood.
I didn't really want to start a controversy -- I just wanted to
understand what was happening.
My understanding of NFS is fairly pedestrian in that I merely get the
big picture. Which is why I posted here.
Thanks,
Wim
-----Original Message-----
From: Chuck Lever [mailto:chuck.lever@oracle.com]
Sent: Monday, August 06, 2007 12:16 PM
To: Wim Colgate
Cc: nfs@lists.sourceforge.net
Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
Wim Colgate wrote:
> Specifically I am trying to inject errors by manually (but politely)
> bringing the NFS server down then up, then down (rinse and repeat ...)
> while doing IO from a linux client. As mentioned the open file is
> O_DIRECT and O_SYNC -- which I thought should mean either the data
hits
> the server's storage or I should get an error; and I'm more than happy
> to deal with an IO error.
>
> I'm confident the writes are less than wsize (4096 bytes to be
precise).
>
>
> Is there a 100% guaranteed method to get the behavior I thought
O_DIRECT
> and O_SYNC was providing?
What behavior did you expect O_DIRECT + O_SYNC to provide? O_DIRECT
means "don't cache data" and O_SYNC means "make sure the data is flushed
to the server's disk before each write() system call returns."
Technically, you don't need NFS_FILE_SYNC writes to do either of those.
Which kernel are you testing? The client's use of NFS_FILE_SYNC writes
changed over time.
> -----Original Message-----
> From: Peter Staubach [mailto:staubach@redhat.com]
> Sent: Monday, August 06, 2007 10:33 AM
> To: chuck.lever@oracle.com
> Cc: Wim Colgate; nfs@lists.sourceforge.net
> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
>
> Chuck Lever wrote:
>> Wim Colgate wrote:
>>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC,
>>> should I ever expect a callback (nfs_writeback_done) with a
>>> successful task->tk_status (i.e >= 0) with the committed state
>>> (resp->verf->committed) set to NFS_UNSTABLE?
>> Yes, this can happen if the server decides to return NFS_UNSTABLE.
>> Rare, but possible.
>>
>>> A secondary question: if the above is expected, does this occur
>>> because someone is caching the write and is there a mechanism to
>>> disable this effect?
>> Servers can return NFS_UNSTABLE to any WRITE request, so I can't
think
>
>> of a way this might be disabled.
>
> Actually, it would be a protocol error for a server to return
> a commitment level less than was requested by the client. The
> server can return a greater commitment level, but not less than.
>
> ps
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 19:19 ` Trond Myklebust
@ 2007-08-06 19:35 ` Chuck Lever
0 siblings, 0 replies; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 19:35 UTC (permalink / raw)
To: Trond Myklebust; +Cc: nfs, Wim Colgate
[-- Attachment #1: Type: text/plain, Size: 1336 bytes --]
Trond Myklebust wrote:
>>> I see no reason why we should care about supporting such a server.
>> I said nothing about whether the server should or should not return such
>> a value. I just said that it is a possibility, and that I have observed
>> the behavior in the field.
>>
>> The client, if it is a good implementation, needs to check for this
>> possibility and throw an error in that case.
>
> We have never supported servers that blatantly violate the protocol, and
> I see no reason to be burdening the client with a whole load of checks
> for server protocol violations either.
Take a look at nfs_writeback_done(). You'll see a specific check for a
"known bug in Tru64 < 5.0" that is exactly the check to see if the
server is violating this part of the protocol. Tru64 is not the only
server where this can happen.
And in fact, the behavior *is* actually "supported" by the client -- I
believe Linux will attempt to post a COMMIT if the server returns
NFS_UNSTABLE to *any* write the client has done.
> If you want a tool for testing servers, then use something like pynfs.
No argument there. Wim, you can precisely determine the request stream
a server sees with a test client such as pynfs. However, it appeared
that you were looking at whole system client-server interaction during
server instability.
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard
[-- Attachment #3: Type: text/plain, Size: 315 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #4: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NFS_UNSTABLE vs. FILE and DATA sync.
2007-08-06 19:33 ` Wim Colgate
@ 2007-08-06 19:42 ` Chuck Lever
0 siblings, 0 replies; 12+ messages in thread
From: Chuck Lever @ 2007-08-06 19:42 UTC (permalink / raw)
To: Wim Colgate; +Cc: nfs
[-- Attachment #1: Type: text/plain, Size: 3545 bytes --]
Wim Colgate wrote:
> The linux kernel I was using is 2.6.18-8.
>
> To be fair, I was not trying to force NFS_FILE_SYNC; to make a long
> story short, I started with O_DIRECT (please don't cache data). I moved
> to add O_SYNC (don't return until my data is written safely). And when I
> couldn't explain why I was missing some data (discrepancy between client
> and server), I started investigating what was happening under the hood.
In fact O_DIRECT also guarantees that the data is on the server's disk
before the write() call returns. In some older versions of the client,
O_SYNC forced the direct I/O engine to use NFS_FILE_SYNC writes for
everything. I don't think that logic is there any more.
But what you describe above is a bug. A network dump would be the next
step to understand the true interaction between the client and the
server during a server reboot.
There were some bugs in the client's direct I/O engine where server
reboot recovery might result in data loss. Trond fixed a couple of bugs
in this area around 2.6.19 or 20. It would be interesting if you tested
a later kernel, just for behavioral comparison.
> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@oracle.com]
> Sent: Monday, August 06, 2007 12:16 PM
> To: Wim Colgate
> Cc: nfs@lists.sourceforge.net
> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
>
> Wim Colgate wrote:
>> Specifically I am trying to inject errors by manually (but politely)
>> bringing the NFS server down then up, then down (rinse and repeat ...)
>> while doing IO from a linux client. As mentioned the open file is
>> O_DIRECT and O_SYNC -- which I thought should mean either the data
> hits
>> the server's storage or I should get an error; and I'm more than happy
>> to deal with an IO error.
>>
>> I'm confident the writes are less than wsize (4096 bytes to be
> precise).
>>
>> Is there a 100% guaranteed method to get the behavior I thought
> O_DIRECT
>> and O_SYNC was providing?
>
> What behavior did you expect O_DIRECT + O_SYNC to provide? O_DIRECT
> means "don't cache data" and O_SYNC means "make sure the data is flushed
>
> to the server's disk before each write() system call returns."
> Technically, you don't need NFS_FILE_SYNC writes to do either of those.
>
> Which kernel are you testing? The client's use of NFS_FILE_SYNC writes
> changed over time.
>
>> -----Original Message-----
>> From: Peter Staubach [mailto:staubach@redhat.com]
>> Sent: Monday, August 06, 2007 10:33 AM
>> To: chuck.lever@oracle.com
>> Cc: Wim Colgate; nfs@lists.sourceforge.net
>> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
>>
>> Chuck Lever wrote:
>>> Wim Colgate wrote:
>>>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC,
>>>> should I ever expect a callback (nfs_writeback_done) with a
>>>> successful task->tk_status (i.e >= 0) with the committed state
>>>> (resp->verf->committed) set to NFS_UNSTABLE?
>>> Yes, this can happen if the server decides to return NFS_UNSTABLE.
>>> Rare, but possible.
>>>
>>>> A secondary question: if the above is expected, does this occur
>>>> because someone is caching the write and is there a mechanism to
>>>> disable this effect?
>>> Servers can return NFS_UNSTABLE to any WRITE request, so I can't
> think
>>> of a way this might be disabled.
>> Actually, it would be a protocol error for a server to return
>> a commitment level less than was requested by the client. The
>> server can return a greater commitment level, but not less than.
>>
>> ps
[-- Attachment #2: chuck.lever.vcf --]
[-- Type: text/x-vcard, Size: 290 bytes --]
begin:vcard
fn:Chuck Lever
n:Lever;Chuck
org:Oracle Corporation;Corporate Architecture: Linux Projects Group
adr:;;1015 Granger Avenue;Ann Arbor;MI;48104;USA
title:Principal Member of Staff
tel;work:+1 248 614 5091
x-mozilla-html:FALSE
url:http://oss.oracle.com/~cel
version:2.1
end:vcard
[-- Attachment #3: Type: text/plain, Size: 315 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
[-- Attachment #4: Type: text/plain, Size: 140 bytes --]
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2007-08-06 20:29 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-06 16:02 NFS_UNSTABLE vs. FILE and DATA sync Wim Colgate
2007-08-06 16:37 ` Chuck Lever
2007-08-06 17:10 ` Chuck Lever
2007-08-06 18:58 ` Trond Myklebust
2007-08-06 19:13 ` Chuck Lever
2007-08-06 19:19 ` Trond Myklebust
2007-08-06 19:35 ` Chuck Lever
2007-08-06 17:33 ` Peter Staubach
2007-08-06 17:40 ` Wim Colgate
2007-08-06 19:16 ` Chuck Lever
2007-08-06 19:33 ` Wim Colgate
2007-08-06 19:42 ` Chuck Lever
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.