Wim Colgate wrote:
> The linux kernel I was using is 2.6.18-8.
> 
> To be fair, I was not trying to force NFS_FILE_SYNC; to make a long
> story short, I started with O_DIRECT (please don't cache data). I moved
> to add O_SYNC (don't return until my data is written safely). And when I
> couldn't explain why I was missing some data (discrepancy between client
> and server), I started investigating what was happening under the hood.

In fact O_DIRECT also guarantees that the data is on the server's disk 
before the write() call returns.  In some older versions of the client, 
O_SYNC forced the direct I/O engine to use NFS_FILE_SYNC writes for 
everything.  I don't think that logic is there any more.

But what you describe above is a bug.  A network dump would be the next 
step to understand the true interaction between the client and the 
server during a server reboot.

There were some bugs in the client's direct I/O engine where server 
reboot recovery might result in data loss.  Trond fixed a couple of bugs 
in this area around 2.6.19 or 20.  It would be interesting if you tested 
a later kernel, just for behavioral comparison.

> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@oracle.com] 
> Sent: Monday, August 06, 2007 12:16 PM
> To: Wim Colgate
> Cc: nfs@lists.sourceforge.net
> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
> 
> Wim Colgate wrote:
>> Specifically I am trying to inject errors by manually (but politely)
>> bringing the NFS server down then up, then down (rinse and repeat ...)
>> while doing IO from a linux client. As mentioned the open file is
>> O_DIRECT and O_SYNC -- which I thought should mean either the data
> hits
>> the server's storage or I should get an error; and I'm more than happy
>> to deal with an IO error.
>>
>> I'm confident the writes are less than wsize (4096 bytes to be
> precise).
>>
>> Is there a 100% guaranteed method to get the behavior I thought
> O_DIRECT
>> and O_SYNC was providing?
> 
> What behavior did you expect O_DIRECT + O_SYNC to provide?  O_DIRECT 
> means "don't cache data" and O_SYNC means "make sure the data is flushed
> 
> to the server's disk before each write() system call returns." 
> Technically, you don't need NFS_FILE_SYNC writes to do either of those.
> 
> Which kernel are you testing?  The client's use of NFS_FILE_SYNC writes 
> changed over time.
> 
>> -----Original Message-----
>> From: Peter Staubach [mailto:staubach@redhat.com] 
>> Sent: Monday, August 06, 2007 10:33 AM
>> To: chuck.lever@oracle.com
>> Cc: Wim Colgate; nfs@lists.sourceforge.net
>> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync.
>>
>> Chuck Lever wrote:
>>> Wim Colgate wrote:
>>>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC, 
>>>> should I ever expect a callback (nfs_writeback_done) with a 
>>>> successful task->tk_status (i.e >= 0) with the committed state 
>>>> (resp->verf->committed) set to NFS_UNSTABLE?
>>> Yes, this can happen if the server decides to return NFS_UNSTABLE. 
>>> Rare, but possible.
>>>
>>>> A secondary question: if the above is expected, does this occur 
>>>> because someone is caching the write and is there a mechanism to 
>>>> disable this effect?
>>> Servers can return NFS_UNSTABLE to any WRITE request, so I can't
> think
>>> of a way this might be disabled. 
>> Actually, it would be a protocol error for a server to return
>> a commitment level less than was requested by the client.  The
>> server can return a greater commitment level, but not less than.
>>
>>        ps