Wim Colgate wrote: > The linux kernel I was using is 2.6.18-8. > > To be fair, I was not trying to force NFS_FILE_SYNC; to make a long > story short, I started with O_DIRECT (please don't cache data). I moved > to add O_SYNC (don't return until my data is written safely). And when I > couldn't explain why I was missing some data (discrepancy between client > and server), I started investigating what was happening under the hood. In fact O_DIRECT also guarantees that the data is on the server's disk before the write() call returns. In some older versions of the client, O_SYNC forced the direct I/O engine to use NFS_FILE_SYNC writes for everything. I don't think that logic is there any more. But what you describe above is a bug. A network dump would be the next step to understand the true interaction between the client and the server during a server reboot. There were some bugs in the client's direct I/O engine where server reboot recovery might result in data loss. Trond fixed a couple of bugs in this area around 2.6.19 or 20. It would be interesting if you tested a later kernel, just for behavioral comparison. > -----Original Message----- > From: Chuck Lever [mailto:chuck.lever@oracle.com] > Sent: Monday, August 06, 2007 12:16 PM > To: Wim Colgate > Cc: nfs@lists.sourceforge.net > Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync. > > Wim Colgate wrote: >> Specifically I am trying to inject errors by manually (but politely) >> bringing the NFS server down then up, then down (rinse and repeat ...) >> while doing IO from a linux client. As mentioned the open file is >> O_DIRECT and O_SYNC -- which I thought should mean either the data > hits >> the server's storage or I should get an error; and I'm more than happy >> to deal with an IO error. >> >> I'm confident the writes are less than wsize (4096 bytes to be > precise). >> >> Is there a 100% guaranteed method to get the behavior I thought > O_DIRECT >> and O_SYNC was providing? > > What behavior did you expect O_DIRECT + O_SYNC to provide? O_DIRECT > means "don't cache data" and O_SYNC means "make sure the data is flushed > > to the server's disk before each write() system call returns." > Technically, you don't need NFS_FILE_SYNC writes to do either of those. > > Which kernel are you testing? The client's use of NFS_FILE_SYNC writes > changed over time. > >> -----Original Message----- >> From: Peter Staubach [mailto:staubach@redhat.com] >> Sent: Monday, August 06, 2007 10:33 AM >> To: chuck.lever@oracle.com >> Cc: Wim Colgate; nfs@lists.sourceforge.net >> Subject: Re: [NFS] NFS_UNSTABLE vs. FILE and DATA sync. >> >> Chuck Lever wrote: >>> Wim Colgate wrote: >>>> If I have a soft mount, and open a file with O_DIRECT and O_SYNC, >>>> should I ever expect a callback (nfs_writeback_done) with a >>>> successful task->tk_status (i.e >= 0) with the committed state >>>> (resp->verf->committed) set to NFS_UNSTABLE? >>> Yes, this can happen if the server decides to return NFS_UNSTABLE. >>> Rare, but possible. >>> >>>> A secondary question: if the above is expected, does this occur >>>> because someone is caching the write and is there a mechanism to >>>> disable this effect? >>> Servers can return NFS_UNSTABLE to any WRITE request, so I can't > think >>> of a way this might be disabled. >> Actually, it would be a protocol error for a server to return >> a commitment level less than was requested by the client. The >> server can return a greater commitment level, but not less than. >> >> ps