* O_DIRECT, O_SYNC, or fsync() on NFS mounts? @ 2010-11-18 23:34 Moazam Raja 2010-11-19 19:24 ` Trond Myklebust 0 siblings, 1 reply; 11+ messages in thread From: Moazam Raja @ 2010-11-18 23:34 UTC (permalink / raw) To: linux-nfs Hi all, I'm currently exporting a ZFS filesystem on Solaris 11 Express as NFS. I have a Linux client mounting that NFS v3 filesystem with the proto=tcp option. My question is, what's the safest and most reliable way to write data to this NFS mount on a Linux client? Should my application code use O_DIRECT, or O_SYNC? Or should I be doing a write() and a fsync()? I want to make sure that data is not lost and is truly committed, while keeping decent performance (of course). -Moazam ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: O_DIRECT, O_SYNC, or fsync() on NFS mounts? 2010-11-18 23:34 O_DIRECT, O_SYNC, or fsync() on NFS mounts? Moazam Raja @ 2010-11-19 19:24 ` Trond Myklebust 2010-11-19 19:55 ` Chuck Lever ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Trond Myklebust @ 2010-11-19 19:24 UTC (permalink / raw) To: Moazam Raja; +Cc: linux-nfs On Thu, 2010-11-18 at 15:34 -0800, Moazam Raja wrote: > Hi all, > > I'm currently exporting a ZFS filesystem on Solaris 11 Express as NFS. > I have a Linux client mounting that NFS v3 filesystem with the > proto=tcp option. > > My question is, what's the safest and most reliable way to write data > to this NFS mount on a Linux client? Should my application code use > O_DIRECT, or O_SYNC? Or should I be doing a write() and a fsync()? I > want to make sure that data is not lost and is truly committed, while > keeping decent performance (of course). Any one of the above methods will ensure that the data is synced to disk. In addition, NFS also guarantees that your data is fully synced to disk when taking/freeing POSIX locks, and when you close() the file. The choice of one method over the other depends on your application requirements. Not on your choice of underlying storage. Trond ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: O_DIRECT, O_SYNC, or fsync() on NFS mounts? 2010-11-19 19:24 ` Trond Myklebust @ 2010-11-19 19:55 ` Chuck Lever 2010-11-19 20:04 ` J. Bruce Fields [not found] ` <AANLkTi=AV20AsUKOGfVg6M92T8LfPLuuyrG_hQESw_RU@mail.gmail.com> 2 siblings, 0 replies; 11+ messages in thread From: Chuck Lever @ 2010-11-19 19:55 UTC (permalink / raw) To: Moazam Raja; +Cc: Linux NFS Mailing List, Trond Myklebust On Nov 19, 2010, at 2:24 PM, Trond Myklebust wrote: > On Thu, 2010-11-18 at 15:34 -0800, Moazam Raja wrote: >> Hi all, >> >> I'm currently exporting a ZFS filesystem on Solaris 11 Express as NFS. >> I have a Linux client mounting that NFS v3 filesystem with the >> proto=tcp option. >> >> My question is, what's the safest and most reliable way to write data >> to this NFS mount on a Linux client? Should my application code use >> O_DIRECT, or O_SYNC? Or should I be doing a write() and a fsync()? I >> want to make sure that data is not lost and is truly committed, while >> keeping decent performance (of course). > > Any one of the above methods will ensure that the data is synced to > disk. In addition, NFS also guarantees that your data is fully synced to > disk when taking/freeing POSIX locks, and when you close() the file. > > The choice of one method over the other depends on your application > requirements. Not on your choice of underlying storage. We should add that the synchronous methods (O_DIRECT and O_SYNC) guarantee that the write will go immediately to the server and return an immediate error code (like ENOSPC). That might be an improvement over an async write, where an error report could be delayed until close(2). But as was said above, it depends on your application's requirements. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: O_DIRECT, O_SYNC, or fsync() on NFS mounts? 2010-11-19 19:24 ` Trond Myklebust 2010-11-19 19:55 ` Chuck Lever @ 2010-11-19 20:04 ` J. Bruce Fields 2010-11-19 21:26 ` Trond Myklebust [not found] ` <AANLkTi=AV20AsUKOGfVg6M92T8LfPLuuyrG_hQESw_RU@mail.gmail.com> 2 siblings, 1 reply; 11+ messages in thread From: J. Bruce Fields @ 2010-11-19 20:04 UTC (permalink / raw) To: Trond Myklebust; +Cc: Moazam Raja, linux-nfs On Fri, Nov 19, 2010 at 02:24:59PM -0500, Trond Myklebust wrote: > On Thu, 2010-11-18 at 15:34 -0800, Moazam Raja wrote: > > Hi all, > > > > I'm currently exporting a ZFS filesystem on Solaris 11 Express as NFS. > > I have a Linux client mounting that NFS v3 filesystem with the > > proto=tcp option. > > > > My question is, what's the safest and most reliable way to write data > > to this NFS mount on a Linux client? Should my application code use > > O_DIRECT, or O_SYNC? Or should I be doing a write() and a fsync()? I > > want to make sure that data is not lost and is truly committed, while > > keeping decent performance (of course). > > Any one of the above methods will ensure that the data is synced to > disk. In addition, NFS also guarantees that your data is fully synced to > disk when taking/freeing POSIX locks, and when you close() the file. Is the client still doing that in the presence of a write delegation, by the way? --b. > > The choice of one method over the other depends on your application > requirements. Not on your choice of underlying storage. > > Trond > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: O_DIRECT, O_SYNC, or fsync() on NFS mounts? 2010-11-19 20:04 ` J. Bruce Fields @ 2010-11-19 21:26 ` Trond Myklebust 2010-11-19 21:48 ` J. Bruce Fields 2010-11-21 10:46 ` Christoph Hellwig 0 siblings, 2 replies; 11+ messages in thread From: Trond Myklebust @ 2010-11-19 21:26 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Moazam Raja, linux-nfs On Fri, 2010-11-19 at 15:04 -0500, J. Bruce Fields wrote: > On Fri, Nov 19, 2010 at 02:24:59PM -0500, Trond Myklebust wrote: > > On Thu, 2010-11-18 at 15:34 -0800, Moazam Raja wrote: > > > Hi all, > > > > > > I'm currently exporting a ZFS filesystem on Solaris 11 Express as NFS. > > > I have a Linux client mounting that NFS v3 filesystem with the > > > proto=tcp option. > > > > > > My question is, what's the safest and most reliable way to write data > > > to this NFS mount on a Linux client? Should my application code use > > > O_DIRECT, or O_SYNC? Or should I be doing a write() and a fsync()? I > > > want to make sure that data is not lost and is truly committed, while > > > keeping decent performance (of course). > > > > Any one of the above methods will ensure that the data is synced to > > disk. In addition, NFS also guarantees that your data is fully synced to > > disk when taking/freeing POSIX locks, and when you close() the file. > > Is the client still doing that in the presence of a write delegation, by > the way? If the application requests O_DIRECT/O_SYNC or calls fsync(), we are required by POSIX to ensure the data is safe on disk. The presence of an NFS delegation does not change that requirement. We could potentially relax the sync-to-disk requirements when locking and closing the file since those are only about ensuring close-to-open cache consistency requirements (which is also ensured by the delegation) but we do not do so today. Cheers Trond ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: O_DIRECT, O_SYNC, or fsync() on NFS mounts? 2010-11-19 21:26 ` Trond Myklebust @ 2010-11-19 21:48 ` J. Bruce Fields 2010-11-21 10:46 ` Christoph Hellwig 1 sibling, 0 replies; 11+ messages in thread From: J. Bruce Fields @ 2010-11-19 21:48 UTC (permalink / raw) To: Trond Myklebust; +Cc: Moazam Raja, linux-nfs On Fri, Nov 19, 2010 at 04:26:35PM -0500, Trond Myklebust wrote: > On Fri, 2010-11-19 at 15:04 -0500, J. Bruce Fields wrote: > > On Fri, Nov 19, 2010 at 02:24:59PM -0500, Trond Myklebust wrote: > > > On Thu, 2010-11-18 at 15:34 -0800, Moazam Raja wrote: > > > > Hi all, > > > > > > > > I'm currently exporting a ZFS filesystem on Solaris 11 Express as NFS. > > > > I have a Linux client mounting that NFS v3 filesystem with the > > > > proto=tcp option. > > > > > > > > My question is, what's the safest and most reliable way to write data > > > > to this NFS mount on a Linux client? Should my application code use > > > > O_DIRECT, or O_SYNC? Or should I be doing a write() and a fsync()? I > > > > want to make sure that data is not lost and is truly committed, while > > > > keeping decent performance (of course). > > > > > > Any one of the above methods will ensure that the data is synced to > > > disk. In addition, NFS also guarantees that your data is fully synced to > > > disk when taking/freeing POSIX locks, and when you close() the file. > > > > Is the client still doing that in the presence of a write delegation, by > > the way? > > If the application requests O_DIRECT/O_SYNC or calls fsync(), we are > required by POSIX to ensure the data is safe on disk. The presence of an > NFS delegation does not change that requirement. > > We could potentially relax the sync-to-disk requirements when locking > and closing the file since those are only about ensuring close-to-open > cache consistency requirements (which is also ensured by the delegation) > but we do not do so today. OK, that makes sense. We probably shouldn't say in that case that we "guarantee" the sync on close/free, if we consider it a detail of the current implementation rather than a requirement. --b. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: O_DIRECT, O_SYNC, or fsync() on NFS mounts? 2010-11-19 21:26 ` Trond Myklebust 2010-11-19 21:48 ` J. Bruce Fields @ 2010-11-21 10:46 ` Christoph Hellwig 2010-11-21 19:31 ` Moazam Raja 2010-11-21 20:01 ` Trond Myklebust 1 sibling, 2 replies; 11+ messages in thread From: Christoph Hellwig @ 2010-11-21 10:46 UTC (permalink / raw) To: Trond Myklebust; +Cc: J. Bruce Fields, Moazam Raja, linux-nfs On Fri, Nov 19, 2010 at 04:26:35PM -0500, Trond Myklebust wrote: > If the application requests O_DIRECT/O_SYNC or calls fsync(), we are > required by POSIX to ensure the data is safe on disk. The presence of an > NFS delegation does not change that requirement. That's not quite correct. O_DIRECT for one is not actually specific in Posix at all, and the documented Linux semantics only say that the pagecache should not be used (even if it sometimes is with various filesystems). There is not guarantee that data actually is on disk or reachable, for that you need to add the O_SYNC/O_DYSNC flag in addition or use fsync/fdatasync. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: O_DIRECT, O_SYNC, or fsync() on NFS mounts? 2010-11-21 10:46 ` Christoph Hellwig @ 2010-11-21 19:31 ` Moazam Raja 2010-11-21 20:01 ` Trond Myklebust 1 sibling, 0 replies; 11+ messages in thread From: Moazam Raja @ 2010-11-21 19:31 UTC (permalink / raw) To: linux-nfs This is my understanding as well, at least from reading the man pages. O_DIRECT and O_SYNC have different characteristics, so I'm a bit surprised at the responses here. Furthermore, using O_DIRECT over a 1Gbe connection yields 100-110MB/s performance where as O_SYNC on the same connection gives me around 20-30MB/s at best. There is definitely a difference. -Moazam On Sun, Nov 21, 2010 at 2:46 AM, Christoph Hellwig <hch@infradead.org> wrote: > On Fri, Nov 19, 2010 at 04:26:35PM -0500, Trond Myklebust wrote: >> If the application requests O_DIRECT/O_SYNC or calls fsync(), we are >> required by POSIX to ensure the data is safe on disk. The presence of an >> NFS delegation does not change that requirement. > > That's not quite correct. O_DIRECT for one is not actually specific in > Posix at all, and the documented Linux semantics only say that the > pagecache should not be used (even if it sometimes is with various > filesystems). There is not guarantee that data actually is on disk or > reachable, for that you need to add the O_SYNC/O_DYSNC flag in addition > or use fsync/fdatasync. > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: O_DIRECT, O_SYNC, or fsync() on NFS mounts? 2010-11-21 10:46 ` Christoph Hellwig 2010-11-21 19:31 ` Moazam Raja @ 2010-11-21 20:01 ` Trond Myklebust 1 sibling, 0 replies; 11+ messages in thread From: Trond Myklebust @ 2010-11-21 20:01 UTC (permalink / raw) To: Christoph Hellwig; +Cc: J. Bruce Fields, Moazam Raja, linux-nfs On Sun, 2010-11-21 at 05:46 -0500, Christoph Hellwig wrote: > On Fri, Nov 19, 2010 at 04:26:35PM -0500, Trond Myklebust wrote: > > If the application requests O_DIRECT/O_SYNC or calls fsync(), we are > > required by POSIX to ensure the data is safe on disk. The presence of an > > NFS delegation does not change that requirement. > > That's not quite correct. O_DIRECT for one is not actually specific in > Posix at all, and the documented Linux semantics only say that the > pagecache should not be used (even if it sometimes is with various > filesystems). There is not guarantee that data actually is on disk or > reachable, for that you need to add the O_SYNC/O_DYSNC flag in addition > or use fsync/fdatasync. True. We treat the O_DIRECT case as being the same as O_DIRECT|O_SYNC because we don't currently have a way to locate and track outstanding O_DIRECT rpc calls, and so fsync() has no effect. We do, however support aio/dio, and so people who want better writev() syscall latency can use that... Cheers Trond ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <AANLkTi=AV20AsUKOGfVg6M92T8LfPLuuyrG_hQESw_RU@mail.gmail.com>]
* Re: O_DIRECT, O_SYNC, or fsync() on NFS mounts? [not found] ` <AANLkTi=AV20AsUKOGfVg6M92T8LfPLuuyrG_hQESw_RU@mail.gmail.com> @ 2010-11-20 23:54 ` Trond Myklebust [not found] ` <AANLkTikFfdMWs0b4V1doVYUx1T96+ef8-dMUZf3v8cW9@mail.gmail.com> 0 siblings, 1 reply; 11+ messages in thread From: Trond Myklebust @ 2010-11-20 23:54 UTC (permalink / raw) To: Kums; +Cc: Moazam Raja, linux-nfs On Sat, 2010-11-20 at 09:12 -0700, Kums wrote: > On Fri, Nov 19, 2010 at 12:24 PM, Trond Myklebust < > trond.myklebust@fys.uio.no> wrote: > > > On Thu, 2010-11-18 at 15:34 -0800, Moazam Raja wrote: > > > Hi all, > > > > > > I'm currently exporting a ZFS filesystem on Solaris 11 Express as NFS. > > > I have a Linux client mounting that NFS v3 filesystem with the > > > proto=tcp option. > > > > > > My question is, what's the safest and most reliable way to write data > > > to this NFS mount on a Linux client? Should my application code use > > > O_DIRECT, or O_SYNC? Or should I be doing a write() and a fsync()? I > > > want to make sure that data is not lost and is truly committed, while > > > keeping decent performance (of course). > > > > Any one of the above methods will ensure that the data is synced to > > disk. In addition, NFS also guarantees that your data is fully synced to > > disk when taking/freeing POSIX locks, and when you close() the file. > > -- > > > > Instead of enforcing at the application side with O_DIRECT/O_SYNC, what if > we mount nfs client with -o sync option as well as exportfs the nfs server > with sync option? This way the data from all the application can be > guaranteed to be safe? ??????? What is your definition of 'safe' here? Do you mean 'everything written by my application is guaranteed to hit disk'? If so, then that would be a much stronger guarantee than POSIX and local disk give you, and it will seriously impact I/O performance (whether you use NFS, local disk or whatever). Why do you need this kind of guarantee in the first place? What applications are you running? Trond ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <AANLkTikFfdMWs0b4V1doVYUx1T96+ef8-dMUZf3v8cW9@mail.gmail.com>]
* Re: O_DIRECT, O_SYNC, or fsync() on NFS mounts? [not found] ` <AANLkTikFfdMWs0b4V1doVYUx1T96+ef8-dMUZf3v8cW9@mail.gmail.com> @ 2010-11-22 18:04 ` Trond Myklebust 0 siblings, 0 replies; 11+ messages in thread From: Trond Myklebust @ 2010-11-22 18:04 UTC (permalink / raw) To: Kums; +Cc: Moazam Raja, linux-nfs On Mon, 2010-11-22 at 10:45 -0700, Kums wrote: > On Sat, Nov 20, 2010 at 4:54 PM, Trond Myklebust <trond.myklebust@fys.uio.no > > wrote: > > If so, then that would be > > a much stronger guarantee than POSIX and local disk give you, and it > > will seriously impact I/O performance (whether you use NFS, local disk > > or whatever). > > > > Yes, I understand. Iam just throwing out a suggestion to see if "-o sync" > nfs mount + sync exportfs option can be alternative to using O_SYNC or > O_DIRECT in the application (to guarantee everything written by application > hits the disk). mount -osync under NFS works exactly the same as under any other filesystem, so yes... Trond ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-11-22 18:04 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-18 23:34 O_DIRECT, O_SYNC, or fsync() on NFS mounts? Moazam Raja
2010-11-19 19:24 ` Trond Myklebust
2010-11-19 19:55 ` Chuck Lever
2010-11-19 20:04 ` J. Bruce Fields
2010-11-19 21:26 ` Trond Myklebust
2010-11-19 21:48 ` J. Bruce Fields
2010-11-21 10:46 ` Christoph Hellwig
2010-11-21 19:31 ` Moazam Raja
2010-11-21 20:01 ` Trond Myklebust
[not found] ` <AANLkTi=AV20AsUKOGfVg6M92T8LfPLuuyrG_hQESw_RU@mail.gmail.com>
2010-11-20 23:54 ` Trond Myklebust
[not found] ` <AANLkTikFfdMWs0b4V1doVYUx1T96+ef8-dMUZf3v8cW9@mail.gmail.com>
2010-11-22 18:04 ` Trond Myklebust
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).