* sync & asyck i/o @ 2001-02-06 14:24 Anders Eriksson 2001-02-06 14:52 ` Alan Cox 0 siblings, 1 reply; 12+ messages in thread From: Anders Eriksson @ 2001-02-06 14:24 UTC (permalink / raw) To: linux-kernel [-- Attachment #1: Type: text/plain, Size: 215 bytes --] According to the man page for fsync it copies in-core data to disk prior to its return. Does that take async i/o to the media in account? I.e. does it wait for completion of the async i/o to the disk? /Anders [-- Attachment #2: Type: application/pgp-signature, Size: 235 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 14:24 sync & asyck i/o Anders Eriksson @ 2001-02-06 14:52 ` Alan Cox 2001-02-06 17:34 ` Stephen C. Tweedie ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Alan Cox @ 2001-02-06 14:52 UTC (permalink / raw) To: Anders Eriksson; +Cc: linux-kernel > According to the man page for fsync it copies in-core data to disk > prior to its return. Does that take async i/o to the media in account? > I.e. does it wait for completion of the async i/o to the disk? Undefined. In theory for a journalling file system it means the change is committed to the log and the log to the media, and for other fs that the change is committed to the final disk and recoverable by fsck worst case In practice some IDE disks do write merging and small amounts of write caching in the drive firmware so you cannot trust it 100%. In addition some higher end controllers will store to battery backed memory caches which is normally just fine since the reboot will play through the ram cache. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 14:52 ` Alan Cox @ 2001-02-06 17:34 ` Stephen C. Tweedie 2001-02-06 18:00 ` Ben LaHaise 2001-02-06 18:21 ` Daniel Phillips 2001-02-06 17:51 ` Josh Myer 2001-02-06 17:54 ` David Woodhouse 2 siblings, 2 replies; 12+ messages in thread From: Stephen C. Tweedie @ 2001-02-06 17:34 UTC (permalink / raw) To: Alan Cox; +Cc: Anders Eriksson, linux-kernel, Stephen Tweedie Hi, On Tue, Feb 06, 2001 at 02:52:40PM +0000, Alan Cox wrote: > > According to the man page for fsync it copies in-core data to disk > > prior to its return. Does that take async i/o to the media in account? > > I.e. does it wait for completion of the async i/o to the disk? > > Undefined. > In practice some IDE disks do write merging and small amounts of write > caching in the drive firmware so you cannot trust it 100%. It's worth noting that it *is* defined unambiguously in the standards: fsync waits until all the data is hard on disk. Linux will obey that if it possibly can: only in cases where the hardware is actively lying about when the data has hit disk will the guarantee break down. --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 17:34 ` Stephen C. Tweedie @ 2001-02-06 18:00 ` Ben LaHaise 2001-02-06 18:21 ` Daniel Phillips 1 sibling, 0 replies; 12+ messages in thread From: Ben LaHaise @ 2001-02-06 18:00 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Alan Cox, Anders Eriksson, linux-kernel On Tue, 6 Feb 2001, Stephen C. Tweedie wrote: > It's worth noting that it *is* defined unambiguously in the standards: > fsync waits until all the data is hard on disk. Linux will obey that > if it possibly can: only in cases where the hardware is actively lying > about when the data has hit disk will the guarantee break down. It is defined for writes that have begun before the fsync() started. fsync has no bearing on aio writes until the async writes have completed. If people are worried about the interaction between an fsync in their app and an async write, they should be using syncronous writes (which are perfectly usable with async io). -ben - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 17:34 ` Stephen C. Tweedie 2001-02-06 18:00 ` Ben LaHaise @ 2001-02-06 18:21 ` Daniel Phillips 1 sibling, 0 replies; 12+ messages in thread From: Daniel Phillips @ 2001-02-06 18:21 UTC (permalink / raw) To: Stephen C. Tweedie, linux-kernel "Stephen C. Tweedie" wrote: > > Hi, > > On Tue, Feb 06, 2001 at 02:52:40PM +0000, Alan Cox wrote: > > > According to the man page for fsync it copies in-core data to disk > > > prior to its return. Does that take async i/o to the media in account? > > > I.e. does it wait for completion of the async i/o to the disk? > > > > Undefined. > > > In practice some IDE disks do write merging and small amounts of write > > caching in the drive firmware so you cannot trust it 100%. > > It's worth noting that it *is* defined unambiguously in the standards: > fsync waits until all the data is hard on disk. Linux will obey that > if it possibly can: only in cases where the hardware is actively lying > about when the data has hit disk will the guarantee break down. Sometimes I want to know that the write is safely on disk and sometimes I only need to know that the io has gone over the bus and is on its way to disk. In the latter case the buffer/page can be unlocked a lot sooner. Please correct me if I'm wrong, but I don't think the current API can make that distinction for IDE, much less provide a uniform way of controlling this behaviour across all types of block devices. We need that, or else we have to choose between the following: 1) slow 2) risky. I'd like to be able to set a bit in the buffer_head that says 'get back to me when it's on disk' vs 'get back to me when it's hit the bus'. -- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 14:52 ` Alan Cox 2001-02-06 17:34 ` Stephen C. Tweedie @ 2001-02-06 17:51 ` Josh Myer 2001-02-06 17:56 ` Alan Cox 2001-02-06 17:54 ` David Woodhouse 2 siblings, 1 reply; 12+ messages in thread From: Josh Myer @ 2001-02-06 17:51 UTC (permalink / raw) To: Alan Cox; +Cc: linux-kernel Hello, On Tue, 6 Feb 2001, Alan Cox wrote: [snip] > In theory for a journalling file system it means the change is committed to the > log and the log to the media, and for other fs that the change is committed > to the final disk and recoverable by fsck worst case > > In practice some IDE disks do write merging and small amounts of write > caching in the drive firmware so you cannot trust it 100%. In addition some > higher end controllers will store to battery backed memory caches which is > normally just fine since the reboot will play through the ram cache. > Does this imply that in order to ensure my data hits the drives, i should do a warm reboot and then shut down from the lilo: prompt or similiar? apologies to bug you with a simple question, but i can see other people worrying about data loss here too. -- /jbm - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 17:51 ` Josh Myer @ 2001-02-06 17:56 ` Alan Cox 0 siblings, 0 replies; 12+ messages in thread From: Alan Cox @ 2001-02-06 17:56 UTC (permalink / raw) To: Josh Myer; +Cc: Alan Cox, linux-kernel > Does this imply that in order to ensure my data hits the drives, i should > do a warm reboot and then shut down from the lilo: prompt or similiar? As far as I can tell the IDE drives are write caching at most a second or two of data. Andre may know more - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 14:52 ` Alan Cox 2001-02-06 17:34 ` Stephen C. Tweedie 2001-02-06 17:51 ` Josh Myer @ 2001-02-06 17:54 ` David Woodhouse 2001-02-06 18:18 ` Stephen C. Tweedie 2 siblings, 1 reply; 12+ messages in thread From: David Woodhouse @ 2001-02-06 17:54 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Alan Cox, Anders Eriksson, linux-kernel sct@redhat.com said: > Linux will obey that if it possibly can: only in cases where the > hardware is actively lying about when the data has hit disk will the > guarantee break down. Do we attempt to ask SCSI disks nicely to flush their write caches in this situation? cf. http://www.danbbs.dk/~dino/SCSI/SCSI2-09.html#9.2.18 Or do we instruct all SCSI disks not to do write caching in the first place? -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 17:54 ` David Woodhouse @ 2001-02-06 18:18 ` Stephen C. Tweedie 2001-02-06 19:25 ` Andre Hedrick 0 siblings, 1 reply; 12+ messages in thread From: Stephen C. Tweedie @ 2001-02-06 18:18 UTC (permalink / raw) To: David Woodhouse Cc: Stephen C. Tweedie, Alan Cox, Anders Eriksson, linux-kernel Hi, On Tue, Feb 06, 2001 at 05:54:41PM +0000, David Woodhouse wrote: > > sct@redhat.com said: > > Linux will obey that if it possibly can: only in cases where the > > hardware is actively lying about when the data has hit disk will the > > guarantee break down. > > Do we attempt to ask SCSI disks nicely to flush their write caches in this > situation? cf. http://www.danbbs.dk/~dino/SCSI/SCSI2-09.html#9.2.18 No, we simply omit to instruct them to enable write-back caching. Linux assumes that the WCE (write cache enable) bit in a disk's caching mode page is zero. --Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 18:18 ` Stephen C. Tweedie @ 2001-02-06 19:25 ` Andre Hedrick 2001-02-06 23:21 ` Stephen C. Tweedie 0 siblings, 1 reply; 12+ messages in thread From: Andre Hedrick @ 2001-02-06 19:25 UTC (permalink / raw) To: Stephen C. Tweedie Cc: David Woodhouse, Alan Cox, Anders Eriksson, linux-kernel On Tue, 6 Feb 2001, Stephen C. Tweedie wrote: > Hi, > > On Tue, Feb 06, 2001 at 05:54:41PM +0000, David Woodhouse wrote: > > > > sct@redhat.com said: > > > Linux will obey that if it possibly can: only in cases where the > > > hardware is actively lying about when the data has hit disk will the > > > guarantee break down. > > > > Do we attempt to ask SCSI disks nicely to flush their write caches in this > > situation? cf. http://www.danbbs.dk/~dino/SCSI/SCSI2-09.html#9.2.18 > > No, we simply omit to instruct them to enable write-back caching. > Linux assumes that the WCE (write cache enable) bit in a disk's > caching mode page is zero. Stephen, You can not be so blind to omit the command. You have to issue an active command to disable WCE. All modern drives come with it defaulted enabled, especially ATA disks. Andre Hedrick Linux ATA Development ASL Kernel Development ----------------------------------------------------------------------------- ASL, Inc. Toll free: 1-877-ASL-3535 1757 Houret Court Fax: 1-408-941-2071 Milpitas, CA 95035 Web: www.aslab.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 19:25 ` Andre Hedrick @ 2001-02-06 23:21 ` Stephen C. Tweedie 2001-02-07 0:42 ` Andre Hedrick 0 siblings, 1 reply; 12+ messages in thread From: Stephen C. Tweedie @ 2001-02-06 23:21 UTC (permalink / raw) To: Andre Hedrick Cc: Stephen C. Tweedie, David Woodhouse, Alan Cox, Anders Eriksson, linux-kernel Hi, On Tue, Feb 06, 2001 at 11:25:00AM -0800, Andre Hedrick wrote: > On Tue, 6 Feb 2001, Stephen C. Tweedie wrote: > > No, we simply omit to instruct them to enable write-back caching. > > Linux assumes that the WCE (write cache enable) bit in a disk's > > caching mode page is zero. > > You can not be so blind to omit the command. Linux has traditionally ignored the issue. Don't ask me to defend it --- the last advice I got from anybody who knew SCSI well was that SCSI disks were defaulting to WCE-disabled. Note that disabling SCSI WCE doesn't disable the cache, it just enforces synchronous completion. With tagged command queuing, writeback caching doesn't necessarily mean a huge performance increase. But if WCE is being enabled by default on modern SCSI drives, then that's something which the scsi stack really does need to fix --- the upper block layers will most definitely break if we have WCE enabled and we don't set force-unit-access on the scsi commands. The ll_rw_block interface is perfectly clear: it expects the data to be written to persistent storage once the buffer_head end_io is called. If that's not the case, somebody needs to fix the lower layers. Cheers, Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: sync & asyck i/o 2001-02-06 23:21 ` Stephen C. Tweedie @ 2001-02-07 0:42 ` Andre Hedrick 0 siblings, 0 replies; 12+ messages in thread From: Andre Hedrick @ 2001-02-07 0:42 UTC (permalink / raw) To: Stephen C. Tweedie Cc: David Woodhouse, Alan Cox, Anders Eriksson, linux-kernel On Tue, 6 Feb 2001, Stephen C. Tweedie wrote: > The ll_rw_block interface is perfectly clear: it expects the data to > be written to persistent storage once the buffer_head end_io is > called. If that's not the case, somebody needs to fix the lower > layers. Sure in 2.5 when I have a cleaner method of setting up hooks to allow testing and changing of the mode but you can not assume that this stuff is off by default and will stay that way. At this time I am working to clean up an IBM mess of drives that do random dumping of the drive cache to the platters when power is pulled. This is a nice dirty errata that I have heard about but have never seen, but can believe that it is real. The painful part is now that drives have these huge buffers of upto 4MB, we have only a second or two to hit the platters before the head float and spindle sync for writing depart from the allowable range and it does not get to disk....OOPS! I suspect that with all of the new NVRAM HOSTS coming to market soon we will see more fs death in the future until things settle. Cheers, Andre Hedrick Linux ATA Development ASL Kernel Development ----------------------------------------------------------------------------- ASL, Inc. Toll free: 1-877-ASL-3535 1757 Houret Court Fax: 1-408-941-2071 Milpitas, CA 95035 Web: www.aslab.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2001-02-07 0:43 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-02-06 14:24 sync & asyck i/o Anders Eriksson 2001-02-06 14:52 ` Alan Cox 2001-02-06 17:34 ` Stephen C. Tweedie 2001-02-06 18:00 ` Ben LaHaise 2001-02-06 18:21 ` Daniel Phillips 2001-02-06 17:51 ` Josh Myer 2001-02-06 17:56 ` Alan Cox 2001-02-06 17:54 ` David Woodhouse 2001-02-06 18:18 ` Stephen C. Tweedie 2001-02-06 19:25 ` Andre Hedrick 2001-02-06 23:21 ` Stephen C. Tweedie 2001-02-07 0:42 ` Andre Hedrick
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox