* Re: [patch 0/9] writeback data integrity and other fixes (take 3) [not found] ` <20081029094417.GA21824@infradead.org> @ 2008-10-29 10:30 ` Nick Piggin 2008-10-29 12:22 ` Jamie Lokier 0 siblings, 1 reply; 8+ messages in thread From: Nick Piggin @ 2008-10-29 10:30 UTC (permalink / raw) To: Christoph Hellwig, linux-nfs; +Cc: akpm, xfs, linux-fsdevel, Chris Mason On Wed, Oct 29, 2008 at 05:44:17AM -0400, Christoph Hellwig wrote: > On Wed, Oct 29, 2008 at 10:21:43AM +0100, Nick Piggin wrote: > > Please do. > > Well, there's one stumling block I haven't made progress on yet: > > I've changed the prototype of ->fsync to lose the dentry as we should > always have a valid file struct. Except that nfsd doesn't on > directories. So I either need to fake up a file there, or bail out > and add a ->dir_sync export operation that needs just a dentry. OK. I don't know much about hthat code, but I would think nfsd should look as close to the syscall layer as possible. I guess there must be something prohibitive (some protocol semantics?). Is there anything that particularly makes it a file operation as opposed to an inode operation? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 0/9] writeback data integrity and other fixes (take 3) 2008-10-29 10:30 ` [patch 0/9] writeback data integrity and other fixes (take 3) Nick Piggin @ 2008-10-29 12:22 ` Jamie Lokier 2008-10-29 13:32 ` Ric Wheeler 2008-10-29 21:43 ` Dave Chinner 0 siblings, 2 replies; 8+ messages in thread From: Jamie Lokier @ 2008-10-29 12:22 UTC (permalink / raw) To: Nick Piggin Cc: Christoph Hellwig, linux-nfs, akpm, xfs, linux-fsdevel, Chris Mason Nick Piggin wrote: > On Wed, Oct 29, 2008 at 05:44:17AM -0400, Christoph Hellwig wrote: > > On Wed, Oct 29, 2008 at 10:21:43AM +0100, Nick Piggin wrote: > > > Please do. > > > > Well, there's one stumling block I haven't made progress on yet: > > > > I've changed the prototype of ->fsync to lose the dentry as we should > > always have a valid file struct. Except that nfsd doesn't on > > directories. So I either need to fake up a file there, or bail out > > and add a ->dir_sync export operation that needs just a dentry. > > OK. I don't know much about hthat code, but I would think nfsd > should look as close to the syscall layer as possible. I guess > there must be something prohibitive (some protocol semantics?). > > Is there anything that particularly makes it a file operation > as opposed to an inode operation? In principle, is fsync() required to flush all dirty data written through any file descriptor ever, or just dirty data written through the file descriptor used for fsync()? -- Jamie ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 0/9] writeback data integrity and other fixes (take 3) 2008-10-29 12:22 ` Jamie Lokier @ 2008-10-29 13:32 ` Ric Wheeler 2008-10-29 14:56 ` Chris Mason 2008-10-29 21:43 ` Dave Chinner 1 sibling, 1 reply; 8+ messages in thread From: Ric Wheeler @ 2008-10-29 13:32 UTC (permalink / raw) To: Jamie Lokier Cc: Nick Piggin, Christoph Hellwig, linux-nfs, akpm, xfs, linux-fsdevel, Chris Mason Jamie Lokier wrote: > Nick Piggin wrote: > >> On Wed, Oct 29, 2008 at 05:44:17AM -0400, Christoph Hellwig wrote: >> >>> On Wed, Oct 29, 2008 at 10:21:43AM +0100, Nick Piggin wrote: >>> >>>> Please do. >>>> >>> Well, there's one stumling block I haven't made progress on yet: >>> >>> I've changed the prototype of ->fsync to lose the dentry as we should >>> always have a valid file struct. Except that nfsd doesn't on >>> directories. So I either need to fake up a file there, or bail out >>> and add a ->dir_sync export operation that needs just a dentry. >>> >> OK. I don't know much about hthat code, but I would think nfsd >> should look as close to the syscall layer as possible. I guess >> there must be something prohibitive (some protocol semantics?). >> >> Is there anything that particularly makes it a file operation >> as opposed to an inode operation? >> > > In principle, is fsync() required to flush all dirty data written > through any file descriptor ever, or just dirty data written through > the file descriptor used for fsync()? > > -- Jamie > -- > http://www.opengroup.org/onlinepubs/009695399/functions/fsync.html Is a pointer to what seems to be the official posix spec for this - it is definitely per file descriptor, not per file system, etc... What can happen by side effect (depending on the implementation) is that you can actually force out all data for any file. I found that you can approach non-fsync speeds for an fsync per file workload by simply writing all of the files out, then going back and fsync'ing them one at a time (last file first makes a bit of a difference). With that technique, you do get the hard promise of full data integrity and high speed. This is useful when you want to do bulk writes (tar, etc) ric ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 0/9] writeback data integrity and other fixes (take 3) 2008-10-29 13:32 ` Ric Wheeler @ 2008-10-29 14:56 ` Chris Mason [not found] ` <1225292196.6448.263.camel-cGoWVVl3WGUrkklhUoBCrlaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: Chris Mason @ 2008-10-29 14:56 UTC (permalink / raw) To: Ric Wheeler Cc: Jamie Lokier, Nick Piggin, Christoph Hellwig, linux-nfs, akpm, xfs, linux-fsdevel On Wed, 2008-10-29 at 09:32 -0400, Ric Wheeler wrote: > Jamie Lokier wrote: > >> Is there anything that particularly makes it a file operation > >> as opposed to an inode operation? > >> > > > > In principle, is fsync() required to flush all dirty data written > > through any file descriptor ever, or just dirty data written through > > the file descriptor used for fsync()? > > > > -- Jamie > > -- > > > http://www.opengroup.org/onlinepubs/009695399/functions/fsync.html > > Is a pointer to what seems to be the official posix spec for this - it > is definitely per file descriptor, not per file system, etc... > Maybe I'm reading Jamie's question wrong, but I think he's saying: /* open exactly the same file twice */ fd = open("file"); fd2 = open("file"); write(fd, "stuff") write(fd2, "more stuff") fsync(fd); Does the fsync promise "more stuff" will be on disk? I think the answer should be yes. -chris ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <1225292196.6448.263.camel-cGoWVVl3WGUrkklhUoBCrlaTQe2KTcn/@public.gmane.org>]
* Re: [patch 0/9] writeback data integrity and other fixes (take 3) [not found] ` <1225292196.6448.263.camel-cGoWVVl3WGUrkklhUoBCrlaTQe2KTcn/@public.gmane.org> @ 2008-10-30 2:16 ` Nick Piggin [not found] ` <20081030021601.GF18041-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: Nick Piggin @ 2008-10-30 2:16 UTC (permalink / raw) To: Chris Mason Cc: Ric Wheeler, Jamie Lokier, Christoph Hellwig, linux-nfs, akpm, xfs, linux-fsdevel On Wed, Oct 29, 2008 at 10:56:36AM -0400, Chris Mason wrote: > On Wed, 2008-10-29 at 09:32 -0400, Ric Wheeler wrote: > > Jamie Lokier wrote: > > > >> Is there anything that particularly makes it a file operation > > >> as opposed to an inode operation? > > >> > > > > > > In principle, is fsync() required to flush all dirty data written > > > through any file descriptor ever, or just dirty data written through > > > the file descriptor used for fsync()? > > > > > > -- Jamie > > > -- > > > > > http://www.opengroup.org/onlinepubs/009695399/functions/fsync.html > > > > Is a pointer to what seems to be the official posix spec for this - it > > is definitely per file descriptor, not per file system, etc... > > > > Maybe I'm reading Jamie's question wrong, but I think he's saying: > > /* open exactly the same file twice */ > fd = open("file"); > fd2 = open("file"); > > write(fd, "stuff") > write(fd2, "more stuff") > fsync(fd); > > Does the fsync promise "more stuff" will be on disk? I think the answer > should be yes. I think so. And this is in the context of making ->fsync an inode operation and avoid the NFS NULL-file problem... I don't think there is any fd specific metadata that fsync has to deal with? Any other reasons it has to be a file operation? ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <20081030021601.GF18041-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>]
* Re: [patch 0/9] writeback data integrity and other fixes (take 3) [not found] ` <20081030021601.GF18041-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org> @ 2008-10-30 12:51 ` jim owens 2008-10-30 13:41 ` Jim Rees 0 siblings, 1 reply; 8+ messages in thread From: jim owens @ 2008-10-30 12:51 UTC (permalink / raw) To: Nick Piggin Cc: Chris Mason, Ric Wheeler, Jamie Lokier, Christoph Hellwig, linux-nfs, akpm, xfs, linux-fsdevel Nick Piggin wrote: > On Wed, Oct 29, 2008 at 10:56:36AM -0400, Chris Mason wrote: >> On Wed, 2008-10-29 at 09:32 -0400, Ric Wheeler wrote: >>> Jamie Lokier wrote: >>>>> Is there anything that particularly makes it a file operation >>>>> as opposed to an inode operation? >>>>> >>>> In principle, is fsync() required to flush all dirty data written >>>> through any file descriptor ever, or just dirty data written through >>>> the file descriptor used for fsync()? >>>> >>>> -- Jamie >>>> -- >>>> >>> http://www.opengroup.org/onlinepubs/009695399/functions/fsync.html >>> >>> Is a pointer to what seems to be the official posix spec for this - it >>> is definitely per file descriptor, not per file system, etc... >>> >> Maybe I'm reading Jamie's question wrong, but I think he's saying: >> >> /* open exactly the same file twice */ >> fd = open("file"); >> fd2 = open("file"); >> >> write(fd, "stuff") >> write(fd2, "more stuff") >> fsync(fd); >> >> Does the fsync promise "more stuff" will be on disk? I think the answer >> should be yes. > > I think so. And this is in the context of making ->fsync an inode > operation and avoid the NFS NULL-file problem... I don't think there > is any fd specific metadata that fsync has to deal with? Any other > reasons it has to be a file operation? NO, or at least *not the posix definition*. It is normal in unix-like operating systems to always flush everything dirty on the inode no matter what stream it arrived on. Flushing everything is permitted but not the requirement so applications must not expect this is *promised* or they will not be portable. It is only guaranteed that "stuff" in this example will be on disk. AFAIK the fsync semantic comes from the days of dinosaurs, mainframes, and minicomputers... when a lot of operating systems had user-space libraries that buffered the I/O. On fsync(fd), the "fd2" data would still be in user-space. jim ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 0/9] writeback data integrity and other fixes (take 3) 2008-10-30 12:51 ` jim owens @ 2008-10-30 13:41 ` Jim Rees 0 siblings, 0 replies; 8+ messages in thread From: Jim Rees @ 2008-10-30 13:41 UTC (permalink / raw) To: jim owens Cc: Nick Piggin, Chris Mason, Ric Wheeler, Jamie Lokier, Christoph Hellwig, linux-nfs, akpm, xfs, linux-fsdevel jim owens wrote: AFAIK the fsync semantic comes from the days of dinosaurs, mainframes, and minicomputers... when a lot of operating systems had user-space libraries that buffered the I/O. On fsync(fd), the "fd2" data would still be in user-space. User space buffering happens in stdio, which is above the system call level. It's been that way since fsync() was first introduced, and is still that way today. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch 0/9] writeback data integrity and other fixes (take 3) 2008-10-29 12:22 ` Jamie Lokier 2008-10-29 13:32 ` Ric Wheeler @ 2008-10-29 21:43 ` Dave Chinner 1 sibling, 0 replies; 8+ messages in thread From: Dave Chinner @ 2008-10-29 21:43 UTC (permalink / raw) To: Jamie Lokier Cc: Nick Piggin, Christoph Hellwig, linux-nfs, akpm, xfs, linux-fsdevel, Chris Mason On Wed, Oct 29, 2008 at 12:22:35PM +0000, Jamie Lokier wrote: > Nick Piggin wrote: > > On Wed, Oct 29, 2008 at 05:44:17AM -0400, Christoph Hellwig wrote: > > > On Wed, Oct 29, 2008 at 10:21:43AM +0100, Nick Piggin wrote: > > > > Please do. > > > > > > Well, there's one stumling block I haven't made progress on yet: > > > > > > I've changed the prototype of ->fsync to lose the dentry as we should > > > always have a valid file struct. Except that nfsd doesn't on > > > directories. So I either need to fake up a file there, or bail out > > > and add a ->dir_sync export operation that needs just a dentry. > > > > OK. I don't know much about hthat code, but I would think nfsd > > should look as close to the syscall layer as possible. I guess > > there must be something prohibitive (some protocol semantics?). > > > > Is there anything that particularly makes it a file operation > > as opposed to an inode operation? > > In principle, is fsync() required to flush all dirty data written > through any file descriptor ever, or just dirty data written through > the file descriptor used for fsync()? fsync() is required to flush the data that is dirty at the time of the call, as well as any associated metadata needed to reference that data. It doesn't matter who wrote the data in the first place.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-10-30 13:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20081028144715.683011000@suse.de>
[not found] ` <20081028153953.GB3082@wotan.suse.de>
[not found] ` <20081028222746.GB4985@disturbed>
[not found] ` <20081029001653.GF15599@wotan.suse.de>
[not found] ` <20081029031645.GE4985@disturbed>
[not found] ` <20081029091203.GA32545@infradead.org>
[not found] ` <20081029092143.GA5953@wotan.suse.de>
[not found] ` <20081029094417.GA21824@infradead.org>
2008-10-29 10:30 ` [patch 0/9] writeback data integrity and other fixes (take 3) Nick Piggin
2008-10-29 12:22 ` Jamie Lokier
2008-10-29 13:32 ` Ric Wheeler
2008-10-29 14:56 ` Chris Mason
[not found] ` <1225292196.6448.263.camel-cGoWVVl3WGUrkklhUoBCrlaTQe2KTcn/@public.gmane.org>
2008-10-30 2:16 ` Nick Piggin
[not found] ` <20081030021601.GF18041-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-10-30 12:51 ` jim owens
2008-10-30 13:41 ` Jim Rees
2008-10-29 21:43 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox