* Directory fsync
@ 2011-09-23 15:12 Zhu Han
2011-09-23 16:33 ` Christoph Hellwig
0 siblings, 1 reply; 7+ messages in thread
From: Zhu Han @ 2011-09-23 15:12 UTC (permalink / raw)
To: xfs
[-- Attachment #1.1: Type: text/plain, Size: 573 bytes --]
I note below words in the manual of fsync:
Calling fsync() does not necessarily ensure that the entry in
the directory containing the file has also reached disk. For that an
explicit fsync() on a file
descriptor for the directory is also needed.
I am wondering is directory sync is essential after below steps if I want to
assure the file can be retrieved after system crash?
1) create file A
2) write file A
3) fsync(file A)
--------------------------------> fsync(parent directory) [Is it essential
to make the inode linked to parent directory?]
[-- Attachment #1.2: Type: text/html, Size: 623 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Directory fsync 2011-09-23 15:12 Directory fsync Zhu Han @ 2011-09-23 16:33 ` Christoph Hellwig 2011-09-23 23:09 ` Michael Monnerie 0 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2011-09-23 16:33 UTC (permalink / raw) To: Zhu Han; +Cc: xfs On Fri, Sep 23, 2011 at 11:12:02PM +0800, Zhu Han wrote: > I note below words in the manual of fsync: > Calling fsync() does not necessarily ensure that the entry in > the directory containing the file has also reached disk. For that an > explicit fsync() on a file > descriptor for the directory is also needed. > > I am wondering is directory sync is essential after below steps if I want to > assure the file can be retrieved after system crash? > > 1) create file A > 2) write file A > 3) fsync(file A) > > --------------------------------> fsync(parent directory) [Is it essential > to make the inode linked to parent directory?] As far as standards are concerned it is. As far as the current XFS implementation is concerned you don't need it as the file fsync will also force out all transactions that belong to the create. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Directory fsync 2011-09-23 16:33 ` Christoph Hellwig @ 2011-09-23 23:09 ` Michael Monnerie 2011-09-24 1:20 ` Zhu Han 2011-09-26 0:28 ` Dave Chinner 0 siblings, 2 replies; 7+ messages in thread From: Michael Monnerie @ 2011-09-23 23:09 UTC (permalink / raw) To: xfs; +Cc: Christoph Hellwig, Zhu Han [-- Attachment #1.1: Type: Text/Plain, Size: 1091 bytes --] On Freitag, 23. September 2011 Christoph Hellwig wrote: > As far as standards are concerned it is. As far as the current XFS > implementation is concerned you don't need it as the file fsync will > also force out all transactions that belong to the create. Aren't you giving O_PONIES to the users? ;-) I understand your description, but we should always tell people to use a directory fsync to be sure. Their applications might run on other filesystems, or run for 10 years, and maybe XFS's implementation changes in between. And maybe in historical kernels even XFS's implementation wasn't like it's now? @schumi: If your application should be able to run in a safe way on other filesystems, or other kernel releases, or other unixes, it's best to fsync the directory inode too. It's better to use it always, then nothing won't break. -- mit freundlichen Grüssen, Michael Monnerie, Ing. BSc it-management Internet Services: Protéger http://proteger.at [gesprochen: Prot-e-schee] Tel: +43 660 / 415 6531 // Haus zu verkaufen: http://zmi.at/langegg/ [-- Attachment #1.2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 198 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Directory fsync 2011-09-23 23:09 ` Michael Monnerie @ 2011-09-24 1:20 ` Zhu Han 2011-10-01 23:20 ` Peter Grandi 2011-09-26 0:28 ` Dave Chinner 1 sibling, 1 reply; 7+ messages in thread From: Zhu Han @ 2011-09-24 1:20 UTC (permalink / raw) To: Michael Monnerie; +Cc: Christoph Hellwig, xfs [-- Attachment #1.1: Type: text/plain, Size: 2110 bytes --] On Sat, Sep 24, 2011 at 7:09 AM, Michael Monnerie < michael.monnerie@is.it-management.at> wrote: > On Freitag, 23. September 2011 Christoph Hellwig wrote: > > As far as standards are concerned it is. As far as the current XFS > > implementation is concerned you don't need it as the file fsync will > > also force out all transactions that belong to the create. > > Aren't you giving O_PONIES to the users? ;-) > > I understand your description, but we should always tell people to use a > directory fsync to be sure. Their applications might run on other > filesystems, or run for 10 years, and maybe XFS's implementation changes > in between. And maybe in historical kernels even XFS's implementation > wasn't like it's now? > Thank you all. I see the importance of following the standard. But I am glad to know the current implementation of XFS enforce more strict fsync semantic, just as every application developer wishes. What I worry is not much applications syncs the directory after new files are created, even if PostgreSQL[1] and many other NoSQL database. If the current implementation forces more strict semantic, it makes our mind much much more peaceful. And , not many runtime supports sync of directory, e.g. java ecosystem does not have such support... So it is very very hard to follow this standard. For God's sake, the right semantic of fsync should be "The users wants to assure the file is retrievable after system crash or power failure if fsync returned successfully". [1] http://postgresql.1045698.n5.nabble.com/fsync-reliability-td4330289.html > > @schumi: If your application should be able to run in a safe way on > other filesystems, or other kernel releases, or other unixes, it's best > to fsync the directory inode too. It's better to use it always, then > nothing won't break. > > -- > mit freundlichen Grüssen, > Michael Monnerie, Ing. BSc > > it-management Internet Services: Protéger > http://proteger.at [gesprochen: Prot-e-schee] > Tel: +43 660 / 415 6531 > > // Haus zu verkaufen: http://zmi.at/langegg/ > [-- Attachment #1.2: Type: text/html, Size: 2928 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Directory fsync 2011-09-24 1:20 ` Zhu Han @ 2011-10-01 23:20 ` Peter Grandi 0 siblings, 0 replies; 7+ messages in thread From: Peter Grandi @ 2011-10-01 23:20 UTC (permalink / raw) To: Linux fs XFS >>> As far as standards are concerned it is. As far as the >>> current XFS implementation is concerned you don't need it as >>> the file fsync will also force out all transactions that >>> belong to the create. >> Aren't you giving O_PONIES to the users? ;-) I understand >> your description, but we should always tell people to use a >> directory fsync to be sure. Sometimes users wish unicorns, not just ponies, and sometimes they really want winged unicorns, not just unicorns... > I see the importance of following the standard. But I am glad > to know the current implementation of XFS enforce more strict > fsync semantic, just as every application developer wishes. Stricter semantics means potetially more expensive IO and more complicated kernel implementation with more chances for subtle bugs. Unless you are arguing that applications developers demand O_PONIES and don't care about thsat much application performance of portability or kernel bug opportunities. It is a long time since I reminded anyone that the UNIX filesystem semantics were designed when the whole kernel was (well) under 64KiB, and that was an interesting constraint. > What I worry is not much applications syncs the directory > after new files are created, even if PostgreSQL[1] and many > other NoSQL database. If the current implementation forces > more strict semantic, it makes our mind much much more > peaceful. Probably the developer should be a lot less peaceful, because the safer than required semantics could and perhaps should disappear tomorrow, and then application would be subtly buggy. It is not a theoretical issue; there have been a lot of problems and a huge O_PONIES discussion when the 'ext4' developers went for an implementation closer to the safety level madnated by the standard. Never mind exceptionally silly application developers who tend to forget that application files might reside on NFS or other network file systems that are both extremely popular and they cannot be ignored, and have semantics less safe then POSIX. Relying on implementations that implement safer behavior than POSIX seems to me a very bad, lazy (and common) idea. > [ ... ] a right semantic of fsync should be "The users wants > to assure the file is retrievable after system crash or power > failure if fsync returned successfully". Those would be really bad semantics, because UNIX/POSIX/Linux filesystem semantics don't allow this silly definition to have a useful meaning. The definition seems to be based on ignorance of the really important and big fact that UNIX/POSIX/Linux files have no names, and that only directory entries have names, and that a file can be linked to by zero or many directory entries, and that for the kernel it can be very expensive to keep track of all the directory entries (if any) that (hard) link to the file. A process only needs to 'fsync' a directory if it modified the directory (for example on entry, not necessarily file, creation or modification) and it would be really stupid and against all UNIX/POSIX/Linux logic to impose on the kernel the overhead of finding and 'fsync'ing all the directories that have entries (if any!) linking to a file being 'fsync'ed itself. It is up the user and/or the the applications managing file and named hard links to them to 'fsync' the file when appropriate, and if needed (and not necessarily at the same time) any directories containing the hard links to the file, because which directory entries should link to a file and where they are can only be part of the application/user data management logic. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Directory fsync 2011-09-23 23:09 ` Michael Monnerie 2011-09-24 1:20 ` Zhu Han @ 2011-09-26 0:28 ` Dave Chinner 2011-09-26 0:51 ` Christoph Hellwig 1 sibling, 1 reply; 7+ messages in thread From: Dave Chinner @ 2011-09-26 0:28 UTC (permalink / raw) To: Michael Monnerie; +Cc: Christoph Hellwig, Zhu Han, xfs On Sat, Sep 24, 2011 at 01:09:44AM +0200, Michael Monnerie wrote: > On Freitag, 23. September 2011 Christoph Hellwig wrote: > > As far as standards are concerned it is. As far as the current XFS > > implementation is concerned you don't need it as the file fsync will > > also force out all transactions that belong to the create. > > Aren't you giving O_PONIES to the users? ;-) > > I understand your description, but we should always tell people to use a > directory fsync to be sure. Their applications might run on other > filesystems, or run for 10 years, and maybe XFS's implementation changes > in between. And maybe in historical kernels even XFS's implementation > wasn't like it's now? XFS's journalling has always behaved this way - *all* transactions prior to the fsync() triggered log force are guaranteed to be on disk once the fsync completes. There are no plans to change this behaviour, either, because we rely on this architectural characteristic to provide strong ordering of metadata operations in many places. All it means is that the directory fsync() is a no-op that only costs CPU time. > @schumi: If your application should be able to run in a safe way on > other filesystems, or other kernel releases, or other unixes, it's best > to fsync the directory inode too. It's better to use it always, then > nothing won't break. *nod* Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Directory fsync 2011-09-26 0:28 ` Dave Chinner @ 2011-09-26 0:51 ` Christoph Hellwig 0 siblings, 0 replies; 7+ messages in thread From: Christoph Hellwig @ 2011-09-26 0:51 UTC (permalink / raw) To: Dave Chinner; +Cc: Michael Monnerie, Christoph Hellwig, Zhu Han, xfs On Mon, Sep 26, 2011 at 10:28:11AM +1000, Dave Chinner wrote: > All it means is that the directory fsync() is a no-op that only > costs CPU time. Currently it also causes a superflous cache flush, but I have a patch in my QA queue to fix that and reduce the (already tiny) CPU overhead a bit more. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-10-01 23:20 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-09-23 15:12 Directory fsync Zhu Han 2011-09-23 16:33 ` Christoph Hellwig 2011-09-23 23:09 ` Michael Monnerie 2011-09-24 1:20 ` Zhu Han 2011-10-01 23:20 ` Peter Grandi 2011-09-26 0:28 ` Dave Chinner 2011-09-26 0:51 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox