From: Eric Sandeen <sandeen@redhat.com>
To: Pavel Machek <pavel@ucw.cz>
Cc: Theodore Tso <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
Fernando Luis V?zquez Cao <fernando@oss.ntt.co.jp>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
kernel list <linux-kernel@vger.kernel.org>,
Jens Axboe <jens.axboe@oracle.com>,
fernando@kic.ac.jp, Ric Wheeler <rwheeler@redhat.com>
Subject: Re: vfs: Add MS_FLUSHONFSYNC mount flag
Date: Sun, 22 Feb 2009 14:51:56 -0600 [thread overview]
Message-ID: <49A1BAEC.7080504@redhat.com> (raw)
In-Reply-To: <20090222141532.GC1586@ucw.cz>
Pavel Machek wrote:
> On Thu 2009-02-12 21:23:36, Theodore Tso wrote:
>> On Thu, Feb 12, 2009 at 03:30:10PM -0600, Eric Sandeen wrote:
>>>> Yes, but OTOH we should give sysadmin a possibility to enable / disable
>>>> it on just some partitions. I don't see a reasonable use for that but people
>>>> tend to do strange things ;) and here isn't probably a strong reason to not
>>>> allow them.
>>>>
>>> But nobody has asked for that, have they? So why offer it up a this point?
>>>
>>> They could use LD_PRELOAD to make fsync a no-op if they really don't
>>> care for it, I guess... though that's not easily per-fs either.
>> Actually, Bart Samwel at FOSDEM talked to me and asked for something
>> similar --- what we came up which meant his request while still being
>> standards-compliant was a per-process personality flag which had three
>> options:
>>
>> *) Always honor fsync() calls (the default)
>> *) Never honor fsync() calls
>> *) Only honor fsync() calls if a global "honor fsync" flag
>> (which would be manipulated by the laptop mode scripts)
>> is set.
>>
>> The flag would be reset to the default across a setuid exec, but would
>> otherwise be inherited across fork()'s. It might be possible to
>> set/get the flag via a /proc interface.
>>
>> The basic idea is that laptop systems where the system administrator
>> wants longer battery life (and trusts the battery not to suddenly give
>> out) more than they care about fsync() guarantees can set up a pam
>> library which sets the flag for at login time so that all of the
>> user's processes can be set up not to honor fsync() calls; however,
>> all of the system daemons would still function normally.
>
> Sounds like posix violation to
> me... '/sys/fsync_does_not_really_sync'?
>
> Perhaps it is better done at glibc level? Environment variables
> already mostly have semantics you want.....
>
> Pavel
One other thing that may be worth bringing up (just to muddy the waters
more) is OSX's handling of this stuff.
>From the fsync(2) manpage:
> Note that while fsync() will flush all data from the host to the
> drive (i.e. the "permanent storage device"), the drive itself may not
> physically write the data to the platters for quite some time and it
> may be written in an out-of-order sequence.
>
> Specifically, if the drive loses power or the OS crashes, the appli-
> cation may find that only some or none of their data was written.
> The disk drive may also re-order the data so that later writes may be
> present, while earlier writes are not.
>
> This is not a theoretical edge case. This scenario is easily repro-
> duced with real world workloads and drive power failures.
>
> For applications that require tighter guarantees about the integrity
> of their data, Mac OS X provides the F_FULLFSYNC fcntl. The F_FULLF-
> SYNC fcntl asks the drive to flush all buffered data to permanent
> storage. Applications, such as databases, that require a strict
> ordering of writes should use F_FULLFSYNC to ensure that their data
> is written in the order they expect. Please see fcntl(2) for more
> detail.
and from fcntl(2)
> F_FULLFSYNC Does the same thing as fsync(2) then asks the drive to
> flush all buffered data to the permanent storage
> device (arg is ignored). This is currently imple-
> mented on HFS, MS-DOS (FAT), and Universal Disk Format
> (UDF) file systems. The operation may take quite a
> while to complete. Certain FireWire drives have also
> been known to ignore the request to flush their
> buffered data.
-Eric
next prev parent reply other threads:[~2009-02-22 20:53 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-13 13:14 ext2 + -osync: not as easy as it seems Pavel Machek
2009-01-13 13:45 ` Alan Cox
2009-01-13 14:03 ` Theodore Tso
2009-01-13 14:07 ` Jens Axboe
2009-01-13 14:26 ` [PATCH] block: Fix documentation for blkdev_issue_flush() Theodore Ts'o
2009-01-13 14:28 ` Jens Axboe
2009-01-13 14:30 ` ext2 + -osync: not as easy as it seems Jan Kara
2009-01-13 14:46 ` Theodore Tso
2009-01-14 3:37 ` Fernando Luis Vázquez Cao
2009-01-14 10:35 ` Jan Kara
2009-01-14 13:21 ` Theodore Tso
2009-01-14 14:05 ` Jan Kara
2009-01-14 14:08 ` Jens Axboe
2009-01-14 14:34 ` Theodore Tso
2009-01-14 14:43 ` Jens Axboe
2009-02-12 16:43 ` Eric Sandeen
2009-02-16 12:09 ` Jens Axboe
2009-01-14 14:12 ` Theodore Tso
2009-01-14 14:37 ` Jan Kara
2009-01-14 16:59 ` Theodore Tso
2009-01-15 12:06 ` Fernando Luis Vázquez Cao
2009-01-15 23:45 ` Jan Kara
2009-01-16 12:31 ` Fernando Luis Vázquez Cao
2009-01-16 13:55 ` ext3: call blkdev_issue_flush on fsync Fernando Luis Vázquez Cao
2009-01-16 16:30 ` Jan Kara
2009-01-17 9:47 ` Fernando Luis Vázquez Cao
2009-01-17 10:00 ` Fernando Luis Vázquez Cao
2009-01-19 12:03 ` Jan Kara
2009-01-28 9:45 ` Fernando Luis Vázquez Cao
2009-01-28 9:55 ` Jan Kara
2009-02-12 10:33 ` Fernando Luis Vázquez Cao
2009-02-12 10:35 ` vfs: Improve readability off mount flag definitins by using offsets Fernando Luis Vázquez Cao
2009-02-12 10:36 ` vfs: Add MS_FLUSHONFSYNC mount flag Fernando Luis Vázquez Cao
2009-02-12 17:13 ` Eric Sandeen
2009-02-12 17:29 ` Jeff Garzik
2009-02-14 15:36 ` Christoph Hellwig
2009-02-15 7:23 ` Fernando Luis Vázquez Cao
2009-02-15 22:54 ` Theodore Tso
2009-02-16 4:29 ` Eric Sandeen
2009-02-16 7:47 ` Fernando Luis Vázquez Cao
2009-02-16 7:47 ` Fernando Luis Vázquez Cao
2009-02-12 21:23 ` Jan Kara
2009-02-12 21:30 ` Eric Sandeen
2009-02-13 1:47 ` Fernando Luis Vázquez Cao
2009-02-13 6:07 ` Eric Sandeen
2009-02-13 2:23 ` Theodore Tso
2009-02-22 14:15 ` Pavel Machek
2009-02-22 20:51 ` Eric Sandeen [this message]
2009-02-22 23:19 ` Theodore Tso
2009-02-22 23:42 ` Jeff Garzik
2009-02-22 23:46 ` Jeff Garzik
2009-02-23 1:23 ` Theodore Tso
2009-02-13 1:14 ` Fernando Luis Vázquez Cao
2009-02-13 6:20 ` Eric Sandeen
2009-02-13 10:36 ` Fernando Luis Vázquez Cao
2009-02-13 12:20 ` Dave Chinner
2009-02-13 16:29 ` Fernando Luis Vazquez Cao
2009-02-14 11:24 ` Dave Chinner
2009-02-14 13:03 ` Fernando Luis Vázquez Cao
2009-02-14 13:19 ` Fernando Luis Vázquez Cao
2009-02-15 2:48 ` Dave Chinner
2009-02-15 7:11 ` Fernando Luis Vázquez Cao
2009-02-12 10:37 ` util-linux: Add new mount options flushonfsync and noflushonfsync to mount Fernando Luis Vázquez Cao
2009-02-12 10:38 ` util-linux: Add explanation for new mount options flushonfsync and noflushonfsync to mount(8) man page Fernando Luis Vázquez Cao
2009-02-12 10:38 ` block: Add block_flush_device() Fernando Luis Vázquez Cao
2009-02-12 10:39 ` ext3: call blkdev_issue_flush on fsync Fernando Luis Vázquez Cao
2009-02-12 10:40 ` ext4: " Fernando Luis Vázquez Cao
2009-02-15 22:46 ` Theodore Tso
2009-02-16 7:09 ` Fernando Luis Vázquez Cao
2009-02-16 7:25 ` [PATCH 1/3] block: Add block_flush_device() Fernando Luis Vázquez Cao
2009-02-16 7:29 ` [2/3] ext3: call block_flush_device() on fsync Fernando Luis Vázquez Cao
2009-02-16 7:31 ` [PATCH 3/3] ext4: " Fernando Luis Vázquez Cao
2009-01-16 13:59 ` ext4: call blkdev_issue_flush " Fernando Luis Vázquez Cao
2009-01-13 14:42 ` ext2 + -osync: not as easy as it seems Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49A1BAEC.7080504@redhat.com \
--to=sandeen@redhat.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=fernando@kic.ac.jp \
--cc=fernando@oss.ntt.co.jp \
--cc=jack@suse.cz \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=pavel@ucw.cz \
--cc=rwheeler@redhat.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.