From: Christoph Hellwig <hch@infradead.org>
To: Alex Bligh <alex@alex.org.uk>
Cc: linux-kernel@vger.kernel.org
Subject: Re: REQ_FLUSH, REQ_FUA and open/close of block devices
Date: Sun, 22 May 2011 07:26:29 -0400 [thread overview]
Message-ID: <20110522112629.GA26586@infradead.org> (raw)
In-Reply-To: <A0329D810FA7795CEDDA5C70@nimrod.local>
> So, the file in question is not mmap'd (it's an nbd disk). fsync() /
> fdatasync() is too expensive as it will sync everything. As far as I can
> tell, this is no more dangerous re metadata than fdatasync() which also
> does not sync metadata. I had read the last sentence as "this system
> call does not *necessarily* flush disk write caches" (meaning "if you
> haven't mounted e.g. ext3 with barriers=1, then you can't ensure write
> caches write through"), as opposed to "will not ever flush disk write
> caches", and given mounting ext3 without barriers=1 produces no FUA or
> FLUSH commands in normal operation anyway (as far as light debugging
> can see) that's not much of a loss.
ext3 without barriers does not gurantee any data integrity and will lose
your data in an eye blink if you have a large enough cache.
fdatasync is equivalent to fsync except that it does not flush
non-essential metadata (basically just timestamps in practice), but it
does flush metadata requried to find the data again, e.g. allocation
information and extent maps. sync_file_range does nothing but flush
out pagecache content - it means you basically won't get your data
back in case of a crash if you either:
a) have a volatile write cache in your disk (e.g. any normal SATA disk)
b) are using a sparse file on a filesystem
c) are using a fallocate-preallocated file on a filesystem
d) use any file on a COW filesystem like btrfs
e.g. it only does anything useful for you if you do not have a volatile
write cache, and either use a raw block device node, or just overwrite
an already fully allocated (and not preallocated) file on a non-COW
filesystem.
> But rather than trying to justify myself: what is the best way to
> emulate FUA, i.e. ensure a specific portion of a file is synced before
> returning, without ensuring the whole lot is synced (which is far too
> slow)? The only other option I can see is to open the file with a second
> fd, mmap the chunk of the file (it may be larger than the available
> virtual address space), mysnc it with MS_SYNC, then fsync, then munmap
> and close, and hope the fsync doesn't spit anything else out. This
> seems a little excessive, and I don't even know whether it would work.
You can have a second FD with O_DSYNC open and write to that. But for
NBD and Linux guest that won't make any different yet. While REQ_FUA
is a separate flag so far it's only used in combination with REQ_FLUSH,
so the only pattern you'll see REQ_FUA used in is:
REQ_FLUSH
REQ_FUA
which means there's no data but the one just written in the cache.
next prev parent reply other threads:[~2011-05-22 11:26 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-19 15:06 REQ_FLUSH, REQ_FUA and open/close of block devices Alex Bligh
2011-05-20 12:20 ` Christoph Hellwig
2011-05-21 8:42 ` Alex Bligh
2011-05-22 10:44 ` Christoph Hellwig
2011-05-22 11:17 ` Alex Bligh
2011-05-22 11:26 ` Christoph Hellwig [this message]
2011-05-22 12:00 ` Alex Bligh
2011-05-22 12:04 ` Christoph Hellwig
2011-05-22 16:56 ` Jeff Garzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110522112629.GA26586@infradead.org \
--to=hch@infradead.org \
--cc=alex@alex.org.uk \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.