* Re: [RFC] failure atomic writes for file systems and block devices
[not found] <20170228145737.19016-1-hch@lst.de>
@ 2017-03-01 11:21 ` Amir Goldstein
[not found] ` <CAOQ4uxjYwP+XGdDu-HiKmiAVEFydqYj6xfQbaKULB9WWUSuX5g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 2+ messages in thread
From: Amir Goldstein @ 2017-03-01 11:21 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-fsdevel, linux-xfs, linux-block, linux-api,
Michael Kerrisk (man-pages)
On Tue, Feb 28, 2017 at 4:57 PM, Christoph Hellwig <hch@lst.de> wrote:
> Hi all,
>
> this series implements a new O_ATOMIC flag for failure atomic writes
> to files. It is based on and tries to unify to earlier proposals,
> the first one for block devices by Chris Mason:
>
> https://lwn.net/Articles/573092/
>
> and the second one for regular files, published by HP Research at
> Usenix FAST 2015:
>
> https://www.usenix.org/conference/fast15/technical-sessions/presentation/verma
>
> It adds a new O_ATOMIC flag for open, which requests writes to be
> failure-atomic, that is either the whole write makes it to persistent
> storage, or none of it, even in case of power of other failures.
>
> There are two implementation various of this: on block devices O_ATOMIC
> must be combined with O_(D)SYNC so that storage devices that can handle
> large writes atomically can simply do that without any additional work.
> This case is supported by NVMe.
>
> The second case is for file systems, where we simply write new blocks
> out of places and then remap them into the file atomically on either
> completion of an O_(D)SYNC write or when fsync is called explicitly.
>
> The semantics of the latter case are explained in detail at the Usenix
> paper above.
>
> Last but not least a new fcntl is implemented to provide information
> about I/O restrictions such as alignment requirements and the maximum
> atomic write size.
>
> The implementation is simple and clean, but I'm rather unhappy about
> the interface as it has too many failure modes to bullet proof. For
> one old kernels ignore unknown open flags silently, so applications
> have to check the F_IOINFO fcntl before, which is a bit of a killer.
> Because of that I've also not implemented any other validity checks
> yet, as they might make thing even worse when an open on a not supported
> file system or device fails, but not on an old kernel. Maybe we need
> a new open version that checks arguments properly first?
>
[CC += linux-api@vger.kernel.org] for that question and for the new API
> Also I'm really worried about the NVMe failure modes - devices simply
> advertise an atomic write size, with no way for the device to know
> that the host requested a given write to be atomic, and thus no
> error reporting. This is made worse by NVMe 1.2 adding per-namespace
> atomic I/O parameters that devices can use to introduce additional
> odd alignment quirks - while there is some language in the spec
> requiring them not to weaken the per-controller guarantees it all
> looks rather weak and I'm not too confident in all implementations
> getting everything right.
>
> Last but not least this depends on a few XFS patches, so to actually
> apply / run the patches please use this git tree:
>
> git://git.infradead.org/users/hch/vfs.git O_ATOMIC
>
> Gitweb:
>
> http://git.infradead.org/users/hch/vfs.git/shortlog/refs/heads/O_ATOMIC
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [RFC] failure atomic writes for file systems and block devices
[not found] ` <CAOQ4uxjYwP+XGdDu-HiKmiAVEFydqYj6xfQbaKULB9WWUSuX5g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-03-01 15:07 ` Christoph Hellwig
0 siblings, 0 replies; 2+ messages in thread
From: Christoph Hellwig @ 2017-03-01 15:07 UTC (permalink / raw)
To: Amir Goldstein
Cc: Christoph Hellwig, linux-fsdevel,
linux-xfs-u79uwXL29TY76Z2rM5mHXA, linux-block,
linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk (man-pages)
On Wed, Mar 01, 2017 at 01:21:41PM +0200, Amir Goldstein wrote:
> [CC += linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] for that question and for the new API
We'll need to iterate over the API a few more times first I think..
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-03-01 15:07 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20170228145737.19016-1-hch@lst.de>
2017-03-01 11:21 ` [RFC] failure atomic writes for file systems and block devices Amir Goldstein
[not found] ` <CAOQ4uxjYwP+XGdDu-HiKmiAVEFydqYj6xfQbaKULB9WWUSuX5g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-01 15:07 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).