From: Christoph Hellwig <hch@lst.de>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>,
linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-block@vger.kernel.org
Subject: Re: [PATCH 07/12] xfs: implement failure-atomic writes
Date: Wed, 1 Mar 2017 16:17:57 +0100 [thread overview]
Message-ID: <20170301151757.GH12248@lst.de> (raw)
In-Reply-To: <20170228230940.GB26319@birch.djwong.org>
On Tue, Feb 28, 2017 at 03:09:40PM -0800, Darrick J. Wong wrote:
> By the way, the copy on write code remembers the extents it has
> allocated for CoW staging in the refcount btree so that it can free them
> after a crash, which means that O_ATOMIC requires reflink to be enabled.
Yeah.
> There doesn't seem to be any explicit checking that reflink is even
> enabled, which will probably just lead to weird crashes on a pre-reflink
> xfs.
True. I had this earlier when I hat basic O_ATOMIC validity checking,
but that was dropped from the series I posted.
>
> FWIW I didn't see any checking anywhere (vfs or xfs) that the filesystem
> can actually support O_ATOMIC. If the FS doesn't support atomic writes,
> shouldn't the kernel send EINVAL or something back to userspace?
Older kernels can't check it, so having new ones check it creates even
more of a mess.
I'm still not feeling very well about O_ATOMIC - either we need an
open2 that checks for unknown flags, or I need to change this to
a per-op flag - RWF_ATOMIC for write (pwritev2 actually), and MAP_ATOMIC
for mmap. But given that pwritev2 isn't really supported in common
userland yet that might be rather painful.
> At the start of xfs_reflink.c is a long block comment describing how the
> copy on write mechanism works. Since O_ATOMIC is a variant on CoW (it's
> basically CoW with remapping deferred until fsync), please update the
> comment so that the comments capture the details of how atomic writes
> work.
>
> (IOWs: Dave asked me to leave the big comment, so I'm going to try to
> keep it fairly up to date.)
I'll add some information to it.
> I suppose it goes without saying that userspace will have to coordinate
> its O_ATOMIC writes to the file.
It does - but if you have multiple writers to a file they really need
to be coordinated anyway. If you have threads whose updates race
you'd need something like
open(O_TMPFILE)
clone file (or range) into tempfile
update tempfile
clone region you want atomically inserted back into the original file.
We can actually do that with existing primitives, but it's a bit more
heavyweight. We could opimize this a bit by checking if an extent
already points to the same physical blocks before replacing it in
clone_file_range.
> > + if (file->f_flags & O_ATOMIC)
> > + printk_ratelimited("O_ATOMIC!\n");
>
> Per above,
>
> if (file->f_flags & O_ATOMIC) {
> if (!xfs_sb_version_hasreflink(...))
> return -EPROTONOSUPPORT;
Yeah.
> printk_ratelimited("EXPERIMENTAL atomic writes feature in use!\n");
And that should just go away - it was a local debug aid :)
next prev parent reply other threads:[~2017-03-01 15:18 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-28 14:57 [RFC] failure atomic writes for file systems and block devices Christoph Hellwig
2017-02-28 14:57 ` [PATCH 01/12] uapi/fs: add O_ATOMIC to the open flags Christoph Hellwig
2017-02-28 14:57 ` [PATCH 02/12] iomap: pass IOMAP_* flags to actors Christoph Hellwig
2017-02-28 14:57 ` [PATCH 03/12] iomap: add a IOMAP_ATOMIC flag Christoph Hellwig
2017-02-28 14:57 ` [PATCH 04/12] fs: add a BH_Atomic flag Christoph Hellwig
2017-02-28 14:57 ` [PATCH 05/12] fs: add a F_IOINFO fcntl Christoph Hellwig
2017-02-28 16:51 ` Darrick J. Wong
2017-03-01 15:11 ` Christoph Hellwig
2017-02-28 14:57 ` [PATCH 06/12] xfs: cleanup is_reflink checks Christoph Hellwig
2017-02-28 14:57 ` [PATCH 07/12] xfs: implement failure-atomic writes Christoph Hellwig
2017-02-28 23:09 ` Darrick J. Wong
2017-03-01 15:17 ` Christoph Hellwig [this message]
2017-02-28 14:57 ` [PATCH 08/12] xfs: implement the F_IOINFO fcntl Christoph Hellwig
2017-02-28 14:57 ` [PATCH 09/12] block: advertize max atomic write limit Christoph Hellwig
2017-02-28 14:57 ` [PATCH 10/12] block_dev: set REQ_NOMERGE for O_ATOMIC writes Christoph Hellwig
2017-02-28 14:57 ` [PATCH 11/12] block_dev: implement the F_IOINFO fcntl Christoph Hellwig
2017-02-28 14:57 ` [PATCH 12/12] nvme: export the atomic write limit Christoph Hellwig
2017-02-28 20:48 ` [RFC] failure atomic writes for file systems and block devices Chris Mason
2017-03-01 15:07 ` Christoph Hellwig
2017-02-28 23:22 ` Darrick J. Wong
2017-03-01 15:09 ` Christoph Hellwig
2017-03-01 11:21 ` Amir Goldstein
2017-03-01 15:07 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170301151757.GH12248@lst.de \
--to=hch@lst.de \
--cc=darrick.wong@oracle.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).