From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 5/5] xfs: introduce an always_cow mode
Date: Thu, 20 Sep 2018 13:17:02 -0700 [thread overview]
Message-ID: <20180920201702.GN20086@magnolia> (raw)
In-Reply-To: <20180920144220.2181-6-hch@lst.de>
On Thu, Sep 20, 2018 at 04:42:20PM +0200, Christoph Hellwig wrote:
> When the (debug-only) always_cow module parameter is set, we will always
> write out place, even if the file is not reflinked. In addition to being
> a useful debug aid this prepares for modes where we must always write
> out of place. An example for that is the upcoming support for atomic
> overwrites.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_aops.c | 2 +-
> fs/xfs/xfs_file.c | 2 +-
> fs/xfs/xfs_iomap.c | 8 +++----
> fs/xfs/xfs_reflink.c | 50 ++++++++++++++++++++++++++++++++------------
> fs/xfs/xfs_reflink.h | 13 ++++++++++++
> 5 files changed, 56 insertions(+), 19 deletions(-)
>
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 775cdcfe70c2..65352d52d24a 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -980,7 +980,7 @@ xfs_vm_bmap(
> * Since we don't pass back blockdev info, we can't return bmap
> * information for rt files either.
> */
> - if (xfs_is_reflink_inode(ip) || XFS_IS_REALTIME_INODE(ip))
> + if (xfs_is_cow_inode(ip) || XFS_IS_REALTIME_INODE(ip))
> return 0;
> return iomap_bmap(mapping, block, &xfs_iomap_ops);
> }
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index fa8fbc84eec8..2756fc583716 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -507,7 +507,7 @@ xfs_file_dio_aio_write(
> * We can't properly handle unaligned direct I/O to reflink
> * files yet, as we can't unshare a partial block.
> */
> - if (xfs_is_reflink_inode(ip)) {
> + if (xfs_is_cow_inode(ip)) {
> trace_xfs_reflink_bounce_dio_write(ip, iocb->ki_pos, count);
> return -EREMCHG;
> }
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 1bfc40ce581a..398c5b23a368 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -570,7 +570,7 @@ xfs_delalloc_iomap_begin(
> if (eof)
> got.br_startoff = maxbytes_fsb;
> if (got.br_startoff <= offset_fsb) {
> - if (xfs_is_reflink_inode(ip) &&
> + if (xfs_is_cow_inode(ip) &&
> ((flags & IOMAP_WRITE) ||
> got.br_state != XFS_EXT_UNWRITTEN)) {
> end_fsb = min(XFS_B_TO_FSB(mp, offset + count),
> @@ -1048,7 +1048,7 @@ xfs_ilock_for_iomap(
> * COW writes may allocate delalloc space or convert unwritten COW
> * extents, so we need to make sure to take the lock exclusively here.
> */
> - if (xfs_is_reflink_inode(ip) && is_write) {
> + if (xfs_is_cow_inode(ip) && is_write) {
> /*
> * FIXME: It could still overwrite on unshared extents and not
> * need allocation.
> @@ -1082,7 +1082,7 @@ xfs_ilock_for_iomap(
> * check, so if we got ILOCK_SHARED for a write and but we're now a
> * reflink inode we have to switch to ILOCK_EXCL and relock.
> */
> - if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_reflink_inode(ip)) {
> + if (mode == XFS_ILOCK_SHARED && is_write && xfs_is_cow_inode(ip)) {
> xfs_iunlock(ip, mode);
> mode = XFS_ILOCK_EXCL;
> goto relock;
> @@ -1151,7 +1151,7 @@ xfs_file_iomap_begin(
> * Break shared extents if necessary. Checks for non-blocking IO have
> * been done up front, so we don't need to do them here.
> */
> - if (xfs_is_reflink_inode(ip)) {
> + if (xfs_is_cow_inode(ip)) {
> error = xfs_reflink_allocate_cow(ip, &imap, &shared, &lockmode);
> if (error)
> goto out_unlock;
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index b25ea7a3a0e7..a893f6d65ff0 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -129,6 +129,11 @@
> * ioend, the better.
> */
>
> +#ifdef DEBUG
> +bool xfs_always_cow;
> +module_param_named(always_cow, xfs_always_cow, bool, 0644);
> +#endif
I would have expected this to end up in /proc/sys/fs/xfs/always_cow
along with all the other behavior knobs, but I guess this is a debugging
thing...?
> +
> /*
> * Given an AG extent, find the lowest-numbered run of shared blocks
> * within that range and return the range in fbno/flen. If
> @@ -192,7 +197,7 @@ xfs_reflink_trim_around_shared(
> int error = 0;
>
> /* Holes, unwritten, and delalloc extents cannot be shared */
> - if (!xfs_is_reflink_inode(ip) || !xfs_bmap_is_real_extent(irec)) {
> + if (!xfs_is_cow_inode(ip) || !xfs_bmap_is_real_extent(irec)) {
> *shared = false;
> return 0;
> }
> @@ -234,6 +239,22 @@ xfs_reflink_trim_around_shared(
> }
> }
>
> +static bool
> +xfs_inode_need_cow(
> + struct xfs_inode *ip,
> + struct xfs_bmbt_irec *imap,
> + bool *shared)
> +{
> + /* We can't update any real extents in always COW mode. */
/me struggled with this for a minute before figuring out what the
comment was trying to tell me. Is the following accurate?
"In always_cow mode, behave as if the incoming data fork imap is
completely shared." ?
--D
> + if (xfs_always_cow && !isnullstartblock(imap->br_startblock)) {
> + *shared = true;
> + return 0;
> + }
> +
> + /* Trim the mapping to the nearest shared extent boundary. */
> + return xfs_reflink_trim_around_shared(ip, imap, shared);
> +}
> +
> /*
> * Trim the passed in imap to the next shared/unshared extent boundary, and
> * if imap->br_startoff points to a shared extent reserve space for it in the
> @@ -248,13 +269,17 @@ xfs_reflink_reserve_cow(
> struct xfs_inode *ip,
> struct xfs_bmbt_irec *imap)
> {
> - struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK);
> struct xfs_bmbt_irec got;
> int error = 0;
> bool eof = false;
> struct xfs_iext_cursor icur;
> bool shared;
>
> + if (!ip->i_cowfp) {
> + ASSERT(!xfs_is_reflink_inode(ip));
> + xfs_ifork_init_cow(ip);
> + }
> +
> /*
> * Search the COW fork extent list first. This serves two purposes:
> * first this implement the speculative preallocation using cowextisze,
> @@ -263,8 +288,8 @@ xfs_reflink_reserve_cow(
> * extent list is generally faster than going out to the shared extent
> * tree.
> */
> -
> - if (!xfs_iext_lookup_extent(ip, ifp, imap->br_startoff, &icur, &got))
> + if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, imap->br_startoff, &icur,
> + &got))
> eof = true;
> if (!eof && got.br_startoff <= imap->br_startoff) {
> trace_xfs_reflink_cow_found(ip, imap);
> @@ -273,14 +298,10 @@ xfs_reflink_reserve_cow(
> }
>
> /* Trim the mapping to the nearest shared extent boundary. */
> - error = xfs_reflink_trim_around_shared(ip, imap, &shared);
> - if (error)
> + error = xfs_inode_need_cow(ip, imap, &shared);
> + if (error || !shared)
> return error;
>
> - /* Not shared? Just report the (potentially capped) extent. */
> - if (!shared)
> - return 0;
> -
> /*
> * Fork all the shared blocks from our write offset until the end of
> * the extent.
> @@ -374,7 +395,7 @@ xfs_find_trim_cow_extent(
> if (got.br_startoff > offset_fsb) {
> xfs_trim_extent(imap, imap->br_startoff,
> got.br_startoff - imap->br_startoff);
> - return xfs_reflink_trim_around_shared(ip, imap, shared);
> + return xfs_inode_need_cow(ip, imap, shared);
> }
>
> *shared = true;
> @@ -408,7 +429,10 @@ xfs_reflink_allocate_cow(
> xfs_extlen_t resblks = 0;
>
> ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
> - ASSERT(xfs_is_reflink_inode(ip));
> + if (!ip->i_cowfp) {
> + ASSERT(!xfs_is_reflink_inode(ip));
> + xfs_ifork_init_cow(ip);
> + }
>
> error = xfs_find_trim_cow_extent(ip, imap, shared, &found);
> if (error || !*shared)
> @@ -585,7 +609,7 @@ xfs_reflink_cancel_cow_range(
> int error;
>
> trace_xfs_reflink_cancel_cow_range(ip, offset, count);
> - ASSERT(xfs_is_reflink_inode(ip));
> + ASSERT(ip->i_cowfp);
>
> offset_fsb = XFS_B_TO_FSBT(ip->i_mount, offset);
> if (count == NULLFILEOFF)
> diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h
> index 7f47202b5639..b89e07239b82 100644
> --- a/fs/xfs/xfs_reflink.h
> +++ b/fs/xfs/xfs_reflink.h
> @@ -6,6 +6,19 @@
> #ifndef __XFS_REFLINK_H
> #define __XFS_REFLINK_H 1
>
> +#ifdef DEBUG
> +extern bool xfs_always_cow;
> +#else
> +#define xfs_always_cow false
> +#endif
> +
> +static inline bool xfs_is_cow_inode(struct xfs_inode *ip)
> +{
> + return xfs_is_reflink_inode(ip) ||
> + (xfs_always_cow &&
> + xfs_sb_version_hasreflink(&ip->i_mount->m_sb));
> +}
> +
> extern int xfs_reflink_find_shared(struct xfs_mount *mp, struct xfs_trans *tp,
> xfs_agnumber_t agno, xfs_agblock_t agbno, xfs_extlen_t aglen,
> xfs_agblock_t *fbno, xfs_extlen_t *flen, bool find_maximal);
> --
> 2.18.0
>
next prev parent reply other threads:[~2018-09-21 2:02 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-20 14:42 more reflink fixes & debug Christoph Hellwig
2018-09-20 14:42 ` [PATCH 1/5] xfs: cancel COW blocks before swapext Christoph Hellwig
2018-09-20 20:04 ` Darrick J. Wong
2018-09-27 15:07 ` Brian Foster
2018-09-30 22:50 ` Christoph Hellwig
2018-10-01 11:03 ` Brian Foster
2018-09-20 14:42 ` [PATCH 2/5] xfs: skip delalloc COW blocks in xfs_reflink_end_cow Christoph Hellwig
2018-09-20 20:22 ` Darrick J. Wong
2018-09-20 14:42 ` [PATCH 3/5] xfs: fix fork selection in xfs_find_trim_cow_extent Christoph Hellwig
2018-09-20 14:42 ` [PATCH 4/5] xfs: print dangling delalloc extents Christoph Hellwig
2018-09-20 20:06 ` Darrick J. Wong
2018-09-27 15:07 ` Brian Foster
2018-09-20 14:42 ` [PATCH 5/5] xfs: introduce an always_cow mode Christoph Hellwig
2018-09-20 20:17 ` Darrick J. Wong [this message]
2018-09-20 21:23 ` Dave Chinner
2018-09-21 5:23 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180920201702.GN20086@magnolia \
--to=darrick.wong@oracle.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox