From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: david@fromorbit.com, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 10/63] xfs: create refcount update intent log items
Date: Wed, 28 Sep 2016 12:20:18 -0400 [thread overview]
Message-ID: <20160928162017.GE8852@bfoster.bfoster> (raw)
In-Reply-To: <147503127360.30303.13509008550712587655.stgit@birch.djwong.org>
On Tue, Sep 27, 2016 at 07:54:33PM -0700, Darrick J. Wong wrote:
> Create refcount update intent/done log items to record redo
> information in the log. Because we need to roll transactions between
> updating the bmbt mapping and updating the reverse mapping, we also
> have to track the status of the metadata updates that will be recorded
> in the post-roll transactions, just in case we crash before committing
> the final transaction. This mechanism enables log recovery to finish
> what was already started.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> fs/xfs/Makefile | 1
> fs/xfs/libxfs/xfs_log_format.h | 59 ++++++
> fs/xfs/xfs_refcount_item.c | 406 ++++++++++++++++++++++++++++++++++++++++
> fs/xfs/xfs_refcount_item.h | 102 ++++++++++
> fs/xfs/xfs_super.c | 18 ++
> 5 files changed, 584 insertions(+), 2 deletions(-)
> create mode 100644 fs/xfs/xfs_refcount_item.c
> create mode 100644 fs/xfs/xfs_refcount_item.h
>
>
...
> diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
> new file mode 100644
> index 0000000..ac52b02
> --- /dev/null
> +++ b/fs/xfs/xfs_refcount_item.c
> @@ -0,0 +1,406 @@
...
> +/*
> + * This is called to fill in the vector of log iovecs for the
> + * given cud log item. We use only 1 iovec, and we point that
> + * at the cud_log_format structure embedded in the cud item.
> + * It is at this point that we assert that all of the extent
> + * slots in the cud item have been filled.
> + */
> +STATIC void
> +xfs_cud_item_format(
> + struct xfs_log_item *lip,
> + struct xfs_log_vec *lv)
> +{
> + struct xfs_cud_log_item *cudp = CUD_ITEM(lip);
> + struct xfs_log_iovec *vecp = NULL;
> +
> + cudp->cud_format.cud_type = XFS_LI_CUD;
> + cudp->cud_format.cud_size = 1;
> +
> + xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_CUD_FORMAT, &cudp->cud_format,
> + sizeof(struct xfs_rud_log_format));
They're the same size, but: xfs_cud_log_format
Brian
> +}
> +
> +/*
> + * Pinning has no meaning for an cud item, so just return.
> + */
> +STATIC void
> +xfs_cud_item_pin(
> + struct xfs_log_item *lip)
> +{
> +}
> +
> +/*
> + * Since pinning has no meaning for an cud item, unpinning does
> + * not either.
> + */
> +STATIC void
> +xfs_cud_item_unpin(
> + struct xfs_log_item *lip,
> + int remove)
> +{
> +}
> +
> +/*
> + * There isn't much you can do to push on an cud item. It is simply stuck
> + * waiting for the log to be flushed to disk.
> + */
> +STATIC uint
> +xfs_cud_item_push(
> + struct xfs_log_item *lip,
> + struct list_head *buffer_list)
> +{
> + return XFS_ITEM_PINNED;
> +}
> +
> +/*
> + * The CUD is either committed or aborted if the transaction is cancelled. If
> + * the transaction is cancelled, drop our reference to the CUI and free the
> + * CUD.
> + */
> +STATIC void
> +xfs_cud_item_unlock(
> + struct xfs_log_item *lip)
> +{
> + struct xfs_cud_log_item *cudp = CUD_ITEM(lip);
> +
> + if (lip->li_flags & XFS_LI_ABORTED) {
> + xfs_cui_release(cudp->cud_cuip);
> + kmem_zone_free(xfs_cud_zone, cudp);
> + }
> +}
> +
> +/*
> + * When the cud item is committed to disk, all we need to do is delete our
> + * reference to our partner cui item and then free ourselves. Since we're
> + * freeing ourselves we must return -1 to keep the transaction code from
> + * further referencing this item.
> + */
> +STATIC xfs_lsn_t
> +xfs_cud_item_committed(
> + struct xfs_log_item *lip,
> + xfs_lsn_t lsn)
> +{
> + struct xfs_cud_log_item *cudp = CUD_ITEM(lip);
> +
> + /*
> + * Drop the CUI reference regardless of whether the CUD has been
> + * aborted. Once the CUD transaction is constructed, it is the sole
> + * responsibility of the CUD to release the CUI (even if the CUI is
> + * aborted due to log I/O error).
> + */
> + xfs_cui_release(cudp->cud_cuip);
> + kmem_zone_free(xfs_cud_zone, cudp);
> +
> + return (xfs_lsn_t)-1;
> +}
> +
> +/*
> + * The CUD dependency tracking op doesn't do squat. It can't because
> + * it doesn't know where the free extent is coming from. The dependency
> + * tracking has to be handled by the "enclosing" metadata object. For
> + * example, for inodes, the inode is locked throughout the extent freeing
> + * so the dependency should be recorded there.
> + */
> +STATIC void
> +xfs_cud_item_committing(
> + struct xfs_log_item *lip,
> + xfs_lsn_t lsn)
> +{
> +}
> +
> +/*
> + * This is the ops vector shared by all cud log items.
> + */
> +static const struct xfs_item_ops xfs_cud_item_ops = {
> + .iop_size = xfs_cud_item_size,
> + .iop_format = xfs_cud_item_format,
> + .iop_pin = xfs_cud_item_pin,
> + .iop_unpin = xfs_cud_item_unpin,
> + .iop_unlock = xfs_cud_item_unlock,
> + .iop_committed = xfs_cud_item_committed,
> + .iop_push = xfs_cud_item_push,
> + .iop_committing = xfs_cud_item_committing,
> +};
> +
> +/*
> + * Allocate and initialize an cud item with the given number of extents.
> + */
> +struct xfs_cud_log_item *
> +xfs_cud_init(
> + struct xfs_mount *mp,
> + struct xfs_cui_log_item *cuip)
> +
> +{
> + struct xfs_cud_log_item *cudp;
> +
> + cudp = kmem_zone_zalloc(xfs_cud_zone, KM_SLEEP);
> + xfs_log_item_init(mp, &cudp->cud_item, XFS_LI_CUD, &xfs_cud_item_ops);
> + cudp->cud_cuip = cuip;
> + cudp->cud_format.cud_cui_id = cuip->cui_format.cui_id;
> +
> + return cudp;
> +}
> diff --git a/fs/xfs/xfs_refcount_item.h b/fs/xfs/xfs_refcount_item.h
> new file mode 100644
> index 0000000..7b8f56b
> --- /dev/null
> +++ b/fs/xfs/xfs_refcount_item.h
> @@ -0,0 +1,102 @@
> +/*
> + * Copyright (C) 2016 Oracle. All Rights Reserved.
> + *
> + * Author: Darrick J. Wong <darrick.wong@oracle.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it would be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write the Free Software Foundation,
> + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
> + */
> +#ifndef __XFS_REFCOUNT_ITEM_H__
> +#define __XFS_REFCOUNT_ITEM_H__
> +
> +/*
> + * There are (currently) two pairs of refcount btree redo item types:
> + * increase and decrease. The log items for these are CUI (refcount
> + * update intent) and CUD (refcount update done). The redo item type
> + * is encoded in the flags field of each xfs_map_extent.
> + *
> + * *I items should be recorded in the *first* of a series of rolled
> + * transactions, and the *D items should be recorded in the same
> + * transaction that records the associated refcountbt updates.
> + *
> + * Should the system crash after the commit of the first transaction
> + * but before the commit of the final transaction in a series, log
> + * recovery will use the redo information recorded by the intent items
> + * to replay the refcountbt metadata updates.
> + */
> +
> +/* kernel only CUI/CUD definitions */
> +
> +struct xfs_mount;
> +struct kmem_zone;
> +
> +/*
> + * Max number of extents in fast allocation path.
> + */
> +#define XFS_CUI_MAX_FAST_EXTENTS 16
> +
> +/*
> + * Define CUI flag bits. Manipulated by set/clear/test_bit operators.
> + */
> +#define XFS_CUI_RECOVERED 1
> +
> +/*
> + * This is the "refcount update intent" log item. It is used to log
> + * the fact that some reverse mappings need to change. It is used in
> + * conjunction with the "refcount update done" log item described
> + * below.
> + *
> + * These log items follow the same rules as struct xfs_efi_log_item;
> + * see the comments about that structure (in xfs_extfree_item.h) for
> + * more details.
> + */
> +struct xfs_cui_log_item {
> + struct xfs_log_item cui_item;
> + atomic_t cui_refcount;
> + atomic_t cui_next_extent;
> + unsigned long cui_flags; /* misc flags */
> + struct xfs_cui_log_format cui_format;
> +};
> +
> +static inline size_t
> +xfs_cui_log_item_sizeof(
> + unsigned int nr)
> +{
> + return offsetof(struct xfs_cui_log_item, cui_format) +
> + xfs_cui_log_format_sizeof(nr);
> +}
> +
> +/*
> + * This is the "refcount update done" log item. It is used to log the
> + * fact that some refcountbt updates mentioned in an earlier cui item
> + * have been performed.
> + */
> +struct xfs_cud_log_item {
> + struct xfs_log_item cud_item;
> + struct xfs_cui_log_item *cud_cuip;
> + struct xfs_cud_log_format cud_format;
> +};
> +
> +extern struct kmem_zone *xfs_cui_zone;
> +extern struct kmem_zone *xfs_cud_zone;
> +
> +struct xfs_cui_log_item *xfs_cui_init(struct xfs_mount *, uint);
> +struct xfs_cud_log_item *xfs_cud_init(struct xfs_mount *,
> + struct xfs_cui_log_item *);
> +int xfs_cui_copy_format(struct xfs_log_iovec *buf,
> + struct xfs_cui_log_format *dst_cui_fmt);
> +void xfs_cui_item_free(struct xfs_cui_log_item *);
> +void xfs_cui_release(struct xfs_cui_log_item *);
> +
> +#endif /* __XFS_REFCOUNT_ITEM_H__ */
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 2d092f9..abe69c6 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -47,6 +47,7 @@
> #include "xfs_sysfs.h"
> #include "xfs_ondisk.h"
> #include "xfs_rmap_item.h"
> +#include "xfs_refcount_item.h"
>
> #include <linux/namei.h>
> #include <linux/init.h>
> @@ -1788,8 +1789,23 @@ xfs_init_zones(void)
> if (!xfs_rui_zone)
> goto out_destroy_rud_zone;
>
> + xfs_cud_zone = kmem_zone_init(sizeof(struct xfs_cud_log_item),
> + "xfs_cud_item");
> + if (!xfs_cud_zone)
> + goto out_destroy_rui_zone;
> +
> + xfs_cui_zone = kmem_zone_init(
> + xfs_cui_log_item_sizeof(XFS_CUI_MAX_FAST_EXTENTS),
> + "xfs_cui_item");
> + if (!xfs_cui_zone)
> + goto out_destroy_cud_zone;
> +
> return 0;
>
> + out_destroy_cud_zone:
> + kmem_zone_destroy(xfs_cud_zone);
> + out_destroy_rui_zone:
> + kmem_zone_destroy(xfs_rui_zone);
> out_destroy_rud_zone:
> kmem_zone_destroy(xfs_rud_zone);
> out_destroy_icreate_zone:
> @@ -1832,6 +1848,8 @@ xfs_destroy_zones(void)
> * destroy caches.
> */
> rcu_barrier();
> + kmem_zone_destroy(xfs_cui_zone);
> + kmem_zone_destroy(xfs_cud_zone);
> kmem_zone_destroy(xfs_rui_zone);
> kmem_zone_destroy(xfs_rud_zone);
> kmem_zone_destroy(xfs_icreate_zone);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-09-28 16:20 UTC|newest]
Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-28 2:53 [PATCH v9 00/63] xfs: add reflink and dedupe support Darrick J. Wong
2016-09-28 2:53 ` [PATCH 01/63] vfs: support FS_XFLAG_COWEXTSIZE and get/set of CoW extent size hint Darrick J. Wong
2016-09-29 16:48 ` Christoph Hellwig
2016-09-28 2:53 ` [PATCH 02/63] xfs: return an error when an inline directory is too small Darrick J. Wong
2016-09-28 16:19 ` Brian Foster
2016-09-29 16:48 ` Christoph Hellwig
2016-09-28 2:53 ` [PATCH 03/63] xfs: define tracepoints for refcount btree activities Darrick J. Wong
2016-09-29 16:49 ` Christoph Hellwig
2016-09-28 2:53 ` [PATCH 04/63] xfs: introduce refcount btree definitions Darrick J. Wong
2016-09-28 2:54 ` [PATCH 05/63] xfs: refcount btree add more reserved blocks Darrick J. Wong
2016-09-28 2:54 ` [PATCH 06/63] xfs: define the on-disk refcount btree format Darrick J. Wong
2016-09-28 16:20 ` Brian Foster
2016-09-28 18:35 ` Darrick J. Wong
2016-09-28 2:54 ` [PATCH 07/63] xfs: add refcount btree support to growfs Darrick J. Wong
2016-09-28 2:54 ` [PATCH 08/63] xfs: account for the refcount btree in the alloc/free log reservation Darrick J. Wong
2016-09-28 16:20 ` Brian Foster
2016-09-28 19:45 ` Darrick J. Wong
2016-09-29 21:18 ` Darrick J. Wong
2016-09-29 23:13 ` Darrick J. Wong
2016-09-28 2:54 ` [PATCH 09/63] xfs: add refcount btree operations Darrick J. Wong
2016-09-28 16:20 ` Brian Foster
2016-09-28 18:46 ` Darrick J. Wong
2016-09-28 2:54 ` [PATCH 10/63] xfs: create refcount update intent log items Darrick J. Wong
2016-09-28 16:20 ` Brian Foster [this message]
2016-09-28 18:47 ` Darrick J. Wong
2016-09-29 16:52 ` Christoph Hellwig
2016-09-29 17:44 ` Darrick J. Wong
2016-09-28 2:54 ` [PATCH 11/63] xfs: log refcount intent items Darrick J. Wong
2016-09-29 16:56 ` Christoph Hellwig
2016-09-29 20:48 ` Darrick J. Wong
2016-09-28 2:54 ` [PATCH 12/63] xfs: adjust refcount of an extent of blocks in refcount btree Darrick J. Wong
2016-09-29 14:44 ` Brian Foster
2016-09-29 19:03 ` Darrick J. Wong
2016-09-30 11:59 ` Brian Foster
2016-09-30 18:27 ` Darrick J. Wong
2016-09-30 19:23 ` Brian Foster
2016-09-28 2:54 ` [PATCH 13/63] xfs: connect refcount adjust functions to upper layers Darrick J. Wong
2016-09-28 2:55 ` [PATCH 14/63] xfs: adjust refcount when unmapping file blocks Darrick J. Wong
2016-09-28 2:55 ` [PATCH 15/63] xfs: add refcount btree block detection to log recovery Darrick J. Wong
2016-09-28 2:55 ` [PATCH 16/63] xfs: refcount btree requires more reserved space Darrick J. Wong
2016-09-28 2:55 ` [PATCH 17/63] xfs: introduce reflink utility functions Darrick J. Wong
2016-09-28 2:55 ` [PATCH 18/63] xfs: create bmbt update intent log items Darrick J. Wong
2016-09-28 2:55 ` [PATCH 19/63] xfs: log bmap intent items Darrick J. Wong
2016-09-28 2:55 ` [PATCH 20/63] xfs: map an inode's offset to an exact physical block Darrick J. Wong
2016-09-28 2:55 ` [PATCH 21/63] xfs: pass bmapi flags through to bmap_del_extent Darrick J. Wong
2016-09-28 2:55 ` [PATCH 22/63] xfs: implement deferred bmbt map/unmap operations Darrick J. Wong
2016-09-28 2:56 ` [PATCH 23/63] xfs: when replaying bmap operations, don't let unlinked inodes get reaped Darrick J. Wong
2016-09-28 2:56 ` [PATCH 24/63] xfs: return work remaining at the end of a bunmapi operation Darrick J. Wong
2016-09-28 2:56 ` [PATCH 25/63] xfs: define tracepoints for reflink activities Darrick J. Wong
2016-09-28 2:56 ` [PATCH 26/63] xfs: add reflink feature flag to geometry Darrick J. Wong
2016-09-28 2:56 ` [PATCH 27/63] xfs: don't allow reflinked dir/dev/fifo/socket/pipe files Darrick J. Wong
2016-09-28 2:56 ` [PATCH 28/63] xfs: introduce the CoW fork Darrick J. Wong
2016-09-28 2:56 ` [PATCH 29/63] xfs: support bmapping delalloc extents in " Darrick J. Wong
2016-09-28 2:56 ` [PATCH 30/63] xfs: create delalloc extents in " Darrick J. Wong
2016-09-28 2:56 ` [PATCH 31/63] xfs: support allocating delayed " Darrick J. Wong
2016-09-28 2:57 ` [PATCH 32/63] xfs: allocate " Darrick J. Wong
2016-09-28 2:57 ` [PATCH 33/63] xfs: support removing extents from " Darrick J. Wong
2016-09-28 2:57 ` [PATCH 34/63] xfs: move mappings from cow fork to data fork after copy-write Darrick J. Wong
2016-09-28 2:57 ` [PATCH 35/63] xfs: report shared extent mappings to userspace correctly Darrick J. Wong
2016-09-28 2:57 ` [PATCH 36/63] xfs: implement CoW for directio writes Darrick J. Wong
2016-09-28 2:57 ` [PATCH 37/63] xfs: cancel CoW reservations and clear inode reflink flag when freeing blocks Darrick J. Wong
2016-09-29 17:01 ` Christoph Hellwig
2016-09-29 20:23 ` Darrick J. Wong
2016-09-28 2:57 ` [PATCH 38/63] xfs: cancel pending CoW reservations when destroying inodes Darrick J. Wong
2016-09-28 2:57 ` [PATCH 39/63] xfs: store in-progress CoW allocations in the refcount btree Darrick J. Wong
2016-09-28 2:57 ` [PATCH 40/63] xfs: reflink extents from one file to another Darrick J. Wong
2016-09-28 2:58 ` [PATCH 41/63] xfs: add clone file and clone range vfs functions Darrick J. Wong
2016-09-29 17:03 ` Christoph Hellwig
2016-09-28 2:58 ` [PATCH 42/63] xfs: add dedupe range vfs function Darrick J. Wong
2016-09-29 17:03 ` Christoph Hellwig
2016-09-29 17:49 ` Darrick J. Wong
2016-09-28 2:58 ` [PATCH 43/63] xfs: teach get_bmapx about shared extents and the CoW fork Darrick J. Wong
2016-09-29 17:05 ` Christoph Hellwig
2016-09-29 17:40 ` Darrick J. Wong
2016-09-29 19:51 ` Christoph Hellwig
2016-09-30 0:18 ` Dave Chinner
2016-09-30 1:50 ` Darrick J. Wong
2016-09-28 2:58 ` [PATCH 44/63] xfs: swap inode reflink flags when swapping inode extents Darrick J. Wong
2016-09-28 2:58 ` [PATCH 45/63] xfs: unshare a range of blocks via fallocate Darrick J. Wong
2016-09-29 17:07 ` Christoph Hellwig
2016-09-29 19:45 ` Darrick J. Wong
2016-09-28 2:58 ` [PATCH 46/63] xfs: CoW shared EOF block when truncating file Darrick J. Wong
2016-09-29 17:29 ` Christoph Hellwig
2016-09-29 20:13 ` Darrick J. Wong
2016-09-29 20:22 ` Christoph Hellwig
2016-09-29 21:23 ` Darrick J. Wong
2016-09-28 2:58 ` [PATCH 47/63] xfs: create a separate cow extent size hint for the allocator Darrick J. Wong
2016-09-28 2:58 ` [PATCH 48/63] xfs: preallocate blocks for worst-case btree expansion Darrick J. Wong
2016-09-28 2:58 ` [PATCH 49/63] xfs: don't allow reflink when the AG is low on space Darrick J. Wong
2016-09-28 2:58 ` [PATCH 50/63] xfs: try other AGs to allocate a BMBT block Darrick J. Wong
2016-09-28 2:59 ` [PATCH 51/63] xfs: garbage collect old cowextsz reservations Darrick J. Wong
2016-09-28 2:59 ` [PATCH 52/63] xfs: increase log reservations for reflink Darrick J. Wong
2016-09-28 2:59 ` [PATCH 53/63] xfs: add shared rmap map/unmap/convert log item types Darrick J. Wong
2016-09-28 2:59 ` [PATCH 54/63] xfs: use interval query for rmap alloc operations on shared files Darrick J. Wong
2016-09-28 2:59 ` [PATCH 55/63] xfs: convert unwritten status of reverse mappings for " Darrick J. Wong
2016-09-28 2:59 ` [PATCH 56/63] xfs: set a default CoW extent size of 32 blocks Darrick J. Wong
2016-09-28 2:59 ` [PATCH 57/63] xfs: check for invalid inode reflink flags Darrick J. Wong
2016-09-28 2:59 ` [PATCH 58/63] xfs: don't mix reflink and DAX mode for now Darrick J. Wong
2016-09-28 2:59 ` [PATCH 59/63] xfs: simulate per-AG reservations being critically low Darrick J. Wong
2016-09-28 3:00 ` [PATCH 60/63] xfs: recognize the reflink feature bit Darrick J. Wong
2016-09-28 3:00 ` [PATCH 61/63] xfs: various swapext cleanups Darrick J. Wong
2016-09-28 3:00 ` [PATCH 62/63] xfs: refactor swapext code Darrick J. Wong
2016-09-28 3:00 ` [PATCH 63/63] xfs: implement swapext for rmap filesystems Darrick J. Wong
2016-09-29 13:46 ` [PATCH v9 00/63] xfs: add reflink and dedupe support Christoph Hellwig
2016-09-29 17:23 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160928162017.GE8852@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).