From: Chandan Babu R <chandan.babu@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 07/16] xfs: reserve space and initialise xlog_op_header in item formatting
Date: Mon, 22 Nov 2021 17:09:06 +0530 [thread overview]
Message-ID: <87zgpwv2qt.fsf@debian-BULLSEYE-live-builder-AMD64> (raw)
In-Reply-To: <20211118231352.2051947-8-david@fromorbit.com>
On 19 Nov 2021 at 04:43, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> Current xlog_write() adds op headers to the log manually for every
> log item region that is in the vector passed to it. While
> xlog_write() needs to stamp the transaction ID into the ophdr, we
> already know it's length, flags, clientid, etc at CIL commit time.
>
> This means the only time that xlog write really needs to format and
> reserve space for a new ophdr is when a region is split across two
> iclogs. Adding the opheader and accounting for it as part of the
> normal formatted item region means we simplify the accounting
> of space used by a transaction and we don't have to special case
> reserving of space in for the ophdrs in xlog_write(). It also means
> we can largely initialise the ophdr in transaction commit instead
> of xlog_write, making the xlog_write formatting inner loop much
> tighter.
>
> xlog_prepare_iovec() is now too large to stay as an inline function,
> so we move it out of line and into xfs_log.c.
>
> Object sizes:
> text data bss dec hex filename
> 1125934 305951 484 1432369 15db31 fs/xfs/built-in.a.before
> 1123360 305951 484 1429795 15d123 fs/xfs/built-in.a.after
>
> So the code is a roughly 2.5kB smaller with xlog_prepare_iovec() now
> out of line, even though it grew in size itself.
>
I verified the following,
1. For embedded op headers, xlog_cil_alloc_shadow_bufs() correctly computes
the space required to hold a region.
2. The space actually used by each iovec (plus space required
by an op header) is correctly updated by xlog_finish_iovec().
3. op header's oh_len is correctly updated in both partial and non-partial
writes of a region to an iclog.
Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_log.c | 115 +++++++++++++++++++++++++++++--------------
> fs/xfs/xfs_log.h | 42 +++-------------
> fs/xfs/xfs_log_cil.c | 25 +++++-----
> 3 files changed, 99 insertions(+), 83 deletions(-)
>
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index f09663d3664b..0923bee8d4e2 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -90,6 +90,62 @@ xlog_iclogs_empty(
> static int
> xfs_log_cover(struct xfs_mount *);
>
> +/*
> + * We need to make sure the buffer pointer returned is naturally aligned for the
> + * biggest basic data type we put into it. We have already accounted for this
> + * padding when sizing the buffer.
> + *
> + * However, this padding does not get written into the log, and hence we have to
> + * track the space used by the log vectors separately to prevent log space hangs
> + * due to inaccurate accounting (i.e. a leak) of the used log space through the
> + * CIL context ticket.
> + *
> + * We also add space for the xlog_op_header that describes this region in the
> + * log. This prepends the data region we return to the caller to copy their data
> + * into, so do all the static initialisation of the ophdr now. Because the ophdr
> + * is not 8 byte aligned, we have to be careful to ensure that we align the
> + * start of the buffer such that the region we return to the call is 8 byte
> + * aligned and packed against the tail of the ophdr.
> + */
> +void *
> +xlog_prepare_iovec(
> + struct xfs_log_vec *lv,
> + struct xfs_log_iovec **vecp,
> + uint type)
> +{
> + struct xfs_log_iovec *vec = *vecp;
> + struct xlog_op_header *oph;
> + uint32_t len;
> + void *buf;
> +
> + if (vec) {
> + ASSERT(vec - lv->lv_iovecp < lv->lv_niovecs);
> + vec++;
> + } else {
> + vec = &lv->lv_iovecp[0];
> + }
> +
> + len = lv->lv_buf_len + sizeof(struct xlog_op_header);
> + if (!IS_ALIGNED(len, sizeof(uint64_t))) {
> + lv->lv_buf_len = round_up(len, sizeof(uint64_t)) -
> + sizeof(struct xlog_op_header);
> + }
> +
> + vec->i_type = type;
> + vec->i_addr = lv->lv_buf + lv->lv_buf_len;
> +
> + oph = vec->i_addr;
> + oph->oh_clientid = XFS_TRANSACTION;
> + oph->oh_res2 = 0;
> + oph->oh_flags = 0;
> +
> + buf = vec->i_addr + sizeof(struct xlog_op_header);
> + ASSERT(IS_ALIGNED((unsigned long)buf, sizeof(uint64_t)));
> +
> + *vecp = vec;
> + return buf;
> +}
> +
> static void
> xlog_grant_sub_space(
> struct xlog *log,
> @@ -2246,9 +2302,9 @@ xlog_print_trans(
> }
>
> /*
> - * Calculate the potential space needed by the log vector. If this is a start
> - * transaction, the caller has already accounted for both opheaders in the start
> - * transaction, so we don't need to account for them here.
> + * Calculate the potential space needed by the log vector. All regions contain
> + * their own opheaders and they are accounted for in region space so we don't
> + * need to add them to the vector length here.
> */
> static int
> xlog_write_calc_vec_length(
> @@ -2275,18 +2331,7 @@ xlog_write_calc_vec_length(
> xlog_tic_add_region(ticket, vecp->i_len, vecp->i_type);
> }
> }
> -
> - /* Don't account for regions with embedded ophdrs */
> - if (optype && headers > 0) {
> - headers--;
> - if (optype & XLOG_START_TRANS) {
> - ASSERT(headers >= 1);
> - headers--;
> - }
> - }
> -
> ticket->t_res_num_ophdrs += headers;
> - len += headers * sizeof(struct xlog_op_header);
>
> return len;
> }
> @@ -2296,7 +2341,6 @@ xlog_write_setup_ophdr(
> struct xlog_op_header *ophdr,
> struct xlog_ticket *ticket)
> {
> - ophdr->oh_tid = cpu_to_be32(ticket->t_tid);
> ophdr->oh_clientid = XFS_TRANSACTION;
> ophdr->oh_res2 = 0;
> ophdr->oh_flags = 0;
> @@ -2514,21 +2558,25 @@ xlog_write(
> ASSERT((unsigned long)ptr % sizeof(int32_t) == 0);
>
> /*
> - * The XLOG_START_TRANS has embedded ophdrs for the
> - * start record and transaction header. They will always
> - * be the first two regions in the lv chain. Commit and
> - * unmount records also have embedded ophdrs.
> + * Regions always have their ophdr at the start of the
> + * region, except for:
> + * - a transaction start which has a start record ophdr
> + * before the first region ophdr; and
> + * - the previous region didn't fully fit into an iclog
> + * so needs a continuation ophdr to prepend the region
> + * in this new iclog.
> */
> - if (optype) {
> - ophdr = reg->i_addr;
> - if (index)
> - optype &= ~XLOG_START_TRANS;
> - } else {
> + ophdr = reg->i_addr;
> + if (optype && index) {
> + optype &= ~XLOG_START_TRANS;
> + } else if (partial_copy) {
> ophdr = xlog_write_setup_ophdr(ptr, ticket);
> xlog_write_adv_cnt(&ptr, &len, &log_offset,
> sizeof(struct xlog_op_header));
> added_ophdr = true;
> }
> + ophdr->oh_tid = cpu_to_be32(ticket->t_tid);
> +
> len += xlog_write_setup_copy(ticket, ophdr,
> iclog->ic_size-log_offset,
> reg->i_len,
> @@ -2546,20 +2594,11 @@ xlog_write(
> ophdr->oh_len = cpu_to_be32(copy_len -
> sizeof(struct xlog_op_header));
> }
> - /*
> - * Copy region.
> - *
> - * Commit records just log an opheader, so
> - * we can have empty payloads with no data region to
> - * copy. Hence we only copy the payload if the vector
> - * says it has data to copy.
> - */
> - ASSERT(copy_len >= 0);
> - if (copy_len > 0) {
> - memcpy(ptr, reg->i_addr + copy_off, copy_len);
> - xlog_write_adv_cnt(&ptr, &len, &log_offset,
> - copy_len);
> - }
> +
> + ASSERT(copy_len > 0);
> + memcpy(ptr, reg->i_addr + copy_off, copy_len);
> + xlog_write_adv_cnt(&ptr, &len, &log_offset, copy_len);
> +
> if (added_ophdr)
> copy_len += sizeof(struct xlog_op_header);
> record_cnt++;
> diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> index d1fc43476166..816f44d7dc81 100644
> --- a/fs/xfs/xfs_log.h
> +++ b/fs/xfs/xfs_log.h
> @@ -21,44 +21,18 @@ struct xfs_log_vec {
>
> #define XFS_LOG_VEC_ORDERED (-1)
>
> -/*
> - * We need to make sure the buffer pointer returned is naturally aligned for the
> - * biggest basic data type we put into it. We have already accounted for this
> - * padding when sizing the buffer.
> - *
> - * However, this padding does not get written into the log, and hence we have to
> - * track the space used by the log vectors separately to prevent log space hangs
> - * due to inaccurate accounting (i.e. a leak) of the used log space through the
> - * CIL context ticket.
> - */
> -static inline void *
> -xlog_prepare_iovec(struct xfs_log_vec *lv, struct xfs_log_iovec **vecp,
> - uint type)
> -{
> - struct xfs_log_iovec *vec = *vecp;
> -
> - if (vec) {
> - ASSERT(vec - lv->lv_iovecp < lv->lv_niovecs);
> - vec++;
> - } else {
> - vec = &lv->lv_iovecp[0];
> - }
> -
> - if (!IS_ALIGNED(lv->lv_buf_len, sizeof(uint64_t)))
> - lv->lv_buf_len = round_up(lv->lv_buf_len, sizeof(uint64_t));
> -
> - vec->i_type = type;
> - vec->i_addr = lv->lv_buf + lv->lv_buf_len;
> -
> - ASSERT(IS_ALIGNED((unsigned long)vec->i_addr, sizeof(uint64_t)));
> -
> - *vecp = vec;
> - return vec->i_addr;
> -}
> +void *xlog_prepare_iovec(struct xfs_log_vec *lv, struct xfs_log_iovec **vecp,
> + uint type);
>
> static inline void
> xlog_finish_iovec(struct xfs_log_vec *lv, struct xfs_log_iovec *vec, int len)
> {
> + struct xlog_op_header *oph = vec->i_addr;
> +
> + /* opheader tracks payload length, logvec tracks region length */
> + oph->oh_len = cpu_to_be32(len);
> +
> + len += sizeof(struct xlog_op_header);
> lv->lv_buf_len += len;
> lv->lv_bytes += len;
> vec->i_len = len;
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index 29cf2d5d0b96..90a0e9b9d3e0 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -181,13 +181,20 @@ xlog_cil_alloc_shadow_bufs(
> }
>
> /*
> - * We 64-bit align the length of each iovec so that the start
> - * of the next one is naturally aligned. We'll need to
> - * account for that slack space here. Then round nbytes up
> - * to 64-bit alignment so that the initial buffer alignment is
> - * easy to calculate and verify.
> + * We 64-bit align the length of each iovec so that the start of
> + * the next one is naturally aligned. We'll need to account for
> + * that slack space here.
> + *
> + * We also add the xlog_op_header to each region when
> + * formatting, but that's not accounted to the size of the item
> + * at this point. Hence we'll need an addition number of bytes
> + * for each vector to hold an opheader.
> + *
> + * Then round nbytes up to 64-bit alignment so that the initial
> + * buffer alignment is easy to calculate and verify.
> */
> - nbytes += niovecs * sizeof(uint64_t);
> + nbytes += niovecs *
> + (sizeof(uint64_t) + sizeof(struct xlog_op_header));
> nbytes = round_up(nbytes, sizeof(uint64_t));
>
> /*
> @@ -441,11 +448,6 @@ xlog_cil_insert_items(
>
> spin_lock(&cil->xc_cil_lock);
>
> - /* account for space used by new iovec headers */
> - iovhdr_res = diff_iovecs * sizeof(xlog_op_header_t);
> - len += iovhdr_res;
> - ctx->nvecs += diff_iovecs;
> -
> /* attach the transaction to the CIL if it has any busy extents */
> if (!list_empty(&tp->t_busy))
> list_splice_init(&tp->t_busy, &ctx->busy_extents);
> @@ -477,6 +479,7 @@ xlog_cil_insert_items(
> }
> tp->t_ticket->t_curr_res -= len;
> ctx->space_used += len;
> + ctx->nvecs += diff_iovecs;
>
> /*
> * If we've overrun the reservation, dump the tx details before we move
--
chandan
next prev parent reply other threads:[~2021-11-22 11:39 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-18 23:13 [PATCH 00/16 v7] xfs: rework xlog_write() Dave Chinner
2021-11-18 23:13 ` [PATCH 01/16] xfs: factor out the CIL transaction header building Dave Chinner
2021-11-22 10:48 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 02/16] xfs: only CIL pushes require a start record Dave Chinner
2021-11-22 11:29 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 03/16] xfs: embed the xlog_op_header in the unmount record Dave Chinner
2021-11-22 11:30 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 04/16] xfs: embed the xlog_op_header in the commit record Dave Chinner
2021-11-22 11:31 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 05/16] xfs: log tickets don't need log client id Dave Chinner
2021-11-22 11:37 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 06/16] xfs: move log iovec alignment to preparation function Dave Chinner
2021-11-22 11:37 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 07/16] xfs: reserve space and initialise xlog_op_header in item formatting Dave Chinner
2021-11-22 11:39 ` Chandan Babu R [this message]
2021-11-18 23:13 ` [PATCH 08/16] xfs: log ticket region debug is largely useless Dave Chinner
2021-11-22 11:39 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 09/16] xfs: pass lv chain length into xlog_write() Dave Chinner
2021-11-22 11:39 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 10/16] xfs: change the type of ic_datap Dave Chinner
2021-11-18 23:13 ` [PATCH 11/16] xfs: introduce xlog_write_full() Dave Chinner
2021-11-22 11:40 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 12/16] xfs: introduce xlog_write_partial() Dave Chinner
2021-11-23 16:13 ` Chandan Babu R
2021-11-23 21:47 ` Darrick J. Wong
2021-11-18 23:13 ` [PATCH 13/16] xfs: remove xlog_verify_dest_ptr Dave Chinner
2021-11-18 23:13 ` [PATCH 14/16] xfs: xlog_write() no longer needs contwr state Dave Chinner
2021-11-24 8:54 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 15/16] xfs: xlog_write() doesn't need optype anymore Dave Chinner
2021-11-24 8:55 ` Chandan Babu R
2021-11-18 23:13 ` [PATCH 16/16] xfs: CIL context doesn't need to count iovecs Dave Chinner
2021-11-24 8:55 ` Chandan Babu R
-- strict thread matches above, loose matches on Subject: below --
2022-03-09 5:29 [PATCH 00/16 v8] xfs: rework xlog_write() Dave Chinner
2022-03-09 5:29 ` [PATCH 07/16] xfs: reserve space and initialise xlog_op_header in item formatting Dave Chinner
2021-11-09 1:50 [PATCH 00/16 v6] xfs: rework xlog_write() Dave Chinner
2021-11-09 1:50 ` [PATCH 07/16] xfs: reserve space and initialise xlog_op_header in item formatting Dave Chinner
2021-11-11 8:10 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zgpwv2qt.fsf@debian-BULLSEYE-live-builder-AMD64 \
--to=chandan.babu@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox