From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 10/10] xfs: refactor INUMBERS to use iwalk functions
Date: Tue, 11 Jun 2019 09:21:28 -0700 [thread overview]
Message-ID: <20190611162127.GT1871505@magnolia> (raw)
In-Reply-To: <20190611150851.GC10942@bfoster>
On Tue, Jun 11, 2019 at 11:08:51AM -0400, Brian Foster wrote:
> On Tue, Jun 04, 2019 at 02:50:33PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Now that we have generic functions to walk inode records, refactor the
> > INUMBERS implementation to use it.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> > fs/xfs/xfs_ioctl.c | 20 ++++--
> > fs/xfs/xfs_ioctl.h | 2 +
> > fs/xfs/xfs_ioctl32.c | 35 ++++------
> > fs/xfs/xfs_itable.c | 168 ++++++++++++++++++++------------------------------
> > fs/xfs/xfs_itable.h | 22 +------
> > fs/xfs/xfs_iwalk.c | 161 ++++++++++++++++++++++++++++++++++++++++++++++--
> > fs/xfs/xfs_iwalk.h | 12 ++++
>
> It looks like there's a decent amount of xfs_iwalk code changes in this
> patch. That should probably be a separate patch, at minimum to have a
> separate changelog to document the changes/updates required for
> inumbers.
<nod> I'll break out the iwalk changes into a separate patch so that
this one only has the changes needed to wire up the ioctl.
> > 7 files changed, 262 insertions(+), 158 deletions(-)
> >
> >
> ...
> > diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
> > index 06abe5c9c0ee..bade54d6ac64 100644
> > --- a/fs/xfs/xfs_itable.c
> > +++ b/fs/xfs/xfs_itable.c
> > @@ -259,121 +259,85 @@ xfs_bulkstat(
> > return error;
> > }
> >
> > -int
> > -xfs_inumbers_fmt(
> > - void __user *ubuffer, /* buffer to write to */
> > - const struct xfs_inogrp *buffer, /* buffer to read from */
> > - long count, /* # of elements to read */
> > - long *written) /* # of bytes written */
> > +struct xfs_inumbers_chunk {
> > + inumbers_fmt_pf formatter;
> > + struct xfs_ibulk *breq;
> > +};
> > +
> > +/*
> > + * INUMBERS
> > + * ========
> > + * This is how we export inode btree records to userspace, so that XFS tools
> > + * can figure out where inodes are allocated.
> > + */
> > +
> > +/*
> > + * Format the inode group structure and report it somewhere.
> > + *
> > + * Similar to xfs_bulkstat_one_int, lastino is the inode cursor as we walk
> > + * through the filesystem so we move it forward unless there was a runtime
> > + * error. If the formatter tells us the buffer is now full we also move the
> > + * cursor forward and abort the walk.
> > + */
> > +STATIC int
> > +xfs_inumbers_walk(
> > + struct xfs_mount *mp,
> > + struct xfs_trans *tp,
> > + xfs_agnumber_t agno,
> > + const struct xfs_inobt_rec_incore *irec,
> > + void *data)
> > {
> > - if (copy_to_user(ubuffer, buffer, count * sizeof(*buffer)))
> > - return -EFAULT;
> > - *written = count * sizeof(*buffer);
> > - return 0;
> > + struct xfs_inogrp inogrp = {
> > + .xi_startino = XFS_AGINO_TO_INO(mp, agno, irec->ir_startino),
> > + .xi_alloccount = irec->ir_count - irec->ir_freecount,
> > + .xi_allocmask = ~irec->ir_free,
> > + };
>
> Not related to this patch, but I'm wondering if we should be using
> xfs_inobt_irec_to_allocmask() here to mask off holes from the resulting
> allocation bitmap. Eh, I guess it's misleading either way..
Holes were supposed to be marked in ir_free also, right?
So (assuming the irec isn't corrupt) we should be protected against
reporting a hole as an "allocated" inode, right?
> > + struct xfs_inumbers_chunk *ic = data;
> > + xfs_agino_t agino;
> > + int error;
> > +
> > + error = ic->formatter(ic->breq, &inogrp);
> > + if (error && error != XFS_IBULK_BUFFER_FULL)
> > + return error;
> > + if (error == XFS_IBULK_BUFFER_FULL)
> > + error = XFS_INOBT_WALK_ABORT;
> > +
> > + agino = irec->ir_startino + XFS_INODES_PER_CHUNK;
> > + ic->breq->startino = XFS_AGINO_TO_INO(mp, agno, agino);
> > + return error;
> > }
> >
> ...
> > diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c
> > index c4a9c4c246b7..3a35d1cf7e14 100644
> > --- a/fs/xfs/xfs_iwalk.c
> > +++ b/fs/xfs/xfs_iwalk.c
> ...
> > @@ -286,7 +298,7 @@ xfs_iwalk_ag_start(
> > * have to deal with tearing down the cursor to walk the records.
> > */
> > error = xfs_iwalk_grab_ichunk(*curpp, agino, &icount,
> > - &iwag->recs[iwag->nr_recs]);
> > + &iwag->recs[iwag->nr_recs], trim);
>
> Hmm, it looks like we could lift the lookup from xfs_iwalk_grab_ichunk()
> up into xfs_iwalk_ag_start() and avoid needing to pass trim down
> multiple levels. In fact, if we're not trimming the record we don't need
> to grab the record at all in this path. We could do the lookup (setting
> has_more) then bounce right up to the core walker algorithm, right? If
> so, that seems a bit cleaner in terms of only using special cased code
> when it's actually necessary.
Right.
>
> > if (error)
> > return error;
> > if (icount)
> ...
> > @@ -561,3 +574,135 @@ xfs_iwalk_threaded(
> ...
> > +/*
> > + * Walk all inode btree records in a single AG, from @iwag->startino to the end
> > + * of the AG.
> > + */
> > +STATIC int
> > +xfs_inobt_walk_ag(
> > + struct xfs_iwalk_ag *iwag)
> > +{
> > + struct xfs_mount *mp = iwag->mp;
> > + struct xfs_trans *tp = iwag->tp;
> > + struct xfs_buf *agi_bp = NULL;
> > + struct xfs_btree_cur *cur = NULL;
> > + xfs_agnumber_t agno;
> > + xfs_agino_t agino;
> > + int has_more;
> > + int error = 0;
> > +
> > + /* Set up our cursor at the right place in the inode btree. */
> > + agno = XFS_INO_TO_AGNO(mp, iwag->startino);
> > + agino = XFS_INO_TO_AGINO(mp, iwag->startino);
> > + error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more,
> > + false);
> > +
> > + while (!error && has_more && !xfs_pwork_want_abort(&iwag->pwork)) {
> > + struct xfs_inobt_rec_incore *irec;
> > +
> > + cond_resched();
> > +
> > + /* Fetch the inobt record. */
> > + irec = &iwag->recs[iwag->nr_recs];
> > + error = xfs_inobt_get_rec(cur, irec, &has_more);
> > + if (error || !has_more)
> > + break;
> > +
> > + /*
> > + * If there's space in the buffer for more records, increment
> > + * the btree cursor and grab more.
> > + */
> > + if (++iwag->nr_recs < iwag->sz_recs) {
> > + error = xfs_btree_increment(cur, 0, &has_more);
> > + if (error || !has_more)
> > + break;
> > + continue;
> > + }
> > +
> > + /*
> > + * Otherwise, we need to save cursor state and run the callback
> > + * function on the cached records. The run_callbacks function
> > + * is supposed to return a cursor pointing to the record where
> > + * we would be if we had been able to increment like above.
> > + */
> > + error = xfs_iwalk_run_callbacks(iwag, xfs_inobt_walk_ag_recs,
> > + agno, &cur, &agi_bp, &has_more);
> > + }
> > +
> > + xfs_iwalk_del_inobt(tp, &cur, &agi_bp, error);
> > +
> > + /* Walk any records left behind in the cache. */
> > + if (iwag->nr_recs == 0 || error || xfs_pwork_want_abort(&iwag->pwork))
> > + return error;
> > +
> > + return xfs_inobt_walk_ag_recs(iwag);
> > +}
>
> Similar comments apply here as for the previous xfs_iwalk_ag() patch.
> Though looking at it, the only differences here are the lack of free
> inode check, readahead and the callback function (particularly once you
> consider the partial completion refactoring we discussed earlier). I
> think this could all be generalized with a single flag such that there's
> no need for separate xfs_[inobt|iwalk]_ag() functions.
Yep. Already refactoring that. :)
> Hmmm.. perhaps we could use a flag or separate function pointers in
> struct xfs_iwalk_ag to accomplish the same thing all the way up through
> the record walker functions. IOW, xfs_iwalk_ag_recs() looks like it
> could call either callback based on which is defined in the
> xfs_iwalk_ag.
<nod>
> This could all be done as a separate patch of course, if that's easier.
I might just go back and remove the function pointer from run_callbacks
in the earlier patches...
>
> > +
> > +/*
> > + * Walk all inode btree records in the filesystem starting from @startino. The
> > + * @inobt_walk_fn will be called for each btree record, being passed the incore
> > + * record and @data. @max_prefetch controls how many inobt records we try to
> > + * cache ahead of time.
> > + */
> > +int
> > +xfs_inobt_walk(
> > + struct xfs_mount *mp,
> > + struct xfs_trans *tp,
> > + xfs_ino_t startino,
> > + xfs_inobt_walk_fn inobt_walk_fn,
> > + unsigned int max_prefetch,
> > + void *data)
> > +{
> > + struct xfs_iwalk_ag iwag = {
> > + .mp = mp,
> > + .tp = tp,
> > + .inobt_walk_fn = inobt_walk_fn,
> > + .data = data,
> > + .startino = startino,
> > + .pwork = XFS_PWORK_SINGLE_THREADED,
> > + };
> > + xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino);
> > + int error;
> > +
> > + ASSERT(agno < mp->m_sb.sb_agcount);
> > +
> > + xfs_iwalk_set_prefetch(&iwag, max_prefetch * XFS_INODES_PER_CHUNK);
>
> A brief comment above this line would be helpful. Something like:
>
> /* translate inumbers record count to inode count */
Done. Thanks for the review!
--D
> Brian
>
> > + error = xfs_iwalk_alloc(&iwag);
> > + if (error)
> > + return error;
> > +
> > + for (; agno < mp->m_sb.sb_agcount; agno++) {
> > + error = xfs_inobt_walk_ag(&iwag);
> > + if (error)
> > + break;
> > + iwag.startino = XFS_AGINO_TO_INO(mp, agno + 1, 0);
> > + }
> > +
> > + xfs_iwalk_free(&iwag);
> > + return error;
> > +}
> > diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h
> > index 76d8f87a39ef..20bee93d4676 100644
> > --- a/fs/xfs/xfs_iwalk.h
> > +++ b/fs/xfs/xfs_iwalk.h
> > @@ -18,4 +18,16 @@ int xfs_iwalk_threaded(struct xfs_mount *mp, xfs_ino_t startino,
> > xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, bool poll,
> > void *data);
> >
> > +/* Walk all inode btree records in the filesystem starting from @startino. */
> > +typedef int (*xfs_inobt_walk_fn)(struct xfs_mount *mp, struct xfs_trans *tp,
> > + xfs_agnumber_t agno,
> > + const struct xfs_inobt_rec_incore *irec,
> > + void *data);
> > +/* Return value (for xfs_inobt_walk_fn) that aborts the walk immediately. */
> > +#define XFS_INOBT_WALK_ABORT (XFS_IWALK_ABORT)
> > +
> > +int xfs_inobt_walk(struct xfs_mount *mp, struct xfs_trans *tp,
> > + xfs_ino_t startino, xfs_inobt_walk_fn inobt_walk_fn,
> > + unsigned int max_prefetch, void *data);
> > +
> > #endif /* __XFS_IWALK_H__ */
> >
next prev parent reply other threads:[~2019-06-11 16:21 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-04 21:49 [PATCH v2 00/10] xfs: refactor and improve inode iteration Darrick J. Wong
2019-06-04 21:49 ` [PATCH 01/10] xfs: create simplified inode walk function Darrick J. Wong
2019-06-10 13:58 ` Brian Foster
2019-06-10 16:59 ` Darrick J. Wong
2019-06-10 17:55 ` Brian Foster
2019-06-10 23:11 ` Darrick J. Wong
2019-06-11 22:33 ` Dave Chinner
2019-06-11 23:05 ` Darrick J. Wong
2019-06-12 12:13 ` Brian Foster
2019-06-12 16:53 ` Darrick J. Wong
2019-06-12 17:54 ` Darrick J. Wong
2019-06-04 21:49 ` [PATCH 02/10] xfs: convert quotacheck to use the new iwalk functions Darrick J. Wong
2019-06-10 13:58 ` Brian Foster
2019-06-10 17:10 ` Darrick J. Wong
2019-06-11 23:23 ` Dave Chinner
2019-06-12 0:32 ` Darrick J. Wong
2019-06-12 12:55 ` Brian Foster
2019-06-12 23:33 ` Dave Chinner
2019-06-13 18:34 ` Brian Foster
2019-06-04 21:49 ` [PATCH 03/10] xfs: bulkstat should copy lastip whenever userspace supplies one Darrick J. Wong
2019-06-10 13:59 ` Brian Foster
2019-06-04 21:49 ` [PATCH 04/10] xfs: convert bulkstat to new iwalk infrastructure Darrick J. Wong
2019-06-10 14:02 ` Brian Foster
2019-06-10 17:38 ` Darrick J. Wong
2019-06-10 18:29 ` Brian Foster
2019-06-10 23:42 ` Darrick J. Wong
2019-06-04 21:49 ` [PATCH 05/10] xfs: move bulkstat ichunk helpers to iwalk code Darrick J. Wong
2019-06-10 14:02 ` Brian Foster
2019-06-04 21:50 ` [PATCH 06/10] xfs: change xfs_iwalk_grab_ichunk to use startino, not lastino Darrick J. Wong
2019-06-10 19:32 ` Brian Foster
2019-06-04 21:50 ` [PATCH 07/10] xfs: clean up long conditionals in xfs_iwalk_ichunk_ra Darrick J. Wong
2019-06-10 19:32 ` Brian Foster
2019-06-04 21:50 ` [PATCH 08/10] xfs: multithreaded iwalk implementation Darrick J. Wong
2019-06-10 19:40 ` Brian Foster
2019-06-11 1:10 ` Darrick J. Wong
2019-06-11 13:13 ` Brian Foster
2019-06-11 15:29 ` Darrick J. Wong
2019-06-11 17:00 ` Brian Foster
2019-06-04 21:50 ` [PATCH 09/10] xfs: poll waiting for quotacheck Darrick J. Wong
2019-06-11 15:07 ` Brian Foster
2019-06-11 16:06 ` Darrick J. Wong
2019-06-11 17:01 ` Brian Foster
2019-06-04 21:50 ` [PATCH 10/10] xfs: refactor INUMBERS to use iwalk functions Darrick J. Wong
2019-06-11 15:08 ` Brian Foster
2019-06-11 16:21 ` Darrick J. Wong [this message]
2019-06-11 17:01 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190611162127.GT1871505@magnolia \
--to=darrick.wong@oracle.com \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox