From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 33/39] xfs: Add order IDs to log items in CIL
Date: Wed, 2 Jun 2021 20:02:33 -0700 [thread overview]
Message-ID: <20210603030233.GU26380@locust> (raw)
In-Reply-To: <20210603021330.GL664593@dread.disaster.area>
On Thu, Jun 03, 2021 at 12:13:30PM +1000, Dave Chinner wrote:
> On Wed, Jun 02, 2021 at 05:49:14PM -0700, Darrick J. Wong wrote:
> > On Thu, Jun 03, 2021 at 10:16:22AM +1000, Dave Chinner wrote:
> > > On Thu, May 27, 2021 at 12:00:23PM -0700, Darrick J. Wong wrote:
> > > > On Wed, May 19, 2021 at 10:13:11PM +1000, Dave Chinner wrote:
> > > > > From: Dave Chinner <dchinner@redhat.com>
> > > > >
> > > > > Before we split the ordered CIL up into per cpu lists, we need a
> > > > > mechanism to track the order of the items in the CIL. We need to do
> > > > > this because there are rules around the order in which related items
> > > > > must physically appear in the log even inside a single checkpoint
> > > > > transaction.
> > > > >
> > > > > An example of this is intents - an intent must appear in the log
> > > > > before it's intent done record so taht log recovery can cancel the
> > > >
> > > > s/taht/that/
> > > >
> > > > > intent correctly. If we have these two records misordered in the
> > > > > CIL, then they will not be recovered correctly by journal replay.
> > > > >
> > > > > We also will not be able to move items to the tail of
> > > > > the CIL list when they are relogged, hence the log items will need
> > > > > some mechanism to allow the correct log item order to be recreated
> > > > > before we write log items to the hournal.
> > > > >
> > > > > Hence we need to have a mechanism for recording global order of
> > > > > transactions in the log items so that we can recover that order
> > > > > from un-ordered per-cpu lists.
> > > > >
> > > > > Do this with a simple monotonic increasing commit counter in the CIL
> > > > > context. Each log item in the transaction gets stamped with the
> > > > > current commit order ID before it is added to the CIL. If the item
> > > > > is already in the CIL, leave it where it is instead of moving it to
> > > > > the tail of the list and instead sort the list before we start the
> > > > > push work.
> > > > >
> > > > > XXX: list_sort() under the cil_ctx_lock held exclusive starts
> > > > > hurting that >16 threads. Front end commits are waiting on the push
> > > > > to switch contexts much longer. The item order id should likely be
> > > > > moved into the logvecs when they are detacted from the items, then
> > > > > the sort can be done on the logvec after the cil_ctx_lock has been
> > > > > released. logvecs will need to use a list_head for this rather than
> > > > > a single linked list like they do now....
> > > >
> > > > ...which I guess happens in patch 35 now?
> > >
> > > Right. I'll just remove this from the commit message.
> > >
> > > > > @@ -780,6 +780,26 @@ xlog_cil_build_trans_hdr(
> > > > > tic->t_curr_res -= lvhdr->lv_bytes;
> > > > > }
> > > > >
> > > > > +/*
> > > > > + * CIL item reordering compare function. We want to order in ascending ID order,
> > > > > + * but we want to leave items with the same ID in the order they were added to
> > > >
> > > > When do we have items with the same id?
> > >
> > > All the items in a single transaction have the same id. The order id
> > > increments before we tag all the items in the transaction and insert
> > > them into the CIL.
> > >
> > > > I guess that happens if we have multiple transactions adding items to
> > > > the cil at the same time? I guess that's not a big deal since each of
> > > > those threads will hold a disjoint set of locks, so even if the order
> > > > ids are the same for a bunch of items, they're never going to be
> > > > touching the same AG/inode/metadata object, right?
> > > >
> > > > If that's correct, then:
> > > > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> > >
> > >
> > > While true, it's not the way this works so I won't immediately
> > > accept your RVB. The reason for not changing the ordering within a
> > > single transaction is actually intent logging. i.e. this:
> > >
> > > > > + * the list. This is important for operations like reflink where we log 4 order
> > > > > + * dependent intents in a single transaction when we overwrite an existing
> > > > > + * shared extent with a new shared extent. i.e. BUI(unmap), CUI(drop),
> > > > > + * CUI (inc), BUI(remap)...
> > >
> > > There's a specific order of operations that recovery must run these
> > > intents in, and so if we re-order them here in the CIL they'll be
> > > out of order in the log and recovery will replay the intents in the
> > > wrong order. Replaying the intents in the wrong order results in
> > > corruption warnings and assert failures during log recovery, hence
> > > the constraint of not re-ordering items within the same transaction.
> >
> > <ding> lightbulb comes on. I think I understood this better the last
> > time I read all these patches. :/
> >
> > Basically, for each item that can be attached to a transaction, you're
> > assigning it an "order id" that is a monotonically increasing counter
> > that (roughly) records the last time the item was committed. Certain
> > items (like inodes) can be relogged and committed multiple times in
> > rapid fire succession, in which case the order_id will get bumped
> > forward.
>
> Effectively, yes.
>
> > In the /next/ patch you'll change the cil item list to be per-cpu and
> > only splice the mess together at cil push time. For that to work
> > properly, you have to re-sort that resulting list in commit order (aka
> > the order_id) to keep the items in order of commit.
> >
> > For items *within* a transaction, you take advantage of the property
> > of list_sort that it won't reorder items with cmp(a, b) == 0, which
> > means that all the intents logged to a transaction will maintain the
> > same order that the author of higher level code wrote into the software.
>
> Correct.
Ok, good.
> > Question: xlog_cil_push_work zeroes the order_id of pushed log items.
> > Is there any potential problem here when ctx->order_id wraps around to
> > zero? I think the answer is that we'll move on to a new cil context
> > long before we hit 2^32-1 transactions?
>
> Yes. At the moment, the max transaction rate is about 800k/s, which
> means it'd take a couple of hours to run 4 billion transactions. So
> we're in no danger of overruning the number of transactions in a CIL
> commit any time soon. And if we ever get near that, we can just bump
> the counter to a 64 bit value...
Ok.
With the "taht" in the commit message fixed,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2021-06-03 3:11 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-19 12:12 [PATCH 00/39 v4] xfs: CIL and log optimisations Dave Chinner
2021-05-19 12:12 ` [PATCH 01/39] xfs: log stripe roundoff is a property of the log Dave Chinner
2021-05-28 0:54 ` Allison Henderson
2021-05-19 12:12 ` [PATCH 02/39] xfs: separate CIL commit record IO Dave Chinner
2021-05-28 0:54 ` Allison Henderson
2021-05-19 12:12 ` [PATCH 03/39] xfs: remove xfs_blkdev_issue_flush Dave Chinner
2021-05-28 0:54 ` Allison Henderson
2021-05-19 12:12 ` [PATCH 04/39] xfs: async blkdev cache flush Dave Chinner
2021-05-20 23:53 ` Darrick J. Wong
2021-05-28 0:54 ` Allison Henderson
2021-05-19 12:12 ` [PATCH 05/39] xfs: CIL checkpoint flushes caches unconditionally Dave Chinner
2021-05-28 0:54 ` Allison Henderson
2021-05-19 12:12 ` [PATCH 06/39] xfs: remove need_start_rec parameter from xlog_write() Dave Chinner
2021-05-19 12:12 ` [PATCH 07/39] xfs: journal IO cache flush reductions Dave Chinner
2021-05-21 0:16 ` Darrick J. Wong
2021-05-19 12:12 ` [PATCH 08/39] xfs: Fix CIL throttle hang when CIL space used going backwards Dave Chinner
2021-05-19 12:12 ` [PATCH 09/39] xfs: xfs_log_force_lsn isn't passed a LSN Dave Chinner
2021-05-21 0:20 ` Darrick J. Wong
2021-05-19 12:12 ` [PATCH 10/39] xfs: AIL needs asynchronous CIL forcing Dave Chinner
2021-05-21 0:33 ` Darrick J. Wong
2021-05-19 12:12 ` [PATCH 11/39] xfs: CIL work is serialised, not pipelined Dave Chinner
2021-05-21 0:32 ` Darrick J. Wong
2021-05-19 12:12 ` [PATCH 12/39] xfs: factor out the CIL transaction header building Dave Chinner
2021-05-19 12:12 ` [PATCH 13/39] xfs: only CIL pushes require a start record Dave Chinner
2021-05-19 12:12 ` [PATCH 14/39] xfs: embed the xlog_op_header in the unmount record Dave Chinner
2021-05-21 0:35 ` Darrick J. Wong
2021-05-19 12:12 ` [PATCH 15/39] xfs: embed the xlog_op_header in the commit record Dave Chinner
2021-05-19 12:12 ` [PATCH 16/39] xfs: log tickets don't need log client id Dave Chinner
2021-05-21 0:38 ` Darrick J. Wong
2021-05-19 12:12 ` [PATCH 17/39] xfs: move log iovec alignment to preparation function Dave Chinner
2021-05-19 12:12 ` [PATCH 18/39] xfs: reserve space and initialise xlog_op_header in item formatting Dave Chinner
2021-05-19 12:12 ` [PATCH 19/39] xfs: log ticket region debug is largely useless Dave Chinner
2021-05-19 12:12 ` [PATCH 20/39] xfs: pass lv chain length into xlog_write() Dave Chinner
2021-05-27 17:20 ` Darrick J. Wong
2021-06-02 22:18 ` Dave Chinner
2021-06-02 22:24 ` Darrick J. Wong
2021-06-02 22:58 ` [PATCH 20/39 V2] " Dave Chinner
2021-06-02 23:01 ` Darrick J. Wong
2021-05-19 12:12 ` [PATCH 21/39] xfs: introduce xlog_write_single() Dave Chinner
2021-05-27 17:27 ` Darrick J. Wong
2021-05-19 12:13 ` [PATCH 22/39] xfs:_introduce xlog_write_partial() Dave Chinner
2021-05-27 18:06 ` Darrick J. Wong
2021-06-02 22:21 ` Dave Chinner
2021-05-19 12:13 ` [PATCH 23/39] xfs: xlog_write() no longer needs contwr state Dave Chinner
2021-05-19 12:13 ` [PATCH 24/39] xfs: xlog_write() doesn't need optype anymore Dave Chinner
2021-05-27 18:07 ` Darrick J. Wong
2021-05-19 12:13 ` [PATCH 25/39] xfs: CIL context doesn't need to count iovecs Dave Chinner
2021-05-27 18:08 ` Darrick J. Wong
2021-05-19 12:13 ` [PATCH 26/39] xfs: use the CIL space used counter for emptiness checks Dave Chinner
2021-05-19 12:13 ` [PATCH 27/39] xfs: lift init CIL reservation out of xc_cil_lock Dave Chinner
2021-05-19 12:13 ` [PATCH 28/39] xfs: rework per-iclog header CIL reservation Dave Chinner
2021-05-27 18:17 ` Darrick J. Wong
2021-05-19 12:13 ` [PATCH 29/39] xfs: introduce per-cpu CIL tracking structure Dave Chinner
2021-05-27 18:31 ` Darrick J. Wong
2021-05-19 12:13 ` [PATCH 30/39] xfs: implement percpu cil space used calculation Dave Chinner
2021-05-27 18:41 ` Darrick J. Wong
2021-06-02 23:47 ` Dave Chinner
2021-06-03 1:26 ` Darrick J. Wong
2021-06-03 2:28 ` Dave Chinner
2021-06-03 3:01 ` Darrick J. Wong
2021-06-03 3:56 ` Dave Chinner
2021-05-19 12:13 ` [PATCH 31/39] xfs: track CIL ticket reservation in percpu structure Dave Chinner
2021-05-27 18:48 ` Darrick J. Wong
2021-05-19 12:13 ` [PATCH 32/39] xfs: convert CIL busy extents to per-cpu Dave Chinner
2021-05-27 18:49 ` Darrick J. Wong
2021-05-19 12:13 ` [PATCH 33/39] xfs: Add order IDs to log items in CIL Dave Chinner
2021-05-27 19:00 ` Darrick J. Wong
2021-06-03 0:16 ` Dave Chinner
2021-06-03 0:49 ` Darrick J. Wong
2021-06-03 2:13 ` Dave Chinner
2021-06-03 3:02 ` Darrick J. Wong [this message]
2021-05-19 12:13 ` [PATCH 34/39] xfs: convert CIL to unordered per cpu lists Dave Chinner
2021-05-27 19:03 ` Darrick J. Wong
2021-06-03 0:27 ` Dave Chinner
2021-05-19 12:13 ` [PATCH 35/39] xfs: convert log vector chain to use list heads Dave Chinner
2021-05-27 19:13 ` Darrick J. Wong
2021-06-03 0:38 ` Dave Chinner
2021-06-03 0:50 ` Darrick J. Wong
2021-05-19 12:13 ` [PATCH 36/39] xfs: move CIL ordering to the logvec chain Dave Chinner
2021-05-27 19:14 ` Darrick J. Wong
2021-05-19 12:13 ` [PATCH 37/39] xfs: avoid cil push lock if possible Dave Chinner
2021-05-27 19:18 ` Darrick J. Wong
2021-05-19 12:13 ` [PATCH 38/39] xfs: xlog_sync() manually adjusts grant head space Dave Chinner
2021-05-19 12:13 ` [PATCH 39/39] xfs: expanding delayed logging design with background material Dave Chinner
2021-05-27 20:38 ` Darrick J. Wong
2021-06-03 0:57 ` Dave Chinner
-- strict thread matches above, loose matches on Subject: below --
2021-06-03 5:22 [PATCH 00/39 v5] xfs: CIL and log optimisations Dave Chinner
2021-06-03 5:22 ` [PATCH 33/39] xfs: Add order IDs to log items in CIL Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210603030233.GU26380@locust \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).