public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/4] xfs: fix some log stalling problems in defer ops
@ 2020-09-29 17:44 Darrick J. Wong
  2020-09-29 17:44 ` [PATCH 1/4] xfs: change the order in which child and parent defer ops are finished Darrick J. Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Darrick J. Wong @ 2020-09-29 17:44 UTC (permalink / raw)
  To: darrick.wong; +Cc: Dave Chinner, Brian Foster, linux-xfs, david, bfoster

Hi all,

This last series tries to fix some structural problems in the defer ops
code.  The defer ops code has been finishing items in the wrong order --
if a top level defer op creates items A and B, and finishing item A
creates more defer ops A1 and A2, we'll put the new items on the end of
the chain and process them in the order A B A1 A2.  This is kind of
weird, since it's convenient for programmers to be able to think of A
and B as an ordered sequence where all the work for A must finish before
we move on to B, e.g. A A1 A2 D.

That isn't how the defer ops actually works, but so far we've been lucky
that this hasn't ever caused serious problems.  This /will/, however,
when we get to the atomic extent swap code, where for refcounting
purposes it actually /does/ matter that unmap and map child intents
execute in that order, and complete before we move on to the next extent
in the files.  This also causes a very long chain of intent items to
build up, which can exhaust memory.

We need to teach defer ops to finish all the sub-work associated with
each defer op that the caller gave us, to minimize the length of the
defer ops chains; and then we need to teach it to relog items
periodically to avoid pinning the log tail.

v2: combine all the relog patches into one, and base the decision to
relog an iten dependent on whether or not it's in an old checkpoint
v3: fix backwards logic, don't relog items in the same checkpoint,
and split up the changes
v4: fix some comments, split the log changes into a separate patch
v5: add a defer ops relog counter to the xfs stats file

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=defer-ops-stalls-5.10
---
 fs/xfs/libxfs/xfs_defer.c  |   69 +++++++++++++++++++++++++++++++++++++++++++-
 fs/xfs/xfs_bmap_item.c     |   27 +++++++++++++++++
 fs/xfs/xfs_extfree_item.c  |   29 ++++++++++++++++++
 fs/xfs/xfs_log.c           |   40 +++++++++++++++++++-------
 fs/xfs/xfs_log.h           |    2 +
 fs/xfs/xfs_refcount_item.c |   27 +++++++++++++++++
 fs/xfs/xfs_rmap_item.c     |   27 +++++++++++++++++
 fs/xfs/xfs_stats.c         |    4 +++
 fs/xfs/xfs_stats.h         |    1 +
 fs/xfs/xfs_trace.h         |    1 +
 fs/xfs/xfs_trans.h         |   10 ++++++
 11 files changed, 226 insertions(+), 11 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread
* [PATCH v5 0/4] xfs: fix some log stalling problems in defer ops
@ 2020-10-05 18:20 Darrick J. Wong
  2020-10-05 18:21 ` [PATCH 4/4] xfs: only relog deferred intent items if free space in the log gets low Darrick J. Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Darrick J. Wong @ 2020-10-05 18:20 UTC (permalink / raw)
  To: darrick.wong; +Cc: Brian Foster, Dave Chinner, linux-xfs, david, bfoster, hch

Hi all,

This last series tries to fix some structural problems in the defer ops
code.  The defer ops code has been finishing items in the wrong order --
if a top level defer op creates items A and B, and finishing item A
creates more defer ops A1 and A2, we'll put the new items on the end of
the chain and process them in the order A B A1 A2.  This is kind of
weird, since it's convenient for programmers to be able to think of A
and B as an ordered sequence where all the work for A must finish before
we move on to B, e.g. A A1 A2 D.

That isn't how the defer ops actually works, but so far we've been lucky
that this hasn't ever caused serious problems.  This /will/, however,
when we get to the atomic extent swap code, where for refcounting
purposes it actually /does/ matter that unmap and map child intents
execute in that order, and complete before we move on to the next extent
in the files.  This also causes a very long chain of intent items to
build up, which can exhaust memory.

We need to teach defer ops to finish all the sub-work associated with
each defer op that the caller gave us, to minimize the length of the
defer ops chains; and then we need to teach it to relog items
periodically to avoid pinning the log tail.

v2: combine all the relog patches into one, and base the decision to
relog an iten dependent on whether or not it's in an old checkpoint
v3: fix backwards logic, don't relog items in the same checkpoint,
and split up the changes
v4: fix some comments, split the log changes into a separate patch
v5: add a defer ops relog counter to the xfs stats file

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=defer-ops-stalls-5.10
---
 fs/xfs/libxfs/xfs_defer.c  |   69 +++++++++++++++++++++++++++++++++++++++++++-
 fs/xfs/xfs_bmap_item.c     |   27 +++++++++++++++++
 fs/xfs/xfs_extfree_item.c  |   29 ++++++++++++++++++
 fs/xfs/xfs_log.c           |   40 +++++++++++++++++++-------
 fs/xfs/xfs_log.h           |    2 +
 fs/xfs/xfs_refcount_item.c |   27 +++++++++++++++++
 fs/xfs/xfs_rmap_item.c     |   27 +++++++++++++++++
 fs/xfs/xfs_stats.c         |    4 +++
 fs/xfs/xfs_stats.h         |    1 +
 fs/xfs/xfs_trace.h         |    1 +
 fs/xfs/xfs_trans.h         |   10 ++++++
 11 files changed, 226 insertions(+), 11 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread
* [PATCH v4 0/4] xfs: fix some log stalling problems in defer ops
@ 2020-09-27 23:41 Darrick J. Wong
  2020-09-27 23:42 ` [PATCH 4/4] xfs: only relog deferred intent items if free space in the log gets low Darrick J. Wong
  0 siblings, 1 reply; 12+ messages in thread
From: Darrick J. Wong @ 2020-09-27 23:41 UTC (permalink / raw)
  To: darrick.wong; +Cc: Brian Foster, Dave Chinner, linux-xfs, david, bfoster

Hi all,

This last series tries to fix some structural problems in the defer ops
code.  The defer ops code has been finishing items in the wrong order --
if a top level defer op creates items A and B, and finishing item A
creates more defer ops A1 and A2, we'll put the new items on the end of
the chain and process them in the order A B A1 A2.  This is kind of
weird, since it's convenient for programmers to be able to think of A
and B as an ordered sequence where all the work for A must finish before
we move on to B, e.g. A A1 A2 D.

That isn't how the defer ops actually works, but so far we've been lucky
that this hasn't ever caused serious problems.  This /will/, however,
when we get to the atomic extent swap code, where for refcounting
purposes it actually /does/ matter that unmap and map child intents
execute in that order, and complete before we move on to the next extent
in the files.  This also causes a very long chain of intent items to
build up, which can exhaust memory.

We need to teach defer ops to finish all the sub-work associated with
each defer op that the caller gave us, to minimize the length of the
defer ops chains; and then we need to teach it to relog items
periodically to avoid pinning the log tail.

v2: combine all the relog patches into one, and base the decision to
relog an iten dependent on whether or not it's in an old checkpoint
v3: fix backwards logic, don't relog items in the same checkpoint,
and split up the changes
v4: fix some comments, split the log changes into a separate patch

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=defer-ops-stalls-5.10
---
 fs/xfs/libxfs/xfs_defer.c  |   64 +++++++++++++++++++++++++++++++++++++++++++-
 fs/xfs/xfs_bmap_item.c     |   27 +++++++++++++++++++
 fs/xfs/xfs_extfree_item.c  |   29 ++++++++++++++++++++
 fs/xfs/xfs_log.c           |   40 +++++++++++++++++++++-------
 fs/xfs/xfs_log.h           |    2 +
 fs/xfs/xfs_refcount_item.c |   27 +++++++++++++++++++
 fs/xfs/xfs_rmap_item.c     |   27 +++++++++++++++++++
 fs/xfs/xfs_trace.h         |    1 +
 fs/xfs/xfs_trans.h         |   10 +++++++
 9 files changed, 216 insertions(+), 11 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-10-05 18:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-29 17:44 [PATCH v5 0/4] xfs: fix some log stalling problems in defer ops Darrick J. Wong
2020-09-29 17:44 ` [PATCH 1/4] xfs: change the order in which child and parent defer ops are finished Darrick J. Wong
2020-09-29 17:44 ` [PATCH 2/4] xfs: periodically relog deferred intent items Darrick J. Wong
2020-10-01 16:02   ` Brian Foster
2020-10-01 17:22     ` Darrick J. Wong
2020-10-01 17:50       ` Brian Foster
2020-09-29 17:44 ` [PATCH 3/4] xfs: expose the log push threshold Darrick J. Wong
2020-09-29 17:44 ` [PATCH 4/4] xfs: only relog deferred intent items if free space in the log gets low Darrick J. Wong
2020-10-01 16:02   ` Brian Foster
2020-10-01 17:33     ` Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2020-10-05 18:20 [PATCH v5 0/4] xfs: fix some log stalling problems in defer ops Darrick J. Wong
2020-10-05 18:21 ` [PATCH 4/4] xfs: only relog deferred intent items if free space in the log gets low Darrick J. Wong
2020-09-27 23:41 [PATCH v4 0/4] xfs: fix some log stalling problems in defer ops Darrick J. Wong
2020-09-27 23:42 ` [PATCH 4/4] xfs: only relog deferred intent items if free space in the log gets low Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox