linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 202077] New: xfs transaction overruns on 4.14.67
@ 2018-12-26 17:19 bugzilla-daemon
  2018-12-26 17:19 ` [Bug 202077] " bugzilla-daemon
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: bugzilla-daemon @ 2018-12-26 17:19 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202077

            Bug ID: 202077
           Summary: xfs transaction overruns on 4.14.67
           Product: File System
           Version: 2.5
    Kernel Version: 4.14.67
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: XFS
          Assignee: filesystem_xfs@kernel-bugs.kernel.org
          Reporter: thomas.walker@twosigma.com
        Regression: No

Created attachment 280149
  --> https://bugzilla.kernel.org/attachment.cgi?id=280149&action=edit
xfs transaction overrun #1

We've encountered two recent examples of xfs transaction overruns on production
systems running 4.14.67 kernels.  Both systems in this case are running docker
with dozens of overlay mounts, using this xfs fs as both upper and lower.  In
both cases the filesystem was able to successfully recover when the filesystem
was unmounted and remounted again.

It looks like there has been a good bit of work in 4.16+ addressing similar
issues but none of it has made it back into the 4.14 LTS.  Any chance that any
of the attached debug output points to anything specific that might be a
candidate for backport?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 202077] xfs transaction overruns on 4.14.67
  2018-12-26 17:19 [Bug 202077] New: xfs transaction overruns on 4.14.67 bugzilla-daemon
@ 2018-12-26 17:19 ` bugzilla-daemon
  2019-01-01 22:30 ` [Bug 202077] New: " Dave Chinner
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2018-12-26 17:19 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202077

--- Comment #1 from Thomas Walker (thomas.walker@twosigma.com) ---
Created attachment 280151
  --> https://bugzilla.kernel.org/attachment.cgi?id=280151&action=edit
xfs transaction overrun #2

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Bug 202077] New: xfs transaction overruns on 4.14.67
  2018-12-26 17:19 [Bug 202077] New: xfs transaction overruns on 4.14.67 bugzilla-daemon
  2018-12-26 17:19 ` [Bug 202077] " bugzilla-daemon
@ 2019-01-01 22:30 ` Dave Chinner
  2019-01-01 22:30 ` [Bug 202077] xfs transaction log reservation " bugzilla-daemon
  2019-01-02 16:54 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: Dave Chinner @ 2019-01-01 22:30 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: linux-xfs

On Wed, Dec 26, 2018 at 05:19:12PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> We've encountered two recent examples of xfs transaction overruns on production
> systems running 4.14.67 kernels.  Both systems in this case are running docker
> with dozens of overlay mounts, using this xfs fs as both upper and lower.  In
> both cases the filesystem was able to successfully recover when the filesystem
> was unmounted and remounted again.

Inboth cases, it looks like there were two free space manipulations
in a single transaction, likely first modifying the free list
(pattern is EFD, XAGF, ABTB, ABTC, then AGFL) followed by freeing
the actual extent (more ABTB, ABTC buffers).

> It looks like there has been a good bit of work in 4.16+

The first fixes went into 4.18 with the deferred AGFL free
operations. Those were the commits associated with the patchset
titled "[PATCH v2 0/6] xfs: defer agfl block frees".

There were more fixes in 4.19 to always defer the AGFL free for all
operations. This was a much larger and more significant change, and
can be found from the series titled "[PATCH 00/24] xfs: broad
enablement of deferred agfl frees".

> addressing similar issues but none of it has made it back into the
> 4.14 LTS.  Any chance that any of the attached debug output points
> to anything specific that might be a candidate for backport?

Backporting the first series might be sufficient to avoid your
problem (both are from the inode inactivation path) but it is no
guarantee. I also have no idea what dependencies that patchset has
on the rest of the code (e.g. is there enough deferred op
infrastructure in place in 4.14?), and seeing as it touches core
allocation algorithms it would require a substantial amount of QA
before release....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 202077] xfs transaction log reservation overruns on 4.14.67
  2018-12-26 17:19 [Bug 202077] New: xfs transaction overruns on 4.14.67 bugzilla-daemon
  2018-12-26 17:19 ` [Bug 202077] " bugzilla-daemon
  2019-01-01 22:30 ` [Bug 202077] New: " Dave Chinner
@ 2019-01-01 22:30 ` bugzilla-daemon
  2019-01-02 16:54 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2019-01-01 22:30 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202077

--- Comment #2 from Dave Chinner (david@fromorbit.com) ---
On Wed, Dec 26, 2018 at 05:19:12PM +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> We've encountered two recent examples of xfs transaction overruns on
> production
> systems running 4.14.67 kernels.  Both systems in this case are running
> docker
> with dozens of overlay mounts, using this xfs fs as both upper and lower.  In
> both cases the filesystem was able to successfully recover when the
> filesystem
> was unmounted and remounted again.

Inboth cases, it looks like there were two free space manipulations
in a single transaction, likely first modifying the free list
(pattern is EFD, XAGF, ABTB, ABTC, then AGFL) followed by freeing
the actual extent (more ABTB, ABTC buffers).

> It looks like there has been a good bit of work in 4.16+

The first fixes went into 4.18 with the deferred AGFL free
operations. Those were the commits associated with the patchset
titled "[PATCH v2 0/6] xfs: defer agfl block frees".

There were more fixes in 4.19 to always defer the AGFL free for all
operations. This was a much larger and more significant change, and
can be found from the series titled "[PATCH 00/24] xfs: broad
enablement of deferred agfl frees".

> addressing similar issues but none of it has made it back into the
> 4.14 LTS.  Any chance that any of the attached debug output points
> to anything specific that might be a candidate for backport?

Backporting the first series might be sufficient to avoid your
problem (both are from the inode inactivation path) but it is no
guarantee. I also have no idea what dependencies that patchset has
on the rest of the code (e.g. is there enough deferred op
infrastructure in place in 4.14?), and seeing as it touches core
allocation algorithms it would require a substantial amount of QA
before release....

Cheers,

Dave.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 202077] xfs transaction log reservation overruns on 4.14.67
  2018-12-26 17:19 [Bug 202077] New: xfs transaction overruns on 4.14.67 bugzilla-daemon
                   ` (2 preceding siblings ...)
  2019-01-01 22:30 ` [Bug 202077] xfs transaction log reservation " bugzilla-daemon
@ 2019-01-02 16:54 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2019-01-02 16:54 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=202077

--- Comment #3 from Thomas Walker (thomas.walker@twosigma.com) ---
Thanks for the response.  That first patchset does appear to apply cleanly
(with a little fuzz) to 4.14 but, as you say, I don't know offhand how mature
the code it depends upon is in 4.14 and without a reliable reproducer it will
be hard to say whether it even addresses my issue.  I'll keep seeing if I can
reproduce the problem more consistently and see...

While I'm running 4.19 on a few test systems, I've been taking a wait-and-see
approach towards broader usage given the number of regressions that have
cropped up (and been fixed) thus far.  Good to know that this is likely
addressed there though.

Thanks,
Tom.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-01-02 16:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-26 17:19 [Bug 202077] New: xfs transaction overruns on 4.14.67 bugzilla-daemon
2018-12-26 17:19 ` [Bug 202077] " bugzilla-daemon
2019-01-01 22:30 ` [Bug 202077] New: " Dave Chinner
2019-01-01 22:30 ` [Bug 202077] xfs transaction log reservation " bugzilla-daemon
2019-01-02 16:54 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).