* Re: [PATCH] xfs: explicitly call cond_resched in xfs_itruncate_extents_flags
[not found] ` <CAMXzGWJU+a2s-tbpzdmPTCg9Et7UpDdpdBEjkiUUvAV5kxTjig@mail.gmail.com>
@ 2024-01-11 20:27 ` Dave Chinner
2024-01-12 13:01 ` Thomas Gleixner
0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2024-01-11 20:27 UTC (permalink / raw)
To: Jian Wen
Cc: linux-xfs, djwong, hch, dchinner, Jian Wen, Thomas Gleixner,
linux-kernel
[cc Thomas, lkml]
On Thu, Jan 11, 2024 at 08:52:22PM +0800, Jian Wen wrote:
> On Thu, Jan 11, 2024 at 5:38 AM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Wed, Jan 10, 2024 at 03:13:47PM +0800, Jian Wen wrote:
> > > From: Jian Wen <wenjianhn@gmail.com>
> > >
> > > Deleting a file with lots of extents may cause a soft lockup if the
> > > preemption model is none(CONFIG_PREEMPT_NONE=y or preempt=none is set
> > > in the kernel cmdline). Alibaba cloud kernel and Oracle UEK container
> > > kernel are affected by the issue, since they select CONFIG_PREEMPT_NONE=y.
> >
> > Time for them to move to CONFIG_PREEMPT_DYNAMIC?
> I had asked one of them to support CONFIG_PREEMPT_DYNAMIC before
> sending the patch.
OK.
> > Also there has been recent action towards removing
> > CONFIG_PREEMPT_NONE/VOLUNTARY and cond_resched() altogether because
> > the lazy preemption model coming present in the RTPREEMPT patchset
> > solves the performance issues with full preemption that PREEMPT_NONE
> > works around...
> >
> > https://lwn.net/Articles/944686/
> > https://lwn.net/Articles/945422/
> >
> > Further, Thomas Gleixner has stated in those discussions that:
> >
> > "Though definitely I'm putting a permanent NAK in place for
> > any attempts to duct tape the preempt=NONE model any
> > further by sprinkling more cond*() and whatever warts
> > around."
> >
> > https://lwn.net/ml/linux-kernel/87jzshhexi.ffs@tglx/
> >
> > > Explicitly call cond_resched in xfs_itruncate_extents_flags avoid
> > > the below softlockup warning.
> >
> > IOWs, this is no longer considered an acceptible solution by core
> > kernel maintainers.
> Understood. I will only build a hotfix for our production kernel then.
Yeah, that may be your best short term fix. We'll need to clarify
what the current policy is on adding cond_resched points before we
go any further in this direction.
Thomas, any update on what is happening with cond_resched() - is
there an ETA on it going away/being unnecessary?
> > Regardless of these policy issues, the code change:
> >
> > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > > index c0f1c89786c2..194381e10472 100644
> > > --- a/fs/xfs/xfs_inode.c
> > > +++ b/fs/xfs/xfs_inode.c
> > > @@ -4,6 +4,7 @@
> > > * All Rights Reserved.
> > > */
> > > #include <linux/iversion.h>
> > > +#include <linux/sched.h>
> >
> > Global includes like this go in fs/xfs/xfs_linux.h, but I don't
> > think that's even necessary because we have cond_resched() calls
> > elsewhere in XFS with the same include list as xfs_inode.c...
> >
> > > #include "xfs.h"
> > > #include "xfs_fs.h"
> > > @@ -1383,6 +1384,8 @@ xfs_itruncate_extents_flags(
> > > error = xfs_defer_finish(&tp);
> > > if (error)
> > > goto out;
> > > +
> > > + cond_resched();
> > > }
> >
> > Shouldn't this go in xfs_defer_finish() so that we capture all the
> > cases where we loop indefinitely over a range continually rolling a
> > permanent transaction via xfs_defer_finish()?
> It seems xfs_collapse_file_space and xfs_insert_file_space also need
> to yield CPU.
> I don't have use cases for them yet.
Yup, they do, but they also call xfs_defer_finish(), so having the
cond_resched() in that function will capture them as well.
Also, the current upstream tree has moved this code from
xfs_itruncate_extents_flags() to xfs_bunmapi_range(), so the
cond_resched() has to be moved, anyway. We may as well put it in
xfs_defer_finish() if we end up doing this.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] xfs: explicitly call cond_resched in xfs_itruncate_extents_flags
2024-01-11 20:27 ` [PATCH] xfs: explicitly call cond_resched in xfs_itruncate_extents_flags Dave Chinner
@ 2024-01-12 13:01 ` Thomas Gleixner
2024-01-23 7:01 ` Ankur Arora
0 siblings, 1 reply; 3+ messages in thread
From: Thomas Gleixner @ 2024-01-12 13:01 UTC (permalink / raw)
To: Dave Chinner, Jian Wen
Cc: linux-xfs, djwong, hch, dchinner, Jian Wen, linux-kernel,
Ankur Arora
On Fri, Jan 12 2024 at 07:27, Dave Chinner wrote:
Cc: Ankur
> On Thu, Jan 11, 2024 at 08:52:22PM +0800, Jian Wen wrote:
>> On Thu, Jan 11, 2024 at 5:38 AM Dave Chinner <david@fromorbit.com> wrote:
>> > IOWs, this is no longer considered an acceptible solution by core
>> > kernel maintainers.
>> Understood. I will only build a hotfix for our production kernel then.
>
> Yeah, that may be your best short term fix. We'll need to clarify
> what the current policy is on adding cond_resched points before we
> go any further in this direction.
Well, right now until the scheduler situation is sorted there is no
other solution than to add the cond_resched() muck.
> Thomas, any update on what is happening with cond_resched() - is
> there an ETA on it going away/being unnecessary?
Ankur is working on that...
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] xfs: explicitly call cond_resched in xfs_itruncate_extents_flags
2024-01-12 13:01 ` Thomas Gleixner
@ 2024-01-23 7:01 ` Ankur Arora
0 siblings, 0 replies; 3+ messages in thread
From: Ankur Arora @ 2024-01-23 7:01 UTC (permalink / raw)
To: tglx
Cc: ankur.a.arora, david, dchinner, djwong, hch, linux-kernel,
linux-xfs, wenjian1, wenjianhn
[ Missed this email until now. ]
Thomas Gleixner writes:
>On Fri, Jan 12 2024 at 07:27, Dave Chinner wrote:
>
>Cc: Ankur
>
> > On Thu, Jan 11, 2024 at 08:52:22PM +0800, Jian Wen wrote:
> >> On Thu, Jan 11, 2024 at 5:38 AM Dave Chinner <david@fromorbit.com> wrote:
> >> > IOWs, this is no longer considered an acceptible solution by core
> >> > kernel maintainers.
> >> Understood. I will only build a hotfix for our production kernel then.
> >
> > Yeah, that may be your best short term fix. We'll need to clarify
> > what the current policy is on adding cond_resched points before we
> > go any further in this direction.
>
> Well, right now until the scheduler situation is sorted there is no
> other solution than to add the cond_resched() muck.
>
> > Thomas, any update on what is happening with cond_resched() - is
> > there an ETA on it going away/being unnecessary?
>
> Ankur is working on that...
Yeah, running through a final round of tests before sending out the series.
Dave, on the status of cond_resched(): the work on this adds a new scheduling
model (as Thomas implemented in his PoC) undwer which cond_resched() would
basically be a stub.
However, given that other preemption models continue to use cond_resched(),
we would need to live with cond_resched() for a while -- at least while
this model works well enough under a wide enough variety of loads.
Thanks
Ankur
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-01-23 7:02 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240110071347.3711925-1-wenjian1@xiaomi.com>
[not found] ` <ZZ8OaNnp6b/PJzsb@dread.disaster.area>
[not found] ` <CAMXzGWJU+a2s-tbpzdmPTCg9Et7UpDdpdBEjkiUUvAV5kxTjig@mail.gmail.com>
2024-01-11 20:27 ` [PATCH] xfs: explicitly call cond_resched in xfs_itruncate_extents_flags Dave Chinner
2024-01-12 13:01 ` Thomas Gleixner
2024-01-23 7:01 ` Ankur Arora
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox