From: Peter Zijlstra <peterz@infradead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>,
linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@redhat.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
linux-mm@kvack.org, Ingo Molnar <mingo@redhat.org>,
Shuah Khan <shuah@kernel.org>
Subject: Re: [PATCH v12 2/3] sched: Move task_mm_cid_work to mm work_struct
Date: Wed, 9 Apr 2025 21:08:34 +0200 [thread overview]
Message-ID: <20250409190834.GQ9833@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <e916f393-b18c-4641-ace7-cf23b7508e09@efficios.com>
On Wed, Apr 09, 2025 at 11:53:05AM -0400, Mathieu Desnoyers wrote:
> On 2025-04-09 11:20, Peter Zijlstra wrote:
> > On Wed, Apr 09, 2025 at 10:15:42AM -0400, Mathieu Desnoyers wrote:
> > > On 2025-04-09 10:03, Peter Zijlstra wrote:
> > > > On Tue, Mar 11, 2025 at 07:28:45AM +0100, Gabriele Monaco wrote:
> > > > > +static inline void rseq_preempt_from_tick(struct task_struct *t)
> > > > > +{
> > > > > + u64 rtime = t->se.sum_exec_runtime - t->se.prev_sum_exec_runtime;
> > > > > +
> > > > > + if (rtime > RSEQ_UNPREEMPTED_THRESHOLD)
> > > > > + rseq_preempt(t);
> > > > > +}
> > > >
> > > > This confused me.
> > > >
> > > > The goal seems to be to tickle __rseq_handle_notify_resume() so it'll
> > > > end up queueing that work thing. But why do we want to set PREEMPT_BIT
> > > > here?
> > >
> > > In that scenario, we trigger (from tick) the fact that we may recompact the
> > > mm_cid, and thus need to update the rseq mm_cid field before returning to
> > > userspace.
> > >
> > > Changing the value of the mm_cid field while userspace is within a rseq
> > > critical section should abort the critical section, because the rseq
> > > critical section should be able to expect the mm_cid to be invariant
> > > for the whole c.s..
> >
> > But, if we run that compaction in a worker, what guarantees the
> > compaction is done and mm_cid is stable, but the time this task returns
> > to userspace again?
>
> So let's say we have a task which is running and not preempted by any
> other task on a cpu for a long time.
>
> The idea is to have the tick do two things:
>
> A) trigger the mm_cid recompaction,
>
> B) trigger an update of the task's rseq->mm_cid field at some point
> after recompaction, so it can get a mm_cid value closer to 0.
>
> So in its current form this patch will indeed trigger rseq_preempt()
> for *every tick* after the task has run for more than 100ms, which
> I don't think is intended. This should be fixed.
>
> Also, doing just an rseq_preempt() is not the correct approach, as
> AFAIU it won't force the long running task to release the currently
> held mm_cid value.
>
> I think we need something that looks like the following based on the
> current patch:
>
> - rename rseq_preempt_from_tick() to rseq_tick(),
>
> - modify rseq_tick() to ensure it calls rseq_set_notify_resume(t)
> rather than rseq_preempt().
>
> - modify rseq_tick() to ensure it only calls it once every
> RSEQ_UNPREEMPTED_THRESHOLD, rather than every tick after
> RSEQ_UNPREEMPTED_THRESHOLD.
>
> - modify rseq_tick() so at some point after the work has
> compacted mm_cids, we do the same things as switch_mm_cid()
> does, namely to release the currently held cid and get a likely
> smaller one (closer to 0). If the value changes, then we should
> trigger rseq_preempt() so the task updates the mm_cid field before
> returning to userspace and restarts ongoing rseq critical section.
>
> Thoughts ?
Yes, that seems better. Also be sure there's a comment around there
somewhere that explains this. Because I'm sure I'll have forgotten all
about this in a few months time :-)
next prev parent reply other threads:[~2025-04-09 19:08 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-11 6:28 [PATCH v12 0/3] sched: Restructure task_mm_cid_work for predictability Gabriele Monaco
2025-03-11 6:28 ` [PATCH v12 1/3] sched: Add prev_sum_exec_runtime support for RT, DL and SCX classes Gabriele Monaco
2025-03-11 6:28 ` [PATCH v12 2/3] sched: Move task_mm_cid_work to mm work_struct Gabriele Monaco
2025-04-09 14:03 ` Peter Zijlstra
2025-04-09 14:15 ` Mathieu Desnoyers
2025-04-09 15:20 ` Peter Zijlstra
2025-04-09 15:53 ` Mathieu Desnoyers
2025-04-09 19:08 ` Peter Zijlstra [this message]
2025-04-10 12:50 ` [PATCH] fixup: " Gabriele Monaco
2025-04-10 14:04 ` Mathieu Desnoyers
2025-04-10 14:36 ` Gabriele Monaco
2025-03-11 6:28 ` [PATCH v12 3/3] selftests/rseq: Add test for mm_cid compaction Gabriele Monaco
2025-03-26 7:31 ` [PATCH v12 0/3] sched: Restructure task_mm_cid_work for predictability Gabriele Monaco
2025-03-26 14:33 ` Mathieu Desnoyers
2025-04-09 9:45 ` Gabriele Monaco
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250409190834.GQ9833@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=gmonaco@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@redhat.com \
--cc=mingo@redhat.org \
--cc=paulmck@kernel.org \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.