public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] sched/core: Fix PSI inconsistent task state splats with DELAY_DEQUEUE
@ 2024-10-10  8:28 K Prateek Nayak
  2024-10-10  8:28 ` [PATCH 1/3] sched/core: Dequeue PSI signals for blocked tasks that are delayed K Prateek Nayak
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: K Prateek Nayak @ 2024-10-10  8:28 UTC (permalink / raw)
  To: Peter Ziljstra, Ingo Molnar, Juri Lelli, Vincent Guittot,
	Johannes Weiner, Suren Baghdasaryan, linux-kernel
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Thomas Gleixner, Klaus Kudielka,
	Chris Bainbridge, Linux regression tracking (Thorsten Leemhuis),
	Gautham R. Shenoy, Youssef Esmat, Paul Menzel, Bert Karwatzki,
	regressions, K Prateek Nayak

After the introduction of DELAY_DEQUEUE, PSI consistently started
warning about inconsistent task state early into the boot. This could be
root-caused to three issues that the three patches respectively solve:

o PSI signals not being dequeued when the task is blocked, but also
  delayed since psi_sched_switch() considered "!task_on_rq_queued()" as
  the task being blocked but a delayed task will remain queued on the
  runqueue until it is picked again and goes through a full dequeue.

o enqueue_task() not using the ENQUEUE_WAKEUP alongside ENQUEUE_DELAYED
  in ttwu_runnable(). Since psi_enqueue() only considers (in terms of
  enqueue flags):

    (flags & ENQUEUE_WAKEUP) && !(flags & ENQUEUE_MIGRATED)

  ... as a wakeup, the lack of ENQUEUE_WAKEUP can misguide psi_enqueue()
  which only clears TSK_IOWAIT flag on wakeups.

o When a delayed task is migrated by the load balancer, the requeue or
  the wakeup context may be aware that the task has migrated between it
  blocking and it waking up. This is necessary to be communicated to PSI
  which forgoes clearing TSK_IOWAIT since it expects the psi_.*dequeue()
  to have cleared it during migration.

The series correctly communicates the blocked status of a delayed task
to psi_dequeue(), adds the ENQUEUE_WAKEUP flag during a requeue in
ttwu_runnable(), re-arranges the psi_enqueue() to be called after a
"p->sched_class->enqueue_task()", and notify psi_enqueue() of a
migration in delayed state using "p->migration_flags" to maintain the
task state consistently.

This series was previously posted as one large diff at
https://lore.kernel.org/lkml/f82def74-a64a-4a05-c8d4-4eeb3e03d0c0@amd.com/
and was tested by Johannes. The tags on the diff have been carried
to this series.

This series is based on:

    git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core

at commit 7266f0a6d3bb ("fs/bcachefs: Fix
__wait_on_freeing_inode() definition of waitqueue entry")

Any and all feedback is greatly appreciated.

--
K Prateek Nayak (2):
  sched/core: Add ENQUEUE_WAKEUP flag alongside ENQUEUE_DELAYED
  sched/core: Indicate a sched_delayed task was migrated before wakeup

Peter Zijlstra (1):
  sched/core: Dequeue PSI signals for blocked tasks that are delayed

 kernel/sched/core.c  | 25 ++++++++++++++++++++++---
 kernel/sched/sched.h |  1 +
 kernel/sched/stats.h | 10 ++++++++++
 3 files changed, 33 insertions(+), 3 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-10-28 13:28 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-10  8:28 [PATCH 0/3] sched/core: Fix PSI inconsistent task state splats with DELAY_DEQUEUE K Prateek Nayak
2024-10-10  8:28 ` [PATCH 1/3] sched/core: Dequeue PSI signals for blocked tasks that are delayed K Prateek Nayak
2024-10-10 19:23   ` Johannes Weiner
2024-10-10  8:28 ` [PATCH 2/3] sched/core: Add ENQUEUE_WAKEUP flag alongside ENQUEUE_DELAYED K Prateek Nayak
2024-10-10  8:28 ` [PATCH 3/3] sched/core: Indicate a sched_delayed task was migrated before wakeup K Prateek Nayak
2024-10-10 13:03   ` Johannes Weiner
2024-10-10 13:06     ` Peter Zijlstra
2024-10-10 19:37       ` Johannes Weiner
2024-10-11  3:31         ` K Prateek Nayak
2024-10-11  8:33         ` Peter Zijlstra
2024-10-11 10:08           ` Johannes Weiner
2024-10-11 10:39             ` Peter Zijlstra
2024-10-14 14:43               ` Johannes Weiner
2024-10-15  3:11                 ` K Prateek Nayak
2024-10-28 13:28                 ` [tip: sched/core] sched: psi: pass enqueue/dequeue flags to psi callbacks directly tip-bot2 for Johannes Weiner
2024-10-12 14:15         ` [tip: sched/urgent] Since sched_delayed tasks remain queued even after blocking, the load tip-bot2 for Johannes Weiner
2024-10-14  7:28         ` [tip: sched/urgent] sched/psi: Fix mistaken CPU pressure indication after corrupted task state bug tip-bot2 for Johannes Weiner
2024-10-10 15:59     ` [PATCH 3/3] sched/core: Indicate a sched_delayed task was migrated before wakeup K Prateek Nayak
2024-10-10 10:47 ` [PATCH 0/3] sched/core: Fix PSI inconsistent task state splats with DELAY_DEQUEUE Peter Zijlstra
2024-10-10 10:57   ` K Prateek Nayak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox