From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E53CA37BE91 for ; Tue, 12 May 2026 02:56:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778554618; cv=none; b=kfuOnJ88O29ey/LJuMFSb3Ni8NDFothhpL9FSbli2RE3CkF4cEwHX6DEq2aL2OnzPP0Uw2/SwsVKPMZNpw2UDCrGT1imZCx6uBReabDIwoxwsh+9bRnWQiHEZ0ulwymCgnsXsSWN5yXl/v8YqQEXLcTVTNpOf3V+pnW767MzVRo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778554618; c=relaxed/simple; bh=uu5jZh5sMFXc54UY1yj0t6YpHLulP+EolXIMP6z1wiU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=BWrrfQBwn2EhJ0HPhjkXRml3javmWXZ3bmq3lrhl9WD50FZeMMMP0k42HEIb0/7ouZCQipOC8ERlQlSAz2GSl35304DAczqUtRSorF6Ld60Gbs7UGTuVHWXF6eGEzS4MSQPCoqTVoSVvIjApPCwTzSDPuUrA27QGz2L2BeZ0SWA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=avMW0kb3; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="avMW0kb3" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-82fd55bf6cdso3288298b3a.3 for ; Mon, 11 May 2026 19:56:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1778554609; x=1779159409; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=bPmrYQ8MlObtJ/qg9HEe4fMV04tB2JQRuyWVKdMHA9k=; b=avMW0kb3JYE0Gh4sEGmj+UNK55KC8OeQDxPkyP/8mEiXQ7lyIdfkoX/XTh7B7lEPd5 TvJqvLLL9OxE62gqfSugW4iuOULvxakkkYjXKOPcxPVDODKKFo4ZbeSVMdGF+Tw12ZTa 7KjRFqf17hGSzxyaG9QVxJbCj5YzKqszAgBWZemBQh9Jmy9yDqaKE753nKlR70MttmpI Hgfntcmjgtej05mXXwNMlwBu17419H24HS5lH2988ibdsmMvFSiLQ25hE7kVGfjx5z74 WYZyDUvVhobltscp/Q0/Gdp9lRHQP1SN/2YK+mkb+fO3cx8sQK8C36fs8oiqiv2+S4ZG R9hQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778554609; x=1779159409; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bPmrYQ8MlObtJ/qg9HEe4fMV04tB2JQRuyWVKdMHA9k=; b=BriVMy5soYxbjqf5hCxXM2Nt4TBltS79edwUKdSLiGpVzC4nmbZtiopQ2e39zVahdw Jlx0fFFMHeKi6gtpALDraIoC1JLGa0qTHVz7KVa9HSPcWhV+bI0UFvA2nqmJuDzyQ4a7 x1azoFnacbg+7OwaNmVqt4ai77i2sET1mcemYoxbE1OOT2yFQcbSsqtjBLDyUyT0/ghK O2jgbH0HdSCrosveaajM7kTNSumMWttAre4vIj48rqp05cuxDLrHWauwE8NaO9DcFZkz KqrSjuKgcuLH4wd/7P5ZP4TzH8F0eZITpmv6QG/kDe4N9k+m+swA34TM4BblAi2j+Gqd yAaw== X-Gm-Message-State: AOJu0Yw/AzDx22d1TbL2A76ebz7vCWXEkTGZ8FyBUQVCfmtG0T3/ikfm g5dmGybrufxaUEIcjclQ2K52aHOYKMWgNa2tXFtXDUnA/7Va/9ujyYf1wAXonv69tKiMcNCOrft ugaH9P/ZuEHSRtW3gR3w9FPD/C9pVhPUY4/RshshnMWkCvz3C50l/MMOEucViJ2mYGzosMNO/tx DNi7KM8eR5PbvJfWoUNGADDwepTNKsNUK2qWr1GbDtirrkrU7/ X-Received: from pfblh16.prod.google.com ([2002:a05:6a00:7110:b0:82f:805:b62a]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:4655:b0:82f:21ee:270e with SMTP id d2e1a72fcca58-83a5dc60001mr25681328b3a.42.1778554608811; Mon, 11 May 2026 19:56:48 -0700 (PDT) Date: Tue, 12 May 2026 02:56:16 +0000 In-Reply-To: <20260512025635.2840817-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260512025635.2840817-1-jstultz@google.com> X-Mailer: git-send-email 2.54.0.563.g4f69b47b94-goog Message-ID: <20260512025635.2840817-7-jstultz@google.com> Subject: [PATCH v29 6/9] sched: Add is_blocked task flag From: John Stultz To: LKML Cc: John Stultz , Vineeth Pillai , Peter Zijlstra , Joel Fernandes , Qais Yousef , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Add a new is_blocked flag to the task struct. This flag is set by try_to_block_task() and cleared by ttwu_do_wakeup() and tracks if the task is blocked. Traditionally this would mirror !p->on_rq, however due things like DELAY_DEQUEUE and PROXY_EXEC, this can diverge, so its useful to manage separately. Additionally with this, we might be able to get rid of the p->se.sched_delayed (ab)use in the core code (eventually). Taken whole cloth from Peter's email: https://lore.kernel.org/lkml/20260501132143.GC1026330@noisy.programming.kicks-ass.net/ With a few additional p->is_blocked = 0 in a few cases where we return current if blocked_on gets zeroed or there is no owner. This may hint that these current special cases might be dropped eventually. This change also helps resolve wait-queue stalls seen with proxy-execution. See previous patch attempts for details: https://lore.kernel.org/lkml/20260430215103.2978955-2-jstultz@google.com/ Reported-by: Vineeth Pillai Suggested-by: Peter Zijlstra Signed-off-by: John Stultz --- Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- include/linux/sched.h | 7 +++++-- kernel/sched/core.c | 16 +++++++++++++++- 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 5b68a1c9eedcf..bbb183233855a 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -846,7 +846,11 @@ struct task_struct { struct alloc_tag *alloc_tag; #endif - int on_cpu; + u8 on_cpu; + u8 on_rq; + u8 is_blocked; + u8 __pad; + struct __call_single_node wake_entry; unsigned int wakee_flips; unsigned long wakee_flip_decay_ts; @@ -861,7 +865,6 @@ struct task_struct { */ int recent_used_cpu; int wake_cpu; - int on_rq; int prio; int static_prio; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 633dd5b8428e5..8a223555be2e9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -615,6 +615,12 @@ EXPORT_SYMBOL(__trace_set_current_state); * [ The astute reader will observe that it is possible for two tasks on one * CPU to have ->on_cpu = 1 at the same time. ] * + * p->is_blocked <- { 0, 1 }: + * + * is set by try_to_block_task() and cleared by ttwu_do_wakeup() and tracks + * if the task is blocked. Traditionally this would mirror p->on_rq, however + * due things like DELAY_DEQUEUE and PROXY_EXEC, this can diverge. + * * task_cpu(p): is changed by set_task_cpu(), the rules are: * * - Don't call set_task_cpu() on a blocked task: @@ -3706,6 +3712,7 @@ ttwu_stat(struct task_struct *p, int cpu, int wake_flags) */ static inline void ttwu_do_wakeup(struct task_struct *p) { + p->is_blocked = 0; WRITE_ONCE(p->__state, TASK_RUNNING); trace_sched_wakeup(p); } @@ -4239,6 +4246,7 @@ int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) * it disabling IRQs (this allows not taking ->pi_lock). */ WARN_ON_ONCE(p->se.sched_delayed); + WARN_ON_ONCE(p->is_blocked); /* If p is current, we know we can run here, so clear blocked_on */ clear_task_blocked_on(p, NULL); if (!ttwu_state_match(p, state, &success)) @@ -4550,6 +4558,7 @@ static void __sched_fork(u64 clone_flags, struct task_struct *p) /* A delayed task cannot be in clone(). */ WARN_ON_ONCE(p->se.sched_delayed); + WARN_ON_ONCE(p->is_blocked); #ifdef CONFIG_FAIR_GROUP_SCHED p->se.cfs_rq = NULL; @@ -6671,6 +6680,7 @@ static bool try_to_block_task(struct rq *rq, struct task_struct *p, unsigned long task_state = *task_state_p; if (signal_pending_state(task_state, p)) { + p->is_blocked = 0; WRITE_ONCE(p->__state, TASK_RUNNING); *task_state_p = TASK_RUNNING; clear_task_blocked_on(p, NULL); @@ -6678,6 +6688,8 @@ static bool try_to_block_task(struct rq *rq, struct task_struct *p, return false; } + p->is_blocked = 1; + /* * We check should_block after signal_pending because we * will want to wake the task in that case. But if @@ -6837,6 +6849,7 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf) /* if its PROXY_WAKING, do return migration or run if current */ if (mutex == PROXY_WAKING) { if (task_current(rq, p)) { + p->is_blocked = 0; clear_task_blocked_on(p, PROXY_WAKING); return p; } @@ -6872,6 +6885,7 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf) * just run on this rq), or return-migrate the task. */ if (task_current(rq, p)) { + p->is_blocked = 0; __clear_task_blocked_on(p, NULL); return p; } @@ -7105,7 +7119,7 @@ static void __sched notrace __schedule(int sched_mode) clear_task_blocked_on(prev, NULL); rq_set_donor(rq, next); - if (unlikely(next->blocked_on)) { + if (unlikely(next->is_blocked && next->blocked_on)) { next = find_proxy_task(rq, next, &rf); if (!next) { zap_balance_callbacks(rq); -- 2.54.0.563.g4f69b47b94-goog