From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Vincent Guittot <vincent.guittot@linaro.org>
Cc: Chen Yu <yu.c.chen@intel.com>, Juri Lelli <juri.lelli@redhat.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Madadi Vineeth Reddy <vineethr@linux.ibm.com>,
Hillf Danton <hdanton@sina.com>,
Shrikanth Hegde <sshegde@linux.ibm.com>,
Jianyong Wu <jianyong.wu@outlook.com>,
Yangyu Chen <cyy@cyyself.name>,
Tingyin Duan <tingyin.duan@gmail.com>,
Vern Hao <vernhao@tencent.com>, Vern Hao <haoxing990@gmail.com>,
Len Brown <len.brown@intel.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
Aubrey Li <aubrey.li@intel.com>, Zhao Liu <zhao1.liu@intel.com>,
Chen Yu <yu.chen.surf@gmail.com>,
Adam Li <adamli@os.amperecomputing.com>,
Aaron Lu <ziqianlu@bytedance.com>,
Tim Chen <tim.c.chen@intel.com>, Josh Don <joshdon@google.com>,
Gavin Guo <gavinguo@igalia.com>,
Qais Yousef <qyousef@layalina.io>,
Libo Chen <libchen@purestorage.com>,
Luo Gengkun <luogengkun2@huawei.com>,
linux-kernel@vger.kernel.org
Subject: [Patch v4 10/16] sched/cache: Fix unpaired account_llc_enqueue/dequeue
Date: Wed, 13 May 2026 13:39:21 -0700 [thread overview]
Message-ID: <0c8c6a1571d66792a4d2ff0103ba3cc13e059046.1778703694.git.tim.c.chen@linux.intel.com> (raw)
In-Reply-To: <cover.1778703694.git.tim.c.chen@linux.intel.com>
From: Chen Yu <yu.c.chen@intel.com>
There is a race condition that, after a task is enqueued
on a runqueue, task_llc(p) may change due to CPU hotplug,
because the llc_id is dynamically allocated and adjusted
at runtime.
Therefore, checking task_llc(p) to determine whether the
task is being dequeued from its preferred LLC is unreliable
and can cause inconsistent values.
To fix this problem, record whether p is enqueued on its
preferred LLC, in order to pair with account_llc_dequeue()
to maintain a consistent nr_pref_llc_running per runqueue.
This bug was reported by sashiko, and the solution was once
suggested by Prateek.
Fixes: 46afe3af7ead ("sched/cache: Track LLC-preferred tasks per runqueue")
Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
include/linux/sched.h | 2 ++
init/init_task.c | 1 +
kernel/sched/fair.c | 31 ++++++++++++++++++++++++++++---
3 files changed, 31 insertions(+), 3 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 95729670929c..2c9e8e2edde1 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1410,6 +1410,8 @@ struct task_struct {
#ifdef CONFIG_SCHED_CACHE
struct callback_head cache_work;
int preferred_llc;
+ /* 1: task was enqueued to its preferred LLC, 0 otherwise */
+ int pref_llc_queued;
#endif
struct rseq_data rseq;
diff --git a/init/init_task.c b/init/init_task.c
index 5d90db4ff1f8..3ecd66fbd563 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -217,6 +217,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = {
#endif
#ifdef CONFIG_SCHED_CACHE
.preferred_llc = -1,
+ .pref_llc_queued = 0,
#endif
#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
.kasan_depth = 1,
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 73f185ba6e48..9e6edd40cd80 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1472,15 +1472,32 @@ static bool invalid_llc_nr(struct mm_struct *mm, struct task_struct *p,
static void account_llc_enqueue(struct rq *rq, struct task_struct *p)
{
+ int pref_llc, pref_llc_queued;
struct sched_domain *sd;
- int pref_llc;
pref_llc = p->preferred_llc;
if (pref_llc < 0)
return;
+ pref_llc_queued = (pref_llc == task_llc(p));
rq->nr_llc_running++;
- rq->nr_pref_llc_running += (pref_llc == task_llc(p));
+ rq->nr_pref_llc_running += pref_llc_queued;
+
+ /*
+ * Record whether p is enqueued on its preferred
+ * LLC, in order to pair with account_llc_dequeue()
+ * to maintain a consistent nr_pref_llc_running per
+ * runqueue.
+ * This is necessary because a race condition exists:
+ * after a task is enqueued on a runqueue, task_llc(p)
+ * may change due to CPU hotplug. Therefore, checking
+ * task_llc(p) to determine whether the task is being
+ * dequeued from its preferred LLC is unreliable and
+ * can cause inconsistent values - checking the
+ * p->pref_llc_queued in account_llc_dequeue() would
+ * be reliable.
+ */
+ p->pref_llc_queued = pref_llc_queued;
sd = rcu_dereference_all(rq->sd);
if (sd && (unsigned int)pref_llc < sd->llc_max)
@@ -1497,7 +1514,15 @@ static void account_llc_dequeue(struct rq *rq, struct task_struct *p)
return;
rq->nr_llc_running--;
- rq->nr_pref_llc_running -= (pref_llc == task_llc(p));
+ if (p->pref_llc_queued) {
+ rq->nr_pref_llc_running--;
+ /*
+ * Update the status in case
+ * other logic might query
+ * this.
+ */
+ p->pref_llc_queued = 0;
+ }
sd = rcu_dereference_all(rq->sd);
if (sd && (unsigned int)pref_llc < sd->llc_max) {
--
2.32.0
next prev parent reply other threads:[~2026-05-13 20:33 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 20:39 [Patch v4 00/16] Cache aware scheduling enhancements Tim Chen
2026-05-13 20:39 ` [Patch v4 01/16] sched/cache: Allow only 1 thread of the process to calculate the LLC occupancy Tim Chen
2026-05-13 20:39 ` [Patch v4 02/16] sched/cache: Disable cache aware scheduling for processes with high thread counts Tim Chen
2026-05-13 20:39 ` [Patch v4 03/16] sched/cache: Skip cache-aware scheduling for single-threaded processes Tim Chen
2026-05-13 20:39 ` [Patch v4 04/16] sched/cache: Calculate the LLC size and store it in sched_domain Tim Chen
2026-05-13 20:39 ` [Patch v4 05/16] sched/cache: Avoid cache-aware scheduling for memory-heavy processes Tim Chen
2026-05-13 20:39 ` [Patch v4 06/16] sched/cache: Add user control to adjust the aggressiveness of cache-aware scheduling Tim Chen
2026-05-13 20:39 ` [Patch v4 07/16] sched/cache: Fix rcu warning when accessing sd_llc domain Tim Chen
2026-05-13 20:39 ` [Patch v4 08/16] sched/cache: Fix potential NULL mm pointer access Tim Chen
2026-05-13 20:39 ` [Patch v4 09/16] sched/cache: Annotate lockless accesses to mm->sc_stat.cpu Tim Chen
2026-05-13 20:39 ` Tim Chen [this message]
2026-05-13 20:39 ` [Patch v4 11/16] sched/cache: Fix checking active load balance by only considering the CFS task Tim Chen
2026-05-13 20:39 ` [Patch v4 12/16] sched/cache: Fix race condition during sched domain rebuild Tim Chen
2026-05-13 20:39 ` [Patch v4 13/16] sched/cache: Fix cache aware scheduling enabling for multi LLCs system Tim Chen
2026-05-13 20:39 ` [Patch v4 14/16] sched/cache: Fix has_multi_llcs iff at least one partition has multiple LLCs Tim Chen
2026-05-13 20:39 ` [Patch v4 15/16] sched/cache: Fix possible overflow when invalidating the preferred CPU Tim Chen
2026-05-13 20:39 ` [Patch v4 16/16] sched/cache: Fix stale preferred_llc for a new task Tim Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0c8c6a1571d66792a4d2ff0103ba3cc13e059046.1778703694.git.tim.c.chen@linux.intel.com \
--to=tim.c.chen@linux.intel.com \
--cc=adamli@os.amperecomputing.com \
--cc=aubrey.li@intel.com \
--cc=bsegall@google.com \
--cc=cyy@cyyself.name \
--cc=dietmar.eggemann@arm.com \
--cc=gavinguo@igalia.com \
--cc=haoxing990@gmail.com \
--cc=hdanton@sina.com \
--cc=jianyong.wu@outlook.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=len.brown@intel.com \
--cc=libchen@purestorage.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luogengkun2@huawei.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=qyousef@layalina.io \
--cc=rostedt@goodmis.org \
--cc=sshegde@linux.ibm.com \
--cc=tim.c.chen@intel.com \
--cc=tingyin.duan@gmail.com \
--cc=vernhao@tencent.com \
--cc=vincent.guittot@linaro.org \
--cc=vineethr@linux.ibm.com \
--cc=vschneid@redhat.com \
--cc=yu.c.chen@intel.com \
--cc=yu.chen.surf@gmail.com \
--cc=zhao1.liu@intel.com \
--cc=ziqianlu@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox