From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EFE66382F1E for ; Wed, 13 May 2026 20:33:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.17 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778704431; cv=none; b=g78Ij0kN79U0abRJhSepyo/SGyV1+pr6oqKClpk/+XjCKC0SUwtVXUDU50zeP+PdaOJVYEXxwWpKFPSV5CTQKU+H2AGndXwMtWjJXf0yb6zfNo2qZ6ud8Vk/aNKyQu+a56XqNKpRirllRTZhB8bZcLSm00Wo66bYCfPJ3OPGN7E= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778704431; c=relaxed/simple; bh=Hjr3I2K3sS9NXaWhU0v/hUPqGrqUl+Q6KUJXkwPe5g8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=sotYRLjYUat5MRn5xgmrCCela/wWUZ0RekMlLnQa4YpCaFv3As46KsC0iJNW3cReyzCMRwWNktil6v7j/A8hzzc2U8Yx8wbhnbe0ZnXLF7zcnMhpSzrY8x320XaDSkTK95/1WuNabXrc5zXkx2EsSdQTcoHSVALyDmpdwroXl7w= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=HYnTiCWY; arc=none smtp.client-ip=198.175.65.17 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="HYnTiCWY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1778704423; x=1810240423; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Hjr3I2K3sS9NXaWhU0v/hUPqGrqUl+Q6KUJXkwPe5g8=; b=HYnTiCWYOKGaGwRhcG2O4KL+npu3VudnaL7Szvoiw9Ag3kW0Vn4HNjA6 Nl3L6DjnsWU1OAoBX94Od9DYZCnT0a3Hd90934xN1hBrazLLxvEJ7CCRp atnvO6tS3CuQuKNy49GFlpbB7wvfhWFhinbXQ1nccoQllmPa1JSt89JV9 8wtBTAlvMyHc8QhUHERK0ytPC3U34hmR+f3ytQDOfThOZT7ODseX03jHR CGPiuRgqdx3bm6T+6sQ3LkK1DH/r2EYXjiNAjSxiSdK9p8Zcq50G56Nje nx+gmONn2bdtCvyZPN+w+Q/95ieTxVHcxySd7IZD1xKzCh7fkaEhVLwW7 g==; X-CSE-ConnectionGUID: Z4zLvo2LRLmzkVG/qjCgvQ== X-CSE-MsgGUID: YsuJsQbzRkCGlW6T4rc7Ig== X-IronPort-AV: E=McAfee;i="6800,10657,11785"; a="79623151" X-IronPort-AV: E=Sophos;i="6.23,233,1770624000"; d="scan'208";a="79623151" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 May 2026 13:33:42 -0700 X-CSE-ConnectionGUID: zoYX+OWCQmG7l1zeSc0pBQ== X-CSE-MsgGUID: acUGvIlsRGSUvc/uvvqw8w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,233,1770624000"; d="scan'208";a="238076384" Received: from b04f130c83f2.jf.intel.com ([10.165.154.98]) by orviesa008.jf.intel.com with ESMTP; 13 May 2026 13:33:41 -0700 From: Tim Chen To: Peter Zijlstra , Ingo Molnar , K Prateek Nayak , Vincent Guittot Cc: Chen Yu , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Madadi Vineeth Reddy , Hillf Danton , Shrikanth Hegde , Jianyong Wu , Yangyu Chen , Tingyin Duan , Vern Hao , Vern Hao , Len Brown , Tim Chen , Aubrey Li , Zhao Liu , Chen Yu , Adam Li , Aaron Lu , Tim Chen , Josh Don , Gavin Guo , Qais Yousef , Libo Chen , Luo Gengkun , linux-kernel@vger.kernel.org Subject: [Patch v4 09/16] sched/cache: Annotate lockless accesses to mm->sc_stat.cpu Date: Wed, 13 May 2026 13:39:20 -0700 Message-Id: <63ea494f12efcf265d7134400a06cd75d7f2c310.1778703694.git.tim.c.chen@linux.intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Chen Yu mm->sc_stat.cpu is written by task_cache_work() and could be read locklessly by several functions on other CPUs. Use READ_ONCE and WRITE_ONCE on mm->sc_stat.cpu access and write to prevent inconsistent values from compiler optimizations when there are multiple accesses. For example in get_pref_llc(), if the writer updated the field between two compiler-generated loads, the validation (e.g., cpu != -1) and subsequent use (e.g., llc_id(cpu)) could operate on different values, allowing a negative CPU ID to be used as an index. Leave plain write in mm_init_sched(), where the mm is not yet visible to other CPUs. This bug was reported by sashiko. Fixes: 47d8696b95f7 ("sched/cache: Assign preferred LLC ID to processes") Signed-off-by: Chen Yu Co-developed-by: Tim Chen Signed-off-by: Tim Chen --- kernel/sched/fair.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 913b09254732..73f185ba6e48 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1598,13 +1598,14 @@ static unsigned long fraction_mm_sched(struct rq *rq, static int get_pref_llc(struct task_struct *p, struct mm_struct *mm) { - int mm_sched_llc = -1; + int mm_sched_llc = -1, mm_sched_cpu; if (!mm) return -1; - if (mm->sc_stat.cpu != -1) { - mm_sched_llc = llc_id(mm->sc_stat.cpu); + mm_sched_cpu = READ_ONCE(mm->sc_stat.cpu); + if (mm_sched_cpu != -1) { + mm_sched_llc = llc_id(mm_sched_cpu); #ifdef CONFIG_NUMA_BALANCING /* @@ -1619,7 +1620,7 @@ static int get_pref_llc(struct task_struct *p, struct mm_struct *mm) */ if (static_branch_likely(&sched_numa_balancing) && p->numa_preferred_nid >= 0 && - cpu_to_node(mm->sc_stat.cpu) != p->numa_preferred_nid) + cpu_to_node(mm_sched_cpu) != p->numa_preferred_nid) mm_sched_llc = -1; #endif } @@ -1665,8 +1666,8 @@ void account_mm_sched(struct rq *rq, struct task_struct *p, s64 delta_exec) if (epoch - READ_ONCE(mm->sc_stat.epoch) > llc_epoch_affinity_timeout || invalid_llc_nr(mm, p, cpu_of(rq)) || exceed_llc_capacity(mm, cpu_of(rq))) { - if (mm->sc_stat.cpu != -1) - mm->sc_stat.cpu = -1; + if (READ_ONCE(mm->sc_stat.cpu) != -1) + WRITE_ONCE(mm->sc_stat.cpu, -1); } mm_sched_llc = get_pref_llc(p, mm); @@ -1714,7 +1715,7 @@ static void get_scan_cpumasks(cpumask_var_t cpus, struct task_struct *p) if (!static_branch_likely(&sched_numa_balancing)) goto out; - cpu = p->mm->sc_stat.cpu; + cpu = READ_ONCE(p->mm->sc_stat.cpu); if (cpu != -1) nid = cpu_to_node(cpu); curr_cpu = task_cpu(p); @@ -1799,8 +1800,8 @@ static void task_cache_work(struct callback_head *work) curr_cpu = task_cpu(p); if (invalid_llc_nr(mm, p, curr_cpu) || exceed_llc_capacity(mm, curr_cpu)) { - if (mm->sc_stat.cpu != -1) - mm->sc_stat.cpu = -1; + if (READ_ONCE(mm->sc_stat.cpu) != -1) + WRITE_ONCE(mm->sc_stat.cpu, -1); return; } @@ -1857,7 +1858,7 @@ static void task_cache_work(struct callback_head *work) m_a_cpu = m_cpu; } - if (llc_id(cpu) == llc_id(mm->sc_stat.cpu)) + if (llc_id(cpu) == llc_id(READ_ONCE(mm->sc_stat.cpu))) curr_m_a_occ = a_occ; cpumask_andnot(cpus, cpus, sched_domain_span(sd)); @@ -1875,7 +1876,7 @@ static void task_cache_work(struct callback_head *work) * 3. 2X is chosen based on test results, as it delivers * the optimal performance gain so far. */ - mm->sc_stat.cpu = m_a_cpu; + WRITE_ONCE(mm->sc_stat.cpu, m_a_cpu); } update_avg_scale(&mm->sc_stat.nr_running_avg, nr_running); @@ -10441,15 +10442,15 @@ static enum llc_mig can_migrate_llc_task(int src_cpu, int dst_cpu, if (!mm) return mig_unrestricted; - cpu = mm->sc_stat.cpu; + cpu = READ_ONCE(mm->sc_stat.cpu); if (cpu < 0 || cpus_share_cache(src_cpu, dst_cpu)) return mig_unrestricted; /* skip cache aware load balance for too many threads */ if (invalid_llc_nr(mm, p, dst_cpu) || exceed_llc_capacity(mm, dst_cpu)) { - if (mm->sc_stat.cpu != -1) - mm->sc_stat.cpu = -1; + if (READ_ONCE(mm->sc_stat.cpu) != -1) + WRITE_ONCE(mm->sc_stat.cpu, -1); return mig_unrestricted; } -- 2.32.0