public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [linus:master] [timers]  fe90c5ba88: BUG:KCSAN:data-race_in_timer_expire_remote/timer_recalc_next_expiry
@ 2024-10-30  5:38 kernel test robot
  2024-10-30  7:53 ` timers: Add missing READ_ONCE() in __run_timer_base() Thomas Gleixner
  0 siblings, 1 reply; 4+ messages in thread
From: kernel test robot @ 2024-10-30  5:38 UTC (permalink / raw)
  To: Anna-Maria Behnsen
  Cc: oe-lkp, lkp, linux-kernel, Thomas Gleixner, Frederic Weisbecker,
	oliver.sang



Hello,


we understand this is a renaming commit, which causes below KCSAN report
difference.

d7b01b81bd2dad57 fe90c5ba88ad43d42acefb21b57
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
         20:300         -7%            :300   dmesg.BUG:KCSAN:data-race_in_next_expiry_recalc/run_timer_softirq
         11:300         -4%            :300   dmesg.BUG:KCSAN:data-race_in_next_expiry_recalc/timer_expire_remote
           :300          6%          18:300   dmesg.BUG:KCSAN:data-race_in_run_timer_softirq/timer_recalc_next_expiry
           :300          5%          14:300   dmesg.BUG:KCSAN:data-race_in_timer_expire_remote/timer_recalc_next_expiry

we made out this report to let you be aware that the possible issues in
related code. then it's up to you to see if these issues need to care.

if you need us do more tests or test some patch, please let us know. thanks!

below is full report FYI.


kernel test robot noticed "BUG:KCSAN:data-race_in_timer_expire_remote/timer_recalc_next_expiry" on:

commit: fe90c5ba88ad43d42acefb21b57df837be86a61a ("timers: Rename next_expiry_recalc() to be unique")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master      81983758430957d9a5cb3333fe324fd70cf63e7e]
[test failed on linux-next/master dec9255a128e19c5fcc3bdb18175d78094cc624d]

in testcase: trinity
version: 
with following parameters:

	runtime: 300s
	group: group-03
	nr_groups: 5



config: x86_64-randconfig-073-20241025
compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202410301205.ef8e9743-lkp@intel.com


[  131.612941][    C0] ==================================================================
[  131.614221][    C0] BUG: KCSAN: data-race in timer_expire_remote / timer_recalc_next_expiry
[  131.615526][    C0]
[  131.615932][    C0] write (marked) to 0xffff88842fd1d4d0 of 8 bytes by interrupt on cpu 1:
[ 131.617229][ C0] timer_recalc_next_expiry (kernel/time/timer.c:1969 (discriminator 2)) 
[ 131.618104][ C0] __run_timers (kernel/time/timer.c:2399) 
[ 131.618818][ C0] run_timer_softirq (kernel/time/timer.c:2431 kernel/time/timer.c:2423 kernel/time/timer.c:2439 kernel/time/timer.c:2449) 
[ 131.619581][ C0] handle_softirqs (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:555) 
[ 131.620321][ C0] __irq_exit_rcu (kernel/softirq.c:589 kernel/softirq.c:428 kernel/softirq.c:637) 
[ 131.621036][ C0] irq_exit_rcu (kernel/softirq.c:651) 
[ 131.621719][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043 arch/x86/kernel/apic/apic.c:1043) 
[ 131.622596][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:702) 
[ 131.623523][ C0] default_idle (arch/x86/include/asm/irqflags.h:37 arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:743) 
[ 131.628342][ C0] default_idle_call (include/linux/cpuidle.h:143 kernel/sched/idle.c:118) 
[ 131.629091][ C0] cpuidle_idle_call (kernel/sched/idle.c:186) 
[ 131.629892][ C0] do_idle (kernel/sched/idle.c:328) 
[ 131.630536][ C0] cpu_startup_entry (kernel/sched/idle.c:423 (discriminator 1)) 
[ 131.631318][ C0] start_secondary (arch/x86/kernel/smpboot.c:224 arch/x86/kernel/smpboot.c:291) 
[ 131.632082][ C0] common_startup_64 (arch/x86/kernel/head_64.S:421) 
[  131.632864][    C0]
[  131.633255][    C0] read to 0xffff88842fd1d4d0 of 8 bytes by interrupt on cpu 0:
[ 131.634436][ C0] timer_expire_remote (kernel/time/timer.c:2425 kernel/time/timer.c:2182) 
[ 131.635247][ C0] tmigr_handle_remote_up (arch/x86/include/asm/irqflags.h:26 arch/x86/include/asm/irqflags.h:87 arch/x86/include/asm/irqflags.h:147 kernel/time/timer_migration.c:947 kernel/time/timer_migration.c:1021) 
[ 131.636092][ C0] tmigr_handle_remote (kernel/time/timer_migration.c:533 kernel/time/timer_migration.c:1080) 
[ 131.636875][ C0] run_timer_softirq (kernel/time/timer.c:2455) 
[ 131.637671][ C0] handle_softirqs (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:555) 
[ 131.638358][ C0] __irq_exit_rcu (kernel/softirq.c:589 kernel/softirq.c:428 kernel/softirq.c:637) 
[ 131.639083][ C0] irq_exit_rcu (kernel/softirq.c:651) 
[ 131.639737][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043 arch/x86/kernel/apic/apic.c:1043) 
[ 131.640600][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:702) 
[ 131.641494][ C0] mtree_range_walk (lib/maple_tree.c:788 lib/maple_tree.c:2792) 
[ 131.642226][ C0] mas_walk (lib/maple_tree.c:265 lib/maple_tree.c:4907) 
[ 131.642874][ C0] lock_vma_under_rcu (mm/memory.c:5996) 
[ 131.643627][ C0] do_user_addr_fault (arch/x86/mm/fault.c:1330) 
[ 131.644422][ C0] exc_page_fault (arch/x86/include/asm/irqflags.h:26 arch/x86/include/asm/irqflags.h:87 arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) 
[ 131.645154][ C0] asm_exc_page_fault (arch/x86/include/asm/idtentry.h:623) 
[  131.645911][    C0]
[  131.646292][    C0] value changed: 0x00000000ffff5991 -> 0x00000000ffff5bc0
[  131.647367][    C0]
[  131.647755][    C0] Reported by Kernel Concurrency Sanitizer on:
[  131.648646][    C0] CPU: 0 UID: 0 PID: 518 Comm: run Not tainted 6.11.0-rc1-00042-gfe90c5ba88ad #1 cf6842c5d2875ed08b01af3196bb8a34c3713203
[  131.650536][    C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[  131.652009][    C0] ==================================================================



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241030/202410301205.ef8e9743-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 4+ messages in thread

* timers: Add missing READ_ONCE() in __run_timer_base()
  2024-10-30  5:38 [linus:master] [timers] fe90c5ba88: BUG:KCSAN:data-race_in_timer_expire_remote/timer_recalc_next_expiry kernel test robot
@ 2024-10-30  7:53 ` Thomas Gleixner
  2024-10-31  9:44   ` Frederic Weisbecker
  2024-10-31 10:54   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  0 siblings, 2 replies; 4+ messages in thread
From: Thomas Gleixner @ 2024-10-30  7:53 UTC (permalink / raw)
  To: kernel test robot, Anna-Maria Behnsen
  Cc: oe-lkp, lkp, linux-kernel, Frederic Weisbecker, oliver.sang

__run_timer_base() checks base::next_expiry without holding
base::lock. That can race with a remote CPU updating next_expiry under the
lock. This is an intentional and harmless data race, but lacks a
READ_ONCE(), so KCSAN complains about this.

Add the missing READ_ONCE(). All other places are covered already.

Fixes: 79f8b28e85f8 ("timers: Annotate possible non critical data race of next_expiry")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Closes: https://lore.kernel.org/oe-lkp/202410301205.ef8e9743-lkp@intel.com
---
 kernel/time/timer.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -2422,7 +2422,8 @@ static inline void __run_timers(struct t
 
 static void __run_timer_base(struct timer_base *base)
 {
-	if (time_before(jiffies, base->next_expiry))
+	/* Can race against a remote CPU updating next_expiry under the lock */
+	if (time_before(jiffies, READ_ONCE(base->next_expiry)))
 		return;
 
 	timer_base_lock_expiry(base);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: timers: Add missing READ_ONCE() in __run_timer_base()
  2024-10-30  7:53 ` timers: Add missing READ_ONCE() in __run_timer_base() Thomas Gleixner
@ 2024-10-31  9:44   ` Frederic Weisbecker
  2024-10-31 10:54   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 4+ messages in thread
From: Frederic Weisbecker @ 2024-10-31  9:44 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: kernel test robot, Anna-Maria Behnsen, oe-lkp, lkp, linux-kernel

On Wed, Oct 30, 2024 at 08:53:51AM +0100, Thomas Gleixner wrote:
> __run_timer_base() checks base::next_expiry without holding
> base::lock. That can race with a remote CPU updating next_expiry under the
> lock. This is an intentional and harmless data race, but lacks a
> READ_ONCE(), so KCSAN complains about this.
> 
> Add the missing READ_ONCE(). All other places are covered already.
> 
> Fixes: 79f8b28e85f8 ("timers: Annotate possible non critical data race of next_expiry")
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Closes: https://lore.kernel.org/oe-lkp/202410301205.ef8e9743-lkp@intel.com

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip: timers/core] timers: Add missing READ_ONCE() in __run_timer_base()
  2024-10-30  7:53 ` timers: Add missing READ_ONCE() in __run_timer_base() Thomas Gleixner
  2024-10-31  9:44   ` Frederic Weisbecker
@ 2024-10-31 10:54   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 4+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-10-31 10:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kernel test robot, Thomas Gleixner, Frederic Weisbecker, x86,
	linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     1d4199cbbe95efaba51304cfd844bd0ccd224e61
Gitweb:        https://git.kernel.org/tip/1d4199cbbe95efaba51304cfd844bd0ccd224e61
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 30 Oct 2024 08:53:51 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 31 Oct 2024 11:45:01 +01:00

timers: Add missing READ_ONCE() in __run_timer_base()

__run_timer_base() checks base::next_expiry without holding
base::lock. That can race with a remote CPU updating next_expiry under the
lock. This is an intentional and harmless data race, but lacks a
READ_ONCE(), so KCSAN complains about this.

Add the missing READ_ONCE(). All other places are covered already.

Fixes: 79f8b28e85f8 ("timers: Annotate possible non critical data race of next_expiry")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/all/87a5emyqk0.ffs@tglx
Closes: https://lore.kernel.org/oe-lkp/202410301205.ef8e9743-lkp@intel.com
---
 kernel/time/timer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 02355b2..a283e52 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -2421,7 +2421,8 @@ static inline void __run_timers(struct timer_base *base)
 
 static void __run_timer_base(struct timer_base *base)
 {
-	if (time_before(jiffies, base->next_expiry))
+	/* Can race against a remote CPU updating next_expiry under the lock */
+	if (time_before(jiffies, READ_ONCE(base->next_expiry)))
 		return;
 
 	timer_base_lock_expiry(base);

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-10-31 10:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-30  5:38 [linus:master] [timers] fe90c5ba88: BUG:KCSAN:data-race_in_timer_expire_remote/timer_recalc_next_expiry kernel test robot
2024-10-30  7:53 ` timers: Add missing READ_ONCE() in __run_timer_base() Thomas Gleixner
2024-10-31  9:44   ` Frederic Weisbecker
2024-10-31 10:54   ` [tip: timers/core] " tip-bot2 for Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox