public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [cpuidle,intel_idle]  32d4fd5751: WARNING:at_kernel/rcu/tree.c:#rcu_eqs_exit
       [not found] <20220612160006.GB35020@xsang-OptiPlex-9020>
@ 2022-06-23 11:23 ` Shinichiro Kawasaki
  2022-09-14  8:26   ` Oliver Sang
  0 siblings, 1 reply; 2+ messages in thread
From: Shinichiro Kawasaki @ 2022-06-23 11:23 UTC (permalink / raw)
  To: kernel test robot
  Cc: Peter Zijlstra, Rafael J. Wysocki, LKML, linux-pm@vger.kernel.org,
	lkp@lists.01.org, lkp@intel.com, Damien Le Moal

On Jun 13, 2022 / 00:00, kernel test robot wrote:
> 
> 
> Greeting,
> 
> FYI, we noticed the following commit (built with gcc-11):
> 
> commit: 32d4fd5751eadbe1823a37eb38df85ec5c8e6207 ("cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: kernel-selftests
> version: kernel-selftests-x86_64-cef46213-1_20220609
> with following parameters:
> 
> 	group: resctrl
> 	ucode: 0x500320a
> 
> test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
> test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
> 
> 
> on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
> 
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> 
> 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <oliver.sang@intel.com>
> 
> 
> [ 29.104402][ T0] WARNING: CPU: 0 PID: 0 at kernel/rcu/tree.c:864 rcu_eqs_exit+0x4b/0xc0 
> [   29.104417][    T0]
> [   29.104418][    T0] =============================
> [   29.104419][    T0] WARNING: suspicious RCU usage
> [   29.104421][    T0] 5.19.0-rc1-00001-g32d4fd5751ea #1 Not tainted
> [   29.104424][    T0] -----------------------------

FYI, I observe this WARNING on my test servers for fstests, with kernel
v5.19-rc3. It was observed at system boot, and was also observed repeatedly
during fstests run. I reverted the commit 32d4fd5751ea then the WARNING
disappeared. The WARNING was observed on systems with 20 threads CPU, but
not observed on systems with 8 threads CPU.

Looking in the commit, I'm not sure how it is related to the RCU warning.
If any further action on my system would help, please let me know.

-- 
Shin'ichiro Kawasaki

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [cpuidle,intel_idle]  32d4fd5751: WARNING:at_kernel/rcu/tree.c:#rcu_eqs_exit
  2022-06-23 11:23 ` [cpuidle,intel_idle] 32d4fd5751: WARNING:at_kernel/rcu/tree.c:#rcu_eqs_exit Shinichiro Kawasaki
@ 2022-09-14  8:26   ` Oliver Sang
  0 siblings, 0 replies; 2+ messages in thread
From: Oliver Sang @ 2022-09-14  8:26 UTC (permalink / raw)
  To: Shinichiro Kawasaki
  Cc: Peter Zijlstra, Rafael J. Wysocki, LKML, linux-pm@vger.kernel.org,
	lkp@lists.01.org, lkp@intel.com, Damien Le Moal

Hi Shin'ichiro Kawasaki and Peter Zijlstra,

On Thu, Jun 23, 2022 at 11:23:59AM +0000, Shinichiro Kawasaki wrote:
> On Jun 13, 2022 / 00:00, kernel test robot wrote:
> > 
> > 
> > Greeting,
> > 
> > FYI, we noticed the following commit (built with gcc-11):
> > 
> > commit: 32d4fd5751eadbe1823a37eb38df85ec5c8e6207 ("cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > in testcase: kernel-selftests
> > version: kernel-selftests-x86_64-cef46213-1_20220609
> > with following parameters:
> > 
> > 	group: resctrl
> > 	ucode: 0x500320a
> > 
> > test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
> > test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
> > 
> > 
> > on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
> > 
> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> > 
> > 
> > 
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > 
> > 
> > [ 29.104402][ T0] WARNING: CPU: 0 PID: 0 at kernel/rcu/tree.c:864 rcu_eqs_exit+0x4b/0xc0 
> > [   29.104417][    T0]
> > [   29.104418][    T0] =============================
> > [   29.104419][    T0] WARNING: suspicious RCU usage
> > [   29.104421][    T0] 5.19.0-rc1-00001-g32d4fd5751ea #1 Not tainted
> > [   29.104424][    T0] -----------------------------
> 
> FYI, I observe this WARNING on my test servers for fstests, with kernel
> v5.19-rc3. It was observed at system boot, and was also observed repeatedly
> during fstests run. I reverted the commit 32d4fd5751ea then the WARNING
> disappeared. The WARNING was observed on systems with 20 threads CPU, but
> not observed on systems with 8 threads CPU.
> 
> Looking in the commit, I'm not sure how it is related to the RCU warning.
> If any further action on my system would help, please let me know.

recently we made further tests and confirmed the issue is existing on this
commit but clean on parent, still on test machine:
  88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory

=========================================================================================
compiler/group/kconfig/rootfs/tbox_group/testcase:
  gcc-11/resctrl/x86_64-rhel-8.3-kselftests/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp9/kernel-selftests

commit:
  v5.19-rc1
  32d4fd5751eadbe1823a37eb38df85ec5c8e6207

       v5.19-rc1 32d4fd5751eadbe1823a37eb38d
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :20         100%          20:20    dmesg.RIP:rcu_eqs_exit   <------
           :20          95%          19:20    dmesg.RIP:sched_clock_tick
           :20          90%          18:20    dmesg.WARNING:at_kernel/rcu/tree.c:#rcu_eqs_exit
           :20          90%          18:20    dmesg.WARNING:at_kernel/sched/clock.c:#sched_clock_tick
           :20         100%          20:20    dmesg.WARNING:suspicious_RCU_usage
           :20         100%          20:20    dmesg.boot_failures
           :20           5%           1:20    dmesg.include/linux/rcupdate.h:#rcu_read_lock()used_illegally_while_idle
           :20           5%           1:20    dmesg.include/linux/rcupdate.h:#rcu_read_unlock()used_illegally_while_idle
           :20          95%          19:20    dmesg.include/trace/events/error_report.h:#suspicious_rcu_dereference_check()usage
           :20         100%          20:20    dmesg.include/trace/events/lock.h:#suspicious_rcu_dereference_check()usage


as Shin'ichiro Kawasaki mentioned, the issues seems not be able to reproduce on
systems with small number of threads of CPU. so we tested on a vm which only
have 2 threads
  qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

we confirmed the issue cannot be reproduced.

we actually don't have related knolwedge, if need extra data or testing we can
help.

> 
> -- 
> Shin'ichiro Kawasaki

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-09-14  8:27 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20220612160006.GB35020@xsang-OptiPlex-9020>
2022-06-23 11:23 ` [cpuidle,intel_idle] 32d4fd5751: WARNING:at_kernel/rcu/tree.c:#rcu_eqs_exit Shinichiro Kawasaki
2022-09-14  8:26   ` Oliver Sang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox