public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [tip:timers/urgent] [timers/migration]  8cdb61838e: WARNING:at_kernel/time/timer_migration.c:#tmigr_connect_child_parent
@ 2024-07-10  8:37 kernel test robot
  2024-07-15 23:14 ` Paul E. McKenney
  0 siblings, 1 reply; 4+ messages in thread
From: kernel test robot @ 2024-07-10  8:37 UTC (permalink / raw)
  To: Anna-Maria Behnsen
  Cc: oe-lkp, lkp, linux-kernel, x86, Thomas Gleixner,
	Frederic Weisbecker, oliver.sang



Hello,

kernel test robot noticed "WARNING:at_kernel/time/timer_migration.c:#tmigr_connect_child_parent" on:

commit: 8cdb61838ee5c63556773ea2eed24deab6b15257 ("timers/migration: Move hierarchy setup into cpuhotplug prepare callback")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git timers/urgent

[test failed on linux-next/master 82d01fe6ee52086035b201cfa1410a3b04384257]

in testcase: filebench
version: filebench-x86_64-22620e6-1_20240224
with following parameters:

	disk: 1HDD
	fs: xfs
	test: randomwrite.f
	cpufreq_governor: performance



compiler: gcc-13
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202407101636.d9d4e8be-oliver.sang@intel.com


[   16.306955][  T209] ------------[ cut here ]------------
[ 16.312287][ T209] WARNING: CPU: 32 PID: 209 at kernel/time/timer_migration.c:1620 tmigr_connect_child_parent (kernel/time/timer_migration.c:1620 (discriminator 7)) 
[   16.323148][  T209] Modules linked in:
[   16.326901][  T209] CPU: 32 PID: 209 Comm: cpuhp/32 Not tainted 6.10.0-rc6-00002-g8cdb61838ee5 #1
[ 16.335766][ T209] RIP: 0010:tmigr_connect_child_parent (kernel/time/timer_migration.c:1620 (discriminator 7)) 
[ 16.341945][ T209] Code: c6 07 00 0f 1f 00 fb 66 90 0f b6 45 60 48 89 e2 48 89 ee 48 89 df 88 44 24 18 e8 ec fc ff ff 84 c0 75 09 48 83 7b 08 00 74 02 <0f> 0b 48 8b 44 24 20 65 48 2b 04 25 28 00 00 00 75 36 48 83 c4 28
All code
========
   0:	c6 07 00             	movb   $0x0,(%rdi)
   3:	0f 1f 00             	nopl   (%rax)
   6:	fb                   	sti    
   7:	66 90                	xchg   %ax,%ax
   9:	0f b6 45 60          	movzbl 0x60(%rbp),%eax
   d:	48 89 e2             	mov    %rsp,%rdx
  10:	48 89 ee             	mov    %rbp,%rsi
  13:	48 89 df             	mov    %rbx,%rdi
  16:	88 44 24 18          	mov    %al,0x18(%rsp)
  1a:	e8 ec fc ff ff       	callq  0xfffffffffffffd0b
  1f:	84 c0                	test   %al,%al
  21:	75 09                	jne    0x2c
  23:	48 83 7b 08 00       	cmpq   $0x0,0x8(%rbx)
  28:	74 02                	je     0x2c
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	48 8b 44 24 20       	mov    0x20(%rsp),%rax
  31:	65 48 2b 04 25 28 00 	sub    %gs:0x28,%rax
  38:	00 00 
  3a:	75 36                	jne    0x72
  3c:	48 83 c4 28          	add    $0x28,%rsp

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	48 8b 44 24 20       	mov    0x20(%rsp),%rax
   7:	65 48 2b 04 25 28 00 	sub    %gs:0x28,%rax
   e:	00 00 
  10:	75 36                	jne    0x48
  12:	48 83 c4 28          	add    $0x28,%rsp
[   16.361382][  T209] RSP: 0000:ffa0000007587da0 EFLAGS: 00010286
[   16.367302][  T209] RAX: 0000000000000000 RBX: ff11002000b56b00 RCX: 0000000000010101
[   16.375130][  T209] RDX: 0000000000010101 RSI: ff11002000b56b00 RDI: ff11002000b56b50
[   16.382955][  T209] RBP: ff11002000b57500 R08: 0000000000000101 R09: ffa0000007587da0
[   16.390782][  T209] R10: 0000000000000001 R11: 00000000c5672a10 R12: 0000000000000001
[   16.398608][  T209] R13: ff11001084311240 R14: 0000000000000020 R15: 0000000000000002
[   16.406433][  T209] FS:  0000000000000000(0000) GS:ff11001fffe00000(0000) knlGS:0000000000000000
[   16.415213][  T209] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.421650][  T209] CR2: 0000000000000000 CR3: 000000207de1c001 CR4: 0000000000771ef0
[   16.429477][  T209] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   16.437301][  T209] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   16.445128][  T209] PKRU: 55555554
[   16.448535][  T209] Call Trace:
[   16.451680][  T209]  <TASK>
[ 16.454479][ T209] ? __warn (kernel/panic.c:693) 
[ 16.458405][ T209] ? tmigr_connect_child_parent (kernel/time/timer_migration.c:1620 (discriminator 7)) 
[ 16.463980][ T209] ? report_bug (lib/bug.c:180 lib/bug.c:219) 
[ 16.468338][ T209] ? handle_bug (arch/x86/kernel/traps.c:239) 
[ 16.472523][ T209] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) 
[ 16.477055][ T209] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) 
[ 16.481936][ T209] ? tmigr_connect_child_parent (kernel/time/timer_migration.c:1620 (discriminator 7)) 
[ 16.487507][ T209] tmigr_setup_groups+0x1e6/0x430 
[ 16.492993][ T209] ? __pfx_tmigr_cpu_prepare (kernel/time/timer_migration.c:1727) 
[ 16.498305][ T209] tmigr_cpu_prepare (kernel/time/timer_migration.c:1721 kernel/time/timer_migration.c:1737) 
[ 16.502927][ T209] cpuhp_invoke_callback (kernel/cpu.c:194) 
[ 16.507980][ T209] ? __pfx_smpboot_thread_fn (kernel/smpboot.c:107) 
[ 16.513291][ T209] cpuhp_thread_fun (kernel/cpu.c:1092 (discriminator 1)) 
[ 16.517910][ T209] smpboot_thread_fn (kernel/smpboot.c:164) 
[ 16.522615][ T209] kthread (kernel/kthread.c:389) 
[ 16.526367][ T209] ? __pfx_kthread (kernel/kthread.c:342) 
[ 16.530814][ T209] ret_from_fork (arch/x86/kernel/process.c:147) 
[ 16.535088][ T209] ? __pfx_kthread (kernel/kthread.c:342) 
[ 16.539533][ T209] ret_from_fork_asm (arch/x86/entry/entry_64.S:257) 
[   16.544153][  T209]  </TASK>
[   16.547039][  T209] ---[ end trace 0000000000000000 ]---
[   16.562459][    T1] Timer migration setup failed



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240710/202407101636.d9d4e8be-oliver.sang@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [tip:timers/urgent] [timers/migration]  8cdb61838e: WARNING:at_kernel/time/timer_migration.c:#tmigr_connect_child_parent
  2024-07-10  8:37 [tip:timers/urgent] [timers/migration] 8cdb61838e: WARNING:at_kernel/time/timer_migration.c:#tmigr_connect_child_parent kernel test robot
@ 2024-07-15 23:14 ` Paul E. McKenney
  2024-07-16  7:36   ` Anna-Maria Behnsen
  0 siblings, 1 reply; 4+ messages in thread
From: Paul E. McKenney @ 2024-07-15 23:14 UTC (permalink / raw)
  To: kernel test robot
  Cc: Anna-Maria Behnsen, oe-lkp, lkp, linux-kernel, x86,
	Thomas Gleixner, Frederic Weisbecker

On Wed, Jul 10, 2024 at 04:37:00PM +0800, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed "WARNING:at_kernel/time/timer_migration.c:#tmigr_connect_child_parent" on:
> 
> commit: 8cdb61838ee5c63556773ea2eed24deab6b15257 ("timers/migration: Move hierarchy setup into cpuhotplug prepare callback")
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git timers/urgent

For whatever it is worth, I am also seeing this on refscale runs on
recent -next.

The reproducer is to clone perfbook [1] in your ~/git directory
(as in ~/git/perfboot), then run this from your Linux source tree,
preferably on a system with few CPUs:

bash ~/git/perfbook/CodeSamples/defer/rcuscale.sh

The output will have "FAIL" in it, which indicates that the corresponding
guest OS splatted.  If it would be useful, I would be happy to produce
a one-liner that runs the guest OS only once and leaves the console
output around.  Otherwise, I will continue being lazy.  ;-)

							Thanx, Paul

[1] https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git

> [test failed on linux-next/master 82d01fe6ee52086035b201cfa1410a3b04384257]
> 
> in testcase: filebench
> version: filebench-x86_64-22620e6-1_20240224
> with following parameters:
> 
> 	disk: 1HDD
> 	fs: xfs
> 	test: randomwrite.f
> 	cpufreq_governor: performance
> 
> 
> 
> compiler: gcc-13
> test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202407101636.d9d4e8be-oliver.sang@intel.com
> 
> 
> [   16.306955][  T209] ------------[ cut here ]------------
> [ 16.312287][ T209] WARNING: CPU: 32 PID: 209 at kernel/time/timer_migration.c:1620 tmigr_connect_child_parent (kernel/time/timer_migration.c:1620 (discriminator 7)) 
> [   16.323148][  T209] Modules linked in:
> [   16.326901][  T209] CPU: 32 PID: 209 Comm: cpuhp/32 Not tainted 6.10.0-rc6-00002-g8cdb61838ee5 #1
> [ 16.335766][ T209] RIP: 0010:tmigr_connect_child_parent (kernel/time/timer_migration.c:1620 (discriminator 7)) 
> [ 16.341945][ T209] Code: c6 07 00 0f 1f 00 fb 66 90 0f b6 45 60 48 89 e2 48 89 ee 48 89 df 88 44 24 18 e8 ec fc ff ff 84 c0 75 09 48 83 7b 08 00 74 02 <0f> 0b 48 8b 44 24 20 65 48 2b 04 25 28 00 00 00 75 36 48 83 c4 28
> All code
> ========
>    0:	c6 07 00             	movb   $0x0,(%rdi)
>    3:	0f 1f 00             	nopl   (%rax)
>    6:	fb                   	sti    
>    7:	66 90                	xchg   %ax,%ax
>    9:	0f b6 45 60          	movzbl 0x60(%rbp),%eax
>    d:	48 89 e2             	mov    %rsp,%rdx
>   10:	48 89 ee             	mov    %rbp,%rsi
>   13:	48 89 df             	mov    %rbx,%rdi
>   16:	88 44 24 18          	mov    %al,0x18(%rsp)
>   1a:	e8 ec fc ff ff       	callq  0xfffffffffffffd0b
>   1f:	84 c0                	test   %al,%al
>   21:	75 09                	jne    0x2c
>   23:	48 83 7b 08 00       	cmpq   $0x0,0x8(%rbx)
>   28:	74 02                	je     0x2c
>   2a:*	0f 0b                	ud2    		<-- trapping instruction
>   2c:	48 8b 44 24 20       	mov    0x20(%rsp),%rax
>   31:	65 48 2b 04 25 28 00 	sub    %gs:0x28,%rax
>   38:	00 00 
>   3a:	75 36                	jne    0x72
>   3c:	48 83 c4 28          	add    $0x28,%rsp
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	0f 0b                	ud2    
>    2:	48 8b 44 24 20       	mov    0x20(%rsp),%rax
>    7:	65 48 2b 04 25 28 00 	sub    %gs:0x28,%rax
>    e:	00 00 
>   10:	75 36                	jne    0x48
>   12:	48 83 c4 28          	add    $0x28,%rsp
> [   16.361382][  T209] RSP: 0000:ffa0000007587da0 EFLAGS: 00010286
> [   16.367302][  T209] RAX: 0000000000000000 RBX: ff11002000b56b00 RCX: 0000000000010101
> [   16.375130][  T209] RDX: 0000000000010101 RSI: ff11002000b56b00 RDI: ff11002000b56b50
> [   16.382955][  T209] RBP: ff11002000b57500 R08: 0000000000000101 R09: ffa0000007587da0
> [   16.390782][  T209] R10: 0000000000000001 R11: 00000000c5672a10 R12: 0000000000000001
> [   16.398608][  T209] R13: ff11001084311240 R14: 0000000000000020 R15: 0000000000000002
> [   16.406433][  T209] FS:  0000000000000000(0000) GS:ff11001fffe00000(0000) knlGS:0000000000000000
> [   16.415213][  T209] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   16.421650][  T209] CR2: 0000000000000000 CR3: 000000207de1c001 CR4: 0000000000771ef0
> [   16.429477][  T209] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   16.437301][  T209] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   16.445128][  T209] PKRU: 55555554
> [   16.448535][  T209] Call Trace:
> [   16.451680][  T209]  <TASK>
> [ 16.454479][ T209] ? __warn (kernel/panic.c:693) 
> [ 16.458405][ T209] ? tmigr_connect_child_parent (kernel/time/timer_migration.c:1620 (discriminator 7)) 
> [ 16.463980][ T209] ? report_bug (lib/bug.c:180 lib/bug.c:219) 
> [ 16.468338][ T209] ? handle_bug (arch/x86/kernel/traps.c:239) 
> [ 16.472523][ T209] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) 
> [ 16.477055][ T209] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) 
> [ 16.481936][ T209] ? tmigr_connect_child_parent (kernel/time/timer_migration.c:1620 (discriminator 7)) 
> [ 16.487507][ T209] tmigr_setup_groups+0x1e6/0x430 
> [ 16.492993][ T209] ? __pfx_tmigr_cpu_prepare (kernel/time/timer_migration.c:1727) 
> [ 16.498305][ T209] tmigr_cpu_prepare (kernel/time/timer_migration.c:1721 kernel/time/timer_migration.c:1737) 
> [ 16.502927][ T209] cpuhp_invoke_callback (kernel/cpu.c:194) 
> [ 16.507980][ T209] ? __pfx_smpboot_thread_fn (kernel/smpboot.c:107) 
> [ 16.513291][ T209] cpuhp_thread_fun (kernel/cpu.c:1092 (discriminator 1)) 
> [ 16.517910][ T209] smpboot_thread_fn (kernel/smpboot.c:164) 
> [ 16.522615][ T209] kthread (kernel/kthread.c:389) 
> [ 16.526367][ T209] ? __pfx_kthread (kernel/kthread.c:342) 
> [ 16.530814][ T209] ret_from_fork (arch/x86/kernel/process.c:147) 
> [ 16.535088][ T209] ? __pfx_kthread (kernel/kthread.c:342) 
> [ 16.539533][ T209] ret_from_fork_asm (arch/x86/entry/entry_64.S:257) 
> [   16.544153][  T209]  </TASK>
> [   16.547039][  T209] ---[ end trace 0000000000000000 ]---
> [   16.562459][    T1] Timer migration setup failed
> 
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240710/202407101636.d9d4e8be-oliver.sang@intel.com
> 
> 
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [tip:timers/urgent] [timers/migration]  8cdb61838e: WARNING:at_kernel/time/timer_migration.c:#tmigr_connect_child_parent
  2024-07-15 23:14 ` Paul E. McKenney
@ 2024-07-16  7:36   ` Anna-Maria Behnsen
  2024-07-16 13:41     ` Paul E. McKenney
  0 siblings, 1 reply; 4+ messages in thread
From: Anna-Maria Behnsen @ 2024-07-16  7:36 UTC (permalink / raw)
  To: paulmck, kernel test robot
  Cc: oe-lkp, lkp, linux-kernel, x86, Thomas Gleixner,
	Frederic Weisbecker

Hi Paul,

"Paul E. McKenney" <paulmck@kernel.org> writes:

> On Wed, Jul 10, 2024 at 04:37:00PM +0800, kernel test robot wrote:
>> 
>> 
>> Hello,
>> 
>> kernel test robot noticed "WARNING:at_kernel/time/timer_migration.c:#tmigr_connect_child_parent" on:
>> 
>> commit: 8cdb61838ee5c63556773ea2eed24deab6b15257 ("timers/migration: Move hierarchy setup into cpuhotplug prepare callback")
>> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git timers/urgent
>
> For whatever it is worth, I am also seeing this on refscale runs on
> recent -next.
>
> The reproducer is to clone perfbook [1] in your ~/git directory
> (as in ~/git/perfboot), then run this from your Linux source tree,
> preferably on a system with few CPUs:
>
> bash ~/git/perfbook/CodeSamples/defer/rcuscale.sh
>
> The output will have "FAIL" in it, which indicates that the corresponding
> guest OS splatted.  If it would be useful, I would be happy to produce
> a one-liner that runs the guest OS only once and leaves the console
> output around.  Otherwise, I will continue being lazy.  ;-)

Thanks for the report. I found the root cause for it and I am working on
a fix as the commit which triggers the warning also has another
problem... And I already requested to drop the tip timers/urgent patches
(at least my patches).

So, enjoy being lazy!

Thanks,

	Anna-Maria


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [tip:timers/urgent] [timers/migration]  8cdb61838e: WARNING:at_kernel/time/timer_migration.c:#tmigr_connect_child_parent
  2024-07-16  7:36   ` Anna-Maria Behnsen
@ 2024-07-16 13:41     ` Paul E. McKenney
  0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2024-07-16 13:41 UTC (permalink / raw)
  To: Anna-Maria Behnsen
  Cc: kernel test robot, oe-lkp, lkp, linux-kernel, x86,
	Thomas Gleixner, Frederic Weisbecker

On Tue, Jul 16, 2024 at 09:36:47AM +0200, Anna-Maria Behnsen wrote:
> Hi Paul,
> 
> "Paul E. McKenney" <paulmck@kernel.org> writes:
> 
> > On Wed, Jul 10, 2024 at 04:37:00PM +0800, kernel test robot wrote:
> >> 
> >> 
> >> Hello,
> >> 
> >> kernel test robot noticed "WARNING:at_kernel/time/timer_migration.c:#tmigr_connect_child_parent" on:
> >> 
> >> commit: 8cdb61838ee5c63556773ea2eed24deab6b15257 ("timers/migration: Move hierarchy setup into cpuhotplug prepare callback")
> >> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git timers/urgent
> >
> > For whatever it is worth, I am also seeing this on refscale runs on
> > recent -next.
> >
> > The reproducer is to clone perfbook [1] in your ~/git directory
> > (as in ~/git/perfboot), then run this from your Linux source tree,
> > preferably on a system with few CPUs:
> >
> > bash ~/git/perfbook/CodeSamples/defer/rcuscale.sh
> >
> > The output will have "FAIL" in it, which indicates that the corresponding
> > guest OS splatted.  If it would be useful, I would be happy to produce
> > a one-liner that runs the guest OS only once and leaves the console
> > output around.  Otherwise, I will continue being lazy.  ;-)
> 
> Thanks for the report. I found the root cause for it and I am working on
> a fix as the commit which triggers the warning also has another
> problem... And I already requested to drop the tip timers/urgent patches
> (at least my patches).
> 
> So, enjoy being lazy!

Glad that you are on it, thank you!

And yes, the option of being lazy seems to be increasingly attractive
as the years rush by...  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-07-16 13:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-10  8:37 [tip:timers/urgent] [timers/migration] 8cdb61838e: WARNING:at_kernel/time/timer_migration.c:#tmigr_connect_child_parent kernel test robot
2024-07-15 23:14 ` Paul E. McKenney
2024-07-16  7:36   ` Anna-Maria Behnsen
2024-07-16 13:41     ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox