All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>, <aubrey.li@linux.intel.com>,
	<yu.c.chen@intel.com>, <oliver.sang@intel.com>
Subject: [peterz-queue:sched/prep] [sched/fair]  124c8f4374: WARNING:at_kernel/sched/sched.h:#update_load_avg
Date: Mon, 29 Jul 2024 14:14:52 +0800	[thread overview]
Message-ID: <202407291328.761441a7-oliver.sang@intel.com> (raw)



Hello,

kernel test robot noticed "WARNING:at_kernel/sched/sched.h:#update_load_avg" on:

commit: 124c8f43740f3f1e3e2aeecd976bcf5ddd7bd435 ("sched/fair: Implement delayed dequeue")
https://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git sched/prep

in testcase: boot

compiler: clang-18
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


we noticed the issue does not always happen, 60 times out of 150 runs as below.
but the parent keeps clean.

a1133ac6dc7c6bde 124c8f43740f3f1e3e2aeecd976
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :150         40%          60:150   dmesg.WARNING:at_kernel/sched/sched.h:#update_load_avg



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202407291328.761441a7-oliver.sang@intel.com


[   13.522527][    C0] ------------[ cut here ]------------
[   13.523046][    C0] rq->clock_update_flags < RQCF_ACT_SKIP
[ 13.523055][ C0] WARNING: CPU: 0 PID: 158 at kernel/sched/sched.h:1550 update_load_avg (kernel/sched/sched.h:1550 kernel/sched/pelt.h:67 kernel/sched/pelt.h:174 kernel/sched/fair.c:4747) 
[   13.524265][    C0] Modules linked in: drm fuse ip_tables
[   13.524722][    C0] CPU: 0 UID: 0 PID: 158 Comm: openipmi Not tainted 6.10.0-12745-g124c8f43740f #1
[   13.525464][    C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 13.526307][ C0] RIP: 0010:update_load_avg (kernel/sched/sched.h:1550 kernel/sched/pelt.h:67 kernel/sched/pelt.h:174 kernel/sched/fair.c:4747) 
[ 13.526793][ C0] Code: 00 4c 2b bb 88 01 00 00 40 f6 c5 02 0f 84 b7 fa ff ff e9 ca fa ff ff c6 05 b9 1b e4 01 01 48 c7 c7 34 74 d6 a6 e8 a1 c7 fa ff <0f> 0b e9 78 fa ff ff 65 8b 0d 8f 83 91 5a 89 c9 48 0f a3 0d e9 7b
All code
========
   0:	00 4c 2b bb          	add    %cl,-0x45(%rbx,%rbp,1)
   4:	88 01                	mov    %al,(%rcx)
   6:	00 00                	add    %al,(%rax)
   8:	40 f6 c5 02          	test   $0x2,%bpl
   c:	0f 84 b7 fa ff ff    	je     0xfffffffffffffac9
  12:	e9 ca fa ff ff       	jmpq   0xfffffffffffffae1
  17:	c6 05 b9 1b e4 01 01 	movb   $0x1,0x1e41bb9(%rip)        # 0x1e41bd7
  1e:	48 c7 c7 34 74 d6 a6 	mov    $0xffffffffa6d67434,%rdi
  25:	e8 a1 c7 fa ff       	callq  0xfffffffffffac7cb
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	e9 78 fa ff ff       	jmpq   0xfffffffffffffaa9
  31:	65 8b 0d 8f 83 91 5a 	mov    %gs:0x5a91838f(%rip),%ecx        # 0x5a9183c7
  38:	89 c9                	mov    %ecx,%ecx
  3a:	48                   	rex.W
  3b:	0f                   	.byte 0xf
  3c:	a3                   	.byte 0xa3
  3d:	0d                   	.byte 0xd
  3e:	e9                   	.byte 0xe9
  3f:	7b                   	.byte 0x7b

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	e9 78 fa ff ff       	jmpq   0xfffffffffffffa7f
   7:	65 8b 0d 8f 83 91 5a 	mov    %gs:0x5a91838f(%rip),%ecx        # 0x5a91839d
   e:	89 c9                	mov    %ecx,%ecx
  10:	48                   	rex.W
  11:	0f                   	.byte 0xf
  12:	a3                   	.byte 0xa3
  13:	0d                   	.byte 0xd
  14:	e9                   	.byte 0xe9
  15:	7b                   	.byte 0x7b
[   13.528234][    C0] RSP: 0000:ffffbaef40003d90 EFLAGS: 00010046
[   13.528723][    C0] RAX: b53834d2850b7b00 RBX: ffff9ef2efd34640 RCX: 0000000000000027
[   13.529379][    C0] RDX: 0000000000000002 RSI: 00000000ffff7fff RDI: ffff9ef2efc20b88
[   13.530029][    C0] RBP: 0000000000000001 R08: 0000000000007fff R09: ffffffffa7053610
[   13.530685][    C0] R10: 0000000000017ffd R11: 0000000000000004 R12: ffff9ef2efd345c0
[   13.531344][    C0] R13: ffff9ef2efd34640 R14: ffff9eefc0e26400 R15: 0000000000000000
[   13.532001][    C0] FS:  0000000000000000(0000) GS:ffff9ef2efc00000(0063) knlGS:00000000f7f5f180
[   13.532720][    C0] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[   13.533242][    C0] CR2: 00000000577c51ac CR3: 0000000162e6c000 CR4: 00000000000406f0
[   13.533891][    C0] Call Trace:
[   13.534215][    C0]  <IRQ>
[ 13.534504][ C0] ? __warn (kernel/panic.c:240 kernel/panic.c:735) 
[ 13.534863][ C0] ? update_load_avg (kernel/sched/sched.h:1550 kernel/sched/pelt.h:67 kernel/sched/pelt.h:174 kernel/sched/fair.c:4747) 
[ 13.535278][ C0] ? report_bug (lib/bug.c:? lib/bug.c:219) 
[ 13.535695][ C0] ? handle_bug (arch/x86/kernel/traps.c:239) 
[ 13.536079][ C0] ? exc_invalid_op (arch/x86/kernel/traps.c:260) 
[ 13.536498][ C0] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) 
[ 13.536920][ C0] ? update_load_avg (kernel/sched/sched.h:1550 kernel/sched/pelt.h:67 kernel/sched/pelt.h:174 kernel/sched/fair.c:4747) 
[ 13.537334][ C0] dequeue_entity (kernel/sched/sched.h:817 kernel/sched/fair.c:5429) 
[ 13.537729][ C0] dequeue_entities (kernel/sched/fair.c:6972) 
[ 13.538130][ C0] unregister_fair_sched_group (kernel/sched/fair.c:379 kernel/sched/fair.c:13159) 
[ 13.554647][ C0] ? __pfx_sched_unregister_group_rcu (kernel/sched/core.c:8759) 
[ 13.555167][ C0] sched_unregister_group_rcu (kernel/sched/core.c:8707 kernel/sched/core.c:8761) 
[ 13.555621][ C0] rcu_do_batch (arch/x86/include/asm/preempt.h:26 kernel/rcu/tree.c:2576) 
[ 13.556008][ C0] rcu_core (kernel/rcu/tree.c:2845) 
[ 13.556373][ C0] handle_softirqs (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:555) 
[ 13.556775][ C0] __irq_exit_rcu (kernel/softirq.c:617 kernel/softirq.c:639) 
[ 13.557166][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043 arch/x86/kernel/apic/apic.c:1043) 
[   13.557638][    C0]  </IRQ>
[   13.557920][    C0]  <TASK>
[ 13.560463][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:702) 
[ 13.560957][ C0] RIP: 0010:irqentry_exit_to_user_mode (kernel/entry/common.c:? include/linux/entry-common.h:328 kernel/entry/common.c:231) 
[ 13.561465][ C0] Code: 90 41 57 41 56 53 48 89 fb e8 63 90 1f ff 65 4c 8b 34 25 40 37 03 00 49 8b 3e f7 c7 0e 30 02 00 74 7c 49 89 ff e8 97 8f 1f ff <4c> 89 f8 a8 08 75 38 a9 00 10 00 00 75 40 a9 00 20 00 00 75 4b a9
All code
========
   0:	90                   	nop
   1:	41 57                	push   %r15
   3:	41 56                	push   %r14
   5:	53                   	push   %rbx
   6:	48 89 fb             	mov    %rdi,%rbx
   9:	e8 63 90 1f ff       	callq  0xffffffffff1f9071
   e:	65 4c 8b 34 25 40 37 	mov    %gs:0x33740,%r14
  15:	03 00 
  17:	49 8b 3e             	mov    (%r14),%rdi
  1a:	f7 c7 0e 30 02 00    	test   $0x2300e,%edi
  20:	74 7c                	je     0x9e
  22:	49 89 ff             	mov    %rdi,%r15
  25:	e8 97 8f 1f ff       	callq  0xffffffffff1f8fc1
  2a:*	4c 89 f8             	mov    %r15,%rax		<-- trapping instruction
  2d:	a8 08                	test   $0x8,%al
  2f:	75 38                	jne    0x69
  31:	a9 00 10 00 00       	test   $0x1000,%eax
  36:	75 40                	jne    0x78
  38:	a9 00 20 00 00       	test   $0x2000,%eax
  3d:	75 4b                	jne    0x8a
  3f:	a9                   	.byte 0xa9

Code starting with the faulting instruction
===========================================
   0:	4c 89 f8             	mov    %r15,%rax
   3:	a8 08                	test   $0x8,%al
   5:	75 38                	jne    0x3f
   7:	a9 00 10 00 00       	test   $0x1000,%eax
   c:	75 40                	jne    0x4e
   e:	a9 00 20 00 00       	test   $0x2000,%eax
  13:	75 4b                	jne    0x60
  15:	a9                   	.byte 0xa9
[   13.562938][    C0] RSP: 0000:ffffbaef40613f38 EFLAGS: 00000202
[   13.563444][    C0] RAX: ffff9ef0287dd080 RBX: ffffbaef40613f58 RCX: 000000000000080b
[   13.564125][    C0] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000008
[   13.564796][    C0] RBP: 0000000000000000 R08: 000000000000002b R09: 000000000000002b
[   13.565456][    C0] R10: 0000000000000296 R11: 00000000ff8772ac R12: 0000000000000000
[   13.566115][    C0] R13: 0000000000000000 R14: ffff9ef0287dd080 R15: 0000000020000008
[ 13.566782][ C0] ? irqentry_exit_to_user_mode (kernel/entry/common.c:? include/linux/entry-common.h:328 kernel/entry/common.c:231) 
[ 13.567254][ C0] asm_sysvec_reschedule_ipi (arch/x86/include/asm/idtentry.h:707) 
[   13.567709][    C0] RIP: 0023:0x56652630
[ 13.568072][ C0] Code: ff ff 83 c4 10 85 c0 74 05 83 c4 08 5b c3 83 ec 0c 8d 83 8d 6b ff ff 50 e8 cd 7c ff ff 8d b4 26 00 00 00 00 8d b6 00 00 00 00 <55> 57 56 53 e8 e7 61 ff ff 81 c3 13 38 01 00 83 ec 0c 8b 44 24 20
All code
========
   0:	ff                   	(bad)  
   1:	ff 83 c4 10 85 c0    	incl   -0x3f7aef3c(%rbx)
   7:	74 05                	je     0xe
   9:	83 c4 08             	add    $0x8,%esp
   c:	5b                   	pop    %rbx
   d:	c3                   	retq   
   e:	83 ec 0c             	sub    $0xc,%esp
  11:	8d 83 8d 6b ff ff    	lea    -0x9473(%rbx),%eax
  17:	50                   	push   %rax
  18:	e8 cd 7c ff ff       	callq  0xffffffffffff7cea
  1d:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi
  24:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
  2a:*	55                   	push   %rbp		<-- trapping instruction
  2b:	57                   	push   %rdi
  2c:	56                   	push   %rsi
  2d:	53                   	push   %rbx
  2e:	e8 e7 61 ff ff       	callq  0xffffffffffff621a
  33:	81 c3 13 38 01 00    	add    $0x13813,%ebx
  39:	83 ec 0c             	sub    $0xc,%esp
  3c:	8b 44 24 20          	mov    0x20(%rsp),%eax

Code starting with the faulting instruction
===========================================
   0:	55                   	push   %rbp
   1:	57                   	push   %rdi
   2:	56                   	push   %rsi
   3:	53                   	push   %rbx
   4:	e8 e7 61 ff ff       	callq  0xffffffffffff61f0
   9:	81 c3 13 38 01 00    	add    $0x13813,%ebx
   f:	83 ec 0c             	sub    $0xc,%esp
  12:	8b 44 24 20          	mov    0x20(%rsp),%eax


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240729/202407291328.761441a7-oliver.sang@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


             reply	other threads:[~2024-07-29  6:15 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-29  6:14 kernel test robot [this message]
2024-07-31  6:16 ` [peterz-queue:sched/prep] [sched/fair] 124c8f4374: WARNING:at_kernel/sched/sched.h:#update_load_avg K Prateek Nayak
2024-07-31 11:54   ` Peter Zijlstra
2024-08-01  3:40     ` K Prateek Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202407291328.761441a7-oliver.sang@intel.com \
    --to=oliver.sang@intel.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=peterz@infradead.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.