All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>, <x86@kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Chen Yu <yu.c.chen@intel.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	"K Prateek Nayak" <kprateek.nayak@amd.com>,
	Srikar Dronamraju <srikar@linux.ibm.com>,
	Mohini Narkhede <mohini.narkhede@intel.com>,
	<aubrey.li@linux.intel.com>, <oliver.sang@intel.com>
Subject: [tip:tmp.tmp] [sched/fair]  eb2db043ab: BUG:kernel_NULL_pointer_dereference,address
Date: Thu, 27 Nov 2025 16:35:37 +0800	[thread overview]
Message-ID: <202511271605.bd46ddc3-lkp@intel.com> (raw)



Hello,

kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:

commit: eb2db043ab3a28ae76800f2a57e144420800d56d ("sched/fair: Skip sched_balance_running cmpxchg when balance is not due")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git tmp.tmp

in testcase: fio-basic
version: fio-x86_64-7c8dbca4-1_20251123
with following parameters:

	runtime: 300s
	disk: 1SSD
	fs: btrfs
	nr_task: 100%
	test_size: 128G
	rw: randwrite
	bs: 4M
	ioengine: falloc
	cpufreq_governor: performance



config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz (Cascade Lake) with 176G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202511271605.bd46ddc3-lkp@intel.com


[    5.764008][    C0] BUG: kernel NULL pointer dereference, address: 0000000000000000
[    5.764501][    T1] futex hash table entries: 16384 (1048576 bytes on 4 NUMA nodes, total 4096 KiB, linear).
[    5.764999][    C0] #PF: supervisor read access in kernel mode
[    5.764999][    C0] #PF: error_code(0x0000) - not-present page
[    5.764999][    T1] pinctrl core: initialized pinctrl subsystem
[    5.764999][    C0] PGD 0 P4D 0
[    5.764999][    C0] Oops: Oops: 0000 [#1] SMP NOPTI
[    5.764999][    C0] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G S                  6.18.0-rc6-00035-geb2db043ab3a #1 VOLUNTARY
[    5.764999][    C0] Tainted: [S]=CPU_OUT_OF_SPEC
[    5.764999][    C0] Hardware name: Intel Corporation ............/S9200WKBRD2, BIOS SE5C620.86B.0D.01.0552.060220191912 06/02/2019
[    5.764999][    C0] RIP: 0010:sched_balance_rq (arch/x86/include/asm/atomic.h:107 (discriminator 4) include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) kernel/sched/fair.c:11733 (discriminator 4))
[    5.764999][    C0] Code: b8 00 00 00 65 48 2b 15 c2 47 a7 02 0f 85 30 03 00 00 48 81 c4 c0 00 00 00 89 f0 5b 5d 41 5c 41 5d 41 5e 41 5f e9 bc fb f5 00 <8b> 04 25 00 00 00 00 ba 01 00 00 00 f0 0f b1 15 58 d6 af 02 0f 94
All code
========
   0:	b8 00 00 00 65       	mov    $0x65000000,%eax
   5:	48 2b 15 c2 47 a7 02 	sub    0x2a747c2(%rip),%rdx        # 0x2a747ce
   c:	0f 85 30 03 00 00    	jne    0x342
  12:	48 81 c4 c0 00 00 00 	add    $0xc0,%rsp
  19:	89 f0                	mov    %esi,%eax
  1b:	5b                   	pop    %rbx
  1c:	5d                   	pop    %rbp
  1d:	41 5c                	pop    %r12
  1f:	41 5d                	pop    %r13
  21:	41 5e                	pop    %r14
  23:	41 5f                	pop    %r15
  25:	e9 bc fb f5 00       	jmp    0xf5fbe6
  2a:*	8b 04 25 00 00 00 00 	mov    0x0,%eax		<-- trapping instruction
  31:	ba 01 00 00 00       	mov    $0x1,%edx
  36:	f0 0f b1 15 58 d6 af 	lock cmpxchg %edx,0x2afd658(%rip)        # 0x2afd696
  3d:	02 
  3e:	0f                   	.byte 0xf
  3f:	94                   	xchg   %eax,%esp

Code starting with the faulting instruction
===========================================
   0:	8b 04 25 00 00 00 00 	mov    0x0,%eax
   7:	ba 01 00 00 00       	mov    $0x1,%edx
   c:	f0 0f b1 15 58 d6 af 	lock cmpxchg %edx,0x2afd658(%rip)        # 0x2afd66c
  13:	02 
  14:	0f                   	.byte 0xf
  15:	94                   	xchg   %eax,%esp
[    5.764999][    C0] RSP: 0000:ffffc90000003e30 EFLAGS: 00010202
[    5.764999][    C0] RAX: 0000000000000001 RBX: ffff8881002c2ba0 RCX: 0000000000000000
[    5.764999][    C0] RDX: ffff8881002dbc01 RSI: 00000000000000c0 RDI: 00000000000000c0
[    5.764999][    C0] RBP: 0000000000000000 R08: ffff8881002dbcc0 R09: ffff8881002c2020
[    5.764999][    C0] R10: 00ffffff00000000 R11: 0000000000000000 R12: 0000000000000000
[    5.764999][    C0] R13: ffffc90000003ed8 R14: ffffc90000003e80 R15: ffffc90000003f4c
[    5.764999][    C0] FS:  0000000000000000(0000) GS:ffff888ccb7f2000(0000) knlGS:0000000000000000
[    5.764999][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.764999][    C0] CR2: 0000000000000000 CR3: 0000002c7de24001 CR4: 00000000007706f0
[    5.764999][    C0] PKRU: 55555554
[    5.764999][    C0] Call Trace:
[    5.764999][    C0]  <IRQ>
[    5.764999][    C0]  ? rcu_do_batch (kernel/rcu/tree.c:2612 (discriminator 1))
[    5.764999][    C0]  sched_balance_domains (kernel/sched/fair.c:12186 (discriminator 1))
[    5.764999][    C0]  ? sched_balance_update_blocked_averages (arch/x86/include/asm/irqflags.h:158 (discriminator 1) kernel/sched/sched.h:1577 (discriminator 1) kernel/sched/sched.h:1884 (discriminator 1) kernel/sched/fair.c:9857 (discriminator 1))
[    5.764999][    C0]  handle_softirqs (arch/x86/include/asm/jump_label.h:36 include/trace/events/irq.h:142 kernel/softirq.c:623)
[    5.764999][    C0]  __irq_exit_rcu (kernel/softirq.c:657 kernel/softirq.c:496 kernel/softirq.c:723)
[    5.764999][    C0]  sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1052 (discriminator 35) arch/x86/kernel/apic/apic.c:1052 (discriminator 35))
[    5.764999][    C0]  </IRQ>
[    5.764999][    C0]  <TASK>
[    5.764999][    C0]  asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:697)
[    5.764999][    C0] RIP: 0010:mwait_idle (arch/x86/include/asm/irqflags.h:37 arch/x86/include/asm/irqflags.h:114 arch/x86/kernel/process.c:930)
[    5.764999][    C0] Code: 2d c0 8e 10 00 f0 80 0e 40 48 8b 06 a8 10 75 1b 48 89 f0 0f 1f 00 31 c9 89 ca 0f 01 c8 48 8b 06 a8 10 75 07 89 c8 fb 0f 01 c9 <fa> f0 80 26 bf e9 c5 e1 00 00 0f 1f 44 00 00 66 66 2e 0f 1f 84 00
All code
========
   0:	2d c0 8e 10 00       	sub    $0x108ec0,%eax
   5:	f0 80 0e 40          	lock orb $0x40,(%rsi)
   9:	48 8b 06             	mov    (%rsi),%rax
   c:	a8 10                	test   $0x10,%al
   e:	75 1b                	jne    0x2b
  10:	48 89 f0             	mov    %rsi,%rax
  13:	0f 1f 00             	nopl   (%rax)
  16:	31 c9                	xor    %ecx,%ecx
  18:	89 ca                	mov    %ecx,%edx
  1a:	0f 01 c8             	monitor %rax,%ecx,%edx
  1d:	48 8b 06             	mov    (%rsi),%rax
  20:	a8 10                	test   $0x10,%al
  22:	75 07                	jne    0x2b
  24:	89 c8                	mov    %ecx,%eax
  26:	fb                   	sti
  27:	0f 01 c9             	mwait  %eax,%ecx
  2a:*	fa                   	cli		<-- trapping instruction
  2b:	f0 80 26 bf          	lock andb $0xbf,(%rsi)
  2f:	e9 c5 e1 00 00       	jmp    0xe1f9
  34:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  39:	66                   	data16
  3a:	66                   	data16
  3b:	2e                   	cs
  3c:	0f                   	.byte 0xf
  3d:	1f                   	(bad)
  3e:	84 00                	test   %al,(%rax)

Code starting with the faulting instruction
===========================================
   0:	fa                   	cli
   1:	f0 80 26 bf          	lock andb $0xbf,(%rsi)
   5:	e9 c5 e1 00 00       	jmp    0xe1cf
   a:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
   f:	66                   	data16
  10:	66                   	data16
  11:	2e                   	cs
  12:	0f                   	.byte 0xf
  13:	1f                   	(bad)
  14:	84 00                	test   %al,(%rax)
[    5.764999][    C0] RSP: 0000:ffffffff82e03e90 EFLAGS: 00000246
[    5.764999][    C0] RAX: 0000000000000000 RBX: ffffffff82e12940 RCX: 0000000000000000
[    5.764999][    C0] RDX: 0000000000000000 RSI: ffffffff82e12940 RDI: 0000000001655ddc
[    5.764999][    C0] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff888105016728
[    5.764999][    C0] R10: 000000000000001d R11: 0000000000000011 R12: 0000000000000000
[    5.764999][    C0] R13: 0000000000000000 R14: 0000000000000000 R15: 0000002c7fff1000
[    5.764999][    C0]  default_idle_call (include/linux/cpuidle.h:144 kernel/sched/idle.c:123)
[    5.764999][    C0]  cpuidle_idle_call (kernel/sched/idle.c:191)
[    5.764999][    C0]  do_idle (kernel/sched/idle.c:332)
[    5.764999][    C0]  cpu_startup_entry (kernel/sched/idle.c:427)
[    5.764999][    C0]  rest_init (init/main.c:757)
[    5.764999][    C0]  start_kernel (init/main.c:1111)
[    5.764999][    C0]  x86_64_start_reservations (arch/x86/kernel/head64.c:310)
[    5.764999][    C0]  x86_64_start_kernel (??:?)
[    5.764999][    C0]  common_startup_64 (arch/x86/kernel/head_64.S:419)
[    5.764999][    C0]  </TASK>
[    5.764999][    C0] Modules linked in:
[    5.764999][    C0] CR2: 0000000000000000
[    5.764999][    C0] ---[ end trace 0000000000000000 ]---


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251127/202511271605.bd46ddc3-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


             reply	other threads:[~2025-11-27  8:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-27  8:35 kernel test robot [this message]
2025-11-27  8:38 ` [tip:tmp.tmp] [sched/fair] eb2db043ab: BUG:kernel_NULL_pointer_dereference,address Peter Zijlstra
2025-11-27  9:07   ` Ingo Molnar
2025-11-28  1:42     ` Oliver Sang
2025-11-28  9:34       ` Peter Zijlstra
2025-12-02  5:11         ` Ingo Molnar
2025-12-03  6:54           ` Oliver Sang
2025-12-03 17:16             ` Ingo Molnar
2025-12-02  4:59       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202511271605.bd46ddc3-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=mingo@kernel.org \
    --cc=mohini.narkhede@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=peterz@infradead.org \
    --cc=srikar@linux.ibm.com \
    --cc=sshegde@linux.ibm.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=x86@kernel.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.