From: kernel test robot <oliver.sang@intel.com>
To: <kaiyang2@cs.cmu.edu>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
<cgroups@vger.kernel.org>, <roman.gushchin@linux.dev>,
<shakeel.butt@linux.dev>, <muchun.song@linux.dev>,
<akpm@linux-foundation.org>, <mhocko@kernel.org>,
<nehagholkar@meta.com>, <abhishekd@meta.com>,
<hannes@cmpxchg.org>, <weixugc@google.com>, <rientjes@google.com>,
Kaiyang Zhao <kaiyang2@cs.cmu.edu>, <oliver.sang@intel.com>
Subject: Re: [RFC PATCH 2/4] calculate memory.low for the local node and track its usage
Date: Sun, 22 Sep 2024 16:39:35 +0800 [thread overview]
Message-ID: <202409221625.1e974ac-oliver.sang@intel.com> (raw)
In-Reply-To: <20240920221202.1734227-3-kaiyang2@cs.cmu.edu>
Hello,
kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:
commit: 6f4c005a5f8b8ff1ce674731545b302af5f28f3f ("[RFC PATCH 2/4] calculate memory.low for the local node and track its usage")
url: https://github.com/intel-lab-lkp/linux/commits/kaiyang2-cs-cmu-edu/Add-get_cgroup_local_usage-for-estimating-the-top-tier-memory-usage/20240921-061404
base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/all/20240920221202.1734227-3-kaiyang2@cs.cmu.edu/
patch subject: [RFC PATCH 2/4] calculate memory.low for the local node and track its usage
in testcase: boot
compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
+---------------------------------------------+------------+------------+
| | 0af685cc17 | 6f4c005a5f |
+---------------------------------------------+------------+------------+
| boot_successes | 12 | 0 |
| boot_failures | 0 | 12 |
| BUG:kernel_NULL_pointer_dereference,address | 0 | 12 |
| Oops | 0 | 12 |
| RIP:si_meminfo_node | 0 | 12 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 12 |
+---------------------------------------------+------------+------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202409221625.1e974ac-oliver.sang@intel.com
[ 14.204830][ T1] BUG: kernel NULL pointer dereference, address: 0000000000000090
[ 14.206729][ T1] #PF: supervisor read access in kernel mode
[ 14.208090][ T1] #PF: error_code(0x0000) - not-present page
[ 14.209393][ T1] PGD 0 P4D 0
[ 14.210212][ T1] Oops: Oops: 0000 [#1] SMP PTI
[ 14.211269][ T1] CPU: 1 UID: 0 PID: 1 Comm: systemd Not tainted 6.11.0-rc6-00570-g6f4c005a5f8b #1
[ 14.213284][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 14.215290][ T1] RIP: 0010:si_meminfo_node (arch/x86/include/asm/atomic64_64.h:15 (discriminator 3) include/linux/atomic/atomic-arch-fallback.h:2583 (discriminator 3) include/linux/atomic/atomic-long.h:38 (discriminator 3) include/linux/atomic/atomic-instrumented.h:3189 (discriminator 3) include/linux/mmzone.h:1042 (discriminator 3) mm/show_mem.c:98 (discriminator 3))
[ 14.216523][ T1] Code: 90 90 66 0f 1f 00 0f 1f 44 00 00 48 63 c6 55 31 d2 4c 8b 04 c5 c0 a7 fb 8c 53 48 89 c5 48 89 fb 4c 89 c0 49 8d b8 00 1e 00 00 <48> 8b 88 90 00 00 00 48 05 00 06 00 00 48 01 ca 48 39 f8 75 eb 48
All code
========
0: 90 nop
1: 90 nop
2: 66 0f 1f 00 nopw (%rax)
6: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
b: 48 63 c6 movslq %esi,%rax
e: 55 push %rbp
f: 31 d2 xor %edx,%edx
11: 4c 8b 04 c5 c0 a7 fb mov -0x73045840(,%rax,8),%r8
18: 8c
19: 53 push %rbx
1a: 48 89 c5 mov %rax,%rbp
1d: 48 89 fb mov %rdi,%rbx
20: 4c 89 c0 mov %r8,%rax
23: 49 8d b8 00 1e 00 00 lea 0x1e00(%r8),%rdi
2a:* 48 8b 88 90 00 00 00 mov 0x90(%rax),%rcx <-- trapping instruction
31: 48 05 00 06 00 00 add $0x600,%rax
37: 48 01 ca add %rcx,%rdx
3a: 48 39 f8 cmp %rdi,%rax
3d: 75 eb jne 0x2a
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 48 8b 88 90 00 00 00 mov 0x90(%rax),%rcx
7: 48 05 00 06 00 00 add $0x600,%rax
d: 48 01 ca add %rcx,%rdx
10: 48 39 f8 cmp %rdi,%rax
13: 75 eb jne 0x0
15: 48 rex.W
[ 14.220364][ T1] RSP: 0018:ffffb14b40013d68 EFLAGS: 00010246
[ 14.221717][ T1] RAX: 0000000000000000 RBX: ffffb14b40013d88 RCX: 00000000003a19a2
[ 14.223496][ T1] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000001e00
[ 14.225170][ T1] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000008
[ 14.226964][ T1] R10: 0000000000000008 R11: 0fffffffffffffff R12: ffffb14b40013d88
[ 14.228774][ T1] R13: 00000000003e7ac3 R14: ffffb14b40013e88 R15: ffff98ab0434f7a0
[ 14.230421][ T1] FS: 00007f9569ae9940(0000) GS:ffff98adefd00000(0000) knlGS:0000000000000000
[ 14.234569][ T1] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 14.235900][ T1] CR2: 0000000000000090 CR3: 0000000100072000 CR4: 00000000000006f0
[ 14.237620][ T1] Call Trace:
[ 14.238502][ T1] <TASK>
[ 14.239254][ T1] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[ 14.240189][ T1] ? page_fault_oops (arch/x86/mm/fault.c:715)
[ 14.241254][ T1] ? exc_page_fault (arch/x86/include/asm/irqflags.h:37 arch/x86/include/asm/irqflags.h:92 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
[ 14.242297][ T1] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:623)
[ 14.243313][ T1] ? si_meminfo_node (arch/x86/include/asm/atomic64_64.h:15 (discriminator 3) include/linux/atomic/atomic-arch-fallback.h:2583 (discriminator 3) include/linux/atomic/atomic-long.h:38 (discriminator 3) include/linux/atomic/atomic-instrumented.h:3189 (discriminator 3) include/linux/mmzone.h:1042 (discriminator 3) mm/show_mem.c:98 (discriminator 3))
[ 14.244443][ T1] ? si_meminfo_node (mm/show_mem.c:114)
[ 14.245460][ T1] memory_low_write (mm/memcontrol.c:4088)
[ 14.246547][ T1] kernfs_fop_write_iter (fs/kernfs/file.c:338)
[ 14.247804][ T1] vfs_write (fs/read_write.c:497 fs/read_write.c:590)
[ 14.248830][ T1] ksys_write (fs/read_write.c:643)
[ 14.249783][ T1] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
[ 14.250800][ T1] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 14.252260][ T1] RIP: 0033:0x7f956a64b240
[ 14.253276][ T1] Code: 40 00 48 8b 15 c1 9b 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d a1 23 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
All code
========
0: 40 00 48 8b add %cl,-0x75(%rax)
4: 15 c1 9b 0d 00 adc $0xd9bc1,%eax
9: f7 d8 neg %eax
b: 64 89 02 mov %eax,%fs:(%rdx)
e: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
15: eb b7 jmp 0xffffffffffffffce
17: 0f 1f 00 nopl (%rax)
1a: 80 3d a1 23 0e 00 00 cmpb $0x0,0xe23a1(%rip) # 0xe23c2
21: 74 17 je 0x3a
23: b8 01 00 00 00 mov $0x1,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 58 ja 0x8a
32: c3 retq
33: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
3a: 48 83 ec 28 sub $0x28,%rsp
3e: 48 rex.W
3f: 89 .byte 0x89
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 58 ja 0x60
8: c3 retq
9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
10: 48 83 ec 28 sub $0x28,%rsp
14: 48 rex.W
15: 89 .byte 0x89
[ 14.257195][ T1] RSP: 002b:00007ffcc66594e8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 14.259009][ T1] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f956a64b240
[ 14.260848][ T1] RDX: 0000000000000002 RSI: 00007ffcc6659740 RDI: 000000000000001b
[ 14.262500][ T1] RBP: 00007ffcc6659740 R08: 0000000000000000 R09: 0000000000000001
[ 14.264147][ T1] R10: 00007f956a6c4820 R11: 0000000000000202 R12: 0000000000000002
[ 14.265934][ T1] R13: 000055fd63872c10 R14: 0000000000000002 R15: 00007f956a7219e0
[ 14.267589][ T1] </TASK>
[ 14.268340][ T1] Modules linked in: ip_tables
[ 14.269410][ T1] CR2: 0000000000000090
[ 14.270478][ T1] ---[ end trace 0000000000000000 ]---
[ 14.271717][ T1] RIP: 0010:si_meminfo_node (arch/x86/include/asm/atomic64_64.h:15 (discriminator 3) include/linux/atomic/atomic-arch-fallback.h:2583 (discriminator 3) include/linux/atomic/atomic-long.h:38 (discriminator 3) include/linux/atomic/atomic-instrumented.h:3189 (discriminator 3) include/linux/mmzone.h:1042 (discriminator 3) mm/show_mem.c:98 (discriminator 3))
[ 14.272874][ T1] Code: 90 90 66 0f 1f 00 0f 1f 44 00 00 48 63 c6 55 31 d2 4c 8b 04 c5 c0 a7 fb 8c 53 48 89 c5 48 89 fb 4c 89 c0 49 8d b8 00 1e 00 00 <48> 8b 88 90 00 00 00 48 05 00 06 00 00 48 01 ca 48 39 f8 75 eb 48
All code
========
0: 90 nop
1: 90 nop
2: 66 0f 1f 00 nopw (%rax)
6: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
b: 48 63 c6 movslq %esi,%rax
e: 55 push %rbp
f: 31 d2 xor %edx,%edx
11: 4c 8b 04 c5 c0 a7 fb mov -0x73045840(,%rax,8),%r8
18: 8c
19: 53 push %rbx
1a: 48 89 c5 mov %rax,%rbp
1d: 48 89 fb mov %rdi,%rbx
20: 4c 89 c0 mov %r8,%rax
23: 49 8d b8 00 1e 00 00 lea 0x1e00(%r8),%rdi
2a:* 48 8b 88 90 00 00 00 mov 0x90(%rax),%rcx <-- trapping instruction
31: 48 05 00 06 00 00 add $0x600,%rax
37: 48 01 ca add %rcx,%rdx
3a: 48 39 f8 cmp %rdi,%rax
3d: 75 eb jne 0x2a
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 48 8b 88 90 00 00 00 mov 0x90(%rax),%rcx
7: 48 05 00 06 00 00 add $0x600,%rax
d: 48 01 ca add %rcx,%rdx
10: 48 39 f8 cmp %rdi,%rax
13: 75 eb jne 0x0
15: 48 rex.W
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240922/202409221625.1e974ac-oliver.sang@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2024-09-22 8:39 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-20 22:11 [RFC PATCH 0/4] memory tiering fairness by per-cgroup control of promotion and demotion kaiyang2
2024-09-20 22:11 ` [RFC PATCH 1/4] Add get_cgroup_local_usage for estimating the top-tier memory usage kaiyang2
2024-09-20 22:11 ` [RFC PATCH 2/4] calculate memory.low for the local node and track its usage kaiyang2
2024-09-21 23:18 ` kernel test robot
2024-09-22 8:39 ` kernel test robot [this message]
2024-10-15 22:05 ` Gregory Price
2024-09-20 22:11 ` [RFC PATCH 3/4] use memory.low local node protection for local node reclaim kaiyang2
2024-09-22 0:51 ` kernel test robot
2024-09-22 16:31 ` kernel test robot
2024-10-15 21:52 ` Gregory Price
2024-09-20 22:11 ` [RFC PATCH 4/4] reduce NUMA balancing scan size of cgroups over their local memory.low kaiyang2
2024-10-11 20:51 ` [RFC PATCH 0/4] memory tiering fairness by per-cgroup control of promotion and demotion Kaiyang Zhao
2024-11-08 19:01 ` kaiyang2
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202409221625.1e974ac-oliver.sang@intel.com \
--to=oliver.sang@intel.com \
--cc=abhishekd@meta.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kaiyang2@cs.cmu.edu \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=nehagholkar@meta.com \
--cc=oe-lkp@lists.linux.dev \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=weixugc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.