public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed
From: "Li, Ming4" <ming4.li@intel.com>
To: Pengfei Xu <pengfei.xu@intel.com>, <rrichter@amd.com>
Cc: <linux-cxl@vger.kernel.org>, <dave.jiang@intel.com>,
	<Jonathan.Cameron@huawei.com>, <dan.j.williams@intel.com>
Subject: Re: [CXL] There is BUG: slab-out-of-bounds in cxl_setup_parent_dport in v6.10
Date: Tue, 6 Aug 2024 13:19:13 +0800	[thread overview]
Message-ID: <7d1a47c8-4de5-44a9-b992-7f86d76366eb@intel.com> (raw)
In-Reply-To: <ZrGFWYKwxa1ld9Iz@xpf.sh.intel.com>

On 8/6/2024 10:07 AM, Pengfei Xu wrote:
> Hi Robert Richter and CXL experts,
>
> There is BUG: slab-out-of-bounds in cxl_setup_parent_dport in v6.10 when
> boot up CXL qemu environment.
>
> It's a kernel tools/testing/cxl testing in qemu simulated CXL environment.
>
> Related kconfig and dmesg are in attached of link:
> https://bugzilla.kernel.org/show_bug.cgi?id=219127
>
> It seems to be related to commit:
> f05fd10d138d cxl/pci: Add RCH downstream port AER register discovery
>
> KASAN and CXL kconfig could trigger this problem:
> "
> CONFIG_KASAN=y
> CONFIG_KASAN_GENERIC=y
> CONFIG_KASAN_INLINE=y
> CONFIG_KASAN_STACK=y
>
> CONFIG_CXL_BUS=m
> CONFIG_CXL_PCI=m
> CONFIG_CXL_MEM_RAW_COMMANDS=y
> CONFIG_CXL_ACPI=m
> CONFIG_CXL_PMEM=m
> CONFIG_CXL_MEM=m
> CONFIG_CXL_PORT=y
> CONFIG_CXL_SUSPEND=y
> CONFIG_CXL_REGION_INVALIDATION_TEST=y
> CONFIG_NVDIMM_SECURITY_TEST=y
> "
>
> Dmesg info:
> "
> [   24.413405] ==================================================================
> [   24.416332] BUG: KASAN: slab-out-of-bounds in cxl_setup_parent_dport+0x480/0x530 [cxl_core]
> [   24.419291] Read of size 1 at addr ff110000676014f8 by task (udev-worker)/676[   24.424403] CPU: 2 PID: 676 Comm: (udev-worker) Tainted: G           O     N 6.10.0-qemucxl #1
> [   24.427232] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20240214-2.el9 02/14/2024
> [   24.430089] Call Trace:
> [   24.432534]  <TASK>
> [   24.434891]  dump_stack_lvl+0xea/0x150
> [   24.438131]  print_report+0xce/0x610
> [   24.440498]  ? cxl_setup_parent_dport+0x480/0x530 [cxl_core]
> [   24.443129]  ? kasan_complete_mode_report_info+0x40/0x200
> [   24.445602]  ? cxl_setup_parent_dport+0x480/0x530 [cxl_core]
> [   24.448221]  kasan_report+0xcc/0x110
> [   24.450527]  ? cxl_setup_parent_dport+0x480/0x530 [cxl_core]
> [   24.453140]  __asan_report_load1_noabort+0x18/0x20
> [   24.455455]  cxl_setup_parent_dport+0x480/0x530 [cxl_core]
> [   24.457986]  cxl_mem_probe+0x49b/0xaa0 [cxl_mem]
> [   24.460285]  ? __pfx_cxl_mem_probe+0x10/0x10 [cxl_mem]
> [   24.462592]  ? sysfs_create_link+0x75/0xd0
> [   24.464775]  cxl_bus_probe+0x5e/0xc0 [cxl_core]
> [   24.467153]  ? __pfx_cxl_bus_probe+0x10/0x10 [cxl_core]
> [   24.469632]  really_probe+0x27c/0xac0
> [   24.471750]  ? __sanitizer_cov_trace_const_cmp1+0x1e/0x30
> [   24.474087]  __driver_probe_device+0x1f3/0x460
> [   24.476288]  ? parse_option_str+0x149/0x190
> [   24.478435]  driver_probe_device+0x56/0x1b0
> [   24.480543]  __device_attach_driver+0x1e7/0x300
> [   24.482682]  bus_for_each_drv+0x159/0x1e0
> [   24.484818]  ? __pfx___device_attach_driver+0x10/0x10
> [   24.486935]  ? __pfx_bus_for_each_drv+0x10/0x10
> [   24.489037]  ? _raw_spin_unlock_irqrestore+0x45/0x70
> [   24.491097]  __device_attach+0x215/0x4f0
> [   24.493055]  ? __pfx___device_attach+0x10/0x10
> [   24.495032]  ? do_raw_spin_unlock+0x15c/0x210
> [   24.497020]  device_initial_probe+0x24/0x30
> [   24.498922]  bus_probe_device+0x18e/0x1d0
> [   24.500732]  device_add+0x11b6/0x1b60
> [   24.502485]  ? __pfx_device_add+0x10/0x10
> [   24.504275]  ? __pfx_exact_lock+0x10/0x10
> [   24.506063]  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
> [   24.507931]  ? kobject_get+0xc5/0x160
> [   24.509619]  cdev_device_add+0x13c/0x280
> [   24.511319]  devm_cxl_add_memdev+0x547/0x6f0 [cxl_core]
> [   24.513287]  cxl_mock_mem_probe+0xf1d/0x1d30 [cxl_mock_mem]
> [   24.515133]  ? __pfx_cxl_mock_mem_probe+0x10/0x10 [cxl_mock_mem]
> [   24.516998]  platform_probe+0x10a/0x200
> [   24.518813]  ? __pfx_platform_probe+0x10/0x10
> [   24.520638]  really_probe+0x27c/0xac0
> [   24.522340]  ? __sanitizer_cov_trace_const_cmp1+0x1e/0x30
> [   24.524194]  __driver_probe_device+0x1f3/0x460
> [   24.525972]  ? parse_option_str+0x149/0x190
> [   24.527763]  driver_probe_device+0x56/0x1b0
> [   24.529555]  __driver_attach+0x277/0x570
> [   24.531278]  ? __pfx___driver_attach+0x10/0x10
> [   24.532912]  bus_for_each_dev+0x142/0x1e0
> [   24.534474]  ? __pfx_bus_for_each_dev+0x10/0x10
> [   24.536094]  ? _raw_spin_unlock+0x31/0x60
> [   24.537676]  driver_attach+0x49/0x60
> [   24.539220]  bus_add_driver+0x2f3/0x6b0
> [   24.540781]  driver_register+0x170/0x4b0
> [   24.542334]  ? __pfx_cxl_mock_mem_driver_init+0x10/0x10 [cxl_mock_mem]
> [   24.544104]  __platform_driver_register+0x66/0x80
> [   24.545782]  ? __pfx_cxl_mock_mem_driver_init+0x10/0x10 [cxl_mock_mem]
> [   24.547579]  cxl_mock_mem_driver_init+0x25/0xff0 [cxl_mock_mem]
> [   24.549362]  do_one_initcall+0x114/0x5d0
> [   24.550991]  ? __pfx_do_one_initcall+0x10/0x10
> [   24.552593]  ? __kasan_kmalloc+0x88/0xa0
> [   24.554089]  ? kasan_poison+0x3e/0x60
> [   24.555511]  ? kasan_unpoison+0x2c/0x60
> [   24.557076]  ? kasan_poison+0x3e/0x60
> [   24.558573]  ? __asan_register_globals+0x62/0x80
> [   24.560188]  ? __pfx_cxl_mock_mem_driver_init+0x10/0x10 [cxl_mock_mem]
> [   24.562019]  do_init_module+0x277/0x750
> [   24.563556]  load_module+0x5d1d/0x74f0
> [   24.565124]  ? __pfx_load_module+0x10/0x10
> [   24.566656]  ? __pfx_ima_post_read_file+0x10/0x10
> [   24.568235]  ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
> [   24.569875]  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
> [   24.571521]  ? security_kernel_post_read_file+0xa2/0xd0
> [   24.573189]  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
> [   24.574851]  ? kernel_read_file+0x503/0x820
> [   24.576441]  ? __pfx_kernel_read_file+0x10/0x10
> [   24.577887]  ? __pfx___lock_acquire+0x10/0x10
> [   24.579390]  init_module_from_file+0x12c/0x1a0
> [   24.580988]  ? init_module_from_file+0x12c/0x1a0
> [   24.582575]  ? __pfx_init_module_from_file+0x10/0x10
> [   24.584234]  ? __this_cpu_preempt_check+0x21/0x30
> [   24.585831]  ? do_raw_spin_unlock+0x15c/0x210
> [   24.587460]  idempotent_init_module+0x3f1/0x690
> [   24.589126]  ? __pfx_idempotent_init_module+0x10/0x10
> [   24.590808]  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
> [   24.592513]  ? __fget_light+0x17d/0x210
> [   24.594058]  __x64_sys_finit_module+0x10e/0x1a0
> [   24.595643]  x64_sys_call+0x137a/0x20d0
> [   24.597160]  do_syscall_64+0x6d/0x140
> [   24.598688]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   24.600350] RIP: 0033:0x7fbac6f3185d
> [   24.601923] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a3 45 0c 00 f7 d8 64 89 01 48
> [   24.606145] RSP: 002b:00007ffd13414db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> [   24.608143] RAX: ffffffffffffffda RBX: 0000564200192ca0 RCX: 00007fbac6f3185d
> [   24.610130] RDX: 0000000000000000 RSI: 00007fbac758707d RDI: 0000000000000006
> [   24.612106] RBP: 00007ffd13414e70 R08: 0000000000000000 R09: 00007ffd13414e00
> [   24.614112] R10: 0000000000000006 R11: 0000000000000246 R12: 00007fbac758707d
> [   24.616113] R13: 0000000000020000 R14: 0000564200159890 R15: 0000564200195a20
> [   24.618182]  </TASK>[   24.621370] Allocated by task 615:
> [   24.623062]  kasan_save_stack+0x2c/0x60
> [   24.624851]  kasan_save_track+0x18/0x40
> [   24.626603]  kasan_save_alloc_info+0x3c/0x50
> [   24.628411]  __kasan_kmalloc+0x88/0xa0
> [   24.630155]  __kmalloc_noprof+0x1dd/0x4a0
> [   24.631899]  platform_device_alloc+0x3a/0x230
> [   24.633595]  fq_codel_reset+0x6c/0x370 [sch_fq_codel]
> [   24.635477]  do_one_initcall+0x114/0x5d0
> [   24.637164]  do_init_module+0x277/0x750
> [   24.638818]  load_module+0x5d1d/0x74f0
> [   24.640483]  init_module_from_file+0x12c/0x1a0
> [   24.642214]  idempotent_init_module+0x3f1/0x690
> [   24.644013]  __x64_sys_finit_module+0x10e/0x1a0
> [   24.645772]  x64_sys_call+0x137a/0x20d0
> [   24.647459]  do_syscall_64+0x6d/0x140
> [   24.649117]  entry_SYSCALL_64_after_hwframe+0x76/0x7e[   24.652407] The buggy address belongs to the object at ff11000067601000
>                 which belongs to the cache kmalloc-2k of size 2048
> [   24.656105] The buggy address is located 23 bytes to the right of
>                 allocated 1249-byte region [ff11000067601000, ff110000676014e1)[   24.661503] The buggy address belongs to the physical page:
> [   24.663371] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x67600
> [   24.665505] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> [   24.667605] flags: 0xfffffc0000040(head|node=0|zone=1|lastcpupid=0x1fffff)
> [   24.669656] page_type: 0xffffefff(slab)
> [   24.671471] raw: 000fffffc0000040 ff1100000d83d200 dead000000000122 0000000000000000
> [   24.673568] raw: 0000000000000000 0000000000080008 00000001ffffefff 0000000000000000
> [   24.675767] head: 000fffffc0000040 ff1100000d83d200 dead000000000122 0000000000000000
> [   24.678014] head: 0000000000000000 0000000000080008 00000001ffffefff 0000000000000000
> [   24.680121] head: 000fffffc0000003 ffd40000019d8001 ffffffffffffffff 0000000000000000
> [   24.682313] head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> [   24.684465] page dumped because: kasan: bad access detected[   24.688127] Memory state around the buggy address:
> [   24.690046]  ff11000067601380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   24.692171]  ff11000067601400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   24.694384] >ff11000067601480: 00 00 00 00 00 00 00 00 00 00 00 00 01 fc fc fc
> [   24.696415]                                                                 ^
> [   24.698485]  ff11000067601500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   24.700609]  ff11000067601580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   24.702598] ==================================================================
> "
>
> I hope it's helpful.
>
> Best Regards,
> Thanks!
>
Hi Pengfei,

I can reproduce it on my environment with your configuration. I confirm the bug is the same as the bug I hit recently, I have sent out a patch for review, the link is https://lore.kernel.org/linux-cxl/20240806041547.1958787-1-ming4.li@intel.com/T/#u

the root cause is that cxl-test module creates an RCH topology, and using platform_device to create RCH downstream port in the RCH topology. There is a wrong pci_host_bridge got from to_pci_host_bridge(dport->dport_dev) in cxl_setup_parent_dport().


  reply	other threads:[~2024-08-06  5:19 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-06  2:07 [CXL] There is BUG: slab-out-of-bounds in cxl_setup_parent_dport in v6.10 Pengfei Xu
2024-08-06  5:19 ` Li, Ming4 [this message]
2024-08-06  7:38   ` Pengfei Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7d1a47c8-4de5-44a9-b992-7f86d76366eb@intel.com \
    --to=ming4.li@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=pengfei.xu@intel.com \
    --cc=rrichter@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox