From: srinivas pandruvada <srinivas.pandruvada@linux.intel.com>
To: Zach Wade <zachwade.k@gmail.com>,
hdegoede@redhat.com, ilpo.jarvinen@linux.intel.com
Cc: platform-driver-x86@vger.kernel.org
Subject: Re: [PATCH] platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug
Date: Thu, 19 Sep 2024 14:37:03 -0400 [thread overview]
Message-ID: <422bfc19243e8d50fc3c847f0c1db01e703aff4e.camel@linux.intel.com> (raw)
In-Reply-To: <8b83c925-17d2-47e1-a278-998c229c02f1@gmail.com>
On Fri, 2024-09-20 at 00:22 +0800, Zach Wade wrote:
>
>
>
Hi Wade,
...
...
> This issue was discovered when the hard drive was passed directly to
> the
> virtual machine and the driver was automatically loaded.
>
> The virtual machine executes lsblk internally and can see the
> physical
> hard disk I am directly using:
> lsblk -S
> NAME HCTL TYPE VENDOR MODEL REV
> SERIAL
> TRAN
> sda 32:0:0:0 disk ATA SAMSUNG MZ7LH240HAHQ-00005 HXT7904Q
> S45RNC0T166451
> sdb 32:0:1:0 disk ATA INTEL SSDSC2KG480G8 XCV10100
> BTYG84910BR5480BGN
>
So seems these devices are using same PCI Intel vendor device ID as SST
device after emulation.
These are not SST devices.
What is
sudo lspci -vvv
Also cat /proc/cpuinfo?
Thanks,
Srinivas
>
> > I don't think lspci in VM will show this device.
> > Can you send lspci -k?
>
> lspci -k
> 00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX
> Host
> bridge (rev 01)
> Subsystem: VMware Virtual Machine Chipset
> 00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP
> bridge (rev 01)
> 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev
> 08)
> Subsystem: VMware Virtual Machine Chipset
> 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev
> 01)
> Subsystem: VMware Virtual Machine Chipset
> Kernel driver in use: ata_piix
> Kernel modules: pata_acpi, ata_generic
> 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)
> Subsystem: VMware Virtual Machine Chipset
> Kernel modules: i2c_piix4
> 00:07.7 System peripheral: VMware Virtual Machine Communication
> Interface (rev 10)
> Subsystem: VMware Virtual Machine Communication Interface
> Kernel driver in use: vmw_vmci
> Kernel modules: vmw_vmci
> 00:0f.0 VGA compatible controller: VMware SVGA II Adapter
> Subsystem: VMware SVGA II Adapter
> Kernel driver in use: vmwgfx
> Kernel modules: vmwgfx
> 02:00.0 Serial Attached SCSI controller: VMware PVSCSI SCSI
> Controller
> (rev 02)
> DeviceName: SCSI0
> Subsystem: VMware PVSCSI SCSI Controller
> Kernel driver in use: vmw_pvscsi
> Kernel modules: vmw_pvscsi
> 02:01.0 USB controller: VMware USB1.1 UHCI Controller
> DeviceName: usb
> Subsystem: VMware Device 1976
> Kernel driver in use: uhci_hcd
> 02:02.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev
> 01)
> DeviceName: Ethernet0
> Subsystem: VMware VMXNET3 Ethernet Controller
> Kernel driver in use: vmxnet3
> Kernel modules: vmxnet3
> 02:03.0 USB controller: VMware USB2 EHCI Controller
> DeviceName: ehci
> Subsystem: VMware USB2 EHCI Controller
> Kernel driver in use: ehci-pci
> 02:04.0 SATA controller: VMware SATA AHCI controller
> DeviceName: sata0
> Subsystem: VMware SATA AHCI controller
> Kernel driver in use: ahci
>
> >
This is not complete list.
> > I want to make sure somehow your other VM PCI device is using same
> > ID
> > as this device.
> >
>
> Are you referring to transferring this hard drive to another virtual
> machine? The action is too big, unfortunately I cannot do it this
> way.
> This physical hard drive is only directly connected to this virtual
> machine for use.
>
> >
> > > > > Here on the virtualized environment, seems the
> > > > > topology_physical_package_id() (from CPU APIC ID in non
> > > > > virtualized
> > > > > case) is assigned some big value, which is more than max
> > > > > packages
> > > > > in
> > > > > the system.
> > > > >
> > > > > But your fix is good as topology_logical_package_id() should
> > > > > be
> > > > > less
> > > > > than value returned by topology_max_packages() and hence
> > > > > avoid
> > > > > this
> > > > > issue.
> > > > >
> > > > > Can you confirm the value returned by
> > > > > topology_logical_package_id() and
> > > > > topology_physical_package_id()?
> > >
> > > cat /proc/cpuinfo | grep "physical id"
> > > physical id : 0
> > > physical id : 2
> > > physical id : 4
> > > ......
> > > physical id : 58
> > > physical id : 60
> > > physical id : 62
> >
> > >
> > > I calculated topology_max_packages() * sizeof (* isst_pkg.info)
> > > in
> > > isst_if_cpu_info_init, and focused on pkg_id and bus_no in
> > > _isst_if_get_pci_dev.
> > > The printk printed result is as follows:
> > > [ 51.879700] Allocated size: 512
> >
> > Here topology_max_packages() returned 32.
>
> yes.
>
> >
> > > [ 51.880148] pkg_id: 0, bus_no: 0
> > > [ 51.881242] pkg_id: 0, bus_no: 1
> > > [ 51.884209] pkg_id: 2, bus_no: 0
> > > [ 51.884571] pkg_id: 2, bus_no: 1
> > > [ 51.884931] pkg_id: 4, bus_no: 0
> > > [ 51.885313] pkg_id: 4, bus_no: 1
> > > ......
> > > [ 51.899134] pkg_id: 28, bus_no: 0
> > > [ 51.899511] pkg_id: 28, bus_no: 1
> > > [ 51.899909] pkg_id: 30, bus_no: 0
> > > [ 51.901012] pkg_id: 30, bus_no: 1
> > > [ 51.902160]
> > > =================================================================
> > > =
> > > [ 51.902936] BUG: KASAN: slab-out-of-bounds in
> > > _isst_if_get_pci_dev.cold+0xde/0xe4 [isst_if_common]
> > > [ 51.982707]
> > > =================================================================
> > > =
> > Package ID is 32, so it will overflow. There seems to be only 32
> > packages.
> > If you print topology_logical_package_id(), you will have no gaps,
> > and
> > will be 0-31.
> >
> > Can you also print topology_logical_package_id() to confirm.
>
> After adding printk to _isst_if_get_pci_dev, the changes in pkd_id
> can
> be seen as follows:
> [ 18.078652] pkg_id:0
> [ 18.078669] pkg_id:0
> [ 18.079215] pkg_id:1
> [ 18.080920] pkg_id:1
> [ 18.081847] pkg_id:2
> [ 18.082756] pkg_id:2
> [ 18.088928] pkg_id:3
> [ 18.089839] pkg_id:3
> ......
> [ 18.185462] pkg_id:30
> [ 18.185471] pkg_id:30
> [ 18.185561] pkg_id:31
> [ 18.185569] pkg_id:31
>
> >
> > topology_max_packages() returns max __max_logical_packages, so
> > topology_logical_package_id() will be better here.
> >
> > > [ 51.985453] pkg_id: 32, bus_no: 0
> > > [ 51.986569] pkg_id: 32, bus_no: 1
> > > [ 51.988501] pkg_id: 34, bus_no: 0
> > > [ 51.989616] pkg_id: 34, bus_no: 1
> > > ......
> > > [ 52.059749] pkg_id: 58, bus_no: 0
> > > [ 52.062331] pkg_id: 58, bus_no: 1
> > > [ 52.066039] pkg_id: 60, bus_no: 0
> > > [ 52.068503] pkg_id: 60, bus_no: 1
> > > [ 52.072018] pkg_id: 62, bus_no: 0
> > > [ 52.074375] pkg_id: 62, bus_no: 1
> > >
> > > > >
> > > > > We can change commit description based on that.
> > > > >
> > > > > Thanks,
> > > > > Srinivas
> > > > >
> > >
> > > I think the changes are minor, so no more content was added to
> > > the
> > > patch.
> > > If you think it needs to be added, I am happy for you to help
> > > supplement
> > > it.
> > I just want to be clear how to reproduce this issue.
> >
> > "
> > Attaching SST PCI device to VM causes "BUG: KASAN: slab-out-of-
> > bounds".
> >
> > Then you can add the kasan bug report.
> >
> > The reason for this error is physical_package_ids assigned by VMM
> > have
> > holes. This will cause value returned by
> > topology_physical_package_id()
> > to be more than topology_max_packages(). The allocation uses
> > topology_max_packages() to allocate memory. topology_max_packages()
> > returns maximum logical package IDs. Hence use
> > topology_logical_package_id() instead of
> > topology_physical_package_id().
> > "
>
> Thanks, I will make the necessary modifications in v2.
>
> >
> > Also we should add a check
> >
> > pkg_id = topology_logical_package_id(cpu);
> > if (pkg_id >= topology_max_packages())
> > return NULL;
> >
> > May be VMM has holes in logical IDs also, then atleast it will not
> > cause BUG.
>
> Great suggestion, I will add it in the new v2.
>
> Thanks,
> Srinivas
>
> >
> > Thanks,
> > Srinivas
> >
> >
> > > Thanks,
> > > Zach
> > >
> > > > > after loading the
> > > > > isst_if_common and isst_if_mbox_msr modules on the 64 core,
> > > > > the
> > > > > kasan
> > > > > report was triggered.
> > > > > After consulting the kernel manual
> > > > > (Documentation/arch/x86/topology.rst),
> > > > > I think in _isst_if_get_pci_dev, topology_physical_package_id
> > > > > should
> > > > > be
> > > > > replaced with topology_logical_package_id.
> > > > >
> > > > > kasan bug report:
> > > > > [ 19.411889]
> > > > > =============================================================
> > > > > ====
> > > > > =
> > > > > [ 19.413702] BUG: KASAN: slab-out-of-bounds in
> > > > > _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
> > > > > [ 19.415634] Read of size 8 at addr ffff888829e65200 by
> > > > > task
> > > > > cpuhp/16/113
> > > > > [ 19.417368]
> > > > > [ 19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G
> > > > > E 6.9.0 #10
> > > > > [ 19.420435] Hardware name: VMware, Inc. VMware20,1/440BX
> > > > > Desktop
> > > > > Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713
> > > > > 07/28/2022
> > > > > [ 19.422687] Call Trace:
> > > > > [ 19.424091] <TASK>
> > > > > [ 19.425448] dump_stack_lvl+0x5d/0x80
> > > > > [ 19.426963] ? _isst_if_get_pci_dev+0x3d5/0x400
> > > > > [isst_if_common]
> > > > > [ 19.428694] print_report+0x19d/0x52e
> > > > > [ 19.430206] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
> > > > > [ 19.431837] ? _isst_if_get_pci_dev+0x3d5/0x400
> > > > > [isst_if_common]
> > > > > [ 19.433539] kasan_report+0xf0/0x170
> > > > > [ 19.435019] ? _isst_if_get_pci_dev+0x3d5/0x400
> > > > > [isst_if_common]
> > > > > [ 19.436709] _isst_if_get_pci_dev+0x3d5/0x400
> > > > > [isst_if_common]
> > > > > [ 19.438379] ? __pfx_sched_clock_cpu+0x10/0x10
> > > > > [ 19.439910] isst_if_cpu_online+0x406/0x58f
> > > > > [isst_if_common]
> > > > > [ 19.441573] ? __pfx_isst_if_cpu_online+0x10/0x10
> > > > > [isst_if_common]
> > > > > [ 19.443263] ? ttwu_queue_wakelist+0x2c1/0x360
> > > > > [ 19.444797] cpuhp_invoke_callback+0x221/0xec0
> > > > > [ 19.446337] cpuhp_thread_fun+0x21b/0x610
> > > > > [ 19.447814] ? __pfx_cpuhp_thread_fun+0x10/0x10
> > > > > [ 19.449354] smpboot_thread_fn+0x2e7/0x6e0
> > > > > [ 19.450859] ? __pfx_smpboot_thread_fn+0x10/0x10
> > > > > [ 19.452405] kthread+0x29c/0x350
> > > > > [ 19.453817] ? __pfx_kthread+0x10/0x10
> > > > > [ 19.455253] ret_from_fork+0x31/0x70
> > > > > [ 19.456685] ? __pfx_kthread+0x10/0x10
> > > > > [ 19.458114] ret_from_fork_asm+0x1a/0x30
> > > > > [ 19.459573] </TASK>
> > > > > [ 19.460853]
> > > > > [ 19.462055] Allocated by task 1198:
> > > > > [ 19.463410] kasan_save_stack+0x30/0x50
> > > > > [ 19.464788] kasan_save_track+0x14/0x30
> > > > > [ 19.466139] __kasan_kmalloc+0xaa/0xb0
> > > > > [ 19.467465] __kmalloc+0x1cd/0x470
> > > > > [ 19.468748] isst_if_cdev_register+0x1da/0x350
> > > > > [isst_if_common]
> > > > > [ 19.470233] isst_if_mbox_init+0x108/0xff0
> > > > > [isst_if_mbox_msr]
> > > > > [ 19.471670] do_one_initcall+0xa4/0x380
> > > > > [ 19.472903] do_init_module+0x238/0x760
> > > > > [ 19.474105] load_module+0x5239/0x6f00
> > > > > [ 19.475285] init_module_from_file+0xd1/0x130
> > > > > [ 19.476506] idempotent_init_module+0x23b/0x650
> > > > > [ 19.477725] __x64_sys_finit_module+0xbe/0x130
> > > > > [ 19.476506] idempotent_init_module+0x23b/0x650
> > > > > [ 19.477725] __x64_sys_finit_module+0xbe/0x130
> > > > > [ 19.478920] do_syscall_64+0x82/0x160
> > > > > [ 19.480036] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > > > [ 19.481292]
> > > > > [ 19.482205] The buggy address belongs to the object at
> > > > > ffff888829e65000
> > > > > which belongs to the cache kmalloc-512 of size 512
> > > > > [ 19.484818] The buggy address is located 0 bytes to the
> > > > > right
> > > > > of
> > > > > allocated 512-byte region [ffff888829e65000,
> > > > > ffff888829e65200)
> > > > > [ 19.487447]
> > > > > [ 19.488328] The buggy address belongs to the physical
> > > > > page:
> > > > > [ 19.489569] page: refcount:1 mapcount:0
> > > > > mapping:0000000000000000
> > > > > index:0xffff888829e60c00 pfn:0x829e60
> > > > > [ 19.491140] head: order:3 entire_mapcount:0
> > > > > nr_pages_mapped:0
> > > > > pincount:0
> > > > > [ 19.492466] anon flags:
> > > > > 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff)
> > > > > [ 19.493914] page_type: 0xffffffff()
> > > > > [ 19.494988] raw: 0057ffffc0000840 ffff88810004cc80
> > > > > 0000000000000000 0000000000000001
> > > > > [ 19.496451] raw: ffff888829e60c00 0000000080200018
> > > > > 00000001ffffffff 0000000000000000
> > > > > [ 19.497906] head: 0057ffffc0000840 ffff88810004cc80
> > > > > 0000000000000000 0000000000000001
> > > > > [ 19.499379] head: ffff888829e60c00 0000000080200018
> > > > > 00000001ffffffff 0000000000000000
> > > > > [ 19.500844] head: 0057ffffc0000003 ffffea0020a79801
> > > > > ffffea0020a79848 00000000ffffffff
> > > > > [ 19.502316] head: 0000000800000000 0000000000000000
> > > > > 00000000ffffffff 0000000000000000
> > > > > [ 19.503784] page dumped because: kasan: bad access
> > > > > detected
> > > > > [ 19.505058]
> > > > > [ 19.505970] Memory state around the buggy address:
> > > > > [ 19.507172] ffff888829e65100: 00 00 00 00 00 00 00 00 00
> > > > > 00
> > > > > 00 00
> > > > > 00 00 00 00
> > > > > [ 19.508599] ffff888829e65180: 00 00 00 00 00 00 00 00 00
> > > > > 00
> > > > > 00 00
> > > > > 00 00 00 00
> > > > > [ 19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc
> > > > > fc
> > > > > fc fc
> > > > > fc fc fc fc
> > > > > [ 19.510014] ^
> > > > > [ 19.510016] ffff888829e65280: fc fc fc fc fc fc fc fc fc
> > > > > fc
> > > > > fc fc
> > > > > fc fc fc fc
> > > > > [ 19.510018] ffff888829e65300: fc fc fc fc fc fc fc fc fc
> > > > > fc
> > > > > fc fc
> > > > > fc fc fc fc
> > > > > [ 19.515367]
> > > > > =============================================================
> > > > > ====
> > > > > =
> > > > >
> > > > > Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device
> > > > > mapping
> > > > > with
> > > > > Sub-NUMA clustering")
> > > > > Signed-off-by: Zach Wade <zachwade.k@gmail.com>
> > > > > ---
> > > > > drivers/platform/x86/intel/speed_select_if/isst_if_common.
> > > > > c | 2
> > > > > +-
> > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git
> > > > > a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
> > > > > b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
> > > > > index 10e21563fa46..80654aacd5bd 100644
> > > > > ---
> > > > > a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
> > > > > +++
> > > > > b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
> > > > > @@ -316,7 +316,7 @@ static struct pci_dev
> > > > > *_isst_if_get_pci_dev(int
> > > > > cpu, int bus_no, int dev, int fn
> > > > > cpu >= nr_cpu_ids || cpu >=
> > > > > num_possible_cpus())
> > > > > return NULL;
> > > > >
> > > > > - pkg_id = topology_physical_package_id(cpu);
> > > > > + pkg_id = topology_logical_package_id(cpu);
> > > > >
> > > > > bus_number = isst_cpu_info[cpu].bus_info[bus_no];
> > > > > if (bus_number < 0)
> > > >
> > >
> >
>
next prev parent reply other threads:[~2024-09-19 18:37 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-17 18:03 [PATCH] platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug Zach Wade
2024-09-18 13:48 ` srinivas pandruvada
2024-09-18 16:37 ` Zach Wade
2024-09-18 17:41 ` srinivas pandruvada
2024-09-19 16:22 ` Zach Wade
2024-09-19 18:37 ` srinivas pandruvada [this message]
2024-09-20 16:16 ` Zach Wade
2024-09-19 16:37 ` [PATCH v2] " Zach Wade
2024-09-19 18:44 ` srinivas pandruvada
2024-09-20 16:19 ` Zach Wade
2024-09-20 19:37 ` srinivas pandruvada
2024-09-21 11:41 ` Zach Wade
2024-09-23 14:45 ` [PATCH v3] " Zach Wade
2024-09-23 17:51 ` srinivas pandruvada
2024-10-05 12:53 ` Hans de Goede
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=422bfc19243e8d50fc3c847f0c1db01e703aff4e.camel@linux.intel.com \
--to=srinivas.pandruvada@linux.intel.com \
--cc=hdegoede@redhat.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=platform-driver-x86@vger.kernel.org \
--cc=zachwade.k@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.