* [PATCH v4 0/4] Miscellaneous fixes for pci subsystem
@ 2026-01-16 8:17 Ziming Du
2026-01-16 8:17 ` [PATCH v4 1/4] PCI/sysfs: Prohibit unaligned access to I/O port Ziming Du
` (4 more replies)
0 siblings, 5 replies; 13+ messages in thread
From: Ziming Du @ 2026-01-16 8:17 UTC (permalink / raw)
To: bhelgaas, alex, chrisw, jbarnes; +Cc: linux-pci, linux-kernel, liuyongqiang13
Miscellaneous fixes for pci subsystem
Changes in v4:
- Remove the architecture-specific #ifdef to apply the alignment
check for all platforms (including x86), as device registers are
naturally aligned anyway.
- Fix a potential issue in proc_bus_pci_read() to make it consistent
with proc_bus_pci_write(), as suggested by Ilpo Järvinen.
- Link to v3: https://lore.kernel.org/linux-pci/20260108015944.3520719-1-duziming2@huawei.com/
Changes in v3:
- Check *ppos before assign it to pos.
- Link to v2: https://lore.kernel.org/linux-pci/20251224092721.2034529-1-duziming2@huawei.com/
Changes in v2:
- Correct grammer and indentation.
- Remove unrelated stack traces from the commit message.
- Modify the handling of pos by adding a non-negative check to ensure
that the input value is valid.
- Use the existing IS_ALIGNED macro and ensure that after modification,
other cases still retuen -EINVAL as before.
- Link to v1: https://lore.kernel.org/linux-pci/20251216083912.758219-1-duziming2@huawei.com/
Yongqiang Liu (2):
PCI: Prevent overflow in proc_bus_pci_write()
PCI/sysfs: Prohibit unaligned access to I/O port
Ziming Du (2):
PCI/sysfs: Fix null pointer dereference during hotplug
PCI: Prevent overflow in proc_bus_pci_read()
drivers/pci/pci-sysfs.c | 7 +++++++
drivers/pci/proc.c | 11 +++++++++--
2 files changed, 16 insertions(+), 2 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v4 1/4] PCI/sysfs: Prohibit unaligned access to I/O port
2026-01-16 8:17 [PATCH v4 0/4] Miscellaneous fixes for pci subsystem Ziming Du
@ 2026-01-16 8:17 ` Ziming Du
2026-02-26 17:00 ` Bjorn Helgaas
2026-01-16 8:17 ` [PATCH v4 2/4] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
` (3 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Ziming Du @ 2026-01-16 8:17 UTC (permalink / raw)
To: bhelgaas, alex, chrisw, jbarnes; +Cc: linux-pci, linux-kernel, liuyongqiang13
Unaligned access is harmful for non-x86 archs such as arm64. When we
use pwrite or pread to access the I/O port resources with unaligned
offset, system will crash as follows:
Unable to handle kernel paging request at virtual address fffffbfffe8010c1
Internal error: Oops: 0000000096000061 [#1] SMP
Call trace:
_outw include/asm-generic/io.h:594 [inline]
logic_outw+0x54/0x218 lib/logic_pio.c:305
pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
vfs_write+0x7bc/0xac8 fs/read_write.c:586
ksys_write+0x12c/0x270 fs/read_write.c:639
__arm64_sys_write+0x78/0xb8 fs/read_write.c:648
Although x86 might handle unaligned I/O accesses by splitting cycles,
this approach is still limited because PCI device registers typically
expect natural alignment. A global prohibition of unaligned accesses
ensures consistent behavior across all architectures and prevents
unexpected hardware side effects.
Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port resources")
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: Ziming Du <duziming2@huawei.com>
Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
drivers/pci/pci-sysfs.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index c2df915ad2d29..18e5d4603b472 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -31,6 +31,7 @@
#include <linux/of.h>
#include <linux/aperture.h>
#include <linux/unaligned.h>
+#include <linux/align.h>
#include "pci.h"
#ifndef ARCH_PCI_DEV_GROUPS
@@ -1166,12 +1167,16 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj,
*(u8 *)buf = inb(port);
return 1;
case 2:
+ if (!IS_ALIGNED(port, count))
+ return -EINVAL;
if (write)
outw(*(u16 *)buf, port);
else
*(u16 *)buf = inw(port);
return 2;
case 4:
+ if (!IS_ALIGNED(port, count))
+ return -EINVAL;
if (write)
outl(*(u32 *)buf, port);
else
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v4 2/4] PCI/sysfs: Fix null pointer dereference during hotplug
2026-01-16 8:17 [PATCH v4 0/4] Miscellaneous fixes for pci subsystem Ziming Du
2026-01-16 8:17 ` [PATCH v4 1/4] PCI/sysfs: Prohibit unaligned access to I/O port Ziming Du
@ 2026-01-16 8:17 ` Ziming Du
2026-02-26 17:14 ` Bjorn Helgaas
2026-01-16 8:17 ` [PATCH v4 3/4] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
` (2 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Ziming Du @ 2026-01-16 8:17 UTC (permalink / raw)
To: bhelgaas, alex, chrisw, jbarnes; +Cc: linux-pci, linux-kernel, liuyongqiang13
During the concurrent process of creating and rescanning in VF, the
resource files for the same pci_dev may be created twice. The second
creation attempt fails, resulting the res_attr in pci_dev to kfree(),
but the pointer is not set to NULL. This will subsequently lead to
dereferencing a null pointer when removing the device.
When we perform the following operation:
echo $sriov_totalvfs > /sys/class/net/"$pfname"/device/sriov_numvfs &
sleep 0.5
echo 1 > /sys/bus/pci/rescan
pci_remove "$pfname"
system will crash as follows:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Call trace:
__pi_strlen+0x14/0x150
kernfs_find_ns+0x54/0x120
kernfs_remove_by_name_ns+0x58/0xf0
sysfs_remove_bin_file+0x24/0x38
pci_remove_resource_files+0x44/0x90
pci_remove_sysfs_dev_files+0x28/0x40
pci_stop_bus_device+0xb8/0x118
pci_stop_and_remove_bus_device+0x20/0x40
pci_iov_remove_virtfn+0xb8/0x138
sriov_disable+0xbc/0x190
pci_disable_sriov+0x30/0x48
hinic_pci_sriov_disable+0x54/0x138 [hinic]
hinic_remove+0x140/0x290 [hinic]
pci_device_remove+0x4c/0xf8
device_remove+0x54/0x90
device_release_driver_internal+0x1d4/0x238
device_release_driver+0x20/0x38
pci_stop_bus_device+0xa8/0x118
pci_stop_and_remove_bus_device_locked+0x28/0x50
remove_store+0x128/0x208
Fix this by set the pointer to NULL after releasing 'res_attr' immediately.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ziming Du <duziming2@huawei.com>
---
drivers/pci/pci-sysfs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 18e5d4603b472..fbcbf39232732 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1227,12 +1227,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
if (res_attr) {
sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
kfree(res_attr);
+ pdev->res_attr[i] = NULL;
}
res_attr = pdev->res_attr_wc[i];
if (res_attr) {
sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
kfree(res_attr);
+ pdev->res_attr_wc[i] = NULL;
}
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v4 3/4] PCI: Prevent overflow in proc_bus_pci_write()
2026-01-16 8:17 [PATCH v4 0/4] Miscellaneous fixes for pci subsystem Ziming Du
2026-01-16 8:17 ` [PATCH v4 1/4] PCI/sysfs: Prohibit unaligned access to I/O port Ziming Du
2026-01-16 8:17 ` [PATCH v4 2/4] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
@ 2026-01-16 8:17 ` Ziming Du
2026-03-03 19:32 ` Bjorn Helgaas
2026-01-16 8:17 ` [PATCH v4 4/4] PCI: Prevent overflow in proc_bus_pci_read() Ziming Du
2026-01-30 7:53 ` [PATCH v4 0/4] Miscellaneous fixes for pci subsystem duziming
4 siblings, 1 reply; 13+ messages in thread
From: Ziming Du @ 2026-01-16 8:17 UTC (permalink / raw)
To: bhelgaas, alex, chrisw, jbarnes; +Cc: linux-pci, linux-kernel, liuyongqiang13
From: Yongqiang Liu <liuyongqiang13@huawei.com>
When the value of *ppos over the INT_MAX, the pos is over set to a
negative value which will be passed to get_user() or
pci_user_write_config_dword(). Unexpected behavior such as a soft lockup
will happen as follows:
watchdog: BUG: soft lockup - CPU#0 stuck for 130s! [syz.3.109:3444]
RIP: 0010:_raw_spin_unlock_irq+0x17/0x30
Call Trace:
<TASK>
pci_user_write_config_dword+0x126/0x1f0
proc_bus_pci_write+0x273/0x470
proc_reg_write+0x1b6/0x280
do_iter_write+0x48e/0x790
vfs_writev+0x125/0x4a0
__x64_sys_pwritev+0x1e2/0x2a0
do_syscall_64+0x59/0x110
entry_SYSCALL_64_after_hwframe+0x78/0xe2
Fix this by adding a non-negative check before assign *ppos to pos.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: Ziming Du <duziming2@huawei.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
drivers/pci/proc.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 9348a0fb80847..2d51b26edbe74 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -113,10 +113,14 @@ static ssize_t proc_bus_pci_write(struct file *file, const char __user *buf,
{
struct inode *ino = file_inode(file);
struct pci_dev *dev = pde_data(ino);
- int pos = *ppos;
+ int pos;
int size = dev->cfg_size;
int cnt, ret;
+ if (*ppos > INT_MAX)
+ return -EINVAL;
+ pos = *ppos;
+
ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
if (ret)
return ret;
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v4 4/4] PCI: Prevent overflow in proc_bus_pci_read()
2026-01-16 8:17 [PATCH v4 0/4] Miscellaneous fixes for pci subsystem Ziming Du
` (2 preceding siblings ...)
2026-01-16 8:17 ` [PATCH v4 3/4] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
@ 2026-01-16 8:17 ` Ziming Du
2026-01-30 7:53 ` [PATCH v4 0/4] Miscellaneous fixes for pci subsystem duziming
4 siblings, 0 replies; 13+ messages in thread
From: Ziming Du @ 2026-01-16 8:17 UTC (permalink / raw)
To: bhelgaas, alex, chrisw, jbarnes; +Cc: linux-pci, linux-kernel, liuyongqiang13
proc_bus_pci_read() assigns *ppos directly to an unsigned integer variable.
For large offsets, this implicit conversion may truncate the value and
cause reads from an incorrect position.
proc_bus_pci_write() explicitly validates *ppos and rejects values larger
than INT_MAX, while proc_bus_pci_read() currently accepts them. This
difference in position handling is unjustified.
Fix this by validating *ppos in proc_bus_pci_read() and rejecting offsets
larger than INT_MAX before the assignment, matching proc_bus_pci_write().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ziming Du <duziming2@huawei.com>
Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
drivers/pci/proc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 2d51b26edbe74..f4ef7629dc78b 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -29,9 +29,12 @@ static ssize_t proc_bus_pci_read(struct file *file, char __user *buf,
size_t nbytes, loff_t *ppos)
{
struct pci_dev *dev = pde_data(file_inode(file));
- unsigned int pos = *ppos;
+ int pos;
unsigned int cnt, size;
+ if (*ppos > INT_MAX)
+ return -EINVAL;
+ pos = *ppos;
/*
* Normal users can read only the standardized portion of the
* configuration space as several chips lock up when trying to read
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4 0/4] Miscellaneous fixes for pci subsystem
2026-01-16 8:17 [PATCH v4 0/4] Miscellaneous fixes for pci subsystem Ziming Du
` (3 preceding siblings ...)
2026-01-16 8:17 ` [PATCH v4 4/4] PCI: Prevent overflow in proc_bus_pci_read() Ziming Du
@ 2026-01-30 7:53 ` duziming
2026-02-06 22:29 ` Bjorn Helgaas
4 siblings, 1 reply; 13+ messages in thread
From: duziming @ 2026-01-30 7:53 UTC (permalink / raw)
To: duziming2
Cc: alex, bhelgaas, chrisw, jbarnes, linux-kernel, linux-pci,
liuyongqiang13, duziming2
Hi all,
Gentle ping on this patchset. Any feedback would be greatly appreciated.
On 2026/1/15 15:52, Ziming Du wrote:
> Miscellaneous fixes for pci subsystem
>
> Changes in v4:
> - Remove the architecture-specific #ifdef to apply the alignment
> check for all platforms (including x86), as device registers are
> naturally aligned anyway.
> - Fix a potential issue in proc_bus_pci_read() to make it consistent
> with proc_bus_pci_write(), as suggested by Ilpo Järvinen.
> - Link to v3:https://lore.kernel.org/linux-pci/20260108015944.3520719-1-duziming2@huawei.com/
>
> Changes in v3:
> - Check *ppos before assign it to pos.
> - Link to v2:https://lore.kernel.org/linux-pci/20251224092721.2034529-1-duziming2@huawei.com/
>
> Changes in v2:
> - Correct grammer and indentation.
> - Remove unrelated stack traces from the commit message.
> - Modify the handling of pos by adding a non-negative check to ensure
> that the input value is valid.
> - Use the existing IS_ALIGNED macro and ensure that after modification,
> other cases still retuen -EINVAL as before.
> - Link to v1:https://lore.kernel.org/linux-pci/20251216083912.758219-1-duziming2@huawei.com/
>
> Yongqiang Liu (2):
> PCI: Prevent overflow in proc_bus_pci_write()
> PCI/sysfs: Prohibit unaligned access to I/O port
> Ziming Du (2):
> PCI/sysfs: Fix null pointer dereference during hotplug
> PCI: Prevent overflow in proc_bus_pci_read()
>
> drivers/pci/pci-sysfs.c | 7 +++++++
> drivers/pci/proc.c | 11 +++++++++--
> 2 files changed, 16 insertions(+), 2 deletions(-)
Thanx du!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 0/4] Miscellaneous fixes for pci subsystem
2026-01-30 7:53 ` [PATCH v4 0/4] Miscellaneous fixes for pci subsystem duziming
@ 2026-02-06 22:29 ` Bjorn Helgaas
2026-02-26 9:07 ` duziming
0 siblings, 1 reply; 13+ messages in thread
From: Bjorn Helgaas @ 2026-02-06 22:29 UTC (permalink / raw)
To: duziming
Cc: alex, bhelgaas, chrisw, jbarnes, linux-kernel, linux-pci,
liuyongqiang13
On Fri, Jan 30, 2026 at 03:53:10PM +0800, duziming wrote:
> Hi all,
>
> Gentle ping on this patchset. Any feedback would be greatly appreciated.
Sorry I didn't get to this. Poke again after v6.20-rc1.
> On 2026/1/15 15:52, Ziming Du wrote:
>
> > Miscellaneous fixes for pci subsystem
> >
> > Changes in v4:
> > - Remove the architecture-specific #ifdef to apply the alignment
> > check for all platforms (including x86), as device registers are
> > naturally aligned anyway.
> > - Fix a potential issue in proc_bus_pci_read() to make it consistent
> > with proc_bus_pci_write(), as suggested by Ilpo Järvinen.
> > - Link to v3:https://lore.kernel.org/linux-pci/20260108015944.3520719-1-duziming2@huawei.com/
> >
> > Changes in v3:
> > - Check *ppos before assign it to pos.
> > - Link to v2:https://lore.kernel.org/linux-pci/20251224092721.2034529-1-duziming2@huawei.com/
> >
> > Changes in v2:
> > - Correct grammer and indentation.
> > - Remove unrelated stack traces from the commit message.
> > - Modify the handling of pos by adding a non-negative check to ensure
> > that the input value is valid.
> > - Use the existing IS_ALIGNED macro and ensure that after modification,
> > other cases still retuen -EINVAL as before.
> > - Link to v1:https://lore.kernel.org/linux-pci/20251216083912.758219-1-duziming2@huawei.com/
> >
> > Yongqiang Liu (2):
> > PCI: Prevent overflow in proc_bus_pci_write()
> > PCI/sysfs: Prohibit unaligned access to I/O port
> > Ziming Du (2):
> > PCI/sysfs: Fix null pointer dereference during hotplug
> > PCI: Prevent overflow in proc_bus_pci_read()
> >
> > drivers/pci/pci-sysfs.c | 7 +++++++
> > drivers/pci/proc.c | 11 +++++++++--
> > 2 files changed, 16 insertions(+), 2 deletions(-)
>
> Thanx du!
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 0/4] Miscellaneous fixes for pci subsystem
2026-02-06 22:29 ` Bjorn Helgaas
@ 2026-02-26 9:07 ` duziming
0 siblings, 0 replies; 13+ messages in thread
From: duziming @ 2026-02-26 9:07 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: alex, bhelgaas, chrisw, jbarnes, linux-kernel, linux-pci,
liuyongqiang13
在 2026/2/7 6:29, Bjorn Helgaas 写道:
> On Fri, Jan 30, 2026 at 03:53:10PM +0800, duziming wrote:
>> Hi all,
>>
>> Gentle ping on this patchset. Any feedback would be greatly appreciated.
> Sorry I didn't get to this. Poke again after v6.20-rc1.
ping?
>> On 2026/1/15 15:52, Ziming Du wrote:
>>
>>> Miscellaneous fixes for pci subsystem
>>>
>>> Changes in v4:
>>> - Remove the architecture-specific #ifdef to apply the alignment
>>> check for all platforms (including x86), as device registers are
>>> naturally aligned anyway.
>>> - Fix a potential issue in proc_bus_pci_read() to make it consistent
>>> with proc_bus_pci_write(), as suggested by Ilpo Järvinen.
>>> - Link to v3:https://lore.kernel.org/linux-pci/20260108015944.3520719-1-duziming2@huawei.com/
>>>
>>> Changes in v3:
>>> - Check *ppos before assign it to pos.
>>> - Link to v2:https://lore.kernel.org/linux-pci/20251224092721.2034529-1-duziming2@huawei.com/
>>>
>>> Changes in v2:
>>> - Correct grammer and indentation.
>>> - Remove unrelated stack traces from the commit message.
>>> - Modify the handling of pos by adding a non-negative check to ensure
>>> that the input value is valid.
>>> - Use the existing IS_ALIGNED macro and ensure that after modification,
>>> other cases still retuen -EINVAL as before.
>>> - Link to v1:https://lore.kernel.org/linux-pci/20251216083912.758219-1-duziming2@huawei.com/
>>>
>>> Yongqiang Liu (2):
>>> PCI: Prevent overflow in proc_bus_pci_write()
>>> PCI/sysfs: Prohibit unaligned access to I/O port
>>> Ziming Du (2):
>>> PCI/sysfs: Fix null pointer dereference during hotplug
>>> PCI: Prevent overflow in proc_bus_pci_read()
>>>
>>> drivers/pci/pci-sysfs.c | 7 +++++++
>>> drivers/pci/proc.c | 11 +++++++++--
>>> 2 files changed, 16 insertions(+), 2 deletions(-)
>> Thanx du!
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 1/4] PCI/sysfs: Prohibit unaligned access to I/O port
2026-01-16 8:17 ` [PATCH v4 1/4] PCI/sysfs: Prohibit unaligned access to I/O port Ziming Du
@ 2026-02-26 17:00 ` Bjorn Helgaas
0 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2026-02-26 17:00 UTC (permalink / raw)
To: Ziming Du
Cc: bhelgaas, alex, chrisw, jbarnes, linux-pci, linux-kernel,
liuyongqiang13
On Fri, Jan 16, 2026 at 04:17:18PM +0800, Ziming Du wrote:
> Unaligned access is harmful for non-x86 archs such as arm64. When we
> use pwrite or pread to access the I/O port resources with unaligned
> offset, system will crash as follows:
> @@ -1166,12 +1167,16 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj,
> *(u8 *)buf = inb(port);
> return 1;
> case 2:
> + if (!IS_ALIGNED(port, count))
> + return -EINVAL;
I assume "IS_ALIGNED(port, 1)" is *always* true, so can we just do
this once before the switch instead of adding it to the "case 2" and
"case 4"?
> if (write)
> outw(*(u16 *)buf, port);
> else
> *(u16 *)buf = inw(port);
> return 2;
> case 4:
> + if (!IS_ALIGNED(port, count))
> + return -EINVAL;
> if (write)
> outl(*(u32 *)buf, port);
> else
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 2/4] PCI/sysfs: Fix null pointer dereference during hotplug
2026-01-16 8:17 ` [PATCH v4 2/4] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
@ 2026-02-26 17:14 ` Bjorn Helgaas
2026-02-27 2:30 ` duziming
2026-04-02 7:23 ` duziming
0 siblings, 2 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2026-02-26 17:14 UTC (permalink / raw)
To: Ziming Du
Cc: bhelgaas, alex, chrisw, jbarnes, linux-pci, linux-kernel,
liuyongqiang13
On Fri, Jan 16, 2026 at 04:17:19PM +0800, Ziming Du wrote:
> During the concurrent process of creating and rescanning in VF, the
> resource files for the same pci_dev may be created twice.
Where are the two resource file creations? This will help review the
patch.
> The second
> creation attempt fails, resulting the res_attr in pci_dev to kfree(),
> but the pointer is not set to NULL. This will subsequently lead to
> dereferencing a null pointer when removing the device.
>
> When we perform the following operation:
> echo $sriov_totalvfs > /sys/class/net/"$pfname"/device/sriov_numvfs &
I think it would be more informative to include an actual sample here.
We can easily substitute the device names and numbers, given a
concrete example. It's a little bit harder to intuit what $pfname and
$sriov_totalvfs should be. E.g.,
$ cat /sys/bus/pci/devices/0000:02:00.0/sriov_totalvfs
128
$ echo 128 > /sys/bus/pci/devices/0000:02:00.0/sriov_numvfs &
Unless it's important to use /sys/class/net/..., use
/sys/bus/pci/devices/... both places to make it simpler.
> sleep 0.5
> echo 1 > /sys/bus/pci/rescan
These look like shell commands ...
> pci_remove "$pfname"
but what is "pci_remove"? I guess it must be an echo into
/sys/bus/pci/devices/.../remove; expanding it here would be better.
> system will crash as follows:
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> Call trace:
> __pi_strlen+0x14/0x150
> kernfs_find_ns+0x54/0x120
> kernfs_remove_by_name_ns+0x58/0xf0
> sysfs_remove_bin_file+0x24/0x38
> pci_remove_resource_files+0x44/0x90
> pci_remove_sysfs_dev_files+0x28/0x40
> pci_stop_bus_device+0xb8/0x118
> pci_stop_and_remove_bus_device+0x20/0x40
> pci_iov_remove_virtfn+0xb8/0x138
> sriov_disable+0xbc/0x190
> pci_disable_sriov+0x30/0x48
> hinic_pci_sriov_disable+0x54/0x138 [hinic]
> hinic_remove+0x140/0x290 [hinic]
> pci_device_remove+0x4c/0xf8
> device_remove+0x54/0x90
> device_release_driver_internal+0x1d4/0x238
> device_release_driver+0x20/0x38
> pci_stop_bus_device+0xa8/0x118
> pci_stop_and_remove_bus_device_locked+0x28/0x50
> remove_store+0x128/0x208
>
> Fix this by set the pointer to NULL after releasing 'res_attr' immediately.
This *sounds* like it would still be racy unless there's a lock around
this. If there is a lock, please mention what it is and where it's
held.
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Ziming Du <duziming2@huawei.com>
> ---
> drivers/pci/pci-sysfs.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 18e5d4603b472..fbcbf39232732 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1227,12 +1227,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
> if (res_attr) {
> sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
> kfree(res_attr);
> + pdev->res_attr[i] = NULL;
> }
>
> res_attr = pdev->res_attr_wc[i];
> if (res_attr) {
> sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
> kfree(res_attr);
> + pdev->res_attr_wc[i] = NULL;
> }
> }
> }
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 2/4] PCI/sysfs: Fix null pointer dereference during hotplug
2026-02-26 17:14 ` Bjorn Helgaas
@ 2026-02-27 2:30 ` duziming
2026-04-02 7:23 ` duziming
1 sibling, 0 replies; 13+ messages in thread
From: duziming @ 2026-02-27 2:30 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: bhelgaas, alex, chrisw, jbarnes, linux-pci, linux-kernel,
liuyongqiang13
在 2026/2/27 1:14, Bjorn Helgaas 写道:
> On Fri, Jan 16, 2026 at 04:17:19PM +0800, Ziming Du wrote:
>> During the concurrent process of creating and rescanning in VF, the
>> resource files for the same pci_dev may be created twice.
> Where are the two resource file creations? This will help review the
> patch.
The process of creating VFs:
sriov_numvfs_store
hinic_pci_sriov_configure
hinic_pci_sriov_enable
pci_enable_sriov
sriov_enable
sriov_add_vfs
pci_iov_add_virtfn
pci_bus_add_device
pci_create_sysfs_dev_files
pci_create_resource_files
The process of rescanning VFs:
rescan_store
pci_rescan_bus
pci_bus_add_devices
pci_bus_add_device
pci_create_sysfs_dev_files
pci_create_resource_files
>> The second
>> creation attempt fails, resulting the res_attr in pci_dev to kfree(),
>> but the pointer is not set to NULL. This will subsequently lead to
>> dereferencing a null pointer when removing the device.
>>
>> When we perform the following operation:
>> echo $sriov_totalvfs > /sys/class/net/"$pfname"/device/sriov_numvfs &
> I think it would be more informative to include an actual sample here.
> We can easily substitute the device names and numbers, given a
> concrete example. It's a little bit harder to intuit what $pfname and
> $sriov_totalvfs should be. E.g.,
>
> $ cat /sys/bus/pci/devices/0000:02:00.0/sriov_totalvfs
> 128
> $ echo 128 > /sys/bus/pci/devices/0000:02:00.0/sriov_numvfs &
>
> Unless it's important to use /sys/class/net/..., use
> /sys/bus/pci/devices/... both places to make it simpler.
>
>> sleep 0.5
>> echo 1 > /sys/bus/pci/rescan
> These look like shell commands ...
>
>> pci_remove "$pfname"
> but what is "pci_remove"? I guess it must be an echo into
> /sys/bus/pci/devices/.../remove; expanding it here would be better.
Yes, you are right, based on your comments, I have revised it as follows:
$ cat /sys/bus/pci/devices/0000:02:00.0/sriov_totalvfs
128
$ echo 128 > /sys/bus/pci/devices/0000:02:00.0/sriov_numvfs &
$ sleep 0.5
$ echo 1 > /sys/bus/pci/rescan
$ echo 1 > /sys/bus/pci/devices/0000:02:00.0/remove
>> system will crash as follows:
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>> Call trace:
>> __pi_strlen+0x14/0x150
>> kernfs_find_ns+0x54/0x120
>> kernfs_remove_by_name_ns+0x58/0xf0
>> sysfs_remove_bin_file+0x24/0x38
>> pci_remove_resource_files+0x44/0x90
>> pci_remove_sysfs_dev_files+0x28/0x40
>> pci_stop_bus_device+0xb8/0x118
>> pci_stop_and_remove_bus_device+0x20/0x40
>> pci_iov_remove_virtfn+0xb8/0x138
>> sriov_disable+0xbc/0x190
>> pci_disable_sriov+0x30/0x48
>> hinic_pci_sriov_disable+0x54/0x138 [hinic]
>> hinic_remove+0x140/0x290 [hinic]
>> pci_device_remove+0x4c/0xf8
>> device_remove+0x54/0x90
>> device_release_driver_internal+0x1d4/0x238
>> device_release_driver+0x20/0x38
>> pci_stop_bus_device+0xa8/0x118
>> pci_stop_and_remove_bus_device_locked+0x28/0x50
>> remove_store+0x128/0x208
>>
>> Fix this by set the pointer to NULL after releasing 'res_attr' immediately.
> This *sounds* like it would still be racy unless there's a lock around
> this. If there is a lock, please mention what it is and where it's
> held.
The rescan process holds pci_lock_rescan_remove(), while the creation
process
holds "device_lock(&pdev->dev)"
>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>> Signed-off-by: Ziming Du <duziming2@huawei.com>
>> ---
>> drivers/pci/pci-sysfs.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index 18e5d4603b472..fbcbf39232732 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -1227,12 +1227,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
>> if (res_attr) {
>> sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
>> kfree(res_attr);
>> + pdev->res_attr[i] = NULL;
>> }
>>
>> res_attr = pdev->res_attr_wc[i];
>> if (res_attr) {
>> sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
>> kfree(res_attr);
>> + pdev->res_attr_wc[i] = NULL;
>> }
>> }
>> }
>> --
>> 2.43.0
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/4] PCI: Prevent overflow in proc_bus_pci_write()
2026-01-16 8:17 ` [PATCH v4 3/4] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
@ 2026-03-03 19:32 ` Bjorn Helgaas
0 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2026-03-03 19:32 UTC (permalink / raw)
To: Ziming Du
Cc: bhelgaas, alex, chrisw, jbarnes, linux-pci, linux-kernel,
liuyongqiang13
On Fri, Jan 16, 2026 at 04:17:20PM +0800, Ziming Du wrote:
> From: Yongqiang Liu <liuyongqiang13@huawei.com>
>
> When the value of *ppos over the INT_MAX, the pos is over set to a
> negative value which will be passed to get_user() or
> pci_user_write_config_dword(). Unexpected behavior such as a soft lockup
> will happen as follows:
I think it's crazy to worry about offsets overflowing INT_MAX. We're
doing PCI config accesses. Config space is only 4K at most, so we
already have a much smaller upper bound on the offset.
The procfs proc_bus_pci_write() is essentially the same as the sysfs
pci_write_config(). They should share some common implementation.
> watchdog: BUG: soft lockup - CPU#0 stuck for 130s! [syz.3.109:3444]
> RIP: 0010:_raw_spin_unlock_irq+0x17/0x30
> Call Trace:
> <TASK>
> pci_user_write_config_dword+0x126/0x1f0
> proc_bus_pci_write+0x273/0x470
> proc_reg_write+0x1b6/0x280
> do_iter_write+0x48e/0x790
> vfs_writev+0x125/0x4a0
> __x64_sys_pwritev+0x1e2/0x2a0
> do_syscall_64+0x59/0x110
> entry_SYSCALL_64_after_hwframe+0x78/0xe2
>
> Fix this by adding a non-negative check before assign *ppos to pos.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
> Signed-off-by: Ziming Du <duziming2@huawei.com>
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> ---
> drivers/pci/proc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 9348a0fb80847..2d51b26edbe74 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -113,10 +113,14 @@ static ssize_t proc_bus_pci_write(struct file *file, const char __user *buf,
> {
> struct inode *ino = file_inode(file);
> struct pci_dev *dev = pde_data(ino);
> - int pos = *ppos;
> + int pos;
> int size = dev->cfg_size;
> int cnt, ret;
>
> + if (*ppos > INT_MAX)
> + return -EINVAL;
> + pos = *ppos;
The first issue here is that "*ppos" (loff_t) is long long, but "pos"
is int, so we're assigning a 64-bit value to a 32-bit container and
losing any high bits. So an offset of 0x100000000 is incorrectly
treated as valid.
A change like this should fix that and bring this closer to the
pci_write_config() implementation:
- int pos = *ppos;
+ loff_t pos = *ppos;
- if (pos >= size)
+ if (pos > dev->cfg_size)
return 0;
There's also a second issue here:
if (pos + nbytes > size)
nbytes = size - pos;
"pos" is a signed int, "nbytes" is size_t, which is an *unsigned* int,
so "pos" is implicitly converted to an unsigned value. I think this
is what causes the soft lockup you reported because an offset like the
0x80800001 in your test case is converted from signed -2139095039 to
unsigned 2155872257. "size" is dev->cfg_size, e.g., 4096, so
2155872257 + nbytes is certainly larger than 4096, so nbytes ends up
being set to some huge unsigned size_t value.
This issue would probably be avoided simply by returning early when
"pos" is out of range. But mixing signed and unsigned in that
"pos + nbytes" expression is just asking for trouble and we should
avoid it as pci_write_config() does.
So I'd like to see something that makes the procfs accessor
validations look like the sysfs accessors. It's a little messy
because they use different names, so the patches will be ugly. But I
think it's worth it to make them work the same way so we don't have to
analyze them separately.
Maybe could be done in a couple steps, e.g., one to simply rename
things and a second to make the functional changes.
Bjorn
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 2/4] PCI/sysfs: Fix null pointer dereference during hotplug
2026-02-26 17:14 ` Bjorn Helgaas
2026-02-27 2:30 ` duziming
@ 2026-04-02 7:23 ` duziming
1 sibling, 0 replies; 13+ messages in thread
From: duziming @ 2026-04-02 7:23 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: bhelgaas, alex, chrisw, jbarnes, linux-pci, linux-kernel,
liuyongqiang13
在 2026/2/27 1:14, Bjorn Helgaas 写道:
> On Fri, Jan 16, 2026 at 04:17:19PM +0800, Ziming Du wrote:
>> During the concurrent process of creating and rescanning in VF, the
>> resource files for the same pci_dev may be created twice.
> Where are the two resource file creations? This will help review the
> patch.
>
>> The second
>> creation attempt fails, resulting the res_attr in pci_dev to kfree(),
>> but the pointer is not set to NULL. This will subsequently lead to
>> dereferencing a null pointer when removing the device.
>>
>> When we perform the following operation:
>> echo $sriov_totalvfs > /sys/class/net/"$pfname"/device/sriov_numvfs &
> I think it would be more informative to include an actual sample here.
> We can easily substitute the device names and numbers, given a
> concrete example. It's a little bit harder to intuit what $pfname and
> $sriov_totalvfs should be. E.g.,
>
> $ cat /sys/bus/pci/devices/0000:02:00.0/sriov_totalvfs
> 128
> $ echo 128 > /sys/bus/pci/devices/0000:02:00.0/sriov_numvfs &
>
> Unless it's important to use /sys/class/net/..., use
> /sys/bus/pci/devices/... both places to make it simpler.
>
>> sleep 0.5
>> echo 1 > /sys/bus/pci/rescan
> These look like shell commands ...
>
>> pci_remove "$pfname"
> but what is "pci_remove"? I guess it must be an echo into
> /sys/bus/pci/devices/.../remove; expanding it here would be better.
>
>> system will crash as follows:
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>> Call trace:
>> __pi_strlen+0x14/0x150
>> kernfs_find_ns+0x54/0x120
>> kernfs_remove_by_name_ns+0x58/0xf0
>> sysfs_remove_bin_file+0x24/0x38
>> pci_remove_resource_files+0x44/0x90
>> pci_remove_sysfs_dev_files+0x28/0x40
>> pci_stop_bus_device+0xb8/0x118
>> pci_stop_and_remove_bus_device+0x20/0x40
>> pci_iov_remove_virtfn+0xb8/0x138
>> sriov_disable+0xbc/0x190
>> pci_disable_sriov+0x30/0x48
>> hinic_pci_sriov_disable+0x54/0x138 [hinic]
>> hinic_remove+0x140/0x290 [hinic]
>> pci_device_remove+0x4c/0xf8
>> device_remove+0x54/0x90
>> device_release_driver_internal+0x1d4/0x238
>> device_release_driver+0x20/0x38
>> pci_stop_bus_device+0xa8/0x118
>> pci_stop_and_remove_bus_device_locked+0x28/0x50
>> remove_store+0x128/0x208
>>
>> Fix this by set the pointer to NULL after releasing 'res_attr' immediately.
> This *sounds* like it would still be racy unless there's a lock around
> this. If there is a lock, please mention what it is and where it's
> held.
I found that the primary race condition between VF creation and PCI
rescan has been
addressed by the pci_lock_rescan_remove() lock added in commit
a5338e365c45.
Given this, would setting the pointer to NULL after kfree still be
considered a worthwhile defensive measure?
>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>> Signed-off-by: Ziming Du <duziming2@huawei.com>
>> ---
>> drivers/pci/pci-sysfs.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index 18e5d4603b472..fbcbf39232732 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -1227,12 +1227,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
>> if (res_attr) {
>> sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
>> kfree(res_attr);
>> + pdev->res_attr[i] = NULL;
>> }
>>
>> res_attr = pdev->res_attr_wc[i];
>> if (res_attr) {
>> sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
>> kfree(res_attr);
>> + pdev->res_attr_wc[i] = NULL;
>> }
>> }
>> }
>> --
>> 2.43.0
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-04-02 7:23 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-16 8:17 [PATCH v4 0/4] Miscellaneous fixes for pci subsystem Ziming Du
2026-01-16 8:17 ` [PATCH v4 1/4] PCI/sysfs: Prohibit unaligned access to I/O port Ziming Du
2026-02-26 17:00 ` Bjorn Helgaas
2026-01-16 8:17 ` [PATCH v4 2/4] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
2026-02-26 17:14 ` Bjorn Helgaas
2026-02-27 2:30 ` duziming
2026-04-02 7:23 ` duziming
2026-01-16 8:17 ` [PATCH v4 3/4] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
2026-03-03 19:32 ` Bjorn Helgaas
2026-01-16 8:17 ` [PATCH v4 4/4] PCI: Prevent overflow in proc_bus_pci_read() Ziming Du
2026-01-30 7:53 ` [PATCH v4 0/4] Miscellaneous fixes for pci subsystem duziming
2026-02-06 22:29 ` Bjorn Helgaas
2026-02-26 9:07 ` duziming
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox