* [PATCH v3 0/3] Miscellaneous fixes for pci subsystem
@ 2026-01-08 1:59 Ziming Du
2026-01-08 1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Ziming Du @ 2026-01-08 1:59 UTC (permalink / raw)
To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2
Miscellaneous fixes for pci subsystem
Changes in v3:
- Check *ppos before assign it to pos.
- Link to v2: https://lore.kernel.org/linux-pci/20251224092721.2034529-1-duziming2@huawei.com/
Changes in v2:
- Correct grammer and indentation.
- Remove unrelated stack traces from the commit message.
- Modify the handling of pos by adding a non-negative check to ensure
that the input value is valid.
- Use the existing IS_ALIGNED macro and ensure that after modification,
other cases still retuen -EINVAL as before.
- Link to v1: https://lore.kernel.org/linux-pci/20251216083912.758219-1-duziming2@huawei.com/
Yongqiang Liu (2):
PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
PCI: Prevent overflow in proc_bus_pci_write()
Ziming Du (1):
PCI/sysfs: Fix null pointer dereference during hotplug
drivers/pci/pci-sysfs.c | 11 +++++++++++
drivers/pci/proc.c | 6 +++++-
2 files changed, 16 insertions(+), 1 deletion(-)
--
2.43.0
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug 2026-01-08 1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du @ 2026-01-08 1:59 ` Ziming Du 2026-01-08 1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du 2026-01-08 1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du 2 siblings, 0 replies; 12+ messages in thread From: Ziming Du @ 2026-01-08 1:59 UTC (permalink / raw) To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2 During the concurrent process of creating and rescanning in VF, the resource files for the same pci_dev may be created twice. The second creation attempt fails, resulting the res_attr in pci_dev to kfree(), but the pointer is not set to NULL. This will subsequently lead to dereferencing a null pointer when removing the device. When we perform the following operation: echo $sriov_totalvfs > /sys/class/net/"$pfname"/device/sriov_numvfs & sleep 0.5 echo 1 > /sys/bus/pci/rescan pci_remove "$pfname" system will crash as follows: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Call trace: __pi_strlen+0x14/0x150 kernfs_find_ns+0x54/0x120 kernfs_remove_by_name_ns+0x58/0xf0 sysfs_remove_bin_file+0x24/0x38 pci_remove_resource_files+0x44/0x90 pci_remove_sysfs_dev_files+0x28/0x40 pci_stop_bus_device+0xb8/0x118 pci_stop_and_remove_bus_device+0x20/0x40 pci_iov_remove_virtfn+0xb8/0x138 sriov_disable+0xbc/0x190 pci_disable_sriov+0x30/0x48 hinic_pci_sriov_disable+0x54/0x138 [hinic] hinic_remove+0x140/0x290 [hinic] pci_device_remove+0x4c/0xf8 device_remove+0x54/0x90 device_release_driver_internal+0x1d4/0x238 device_release_driver+0x20/0x38 pci_stop_bus_device+0xa8/0x118 pci_stop_and_remove_bus_device_locked+0x28/0x50 remove_store+0x128/0x208 Fix this by set the pointer to NULL after releasing 'res_attr' immediately. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Ziming Du <duziming2@huawei.com> --- drivers/pci/pci-sysfs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index c2df915ad2d2..7e697b82c5e1 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -1222,12 +1222,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev) if (res_attr) { sysfs_remove_bin_file(&pdev->dev.kobj, res_attr); kfree(res_attr); + pdev->res_attr[i] = NULL; } res_attr = pdev->res_attr_wc[i]; if (res_attr) { sysfs_remove_bin_file(&pdev->dev.kobj, res_attr); kfree(res_attr); + pdev->res_attr_wc[i] = NULL; } } } -- 2.43.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() 2026-01-08 1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du 2026-01-08 1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du @ 2026-01-08 1:59 ` Ziming Du 2026-01-08 11:35 ` Ilpo Järvinen 2026-01-08 1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du 2 siblings, 1 reply; 12+ messages in thread From: Ziming Du @ 2026-01-08 1:59 UTC (permalink / raw) To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2 From: Yongqiang Liu <liuyongqiang13@huawei.com> When the value of *ppos over the INT_MAX, the pos is over set to a negative value which will be passed to get_user() or pci_user_write_config_dword(). Unexpected behavior such as a soft lockup will happen as follows: watchdog: BUG: soft lockup - CPU#0 stuck for 130s! [syz.3.109:3444] RIP: 0010:_raw_spin_unlock_irq+0x17/0x30 Call Trace: <TASK> pci_user_write_config_dword+0x126/0x1f0 proc_bus_pci_write+0x273/0x470 proc_reg_write+0x1b6/0x280 do_iter_write+0x48e/0x790 vfs_writev+0x125/0x4a0 __x64_sys_pwritev+0x1e2/0x2a0 do_syscall_64+0x59/0x110 entry_SYSCALL_64_after_hwframe+0x78/0xe2 Fix this by using unsigned int for the pos. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com> Signed-off-by: Ziming Du <duziming2@huawei.com> --- drivers/pci/proc.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c index 9348a0fb8084..2d51b26edbe7 100644 --- a/drivers/pci/proc.c +++ b/drivers/pci/proc.c @@ -113,10 +113,14 @@ static ssize_t proc_bus_pci_write(struct file *file, const char __user *buf, { struct inode *ino = file_inode(file); struct pci_dev *dev = pde_data(ino); - int pos = *ppos; + int pos; int size = dev->cfg_size; int cnt, ret; + if (*ppos > INT_MAX) + return -EINVAL; + pos = *ppos; + ret = security_locked_down(LOCKDOWN_PCI_ACCESS); if (ret) return ret; -- 2.43.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() 2026-01-08 1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du @ 2026-01-08 11:35 ` Ilpo Järvinen 0 siblings, 0 replies; 12+ messages in thread From: Ilpo Järvinen @ 2026-01-08 11:35 UTC (permalink / raw) To: Ziming Du; +Cc: bhelgaas, linux-pci, LKML, liuyongqiang13 [-- Attachment #1: Type: text/plain, Size: 2037 bytes --] On Thu, 8 Jan 2026, Ziming Du wrote: > From: Yongqiang Liu <liuyongqiang13@huawei.com> > > When the value of *ppos over the INT_MAX, the pos is over set to a > negative value which will be passed to get_user() or > pci_user_write_config_dword(). Unexpected behavior such as a soft lockup > will happen as follows: > > watchdog: BUG: soft lockup - CPU#0 stuck for 130s! [syz.3.109:3444] > RIP: 0010:_raw_spin_unlock_irq+0x17/0x30 > Call Trace: > <TASK> > pci_user_write_config_dword+0x126/0x1f0 > proc_bus_pci_write+0x273/0x470 > proc_reg_write+0x1b6/0x280 > do_iter_write+0x48e/0x790 > vfs_writev+0x125/0x4a0 > __x64_sys_pwritev+0x1e2/0x2a0 > do_syscall_64+0x59/0x110 > entry_SYSCALL_64_after_hwframe+0x78/0xe2 > > Fix this by using unsigned int for the pos. > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com> > Signed-off-by: Ziming Du <duziming2@huawei.com> > --- > drivers/pci/proc.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c > index 9348a0fb8084..2d51b26edbe7 100644 > --- a/drivers/pci/proc.c > +++ b/drivers/pci/proc.c > @@ -113,10 +113,14 @@ static ssize_t proc_bus_pci_write(struct file *file, const char __user *buf, > { > struct inode *ino = file_inode(file); > struct pci_dev *dev = pde_data(ino); > - int pos = *ppos; > + int pos; > int size = dev->cfg_size; > int cnt, ret; > > + if (*ppos > INT_MAX) > + return -EINVAL; > + pos = *ppos; > + > ret = security_locked_down(LOCKDOWN_PCI_ACCESS); > if (ret) > return ret; > Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> With the note that proc_bus_pci_read() and proc_bus_pci_write() diverge in handling > INT_MAX values and that feels unjustified (but there's not this same problem on the read side I guess so if the read side is made the same as the write side, it would be better to do that in another patch). -- i. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 2026-01-08 1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du 2026-01-08 1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du 2026-01-08 1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du @ 2026-01-08 1:59 ` Ziming Du 2026-01-08 8:56 ` David Laight 2 siblings, 1 reply; 12+ messages in thread From: Ziming Du @ 2026-01-08 1:59 UTC (permalink / raw) To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2 From: Yongqiang Liu <liuyongqiang13@huawei.com> Unaligned access is harmful for non-x86 archs such as arm64. When we use pwrite or pread to access the I/O port resources with unaligned offset, system will crash as follows: Unable to handle kernel paging request at virtual address fffffbfffe8010c1 Internal error: Oops: 0000000096000061 [#1] SMP Call trace: _outw include/asm-generic/io.h:594 [inline] logic_outw+0x54/0x218 lib/logic_pio.c:305 pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline] pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline] pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181 sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158 kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338 vfs_write+0x7bc/0xac8 fs/read_write.c:586 ksys_write+0x12c/0x270 fs/read_write.c:639 __arm64_sys_write+0x78/0xb8 fs/read_write.c:648 Powerpc seems affected as well, so prohibit the unaligned access on non-x86 archs. Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port resources") Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com> Signed-off-by: Ziming Du <duziming2@huawei.com> --- drivers/pci/pci-sysfs.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 7e697b82c5e1..11d8b7ec4263 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -31,6 +31,7 @@ #include <linux/of.h> #include <linux/aperture.h> #include <linux/unaligned.h> +#include <linux/align.h> #include "pci.h" #ifndef ARCH_PCI_DEV_GROUPS @@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj, *(u8 *)buf = inb(port); return 1; case 2: + #if !defined(CONFIG_X86) + if (!IS_ALIGNED(port, count)) + return -EFAULT; + #endif if (write) outw(*(u16 *)buf, port); else *(u16 *)buf = inw(port); return 2; case 4: + #if !defined(CONFIG_X86) + if (!IS_ALIGNED(port, count)) + return -EFAULT; + #endif if (write) outl(*(u32 *)buf, port); else -- 2.43.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 2026-01-08 1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du @ 2026-01-08 8:56 ` David Laight 2026-01-08 11:49 ` duziming 2026-01-09 0:38 ` Maciej W. Rozycki 0 siblings, 2 replies; 12+ messages in thread From: David Laight @ 2026-01-08 8:56 UTC (permalink / raw) To: Ziming Du; +Cc: bhelgaas, linux-pci, linux-kernel, liuyongqiang13 On Thu, 8 Jan 2026 09:59:44 +0800 Ziming Du <duziming2@huawei.com> wrote: > From: Yongqiang Liu <liuyongqiang13@huawei.com> > > Unaligned access is harmful for non-x86 archs such as arm64. When we > use pwrite or pread to access the I/O port resources with unaligned > offset, system will crash as follows: > > Unable to handle kernel paging request at virtual address fffffbfffe8010c1 > Internal error: Oops: 0000000096000061 [#1] SMP > Call trace: > _outw include/asm-generic/io.h:594 [inline] > logic_outw+0x54/0x218 lib/logic_pio.c:305 > pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline] > pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline] > pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181 > sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158 > kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338 > vfs_write+0x7bc/0xac8 fs/read_write.c:586 > ksys_write+0x12c/0x270 fs/read_write.c:639 > __arm64_sys_write+0x78/0xb8 fs/read_write.c:648 > > Powerpc seems affected as well, so prohibit the unaligned access > on non-x86 archs. I'm not sure it makes any real sense for x86 either. IIRC io space is just like memory space, so a 16bit io access looks the same as two 8bit accesses to an 8bit device (some put the 'data fifo' on addresses 0 and 1 so the code could use 16bit io accesses to speed things up). The same will have applied to misaligned accesses. But, in reality, all device registers are aligned. I'm not sure EFAULT is the best error code though, EINVAL might be better. (EINVAL is returned for other address/size errors.) EFAULT is usually returned for errors accessing the user buffer, a least one unix system raises SIGSEGV whenever EFAULT is returned. David > > Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port resources") > Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com> > Signed-off-by: Ziming Du <duziming2@huawei.com> > --- > drivers/pci/pci-sysfs.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index 7e697b82c5e1..11d8b7ec4263 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -31,6 +31,7 @@ > #include <linux/of.h> > #include <linux/aperture.h> > #include <linux/unaligned.h> > +#include <linux/align.h> > #include "pci.h" > > #ifndef ARCH_PCI_DEV_GROUPS > @@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj, > *(u8 *)buf = inb(port); > return 1; > case 2: > + #if !defined(CONFIG_X86) > + if (!IS_ALIGNED(port, count)) > + return -EFAULT; > + #endif > if (write) > outw(*(u16 *)buf, port); > else > *(u16 *)buf = inw(port); > return 2; > case 4: > + #if !defined(CONFIG_X86) > + if (!IS_ALIGNED(port, count)) > + return -EFAULT; > + #endif > if (write) > outl(*(u32 *)buf, port); > else ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 2026-01-08 8:56 ` David Laight @ 2026-01-08 11:49 ` duziming 2026-01-09 7:21 ` Ilpo Järvinen 2026-01-09 0:38 ` Maciej W. Rozycki 1 sibling, 1 reply; 12+ messages in thread From: duziming @ 2026-01-08 11:49 UTC (permalink / raw) To: David Laight; +Cc: bhelgaas, linux-pci, linux-kernel, liuyongqiang13 在 2026/1/8 16:56, David Laight 写道: > On Thu, 8 Jan 2026 09:59:44 +0800 > Ziming Du <duziming2@huawei.com> wrote: > >> From: Yongqiang Liu <liuyongqiang13@huawei.com> >> >> Unaligned access is harmful for non-x86 archs such as arm64. When we >> use pwrite or pread to access the I/O port resources with unaligned >> offset, system will crash as follows: >> >> Unable to handle kernel paging request at virtual address fffffbfffe8010c1 >> Internal error: Oops: 0000000096000061 [#1] SMP >> Call trace: >> _outw include/asm-generic/io.h:594 [inline] >> logic_outw+0x54/0x218 lib/logic_pio.c:305 >> pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline] >> pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline] >> pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181 >> sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158 >> kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338 >> vfs_write+0x7bc/0xac8 fs/read_write.c:586 >> ksys_write+0x12c/0x270 fs/read_write.c:639 >> __arm64_sys_write+0x78/0xb8 fs/read_write.c:648 >> >> Powerpc seems affected as well, so prohibit the unaligned access >> on non-x86 archs. > I'm not sure it makes any real sense for x86 either. > IIRC io space is just like memory space, so a 16bit io access looks the > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on > addresses 0 and 1 so the code could use 16bit io accesses to speed things up). > The same will have applied to misaligned accesses. > But, in reality, all device registers are aligned. > > I'm not sure EFAULT is the best error code though, EINVAL might be better. > (EINVAL is returned for other address/size errors.) > EFAULT is usually returned for errors accessing the user buffer, a least > one unix system raises SIGSEGV whenever EFAULT is returned. > > David Just to confirm: should all architectures prohibit unaligned access to device registers? >> Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port resources") >> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com> >> Signed-off-by: Ziming Du <duziming2@huawei.com> >> --- >> drivers/pci/pci-sysfs.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c >> index 7e697b82c5e1..11d8b7ec4263 100644 >> --- a/drivers/pci/pci-sysfs.c >> +++ b/drivers/pci/pci-sysfs.c >> @@ -31,6 +31,7 @@ >> #include <linux/of.h> >> #include <linux/aperture.h> >> #include <linux/unaligned.h> >> +#include <linux/align.h> >> #include "pci.h" >> >> #ifndef ARCH_PCI_DEV_GROUPS >> @@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj, >> *(u8 *)buf = inb(port); >> return 1; >> case 2: >> + #if !defined(CONFIG_X86) >> + if (!IS_ALIGNED(port, count)) >> + return -EFAULT; >> + #endif >> if (write) >> outw(*(u16 *)buf, port); >> else >> *(u16 *)buf = inw(port); >> return 2; >> case 4: >> + #if !defined(CONFIG_X86) >> + if (!IS_ALIGNED(port, count)) >> + return -EFAULT; >> + #endif >> if (write) >> outl(*(u32 *)buf, port); >> else ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 2026-01-08 11:49 ` duziming @ 2026-01-09 7:21 ` Ilpo Järvinen 2026-01-09 9:40 ` duziming 0 siblings, 1 reply; 12+ messages in thread From: Ilpo Järvinen @ 2026-01-09 7:21 UTC (permalink / raw) To: duziming; +Cc: David Laight, bhelgaas, linux-pci, LKML, liuyongqiang13 [-- Attachment #1: Type: text/plain, Size: 3545 bytes --] On Thu, 8 Jan 2026, duziming wrote: > > 在 2026/1/8 16:56, David Laight 写道: > > On Thu, 8 Jan 2026 09:59:44 +0800 > > Ziming Du <duziming2@huawei.com> wrote: > > > > > From: Yongqiang Liu <liuyongqiang13@huawei.com> > > > > > > Unaligned access is harmful for non-x86 archs such as arm64. When we > > > use pwrite or pread to access the I/O port resources with unaligned > > > offset, system will crash as follows: > > > > > > Unable to handle kernel paging request at virtual address fffffbfffe8010c1 > > > Internal error: Oops: 0000000096000061 [#1] SMP > > > Call trace: > > > _outw include/asm-generic/io.h:594 [inline] > > > logic_outw+0x54/0x218 lib/logic_pio.c:305 > > > pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline] > > > pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline] > > > pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181 > > > sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158 > > > kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338 > > > vfs_write+0x7bc/0xac8 fs/read_write.c:586 > > > ksys_write+0x12c/0x270 fs/read_write.c:639 > > > __arm64_sys_write+0x78/0xb8 fs/read_write.c:648 > > > > > > Powerpc seems affected as well, so prohibit the unaligned access > > > on non-x86 archs. > > I'm not sure it makes any real sense for x86 either. > > IIRC io space is just like memory space, so a 16bit io access looks the > > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on > > addresses 0 and 1 so the code could use 16bit io accesses to speed things > > up). > > The same will have applied to misaligned accesses. > > But, in reality, all device registers are aligned. > > > > I'm not sure EFAULT is the best error code though, EINVAL might be better. > > (EINVAL is returned for other address/size errors.) > > EFAULT is usually returned for errors accessing the user buffer, a least > > one unix system raises SIGSEGV whenever EFAULT is returned. > > > Just to confirm: should all architectures prohibit unaligned access to device > registers? In my opinion, yes, also x86 should prohibit it (like I already expressed but you ignored that comment until now). -- i. > > > Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port > > > resources") > > > Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com> > > > Signed-off-by: Ziming Du <duziming2@huawei.com> > > > --- > > > drivers/pci/pci-sysfs.c | 9 +++++++++ > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > > > index 7e697b82c5e1..11d8b7ec4263 100644 > > > --- a/drivers/pci/pci-sysfs.c > > > +++ b/drivers/pci/pci-sysfs.c > > > @@ -31,6 +31,7 @@ > > > #include <linux/of.h> > > > #include <linux/aperture.h> > > > #include <linux/unaligned.h> > > > +#include <linux/align.h> > > > #include "pci.h" > > > #ifndef ARCH_PCI_DEV_GROUPS > > > @@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp, > > > struct kobject *kobj, > > > *(u8 *)buf = inb(port); > > > return 1; > > > case 2: > > > + #if !defined(CONFIG_X86) > > > + if (!IS_ALIGNED(port, count)) > > > + return -EFAULT; > > > + #endif > > > if (write) > > > outw(*(u16 *)buf, port); > > > else > > > *(u16 *)buf = inw(port); > > > return 2; > > > case 4: > > > + #if !defined(CONFIG_X86) > > > + if (!IS_ALIGNED(port, count)) > > > + return -EFAULT; > > > + #endif > > > if (write) > > > outl(*(u32 *)buf, port); > > > else > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 2026-01-09 7:21 ` Ilpo Järvinen @ 2026-01-09 9:40 ` duziming 0 siblings, 0 replies; 12+ messages in thread From: duziming @ 2026-01-09 9:40 UTC (permalink / raw) To: Ilpo Järvinen Cc: David Laight, bhelgaas, linux-pci, LKML, liuyongqiang13 在 2026/1/9 15:21, Ilpo Järvinen 写道: > On Thu, 8 Jan 2026, duziming wrote: > >> 在 2026/1/8 16:56, David Laight 写道: >>> On Thu, 8 Jan 2026 09:59:44 +0800 >>> Ziming Du <duziming2@huawei.com> wrote: >>> >>>> From: Yongqiang Liu <liuyongqiang13@huawei.com> >>>> >>>> Unaligned access is harmful for non-x86 archs such as arm64. When we >>>> use pwrite or pread to access the I/O port resources with unaligned >>>> offset, system will crash as follows: >>>> >>>> Unable to handle kernel paging request at virtual address fffffbfffe8010c1 >>>> Internal error: Oops: 0000000096000061 [#1] SMP >>>> Call trace: >>>> _outw include/asm-generic/io.h:594 [inline] >>>> logic_outw+0x54/0x218 lib/logic_pio.c:305 >>>> pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline] >>>> pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline] >>>> pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181 >>>> sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158 >>>> kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338 >>>> vfs_write+0x7bc/0xac8 fs/read_write.c:586 >>>> ksys_write+0x12c/0x270 fs/read_write.c:639 >>>> __arm64_sys_write+0x78/0xb8 fs/read_write.c:648 >>>> >>>> Powerpc seems affected as well, so prohibit the unaligned access >>>> on non-x86 archs. >>> I'm not sure it makes any real sense for x86 either. >>> IIRC io space is just like memory space, so a 16bit io access looks the >>> same as two 8bit accesses to an 8bit device (some put the 'data fifo' on >>> addresses 0 and 1 so the code could use 16bit io accesses to speed things >>> up). >>> The same will have applied to misaligned accesses. >>> But, in reality, all device registers are aligned. >>> >>> I'm not sure EFAULT is the best error code though, EINVAL might be better. >>> (EINVAL is returned for other address/size errors.) >>> EFAULT is usually returned for errors accessing the user buffer, a least >>> one unix system raises SIGSEGV whenever EFAULT is returned. >>> >> Just to confirm: should all architectures prohibit unaligned access to device >> registers? > In my opinion, yes, also x86 should prohibit it (like I already > expressed but you ignored that comment until now). Oops, I didn’t quite understand your opinion earlier :( > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 2026-01-08 8:56 ` David Laight 2026-01-08 11:49 ` duziming @ 2026-01-09 0:38 ` Maciej W. Rozycki 2026-01-09 12:53 ` David Laight 1 sibling, 1 reply; 12+ messages in thread From: Maciej W. Rozycki @ 2026-01-09 0:38 UTC (permalink / raw) To: David Laight Cc: Ziming Du, Bjorn Helgaas, linux-pci, linux-kernel, liuyongqiang13 On Thu, 8 Jan 2026, David Laight wrote: > I'm not sure it makes any real sense for x86 either. FWIW I agree. > IIRC io space is just like memory space, so a 16bit io access looks the > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on > addresses 0 and 1 so the code could use 16bit io accesses to speed things up). Huh? A 16-bit port I/O access will have the byte enables set accordingly on PCI and the target device's data lines are driven accordingly in the data cycle. Just as with MMIO; it's just a different bus command (or TLP type for PCIe). There's no data FIFO or anything as exotic in normal hardware to drive or collect data for port I/O accesses wider than 8 bits. Some peripheral hardware may ignore byte enables though to simplify logic and e.g. assume that all port I/O or MMIO accesses are of a certain width, such as 16-bit or 32-bit. > The same will have applied to misaligned accesses. Misaligned accesses may or may not have to be split depending on whether they span the data bus width boundary or not. E.g. a 16-bit access to port I/O location 1 won't be split on 32-bit PCI as it fits on the bus: byte enables #1 and #2 will be driven active and byte enables #0 and #3 will be left inactive. Conversely such an access to location 3 needs to be split into two cycles, with byte enables #3 and #0 only driven active respectively in the first and the second cycle. The x86 BIU will do the split automatically for port I/O instructions as will some other CPU architectures that use memory access instructions to reach the PCI port I/O decoding window in their memory address space (this is a simplified view, as the split may have to be done in the chipset when passing the boundary between data buses of a different width each). With other architectures such as MIPS designated instructions need to be used to drive the byte enables by hand for individual partial accesses in a split access, and the remaining architectures cannot drive some of the byte-enable patterns needed for such split accesses at all (and do masking in software instead for unaligned accesses to regular memory). > But, in reality, all device registers are aligned. True, sometimes beyond their width too. Maciej ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 2026-01-09 0:38 ` Maciej W. Rozycki @ 2026-01-09 12:53 ` David Laight 2026-01-09 15:25 ` Maciej W. Rozycki 0 siblings, 1 reply; 12+ messages in thread From: David Laight @ 2026-01-09 12:53 UTC (permalink / raw) To: Maciej W. Rozycki Cc: Ziming Du, Bjorn Helgaas, linux-pci, linux-kernel, liuyongqiang13 On Fri, 9 Jan 2026 00:38:05 +0000 (GMT) "Maciej W. Rozycki" <macro@orcam.me.uk> wrote: > On Thu, 8 Jan 2026, David Laight wrote: > > > I'm not sure it makes any real sense for x86 either. > > FWIW I agree. The interface could have allowed arbitrary transfers and split them into aligned bus cycles. That would let you hexdump an io bar (useful for diagnostics, but some reads end up being destructive - caveat emptor). But it doesn't.... It is also of limited use because (IIRC) you can't access the device this way once a driver has mapped the bar. The driver can, of course, support applications using mmap() to directly access device memory over PCIe. (The difficulty is mmap() of kernel memory allocated with dma_alloc_coherent().) > > IIRC io space is just like memory space, so a 16bit io access looks the > > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on > > addresses 0 and 1 so the code could use 16bit io accesses to speed things up). > > Huh? A 16-bit port I/O access will have the byte enables set accordingly > on PCI and the target device's data lines are driven accordingly in the > data cycle. Just as with MMIO; it's just a different bus command (or TLP > type for PCIe). I was going back to the historic implementations - like 8086 and 8088. For pcie I know it is all different and the cpu just generates read/write TLP with the required byte enables and the tlp type marked 'io'. (I can't remember whether IO writes end up 'not posted' - failed to find it in a 5cm thick book on my shelf.) > There's no data FIFO or anything as exotic in normal hardware to drive or > collect data for port I/O accesses wider than 8 bits. Some peripheral > hardware may ignore byte enables though to simplify logic and e.g. assume > that all port I/O or MMIO accesses are of a certain width, such as 16-bit > or 32-bit. Not to mention an fpga fabric that converted the TLP for an (aligned) 32bit access into a pair of 32bit accesses - one of which had no byte enables set! Confused the hw engineers who had ignored the byte enables... > > The same will have applied to misaligned accesses. > > Misaligned accesses may or may not have to be split depending on whether > they span the data bus width boundary or not. E.g. a 16-bit access to > port I/O location 1 won't be split on 32-bit PCI as it fits on the bus: > byte enables #1 and #2 will be driven active and byte enables #0 and #3 > will be left inactive. Conversely such an access to location 3 needs to > be split into two cycles, with byte enables #3 and #0 only driven active > respectively in the first and the second cycle. And on PCIe (which is 64bit) a misaligned transfer that crosses a 64bit boundary generates a single 16 byte TLP (not sure about page boundaries). > The x86 BIU will do the split automatically for port I/O instructions as > will some other CPU architectures that use memory access instructions to > reach the PCI port I/O decoding window in their memory address space (this > is a simplified view, as the split may have to be done in the chipset when > passing the boundary between data buses of a different width each). Yes, and I don't understand the HAS_IOPORT option. Pretty much only x86 has separate instructions, but a lot of others will have PCI/PCIe interface logic that can convert cpu memory accesses into 'io' accesses - so the pci_map_bar() should be able to transparently map an io bar into kernel address space. So x86 should be the outlier because it can't do that! Even the strongarm system I used years ago has an address window that generated 'io' cycles on a pcmcia bus. I think a host PCI/PCIe interface could do io accesses for the bottom 64k of its memory window - but I don't know any that work that way. > > With other architectures such as MIPS designated instructions need to be > used to drive the byte enables by hand for individual partial accesses in > a split access, and the remaining architectures cannot drive some of the > byte-enable patterns needed for such split accesses at all (and do masking > in software instead for unaligned accesses to regular memory). > > > But, in reality, all device registers are aligned. > > True, sometimes beyond their width too. Which means you don't want the multiple cycles that happen when someone marks a structure as 'packed' even though it is completely aligned. David > > Maciej > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 2026-01-09 12:53 ` David Laight @ 2026-01-09 15:25 ` Maciej W. Rozycki 0 siblings, 0 replies; 12+ messages in thread From: Maciej W. Rozycki @ 2026-01-09 15:25 UTC (permalink / raw) To: David Laight Cc: Ziming Du, Bjorn Helgaas, linux-pci, linux-kernel, liuyongqiang13 On Fri, 9 Jan 2026, David Laight wrote: > > > IIRC io space is just like memory space, so a 16bit io access looks the > > > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on > > > addresses 0 and 1 so the code could use 16bit io accesses to speed things up). > > > > Huh? A 16-bit port I/O access will have the byte enables set accordingly > > on PCI and the target device's data lines are driven accordingly in the > > data cycle. Just as with MMIO; it's just a different bus command (or TLP > > type for PCIe). > > I was going back to the historic implementations - like 8086 and 8088. Historical systems had various odd bus designs that are out of scope for our consideration. Even the 80386, but that was never interfaced to PCI, perhaps for this very reason. > > Misaligned accesses may or may not have to be split depending on whether > > they span the data bus width boundary or not. E.g. a 16-bit access to > > port I/O location 1 won't be split on 32-bit PCI as it fits on the bus: > > byte enables #1 and #2 will be driven active and byte enables #0 and #3 > > will be left inactive. Conversely such an access to location 3 needs to > > be split into two cycles, with byte enables #3 and #0 only driven active > > respectively in the first and the second cycle. > > And on PCIe (which is 64bit) a misaligned transfer that crosses a 64bit > boundary generates a single 16 byte TLP (not sure about page boundaries). Which is also what is covered (without diving into exact details) by my observation that a transfer may have to be split downstream, e.g. by a PCIe-to-PCI bridge, or internally by the target device. > > The x86 BIU will do the split automatically for port I/O instructions as > > will some other CPU architectures that use memory access instructions to > > reach the PCI port I/O decoding window in their memory address space (this > > is a simplified view, as the split may have to be done in the chipset when > > passing the boundary between data buses of a different width each). > > Yes, and I don't understand the HAS_IOPORT option. > Pretty much only x86 has separate instructions, but a lot of others will > have PCI/PCIe interface logic that can convert cpu memory accesses into > 'io' accesses - so the pci_map_bar() should be able to transparently map > an io bar into kernel address space. > So x86 should be the outlier because it can't do that! > > Even the strongarm system I used years ago has an address window that > generated 'io' cycles on a pcmcia bus. > > I think a host PCI/PCIe interface could do io accesses for the bottom > 64k of its memory window - but I don't know any that work that way. Only x86 (and its predecessors such as 8080, Z80, etc.) has the port I/O space defined at the CPU bus level. All the other architectures do define the port I/O space, but only in the chipset at the PCI/e bus/interconnect level where implemented. It is where the HAS_IOPORT option is concerned. Several contemporary systems have no way to produce port I/O cycles on PCIe owing to the lack of support for the required TLP types in the host bridge. There's simply no way to produce a transaction that would match an I/O BAR in a target device. I own such a system myself, it's a POWER9 machine[1][2]. I'm told numerous ARM systems also have such a limitation. FWIW port I/O TLP types have been deprecated from the beginning of PCIe. References: [1] "Power Systems Host Bridge 4 (PHB4) Specification", Version 1.0, International Business Machines Corporation, 27 July 2018, Table 3-2. "PCIe TLP command summary", p. 29. [2] "pcmcia: add HAS_IOPORT dependencies", <https://lore.kernel.org/r/alpine.DEB.2.21.2205041311280.9548@angie.orcam.me.uk/>. Maciej ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-01-09 15:25 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-01-08 1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du 2026-01-08 1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du 2026-01-08 1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du 2026-01-08 11:35 ` Ilpo Järvinen 2026-01-08 1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du 2026-01-08 8:56 ` David Laight 2026-01-08 11:49 ` duziming 2026-01-09 7:21 ` Ilpo Järvinen 2026-01-09 9:40 ` duziming 2026-01-09 0:38 ` Maciej W. Rozycki 2026-01-09 12:53 ` David Laight 2026-01-09 15:25 ` Maciej W. Rozycki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox