* [PATCH v3 0/3] Miscellaneous fixes for pci subsystem
@ 2026-01-08 1:59 Ziming Du
2026-01-08 1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Ziming Du @ 2026-01-08 1:59 UTC (permalink / raw)
To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2
Miscellaneous fixes for pci subsystem
Changes in v3:
- Check *ppos before assign it to pos.
- Link to v2: https://lore.kernel.org/linux-pci/20251224092721.2034529-1-duziming2@huawei.com/
Changes in v2:
- Correct grammer and indentation.
- Remove unrelated stack traces from the commit message.
- Modify the handling of pos by adding a non-negative check to ensure
that the input value is valid.
- Use the existing IS_ALIGNED macro and ensure that after modification,
other cases still retuen -EINVAL as before.
- Link to v1: https://lore.kernel.org/linux-pci/20251216083912.758219-1-duziming2@huawei.com/
Yongqiang Liu (2):
PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
PCI: Prevent overflow in proc_bus_pci_write()
Ziming Du (1):
PCI/sysfs: Fix null pointer dereference during hotplug
drivers/pci/pci-sysfs.c | 11 +++++++++++
drivers/pci/proc.c | 6 +++++-
2 files changed, 16 insertions(+), 1 deletion(-)
--
2.43.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug
2026-01-08 1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du
@ 2026-01-08 1:59 ` Ziming Du
2026-01-08 1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
2026-01-08 1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du
2 siblings, 0 replies; 12+ messages in thread
From: Ziming Du @ 2026-01-08 1:59 UTC (permalink / raw)
To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2
During the concurrent process of creating and rescanning in VF, the
resource files for the same pci_dev may be created twice. The second
creation attempt fails, resulting the res_attr in pci_dev to kfree(),
but the pointer is not set to NULL. This will subsequently lead to
dereferencing a null pointer when removing the device.
When we perform the following operation:
echo $sriov_totalvfs > /sys/class/net/"$pfname"/device/sriov_numvfs &
sleep 0.5
echo 1 > /sys/bus/pci/rescan
pci_remove "$pfname"
system will crash as follows:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Call trace:
__pi_strlen+0x14/0x150
kernfs_find_ns+0x54/0x120
kernfs_remove_by_name_ns+0x58/0xf0
sysfs_remove_bin_file+0x24/0x38
pci_remove_resource_files+0x44/0x90
pci_remove_sysfs_dev_files+0x28/0x40
pci_stop_bus_device+0xb8/0x118
pci_stop_and_remove_bus_device+0x20/0x40
pci_iov_remove_virtfn+0xb8/0x138
sriov_disable+0xbc/0x190
pci_disable_sriov+0x30/0x48
hinic_pci_sriov_disable+0x54/0x138 [hinic]
hinic_remove+0x140/0x290 [hinic]
pci_device_remove+0x4c/0xf8
device_remove+0x54/0x90
device_release_driver_internal+0x1d4/0x238
device_release_driver+0x20/0x38
pci_stop_bus_device+0xa8/0x118
pci_stop_and_remove_bus_device_locked+0x28/0x50
remove_store+0x128/0x208
Fix this by set the pointer to NULL after releasing 'res_attr' immediately.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ziming Du <duziming2@huawei.com>
---
drivers/pci/pci-sysfs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index c2df915ad2d2..7e697b82c5e1 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1222,12 +1222,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
if (res_attr) {
sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
kfree(res_attr);
+ pdev->res_attr[i] = NULL;
}
res_attr = pdev->res_attr_wc[i];
if (res_attr) {
sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
kfree(res_attr);
+ pdev->res_attr_wc[i] = NULL;
}
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write()
2026-01-08 1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du
2026-01-08 1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
@ 2026-01-08 1:59 ` Ziming Du
2026-01-08 11:35 ` Ilpo Järvinen
2026-01-08 1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du
2 siblings, 1 reply; 12+ messages in thread
From: Ziming Du @ 2026-01-08 1:59 UTC (permalink / raw)
To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2
From: Yongqiang Liu <liuyongqiang13@huawei.com>
When the value of *ppos over the INT_MAX, the pos is over set to a
negative value which will be passed to get_user() or
pci_user_write_config_dword(). Unexpected behavior such as a soft lockup
will happen as follows:
watchdog: BUG: soft lockup - CPU#0 stuck for 130s! [syz.3.109:3444]
RIP: 0010:_raw_spin_unlock_irq+0x17/0x30
Call Trace:
<TASK>
pci_user_write_config_dword+0x126/0x1f0
proc_bus_pci_write+0x273/0x470
proc_reg_write+0x1b6/0x280
do_iter_write+0x48e/0x790
vfs_writev+0x125/0x4a0
__x64_sys_pwritev+0x1e2/0x2a0
do_syscall_64+0x59/0x110
entry_SYSCALL_64_after_hwframe+0x78/0xe2
Fix this by using unsigned int for the pos.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: Ziming Du <duziming2@huawei.com>
---
drivers/pci/proc.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 9348a0fb8084..2d51b26edbe7 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -113,10 +113,14 @@ static ssize_t proc_bus_pci_write(struct file *file, const char __user *buf,
{
struct inode *ino = file_inode(file);
struct pci_dev *dev = pde_data(ino);
- int pos = *ppos;
+ int pos;
int size = dev->cfg_size;
int cnt, ret;
+ if (*ppos > INT_MAX)
+ return -EINVAL;
+ pos = *ppos;
+
ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
if (ret)
return ret;
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
2026-01-08 1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du
2026-01-08 1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
2026-01-08 1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
@ 2026-01-08 1:59 ` Ziming Du
2026-01-08 8:56 ` David Laight
2 siblings, 1 reply; 12+ messages in thread
From: Ziming Du @ 2026-01-08 1:59 UTC (permalink / raw)
To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2
From: Yongqiang Liu <liuyongqiang13@huawei.com>
Unaligned access is harmful for non-x86 archs such as arm64. When we
use pwrite or pread to access the I/O port resources with unaligned
offset, system will crash as follows:
Unable to handle kernel paging request at virtual address fffffbfffe8010c1
Internal error: Oops: 0000000096000061 [#1] SMP
Call trace:
_outw include/asm-generic/io.h:594 [inline]
logic_outw+0x54/0x218 lib/logic_pio.c:305
pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
vfs_write+0x7bc/0xac8 fs/read_write.c:586
ksys_write+0x12c/0x270 fs/read_write.c:639
__arm64_sys_write+0x78/0xb8 fs/read_write.c:648
Powerpc seems affected as well, so prohibit the unaligned access
on non-x86 archs.
Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port resources")
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: Ziming Du <duziming2@huawei.com>
---
drivers/pci/pci-sysfs.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 7e697b82c5e1..11d8b7ec4263 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -31,6 +31,7 @@
#include <linux/of.h>
#include <linux/aperture.h>
#include <linux/unaligned.h>
+#include <linux/align.h>
#include "pci.h"
#ifndef ARCH_PCI_DEV_GROUPS
@@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj,
*(u8 *)buf = inb(port);
return 1;
case 2:
+ #if !defined(CONFIG_X86)
+ if (!IS_ALIGNED(port, count))
+ return -EFAULT;
+ #endif
if (write)
outw(*(u16 *)buf, port);
else
*(u16 *)buf = inw(port);
return 2;
case 4:
+ #if !defined(CONFIG_X86)
+ if (!IS_ALIGNED(port, count))
+ return -EFAULT;
+ #endif
if (write)
outl(*(u32 *)buf, port);
else
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
2026-01-08 1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du
@ 2026-01-08 8:56 ` David Laight
2026-01-08 11:49 ` duziming
2026-01-09 0:38 ` Maciej W. Rozycki
0 siblings, 2 replies; 12+ messages in thread
From: David Laight @ 2026-01-08 8:56 UTC (permalink / raw)
To: Ziming Du; +Cc: bhelgaas, linux-pci, linux-kernel, liuyongqiang13
On Thu, 8 Jan 2026 09:59:44 +0800
Ziming Du <duziming2@huawei.com> wrote:
> From: Yongqiang Liu <liuyongqiang13@huawei.com>
>
> Unaligned access is harmful for non-x86 archs such as arm64. When we
> use pwrite or pread to access the I/O port resources with unaligned
> offset, system will crash as follows:
>
> Unable to handle kernel paging request at virtual address fffffbfffe8010c1
> Internal error: Oops: 0000000096000061 [#1] SMP
> Call trace:
> _outw include/asm-generic/io.h:594 [inline]
> logic_outw+0x54/0x218 lib/logic_pio.c:305
> pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
> pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
> pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
> sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
> kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
> vfs_write+0x7bc/0xac8 fs/read_write.c:586
> ksys_write+0x12c/0x270 fs/read_write.c:639
> __arm64_sys_write+0x78/0xb8 fs/read_write.c:648
>
> Powerpc seems affected as well, so prohibit the unaligned access
> on non-x86 archs.
I'm not sure it makes any real sense for x86 either.
IIRC io space is just like memory space, so a 16bit io access looks the
same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
addresses 0 and 1 so the code could use 16bit io accesses to speed things up).
The same will have applied to misaligned accesses.
But, in reality, all device registers are aligned.
I'm not sure EFAULT is the best error code though, EINVAL might be better.
(EINVAL is returned for other address/size errors.)
EFAULT is usually returned for errors accessing the user buffer, a least
one unix system raises SIGSEGV whenever EFAULT is returned.
David
>
> Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port resources")
> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
> Signed-off-by: Ziming Du <duziming2@huawei.com>
> ---
> drivers/pci/pci-sysfs.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 7e697b82c5e1..11d8b7ec4263 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -31,6 +31,7 @@
> #include <linux/of.h>
> #include <linux/aperture.h>
> #include <linux/unaligned.h>
> +#include <linux/align.h>
> #include "pci.h"
>
> #ifndef ARCH_PCI_DEV_GROUPS
> @@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj,
> *(u8 *)buf = inb(port);
> return 1;
> case 2:
> + #if !defined(CONFIG_X86)
> + if (!IS_ALIGNED(port, count))
> + return -EFAULT;
> + #endif
> if (write)
> outw(*(u16 *)buf, port);
> else
> *(u16 *)buf = inw(port);
> return 2;
> case 4:
> + #if !defined(CONFIG_X86)
> + if (!IS_ALIGNED(port, count))
> + return -EFAULT;
> + #endif
> if (write)
> outl(*(u32 *)buf, port);
> else
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write()
2026-01-08 1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
@ 2026-01-08 11:35 ` Ilpo Järvinen
0 siblings, 0 replies; 12+ messages in thread
From: Ilpo Järvinen @ 2026-01-08 11:35 UTC (permalink / raw)
To: Ziming Du; +Cc: bhelgaas, linux-pci, LKML, liuyongqiang13
[-- Attachment #1: Type: text/plain, Size: 2037 bytes --]
On Thu, 8 Jan 2026, Ziming Du wrote:
> From: Yongqiang Liu <liuyongqiang13@huawei.com>
>
> When the value of *ppos over the INT_MAX, the pos is over set to a
> negative value which will be passed to get_user() or
> pci_user_write_config_dword(). Unexpected behavior such as a soft lockup
> will happen as follows:
>
> watchdog: BUG: soft lockup - CPU#0 stuck for 130s! [syz.3.109:3444]
> RIP: 0010:_raw_spin_unlock_irq+0x17/0x30
> Call Trace:
> <TASK>
> pci_user_write_config_dword+0x126/0x1f0
> proc_bus_pci_write+0x273/0x470
> proc_reg_write+0x1b6/0x280
> do_iter_write+0x48e/0x790
> vfs_writev+0x125/0x4a0
> __x64_sys_pwritev+0x1e2/0x2a0
> do_syscall_64+0x59/0x110
> entry_SYSCALL_64_after_hwframe+0x78/0xe2
>
> Fix this by using unsigned int for the pos.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
> Signed-off-by: Ziming Du <duziming2@huawei.com>
> ---
> drivers/pci/proc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 9348a0fb8084..2d51b26edbe7 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -113,10 +113,14 @@ static ssize_t proc_bus_pci_write(struct file *file, const char __user *buf,
> {
> struct inode *ino = file_inode(file);
> struct pci_dev *dev = pde_data(ino);
> - int pos = *ppos;
> + int pos;
> int size = dev->cfg_size;
> int cnt, ret;
>
> + if (*ppos > INT_MAX)
> + return -EINVAL;
> + pos = *ppos;
> +
> ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
> if (ret)
> return ret;
>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
With the note that proc_bus_pci_read() and proc_bus_pci_write() diverge in
handling > INT_MAX values and that feels unjustified (but there's not this
same problem on the read side I guess so if the read side is made the same
as the write side, it would be better to do that in another patch).
--
i.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
2026-01-08 8:56 ` David Laight
@ 2026-01-08 11:49 ` duziming
2026-01-09 7:21 ` Ilpo Järvinen
2026-01-09 0:38 ` Maciej W. Rozycki
1 sibling, 1 reply; 12+ messages in thread
From: duziming @ 2026-01-08 11:49 UTC (permalink / raw)
To: David Laight; +Cc: bhelgaas, linux-pci, linux-kernel, liuyongqiang13
在 2026/1/8 16:56, David Laight 写道:
> On Thu, 8 Jan 2026 09:59:44 +0800
> Ziming Du <duziming2@huawei.com> wrote:
>
>> From: Yongqiang Liu <liuyongqiang13@huawei.com>
>>
>> Unaligned access is harmful for non-x86 archs such as arm64. When we
>> use pwrite or pread to access the I/O port resources with unaligned
>> offset, system will crash as follows:
>>
>> Unable to handle kernel paging request at virtual address fffffbfffe8010c1
>> Internal error: Oops: 0000000096000061 [#1] SMP
>> Call trace:
>> _outw include/asm-generic/io.h:594 [inline]
>> logic_outw+0x54/0x218 lib/logic_pio.c:305
>> pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
>> pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
>> pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
>> sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
>> kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
>> vfs_write+0x7bc/0xac8 fs/read_write.c:586
>> ksys_write+0x12c/0x270 fs/read_write.c:639
>> __arm64_sys_write+0x78/0xb8 fs/read_write.c:648
>>
>> Powerpc seems affected as well, so prohibit the unaligned access
>> on non-x86 archs.
> I'm not sure it makes any real sense for x86 either.
> IIRC io space is just like memory space, so a 16bit io access looks the
> same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
> addresses 0 and 1 so the code could use 16bit io accesses to speed things up).
> The same will have applied to misaligned accesses.
> But, in reality, all device registers are aligned.
>
> I'm not sure EFAULT is the best error code though, EINVAL might be better.
> (EINVAL is returned for other address/size errors.)
> EFAULT is usually returned for errors accessing the user buffer, a least
> one unix system raises SIGSEGV whenever EFAULT is returned.
>
> David
Just to confirm: should all architectures prohibit unaligned access to
device registers?
>> Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port resources")
>> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
>> Signed-off-by: Ziming Du <duziming2@huawei.com>
>> ---
>> drivers/pci/pci-sysfs.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index 7e697b82c5e1..11d8b7ec4263 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -31,6 +31,7 @@
>> #include <linux/of.h>
>> #include <linux/aperture.h>
>> #include <linux/unaligned.h>
>> +#include <linux/align.h>
>> #include "pci.h"
>>
>> #ifndef ARCH_PCI_DEV_GROUPS
>> @@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj,
>> *(u8 *)buf = inb(port);
>> return 1;
>> case 2:
>> + #if !defined(CONFIG_X86)
>> + if (!IS_ALIGNED(port, count))
>> + return -EFAULT;
>> + #endif
>> if (write)
>> outw(*(u16 *)buf, port);
>> else
>> *(u16 *)buf = inw(port);
>> return 2;
>> case 4:
>> + #if !defined(CONFIG_X86)
>> + if (!IS_ALIGNED(port, count))
>> + return -EFAULT;
>> + #endif
>> if (write)
>> outl(*(u32 *)buf, port);
>> else
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
2026-01-08 8:56 ` David Laight
2026-01-08 11:49 ` duziming
@ 2026-01-09 0:38 ` Maciej W. Rozycki
2026-01-09 12:53 ` David Laight
1 sibling, 1 reply; 12+ messages in thread
From: Maciej W. Rozycki @ 2026-01-09 0:38 UTC (permalink / raw)
To: David Laight
Cc: Ziming Du, Bjorn Helgaas, linux-pci, linux-kernel, liuyongqiang13
On Thu, 8 Jan 2026, David Laight wrote:
> I'm not sure it makes any real sense for x86 either.
FWIW I agree.
> IIRC io space is just like memory space, so a 16bit io access looks the
> same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
> addresses 0 and 1 so the code could use 16bit io accesses to speed things up).
Huh? A 16-bit port I/O access will have the byte enables set accordingly
on PCI and the target device's data lines are driven accordingly in the
data cycle. Just as with MMIO; it's just a different bus command (or TLP
type for PCIe).
There's no data FIFO or anything as exotic in normal hardware to drive or
collect data for port I/O accesses wider than 8 bits. Some peripheral
hardware may ignore byte enables though to simplify logic and e.g. assume
that all port I/O or MMIO accesses are of a certain width, such as 16-bit
or 32-bit.
> The same will have applied to misaligned accesses.
Misaligned accesses may or may not have to be split depending on whether
they span the data bus width boundary or not. E.g. a 16-bit access to
port I/O location 1 won't be split on 32-bit PCI as it fits on the bus:
byte enables #1 and #2 will be driven active and byte enables #0 and #3
will be left inactive. Conversely such an access to location 3 needs to
be split into two cycles, with byte enables #3 and #0 only driven active
respectively in the first and the second cycle.
The x86 BIU will do the split automatically for port I/O instructions as
will some other CPU architectures that use memory access instructions to
reach the PCI port I/O decoding window in their memory address space (this
is a simplified view, as the split may have to be done in the chipset when
passing the boundary between data buses of a different width each).
With other architectures such as MIPS designated instructions need to be
used to drive the byte enables by hand for individual partial accesses in
a split access, and the remaining architectures cannot drive some of the
byte-enable patterns needed for such split accesses at all (and do masking
in software instead for unaligned accesses to regular memory).
> But, in reality, all device registers are aligned.
True, sometimes beyond their width too.
Maciej
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
2026-01-08 11:49 ` duziming
@ 2026-01-09 7:21 ` Ilpo Järvinen
2026-01-09 9:40 ` duziming
0 siblings, 1 reply; 12+ messages in thread
From: Ilpo Järvinen @ 2026-01-09 7:21 UTC (permalink / raw)
To: duziming; +Cc: David Laight, bhelgaas, linux-pci, LKML, liuyongqiang13
[-- Attachment #1: Type: text/plain, Size: 3545 bytes --]
On Thu, 8 Jan 2026, duziming wrote:
>
> 在 2026/1/8 16:56, David Laight 写道:
> > On Thu, 8 Jan 2026 09:59:44 +0800
> > Ziming Du <duziming2@huawei.com> wrote:
> >
> > > From: Yongqiang Liu <liuyongqiang13@huawei.com>
> > >
> > > Unaligned access is harmful for non-x86 archs such as arm64. When we
> > > use pwrite or pread to access the I/O port resources with unaligned
> > > offset, system will crash as follows:
> > >
> > > Unable to handle kernel paging request at virtual address fffffbfffe8010c1
> > > Internal error: Oops: 0000000096000061 [#1] SMP
> > > Call trace:
> > > _outw include/asm-generic/io.h:594 [inline]
> > > logic_outw+0x54/0x218 lib/logic_pio.c:305
> > > pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
> > > pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
> > > pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
> > > sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
> > > kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
> > > vfs_write+0x7bc/0xac8 fs/read_write.c:586
> > > ksys_write+0x12c/0x270 fs/read_write.c:639
> > > __arm64_sys_write+0x78/0xb8 fs/read_write.c:648
> > >
> > > Powerpc seems affected as well, so prohibit the unaligned access
> > > on non-x86 archs.
> > I'm not sure it makes any real sense for x86 either.
> > IIRC io space is just like memory space, so a 16bit io access looks the
> > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
> > addresses 0 and 1 so the code could use 16bit io accesses to speed things
> > up).
> > The same will have applied to misaligned accesses.
> > But, in reality, all device registers are aligned.
> >
> > I'm not sure EFAULT is the best error code though, EINVAL might be better.
> > (EINVAL is returned for other address/size errors.)
> > EFAULT is usually returned for errors accessing the user buffer, a least
> > one unix system raises SIGSEGV whenever EFAULT is returned.
> >
> Just to confirm: should all architectures prohibit unaligned access to device
> registers?
In my opinion, yes, also x86 should prohibit it (like I already
expressed but you ignored that comment until now).
--
i.
> > > Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port
> > > resources")
> > > Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
> > > Signed-off-by: Ziming Du <duziming2@huawei.com>
> > > ---
> > > drivers/pci/pci-sysfs.c | 9 +++++++++
> > > 1 file changed, 9 insertions(+)
> > >
> > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > > index 7e697b82c5e1..11d8b7ec4263 100644
> > > --- a/drivers/pci/pci-sysfs.c
> > > +++ b/drivers/pci/pci-sysfs.c
> > > @@ -31,6 +31,7 @@
> > > #include <linux/of.h>
> > > #include <linux/aperture.h>
> > > #include <linux/unaligned.h>
> > > +#include <linux/align.h>
> > > #include "pci.h"
> > > #ifndef ARCH_PCI_DEV_GROUPS
> > > @@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp,
> > > struct kobject *kobj,
> > > *(u8 *)buf = inb(port);
> > > return 1;
> > > case 2:
> > > + #if !defined(CONFIG_X86)
> > > + if (!IS_ALIGNED(port, count))
> > > + return -EFAULT;
> > > + #endif
> > > if (write)
> > > outw(*(u16 *)buf, port);
> > > else
> > > *(u16 *)buf = inw(port);
> > > return 2;
> > > case 4:
> > > + #if !defined(CONFIG_X86)
> > > + if (!IS_ALIGNED(port, count))
> > > + return -EFAULT;
> > > + #endif
> > > if (write)
> > > outl(*(u32 *)buf, port);
> > > else
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
2026-01-09 7:21 ` Ilpo Järvinen
@ 2026-01-09 9:40 ` duziming
0 siblings, 0 replies; 12+ messages in thread
From: duziming @ 2026-01-09 9:40 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: David Laight, bhelgaas, linux-pci, LKML, liuyongqiang13
在 2026/1/9 15:21, Ilpo Järvinen 写道:
> On Thu, 8 Jan 2026, duziming wrote:
>
>> 在 2026/1/8 16:56, David Laight 写道:
>>> On Thu, 8 Jan 2026 09:59:44 +0800
>>> Ziming Du <duziming2@huawei.com> wrote:
>>>
>>>> From: Yongqiang Liu <liuyongqiang13@huawei.com>
>>>>
>>>> Unaligned access is harmful for non-x86 archs such as arm64. When we
>>>> use pwrite or pread to access the I/O port resources with unaligned
>>>> offset, system will crash as follows:
>>>>
>>>> Unable to handle kernel paging request at virtual address fffffbfffe8010c1
>>>> Internal error: Oops: 0000000096000061 [#1] SMP
>>>> Call trace:
>>>> _outw include/asm-generic/io.h:594 [inline]
>>>> logic_outw+0x54/0x218 lib/logic_pio.c:305
>>>> pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
>>>> pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
>>>> pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
>>>> sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
>>>> kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
>>>> vfs_write+0x7bc/0xac8 fs/read_write.c:586
>>>> ksys_write+0x12c/0x270 fs/read_write.c:639
>>>> __arm64_sys_write+0x78/0xb8 fs/read_write.c:648
>>>>
>>>> Powerpc seems affected as well, so prohibit the unaligned access
>>>> on non-x86 archs.
>>> I'm not sure it makes any real sense for x86 either.
>>> IIRC io space is just like memory space, so a 16bit io access looks the
>>> same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
>>> addresses 0 and 1 so the code could use 16bit io accesses to speed things
>>> up).
>>> The same will have applied to misaligned accesses.
>>> But, in reality, all device registers are aligned.
>>>
>>> I'm not sure EFAULT is the best error code though, EINVAL might be better.
>>> (EINVAL is returned for other address/size errors.)
>>> EFAULT is usually returned for errors accessing the user buffer, a least
>>> one unix system raises SIGSEGV whenever EFAULT is returned.
>>>
>> Just to confirm: should all architectures prohibit unaligned access to device
>> registers?
> In my opinion, yes, also x86 should prohibit it (like I already
> expressed but you ignored that comment until now).
Oops, I didn’t quite understand your opinion earlier :(
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
2026-01-09 0:38 ` Maciej W. Rozycki
@ 2026-01-09 12:53 ` David Laight
2026-01-09 15:25 ` Maciej W. Rozycki
0 siblings, 1 reply; 12+ messages in thread
From: David Laight @ 2026-01-09 12:53 UTC (permalink / raw)
To: Maciej W. Rozycki
Cc: Ziming Du, Bjorn Helgaas, linux-pci, linux-kernel, liuyongqiang13
On Fri, 9 Jan 2026 00:38:05 +0000 (GMT)
"Maciej W. Rozycki" <macro@orcam.me.uk> wrote:
> On Thu, 8 Jan 2026, David Laight wrote:
>
> > I'm not sure it makes any real sense for x86 either.
>
> FWIW I agree.
The interface could have allowed arbitrary transfers and split them into
aligned bus cycles.
That would let you hexdump an io bar (useful for diagnostics, but some
reads end up being destructive - caveat emptor).
But it doesn't....
It is also of limited use because (IIRC) you can't access the device this
way once a driver has mapped the bar.
The driver can, of course, support applications using mmap() to directly
access device memory over PCIe.
(The difficulty is mmap() of kernel memory allocated with dma_alloc_coherent().)
> > IIRC io space is just like memory space, so a 16bit io access looks the
> > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
> > addresses 0 and 1 so the code could use 16bit io accesses to speed things up).
>
> Huh? A 16-bit port I/O access will have the byte enables set accordingly
> on PCI and the target device's data lines are driven accordingly in the
> data cycle. Just as with MMIO; it's just a different bus command (or TLP
> type for PCIe).
I was going back to the historic implementations - like 8086 and 8088.
For pcie I know it is all different and the cpu just generates read/write TLP
with the required byte enables and the tlp type marked 'io'.
(I can't remember whether IO writes end up 'not posted' - failed to find it in
a 5cm thick book on my shelf.)
> There's no data FIFO or anything as exotic in normal hardware to drive or
> collect data for port I/O accesses wider than 8 bits. Some peripheral
> hardware may ignore byte enables though to simplify logic and e.g. assume
> that all port I/O or MMIO accesses are of a certain width, such as 16-bit
> or 32-bit.
Not to mention an fpga fabric that converted the TLP for an (aligned) 32bit
access into a pair of 32bit accesses - one of which had no byte enables set!
Confused the hw engineers who had ignored the byte enables...
> > The same will have applied to misaligned accesses.
>
> Misaligned accesses may or may not have to be split depending on whether
> they span the data bus width boundary or not. E.g. a 16-bit access to
> port I/O location 1 won't be split on 32-bit PCI as it fits on the bus:
> byte enables #1 and #2 will be driven active and byte enables #0 and #3
> will be left inactive. Conversely such an access to location 3 needs to
> be split into two cycles, with byte enables #3 and #0 only driven active
> respectively in the first and the second cycle.
And on PCIe (which is 64bit) a misaligned transfer that crosses a 64bit
boundary generates a single 16 byte TLP (not sure about page boundaries).
> The x86 BIU will do the split automatically for port I/O instructions as
> will some other CPU architectures that use memory access instructions to
> reach the PCI port I/O decoding window in their memory address space (this
> is a simplified view, as the split may have to be done in the chipset when
> passing the boundary between data buses of a different width each).
Yes, and I don't understand the HAS_IOPORT option.
Pretty much only x86 has separate instructions, but a lot of others will
have PCI/PCIe interface logic that can convert cpu memory accesses into
'io' accesses - so the pci_map_bar() should be able to transparently map
an io bar into kernel address space.
So x86 should be the outlier because it can't do that!
Even the strongarm system I used years ago has an address window that
generated 'io' cycles on a pcmcia bus.
I think a host PCI/PCIe interface could do io accesses for the bottom
64k of its memory window - but I don't know any that work that way.
>
> With other architectures such as MIPS designated instructions need to be
> used to drive the byte enables by hand for individual partial accesses in
> a split access, and the remaining architectures cannot drive some of the
> byte-enable patterns needed for such split accesses at all (and do masking
> in software instead for unaligned accesses to regular memory).
>
> > But, in reality, all device registers are aligned.
>
> True, sometimes beyond their width too.
Which means you don't want the multiple cycles that happen when someone
marks a structure as 'packed' even though it is completely aligned.
David
>
> Maciej
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
2026-01-09 12:53 ` David Laight
@ 2026-01-09 15:25 ` Maciej W. Rozycki
0 siblings, 0 replies; 12+ messages in thread
From: Maciej W. Rozycki @ 2026-01-09 15:25 UTC (permalink / raw)
To: David Laight
Cc: Ziming Du, Bjorn Helgaas, linux-pci, linux-kernel, liuyongqiang13
On Fri, 9 Jan 2026, David Laight wrote:
> > > IIRC io space is just like memory space, so a 16bit io access looks the
> > > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
> > > addresses 0 and 1 so the code could use 16bit io accesses to speed things up).
> >
> > Huh? A 16-bit port I/O access will have the byte enables set accordingly
> > on PCI and the target device's data lines are driven accordingly in the
> > data cycle. Just as with MMIO; it's just a different bus command (or TLP
> > type for PCIe).
>
> I was going back to the historic implementations - like 8086 and 8088.
Historical systems had various odd bus designs that are out of scope for
our consideration. Even the 80386, but that was never interfaced to PCI,
perhaps for this very reason.
> > Misaligned accesses may or may not have to be split depending on whether
> > they span the data bus width boundary or not. E.g. a 16-bit access to
> > port I/O location 1 won't be split on 32-bit PCI as it fits on the bus:
> > byte enables #1 and #2 will be driven active and byte enables #0 and #3
> > will be left inactive. Conversely such an access to location 3 needs to
> > be split into two cycles, with byte enables #3 and #0 only driven active
> > respectively in the first and the second cycle.
>
> And on PCIe (which is 64bit) a misaligned transfer that crosses a 64bit
> boundary generates a single 16 byte TLP (not sure about page boundaries).
Which is also what is covered (without diving into exact details) by my
observation that a transfer may have to be split downstream, e.g. by a
PCIe-to-PCI bridge, or internally by the target device.
> > The x86 BIU will do the split automatically for port I/O instructions as
> > will some other CPU architectures that use memory access instructions to
> > reach the PCI port I/O decoding window in their memory address space (this
> > is a simplified view, as the split may have to be done in the chipset when
> > passing the boundary between data buses of a different width each).
>
> Yes, and I don't understand the HAS_IOPORT option.
> Pretty much only x86 has separate instructions, but a lot of others will
> have PCI/PCIe interface logic that can convert cpu memory accesses into
> 'io' accesses - so the pci_map_bar() should be able to transparently map
> an io bar into kernel address space.
> So x86 should be the outlier because it can't do that!
>
> Even the strongarm system I used years ago has an address window that
> generated 'io' cycles on a pcmcia bus.
>
> I think a host PCI/PCIe interface could do io accesses for the bottom
> 64k of its memory window - but I don't know any that work that way.
Only x86 (and its predecessors such as 8080, Z80, etc.) has the port I/O
space defined at the CPU bus level. All the other architectures do define
the port I/O space, but only in the chipset at the PCI/e bus/interconnect
level where implemented. It is where the HAS_IOPORT option is concerned.
Several contemporary systems have no way to produce port I/O cycles on
PCIe owing to the lack of support for the required TLP types in the host
bridge. There's simply no way to produce a transaction that would match
an I/O BAR in a target device. I own such a system myself, it's a POWER9
machine[1][2]. I'm told numerous ARM systems also have such a limitation.
FWIW port I/O TLP types have been deprecated from the beginning of PCIe.
References:
[1] "Power Systems Host Bridge 4 (PHB4) Specification", Version 1.0,
International Business Machines Corporation, 27 July 2018, Table 3-2.
"PCIe TLP command summary", p. 29.
[2] "pcmcia: add HAS_IOPORT dependencies",
<https://lore.kernel.org/r/alpine.DEB.2.21.2205041311280.9548@angie.orcam.me.uk/>.
Maciej
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-01-09 15:25 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-08 1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du
2026-01-08 1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
2026-01-08 1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
2026-01-08 11:35 ` Ilpo Järvinen
2026-01-08 1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du
2026-01-08 8:56 ` David Laight
2026-01-08 11:49 ` duziming
2026-01-09 7:21 ` Ilpo Järvinen
2026-01-09 9:40 ` duziming
2026-01-09 0:38 ` Maciej W. Rozycki
2026-01-09 12:53 ` David Laight
2026-01-09 15:25 ` Maciej W. Rozycki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox