public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Miscellaneous fixes for pci subsystem
@ 2026-01-08  1:59 Ziming Du
  2026-01-08  1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Ziming Du @ 2026-01-08  1:59 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2

Miscellaneous fixes for pci subsystem

Changes in v3:
- Check *ppos before assign it to pos.
- Link to v2: https://lore.kernel.org/linux-pci/20251224092721.2034529-1-duziming2@huawei.com/

Changes in v2:
- Correct grammer and indentation.
- Remove unrelated stack traces from the commit message.
- Modify the handling of pos by adding a non-negative check to ensure
  that the input value is valid.
- Use the existing IS_ALIGNED macro and ensure that after modification,
  other cases still retuen -EINVAL as before.
- Link to v1: https://lore.kernel.org/linux-pci/20251216083912.758219-1-duziming2@huawei.com/

Yongqiang Liu (2):
  PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
  PCI: Prevent overflow in proc_bus_pci_write()

Ziming Du (1):
  PCI/sysfs: Fix null pointer dereference during hotplug

 drivers/pci/pci-sysfs.c | 11 +++++++++++
 drivers/pci/proc.c      |  6 +++++-
 2 files changed, 16 insertions(+), 1 deletion(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug
  2026-01-08  1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du
@ 2026-01-08  1:59 ` Ziming Du
  2026-01-08  1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
  2026-01-08  1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du
  2 siblings, 0 replies; 12+ messages in thread
From: Ziming Du @ 2026-01-08  1:59 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2

During the concurrent process of creating and rescanning in VF, the
resource files for the same pci_dev may be created twice. The second
creation attempt fails, resulting the res_attr in pci_dev to kfree(),
but the pointer is not set to NULL. This will subsequently lead to
dereferencing a null pointer when removing the device.

When we perform the following operation:
  echo $sriov_totalvfs > /sys/class/net/"$pfname"/device/sriov_numvfs &
  sleep 0.5
  echo 1 > /sys/bus/pci/rescan
  pci_remove "$pfname"
system will crash as follows:

  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
  Call trace:
   __pi_strlen+0x14/0x150
   kernfs_find_ns+0x54/0x120
   kernfs_remove_by_name_ns+0x58/0xf0
   sysfs_remove_bin_file+0x24/0x38
   pci_remove_resource_files+0x44/0x90
   pci_remove_sysfs_dev_files+0x28/0x40
   pci_stop_bus_device+0xb8/0x118
   pci_stop_and_remove_bus_device+0x20/0x40
   pci_iov_remove_virtfn+0xb8/0x138
   sriov_disable+0xbc/0x190
   pci_disable_sriov+0x30/0x48
   hinic_pci_sriov_disable+0x54/0x138 [hinic]
   hinic_remove+0x140/0x290 [hinic]
   pci_device_remove+0x4c/0xf8
   device_remove+0x54/0x90
   device_release_driver_internal+0x1d4/0x238
   device_release_driver+0x20/0x38
   pci_stop_bus_device+0xa8/0x118
   pci_stop_and_remove_bus_device_locked+0x28/0x50
   remove_store+0x128/0x208

Fix this by set the pointer to NULL after releasing 'res_attr' immediately.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ziming Du <duziming2@huawei.com>
---
 drivers/pci/pci-sysfs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index c2df915ad2d2..7e697b82c5e1 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1222,12 +1222,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
 		if (res_attr) {
 			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
 			kfree(res_attr);
+			pdev->res_attr[i] = NULL;
 		}
 
 		res_attr = pdev->res_attr_wc[i];
 		if (res_attr) {
 			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
 			kfree(res_attr);
+			pdev->res_attr_wc[i] = NULL;
 		}
 	}
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write()
  2026-01-08  1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du
  2026-01-08  1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
@ 2026-01-08  1:59 ` Ziming Du
  2026-01-08 11:35   ` Ilpo Järvinen
  2026-01-08  1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du
  2 siblings, 1 reply; 12+ messages in thread
From: Ziming Du @ 2026-01-08  1:59 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2

From: Yongqiang Liu <liuyongqiang13@huawei.com>

When the value of *ppos over the INT_MAX, the pos is over set to a
negative value which will be passed to get_user() or
pci_user_write_config_dword(). Unexpected behavior such as a soft lockup
will happen as follows:

 watchdog: BUG: soft lockup - CPU#0 stuck for 130s! [syz.3.109:3444]
 RIP: 0010:_raw_spin_unlock_irq+0x17/0x30
 Call Trace:
  <TASK>
  pci_user_write_config_dword+0x126/0x1f0
  proc_bus_pci_write+0x273/0x470
  proc_reg_write+0x1b6/0x280
  do_iter_write+0x48e/0x790
  vfs_writev+0x125/0x4a0
  __x64_sys_pwritev+0x1e2/0x2a0
  do_syscall_64+0x59/0x110
  entry_SYSCALL_64_after_hwframe+0x78/0xe2

Fix this by using unsigned int for the pos.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: Ziming Du <duziming2@huawei.com>
---
 drivers/pci/proc.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 9348a0fb8084..2d51b26edbe7 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -113,10 +113,14 @@ static ssize_t proc_bus_pci_write(struct file *file, const char __user *buf,
 {
 	struct inode *ino = file_inode(file);
 	struct pci_dev *dev = pde_data(ino);
-	int pos = *ppos;
+	int pos;
 	int size = dev->cfg_size;
 	int cnt, ret;
 
+	if (*ppos > INT_MAX)
+		return -EINVAL;
+	pos = *ppos;
+
 	ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
 	if (ret)
 		return ret;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
  2026-01-08  1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du
  2026-01-08  1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
  2026-01-08  1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
@ 2026-01-08  1:59 ` Ziming Du
  2026-01-08  8:56   ` David Laight
  2 siblings, 1 reply; 12+ messages in thread
From: Ziming Du @ 2026-01-08  1:59 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, liuyongqiang13, duziming2

From: Yongqiang Liu <liuyongqiang13@huawei.com>

Unaligned access is harmful for non-x86 archs such as arm64. When we
use pwrite or pread to access the I/O port resources with unaligned
offset, system will crash as follows:

Unable to handle kernel paging request at virtual address fffffbfffe8010c1
Internal error: Oops: 0000000096000061 [#1] SMP
Call trace:
 _outw include/asm-generic/io.h:594 [inline]
 logic_outw+0x54/0x218 lib/logic_pio.c:305
 pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
 pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
 pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
 sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
 kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
 vfs_write+0x7bc/0xac8 fs/read_write.c:586
 ksys_write+0x12c/0x270 fs/read_write.c:639
 __arm64_sys_write+0x78/0xb8 fs/read_write.c:648

Powerpc seems affected as well, so prohibit the unaligned access
on non-x86 archs.

Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port resources")
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
Signed-off-by: Ziming Du <duziming2@huawei.com>
---
 drivers/pci/pci-sysfs.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 7e697b82c5e1..11d8b7ec4263 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -31,6 +31,7 @@
 #include <linux/of.h>
 #include <linux/aperture.h>
 #include <linux/unaligned.h>
+#include <linux/align.h>
 #include "pci.h"
 
 #ifndef ARCH_PCI_DEV_GROUPS
@@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj,
 			*(u8 *)buf = inb(port);
 		return 1;
 	case 2:
+		#if !defined(CONFIG_X86)
+			if (!IS_ALIGNED(port, count))
+				return -EFAULT;
+		#endif
 		if (write)
 			outw(*(u16 *)buf, port);
 		else
 			*(u16 *)buf = inw(port);
 		return 2;
 	case 4:
+		#if !defined(CONFIG_X86)
+			if (!IS_ALIGNED(port, count))
+				return -EFAULT;
+		#endif
 		if (write)
 			outl(*(u32 *)buf, port);
 		else
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
  2026-01-08  1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du
@ 2026-01-08  8:56   ` David Laight
  2026-01-08 11:49     ` duziming
  2026-01-09  0:38     ` Maciej W. Rozycki
  0 siblings, 2 replies; 12+ messages in thread
From: David Laight @ 2026-01-08  8:56 UTC (permalink / raw)
  To: Ziming Du; +Cc: bhelgaas, linux-pci, linux-kernel, liuyongqiang13

On Thu, 8 Jan 2026 09:59:44 +0800
Ziming Du <duziming2@huawei.com> wrote:

> From: Yongqiang Liu <liuyongqiang13@huawei.com>
> 
> Unaligned access is harmful for non-x86 archs such as arm64. When we
> use pwrite or pread to access the I/O port resources with unaligned
> offset, system will crash as follows:
> 
> Unable to handle kernel paging request at virtual address fffffbfffe8010c1
> Internal error: Oops: 0000000096000061 [#1] SMP
> Call trace:
>  _outw include/asm-generic/io.h:594 [inline]
>  logic_outw+0x54/0x218 lib/logic_pio.c:305
>  pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
>  pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
>  pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
>  sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
>  kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
>  vfs_write+0x7bc/0xac8 fs/read_write.c:586
>  ksys_write+0x12c/0x270 fs/read_write.c:639
>  __arm64_sys_write+0x78/0xb8 fs/read_write.c:648
> 
> Powerpc seems affected as well, so prohibit the unaligned access
> on non-x86 archs.

I'm not sure it makes any real sense for x86 either.
IIRC io space is just like memory space, so a 16bit io access looks the
same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
addresses 0 and 1 so the code could use 16bit io accesses to speed things up).
The same will have applied to misaligned accesses.
But, in reality, all device registers are aligned.

I'm not sure EFAULT is the best error code though, EINVAL might be better.
(EINVAL is returned for other address/size errors.)
EFAULT is usually returned for errors accessing the user buffer, a least
one unix system raises SIGSEGV whenever EFAULT is returned.

	David

> 
> Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port resources")
> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
> Signed-off-by: Ziming Du <duziming2@huawei.com>
> ---
>  drivers/pci/pci-sysfs.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 7e697b82c5e1..11d8b7ec4263 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -31,6 +31,7 @@
>  #include <linux/of.h>
>  #include <linux/aperture.h>
>  #include <linux/unaligned.h>
> +#include <linux/align.h>
>  #include "pci.h"
>  
>  #ifndef ARCH_PCI_DEV_GROUPS
> @@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj,
>  			*(u8 *)buf = inb(port);
>  		return 1;
>  	case 2:
> +		#if !defined(CONFIG_X86)
> +			if (!IS_ALIGNED(port, count))
> +				return -EFAULT;
> +		#endif
>  		if (write)
>  			outw(*(u16 *)buf, port);
>  		else
>  			*(u16 *)buf = inw(port);
>  		return 2;
>  	case 4:
> +		#if !defined(CONFIG_X86)
> +			if (!IS_ALIGNED(port, count))
> +				return -EFAULT;
> +		#endif
>  		if (write)
>  			outl(*(u32 *)buf, port);
>  		else


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write()
  2026-01-08  1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
@ 2026-01-08 11:35   ` Ilpo Järvinen
  0 siblings, 0 replies; 12+ messages in thread
From: Ilpo Järvinen @ 2026-01-08 11:35 UTC (permalink / raw)
  To: Ziming Du; +Cc: bhelgaas, linux-pci, LKML, liuyongqiang13

[-- Attachment #1: Type: text/plain, Size: 2037 bytes --]

On Thu, 8 Jan 2026, Ziming Du wrote:

> From: Yongqiang Liu <liuyongqiang13@huawei.com>
> 
> When the value of *ppos over the INT_MAX, the pos is over set to a
> negative value which will be passed to get_user() or
> pci_user_write_config_dword(). Unexpected behavior such as a soft lockup
> will happen as follows:
> 
>  watchdog: BUG: soft lockup - CPU#0 stuck for 130s! [syz.3.109:3444]
>  RIP: 0010:_raw_spin_unlock_irq+0x17/0x30
>  Call Trace:
>   <TASK>
>   pci_user_write_config_dword+0x126/0x1f0
>   proc_bus_pci_write+0x273/0x470
>   proc_reg_write+0x1b6/0x280
>   do_iter_write+0x48e/0x790
>   vfs_writev+0x125/0x4a0
>   __x64_sys_pwritev+0x1e2/0x2a0
>   do_syscall_64+0x59/0x110
>   entry_SYSCALL_64_after_hwframe+0x78/0xe2
> 
> Fix this by using unsigned int for the pos.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
> Signed-off-by: Ziming Du <duziming2@huawei.com>
> ---
>  drivers/pci/proc.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 9348a0fb8084..2d51b26edbe7 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -113,10 +113,14 @@ static ssize_t proc_bus_pci_write(struct file *file, const char __user *buf,
>  {
>  	struct inode *ino = file_inode(file);
>  	struct pci_dev *dev = pde_data(ino);
> -	int pos = *ppos;
> +	int pos;
>  	int size = dev->cfg_size;
>  	int cnt, ret;
>  
> +	if (*ppos > INT_MAX)
> +		return -EINVAL;
> +	pos = *ppos;
> +
>  	ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
>  	if (ret)
>  		return ret;
> 

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

With the note that proc_bus_pci_read() and proc_bus_pci_write() diverge in 
handling > INT_MAX values and that feels unjustified (but there's not this
same problem on the read side I guess so if the read side is made the same 
as the write side, it would be better to do that in another patch).

-- 
 i.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
  2026-01-08  8:56   ` David Laight
@ 2026-01-08 11:49     ` duziming
  2026-01-09  7:21       ` Ilpo Järvinen
  2026-01-09  0:38     ` Maciej W. Rozycki
  1 sibling, 1 reply; 12+ messages in thread
From: duziming @ 2026-01-08 11:49 UTC (permalink / raw)
  To: David Laight; +Cc: bhelgaas, linux-pci, linux-kernel, liuyongqiang13


在 2026/1/8 16:56, David Laight 写道:
> On Thu, 8 Jan 2026 09:59:44 +0800
> Ziming Du <duziming2@huawei.com> wrote:
>
>> From: Yongqiang Liu <liuyongqiang13@huawei.com>
>>
>> Unaligned access is harmful for non-x86 archs such as arm64. When we
>> use pwrite or pread to access the I/O port resources with unaligned
>> offset, system will crash as follows:
>>
>> Unable to handle kernel paging request at virtual address fffffbfffe8010c1
>> Internal error: Oops: 0000000096000061 [#1] SMP
>> Call trace:
>>   _outw include/asm-generic/io.h:594 [inline]
>>   logic_outw+0x54/0x218 lib/logic_pio.c:305
>>   pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
>>   pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
>>   pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
>>   sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
>>   kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
>>   vfs_write+0x7bc/0xac8 fs/read_write.c:586
>>   ksys_write+0x12c/0x270 fs/read_write.c:639
>>   __arm64_sys_write+0x78/0xb8 fs/read_write.c:648
>>
>> Powerpc seems affected as well, so prohibit the unaligned access
>> on non-x86 archs.
> I'm not sure it makes any real sense for x86 either.
> IIRC io space is just like memory space, so a 16bit io access looks the
> same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
> addresses 0 and 1 so the code could use 16bit io accesses to speed things up).
> The same will have applied to misaligned accesses.
> But, in reality, all device registers are aligned.
>
> I'm not sure EFAULT is the best error code though, EINVAL might be better.
> (EINVAL is returned for other address/size errors.)
> EFAULT is usually returned for errors accessing the user buffer, a least
> one unix system raises SIGSEGV whenever EFAULT is returned.
>
> 	David
Just to confirm: should all architectures prohibit unaligned access to 
device registers?
>> Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port resources")
>> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
>> Signed-off-by: Ziming Du <duziming2@huawei.com>
>> ---
>>   drivers/pci/pci-sysfs.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index 7e697b82c5e1..11d8b7ec4263 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -31,6 +31,7 @@
>>   #include <linux/of.h>
>>   #include <linux/aperture.h>
>>   #include <linux/unaligned.h>
>> +#include <linux/align.h>
>>   #include "pci.h"
>>   
>>   #ifndef ARCH_PCI_DEV_GROUPS
>> @@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp, struct kobject *kobj,
>>   			*(u8 *)buf = inb(port);
>>   		return 1;
>>   	case 2:
>> +		#if !defined(CONFIG_X86)
>> +			if (!IS_ALIGNED(port, count))
>> +				return -EFAULT;
>> +		#endif
>>   		if (write)
>>   			outw(*(u16 *)buf, port);
>>   		else
>>   			*(u16 *)buf = inw(port);
>>   		return 2;
>>   	case 4:
>> +		#if !defined(CONFIG_X86)
>> +			if (!IS_ALIGNED(port, count))
>> +				return -EFAULT;
>> +		#endif
>>   		if (write)
>>   			outl(*(u32 *)buf, port);
>>   		else

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
  2026-01-08  8:56   ` David Laight
  2026-01-08 11:49     ` duziming
@ 2026-01-09  0:38     ` Maciej W. Rozycki
  2026-01-09 12:53       ` David Laight
  1 sibling, 1 reply; 12+ messages in thread
From: Maciej W. Rozycki @ 2026-01-09  0:38 UTC (permalink / raw)
  To: David Laight
  Cc: Ziming Du, Bjorn Helgaas, linux-pci, linux-kernel, liuyongqiang13

On Thu, 8 Jan 2026, David Laight wrote:

> I'm not sure it makes any real sense for x86 either.

 FWIW I agree.

> IIRC io space is just like memory space, so a 16bit io access looks the
> same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
> addresses 0 and 1 so the code could use 16bit io accesses to speed things up).

 Huh?  A 16-bit port I/O access will have the byte enables set accordingly 
on PCI and the target device's data lines are driven accordingly in the 
data cycle.  Just as with MMIO; it's just a different bus command (or TLP 
type for PCIe).

 There's no data FIFO or anything as exotic in normal hardware to drive or 
collect data for port I/O accesses wider than 8 bits.  Some peripheral 
hardware may ignore byte enables though to simplify logic and e.g. assume 
that all port I/O or MMIO accesses are of a certain width, such as 16-bit 
or 32-bit.

> The same will have applied to misaligned accesses.

 Misaligned accesses may or may not have to be split depending on whether 
they span the data bus width boundary or not.  E.g. a 16-bit access to 
port I/O location 1 won't be split on 32-bit PCI as it fits on the bus: 
byte enables #1 and #2 will be driven active and byte enables #0 and #3 
will be left inactive.  Conversely such an access to location 3 needs to 
be split into two cycles, with byte enables #3 and #0 only driven active 
respectively in the first and the second cycle.

 The x86 BIU will do the split automatically for port I/O instructions as 
will some other CPU architectures that use memory access instructions to 
reach the PCI port I/O decoding window in their memory address space (this 
is a simplified view, as the split may have to be done in the chipset when 
passing the boundary between data buses of a different width each).

 With other architectures such as MIPS designated instructions need to be 
used to drive the byte enables by hand for individual partial accesses in 
a split access, and the remaining architectures cannot drive some of the 
byte-enable patterns needed for such split accesses at all (and do masking 
in software instead for unaligned accesses to regular memory).

> But, in reality, all device registers are aligned.

 True, sometimes beyond their width too.

  Maciej

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
  2026-01-08 11:49     ` duziming
@ 2026-01-09  7:21       ` Ilpo Järvinen
  2026-01-09  9:40         ` duziming
  0 siblings, 1 reply; 12+ messages in thread
From: Ilpo Järvinen @ 2026-01-09  7:21 UTC (permalink / raw)
  To: duziming; +Cc: David Laight, bhelgaas, linux-pci, LKML, liuyongqiang13

[-- Attachment #1: Type: text/plain, Size: 3545 bytes --]

On Thu, 8 Jan 2026, duziming wrote:

> 
> 在 2026/1/8 16:56, David Laight 写道:
> > On Thu, 8 Jan 2026 09:59:44 +0800
> > Ziming Du <duziming2@huawei.com> wrote:
> > 
> > > From: Yongqiang Liu <liuyongqiang13@huawei.com>
> > > 
> > > Unaligned access is harmful for non-x86 archs such as arm64. When we
> > > use pwrite or pread to access the I/O port resources with unaligned
> > > offset, system will crash as follows:
> > > 
> > > Unable to handle kernel paging request at virtual address fffffbfffe8010c1
> > > Internal error: Oops: 0000000096000061 [#1] SMP
> > > Call trace:
> > >   _outw include/asm-generic/io.h:594 [inline]
> > >   logic_outw+0x54/0x218 lib/logic_pio.c:305
> > >   pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
> > >   pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
> > >   pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
> > >   sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
> > >   kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
> > >   vfs_write+0x7bc/0xac8 fs/read_write.c:586
> > >   ksys_write+0x12c/0x270 fs/read_write.c:639
> > >   __arm64_sys_write+0x78/0xb8 fs/read_write.c:648
> > > 
> > > Powerpc seems affected as well, so prohibit the unaligned access
> > > on non-x86 archs.
> > I'm not sure it makes any real sense for x86 either.
> > IIRC io space is just like memory space, so a 16bit io access looks the
> > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
> > addresses 0 and 1 so the code could use 16bit io accesses to speed things
> > up).
> > The same will have applied to misaligned accesses.
> > But, in reality, all device registers are aligned.
> > 
> > I'm not sure EFAULT is the best error code though, EINVAL might be better.
> > (EINVAL is returned for other address/size errors.)
> > EFAULT is usually returned for errors accessing the user buffer, a least
> > one unix system raises SIGSEGV whenever EFAULT is returned.
> > 
> Just to confirm: should all architectures prohibit unaligned access to device
> registers?

In my opinion, yes, also x86 should prohibit it (like I already 
expressed but you ignored that comment until now).

-- 
 i.

> > > Fixes: 8633328be242 ("PCI: Allow read/write access to sysfs I/O port
> > > resources")
> > > Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
> > > Signed-off-by: Ziming Du <duziming2@huawei.com>
> > > ---
> > >   drivers/pci/pci-sysfs.c | 9 +++++++++
> > >   1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > > index 7e697b82c5e1..11d8b7ec4263 100644
> > > --- a/drivers/pci/pci-sysfs.c
> > > +++ b/drivers/pci/pci-sysfs.c
> > > @@ -31,6 +31,7 @@
> > >   #include <linux/of.h>
> > >   #include <linux/aperture.h>
> > >   #include <linux/unaligned.h>
> > > +#include <linux/align.h>
> > >   #include "pci.h"
> > >     #ifndef ARCH_PCI_DEV_GROUPS
> > > @@ -1166,12 +1167,20 @@ static ssize_t pci_resource_io(struct file *filp,
> > > struct kobject *kobj,
> > >   			*(u8 *)buf = inb(port);
> > >   		return 1;
> > >   	case 2:
> > > +		#if !defined(CONFIG_X86)
> > > +			if (!IS_ALIGNED(port, count))
> > > +				return -EFAULT;
> > > +		#endif
> > >   		if (write)
> > >   			outw(*(u16 *)buf, port);
> > >   		else
> > >   			*(u16 *)buf = inw(port);
> > >   		return 2;
> > >   	case 4:
> > > +		#if !defined(CONFIG_X86)
> > > +			if (!IS_ALIGNED(port, count))
> > > +				return -EFAULT;
> > > +		#endif
> > >   		if (write)
> > >   			outl(*(u32 *)buf, port);
> > >   		else
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
  2026-01-09  7:21       ` Ilpo Järvinen
@ 2026-01-09  9:40         ` duziming
  0 siblings, 0 replies; 12+ messages in thread
From: duziming @ 2026-01-09  9:40 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: David Laight, bhelgaas, linux-pci, LKML, liuyongqiang13


在 2026/1/9 15:21, Ilpo Järvinen 写道:
> On Thu, 8 Jan 2026, duziming wrote:
>
>> 在 2026/1/8 16:56, David Laight 写道:
>>> On Thu, 8 Jan 2026 09:59:44 +0800
>>> Ziming Du <duziming2@huawei.com> wrote:
>>>
>>>> From: Yongqiang Liu <liuyongqiang13@huawei.com>
>>>>
>>>> Unaligned access is harmful for non-x86 archs such as arm64. When we
>>>> use pwrite or pread to access the I/O port resources with unaligned
>>>> offset, system will crash as follows:
>>>>
>>>> Unable to handle kernel paging request at virtual address fffffbfffe8010c1
>>>> Internal error: Oops: 0000000096000061 [#1] SMP
>>>> Call trace:
>>>>    _outw include/asm-generic/io.h:594 [inline]
>>>>    logic_outw+0x54/0x218 lib/logic_pio.c:305
>>>>    pci_resource_io drivers/pci/pci-sysfs.c:1157 [inline]
>>>>    pci_write_resource_io drivers/pci/pci-sysfs.c:1191 [inline]
>>>>    pci_write_resource_io+0x208/0x260 drivers/pci/pci-sysfs.c:1181
>>>>    sysfs_kf_bin_write+0x188/0x210 fs/sysfs/file.c:158
>>>>    kernfs_fop_write_iter+0x2e8/0x4b0 fs/kernfs/file.c:338
>>>>    vfs_write+0x7bc/0xac8 fs/read_write.c:586
>>>>    ksys_write+0x12c/0x270 fs/read_write.c:639
>>>>    __arm64_sys_write+0x78/0xb8 fs/read_write.c:648
>>>>
>>>> Powerpc seems affected as well, so prohibit the unaligned access
>>>> on non-x86 archs.
>>> I'm not sure it makes any real sense for x86 either.
>>> IIRC io space is just like memory space, so a 16bit io access looks the
>>> same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
>>> addresses 0 and 1 so the code could use 16bit io accesses to speed things
>>> up).
>>> The same will have applied to misaligned accesses.
>>> But, in reality, all device registers are aligned.
>>>
>>> I'm not sure EFAULT is the best error code though, EINVAL might be better.
>>> (EINVAL is returned for other address/size errors.)
>>> EFAULT is usually returned for errors accessing the user buffer, a least
>>> one unix system raises SIGSEGV whenever EFAULT is returned.
>>>
>> Just to confirm: should all architectures prohibit unaligned access to device
>> registers?
> In my opinion, yes, also x86 should prohibit it (like I already
> expressed but you ignored that comment until now).
Oops,  I didn’t quite understand your opinion earlier :(
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
  2026-01-09  0:38     ` Maciej W. Rozycki
@ 2026-01-09 12:53       ` David Laight
  2026-01-09 15:25         ` Maciej W. Rozycki
  0 siblings, 1 reply; 12+ messages in thread
From: David Laight @ 2026-01-09 12:53 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Ziming Du, Bjorn Helgaas, linux-pci, linux-kernel, liuyongqiang13

On Fri, 9 Jan 2026 00:38:05 +0000 (GMT)
"Maciej W. Rozycki" <macro@orcam.me.uk> wrote:

> On Thu, 8 Jan 2026, David Laight wrote:
> 
> > I'm not sure it makes any real sense for x86 either.  
> 
>  FWIW I agree.

The interface could have allowed arbitrary transfers and split them into
aligned bus cycles.
That would let you hexdump an io bar (useful for diagnostics, but some
reads end up being destructive - caveat emptor).
But it doesn't....

It is also of limited use because (IIRC) you can't access the device this
way once a driver has mapped the bar.
The driver can, of course, support applications using mmap() to directly
access device memory over PCIe.
(The difficulty is mmap() of kernel memory allocated with dma_alloc_coherent().)

> > IIRC io space is just like memory space, so a 16bit io access looks the
> > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
> > addresses 0 and 1 so the code could use 16bit io accesses to speed things up).  
> 
>  Huh?  A 16-bit port I/O access will have the byte enables set accordingly 
> on PCI and the target device's data lines are driven accordingly in the 
> data cycle.  Just as with MMIO; it's just a different bus command (or TLP 
> type for PCIe).

I was going back to the historic implementations - like 8086 and 8088.

For pcie I know it is all different and the cpu just generates read/write TLP
with the required byte enables and the tlp type marked 'io'.
(I can't remember whether IO writes end up 'not posted' - failed to find it in
a 5cm thick book on my shelf.) 

>  There's no data FIFO or anything as exotic in normal hardware to drive or 
> collect data for port I/O accesses wider than 8 bits.  Some peripheral 
> hardware may ignore byte enables though to simplify logic and e.g. assume 
> that all port I/O or MMIO accesses are of a certain width, such as 16-bit 
> or 32-bit.

Not to mention an fpga fabric that converted the TLP for an (aligned) 32bit
access into a pair of 32bit accesses - one of which had no byte enables set!
Confused the hw engineers who had ignored the byte enables...

> > The same will have applied to misaligned accesses.  
> 
>  Misaligned accesses may or may not have to be split depending on whether 
> they span the data bus width boundary or not.  E.g. a 16-bit access to 
> port I/O location 1 won't be split on 32-bit PCI as it fits on the bus: 
> byte enables #1 and #2 will be driven active and byte enables #0 and #3 
> will be left inactive.  Conversely such an access to location 3 needs to 
> be split into two cycles, with byte enables #3 and #0 only driven active 
> respectively in the first and the second cycle.

And on PCIe (which is 64bit) a misaligned transfer that crosses a 64bit
boundary generates a single 16 byte TLP (not sure about page boundaries).

>  The x86 BIU will do the split automatically for port I/O instructions as 
> will some other CPU architectures that use memory access instructions to 
> reach the PCI port I/O decoding window in their memory address space (this 
> is a simplified view, as the split may have to be done in the chipset when 
> passing the boundary between data buses of a different width each).

Yes, and I don't understand the HAS_IOPORT option.
Pretty much only x86 has separate instructions, but a lot of others will
have PCI/PCIe interface logic that can convert cpu memory accesses into
'io' accesses - so the pci_map_bar() should be able to transparently map
an io bar into kernel address space.
So x86 should be the outlier because it can't do that!

Even the strongarm system I used years ago has an address window that
generated 'io' cycles on a pcmcia bus.

I think a host PCI/PCIe interface could do io accesses for the bottom
64k of its memory window - but I don't know any that work that way.

> 
>  With other architectures such as MIPS designated instructions need to be 
> used to drive the byte enables by hand for individual partial accesses in 
> a split access, and the remaining architectures cannot drive some of the 
> byte-enable patterns needed for such split accesses at all (and do masking 
> in software instead for unaligned accesses to regular memory).
> 
> > But, in reality, all device registers are aligned.  
> 
>  True, sometimes beyond their width too.

Which means you don't want the multiple cycles that happen when someone
marks a structure as 'packed' even though it is completely aligned.

	David

> 
>   Maciej
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86
  2026-01-09 12:53       ` David Laight
@ 2026-01-09 15:25         ` Maciej W. Rozycki
  0 siblings, 0 replies; 12+ messages in thread
From: Maciej W. Rozycki @ 2026-01-09 15:25 UTC (permalink / raw)
  To: David Laight
  Cc: Ziming Du, Bjorn Helgaas, linux-pci, linux-kernel, liuyongqiang13

On Fri, 9 Jan 2026, David Laight wrote:

> > > IIRC io space is just like memory space, so a 16bit io access looks the
> > > same as two 8bit accesses to an 8bit device (some put the 'data fifo' on
> > > addresses 0 and 1 so the code could use 16bit io accesses to speed things up).  
> > 
> >  Huh?  A 16-bit port I/O access will have the byte enables set accordingly 
> > on PCI and the target device's data lines are driven accordingly in the 
> > data cycle.  Just as with MMIO; it's just a different bus command (or TLP 
> > type for PCIe).
> 
> I was going back to the historic implementations - like 8086 and 8088.

 Historical systems had various odd bus designs that are out of scope for 
our consideration.  Even the 80386, but that was never interfaced to PCI, 
perhaps for this very reason.

> >  Misaligned accesses may or may not have to be split depending on whether 
> > they span the data bus width boundary or not.  E.g. a 16-bit access to 
> > port I/O location 1 won't be split on 32-bit PCI as it fits on the bus: 
> > byte enables #1 and #2 will be driven active and byte enables #0 and #3 
> > will be left inactive.  Conversely such an access to location 3 needs to 
> > be split into two cycles, with byte enables #3 and #0 only driven active 
> > respectively in the first and the second cycle.
> 
> And on PCIe (which is 64bit) a misaligned transfer that crosses a 64bit
> boundary generates a single 16 byte TLP (not sure about page boundaries).

 Which is also what is covered (without diving into exact details) by my 
observation that a transfer may have to be split downstream, e.g. by a 
PCIe-to-PCI bridge, or internally by the target device.

> >  The x86 BIU will do the split automatically for port I/O instructions as 
> > will some other CPU architectures that use memory access instructions to 
> > reach the PCI port I/O decoding window in their memory address space (this 
> > is a simplified view, as the split may have to be done in the chipset when 
> > passing the boundary between data buses of a different width each).
> 
> Yes, and I don't understand the HAS_IOPORT option.
> Pretty much only x86 has separate instructions, but a lot of others will
> have PCI/PCIe interface logic that can convert cpu memory accesses into
> 'io' accesses - so the pci_map_bar() should be able to transparently map
> an io bar into kernel address space.
> So x86 should be the outlier because it can't do that!
> 
> Even the strongarm system I used years ago has an address window that
> generated 'io' cycles on a pcmcia bus.
> 
> I think a host PCI/PCIe interface could do io accesses for the bottom
> 64k of its memory window - but I don't know any that work that way.

 Only x86 (and its predecessors such as 8080, Z80, etc.) has the port I/O
space defined at the CPU bus level.  All the other architectures do define 
the port I/O space, but only in the chipset at the PCI/e bus/interconnect 
level where implemented.  It is where the HAS_IOPORT option is concerned.

 Several contemporary systems have no way to produce port I/O cycles on 
PCIe owing to the lack of support for the required TLP types in the host 
bridge.  There's simply no way to produce a transaction that would match 
an I/O BAR in a target device.  I own such a system myself, it's a POWER9 
machine[1][2].  I'm told numerous ARM systems also have such a limitation.  
FWIW port I/O TLP types have been deprecated from the beginning of PCIe.

References:

[1] "Power Systems Host Bridge 4 (PHB4) Specification", Version 1.0,
    International Business Machines Corporation, 27 July 2018, Table 3-2. 
    "PCIe TLP command summary", p. 29.

[2] "pcmcia: add HAS_IOPORT dependencies", 
    <https://lore.kernel.org/r/alpine.DEB.2.21.2205041311280.9548@angie.orcam.me.uk/>.

  Maciej

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-01-09 15:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-08  1:59 [PATCH v3 0/3] Miscellaneous fixes for pci subsystem Ziming Du
2026-01-08  1:59 ` [PATCH v3 1/3] PCI/sysfs: Fix null pointer dereference during hotplug Ziming Du
2026-01-08  1:59 ` [PATCH v3 2/3] PCI: Prevent overflow in proc_bus_pci_write() Ziming Du
2026-01-08 11:35   ` Ilpo Järvinen
2026-01-08  1:59 ` [PATCH v3 3/3] PCI/sysfs: Prohibit unaligned access to I/O port on non-x86 Ziming Du
2026-01-08  8:56   ` David Laight
2026-01-08 11:49     ` duziming
2026-01-09  7:21       ` Ilpo Järvinen
2026-01-09  9:40         ` duziming
2026-01-09  0:38     ` Maciej W. Rozycki
2026-01-09 12:53       ` David Laight
2026-01-09 15:25         ` Maciej W. Rozycki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox