public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] vfio/xe: Reorganize the init to decouple migration from reset
@ 2026-04-08 16:03 Michał Winiarski
  2026-04-08 16:03 ` [PATCH 2/2] vfio/xe: Add a missing vfio_pci_core_release_dev() Michał Winiarski
  2026-04-09  9:28 ` [PATCH 1/2] vfio/xe: Reorganize the init to decouple migration from reset Tian, Kevin
  0 siblings, 2 replies; 5+ messages in thread
From: Michał Winiarski @ 2026-04-08 16:03 UTC (permalink / raw)
  To: Alex Williamson, intel-xe, linux-kernel, kvm
  Cc: Jason Gunthorpe, Yishai Hadas, Shameer Kolothum, Kevin Tian,
	Michał Winiarski

Attempting to issue reset on VF devices that don't support migration
leads to the following:

  BUG: unable to handle page fault for address: 00000000000011f8
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: Oops: 0000 [#1] SMP NOPTI
  CPU: 2 UID: 0 PID: 7443 Comm: xe_sriov_flr Tainted: G S   U              7.0.0-rc1-lgci-xe-xe-4588-cec43d5c2696af219-nodebug+ #1 PREEMPT(lazy)
  Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER
  Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
  RIP: 0010:xe_sriov_vfio_wait_flr_done+0xc/0x80 [xe]
  Code: ff c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 41 54 53 <83> bf f8 11 00 00 02 75 61 41 89 f4 85 f6 74 52 48 8b 47 08 48 89
  RSP: 0018:ffffc9000f7c39b8 EFLAGS: 00010202
  RAX: ffffffffa04d8660 RBX: ffff88813e3e4000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffffc9000f7c39c8 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: ffff888101a48800
  R13: ffff88813e3e4150 R14: ffff888130d0d008 R15: ffff88813e3e40d0
  FS:  00007877d3d0d940(0000) GS:ffff88890b6d3000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000011f8 CR3: 000000015a762000 CR4: 0000000000f52ef0
  PKRU: 55555554
  Call Trace:
   <TASK>
   xe_vfio_pci_reset_done+0x49/0x120 [xe_vfio_pci]
   pci_dev_restore+0x3b/0x80
   pci_reset_function+0x109/0x140
   reset_store+0x5c/0xb0
   dev_attr_store+0x17/0x40
   sysfs_kf_write+0x72/0x90
   kernfs_fop_write_iter+0x161/0x1f0
   vfs_write+0x261/0x440
   ksys_write+0x69/0xf0
   __x64_sys_write+0x19/0x30
   x64_sys_call+0x259/0x26e0
   do_syscall_64+0xcb/0x1500
   ? __fput+0x1a2/0x2d0
   ? fput_close_sync+0x3d/0xa0
   ? __x64_sys_close+0x3e/0x90
   ? x64_sys_call+0x1b7c/0x26e0
   ? do_syscall_64+0x109/0x1500
   ? __task_pid_nr_ns+0x68/0x100
   ? __do_sys_getpid+0x1d/0x30
   ? x64_sys_call+0x10b5/0x26e0
   ? do_syscall_64+0x109/0x1500
   ? putname+0x41/0x90
   ? do_faccessat+0x1e8/0x300
   ? __x64_sys_access+0x1c/0x30
   ? x64_sys_call+0x1822/0x26e0
   ? do_syscall_64+0x109/0x1500
   ? tick_program_event+0x43/0xa0
   ? hrtimer_interrupt+0x126/0x260
   ? irqentry_exit+0xb2/0x710
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7877d5f1c5a4
  Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d a5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
  RSP: 002b:00007fff48e5f908 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007877d5f1c5a4
  RDX: 0000000000000001 RSI: 00007877d621b0c9 RDI: 0000000000000009
  RBP: 0000000000000001 R08: 00005fb49113b010 R09: 0000000000000007
  R10: 0000000000000000 R11: 0000000000000202 R12: 00007877d621b0c9
  R13: 0000000000000009 R14: 00007fff48e5fac0 R15: 00007fff48e5fac0
   </TASK>

This is caused by the fact that some of the xe_vfio_pci_core_device
members needed for handling reset are only initialized as part of
migration init.

Fix the problem by reorganizing the code to decouple VF init from
migration init.

Fixes: 1f5556ec8b9ef ("vfio/xe: Add device specific vfio_pci driver variant for Intel graphics")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7352
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/vfio/pci/xe/main.c | 39 +++++++++++++++++++++++++++-----------
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/drivers/vfio/pci/xe/main.c b/drivers/vfio/pci/xe/main.c
index 88acfcf840fcc..40ea73234dcf2 100644
--- a/drivers/vfio/pci/xe/main.c
+++ b/drivers/vfio/pci/xe/main.c
@@ -468,39 +468,56 @@ static const struct vfio_migration_ops xe_vfio_pci_migration_ops = {
 static void xe_vfio_pci_migration_init(struct xe_vfio_pci_core_device *xe_vdev)
 {
 	struct vfio_device *core_vdev = &xe_vdev->core_device.vdev;
-	struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
-	struct xe_device *xe = xe_sriov_vfio_get_pf(pdev);
 
-	if (!xe)
-		return;
-	if (!xe_sriov_vfio_migration_supported(xe))
+	if (!xe_sriov_vfio_migration_supported(xe_vdev->xe))
 		return;
 
 	mutex_init(&xe_vdev->state_mutex);
 	spin_lock_init(&xe_vdev->reset_lock);
 
-	/* PF internal control uses vfid index starting from 1 */
-	xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
-	xe_vdev->xe = xe;
-
 	core_vdev->migration_flags = VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P;
 	core_vdev->mig_ops = &xe_vfio_pci_migration_ops;
 }
 
 static void xe_vfio_pci_migration_fini(struct xe_vfio_pci_core_device *xe_vdev)
 {
-	if (!xe_vdev->vfid)
+	struct vfio_device *core_vdev = &xe_vdev->core_device.vdev;
+
+	if (!core_vdev->mig_ops)
 		return;
 
 	mutex_destroy(&xe_vdev->state_mutex);
 }
 
+static int xe_vfio_pci_vf_init(struct xe_vfio_pci_core_device *xe_vdev)
+{
+	struct vfio_device *core_vdev = &xe_vdev->core_device.vdev;
+	struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
+	struct xe_device *xe = xe_sriov_vfio_get_pf(pdev);
+
+	if (!pdev->is_virtfn)
+		return 0;
+	if (!xe)
+		return -ENODEV;
+	xe_vdev->xe = xe;
+
+	/* PF internal control uses vfid index starting from 1 */
+	xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
+
+	xe_vfio_pci_migration_init(xe_vdev);
+
+	return 0;
+}
+
 static int xe_vfio_pci_init_dev(struct vfio_device *core_vdev)
 {
 	struct xe_vfio_pci_core_device *xe_vdev =
 		container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+	int ret;
 
-	xe_vfio_pci_migration_init(xe_vdev);
+	ret = xe_vfio_pci_vf_init(xe_vdev);
+	if (ret)
+		return ret;
 
 	return vfio_pci_core_init_dev(core_vdev);
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] vfio/xe: Add a missing vfio_pci_core_release_dev()
  2026-04-08 16:03 [PATCH 1/2] vfio/xe: Reorganize the init to decouple migration from reset Michał Winiarski
@ 2026-04-08 16:03 ` Michał Winiarski
  2026-04-09  9:05   ` Niklas Schnelle
  2026-04-09  9:28   ` Tian, Kevin
  2026-04-09  9:28 ` [PATCH 1/2] vfio/xe: Reorganize the init to decouple migration from reset Tian, Kevin
  1 sibling, 2 replies; 5+ messages in thread
From: Michał Winiarski @ 2026-04-08 16:03 UTC (permalink / raw)
  To: Alex Williamson, intel-xe, linux-kernel, kvm
  Cc: Jason Gunthorpe, Yishai Hadas, Shameer Kolothum, Kevin Tian,
	Michał Winiarski, Niklas Schnelle

The driver is implementing its own .release(), which means that it needs
to call vfio_pci_core_release_dev().
Add the missing call.

Reported-by: Niklas Schnelle <schnelle@linux.ibm.com>
Closes: https://lore.kernel.org/kvm/408e262c507e8fd628a71e39904fedd99fa0ee8e.camel@linux.ibm.com/
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/vfio/pci/xe/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/vfio/pci/xe/main.c b/drivers/vfio/pci/xe/main.c
index 40ea73234dcf2..af95a011ed111 100644
--- a/drivers/vfio/pci/xe/main.c
+++ b/drivers/vfio/pci/xe/main.c
@@ -528,6 +528,7 @@ static void xe_vfio_pci_release_dev(struct vfio_device *core_vdev)
 		container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
 
 	xe_vfio_pci_migration_fini(xe_vdev);
+	vfio_pci_core_release_dev(core_vdev);
 }
 
 static const struct vfio_device_ops xe_vfio_pci_ops = {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] vfio/xe: Add a missing vfio_pci_core_release_dev()
  2026-04-08 16:03 ` [PATCH 2/2] vfio/xe: Add a missing vfio_pci_core_release_dev() Michał Winiarski
@ 2026-04-09  9:05   ` Niklas Schnelle
  2026-04-09  9:28   ` Tian, Kevin
  1 sibling, 0 replies; 5+ messages in thread
From: Niklas Schnelle @ 2026-04-09  9:05 UTC (permalink / raw)
  To: Michał Winiarski, Alex Williamson, intel-xe, linux-kernel,
	kvm
  Cc: Jason Gunthorpe, Yishai Hadas, Shameer Kolothum, Kevin Tian

On Wed, 2026-04-08 at 18:03 +0200, Michał Winiarski wrote:
> The driver is implementing its own .release(), which means that it needs
> to call vfio_pci_core_release_dev().
> Add the missing call.
> 
> Reported-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Closes: https://lore.kernel.org/kvm/408e262c507e8fd628a71e39904fedd99fa0ee8e.camel@linux.ibm.com/
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/vfio/pci/xe/main.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/vfio/pci/xe/main.c b/drivers/vfio/pci/xe/main.c
> index 40ea73234dcf2..af95a011ed111 100644
> --- a/drivers/vfio/pci/xe/main.c
> +++ b/drivers/vfio/pci/xe/main.c
> @@ -528,6 +528,7 @@ static void xe_vfio_pci_release_dev(struct vfio_device *core_vdev)
>  		container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
>  
>  	xe_vfio_pci_migration_fini(xe_vdev);
> +	vfio_pci_core_release_dev(core_vdev);
>  }
>  
>  static const struct vfio_device_ops xe_vfio_pci_ops = {

Thanks for picking this up! Feel free to add my:

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH 1/2] vfio/xe: Reorganize the init to decouple migration from reset
  2026-04-08 16:03 [PATCH 1/2] vfio/xe: Reorganize the init to decouple migration from reset Michał Winiarski
  2026-04-08 16:03 ` [PATCH 2/2] vfio/xe: Add a missing vfio_pci_core_release_dev() Michał Winiarski
@ 2026-04-09  9:28 ` Tian, Kevin
  1 sibling, 0 replies; 5+ messages in thread
From: Tian, Kevin @ 2026-04-09  9:28 UTC (permalink / raw)
  To: Winiarski, Michal, Alex Williamson,
	intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org
  Cc: Jason Gunthorpe, Yishai Hadas, Shameer Kolothum

> From: Winiarski, Michal <michal.winiarski@intel.com>
> Sent: Thursday, April 9, 2026 12:04 AM
> 
> xe_vfio_pci_migration_ops = {
>  static void xe_vfio_pci_migration_init(struct xe_vfio_pci_core_device
> *xe_vdev)
>  {
>  	struct vfio_device *core_vdev = &xe_vdev->core_device.vdev;
> -	struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> -	struct xe_device *xe = xe_sriov_vfio_get_pf(pdev);
> 
> -	if (!xe)
> -		return;
> -	if (!xe_sriov_vfio_migration_supported(xe))
> +	if (!xe_sriov_vfio_migration_supported(xe_vdev->xe))
>  		return;
> 
>  	mutex_init(&xe_vdev->state_mutex);
>  	spin_lock_init(&xe_vdev->reset_lock);
> 

those two belongs to xe_vfio_pci_vf_init() as a non-migratable
vf still requires them, e.g. in xe_vfio_pci_reset_done()?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH 2/2] vfio/xe: Add a missing vfio_pci_core_release_dev()
  2026-04-08 16:03 ` [PATCH 2/2] vfio/xe: Add a missing vfio_pci_core_release_dev() Michał Winiarski
  2026-04-09  9:05   ` Niklas Schnelle
@ 2026-04-09  9:28   ` Tian, Kevin
  1 sibling, 0 replies; 5+ messages in thread
From: Tian, Kevin @ 2026-04-09  9:28 UTC (permalink / raw)
  To: Winiarski, Michal, Alex Williamson,
	intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org
  Cc: Jason Gunthorpe, Yishai Hadas, Shameer Kolothum, Niklas Schnelle

> From: Winiarski, Michal <michal.winiarski@intel.com>
> Sent: Thursday, April 9, 2026 12:04 AM
> 
> The driver is implementing its own .release(), which means that it needs
> to call vfio_pci_core_release_dev().
> Add the missing call.
> 
> Reported-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Closes:
> https://lore.kernel.org/kvm/408e262c507e8fd628a71e39904fedd99fa0ee8e.
> camel@linux.ibm.com/
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>

Need a Fixed tag and cc stable.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-09  9:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-08 16:03 [PATCH 1/2] vfio/xe: Reorganize the init to decouple migration from reset Michał Winiarski
2026-04-08 16:03 ` [PATCH 2/2] vfio/xe: Add a missing vfio_pci_core_release_dev() Michał Winiarski
2026-04-09  9:05   ` Niklas Schnelle
2026-04-09  9:28   ` Tian, Kevin
2026-04-09  9:28 ` [PATCH 1/2] vfio/xe: Reorganize the init to decouple migration from reset Tian, Kevin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox