All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev,
	"Michał Winiarski" <michal.winiarski@intel.com>,
	"Kevin Tian" <kevin.tian@intel.com>,
	"Alex Williamson" <alex@shazbot.org>
Subject: [PATCH 7.0 42/76] vfio/xe: Reorganize the init to decouple migration from reset
Date: Mon, 20 Apr 2026 17:41:53 +0200	[thread overview]
Message-ID: <20260420153912.358168480@linuxfoundation.org> (raw)
In-Reply-To: <20260420153910.810034134@linuxfoundation.org>

7.0-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michał Winiarski <michal.winiarski@intel.com>

commit 1b81ed612e12ea9df8c5cb6f0ddd4419fd0b8ac8 upstream.

Attempting to issue reset on VF devices that don't support migration
leads to the following:

  BUG: unable to handle page fault for address: 00000000000011f8
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: Oops: 0000 [#1] SMP NOPTI
  CPU: 2 UID: 0 PID: 7443 Comm: xe_sriov_flr Tainted: G S   U              7.0.0-rc1-lgci-xe-xe-4588-cec43d5c2696af219-nodebug+ #1 PREEMPT(lazy)
  Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER
  Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
  RIP: 0010:xe_sriov_vfio_wait_flr_done+0xc/0x80 [xe]
  Code: ff c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 41 54 53 <83> bf f8 11 00 00 02 75 61 41 89 f4 85 f6 74 52 48 8b 47 08 48 89
  RSP: 0018:ffffc9000f7c39b8 EFLAGS: 00010202
  RAX: ffffffffa04d8660 RBX: ffff88813e3e4000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffffc9000f7c39c8 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: ffff888101a48800
  R13: ffff88813e3e4150 R14: ffff888130d0d008 R15: ffff88813e3e40d0
  FS:  00007877d3d0d940(0000) GS:ffff88890b6d3000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000011f8 CR3: 000000015a762000 CR4: 0000000000f52ef0
  PKRU: 55555554
  Call Trace:
   <TASK>
   xe_vfio_pci_reset_done+0x49/0x120 [xe_vfio_pci]
   pci_dev_restore+0x3b/0x80
   pci_reset_function+0x109/0x140
   reset_store+0x5c/0xb0
   dev_attr_store+0x17/0x40
   sysfs_kf_write+0x72/0x90
   kernfs_fop_write_iter+0x161/0x1f0
   vfs_write+0x261/0x440
   ksys_write+0x69/0xf0
   __x64_sys_write+0x19/0x30
   x64_sys_call+0x259/0x26e0
   do_syscall_64+0xcb/0x1500
   ? __fput+0x1a2/0x2d0
   ? fput_close_sync+0x3d/0xa0
   ? __x64_sys_close+0x3e/0x90
   ? x64_sys_call+0x1b7c/0x26e0
   ? do_syscall_64+0x109/0x1500
   ? __task_pid_nr_ns+0x68/0x100
   ? __do_sys_getpid+0x1d/0x30
   ? x64_sys_call+0x10b5/0x26e0
   ? do_syscall_64+0x109/0x1500
   ? putname+0x41/0x90
   ? do_faccessat+0x1e8/0x300
   ? __x64_sys_access+0x1c/0x30
   ? x64_sys_call+0x1822/0x26e0
   ? do_syscall_64+0x109/0x1500
   ? tick_program_event+0x43/0xa0
   ? hrtimer_interrupt+0x126/0x260
   ? irqentry_exit+0xb2/0x710
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7877d5f1c5a4
  Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d a5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
  RSP: 002b:00007fff48e5f908 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007877d5f1c5a4
  RDX: 0000000000000001 RSI: 00007877d621b0c9 RDI: 0000000000000009
  RBP: 0000000000000001 R08: 00005fb49113b010 R09: 0000000000000007
  R10: 0000000000000000 R11: 0000000000000202 R12: 00007877d621b0c9
  R13: 0000000000000009 R14: 00007fff48e5fac0 R15: 00007fff48e5fac0
   </TASK>

This is caused by the fact that some of the xe_vfio_pci_core_device
members needed for handling reset are only initialized as part of
migration init.

Fix the problem by reorganizing the code to decouple VF init from
migration init.

Fixes: 1f5556ec8b9ef ("vfio/xe: Add device specific vfio_pci driver variant for Intel graphics")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7352
Cc: stable@vger.kernel.org
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20260410224948.900550-1-michal.winiarski@intel.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/vfio/pci/xe/main.c |   43 +++++++++++++++++++++++++------------------
 1 file changed, 25 insertions(+), 18 deletions(-)

--- a/drivers/vfio/pci/xe/main.c
+++ b/drivers/vfio/pci/xe/main.c
@@ -454,39 +454,46 @@ static const struct vfio_migration_ops x
 static void xe_vfio_pci_migration_init(struct xe_vfio_pci_core_device *xe_vdev)
 {
 	struct vfio_device *core_vdev = &xe_vdev->core_device.vdev;
-	struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
-	struct xe_device *xe = xe_sriov_vfio_get_pf(pdev);
 
-	if (!xe)
-		return;
-	if (!xe_sriov_vfio_migration_supported(xe))
+	if (!xe_sriov_vfio_migration_supported(xe_vdev->xe))
 		return;
 
-	mutex_init(&xe_vdev->state_mutex);
-	spin_lock_init(&xe_vdev->reset_lock);
-
-	/* PF internal control uses vfid index starting from 1 */
-	xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
-	xe_vdev->xe = xe;
-
 	core_vdev->migration_flags = VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P;
 	core_vdev->mig_ops = &xe_vfio_pci_migration_ops;
 }
 
-static void xe_vfio_pci_migration_fini(struct xe_vfio_pci_core_device *xe_vdev)
+static int xe_vfio_pci_vf_init(struct xe_vfio_pci_core_device *xe_vdev)
 {
-	if (!xe_vdev->vfid)
-		return;
+	struct vfio_device *core_vdev = &xe_vdev->core_device.vdev;
+	struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
+	struct xe_device *xe = xe_sriov_vfio_get_pf(pdev);
 
-	mutex_destroy(&xe_vdev->state_mutex);
+	if (!pdev->is_virtfn)
+		return 0;
+	if (!xe)
+		return -ENODEV;
+	xe_vdev->xe = xe;
+
+	/* PF internal control uses vfid index starting from 1 */
+	xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
+
+	xe_vfio_pci_migration_init(xe_vdev);
+
+	return 0;
 }
 
 static int xe_vfio_pci_init_dev(struct vfio_device *core_vdev)
 {
 	struct xe_vfio_pci_core_device *xe_vdev =
 		container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+	int ret;
 
-	xe_vfio_pci_migration_init(xe_vdev);
+	mutex_init(&xe_vdev->state_mutex);
+	spin_lock_init(&xe_vdev->reset_lock);
+
+	ret = xe_vfio_pci_vf_init(xe_vdev);
+	if (ret)
+		return ret;
 
 	return vfio_pci_core_init_dev(core_vdev);
 }
@@ -496,7 +503,7 @@ static void xe_vfio_pci_release_dev(stru
 	struct xe_vfio_pci_core_device *xe_vdev =
 		container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
 
-	xe_vfio_pci_migration_fini(xe_vdev);
+	mutex_destroy(&xe_vdev->state_mutex);
 }
 
 static const struct vfio_device_ops xe_vfio_pci_ops = {



  parent reply	other threads:[~2026-04-20 15:45 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-20 15:41 [PATCH 7.0 00/76] 7.0.1-rc1 review Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 01/76] nfc: llcp: add missing return after LLCP_CLOSED checks Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 02/76] x86/CPU: Fix FPDSS on Zen1 Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 03/76] can: raw: fix ro->uniq use-after-free in raw_rcv() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 04/76] i2c: s3c24xx: check the size of the SMBUS message before using it Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 05/76] staging: rtl8723bs: initialize le_tmp64 in rtw_BIP_verify() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 06/76] HID: alps: fix NULL pointer dereference in alps_raw_event() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 07/76] HID: core: clamp report_size in s32ton() to avoid undefined shift Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 08/76] net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 09/76] NFC: digital: Bounds check NFC-A cascade depth in SDD response handler Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 10/76] drm/vc4: platform_get_irq_byname() returns an int Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 11/76] bnge: return after auxiliary_device_uninit() in error path Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 12/76] ALSA: usx2y: us144mkii: fix NULL deref on missing interface 0 Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 13/76] ALSA: fireworks: bound device-supplied status before string array lookup Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 14/76] fbdev: tdfxfb: avoid divide-by-zero on FBIOPUT_VSCREENINFO Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 15/76] usb: gadget: f_ncm: validate minimum block_len in ncm_unwrap_ntb() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 16/76] usb: gadget: f_phonet: fix skb frags[] overflow in pn_rx_complete() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 17/76] usb: gadget: renesas_usb3: validate endpoint index in standard request handlers Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 18/76] smb: client: fix off-by-8 bounds check in check_wsl_eas() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 19/76] smb: client: fix OOB reads parsing symlink error response Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 20/76] ksmbd: validate EaNameLength in smb2_get_ea() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 21/76] ksmbd: require 3 sub-authorities before reading sub_auth[2] Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 22/76] ksmbd: fix mechToken leak when SPNEGO decode fails after token alloc Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 23/76] smb: client: avoid double-free in smbd_free_send_io() after smbd_send_batch_flush() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 24/76] smb: server: avoid double-free in smb_direct_free_sendmsg after smb_direct_flush_send_list() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 25/76] usbip: validate number_of_packets in usbip_pack_ret_submit() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 26/76] usb: typec: fusb302: Switch to threaded IRQ handler Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 27/76] usb: storage: Expand range of matched versions for VL817 quirks entry Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 28/76] USB: cdc-acm: Add quirks for Yoga Book 9 14IAH10 INGENIC touchscreen Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 29/76] usb: gadget: f_hid: dont call cdev_init while cdev in use Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 30/76] usb: port: add delay after usb_hub_set_port_power() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 31/76] fbdev: udlfb: avoid divide-by-zero on FBIOPUT_VSCREENINFO Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 32/76] scripts/gdb/symbols: handle module path parameters Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 33/76] scripts: generate_rust_analyzer.py: avoid FD leak Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 34/76] wifi: rtw88: fix device leak on probe failure Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 35/76] staging: sm750fb: fix division by zero in ps_to_hz() Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 36/76] selftests/mm: hmm-tests: dont hardcode THP size to 2MB Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 37/76] USB: serial: option: add Telit Cinterion FN990A MBIM composition Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 38/76] Docs/admin-guide/mm/damon/reclaim: warn commit_inputs vs param updates race Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 39/76] Docs/admin-guide/mm/damon/lru_sort: " Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 40/76] ALSA: ctxfi: Limit PTP to a single page Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 41/76] dcache: Limit the minimal number of bucket to two Greg Kroah-Hartman
2026-04-20 15:41 ` Greg Kroah-Hartman [this message]
2026-04-20 15:41 ` [PATCH 7.0 43/76] arm64: mm: Handle invalid large leaf mappings correctly Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 44/76] media: vidtv: fix NULL pointer dereference in vidtv_channel_pmt_match_sections Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 45/76] ocfs2: fix possible deadlock between unlink and dio_end_io_write Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 46/76] ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 47/76] ocfs2: handle invalid dinode in ocfs2_group_extend Greg Kroah-Hartman
2026-04-20 15:41 ` [PATCH 7.0 48/76] PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in epf_ntb_epc_cleanup Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 49/76] PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 50/76] KVM: selftests: Remove duplicate LAUNCH_UPDATE_VMSA call in SEV-ES migrate test Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 51/76] KVM: SEV: Reject attempts to sync VMSA of an already-launched/encrypted vCPU Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 52/76] KVM: SEV: Protect *all* of sev_mem_enc_register_region() with kvm->lock Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 53/76] KVM: SEV: Disallow LAUNCH_FINISH if vCPUs are actively being created Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 54/76] KVM: SEV: Lock all vCPUs when synchronzing VMSAs for SNP launch finish Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 55/76] KVM: SEV: Drop WARN on large size for KVM_MEMORY_ENCRYPT_REG_REGION Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 56/76] mm: call ->free_folio() directly in folio_unmap_invalidate() Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 57/76] checkpatch: add support for Assisted-by tag Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 58/76] x86-64: rename misleadingly named __copy_user_nocache() function Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 59/76] x86: rename and clean up __copy_from_user_inatomic_nocache() Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 60/76] x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 61/76] KVM: x86: Use scratch field in MMIO fragment to hold small write values Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 62/76] ASoC: qcom: q6apm: move component registration to unmanaged version Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 63/76] mm/kasan: fix double free for kasan pXds Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 64/76] mm: blk-cgroup: fix use-after-free in cgwb_release_workfn() Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 65/76] media: vidtv: fix nfeeds state corruption on start_streaming failure Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 66/76] media: mediatek: vcodec: fix use-after-free in encoder release path Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 67/76] media: em28xx: fix use-after-free in em28xx_v4l2_open() Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 68/76] hwmon: (powerz) Fix use-after-free on USB disconnect Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 69/76] ALSA: 6fire: fix use-after-free on disconnect Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 70/76] bcache: fix cached_dev.sb_bio use-after-free and crash Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 71/76] wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 72/76] media: as102: fix to not free memory after the device is registered in as102_usb_probe() Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 73/76] nilfs2: fix NULL i_assoc_inode dereference in nilfs_mdt_save_to_shadow_map Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 74/76] media: vidtv: fix pass-by-value structs causing MSAN warnings Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 75/76] media: hackrf: fix to not free memory after the device is registered in hackrf_probe() Greg Kroah-Hartman
2026-04-20 15:42 ` [PATCH 7.0 76/76] mm/userfaultfd: fix hugetlb fault mutex hash calculation Greg Kroah-Hartman
2026-04-20 17:32 ` [PATCH 7.0 00/76] 7.0.1-rc1 review Ronald Warsow
2026-04-20 18:17 ` Florian Fainelli
2026-04-20 21:58   ` Luna Jernberg
2026-04-20 22:28 ` Peter Schneider
2026-04-20 23:08 ` Takeshi Ogasawara
2026-04-21  6:52 ` Ron Economos
2026-04-21  8:09 ` Brett A C Sheffield
2026-04-21 10:22 ` Miguel Ojeda
2026-04-21 16:45 ` Shuah Khan
2026-04-21 16:47 ` Josh Law
2026-04-21 20:04 ` Mark Brown
2026-04-22  6:12 ` Barry K. Nathan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260420153912.358168480@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=alex@shazbot.org \
    --cc=kevin.tian@intel.com \
    --cc=michal.winiarski@intel.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.