From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CA856FF495D for ; Mon, 30 Mar 2026 07:45:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8C1E410E4C5; Mon, 30 Mar 2026 07:45:27 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="Nvfdk1YP"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by gabe.freedesktop.org (Postfix) with ESMTPS id 953A110E4C5 for ; Mon, 30 Mar 2026 07:45:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1774856727; x=1806392727; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=0dspvXr+UY+/FarK6zJeOSKLx4zt0RarhdkGTUHqam8=; b=Nvfdk1YPk76Uo3LCTX59NMI6IX8c6c9f4Keb0smpG/BX5gScFZ7icoAs 5ylNjBD/1uCgRfrdelWTZuYGBNYzhTA5guoI0OTrYplxS/bgLxHzLusJN Crv/HUwc9aWjAnTRw4y4ArgMBPHcXF+sLHI+1IJNZj7Ev4ZI9ZTqKx3hL CPEb1KJY8DZv9EyXE0L2qxUFd6CVq9FydodxpzOUZ1dbZChuVDL6HRvzA ntM9lW8B19xdYClveey6zirViWkO7d4Qc69lgOEiFf6p72NMyUosEi2JZ 51/vP3pjJfQPkrUvSjXfQlnsF+JH81TizSTtQwn8QZhXScrfbdJ5e4z2S Q==; X-CSE-ConnectionGUID: VDhO/nWdRN2UnY8Kg0mmiQ== X-CSE-MsgGUID: biuzJMS1SeKLwnxsxV2kgw== X-IronPort-AV: E=McAfee;i="6800,10657,11743"; a="86915315" X-IronPort-AV: E=Sophos;i="6.23,149,1770624000"; d="scan'208";a="86915315" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2026 00:45:26 -0700 X-CSE-ConnectionGUID: VAAKTFjPS1+opsMdry/vUA== X-CSE-MsgGUID: WaJqQGqNTIimnpkJvjtE/g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,149,1770624000"; d="scan'208";a="224996765" Received: from abityuts-desk.ger.corp.intel.com (HELO [10.245.245.95]) ([10.245.245.95]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2026 00:45:25 -0700 Message-ID: <52e4f2e61ab7df9b46c1c16ac4dfde929732629a.camel@linux.intel.com> Subject: Re: [PATCH v2] drm/xe: Drop all mappings for wedged device From: Thomas =?ISO-8859-1?Q?Hellstr=F6m?= To: Matthew Auld , Matthew Brost , Raag Jadav Cc: intel-xe@lists.freedesktop.org, himal.prasad.ghimiray@intel.com, matthew.d.roper@intel.com Date: Mon, 30 Mar 2026 09:45:22 +0200 In-Reply-To: <156d287e-117f-4575-a90c-3aaa233ed670@intel.com> References: <20260326132816.739363-1-raag.jadav@intel.com> <9099f0ef-87a9-42f6-888f-57bb73f6d6ae@intel.com> <2759679af38d84c75e43b19ef5a93681f789ff28.camel@linux.intel.com> <156d287e-117f-4575-a90c-3aaa233ed670@intel.com> Organization: Intel Sweden AB, Registration Number: 556189-6027 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.58.3 (3.58.3-1.fc43) MIME-Version: 1.0 X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Fri, 2026-03-27 at 15:03 +0000, Matthew Auld wrote: > On 27/03/2026 10:41, Thomas Hellstr=C3=B6m wrote: > > On Fri, 2026-03-27 at 10:18 +0000, Matthew Auld wrote: > > > On 26/03/2026 21:19, Matthew Brost wrote: > > > > On Thu, Mar 26, 2026 at 06:58:16PM +0530, Raag Jadav wrote: > > > > > As per uapi documentation[1], the prerequisite for wedged > > > > > device > > > > > is to > > > > > drop all memory mappings. Follow it. > > > > >=20 > > > > > [1] Documentation/gpu/drm-uapi.rst > > > > >=20 > > > > > v2: Also drop CPU mappings (Matthew Auld) > > > > >=20 > > > > > Fixes: 7bc00751f877 ("drm/xe: Use device wedged event") > > > > > Signed-off-by: Raag Jadav > > > > > --- > > > > > =C2=A0=C2=A0 drivers/gpu/drm/xe/xe_bo_evict.c | 8 +++++++- > > > > > =C2=A0=C2=A0 drivers/gpu/drm/xe/xe_bo_evict.h | 1 + > > > > > =C2=A0=C2=A0 drivers/gpu/drm/xe/xe_device.c=C2=A0=C2=A0 | 5 +++++ > > > > > =C2=A0=C2=A0 3 files changed, 13 insertions(+), 1 deletion(-) > > > > >=20 > > > > > diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c > > > > > b/drivers/gpu/drm/xe/xe_bo_evict.c > > > > > index 7661fca7f278..f741cda50b2d 100644 > > > > > --- a/drivers/gpu/drm/xe/xe_bo_evict.c > > > > > +++ b/drivers/gpu/drm/xe/xe_bo_evict.c > > > > > @@ -270,7 +270,13 @@ int xe_bo_restore_late(struct xe_device > > > > > *xe) > > > > > =C2=A0=C2=A0=C2=A0 return ret; > > > > > =C2=A0=C2=A0 } > > > > > =C2=A0=C2=A0=20 > > > > > -static void xe_bo_pci_dev_remove_pinned(struct xe_device > > > > > *xe) > > > > > +/** > > > > > + * xe_bo_pci_dev_remove_pinned() - Unmap external bos > > > > > + * @xe: xe device > > > > > + * > > > > > + * Drop dma mappings of all external pinned bos. > > > > > + */ > > > > > +void xe_bo_pci_dev_remove_pinned(struct xe_device *xe) > > > > > =C2=A0=C2=A0 { > > > > > =C2=A0=C2=A0=C2=A0 struct xe_tile *tile; > > > > > =C2=A0=C2=A0=C2=A0 unsigned int id; > > > > > diff --git a/drivers/gpu/drm/xe/xe_bo_evict.h > > > > > b/drivers/gpu/drm/xe/xe_bo_evict.h > > > > > index e8385cb7f5e9..6ce27e272780 100644 > > > > > --- a/drivers/gpu/drm/xe/xe_bo_evict.h > > > > > +++ b/drivers/gpu/drm/xe/xe_bo_evict.h > > > > > @@ -15,6 +15,7 @@ void > > > > > xe_bo_notifier_unprepare_all_pinned(struct > > > > > xe_device *xe); > > > > > =C2=A0=C2=A0 int xe_bo_restore_early(struct xe_device *xe); > > > > > =C2=A0=C2=A0 int xe_bo_restore_late(struct xe_device *xe); > > > > > =C2=A0=C2=A0=20 > > > > > +void xe_bo_pci_dev_remove_pinned(struct xe_device *xe); > > > > > =C2=A0=C2=A0 void xe_bo_pci_dev_remove_all(struct xe_device *xe); > > > > > =C2=A0=C2=A0=20 > > > > > =C2=A0=C2=A0 int xe_bo_pinned_init(struct xe_device *xe); > > > > > diff --git a/drivers/gpu/drm/xe/xe_device.c > > > > > b/drivers/gpu/drm/xe/xe_device.c > > > > > index b17d4a878686..4c0097f3aefb 100644 > > > > > --- a/drivers/gpu/drm/xe/xe_device.c > > > > > +++ b/drivers/gpu/drm/xe/xe_device.c > > > > > @@ -1347,6 +1347,11 @@ void xe_device_declare_wedged(struct > > > > > xe_device *xe) > > > > > =C2=A0=C2=A0=C2=A0 for_each_gt(gt, xe, id) > > > > > =C2=A0=C2=A0=C2=A0 xe_gt_declare_wedged(gt); > > > > > =C2=A0=C2=A0=20 > > > > > + /* Drop dma mappings of external bos */ > > > > > + xe_bo_pci_dev_remove_pinned(xe); > > > >=20 > > > > Do we even need the part above? unmap_mapping_range() should > > > > drop > > > > all > > > > DMA mappings for the device being wedged, right? In other > > > > words, > > > > the device > > > > should no longer be able to access system memory or other > > > > devices=E2=80=99 > > > > memory > > > > via PCIe P2P. I'm not 100% sure about this, though. > > >=20 > > > AFAIK unmap_mapping_range() is just for the CPU mmap side. It > > > should > > > ensure ~everything is refaulted on the next CPU access, so we can > > > point > > > to dummy page. > > >=20 > > > For dma mapping side, I'm still not completely sure what the best > > > approach is. On the one hand, device is wedged so we should not > > > really > > > be doing new GPU access? Ioctls are all blocked, and with below, > > > CPU > > > access will be re-directed to dummy page. So perhaps doing > > > nothing > > > for > > > dma mapping side is OK? If we want to actually remove all dma > > > mappings > > > for extra safety, I think closest thing is maybe purge all BOs? > > > Similar > > > to what we do for an unplug. > >=20 > > It sounds like, when the device is wedged beyond recovery, not even > > the > > unplug / pcie device unbind path should be doing any hwardware > > accesses. So if that path is fixed up to avoid that, then perhaps > > we > > can just unbind the pcie device just after wedging? That is, of > > course > > if it's acceptable that the drm_device <-> pcie device association > > is > > broken. >=20 > Yeah, it does seem kind of similar to the unplug flow, just that > device=20 > is potentially inaccessible through that. All of the devm cleanup, > plus=20 > already nuking CPU mapppings and the purge stuff is relevant for > wedge,=20 > I think. >=20 > Do we know if coredump is maybe interesting after wedge? Maybe that > is=20 > one potential issue, since it looks like that is attached to the=20 > physical device? AFAICT that would currently get nuked on > unbind/unplug?=20 > Perhaps if we get into a wedge state, the user might want to collect > any=20 > logs/coredump first, assuming there are some? >=20 > Also thinking some more, there is also the dma-buf exported to > another=20 > device edgecase, like with VRAM? With unplug, I think we move it to=20 > system memory and notify the importer. But here, if device is cooked, > we=20 > maybe can't actually move it with hw? Do we know if this is handled > in=20 > some way already? For dma-buf, there is a new interface landing to "revoke" a dma-buf completely. I think it's already in the code but not fully sure. We should take a look at that. Otherwise it sounds like for dynamic importers at least we need to reject mapping and run move_notify() on=20 all exported dma-buf. Apart from that, following https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#device-wedging and from discussions with Rodrigo IMO we need to * Block user-space mappings of device memory - unmap_mapping_range(). * Figure out how to handle device-private SVM pages. At the very least we need to invalidate the mm ranges, error on new migration attempts and block dma-setup for peer devices.=20 * Kill all dma-mappings. * Signal all fences. * Block and drain user-space accesses (ioctl, read, write, fault). Looks like much is blocked today but not drained. To drain we need to be using drm_dev_enter() and drm_dev_exit() and add wedge synchronization with those. * Figure out what user-space accesses remaining needed for post-wedge access of debug / telemetry data. * Ensure this all doesn't make unplug / devres / dmres cleanup trip. /Thomas >=20 > >=20 > > /Thomas > >=20 > >=20 > >=20 > > >=20 > > > So perhaps xe_bo_pci_dev_remove_all() is better here? Also I > > > guess > > > would > > > need: > > >=20 > > > @@ -349,7 +349,8 @@ static void xe_evict_flags(struct > > > ttm_buffer_object > > > *tbo, > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return; > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } > > >=20 > > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (device_unplugged && !tbo->b= ase.dma_buf) { > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if ((device_unplugged || xe_dev= ice_wedged(xe)) && > > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 !tbo->b= ase.dma_buf) { > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 *placement =3D purge_placement; > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return; > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } > > >=20 > > > >=20 > > > > Matt > > > >=20 > > > > > + /* Drop all CPU mappings pointing to this device */ > > > > > + unmap_mapping_range(xe->drm.anon_inode->i_mapping, > > > > > 0, 0, > > > > > 1); > > > > > + > > > > > =C2=A0=C2=A0=C2=A0 if (xe_device_wedged(xe)) { > > > > > =C2=A0=C2=A0=C2=A0 /* > > > > > =C2=A0=C2=A0=C2=A0 * XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET is > > > > > intended for debugging > > > > > --=20 > > > > > 2.43.0 > > > > >=20