public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Matthew Auld <matthew.auld@intel.com>
To: Raag Jadav <raag.jadav@intel.com>, intel-xe@lists.freedesktop.org
Cc: matthew.brost@intel.com, thomas.hellstrom@linux.intel.com,
	himal.prasad.ghimiray@intel.com, matthew.d.roper@intel.com
Subject: Re: [PATCH v1] drm/xe: Drop dma mappings for wedged device
Date: Tue, 24 Mar 2026 12:52:30 +0000	[thread overview]
Message-ID: <019ee8de-9268-4706-841b-25d9b0818f1a@intel.com> (raw)
In-Reply-To: <20260324071529.447319-1-raag.jadav@intel.com>

On 24/03/2026 07:13, Raag Jadav wrote:
> As per uapi documentation[1], the prerequisite for wedged device is to
> drop all dma mappings. Reuse xe_bo_pci_dev_remove_pinned() for this,
> which iterates over external bo list and removes all dma mappings.
> 
> [1] Documentation/gpu/drm-uapi.rst

Can you point to where it says that? Do you just mean the: "disabling 
DMA to system memory" ?

One other thing that maybe jumps out is:

"All existing mmaps should be invalidated and
page faults should be redirected to a dummy page"

Are we also missing that? We have the dummy page flow, but do we 
actually force everything to be refaulted?

Something like:

  /* Clear all CPU mappings pointing to this device */
  unmap_mapping_range(dev->anon_inode->i_mapping, 0, 0, 1);

> 
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---
> PS: This is pretty much uncharted territory for me, so please consider
> this an RFC.
> 
>   drivers/gpu/drm/xe/xe_bo_evict.c | 8 +++++++-
>   drivers/gpu/drm/xe/xe_bo_evict.h | 1 +
>   drivers/gpu/drm/xe/xe_device.c   | 2 ++
>   3 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c b/drivers/gpu/drm/xe/xe_bo_evict.c
> index 7661fca7f278..f741cda50b2d 100644
> --- a/drivers/gpu/drm/xe/xe_bo_evict.c
> +++ b/drivers/gpu/drm/xe/xe_bo_evict.c
> @@ -270,7 +270,13 @@ int xe_bo_restore_late(struct xe_device *xe)
>   	return ret;
>   }
>   
> -static void xe_bo_pci_dev_remove_pinned(struct xe_device *xe)
> +/**
> + * xe_bo_pci_dev_remove_pinned() - Unmap external bos
> + * @xe: xe device
> + *
> + * Drop dma mappings of all external pinned bos.
> + */
> +void xe_bo_pci_dev_remove_pinned(struct xe_device *xe)
>   {
>   	struct xe_tile *tile;
>   	unsigned int id;
> diff --git a/drivers/gpu/drm/xe/xe_bo_evict.h b/drivers/gpu/drm/xe/xe_bo_evict.h
> index e8385cb7f5e9..6ce27e272780 100644
> --- a/drivers/gpu/drm/xe/xe_bo_evict.h
> +++ b/drivers/gpu/drm/xe/xe_bo_evict.h
> @@ -15,6 +15,7 @@ void xe_bo_notifier_unprepare_all_pinned(struct xe_device *xe);
>   int xe_bo_restore_early(struct xe_device *xe);
>   int xe_bo_restore_late(struct xe_device *xe);
>   
> +void xe_bo_pci_dev_remove_pinned(struct xe_device *xe);
>   void xe_bo_pci_dev_remove_all(struct xe_device *xe);
>   
>   int xe_bo_pinned_init(struct xe_device *xe);
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 207ad2eea412..ac51b04560df 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -1351,6 +1351,8 @@ void xe_device_declare_wedged(struct xe_device *xe)
>   	for_each_gt(gt, xe, id)
>   		xe_gt_declare_wedged(gt);
>   
> +	xe_bo_pci_dev_remove_pinned(xe);

AFAIK this just removes the iommu mappings for kernel BOs (small subset 
of BOs), if there are any. Also if you are not using iommu, then dma 
between GPU and system memory is still possible. And for userspace BOs 
nothing changes. But I guess this is still better than nothing and will 
maybe catch some misuse?

> +
>   	if (xe_device_wedged(xe)) {
>   		/*
>   		 * XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET is intended for debugging


  parent reply	other threads:[~2026-03-24 12:52 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24  7:13 [PATCH v1] drm/xe: Drop dma mappings for wedged device Raag Jadav
2026-03-24  7:25 ` ✓ CI.KUnit: success for " Patchwork
2026-03-24  8:25 ` ✓ Xe.CI.BAT: " Patchwork
2026-03-24 12:52 ` Matthew Auld [this message]
2026-03-24 15:14   ` [PATCH v1] " Raag Jadav
2026-03-24 15:37     ` Matthew Auld
2026-03-25 19:37       ` Raag Jadav
2026-03-24 15:31 ` ✗ Xe.CI.FULL: failure for " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=019ee8de-9268-4706-841b-25d9b0818f1a@intel.com \
    --to=matthew.auld@intel.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=raag.jadav@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox