Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	Nareshkumar Gollakoti <naresh.kumar.g@intel.com>,
	Christoph Manszewski <christoph.manszewski@intel.com>,
	Maciej Patelczyk <maciej.patelczyk@intel.com>
Subject: Re: [PATCH] drm/xe/pf: Allow to lock/unlock the PF
Date: Tue, 28 Oct 2025 18:31:09 -0700	[thread overview]
Message-ID: <aQFuXW5WUeUu1f6r@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <20251028200521.184592-1-michal.wajdeczko@intel.com>

On Tue, Oct 28, 2025 at 09:05:21PM +0100, Michal Wajdeczko wrote:
> Some driver functionalities, like eudebug or ccs-mode, can't
> be used when VFs are enabled.  Add functions to allow locking
> the PF functionality for exclusive usage (either for enabling
> VFs or to enable those other features, or simply for testing).
> Add also debugfs attributes to explicitly call those functions
> if needed.
> 

Hmm, I'm not sure about this. Why not just lock the SR-IOV master mutex
in pf_enable_vfs? If the reason is that lockdep blows up — for example,
if the master mutex is annotated with __reclaim and pf_enable_vfs
allocates memory — then you still have a potential deadlock; you've just
silenced lockdep. I'm not certain that's the case, just using it as an
example.

Given that, I'd lean toward saying no — this really, really looks
unsafe. If you'd like, get a second opinion from a locking expert (e.g.,
Thomas), but I think this is a no from me.

Matt

> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Nareshkumar Gollakoti <naresh.kumar.g@intel.com>
> Cc: Christoph Manszewski <christoph.manszewski@intel.com>
> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_pci_sriov.c        |  7 +++++
>  drivers/gpu/drm/xe/xe_sriov_pf.c         | 38 ++++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_sriov_pf.h         |  4 +++
>  drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 15 ++++++++++
>  drivers/gpu/drm/xe/xe_sriov_pf_types.h   |  3 ++
>  5 files changed, 67 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_pci_sriov.c b/drivers/gpu/drm/xe/xe_pci_sriov.c
> index 735f51effc7a..e1d34860b064 100644
> --- a/drivers/gpu/drm/xe/xe_pci_sriov.c
> +++ b/drivers/gpu/drm/xe/xe_pci_sriov.c
> @@ -120,6 +120,10 @@ static int pf_enable_vfs(struct xe_device *xe, int num_vfs)
>  	if (err)
>  		goto out;
>  
> +	err = xe_sriov_pf_try_lock(xe);
> +	if (err)
> +		goto out;
> +
>  	/*
>  	 * We must hold additional reference to the runtime PM to keep PF in D0
>  	 * during VFs lifetime, as our VFs do not implement the PM capability.
> @@ -157,6 +161,7 @@ static int pf_enable_vfs(struct xe_device *xe, int num_vfs)
>  failed:
>  	xe_sriov_pf_unprovision_vfs(xe, num_vfs);
>  	xe_pm_runtime_put(xe);
> +	xe_sriov_pf_unlock(xe);
>  out:
>  	xe_sriov_notice(xe, "Failed to enable %u VF%s (%pe)\n",
>  			num_vfs, str_plural(num_vfs), ERR_PTR(err));
> @@ -186,6 +191,8 @@ static int pf_disable_vfs(struct xe_device *xe)
>  	/* not needed anymore - see pf_enable_vfs() */
>  	xe_pm_runtime_put(xe);
>  
> +	xe_sriov_pf_unlock(xe);
> +
>  	xe_sriov_info(xe, "Disabled %u VF%s\n", num_vfs, str_plural(num_vfs));
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf.c b/drivers/gpu/drm/xe/xe_sriov_pf.c
> index bc1ab9ee31d9..8cdd25db2cf9 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf.c
> @@ -157,6 +157,44 @@ int xe_sriov_pf_wait_ready(struct xe_device *xe)
>  	return 0;
>  }
>  
> +/**
> + * xe_sriov_pf_try_lock() - Try to lock the PF.
> + * @xe: the PF &xe_device
> + *
> + * This function can only be called on PF.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_try_lock(struct xe_device *xe)
> +{
> +	guard(mutex)(xe_sriov_pf_master_mutex(xe));
> +
> +	if (xe->sriov.pf.owner) {
> +		xe_sriov_dbg(xe, "already locked by %ps\n", xe->sriov.pf.owner);
> +		return -EBUSY;
> +	}
> +
> +	xe->sriov.pf.owner = __builtin_return_address(0);
> +	xe_sriov_dbg_verbose(xe, "locked by %ps\n", xe->sriov.pf.owner);
> +
> +	return 0;
> +}
> +
> +/**
> + * xe_sriov_pf_unlock() - Unlock the PF.
> + * @xe: the PF &xe_device
> + *
> + * This function can only be called on PF.
> + */
> +void xe_sriov_pf_unlock(struct xe_device *xe)
> +{
> +	guard(mutex)(xe_sriov_pf_master_mutex(xe));
> +
> +	xe_assert(xe, xe->sriov.pf.owner);
> +	xe_sriov_dbg_verbose(xe, "unlocked by %ps\n", __builtin_return_address(0));
> +	xe->sriov.pf.owner = NULL;
> +}
> +
>  /**
>   * xe_sriov_pf_print_vfs_summary - Print SR-IOV PF information.
>   * @xe: the &xe_device to print info from
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf.h b/drivers/gpu/drm/xe/xe_sriov_pf.h
> index cba3fde9581f..2261596bb4fe 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf.h
> @@ -17,11 +17,15 @@ bool xe_sriov_pf_readiness(struct xe_device *xe);
>  int xe_sriov_pf_init_early(struct xe_device *xe);
>  int xe_sriov_pf_init_late(struct xe_device *xe);
>  int xe_sriov_pf_wait_ready(struct xe_device *xe);
> +int xe_sriov_pf_try_lock(struct xe_device *xe);
> +void xe_sriov_pf_unlock(struct xe_device *xe);
>  void xe_sriov_pf_print_vfs_summary(struct xe_device *xe, struct drm_printer *p);
>  #else
>  static inline bool xe_sriov_pf_readiness(struct xe_device *xe) { return false; }
>  static inline int xe_sriov_pf_init_early(struct xe_device *xe) { return 0; }
>  static inline int xe_sriov_pf_init_late(struct xe_device *xe) { return 0; }
> +int xe_sriov_pf_try_lock(struct xe_device *xe) { return 0; }
> +void xe_sriov_pf_unlock(struct xe_device *xe) { }
>  #endif
>  
>  #endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> index a81aa05c5532..7c011462244d 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> @@ -96,12 +96,27 @@ static inline int xe_sriov_pf_restore_auto_provisioning(struct xe_device *xe)
>  	return xe_sriov_pf_provision_set_mode(xe, XE_SRIOV_PROVISIONING_MODE_AUTO);
>  }
>  
> +static inline int xe_sriov_pf_try_lock_pf(struct xe_device *xe)
> +{
> +	return xe_sriov_pf_try_lock(xe);
> +}
> +
> +static inline int xe_sriov_pf_force_unlock_pf(struct xe_device *xe)
> +{
> +	xe_sriov_pf_unlock(xe);
> +	return 0;
> +}
> +
>  DEFINE_SRIOV_ATTRIBUTE(restore_auto_provisioning);
> +DEFINE_SRIOV_ATTRIBUTE(try_lock_pf);
> +DEFINE_SRIOV_ATTRIBUTE(force_unlock_pf);
>  
>  static void pf_populate_root(struct xe_device *xe, struct dentry *dent)
>  {
>  	debugfs_create_file("restore_auto_provisioning", 0200, dent, xe,
>  			    &restore_auto_provisioning_fops);
> +	debugfs_create_file("try_lock_pf", 0200, dent, xe, &try_lock_pf_fops);
> +	debugfs_create_file("force_unlock_pf", 0200, dent, xe, &force_unlock_pf_fops);
>  }
>  
>  static int simple_show(struct seq_file *m, void *data)
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> index c753cd59aed2..91da3c979922 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> @@ -36,6 +36,9 @@ struct xe_device_pf {
>  	/** @master_lock: protects all VFs configurations across GTs */
>  	struct mutex master_lock;
>  
> +	/** @owner: the RET_IP of the owner who locked the PF */
> +	void *owner;
> +
>  	/** @provision: device level provisioning data. */
>  	struct xe_sriov_pf_provision provision;
>  
> -- 
> 2.47.1
> 

  parent reply	other threads:[~2025-10-29  1:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-28 20:05 [PATCH] drm/xe/pf: Allow to lock/unlock the PF Michal Wajdeczko
2025-10-28 22:19 ` ✓ CI.KUnit: success for " Patchwork
2025-10-28 22:57 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-29  1:31 ` Matthew Brost [this message]
2025-10-29  8:14   ` [PATCH] " Michal Wajdeczko
2025-10-29 11:02     ` Manszewski, Christoph
2025-10-29  1:34 ` Matthew Brost
2025-10-29 10:28 ` ✗ Xe.CI.Full: failure for " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQFuXW5WUeUu1f6r@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=christoph.manszewski@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=maciej.patelczyk@intel.com \
    --cc=michal.wajdeczko@intel.com \
    --cc=naresh.kumar.g@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox