Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
	Badal Nilawar <badal.nilawar@intel.com>,
	 Lucas De Marchi <lucas.demarchi@intel.com>,
	Nirmoy Das <nirmoy.das@intel.com>
Subject: Re: [RFC 1/9] drm/xe: Error handling in xe_force_wake_get()
Date: Thu, 5 Sep 2024 15:29:42 -0400	[thread overview]
Message-ID: <ZtoGpjVmZbwLsD1s@intel.com> (raw)
In-Reply-To: <20240830052326.3707019-2-himal.prasad.ghimiray@intel.com>

On Fri, Aug 30, 2024 at 10:53:18AM +0530, Himal Prasad Ghimiray wrote:
> If an acknowledgment timeout occurs for a domain awake request, put to
> sleep all domains awakened by the caller and decrease the reference
> count for all requested domains. This prevents xe_force_wake_get() from
> leaving an unhandled reference count in case of failure.
> While at it, add simple kernel-doc for xe_force_wake_get() and
> xe_force_wake_put() functions.
> 
> Cc: Badal Nilawar <badal.nilawar@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Cc: Nirmoy Das <nirmoy.das@intel.com>
> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_force_wake.c | 52 +++++++++++++++++++++++++++---
>  1 file changed, 47 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_force_wake.c b/drivers/gpu/drm/xe/xe_force_wake.c
> index b263fff15273..8aa8d9b41052 100644
> --- a/drivers/gpu/drm/xe/xe_force_wake.c
> +++ b/drivers/gpu/drm/xe/xe_force_wake.c
> @@ -150,31 +150,73 @@ static int domain_sleep_wait(struct xe_gt *gt,
>  					 (ffs(tmp__) - 1))) && \
>  					 domain__->reg_ctl.addr)
>  
> +/**
> + * xe_force_wake_get : Increase the domain refcount; if it was 0 initially, wake the domain
> + * @fw: struct xe_force_wake
> + * @domains: forcewake domains to get refcount on
> + *
> + * Increment refcount for the force-wake domain. If the domain is
> + * asleep, awaken it and wait for acknowledgment within the specified
> + * timeout. If a timeout occurs, decrement the refcount and put the
> + * caller awaken domains to sleep.
> + *
> + * Return: 0 on success or 1 on ack timeout from domains.

* Returns 0 for success, negative error code otherwise.

> + */
>  int xe_force_wake_get(struct xe_force_wake *fw,
>  		      enum xe_force_wake_domains domains)
>  {
>  	struct xe_gt *gt = fw->gt;
>  	struct xe_force_wake_domain *domain;
> -	enum xe_force_wake_domains tmp, woken = 0;
> +	enum xe_force_wake_domains tmp, awake_rqst = 0, awake_ack = 0;
>  	unsigned long flags;
>  	int ret = 0;
>  
>  	spin_lock_irqsave(&fw->lock, flags);
>  	for_each_fw_domain_masked(domain, domains, fw, tmp) {
>  		if (!domain->ref++) {
> -			woken |= BIT(domain->id);
> +			awake_rqst |= BIT(domain->id);
>  			domain_wake(gt, domain);
>  		}
>  	}
> -	for_each_fw_domain_masked(domain, woken, fw, tmp) {
> -		ret |= domain_wake_wait(gt, domain);

now you suppress the mmio error code...
should be better to find a way to propagate that.

> +	for_each_fw_domain_masked(domain, awake_rqst, fw, tmp) {
> +		if (domain_wake_wait(gt, domain) == 0)
> +			awake_ack |= BIT(domain->id);
> +	}
> +
> +	ret = (awake_ack == awake_rqst) ? 0 : 1;

s/1/-EIO/ ?

> +
> +	/*
> +	 * If @domains is XE_FORCEWAKE_ALL and an acknowledgment times out
> +	 * for any domain, decrease the reference count and put the awake
> +	 * domains to sleep. For individual domains, just decrement the
> +	 * reference count.
> +	 */
> +	if (ret) {
> +		for_each_fw_domain_masked(domain, awake_rqst, fw, tmp) {
> +			if (!--domain->ref && (awake_ack & BIT(domain->id)))
> +				domain_sleep(gt, domain);

wonder if it would help to extract this in a separate function to be
used here and in the -put function.

But more then that, I have a question here...
Do we really need to sleep other domains if we are not getting ack from certain domain?
Doesn't it generally means that we are busted anyway?

But also, if we really need to sleep, then perhaps shouldn't we also
call the sleep function even from the guys who didn't ack? perhaps the ack
timedout, but it really woke-up? how sure we are that this is not possible?

> +		}
> +		awake_ack = 0;
>  	}
> -	fw->awake_domains |= woken;
> +
> +	fw->awake_domains |= awake_ack;
>  	spin_unlock_irqrestore(&fw->lock, flags);
>  
>  	return ret;
>  }
>  
> +/**
> + * xe_force_wake_put - Decrement the refcount and put domain to sleep if refcount becomes 0
> + * @fw: Pointer to the force wake structure
> + * @domains: forcewake domains to put reference
> + *
> + * This function reduces the reference counts for specified domains. If
> + * refcount for any of the specified domain reaches 0, it puts the domain to sleep
> + * and waits for acknowledgment for domain to sleep within specified timeout.
> + * Ensure this function is called only in case of successful xe_force_wake_get().
> + *
> + * Returns 0 in case of success or non-zero in case of timeout of ack
> + */
>  int xe_force_wake_put(struct xe_force_wake *fw,
>  		      enum xe_force_wake_domains domains)
>  {
> -- 
> 2.34.1
> 

  parent reply	other threads:[~2024-09-05 19:29 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30  5:23 [RFC 0/9] Fix xe_force_wake_get() failure handling Himal Prasad Ghimiray
2024-08-30  5:18 ` ✓ CI.Patch_applied: success for " Patchwork
2024-08-30  5:18 ` ✓ CI.checkpatch: " Patchwork
2024-08-30  5:19 ` ✓ CI.KUnit: " Patchwork
2024-08-30  5:23 ` [RFC 1/9] drm/xe: Error handling in xe_force_wake_get() Himal Prasad Ghimiray
2024-08-30  6:37   ` Jani Nikula
2024-08-30  6:45     ` Ghimiray, Himal Prasad
2024-09-05 19:29   ` Rodrigo Vivi [this message]
2024-09-05 20:02     ` Ghimiray, Himal Prasad
2024-09-06 16:18       ` Rodrigo Vivi
2024-09-10 18:27         ` Nilawar, Badal
2024-09-11  6:51           ` Ghimiray, Himal Prasad
2024-09-11  6:40       ` Upadhyay, Tejas
2024-08-30  5:23 ` [RFC 2/9] drm/xe: Ensure __must_check for xe_force_wake_get() return Himal Prasad Ghimiray
2024-09-05 19:30   ` Rodrigo Vivi
2024-08-30  5:23 ` [RFC 3/9] drm/xe/gsc: call xe_force_wake_put() only if xe_force_wake_get() succeeds Himal Prasad Ghimiray
2024-08-30  5:23 ` [RFC 4/9] drm/xe/gt: " Himal Prasad Ghimiray
2024-08-30  5:23 ` [RFC 5/9] drm/xe/guc: " Himal Prasad Ghimiray
2024-08-30  5:23 ` [RFC 6/9] drm/xe/oa: Handle force_wake_get failure in xe_oa_stream_init() Himal Prasad Ghimiray
2024-08-30  5:23 ` [RFC 7/9] drm/xe/gt_tlb_invalidation_ggtt: Call xe_force_wake_put if xe_force_wake_get succeds Himal Prasad Ghimiray
2024-09-05 19:37   ` Rodrigo Vivi
2024-09-05 19:51     ` Ghimiray, Himal Prasad
2024-09-06 16:29       ` Rodrigo Vivi
2024-09-09  9:29         ` Ghimiray, Himal Prasad
2024-09-10 14:37           ` Nilawar, Badal
2024-09-10 17:39             ` Rodrigo Vivi
2024-09-10 17:53               ` Nilawar, Badal
2024-08-30  5:23 ` [RFC 8/9] drm/xe: Change return type to void for xe_force_wake_put Himal Prasad Ghimiray
2024-08-30  5:23 ` [RFC 9/9] drm/xe: forcewake debugfs open fails on xe_forcewake_get failure Himal Prasad Ghimiray
2024-08-30  5:32 ` ✓ CI.Build: success for Fix xe_force_wake_get() failure handling Patchwork
2024-08-30  5:37 ` ✓ CI.Hooks: " Patchwork
2024-08-30  5:42 ` ✓ CI.checksparse: " Patchwork
2024-08-30  6:05 ` ✓ CI.BAT: " Patchwork
2024-08-30 17:41 ` ✓ CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZtoGpjVmZbwLsD1s@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=badal.nilawar@intel.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=nirmoy.das@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox