All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Matthew Auld <matthew.auld@intel.com>
Cc: "Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	intel-xe@lists.freedesktop.org,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Subject: Re: [PATCH] drm/xe: Kill missing outer runtime PM protection warning
Date: Wed, 4 Sep 2024 14:53:08 +0000	[thread overview]
Message-ID: <Zth0VCj7ab0bOOG1@DUT025-TGLU.fm.intel.com> (raw)
In-Reply-To: <23623c0f-f3e5-4227-9a84-7490ae3978ec@intel.com>

On Wed, Sep 04, 2024 at 01:49:47PM +0100, Matthew Auld wrote:
> On 03/09/2024 23:38, Rodrigo Vivi wrote:
> > This message was very useful to ensure that Xe was taking all
> > the needed outer runtime pm references. However, at this point
> > it is only a false positive. So, remove it.
> > 
> > False positive cases:
> > 
> > 1:
> > [184.983389] xe ...: [drm] Missing outer runtime PM protection
> > [snip]
> > [184.984096]  drm_ioctl+0x2cf/0x580 [drm]
> > [snip]
> > [184.984710] xe 0000:00:02.0: Runtime PM usage count underflow!
> > 
> > In this case the underflow is the problem since we are sure that
> > the ioctl is protected. But something else is abusing the 'put'
> > calls.
> > 
> > 2:
> > rpm_status:           0000:03:00.0 status=RPM_SUSPENDING
> > console:              xe_bo_evict_all (called from suspend)
> > xe_sched_job_create:  dev=0000:03:00.0, ...
> > xe_sched_job_exec:    dev=0000:03:00.0, ...
> > xe_pm_runtime_put:    dev=0000:03:00.0, ...
> > xe_sched_job_run:     dev=0000:03:00.0, ...
> > rpm_usage:            0000:03:00.0 flags-0 cnt-2  ...
> > rpm_usage:            0000:03:00.0 flags-0 cnt-2  ...
> > rpm_usage:            0000:03:00.0 flags-0 cnt-2  ...
> > console:              xe 0000:03:00.0: [drm] Missing outer runtime
> > 		      	 	       	     	     PM protection
> > console:               xe_guc_ct_send+0x15/0x50 [xe]
> > console:               guc_exec_queue_run_job+0x1509/0x3950 [xe]
> > [snip]
> > console:               drm_sched_run_job_work+0x649/0xc20
> > 
> > At this point, BOs are getting evicted from VRAM with rpm
> > usage-counter = 2, but rpm status = SUSPENDING.
> > The xe->pm_callback_task won't be equal 'current' because this call is
> > coming from a work queue.
> > 
> > So, pm_runtime_get_if_active() will be called and return 0 because rpm
> > status != ACTIVE (but equal SUSPENDING).
> > 
> > The only way out is to just grab the reference and move on.
> > 
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_pm.c | 10 ++--------
> >   1 file changed, 2 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > index da68cd689a96..e1a5e43b0f34 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.c
> > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > @@ -592,20 +592,14 @@ bool xe_pm_runtime_get_if_in_use(struct xe_device *xe)
> >    * xe_pm_runtime_get_noresume - Bump runtime PM usage counter without resuming
> >    * @xe: xe device instance
> >    *
> > - * This function should be used in inner places where it is surely already
> > + * This function should *only* be used in inner places where it is surely already
> >    * protected by outer-bound callers of `xe_pm_runtime_get`.
> > - * It will warn if not protected.
> >    * The reference should be put back after this function regardless, since it
> >    * will always bump the usage counter, regardless.
> >    */
> >   void xe_pm_runtime_get_noresume(struct xe_device *xe)
> >   {
> > -	bool ref;
> > -
> > -	ref = xe_pm_runtime_get_if_in_use(xe);
> > -
> > -	if (drm_WARN(&xe->drm, !ref, "Missing outer runtime PM protection\n"))
> > -		pm_runtime_get_noresume(xe->drm.dev);
> 
> This has proven to find real bugs in the past, right? If so, it seems
> unfortunate to drop this completely? What about making it slightly more

I quickly looked and had the same reservations about dropping completely
as it has proved useful in the past. 

> fuzzy with something like:
> 
> drm_WARN(!pm_read_callback_task() && !pm_runtime_active(), ...
> 
> That should avoid the false positive, at the cost of not finding some real
> bugs, but at least gives us something?
> 

Haven't completely wrapped my head around pm_read_callback_task() usage
so can't comment it works but in general agree with something a little
more fuzzy is better than nothing.

Matt

> > +	pm_runtime_get_noresume(xe->drm.dev);
> >   }
> >   /**

  reply	other threads:[~2024-09-04 14:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-03 22:38 [PATCH] drm/xe: Kill missing outer runtime PM protection warning Rodrigo Vivi
2024-09-03 23:25 ` ✓ CI.Patch_applied: success for " Patchwork
2024-09-03 23:25 ` ✓ CI.checkpatch: " Patchwork
2024-09-03 23:26 ` ✓ CI.KUnit: " Patchwork
2024-09-03 23:38 ` ✓ CI.Build: " Patchwork
2024-09-03 23:40 ` ✓ CI.Hooks: " Patchwork
2024-09-03 23:41 ` ✓ CI.checksparse: " Patchwork
2024-09-04  0:00 ` ✓ CI.BAT: " Patchwork
2024-09-04  4:19 ` ✗ CI.FULL: failure " Patchwork
2024-09-04 12:49 ` [PATCH] " Matthew Auld
2024-09-04 14:53   ` Matthew Brost [this message]
2024-09-04 15:03   ` Rodrigo Vivi
2024-09-04 15:42     ` Matthew Auld
2024-09-04 16:06       ` Rodrigo Vivi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zth0VCj7ab0bOOG1@DUT025-TGLU.fm.intel.com \
    --to=matthew.brost@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.