All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: "Cavitt, Jonathan" <jonathan.cavitt@intel.com>,
	"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH 3/3] drm/xe/pf: Don't resume device from restart worker
Date: Fri, 1 Aug 2025 16:45:53 -0400	[thread overview]
Message-ID: <aI0ngX8onl4C8_3x@intel.com> (raw)
In-Reply-To: <e2df3e6b-c377-4c91-b107-518ae7e0c3b6@intel.com>

On Wed, Jul 30, 2025 at 11:48:01PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 7/30/2025 11:10 PM, Cavitt, Jonathan wrote:
> > -----Original Message-----
> > From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of Michal Wajdeczko
> > Sent: Wednesday, July 30, 2025 10:49 AM
> > To: intel-xe@lists.freedesktop.org
> > Cc: Wajdeczko, Michal <Michal.Wajdeczko@intel.com>
> > Subject: [PATCH 3/3] drm/xe/pf: Don't resume device from restart worker
> >>
> >> The PF's restart worker shouldn't attempt to resume the device on
> >> its own, since its goal is to finish PF and VFs reprovisioning on
> >> the recently reset GuC. Take extra RPM reference while scheduling
> >> a work and release it from the worker or when we cancel a work.
> >>
> >> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> >> ---
> >>  drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 23 +++++++++++++++++++----
> >>  1 file changed, 19 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> >> index 8bc7d7f9f47a..0c9012fb625d 100644
> >> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> >> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> >> @@ -53,7 +53,11 @@ static void pf_init_workers(struct xe_gt *gt)
> >>  
> >>  static void pf_fini_workers(struct xe_gt *gt)
> >>  {
> >> -	disable_work_sync(&gt->sriov.pf.workers.restart);
> >> +	if (disable_work_sync(&gt->sriov.pf.workers.restart)) {
> >> +		xe_gt_sriov_dbg_verbose(gt, "pending restart disabled!\n");
> >> +		/* release a rpm reference taken on the worker behalf */
> >> +		xe_pm_runtime_put(gt_to_xe(gt));
> >> +	}
> >>  }
> >>  
> >>  /**
> >> @@ -205,8 +209,11 @@ static void pf_cancel_restart(struct xe_gt *gt)
> >>  {
> >>  	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> >>  
> >> -	if (cancel_work_sync(&gt->sriov.pf.workers.restart))
> >> +	if (cancel_work_sync(&gt->sriov.pf.workers.restart)) {
> >>  		xe_gt_sriov_dbg_verbose(gt, "pending restart canceled!\n");
> >> +		/* release a rpm reference taken on the worker behalf */
> >> +		xe_pm_runtime_put(gt_to_xe(gt));
> >> +	}
> >>  }
> >>  
> >>  /**
> >> @@ -224,9 +231,12 @@ static void pf_restart(struct xe_gt *gt)
> >>  {
> >>  	struct xe_device *xe = gt_to_xe(gt);
> >>  
> >> -	xe_pm_runtime_get(xe);
> >> +	xe_gt_assert(gt, !xe_pm_runtime_suspended(xe));
> >> +
> >>  	xe_gt_sriov_pf_config_restart(gt);
> >>  	xe_gt_sriov_pf_control_restart(gt);
> >> +
> >> +	/* release a rpm reference taken on our behalf */
> > 
> > NIT:
> > For consistency with the other two comments, maybe:
> > s/our/the worker
> > Or is the pm reference taken in this instance different from the pm reference
> > taken in pf_cancel_restart and pf_fini_workers?
> 
> this is the worker context, hence 'our'

I honestly prefer the non personal statements. But not an issue

> 
> > 
> > There're also some other minor grammar things ("s/a rpm/an rpm" and 
> > "s/worker behalf/worker's behalf", for example) that can be applied more
> > generally to the whole patch.
> > 
> > I'm not going to block on minor grammatical fixups, though, so:

My english is broken so I usually miss things like this. I even just learned
that the 'an rpm' is the right one...

But anyway, if you notice grammatical issues, please refrain on giving
your reviewed-by. Otherwise we will need to keep accepting small patches
later with small grammar fixes :/

Please ask the author a new revision and only put your rv-b when you are
comfortable that we are not leaving things behind.

> > Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> > -Jonathan Cavitt
> > 
> >>  	xe_pm_runtime_put(xe);
> >>  
> >>  	xe_gt_sriov_dbg(gt, "restart completed\n");
> >> @@ -245,8 +255,13 @@ static void pf_queue_restart(struct xe_gt *gt)
> >>  
> >>  	xe_gt_assert(gt, IS_SRIOV_PF(xe));
> >>  
> >> -	if (!queue_work(xe->sriov.wq, &gt->sriov.pf.workers.restart))
> >> +	/* take a rpm reference on behalf of the worker */
> >> +	xe_pm_runtime_get_noresume(xe);
> >> +
> >> +	if (!queue_work(xe->sriov.wq, &gt->sriov.pf.workers.restart)) {
> >>  		xe_gt_sriov_dbg(gt, "restart already in queue!\n");
> >> +		xe_pm_runtime_put(xe);

As for the patch itself from the power management perspective, the approach
looks good to me. It is the right thing to do if we don't want to let the
device to suspend and possibly losing power between the scheduling and
the work is done. However the hard part is to ensure that all the work
cancellation is taken care properly, what I believe you did right.

But it is important that the reviewers checked for other possible missed
spots here. I hope this is the case.

Thanks,
Rodrigo.

> >> +	}
> >>  }
> >>  
> >>  /**
> >> -- 
> >> 2.47.1
> >>
> >>
> 

  reply	other threads:[~2025-08-01 20:46 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-30 17:49 [PATCH 0/3] Misc PF improvements Michal Wajdeczko
2025-07-30 17:49 ` [PATCH 1/3] drm/xe/pf: Disable PF restart worker on device removal Michal Wajdeczko
2025-07-30 21:08   ` Cavitt, Jonathan
2025-07-30 21:34     ` Michal Wajdeczko
2025-07-30 21:40       ` Cavitt, Jonathan
2025-08-01 20:48         ` Rodrigo Vivi
2025-08-01 10:23   ` Piotr Piórkowski
2025-07-30 17:49 ` [PATCH 2/3] drm/xe/pf: Make sure PF is ready to configure VFs Michal Wajdeczko
2025-07-30 21:09   ` Cavitt, Jonathan
2025-07-30 21:44     ` Michal Wajdeczko
2025-08-01 20:51       ` Rodrigo Vivi
2025-08-01 11:25   ` Piotr Piórkowski
2025-07-30 17:49 ` [PATCH 3/3] drm/xe/pf: Don't resume device from restart worker Michal Wajdeczko
2025-07-30 21:10   ` Cavitt, Jonathan
2025-07-30 21:48     ` Michal Wajdeczko
2025-08-01 20:45       ` Rodrigo Vivi [this message]
2025-08-01 22:55         ` Cavitt, Jonathan
2025-08-01 14:04   ` Piotr Piórkowski
2025-07-30 19:12 ` ✓ CI.KUnit: success for Misc PF improvements Patchwork
2025-07-30 20:14 ` ✓ Xe.CI.BAT: " Patchwork
2025-07-30 21:29 ` ✓ Xe.CI.Full: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aI0ngX8onl4C8_3x@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jonathan.cavitt@intel.com \
    --cc=michal.wajdeczko@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.