All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back
@ 2023-02-21 14:52 Matthew Auld
  2023-02-21 21:16 ` Lucas De Marchi
  0 siblings, 1 reply; 4+ messages in thread
From: Matthew Auld @ 2023-02-21 14:52 UTC (permalink / raw)
  To: intel-xe; +Cc: Lucas De Marchi, Rodrigo Vivi

In local_pci_probe() the core kernel increments the rpm for the device,
just before calling into the probe hook. If the driver/device supports
runtime pm it is then meant to drop this ref during probe (like we do in
xe_pm_runtime_init()). However when removing the device we then also need
to give the reference back, otherwise the ref that is dropped in
pci_device_remove() will be unbalanced when for example unloading the
driver, leading to warnings like:

    [ 3808.596345] xe 0000:03:00.0: Runtime PM usage count underflow!

Fix this by incrementing the rpm ref when removing the device.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/193
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/xe/xe_pci.c | 1 +
 drivers/gpu/drm/xe/xe_pm.c  | 7 +++++++
 drivers/gpu/drm/xe/xe_pm.h  | 1 +
 3 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index 25598de3a1fc..85d337cd8fbe 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -441,6 +441,7 @@ static void xe_pci_remove(struct pci_dev *pdev)
 		return;
 
 	xe_device_remove(xe);
+	xe_pm_runtime_fini(xe);
 	pci_set_drvdata(pdev, NULL);
 }
 
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 44c38e670587..73d81621d960 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -128,6 +128,13 @@ void xe_pm_runtime_init(struct xe_device *xe)
 	pm_runtime_put_autosuspend(dev);
 }
 
+void xe_pm_runtime_fini(struct xe_device *xe)
+{
+	struct device *dev = xe->drm.dev;
+
+	pm_runtime_get_sync(dev);
+}
+
 int xe_pm_runtime_suspend(struct xe_device *xe)
 {
 	struct xe_gt *gt;
diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
index b8c5f9558e26..6a885585f653 100644
--- a/drivers/gpu/drm/xe/xe_pm.h
+++ b/drivers/gpu/drm/xe/xe_pm.h
@@ -14,6 +14,7 @@ int xe_pm_suspend(struct xe_device *xe);
 int xe_pm_resume(struct xe_device *xe);
 
 void xe_pm_runtime_init(struct xe_device *xe);
+void xe_pm_runtime_fini(struct xe_device *xe);
 int xe_pm_runtime_suspend(struct xe_device *xe);
 int xe_pm_runtime_resume(struct xe_device *xe);
 int xe_pm_runtime_get(struct xe_device *xe);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back
  2023-02-21 14:52 [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back Matthew Auld
@ 2023-02-21 21:16 ` Lucas De Marchi
  2023-02-21 21:39   ` Rodrigo Vivi
  2023-02-22 12:01   ` Matthew Auld
  0 siblings, 2 replies; 4+ messages in thread
From: Lucas De Marchi @ 2023-02-21 21:16 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-xe, Rodrigo Vivi

On Tue, Feb 21, 2023 at 02:52:21PM +0000, Matthew Auld wrote:
>In local_pci_probe() the core kernel increments the rpm for the device,
>just before calling into the probe hook. If the driver/device supports
>runtime pm it is then meant to drop this ref during probe (like we do in

s/drop/put/ to be consistent with the terminology?

>xe_pm_runtime_init()). However when removing the device we then also need
>to give the reference back, otherwise the ref that is dropped in

give? we are calling pm_runtime_get_sync(), which  would be "take".

>pci_device_remove() will be unbalanced when for example unloading the
>driver, leading to warnings like:
>
>    [ 3808.596345] xe 0000:03:00.0: Runtime PM usage count underflow!
>
>Fix this by incrementing the rpm ref when removing the device.
>
>Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/193
>Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>Cc: Matthew Brost <matthew.brost@intel.com>
>Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>---
> drivers/gpu/drm/xe/xe_pci.c | 1 +
> drivers/gpu/drm/xe/xe_pm.c  | 7 +++++++
> drivers/gpu/drm/xe/xe_pm.h  | 1 +
> 3 files changed, 9 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>index 25598de3a1fc..85d337cd8fbe 100644
>--- a/drivers/gpu/drm/xe/xe_pci.c
>+++ b/drivers/gpu/drm/xe/xe_pci.c
>@@ -441,6 +441,7 @@ static void xe_pci_remove(struct pci_dev *pdev)
> 		return;
>
> 	xe_device_remove(xe);
>+	xe_pm_runtime_fini(xe);

after xe_device_remove()? Wouldn't that end up calling the last
drm_dev_put() and thus triggering all the drmm_* releases?

Lucas De Marchi

> 	pci_set_drvdata(pdev, NULL);
> }
>
>diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
>index 44c38e670587..73d81621d960 100644
>--- a/drivers/gpu/drm/xe/xe_pm.c
>+++ b/drivers/gpu/drm/xe/xe_pm.c
>@@ -128,6 +128,13 @@ void xe_pm_runtime_init(struct xe_device *xe)
> 	pm_runtime_put_autosuspend(dev);
> }
>
>+void xe_pm_runtime_fini(struct xe_device *xe)
>+{
>+	struct device *dev = xe->drm.dev;
>+
>+	pm_runtime_get_sync(dev);
>+}
>+
> int xe_pm_runtime_suspend(struct xe_device *xe)
> {
> 	struct xe_gt *gt;
>diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
>index b8c5f9558e26..6a885585f653 100644
>--- a/drivers/gpu/drm/xe/xe_pm.h
>+++ b/drivers/gpu/drm/xe/xe_pm.h
>@@ -14,6 +14,7 @@ int xe_pm_suspend(struct xe_device *xe);
> int xe_pm_resume(struct xe_device *xe);
>
> void xe_pm_runtime_init(struct xe_device *xe);
>+void xe_pm_runtime_fini(struct xe_device *xe);
> int xe_pm_runtime_suspend(struct xe_device *xe);
> int xe_pm_runtime_resume(struct xe_device *xe);
> int xe_pm_runtime_get(struct xe_device *xe);
>-- 
>2.39.1
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back
  2023-02-21 21:16 ` Lucas De Marchi
@ 2023-02-21 21:39   ` Rodrigo Vivi
  2023-02-22 12:01   ` Matthew Auld
  1 sibling, 0 replies; 4+ messages in thread
From: Rodrigo Vivi @ 2023-02-21 21:39 UTC (permalink / raw)
  To: Lucas De Marchi; +Cc: Matthew Auld, intel-xe

On Tue, Feb 21, 2023 at 01:16:49PM -0800, Lucas De Marchi wrote:
> On Tue, Feb 21, 2023 at 02:52:21PM +0000, Matthew Auld wrote:
> > In local_pci_probe() the core kernel increments the rpm for the device,
> > just before calling into the probe hook. If the driver/device supports
> > runtime pm it is then meant to drop this ref during probe (like we do in
> 
> s/drop/put/ to be consistent with the terminology?

yeap, put is better...

> 
> > xe_pm_runtime_init()). However when removing the device we then also need
> > to give the reference back, otherwise the ref that is dropped in
> 
> give? we are calling pm_runtime_get_sync(), which  would be "take".

and get here...

> 
> > pci_device_remove() will be unbalanced when for example unloading the
> > driver, leading to warnings like:
> > 
> >    [ 3808.596345] xe 0000:03:00.0: Runtime PM usage count underflow!
> > 
> > Fix this by incrementing the rpm ref when removing the device.
> > 
> > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/193

Thank you!

> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_pci.c | 1 +
> > drivers/gpu/drm/xe/xe_pm.c  | 7 +++++++
> > drivers/gpu/drm/xe/xe_pm.h  | 1 +
> > 3 files changed, 9 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > index 25598de3a1fc..85d337cd8fbe 100644
> > --- a/drivers/gpu/drm/xe/xe_pci.c
> > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > @@ -441,6 +441,7 @@ static void xe_pci_remove(struct pci_dev *pdev)
> > 		return;
> > 
> > 	xe_device_remove(xe);
> > +	xe_pm_runtime_fini(xe);
> 
> after xe_device_remove()? Wouldn't that end up calling the last
> drm_dev_put() and thus triggering all the drmm_* releases?
> 
> Lucas De Marchi
> 
> > 	pci_set_drvdata(pdev, NULL);
> > }
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > index 44c38e670587..73d81621d960 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.c
> > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > @@ -128,6 +128,13 @@ void xe_pm_runtime_init(struct xe_device *xe)
> > 	pm_runtime_put_autosuspend(dev);
> > }
> > 
> > +void xe_pm_runtime_fini(struct xe_device *xe)
> > +{
> > +	struct device *dev = xe->drm.dev;
> > +
> > +	pm_runtime_get_sync(dev);

please also add:
pm_runtime_forbid(dev);

then

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


> > +}
> > +
> > int xe_pm_runtime_suspend(struct xe_device *xe)
> > {
> > 	struct xe_gt *gt;
> > diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> > index b8c5f9558e26..6a885585f653 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.h
> > +++ b/drivers/gpu/drm/xe/xe_pm.h
> > @@ -14,6 +14,7 @@ int xe_pm_suspend(struct xe_device *xe);
> > int xe_pm_resume(struct xe_device *xe);
> > 
> > void xe_pm_runtime_init(struct xe_device *xe);
> > +void xe_pm_runtime_fini(struct xe_device *xe);
> > int xe_pm_runtime_suspend(struct xe_device *xe);
> > int xe_pm_runtime_resume(struct xe_device *xe);
> > int xe_pm_runtime_get(struct xe_device *xe);
> > -- 
> > 2.39.1
> > 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back
  2023-02-21 21:16 ` Lucas De Marchi
  2023-02-21 21:39   ` Rodrigo Vivi
@ 2023-02-22 12:01   ` Matthew Auld
  1 sibling, 0 replies; 4+ messages in thread
From: Matthew Auld @ 2023-02-22 12:01 UTC (permalink / raw)
  To: Lucas De Marchi; +Cc: intel-xe, Rodrigo Vivi

On 21/02/2023 21:16, Lucas De Marchi wrote:
> On Tue, Feb 21, 2023 at 02:52:21PM +0000, Matthew Auld wrote:
>> In local_pci_probe() the core kernel increments the rpm for the device,
>> just before calling into the probe hook. If the driver/device supports
>> runtime pm it is then meant to drop this ref during probe (like we do in
> 
> s/drop/put/ to be consistent with the terminology?
> 
>> xe_pm_runtime_init()). However when removing the device we then also need
>> to give the reference back, otherwise the ref that is dropped in
> 
> give? we are calling pm_runtime_get_sync(), which  would be "take".
> 
>> pci_device_remove() will be unbalanced when for example unloading the
>> driver, leading to warnings like:
>>
>>    [ 3808.596345] xe 0000:03:00.0: Runtime PM usage count underflow!
>>
>> Fix this by incrementing the rpm ref when removing the device.
>>
>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/193
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_pci.c | 1 +
>> drivers/gpu/drm/xe/xe_pm.c  | 7 +++++++
>> drivers/gpu/drm/xe/xe_pm.h  | 1 +
>> 3 files changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>> index 25598de3a1fc..85d337cd8fbe 100644
>> --- a/drivers/gpu/drm/xe/xe_pci.c
>> +++ b/drivers/gpu/drm/xe/xe_pci.c
>> @@ -441,6 +441,7 @@ static void xe_pci_remove(struct pci_dev *pdev)
>>         return;
>>
>>     xe_device_remove(xe);
>> +    xe_pm_runtime_fini(xe);
> 
> after xe_device_remove()? Wouldn't that end up calling the last
> drm_dev_put() and thus triggering all the drmm_* releases?

In __device_release_driver() it will call device_remove() first, which 
eventually calls our xe_pci_remove() hook. A little further down it then 
calls device_unbind_cleanup(), which in turn calls devres_release_all(), 
which eventually calls into drm_managed_release() and handles all the 
drmm_* stuff, AFAICT.

> 
> Lucas De Marchi
> 
>>     pci_set_drvdata(pdev, NULL);
>> }
>>
>> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
>> index 44c38e670587..73d81621d960 100644
>> --- a/drivers/gpu/drm/xe/xe_pm.c
>> +++ b/drivers/gpu/drm/xe/xe_pm.c
>> @@ -128,6 +128,13 @@ void xe_pm_runtime_init(struct xe_device *xe)
>>     pm_runtime_put_autosuspend(dev);
>> }
>>
>> +void xe_pm_runtime_fini(struct xe_device *xe)
>> +{
>> +    struct device *dev = xe->drm.dev;
>> +
>> +    pm_runtime_get_sync(dev);
>> +}
>> +
>> int xe_pm_runtime_suspend(struct xe_device *xe)
>> {
>>     struct xe_gt *gt;
>> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
>> index b8c5f9558e26..6a885585f653 100644
>> --- a/drivers/gpu/drm/xe/xe_pm.h
>> +++ b/drivers/gpu/drm/xe/xe_pm.h
>> @@ -14,6 +14,7 @@ int xe_pm_suspend(struct xe_device *xe);
>> int xe_pm_resume(struct xe_device *xe);
>>
>> void xe_pm_runtime_init(struct xe_device *xe);
>> +void xe_pm_runtime_fini(struct xe_device *xe);
>> int xe_pm_runtime_suspend(struct xe_device *xe);
>> int xe_pm_runtime_resume(struct xe_device *xe);
>> int xe_pm_runtime_get(struct xe_device *xe);
>> -- 
>> 2.39.1
>>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-02-22 12:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-21 14:52 [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back Matthew Auld
2023-02-21 21:16 ` Lucas De Marchi
2023-02-21 21:39   ` Rodrigo Vivi
2023-02-22 12:01   ` Matthew Auld

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.