* [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back
@ 2023-02-21 14:52 Matthew Auld
2023-02-21 21:16 ` Lucas De Marchi
0 siblings, 1 reply; 4+ messages in thread
From: Matthew Auld @ 2023-02-21 14:52 UTC (permalink / raw)
To: intel-xe; +Cc: Lucas De Marchi, Rodrigo Vivi
In local_pci_probe() the core kernel increments the rpm for the device,
just before calling into the probe hook. If the driver/device supports
runtime pm it is then meant to drop this ref during probe (like we do in
xe_pm_runtime_init()). However when removing the device we then also need
to give the reference back, otherwise the ref that is dropped in
pci_device_remove() will be unbalanced when for example unloading the
driver, leading to warnings like:
[ 3808.596345] xe 0000:03:00.0: Runtime PM usage count underflow!
Fix this by incrementing the rpm ref when removing the device.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/193
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
drivers/gpu/drm/xe/xe_pci.c | 1 +
drivers/gpu/drm/xe/xe_pm.c | 7 +++++++
drivers/gpu/drm/xe/xe_pm.h | 1 +
3 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index 25598de3a1fc..85d337cd8fbe 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -441,6 +441,7 @@ static void xe_pci_remove(struct pci_dev *pdev)
return;
xe_device_remove(xe);
+ xe_pm_runtime_fini(xe);
pci_set_drvdata(pdev, NULL);
}
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
index 44c38e670587..73d81621d960 100644
--- a/drivers/gpu/drm/xe/xe_pm.c
+++ b/drivers/gpu/drm/xe/xe_pm.c
@@ -128,6 +128,13 @@ void xe_pm_runtime_init(struct xe_device *xe)
pm_runtime_put_autosuspend(dev);
}
+void xe_pm_runtime_fini(struct xe_device *xe)
+{
+ struct device *dev = xe->drm.dev;
+
+ pm_runtime_get_sync(dev);
+}
+
int xe_pm_runtime_suspend(struct xe_device *xe)
{
struct xe_gt *gt;
diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
index b8c5f9558e26..6a885585f653 100644
--- a/drivers/gpu/drm/xe/xe_pm.h
+++ b/drivers/gpu/drm/xe/xe_pm.h
@@ -14,6 +14,7 @@ int xe_pm_suspend(struct xe_device *xe);
int xe_pm_resume(struct xe_device *xe);
void xe_pm_runtime_init(struct xe_device *xe);
+void xe_pm_runtime_fini(struct xe_device *xe);
int xe_pm_runtime_suspend(struct xe_device *xe);
int xe_pm_runtime_resume(struct xe_device *xe);
int xe_pm_runtime_get(struct xe_device *xe);
--
2.39.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back
2023-02-21 14:52 [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back Matthew Auld
@ 2023-02-21 21:16 ` Lucas De Marchi
2023-02-21 21:39 ` Rodrigo Vivi
2023-02-22 12:01 ` Matthew Auld
0 siblings, 2 replies; 4+ messages in thread
From: Lucas De Marchi @ 2023-02-21 21:16 UTC (permalink / raw)
To: Matthew Auld; +Cc: intel-xe, Rodrigo Vivi
On Tue, Feb 21, 2023 at 02:52:21PM +0000, Matthew Auld wrote:
>In local_pci_probe() the core kernel increments the rpm for the device,
>just before calling into the probe hook. If the driver/device supports
>runtime pm it is then meant to drop this ref during probe (like we do in
s/drop/put/ to be consistent with the terminology?
>xe_pm_runtime_init()). However when removing the device we then also need
>to give the reference back, otherwise the ref that is dropped in
give? we are calling pm_runtime_get_sync(), which would be "take".
>pci_device_remove() will be unbalanced when for example unloading the
>driver, leading to warnings like:
>
> [ 3808.596345] xe 0000:03:00.0: Runtime PM usage count underflow!
>
>Fix this by incrementing the rpm ref when removing the device.
>
>Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/193
>Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>Cc: Matthew Brost <matthew.brost@intel.com>
>Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>---
> drivers/gpu/drm/xe/xe_pci.c | 1 +
> drivers/gpu/drm/xe/xe_pm.c | 7 +++++++
> drivers/gpu/drm/xe/xe_pm.h | 1 +
> 3 files changed, 9 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>index 25598de3a1fc..85d337cd8fbe 100644
>--- a/drivers/gpu/drm/xe/xe_pci.c
>+++ b/drivers/gpu/drm/xe/xe_pci.c
>@@ -441,6 +441,7 @@ static void xe_pci_remove(struct pci_dev *pdev)
> return;
>
> xe_device_remove(xe);
>+ xe_pm_runtime_fini(xe);
after xe_device_remove()? Wouldn't that end up calling the last
drm_dev_put() and thus triggering all the drmm_* releases?
Lucas De Marchi
> pci_set_drvdata(pdev, NULL);
> }
>
>diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
>index 44c38e670587..73d81621d960 100644
>--- a/drivers/gpu/drm/xe/xe_pm.c
>+++ b/drivers/gpu/drm/xe/xe_pm.c
>@@ -128,6 +128,13 @@ void xe_pm_runtime_init(struct xe_device *xe)
> pm_runtime_put_autosuspend(dev);
> }
>
>+void xe_pm_runtime_fini(struct xe_device *xe)
>+{
>+ struct device *dev = xe->drm.dev;
>+
>+ pm_runtime_get_sync(dev);
>+}
>+
> int xe_pm_runtime_suspend(struct xe_device *xe)
> {
> struct xe_gt *gt;
>diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
>index b8c5f9558e26..6a885585f653 100644
>--- a/drivers/gpu/drm/xe/xe_pm.h
>+++ b/drivers/gpu/drm/xe/xe_pm.h
>@@ -14,6 +14,7 @@ int xe_pm_suspend(struct xe_device *xe);
> int xe_pm_resume(struct xe_device *xe);
>
> void xe_pm_runtime_init(struct xe_device *xe);
>+void xe_pm_runtime_fini(struct xe_device *xe);
> int xe_pm_runtime_suspend(struct xe_device *xe);
> int xe_pm_runtime_resume(struct xe_device *xe);
> int xe_pm_runtime_get(struct xe_device *xe);
>--
>2.39.1
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back
2023-02-21 21:16 ` Lucas De Marchi
@ 2023-02-21 21:39 ` Rodrigo Vivi
2023-02-22 12:01 ` Matthew Auld
1 sibling, 0 replies; 4+ messages in thread
From: Rodrigo Vivi @ 2023-02-21 21:39 UTC (permalink / raw)
To: Lucas De Marchi; +Cc: Matthew Auld, intel-xe
On Tue, Feb 21, 2023 at 01:16:49PM -0800, Lucas De Marchi wrote:
> On Tue, Feb 21, 2023 at 02:52:21PM +0000, Matthew Auld wrote:
> > In local_pci_probe() the core kernel increments the rpm for the device,
> > just before calling into the probe hook. If the driver/device supports
> > runtime pm it is then meant to drop this ref during probe (like we do in
>
> s/drop/put/ to be consistent with the terminology?
yeap, put is better...
>
> > xe_pm_runtime_init()). However when removing the device we then also need
> > to give the reference back, otherwise the ref that is dropped in
>
> give? we are calling pm_runtime_get_sync(), which would be "take".
and get here...
>
> > pci_device_remove() will be unbalanced when for example unloading the
> > driver, leading to warnings like:
> >
> > [ 3808.596345] xe 0000:03:00.0: Runtime PM usage count underflow!
> >
> > Fix this by incrementing the rpm ref when removing the device.
> >
> > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/193
Thank you!
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_pci.c | 1 +
> > drivers/gpu/drm/xe/xe_pm.c | 7 +++++++
> > drivers/gpu/drm/xe/xe_pm.h | 1 +
> > 3 files changed, 9 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> > index 25598de3a1fc..85d337cd8fbe 100644
> > --- a/drivers/gpu/drm/xe/xe_pci.c
> > +++ b/drivers/gpu/drm/xe/xe_pci.c
> > @@ -441,6 +441,7 @@ static void xe_pci_remove(struct pci_dev *pdev)
> > return;
> >
> > xe_device_remove(xe);
> > + xe_pm_runtime_fini(xe);
>
> after xe_device_remove()? Wouldn't that end up calling the last
> drm_dev_put() and thus triggering all the drmm_* releases?
>
> Lucas De Marchi
>
> > pci_set_drvdata(pdev, NULL);
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> > index 44c38e670587..73d81621d960 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.c
> > +++ b/drivers/gpu/drm/xe/xe_pm.c
> > @@ -128,6 +128,13 @@ void xe_pm_runtime_init(struct xe_device *xe)
> > pm_runtime_put_autosuspend(dev);
> > }
> >
> > +void xe_pm_runtime_fini(struct xe_device *xe)
> > +{
> > + struct device *dev = xe->drm.dev;
> > +
> > + pm_runtime_get_sync(dev);
please also add:
pm_runtime_forbid(dev);
then
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > +}
> > +
> > int xe_pm_runtime_suspend(struct xe_device *xe)
> > {
> > struct xe_gt *gt;
> > diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> > index b8c5f9558e26..6a885585f653 100644
> > --- a/drivers/gpu/drm/xe/xe_pm.h
> > +++ b/drivers/gpu/drm/xe/xe_pm.h
> > @@ -14,6 +14,7 @@ int xe_pm_suspend(struct xe_device *xe);
> > int xe_pm_resume(struct xe_device *xe);
> >
> > void xe_pm_runtime_init(struct xe_device *xe);
> > +void xe_pm_runtime_fini(struct xe_device *xe);
> > int xe_pm_runtime_suspend(struct xe_device *xe);
> > int xe_pm_runtime_resume(struct xe_device *xe);
> > int xe_pm_runtime_get(struct xe_device *xe);
> > --
> > 2.39.1
> >
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back
2023-02-21 21:16 ` Lucas De Marchi
2023-02-21 21:39 ` Rodrigo Vivi
@ 2023-02-22 12:01 ` Matthew Auld
1 sibling, 0 replies; 4+ messages in thread
From: Matthew Auld @ 2023-02-22 12:01 UTC (permalink / raw)
To: Lucas De Marchi; +Cc: intel-xe, Rodrigo Vivi
On 21/02/2023 21:16, Lucas De Marchi wrote:
> On Tue, Feb 21, 2023 at 02:52:21PM +0000, Matthew Auld wrote:
>> In local_pci_probe() the core kernel increments the rpm for the device,
>> just before calling into the probe hook. If the driver/device supports
>> runtime pm it is then meant to drop this ref during probe (like we do in
>
> s/drop/put/ to be consistent with the terminology?
>
>> xe_pm_runtime_init()). However when removing the device we then also need
>> to give the reference back, otherwise the ref that is dropped in
>
> give? we are calling pm_runtime_get_sync(), which would be "take".
>
>> pci_device_remove() will be unbalanced when for example unloading the
>> driver, leading to warnings like:
>>
>> [ 3808.596345] xe 0000:03:00.0: Runtime PM usage count underflow!
>>
>> Fix this by incrementing the rpm ref when removing the device.
>>
>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/193
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_pci.c | 1 +
>> drivers/gpu/drm/xe/xe_pm.c | 7 +++++++
>> drivers/gpu/drm/xe/xe_pm.h | 1 +
>> 3 files changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>> index 25598de3a1fc..85d337cd8fbe 100644
>> --- a/drivers/gpu/drm/xe/xe_pci.c
>> +++ b/drivers/gpu/drm/xe/xe_pci.c
>> @@ -441,6 +441,7 @@ static void xe_pci_remove(struct pci_dev *pdev)
>> return;
>>
>> xe_device_remove(xe);
>> + xe_pm_runtime_fini(xe);
>
> after xe_device_remove()? Wouldn't that end up calling the last
> drm_dev_put() and thus triggering all the drmm_* releases?
In __device_release_driver() it will call device_remove() first, which
eventually calls our xe_pci_remove() hook. A little further down it then
calls device_unbind_cleanup(), which in turn calls devres_release_all(),
which eventually calls into drm_managed_release() and handles all the
drmm_* stuff, AFAICT.
>
> Lucas De Marchi
>
>> pci_set_drvdata(pdev, NULL);
>> }
>>
>> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
>> index 44c38e670587..73d81621d960 100644
>> --- a/drivers/gpu/drm/xe/xe_pm.c
>> +++ b/drivers/gpu/drm/xe/xe_pm.c
>> @@ -128,6 +128,13 @@ void xe_pm_runtime_init(struct xe_device *xe)
>> pm_runtime_put_autosuspend(dev);
>> }
>>
>> +void xe_pm_runtime_fini(struct xe_device *xe)
>> +{
>> + struct device *dev = xe->drm.dev;
>> +
>> + pm_runtime_get_sync(dev);
>> +}
>> +
>> int xe_pm_runtime_suspend(struct xe_device *xe)
>> {
>> struct xe_gt *gt;
>> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
>> index b8c5f9558e26..6a885585f653 100644
>> --- a/drivers/gpu/drm/xe/xe_pm.h
>> +++ b/drivers/gpu/drm/xe/xe_pm.h
>> @@ -14,6 +14,7 @@ int xe_pm_suspend(struct xe_device *xe);
>> int xe_pm_resume(struct xe_device *xe);
>>
>> void xe_pm_runtime_init(struct xe_device *xe);
>> +void xe_pm_runtime_fini(struct xe_device *xe);
>> int xe_pm_runtime_suspend(struct xe_device *xe);
>> int xe_pm_runtime_resume(struct xe_device *xe);
>> int xe_pm_runtime_get(struct xe_device *xe);
>> --
>> 2.39.1
>>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-02-22 12:02 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-21 14:52 [Intel-xe] [PATCH] drm/xe/pm: give the core kernel its rpm ref back Matthew Auld
2023-02-21 21:16 ` Lucas De Marchi
2023-02-21 21:39 ` Rodrigo Vivi
2023-02-22 12:01 ` Matthew Auld
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.