* [PATCH] drm/xe/display: Fix memleak in display initialization
@ 2024-01-25 6:36 wangxiaoming321
2024-01-25 9:41 ` Jani Nikula
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: wangxiaoming321 @ 2024-01-25 6:36 UTC (permalink / raw)
To: lucas.demarchi, ogabbay, thomas.hellstrom, maarten.lankhorst,
mripard, tzimmermann, airlied, daniel, intel-xe, dri-devel,
linux-kernel
Cc: wangxiaoming321
In the call stack xe_device_probe -> xe_display_init_nommio -> intel_power_domains_init
Power_domains hasn't been cleaned up if return error,
which has do the clean in i915_driver_late_release call from i915_driver_probe.
unreferenced object 0xffff88811150ee00 (size 512):
comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
hex dump (first 32 bytes):
10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................
ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
[<ffffffff812c98b2>] __kmalloc+0x52/0x150
[<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
[<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
[<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
[<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
[<ffffffff817f2187>] local_pci_probe+0x47/0xa0
[<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
[<ffffffff8192f2a2>] really_probe+0x1a2/0x410
[<ffffffff8192f598>] __driver_probe_device+0x78/0x160
[<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
[<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
[<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
[<ffffffff8192e159>] bus_add_driver+0x119/0x220
[<ffffffff81930d00>] driver_register+0x60/0x120
[<ffffffffa05e50a0>] 0xffffffffa05e50a0
Signed-off-by: wangxiaoming321 <xiaoming.wang@intel.com>
---
drivers/gpu/drm/xe/xe_display.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_display.c b/drivers/gpu/drm/xe/xe_display.c
index 74391d9b11ae..2725afba4afb 100644
--- a/drivers/gpu/drm/xe/xe_display.c
+++ b/drivers/gpu/drm/xe/xe_display.c
@@ -146,8 +146,10 @@ int xe_display_init_nommio(struct xe_device *xe)
intel_detect_pch(xe);
err = intel_power_domains_init(xe);
- if (err)
+ if (err) {
+ intel_power_domains_cleanup(xe);
return err;
+ }
return drmm_add_action_or_reset(&xe->drm, xe_display_fini_nommio, xe);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/xe/display: Fix memleak in display initialization
2024-01-25 6:36 [PATCH] drm/xe/display: Fix memleak in display initialization wangxiaoming321
@ 2024-01-25 9:41 ` Jani Nikula
2024-01-26 14:34 ` wangxiaoming321
2024-01-26 15:34 ` wangxiaoming321
2 siblings, 0 replies; 10+ messages in thread
From: Jani Nikula @ 2024-01-25 9:41 UTC (permalink / raw)
To: wangxiaoming321, lucas.demarchi, ogabbay, thomas.hellstrom,
maarten.lankhorst, mripard, tzimmermann, airlied, daniel,
intel-xe, dri-devel, linux-kernel
Cc: wangxiaoming321
On Thu, 25 Jan 2024, wangxiaoming321 <xiaoming.wang@intel.com> wrote:
> In the call stack xe_device_probe -> xe_display_init_nommio -> intel_power_domains_init
> Power_domains hasn't been cleaned up if return error,
> which has do the clean in i915_driver_late_release call from i915_driver_probe.
This has nothing to do with i915_*.
If intel_power_domains_init() returns an error, it should have cleaned
up after itself, not force its caller to do that. If there's an issue,
please fix it in intel_power_domains_init().
BR,
Jani.
>
> unreferenced object 0xffff88811150ee00 (size 512):
> comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
> hex dump (first 32 bytes):
> 10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................
> ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
> [<ffffffff812c98b2>] __kmalloc+0x52/0x150
> [<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
> [<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
> [<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
> [<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
> [<ffffffff817f2187>] local_pci_probe+0x47/0xa0
> [<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
> [<ffffffff8192f2a2>] really_probe+0x1a2/0x410
> [<ffffffff8192f598>] __driver_probe_device+0x78/0x160
> [<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
> [<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
> [<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
> [<ffffffff8192e159>] bus_add_driver+0x119/0x220
> [<ffffffff81930d00>] driver_register+0x60/0x120
> [<ffffffffa05e50a0>] 0xffffffffa05e50a0
>
> Signed-off-by: wangxiaoming321 <xiaoming.wang@intel.com>
> ---
> drivers/gpu/drm/xe/xe_display.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_display.c b/drivers/gpu/drm/xe/xe_display.c
> index 74391d9b11ae..2725afba4afb 100644
> --- a/drivers/gpu/drm/xe/xe_display.c
> +++ b/drivers/gpu/drm/xe/xe_display.c
> @@ -146,8 +146,10 @@ int xe_display_init_nommio(struct xe_device *xe)
> intel_detect_pch(xe);
>
> err = intel_power_domains_init(xe);
> - if (err)
> + if (err) {
> + intel_power_domains_cleanup(xe);
> return err;
> + }
>
> return drmm_add_action_or_reset(&xe->drm, xe_display_fini_nommio, xe);
> }
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] drm/xe/display: Fix memleak in display initialization
2024-01-25 6:36 [PATCH] drm/xe/display: Fix memleak in display initialization wangxiaoming321
2024-01-25 9:41 ` Jani Nikula
@ 2024-01-26 14:34 ` wangxiaoming321
2024-01-26 14:44 ` Ville Syrjälä
2024-01-26 15:34 ` wangxiaoming321
2 siblings, 1 reply; 10+ messages in thread
From: wangxiaoming321 @ 2024-01-26 14:34 UTC (permalink / raw)
To: lucas.demarchi, ogabbay, thomas.hellstrom, maarten.lankhorst,
mripard, tzimmermann, airlied, daniel, intel-xe, dri-devel,
linux-kernel
Cc: wangxiaoming321
intel_power_domains_init has been called twice in xe_device_probe:
xe_device_probe -> xe_display_init_nommio -> intel_power_domains_init(xe)
xe_device_probe -> xe_display_init_noirq -> intel_display_driver_probe_noirq
-> intel_power_domains_init(i915)
It needs add a flag to avoid power_domains->power_wells double malloc.
unreferenced object 0xffff88811150ee00 (size 512):
comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
hex dump (first 32 bytes):
10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................
ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
[<ffffffff812c98b2>] __kmalloc+0x52/0x150
[<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
[<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
[<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
[<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
[<ffffffff817f2187>] local_pci_probe+0x47/0xa0
[<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
[<ffffffff8192f2a2>] really_probe+0x1a2/0x410
[<ffffffff8192f598>] __driver_probe_device+0x78/0x160
[<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
[<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
[<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
[<ffffffff8192e159>] bus_add_driver+0x119/0x220
[<ffffffff81930d00>] driver_register+0x60/0x120
[<ffffffffa05e50a0>] 0xffffffffa05e50a0
Signed-off-by: wangxiaoming321 <xiaoming.wang@intel.com>
---
drivers/gpu/drm/i915/display/intel_display_power.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c
index bf9685acf75a..3b48a1cb7c54 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -36,6 +36,8 @@
for_each_power_well_reverse(__dev_priv, __power_well) \
for_each_if(test_bit((__domain), (__power_well)->domains.bits))
+static int intel_power_domains_init_flag = 0;
+
const char *
intel_display_power_domain_str(enum intel_display_power_domain domain)
{
@@ -1016,6 +1018,11 @@ int intel_power_domains_init(struct drm_i915_private *dev_priv)
{
struct i915_power_domains *power_domains = &dev_priv->display.power.domains;
+ if(intel_power_domains_init_flag == 1)
+ return 0;
+
+ intel_power_domains_init_flag++;
+
dev_priv->display.params.disable_power_well =
sanitize_disable_power_well_option(dev_priv,
dev_priv->display.params.disable_power_well);
@@ -1041,6 +1048,7 @@ int intel_power_domains_init(struct drm_i915_private *dev_priv)
*/
void intel_power_domains_cleanup(struct drm_i915_private *dev_priv)
{
+ intel_power_domains_init_flag = 0;
intel_display_power_map_cleanup(&dev_priv->display.power.domains);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/xe/display: Fix memleak in display initialization
2024-01-26 14:34 ` wangxiaoming321
@ 2024-01-26 14:44 ` Ville Syrjälä
0 siblings, 0 replies; 10+ messages in thread
From: Ville Syrjälä @ 2024-01-26 14:44 UTC (permalink / raw)
To: wangxiaoming321
Cc: daniel, lucas.demarchi, linux-kernel, mripard, dri-devel,
tzimmermann, airlied, intel-xe
On Fri, Jan 26, 2024 at 10:34:33PM +0800, wangxiaoming321 wrote:
> intel_power_domains_init has been called twice in xe_device_probe:
> xe_device_probe -> xe_display_init_nommio -> intel_power_domains_init(xe)
> xe_device_probe -> xe_display_init_noirq -> intel_display_driver_probe_noirq
> -> intel_power_domains_init(i915)
Why are you calling it twice?
>
> It needs add a flag to avoid power_domains->power_wells double malloc.
>
> unreferenced object 0xffff88811150ee00 (size 512):
> comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
> hex dump (first 32 bytes):
> 10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................
> ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
> [<ffffffff812c98b2>] __kmalloc+0x52/0x150
> [<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
> [<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
> [<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
> [<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
> [<ffffffff817f2187>] local_pci_probe+0x47/0xa0
> [<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
> [<ffffffff8192f2a2>] really_probe+0x1a2/0x410
> [<ffffffff8192f598>] __driver_probe_device+0x78/0x160
> [<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
> [<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
> [<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
> [<ffffffff8192e159>] bus_add_driver+0x119/0x220
> [<ffffffff81930d00>] driver_register+0x60/0x120
> [<ffffffffa05e50a0>] 0xffffffffa05e50a0
>
> Signed-off-by: wangxiaoming321 <xiaoming.wang@intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_display_power.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c
> index bf9685acf75a..3b48a1cb7c54 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> @@ -36,6 +36,8 @@
> for_each_power_well_reverse(__dev_priv, __power_well) \
> for_each_if(test_bit((__domain), (__power_well)->domains.bits))
>
> +static int intel_power_domains_init_flag = 0;
> +
> const char *
> intel_display_power_domain_str(enum intel_display_power_domain domain)
> {
> @@ -1016,6 +1018,11 @@ int intel_power_domains_init(struct drm_i915_private *dev_priv)
> {
> struct i915_power_domains *power_domains = &dev_priv->display.power.domains;
>
> + if(intel_power_domains_init_flag == 1)
> + return 0;
> +
> + intel_power_domains_init_flag++;
Consider what happens when you have multiple Intel GPUs in the system...
> +
> dev_priv->display.params.disable_power_well =
> sanitize_disable_power_well_option(dev_priv,
> dev_priv->display.params.disable_power_well);
> @@ -1041,6 +1048,7 @@ int intel_power_domains_init(struct drm_i915_private *dev_priv)
> */
> void intel_power_domains_cleanup(struct drm_i915_private *dev_priv)
> {
> + intel_power_domains_init_flag = 0;
> intel_display_power_map_cleanup(&dev_priv->display.power.domains);
> }
>
> --
> 2.25.1
--
Ville Syrjälä
Intel
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] drm/xe/display: Fix memleak in display initialization
2024-01-25 6:36 [PATCH] drm/xe/display: Fix memleak in display initialization wangxiaoming321
2024-01-25 9:41 ` Jani Nikula
2024-01-26 14:34 ` wangxiaoming321
@ 2024-01-26 15:34 ` wangxiaoming321
2024-01-31 14:54 ` Lucas De Marchi
2 siblings, 1 reply; 10+ messages in thread
From: wangxiaoming321 @ 2024-01-26 15:34 UTC (permalink / raw)
To: lucas.demarchi, ogabbay, thomas.hellstrom, maarten.lankhorst,
mripard, tzimmermann, airlied, daniel, intel-xe, dri-devel,
linux-kernel
Cc: wangxiaoming321
intel_power_domains_init has been called twice in xe_device_probe:
xe_device_probe -> xe_display_init_nommio -> intel_power_domains_init(xe)
xe_device_probe -> xe_display_init_noirq -> intel_display_driver_probe_noirq
-> intel_power_domains_init(i915)
It needs remove one to avoid power_domains->power_wells double malloc.
unreferenced object 0xffff88811150ee00 (size 512):
comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
hex dump (first 32 bytes):
10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................
ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
[<ffffffff812c98b2>] __kmalloc+0x52/0x150
[<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
[<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
[<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
[<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
[<ffffffff817f2187>] local_pci_probe+0x47/0xa0
[<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
[<ffffffff8192f2a2>] really_probe+0x1a2/0x410
[<ffffffff8192f598>] __driver_probe_device+0x78/0x160
[<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
[<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
[<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
[<ffffffff8192e159>] bus_add_driver+0x119/0x220
[<ffffffff81930d00>] driver_register+0x60/0x120
[<ffffffffa05e50a0>] 0xffffffffa05e50a0
Signed-off-by: wangxiaoming321 <xiaoming.wang@intel.com>
---
drivers/gpu/drm/xe/xe_display.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_display.c b/drivers/gpu/drm/xe/xe_display.c
index 74391d9b11ae..e4db069f0db3 100644
--- a/drivers/gpu/drm/xe/xe_display.c
+++ b/drivers/gpu/drm/xe/xe_display.c
@@ -134,8 +134,6 @@ static void xe_display_fini_nommio(struct drm_device *dev, void *dummy)
int xe_display_init_nommio(struct xe_device *xe)
{
- int err;
-
if (!xe->info.enable_display)
return 0;
@@ -145,10 +143,6 @@ int xe_display_init_nommio(struct xe_device *xe)
/* This must be called before any calls to HAS_PCH_* */
intel_detect_pch(xe);
- err = intel_power_domains_init(xe);
- if (err)
- return err;
-
return drmm_add_action_or_reset(&xe->drm, xe_display_fini_nommio, xe);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/xe/display: Fix memleak in display initialization
2024-01-26 15:34 ` wangxiaoming321
@ 2024-01-31 14:54 ` Lucas De Marchi
2024-01-31 15:07 ` Jani Nikula
0 siblings, 1 reply; 10+ messages in thread
From: Lucas De Marchi @ 2024-01-31 14:54 UTC (permalink / raw)
To: wangxiaoming321
Cc: daniel, linux-kernel, mripard, dri-devel, tzimmermann, airlied,
intel-xe
+Jani
On Fri, Jan 26, 2024 at 11:34:53PM +0800, wangxiaoming321 wrote:
>intel_power_domains_init has been called twice in xe_device_probe:
>xe_device_probe -> xe_display_init_nommio -> intel_power_domains_init(xe)
>xe_device_probe -> xe_display_init_noirq -> intel_display_driver_probe_noirq
>-> intel_power_domains_init(i915)
ok, once upon a time intel_power_domains_init() was called by the driver
initialization code and not initialized inside the display. I think.
Now it's part of the display probe and we never updated the xe side.
>
>It needs remove one to avoid power_domains->power_wells double malloc.
>
>unreferenced object 0xffff88811150ee00 (size 512):
> comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
> hex dump (first 32 bytes):
> 10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................
> ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
> [<ffffffff812c98b2>] __kmalloc+0x52/0x150
> [<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
> [<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
> [<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
> [<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
> [<ffffffff817f2187>] local_pci_probe+0x47/0xa0
> [<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
> [<ffffffff8192f2a2>] really_probe+0x1a2/0x410
> [<ffffffff8192f598>] __driver_probe_device+0x78/0x160
> [<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
> [<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
> [<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
> [<ffffffff8192e159>] bus_add_driver+0x119/0x220
> [<ffffffff81930d00>] driver_register+0x60/0x120
> [<ffffffffa05e50a0>] 0xffffffffa05e50a0
>
This will need a Fixes trailer. This seems to be a suitable one:
Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
>Signed-off-by: wangxiaoming321 <xiaoming.wang@intel.com>
>---
> drivers/gpu/drm/xe/xe_display.c | 6 ------
> 1 file changed, 6 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_display.c b/drivers/gpu/drm/xe/xe_display.c
>index 74391d9b11ae..e4db069f0db3 100644
>--- a/drivers/gpu/drm/xe/xe_display.c
>+++ b/drivers/gpu/drm/xe/xe_display.c
>@@ -134,8 +134,6 @@ static void xe_display_fini_nommio(struct drm_device *dev, void *dummy)
>
> int xe_display_init_nommio(struct xe_device *xe)
> {
>- int err;
>-
> if (!xe->info.enable_display)
> return 0;
>
>@@ -145,10 +143,6 @@ int xe_display_init_nommio(struct xe_device *xe)
> /* This must be called before any calls to HAS_PCH_* */
> intel_detect_pch(xe);
>
>- err = intel_power_domains_init(xe);
>- if (err)
>- return err;
xe_display_init_nommio() has xe_display_fini_nommio() as its destructor
counter part. Unfortunately display side looks wrong as it does:
init:
intel_display_driver_probe_noirq() -> intel_power_domains_init()
destroy:
i915_driver_late_release() -> intel_power_domains_cleanup()
I think leaving intel_power_domains_cleanup() as is for now so it's
called by xe works, but this needs to go through CI, which apparently
this series didn't go. I re-triggered it.
+Jani if he thinks this can be changed in another way or already have
the complete solution.
Lucas De Marchi
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/xe/display: Fix memleak in display initialization
2024-01-31 14:54 ` Lucas De Marchi
@ 2024-01-31 15:07 ` Jani Nikula
2024-02-01 14:19 ` Maarten Lankhorst
0 siblings, 1 reply; 10+ messages in thread
From: Jani Nikula @ 2024-01-31 15:07 UTC (permalink / raw)
To: Lucas De Marchi, wangxiaoming321
Cc: daniel, linux-kernel, mripard, dri-devel, tzimmermann, airlied,
intel-xe
On Wed, 31 Jan 2024, Lucas De Marchi <lucas.demarchi@intel.com> wrote:
> +Jani
>
> On Fri, Jan 26, 2024 at 11:34:53PM +0800, wangxiaoming321 wrote:
>>intel_power_domains_init has been called twice in xe_device_probe:
>>xe_device_probe -> xe_display_init_nommio -> intel_power_domains_init(xe)
>>xe_device_probe -> xe_display_init_noirq -> intel_display_driver_probe_noirq
>>-> intel_power_domains_init(i915)
>
> ok, once upon a time intel_power_domains_init() was called by the driver
> initialization code and not initialized inside the display. I think.
> Now it's part of the display probe and we never updated the xe side.
>
>>
>>It needs remove one to avoid power_domains->power_wells double malloc.
>>
>>unreferenced object 0xffff88811150ee00 (size 512):
>> comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
>> hex dump (first 32 bytes):
>> 10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................
>> ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
>> backtrace:
>> [<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
>> [<ffffffff812c98b2>] __kmalloc+0x52/0x150
>> [<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
>> [<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
>> [<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
>> [<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
>> [<ffffffff817f2187>] local_pci_probe+0x47/0xa0
>> [<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
>> [<ffffffff8192f2a2>] really_probe+0x1a2/0x410
>> [<ffffffff8192f598>] __driver_probe_device+0x78/0x160
>> [<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
>> [<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
>> [<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
>> [<ffffffff8192e159>] bus_add_driver+0x119/0x220
>> [<ffffffff81930d00>] driver_register+0x60/0x120
>> [<ffffffffa05e50a0>] 0xffffffffa05e50a0
>>
>
> This will need a Fixes trailer. This seems to be a suitable one:
>
> Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
>
>>Signed-off-by: wangxiaoming321 <xiaoming.wang@intel.com>
>>---
>> drivers/gpu/drm/xe/xe_display.c | 6 ------
>> 1 file changed, 6 deletions(-)
>>
>>diff --git a/drivers/gpu/drm/xe/xe_display.c b/drivers/gpu/drm/xe/xe_display.c
>>index 74391d9b11ae..e4db069f0db3 100644
>>--- a/drivers/gpu/drm/xe/xe_display.c
>>+++ b/drivers/gpu/drm/xe/xe_display.c
>>@@ -134,8 +134,6 @@ static void xe_display_fini_nommio(struct drm_device *dev, void *dummy)
>>
>> int xe_display_init_nommio(struct xe_device *xe)
>> {
>>- int err;
>>-
>> if (!xe->info.enable_display)
>> return 0;
>>
>>@@ -145,10 +143,6 @@ int xe_display_init_nommio(struct xe_device *xe)
>> /* This must be called before any calls to HAS_PCH_* */
>> intel_detect_pch(xe);
>>
>>- err = intel_power_domains_init(xe);
>>- if (err)
>>- return err;
>
> xe_display_init_nommio() has xe_display_fini_nommio() as its destructor
> counter part. Unfortunately display side looks wrong as it does:
>
> init:
> intel_display_driver_probe_noirq() -> intel_power_domains_init()
>
> destroy:
> i915_driver_late_release() -> intel_power_domains_cleanup()
>
> I think leaving intel_power_domains_cleanup() as is for now so it's
> called by xe works, but this needs to go through CI, which apparently
> this series didn't go. I re-triggered it.
>
> +Jani if he thinks this can be changed in another way or already have
> the complete solution.
I don't. But it is and will be a recurring problem. i915 and xe core
drivers should handle display init and cleanup the same way. But
currently i915 goes on to call e.g. intel_power_domains_cleanup()
directly from top level driver code. There are other examples.
And we seem to have recently added *more*. See e.g. bd738d859e71
("drm/i915: Prevent modesets during driver init/shutdown").
BR,
Jani.
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/xe/display: Fix memleak in display initialization
2024-01-31 15:07 ` Jani Nikula
@ 2024-02-01 14:19 ` Maarten Lankhorst
0 siblings, 0 replies; 10+ messages in thread
From: Maarten Lankhorst @ 2024-02-01 14:19 UTC (permalink / raw)
To: Jani Nikula, Lucas De Marchi, wangxiaoming321
Cc: ogabbay, thomas.hellstrom, mripard, tzimmermann, airlied, daniel,
intel-xe, dri-devel, linux-kernel
On 2024-01-31 16:07, Jani Nikula wrote:
> On Wed, 31 Jan 2024, Lucas De Marchi <lucas.demarchi@intel.com> wrote:
>> +Jani
>>
>> On Fri, Jan 26, 2024 at 11:34:53PM +0800, wangxiaoming321 wrote:
>>> intel_power_domains_init has been called twice in xe_device_probe:
>>> xe_device_probe -> xe_display_init_nommio -> intel_power_domains_init(xe)
>>> xe_device_probe -> xe_display_init_noirq -> intel_display_driver_probe_noirq
>>> -> intel_power_domains_init(i915)
>>
>> ok, once upon a time intel_power_domains_init() was called by the driver
>> initialization code and not initialized inside the display. I think.
>> Now it's part of the display probe and we never updated the xe side.
>>
>>>
>>> It needs remove one to avoid power_domains->power_wells double malloc.
>>>
>>> unreferenced object 0xffff88811150ee00 (size 512):
>>> comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
>>> hex dump (first 32 bytes):
>>> 10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................
>>> ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
>>> backtrace:
>>> [<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
>>> [<ffffffff812c98b2>] __kmalloc+0x52/0x150
>>> [<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
>>> [<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
>>> [<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
>>> [<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
>>> [<ffffffff817f2187>] local_pci_probe+0x47/0xa0
>>> [<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
>>> [<ffffffff8192f2a2>] really_probe+0x1a2/0x410
>>> [<ffffffff8192f598>] __driver_probe_device+0x78/0x160
>>> [<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
>>> [<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
>>> [<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
>>> [<ffffffff8192e159>] bus_add_driver+0x119/0x220
>>> [<ffffffff81930d00>] driver_register+0x60/0x120
>>> [<ffffffffa05e50a0>] 0xffffffffa05e50a0
>>>
>>
>> This will need a Fixes trailer. This seems to be a suitable one:
>>
>> Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
>>
>>> Signed-off-by: wangxiaoming321 <xiaoming.wang@intel.com>
>>> ---
>>> drivers/gpu/drm/xe/xe_display.c | 6 ------
>>> 1 file changed, 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_display.c b/drivers/gpu/drm/xe/xe_display.c
>>> index 74391d9b11ae..e4db069f0db3 100644
>>> --- a/drivers/gpu/drm/xe/xe_display.c
>>> +++ b/drivers/gpu/drm/xe/xe_display.c
>>> @@ -134,8 +134,6 @@ static void xe_display_fini_nommio(struct drm_device *dev, void *dummy)
>>>
>>> int xe_display_init_nommio(struct xe_device *xe)
>>> {
>>> - int err;
>>> -
>>> if (!xe->info.enable_display)
>>> return 0;
>>>
>>> @@ -145,10 +143,6 @@ int xe_display_init_nommio(struct xe_device *xe)
>>> /* This must be called before any calls to HAS_PCH_* */
>>> intel_detect_pch(xe);
>>>
>>> - err = intel_power_domains_init(xe);
>>> - if (err)
>>> - return err;
>>
>> xe_display_init_nommio() has xe_display_fini_nommio() as its destructor
>> counter part. Unfortunately display side looks wrong as it does:
>>
>> init:
>> intel_display_driver_probe_noirq() -> intel_power_domains_init()
>>
>> destroy:
>> i915_driver_late_release() -> intel_power_domains_cleanup()
>>
>> I think leaving intel_power_domains_cleanup() as is for now so it's
>> called by xe works, but this needs to go through CI, which apparently
>> this series didn't go. I re-triggered it.
>>
>> +Jani if he thinks this can be changed in another way or already have
>> the complete solution.
>
> I don't. But it is and will be a recurring problem. i915 and xe core
> drivers should handle display init and cleanup the same way. But
> currently i915 goes on to call e.g. intel_power_domains_cleanup()
> directly from top level driver code. There are other examples.
>
> And we seem to have recently added *more*. See e.g. bd738d859e71
> ("drm/i915: Prevent modesets during driver init/shutdown").
That commit seems terrible Should we instead not only enable any code
that can cause modesets after it's safe to do so?
Cheers,
~Maarten
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] drm/xe/display: Fix memleak in display initialization
@ 2024-02-02 21:56 Lucas De Marchi
2024-02-03 6:05 ` Lucas De Marchi
0 siblings, 1 reply; 10+ messages in thread
From: Lucas De Marchi @ 2024-02-02 21:56 UTC (permalink / raw)
To: intel-xe; +Cc: Xiaoming Wang, Lucas De Marchi
From: Xiaoming Wang <xiaoming.wang@intel.com>
intel_power_domains_init is called twice in xe_device_probe:
xe_device_probe -> xe_display_init_nommio -> intel_power_domains_init()
xe_device_probe -> xe_display_init_noirq -> intel_display_driver_probe_noirq
-> intel_power_domains_init(i915)
It needs remove one to avoid power_domains->power_wells double malloc.
unreferenced object 0xffff88811150ee00 (size 512):
comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
hex dump (first 32 bytes):
10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................
ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
[<ffffffff812c98b2>] __kmalloc+0x52/0x150
[<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
[<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
[<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
[<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
[<ffffffff817f2187>] local_pci_probe+0x47/0xa0
[<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
[<ffffffff8192f2a2>] really_probe+0x1a2/0x410
[<ffffffff8192f598>] __driver_probe_device+0x78/0x160
[<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
[<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
[<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
[<ffffffff8192e159>] bus_add_driver+0x119/0x220
[<ffffffff81930d00>] driver_register+0x60/0x120
[<ffffffffa05e50a0>] 0xffffffffa05e50a0
The call to intel_power_domains_cleanup() needs to stay where it is for
now. The main issue is that while the init is called by the display
side, shared by i915 and xe, the cleanup is called by a non-shared code
path. Fixing that will be done as a separate commit.
Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
Signed-off-by: Xiaoming Wang <xiaoming.wang@intel.com>
[ reword commit message and explain why the fini needs to stay
where it is ]
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
drivers/gpu/drm/xe/display/xe_display.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c
index 74391d9b11ae..e4db069f0db3 100644
--- a/drivers/gpu/drm/xe/display/xe_display.c
+++ b/drivers/gpu/drm/xe/display/xe_display.c
@@ -134,8 +134,6 @@ static void xe_display_fini_nommio(struct drm_device *dev, void *dummy)
int xe_display_init_nommio(struct xe_device *xe)
{
- int err;
-
if (!xe->info.enable_display)
return 0;
@@ -145,10 +143,6 @@ int xe_display_init_nommio(struct xe_device *xe)
/* This must be called before any calls to HAS_PCH_* */
intel_detect_pch(xe);
- err = intel_power_domains_init(xe);
- if (err)
- return err;
-
return drmm_add_action_or_reset(&xe->drm, xe_display_fini_nommio, xe);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/xe/display: Fix memleak in display initialization
2024-02-02 21:56 Lucas De Marchi
@ 2024-02-03 6:05 ` Lucas De Marchi
0 siblings, 0 replies; 10+ messages in thread
From: Lucas De Marchi @ 2024-02-03 6:05 UTC (permalink / raw)
To: intel-xe; +Cc: Xiaoming Wang
On Fri, Feb 02, 2024 at 01:56:58PM -0800, Lucas De Marchi wrote:
>From: Xiaoming Wang <xiaoming.wang@intel.com>
>
>intel_power_domains_init is called twice in xe_device_probe:
>xe_device_probe -> xe_display_init_nommio -> intel_power_domains_init()
>xe_device_probe -> xe_display_init_noirq -> intel_display_driver_probe_noirq
>-> intel_power_domains_init(i915)
>
>It needs remove one to avoid power_domains->power_wells double malloc.
>
>unreferenced object 0xffff88811150ee00 (size 512):
> comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
> hex dump (first 32 bytes):
> 10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................
> ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................
> backtrace:
> [<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
> [<ffffffff812c98b2>] __kmalloc+0x52/0x150
> [<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
> [<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
> [<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
> [<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
> [<ffffffff817f2187>] local_pci_probe+0x47/0xa0
> [<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
> [<ffffffff8192f2a2>] really_probe+0x1a2/0x410
> [<ffffffff8192f598>] __driver_probe_device+0x78/0x160
> [<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
> [<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
> [<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
> [<ffffffff8192e159>] bus_add_driver+0x119/0x220
> [<ffffffff81930d00>] driver_register+0x60/0x120
> [<ffffffffa05e50a0>] 0xffffffffa05e50a0
>
>The call to intel_power_domains_cleanup() needs to stay where it is for
>now. The main issue is that while the init is called by the display
>side, shared by i915 and xe, the cleanup is called by a non-shared code
>path. Fixing that will be done as a separate commit.
>
>Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
>Signed-off-by: Xiaoming Wang <xiaoming.wang@intel.com>
>[ reword commit message and explain why the fini needs to stay
> where it is ]
>Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
>Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
applied, thanks
Lucas De Marchi
>---
> drivers/gpu/drm/xe/display/xe_display.c | 6 ------
> 1 file changed, 6 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c
>index 74391d9b11ae..e4db069f0db3 100644
>--- a/drivers/gpu/drm/xe/display/xe_display.c
>+++ b/drivers/gpu/drm/xe/display/xe_display.c
>@@ -134,8 +134,6 @@ static void xe_display_fini_nommio(struct drm_device *dev, void *dummy)
>
> int xe_display_init_nommio(struct xe_device *xe)
> {
>- int err;
>-
> if (!xe->info.enable_display)
> return 0;
>
>@@ -145,10 +143,6 @@ int xe_display_init_nommio(struct xe_device *xe)
> /* This must be called before any calls to HAS_PCH_* */
> intel_detect_pch(xe);
>
>- err = intel_power_domains_init(xe);
>- if (err)
>- return err;
>-
> return drmm_add_action_or_reset(&xe->drm, xe_display_fini_nommio, xe);
> }
>
>--
>2.43.0
>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-02-03 6:05 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-25 6:36 [PATCH] drm/xe/display: Fix memleak in display initialization wangxiaoming321
2024-01-25 9:41 ` Jani Nikula
2024-01-26 14:34 ` wangxiaoming321
2024-01-26 14:44 ` Ville Syrjälä
2024-01-26 15:34 ` wangxiaoming321
2024-01-31 14:54 ` Lucas De Marchi
2024-01-31 15:07 ` Jani Nikula
2024-02-01 14:19 ` Maarten Lankhorst
-- strict thread matches above, loose matches on Subject: below --
2024-02-02 21:56 Lucas De Marchi
2024-02-03 6:05 ` Lucas De Marchi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox