Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [REGRESSION][RFC] memleak on xe load & unload on PTL
@ 2025-11-18 12:35 Michał Grzelak
  2025-11-19 13:39 ` Arunpravin Paneer Selvam
  2025-11-19 17:26 ` Lucas De Marchi
  0 siblings, 2 replies; 4+ messages in thread
From: Michał Grzelak @ 2025-11-18 12:35 UTC (permalink / raw)
  To: intel-xe, dri-devel
  Cc: Lucas De Marchi, Rodrigo Vivi, Jani Nikula,
	Arunpravin Paneer Selvam, Thomas Hellström,
	Michał Grzelak

[-- Attachment #1: Type: text/plain, Size: 1205 bytes --]

Hi,

just hit memory leak on xe module load & unload:

unreferenced object 0xffff88811b047d10 (size 16):
   comm "modprobe", pid 1058, jiffies 4297578480
   hex dump (first 16 bytes):
     00 6b 4b 2d 81 88 ff ff 80 7e 4b 2d 81 88 ff ff  .kK-.....~K-....
   backtrace (crc 4f169eaf):
     kmemleak_alloc+0x4a/0x90
     __kmalloc_cache_noprof+0x488/0x800
     drm_buddy_init+0xc2/0x330 [drm_buddy]
     __xe_ttm_vram_mgr_init+0xc3/0x190 [xe]
     xe_ttm_stolen_mgr_init+0xf5/0x9d0 [xe]
     xe_device_probe+0x326/0x9e0 [xe]
     xe_pci_probe+0x39a/0x610 [xe]
     local_pci_probe+0x47/0xb0
     pci_device_probe+0xf3/0x260
     really_probe+0xf1/0x3c0
     __driver_probe_device+0x8c/0x180
     driver_probe_device+0x24/0xd0
     __driver_attach+0x10f/0x220
     bus_for_each_dev+0x7f/0xe0
     driver_attach+0x1e/0x30
     bus_add_driver+0x151/0x290

Issue was reproduced on PTL & BMG, booted with latest kernel from
drm-tip. Looks like fault was introduced in commit d4cd665c9
("drm/buddy: Separate clear and dirty free block trees"), since reverting it
makes the leak disappear. Also attached RFC patch, which at first
glance could fix the issue.

Added xe maintainers and the author to Cc.

BR,
Michał

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: rfc.patch --]
[-- Type: text/x-diff; name=0001-drm-buddy-release-free_trees-array-on-buddy-mm-teard.patch, Size: 1429 bytes --]

From 914aa53c18a88834310b8560323b63bae98fb29d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Micha=C5=82=20Grzelak?= <michal.grzelak@intel.com>
Date: Tue, 18 Nov 2025 11:34:11 +0100
Subject: [PATCH] drm/buddy: release free_trees array on buddy mm teardown
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Organization: Intel Technology Poland sp. z o.o. - ul. Slowackiego 173, 80-298 Gdansk - KRS 101882 - NIP 957-07-52-316

During initialization of DRM buddy memory manager at drm_buddy_init,
mm->free_trees array is allocated for both clear and dirty RB trees.
During cleanup happening at drm_buddy_fini it is never freed, leading to
memory leaks observed on xe module load & unload cycles.

Deallocate array for free trees when cleaning up buddy memory manager.

Fixes: d4cd665c ("drm/buddy: Separate clear and dirty free block trees")
Signed-off-by: Michał Grzelak <michal.grzelak@intel.com>
---
 drivers/gpu/drm/drm_buddy.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 2f279b46bd2c..8308116058cc 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -420,6 +420,7 @@ void drm_buddy_fini(struct drm_buddy *mm)
 
 	for_each_free_tree(i)
 		kfree(mm->free_trees[i]);
+	kfree(mm->free_trees);
 	kfree(mm->roots);
 }
 EXPORT_SYMBOL(drm_buddy_fini);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [REGRESSION][RFC] memleak on xe load & unload on PTL
  2025-11-18 12:35 [REGRESSION][RFC] memleak on xe load & unload on PTL Michał Grzelak
@ 2025-11-19 13:39 ` Arunpravin Paneer Selvam
  2025-11-19 17:26 ` Lucas De Marchi
  1 sibling, 0 replies; 4+ messages in thread
From: Arunpravin Paneer Selvam @ 2025-11-19 13:39 UTC (permalink / raw)
  To: Michał Grzelak, intel-xe, dri-devel
  Cc: Lucas De Marchi, Rodrigo Vivi, Jani Nikula, Thomas Hellström,
	Christian König

Hi Michal,

Please send the patch for the review.

Regards,
Arun.

On 11/18/2025 6:05 PM, Michał Grzelak wrote:
> Hi,
>
> just hit memory leak on xe module load & unload:
>
> unreferenced object 0xffff88811b047d10 (size 16):
>   comm "modprobe", pid 1058, jiffies 4297578480
>   hex dump (first 16 bytes):
>     00 6b 4b 2d 81 88 ff ff 80 7e 4b 2d 81 88 ff ff .kK-.....~K-....
>   backtrace (crc 4f169eaf):
>     kmemleak_alloc+0x4a/0x90
>     __kmalloc_cache_noprof+0x488/0x800
>     drm_buddy_init+0xc2/0x330 [drm_buddy]
>     __xe_ttm_vram_mgr_init+0xc3/0x190 [xe]
>     xe_ttm_stolen_mgr_init+0xf5/0x9d0 [xe]
>     xe_device_probe+0x326/0x9e0 [xe]
>     xe_pci_probe+0x39a/0x610 [xe]
>     local_pci_probe+0x47/0xb0
>     pci_device_probe+0xf3/0x260
>     really_probe+0xf1/0x3c0
>     __driver_probe_device+0x8c/0x180
>     driver_probe_device+0x24/0xd0
>     __driver_attach+0x10f/0x220
>     bus_for_each_dev+0x7f/0xe0
>     driver_attach+0x1e/0x30
>     bus_add_driver+0x151/0x290
>
> Issue was reproduced on PTL & BMG, booted with latest kernel from
> drm-tip. Looks like fault was introduced in commit d4cd665c9
> ("drm/buddy: Separate clear and dirty free block trees"), since 
> reverting it
> makes the leak disappear. Also attached RFC patch, which at first
> glance could fix the issue.
>
> Added xe maintainers and the author to Cc.
>
> BR,
> Michał


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [REGRESSION][RFC] memleak on xe load & unload on PTL
  2025-11-18 12:35 [REGRESSION][RFC] memleak on xe load & unload on PTL Michał Grzelak
  2025-11-19 13:39 ` Arunpravin Paneer Selvam
@ 2025-11-19 17:26 ` Lucas De Marchi
  2025-11-20  8:56   ` Michał Grzelak
  1 sibling, 1 reply; 4+ messages in thread
From: Lucas De Marchi @ 2025-11-19 17:26 UTC (permalink / raw)
  To: Michał Grzelak
  Cc: intel-xe, dri-devel, Rodrigo Vivi, Jani Nikula,
	Arunpravin Paneer Selvam, Thomas Hellström

On Tue, Nov 18, 2025 at 01:35:53PM +0100, Michał Grzelak wrote:
>Hi,
>
>just hit memory leak on xe module load & unload:
>
>unreferenced object 0xffff88811b047d10 (size 16):
>  comm "modprobe", pid 1058, jiffies 4297578480
>  hex dump (first 16 bytes):
>    00 6b 4b 2d 81 88 ff ff 80 7e 4b 2d 81 88 ff ff  .kK-.....~K-....
>  backtrace (crc 4f169eaf):
>    kmemleak_alloc+0x4a/0x90
>    __kmalloc_cache_noprof+0x488/0x800
>    drm_buddy_init+0xc2/0x330 [drm_buddy]
>    __xe_ttm_vram_mgr_init+0xc3/0x190 [xe]
>    xe_ttm_stolen_mgr_init+0xf5/0x9d0 [xe]
>    xe_device_probe+0x326/0x9e0 [xe]
>    xe_pci_probe+0x39a/0x610 [xe]
>    local_pci_probe+0x47/0xb0
>    pci_device_probe+0xf3/0x260
>    really_probe+0xf1/0x3c0
>    __driver_probe_device+0x8c/0x180
>    driver_probe_device+0x24/0xd0
>    __driver_attach+0x10f/0x220
>    bus_for_each_dev+0x7f/0xe0
>    driver_attach+0x1e/0x30
>    bus_add_driver+0x151/0x290
>
>Issue was reproduced on PTL & BMG, booted with latest kernel from
>drm-tip. Looks like fault was introduced in commit d4cd665c9
>("drm/buddy: Separate clear and dirty free block trees"), since reverting it
>makes the leak disappear. Also attached RFC patch, which at first
>glance could fix the issue.
>
>Added xe maintainers and the author to Cc.

the backtrace above and the commit message below could be merged
together

>
>BR,
>Michał

>From 914aa53c18a88834310b8560323b63bae98fb29d Mon Sep 17 00:00:00 2001
>From: =?UTF-8?q?Micha=C5=82=20Grzelak?= <michal.grzelak@intel.com>
>Date: Tue, 18 Nov 2025 11:34:11 +0100
>Subject: [PATCH] drm/buddy: release free_trees array on buddy mm teardown
>MIME-Version: 1.0
>Content-Type: text/plain; charset=UTF-8
>Content-Transfer-Encoding: 8bit
>Organization: Intel Technology Poland sp. z o.o. - ul. Slowackiego 173, 80-298 Gdansk - KRS 101882 - NIP 957-07-52-316
>
>During initialization of DRM buddy memory manager at drm_buddy_init,
>mm->free_trees array is allocated for both clear and dirty RB trees.
>During cleanup happening at drm_buddy_fini it is never freed, leading to
>memory leaks observed on xe module load & unload cycles.
>
>Deallocate array for free trees when cleaning up buddy memory manager.
>
>Fixes: d4cd665c ("drm/buddy: Separate clear and dirty free block trees")
>Signed-off-by: Michał Grzelak <michal.grzelak@intel.com>
>---
> drivers/gpu/drm/drm_buddy.c | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
>index 2f279b46bd2c..8308116058cc 100644
>--- a/drivers/gpu/drm/drm_buddy.c
>+++ b/drivers/gpu/drm/drm_buddy.c
>@@ -420,6 +420,7 @@ void drm_buddy_fini(struct drm_buddy *mm)
> 
> 	for_each_free_tree(i)
> 		kfree(mm->free_trees[i]);
>+	kfree(mm->free_trees);

looks correct to me and also matches the out_free_tree label in
drm_buddy_init()

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

thanks,
Lucas De Marchi

> 	kfree(mm->roots);
> }
> EXPORT_SYMBOL(drm_buddy_fini);
>-- 
>2.45.2
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [REGRESSION][RFC] memleak on xe load & unload on PTL
  2025-11-19 17:26 ` Lucas De Marchi
@ 2025-11-20  8:56   ` Michał Grzelak
  0 siblings, 0 replies; 4+ messages in thread
From: Michał Grzelak @ 2025-11-20  8:56 UTC (permalink / raw)
  To: Lucas De Marchi
  Cc: Michał Grzelak, intel-xe, dri-devel, Rodrigo Vivi,
	Jani Nikula, Arunpravin Paneer Selvam, Thomas Hellström

[-- Attachment #1: Type: text/plain, Size: 3136 bytes --]

On Wed, 19 Nov 2025, Lucas De Marchi wrote:
> On Tue, Nov 18, 2025 at 01:35:53PM +0100, Michał Grzelak wrote:
>> Hi,
>> 
>> just hit memory leak on xe module load & unload:
>> 
>> unreferenced object 0xffff88811b047d10 (size 16):
>>  comm "modprobe", pid 1058, jiffies 4297578480
>>  hex dump (first 16 bytes):
>>    00 6b 4b 2d 81 88 ff ff 80 7e 4b 2d 81 88 ff ff  .kK-.....~K-....
>>  backtrace (crc 4f169eaf):
>>    kmemleak_alloc+0x4a/0x90
>>    __kmalloc_cache_noprof+0x488/0x800
>>    drm_buddy_init+0xc2/0x330 [drm_buddy]
>>    __xe_ttm_vram_mgr_init+0xc3/0x190 [xe]
>>    xe_ttm_stolen_mgr_init+0xf5/0x9d0 [xe]
>>    xe_device_probe+0x326/0x9e0 [xe]
>>    xe_pci_probe+0x39a/0x610 [xe]
>>    local_pci_probe+0x47/0xb0
>>    pci_device_probe+0xf3/0x260
>>    really_probe+0xf1/0x3c0
>>    __driver_probe_device+0x8c/0x180
>>    driver_probe_device+0x24/0xd0
>>    __driver_attach+0x10f/0x220
>>    bus_for_each_dev+0x7f/0xe0
>>    driver_attach+0x1e/0x30
>>    bus_add_driver+0x151/0x290
>> 
>> Issue was reproduced on PTL & BMG, booted with latest kernel from
>> drm-tip. Looks like fault was introduced in commit d4cd665c9
>> ("drm/buddy: Separate clear and dirty free block trees"), since reverting 
>> it
>> makes the leak disappear. Also attached RFC patch, which at first
>> glance could fix the issue.
>> 
>> Added xe maintainers and the author to Cc.
>
> the backtrace above and the commit message below could be merged
> together
>
>> From 914aa53c18a88834310b8560323b63bae98fb29d Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?Micha=C5=82=20Grzelak?= <michal.grzelak@intel.com>
>> Date: Tue, 18 Nov 2025 11:34:11 +0100
>> Subject: [PATCH] drm/buddy: release free_trees array on buddy mm teardown
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>> Organization: Intel Technology Poland sp. z o.o. - ul. Slowackiego 173, 
>> 80-298 Gdansk - KRS 101882 - NIP 957-07-52-316
>> 
>> During initialization of DRM buddy memory manager at drm_buddy_init,
>> mm->free_trees array is allocated for both clear and dirty RB trees.
>> During cleanup happening at drm_buddy_fini it is never freed, leading to
>> memory leaks observed on xe module load & unload cycles.
>> 
>> Deallocate array for free trees when cleaning up buddy memory manager.
>> 
>> Fixes: d4cd665c ("drm/buddy: Separate clear and dirty free block trees")
>> Signed-off-by: Michał Grzelak <michal.grzelak@intel.com>
>> ---
>> drivers/gpu/drm/drm_buddy.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
>> index 2f279b46bd2c..8308116058cc 100644
>> --- a/drivers/gpu/drm/drm_buddy.c
>> +++ b/drivers/gpu/drm/drm_buddy.c
>> @@ -420,6 +420,7 @@ void drm_buddy_fini(struct drm_buddy *mm)
>>
>> 	for_each_free_tree(i)
>> 		kfree(mm->free_trees[i]);
>> +	kfree(mm->free_trees);
>
> looks correct to me and also matches the out_free_tree label in
> drm_buddy_init()
>
> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

Thank you Lucas for your review. I will update the commit message with
your R-B applied when resending the patch.

BR,
Michał

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-11-20  8:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-18 12:35 [REGRESSION][RFC] memleak on xe load & unload on PTL Michał Grzelak
2025-11-19 13:39 ` Arunpravin Paneer Selvam
2025-11-19 17:26 ` Lucas De Marchi
2025-11-20  8:56   ` Michał Grzelak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox