* [BUG] drm/tegra: DMA buffers are not always freed @ 2026-05-12 5:29 Aaron Kling 2026-05-13 3:26 ` Mikko Perttunen 0 siblings, 1 reply; 5+ messages in thread From: Aaron Kling @ 2026-05-12 5:29 UTC (permalink / raw) To: linux-tegra, dri-devel There is an issue with tegra-drm where some buffers get created, then freed, but the dma buffer never gets freed. Causing display controller memory allocations to start failing after the leaks fill up cma. I created an issue on the freedesktop issue tracker [0] with a patch with some debug logs I added, then a log from Android that contains these logs. CMA is set to 512MB, and when allocations start to fail, the unfreed allocations add up to just shy of 500MB, where it's reasonable to expect that 8MB contiguous is no longer available. The log was generated on a Jetson TX2 NX, but I have seen this leak on other archs as well, this also does not appear to be limited to soc's with nvdisplay. This does not appear to be a userspace issue. The graphics allocator works as expected for other soc vendors. And as the logs show, the delete dumb buffer ioctl is called, but is not always followed by the dma buffer getting freed. I have also observed this issue with a gralloc that uses the tegra gem create and such, this is not unique to dumb buffers, that's just the last log I had when deciding to post the issue to lkml. What I primarily intend to ask here is how to further debug this issue. I'm not finding any direct path between the delete dumb ioctl handling and gem release or tegra bo free. Can someone point me to the pieces in the middle I'm missing, where the logic is to decide is a buffer should be freed? Aaron [0] https://gitlab.freedesktop.org/drm/tegra/-/work_items/9 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] drm/tegra: DMA buffers are not always freed 2026-05-12 5:29 [BUG] drm/tegra: DMA buffers are not always freed Aaron Kling @ 2026-05-13 3:26 ` Mikko Perttunen 2026-05-13 4:26 ` Aaron Kling 0 siblings, 1 reply; 5+ messages in thread From: Mikko Perttunen @ 2026-05-13 3:26 UTC (permalink / raw) To: linux-tegra, dri-devel, Aaron Kling On Tuesday, May 12, 2026 2:29 PM Aaron Kling wrote: > There is an issue with tegra-drm where some buffers get created, then > freed, but the dma buffer never gets freed. Causing display controller > memory allocations to start failing after the leaks fill up cma. > > I created an issue on the freedesktop issue tracker [0] with a patch > with some debug logs I added, then a log from Android that contains > these logs. CMA is set to 512MB, and when allocations start to fail, > the unfreed allocations add up to just shy of 500MB, where it's > reasonable to expect that 8MB contiguous is no longer available. The > log was generated on a Jetson TX2 NX, but I have seen this leak on > other archs as well, this also does not appear to be limited to soc's > with nvdisplay. > > This does not appear to be a userspace issue. The graphics allocator > works as expected for other soc vendors. And as the logs show, the > delete dumb buffer ioctl is called, but is not always followed by the > dma buffer getting freed. I have also observed this issue with a > gralloc that uses the tegra gem create and such, this is not unique to > dumb buffers, that's just the last log I had when deciding to post the > issue to lkml. > > What I primarily intend to ask here is how to further debug this > issue. I'm not finding any direct path between the delete dumb ioctl > handling and gem release or tegra bo free. Can someone point me to the > pieces in the middle I'm missing, where the logic is to decide is a > buffer should be freed? > > Aaron > > [0] https://gitlab.freedesktop.org/drm/tegra/-/work_items/9 > If the issue is specific to buffers that get used with display, I have an idea of what the issue is -- there is some circular reference counting with the BO cache in the host1x driver, and that means that BOs that end up in the cache never get released. Let me do some testing locally and I'll send out a patch once ready. Thanks! Mikko ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] drm/tegra: DMA buffers are not always freed 2026-05-13 3:26 ` Mikko Perttunen @ 2026-05-13 4:26 ` Aaron Kling 2026-05-15 2:39 ` Mikko Perttunen 0 siblings, 1 reply; 5+ messages in thread From: Aaron Kling @ 2026-05-13 4:26 UTC (permalink / raw) To: Mikko Perttunen; +Cc: linux-tegra, dri-devel On Tue, May 12, 2026 at 10:26 PM Mikko Perttunen <mperttunen@nvidia.com> wrote: > > On Tuesday, May 12, 2026 2:29 PM Aaron Kling wrote: > > There is an issue with tegra-drm where some buffers get created, then > > freed, but the dma buffer never gets freed. Causing display controller > > memory allocations to start failing after the leaks fill up cma. > > > > I created an issue on the freedesktop issue tracker [0] with a patch > > with some debug logs I added, then a log from Android that contains > > these logs. CMA is set to 512MB, and when allocations start to fail, > > the unfreed allocations add up to just shy of 500MB, where it's > > reasonable to expect that 8MB contiguous is no longer available. The > > log was generated on a Jetson TX2 NX, but I have seen this leak on > > other archs as well, this also does not appear to be limited to soc's > > with nvdisplay. > > > > This does not appear to be a userspace issue. The graphics allocator > > works as expected for other soc vendors. And as the logs show, the > > delete dumb buffer ioctl is called, but is not always followed by the > > dma buffer getting freed. I have also observed this issue with a > > gralloc that uses the tegra gem create and such, this is not unique to > > dumb buffers, that's just the last log I had when deciding to post the > > issue to lkml. > > > > What I primarily intend to ask here is how to further debug this > > issue. I'm not finding any direct path between the delete dumb ioctl > > handling and gem release or tegra bo free. Can someone point me to the > > pieces in the middle I'm missing, where the logic is to decide is a > > buffer should be freed? > > > > Aaron > > > > [0] https://gitlab.freedesktop.org/drm/tegra/-/work_items/9 > > > > If the issue is specific to buffers that get used with display, I have > an idea of what the issue is -- there is some circular reference > counting with the BO cache in the host1x driver, and that means that > BOs that end up in the cache never get released. As far as I know, this only affects display controller buffers. Though unfortunately, I have limited ways to test the media engines right now. > Let me do some testing locally and I'll send out a patch once ready. Sounds good, thanks. Aaron ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] drm/tegra: DMA buffers are not always freed 2026-05-13 4:26 ` Aaron Kling @ 2026-05-15 2:39 ` Mikko Perttunen 2026-05-15 4:30 ` Aaron Kling 0 siblings, 1 reply; 5+ messages in thread From: Mikko Perttunen @ 2026-05-15 2:39 UTC (permalink / raw) To: Aaron Kling; +Cc: linux-tegra, dri-devel On Wednesday, May 13, 2026 1:26 PM Aaron Kling wrote: > On Tue, May 12, 2026 at 10:26 PM Mikko Perttunen <mperttunen@nvidia.com> wrote: > > > > On Tuesday, May 12, 2026 2:29 PM Aaron Kling wrote: > > > There is an issue with tegra-drm where some buffers get created, then > > > freed, but the dma buffer never gets freed. Causing display controller > > > memory allocations to start failing after the leaks fill up cma. > > > > > > I created an issue on the freedesktop issue tracker [0] with a patch > > > with some debug logs I added, then a log from Android that contains > > > these logs. CMA is set to 512MB, and when allocations start to fail, > > > the unfreed allocations add up to just shy of 500MB, where it's > > > reasonable to expect that 8MB contiguous is no longer available. The > > > log was generated on a Jetson TX2 NX, but I have seen this leak on > > > other archs as well, this also does not appear to be limited to soc's > > > with nvdisplay. > > > > > > This does not appear to be a userspace issue. The graphics allocator > > > works as expected for other soc vendors. And as the logs show, the > > > delete dumb buffer ioctl is called, but is not always followed by the > > > dma buffer getting freed. I have also observed this issue with a > > > gralloc that uses the tegra gem create and such, this is not unique to > > > dumb buffers, that's just the last log I had when deciding to post the > > > issue to lkml. > > > > > > What I primarily intend to ask here is how to further debug this > > > issue. I'm not finding any direct path between the delete dumb ioctl > > > handling and gem release or tegra bo free. Can someone point me to the > > > pieces in the middle I'm missing, where the logic is to decide is a > > > buffer should be freed? > > > > > > Aaron > > > > > > [0] https://gitlab.freedesktop.org/drm/tegra/-/work_items/9 > > > > > > > If the issue is specific to buffers that get used with display, I have > > an idea of what the issue is -- there is some circular reference > > counting with the BO cache in the host1x driver, and that means that > > BOs that end up in the cache never get released. > > As far as I know, this only affects display controller buffers. Though > unfortunately, I have limited ways to test the media engines right > now. I've been working on some more userspace for the media engines. Hopefully I can get that in shape soon. > > > Let me do some testing locally and I'll send out a patch once ready. > > Sounds good, thanks. I posted a fix, please give it a try. Incidentally, on my side I don't have that much testing set up for the display :) Mikko > > Aaron ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] drm/tegra: DMA buffers are not always freed 2026-05-15 2:39 ` Mikko Perttunen @ 2026-05-15 4:30 ` Aaron Kling 0 siblings, 0 replies; 5+ messages in thread From: Aaron Kling @ 2026-05-15 4:30 UTC (permalink / raw) To: Mikko Perttunen; +Cc: linux-tegra, dri-devel On Thu, May 14, 2026 at 9:39 PM Mikko Perttunen <mperttunen@nvidia.com> wrote: > > On Wednesday, May 13, 2026 1:26 PM Aaron Kling wrote: > > On Tue, May 12, 2026 at 10:26 PM Mikko Perttunen <mperttunen@nvidia.com> wrote: > > > > > > On Tuesday, May 12, 2026 2:29 PM Aaron Kling wrote: > > > > There is an issue with tegra-drm where some buffers get created, then > > > > freed, but the dma buffer never gets freed. Causing display controller > > > > memory allocations to start failing after the leaks fill up cma. > > > > > > > > I created an issue on the freedesktop issue tracker [0] with a patch > > > > with some debug logs I added, then a log from Android that contains > > > > these logs. CMA is set to 512MB, and when allocations start to fail, > > > > the unfreed allocations add up to just shy of 500MB, where it's > > > > reasonable to expect that 8MB contiguous is no longer available. The > > > > log was generated on a Jetson TX2 NX, but I have seen this leak on > > > > other archs as well, this also does not appear to be limited to soc's > > > > with nvdisplay. > > > > > > > > This does not appear to be a userspace issue. The graphics allocator > > > > works as expected for other soc vendors. And as the logs show, the > > > > delete dumb buffer ioctl is called, but is not always followed by the > > > > dma buffer getting freed. I have also observed this issue with a > > > > gralloc that uses the tegra gem create and such, this is not unique to > > > > dumb buffers, that's just the last log I had when deciding to post the > > > > issue to lkml. > > > > > > > > What I primarily intend to ask here is how to further debug this > > > > issue. I'm not finding any direct path between the delete dumb ioctl > > > > handling and gem release or tegra bo free. Can someone point me to the > > > > pieces in the middle I'm missing, where the logic is to decide is a > > > > buffer should be freed? > > > > > > > > Aaron > > > > > > > > [0] https://gitlab.freedesktop.org/drm/tegra/-/work_items/9 > > > > > > > > > > If the issue is specific to buffers that get used with display, I have > > > an idea of what the issue is -- there is some circular reference > > > counting with the BO cache in the host1x driver, and that means that > > > BOs that end up in the cache never get released. > > > > As far as I know, this only affects display controller buffers. Though > > unfortunately, I have limited ways to test the media engines right > > now. > > I've been working on some more userspace for the media engines. > Hopefully I can get that in shape soon. Great to hear. My android use case unfortunately has some very specific requirements, namely a c2 aidl hal. But maybe with more examples of the uapi in action, I can try looking at one again. Though, my last attempt using the existing nvdec example had my head spinning in about 3 seconds flat between that and the c2 api. > > > > > Let me do some testing locally and I'll send out a patch once ready. > > > > Sounds good, thanks. > > I posted a fix, please give it a try. Incidentally, on my side I don't > have that much testing set up for the display :) My initial test run on p2972 using swiftshader is looking good for this specific issue at least. Part way through a vts run and I haven't got any allocation fails, far past where I got them previously. However, this may have peeled back that onion to another problem. I'm getting stack traces from shared plane atomics, and a lot of mmu faults during the graphics tests. I'll see if I can narrow down a simple reproduction and trace down the cause. And I'll check the bo caching patch on a few other devices, then post a tested-by on there if they work. Aaron ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-15 4:30 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-12 5:29 [BUG] drm/tegra: DMA buffers are not always freed Aaron Kling 2026-05-13 3:26 ` Mikko Perttunen 2026-05-13 4:26 ` Aaron Kling 2026-05-15 2:39 ` Mikko Perttunen 2026-05-15 4:30 ` Aaron Kling
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.