* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: Joel Fernandes @ 2026-01-30 21:14 UTC (permalink / raw)
To: John Hubbard
Cc: Danilo Krummrich, Zhi Wang, linux-kernel, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Jonathan Corbet, Alex Deucher, Christian Koenig, Jani Nikula,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Rui Huang,
Matthew Auld, Matthew Brost, Lucas De Marchi, Thomas Hellstrom,
Helge Deller, Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng,
Gary Guo, Bjorn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross, Alistair Popple, Timur Tabi, Edwin Peer,
Alexandre Courbot, Andrea Righi, Andy Ritger, Alexey Ivanov,
Balbir Singh, Philipp Stanner, Elle Rhumsaa, Daniel Almeida,
nouveau, dri-devel, rust-for-linux, linux-doc, amd-gfx, intel-gfx,
intel-xe, linux-fbdev
In-Reply-To: <c064fbdc-9202-437d-80ff-6134d2a33778@nvidia.com>
On 1/29/2026 10:38 PM, John Hubbard wrote:
> On 1/29/26 5:59 PM, Joel Fernandes wrote:
>> On 1/29/26 8:12 PM, John Hubbard wrote:
>>> On 1/29/26 4:26 PM, Joel Fernandes wrote:
>>>> Based on the below discussion and research, I came up with some deadlock
>>>> scenarios that we need to handle in the v6 series of these patches.
>>>> [...]
>>>> memory allocations under locks that we need in the dma-fence signaling
>>>> critical path (when doing the virtual memory map/unmap)
>>>
>>> unmap? Are you seeing any allocations happening during unmap? I don't
>>> immediately see any, but that sounds surprising.
>>
>> Not allocations but we are acquiring locks during unmap. My understanding
>> is (at least some) unmaps have to also be done in the dma fence signaling
>> critical path (the run stage), but Danilo/you can correct me if I am wrong
>> on that. We cannot avoid all locking but those same locks cannot be held in
>> any other paths which do a memory allocation (as mentioned in one of the
>> deadlock scenarios), that is probably the main thing to check for unmap.
>>
>
> Right, OK we are on the same page now: no allocations happening on unmap,
> but it can still deadlock, because the driver is typically going to
> use a single lock to protect calls both map and unmap-related calls
> to the buddy allocator.
Yes exactly!
>
> For the deadlock above, I think a good way to break that deadlock is
> to not allow taking that lock in a fence signaling calling path.
>
> So during an unmap, instead of "lock, unmap/free, unlock" it should
> move the item to a deferred-free list, which is processed separately.
> Of course, this is a little complex, because the allocation and reclaim
> has to be aware of such lists if they get large.
Yes, also avoiding GFP_KERNEL allocations while holding any of these mm locks
(whichever we take during map). The GPU buddy actually does GFP_KERNEL
allocations internally which is problematic.
Some solutions / next steps:
1. allocating (VRAM and system memory) outside mm locks just before acquiring them.
2. pre-allocating both VRAM and system memory needed, before the DMA fence
critical paths (The issue is also to figure out how much memory to pre-allocate
for the page table pages based on the VM_BIND request. I think we can analyze
the page tables in the submit stage to make an estimate).
3. Unfortunately, I am using gpu-buddy when allocating a VA range in the Vmm
(called virt_buddy), which itself does GFP_KERNEL memory allocations in the
allocate path. I am not sure what do yet about this. ISTR the maple tree also
has similar issues.
4. Using non-reclaimable memory allocations where pre-allocation or
pre-allocated memory pools is not possible (I'd like to avoid this #4 so we
don't fail allocations when memory is scarce).
Will work on these issues for the v7. Thanks,
--
Joel Fernandes
^ permalink raw reply
* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: John Hubbard @ 2026-01-30 3:38 UTC (permalink / raw)
To: Joel Fernandes
Cc: Danilo Krummrich, Zhi Wang, linux-kernel, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Jonathan Corbet, Alex Deucher, Christian Koenig, Jani Nikula,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Rui Huang,
Matthew Auld, Matthew Brost, Lucas De Marchi, Thomas Hellstrom,
Helge Deller, Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng,
Gary Guo, Bjorn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross, Alistair Popple, Timur Tabi, Edwin Peer,
Alexandre Courbot, Andrea Righi, Andy Ritger, Alexey Ivanov,
Balbir Singh, Philipp Stanner, Elle Rhumsaa, Daniel Almeida,
nouveau, dri-devel, rust-for-linux, linux-doc, amd-gfx, intel-gfx,
intel-xe, linux-fbdev
In-Reply-To: <20260130015901.GA301119@joelbox2>
On 1/29/26 5:59 PM, Joel Fernandes wrote:
> On 1/29/26 8:12 PM, John Hubbard wrote:
>> On 1/29/26 4:26 PM, Joel Fernandes wrote:
>>> Based on the below discussion and research, I came up with some deadlock
>>> scenarios that we need to handle in the v6 series of these patches.
>>> [...]
>>> memory allocations under locks that we need in the dma-fence signaling
>>> critical path (when doing the virtual memory map/unmap)
>>
>> unmap? Are you seeing any allocations happening during unmap? I don't
>> immediately see any, but that sounds surprising.
>
> Not allocations but we are acquiring locks during unmap. My understanding
> is (at least some) unmaps have to also be done in the dma fence signaling
> critical path (the run stage), but Danilo/you can correct me if I am wrong
> on that. We cannot avoid all locking but those same locks cannot be held in
> any other paths which do a memory allocation (as mentioned in one of the
> deadlock scenarios), that is probably the main thing to check for unmap.
>
Right, OK we are on the same page now: no allocations happening on unmap,
but it can still deadlock, because the driver is typically going to
use a single lock to protect calls both map and unmap-related calls
to the buddy allocator.
For the deadlock above, I think a good way to break that deadlock is
to not allow taking that lock in a fence signaling calling path.
So during an unmap, instead of "lock, unmap/free, unlock" it should
move the item to a deferred-free list, which is processed separately.
Of course, this is a little complex, because the allocation and reclaim
has to be aware of such lists if they get large.
thanks,
--
John Hubbard
^ permalink raw reply
* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: Joel Fernandes @ 2026-01-30 1:59 UTC (permalink / raw)
To: John Hubbard
Cc: Danilo Krummrich, Zhi Wang, linux-kernel, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Jonathan Corbet, Alex Deucher, Christian Koenig, Jani Nikula,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Rui Huang,
Matthew Auld, Matthew Brost, Lucas De Marchi, Thomas Hellstrom,
Helge Deller, Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng,
Gary Guo, Bjorn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross, Alistair Popple, Timur Tabi, Edwin Peer,
Alexandre Courbot, Andrea Righi, Andy Ritger, Alexey Ivanov,
Balbir Singh, Philipp Stanner, Elle Rhumsaa, Daniel Almeida,
nouveau, dri-devel, rust-for-linux, linux-doc, amd-gfx, intel-gfx,
intel-xe, linux-fbdev
In-Reply-To: <97af2d85-a905-44d4-951f-e56a40f4312e@nvidia.com>
On 1/29/26 8:12 PM, John Hubbard wrote:
> On 1/29/26 4:26 PM, Joel Fernandes wrote:
>> Based on the below discussion and research, I came up with some deadlock
>> scenarios that we need to handle in the v6 series of these patches.
>> [...]
>> memory allocations under locks that we need in the dma-fence signaling
>> critical path (when doing the virtual memory map/unmap)
>
> unmap? Are you seeing any allocations happening during unmap? I don't
> immediately see any, but that sounds surprising.
Not allocations but we are acquiring locks during unmap. My understanding
is (at least some) unmaps have to also be done in the dma fence signaling
critical path (the run stage), but Danilo/you can correct me if I am wrong
on that. We cannot avoid all locking but those same locks cannot be held in
any other paths which do a memory allocation (as mentioned in one of the
deadlock scenarios), that is probably the main thing to check for unmap.
Thanks,
--
Joel Fernandes
^ permalink raw reply
* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: Joel Fernandes @ 2026-01-30 1:45 UTC (permalink / raw)
To: Gary Guo
Cc: Danilo Krummrich, Zhi Wang, linux-kernel, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Jonathan Corbet, Alex Deucher, Christian Koenig, Jani Nikula,
Joonas Lahtinen, Vivi Rodrigo, Tvrtko Ursulin, Rui Huang,
Matthew Auld, Matthew Brost, Lucas De Marchi, Thomas Hellstrom,
Helge Deller, Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng,
Bjorn Roy Baron, Benno Lossin, Andreas Hindborg, Trevor Gross,
John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Alexandre Courbot, Andrea Righi, Andy Ritger, Alexey Ivanov,
Balbir Singh, Philipp Stanner, Elle Rhumsaa, Daniel Almeida,
nouveau, dri-devel, rust-for-linux, linux-doc, amd-gfx, intel-gfx,
intel-xe, linux-fbdev, Gary Guo
In-Reply-To: <DG1IZ8T0FFM2.2WTUZ3AESF9RD@garyguo.net>
> On Jan 29, 2026, at 8:16 PM, Gary Guo <gary@garyguo.net> wrote:
>
> On Fri Jan 30, 2026 at 12:26 AM GMT, Joel Fernandes wrote:
>> Hi, Danilo, all,
>>
>> Based on the below discussion and research, I came up with some deadlock
>> scenarios that we need to handle in the v6 series of these patches. Please let
>> me know if I missed something below. At the moment, off the top I identified
>> that we are doing GFP_KERNEL memory allocations inside GPU buddy allocator
>> during map/unmap. I will work on solutions for that. Thanks.
>>
>> All deadlock scenarios
>> ----------------------
>> The gist is, in the DMA fence signaling critical path we cannot acquire
>> resources (locks or memory allocation etc) that are already acquired when a
>> fence is being waited on to be signaled. So we have to careful which resources
>> we acquire, and also we need to be careful which paths in the driver we do any
>> memory allocations under locks that we need in the dma-fence signaling critical
>> path (when doing the virtual memory map/unmap)
>
> When thinking about deadlocks it usually helps if you think without detailed
> scenarios (which would be hard to enumerate and easy to miss), but rather in
> terms of relative order of resource acquisition. All resources that you wait on
> would need to form a partial order. Any violation could result in deadlocks.
> This is also how lockdep checks.
>
> So to me all cases you listed are all the same...
Hmm, I am quite familiar with lockdep internals, but I don’t see how all cases
are the same one when there are different resources being acquired (locks versus
memory allocation, for instance). I think it helps to visualize different cases
based on different scenarios for a complete understanding of issues and mild
repetition is a good thing IMO - the goal is to not miss anything. But agreed on
that is how lockdep works. Lockdep just needs those relationships in its graph
to know that ordering enough to flag issues. Speaking of lockdep, I have not
checked but we should probably add support for fence signal/wait and resource
dependencies, to catch any potential issues as well.
Thanks for taking a look,
--
Joel Fernandes
>
> Best,
> Gary
>
>>
>> 1. deadlock scenario 1: allocator deadlock (no locking needed to trigger it)
>>
>> Fence Signal start (A) -> Alloc -> MMU notifier/Shrinker (B) -> Fence Wait (A)
>>
>> ABA deadlock.
>>
>> 2. deadlock scenario 2: Same as 1, but ABBA scenario (2 CPUs).
>>
>> CPU 0: Fence Signal start (A) -> Alloc (B)
>>
>> CPU 1: Alloc -> MMU notifier or Shrinker (B) -> Fence Wait (A)
>>
>> 3. deadlock scenario 3: When locking: ABBA (and similarly) deadlock but locking.
>>
>> CPU 0: Fence Signal start (A) -> Lock (B)
>>
>> CPU 1: Lock (B) -> Fence Wait (A)
>>
>> 4. deadlock scenario 4: Same as scenario 3, but the fence wait comes from
>> allocation path.
>>
>> rule: We cannot try to acquire locks in the DMA fence signaling critical path if
>> those locks were already acquire in paths that do reclaimable memory allocations.
>>
>> CPU 0: Fence Signal (A) -> Lock (B)
>>
>> CPU 1: Lock (B) -> Alloc -> Fence Wait (A)
>>
>> 5. deadlock scenario 5: Transitive locking:
>>
>> rule: We cannot try to acquire locks in the DMA fence signaling critical path
>> that are transitively waiting on the same DMA fence.
>>
>> Fence Signal (A) -> Lock (B)
>>
>> Lock (B) -> Lock(C)
>>
>> Lock (C) -> Alloc -> Fence Wait (A)
>>
>> ABBCCA deadlock.
>>
>>
>> --
>> Joel Fernandes
>>
>>> On 1/28/2026 7:04 AM, Danilo Krummrich wrote:
>>> On Fri Jan 23, 2026 at 12:16 AM CET, Joel Fernandes wrote:
>>>> My plan is to make TLB and PRAMIN use immutable references in their function
>>>> calls and then implement internal locking. I've already done this for the GPU
>>>> buddy functions, so it should be doable, and we'll keep it consistent. As a
>>>> result, we will have finer-grain locking on the memory management objects
>>>> instead of requiring to globally lock a common GpuMm object. I'll plan on
>>>> doing this for v7.
>>>>
>>>> Also, the PTE allocation race you mentioned is already handled by PRAMIN
>>>> serialization. Since threads must hold the PRAMIN lock to write page table
>>>> entries, concurrent writers are not possible:
>>>>
>>>> Thread A: acquire PRAMIN lock
>>>> Thread A: read PDE (via PRAMIN) -> NULL
>>>> Thread A: alloc PT page, write PDE
>>>> Thread A: release PRAMIN lock
>>>>
>>>> Thread B: acquire PRAMIN lock
>>>> Thread B: read PDE (via PRAMIN) -> sees A's pointer
>>>> Thread B: uses existing PT page, no allocation needed
>>>
>>> This won't work unfortunately.
>>>
>>> We have to separate allocations and modifications of the page tabe. Or in other
>>> words, we must not allocate new PDEs or PTEs while holding the lock protecting
>>> the page table from modifications.
>>>
>>> Once we have VM_BIND in nova-drm, we will have the situation that userspace
>>> passes jobs to modify the GPUs virtual address space and hence the page tables.
>>>
>>> Such a jobs has mainly three stages.
>>>
>>> (1) The submit stage.
>>>
>>> This is where the job is initialized, dependencies are set up and the
>>> driver has to pre-allocate all kinds of structures that are required
>>> throughout the subsequent stages of the job.
>>>
>>> (2) The run stage.
>>>
>>> This is the stage where the job is staged for execution and its DMA fence
>>> has been made public (i.e. it is accessible by userspace).
>>>
>>> This is the stage where we are in the DMA fence signalling critical
>>> section, hence we can't do any non-atomic allocations, since otherwise we
>>> could deadlock in MMU notifier callbacks for instance.
>>>
>>> This is the stage where the page table is actually modified. Hence, we
>>> can't acquire any locks that might be held elsewhere while doing
>>> non-atomic allocations. Also note that this is transitive, e.g. if you
>>> take lock A and somewhere else a lock B is taked while A is already held
>>> and we do non-atomic allocations while holding B, then A can't be held in
>>> the DMA fence signalling critical path either.
>>>
>>> It is also worth noting that this is the stage where we know the exact
>>> operations we have to execute based on the VM_BIND request from userspace.
>>>
>>> For instance, in the submit stage we may only know that userspace wants
>>> that we map a BO with a certain offset in the GPUs virtual address space
>>> at [0x0, 0x1000000]. What we don't know is what exact operations this does
>>> require, i.e. "What do we have to unmap first?", "Are there any
>>> overlapping mappings that we have to truncate?", etc.
>>>
>>> So, we have to consider this when we pre-allocate in the submit stage.
>>>
>>> (3) The cleanup stage.
>>>
>>> This is where the job has been signaled and hence left the DMA fence
>>> signalling critical section.
>>>
>>> In this stage the job is cleaned up, which includes freeing data that is
>>> not required anymore, such as PTEs and PDEs.
>
--
Joel Fernandes
--
--
Joel Fernandes
^ permalink raw reply
* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: Gary Guo @ 2026-01-30 1:16 UTC (permalink / raw)
To: Joel Fernandes, Danilo Krummrich
Cc: Zhi Wang, linux-kernel, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Jonathan Corbet,
Alex Deucher, Christian Koenig, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Matthew Auld,
Matthew Brost, Lucas De Marchi, Thomas Hellstrom, Helge Deller,
Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Bjorn Roy Baron, Benno Lossin, Andreas Hindborg, Trevor Gross,
John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Alexandre Courbot, Andrea Righi, Andy Ritger, Alexey Ivanov,
Balbir Singh, Philipp Stanner, Elle Rhumsaa, Daniel Almeida,
nouveau, dri-devel, rust-for-linux, linux-doc, amd-gfx, intel-gfx,
intel-xe, linux-fbdev
In-Reply-To: <20e04a3e-8d7d-47bc-9299-deadf8b9e992@nvidia.com>
On Fri Jan 30, 2026 at 12:26 AM GMT, Joel Fernandes wrote:
> Hi, Danilo, all,
>
> Based on the below discussion and research, I came up with some deadlock
> scenarios that we need to handle in the v6 series of these patches. Please let
> me know if I missed something below. At the moment, off the top I identified
> that we are doing GFP_KERNEL memory allocations inside GPU buddy allocator
> during map/unmap. I will work on solutions for that. Thanks.
>
> All deadlock scenarios
> ----------------------
> The gist is, in the DMA fence signaling critical path we cannot acquire
> resources (locks or memory allocation etc) that are already acquired when a
> fence is being waited on to be signaled. So we have to careful which resources
> we acquire, and also we need to be careful which paths in the driver we do any
> memory allocations under locks that we need in the dma-fence signaling critical
> path (when doing the virtual memory map/unmap)
When thinking about deadlocks it usually helps if you think without detailed
scenarios (which would be hard to enumerate and easy to miss), but rather in
terms of relative order of resource acquisition. All resources that you wait on
would need to form a partial order. Any violation could result in deadlocks.
This is also how lockdep checks.
So to me all cases you listed are all the same...
Best,
Gary
>
> 1. deadlock scenario 1: allocator deadlock (no locking needed to trigger it)
>
> Fence Signal start (A) -> Alloc -> MMU notifier/Shrinker (B) -> Fence Wait (A)
>
> ABA deadlock.
>
> 2. deadlock scenario 2: Same as 1, but ABBA scenario (2 CPUs).
>
> CPU 0: Fence Signal start (A) -> Alloc (B)
>
> CPU 1: Alloc -> MMU notifier or Shrinker (B) -> Fence Wait (A)
>
> 3. deadlock scenario 3: When locking: ABBA (and similarly) deadlock but locking.
>
> CPU 0: Fence Signal start (A) -> Lock (B)
>
> CPU 1: Lock (B) -> Fence Wait (A)
>
> 4. deadlock scenario 4: Same as scenario 3, but the fence wait comes from
> allocation path.
>
> rule: We cannot try to acquire locks in the DMA fence signaling critical path if
> those locks were already acquire in paths that do reclaimable memory allocations.
>
> CPU 0: Fence Signal (A) -> Lock (B)
>
> CPU 1: Lock (B) -> Alloc -> Fence Wait (A)
>
> 5. deadlock scenario 5: Transitive locking:
>
> rule: We cannot try to acquire locks in the DMA fence signaling critical path
> that are transitively waiting on the same DMA fence.
>
> Fence Signal (A) -> Lock (B)
>
> Lock (B) -> Lock(C)
>
> Lock (C) -> Alloc -> Fence Wait (A)
>
> ABBCCA deadlock.
>
>
> --
> Joel Fernandes
>
> On 1/28/2026 7:04 AM, Danilo Krummrich wrote:
>> On Fri Jan 23, 2026 at 12:16 AM CET, Joel Fernandes wrote:
>>> My plan is to make TLB and PRAMIN use immutable references in their function
>>> calls and then implement internal locking. I've already done this for the GPU
>>> buddy functions, so it should be doable, and we'll keep it consistent. As a
>>> result, we will have finer-grain locking on the memory management objects
>>> instead of requiring to globally lock a common GpuMm object. I'll plan on
>>> doing this for v7.
>>>
>>> Also, the PTE allocation race you mentioned is already handled by PRAMIN
>>> serialization. Since threads must hold the PRAMIN lock to write page table
>>> entries, concurrent writers are not possible:
>>>
>>> Thread A: acquire PRAMIN lock
>>> Thread A: read PDE (via PRAMIN) -> NULL
>>> Thread A: alloc PT page, write PDE
>>> Thread A: release PRAMIN lock
>>>
>>> Thread B: acquire PRAMIN lock
>>> Thread B: read PDE (via PRAMIN) -> sees A's pointer
>>> Thread B: uses existing PT page, no allocation needed
>>
>> This won't work unfortunately.
>>
>> We have to separate allocations and modifications of the page tabe. Or in other
>> words, we must not allocate new PDEs or PTEs while holding the lock protecting
>> the page table from modifications.
>>
>> Once we have VM_BIND in nova-drm, we will have the situation that userspace
>> passes jobs to modify the GPUs virtual address space and hence the page tables.
>>
>> Such a jobs has mainly three stages.
>>
>> (1) The submit stage.
>>
>> This is where the job is initialized, dependencies are set up and the
>> driver has to pre-allocate all kinds of structures that are required
>> throughout the subsequent stages of the job.
>>
>> (2) The run stage.
>>
>> This is the stage where the job is staged for execution and its DMA fence
>> has been made public (i.e. it is accessible by userspace).
>>
>> This is the stage where we are in the DMA fence signalling critical
>> section, hence we can't do any non-atomic allocations, since otherwise we
>> could deadlock in MMU notifier callbacks for instance.
>>
>> This is the stage where the page table is actually modified. Hence, we
>> can't acquire any locks that might be held elsewhere while doing
>> non-atomic allocations. Also note that this is transitive, e.g. if you
>> take lock A and somewhere else a lock B is taked while A is already held
>> and we do non-atomic allocations while holding B, then A can't be held in
>> the DMA fence signalling critical path either.
>>
>> It is also worth noting that this is the stage where we know the exact
>> operations we have to execute based on the VM_BIND request from userspace.
>>
>> For instance, in the submit stage we may only know that userspace wants
>> that we map a BO with a certain offset in the GPUs virtual address space
>> at [0x0, 0x1000000]. What we don't know is what exact operations this does
>> require, i.e. "What do we have to unmap first?", "Are there any
>> overlapping mappings that we have to truncate?", etc.
>>
>> So, we have to consider this when we pre-allocate in the submit stage.
>>
>> (3) The cleanup stage.
>>
>> This is where the job has been signaled and hence left the DMA fence
>> signalling critical section.
>>
>> In this stage the job is cleaned up, which includes freeing data that is
>> not required anymore, such as PTEs and PDEs.
^ permalink raw reply
* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: John Hubbard @ 2026-01-30 1:11 UTC (permalink / raw)
To: Joel Fernandes, Danilo Krummrich
Cc: Zhi Wang, linux-kernel, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Jonathan Corbet,
Alex Deucher, Christian Koenig, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Matthew Auld,
Matthew Brost, Lucas De Marchi, Thomas Hellstrom, Helge Deller,
Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Bjorn Roy Baron, Benno Lossin, Andreas Hindborg, Trevor Gross,
Alistair Popple, Timur Tabi, Edwin Peer, Alexandre Courbot,
Andrea Righi, Andy Ritger, Alexey Ivanov, Balbir Singh,
Philipp Stanner, Elle Rhumsaa, Daniel Almeida, nouveau, dri-devel,
rust-for-linux, linux-doc, amd-gfx, intel-gfx, intel-xe,
linux-fbdev
In-Reply-To: <20e04a3e-8d7d-47bc-9299-deadf8b9e992@nvidia.com>
On 1/29/26 4:26 PM, Joel Fernandes wrote:
> Hi, Danilo, all,
>
> Based on the below discussion and research, I came up with some deadlock
> scenarios that we need to handle in the v6 series of these patches. Please let
> me know if I missed something below. At the moment, off the top I identified
> that we are doing GFP_KERNEL memory allocations inside GPU buddy allocator
> during map/unmap. I will work on solutions for that. Thanks.
>
> All deadlock scenarios
> ----------------------
> The gist is, in the DMA fence signaling critical path we cannot acquire
> resources (locks or memory allocation etc) that are already acquired when a
> fence is being waited on to be signaled. So we have to careful which resources
> we acquire, and also we need to be careful which paths in the driver we do any
> memory allocations under locks that we need in the dma-fence signaling critical
> path (when doing the virtual memory map/unmap)
unmap? Are you seeing any allocations happening during unmap? I don't
immediately see any, but that sounds surprising.
thanks,
--
John Hubbard
^ permalink raw reply
* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: Joel Fernandes @ 2026-01-30 0:26 UTC (permalink / raw)
To: Danilo Krummrich
Cc: Zhi Wang, linux-kernel, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Jonathan Corbet,
Alex Deucher, Christian Koenig, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Matthew Auld,
Matthew Brost, Lucas De Marchi, Thomas Hellstrom, Helge Deller,
Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Bjorn Roy Baron, Benno Lossin, Andreas Hindborg, Trevor Gross,
John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Alexandre Courbot, Andrea Righi, Andy Ritger, Alexey Ivanov,
Balbir Singh, Philipp Stanner, Elle Rhumsaa, Daniel Almeida,
nouveau, dri-devel, rust-for-linux, linux-doc, amd-gfx, intel-gfx,
intel-xe, linux-fbdev
In-Reply-To: <DG07HZN0PL87.X5MKDCVVYIRE@kernel.org>
Hi, Danilo, all,
Based on the below discussion and research, I came up with some deadlock
scenarios that we need to handle in the v6 series of these patches. Please let
me know if I missed something below. At the moment, off the top I identified
that we are doing GFP_KERNEL memory allocations inside GPU buddy allocator
during map/unmap. I will work on solutions for that. Thanks.
All deadlock scenarios
----------------------
The gist is, in the DMA fence signaling critical path we cannot acquire
resources (locks or memory allocation etc) that are already acquired when a
fence is being waited on to be signaled. So we have to careful which resources
we acquire, and also we need to be careful which paths in the driver we do any
memory allocations under locks that we need in the dma-fence signaling critical
path (when doing the virtual memory map/unmap)
1. deadlock scenario 1: allocator deadlock (no locking needed to trigger it)
Fence Signal start (A) -> Alloc -> MMU notifier/Shrinker (B) -> Fence Wait (A)
ABA deadlock.
2. deadlock scenario 2: Same as 1, but ABBA scenario (2 CPUs).
CPU 0: Fence Signal start (A) -> Alloc (B)
CPU 1: Alloc -> MMU notifier or Shrinker (B) -> Fence Wait (A)
3. deadlock scenario 3: When locking: ABBA (and similarly) deadlock but locking.
CPU 0: Fence Signal start (A) -> Lock (B)
CPU 1: Lock (B) -> Fence Wait (A)
4. deadlock scenario 4: Same as scenario 3, but the fence wait comes from
allocation path.
rule: We cannot try to acquire locks in the DMA fence signaling critical path if
those locks were already acquire in paths that do reclaimable memory allocations.
CPU 0: Fence Signal (A) -> Lock (B)
CPU 1: Lock (B) -> Alloc -> Fence Wait (A)
5. deadlock scenario 5: Transitive locking:
rule: We cannot try to acquire locks in the DMA fence signaling critical path
that are transitively waiting on the same DMA fence.
Fence Signal (A) -> Lock (B)
Lock (B) -> Lock(C)
Lock (C) -> Alloc -> Fence Wait (A)
ABBCCA deadlock.
--
Joel Fernandes
On 1/28/2026 7:04 AM, Danilo Krummrich wrote:
> On Fri Jan 23, 2026 at 12:16 AM CET, Joel Fernandes wrote:
>> My plan is to make TLB and PRAMIN use immutable references in their function
>> calls and then implement internal locking. I've already done this for the GPU
>> buddy functions, so it should be doable, and we'll keep it consistent. As a
>> result, we will have finer-grain locking on the memory management objects
>> instead of requiring to globally lock a common GpuMm object. I'll plan on
>> doing this for v7.
>>
>> Also, the PTE allocation race you mentioned is already handled by PRAMIN
>> serialization. Since threads must hold the PRAMIN lock to write page table
>> entries, concurrent writers are not possible:
>>
>> Thread A: acquire PRAMIN lock
>> Thread A: read PDE (via PRAMIN) -> NULL
>> Thread A: alloc PT page, write PDE
>> Thread A: release PRAMIN lock
>>
>> Thread B: acquire PRAMIN lock
>> Thread B: read PDE (via PRAMIN) -> sees A's pointer
>> Thread B: uses existing PT page, no allocation needed
>
> This won't work unfortunately.
>
> We have to separate allocations and modifications of the page tabe. Or in other
> words, we must not allocate new PDEs or PTEs while holding the lock protecting
> the page table from modifications.
>
> Once we have VM_BIND in nova-drm, we will have the situation that userspace
> passes jobs to modify the GPUs virtual address space and hence the page tables.
>
> Such a jobs has mainly three stages.
>
> (1) The submit stage.
>
> This is where the job is initialized, dependencies are set up and the
> driver has to pre-allocate all kinds of structures that are required
> throughout the subsequent stages of the job.
>
> (2) The run stage.
>
> This is the stage where the job is staged for execution and its DMA fence
> has been made public (i.e. it is accessible by userspace).
>
> This is the stage where we are in the DMA fence signalling critical
> section, hence we can't do any non-atomic allocations, since otherwise we
> could deadlock in MMU notifier callbacks for instance.
>
> This is the stage where the page table is actually modified. Hence, we
> can't acquire any locks that might be held elsewhere while doing
> non-atomic allocations. Also note that this is transitive, e.g. if you
> take lock A and somewhere else a lock B is taked while A is already held
> and we do non-atomic allocations while holding B, then A can't be held in
> the DMA fence signalling critical path either.
>
> It is also worth noting that this is the stage where we know the exact
> operations we have to execute based on the VM_BIND request from userspace.
>
> For instance, in the submit stage we may only know that userspace wants
> that we map a BO with a certain offset in the GPUs virtual address space
> at [0x0, 0x1000000]. What we don't know is what exact operations this does
> require, i.e. "What do we have to unmap first?", "Are there any
> overlapping mappings that we have to truncate?", etc.
>
> So, we have to consider this when we pre-allocate in the submit stage.
>
> (3) The cleanup stage.
>
> This is where the job has been signaled and hence left the DMA fence
> signalling critical section.
>
> In this stage the job is cleaned up, which includes freeing data that is
> not required anymore, such as PTEs and PDEs.
--
Joel Fernandes
^ permalink raw reply
* Re: [PATCH v7 1/4] dt-bindings: backlight: Add max25014 support
From: Rob Herring @ 2026-01-29 16:04 UTC (permalink / raw)
To: Maud Spierings
Cc: Lee Jones, Daniel Thompson, Jingoo Han, Pavel Machek,
Krzysztof Kozlowski, Conor Dooley, Helge Deller, Shawn Guo,
Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam,
Liam Girdwood, Mark Brown, dri-devel, linux-leds, devicetree,
linux-kernel, linux-fbdev, imx, linux-arm-kernel
In-Reply-To: <20260123-max25014-v7-1-15e504b9acc7@gocontroll.com>
On Fri, Jan 23, 2026 at 12:31:30PM +0100, Maud Spierings wrote:
> The Maxim MAX25014 is a 4-channel automotive grade backlight driver IC
> with integrated boost controller.
>
> Signed-off-by: Maud Spierings <maudspierings@gocontroll.com>
>
> ---
>
> In the current implementation the control registers for channel 1,
> control all channels. So only one led subnode with led-sources is
> supported right now. If at some point the driver functionality is
> expanded the bindings can be easily extended with it.
> ---
> .../bindings/leds/backlight/maxim,max25014.yaml | 91 ++++++++++++++++++++++
> MAINTAINERS | 5 ++
> 2 files changed, 96 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/leds/backlight/maxim,max25014.yaml b/Documentation/devicetree/bindings/leds/backlight/maxim,max25014.yaml
> new file mode 100644
> index 000000000000..c499e6224a8f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/leds/backlight/maxim,max25014.yaml
> @@ -0,0 +1,91 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/leds/backlight/maxim,max25014.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Maxim max25014 backlight controller
> +
> +maintainers:
> + - Maud Spierings <maudspierings@gocontroll.com>
> +
> +properties:
> + compatible:
> + enum:
> + - maxim,max25014
> +
> + reg:
> + maxItems: 1
> +
> + "#address-cells":
> + const: 1
> +
> + "#size-cells":
> + const: 0
No child nodes (with addresses), so these should be dropped. And in the
example.
With that,
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
^ permalink raw reply
* Re: [PATCH v5] fbtft: limit dirty rows based on damage range
From: Andy Shevchenko @ 2026-01-29 10:59 UTC (permalink / raw)
To: Dan Carpenter
Cc: ChanSoo Shin, andy, gregkh, dri-devel, linux-fbdev, linux-staging
In-Reply-To: <aXr-RhUXwIvMHYZI@stanley.mountain>
On Thu, Jan 29, 2026 at 09:29:26AM +0300, Dan Carpenter wrote:
> On Thu, Jan 29, 2026 at 05:39:38AM +0900, ChanSoo Shin wrote:
> > Instead of marking the entire display as dirty, calculate the start
> > and end rows based on the damage offset and length and only mark the
> > affected rows dirty. This reduces unnecessary full framebuffer updates
> > for partial writes.
> >
> > Signed-off-by: ChanSoo Shin <csshin9928@gmail.com>
> > ---
>
> TL/DR: I suck as a reviewer so I would be nervous to apply this
> without testing. Andy is an expert here and we trust him so if he's
> okay with it then great. Or if some other expert could sign off, but
> I don't know enough to sign off myself.
The rule of thumb for _this_ driver (or set of drivers under FBTFT) is
that: we are in maintenance mode and we only accept bugfixes or treewide
changes. The rest can be accepted but unlikely. Either way, we really
want to see this (kind of changes) being tested on real HW. It's not as
simple as renaming variable 'i' to 'j'.
> The problem for me is how do I review something like this? Staging
> is a grab bag of different modules and I'm not an expert in any of
> the subsystems. Normally, it's easy to review staging patches
> because they are clean up work which does change how the code works
> so I just look for unintentional side effects.
>
> It's trickier to review a patch like this which changes runtime. If
> it were fixing a bug, then I could verify the bug is real and say
> well, "Maybe the fix is wrong, but we were going to corrupt memory
> anyway, so the worst case is that it is as bad as before. It can't
> make the problem worse."
>
> This is your first kernel patch. You don't work for a company that
> makes the hardware. You said earlier in a private email that this
> hasn't been tested.
Unfortunately it is not the best driver to go with this. At some point I might
be able to test this when I setup my fbtft minilab at home, I have a few I²C,
SPI, and parallel panels.
> The patch looks reasonable to me, but it also looks simple. If it
> were that easy why didn't the original author do it?
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* Re: [PATCH] staging: fbtft: use guard() to simplify code
From: Paul Retourne @ 2026-01-29 9:18 UTC (permalink / raw)
To: Andy Shevchenko
Cc: andy, gregkh, dri-devel, linux-fbdev, linux-staging, linux-kernel,
Paul Retourne
In-Reply-To: <aXqI7qbxZEulU_GO@smile.fi.intel.com>
On 1/28/26 23:08, Andy Shevchenko wrote:
> Sorry, but I don't see much value in this change.
Understood, I see the problem, I'll try to do changes that are actually
useful in the future.
Thank you for looking at it and taking the time to answer.
--
Paul Retourné
^ permalink raw reply
* Re: [PATCH v8 1/2] staging: fbtft: Fix build failure when CONFIG_FB_DEVICE=n
From: Thomas Zimmermann @ 2026-01-29 8:36 UTC (permalink / raw)
To: Chintan Patel, linux-fbdev, linux-staging, linux-omap
Cc: linux-kernel, dri-devel, andy, deller, gregkh, kernel test robot,
Andy Shevchenko
In-Reply-To: <20260122031635.11414-1-chintanlike@gmail.com>
Hi,
we've seen reports about this bug from linux-next.
Did anyone merge this series?
Best regards
Thomas
Am 22.01.26 um 04:16 schrieb Chintan Patel:
> When CONFIG_FB_DEVICE is disabled, struct fb_info does
> not provide a valid dev pointer. Direct dereferences of
> fb_info->dev therefore result in build failures.
>
> Fix this by avoiding direct accesses to fb_info->dev and
> switching the affected debug logging to framebuffer helpers
> that do not rely on a device pointer.
>
> This fixes the following build failure reported by the
> kernel test robot.
>
> Fixes: a06d03f9f238 ("staging: fbtft: Make FB_DEVICE dependency optional")
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202601110740.Y9XK5HtN-lkp@intel.com
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> Signed-off-by: Chintan Patel <chintanlike@gmail.com>
>
> ---
> Changes in v8:
> - Add Reviewed-by tag from Andy Shevchenko.
>
> Changes in v7:
> - Split logging cleanups into a separate patch
> - Limit this patch to the CONFIG_FB_DEVICE=n build fix only
>
> Changes in v6:
> - Switch debug/info logging to fb_dbg() and fb_info()(suggested by Thomas Zimmermann)
> - Drop dev_of_fbinfo() usage in favor of framebuffer helpers that implicitly
> handle the debug/info context.
> - Drop __func__ usage per review feedback(suggested by greg k-h)
> - Add Fixes tag for a06d03f9f238 ("staging: fbtft: Make FB_DEVICE dependency optional")
> (suggested by Andy Shevchenko)
>
> Changes in v5:
> - Initial attempt to replace info->dev accesses using
> dev_of_fbinfo() helper
>
> drivers/staging/fbtft/fbtft-core.c | 19 +++++++++----------
> 1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/staging/fbtft/fbtft-core.c b/drivers/staging/fbtft/fbtft-core.c
> index 8a5ccc8ae0a1..1b3b62950205 100644
> --- a/drivers/staging/fbtft/fbtft-core.c
> +++ b/drivers/staging/fbtft/fbtft-core.c
> @@ -365,9 +365,9 @@ static int fbtft_fb_setcolreg(unsigned int regno, unsigned int red,
> unsigned int val;
> int ret = 1;
>
> - dev_dbg(info->dev,
> - "%s(regno=%u, red=0x%X, green=0x%X, blue=0x%X, trans=0x%X)\n",
> - __func__, regno, red, green, blue, transp);
> + fb_dbg(info,
> + "regno=%u, red=0x%X, green=0x%X, blue=0x%X, trans=0x%X\n",
> + regno, red, green, blue, transp);
>
> switch (info->fix.visual) {
> case FB_VISUAL_TRUECOLOR:
> @@ -391,8 +391,7 @@ static int fbtft_fb_blank(int blank, struct fb_info *info)
> struct fbtft_par *par = info->par;
> int ret = -EINVAL;
>
> - dev_dbg(info->dev, "%s(blank=%d)\n",
> - __func__, blank);
> + fb_dbg(info, "blank=%d\n", blank);
>
> if (!par->fbtftops.blank)
> return ret;
> @@ -793,11 +792,11 @@ int fbtft_register_framebuffer(struct fb_info *fb_info)
> if (spi)
> sprintf(text2, ", spi%d.%d at %d MHz", spi->controller->bus_num,
> spi_get_chipselect(spi, 0), spi->max_speed_hz / 1000000);
> - dev_info(fb_info->dev,
> - "%s frame buffer, %dx%d, %d KiB video memory%s, fps=%lu%s\n",
> - fb_info->fix.id, fb_info->var.xres, fb_info->var.yres,
> - fb_info->fix.smem_len >> 10, text1,
> - HZ / fb_info->fbdefio->delay, text2);
> + fb_info(fb_info,
> + "%s frame buffer, %dx%d, %d KiB video memory%s, fps=%lu%s\n",
> + fb_info->fix.id, fb_info->var.xres, fb_info->var.yres,
> + fb_info->fix.smem_len >> 10, text1,
> + HZ / fb_info->fbdefio->delay, text2);
>
> /* Turn on backlight if available */
> if (fb_info->bl_dev) {
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)
^ permalink raw reply
* Re: [PATCH v2] fbdev: au1100fb: Check return value of clk_enable() in .resume()
From: Helge Deller @ 2026-01-29 7:45 UTC (permalink / raw)
To: Uwe Kleine-König, Chen Ni
Cc: dri-devel, elfring, linux-fbdev, linux-kernel
In-Reply-To: <u7a7owvizacghl3kpk5zxrf6iurmvvvjnjnzqa43xgafxcmb7x@jsihky4phvko>
On 1/29/26 08:13, Uwe Kleine-König wrote:
> On Thu, Jan 29, 2026 at 12:07:14PM +0800, Chen Ni wrote:
>> Check the return value of clk_enable() in au1100fb_drv_resume() and
>> return the error on failure.
>> This ensures the system is aware of the resume failure and can track
>> its state accurately.
>>
>> Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
>
> Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Applied to fbdev git tree.
Uwe, thanks for feedback & double-checking!
Helge
^ permalink raw reply
* Re: [PATCH v2] fbdev: au1100fb: Check return value of clk_enable() in .resume()
From: Markus Elfring @ 2026-01-29 7:36 UTC (permalink / raw)
To: Chen Ni, linux-fbdev, dri-devel; +Cc: LKML, Helge Deller, Uwe Kleine-König
In-Reply-To: <20260129040714.2772522-1-nichen@iscas.ac.cn>
> Check the return value of clk_enable() in au1100fb_drv_resume() and
> return the error on failure.
Were any source code analysis tools involved here?
> This ensures the system is aware of the resume failure and can track
Ensure?
> its state accurately.
Did anything hinder to add any tags (like “Fixes” and “Cc”) accordingly?
Regards,
Markus
^ permalink raw reply
* Re: [PATCH v2] fbdev: au1100fb: Check return value of clk_enable() in .resume()
From: Uwe Kleine-König @ 2026-01-29 7:13 UTC (permalink / raw)
To: Chen Ni; +Cc: deller, dri-devel, elfring, linux-fbdev, linux-kernel
In-Reply-To: <20260129040714.2772522-1-nichen@iscas.ac.cn>
[-- Attachment #1: Type: text/plain, Size: 385 bytes --]
On Thu, Jan 29, 2026 at 12:07:14PM +0800, Chen Ni wrote:
> Check the return value of clk_enable() in au1100fb_drv_resume() and
> return the error on failure.
> This ensures the system is aware of the resume failure and can track
> its state accurately.
>
> Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Thanks
Uwe
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH v5] fbtft: limit dirty rows based on damage range
From: Dan Carpenter @ 2026-01-29 6:29 UTC (permalink / raw)
To: ChanSoo Shin, andy; +Cc: gregkh, dri-devel, linux-fbdev, linux-staging
In-Reply-To: <20260128203938.962414-1-csshin9928@gmail.com>
On Thu, Jan 29, 2026 at 05:39:38AM +0900, ChanSoo Shin wrote:
> Instead of marking the entire display as dirty, calculate the start
> and end rows based on the damage offset and length and only mark the
> affected rows dirty. This reduces unnecessary full framebuffer updates
> for partial writes.
>
> Signed-off-by: ChanSoo Shin <csshin9928@gmail.com>
> ---
TL/DR: I suck as a reviewer so I would be nervous to apply this
without testing. Andy is an expert here and we trust him so if he's
okay with it then great. Or if some other expert could sign off, but
I don't know enough to sign off myself.
The problem for me is how do I review something like this? Staging
is a grab bag of different modules and I'm not an expert in any of
the subsystems. Normally, it's easy to review staging patches
because they are clean up work which does change how the code works
so I just look for unintentional side effects.
It's trickier to review a patch like this which changes runtime. If
it were fixing a bug, then I could verify the bug is real and say
well, "Maybe the fix is wrong, but we were going to corrupt memory
anyway, so the worst case is that it is as bad as before. It can't
make the problem worse."
This is your first kernel patch. You don't work for a company that
makes the hardware. You said earlier in a private email that this
hasn't been tested.
The patch looks reasonable to me, but it also looks simple. If it
were that easy why didn't the original author do it?
regards,
dan carpenter
^ permalink raw reply
* Re: [PATCH v2 1/2] dt-bindings: backlight: gpio-backlight: allow multiple GPIOs
From: tessolveupstream @ 2026-01-29 5:41 UTC (permalink / raw)
To: Daniel Thompson, Krzysztof Kozlowski
Cc: lee, danielt, jingoohan1, deller, pavel, robh, krzk+dt, conor+dt,
dri-devel, linux-fbdev, linux-leds, devicetree, linux-kernel
In-Reply-To: <aXnxGPNtk5BwoJOu@aspen.lan>
On 28-01-2026 16:50, Daniel Thompson wrote:
> On Wed, Jan 28, 2026 at 11:11:33AM +0100, Krzysztof Kozlowski wrote:
>> On 23/01/2026 12:11, tessolveupstream@gmail.com wrote:
>>>
>>>
>>> On 20-01-2026 20:01, Krzysztof Kozlowski wrote:
>>>> On 20/01/2026 13:50, Sudarshan Shetty wrote:
>>>>> Update the gpio-backlight binding to support configurations that require
>>>>> more than one GPIO for enabling/disabling the backlight.
>>>>
>>>>
>>>> Why? Which devices need it? How a backlight would have three enable
>>>> GPIOs? I really do not believe, so you need to write proper hardware
>>>> justification.
>>>>
>>>
>>> To clarify our hardware setup:
>>> the panel requires one GPIO for the backlight enable signal, and it
>>> also has a PWM input. Since the QCS615 does not provide a PWM controller
>>> for this use case, the PWM input is connected to a GPIO that is driven
>>> high to provide a constant 100% duty cycle, as explained in the link
>>> below.
>>> https://lore.kernel.org/all/20251028061636.724667-1-tessolveupstream@gmail.com/T/#m93ca4e5c7bf055715ed13316d91f0cd544244cf5
>>
>> That's not an enable gpio, but PWM.
>>
>> You write bindings for this device, not for something else - like your
>> board.
>
> Sudarshan: I believe at one point the intent was to model this hardware
> as a pwm-backlight (using enables GPIOs to drive the enable pin)
> attached to a pwm-gpio (to drive the PWM pin). Did this approach work?
>
Yes, the original plan was to model this using pwm‑gpio, and that
setup worked. But on the SOC there’s no actual PWM controller available
for this path— the LED_PWM line is just tied to a GPIO that’s driven
high (effectively a fixed 100% duty cycle). Because of that, describing
it as a PWM in DT was flagged as incorrect.
As pointed out during the SoC DTS review, the correct path forward is
to extend gpio‑backlight to handle multiple GPIOs, rather than
representing them as multiple separate backlight devices.
>
> Daniel.
^ permalink raw reply
* Re: [PATCH] staging: fbtft: use guard() to simplify code
From: Greg KH @ 2026-01-29 4:38 UTC (permalink / raw)
To: Paul Retourné
Cc: andy, dri-devel, linux-fbdev, linux-staging, linux-kernel
In-Reply-To: <20260128212644.1170970-1-paul.retourne@orange.fr>
On Wed, Jan 28, 2026 at 10:26:42PM +0100, Paul Retourné wrote:
> Use guard() to simplify mutex locking. No functional change.
It's best to use guard() for new code, not touching existing code as:
> 3 files changed, 8 insertions(+), 8 deletions(-)
This made no change overall at all :(
thanks,
greg k-h
^ permalink raw reply
* [PATCH v2] fbdev: au1100fb: Check return value of clk_enable() in .resume()
From: Chen Ni @ 2026-01-29 4:07 UTC (permalink / raw)
To: u.kleine-koenig
Cc: deller, dri-devel, elfring, linux-fbdev, linux-kernel, Chen Ni
In-Reply-To: <zytpnyodschvn4mmpllxp62yg3o77hjl7l5nyckoxyuvucjyaj@xsxbybnyzd44>
Check the return value of clk_enable() in au1100fb_drv_resume() and
return the error on failure.
This ensures the system is aware of the resume failure and can track
its state accurately.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
---
Changes in v2:
- Update commit message
- Clean up extraneous whitespace in the code
---
drivers/video/fbdev/au1100fb.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/au1100fb.c b/drivers/video/fbdev/au1100fb.c
index 6251a6b07b3a..feaa1061c436 100644
--- a/drivers/video/fbdev/au1100fb.c
+++ b/drivers/video/fbdev/au1100fb.c
@@ -567,13 +567,16 @@ int au1100fb_drv_suspend(struct platform_device *dev, pm_message_t state)
int au1100fb_drv_resume(struct platform_device *dev)
{
struct au1100fb_device *fbdev = platform_get_drvdata(dev);
+ int ret;
if (!fbdev)
return 0;
memcpy(fbdev->regs, &fbregs, sizeof(struct au1100fb_regs));
- clk_enable(fbdev->lcdclk);
+ ret = clk_enable(fbdev->lcdclk);
+ if (ret)
+ return ret;
/* Unblank the LCD */
au1100fb_fb_blank(VESA_NO_BLANKING, &fbdev->info);
--
2.25.1
^ permalink raw reply related
* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: Joel Fernandes @ 2026-01-29 1:49 UTC (permalink / raw)
To: John Hubbard, Danilo Krummrich
Cc: Zhi Wang, linux-kernel, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Jonathan Corbet,
Alex Deucher, Christian Koenig, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Matthew Auld,
Matthew Brost, Lucas De Marchi, Thomas Hellstrom, Helge Deller,
Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Bjorn Roy Baron, Benno Lossin, Andreas Hindborg, Trevor Gross,
Alistair Popple, Timur Tabi, Edwin Peer, Alexandre Courbot,
Andrea Righi, Andy Ritger, Alexey Ivanov, Balbir Singh,
Philipp Stanner, Elle Rhumsaa, Daniel Almeida, nouveau, dri-devel,
rust-for-linux, linux-doc, amd-gfx, intel-gfx, intel-xe,
linux-fbdev
In-Reply-To: <bd6fcda9-0d76-4208-b6c1-8df6f9f4616e@nvidia.com>
On 1/28/2026 8:02 PM, John Hubbard wrote:
> On 1/28/26 4:09 PM, Danilo Krummrich wrote:
>> On Wed Jan 28, 2026 at 4:27 PM CET, Joel Fernandes wrote:
>>> I will go over these concerns, just to clarify - do you mean forbidding
>>> *any* lock or do you mean only forbidding non-atomic locks? I believe we
>>> can avoid non-atomic locks completely - actually I just wrote a patch
>>> before I read this email to do just. If we are to forbid any locking at
>>> all, that might require some careful redesign to handle the above race
>>> afaics.
>>
>> It's not about the locks themselves, sleeping locks are fine too. It's about
>> holding locks that are held elsewhere when doing memory allocations that can
>> call back into MMU notifiers or the shrinker.
>
> If you look at core kernel mm, you'll find a similar constraint: avoid
> holding any locks while allocating--unless you are in the reclaim code
> itself.
>
> Especially when dealing with page tables.
>
> So this is looking familiar to me and I agree with the constraint, fwiw.
Right, so similar to core kernel mm, we need to separate PT allocation from the
lock needed for PT writing. Essentially never allocating PT pages in the
dma-fence critical paths. We already have separate locks for both these (buddy
versus vmm), so it should be doable with some adjustments. I will study the
dma-fence further and revise patches. Thanks.
--
Joel Fernandes
^ permalink raw reply
* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: Joel Fernandes @ 2026-01-29 1:28 UTC (permalink / raw)
To: Danilo Krummrich
Cc: Zhi Wang, linux-kernel, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Jonathan Corbet,
Alex Deucher, Christian Koenig, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Matthew Auld,
Matthew Brost, Lucas De Marchi, Thomas Hellstrom, Helge Deller,
Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Bjorn Roy Baron, Benno Lossin, Andreas Hindborg, Trevor Gross,
John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Alexandre Courbot, Andrea Righi, Andy Ritger, Alexey Ivanov,
Balbir Singh, Philipp Stanner, Elle Rhumsaa, Daniel Almeida,
nouveau, dri-devel, rust-for-linux, linux-doc, amd-gfx, intel-gfx,
intel-xe, linux-fbdev
In-Reply-To: <DG0MXC1R8IRS.Y5X6XDUBOGR5@kernel.org>
Hi Danilo,
On 1/28/2026 7:09 PM, Danilo Krummrich wrote:
> On Wed Jan 28, 2026 at 4:27 PM CET, Joel Fernandes wrote:
>> I will go over these concerns, just to clarify - do you mean forbidding
>> *any* lock or do you mean only forbidding non-atomic locks? I believe we
>> can avoid non-atomic locks completely - actually I just wrote a patch
>> before I read this email to do just. If we are to forbid any locking at
>> all, that might require some careful redesign to handle the above race
>> afaics.
>
> It's not about the locks themselves, sleeping locks are fine too.
Ah, so in your last email when you meant "non-atomic", you mean an allocation
that cause memory reclamation etc, right? I got confused by "non-atomic" because
I thought you were referring to acquiring a sleeping lock in a non-atomic
context (I also work on CPU scheduling/RCU, so the word atomic sometimes means
different things to me - my fault not yours :P).
I believe we may have to use "try lock" on a mutex if have to use these in the
future, in a path that cannot wait (such as a page fault handler), but yes I
agree with you we can use mutexes for these, with a combination of try_lock +
bottom half deferrals. additional comment [1].
Coming to the dma-fence deadlocks you mention, this sounds very similar to my
experiences with reclaim-deadlocks when I worked on the Ashmem Android driver.
Deja-vu :-D. The issue there was the memory shrinker would take a lock in the
ashmem driver during reclaim, which is a disaster if the lock was already held
and a memory allocation request triggered reclaim. I believe the DMA fence
usecase is also similar based on your description.
It's about
> holding locks that are held elsewhere when doing memory allocations that can
> call back into MMU notifiers or the shrinker.
>
> I.e. if in the fence signalling critical path you wait for a mutex that is held
> elsewhere while allocating memory and the memory allocation calls back into the
> shrinker, you may end up waiting for your own DMA fence to be signaled, which
> causes a deadlock.
Got it, I will send the next day or so studying the DMA fence architecture but I
mostly got the idea now. We need to be careful with reclaim locking as you
stressed. I will analyze all the requirements to properly address this. I will
reach out if I have any questions. Thanks for sharing your knowledge in this!
--
Joel Fernandes
[1]
I can confirm for completeness, that both Nouveau and OpenRM use mutexes for
PT/VMM related locking. In interrupt contexts, OpenRM does a "try lock" AFAICS
on its mutex. This is similar to how Linux kernel mm page fault handling
acquires mmap_sem (via try-locking).
The linux kernel does have per-PT spinlocks to handle the "2 paths try to
install a PDE/PTE race", but I don't think we need that at the moment for our
usecases as we can keep it simple and rely on the VMM mutex, we can perhaps add
that in later if needed (or use more finer grained block-level locking), but let
me know if anyone disagrees with that.
^ permalink raw reply
* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: John Hubbard @ 2026-01-29 1:02 UTC (permalink / raw)
To: Danilo Krummrich, Joel Fernandes
Cc: Zhi Wang, linux-kernel, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Jonathan Corbet,
Alex Deucher, Christian Koenig, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Matthew Auld,
Matthew Brost, Lucas De Marchi, Thomas Hellstrom, Helge Deller,
Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Bjorn Roy Baron, Benno Lossin, Andreas Hindborg, Trevor Gross,
Alistair Popple, Timur Tabi, Edwin Peer, Alexandre Courbot,
Andrea Righi, Andy Ritger, Alexey Ivanov, Balbir Singh,
Philipp Stanner, Elle Rhumsaa, Daniel Almeida, nouveau, dri-devel,
rust-for-linux, linux-doc, amd-gfx, intel-gfx, intel-xe,
linux-fbdev
In-Reply-To: <DG0MXC1R8IRS.Y5X6XDUBOGR5@kernel.org>
On 1/28/26 4:09 PM, Danilo Krummrich wrote:
> On Wed Jan 28, 2026 at 4:27 PM CET, Joel Fernandes wrote:
>> I will go over these concerns, just to clarify - do you mean forbidding
>> *any* lock or do you mean only forbidding non-atomic locks? I believe we
>> can avoid non-atomic locks completely - actually I just wrote a patch
>> before I read this email to do just. If we are to forbid any locking at
>> all, that might require some careful redesign to handle the above race
>> afaics.
>
> It's not about the locks themselves, sleeping locks are fine too. It's about
> holding locks that are held elsewhere when doing memory allocations that can
> call back into MMU notifiers or the shrinker.
If you look at core kernel mm, you'll find a similar constraint: avoid
holding any locks while allocating--unless you are in the reclaim code
itself.
Especially when dealing with page tables.
So this is looking familiar to me and I agree with the constraint, fwiw.
>
> I.e. if in the fence signalling critical path you wait for a mutex that is held
> elsewhere while allocating memory and the memory allocation calls back into the
> shrinker, you may end up waiting for your own DMA fence to be signaled, which
> causes a deadlock.
Right, and the list of pitfalls such as this is basically limited only
by your imagination--it's long. :)
thanks,
--
John Hubbard
^ permalink raw reply
* Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
From: Danilo Krummrich @ 2026-01-29 0:09 UTC (permalink / raw)
To: Joel Fernandes
Cc: Zhi Wang, linux-kernel, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Jonathan Corbet,
Alex Deucher, Christian Koenig, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Huang Rui, Matthew Auld,
Matthew Brost, Lucas De Marchi, Thomas Hellstrom, Helge Deller,
Alice Ryhl, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Bjorn Roy Baron, Benno Lossin, Andreas Hindborg, Trevor Gross,
John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Alexandre Courbot, Andrea Righi, Andy Ritger, Alexey Ivanov,
Balbir Singh, Philipp Stanner, Elle Rhumsaa, Daniel Almeida,
nouveau, dri-devel, rust-for-linux, linux-doc, amd-gfx, intel-gfx,
intel-xe, linux-fbdev
In-Reply-To: <c0a3ac65-e2e5-4b62-bc75-49b1599e160f@nvidia.com>
On Wed Jan 28, 2026 at 4:27 PM CET, Joel Fernandes wrote:
> I will go over these concerns, just to clarify - do you mean forbidding
> *any* lock or do you mean only forbidding non-atomic locks? I believe we
> can avoid non-atomic locks completely - actually I just wrote a patch
> before I read this email to do just. If we are to forbid any locking at
> all, that might require some careful redesign to handle the above race
> afaics.
It's not about the locks themselves, sleeping locks are fine too. It's about
holding locks that are held elsewhere when doing memory allocations that can
call back into MMU notifiers or the shrinker.
I.e. if in the fence signalling critical path you wait for a mutex that is held
elsewhere while allocating memory and the memory allocation calls back into the
shrinker, you may end up waiting for your own DMA fence to be signaled, which
causes a deadlock.
^ permalink raw reply
* Re: [PATCH RFC v6 00/26] nova-core: Memory management infrastructure (v6)
From: Danilo Krummrich @ 2026-01-29 0:01 UTC (permalink / raw)
To: Joel Fernandes
Cc: linux-kernel, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Simona Vetter, Jonathan Corbet, Alex Deucher,
Christian Koenig, Jani Nikula, Joonas Lahtinen, Vivi Rodrigo,
Tvrtko Ursulin, Rui Huang, Matthew Auld, Matthew Brost,
Lucas De Marchi, Thomas Hellstrom, Helge Deller, Alice Ryhl,
Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo, Bjorn Roy Baron,
Benno Lossin, Andreas Hindborg, Trevor Gross, John Hubbard,
Alistair Popple, Timur Tabi, Edwin Peer, Alexandre Courbot,
Andrea Righi, Andy Ritger, Zhi Wang, Alexey Ivanov, Balbir Singh,
Philipp Stanner, Elle Rhumsaa, Daniel Almeida, nouveau, dri-devel,
rust-for-linux, linux-doc, amd-gfx, intel-gfx, intel-xe,
linux-fbdev
In-Reply-To: <4540DD73-77BA-45F0-B686-32EB96402717@nvidia.com>
On Wed Jan 28, 2026 at 1:44 PM CET, Joel Fernandes wrote:
> I will split into CList, GPU buddy, and Nova MM as you suggest.
Thanks, together with a proper changelog this will help a lot.
> One question: what version numbers should each split series use? CList was at
> v3 before being combined, and similar story for GPU buddy and Nova MM. Should
> I continue from the last version number they were posted with, or continue
> from v6?
I'd say from the last version is probably best. Maybe you also want to move out
of the RFC stage for some of them.
Thanks,
Danilo
^ permalink raw reply
* Re: [PATCH v5] fbtft: limit dirty rows based on damage range
From: Andy Shevchenko @ 2026-01-28 22:35 UTC (permalink / raw)
To: ChanSoo Shin; +Cc: andy, gregkh, dri-devel, linux-fbdev, linux-staging
In-Reply-To: <20260128203938.962414-1-csshin9928@gmail.com>
On Thu, Jan 29, 2026 at 05:39:38AM +0900, ChanSoo Shin wrote:
> Instead of marking the entire display as dirty, calculate the start
> and end rows based on the damage offset and length and only mark the
> affected rows dirty. This reduces unnecessary full framebuffer updates
> for partial writes.
>
> Signed-off-by: ChanSoo Shin <csshin9928@gmail.com>
> ---
This is v5 and no changelog and no answers to the reviewers' questions.
Please, read all what people replied to your previous attempts.
Without that processed correctly NAK to this one.
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* Re: [PATCH] staging: fbtft: use guard() to simplify code
From: Andy Shevchenko @ 2026-01-28 22:08 UTC (permalink / raw)
To: Paul Retourné
Cc: andy, gregkh, dri-devel, linux-fbdev, linux-staging, linux-kernel
In-Reply-To: <20260128212644.1170970-1-paul.retourne@orange.fr>
On Wed, Jan 28, 2026 at 10:26:42PM +0100, Paul Retourné wrote:
> Use guard() to simplify mutex locking. No functional change.
...
> #include <linux/init.h>
> #include <linux/gpio/consumer.h>
> #include <linux/delay.h>
> +#include <linux/cleanup.h>
Try to squeeze the new inclusion into the longest ordered chain with some
pieces left unordered. With the given context the lease is to put it before
delay.h, but maybe there is a better place.
...
> if (par->gamma.curves[0] == 0) {
> - mutex_lock(&par->gamma.lock);
> + guard(mutex)(&par->gamma.lock);
> if (par->info->var.yres == 64)
> par->gamma.curves[0] = 0xCF;
> else
> par->gamma.curves[0] = 0x8F;
> - mutex_unlock(&par->gamma.lock);
> }
This has close to 0 added value. Don't do conversion just for fun.
...
> --- a/drivers/staging/fbtft/fb_ssd1306.c
> +++ b/drivers/staging/fbtft/fb_ssd1306.c
Ditto.
...
Sorry, but I don't see much value in this change.
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox