* Re: [PATCH 0/2] iommu: Allow passing custom allocators to pgtable drivers
2023-11-07 11:52 ` [PATCH 0/2] iommu: Allow passing custom allocators to pgtable drivers Gaurav Kohli
@ 2023-11-07 12:01 ` Gaurav Kohli
2023-11-10 9:47 ` Boris Brezillon
1 sibling, 0 replies; 3+ messages in thread
From: Gaurav Kohli @ 2023-11-07 12:01 UTC (permalink / raw)
To: Rob Clark, Steven Price, Will Deacon
Cc: Boris Brezillon, Joerg Roedel, iommu, Will Deacon, Robin Murphy,
linux-arm-kernel, linux-arm-msm
On 11/7/2023 5:22 PM, Gaurav Kohli wrote:
>
>
> On 10/24/2023 2:32 AM, Rob Clark wrote:
>> On Wed, Sep 20, 2023 at 6:12 AM Steven Price <steven.price@arm.com>
>> wrote:
>>>
>>> On 09/08/2023 13:17, Boris Brezillon wrote:
>>>> Hello,
>>>>
>>>> This patchset is an attempt at making page table allocation
>>>> customizable. This is useful to some GPU drivers for various reasons:
>>>>
>>>> - speed-up upcoming page table allocations by managing a pool of free
>>>> pages
>>>> - batch page table allocation instead of allocating one page at a time
>>>> - pre-reserve pages for page tables needed for map/unmap operations and
>>>> return the unused page tables to some pool
>>>>
>>>> The first and last reasons are particularly important for GPU drivers
>>>> wanting to implement asynchronous VM_BIND. Asynchronous VM_BIND
>>>> requires
>>>> that any page table needed for a map/unmap operation to succeed be
>>>> allocated at VM_BIND job creation time. At the time of the job
>>>> creation,
>>>> we don't know what the VM will look like when we get to execute the
>>>> map/unmap, and can't guess how many page tables we will need. Because
>>>> of that, we have to over-provision page tables for the worst case
>>>> scenario (page table tree is empty), which means we will allocate/free
>>>> a lot. Having pool a pool of free pages is crucial if we want to
>>>> speed-up VM_BIND requests.
>>>>
>>>> A real example of how such custom allocators can be used is available
>>>> here[1]. v2 of the Panthor driver is approaching submission, and I
>>>> figured I'd try to upstream the dependencies separately, which is
>>>> why I submit this series now, even though the user of this new API
>>>> will come afterwards. If you'd prefer to have those patches submitted
>>>> along with the Panthor driver, let me know.
>>>>
>>>> This approach has been discussed with Robin, and is hopefully not too
>>>> far from what he had in mind.
>>>
>>> The alternative would be to embed a cache of pages into the IOMMU
>>> framework, however kmem_cache sadly doesn't seem to support the
>>> 'reserve' of pages concept that we need. mempools could be a solution
>>> but the mempool would need to be created by the IOMMU framework as the
>>> alloc/free functions are specified when creating the pool. So it would
>>> be a much bigger change (to drivers/iommu).
>>>
>>> So, given that so far it's just Panthor this seems like the right
>>> approach for now - when/if other drivers want the same functionality
>>> then it might make sense to revisit the idea of doing the caching within
>>> the IOMMU framework.
>>
>> I have some plans to use this as well for drm/msm.. but the reasons
>> and requirements are basically the same as for panthor. I think I
>> prefer the custom allocator approach, rather than tying this to IOMMU
>> framework. (But ofc custom allocators, I guess, does not prevent the
>> iommu driver from doing it's own caching.)
>>
>> BR,
>> -R
>>
>
> We have also posted one RFC patch series which is based on this current
> patches by Boris and helping us to define our custom alloc and free
> pgtable call. For our side usecase we have a requirement to create
> pgtable from HLOS and then share it to different entity(VMID) and
> basically that also requires few smc calls and for that we need
> custom alloc/free callbacks.
>
> https://lore.kernel.org/all/20231101071144.16309-1-quic_gkohli@quicinc.com/
>
>
> So custom allocator and free ops is helping for us also. Is there any
> plan to merge these patches from Boris.
>
>
>
>
Please use below tested by tag, if you are planning to merge patches by
Boris:
Tested-by: Gaurav Kohli <quic_gkohli@quicinc.com>
>>> Robin: Does this approach sound sensible?
>>>
>>> FWIW:
>>>
>>> Reviewed-by: Steven Price <steven.price@arm.com>
>>>
>>> Steve
>>>
>>>> Regards,
>>>>
>>>> Boris
>>>>
>>>> [1]https://gitlab.freedesktop.org/panfrost/linux/-/blob/panthor/drivers/gpu/drm/panthor/panthor_mmu.c#L441
>>>>
>>>> Boris Brezillon (2):
>>>> iommu: Allow passing custom allocators to pgtable drivers
>>>> iommu: Extend LPAE page table format to support custom allocators
>>>>
>>>> drivers/iommu/io-pgtable-arm.c | 50
>>>> +++++++++++++++++++++++-----------
>>>> drivers/iommu/io-pgtable.c | 31 +++++++++++++++++++++
>>>> include/linux/io-pgtable.h | 21 ++++++++++++++
>>>> 3 files changed, 86 insertions(+), 16 deletions(-)
>>>>
>>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 0/2] iommu: Allow passing custom allocators to pgtable drivers
2023-11-07 11:52 ` [PATCH 0/2] iommu: Allow passing custom allocators to pgtable drivers Gaurav Kohli
2023-11-07 12:01 ` Gaurav Kohli
@ 2023-11-10 9:47 ` Boris Brezillon
1 sibling, 0 replies; 3+ messages in thread
From: Boris Brezillon @ 2023-11-10 9:47 UTC (permalink / raw)
To: Gaurav Kohli
Cc: Rob Clark, Steven Price, Joerg Roedel, iommu, Will Deacon,
Robin Murphy, linux-arm-kernel, linux-arm-msm
Hi Gaurav,
On Tue, 7 Nov 2023 17:22:39 +0530
Gaurav Kohli <quic_gkohli@quicinc.com> wrote:
> On 10/24/2023 2:32 AM, Rob Clark wrote:
> > On Wed, Sep 20, 2023 at 6:12 AM Steven Price <steven.price@arm.com> wrote:
> >>
> >> On 09/08/2023 13:17, Boris Brezillon wrote:
> >>> Hello,
> >>>
> >>> This patchset is an attempt at making page table allocation
> >>> customizable. This is useful to some GPU drivers for various reasons:
> >>>
> >>> - speed-up upcoming page table allocations by managing a pool of free
> >>> pages
> >>> - batch page table allocation instead of allocating one page at a time
> >>> - pre-reserve pages for page tables needed for map/unmap operations and
> >>> return the unused page tables to some pool
> >>>
> >>> The first and last reasons are particularly important for GPU drivers
> >>> wanting to implement asynchronous VM_BIND. Asynchronous VM_BIND requires
> >>> that any page table needed for a map/unmap operation to succeed be
> >>> allocated at VM_BIND job creation time. At the time of the job creation,
> >>> we don't know what the VM will look like when we get to execute the
> >>> map/unmap, and can't guess how many page tables we will need. Because
> >>> of that, we have to over-provision page tables for the worst case
> >>> scenario (page table tree is empty), which means we will allocate/free
> >>> a lot. Having pool a pool of free pages is crucial if we want to
> >>> speed-up VM_BIND requests.
> >>>
> >>> A real example of how such custom allocators can be used is available
> >>> here[1]. v2 of the Panthor driver is approaching submission, and I
> >>> figured I'd try to upstream the dependencies separately, which is
> >>> why I submit this series now, even though the user of this new API
> >>> will come afterwards. If you'd prefer to have those patches submitted
> >>> along with the Panthor driver, let me know.
> >>>
> >>> This approach has been discussed with Robin, and is hopefully not too
> >>> far from what he had in mind.
> >>
> >> The alternative would be to embed a cache of pages into the IOMMU
> >> framework, however kmem_cache sadly doesn't seem to support the
> >> 'reserve' of pages concept that we need. mempools could be a solution
> >> but the mempool would need to be created by the IOMMU framework as the
> >> alloc/free functions are specified when creating the pool. So it would
> >> be a much bigger change (to drivers/iommu).
> >>
> >> So, given that so far it's just Panthor this seems like the right
> >> approach for now - when/if other drivers want the same functionality
> >> then it might make sense to revisit the idea of doing the caching within
> >> the IOMMU framework.
> >
> > I have some plans to use this as well for drm/msm.. but the reasons
> > and requirements are basically the same as for panthor. I think I
> > prefer the custom allocator approach, rather than tying this to IOMMU
> > framework. (But ofc custom allocators, I guess, does not prevent the
> > iommu driver from doing it's own caching.)
> >
> > BR,
> > -R
> >
>
> We have also posted one RFC patch series which is based on this current
> patches by Boris and helping us to define our custom alloc and free
> pgtable call. For our side usecase we have a requirement to create
> pgtable from HLOS and then share it to different entity(VMID) and
> basically that also requires few smc calls and for that we need
> custom alloc/free callbacks.
>
> https://lore.kernel.org/all/20231101071144.16309-1-quic_gkohli@quicinc.com/
>
>
> So custom allocator and free ops is helping for us also. Is there any
> plan to merge these patches from Boris.
Sorry for the late reply. I just sent a v2, but I forgot to add your
Tested-by :-/. Feel free to add it back.
Regards,
Boris
^ permalink raw reply [flat|nested] 3+ messages in thread