public inbox for linuxppc-dev@ozlabs.org
 help / color / mirror / Atom feed
From: Shivaprasad G Bhat <sbhat@linux.ibm.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	kvm@vger.kernel.org, iommu@lists.linux.dev, chleroy@kernel.org,
	mpe@ellerman.id.au, maddy@linux.ibm.com, npiggin@gmail.com,
	alex@shazbot.org, joerg.roedel@amd.com, kevin.tian@intel.com,
	gbatra@linux.ibm.com, clg@kaod.org, vaibhav@linux.ibm.com,
	brking@linux.vnet.ibm.com, nnmlinux@linux.ibm.com,
	amachhiw@linux.ibm.com, tpearson@raptorengineering.com
Subject: Re: [RFC PATCH] powerpc: iommu: Initial IOMMUFD support for PPC64
Date: Tue, 3 Feb 2026 21:22:13 +0530	[thread overview]
Message-ID: <2127b181-2c3a-4470-9b79-b508a18275c9@linux.ibm.com> (raw)
In-Reply-To: <20260127191643.GQ1134360@nvidia.com>

Hi Jason,


Thanks for reviewing this patch.


On 1/28/26 12:46 AM, Jason Gunthorpe wrote:
> On Tue, Jan 27, 2026 at 06:35:56PM +0000, Shivaprasad G Bhat wrote:
>> The RFC attempts to implement the IOMMUFD support on PPC64 by
>> adding new iommu_ops for paging domain. The existing platform
>> domain continues to be the default domain for in-kernel use.
> It would be nice to see the platform domain go away and ppc use the
> normal dma-iommu.c stuff, but I don't think it is critical to making
> it work with iommufd.


I agree. I have started on this. I will send incremental changes

as follow-up after this.


>> On PPC64, IOVA ranges are based on the type of the DMA window
>> and their properties. Currently, there is no way to expose the
>> attributes of the non-default 64-bit DMA window, which the platform
>> supports. The platform allows the operating system to select the
>> starting offset(at 4GiB or 512PiB default offset), pagesize and
>> window size for the non-default 64-bit DMA window. For example,
>> with VFIO, this is handled via VFIO_IOMMU_SPAPR_TCE_GET_INFO
>> and VFIO_IOMMU_SPAPR_TCE_CREATE|REMOVE ioctls. While I am exploring
>> the ways to expose and configure these DMA window attributes as
>> per user input, any suggestions in this regard will be very helpful.
> You can pass in driver specific information during HWPT creation, so
> any properties you need can be specified there.


Sure. I think IOMMU_GET_HW_INFO would be useful for getting the

platform supported configuration in this case.


> Then you'd want to introduce a new domain op to get the apertures
> instead of the single range hard coded into the domain struct. The new
> op would be able to return a list. We can use this op to return
> apertures for sign extension page tables too.
>
> Update iommufd to calculate the reserved regions by evaluating the
> whole list.
>
> I think you'll find this pretty straight forward, I'd do it as a
> followup patch to this one.


Thanks. I will wait for that patch.


>
>> Currently existing vfio type1 specific vfio-compat driver even
>> with this patch will not work for PPC64. I believe we need to have
>> a separate "vfio-spapr-compat" driver to make it work.
> Yes, vfio-compat doesn't support the special spapr ioctls.
>
> I don't think you need a new driver, just implement whatever they do
> with the existing interfaces, probably in its own .c file though.

There are ioctl number conflicts like

# grep -n "VFIO_BASE + 1[89]" include/uapi/linux/vfio.h | grep define
940:#defineVFIO_DEVICE_BIND_IOMMUFD_IO(VFIO_TYPE, VFIO_BASE + 18)
976:#defineVFIO_DEVICE_ATTACH_IOMMUFD_PT_IO(VFIO_TYPE, VFIO_BASE + 19)
1833:#defineVFIO_IOMMU_SPAPR_UNREGISTER_MEMORY_IO(VFIO_TYPE, VFIO_BASE + 18)
1856:#defineVFIO_IOMMU_SPAPR_TCE_CREATE_IO(VFIO_TYPE, VFIO_BASE + 19)
# grep -n "VFIO_BASE + 20" include/uapi/linux/vfio.h | grep define
999:#defineVFIO_DEVICE_DETACH_IOMMUFD_PT_IO(VFIO_TYPE, VFIO_BASE + 20)
1870:#defineVFIO_IOMMU_SPAPR_TCE_REMOVE_IO(VFIO_TYPE, VFIO_BASE + 20)

> However, I have no idea what is required to implement those ops, or if
> it is even possible.. It may be easier to just leave the old vfio
> stuff around instead of trying to compat it. The purpose of compat was
> to be able to build kernels without type1 at all. It isn't necessary
> to start using iommufd in new apps with the new interfaces.
>
> Given you are mainly looking at a VMM that already will have iommufd
> support it may not be worthwhile.


You are right. We do have some use cases beyond VMM, I will consider 
compat driver

only if it is helpful there.


>> @@ -1201,7 +1201,15 @@ spapr_tce_blocked_iommu_attach_dev(struct iommu_domain *platform_domain,
>>   	 * also sets the dma_api ops
>>   	 */
>>   	table_group = iommu_group_get_iommudata(grp);
>> +
>> +	if (old && old->type == IOMMU_DOMAIN_DMA) {
> I'm trying to delete IOMMU_DOMAIN_DMA please don't use it in
> drivers.
Sure.
>>   static const struct iommu_ops spapr_tce_iommu_ops = {
>>   	.default_domain = &spapr_tce_platform_domain,
>>   	.blocked_domain = &spapr_tce_blocked_domain,
>> @@ -1267,6 +1436,14 @@ static const struct iommu_ops spapr_tce_iommu_ops = {
>>   	.probe_device = spapr_tce_iommu_probe_device,
>>   	.release_device = spapr_tce_iommu_release_device,
>>   	.device_group = spapr_tce_iommu_device_group,
>> +	.domain_alloc_paging = spapr_tce_domain_alloc_paging,
>> +	.default_domain_ops = &(const struct iommu_domain_ops) {
>> +		.attach_dev     = spapr_tce_iommu_attach_device,
>> +		.map_pages      = spapr_tce_iommu_map_pages,
>> +		.unmap_pages    = spapr_tce_iommu_unmap_pages,
>> +		.iova_to_phys   = spapr_tce_iommu_iova_to_phys,
>> +		.free           = spapr_tce_domain_free,
>> +	}
> Please don't use default_domain_ops in a driver that is supporting
> multiple domain types and platform, it becomes confusing to guess
> which domain type those ops are linked to.


Sure.


> You should also implement the BLOCKING domain type to make VFIO work
> better


I am not sure how this could help making VFIO better. May be, I am not able

to imagine the advantages with the current platform domain approach

in place. Could you please elaborate more on this?


> I wouldn't try to guess if this is right or not, but it looks pretty
> reasonable as a first start.


Thanks, I will iterate this as RFC till i get to reasonable shape.


Regards,

Shivaprasad

> Jason


  reply	other threads:[~2026-02-03 15:52 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-27 18:35 [RFC PATCH] powerpc: iommu: Initial IOMMUFD support for PPC64 Shivaprasad G Bhat
2026-01-27 19:16 ` Jason Gunthorpe
2026-02-03 15:52   ` Shivaprasad G Bhat [this message]
2026-02-03 18:07     ` Jason Gunthorpe
2026-02-04 15:23       ` Shivaprasad G Bhat
2026-01-28 10:53 ` Christophe Leroy (CS GROUP)
2026-02-03 15:54   ` Shivaprasad G Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2127b181-2c3a-4470-9b79-b508a18275c9@linux.ibm.com \
    --to=sbhat@linux.ibm.com \
    --cc=alex@shazbot.org \
    --cc=amachhiw@linux.ibm.com \
    --cc=brking@linux.vnet.ibm.com \
    --cc=chleroy@kernel.org \
    --cc=clg@kaod.org \
    --cc=gbatra@linux.ibm.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joerg.roedel@amd.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=nnmlinux@linux.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=tpearson@raptorengineering.com \
    --cc=vaibhav@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox