From: Joao Martins <joao.m.martins@oracle.com>
To: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
iommu@lists.linux-foundation.org
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
Kevin Tian <kevin.tian@intel.com>,
Yishai Hadas <yishaih@nvidia.com>,
kvm@vger.kernel.org, Will Deacon <will@kernel.org>,
Cornelia Huck <cohuck@redhat.com>,
Alex Williamson <alex.williamson@redhat.com>,
Jason Gunthorpe <jgg@nvidia.com>,
David Woodhouse <dwmw2@infradead.org>,
Robin Murphy <robin.murphy@arm.com>
Subject: Re: [PATCH RFC 10/19] iommu/amd: Add unmap_read_dirty() support
Date: Tue, 31 May 2022 16:51:12 +0100 [thread overview]
Message-ID: <9f002dae-dd47-5a2a-0d3c-a4bf47a2695c@oracle.com> (raw)
In-Reply-To: <3c1186fe-0fa8-7329-c7a1-64ec0bd644c4@amd.com>
On 5/31/22 13:39, Suravee Suthikulpanit wrote:
> On 4/29/22 4:09 AM, Joao Martins wrote:
>> AMD implementation of unmap_read_dirty() is pretty simple as
>> mostly reuses unmap code with the extra addition of marshalling
>> the dirty bit into the bitmap as it walks the to-be-unmapped
>> IOPTE.
>>
>> Extra care is taken though, to switch over to cmpxchg as opposed
>> to a non-serialized store to the PTE and testing the dirty bit
>> only set until cmpxchg succeeds to set to 0.
>>
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> ---
>> drivers/iommu/amd/io_pgtable.c | 44 +++++++++++++++++++++++++++++-----
>> drivers/iommu/amd/iommu.c | 22 +++++++++++++++++
>> 2 files changed, 60 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
>> index 8325ef193093..1868c3b58e6d 100644
>> --- a/drivers/iommu/amd/io_pgtable.c
>> +++ b/drivers/iommu/amd/io_pgtable.c
>> @@ -355,6 +355,16 @@ static void free_clear_pte(u64 *pte, u64 pteval, struct list_head *freelist)
>> free_sub_pt(pt, mode, freelist);
>> }
>>
>> +static bool free_pte_dirty(u64 *pte, u64 pteval)
>
> Nitpick: Since we free and clearing the dirty bit, should we change
> the function name to free_clear_pte_dirty()?
>
We free and *read* the dirty bit. It just so happens that we clear dirty
bit and every other one in the process. Just to make sure that I am not
clear the dirty bit explicitly (like the read_and_clear_dirty())
>> +{
>> + bool dirty = false;
>> +
>> + while (IOMMU_PTE_DIRTY(cmpxchg64(pte, pteval, 0)))
>
> We should use 0ULL instead of 0.
>
ack.
>> + dirty = true;
>> +
>> + return dirty;
>> +}
>> +
>
> Actually, what do you think if we enhance the current free_clear_pte()
> to also handle the check dirty as well?
>
See further below, about dropping this patch.
>> /*
>> * Generic mapping functions. It maps a physical address into a DMA
>> * address space. It allocates the page table pages if necessary.
>> @@ -428,10 +438,11 @@ static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova,
>> return ret;
>> }
>>
>> -static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops,
>> - unsigned long iova,
>> - size_t size,
>> - struct iommu_iotlb_gather *gather)
>> +static unsigned long __iommu_v1_unmap_page(struct io_pgtable_ops *ops,
>> + unsigned long iova,
>> + size_t size,
>> + struct iommu_iotlb_gather *gather,
>> + struct iommu_dirty_bitmap *dirty)
>> {
>> struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
>> unsigned long long unmapped;
>> @@ -445,11 +456,15 @@ static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops,
>> while (unmapped < size) {
>> pte = fetch_pte(pgtable, iova, &unmap_size);
>> if (pte) {
>> - int i, count;
>> + unsigned long i, count;
>> + bool pte_dirty = false;
>>
>> count = PAGE_SIZE_PTE_COUNT(unmap_size);
>> for (i = 0; i < count; i++)
>> - pte[i] = 0ULL;
>> + pte_dirty |= free_pte_dirty(&pte[i], pte[i]);
>> +
>
> Actually, what if we change the existing free_clear_pte() to free_and_clear_dirty_pte(),
> and incorporate the logic for
>
Likewise, but otherwise it would be a good idea.
>> ...
>> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
>> index 0a86392b2367..a8fcb6e9a684 100644
>> --- a/drivers/iommu/amd/iommu.c
>> +++ b/drivers/iommu/amd/iommu.c
>> @@ -2144,6 +2144,27 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
>> return r;
>> }
>>
>> +static size_t amd_iommu_unmap_read_dirty(struct iommu_domain *dom,
>> + unsigned long iova, size_t page_size,
>> + struct iommu_iotlb_gather *gather,
>> + struct iommu_dirty_bitmap *dirty)
>> +{
>> + struct protection_domain *domain = to_pdomain(dom);
>> + struct io_pgtable_ops *ops = &domain->iop.iop.ops;
>> + size_t r;
>> +
>> + if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
>> + (domain->iop.mode == PAGE_MODE_NONE))
>> + return 0;
>> +
>> + r = (ops->unmap_read_dirty) ?
>> + ops->unmap_read_dirty(ops, iova, page_size, gather, dirty) : 0;
>> +
>> + amd_iommu_iotlb_gather_add_page(dom, gather, iova, page_size);
>> +
>> + return r;
>> +}
>> +
>
> Instead of creating a new function, what if we enhance the current amd_iommu_unmap()
> to also handle read dirty part as well (e.g. __amd_iommu_unmap_read_dirty()), and
> then both amd_iommu_unmap() and amd_iommu_unmap_read_dirty() can call
> the __amd_iommu_unmap_read_dirty()?
Yes, if we were to keep this one.
I am actually dropping this patch (and the whole unmap_read_dirty additions).
The unmap_read_dirty() will be replaced but having userspace do get_dirty_iova() before
the unmap() or still keep the uAPI in iommufd while being a read_dirty() followed by unmap
without the special IOMMU unmap path. See this thread that starts here:
https://lore.kernel.org/linux-iommu/20220502185239.GR8364@nvidia.com/
But essentially, the proposed unmap_read_dirty primitive isn't fully race free as it only
tackle races against IOMMU updating the IOPTE. DMA could be happening between the time I
clear the PTE and when I do the IOMMU TLB flush. Think vIOMMU usecases. Eliminating the
race fully is expensive requiring an extra TLB flush + IOPT walk in addition to the unmap
one (we would essentially double the cost). The thinking is that an alternative new
primitive would instead wrprotect the IOVA (i.e. thus blocking DMA), flush the IOTLB and
then we would read out a dirty bit and the unmap would be a regular unmap. For now I won't
be adding this as it is not clear if any use case really needs this.
Joao
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2022-05-31 15:51 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-28 21:09 [PATCH RFC 00/19] IOMMUFD Dirty Tracking Joao Martins
2022-04-28 21:09 ` [PATCH RFC 01/19] iommu: Add iommu_domain ops for dirty tracking Joao Martins
2022-04-29 7:54 ` Tian, Kevin
2022-04-29 10:44 ` Joao Martins
2022-04-29 12:08 ` Jason Gunthorpe via iommu
2022-04-29 14:26 ` Joao Martins
2022-04-29 14:35 ` Jason Gunthorpe via iommu
2022-04-29 13:40 ` Baolu Lu
2022-04-29 15:27 ` Joao Martins
2022-04-28 21:09 ` [PATCH RFC 02/19] iommufd: Dirty tracking for io_pagetable Joao Martins
2022-04-29 8:07 ` Tian, Kevin
2022-04-29 10:48 ` Joao Martins
2022-04-29 11:56 ` Jason Gunthorpe via iommu
2022-04-29 14:28 ` Joao Martins
2022-04-29 23:51 ` Baolu Lu
2022-05-02 11:57 ` Joao Martins
2022-04-28 21:09 ` [PATCH RFC 03/19] iommufd: Dirty tracking data support Joao Martins
2022-04-29 8:12 ` Tian, Kevin
2022-04-29 10:54 ` Joao Martins
2022-04-29 12:09 ` Jason Gunthorpe via iommu
2022-04-29 14:33 ` Joao Martins
2022-04-30 4:11 ` Baolu Lu
2022-05-02 12:06 ` Joao Martins
2022-04-28 21:09 ` [PATCH RFC 04/19] iommu: Add an unmap API that returns dirtied IOPTEs Joao Martins
2022-04-30 5:12 ` Baolu Lu
2022-05-02 12:22 ` Joao Martins
2022-04-28 21:09 ` [PATCH RFC 05/19] iommufd: Add a dirty bitmap to iopt_unmap_iova() Joao Martins
2022-04-29 12:14 ` Jason Gunthorpe via iommu
2022-04-29 14:36 ` Joao Martins
2022-04-28 21:09 ` [PATCH RFC 06/19] iommufd: Dirty tracking IOCTLs for the hw_pagetable Joao Martins
2022-04-28 21:09 ` [PATCH RFC 07/19] iommufd/vfio-compat: Dirty tracking IOCTLs compatibility Joao Martins
2022-04-29 12:19 ` Jason Gunthorpe via iommu
2022-04-29 14:27 ` Joao Martins
2022-04-29 14:36 ` Jason Gunthorpe via iommu
2022-04-29 14:52 ` Joao Martins
2022-04-28 21:09 ` [PATCH RFC 08/19] iommufd: Add a test for dirty tracking ioctls Joao Martins
2022-04-28 21:09 ` [PATCH RFC 09/19] iommu/amd: Access/Dirty bit support in IOPTEs Joao Martins
2022-05-31 11:34 ` Suravee Suthikulpanit via iommu
2022-05-31 12:15 ` Baolu Lu
2022-05-31 15:22 ` Joao Martins
2022-04-28 21:09 ` [PATCH RFC 10/19] iommu/amd: Add unmap_read_dirty() support Joao Martins
2022-05-31 12:39 ` Suravee Suthikulpanit via iommu
2022-05-31 15:51 ` Joao Martins [this message]
2022-04-28 21:09 ` [PATCH RFC 11/19] iommu/amd: Print access/dirty bits if supported Joao Martins
2022-04-28 21:09 ` [PATCH RFC 12/19] iommu/arm-smmu-v3: Add feature detection for HTTU Joao Martins
2022-04-28 21:09 ` [PATCH RFC 13/19] iommu/arm-smmu-v3: Add feature detection for BBML Joao Martins
2022-04-29 11:11 ` Robin Murphy
2022-04-29 11:54 ` Joao Martins
2022-04-29 12:26 ` Robin Murphy
2022-04-29 14:34 ` Joao Martins
2022-04-28 21:09 ` [PATCH RFC 14/19] iommu/arm-smmu-v3: Add read_and_clear_dirty() support Joao Martins
2022-04-28 21:09 ` [PATCH RFC 15/19] iommu/arm-smmu-v3: Add set_dirty_tracking_range() support Joao Martins
2022-04-29 8:28 ` Tian, Kevin
2022-04-29 11:05 ` Joao Martins
2022-04-29 11:19 ` Robin Murphy
2022-04-29 12:06 ` Joao Martins
2022-04-29 12:23 ` Jason Gunthorpe via iommu
2022-04-29 14:45 ` Joao Martins
2022-04-29 16:11 ` Jason Gunthorpe via iommu
2022-04-29 16:40 ` Joao Martins
2022-04-29 16:46 ` Jason Gunthorpe via iommu
2022-04-29 19:20 ` Robin Murphy
2022-05-02 11:52 ` Joao Martins
2022-05-02 11:57 ` Joao Martins
2022-05-05 7:25 ` Shameerali Kolothum Thodi via iommu
2022-05-05 9:52 ` Joao Martins
2022-04-28 21:09 ` [PATCH RFC 16/19] iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping Joao Martins
2022-04-29 11:35 ` Robin Murphy
2022-04-29 12:10 ` Joao Martins
2022-04-29 12:46 ` Robin Murphy
2022-04-28 21:09 ` [PATCH RFC 17/19] iommu/arm-smmu-v3: Add unmap_read_dirty() support Joao Martins
2022-04-29 11:53 ` Robin Murphy
2022-04-28 21:09 ` [PATCH RFC 18/19] iommu/intel: Access/Dirty bit support for SL domains Joao Martins
2022-04-29 9:03 ` Tian, Kevin
2022-04-29 11:20 ` Joao Martins
2022-04-30 6:12 ` Baolu Lu
2022-05-02 12:24 ` Joao Martins
2022-04-28 21:09 ` [PATCH RFC 19/19] iommu/intel: Add unmap_read_dirty() support Joao Martins
2022-04-29 5:45 ` [PATCH RFC 00/19] IOMMUFD Dirty Tracking Tian, Kevin
2022-04-29 10:27 ` Joao Martins
2022-04-29 12:38 ` Jason Gunthorpe via iommu
2022-04-29 15:20 ` Joao Martins
2022-05-05 7:40 ` Tian, Kevin
2022-05-05 14:07 ` Jason Gunthorpe via iommu
2022-05-06 3:51 ` Tian, Kevin
2022-05-06 11:46 ` Jason Gunthorpe via iommu
2022-05-10 1:38 ` Tian, Kevin
2022-05-10 11:50 ` Joao Martins
2022-05-11 1:17 ` Tian, Kevin
2022-05-10 13:46 ` Jason Gunthorpe via iommu
2022-05-11 1:10 ` Tian, Kevin
2022-05-02 18:11 ` Alex Williamson
2022-05-02 18:52 ` Jason Gunthorpe via iommu
2022-05-03 10:48 ` Joao Martins
2022-05-05 7:42 ` Tian, Kevin
2022-05-05 10:06 ` Joao Martins
2022-05-05 11:03 ` Tian, Kevin
2022-05-05 11:50 ` Joao Martins
2022-05-06 3:14 ` Tian, Kevin
2022-05-05 13:55 ` Jason Gunthorpe via iommu
2022-05-06 3:17 ` Tian, Kevin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9f002dae-dd47-5a2a-0d3c-a4bf47a2695c@oracle.com \
--to=joao.m.martins@oracle.com \
--cc=alex.williamson@redhat.com \
--cc=cohuck@redhat.com \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux-foundation.org \
--cc=jean-philippe@linaro.org \
--cc=jgg@nvidia.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=suravee.suthikulpanit@amd.com \
--cc=will@kernel.org \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox