public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Baolu Lu <baolu.lu@linux.intel.com>
To: Joao Martins <joao.m.martins@oracle.com>,
	Jason Gunthorpe <jgg@nvidia.com>
Cc: baolu.lu@linux.intel.com, iommu@lists.linux.dev,
	Kevin Tian <kevin.tian@intel.com>,
	Shameerali Kolothum Thodi  <shameerali.kolothum.thodi@huawei.com>,
	Yi Liu <yi.l.liu@intel.com>, Yi Y Sun <yi.y.sun@intel.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	Joerg Roedel <joro@8bytes.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Zhenzhong Duan <zhenzhong.duan@intel.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	kvm@vger.kernel.org
Subject: Re: [PATCH v4 11/18] iommu/amd: Access/Dirty bit support in IOPTEs
Date: Fri, 20 Oct 2023 10:21:36 +0800	[thread overview]
Message-ID: <31612252-e6e1-4bfc-8b82-620e79422cbc@linux.intel.com> (raw)
In-Reply-To: <f2109ca9-b194-43f2-bed0-077d03242d1a@oracle.com>

On 10/19/23 7:58 PM, Joao Martins wrote:
> On 19/10/2023 01:17, Joao Martins wrote:
>> On 19/10/2023 00:11, Jason Gunthorpe wrote:
>>> On Wed, Oct 18, 2023 at 09:27:08PM +0100, Joao Martins wrote:
>>>> +static int iommu_v1_read_and_clear_dirty(struct io_pgtable_ops *ops,
>>>> +					 unsigned long iova, size_t size,
>>>> +					 unsigned long flags,
>>>> +					 struct iommu_dirty_bitmap *dirty)
>>>> +{
>>>> +	struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
>>>> +	unsigned long end = iova + size - 1;
>>>> +
>>>> +	do {
>>>> +		unsigned long pgsize = 0;
>>>> +		u64 *ptep, pte;
>>>> +
>>>> +		ptep = fetch_pte(pgtable, iova, &pgsize);
>>>> +		if (ptep)
>>>> +			pte = READ_ONCE(*ptep);
>>> It is fine for now, but this is so slow for something that is such a
>>> fast path. We are optimizing away a TLB invalidation but leaving
>>> this???
>>>
>> More obvious reason is that I'm still working towards the 'faster' page table
>> walker. Then map/unmap code needs to do similar lookups so thought of reusing
>> the same functions as map/unmap initially. And improve it afterwards or when
>> introducing the splitting.
>>
>>> It is a radix tree, you walk trees by retaining your position at each
>>> level as you go (eg in a function per-level call chain or something)
>>> then ++ is cheap. Re-searching the entire tree every time is madness.
>> I'm aware -- I have an improved page-table walker for AMD[0] (not yet for Intel;
>> still in the works),
> Sigh, I realized that Intel's pfn_to_dma_pte() (main lookup function for
> map/unmap/iova_to_phys) does something a little off when it finds a non-present
> PTE. It allocates a page table to it; which is not OK in this specific case (I
> would argue it's neither for iova_to_phys but well maybe I misunderstand the
> expectation of that API).

pfn_to_dma_pte() doesn't allocate page for a non-present PTE if the
target_level parameter is set to 0. See below line 932.

  913 static struct dma_pte *pfn_to_dma_pte(struct dmar_domain *domain,
  914                           unsigned long pfn, int *target_level,
  915                           gfp_t gfp)
  916 {

[...]

  927         while (1) {
  928                 void *tmp_page;
  929
  930                 offset = pfn_level_offset(pfn, level);
  931                 pte = &parent[offset];
  932                 if (!*target_level && (dma_pte_superpage(pte) || 
!dma_pte_present(pte)))
  933                         break;

So both iova_to_phys() and read_and_clear_dirty() are doing things
right:

	struct dma_pte *pte;
	int level = 0;

	pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT,
                              &level, GFP_KERNEL);
	if (pte && dma_pte_present(pte)) {
		/* The PTE is valid, check anything you want! */
		... ...
	}

Or, I am overlooking something else?

Best regards,
baolu

  parent reply	other threads:[~2023-10-20  2:25 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-18 20:26 [PATCH v4 00/18] IOMMUFD Dirty Tracking Joao Martins
2023-10-18 20:26 ` [PATCH v4 01/18] vfio/iova_bitmap: Export more API symbols Joao Martins
2023-10-18 22:14   ` Jason Gunthorpe
2023-10-20  5:45   ` Tian, Kevin
2023-10-20 16:44   ` Alex Williamson
2023-10-18 20:26 ` [PATCH v4 02/18] vfio: Move iova_bitmap into iommufd Joao Martins
2023-10-18 22:14   ` Jason Gunthorpe
2023-10-19 17:48     ` Brett Creeley
2023-10-20  5:46   ` Tian, Kevin
2023-10-20 16:44   ` Alex Williamson
2023-10-18 20:27 ` [PATCH v4 03/18] iommufd/iova_bitmap: Move symbols to IOMMUFD namespace Joao Martins
2023-10-18 22:16   ` Jason Gunthorpe
2023-10-19 17:48   ` Brett Creeley
2023-10-20  5:47   ` Tian, Kevin
2023-10-20 16:44   ` Alex Williamson
2023-10-18 20:27 ` [PATCH v4 04/18] iommu: Add iommu_domain ops for dirty tracking Joao Martins
2023-10-18 22:26   ` Jason Gunthorpe
2023-10-19  1:45   ` Baolu Lu
2023-10-20  5:54   ` Tian, Kevin
2023-10-20 11:24     ` Joao Martins
2023-10-18 20:27 ` [PATCH v4 05/18] iommufd: Add a flag to enforce dirty tracking on attach Joao Martins
2023-10-18 22:26   ` Jason Gunthorpe
2023-10-18 22:38   ` Jason Gunthorpe
2023-10-18 23:38     ` Joao Martins
2023-10-20  5:55       ` Tian, Kevin
2023-10-18 20:27 ` [PATCH v4 06/18] iommufd: Add IOMMU_HWPT_SET_DIRTY Joao Martins
2023-10-18 22:28   ` Jason Gunthorpe
2023-10-20  6:09   ` Tian, Kevin
2023-10-20 15:30     ` Joao Martins
2023-10-20  7:56   ` Tian, Kevin
2023-10-20 20:41   ` Joao Martins
2023-10-18 20:27 ` [PATCH v4 07/18] iommufd: Add IOMMU_HWPT_GET_DIRTY_IOVA Joao Martins
2023-10-18 22:39   ` Jason Gunthorpe
2023-10-18 23:43     ` Joao Martins
2023-10-19 12:01       ` Jason Gunthorpe
2023-10-19 12:04         ` Joao Martins
2023-10-19 10:01   ` Joao Martins
2023-10-20  6:32   ` Tian, Kevin
2023-10-20 11:53     ` Joao Martins
2023-10-20 13:40       ` Jason Gunthorpe
2023-10-18 20:27 ` [PATCH v4 08/18] iommufd: Add capabilities to IOMMU_GET_HW_INFO Joao Martins
2023-10-18 22:44   ` Jason Gunthorpe
2023-10-19  9:55     ` Joao Martins
2023-10-19 23:56       ` Jason Gunthorpe
2023-10-20  6:46   ` Tian, Kevin
2023-10-20 11:52     ` Joao Martins
2023-10-18 20:27 ` [PATCH v4 09/18] iommufd: Add a flag to skip clearing of IOPTE dirty Joao Martins
2023-10-18 22:54   ` Jason Gunthorpe
2023-10-18 23:50     ` Joao Martins
2023-10-20  6:52   ` Tian, Kevin
2023-10-18 20:27 ` [PATCH v4 10/18] iommu/amd: Add domain_alloc_user based domain allocation Joao Martins
2023-10-18 22:58   ` Jason Gunthorpe
2023-10-18 23:54     ` Joao Martins
2023-10-18 20:27 ` [PATCH v4 11/18] iommu/amd: Access/Dirty bit support in IOPTEs Joao Martins
2023-10-18 23:11   ` Jason Gunthorpe
2023-10-19  0:17     ` Joao Martins
2023-10-19 11:58       ` Joao Martins
2023-10-19 23:59         ` Jason Gunthorpe
2023-10-20 14:43           ` Joao Martins
2023-10-20 21:22             ` Joao Martins
2023-10-21 16:14             ` Jason Gunthorpe
2023-10-22  7:07               ` Yishai Hadas
2023-10-20  2:21         ` Baolu Lu [this message]
2023-10-20  7:01           ` Tian, Kevin
2023-10-20  9:34           ` Joao Martins
2023-10-20 11:20             ` Joao Martins
2023-10-20 18:57   ` Joao Martins
2023-10-18 20:27 ` [PATCH v4 12/18] iommu/intel: Access/Dirty bit support for SL domains Joao Martins
2023-10-19  3:04   ` Baolu Lu
2023-10-19  9:14     ` Joao Martins
2023-10-19 10:33       ` Joao Martins
2023-10-19 23:56       ` Jason Gunthorpe
2023-10-20 10:12         ` Joao Martins
2023-10-20  7:53   ` Tian, Kevin
2023-10-20  9:15     ` Baolu Lu
2023-10-18 20:27 ` [PATCH v4 13/18] iommufd/selftest: Expand mock_domain with dev_flags Joao Martins
2023-10-20  7:57   ` Tian, Kevin
2023-10-18 20:27 ` [PATCH v4 14/18] iommufd/selftest: Test IOMMU_HWPT_ALLOC_ENFORCE_DIRTY Joao Martins
2023-10-20  7:59   ` Tian, Kevin
2023-10-18 20:27 ` [PATCH v4 15/18] iommufd/selftest: Test IOMMU_HWPT_SET_DIRTY Joao Martins
2023-10-20  8:00   ` Tian, Kevin
2023-10-18 20:27 ` [PATCH v4 16/18] iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_IOVA Joao Martins
2023-10-18 20:27 ` [PATCH v4 17/18] iommufd/selftest: Test out_capabilities in IOMMU_GET_HW_INFO Joao Martins
2023-10-18 20:27 ` [PATCH v4 18/18] iommufd/selftest: Test IOMMU_GET_DIRTY_IOVA_NO_CLEAR flag Joao Martins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31612252-e6e1-4bfc-8b82-620e79422cbc@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox