From: Baolu Lu <baolu.lu@linux.intel.com>
To: Yi Liu <yi.l.liu@intel.com>, "Tian, Kevin" <kevin.tian@intel.com>,
"joro@8bytes.org" <joro@8bytes.org>,
"jgg@nvidia.com" <jgg@nvidia.com>
Cc: baolu.lu@linux.intel.com,
"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"eric.auger@redhat.com" <eric.auger@redhat.com>,
"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
"Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
"vasant.hegde@amd.com" <vasant.hegde@amd.com>,
"willy@infradead.org" <willy@infradead.org>
Subject: Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
Date: Thu, 7 Nov 2024 16:04:05 +0800 [thread overview]
Message-ID: <27c2acfb-a428-486a-bd10-7d34a8cae4ed@linux.intel.com> (raw)
In-Reply-To: <9cc98d30-6257-4d9c-8735-f1147bd1d966@intel.com>
On 2024/11/7 15:57, Yi Liu wrote:
> On 2024/11/7 14:53, Baolu Lu wrote:
>> On 2024/11/7 14:46, Yi Liu wrote:
>>> On 2024/11/7 13:46, Tian, Kevin wrote:
>>>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>>>> Sent: Thursday, November 7, 2024 12:21 PM
>>>>>
>>>>> On 2024/11/7 10:52, Baolu Lu wrote:
>>>>>> On 11/6/24 23:45, Yi Liu wrote:
>>>>>>> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
>>>>>>> + struct device *dev, pgd_t *pgd,
>>>>>>> + u32 pasid, u16 did, u16 old_did,
>>>>>>> + int flags)
>>>>>>> +{
>>>>>>> + struct pasid_entry *pte;
>>>>>>> +
>>>>>>> + if (!ecap_flts(iommu->ecap)) {
>>>>>>> + pr_err("No first level translation support on %s\n",
>>>>>>> + iommu->name);
>>>>>>> + return -EINVAL;
>>>>>>> + }
>>>>>>> +
>>>>>>> + if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu-
>>>>>>> >cap)) {
>>>>>>> + pr_err("No 5-level paging support for first-level on %s\n",
>>>>>>> + iommu->name);
>>>>>>> + return -EINVAL;
>>>>>>> + }
>>>>>>> +
>>>>>>> + spin_lock(&iommu->lock);
>>>>>>> + pte = intel_pasid_get_entry(dev, pasid);
>>>>>>> + if (!pte) {
>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>> + return -ENODEV;
>>>>>>> + }
>>>>>>> +
>>>>>>> + if (!pasid_pte_is_present(pte)) {
>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>> + return -EINVAL;
>>>>>>> + }
>>>>>>> +
>>>>>>> + WARN_ON(old_did != pasid_get_domain_id(pte));
>>>>>>> +
>>>>>>> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>> +
>>>>>>> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
>>>>>>> + intel_iommu_drain_pasid_prq(dev, pasid);
>>>>>>> +
>>>>>>> + return 0;
>>>>>>> +}
>>>>>>
>>>>>> pasid_pte_config_first_level() causes the pasid entry to
>>>>>> transition from
>>>>>> present to non-present and then to present. In this case, calling
>>>>>> intel_pasid_flush_present() is not accurate, as it is only
>>>>>> intended for
>>>>>> pasid entries transitioning from present to present, according to the
>>>>>> specification.
>>>>>>
>>>>>> It's recommended to move pasid_clear_entry(pte) and
>>>>>> pasid_set_present(pte) out to the caller, so ...
>>>>>>
>>>>>> For setup case (pasid from non-present to present):
>>>>>>
>>>>>> - pasid_clear_entry(pte)
>>>>>> - pasid_pte_config_first_level(pte)
>>>>>> - pasid_set_present(pte)
>>>>>> - cache invalidations
>>>>>>
>>>>>> For replace case (pasid from present to present)
>>>>>>
>>>>>> - pasid_pte_config_first_level(pte)
>>>>>> - cache invalidations
>>>>>>
>>>>>> The same applies to other types of setup and replace.
>>>>>
>>>>> hmmm. Here is the reason I did it in the way of this patch:
>>>>> 1) pasid_clear_entry() can clear all the fields that are not
>>>>> supposed to
>>>>> be used by the new domain. For example, converting a nested
>>>>> domain to
>>>>> SS
>>>>> only domain, if no pasid_clear_entry() then the FSPTR would be
>>>>> there.
>>>>> Although spec seems not enforce it, it might be good to clear it.
>>>>> 2) We don't support atomic replace yet, so the whole pasid entry
>>>>> transition
>>>>> is not done in one shot, so it looks to be ok to do this stepping
>>>>> transition.
>>>>> 3) It seems to be even worse if keep the Present bit during the
>>>>> transition.
>>>>> The pasid entry might be broken while the Present bit
>>>>> indicates this is
>>>>> a valid pasid entry. Say if there is in-flight DMA, the result
>>>>> may be
>>>>> unpredictable.
>>>>>
>>>>> Based on the above, I chose the current way. But I admit if we are
>>>>> going to
>>>>> support atomic replace, then we should refactor a bit. I believe at
>>>>> that
>>>>> time we need to construct the new pasid entry first and try to
>>>>> exchange it
>>>>> to the pasid table. I can see some transition can be done in that
>>>>> way as we
>>>>> can do atomic exchange with 128bits. thoughts? 🙂
>>>>>
>>>>
>>>> yes 128bit cmpxchg is necessary to support atomic replacement.
>>>>
>>>> Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated
>>>> together in a present entry to not break in-flight DMA.
>>>>
>>>> but... your current way (clear entry then update it) also break in-
>>>> flight
>>>> DMA. So let's admit that as the 1st step it's not aimed to support
>>>> atomic replacement. With that Baolu's suggestion makes more sense
>>>> toward future extension with less refactoring required (otherwise
>>>> you should not use intel_pasid_flush_present() then the earlier
>>>> refactoring for that helper is also meaningless).
>>>
>>> I see. The pasid entry might have some filed that is not supposed to be
>>> used after replacement. Should we have a comment about it?
>>
>> I guess all fields except SSADE and P of a pasid table entry should be
>> cleared in pasid_pte_config_first_level()?
>
> perhaps we can take one more step forward. We can construct the new pte
> in a local variable first and then push it to the pte in the pasid
> table. :)
>
That sounds better! The entry is composed on the stack and then copied
over to the pasid table as a whole.
With these two issues addressed, do u mind sending a new version? Let's
try to catch the pull request window. There are other series (iommufd
and vfio) about user space PASID support depending on this.
--
baolu
next prev parent reply other threads:[~2024-11-07 8:04 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
2024-11-06 15:45 ` [PATCH v5 01/13] iommu: Pass old domain to set_dev_pasid op Yi Liu
2024-11-06 15:45 ` [PATCH v5 02/13] iommu/vt-d: Add a helper to flush cache for updating present pasid entry Yi Liu
2024-11-06 15:45 ` [PATCH v5 03/13] iommu/vt-d: Refactor the pasid setup helpers Yi Liu
2024-11-06 15:45 ` [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers Yi Liu
2024-11-07 2:52 ` Baolu Lu
2024-11-07 4:21 ` Yi Liu
2024-11-07 5:46 ` Tian, Kevin
2024-11-07 6:46 ` Yi Liu
2024-11-07 6:53 ` Baolu Lu
2024-11-07 7:57 ` Yi Liu
2024-11-07 8:04 ` Baolu Lu [this message]
2024-11-07 8:39 ` Yi Liu
2024-11-07 8:41 ` Baolu Lu
2024-11-07 2:57 ` Baolu Lu
2024-11-07 4:05 ` Yi Liu
2024-11-06 15:45 ` [PATCH v5 05/13] iommu/vt-d: Consolidate the struct dev_pasid_info add/remove Yi Liu
2024-11-06 15:45 ` [PATCH v5 06/13] iommu/vt-d: Add iommu_domain_did() to get did Yi Liu
2024-11-06 15:46 ` [PATCH v5 07/13] iommu/vt-d: Make intel_iommu_set_dev_pasid() to handle domain replacement Yi Liu
2024-11-06 15:46 ` [PATCH v5 08/13] iommu/vt-d: Limit intel_iommu_set_dev_pasid() for paging domain Yi Liu
2024-11-06 15:46 ` [PATCH v5 09/13] iommu/vt-d: Make intel_svm_set_dev_pasid() support domain replacement Yi Liu
2024-11-06 15:46 ` [PATCH v5 10/13] iommu/vt-d: Make identity_domain_set_dev_pasid() to handle " Yi Liu
2024-11-06 15:46 ` [PATCH v5 11/13] iommu/vt-d: Add set_dev_pasid callback for nested domain Yi Liu
2024-11-06 15:46 ` [PATCH v5 12/13] iommu/arm-smmu-v3: Make set_dev_pasid() op support replace Yi Liu
2024-11-06 15:46 ` [PATCH v5 13/13] iommu: Make set_dev_pasid op support domain replacement Yi Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27c2acfb-a428-486a-bd10-7d34a8cae4ed@linux.intel.com \
--to=baolu.lu@linux.intel.com \
--cc=alex.williamson@redhat.com \
--cc=chao.p.peng@linux.intel.com \
--cc=eric.auger@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=nicolinc@nvidia.com \
--cc=vasant.hegde@amd.com \
--cc=willy@infradead.org \
--cc=yi.l.liu@intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.