From: Baolu Lu <baolu.lu@linux.intel.com>
To: Yi Liu <yi.l.liu@intel.com>, "Tian, Kevin" <kevin.tian@intel.com>,
"joro@8bytes.org" <joro@8bytes.org>,
"jgg@nvidia.com" <jgg@nvidia.com>
Cc: baolu.lu@linux.intel.com,
"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"eric.auger@redhat.com" <eric.auger@redhat.com>,
"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
"Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
"vasant.hegde@amd.com" <vasant.hegde@amd.com>,
"willy@infradead.org" <willy@infradead.org>
Subject: Re: [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers
Date: Thu, 7 Nov 2024 16:41:41 +0800 [thread overview]
Message-ID: <d7d1d655-2ddd-4867-a5c6-af50e280ec57@linux.intel.com> (raw)
In-Reply-To: <f73869de-38b1-41cf-bd20-d91523e4fd08@intel.com>
On 2024/11/7 16:39, Yi Liu wrote:
> On 2024/11/7 16:04, Baolu Lu wrote:
>> On 2024/11/7 15:57, Yi Liu wrote:
>>> On 2024/11/7 14:53, Baolu Lu wrote:
>>>> On 2024/11/7 14:46, Yi Liu wrote:
>>>>> On 2024/11/7 13:46, Tian, Kevin wrote:
>>>>>>> From: Liu, Yi L <yi.l.liu@intel.com>
>>>>>>> Sent: Thursday, November 7, 2024 12:21 PM
>>>>>>>
>>>>>>> On 2024/11/7 10:52, Baolu Lu wrote:
>>>>>>>> On 11/6/24 23:45, Yi Liu wrote:
>>>>>>>>> +int intel_pasid_replace_first_level(struct intel_iommu *iommu,
>>>>>>>>> + struct device *dev, pgd_t *pgd,
>>>>>>>>> + u32 pasid, u16 did, u16 old_did,
>>>>>>>>> + int flags)
>>>>>>>>> +{
>>>>>>>>> + struct pasid_entry *pte;
>>>>>>>>> +
>>>>>>>>> + if (!ecap_flts(iommu->ecap)) {
>>>>>>>>> + pr_err("No first level translation support on %s\n",
>>>>>>>>> + iommu->name);
>>>>>>>>> + return -EINVAL;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + if ((flags & PASID_FLAG_FL5LP) && !
>>>>>>>>> cap_fl5lp_support(iommu- >cap)) {
>>>>>>>>> + pr_err("No 5-level paging support for first-level on
>>>>>>>>> %s\n",
>>>>>>>>> + iommu->name);
>>>>>>>>> + return -EINVAL;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + spin_lock(&iommu->lock);
>>>>>>>>> + pte = intel_pasid_get_entry(dev, pasid);
>>>>>>>>> + if (!pte) {
>>>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>>>> + return -ENODEV;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + if (!pasid_pte_is_present(pte)) {
>>>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>>>> + return -EINVAL;
>>>>>>>>> + }
>>>>>>>>> +
>>>>>>>>> + WARN_ON(old_did != pasid_get_domain_id(pte));
>>>>>>>>> +
>>>>>>>>> + pasid_pte_config_first_level(iommu, pte, pgd, did, flags);
>>>>>>>>> + spin_unlock(&iommu->lock);
>>>>>>>>> +
>>>>>>>>> + intel_pasid_flush_present(iommu, dev, pasid, old_did, pte);
>>>>>>>>> + intel_iommu_drain_pasid_prq(dev, pasid);
>>>>>>>>> +
>>>>>>>>> + return 0;
>>>>>>>>> +}
>>>>>>>>
>>>>>>>> pasid_pte_config_first_level() causes the pasid entry to
>>>>>>>> transition from
>>>>>>>> present to non-present and then to present. In this case, calling
>>>>>>>> intel_pasid_flush_present() is not accurate, as it is only
>>>>>>>> intended for
>>>>>>>> pasid entries transitioning from present to present, according
>>>>>>>> to the
>>>>>>>> specification.
>>>>>>>>
>>>>>>>> It's recommended to move pasid_clear_entry(pte) and
>>>>>>>> pasid_set_present(pte) out to the caller, so ...
>>>>>>>>
>>>>>>>> For setup case (pasid from non-present to present):
>>>>>>>>
>>>>>>>> - pasid_clear_entry(pte)
>>>>>>>> - pasid_pte_config_first_level(pte)
>>>>>>>> - pasid_set_present(pte)
>>>>>>>> - cache invalidations
>>>>>>>>
>>>>>>>> For replace case (pasid from present to present)
>>>>>>>>
>>>>>>>> - pasid_pte_config_first_level(pte)
>>>>>>>> - cache invalidations
>>>>>>>>
>>>>>>>> The same applies to other types of setup and replace.
>>>>>>>
>>>>>>> hmmm. Here is the reason I did it in the way of this patch:
>>>>>>> 1) pasid_clear_entry() can clear all the fields that are not
>>>>>>> supposed to
>>>>>>> be used by the new domain. For example, converting a nested
>>>>>>> domain to
>>>>>>> SS
>>>>>>> only domain, if no pasid_clear_entry() then the FSPTR would
>>>>>>> be there.
>>>>>>> Although spec seems not enforce it, it might be good to
>>>>>>> clear it.
>>>>>>> 2) We don't support atomic replace yet, so the whole pasid entry
>>>>>>> transition
>>>>>>> is not done in one shot, so it looks to be ok to do this
>>>>>>> stepping
>>>>>>> transition.
>>>>>>> 3) It seems to be even worse if keep the Present bit during the
>>>>>>> transition.
>>>>>>> The pasid entry might be broken while the Present bit
>>>>>>> indicates this is
>>>>>>> a valid pasid entry. Say if there is in-flight DMA, the
>>>>>>> result may be
>>>>>>> unpredictable.
>>>>>>>
>>>>>>> Based on the above, I chose the current way. But I admit if we
>>>>>>> are going to
>>>>>>> support atomic replace, then we should refactor a bit. I believe
>>>>>>> at that
>>>>>>> time we need to construct the new pasid entry first and try to
>>>>>>> exchange it
>>>>>>> to the pasid table. I can see some transition can be done in that
>>>>>>> way as we
>>>>>>> can do atomic exchange with 128bits. thoughts? 🙂
>>>>>>>
>>>>>>
>>>>>> yes 128bit cmpxchg is necessary to support atomic replacement.
>>>>>>
>>>>>> Actually vt-d spec clearly says so e.g. SSPTPTR/DID must be updated
>>>>>> together in a present entry to not break in-flight DMA.
>>>>>>
>>>>>> but... your current way (clear entry then update it) also break
>>>>>> in- flight
>>>>>> DMA. So let's admit that as the 1st step it's not aimed to support
>>>>>> atomic replacement. With that Baolu's suggestion makes more sense
>>>>>> toward future extension with less refactoring required (otherwise
>>>>>> you should not use intel_pasid_flush_present() then the earlier
>>>>>> refactoring for that helper is also meaningless).
>>>>>
>>>>> I see. The pasid entry might have some filed that is not supposed
>>>>> to be
>>>>> used after replacement. Should we have a comment about it?
>>>>
>>>> I guess all fields except SSADE and P of a pasid table entry should be
>>>> cleared in pasid_pte_config_first_level()?
>>>
>>> perhaps we can take one more step forward. We can construct the new
>>> pte in a local variable first and then push it to the pte in the
>>> pasid table. 🙂
>>>
>>
>> That sounds better! The entry is composed on the stack and then copied
>> over to the pasid table as a whole.
>
> that's it. Like the below.
Looks good to me now.
next prev parent reply other threads:[~2024-11-07 8:41 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-06 15:45 [PATCH v5 00/13] Make set_dev_pasid op supporting domain replacement Yi Liu
2024-11-06 15:45 ` [PATCH v5 01/13] iommu: Pass old domain to set_dev_pasid op Yi Liu
2024-11-06 15:45 ` [PATCH v5 02/13] iommu/vt-d: Add a helper to flush cache for updating present pasid entry Yi Liu
2024-11-06 15:45 ` [PATCH v5 03/13] iommu/vt-d: Refactor the pasid setup helpers Yi Liu
2024-11-06 15:45 ` [PATCH v5 04/13] iommu/vt-d: Add pasid replace helpers Yi Liu
2024-11-07 2:52 ` Baolu Lu
2024-11-07 4:21 ` Yi Liu
2024-11-07 5:46 ` Tian, Kevin
2024-11-07 6:46 ` Yi Liu
2024-11-07 6:53 ` Baolu Lu
2024-11-07 7:57 ` Yi Liu
2024-11-07 8:04 ` Baolu Lu
2024-11-07 8:39 ` Yi Liu
2024-11-07 8:41 ` Baolu Lu [this message]
2024-11-07 2:57 ` Baolu Lu
2024-11-07 4:05 ` Yi Liu
2024-11-06 15:45 ` [PATCH v5 05/13] iommu/vt-d: Consolidate the struct dev_pasid_info add/remove Yi Liu
2024-11-06 15:45 ` [PATCH v5 06/13] iommu/vt-d: Add iommu_domain_did() to get did Yi Liu
2024-11-06 15:46 ` [PATCH v5 07/13] iommu/vt-d: Make intel_iommu_set_dev_pasid() to handle domain replacement Yi Liu
2024-11-06 15:46 ` [PATCH v5 08/13] iommu/vt-d: Limit intel_iommu_set_dev_pasid() for paging domain Yi Liu
2024-11-06 15:46 ` [PATCH v5 09/13] iommu/vt-d: Make intel_svm_set_dev_pasid() support domain replacement Yi Liu
2024-11-06 15:46 ` [PATCH v5 10/13] iommu/vt-d: Make identity_domain_set_dev_pasid() to handle " Yi Liu
2024-11-06 15:46 ` [PATCH v5 11/13] iommu/vt-d: Add set_dev_pasid callback for nested domain Yi Liu
2024-11-06 15:46 ` [PATCH v5 12/13] iommu/arm-smmu-v3: Make set_dev_pasid() op support replace Yi Liu
2024-11-06 15:46 ` [PATCH v5 13/13] iommu: Make set_dev_pasid op support domain replacement Yi Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d7d1d655-2ddd-4867-a5c6-af50e280ec57@linux.intel.com \
--to=baolu.lu@linux.intel.com \
--cc=alex.williamson@redhat.com \
--cc=chao.p.peng@linux.intel.com \
--cc=eric.auger@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=nicolinc@nvidia.com \
--cc=vasant.hegde@amd.com \
--cc=willy@infradead.org \
--cc=yi.l.liu@intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox