From: Gavin Shan <gshan@redhat.com>
To: Oliver Upton <oliver.upton@linux.dev>
Cc: kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org,
maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com,
yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
qperret@google.com, ricarkol@google.com, tabba@google.com,
bgardon@google.com, zhenyzha@redhat.com, yihyu@redhat.com,
shan.gavin@gmail.com
Subject: Re: [PATCH] KVM: arm64: Fix soft-lockup on relaxing PTE permission
Date: Wed, 6 Sep 2023 08:26:24 +1000 [thread overview]
Message-ID: <bfdafdc5-4abf-a387-0857-e8cb84e4b3d7@redhat.com> (raw)
In-Reply-To: <ZPduJ08GKaKXwIhM@linux.dev>
On 9/6/23 04:06, Oliver Upton wrote:
> On Tue, Sep 05, 2023 at 10:06:14AM +1000, Gavin Shan wrote:
>
> [...]
>
>>> static inline void __invalidate_icache_guest_page(void *va, size_t size)
>>> {
>>> + size_t nr_lines = size / __icache_line_size();
>>> +
>>> if (icache_is_aliasing()) {
>>> /* any kind of VIPT cache */
>>> icache_inval_all_pou();
>>> } else if (read_sysreg(CurrentEL) != CurrentEL_EL1 ||
>>> !icache_is_vpipt()) {
>>> /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
>>> - icache_inval_pou((unsigned long)va, (unsigned long)va + size);
>>> + if (nr_lines > MAX_TLBI_OPS)
>>> + icache_inval_all_pou();
>>> + else
>>> + icache_inval_pou((unsigned long)va,
>>> + (unsigned long)va + size);
>>> }
>>> }
>>
>> I'm not sure if it's worthy to pull the @iminline from CTR_EL0 since it's almost
>> fixed to 64-bytes.
>
> I firmly disagree. The architecture allows implementers to select a
> different minimum line size, and non-64b systems _do_ exist in the wild.
> Furthermore, some implementers have decided to glue together cores with
> mismatched line sizes too...
>
> Though we could avoid some headache by normalizing on 64b, the cold
> reality of the ecosystem requires that we go out of our way to
> accomodate ~any design choice allowed by the architecture.
>
It seems I didn't make it clear enough. The reason why I had the concern
to avoid reading ctr_el0 is we read ctr_el0 for twice in the following path,
but I doubt if anybody cares. Since it's a hot path, each bit of performance
gain will count.
invalidate_icache_guest_page
__invalidate_icache_guest_page // first read on ctr_el0, with your code changes
icache_inval_pou(va, va + size)
invalidate_icache_by_line
icache_line_size // second read on ctr_el0
>> @size is guranteed to be PAGE_SIZE or PMD_SIZE aligned. Maybe
>> we can just aggressively do something like below, disregarding the icache thrashing.
>> In this way, the code is further simplified.
>>
>> if (size > PAGE_SIZE) {
>> icache_inval_all_pou();
>> } else {
>> icache_inval_pou((unsigned long)va,
>> (unsigned long)va + size);
>> } // parantheses is still needed
>
> This could work too but we already have a kernel heuristic for limiting
> the amount of broadcast invalidations, which is MAX_TLBI_OPS. I don't
> want to introduce a second, KVM-specific hack to address the exact same
> thing.
>
Ok. I was confused at the first glance since TLB isn't relevant to icache.
I think it's fine to reuse MAX_TLBI_OPS here, but a comment may be needed.
Oliver, could you please send a formal patch for your changes?
>> I'm leveraging the chance to ask one question, which isn't related to the issue.
>> It seems we're doing the icache/dcache coherence differently for stage1 and stage-2
>> page table entries. The question is why we needn't to clean the dcache for stage-2,
>> as we're doing for the stage-1 case?
>
> KVM always does its required dcache maintenance (if any) on the first
> translation abort to a given IPA. On systems w/o FEAT_DIC, we lazily
> grant execute permissions as an optimization to avoid unnecessary icache
> invalidations, which as you've seen tends to be a bit of a sore spot.
>
> Between the two faults, we've effectively guaranteed that any
> host-initiated writes to the PA are visible to the guest on both the I
> and D side. Any CMOs for making guest-initiated writes coherent after
> the translation fault are the sole responsibility of the guest.
>
Nice, thanks a lot for the explanation.
Thanks,
Gavin
next prev parent reply other threads:[~2023-09-05 22:27 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-04 7:28 [PATCH] KVM: arm64: Fix soft-lockup on relaxing PTE permission Gavin Shan
2023-09-04 8:04 ` Oliver Upton
2023-09-05 0:06 ` Gavin Shan
2023-09-05 18:06 ` Oliver Upton
2023-09-05 22:26 ` Gavin Shan [this message]
2023-09-06 16:29 ` Oliver Upton
2023-09-19 6:41 ` Gavin Shan
2023-09-04 8:22 ` Marc Zyngier
2023-09-05 1:44 ` Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bfdafdc5-4abf-a387-0857-e8cb84e4b3d7@redhat.com \
--to=gshan@redhat.com \
--cc=bgardon@google.com \
--cc=catalin.marinas@arm.com \
--cc=james.morse@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=qperret@google.com \
--cc=ricarkol@google.com \
--cc=shan.gavin@gmail.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=will@kernel.org \
--cc=yihyu@redhat.com \
--cc=yuzenghui@huawei.com \
--cc=zhenyzha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox