From: Marc Zyngier <maz@kernel.org>
To: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: catalin.marinas@arm.com, will@kernel.org,
linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
scott@os.amperecomputing.com, keyur@os.amperecomputing.com
Subject: Re: [PATCH 2/3] KVM: arm64: nv: Emulate ISTATUS when emulated timers are fired.
Date: Tue, 10 Jan 2023 10:46:42 +0000 [thread overview]
Message-ID: <86eds2oeel.wl-maz@kernel.org> (raw)
In-Reply-To: <31e49612-443b-888a-9730-f4e017251130@os.amperecomputing.com>
On Tue, 10 Jan 2023 08:41:44 +0000,
Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote:
>
>
> On 02-01-2023 05:16 pm, Marc Zyngier wrote:
> > On Thu, 29 Dec 2022 13:53:15 +0000,
> > Marc Zyngier <maz@kernel.org> wrote:
> >>
> >> On Wed, 24 Aug 2022 07:03:03 +0100,
> >> Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote:
> >>>
> >>> Guest-Hypervisor forwards the timer interrupt to Guest-Guest, if it is
> >>> enabled, unmasked and ISTATUS bit of register CNTV_CTL_EL0 is set for a
> >>> loaded timer.
> >>>
> >>> For NV2 implementation, the Host-Hypervisor is not emulating the ISTATUS
> >>> bit while forwarding the Emulated Vtimer Interrupt to Guest-Hypervisor.
> >>> This results in the drop of interrupt from Guest-Hypervisor, where as
> >>> Host Hypervisor marked it as an active interrupt and expecting Guest-Guest
> >>> to consume and acknowledge. Due to this, some of the Guest-Guest vCPUs
> >>> are stuck in Idle thread and rcu soft lockups are seen.
> >>>
> >>> This issue is not seen with NV1 case since the register CNTV_CTL_EL0 read
> >>> trap handler is emulating the ISTATUS bit.
> >>>
> >>> Adding code to set/emulate the ISTATUS when the emulated timers are fired.
> >>>
> >>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
> >>> ---
> >>> arch/arm64/kvm/arch_timer.c | 5 +++++
> >>> 1 file changed, 5 insertions(+)
> >>>
> >>> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> >>> index 27a6ec46803a..0b32d943d2d5 100644
> >>> --- a/arch/arm64/kvm/arch_timer.c
> >>> +++ b/arch/arm64/kvm/arch_timer.c
> >>> @@ -63,6 +63,7 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
> >>> struct arch_timer_context *timer,
> >>> enum kvm_arch_timer_regs treg);
> >>> static bool kvm_arch_timer_get_input_level(int vintid);
> >>> +static u64 read_timer_ctl(struct arch_timer_context *timer);
> >>> static struct irq_ops arch_timer_irq_ops = {
> >>> .get_input_level = kvm_arch_timer_get_input_level,
> >>> @@ -356,6 +357,8 @@ static enum hrtimer_restart kvm_hrtimer_expire(struct hrtimer *hrt)
> >>> return HRTIMER_RESTART;
> >>> }
> >>> + /* Timer emulated, emulate ISTATUS also */
> >>> + timer_set_ctl(ctx, read_timer_ctl(ctx));
> >>
> >> Why should we do that for non-NV2 configurations?
> >>
> >>> kvm_timer_update_irq(vcpu, true, ctx);
> >>> return HRTIMER_NORESTART;
> >>> }
> >>> @@ -458,6 +461,8 @@ static void timer_emulate(struct arch_timer_context *ctx)
> >>> trace_kvm_timer_emulate(ctx, should_fire);
> >>> if (should_fire != ctx->irq.level) {
> >>> + /* Timer emulated, emulate ISTATUS also */
> >>> + timer_set_ctl(ctx, read_timer_ctl(ctx));
> >>> kvm_timer_update_irq(ctx->vcpu, should_fire, ctx);
> >>> return;
> >>> }
> >>
> >> I'm not overly keen on this. Yes, we can set the status bit there. But
> >> conversely, the bit will not get cleared when the guest reprograms the
> >> timer, and will take a full exit/entry cycle for it to appear.
> >>
> >> Ergo, the architecture is buggy as memory (the VNCR page) cannot be
> >> used to emulate something as dynamic as a timer.
> >>
> >> It is only with FEAT_ECV that we can solve this correctly by trapping
> >> the counter/timer accesses and emulate them for the guest hypervisor.
> >> I'd rather we add support for that, as I expect all the FEAT_NV2
> >> implementations to have it (and hopefully FEAT_FGT as well).
> >
> > So I went ahead and implemented some very basic FEAT_ECV support to
> > correctly emulate the timers (trapping the CTL/CVAL accesses).
> >
> > Performance dropped like a rock (~30% extra overhead) for L2
> > exit-heavy workloads that are terminated in userspace, such as virtio.
> > For those workloads, vcpu_{load,put}() in L1 now generate extra traps,
> > as we save/restore the timer context, and this is enough to make
> > things visibly slower, even on a pretty fast machine.
> >
> > I managed to get *some* performance back by satisfying CTL/CVAL reads
> > very early on the exit path (a pretty common theme with NV). Which
> > means we end-up needing something like what you have -- only a bit
> > more complete. I came up with the following:
>
> Yes it is more appropriate, this moves ISTATUS update to single place.
> >
> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > index 4945c5b96f05..a198a6211e2a 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -450,6 +450,25 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
> > {
> > int ret;
> > + /*
> > + * Paper over NV2 brokenness by publishing the interrupt status
> > + * bit. This still results in a poor quality of emulation (guest
> > + * writes will have no effect until the next exit).
> > + *
> > + * But hey, it's fast, right?
> > + */
> > + if (vcpu_has_nv2(vcpu) && is_hyp_ctxt(vcpu) &&
> > + (timer_ctx == vcpu_vtimer(vcpu) || timer_ctx == vcpu_ptimer(vcpu))) {
> > + u32 ctl = timer_get_ctl(timer_ctx);
> > +
> > + if (new_level)
> > + ctl |= ARCH_TIMER_CTRL_IT_STAT;
> > + else
> > + ctl &= ~ARCH_TIMER_CTRL_IT_STAT;
> > +
> > + timer_set_ctl(timer_ctx, ctl);
> > + }
> > +
> > timer_ctx->irq.level = new_level;
> > trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_ctx->irq.irq,
> > timer_ctx->irq.level);
> >
> > which reports the interrupt state in all cases.
> >
> > Does this work for you?
>
> This works.
> Are you going to pull this diff/patch in to your 6.2-nv tree? or you
> want me to send an updated patch?
I already have this in the patch titled:
KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state
and the result gets used by:
KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV is on
(not pasting the SHA1s as I'm still fixing a few nits here and there,
and the commit IDs will change).
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: catalin.marinas@arm.com, will@kernel.org,
linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
scott@os.amperecomputing.com, keyur@os.amperecomputing.com
Subject: Re: [PATCH 2/3] KVM: arm64: nv: Emulate ISTATUS when emulated timers are fired.
Date: Tue, 10 Jan 2023 10:46:42 +0000 [thread overview]
Message-ID: <86eds2oeel.wl-maz@kernel.org> (raw)
In-Reply-To: <31e49612-443b-888a-9730-f4e017251130@os.amperecomputing.com>
On Tue, 10 Jan 2023 08:41:44 +0000,
Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote:
>
>
> On 02-01-2023 05:16 pm, Marc Zyngier wrote:
> > On Thu, 29 Dec 2022 13:53:15 +0000,
> > Marc Zyngier <maz@kernel.org> wrote:
> >>
> >> On Wed, 24 Aug 2022 07:03:03 +0100,
> >> Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote:
> >>>
> >>> Guest-Hypervisor forwards the timer interrupt to Guest-Guest, if it is
> >>> enabled, unmasked and ISTATUS bit of register CNTV_CTL_EL0 is set for a
> >>> loaded timer.
> >>>
> >>> For NV2 implementation, the Host-Hypervisor is not emulating the ISTATUS
> >>> bit while forwarding the Emulated Vtimer Interrupt to Guest-Hypervisor.
> >>> This results in the drop of interrupt from Guest-Hypervisor, where as
> >>> Host Hypervisor marked it as an active interrupt and expecting Guest-Guest
> >>> to consume and acknowledge. Due to this, some of the Guest-Guest vCPUs
> >>> are stuck in Idle thread and rcu soft lockups are seen.
> >>>
> >>> This issue is not seen with NV1 case since the register CNTV_CTL_EL0 read
> >>> trap handler is emulating the ISTATUS bit.
> >>>
> >>> Adding code to set/emulate the ISTATUS when the emulated timers are fired.
> >>>
> >>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
> >>> ---
> >>> arch/arm64/kvm/arch_timer.c | 5 +++++
> >>> 1 file changed, 5 insertions(+)
> >>>
> >>> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> >>> index 27a6ec46803a..0b32d943d2d5 100644
> >>> --- a/arch/arm64/kvm/arch_timer.c
> >>> +++ b/arch/arm64/kvm/arch_timer.c
> >>> @@ -63,6 +63,7 @@ static u64 kvm_arm_timer_read(struct kvm_vcpu *vcpu,
> >>> struct arch_timer_context *timer,
> >>> enum kvm_arch_timer_regs treg);
> >>> static bool kvm_arch_timer_get_input_level(int vintid);
> >>> +static u64 read_timer_ctl(struct arch_timer_context *timer);
> >>> static struct irq_ops arch_timer_irq_ops = {
> >>> .get_input_level = kvm_arch_timer_get_input_level,
> >>> @@ -356,6 +357,8 @@ static enum hrtimer_restart kvm_hrtimer_expire(struct hrtimer *hrt)
> >>> return HRTIMER_RESTART;
> >>> }
> >>> + /* Timer emulated, emulate ISTATUS also */
> >>> + timer_set_ctl(ctx, read_timer_ctl(ctx));
> >>
> >> Why should we do that for non-NV2 configurations?
> >>
> >>> kvm_timer_update_irq(vcpu, true, ctx);
> >>> return HRTIMER_NORESTART;
> >>> }
> >>> @@ -458,6 +461,8 @@ static void timer_emulate(struct arch_timer_context *ctx)
> >>> trace_kvm_timer_emulate(ctx, should_fire);
> >>> if (should_fire != ctx->irq.level) {
> >>> + /* Timer emulated, emulate ISTATUS also */
> >>> + timer_set_ctl(ctx, read_timer_ctl(ctx));
> >>> kvm_timer_update_irq(ctx->vcpu, should_fire, ctx);
> >>> return;
> >>> }
> >>
> >> I'm not overly keen on this. Yes, we can set the status bit there. But
> >> conversely, the bit will not get cleared when the guest reprograms the
> >> timer, and will take a full exit/entry cycle for it to appear.
> >>
> >> Ergo, the architecture is buggy as memory (the VNCR page) cannot be
> >> used to emulate something as dynamic as a timer.
> >>
> >> It is only with FEAT_ECV that we can solve this correctly by trapping
> >> the counter/timer accesses and emulate them for the guest hypervisor.
> >> I'd rather we add support for that, as I expect all the FEAT_NV2
> >> implementations to have it (and hopefully FEAT_FGT as well).
> >
> > So I went ahead and implemented some very basic FEAT_ECV support to
> > correctly emulate the timers (trapping the CTL/CVAL accesses).
> >
> > Performance dropped like a rock (~30% extra overhead) for L2
> > exit-heavy workloads that are terminated in userspace, such as virtio.
> > For those workloads, vcpu_{load,put}() in L1 now generate extra traps,
> > as we save/restore the timer context, and this is enough to make
> > things visibly slower, even on a pretty fast machine.
> >
> > I managed to get *some* performance back by satisfying CTL/CVAL reads
> > very early on the exit path (a pretty common theme with NV). Which
> > means we end-up needing something like what you have -- only a bit
> > more complete. I came up with the following:
>
> Yes it is more appropriate, this moves ISTATUS update to single place.
> >
> > diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> > index 4945c5b96f05..a198a6211e2a 100644
> > --- a/arch/arm64/kvm/arch_timer.c
> > +++ b/arch/arm64/kvm/arch_timer.c
> > @@ -450,6 +450,25 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
> > {
> > int ret;
> > + /*
> > + * Paper over NV2 brokenness by publishing the interrupt status
> > + * bit. This still results in a poor quality of emulation (guest
> > + * writes will have no effect until the next exit).
> > + *
> > + * But hey, it's fast, right?
> > + */
> > + if (vcpu_has_nv2(vcpu) && is_hyp_ctxt(vcpu) &&
> > + (timer_ctx == vcpu_vtimer(vcpu) || timer_ctx == vcpu_ptimer(vcpu))) {
> > + u32 ctl = timer_get_ctl(timer_ctx);
> > +
> > + if (new_level)
> > + ctl |= ARCH_TIMER_CTRL_IT_STAT;
> > + else
> > + ctl &= ~ARCH_TIMER_CTRL_IT_STAT;
> > +
> > + timer_set_ctl(timer_ctx, ctl);
> > + }
> > +
> > timer_ctx->irq.level = new_level;
> > trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_ctx->irq.irq,
> > timer_ctx->irq.level);
> >
> > which reports the interrupt state in all cases.
> >
> > Does this work for you?
>
> This works.
> Are you going to pull this diff/patch in to your 6.2-nv tree? or you
> want me to send an updated patch?
I already have this in the patch titled:
KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state
and the result gets used by:
KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV is on
(not pasting the SHA1s as I'm still fixing a few nits here and there,
and the commit IDs will change).
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2023-01-10 10:47 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-24 6:03 [PATCH 0/3] KVM: arm64: nv: Fixes for Nested Virtualization issues Ganapatrao Kulkarni
2022-08-24 6:03 ` Ganapatrao Kulkarni
2022-08-24 6:03 ` Ganapatrao Kulkarni
2022-08-24 6:03 ` [PATCH 1/3] KVM: arm64: nv: only emulate timers that have not yet fired Ganapatrao Kulkarni
2022-08-24 6:03 ` Ganapatrao Kulkarni
2022-08-24 6:03 ` Ganapatrao Kulkarni
2022-12-29 13:00 ` Marc Zyngier
2022-12-29 13:00 ` Marc Zyngier
2022-12-29 13:00 ` Marc Zyngier
2023-01-09 12:25 ` Ganapatrao Kulkarni
2023-01-09 12:25 ` Ganapatrao Kulkarni
2023-01-09 13:44 ` Marc Zyngier
2023-01-09 13:44 ` Marc Zyngier
2023-01-09 14:03 ` Ganapatrao Kulkarni
2023-01-09 14:03 ` Ganapatrao Kulkarni
2022-08-24 6:03 ` [PATCH 2/3] KVM: arm64: nv: Emulate ISTATUS when emulated timers are fired Ganapatrao Kulkarni
2022-08-24 6:03 ` Ganapatrao Kulkarni
2022-08-24 6:03 ` Ganapatrao Kulkarni
2022-12-29 13:53 ` Marc Zyngier
2022-12-29 13:53 ` Marc Zyngier
2022-12-29 13:53 ` Marc Zyngier
2023-01-02 11:46 ` Marc Zyngier
2023-01-02 11:46 ` Marc Zyngier
2023-01-02 11:46 ` Marc Zyngier
2023-01-03 4:21 ` Ganapatrao Kulkarni
2023-01-03 4:21 ` Ganapatrao Kulkarni
2023-01-03 4:21 ` Ganapatrao Kulkarni
2023-01-10 8:41 ` Ganapatrao Kulkarni
2023-01-10 8:41 ` Ganapatrao Kulkarni
2023-01-10 10:46 ` Marc Zyngier [this message]
2023-01-10 10:46 ` Marc Zyngier
2022-08-24 6:03 ` [PATCH 3/3] KVM: arm64: nv: Avoid block mapping if max_map_size is smaller than block size Ganapatrao Kulkarni
2022-08-24 6:03 ` Ganapatrao Kulkarni
2022-08-24 6:03 ` Ganapatrao Kulkarni
2022-12-29 17:42 ` Marc Zyngier
2022-12-29 17:42 ` Marc Zyngier
2022-12-29 17:42 ` Marc Zyngier
2023-01-03 4:26 ` Ganapatrao Kulkarni
2023-01-03 4:26 ` Ganapatrao Kulkarni
2023-01-03 4:26 ` Ganapatrao Kulkarni
2023-01-09 13:58 ` Ganapatrao Kulkarni
2023-01-09 13:58 ` Ganapatrao Kulkarni
2022-10-10 5:56 ` [PATCH 0/3] KVM: arm64: nv: Fixes for Nested Virtualization issues Ganapatrao Kulkarni
2022-10-10 5:56 ` Ganapatrao Kulkarni
2022-10-10 5:56 ` Ganapatrao Kulkarni
2022-10-19 7:59 ` Marc Zyngier
2022-10-19 7:59 ` Marc Zyngier
2022-10-19 7:59 ` Marc Zyngier
2023-01-10 12:17 ` Ganapatrao Kulkarni
2023-01-10 12:17 ` Ganapatrao Kulkarni
2023-01-10 14:05 ` Marc Zyngier
2023-01-10 14:05 ` Marc Zyngier
2023-01-10 21:54 ` Marc Zyngier
2023-01-10 21:54 ` Marc Zyngier
2023-01-11 7:54 ` Marc Zyngier
2023-01-11 7:54 ` Marc Zyngier
2023-01-11 8:46 ` Ganapatrao Kulkarni
2023-01-11 8:46 ` Ganapatrao Kulkarni
2023-01-11 8:48 ` Ganapatrao Kulkarni
2023-01-11 8:48 ` Ganapatrao Kulkarni
2023-01-11 11:39 ` Marc Zyngier
2023-01-11 11:39 ` Marc Zyngier
2023-01-11 12:46 ` Ganapatrao Kulkarni
2023-01-11 12:46 ` Ganapatrao Kulkarni
2023-01-11 13:36 ` Marc Zyngier
2023-01-11 13:36 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86eds2oeel.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=gankulkarni@os.amperecomputing.com \
--cc=keyur@os.amperecomputing.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=scott@os.amperecomputing.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.