All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Oliver Upton <oliver.upton@linux.dev>
Cc: Huang Shijie <shijie@os.amperecomputing.com>,
	maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
	pbonzini@redhat.com, peterz@infradead.org, ingo@redhat.com,
	acme@kernel.org, alexander.shishkin@linux.intel.com,
	jolsa@kernel.org, namhyung@kernel.org, irogers@google.com,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	linux-perf-users@vger.kernel.org, patches@amperecomputing.com,
	zwang@amperecomputing.com
Subject: Re: [PATCH] perf/core: fix the bug in the event multiplexing
Date: Wed, 9 Aug 2023 10:22:25 +0100	[thread overview]
Message-ID: <ZNNa0abhS53cMNcK@FVFF77S0Q05N> (raw)
In-Reply-To: <ZNNNY3MlokEIj4y8@linux.dev>

On Wed, Aug 09, 2023 at 08:25:07AM +0000, Oliver Upton wrote:
> Hi Huang,
> 
> On Wed, Aug 09, 2023 at 09:39:53AM +0800, Huang Shijie wrote:
> > 2.) Root cause.
> > 	There is only 7 counters in my arm64 platform:
> > 	  (one cycle counter) + (6 normal counters)
> > 
> > 	In 1.3 above, we will use 10 event counters.
> > 	Since we only have 7 counters, the perf core will trigger
> >        	event multiplexing in hrtimer:
> > 	     merge_sched_in() -->perf_mux_hrtimer_restart() -->
> > 	     perf_rotate_context().
> > 
> >        In the perf_rotate_context(), it does not restore some PMU registers
> >        as context_switch() does.  In context_switch():
> >              kvm_sched_in()  --> kvm_vcpu_pmu_restore_guest()
> >              kvm_sched_out() --> kvm_vcpu_pmu_restore_host()
> > 
> >        So we got wrong result.
> 
> This is a rather vague description of the problem. AFAICT, the
> issue here is on VHE systems we wind up getting the EL0 count
> enable/disable bits backwards when entering the guest, which is
> corroborated by the data you have below.

Yep; IIUC the issue here is that when we take an IRQ from a guest and reprogram
the PMU in the IRQ handler, the IRQ handler will program the PMU with
appropriate host/guest/user/etc filters for a *host* context, and then we'll
return back into the guest without reconfigurign the event filtering for a
*guest* context.

That can happen for perf_rotate_context(), or when we install an event into a
running context, as that'll happen via an IPI.

> > +void arch_perf_rotate_pmu_set(void)
> > +{
> > +	if (is_guest())
> > +		kvm_vcpu_pmu_restore_guest(NULL);
> > +	else
> > +		kvm_vcpu_pmu_restore_host(NULL);
> > +}
> > +
> 
> This sort of hook is rather nasty, and I'd strongly prefer a solution
> that's confined to KVM. I don't think the !is_guest() branch is
> necessary at all. Regardless of how the pmu context is changed, we need
> to go through vcpu_put() before getting back out to userspace.
> 
> We can check for a running vCPU (ick) from kvm_set_pmu_events() and either
> do the EL0 bit flip there or make a request on the vCPU to call
> kvm_vcpu_pmu_restore_guest() immediately before reentering the guest.
> I'm slightly leaning towards the latter, unless anyone has a better idea
> here.

The latter sounds reasonable to me.

I suspect we need to take special care here to make sure we leave *all* events
in a good state when re-entering the guest or if we get to kvm_sched_out()
after *removing* an event via an IPI -- it'd be easy to mess either case up and
leave some events in a bad state.

Thanks,
Mark.

WARNING: multiple messages have this Message-ID (diff)
From: Mark Rutland <mark.rutland@arm.com>
To: Oliver Upton <oliver.upton@linux.dev>
Cc: Huang Shijie <shijie@os.amperecomputing.com>,
	maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
	pbonzini@redhat.com, peterz@infradead.org, ingo@redhat.com,
	acme@kernel.org, alexander.shishkin@linux.intel.com,
	jolsa@kernel.org, namhyung@kernel.org, irogers@google.com,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	linux-perf-users@vger.kernel.org, patches@amperecomputing.com,
	zwang@amperecomputing.com
Subject: Re: [PATCH] perf/core: fix the bug in the event multiplexing
Date: Wed, 9 Aug 2023 10:22:25 +0100	[thread overview]
Message-ID: <ZNNa0abhS53cMNcK@FVFF77S0Q05N> (raw)
In-Reply-To: <ZNNNY3MlokEIj4y8@linux.dev>

On Wed, Aug 09, 2023 at 08:25:07AM +0000, Oliver Upton wrote:
> Hi Huang,
> 
> On Wed, Aug 09, 2023 at 09:39:53AM +0800, Huang Shijie wrote:
> > 2.) Root cause.
> > 	There is only 7 counters in my arm64 platform:
> > 	  (one cycle counter) + (6 normal counters)
> > 
> > 	In 1.3 above, we will use 10 event counters.
> > 	Since we only have 7 counters, the perf core will trigger
> >        	event multiplexing in hrtimer:
> > 	     merge_sched_in() -->perf_mux_hrtimer_restart() -->
> > 	     perf_rotate_context().
> > 
> >        In the perf_rotate_context(), it does not restore some PMU registers
> >        as context_switch() does.  In context_switch():
> >              kvm_sched_in()  --> kvm_vcpu_pmu_restore_guest()
> >              kvm_sched_out() --> kvm_vcpu_pmu_restore_host()
> > 
> >        So we got wrong result.
> 
> This is a rather vague description of the problem. AFAICT, the
> issue here is on VHE systems we wind up getting the EL0 count
> enable/disable bits backwards when entering the guest, which is
> corroborated by the data you have below.

Yep; IIUC the issue here is that when we take an IRQ from a guest and reprogram
the PMU in the IRQ handler, the IRQ handler will program the PMU with
appropriate host/guest/user/etc filters for a *host* context, and then we'll
return back into the guest without reconfigurign the event filtering for a
*guest* context.

That can happen for perf_rotate_context(), or when we install an event into a
running context, as that'll happen via an IPI.

> > +void arch_perf_rotate_pmu_set(void)
> > +{
> > +	if (is_guest())
> > +		kvm_vcpu_pmu_restore_guest(NULL);
> > +	else
> > +		kvm_vcpu_pmu_restore_host(NULL);
> > +}
> > +
> 
> This sort of hook is rather nasty, and I'd strongly prefer a solution
> that's confined to KVM. I don't think the !is_guest() branch is
> necessary at all. Regardless of how the pmu context is changed, we need
> to go through vcpu_put() before getting back out to userspace.
> 
> We can check for a running vCPU (ick) from kvm_set_pmu_events() and either
> do the EL0 bit flip there or make a request on the vCPU to call
> kvm_vcpu_pmu_restore_guest() immediately before reentering the guest.
> I'm slightly leaning towards the latter, unless anyone has a better idea
> here.

The latter sounds reasonable to me.

I suspect we need to take special care here to make sure we leave *all* events
in a good state when re-entering the guest or if we get to kvm_sched_out()
after *removing* an event via an IPI -- it'd be easy to mess either case up and
leave some events in a bad state.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2023-08-09  9:22 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-09  1:39 [PATCH] perf/core: fix the bug in the event multiplexing Huang Shijie
2023-08-09  1:39 ` Huang Shijie
2023-08-09  3:57 ` kernel test robot
2023-08-09  3:57   ` kernel test robot
2023-08-09  8:25 ` Oliver Upton
2023-08-09  8:25   ` Oliver Upton
2023-08-09  9:17   ` Shijie Huang
2023-08-09  9:17     ` Shijie Huang
2023-08-09  9:22   ` Mark Rutland [this message]
2023-08-09  9:22     ` Mark Rutland
2023-08-09  9:37     ` Shijie Huang
2023-08-09  9:37       ` Shijie Huang
2023-08-09  8:48 ` Marc Zyngier
2023-08-09  8:48   ` Marc Zyngier
2023-08-09  9:10   ` Oliver Upton
2023-08-09  9:10     ` Oliver Upton
2023-08-09  9:28   ` Shijie Huang
2023-08-09  9:28     ` Shijie Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZNNa0abhS53cMNcK@FVFF77S0Q05N \
    --to=mark.rutland@arm.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=catalin.marinas@arm.com \
    --cc=ingo@redhat.com \
    --cc=irogers@google.com \
    --cc=james.morse@arm.com \
    --cc=jolsa@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=patches@amperecomputing.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=shijie@os.amperecomputing.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    --cc=zwang@amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.