All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: "Wang, Wei W" <wei.w.wang@intel.com>,
	"Li, Xiaoyao" <xiaoyao.li@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	"Christopherson,, Sean" <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Liang, Kan" <kan.liang@intel.com>,
	"Kleen, Andi" <andi.kleen@intel.com>
Cc: "linux-perf-users@vger.kernel.org"
	<linux-perf-users@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [RFC PATCH 0/2] KVM: VMX: Fix VM entry failure on PT_MODE_HOST_GUEST while host is using PT
Date: Fri, 16 Sep 2022 09:27:07 -0400	[thread overview]
Message-ID: <da8fe1d8-8cce-7378-18e8-41fd7009ff8d@linux.intel.com> (raw)
In-Reply-To: <CY5PR11MB6365EE3B1C4BCED02A1E40D8DC489@CY5PR11MB6365.namprd11.prod.outlook.com>



On 2022-09-15 10:30 p.m., Wang, Wei W wrote:
> On Thursday, September 15, 2022 11:43 PM, Liang, Kan wrote:
>> On 2022-09-15 10:39 a.m., Wang, Wei W wrote:
>>> On Thursday, September 15, 2022 9:55 PM Liang, Kan wrote:
>>>> On 2022-09-14 10:46 p.m., Wang, Wei W wrote:
>>>>> On Thursday, September 15, 2022 4:26 AM, Liang, Kan wrote:
>>>>>> The perf_event_disable() eventually invokes the intel_pt_stop().
>>>>>> We already expose the intel_pt_stop()/cpu_emergency_stop_pt() to
>>>>>> other modules. I don't think we have to use the perf_event_disable().
>>>>>> Also, the
>>>>>> perf_event_disable() requires extra codes.
>>>>>>
>>>>>> I went through the discussions. I agree with Sean's suggestion.
>>>>>> We should only put the logic in the KVM but all the MSR access
>>>>>> details into the PT driver.
>>>>>
>>>>> Even the driver itself doesn’t drive the save/restore of the MSRs,
>>>>> it is drived
>>>> by perf.
>>>>
>>>> It through perf_event, not driven by perf_event. The perf_event
>>>> generic code never knows when should invokes each driver to
>>>> save/restore information. It should be driven by the other subsystem e.g.,
>> scheduler.
>>>
>>> Yes. The cpu scheduler does this via the perf subsystem, though.
>>>
>>>>
>>>> For this case, KVM should drive the save/restore, and the PT driver
>>>> eventually does all the MSR access details.
>>>>
>>>>> 1. If we make KVM a user of perf, we should do this via
>>>> perf_event_disable/enable_*.
>>>>> 2. If we make KVM an alternative to perf (i.e. have direct control
>>>>> over PMU HW), we can do this via driver interfaces like perf.
>>>>> Per my experience, we should go for 1. Probably need Peter's
>>>>> opinions on
>>>> this.
>>>>>
>>>>
>>>> For 1, the perf_event_disable/enable_* are not enough. They don't
>>>> save/restore MSRs.
>>>
>>> perf_event_disable will go through perf to call pt_event_stop which saves
>> the related MSRs, right?
>>
>> I don't think so. The pt_event_stop() doesn't save all the
>> MSR_IA32_RTIT_* MSRs.
> 
> Not all the MSRs are required to be saved. In general, pt_event_stop should have
> saved all the MSRs required for an event switching. Otherwise the general usages
> of PT have been broken. To be more precise, the following MSRs are not saved by
> pt_event_stop, but I don’t see they are required to be saved:
> 
> - MSR_IA32_RTIT_CR3_MATCH: I don’t see it is used by perf. 
> Seems like KVM saved an MSR that's not even used by the host.
> 
> - Address range MSRs (MSR_IA32_RTIT_ADDR0_A etc.): Those are provided by s/w and not updated by h/w.
> So they're just set to MSRs when event gets scheduled in. There is no need to save. 
>

OK. I think you need a clean-up patch to fix them first.


>>
>>> (if so, what large changes did you mean?)
>>>
>>>> If we go to this way, we have to introduce a new generic interface to
>>>> ask each driver to save/restore their MSRs when the guest is
>>>> entering/exiting. We'd better combine the new interface with the
>>>> existing
>>>> perf_guest_get_msrs() of the core driver.
>>>> I think that's an ideal solution, but requires big changes in the code.
>>>>
>>>> 2 is the current KVM implementation. See pt_save_msr()/pt_load_msr().
>>>> I don't think it's a right way. We'd better fix it.
>>>>
>>>> The suggestion should be 3. The KVM notify the PT driver via the
>>>> interface provided by PT. The PT driver save/restore all the registers.
>>>> I think it's an acceptable solution with small code changes.
>>>
>>> This looks like we just relocate the save/restore functions to the PT driver
>> and KVM still directly call them - still not going through perf's management.
>> Imagine every user operates on the pmu h/w directly like this, things would be
>> a mess.
>>>
>>
>>
>> The pt_event_stop() and the proposed interface still manipulate the PT event
>> pt->handle.event. The event status is updated as well. It's still under control of
>> the perf_event.
> 
> Did you mean to handle the PT event in the proposed driver API? Event status is just
> one of the things. There are other things if we want to make it complete for this,
> e.g. event->oncpu = -1, and eventually seems we will re-implement perf_event_disable_*.
>

As my understand, perf always check the status first. If it's a stopped
or inactivated event, I don't think event->oncpu will be touched. That's
why I think the proposed driver API should be acceptable.

> Btw, Xiaoyao has made it work with perf_event_disable_local, and don’t have that many changes.
> If necessary, we can post the 2nd version out to double check.
>

I'm not worry about which ways (either perf_event_disable_local() or the
proposed PT driver API) are chosen to stop the PT. If the existing
perf_event interfaces can meet your requirement, that's perfect.

My real concern is the pt_save_msr()/pt_load_msr(). I don't think it's a
job for KVM. See atomic_switch_perf_msrs(). It is the perf core driver
rather than KVM that tells which MSRs should be saved/restored in VMCS.
We should do the same thing for PT. (Actually, I think we already
encounter issues with the current KVM-dominated method. KVM
saves/restores unnecessary MSRs. Right?)

To do so, I think there may be two ways.
- Since MSRs have to be switched for both PT and core drivers, it sounds
reasonable to provide a new generic interface in the perf_event. The new
interface is to tell KVM which MSRs should be saved/restored. Then KVM
can decide to save/restore via VMCS or direct MSR access. I suspect this
way requires big change, but it will benefit all the drivers which have
similar requirements.
- The proposed driver API. The MSRs are saved/restored in the PT driver.


Thanks,
Kan

  reply	other threads:[~2022-09-16 13:27 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-25  8:56 [RFC PATCH 0/2] KVM: VMX: Fix VM entry failure on PT_MODE_HOST_GUEST while host is using PT Xiaoyao Li
2022-08-25  8:56 ` [RFC PATCH 1/2] perf/x86/intel/pt: Introduce intel_pt_{stop,resume}() Xiaoyao Li
2022-08-25 15:23   ` Sean Christopherson
2022-08-25  8:56 ` [RFC PATCH 2/2] KVM: VMX: Stop/resume host PT before/after VM entry when PT_MODE_HOST_GUEST Xiaoyao Li
2022-08-25 15:34   ` Sean Christopherson
2022-08-25 15:45     ` Xiaoyao Li
2022-08-25 15:59       ` Sean Christopherson
2022-08-26  6:32         ` Xiaoyao Li
2022-08-26 15:08           ` Sean Christopherson
2022-08-29  7:49 ` [RFC PATCH 0/2] KVM: VMX: Fix VM entry failure on PT_MODE_HOST_GUEST while host is using PT Wang, Wei W
2022-08-29  7:49   ` Wang, Wei W
2022-08-29 17:33   ` Sean Christopherson
2022-08-30  6:02     ` Wang, Wei W
2022-09-08  7:25   ` Xiaoyao Li
2022-09-08  8:53     ` Wang, Wei W
2022-09-14  4:15   ` Xiaoyao Li
2022-09-14  6:16     ` Wang, Wei W
2022-09-14 20:25       ` Liang, Kan
2022-09-15  2:46         ` Wang, Wei W
2022-09-15 13:54           ` Liang, Kan
2022-09-15 14:39             ` Wang, Wei W
2022-09-15 15:42               ` Liang, Kan
2022-09-16  2:30                 ` Wang, Wei W
2022-09-16 13:27                   ` Liang, Kan [this message]
2022-09-19 13:46                     ` Wang, Wei W
2022-09-19 14:41                       ` Liang, Kan
2022-09-19 15:22                         ` Wang, Wei W
2022-09-19 15:55                           ` Liang, Kan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da8fe1d8-8cce-7378-18e8-41fd7009ff8d@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andi.kleen@intel.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=namhyung@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=wei.w.wang@intel.com \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.