From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 283C7D5D688 for ; Thu, 7 Nov 2024 20:09:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date: Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=QiOu+EQVKG03gvmRIfVQE8W01nc3nP8eWslregCdscU=; b=ZiJJf4nbnbFXNNKjSiLSx6kFh0 CRcIsfSwLhJ7pXpXLHjkuwSOcNdizg9OGHtintF1S5JoQcFzdnHMv2dZSMKt6z5w6CsHUkBM4IHby KLftuMMk/t6HC80xyBWag1BXcXnBmpWHleTe8Z+Qw/FMupdY0+ibT7e9kfyIU4vjwtRiqmuBJciLE 6c2Abd3SKrnBllV8wwm8ML+IaTWt2LW4+9yXJWc8rAKVab7RDiICvM3nDrD3b5JrYrfktAtrRa1ia 88oEp9H1DjmIp4SUObx6ItClgUNN0OhtLP+6UioJA3don3ltkDK5zzTN4RG92mYBQ7ymqb8ziwOo7 WCZzni2Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t98p8-00000008D4w-3UHP; Thu, 07 Nov 2024 20:09:46 +0000 Received: from mgamail.intel.com ([192.198.163.7]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t98Sc-000000088J1-3LCL for linux-arm-kernel@lists.infradead.org; Thu, 07 Nov 2024 19:46:32 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731008791; x=1762544791; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=nU6wUyehaIMnSMsUf0IiJjEAD3owJ/kov/ScS7TWAgg=; b=OnZ3TZ90xdTg3iDQYtaM+YopUtG2ia8j4cxQ3BUXaqHDg6m7V7f1DLjQ SCjTrfGCNDqBqEhaULT2p3kXFUWwtkIGlJk3vqC5fzUG1Ra2U8zcC5VCE v15EW4AgVMr7XrNkyLs1aveVhF2eIzfRqv71+NY6F4trN//ac8LRhnkwX bt/lghbnnSoYOh9b8UMSRjt5+lZIN4EzrlVE2qzxfPAddOb/yENqqSxSZ hzxH0/FNrBByHCZuW2hb97BeKM1l6WKkdkdtJQmdiSyFMg85OXJnk6zbz BZeHR5TbIljJw+tW6tdA3cH+QOf7gqCSVrCDc5Qot7jkGJhNnyhYkLMB3 w==; X-CSE-ConnectionGUID: xqvJ5lUuQWO3dmsYOBmEgA== X-CSE-MsgGUID: Iwrh/HR4Rmu9Z+M5ItOFNg== X-IronPort-AV: E=McAfee;i="6700,10204,11249"; a="56272336" X-IronPort-AV: E=Sophos;i="6.12,135,1728975600"; d="scan'208";a="56272336" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Nov 2024 11:46:30 -0800 X-CSE-ConnectionGUID: BfFncz9rRIGYCasq2Q9UMA== X-CSE-MsgGUID: B3tQBkvxRlmhhDfctrP1yQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,135,1728975600"; d="scan'208";a="84729818" Received: from linux.intel.com ([10.54.29.200]) by fmviesa006.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Nov 2024 11:46:29 -0800 Received: from [10.212.68.83] (kliang2-mobl1.ccr.corp.intel.com [10.212.68.83]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by linux.intel.com (Postfix) with ESMTPS id 34C2720B5703; Thu, 7 Nov 2024 11:46:25 -0800 (PST) Message-ID: <655de93b-26cf-4588-aec5-9d0eba997c4e@linux.intel.com> Date: Thu, 7 Nov 2024 14:46:24 -0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v7 5/5] perf: Correct perf sampling with guest VMs To: Colton Lewis , kvm@vger.kernel.org Cc: Oliver Upton , Sean Christopherson , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Will Deacon , Russell King , Catalin Marinas , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, "H . Peter Anvin" , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org References: <20241107190336.2963882-1-coltonlewis@google.com> <20241107190336.2963882-6-coltonlewis@google.com> Content-Language: en-US From: "Liang, Kan" In-Reply-To: <20241107190336.2963882-6-coltonlewis@google.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241107_114630_910305_BC189F04 X-CRM114-Status: GOOD ( 27.00 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2024-11-07 2:03 p.m., Colton Lewis wrote: > Previously any PMU overflow interrupt that fired while a VCPU was > loaded was recorded as a guest event whether it truly was or not. This > resulted in nonsense perf recordings that did not honor > perf_event_attr.exclude_guest and recorded guest IPs where it should > have recorded host IPs. > > Rework the sampling logic to only record guest samples for events with > exclude_guest = 0. This way any host-only events with exclude_guest > set will never see unexpected guest samples. The behaviour of events > with exclude_guest = 0 is unchanged. > > Note that events configured to sample both host and guest may still > misattribute a PMI that arrived in the host as a guest event depending > on KVM arch and vendor behavior. > > Signed-off-by: Colton Lewis > Acked-by: Mark Rutland > Reviewed-by: Oliver Upton > --- Acked-by: Kan Liang Thanks, Kan > arch/arm64/include/asm/perf_event.h | 4 ---- > arch/arm64/kernel/perf_callchain.c | 28 ---------------------------- > arch/x86/events/core.c | 16 ++++------------ > include/linux/perf_event.h | 21 +++++++++++++++++++-- > kernel/events/core.c | 21 +++++++++++++++++---- > 5 files changed, 40 insertions(+), 50 deletions(-) > > diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h > index 31a5584ed423..ee45b4e77347 100644 > --- a/arch/arm64/include/asm/perf_event.h > +++ b/arch/arm64/include/asm/perf_event.h > @@ -10,10 +10,6 @@ > #include > > #ifdef CONFIG_PERF_EVENTS > -struct pt_regs; > -extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs); > -extern unsigned long perf_arch_misc_flags(struct pt_regs *regs); > -#define perf_arch_misc_flags(regs) perf_misc_flags(regs) > #define perf_arch_bpf_user_pt_regs(regs) ®s->user_regs > #endif > > diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c > index 01a9d08fc009..9b7f26b128b5 100644 > --- a/arch/arm64/kernel/perf_callchain.c > +++ b/arch/arm64/kernel/perf_callchain.c > @@ -38,31 +38,3 @@ void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, > > arch_stack_walk(callchain_trace, entry, current, regs); > } > - > -unsigned long perf_arch_instruction_pointer(struct pt_regs *regs) > -{ > - if (perf_guest_state()) > - return perf_guest_get_ip(); > - > - return instruction_pointer(regs); > -} > - > -unsigned long perf_arch_misc_flags(struct pt_regs *regs) > -{ > - unsigned int guest_state = perf_guest_state(); > - int misc = 0; > - > - if (guest_state) { > - if (guest_state & PERF_GUEST_USER) > - misc |= PERF_RECORD_MISC_GUEST_USER; > - else > - misc |= PERF_RECORD_MISC_GUEST_KERNEL; > - } else { > - if (user_mode(regs)) > - misc |= PERF_RECORD_MISC_USER; > - else > - misc |= PERF_RECORD_MISC_KERNEL; > - } > - > - return misc; > -} > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c > index 9fdc5fa22c66..d85e12ca4263 100644 > --- a/arch/x86/events/core.c > +++ b/arch/x86/events/core.c > @@ -3005,9 +3005,6 @@ static unsigned long code_segment_base(struct pt_regs *regs) > > unsigned long perf_arch_instruction_pointer(struct pt_regs *regs) > { > - if (perf_guest_state()) > - return perf_guest_get_ip(); > - > return regs->ip + code_segment_base(regs); > } > > @@ -3035,17 +3032,12 @@ unsigned long perf_arch_guest_misc_flags(struct pt_regs *regs) > > unsigned long perf_arch_misc_flags(struct pt_regs *regs) > { > - unsigned int guest_state = perf_guest_state(); > unsigned long misc = common_misc_flags(regs); > > - if (guest_state) { > - misc |= perf_arch_guest_misc_flags(regs); > - } else { > - if (user_mode(regs)) > - misc |= PERF_RECORD_MISC_USER; > - else > - misc |= PERF_RECORD_MISC_KERNEL; > - } > + if (user_mode(regs)) > + misc |= PERF_RECORD_MISC_USER; > + else > + misc |= PERF_RECORD_MISC_KERNEL; > > return misc; > } > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index 772ad352856b..368ea0e9577c 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -1655,8 +1655,9 @@ extern void perf_tp_event(u16 event_type, u64 count, void *record, > struct task_struct *task); > extern void perf_bp_event(struct perf_event *event, void *data); > > -extern unsigned long perf_misc_flags(struct pt_regs *regs); > -extern unsigned long perf_instruction_pointer(struct pt_regs *regs); > +extern unsigned long perf_misc_flags(struct perf_event *event, struct pt_regs *regs); > +extern unsigned long perf_instruction_pointer(struct perf_event *event, > + struct pt_regs *regs); > > #ifndef perf_arch_misc_flags > # define perf_arch_misc_flags(regs) \ > @@ -1667,6 +1668,22 @@ extern unsigned long perf_instruction_pointer(struct pt_regs *regs); > # define perf_arch_bpf_user_pt_regs(regs) regs > #endif > > +#ifndef perf_arch_guest_misc_flags > +static inline unsigned long perf_arch_guest_misc_flags(struct pt_regs *regs) > +{ > + unsigned long guest_state = perf_guest_state(); > + > + if (!(guest_state & PERF_GUEST_ACTIVE)) > + return 0; > + > + if (guest_state & PERF_GUEST_USER) > + return PERF_RECORD_MISC_GUEST_USER; > + else > + return PERF_RECORD_MISC_GUEST_KERNEL; > +} > +# define perf_arch_guest_misc_flags(regs) perf_arch_guest_misc_flags(regs) > +#endif > + > static inline bool has_branch_stack(struct perf_event *event) > { > return event->attr.sample_type & PERF_SAMPLE_BRANCH_STACK; > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 2c44ffd6f4d8..c62164a2ff23 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -7022,13 +7022,26 @@ void perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *cbs) > EXPORT_SYMBOL_GPL(perf_unregister_guest_info_callbacks); > #endif > > -unsigned long perf_misc_flags(struct pt_regs *regs) > +static bool should_sample_guest(struct perf_event *event) > { > + return !event->attr.exclude_guest && perf_guest_state(); > +} > + > +unsigned long perf_misc_flags(struct perf_event *event, > + struct pt_regs *regs) > +{ > + if (should_sample_guest(event)) > + return perf_arch_guest_misc_flags(regs); > + > return perf_arch_misc_flags(regs); > } > > -unsigned long perf_instruction_pointer(struct pt_regs *regs) > +unsigned long perf_instruction_pointer(struct perf_event *event, > + struct pt_regs *regs) > { > + if (should_sample_guest(event)) > + return perf_guest_get_ip(); > + > return perf_arch_instruction_pointer(regs); > } > > @@ -7849,7 +7862,7 @@ void perf_prepare_sample(struct perf_sample_data *data, > __perf_event_header__init_id(data, event, filtered_sample_type); > > if (filtered_sample_type & PERF_SAMPLE_IP) { > - data->ip = perf_instruction_pointer(regs); > + data->ip = perf_instruction_pointer(event, regs); > data->sample_flags |= PERF_SAMPLE_IP; > } > > @@ -8013,7 +8026,7 @@ void perf_prepare_header(struct perf_event_header *header, > { > header->type = PERF_RECORD_SAMPLE; > header->size = perf_sample_data_size(data, event); > - header->misc = perf_misc_flags(regs); > + header->misc = perf_misc_flags(event, regs); > > /* > * If you're adding more sample types here, you likely need to do