From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4E5B1D5D687 for ; Thu, 7 Nov 2024 19:46:35 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [127.0.0.1]) by lists.ozlabs.org (Postfix) with ESMTP id 4Xksyd6CtDz3brR; Fri, 8 Nov 2024 06:46:33 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; arc=none smtp.remote-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1731008793; cv=none; b=DyTlQRAwUW/MIpg2XXy/uQ4JrEAKrUqtrLi21q0bsUTOT5NCX8skAg/ZaHn8eGkGo7vtajqunpPbXjAjKFmaIafDt5SUu2LgY8T642YBM2YXuNB6f2WpfNSaVxfikT2nKLzuqg9qAc13fdlo/V9Fk5fj1/AUZe1nfcUeNSAxy43etSBBR4/9LTLxljjP/epigiw2LzRhy+/2Qnb5WtEA+5/BMhp6Hcp8KzJz4+WYAnXYsj19IEFkJ9bgYiT0bD4r0mFqXgKUvMXF5qToL40MOepxoRHLhmlkScNILBQVijSNF98hpQq6d+jbVvcv8BcEV5/Riru/hyCH03TfBCoZwg== ARC-Message-Signature: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1731008793; c=relaxed/relaxed; bh=QiOu+EQVKG03gvmRIfVQE8W01nc3nP8eWslregCdscU=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=OrTHRXSBjg8aaF8arpptMQgzs5GA3B6uByGLhPQEZ1bvaSEtMtCr4pwf2OO5LjXIXmO9+CvCIMWcYsUIVuJ26p2Wc/kAW8KtaKl1QdV8vA20A0aAKU6KlKl5jDAgrv8gEm3wT8oAjJNQ74x4xOJ0Hl/jMNQzqmSJQ5/Jky+1dF47/hB30pYVyQC9Drip4i0xLeuclw75xisb/WADTzJFW4kQisxhO1QvgyJbMtbfgHKlEZtEGvKxwLEyElHPb0WlI6Sl9lQRozD24Lhx0GZ85zXmzUGqytvXQYaBDLbbvNQo23tZJ//MGRjcm7o9sGi3Qqs84kkcamXC7xTdSYE6Hg== ARC-Authentication-Results: i=1; lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=aas+U3ZX; dkim-atps=neutral; spf=none (client-ip=192.198.163.7; helo=mgamail.intel.com; envelope-from=kan.liang@linux.intel.com; receiver=lists.ozlabs.org) smtp.mailfrom=linux.intel.com Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=aas+U3ZX; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=linux.intel.com (client-ip=192.198.163.7; helo=mgamail.intel.com; envelope-from=kan.liang@linux.intel.com; receiver=lists.ozlabs.org) Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4Xksyc4TmXz3brN for ; Fri, 8 Nov 2024 06:46:32 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731008793; x=1762544793; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=nU6wUyehaIMnSMsUf0IiJjEAD3owJ/kov/ScS7TWAgg=; b=aas+U3ZXF788ls7BQxpCDkony4Qxcbb3jEHJgcH3czH6dvTnNzArMows 1+5U4jUdZpaPMhATU3UU4KrH8WbkA/5GoFrvkN6wmXfOkSmDeFuFGhnmQ YPMZs3VSjVQebL9i89yDEsbIOPYdWTSnFZw+kgp+pmpGvMGwXQYUX8yeO G/WUrDWP6H01R/yDjIK7Mte8Wzf9zyScsmHfgI0eFOQyXHF2s+VGma8sy +DaFnSu8L3Wkk/SYqnJouOUjjkUINNbvW7KLFn0m57sYmPv4TqozQPPAa PYNqDMlsebGz848HJcWIdU8+07IAOqKpU9DcG07ABERJWtjH2RBpd2Vxh w==; X-CSE-ConnectionGUID: XX8MgV73T6+vtQL0JApczA== X-CSE-MsgGUID: VOTLMi54QS+f/yoRyhHMXw== X-IronPort-AV: E=McAfee;i="6700,10204,11249"; a="56272339" X-IronPort-AV: E=Sophos;i="6.12,135,1728975600"; d="scan'208";a="56272339" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Nov 2024 11:46:30 -0800 X-CSE-ConnectionGUID: BfFncz9rRIGYCasq2Q9UMA== X-CSE-MsgGUID: B3tQBkvxRlmhhDfctrP1yQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,135,1728975600"; d="scan'208";a="84729818" Received: from linux.intel.com ([10.54.29.200]) by fmviesa006.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Nov 2024 11:46:29 -0800 Received: from [10.212.68.83] (kliang2-mobl1.ccr.corp.intel.com [10.212.68.83]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by linux.intel.com (Postfix) with ESMTPS id 34C2720B5703; Thu, 7 Nov 2024 11:46:25 -0800 (PST) Message-ID: <655de93b-26cf-4588-aec5-9d0eba997c4e@linux.intel.com> Date: Thu, 7 Nov 2024 14:46:24 -0500 X-Mailing-List: linuxppc-dev@lists.ozlabs.org List-Id: List-Help: List-Owner: List-Post: List-Archive: , List-Subscribe: , , List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v7 5/5] perf: Correct perf sampling with guest VMs To: Colton Lewis , kvm@vger.kernel.org Cc: Oliver Upton , Sean Christopherson , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Will Deacon , Russell King , Catalin Marinas , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, "H . Peter Anvin" , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org References: <20241107190336.2963882-1-coltonlewis@google.com> <20241107190336.2963882-6-coltonlewis@google.com> Content-Language: en-US From: "Liang, Kan" In-Reply-To: <20241107190336.2963882-6-coltonlewis@google.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 2024-11-07 2:03 p.m., Colton Lewis wrote: > Previously any PMU overflow interrupt that fired while a VCPU was > loaded was recorded as a guest event whether it truly was or not. This > resulted in nonsense perf recordings that did not honor > perf_event_attr.exclude_guest and recorded guest IPs where it should > have recorded host IPs. > > Rework the sampling logic to only record guest samples for events with > exclude_guest = 0. This way any host-only events with exclude_guest > set will never see unexpected guest samples. The behaviour of events > with exclude_guest = 0 is unchanged. > > Note that events configured to sample both host and guest may still > misattribute a PMI that arrived in the host as a guest event depending > on KVM arch and vendor behavior. > > Signed-off-by: Colton Lewis > Acked-by: Mark Rutland > Reviewed-by: Oliver Upton > --- Acked-by: Kan Liang Thanks, Kan > arch/arm64/include/asm/perf_event.h | 4 ---- > arch/arm64/kernel/perf_callchain.c | 28 ---------------------------- > arch/x86/events/core.c | 16 ++++------------ > include/linux/perf_event.h | 21 +++++++++++++++++++-- > kernel/events/core.c | 21 +++++++++++++++++---- > 5 files changed, 40 insertions(+), 50 deletions(-) > > diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h > index 31a5584ed423..ee45b4e77347 100644 > --- a/arch/arm64/include/asm/perf_event.h > +++ b/arch/arm64/include/asm/perf_event.h > @@ -10,10 +10,6 @@ > #include > > #ifdef CONFIG_PERF_EVENTS > -struct pt_regs; > -extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs); > -extern unsigned long perf_arch_misc_flags(struct pt_regs *regs); > -#define perf_arch_misc_flags(regs) perf_misc_flags(regs) > #define perf_arch_bpf_user_pt_regs(regs) ®s->user_regs > #endif > > diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c > index 01a9d08fc009..9b7f26b128b5 100644 > --- a/arch/arm64/kernel/perf_callchain.c > +++ b/arch/arm64/kernel/perf_callchain.c > @@ -38,31 +38,3 @@ void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, > > arch_stack_walk(callchain_trace, entry, current, regs); > } > - > -unsigned long perf_arch_instruction_pointer(struct pt_regs *regs) > -{ > - if (perf_guest_state()) > - return perf_guest_get_ip(); > - > - return instruction_pointer(regs); > -} > - > -unsigned long perf_arch_misc_flags(struct pt_regs *regs) > -{ > - unsigned int guest_state = perf_guest_state(); > - int misc = 0; > - > - if (guest_state) { > - if (guest_state & PERF_GUEST_USER) > - misc |= PERF_RECORD_MISC_GUEST_USER; > - else > - misc |= PERF_RECORD_MISC_GUEST_KERNEL; > - } else { > - if (user_mode(regs)) > - misc |= PERF_RECORD_MISC_USER; > - else > - misc |= PERF_RECORD_MISC_KERNEL; > - } > - > - return misc; > -} > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c > index 9fdc5fa22c66..d85e12ca4263 100644 > --- a/arch/x86/events/core.c > +++ b/arch/x86/events/core.c > @@ -3005,9 +3005,6 @@ static unsigned long code_segment_base(struct pt_regs *regs) > > unsigned long perf_arch_instruction_pointer(struct pt_regs *regs) > { > - if (perf_guest_state()) > - return perf_guest_get_ip(); > - > return regs->ip + code_segment_base(regs); > } > > @@ -3035,17 +3032,12 @@ unsigned long perf_arch_guest_misc_flags(struct pt_regs *regs) > > unsigned long perf_arch_misc_flags(struct pt_regs *regs) > { > - unsigned int guest_state = perf_guest_state(); > unsigned long misc = common_misc_flags(regs); > > - if (guest_state) { > - misc |= perf_arch_guest_misc_flags(regs); > - } else { > - if (user_mode(regs)) > - misc |= PERF_RECORD_MISC_USER; > - else > - misc |= PERF_RECORD_MISC_KERNEL; > - } > + if (user_mode(regs)) > + misc |= PERF_RECORD_MISC_USER; > + else > + misc |= PERF_RECORD_MISC_KERNEL; > > return misc; > } > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index 772ad352856b..368ea0e9577c 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -1655,8 +1655,9 @@ extern void perf_tp_event(u16 event_type, u64 count, void *record, > struct task_struct *task); > extern void perf_bp_event(struct perf_event *event, void *data); > > -extern unsigned long perf_misc_flags(struct pt_regs *regs); > -extern unsigned long perf_instruction_pointer(struct pt_regs *regs); > +extern unsigned long perf_misc_flags(struct perf_event *event, struct pt_regs *regs); > +extern unsigned long perf_instruction_pointer(struct perf_event *event, > + struct pt_regs *regs); > > #ifndef perf_arch_misc_flags > # define perf_arch_misc_flags(regs) \ > @@ -1667,6 +1668,22 @@ extern unsigned long perf_instruction_pointer(struct pt_regs *regs); > # define perf_arch_bpf_user_pt_regs(regs) regs > #endif > > +#ifndef perf_arch_guest_misc_flags > +static inline unsigned long perf_arch_guest_misc_flags(struct pt_regs *regs) > +{ > + unsigned long guest_state = perf_guest_state(); > + > + if (!(guest_state & PERF_GUEST_ACTIVE)) > + return 0; > + > + if (guest_state & PERF_GUEST_USER) > + return PERF_RECORD_MISC_GUEST_USER; > + else > + return PERF_RECORD_MISC_GUEST_KERNEL; > +} > +# define perf_arch_guest_misc_flags(regs) perf_arch_guest_misc_flags(regs) > +#endif > + > static inline bool has_branch_stack(struct perf_event *event) > { > return event->attr.sample_type & PERF_SAMPLE_BRANCH_STACK; > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 2c44ffd6f4d8..c62164a2ff23 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -7022,13 +7022,26 @@ void perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *cbs) > EXPORT_SYMBOL_GPL(perf_unregister_guest_info_callbacks); > #endif > > -unsigned long perf_misc_flags(struct pt_regs *regs) > +static bool should_sample_guest(struct perf_event *event) > { > + return !event->attr.exclude_guest && perf_guest_state(); > +} > + > +unsigned long perf_misc_flags(struct perf_event *event, > + struct pt_regs *regs) > +{ > + if (should_sample_guest(event)) > + return perf_arch_guest_misc_flags(regs); > + > return perf_arch_misc_flags(regs); > } > > -unsigned long perf_instruction_pointer(struct pt_regs *regs) > +unsigned long perf_instruction_pointer(struct perf_event *event, > + struct pt_regs *regs) > { > + if (should_sample_guest(event)) > + return perf_guest_get_ip(); > + > return perf_arch_instruction_pointer(regs); > } > > @@ -7849,7 +7862,7 @@ void perf_prepare_sample(struct perf_sample_data *data, > __perf_event_header__init_id(data, event, filtered_sample_type); > > if (filtered_sample_type & PERF_SAMPLE_IP) { > - data->ip = perf_instruction_pointer(regs); > + data->ip = perf_instruction_pointer(event, regs); > data->sample_flags |= PERF_SAMPLE_IP; > } > > @@ -8013,7 +8026,7 @@ void perf_prepare_header(struct perf_event_header *header, > { > header->type = PERF_RECORD_SAMPLE; > header->size = perf_sample_data_size(data, event); > - header->misc = perf_misc_flags(regs); > + header->misc = perf_misc_flags(event, regs); > > /* > * If you're adding more sample types here, you likely need to do