From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21548CA9EA0 for ; Fri, 25 Oct 2019 09:00:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EEAE121D7B for ; Fri, 25 Oct 2019 09:00:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2408627AbfJYJAl (ORCPT ); Fri, 25 Oct 2019 05:00:41 -0400 Received: from mga06.intel.com ([134.134.136.31]:14367 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2407738AbfJYJAl (ORCPT ); Fri, 25 Oct 2019 05:00:41 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Oct 2019 02:00:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,228,1569308400"; d="scan'208";a="201754021" Received: from linux.intel.com ([10.54.29.200]) by orsmga003.jf.intel.com with ESMTP; 25 Oct 2019 02:00:40 -0700 Received: from [10.125.252.238] (abudanko-mobl.ccr.corp.intel.com [10.125.252.238]) by linux.intel.com (Postfix) with ESMTP id 013D958042B; Fri, 25 Oct 2019 02:00:37 -0700 (PDT) Subject: [PATCH v5 4/4] perf/core,x86: synchronize PMU task contexts on optimized context switches From: Alexey Budankov To: Peter Zijlstra Cc: Arnaldo Carvalho de Melo , Ingo Molnar , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Andi Kleen , Kan Liang , Stephane Eranian , Ian Rogers , Song Liu , linux-kernel References: <6fa20503-b5ad-16c7-260e-5243509176bc@linux.intel.com> Organization: Intel Corp. Message-ID: Date: Fri, 25 Oct 2019 12:00:36 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <6fa20503-b5ad-16c7-260e-5243509176bc@linux.intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Install Intel specific PMU task context synchronization adapter and extend optimized context switch path with PMU specific task context synchronization to fix LBR callstack virtualization on context switches. Signed-off-by: Alexey Budankov --- arch/x86/events/intel/core.c | 7 +++++++ kernel/events/core.c | 13 ++++++++++++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index bbf6588d47ee..dc64b16e6b71 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3820,6 +3820,12 @@ static void intel_pmu_sched_task(struct perf_event_context *ctx, intel_pmu_lbr_sched_task(ctx, sched_in); } +static void intel_pmu_swap_task_ctx(struct perf_event_context *prev, + struct perf_event_context *next) +{ + intel_pmu_lbr_swap_task_ctx(prev, next); +} + static int intel_pmu_check_period(struct perf_event *event, u64 value) { return intel_pmu_has_bts_period(event, value) ? -EINVAL : 0; @@ -3955,6 +3961,7 @@ static __initconst const struct x86_pmu intel_pmu = { .guest_get_msrs = intel_guest_get_msrs, .sched_task = intel_pmu_sched_task, + .swap_task_ctx = intel_pmu_swap_task_ctx, .check_period = intel_pmu_check_period, diff --git a/kernel/events/core.c b/kernel/events/core.c index f9a5d4356562..ed31aa849161 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -3204,10 +3204,21 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn, raw_spin_lock(&ctx->lock); raw_spin_lock_nested(&next_ctx->lock, SINGLE_DEPTH_NESTING); if (context_equiv(ctx, next_ctx)) { + struct pmu *pmu = ctx->pmu; + WRITE_ONCE(ctx->task, next); WRITE_ONCE(next_ctx->task, task); - swap(ctx->task_ctx_data, next_ctx->task_ctx_data); + /* + * PMU specific parts of task perf context can require + * additional synchronization. As an example of such + * synchronization see implementation details of Intel + * LBR call stack data profiling; + */ + if (pmu->swap_task_ctx) + pmu->swap_task_ctx(ctx, next_ctx); + else + swap(ctx->task_ctx_data, next_ctx->task_ctx_data); /* * RCU_INIT_POINTER here is safe because we've not -- 2.20.1