From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752760AbdGFAnE (ORCPT ); Wed, 5 Jul 2017 20:43:04 -0400 Received: from mga14.intel.com ([192.55.52.115]:59471 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752448AbdGFAnD (ORCPT ); Wed, 5 Jul 2017 20:43:03 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.40,314,1496127600"; d="scan'208";a="1168540148" Subject: Re: [PATCH v3 0/2] perf report: Implement visual marker for macro fusion in annotate To: acme@kernel.org, jolsa@kernel.org, peterz@infradead.org, mingo@redhat.com, alexander.shishkin@linux.intel.com Cc: Linux-kernel@vger.kernel.org, ak@linux.intel.com, kan.liang@intel.com, yao.jin@intel.com References: <1497961330-3666-1-git-send-email-yao.jin@linux.intel.com> From: "Jin, Yao" Message-ID: <66e5087a-fb11-6970-e48c-edec3017f7ea@linux.intel.com> Date: Thu, 6 Jul 2017 08:42:54 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <1497961330-3666-1-git-send-email-yao.jin@linux.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Arnaldo, Is this series OK? Thanks Jin Yao On 6/20/2017 8:22 PM, Jin Yao wrote: > Macro fusion merges two instructions to a single micro-op. Intel > core platform performs this hardware optimization under limited > circumstances. For example, CMP + JCC can be "fused" and executed > /retired together. While with sampling this can result in the > sample sometimes being on the JCC and sometimes on the CMP. > So for the fused instruction pair, they could be considered > together. > > On Nehalem, fused instruction pairs: > cmp/test + jcc. > > On other new CPU: > cmp/test/add/sub/and/inc/dec + jcc. > > This patch series marks the case clearly by joining the fused > instruction pair in the arrow of the jump. > > For example: > > │ ┌──cmpl $0x0,argp_program_version_hook > 81.93 │ ├──je 20 > │ │ lock cmpxchg %esi,0x38a9a4(%rip) > │ │↓ jne 29 > │ │↓ jmp 43 > 11.47 │20:└─→cmpxch %esi,0x38a999(%rip) > > Change-log: > ----------- > v3: 1. Add checking for Nehalem (CMP, TEST). For other newer > Intel CPUs just check it by default (CMP, TEST, ADD, > SUB, AND, INC, DEC). > > 2. Use Arnaldo's fix to let the display be better > > v2: According to Arnaldo's comments, remove the weak function and > use an arch-specific function instead to check fused instruction > pair. > > v1: Inital post > > Jin Yao (2): > perf util: Check for fused instruction > perf report: Implement visual marker for macro fusion in annotate > > tools/perf/arch/x86/annotate/instructions.c | 37 +++++++++++++++++++++++++++++ > tools/perf/ui/browser.c | 29 ++++++++++++++++++++++ > tools/perf/ui/browser.h | 2 ++ > tools/perf/ui/browsers/annotate.c | 32 +++++++++++++++++++++++++ > tools/perf/util/annotate.c | 17 +++++++++++++ > tools/perf/util/annotate.h | 3 +++ > 6 files changed, 120 insertions(+) >