From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8685E15AE0; Thu, 11 Jul 2024 04:48:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720673294; cv=none; b=ieKdA24koh3C16Z7vECxbtmaWQ5T31zEt3vqjHQ3pqMhXvnmA9e6tgBU6nCd3mpVRFi41GGwvTSfWKN7weuVvq0J0yfC+qOHQ4yG5lFjsLIzFveJP3tFhU5LnAWtnzPOyifix5+vwlmSLW/6JZktHWbHFhd8bQ9D2TSCpu1uZjQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720673294; c=relaxed/simple; bh=u2dTTjeoQNgtN4DfnFCRkWsTxvIwIy6d1aj8rzgjFqA=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=RkLJrDNqeGVWzkxv8FcUladYnHDRMwFfYMsNcsVELzRy8jb/A2fov/YMGRX8l+ovQcq4z6guCFk2GNKVAmtGJKUBmhUW96gIvTvgsbupuo9/NpXjY0mJtrC6zFx2MIr0wzjvOElI9+X0TaIY05bjp8xsd8PMkYHTF4bj1FWNBUg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=kWbyB+B+; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="kWbyB+B+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1720673292; x=1752209292; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=u2dTTjeoQNgtN4DfnFCRkWsTxvIwIy6d1aj8rzgjFqA=; b=kWbyB+B+NyXlikvyp1HfxMTGNsKpPSP338aPyy9L5GbSbSiK/Z22O7rV kfF/LM37nHvmAYABJrCvA5qdwb8naHsJJ8Iw6s/cZG85COn1z54RDubDh rCb1nzgWgVfHahjdFh3ANAoYIS2R5Mz0diyyC8jHsSDkrvkUPoV61QuDr RNvi6qvB7Z4OYzxB+dAHspiTuXvk2hkq1B9u4JACaqKfGajYadr8QrLpQ FSp4Vjtnwa2lGMN/9zNcMyh/MZldiGFgLF0I8C1ToB63AcyU3s8hGxT+7 dMeDKM1iYPqy7aHW/NXAcGXHVMnV2P7LQZRqFf/D57lTYH8lLjtpmtelU g==; X-CSE-ConnectionGUID: 1M3RMorkSi+h7M6kWZtw/Q== X-CSE-MsgGUID: gRWBfwReTcWJ8JGYUPy6xA== X-IronPort-AV: E=McAfee;i="6700,10204,11129"; a="28696907" X-IronPort-AV: E=Sophos;i="6.09,199,1716274800"; d="scan'208";a="28696907" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2024 21:48:10 -0700 X-CSE-ConnectionGUID: C2ue7uEWT8SEPqjFMH8zLw== X-CSE-MsgGUID: +AqMfFkeTI23ZWeW64V9Uw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,199,1716274800"; d="scan'208";a="48430937" Received: from unknown (HELO [10.238.4.150]) ([10.238.4.150]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2024 21:48:08 -0700 Message-ID: Date: Thu, 11 Jul 2024 12:48:05 +0800 Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [Patch v2 3/5] perf x86/topdown: Don't move topdown metrics events when sorting events To: Ian Rogers Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Adrian Hunter , Alexander Shishkin , Kan Liang , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Yongwei Ma , Dapeng Mi References: <20240708144204.839486-1-dapeng1.mi@linux.intel.com> <20240708144204.839486-4-dapeng1.mi@linux.intel.com> <4d39856e-396d-4a48-9ca3-2e1a574f50d7@linux.intel.com> Content-Language: en-US From: "Mi, Dapeng" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 7/10/2024 11:07 PM, Ian Rogers wrote: > On Wed, Jul 10, 2024 at 2:40 AM Mi, Dapeng wrote: >> >> On 7/10/2024 6:37 AM, Ian Rogers wrote: >>> On Mon, Jul 8, 2024 at 9:18 PM Mi, Dapeng wrote: >>>> On 7/8/2024 11:08 PM, Ian Rogers wrote: >>>>> On Mon, Jul 8, 2024 at 12:40 AM Dapeng Mi wrote: >>>>>> when running below perf command, we say error is reported. >>>>>> >>>>>> perf record -e "{slots,instructions,topdown-retiring}:S" -vv -C0 sleep 1 >>>>>> >>>>>> ------------------------------------------------------------ >>>>>> perf_event_attr: >>>>>> type 4 (cpu) >>>>>> size 168 >>>>>> config 0x400 (slots) >>>>>> sample_type IP|TID|TIME|READ|CPU|PERIOD|IDENTIFIER >>>>>> read_format ID|GROUP|LOST >>>>>> disabled 1 >>>>>> sample_id_all 1 >>>>>> exclude_guest 1 >>>>>> ------------------------------------------------------------ >>>>>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 >>>>>> ------------------------------------------------------------ >>>>>> perf_event_attr: >>>>>> type 4 (cpu) >>>>>> size 168 >>>>>> config 0x8000 (topdown-retiring) >>>>>> { sample_period, sample_freq } 4000 >>>>>> sample_type IP|TID|TIME|READ|CPU|PERIOD|IDENTIFIER >>>>>> read_format ID|GROUP|LOST >>>>>> freq 1 >>>>>> sample_id_all 1 >>>>>> exclude_guest 1 >>>>>> ------------------------------------------------------------ >>>>>> sys_perf_event_open: pid -1 cpu 0 group_fd 5 flags 0x8 >>>>>> sys_perf_event_open failed, error -22 >>>>>> >>>>>> Error: >>>>>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (topdown-retiring). >>>>>> >>>>>> The reason of error is that the events are regrouped and >>>>>> topdown-retiring event is moved to closely after the slots event and >>>>>> topdown-retiring event needs to do the sampling, but Intel PMU driver >>>>>> doesn't support to sample topdown metrics events. >>>>>> >>>>>> For topdown metrics events, it just requires to be in a group which has >>>>>> slots event as leader. It doesn't require topdown metrics event must be >>>>>> closely after slots event. Thus it's a overkill to move topdown metrics >>>>>> event closely after slots event in events regrouping and furtherly cause >>>>>> the above issue. >>>>>> >>>>>> Thus delete the code that moving topdown metrics events to fix the >>>>>> issue. >>>>> I think this is wrong. The topdown events may not be in a group, such >>>>> cases can come from metrics due to grouping constraints, and so they >>>>> must be sorted together so that they may be gathered into a group to >>>>> avoid the perf event opens failing for ungrouped topdown events. I'm >>>>> not understanding what these patches are trying to do, if you want to >>>>> prioritize the event for leader sampling why not modify it to compare >>>> Per my understanding, this change doesn't break anything. The events >>>> regrouping can be divided into below several cases. >>>> >>>> a. all events in a group >>>> >>>> perf stat -e "{instructions,topdown-retiring,slots}" -C0 sleep 1 >>>> WARNING: events were regrouped to match PMUs >>>> >>>> Performance counter stats for 'CPU(s) 0': >>>> >>>> 15,066,240 slots >>>> 1,899,760 instructions >>>> 2,126,998 topdown-retiring >>>> >>>> 1.045783464 seconds time elapsed >>>> >>>> In this case, slots event would be adjusted as the leader event and all >>>> events are still in same group. >>>> >>>> b. all events not in a group >>>> >>>> perf stat -e "instructions,topdown-retiring,slots" -C0 sleep 1 >>>> WARNING: events were regrouped to match PMUs >>>> >>>> Performance counter stats for 'CPU(s) 0': >>>> >>>> 2,045,561 instructions >>>> 17,108,370 slots >>>> 2,281,116 topdown-retiring >>>> >>>> 1.045639284 seconds time elapsed >>>> >>>> In this case, slots and topdown-retiring are placed into a group and slots >>>> is the group leader. instructions event is outside the group. >>>> >>>> c. slots event in group but topdown metric events outside the group >>>> >>>> perf stat -e "{instructions,slots},topdown-retiring" -C0 sleep 1 >>>> WARNING: events were regrouped to match PMUs >>>> >>>> Performance counter stats for 'CPU(s) 0': >>>> >>>> 20,323,878 slots >>>> 2,634,884 instructions >>>> 3,028,656 topdown-retiring >>>> >>>> 1.045076380 seconds time elapsed >>>> >>>> In this case, topdown-retiring event is placed into previous group and >>>> slots is adjusted to leader event. >>>> >>>> d. multiple event groups >>>> >>>> perf stat -e "{instructions,slots},{topdown-retiring}" -C0 sleep 1 >>>> WARNING: events were regrouped to match PMUs >>>> >>>> Performance counter stats for 'CPU(s) 0': >>>> >>>> 26,319,024 slots >>>> 2,427,791 instructions >>>> 2,683,508 topdown-retiring >>>> >>>> 1.045495830 seconds time elapsed >>>> >>>> In this case, the two groups are merged to one group and slots event is >>>> adjusted as leader. >>>> >>>> The key point of this patch is that it's unnecessary to move topdown >>>> metrics events closely after slots event. It's a overkill since Intel core >>>> PMU driver doesn't require that. Intel PMU driver just requires topdown >>>> metrics events are in a group where slots event is the group leader, and >>>> worse the movement for topdown metrics events causes the issue in the >>>> commit message mentioned. >>>> >>>> This patch doesn't block to regroup topdown metrics event. It just removes >>>> the unnecessary movement for topdown metrics events. >>> But you will get the same behavior because of the non-arch dependent >>> force group index - I guess you don't care as the sample read only >>> happens when you have a group. >>> >>> I'm thinking of cases like (which admittedly is broken): >>> ``` >>> $ perf stat -e "{slots,instructions},cycles,topdown-fe-bound" -a sleep 0.1 >>> [sudo] password for irogers: >>> >>> Performance counter stats for 'system wide': >>> >>> 2,589,345,900 slots >>> 852,492,838 instructions >>> 583,525,372 cycles >>> topdown-fe-bound >>> >>> 0.103930790 seconds time elapsed >>> ``` >> I run the upstream code (commit 73e931504f8e0d42978bfcda37b323dbbd1afc08) >> without this patchset, I see same issue. >> >> perf stat -e "{slots,instructions},cycles,topdown-fe-bound" -a sleep 0.1 >> >> Performance counter stats for 'system wide': >> >> 262,448,922 slots >> 29,630,373 instructions >> 43,891,902 cycles >> topdown-fe-bound >> >> 0.150369560 seconds time elapsed >> >> #perf -v >> perf version 6.10.rc6.g73e931504f8e >> >> This issue is not caused by this patchset. > I agree, but I think what is broken above was caused by the forced > grouping change that I did for Andi. The point of your change here is > to say that topdown events don't need to be moved while sorting, but > what should be happening here is just that. topdown-fe-bound should be > moved into the group with slots and instructions so it isn't " supported>". So yes the current code has issues, but that's not the > same as saying we don't need to move topdown events, we do so that we > may group them better. > > Thanks, > Ian I see your point. I think the key is to ensure the topdown metrics events in a group which has slots as the leader. As for where the topdown metrics event is in the group, it doesn't matter. I would see if there is a better method to fix this issue. Thanks. > >>> As the slots event is grouped there's no force group index on it, we >>> want to shuffle the topdown-fe-bound into the group so we want it to >>> compare as less than cycles - ie we're comparing topdown events with >>> non topdown events and trying to shuffle the topdown events first. >> Current evlist__cmp() won't really swap the order of cycles and >> topdown-fe-bound. >> >> if (lhs_sort_idx != rhs_sort_idx) >> return lhs_sort_idx - rhs_sort_idx; >> >> When comparing cycles and topdown-fe-bound events, lhs_sort_idx is 2 and >> rhs_sort_idx is 3, so the swap won't happen. >> >> So the event sequence after sorting is still "slots, instructions ,cycles, >> topdown-fe-bound". Both cycles and topdown-fe-bound events won't be placed >> into the group. >> >> >>> Thanks, >>> Ian >>> >>> >>> >>>>> first? >>>>> >>>>> Thanks, >>>>> Ian >>>>> >>>>>> Signed-off-by: Dapeng Mi >>>>>> --- >>>>>> tools/perf/arch/x86/util/evlist.c | 5 ----- >>>>>> 1 file changed, 5 deletions(-) >>>>>> >>>>>> diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c >>>>>> index 332e8907f43e..6046981d61cf 100644 >>>>>> --- a/tools/perf/arch/x86/util/evlist.c >>>>>> +++ b/tools/perf/arch/x86/util/evlist.c >>>>>> @@ -82,11 +82,6 @@ int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs) >>>>>> return -1; >>>>>> if (arch_is_topdown_slots(rhs)) >>>>>> return 1; >>>>>> - /* Followed by topdown events. */ >>>>>> - if (arch_is_topdown_metrics(lhs) && !arch_is_topdown_metrics(rhs)) >>>>>> - return -1; >>>>>> - if (!arch_is_topdown_metrics(lhs) && arch_is_topdown_metrics(rhs)) >>>>>> - return 1; >>>>>> } >>>>>> >>>>>> /* Default ordering by insertion index. */ >>>>>> -- >>>>>> 2.40.1 >>>>>>