From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8240C4332F for ; Thu, 9 Sep 2021 12:59:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A89E1611F2 for ; Thu, 9 Sep 2021 12:59:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357256AbhIINAD (ORCPT ); Thu, 9 Sep 2021 09:00:03 -0400 Received: from ozlabs.org ([203.11.71.1]:36395 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1354153AbhIIMrR (ORCPT ); Thu, 9 Sep 2021 08:47:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ellerman.id.au; s=201909; t=1631191563; bh=+HVI0XghzcJKkgiW87QzoZzI0ZQ7j9m77j9OMcB3aro=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=C2InCR5mGc1TG/bT5ZgNme4RPUeuOzrkaEQxsyWUaEs1qnzjuOFtd0L+biCl0slwl fIrDvqD6ZvXUBI/7MWk1eH7MO33l+QcFMeY/imehorEQe0RMrBl0IkItYV/bRIdvja GVkphSFtzmQrlncOXtLRqHnlmKJey4mKwHugBJ2q42Hi2VYxWNf4wQAFW2EXZhILUs qvBhYfMbBSGUVjI/hneDEjfdBjOqGFPUTTkNwjJCJUqExiii+CGSp+8l4b9dleVf0L o6ZzXTzstX+fyHxeC8wllyrRHvnstabNMve1/K22KImd4g1lMJk3kXBLIixoquxayT EOJFMf7wBCT6g== Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4H4zJS6KZGz9vDS; Thu, 9 Sep 2021 22:46:00 +1000 (AEST) From: Michael Ellerman To: Peter Zijlstra Cc: Kajol Jain , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, mingo@redhat.com, acme@kernel.org, jolsa@kernel.org, namhyung@kernel.org, linux-perf-users@vger.kernel.org, ak@linux.intel.com, maddy@linux.ibm.com, atrajeev@linux.vnet.ibm.com, rnsastry@linux.ibm.com, yao.jin@linux.intel.com, ast@kernel.org, daniel@iogearbox.net, songliubraving@fb.com, kan.liang@linux.intel.com, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, paulus@samba.org Subject: Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses In-Reply-To: References: <20210904064932.307610-1-kjain@linux.ibm.com> <87ilzbmt7i.fsf@mpe.ellerman.id.au> Date: Thu, 09 Sep 2021 22:45:54 +1000 Message-ID: <87czphnchp.fsf@mpe.ellerman.id.au> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter Zijlstra writes: > On Wed, Sep 08, 2021 at 05:17:53PM +1000, Michael Ellerman wrote: >> Kajol Jain writes: > >> > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h >> > index f92880a15645..030b3e990ac3 100644 >> > --- a/include/uapi/linux/perf_event.h >> > +++ b/include/uapi/linux/perf_event.h >> > @@ -1265,7 +1265,9 @@ union perf_mem_data_src { >> > #define PERF_MEM_LVLNUM_L2 0x02 /* L2 */ >> > #define PERF_MEM_LVLNUM_L3 0x03 /* L3 */ >> > #define PERF_MEM_LVLNUM_L4 0x04 /* L4 */ >> > -/* 5-0xa available */ >> > +#define PERF_MEM_LVLNUM_OC_L2 0x05 /* On Chip L2 */ >> > +#define PERF_MEM_LVLNUM_OC_L3 0x06 /* On Chip L3 */ >> >> The obvious use for 5 is for "L5" and so on. >> >> I'm not sure adding new levels is the best idea, because these don't fit >> neatly into the hierarchy, they are off to the side. >> >> >> I wonder if we should use the remote field. >> >> ie. for another core's L2 we set: >> >> mem_lvl = PERF_MEM_LVL_L2 >> mem_remote = 1 > > This mixes APIs (see below), IIUC the correct usage would be something > like: lvl_num=L2 remote=1 Aha, I was wondering how lvl and lvl_num were supposed to interact. >> Which would mean "remote L2", but not remote enough to be >> lvl = PERF_MEM_LVL_REM_CCE1. >> >> It would be printed by the existing tools/perf code as "Remote L2", vs >> "Remote cache (1 hop)", which seems OK. >> >> >> ie. we'd be able to express: >> >> Current core's L2: LVL_L2 >> Other core's L2: LVL_L2 | REMOTE >> Other chip's L2: LVL_REM_CCE1 | REMOTE >> >> And similarly for L3. >> >> I think that makes sense? Unless people think remote should be reserved >> to mean on another chip, though we already have REM_CCE1 for that. > > IIRC the PERF_MEM_LVL_* namespace is somewhat depricated in favour of > the newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_} fields. Of > course, ABIs being what they are, we get to support both :/ But I'm not > sure mixing them is a great idea. OK. > Also, clearly this could use a comment... > > The 'new' composite doesnt have a hops field because the hardware that > nessecitated that change doesn't report it, but we could easily add a > field there. > > Suppose we add, mem_hops:3 (would 6 hops be too small?) and the > corresponding PERF_MEM_HOPS_{NA, 0..6} It's really 7 if we use remote && hop = 0 to mean the first hop. If we're wanting to use some of the hop levels to represent intra-chip/package hops then we could possibly use them all on a really big system. eg. you could imagine something like: L2 | - local L2 L2 | REMOTE | HOPS_0 - L2 of neighbour core L2 | REMOTE | HOPS_1 - L2 of near core on same chip (same 1/2 of chip) L2 | REMOTE | HOPS_2 - L2 of far core on same chip (other 1/2 of chip) L2 | REMOTE | HOPS_3 - L2 of sibling chip in same package L2 | REMOTE | HOPS_4 - L2 on separate package 1 hop away L2 | REMOTE | HOPS_5 - L2 on separate package 2 hops away L2 | REMOTE | HOPS_6 - L2 on separate package 3 hops away Whether it's useful to represent all those levels I'm not sure, but it's probably good if we have the ability. I guess I'm 50/50 on whether that's enough levels, or whether we want another bit to allow for future growth. > Then I suppose you can encode things like: > > L2 - local L2 > L2 | REMOTE - remote L2 at an unspecified distance (NA) > L2 | REMOTE | HOPS_0 - remote L2 on the same node > L2 | REMOTE | HOPS_1 - remote L2 on a node 1 removed > > Would that work? Yeah that looks good to me. cheers