From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 208B4C433E2 for ; Thu, 21 May 2020 04:38:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 04A13206D4 for ; Thu, 21 May 2020 04:38:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728077AbgEUEiN (ORCPT ); Thu, 21 May 2020 00:38:13 -0400 Received: from mga12.intel.com ([192.55.52.136]:15625 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726394AbgEUEiM (ORCPT ); Thu, 21 May 2020 00:38:12 -0400 IronPort-SDR: 33plZz4AEPVRtPgYDZgdZeWJhFzy+5ca1oZremrF+FOxUc2+DbWvGJ7JnEuGkGW0cQNkzYEXdD sATbX+t+/4PQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 May 2020 21:38:11 -0700 IronPort-SDR: 5LdwVbLtwg1PZMsTXRNHRfjAt8JK4ziS4qH2szjXi4opCJL6aeS6QCrDBckiM0qGvyUjtZMogg jvQNAri8k5Cw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,416,1583222400"; d="scan'208";a="300172439" Received: from yjin15-mobl1.ccr.corp.intel.com (HELO [10.238.5.239]) ([10.238.5.239]) by fmsmga002.fm.intel.com with ESMTP; 20 May 2020 21:38:09 -0700 Subject: Re: [PATCH] perf evsel: Get group fd from CPU0 for system wide event To: Jiri Olsa Cc: acme@kernel.org, jolsa@kernel.org, peterz@infradead.org, mingo@redhat.com, alexander.shishkin@linux.intel.com, Linux-kernel@vger.kernel.org, ak@linux.intel.com, kan.liang@intel.com, yao.jin@intel.com References: <20200430013451.17196-1-yao.jin@linux.intel.com> <20200501102337.GA1761222@krava> <20200505000352.GH1916255@krava> <3e813227-4954-0d4b-bc7a-ca272b18454a@linux.intel.com> <68e53765-6f45-9483-7543-0a2f961cdc62@linux.intel.com> <20200515083312.GB3511648@krava> <5fe2efe4-f8a3-04ef-f5e8-7b9c433d4142@linux.intel.com> <82da23c8-94dd-4096-a987-a17087e2642f@linux.intel.com> <20200520075018.GE110644@krava> From: "Jin, Yao" Message-ID: <0dc72312-5464-3f71-2fa2-ea604d33a1c4@linux.intel.com> Date: Thu, 21 May 2020 12:38:08 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <20200520075018.GE110644@krava> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jiri, On 5/20/2020 3:50 PM, Jiri Olsa wrote: > On Wed, May 20, 2020 at 01:36:40PM +0800, Jin, Yao wrote: >> Hi Jiri, >> >> On 5/18/2020 11:28 AM, Jin, Yao wrote: >>> Hi Jiri, >>> >>> On 5/15/2020 4:33 PM, Jiri Olsa wrote: >>>> On Fri, May 15, 2020 at 02:04:57PM +0800, Jin, Yao wrote: >>>> >>>> SNIP >>>> >>>>> I think I get the root cause. That should be a serious bug in get_group_fd, access violation! >>>>> >>>>> For a group mixed with system-wide event and per-core event and the group >>>>> leader is system-wide event, access violation will happen. >>>>> >>>>> perf_evsel__alloc_fd allocates one FD member for system-wide event (only FD(evsel, 0, 0) is valid). >>>>> >>>>> But for per core event, perf_evsel__alloc_fd allocates N FD members (N = >>>>> ncpus). For example, for ncpus is 8, FD(evsel, 0, 0) to FD(evsel, 7, 0) are >>>>> valid. >>>>> >>>>> get_group_fd(struct evsel *evsel, int cpu, int thread) >>>>> { >>>>>      struct evsel *leader = evsel->leader; >>>>> >>>>>      fd = FD(leader, cpu, thread);    /* access violation may happen here */ >>>>> } >>>>> >>>>> If leader is system-wide event, only the FD(leader, 0, 0) is valid. >>>>> >>>>> When get_group_fd accesses FD(leader, 1, 0), access violation happens. >>>>> >>>>> My fix is: >>>>> >>>>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c >>>>> index 28683b0eb738..db05b8a1e1a8 100644 >>>>> --- a/tools/perf/util/evsel.c >>>>> +++ b/tools/perf/util/evsel.c >>>>> @@ -1440,6 +1440,9 @@ static int get_group_fd(struct evsel *evsel, int cpu, int thread) >>>>>          if (evsel__is_group_leader(evsel)) >>>>>                  return -1; >>>>> >>>>> +       if (leader->core.system_wide && !evsel->core.system_wide) >>>>> +               return -2; >>>> >>>> so this effectively stops grouping system_wide events with others, >>>> and I think it's correct, how about events that differ in cpumask? >>>> >>> >>> My understanding for the events that differ in cpumaks is, if the >>> leader's cpumask is not fully matched with the evsel's cpumask then we >>> stop the grouping. Is this understanding correct? >>> >>> I have done some tests and get some conclusions: >>> >>> 1. If the group is mixed with core and uncore events, the system_wide checking can distinguish them. >>> >>> 2. If the group is mixed with core and uncore events and "-a" is >>> specified, the system_wide for core event is also false. So system_wide >>> checking can distinguish them too >>> >>> 3. In my test, the issue only occurs when we collect the metric which is >>> mixed with uncore event and core event, so maybe checking the >>> system_wide is OK. >>> >>>> should we perhaps ensure this before we call open? go throught all >>>> groups and check they are on the same cpus? >>>> >>> >>> The issue doesn't happen at most of the time (only for the metric >>> consisting of uncore event and core event), so fallback to stop grouping >>> if call open is failed looks reasonable. >>> >>> Thanks >>> Jin Yao >>> >>>> thanks, >>>> jirka >>>> >>>> >>>>> + >>>>>          /* >>>>>           * Leader must be already processed/open, >>>>>           * if not it's a bug. >>>>> @@ -1665,6 +1668,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, >>>>>                                  pid = perf_thread_map__pid(threads, thread); >>>>> >>>>>                          group_fd = get_group_fd(evsel, cpu, thread); >>>>> +                       if (group_fd == -2) { >>>>> +                               errno = EINVAL; >>>>> +                               err = -EINVAL; >>>>> +                               goto out_close; >>>>> +                       } >>>>>   retry_open: >>>>>                          test_attr__ready(); >>>>> >>>>> It enables the perf_evlist__reset_weak_group. And in the second_pass (in >>>>> __run_perf_stat), the events will be opened successfully. >>>>> >>>>> I have tested OK for this fix on cascadelakex. >>>>> >>>>> Thanks >>>>> Jin Yao >>>>> >>>> >> >> Is this fix OK? >> >> Another thing is, do you think if we need to rename >> "evsel->core.system_wide" to "evsel->core.has_cpumask". >> >> The "system_wide" may misleading. >> >> evsel->core.system_wide = pmu ? pmu->is_uncore : false; >> >> "pmu->is_uncore" is true if PMU has a "cpumask". But it's not just uncore >> PMU which has cpumask. Some other PMUs, e.g. cstate_pkg, also have cpumask. >> So for this case, "has_cpumask" should be better. > > so those flags are checked in many places in the code so I don't > think it's wise to mess with them > > what I meant before was that the cpumask could be different for > different events so even when both events are 'system_wide' the > leader 'fd' might not exist for the groupped events and vice versa > > so maybe we should ensure that we are groupping events with same > cpu maps before we go for open, so the get_group_fd stays simple > Thanks for the comments. I'm preparing the patch according to this idea. >> >> But I'm not sure if the change is OK for other case, e.g. PT, which also >> uses "evsel->core.system_wide". > > plz CC Adrian Hunter on next patches > if you are touching this > I will not touch "evsel->core.system_wide" in the new patch. Thanks Jin Yao > thanks, > jirka >