From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6922AC4363D for ; Thu, 24 Sep 2020 14:16:21 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0878A221EB for ; Thu, 24 Sep 2020 14:16:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="KKSgQZPF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0878A221EB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=/CZbCoc0ALqr0d5b8Nf6Oo2pR/WjsYCHswEst991ZWw=; b=KKSgQZPFM2d/i728r8CO68Wv6 8NMTGTwdCjoQpxpcaGhkhVYeygEwDBFtjbJlazAKtpRteabpepT+dpW3FJiWCIbsrJ3MKDNqd6stE LSZ7VQWa/MgRqY1WJ2qbBoLlXNycHU2gpTk2khInoLBIz6wxI9r6Tp8wsnPS5YLx6yphJ47eXdxOy YTV/J93YoAkS565Z8bMQBpyZ0lEGL4RI1HmpZ5KXIjdhTgiPLQkm2uSCCkjBt6aIe5J2+FEvF4NHp uLxKJUr4Uuj0qwNyf4W86j4W+ZecdUvakdTx+naUphUCl/b0y3GHX2UI/hT9x6pwMKxEWo88kcnhd ODFXPEyrA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kLS1G-0003P6-LJ; Thu, 24 Sep 2020 14:14:46 +0000 Received: from szxga06-in.huawei.com ([45.249.212.32] helo=huawei.com) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kLS1C-0003Mh-Kq for linux-arm-kernel@lists.infradead.org; Thu, 24 Sep 2020 14:14:44 +0000 Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 89AD161357A0BD201FBA; Thu, 24 Sep 2020 22:14:27 +0800 (CST) Received: from [10.174.178.63] (10.174.178.63) by DGGEMS406-HUB.china.huawei.com (10.3.19.206) with Microsoft SMTP Server id 14.3.487.0; Thu, 24 Sep 2020 22:14:18 +0800 Subject: Re: [PATCH 1/2] perf stat: Fix segfault when counting armv8_pmu events To: Andi Kleen References: <20200922031346.15051-1-liwei391@huawei.com> <20200922031346.15051-2-liwei391@huawei.com> <20200922192321.GL13818@tassilo.jf.intel.com> <20200922195035.GA42577@tassilo.jf.intel.com> From: "liwei (GF)" Message-ID: Date: Thu, 24 Sep 2020 22:14:17 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.0 MIME-Version: 1.0 In-Reply-To: <20200922195035.GA42577@tassilo.jf.intel.com> X-Originating-IP: [10.174.178.63] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200924_101443_520016_02BEA38C X-CRM114-Status: GOOD ( 15.93 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Alexander Shishkin , Alexey Budankov , Adrian Hunter , Arnaldo Carvalho de Melo , linux-kernel@vger.kernel.org, Peter Zijlstra , Ingo Molnar , huawei.libin@huawei.com, Namhyung Kim , Jiri Olsa , linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Andi, On 2020/9/23 3:50, Andi Kleen wrote: > On Tue, Sep 22, 2020 at 12:23:21PM -0700, Andi Kleen wrote: >>> After debugging, i found the root reason is that the xyarray fd is created >>> by evsel__open_per_thread() ignoring the cpu passed in >>> create_perf_stat_counter(), while the evsel' cpumap is assigned as the >>> corresponding PMU's cpumap in __add_event(). Thus, the xyarray fd is created >>> with ncpus of dummy cpumap and an out of bounds 'cpu' index will be used in >>> perf_evsel__close_fd_cpu(). >>> >>> To address this, add a flag to mark this situation and avoid using the >>> affinity technique when closing/enabling/disabling events. >> >> The flag seems like a hack. How about figuring out the correct number of >> CPUs and using that? > > Also would like to understand what's different on ARM64 than other architectures. > Or could this happen on x86 too? > The problem is that when the user requests per-task events, the cpumask is expected as NULL(dummy), while the armv8_pmu do has a cpumask which inherited by evsel. The armv8_pmu's cpumask was added for heterogeneous systems. So this issue can not happen on x86. In fact, the cpumask is correct indeed, but it should't be used when we requesting per-task events. As these events should be install on all cores, i doubt how much we can benefit from the affinity technique, so i choosed to add a flag. I also did a test on hisilicon arm64 d06 board, with 2 sockets 128 cores. Testing the following command 3 times, with/without the affinity technique: time tools/perf/perf stat -ddd -C 0-127 --per-core --timeout=5000 2> /dev/null * (HEAD detached at 7074674e7338) perf cpumap: Maintain cpumaps ordered and without dups real 0m8.039s user 0m0.402s sys 0m2.582s real 0m7.939s user 0m0.360s sys 0m2.560s real 0m7.997s user 0m0.358s sys 0m2.586s * (HEAD detached at 704e2f5b700d) perf stat: Use affinity for enabling/disabling events real 0m7.954s user 0m0.308s sys 0m2.590s real 0m12.959s user 0m0.332s sys 0m2.582s real 0m18.009s user 0m0.346s sys 0m2.562s The offcpu time is much longer when using affinity, i think that's what migration costs, could you please share me your test case? Thanks, Wei _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel