From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42FBDC433E6 for ; Mon, 1 Feb 2021 18:12:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 09FE364EA0 for ; Mon, 1 Feb 2021 18:12:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231868AbhBASM3 (ORCPT ); Mon, 1 Feb 2021 13:12:29 -0500 Received: from foss.arm.com ([217.140.110.172]:35594 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229663AbhBASMY (ORCPT ); Mon, 1 Feb 2021 13:12:24 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 41FDB1042; Mon, 1 Feb 2021 10:11:38 -0800 (PST) Received: from e113632-lin (e113632-lin.cambridge.arm.com [10.1.194.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0EB9A3F718; Mon, 1 Feb 2021 10:11:35 -0800 (PST) From: Valentin Schneider To: Barry Song , vincent.guittot@linaro.org, mgorman@suse.de, mingo@kernel.org, peterz@infradead.org, dietmar.eggemann@arm.com, morten.rasmussen@arm.com, linux-kernel@vger.kernel.org Cc: linuxarm@openeuler.org, xuwei5@huawei.com, liguozhu@hisilicon.com, tiantao6@hisilicon.com, wanghuiqiang@huawei.com, prime.zeng@hisilicon.com, jonathan.cameron@huawei.com, guodong.xu@linaro.org, Barry Song , Meelis Roos Subject: Re: [PATCH] sched/topology: fix the issue groups don't span domain->span for NUMA diameter > 2 In-Reply-To: <20210201033830.15040-1-song.bao.hua@hisilicon.com> References: <20210201033830.15040-1-song.bao.hua@hisilicon.com> User-Agent: Notmuch/0.21 (http://notmuchmail.org) Emacs/26.3 (x86_64-pc-linux-gnu) Date: Mon, 01 Feb 2021 18:11:26 +0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 01/02/21 16:38, Barry Song wrote: > A tricky thing is that we shouldn't use the sgc of the 1st CPU of node2 > for the sched_group generated by grandchild, otherwise, when this cpu > becomes the balance_cpu of another sched_group of cpus other than node0, > our sched_group generated by grandchild will access the same sgc with > the sched_group generated by child of another CPU. > > So in init_overlap_sched_group(), sgc's capacity be overwritten: > build_balance_mask(sd, sg, mask); > cpu = cpumask_first_and(sched_group_span(sg), mask); > > sg->sgc = *per_cpu_ptr(sdd->sgc, cpu); > > And WARN_ON_ONCE(!cpumask_equal(group_balance_mask(sg), mask)) will > also be triggered: > static void init_overlap_sched_group(struct sched_domain *sd, > struct sched_group *sg) > { > if (atomic_inc_return(&sg->sgc->ref) == 1) > cpumask_copy(group_balance_mask(sg), mask); > else > WARN_ON_ONCE(!cpumask_equal(group_balance_mask(sg), mask)); > } > > So here move to use the sgc of the 2nd cpu. For the corner case, if NUMA > has only one CPU, we will still trigger this WARN_ON_ONCE. But It is > really unlikely to be a real case for one NUMA to have one CPU only. > Well, it's trivial to boot this with QEMU, and it's actually the example the comment atop that WARN_ONCE() is based on. Also, you could end up with a single CPU on a node during hotplug operations... I am not entirely sure whether having more than one CPU per node is a sufficient condition. I'm starting to *think* it is, but I'm not entirely convinced yet - and now I need a new notebook.