From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_2 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A850C433DF for ; Mon, 19 Oct 2020 14:28:57 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B8190222C3 for ; Mon, 19 Oct 2020 14:28:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="n2ao6nbU" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B8190222C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=sqOke68EZ6NX4Ovdls9VqSY9V58bZCSRczY6OPhLI3c=; b=n2ao6nbU2wQgV9LDhax33mV7F Wsmuuzy5xMz5aNRZuTpDXfXBErPEpdj764uIkqhBQYihW6ThUTBQvrGqFk3RmvSkTBGELjO1aBgYO 6SrCk+0WIG8brbqqDux7hhglz8hqIeEwj8cRB/kAYTl2g8TJ+McjpJMVYKFHNLp/DtY+12JUDrgOS YKzZP2gezz7mpXXosDZrT3LNQDaTKOKKmdu2QV2JusQCRerjtBTOfDQ1KFChuz0bcc8lJyxEDRx0Y DsI7NvcGPuQ70+MFoPUi7xp1hcB2QlOOFKnm+tUjJL8MR/2wN677ZCb6NEkaZ0DU1APY5QsR3/0ao KnDFR05TA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kUW8B-0008Uu-TZ; Mon, 19 Oct 2020 14:27:23 +0000 Received: from lhrrgout.huawei.com ([185.176.76.210] helo=huawei.com) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kUW88-0008To-2s for linux-arm-kernel@lists.infradead.org; Mon, 19 Oct 2020 14:27:21 +0000 Received: from lhreml710-chm.china.huawei.com (unknown [172.18.7.108]) by Forcepoint Email with ESMTP id D268999E714A7DA46F91; Mon, 19 Oct 2020 15:27:16 +0100 (IST) Received: from localhost (10.227.96.57) by lhreml710-chm.china.huawei.com (10.201.108.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1913.5; Mon, 19 Oct 2020 15:27:16 +0100 Date: Mon, 19 Oct 2020 15:27:15 +0100 From: Jonathan Cameron To: Valentin Schneider Subject: Re: [RFC PATCH] topology: Represent clusters of CPUs within a die. Message-ID: <20201019142715.00005fb1@huawei.com> In-Reply-To: References: <20201016152702.1513592-1-Jonathan.Cameron@huawei.com> <20201019103522.GK2628@hirez.programming.kicks-ass.net> <20201019123226.00006705@Huawei.com> <20201019131052.GC8004@e123083-lin> Organization: Huawei tech. R&D (UK) Ltd. X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; i686-w64-mingw32) MIME-Version: 1.0 X-Originating-IP: [10.227.96.57] X-ClientProxiedBy: lhreml717-chm.china.huawei.com (10.201.108.68) To lhreml710-chm.china.huawei.com (10.201.108.61) X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201019_102720_227650_6EA23739 X-CRM114-Status: GOOD ( 25.09 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Len Brown , Peter Zijlstra , Greg Kroah-Hartman , x86@kernel.org, guohanjun@huawei.com, linux-kernel@vger.kernel.org, Jeremy Linton , linuxarm@huawei.com, Brice Goglin , linux-acpi@vger.kernel.org, Jerome Glisse , Sudeep Holla , Will Deacon , Morten Rasmussen , linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, 19 Oct 2020 14:48:02 +0100 Valentin Schneider wrote: > +Cc Jeremy > > On 19/10/20 14:10, Morten Rasmussen wrote: > > Hi Jonathan, > > The problem I see is that the benefit of keeping tasks together due to > > the interconnect layout might vary significantly between systems. So if > > we introduce a new cpumask for cluster it has to have represent roughly > > the same system properties otherwise generic software consuming this > > information could be tricked. > > > > If there is a provable benefit of having interconnect grouping > > information, I think it would be better represented by a distance matrix > > like we have for NUMA. > > > > Morten > > That's my queue to paste some of that stuff I've been rambling on and off > about! > > With regards to cache / interconnect layout, I do believe that if we > want to support in the scheduler itself then we should leverage some > distance table rather than to create X extra scheduler topology levels. > > I had a chat with Jeremy on the ACPI side of that sometime ago. IIRC given > that SLIT gives us a distance value between any two PXM, we could directly > express core-to-core distance in that table. With that (and if that still > lets us properly discover NUMA node spans), we could let the scheduler > build dynamic NUMA-like topology levels representing the inner quirks of > the cache / interconnect layout. You would rapidly run into the problem SLIT had for numa node description. There is no consistent description of distance and except in the vaguest sense or 'nearer' it wasn't any use for anything. That is why HMAT came along. It's far from perfect but it is a step up. I can't see how you'd generalize those particular tables to do anything for intercore comms without breaking their use for NUMA, but something a bit similar might work. A lot of thought has gone in (and meeting time) to try an improve the situation for complex topology around NUMA. Whilst there are differences in representing the internal interconnects and caches it seems like a somewhat similar problem. The issue there is it is really really hard to describe this stuff with enough detail to be useful, but simple enough to be usable. https://lore.kernel.org/linux-mm/20181203233509.20671-1-jglisse@redhat.com/ > > It's mostly pipe dreams for now, but there seems to be more and more > hardware where that would make sense; somewhat recently the PowerPC guys > added something to their arch-specific code in that regards. Pipe dream == something to work on ;) ACPI has a nice code first model of updating the spec now, so we can discuss this one in public, and propose spec changes only once we have an implementation proven. Note I'm not proposing we put the cluster stuff in the scheduler, just provide it as a hint to userspace. Jonathan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel