From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 24FAAF8E4A7 for ; Fri, 17 Apr 2026 07:00:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:CC:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=UDQ3e422iz75+H385lB4Ltg4ZCuj0AWc/XEDoyVgtKA=; b=p3rG49Alleg0nj91RPt45KMBcP 9njWnfJaue8chsMhz3nYz+MTo8mynjzI6HvGgl6eYL5poL0paaWh7hFPyP43kS+BqG2XZnOrsr/7Z oNDwZCQ3w02X96cxHPNh0twzg97DXaVlkDxeLGGS/5m0MxuLQ+nZh4fPCgdY24lVQljincnJlRTdb mMbHiSsivErYk6GwXcv10Wv3wLQB4pIwzY/5+9Ra6FtDKGsl7Op9DwDNUHZ7Nv7kXlQnorlCLYNXV WilUdsWJmtv8tK9r7tcbvaEVVh6N3MyNb2vxlL0W+PlX3C4fJTZI2Q+1RfbkxQFx/nmMF8aGEjfnc k+BvFTRw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1wDdBU-00000003ZAQ-0i9Y; Fri, 17 Apr 2026 07:00:12 +0000 Received: from [101.204.27.37] (helo=mailgw1.hygon.cn) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1wDdBQ-00000003Z8J-1FFE for linux-arm-kernel@lists.infradead.org; Fri, 17 Apr 2026 07:00:11 +0000 Received: from maildlp1.hygon.cn (unknown [127.0.0.1]) by mailgw1.hygon.cn (Postfix) with ESMTP id 4fxm355c3TzwvXZ; Fri, 17 Apr 2026 14:59:45 +0800 (CST) Received: from maildlp1.hygon.cn (unknown [172.23.18.60]) by mailgw1.hygon.cn (Postfix) with ESMTP id 4fxm342Yk4zvkdy; Fri, 17 Apr 2026 14:59:44 +0800 (CST) Received: from cncheex04.Hygon.cn (unknown [172.23.18.114]) by maildlp1.hygon.cn (Postfix) with ESMTPS id 1F23A16E2; Fri, 17 Apr 2026 14:59:44 +0800 (CST) Received: from hsj-2U-Workstation (172.19.20.61) by cncheex04.Hygon.cn (172.23.18.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.36; Fri, 17 Apr 2026 14:59:42 +0800 Date: Fri, 17 Apr 2026 14:59:41 +0800 From: Huang Shijie To: Mateusz Guzik CC: , , , , , , , , , , , , , , , Subject: Re: [PATCH 0/3] mm: split the file's i_mmap tree for NUMA Message-ID: References: <20260413062042.804-1-huangsj@hygon.cn> <76pfiwabdgsej6q2yxfh3efuqvsyg7mt7rvl5itzzjyhdrto5r@53viaxsackzv> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <76pfiwabdgsej6q2yxfh3efuqvsyg7mt7rvl5itzzjyhdrto5r@53viaxsackzv> X-Originating-IP: [172.19.20.61] X-ClientProxiedBy: cncheex06.Hygon.cn (172.23.18.116) To cncheex04.Hygon.cn (172.23.18.114) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260417_000009_025233_72EF4878 X-CRM114-Status: GOOD ( 28.41 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Apr 13, 2026 at 05:33:21PM +0200, Mateusz Guzik wrote: > On Mon, Apr 13, 2026 at 02:20:39PM +0800, Huang Shijie wrote: > > In NUMA, there are maybe many NUMA nodes and many CPUs. > > For example, a Hygon's server has 12 NUMA nodes, and 384 CPUs. > > In the UnixBench tests, there is a test "execl" which tests > > the execve system call. > > > > When we test our server with "./Run -c 384 execl", > > the test result is not good enough. The i_mmap locks contended heavily on > > "libc.so" and "ld.so". For example, the i_mmap tree for "libc.so" can have > > over 6000 VMAs, all the VMAs can be in different NUMA mode. > > The insert/remove operations do not run quickly enough. > > > > patch 1 & patch 2 are try to hide the direct access of i_mmap. > > patch 3 splits the i_mmap into sibling trees, and we can get better > > performance with this patch set: > > we can get 77% performance improvement(10 times average) > > > > To my reading you kept the lock as-is and only distributed the protected > state. > > While I don't doubt the improvement, I'm confident should you take a > look at the profile you are going to find this still does not scale with > rwsem being one of the problems (there are other global locks, some of > which have experimental patches for). > > Apart from that this does nothing to help high core systems which are > all one node, which imo puts another question mark on this specific > proposal. > > Of course one may question whether a RB tree is the right choice here, > it may be the lock-protected cost can go way down with merely a better > data structure. > > Regardless of that, for actual scalability, there will be no way around > decentralazing locking around this and partitioning per some core count > (not just by numa awareness). > > Decentralizing locking is definitely possible, but I have not looked > into specifics of how problematic it is. Best case scenario it will > merely with separate locks. Worst case scenario something needs a fully > stabilized state for traversal, in that case another rw lock can be > slapped around this, creating locking order read lock -> per-subset > write lock -- this will suffer scalability due to the read locking, but > it will still scale drastically better as apart from that there will be > no serialization. In this setting the problematic consumer will write > lock the new thing to stabilize the state. For your proposal in no-numa, I hope you can create a patch set for it. I can test it in our machine. Thanks Huang Shijie