From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 800565A0FE for ; Wed, 7 Feb 2024 12:44:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707309892; cv=none; b=bLIXcQnTo7DK0ybQOllli/xQsv4tMSwj4wiEWXWHB1FbS3CBatusPb94efjrDp6XADfpK34vS0S9yPKxrAzNPMbIS22porW2dq0Ie7fkGExIib4P0lElD2UlWmfLwspukBS4tXbPV90XuDlNjielTFav224Q1ZPmPO24apsECa8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707309892; c=relaxed/simple; bh=tYbPcf5/6y9sgM4sHZP3ACfjVZvEn18ioeEU0QslSmk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=BA0M25swm2Ty9T/QQSfyCiXfXgP3gRvP55GLOnGsGCYKDWI8Hww+wCH4shbT4B0DVgz85/evbqB3fqvbpCEYrjNiuRJduXuhv/2Euvn2NXP5fxXpz8TRvM+6/I/bzSTptmoz7VwPDe676P2tgPWQBVrqc3LBLSyaegAa5o2diG4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 21937C433C7; Wed, 7 Feb 2024 12:44:49 +0000 (UTC) Date: Wed, 7 Feb 2024 12:44:47 +0000 From: Catalin Marinas To: Nanyong Sun Cc: will@kernel.org, mike.kravetz@oracle.com, muchun.song@linux.dev, akpm@linux-foundation.org, anshuman.khandual@arm.com, willy@infradead.org, wangkefeng.wang@huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize Message-ID: References: <20240113094436.2506396-1-sunnanyong@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Sat, Jan 27, 2024 at 01:04:15PM +0800, Nanyong Sun wrote: > On 2024/1/26 2:06, Catalin Marinas wrote: > > On Sat, Jan 13, 2024 at 05:44:33PM +0800, Nanyong Sun wrote: > > > HVO was previously disabled on arm64 [1] due to the lack of necessary > > > BBM(break-before-make) logic when changing page tables. > > > This set of patches fix this by adding necessary BBM sequence when > > > changing page table, and supporting vmemmap page fault handling to > > > fixup kernel address translation fault if vmemmap is concurrently accessed. [...] > > How often is this code path called? I wonder whether a stop_machine() > > approach would be simpler. > As long as allocating or releasing hugetlb is called.  We cannot limit users > to only allocate or release hugetlb > when booting or not running any workload on all other cpus, so if use > stop_machine(), it will be triggered > 8 times every 2M and 4096 times every 1G, which is probably too expensive. I'm hoping this can be batched somehow and not do a stop_machine() (or 8) for every 2MB huge page. Just to make sure I understand - is the goal to be able to free struct pages corresponding to hugetlbfs pages? Can we not leave the vmemmap in place and just release that memory to the page allocator? The physical RAM for those struct pages isn't going anywhere, we just have a vmemmap alias to it (cacheable). -- Catalin