From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6C5C0C369A9 for ; Thu, 10 Apr 2025 17:53:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=EjNA9E6fQoObQrEF9x/xR2bwcpg2cboOt/4jyue8tog=; b=N4zVymz9aBOcWp6fpZWdp7oXaN u80ZP4W8ho2Kb/UTu5kLdNgI2pAbrTfizKXt0xFyzvk+eWHjWw67ar33MjSbnx/3c192pNHY5IiUt CcFRN5DAQTegMEHHUrgsvI2HUn13R9dowg5YRhuqeOwD+ma/hv6ILdu0CuuEsvsBEufLVhnEcpscO cok9hMWVCDF5A37Zpl4y8K2oCtjc42XWEB5F0oZru72QYRxRyQa1AzaC3iHSXgoucsX8bieWJ1Em5 rDvoCzr0qZLFBTCIp3/cOoHr+M7w9c3+k00V49xQvZRSr4X5txU9WdBBaMCfCGP6vMH73pKsGxOIS i1c/uFrw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2w5H-0000000BS5h-0aDw; Thu, 10 Apr 2025 17:53:03 +0000 Received: from casper.infradead.org ([2001:8b0:10b:1236::1]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2v8C-0000000BGpV-35bh; Thu, 10 Apr 2025 16:52:00 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=EjNA9E6fQoObQrEF9x/xR2bwcpg2cboOt/4jyue8tog=; b=RAnFP+sa8REHwREQhuJs2YHIWN OrZAs12oMxeWbpncHVB29giNb8cyVPFV3+HdAb/cmklF4kTJD8BRqCv4exsJXSKaiW7dMlbUE2hHr 6rgi9NFwGz8ahECnVaFqrUK7nZu6YDFwiK+BHj3w1GQ+CEtykNrITzBxkTj4pqJiUfipuZUkHryYk FcoRDUUy751+rhrtNHwqavdxk62n4fXInDOnrKxh3cYuCP8m6EB01JqmoKvbOfJc3bUcP58GoemOM 1W4ZmB951GNXPtbhO0Bcpxn93wdKB00fjmkCjI9efyEMQ5M5eQvjPtqEsh9cRwdCtzr+Y2WqmmMYv c2UMaN8A==; Received: from willy by casper.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1u2v83-000000039mo-2MIy; Thu, 10 Apr 2025 16:51:51 +0000 Date: Thu, 10 Apr 2025 17:51:51 +0100 From: Matthew Wilcox To: Jason Gunthorpe Cc: Mike Rapoport , Pratyush Yadav , Changyuan Lyu , linux-kernel@vger.kernel.org, graf@amazon.com, akpm@linux-foundation.org, luto@kernel.org, anthony.yznaga@oracle.com, arnd@arndb.de, ashish.kalra@amd.com, benh@kernel.crashing.org, bp@alien8.de, catalin.marinas@arm.com, dave.hansen@linux.intel.com, dwmw2@infradead.org, ebiederm@xmission.com, mingo@redhat.com, jgowans@amazon.com, corbet@lwn.net, krzk@kernel.org, mark.rutland@arm.com, pbonzini@redhat.com, pasha.tatashin@soleen.com, hpa@zytor.com, peterz@infradead.org, robh+dt@kernel.org, robh@kernel.org, saravanak@google.com, skinsburskii@linux.microsoft.com, rostedt@goodmis.org, tglx@linutronix.de, thomas.lendacky@amd.com, usama.arif@bytedance.com, will@kernel.org, devicetree@vger.kernel.org, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Subject: Re: [PATCH v5 09/16] kexec: enable KHO support for memory preservation Message-ID: References: <20250407141626.GB1557073@nvidia.com> <20250407170305.GI1557073@nvidia.com> <20250409125630.GI1778492@nvidia.com> <20250409153714.GK1778492@nvidia.com> <20250409162837.GN1778492@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250409162837.GN1778492@nvidia.com> X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Apr 09, 2025 at 01:28:37PM -0300, Jason Gunthorpe wrote: > On Wed, Apr 09, 2025 at 07:19:30PM +0300, Mike Rapoport wrote: > > But we have memdesc today, it's struct page. > > No, I don't think it is. struct page seems to be turning into > something legacy that indicates the code has not been converted to the > new stuff yet. No, struct page will be with us for a while. Possibly forever. I have started reluctantly talking about a future in which there aren't struct pages, but it's really premature at this point. That's a 2030 kind of future. For 2025-2029, we will still have alloc_page(s)(). It's just that the size of struct page will be gradually shrinking over that time. > > And when the data structure that memdesc points to will be allocated > > separately folios won't make sense for order-0 allocations. > > At that point the lowest level allocator function will be allocating > the memdesc along with the struct page. Then folio will become > restricted to only actual folio memdescs and alot of the type punning > should go away. We are not there yet. We'll have a few allocator functions. There'll be a slab_alloc(), folio_alloc(), pt_alloc() and so on. I sketched out how these might work last year: https://kernelnewbies.org/MatthewWilcox/FolioAlloc > > > The lowest allocator primitive returns folios, which can represent any > > > order, and the caller casts to their own memdesc. > > > > The lowest allocation primitive returns pages. > > Yes, but as I understand things, we should not be calling that > interface in new code because we are trying to make 'struct page' go > away. > > Instead you should use the folio interfaces and cast to your own > memdesc, or use an allocator interface that returns void * (ie slab) > and never touch the struct page area. > > AFAICT, and I just wrote one of these.. Casting is the best you can do today because I haven't provided a better interface yet. > > And I don't think folio will be a lowest primitive buddy returns anytime > > soon if ever. > > Maybe not internally, but driver facing, I think it should be true. > > Like I just completely purged all struct page from the iommu code: > > https://lore.kernel.org/linux-iommu/0-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com/ > > I don't want some weird KHO interface that doesn't align with using > __folio_alloc_node() and folio_put() as the lowest level allocator > interface. I think it's fine to say "the KHO interface doesn't support bare pages; you must have a memdesc". But I'm not sure that's the right approach.