From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4ED42C678D9 for ; Mon, 9 Jan 2023 17:01:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C1C088E0007; Mon, 9 Jan 2023 12:01:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BCD448E0001; Mon, 9 Jan 2023 12:01:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A94818E0007; Mon, 9 Jan 2023 12:01:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 97F538E0001 for ; Mon, 9 Jan 2023 12:01:07 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3AB1F80BDC for ; Mon, 9 Jan 2023 17:01:07 +0000 (UTC) X-FDA: 80335875774.10.826E1BA Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf02.hostedemail.com (Postfix) with ESMTP id 7B79E80015 for ; Mon, 9 Jan 2023 17:01:04 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Ke+gTiIH; spf=none (imf02.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673283665; a=rsa-sha256; cv=none; b=0xnmzWl9OgqR0UiEQAWI6vo3/XpdrlwGds+PVM/ziHA3+IIJL9rFq/bPpVr6J+WlV8ZO+9 rUPNoFxyvhwlXyCOGcdWWyTBWGJ/BSbfQ8l+61E9MN65/JEBNlHR7HziD/kk7/iiUaktbU TEG9BSsWzGXYtt8lw+vEs+KmfvS6nB4= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Ke+gTiIH; spf=none (imf02.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673283664; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=S/Qtg8tlcCy525PgtQqHnJE46VJGGj6GU4r13vFdWA4=; b=IVZvlqFpZ9sNp8d5vn6mInMocsH5AGznJt0rZHCEaXitaQwtY3HBysveLDL6vOAvKY6jra 3g37YkDvgsuHepqp7V58lfX3jA9WAANBeAX0kcIYKot5vhrfWjP6fo5WUhmRVzhncrc6T/ R3BlB1Js0kTNYVIXdlvhyk2dNqDoJdM= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=S/Qtg8tlcCy525PgtQqHnJE46VJGGj6GU4r13vFdWA4=; b=Ke+gTiIHdFU4kA25oWSuzMLyTb DWYmoKw+zSg/C/kFWdwoSZee7lswTO5rmbQWDHk230gL4VhIX0oDD68GFUb1m3uSJBd1aG4JJ1HUU 8+p7gdfwhG4ryrqekCFBKAysLGzyT0kRk0resnVkHnh6fhJmwHJneIPPya+XhKzM0v8zK/eRg+IBq lb+K+WF0eMB9yzq2npzIW3xyGdq/TlgK4etNcInd0vWB3wkUdwdJHOidBkLlr49OyEqSmzkJWtLxU y+ofuN+zpzM+bmkSPkS3lUb19Jaoge+XwMmglns4ouVxWHf1GXtg0Linc9UYjMKC2NWJ/UTViNUI9 0ehOmsTA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEvWH-002SGY-7c; Mon, 09 Jan 2023 17:01:09 +0000 Date: Mon, 9 Jan 2023 17:01:09 +0000 From: Matthew Wilcox To: Dave Hansen Cc: Yin Fengwei , linux-mm@kvack.org, akpm@linux-foundation.org, jack@suse.cz, hughd@google.com, kirill.shutemov@linux.intel.com, mhocko@suse.com, ak@linux.intel.com, aarcange@redhat.com, npiggin@gmail.com, mgorman@techsingularity.net, rppt@kernel.org, ying.huang@intel.com, tim.c.chen@intel.com Subject: Re: [RFC PATCH 1/4] mcpage: add size/mask/shift definition for multiple consecutive page Message-ID: References: <20230109072232.2398464-1-fengwei.yin@intel.com> <20230109072232.2398464-2-fengwei.yin@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7B79E80015 X-Stat-Signature: i4iyca7b8otqh4o88jkro1uc9ppuugie X-HE-Tag: 1673283664-256588 X-HE-Meta: U2FsdGVkX19qAwT9CVr81DlFKOF9tnQdS9/cToxmYIKeBbaH1IGvHgJ4PGLk2QuXUvc5aALUjTANARgjm/8DqoLP4j2OmoGii/nZRX7bNbJV+YB9iwNk0QYdulNKgnDoMut5HIfNPLBO75Y6iENpLS0GIy9wSdLn81GB+n9S34VJdwTOKx0K/OpOfqR9YHkRpMrhg3AOWbQoMt+o80cKyx/S8SinFlt9JVfOx4vE7UxZaycOMlJ5xJAWGjJOTPDmz/sIu039pF7moQxqv1MpTkya5I1aD5fa6Ldgh9zJt7dPkT16wkTZmH2DVdynGqbug8lKnbG0YK83EN5DsD1SPhBcB+VJVA8w9z53Vfw+UAiQkhlWojdlVSQfxPsSRJsYCheOO4k90PxazJT2dT2eoWoXyTtmnZBf9Gr/WRzQGWJau89bgiBwo1RoqtChtX+D1B8P5OqL/FfQGy2LwAKsq8gQJrkmLFsKggCRCcYNqc8Pp646BVD7YsgSy+YUi4Ck0thQbSmS6Iejm9MTnsViG0UmcD6a4PlVtyu9/YL5+ViONXgeqpjz75/RcjxdwJfp1Hotmrttr79T0JoYXpWU2582nUnMOb3msvW3Xj1jlc/WG00LBJHQynQoZAJ8Is6UiQHb3aEJByfjLbES1z1J5BSVgCd1VMJS7CACaRGBPIan5v9oYseL0fXWDf4Fwk6EF6JjHo5K8VJIIHgq3Eeciefs1oRA4B+cdCo7EWTMZ8SXTiwsfAJThX+1SbQzlSBbcpx4ztEfAKleuVimqClBW0HZoyE/HiKcj7kEYolPJXK0FNqxhdItBKAf0pMJJZF6X+So9JDIIGNbQ+fB/5Q/KlkgK9KKZjw8ln+o4D7DCVA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.008743, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 09, 2023 at 08:30:43AM -0800, Dave Hansen wrote: > On 1/9/23 05:24, Matthew Wilcox wrote: > > On Mon, Jan 09, 2023 at 03:22:29PM +0800, Yin Fengwei wrote: > >> The idea of the multiple consecutive page (abbr as "mcpage") is using > >> collection of physical contiguous 4K page other than huge page for > >> anonymous mapping. > > This is what folios are for. You have an interesting demonstration > > here that shows that moving to larger folios for anonymous memory > > is worth doing (thank you!) but you're missing several of the advantages > > of folios by going off and doing your own thing. > > It might not have come across in the changelog and cover letter, but > Fengwei and the rest of us *totally* agree with you on this. "Doing > your own thing" just isn't going to cut it and if this is going to go > anywhere, it needs to use folios. > > This series is _pure_ RFC and the comments we're interested in is > whether this demonstration warrants going back and doing it the right > way (with folios). Ah, yes, that didn't come across. In fact, the opposite came across with this paragraph: : This series is the first step of mcpage. The furture work can be : enable mcpage for more components like page cache, swapping etc. : Finally, most pages in system will be allocated/free/reclaimed : with mcpage order. Since the page cache has been using multipage folios in mainline since March (and in various trees of mine since Feb 2020!), that indicated to me either a lack of knowledge of folios, or a rejection of the folio approach. Happy to hear that's not true! Most of the results here validate my experience and/or assumptions, which is good. I'm more than happy for someone else to take on the hard work of folio-ising the anon VMAs. I see the problems to be solved as: - Determining (on a page fault) what the correct allocation size is for this process at this time. We have the readahead code to leverage for files, but I don't think we have anything similar for anon memory - Inserting multiple PTEs when a multi-page folio is found. This also needs to be done for file pages, and maybe that's a good place to start. - Finding all the places in the anon memory code that assume that PageCompound() / PageTransHuge() is the same thing as folio_test_pmd_mappable(). There are probably other things that are going to come up, but I think starting is the important part. Not everything needs to be done immediately (#3 before #1, I would think ;-).