From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8074EC4320A for ; Sun, 29 Aug 2021 07:06:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6776360F5D for ; Sun, 29 Aug 2021 07:06:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234803AbhH2HHK (ORCPT ); Sun, 29 Aug 2021 03:07:10 -0400 Received: from mail.kernel.org ([198.145.29.99]:51666 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234753AbhH2HHI (ORCPT ); Sun, 29 Aug 2021 03:07:08 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 0C1FB60F39; Sun, 29 Aug 2021 07:06:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1630220777; bh=/n4BHEfwRpAkQl+6zQT6qVT0qSub4et7m03Zgo/XEPs=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=EK1gvSJl2SZyOTieaHwz8h6/K/9AimVgahG7x7Na6f2EmGseoRAYhFnD2I5Wk88mY vtw76Km759OuxFL4PaH4rm1Na3bvOhwUTJf4o6gY4BGmcJZAwRHUBMXne8y0Zlwrs2 QWsG6/UABUy2XIw2tAqEcwe2Kn/e0oOxfB4gA83FQ1xfuTy6eMLNN9TpXmW4kGEu/2 WnFg/z7uZw8sR1a+ZHcxXa8vyV05GBVBk3wsKUrF+LHV4eJPY24I7c8r7vJG1CwemG RBLFJX1sihXBRR8AecUcgjX++DyavEPQy9+RiIeu51ZGAOnrVMm6ZLuQnE6iM5RSZL OGfNNfU3f3/bw== Date: Sun, 29 Aug 2021 10:06:10 +0300 From: Mike Rapoport To: Vlastimil Babka Cc: linux-mm@kvack.org, Andrew Morton , Andy Lutomirski , Dave Hansen , Ira Weiny , Kees Cook , Mike Rapoport , Peter Zijlstra , Rick Edgecombe , x86@kernel.org, linux-kernel@vger.kernel.org, Brijesh Singh Subject: Re: [RFC PATCH 0/4] mm/page_alloc: cache pte-mapped allocations Message-ID: References: <20210823132513.15836-1-rppt@kernel.org> <9d61b4f7-82d0-5caf-88fa-ff1b78704eea@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9d61b4f7-82d0-5caf-88fa-ff1b78704eea@suse.cz> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 24, 2021 at 06:09:44PM +0200, Vlastimil Babka wrote: > On 8/23/21 15:25, Mike Rapoport wrote: > > > > The idea is to use a gfp flag that will instruct the page allocator to use > > the cache of pte-mapped pages because the caller needs to remove them from > > the direct map or change their attributes. > > Like Dave, I don't like much the idea of a new GFP flag that all page > allocations now have to check, and freeing that has to check a new pageblock > flag, although I can see some of the benefits this brings... > > > When the cache is empty there is an attempt to refill it using PMD-sized > > allocation so that once the direct map is split we'll be able to use all 4K > > pages made available by the split. > > > > If the high order allocation fails, we fall back to order-0 and mark the > > Yeah, this fallback is where we benefit from the page allocator implementation, > because of the page freeing hook that will recognize page from such fallback > blocks and free them to the cache. But does that prevent so much fragmentation > to be worth it? I'd see first if we can do without it. I've run 'stress-ng --mmapfork 20 -t 30' in a VM with 4G or RAM and then checked splits reported in /proc/vmstat to get some ideas what may be the benefit. I've compared Rick's implementation of grouped alloc (rebased on v5.14-rc6) with this set. For that simple test there were ~30% less splits. | grouped alloc | pte-mapped ----------------------+---------------+------------ PMD splits after boot | 16 | 14 PMD splits after test | 49 | 34 (there were no PUD splits at all). I think the closer we have such cache to the buddy, the better would be memory utilization. The downside is that it will be harder to reclaim 2M blocks than with separate caches because at page allocator level we don't have enough information to make the pages allocated from the cache movable. -- Sincerely yours, Mike.