linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Huaisheng Ye <yehs2007@gmail.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org, vbabka@suse.cz,
	mgorman@techsingularity.net, kstewart@linuxfoundation.org,
	alexander.levin@verizon.com, gregkh@linuxfoundation.org,
	colyli@suse.de, chengnt@lenovo.com, hehy1@lenovo.com,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	xen-devel@lists.xenproject.org, linux-btrfs@vger.kernel.org,
	Huaisheng Ye <yehs1@lenovo.com>
Subject: Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
Date: Wed, 23 May 2018 22:19:19 -0700	[thread overview]
Message-ID: <20180524051919.GA9819@bombadil.infradead.org> (raw)
In-Reply-To: <20180522183728.GB20441@dhcp22.suse.cz>

On Tue, May 22, 2018 at 08:37:28PM +0200, Michal Hocko wrote:
> So why is this any better than the current code. Sure I am not a great
> fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> doesn't look too much better, yet we are losing a check for incompatible
> gfp flags. The diffstat looks really sound but then you just look and
> see that the large part is the comment that at least explained the gfp
> zone modifiers somehow and the debugging code. So what is the selling
> point?

I have a plan, but it's not exactly fully-formed yet.

One of the big problems we have today is that we have a lot of users
who have constraints on the physical memory they want to allocate,
but we have very limited abilities to provide them with what they're
asking for.  The various different ZONEs have different meanings on
different architectures and are generally a mess.

If we had eight ZONEs, we could offer:

ZONE_16M	// 24 bit
ZONE_256M	// 28 bit
ZONE_LOWMEM	// CONFIG_32BIT only
ZONE_4G		// 32 bit
ZONE_64G	// 36 bit
ZONE_1T		// 40 bit
ZONE_ALL	// everything larger
ZONE_MOVABLE	// movable allocations; no physical address guarantees

#ifdef CONFIG_64BIT
#define ZONE_NORMAL	ZONE_ALL
#else
#define ZONE_NORMAL	ZONE_LOWMEM
#endif

This would cover most driver DMA mask allocations; we could tweak the
offered zones based on analysis of what people need.

#define GFP_HIGHUSER		(GFP_USER | ZONE_ALL)
#define GFP_HIGHUSER_MOVABLE	(GFP_USER | ZONE_MOVABLE)

One other thing I want to see is that fallback from zones happens from
highest to lowest normally (ie if you fail to allocate in 1T, then you
try to allocate from 64G), but movable allocations hapen from lowest
to highest.  So ZONE_16M ends up full of page cache pages which are
readily evictable for the rare occasions when we need to allocate memory
below 16MB.

I'm sure there are lots of good reasons why this won't work, which is
why I've been hesitant to propose it before now.

  parent reply	other threads:[~2018-05-24  5:19 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 01/12] include/linux/gfp.h: " Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers Huaisheng Ye
2018-05-22  9:38   ` Christoph Hellwig
2018-05-22 10:17     ` [External] " Huaisheng HS1 Ye
2018-05-21 15:20 ` [RFC PATCH v2 03/12] arch/x86/kernel/pci-calgary_64: " Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 04/12] drivers/iommu/amd_iommu: " Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 05/12] include/linux/dma-mapping: " Huaisheng Ye
2018-05-21 15:30   ` Christoph Hellwig
2018-05-21 15:20 ` [RFC PATCH v2 10/12] mm/zsmalloc: " Huaisheng Ye
2018-05-22 11:22   ` Matthew Wilcox
2018-05-22 11:51     ` [External] " Huaisheng HS1 Ye
2018-05-21 15:20 ` [RFC PATCH v2 11/12] include/linux/highmem: update usage of movableflags Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 12/12] arch/x86/include/asm/page.h: " Huaisheng Ye
2018-05-22  9:40 ` [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Christoph Hellwig
2018-05-22 18:37 ` Michal Hocko
2018-05-23 16:07   ` [External] " Huaisheng HS1 Ye
2018-05-24 12:18     ` Michal Hocko
2018-05-25  9:43       ` Huaisheng HS1 Ye
2018-05-28 13:37         ` Michal Hocko
2018-05-30  9:02           ` Huaisheng HS1 Ye
2018-05-30  9:11             ` Christoph Hellwig
2018-05-30  9:12             ` Michal Hocko
2018-05-24  5:19   ` Matthew Wilcox [this message]
2018-05-24 12:23     ` Michal Hocko
2018-05-24 15:18       ` Matthew Wilcox
2018-05-24 15:29         ` Michal Hocko
2018-05-25 12:00           ` Matthew Wilcox
2018-05-28 13:33             ` Michal Hocko
  -- strict thread matches above, loose matches on Subject: below --
2018-05-22 10:22 Huaisheng HS1 Ye

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180524051919.GA9819@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.levin@verizon.com \
    --cc=chengnt@lenovo.com \
    --cc=colyli@suse.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=hehy1@lenovo.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=kstewart@linuxfoundation.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=xen-devel@lists.xenproject.org \
    --cc=yehs1@lenovo.com \
    --cc=yehs2007@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).