All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	muchun.song@linux.dev, souravpanda@google.com
Subject: Re: [PATCH v2] mm: hugetlb_vmemmap: provide stronger vmemmap allocation guarantees
Date: Mon, 17 Apr 2023 11:51:31 -0700	[thread overview]
Message-ID: <20230417185131.GB6389@monkey> (raw)
In-Reply-To: <ZD0ESXd1mGW7BAZ+@dhcp22.suse.cz>

On 04/17/23 10:33, Michal Hocko wrote:
> On Fri 14-04-23 17:47:28, David Rientjes wrote:
> > On Thu, 13 Apr 2023, Michal Hocko wrote:
> > 
> > > [...]
> > > > > > This is a theoretical concern. Freeing a 1G page requires 16M of free
> > > > > > memory. A machine might need to be reconfigured from one task to
> > > > > > another, and release a large number of 1G pages back to the system if
> > > > > > allocating 16M fails, the release won't work.
> > > > >
> > > > > This is really an important "detail" changelog should mention. While I
> > > > > am not really against that change I would much rather see that as a
> > > > > result of a real world fix rather than a theoretical concern. Mostly
> > > > > because a real life scenario would allow us to test the
> > > > > __GFP_RETRY_MAYFAIL effectivness. As that request might fail as well we
> > > > > just end up with a theoretical fix for a theoretical problem. Something
> > > > > that is easy to introduce but much harder to get rid of should we ever
> > > > > need to change __GFP_RETRY_MAYFAIL implementation for example.
> > > > 
> > > > I will add this to changelog in v3. If  __GFP_RETRY_MAYFAIL is
> > > > ineffective we will receive feedback once someone hits this problem.
> > > 
> > > I do not remember anybody hitting this with the current __GFP_NORETRY.
> > > So arguably there is nothing to be fixed ATM.
> > > 
> > 
> > I think we should still at least clear __GFP_NORETRY in this allocation: 
> > to be able to free 1GB hugepages back to the system we'd like the page 
> > allocator to at least exercise its normal order-0 allocation logic rather 
> > than exempting it from retrying reclaim by opting into __GFP_NORETRY.
> > 
> > I'd agree with the analysis in 
> > https://lore.kernel.org/linux-mm/YCafit5ruRJ+SL8I@dhcp22.suse.cz/ that 
> > either a cleared __GFP_NORETRY or a __GFP_RETRY_MAYFAIL makes logical 
> > sense.
> > 
> > We really *do* want to free these hugepages back to the system and the 
> > amount of memory freeing will always be more than the allocation for 
> > struct page.  The net result is more free memory.
> > 
> > If the allocation fails, we can't free 1GB back to the system on a 
> > saturated node if our first reclaim attempt didn't allow these struct 
> > pages to be allocated.  Stranding 1GB in the hugetlb pool that no 
> > userspace on the system can make use of at the time isn't very useful.
> 
> I do not think there is any dispute in the theoretical concern. The question is
> whether this is something that really needs a fix in practice. Have we
> ever seen workloads which rely on GB pages to fail freeing them?

Since I have never seen a failure allocating vmemmmap, I agree that this
is all a theoretical concern.

However, to me it seems that replacing __GFP_NORETRY with __GFP_RETRY_MAYFAIL
would lessen that theoretical concern just a little.  That is simply because
an allocation with __GFP_RETRY_MAYFAIL would be a little more likely to
succeed.

Again, I know this is all theoretical but if switching to __GFP_RETRY_MAYFAIL
would prevent one allocation/hugetlb page freeing failure I think it is worth
it.  Because, as soon as we see one failure we may need to look into addressing
this now theoretical concern.
-- 
Mike Kravetz


  reply	other threads:[~2023-04-17 18:51 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-12 19:59 [PATCH v2] mm: hugetlb_vmemmap: provide stronger vmemmap allocation guarantees Pasha Tatashin
2023-04-12 20:13 ` Andrew Morton
2023-04-12 20:18   ` Michal Hocko
2023-04-13 15:05     ` Pasha Tatashin
2023-04-13 15:25       ` Michal Hocko
2023-04-13 17:11         ` Pasha Tatashin
2023-04-13 18:12           ` Michal Hocko
2023-04-15  0:47             ` David Rientjes
2023-04-17  8:33               ` Michal Hocko
2023-04-17 18:51                 ` Mike Kravetz [this message]
2023-04-13 14:59   ` Pasha Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230417185131.GB6389@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=pasha.tatashin@soleen.com \
    --cc=rientjes@google.com \
    --cc=souravpanda@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.