Re: Propagating GFP_NOFS inside __vmalloc()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Ricardo M. Correia" <ricardo.correia@oracle.com>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Brian Behlendorf <behlendorf1@llnl.gov>,
	Andreas Dilger <andreas.dilger@oracle.com>
Subject: Re: Propagating GFP_NOFS inside __vmalloc()
Date: Tue, 16 Nov 2010 00:30:57 +0100	[thread overview]
Message-ID: <1289863857.13446.199.camel@oralap> (raw)
In-Reply-To: <alpine.DEB.2.00.1011151426360.20468@chino.kir.corp.google.com>

On Mon, 2010-11-15 at 14:50 -0800, David Rientjes wrote:
> Instead of extending the __*() functions with 
> more underscores like other places in the kernel (see mm/slab.c, for 
> instance), I'd suggest just appending _gfp() to their name so 
> __pmd_alloc() uses a new __pmd_alloc_gfp().

Sounds good to me.

> > For our case, I'd think it's better to either handle failure or somehow
> > retry until the allocation succeeds (if we know for sure that it will,
> > eventually).
> > 
> 
> If your use-case is going to block until this memory is available, there's 
> a serious problem that you'll need to address because nothing is going to 
> guarantee that memory will be freed unless something else is trying to 
> allocate memory and pages get written back or something gets killed as a 
> result.

In our use case, this code is only used on servers that are used for
serving a Lustre filesystem and nothing else, so we don't have to worry
about things like run-away memory hogs / user applications.

Currently we do block until this memory is available. I'd rather not go
much into this, but the amount of memory that can be allocated by this
method at any point in time is huge but it's bounded.

Also, we have a slab reclaim callback that signals a dedicated thread,
which asynchronously frees memory (it would free synchronously if
possible, but unfortunately it's not).

This thread is able to potentially free GBs of memory if necessary, and
therefore allow the vmalloc allocations in the I/O path to succeed
eventually. We know this because we limit the amount of memory that can
be allocated and nothing else can use a significant amount of memory on
our systems.

I know this is not how you'd typically do this, but we also have other
constraints (which again, I'd rather not go into) which makes this our
preferred solution.

>   Strictly relying on that behavior is concerning, but it's not 
> something that can be fixed in the VM.
>
> > Not sure what do you mean by this.. I don't see a typical vmalloc()
> > using __GFP_REPEAT anywhere (apart from functions such as
> > pmd_alloc_one(), which in the code above you suggested to keep passing
> > __GFP_REPEAT).. am I missing something?
> > 
> 
> __GFP_REPEAT will retry the allocation indefinitely until the needed 
> amount of memory is reclaimed without considering the order of the 
> allocation; all orders of interest in your case are order-0, so it will 
> loop indefinitely until a single page is reclaimed which won't happen with 
> GFP_NOFS.  Thus, passing the flag is the equivalent of asking the 
> allocator to loop forever until memory is available rather than failing 
> and returning to your error handling.

When you say loop forever, you don't mean in a busy loop, right?
Assuming we sleep in this loop (which AFAICS it does), then it's OK for
us because memory will be freed asynchronously.

If it didn't sleep then it'd be more concerning because all CPUs could
enter this loop and we'd deadlock..

Anyway, I will try the approach that you suggested and send out a new
patch. 

Thanks!

- Ricardo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-11-15 23:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-10 20:42 Propagating GFP_NOFS inside __vmalloc() Ricardo M. Correia
2010-11-10 21:35 ` Ricardo M. Correia
2010-11-10 22:10   ` Dave Chinner
2010-11-11 20:06 ` Andrew Morton
2010-11-11 22:02   ` Ricardo M. Correia
2010-11-11 22:25     ` Andrew Morton
2010-11-11 22:45       ` Ricardo M. Correia
2010-11-11 23:19         ` Ricardo M. Correia
2010-11-11 23:27           ` Andrew Morton
2010-11-11 23:29             ` Ricardo M. Correia
2010-11-15 17:01       ` Ricardo M. Correia
2010-11-15 21:28         ` David Rientjes
2010-11-15 22:19           ` Ricardo M. Correia
2010-11-15 22:50             ` David Rientjes
2010-11-15 23:30               ` Ricardo M. Correia [this message]
2010-11-15 23:55                 ` David Rientjes
2010-11-16 22:11           ` Andrew Morton
2010-11-17  7:18             ` Andreas Dilger
2010-11-17  7:24               ` Andrew Morton
2010-11-17  7:37               ` David Rientjes
2010-11-17  9:04                 ` Christoph Hellwig
2010-11-17 21:24                   ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1289863857.13446.199.camel@oralap \
    --to=ricardo.correia@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreas.dilger@oracle.com \
    --cc=behlendorf1@llnl.gov \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).