All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Pekka J Enberg <penberg@cs.helsinki.fi>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Ingo Molnar <mingo@elte.hu>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	npiggin@suse.de, cl@linux-foundation.org,
	torvalds@linux-foundation.org
Subject: Re: [PATCH v2] slab,slub: ignore __GFP_WAIT if we're booting or  suspending
Date: Fri, 12 Jun 2009 08:22:52 -0700	[thread overview]
Message-ID: <20090612082252.519061c3.akpm@linux-foundation.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0906121244020.30911@melkki.cs.Helsinki.FI>

On Fri, 12 Jun 2009 12:45:21 +0300 (EEST) Pekka J Enberg <penberg@cs.helsinki.fi> wrote:

> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Fri, 12 Jun 2009 12:39:58 +0300
> Subject: [PATCH] Sanitize "gfp" flags during boot
> 
> With the recent shuffle of initialization order to move memory related
> inits earlier, various subtle breakage was introduced in archs like
> powerpc due to code somewhat assuming that GFP_KERNEL can be used as
> soon as the allocators are up. This is not true because any __GFP_WAIT
> allocation will cause interrupts to be enabled, which can be fatal if
> it happens too early.
> 
> This isn't trivial to fix on every call site. For example, powerpc's
> ioremap implementation needs to be called early. For that, it uses two
> different mechanisms to carve out virtual space. Before memory init,
> by moving down VMALLOC_END, and then, by calling get_vm_area().
> Unfortunately, the later does GFK_KERNEL allocations. But we can't do
> anything else because once vmalloc's been initialized, we can no longer
> safely move VMALLOC_END to carve out space.
> 
> There are other examples, wehere can can be called either very early
> or later on when devices are hot-plugged. It would be a major pain for
> such code to have to "know" whether it's in a context where it should
> use GFP_KERNEL or GFP_NOWAIT.
> 
> Finally, by having the ability to silently removed __GFP_WAIT from
> allocations, we pave the way for suspend-to-RAM to use that feature
> to also remove __GFP_IO from allocations done after suspending devices
> has started. This is important because such allocations may hang if
> devices on the swap-out path have been suspended, but not-yet suspended
> drivers don't know about it, and may deadlock themselves by being hung
> into a kmalloc somewhere while holding a mutex for example.
> 
> ...
>
> +/*
> + * We set up the page allocator and the slab allocator early on with interrupts
> + * disabled. Therefore, make sure that we sanitize GFP flags accordingly before
> + * everything is up and running.
> + */
> +gfp_t gfp_allowed_bits = ~(__GFP_WAIT|__GFP_FS | __GFP_IO);

__read_mostly

> +void mm_late_init(void)
> +{
> +	/*
> +	 * Interrupts are enabled now so all GFP allocations are safe.
> +	 */
> +	gfp_allowed_bits = __GFP_BITS_MASK;
> +}

Using plain old -1 here would be a more obviously-correct change.



WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Pekka J Enberg <penberg@cs.helsinki.fi>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Ingo Molnar <mingo@elte.hu>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	npiggin@suse.de, cl@linux-foundation.org,
	torvalds@linux-foundation.org
Subject: Re: [PATCH v2] slab,slub: ignore __GFP_WAIT if we're booting or suspending
Date: Fri, 12 Jun 2009 08:22:52 -0700	[thread overview]
Message-ID: <20090612082252.519061c3.akpm@linux-foundation.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0906121244020.30911@melkki.cs.Helsinki.FI>

On Fri, 12 Jun 2009 12:45:21 +0300 (EEST) Pekka J Enberg <penberg@cs.helsinki.fi> wrote:

> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Fri, 12 Jun 2009 12:39:58 +0300
> Subject: [PATCH] Sanitize "gfp" flags during boot
> 
> With the recent shuffle of initialization order to move memory related
> inits earlier, various subtle breakage was introduced in archs like
> powerpc due to code somewhat assuming that GFP_KERNEL can be used as
> soon as the allocators are up. This is not true because any __GFP_WAIT
> allocation will cause interrupts to be enabled, which can be fatal if
> it happens too early.
> 
> This isn't trivial to fix on every call site. For example, powerpc's
> ioremap implementation needs to be called early. For that, it uses two
> different mechanisms to carve out virtual space. Before memory init,
> by moving down VMALLOC_END, and then, by calling get_vm_area().
> Unfortunately, the later does GFK_KERNEL allocations. But we can't do
> anything else because once vmalloc's been initialized, we can no longer
> safely move VMALLOC_END to carve out space.
> 
> There are other examples, wehere can can be called either very early
> or later on when devices are hot-plugged. It would be a major pain for
> such code to have to "know" whether it's in a context where it should
> use GFP_KERNEL or GFP_NOWAIT.
> 
> Finally, by having the ability to silently removed __GFP_WAIT from
> allocations, we pave the way for suspend-to-RAM to use that feature
> to also remove __GFP_IO from allocations done after suspending devices
> has started. This is important because such allocations may hang if
> devices on the swap-out path have been suspended, but not-yet suspended
> drivers don't know about it, and may deadlock themselves by being hung
> into a kmalloc somewhere while holding a mutex for example.
> 
> ...
>
> +/*
> + * We set up the page allocator and the slab allocator early on with interrupts
> + * disabled. Therefore, make sure that we sanitize GFP flags accordingly before
> + * everything is up and running.
> + */
> +gfp_t gfp_allowed_bits = ~(__GFP_WAIT|__GFP_FS | __GFP_IO);

__read_mostly

> +void mm_late_init(void)
> +{
> +	/*
> +	 * Interrupts are enabled now so all GFP allocations are safe.
> +	 */
> +	gfp_allowed_bits = __GFP_BITS_MASK;
> +}

Using plain old -1 here would be a more obviously-correct change.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-06-12 15:23 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-12  8:13 [PATCH 2/2] slab,slub: ignore __GFP_WAIT if we're booting or suspending Pekka J Enberg
2009-06-12  8:13 ` Pekka J Enberg
2009-06-12  9:03 ` [PATCH v2] " Pekka J Enberg
2009-06-12  9:03   ` Pekka J Enberg
2009-06-12  9:10   ` Ingo Molnar
2009-06-12  9:10     ` Ingo Molnar
2009-06-12  9:21     ` Benjamin Herrenschmidt
2009-06-12  9:21       ` Benjamin Herrenschmidt
2009-06-12  9:24       ` Pekka Enberg
2009-06-12  9:24         ` Pekka Enberg
2009-06-12  9:36         ` Benjamin Herrenschmidt
2009-06-12  9:36           ` Benjamin Herrenschmidt
2009-06-12  9:45           ` Pekka J Enberg
2009-06-12  9:45             ` Pekka J Enberg
2009-06-12  9:58             ` Benjamin Herrenschmidt
2009-06-12  9:58               ` Benjamin Herrenschmidt
2009-06-12 10:00               ` Pekka Enberg
2009-06-12 10:00                 ` Pekka Enberg
2009-06-12 15:22             ` Andrew Morton [this message]
2009-06-12 15:22               ` Andrew Morton
2009-06-12  9:49     ` Pekka Enberg
2009-06-12  9:49       ` Pekka Enberg
2009-06-12  9:52       ` Nick Piggin
2009-06-12  9:52         ` Nick Piggin
2009-06-12  9:54         ` Pekka Enberg
2009-06-12  9:54           ` Pekka Enberg
2009-06-12  9:59         ` Benjamin Herrenschmidt
2009-06-12  9:59           ` Benjamin Herrenschmidt
2009-06-25  4:38           ` Nick Piggin
2009-06-25  4:38             ` Nick Piggin
2009-06-12 10:07       ` Ingo Molnar
2009-06-12 10:07         ` Ingo Molnar
2009-06-12 10:11         ` Pekka Enberg
2009-06-12 10:11           ` Pekka Enberg
2009-06-12 10:15           ` Nick Piggin
2009-06-12 10:15             ` Nick Piggin
2009-06-12 10:30             ` Pekka J Enberg
2009-06-12 10:30               ` Pekka J Enberg
2009-06-12 10:32               ` Pekka Enberg
2009-06-12 10:32                 ` Pekka Enberg
2009-06-12 15:16               ` Linus Torvalds
2009-06-12 15:16                 ` Linus Torvalds
2009-06-12 15:16                 ` Pekka Enberg
2009-06-12 15:16                   ` Pekka Enberg
2009-06-12 11:13             ` Benjamin Herrenschmidt
2009-06-12 11:13               ` Benjamin Herrenschmidt
2009-06-12 11:24               ` Benjamin Herrenschmidt
2009-06-12 11:24                 ` Benjamin Herrenschmidt
2009-06-12 11:11           ` Benjamin Herrenschmidt
2009-06-12 11:11             ` Benjamin Herrenschmidt
2009-06-12 11:34             ` Pekka Enberg
2009-06-12 11:34               ` Pekka Enberg
2009-06-12 11:41               ` Benjamin Herrenschmidt
2009-06-12 11:41                 ` Benjamin Herrenschmidt
2009-06-12 11:43                 ` Pekka Enberg
2009-06-12 11:43                   ` Pekka Enberg
2009-06-12 15:30               ` Andrew Morton
2009-06-12 15:30                 ` Andrew Morton
2009-06-12 21:42                 ` Benjamin Herrenschmidt
2009-06-12 21:42                   ` Benjamin Herrenschmidt
2009-06-25  4:41                 ` Nick Piggin
2009-06-25  4:41                   ` Nick Piggin
2009-06-12 11:09         ` Benjamin Herrenschmidt
2009-06-12 11:09           ` Benjamin Herrenschmidt
2009-06-12 15:04   ` Linus Torvalds
2009-06-12 15:04     ` Linus Torvalds
2009-06-12 15:05     ` Pekka Enberg
2009-06-12 15:05       ` Pekka Enberg
2009-06-19 14:59   ` Pavel Machek
2009-06-19 14:59     ` Pavel Machek
2009-06-19 22:27     ` Benjamin Herrenschmidt
2009-06-19 22:27       ` Benjamin Herrenschmidt
2009-06-19 23:23       ` Pavel Machek
2009-06-19 23:23         ` Pavel Machek
2009-06-19 23:50         ` Benjamin Herrenschmidt
2009-06-19 23:50           ` Benjamin Herrenschmidt
2009-06-20  0:28           ` Pavel Machek
2009-06-20  0:28             ` Pavel Machek
2009-06-20  2:10             ` Benjamin Herrenschmidt
2009-06-20  2:10               ` Benjamin Herrenschmidt
2009-06-21  6:18               ` Pavel Machek
2009-06-21  6:18                 ` Pavel Machek
2009-06-21  9:31                 ` Benjamin Herrenschmidt
2009-06-21  9:31                   ` Benjamin Herrenschmidt
2009-06-25  4:34                   ` Nick Piggin
2009-06-25  4:34                     ` Nick Piggin
2009-06-25  9:56                     ` Benjamin Herrenschmidt
2009-06-25  9:56                       ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090612082252.519061c3.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=cl@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=penberg@cs.helsinki.fi \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.