public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, cl@linux-foundation.org,
	kamezawa.hiroyu@jp.fujitsu.com, lizf@cn.fujitsu.com,
	mingo@elte.hu, yinghai@kernel.org
Subject: Re: [GIT PULL v2] Early SLAB fixes for 2.6.31
Date: Tue, 16 Jun 2009 07:37:22 +1000	[thread overview]
Message-ID: <1245101842.12400.43.camel@pasglop> (raw)
In-Reply-To: <20090615113848.GA23377@wotan.suse.de>

On Mon, 2009-06-15 at 13:38 +0200, Nick Piggin wrote:
> On Mon, Jun 15, 2009 at 01:28:28PM +0200, Nick Piggin wrote:
> > On Mon, Jun 15, 2009 at 01:22:05PM +0200, Nick Piggin wrote:
> > > On Mon, Jun 15, 2009 at 08:39:48PM +1000, Benjamin Herrenschmidt wrote:
> > > But I won't live with having it shit in our nice core code...
> > > Well, at least I won't throw up my hands and give up this
> > > early.
> > 
> > Just the principle, btw.
> 
> I have the same opinion for suspend/resume too, although
> in that case I know less about the issues and if we
> found that it indeed does make a random driver writers
> life easier[*] then it might be a reason to do this. But
> I still don't think that would give boot code a license to
> just revert back to "I don't know or care, GFP_KERNEL pelase"
> 
> [*] and note that being unaware of your context I don't
> think is making life easier automatically.

The suspend/resume case is even worse ... because drivers don't know,
and don't have to.

IE. We are talking here about pretty much -any- kmalloc in the kernel,
you don't seem to understand that.

The problem here is that driver A has suspended and happen to be on the
swapout path. driver B hasn't been suspended yet, and potentially
doesn't even know there's a suspend/resume cycle in progress.

Now, driver B, while holding for example one of its internal mutexes,
calls something that calls something that does a kmalloc(GFP_KERNEL) ...
The later will potentially block forever (or at least until resume)
because the allocator may try to swap something out to devices driven by
driver A while it's suspended.

Now, driver B suspend() is called, which tries to take the above
mutex... kaboom.

Yes, we -could- probably try to invent some scheme for block devices to
"teach" upper layers that they are being suspended. That would cover
some of the cases and would probably not be done properly for 10 kernel
versions to come... Or we can make all kmalloc() degrade automatically
to GFP_NOIO when suspend is started.

Which one is more likely to actually work ? :-)

Cheers,
Ben.


  reply	other threads:[~2009-06-15 21:54 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-12 13:25 [GIT PULL] Early SLAB fixes for 2.6.31 Pekka J Enberg
2009-06-12 13:38 ` Benjamin Herrenschmidt
2009-06-12 13:45   ` Pekka Enberg
2009-06-12 14:30     ` Christoph Lameter
2009-06-12 16:16 ` [GIT PULL v2] " Pekka J Enberg
2009-06-12 17:30   ` Christoph Lameter
2009-06-12 21:46     ` Benjamin Herrenschmidt
2009-06-15  6:46       ` Nick Piggin
2009-06-15  9:10         ` Pekka Enberg
2009-06-15  9:38           ` Nick Piggin
2009-06-15 14:43       ` Christoph Lameter
2009-06-14  7:12     ` Pekka J Enberg
2009-06-15 14:55       ` Christoph Lameter
2009-06-15 14:58         ` Pekka Enberg
2009-06-15 15:05           ` Christoph Lameter
2009-06-15 15:11             ` Pekka Enberg
2009-06-15 15:27               ` Pekka Enberg
2009-06-15 15:51                 ` Christoph Lameter
2009-06-15 15:57                   ` Pekka Enberg
2009-06-15 16:08                     ` Christoph Lameter
2009-06-15 17:15                 ` Linus Torvalds
2009-06-15 18:19                   ` Pekka Enberg
2009-06-15 15:48               ` Christoph Lameter
2009-06-15  8:18   ` Heiko Carstens
2009-06-15  8:26     ` Nick Piggin
2009-06-15  8:32       ` Pekka Enberg
2009-06-15  8:52         ` Nick Piggin
2009-06-15  9:08           ` Pekka Enberg
2009-06-15 10:20             ` Heiko Carstens
2009-06-15 10:21               ` Pekka Enberg
2009-06-15 10:31                 ` Nick Piggin
2009-06-15 10:36                   ` Pekka Enberg
2009-06-15  9:10     ` Pekka Enberg
2009-06-15  9:41       ` Nick Piggin
2009-06-15  9:48         ` Pekka Enberg
2009-06-15  9:59           ` Nick Piggin
2009-06-15  9:51         ` Benjamin Herrenschmidt
2009-06-15  9:57           ` Pekka Enberg
2009-06-15 10:27             ` Nick Piggin
2009-06-15 10:45               ` Benjamin Herrenschmidt
2009-06-15 11:23                 ` Nick Piggin
2009-06-15 12:38                   ` Hugh Dickins
2009-06-15 13:07                     ` Pekka Enberg
2009-06-16  4:57                     ` Nick Piggin
2009-06-16  5:28                       ` Benjamin Herrenschmidt
2009-06-16  5:36                         ` Nick Piggin
2009-06-16 15:12                           ` Christoph Lameter
2009-06-16 15:59                             ` Nick Piggin
2009-06-15 21:31                   ` Benjamin Herrenschmidt
2009-06-16  4:46                     ` Nick Piggin
2009-06-16  5:18                       ` Benjamin Herrenschmidt
2009-06-16  5:29                         ` Nick Piggin
2009-06-16 18:45                       ` Linus Torvalds
2009-06-17  7:47                         ` Nick Piggin
2009-06-17 16:01                           ` Linus Torvalds
2009-06-17 16:17                             ` Nick Piggin
2009-06-17 21:39                             ` Benjamin Herrenschmidt
2009-06-15 10:12           ` Nick Piggin
2009-06-15 10:39             ` Benjamin Herrenschmidt
2009-06-15 11:22               ` Nick Piggin
2009-06-15 11:28                 ` Nick Piggin
2009-06-15 11:38                   ` Nick Piggin
2009-06-15 21:37                     ` Benjamin Herrenschmidt [this message]
2009-06-16  4:42                       ` Nick Piggin
2009-06-15 21:32                   ` Benjamin Herrenschmidt
2009-06-16 15:08                     ` Christoph Lameter
2009-06-16 19:10                       ` Linus Torvalds
2009-06-16 19:23                         ` Christoph Lameter
2009-06-16 19:33                           ` Linus Torvalds
2009-06-16 19:48                             ` Christoph Lameter
2009-06-17  5:18                             ` Pekka Enberg
2009-06-17 16:45                               ` Linus Torvalds
2009-06-18  2:00                                 ` Benjamin Herrenschmidt
2009-06-18  3:24                                   ` Benjamin Herrenschmidt
2009-06-18  6:01                                     ` Pekka Enberg
2009-06-18  8:52                                       ` Benjamin Herrenschmidt
2009-06-16 21:58                         ` Benjamin Herrenschmidt
2009-06-16 22:06                           ` Linus Torvalds
2009-06-16 22:51                             ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1245101842.12400.43.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=penberg@cs.helsinki.fi \
    --cc=torvalds@linux-foundation.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox