All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] dm: Fix deadlock under high i/o load in raid1 setup.
@ 2007-08-13 11:33 ` Heiko Carstens
  0 siblings, 0 replies; 13+ messages in thread
From: Heiko Carstens @ 2007-08-13 11:33 UTC (permalink / raw)
  To: linux-mm, dm-devel
  Cc: Stefan Weinhuber, Stefan Bader, Daniel Kobras, Andrew Morton,
	Linus Torvalds, Alasdair G Kergon

Hi,

the patch below went into 2.6.18. Now my question is: why doesn't it check
if kmalloc(..., GFP_NOIO) returns with a NULL pointer?
Did I miss anything that guarentees that this will always succeed or is it
just a bug?

commit c06aad854fdb9da38fcc22dccfe9d72919453e43
Author: Daniel Kobras <kobras@linux.de>
Date:   Sun Aug 27 01:23:24 2006 -0700

    [PATCH] dm: Fix deadlock under high i/o load in raid1 setup.
    
    On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid,
    we experienced frequent deadlock of the system under high i/o load.  'cat
    /dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly
    after a few GB, 'cp' would be left in 'D' state along with kjournald and
    kmirrord.  The functions cp and kjournald were blocked in did vary, but
    kmirrord's wchan always pointed to 'mempool_alloc()'.  We've seen this pattern
    on 2.6.15 and 2.6.17 kernels.  http://lkml.org/lkml/2005/4/20/142 indicates
    that this problem has been around even before.
    
    So much for the facts, here's my interpretation: mempool_alloc() first tries
    to atomically allocate the requested memory, or falls back to hand out
    preallocated chunks from the mempool.  If both fail, it puts the calling
    process (kmirrord in this case) on a private waitqueue until somebody refills
    the pool.  Where the only 'somebody' is kmirrord itself, so we have a
    deadlock.
    
    I worked around this problem by falling back to a (blocking) kmalloc when
    before kmirrord would have ended up on the waitqueue.  This defeats part of
    the benefits of using the mempool, but at least keeps the system running.  And
    it could be done with a two-line change.  Note that mempool_alloc() clears the
    GFP_NOIO flag internally, and only uses it to decide whether to wait or return
    an error if immediate allocation fails, so the attached patch doesn't change
    behaviour in the non-deadlocking case.  Path is against current git
    (2.6.18-rc4), but should apply to earlier versions as well.  I've tested on
    2.6.15, where this patch makes the difference between random lockup and a
    stable system.
    
    Signed-off-by: Daniel Kobras <kobras@linux.de>
    Acked-by: Alasdair G Kergon <agk@redhat.com>
    Cc: <stable@kernel.org>
    Signed-off-by: Andrew Morton <akpm@osdl.org>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index be48ced..c54de98 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -255,7 +255,9 @@ static struct region *__rh_alloc(struct region_hash *rh, region_t region)
 	struct region *reg, *nreg;
 
 	read_unlock(&rh->hash_lock);
-	nreg = mempool_alloc(rh->region_pool, GFP_NOIO);
+	nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC);
+	if (unlikely(!nreg))
+		nreg = kmalloc(sizeof(struct region), GFP_NOIO);
 	nreg->state = rh->log->type->in_sync(rh->log, region, 1) ?
 		RH_CLEAN : RH_NOSYNC;
 	nreg->rh = rh;

^ permalink raw reply related	[flat|nested] 13+ messages in thread
* [PATCH] dm: Fix deadlock under high i/o load in raid1 setup.
@ 2006-08-09 16:44 Daniel Kobras
  2006-08-12 20:02 ` Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Kobras @ 2006-08-09 16:44 UTC (permalink / raw)
  To: dm-devel; +Cc: linux-kernel

Implement private fallback if immediate allocation from mempool fails.
Standard mempool_alloc() fallback can yield a deadlock when only the
calling process is able to refill the pool. In out-of-memory situations,
instead of waiting for itself, kmirrord now waits for someone else to
free some space, using a standard blocking allocation.

Signed-off-by: Daniel Kobras <kobras@linux.de>
---
[Resending with Cc to l-k. First attempt apparently hasn't made it through to 
dm-devel.]

Hi!

On an nForce4-equipped machine with two SATA disk in raid1 setup using
dmraid, we experienced frequent deadlock of the system under high i/o
load. 'cat /dev/zero > ~/zero' was the most reliable way to reproduce
them: Randomly after a few GB, 'cp' would be left in 'D' state along
with kjournald and kmirrord. The functions cp and kjournald were blocked
in did vary, but kmirrord's wchan always pointed to 'mempool_alloc()'.
We've seen this pattern on 2.6.15 and 2.6.17 kernels.
http://lkml.org/lkml/2005/4/20/142 indicates that this problem has been
around even before.

So much for the facts, here's my interpretation: mempool_alloc() first
tries to atomically allocate the requested memory, or falls back to hand
out preallocated chunks from the mempool. If both fail, it puts the
calling process (kmirrord in this case) on a private waitqueue until
somebody refills the pool. Where the only 'somebody' is kmirrord itself,
so we have a deadlock.

I worked around this problem by falling back to a (blocking) kmalloc
when before kmirrord would have ended up on the waitqueue. This defeats
part of the benefits of using the mempool, but at least keeps the system
running. And it could be done with a two-line change. Note that
mempool_alloc() clears the GFP_NOIO flag internally, and only uses it to
decide whether to wait or return an error if immediate allocation fails,
so the attached patch doesn't change behaviour in the non-deadlocking case.
Path is against current git (2.6.18-rc4), but should apply to earlier
versions as well. I've tested on 2.6.15, where this patch makes the
difference between random lockup and a stable system.

Regards,

Daniel.

diff -r dcc321d1340a -r d52bb3a14d60 drivers/md/dm-raid1.c
--- a/drivers/md/dm-raid1.c	Sun Aug 06 19:00:05 2006 +0000
+++ b/drivers/md/dm-raid1.c	Mon Aug 07 23:16:44 2006 +0200
@@ -255,7 +255,9 @@ static struct region *__rh_alloc(struct 
 	struct region *reg, *nreg;
 
 	read_unlock(&rh->hash_lock);
-	nreg = mempool_alloc(rh->region_pool, GFP_NOIO);
+	nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC);
+	if (unlikely(!nreg))
+		nreg = kmalloc(sizeof(struct region), GFP_NOIO);
 	nreg->state = rh->log->type->in_sync(rh->log, region, 1) ?
 		RH_CLEAN : RH_NOSYNC;
 	nreg->rh = rh;

^ permalink raw reply	[flat|nested] 13+ messages in thread
* [PATCH] dm: Fix deadlock under high i/o load in raid1 setup.
@ 2006-08-07 22:36 Daniel Kobras
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Kobras @ 2006-08-07 22:36 UTC (permalink / raw)
  To: dm-devel

Implement private fallback if immediate allocation from mempool fails.
Standard mempool_alloc() fallback can yield a deadlock when only the
calling process is able to refill the pool. In out-of-memory situations,
instead of waiting for itself, kmirrord now waits for someone else to
free some space, using a standard blocking allocation.

Signed-off-by: Daniel Kobras <kobras@linux.de>
---
[long story]

Hi!

On an nForce4-equipped machine with two SATA disk in raid1 setup using
dmraid, we experienced frequent deadlock of the system under high i/o
load. 'cat /dev/zero > ~/zero' was the most reliable way to reproduce
them: Randomly after a few GB, 'cp' would be left in 'D' state along
with kjournald and kmirrord. The functions cp and kjournald were blocked
in did vary, but kmirrord's wchan always pointed to 'mempool_alloc()'.
We've seen this pattern on 2.6.15 and 2.6.17 kernels.
http://lkml.org/lkml/2005/4/20/142 indicates that this problem has been
around even before.

So much for the facts, here's my interpretation: mempool_alloc() first
tries to atomically allocate the requested memory, or falls back to hand
out preallocated chunks from the mempool. If both fail, it puts the
calling process (kmirrord in this case) on a private waitqueue until
somebody refills the pool. Where the only 'somebody' is kmirrord itself,
so we have a deadlock.

I worked around this problem by falling back to a (blocking) kmalloc
when before kmirrord would have ended up on the waitqueue. This defeats
part of the benefits of using the mempool, but at least keeps the system
running. And it could be done with a two-line change. Note that
mempool_alloc() clears the GFP_NOIO flag internally, and only uses it to
decide whether to wait or return an error if immediate allocation fails,
so the attached patch doesn't change behaviour in the non-deadlocking case.
Path is against current git (2.6.18-rc4), but should apply to earlier
versions as well. I've tested on 2.6.15, where this patch makes the
difference between random lockup and a stable system.

Regards,

Daniel.

diff -r dcc321d1340a -r d52bb3a14d60 drivers/md/dm-raid1.c
--- a/drivers/md/dm-raid1.c	Sun Aug 06 19:00:05 2006 +0000
+++ b/drivers/md/dm-raid1.c	Mon Aug 07 23:16:44 2006 +0200
@@ -255,7 +255,9 @@ static struct region *__rh_alloc(struct 
 	struct region *reg, *nreg;
 
 	read_unlock(&rh->hash_lock);
-	nreg = mempool_alloc(rh->region_pool, GFP_NOIO);
+	nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC);
+	if (unlikely(!nreg))
+		nreg = kmalloc(sizeof(struct region), GFP_NOIO);
 	nreg->state = rh->log->type->in_sync(rh->log, region, 1) ?
 		RH_CLEAN : RH_NOSYNC;
 	nreg->rh = rh;

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-08-16  7:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-13 11:33 [PATCH] dm: Fix deadlock under high i/o load in raid1 setup Heiko Carstens
2007-08-13 11:33 ` Heiko Carstens
2007-08-15 22:56 ` Andrew Morton
2007-08-15 22:56   ` Andrew Morton
2007-08-15 23:59   ` Heiko Carstens
2007-08-15 23:59     ` Heiko Carstens
2007-08-16  3:10     ` Andrew Morton
2007-08-16  3:10       ` Andrew Morton
2007-08-16  7:09       ` Stefan Bader
2007-08-16  7:09         ` [dm-devel] " Stefan Bader
  -- strict thread matches above, loose matches on Subject: below --
2006-08-09 16:44 Daniel Kobras
2006-08-12 20:02 ` Andrew Morton
2006-08-07 22:36 Daniel Kobras

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.