2.6.24-rc6 reproducible raid5 hang

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* 2.6.24-rc6 reproducible raid5 hang
@ 2007-12-27 17:06 dean gaudet
  2007-12-27 17:39 ` dean gaudet
  2007-12-27 19:52 ` Justin Piszcz
  0 siblings, 2 replies; 30+ messages in thread
From: dean gaudet @ 2007-12-27 17:06 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1093 bytes --]

hey neil -- remember that raid5 hang which me and only one or two others 
ever experienced and which was hard to reproduce?  we were debugging it 
well over a year ago (that box has 400+ day uptime now so at least that 
long ago :)  the workaround was to increase stripe_cache_size... i seem to 
have a way to reproduce something which looks much the same.

setup:

- 2.6.24-rc6
- system has 8GiB RAM but no swap
- 8x750GB in a raid5 with one spare, chunksize 1024KiB.
- mkfs.xfs default options
- mount -o noatime
- dd if=/dev/zero of=/mnt/foo bs=4k count=2621440

that sequence hangs for me within 10 seconds... and i can unhang / rehang 
it by toggling between stripe_cache_size 256 and 1024.  i detect the hang 
by watching "iostat -kx /dev/sd? 5".

i've attached the kernel log where i dumped task and timer state while it 
was hung... note that you'll see at some point i did an xfs mount with 
external journal but it happens with internal journal as well.

looks like it's using the raid456 module and async api.

anyhow let me know if you need more info / have any suggestions.

-dean

[-- Attachment #2: Type: APPLICATION/octet-stream, Size: 19281 bytes --]

[-- Attachment #3: Type: APPLICATION/octet-stream, Size: 25339 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-27 17:06 dean gaudet
@ 2007-12-27 17:39 ` dean gaudet
  2007-12-29 16:48   ` dean gaudet
  2007-12-27 19:52 ` Justin Piszcz
  1 sibling, 1 reply; 30+ messages in thread
From: dean gaudet @ 2007-12-27 17:39 UTC (permalink / raw)
  To: linux-raid

hmm this seems more serious... i just ran into it with chunksize 64KiB and 
while just untarring a bunch of linux kernels in parallel... increasing 
stripe_cache_size did the trick again.

-dean

On Thu, 27 Dec 2007, dean gaudet wrote:

> hey neil -- remember that raid5 hang which me and only one or two others 
> ever experienced and which was hard to reproduce?  we were debugging it 
> well over a year ago (that box has 400+ day uptime now so at least that 
> long ago :)  the workaround was to increase stripe_cache_size... i seem to 
> have a way to reproduce something which looks much the same.
> 
> setup:
> 
> - 2.6.24-rc6
> - system has 8GiB RAM but no swap
> - 8x750GB in a raid5 with one spare, chunksize 1024KiB.
> - mkfs.xfs default options
> - mount -o noatime
> - dd if=/dev/zero of=/mnt/foo bs=4k count=2621440
> 
> that sequence hangs for me within 10 seconds... and i can unhang / rehang 
> it by toggling between stripe_cache_size 256 and 1024.  i detect the hang 
> by watching "iostat -kx /dev/sd? 5".
> 
> i've attached the kernel log where i dumped task and timer state while it 
> was hung... note that you'll see at some point i did an xfs mount with 
> external journal but it happens with internal journal as well.
> 
> looks like it's using the raid456 module and async api.
> 
> anyhow let me know if you need more info / have any suggestions.
> 
> -dean

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-27 17:06 dean gaudet
  2007-12-27 17:39 ` dean gaudet
@ 2007-12-27 19:52 ` Justin Piszcz
  2007-12-28  0:08   ` dean gaudet
  1 sibling, 1 reply; 30+ messages in thread
From: Justin Piszcz @ 2007-12-27 19:52 UTC (permalink / raw)
  To: dean gaudet; +Cc: linux-raid



On Thu, 27 Dec 2007, dean gaudet wrote:

> hey neil -- remember that raid5 hang which me and only one or two others
> ever experienced and which was hard to reproduce?  we were debugging it
> well over a year ago (that box has 400+ day uptime now so at least that
> long ago :)  the workaround was to increase stripe_cache_size... i seem to
> have a way to reproduce something which looks much the same.
>
> setup:
>
> - 2.6.24-rc6
> - system has 8GiB RAM but no swap
> - 8x750GB in a raid5 with one spare, chunksize 1024KiB.
> - mkfs.xfs default options
> - mount -o noatime
> - dd if=/dev/zero of=/mnt/foo bs=4k count=2621440
>
> that sequence hangs for me within 10 seconds... and i can unhang / rehang
> it by toggling between stripe_cache_size 256 and 1024.  i detect the hang
> by watching "iostat -kx /dev/sd? 5".
>
> i've attached the kernel log where i dumped task and timer state while it
> was hung... note that you'll see at some point i did an xfs mount with
> external journal but it happens with internal journal as well.
>
> looks like it's using the raid456 module and async api.
>
> anyhow let me know if you need more info / have any suggestions.
>
> -dean

With that high of a stripe size the stripe_cache_size needs to be greater 
than the default to handle it.

Justin.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-27 19:52 ` Justin Piszcz
@ 2007-12-28  0:08   ` dean gaudet
  0 siblings, 0 replies; 30+ messages in thread
From: dean gaudet @ 2007-12-28  0:08 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-raid

[-- Attachment #1: Type: TEXT/PLAIN, Size: 474 bytes --]

On Thu, 27 Dec 2007, Justin Piszcz wrote:

> With that high of a stripe size the stripe_cache_size needs to be greater than
> the default to handle it.

i'd argue that any deadlock is a bug...

regardless i'm still seeing deadlocks with the default chunk_size of 64k 
and stripe_cache_size of 256... in this case it's with a workload which is 
untarring 34 copies of the linux kernel at the same time.  it's a variant 
of doug ledford's memtest, and i've attached it.

-dean

[-- Attachment #2: Type: TEXT/PLAIN, Size: 4046 bytes --]

#!/usr/bin/perl

# Copyright (c) 2007 dean gaudet <dean@arctic.org>
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
# OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
# ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.

# this idea shamelessly stolen from doug ledford

use warnings;
use strict;

# ensure stdout is not buffered
select(STDOUT); $| = 1;

my $usage = "usage: $0 linux.tar.gz /path1 [/path2 ...]\n";
defined(my $tarball = shift) or die $usage;
-f $tarball or die "$tarball does not exist or is not a file\n";

my @paths = @ARGV;
$#paths >= 0 or die "$usage";

# determine size of uncompressed tarball
open(GZIP, "-|") || exec "gzip", "--quiet", "--list", $tarball;
my $line = <GZIP>;
my ($tarball_size) = $line =~ m#^\s*\d+\s*(\d+)#;
defined($tarball_size) or die "unexpected result from gzip --quiet --list $tarball\n";
close(GZIP);

# determine amount of memory
open(MEMINFO, "</proc/meminfo")
        or die "unable to open /proc/meminfo for read: $!\n";
my $total_mem;
while (<MEMINFO>) {
  if (/^MemTotal:\s*(\d+)\s*kB/) {
    $total_mem = $1;
    last;
  }
}
defined($total_mem) or die "did not find MemTotal line in /proc/meminfo\n";
close(MEMINFO);
$total_mem *= 1024;

print "total memory: $total_mem\n";
print "uncompressed tarball: $tarball_size\n";
my $nr_simultaneous = int(1.2 * $total_mem / $tarball_size);
print "nr simultaneous processes: $nr_simultaneous\n";

sub system_or_die {
  my @args = @_;
  system(@args);
  if ($? == 1) {
    my $msg = sprintf("%s failed to exec %s: $!\n", scalar(localtime), $args[0]);
  }
  elsif ($? & 127) {
    my $msg = sprintf("%s %s died with signal %d, %s coredump\n",
        scalar(localtime), $args[0], ($? & 127), ($? & 128) ? "with" : "without");
    die $msg;
  }
  elsif (($? >> 8) != 0) {
    my $msg = sprintf("%s %s exited with non-zero exit code %d\n",
        scalar(localtime), $args[0], $? >> 8);
    die $msg;
  }
}

sub untar($) {
  mkdir($_[0]) or die localtime()." unable to mkdir($_[0]): $!\n";
  system_or_die("tar", "-xzf", $tarball, "-C", $_[0]);
}

print localtime()." untarring golden copy\n";
my $golden = $paths[0]."/dma_tmp.$$.gold";
untar($golden);

my $pass_no = 0;
while (1) {
  print localtime()." pass $pass_no: extracting\n";
  my @outputs;
  foreach my $n (1..$nr_simultaneous) {
    # treat paths in a round-robin manner
    my $dir = shift(@paths);
    push(@paths, $dir);

    $dir .= "/dma_tmp.$$.$n";
    push(@outputs, $dir);

    my $pid = fork;
    defined($pid) or die localtime()." unable to fork: $!\n";
    if ($pid == 0) {
      untar($dir);
      exit(0);
    }
  }

  # wait for the children
  while (wait != -1) {}

  print localtime()." pass $pass_no: diffing\n";
  foreach my $dir (@outputs) {
    my $pid = fork;
    defined($pid) or die localtime()." unable to fork: $!\n";
    if ($pid == 0) {
      system_or_die("diff", "-U", "3", "-rN", $golden, $dir);
      system_or_die("rm", "-fr", $dir);
      exit(0);
    }
  }

  # wait for the children
  while (wait != -1) {}

  ++$pass_no;
}

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-27 17:39 ` dean gaudet
@ 2007-12-29 16:48   ` dean gaudet
  2007-12-29 20:47     ` Dan Williams
  0 siblings, 1 reply; 30+ messages in thread
From: dean gaudet @ 2007-12-29 16:48 UTC (permalink / raw)
  To: linux-raid

hmm bummer, i'm doing another test (rsync 3.5M inodes from another box) on 
the same 64k chunk array and had raised the stripe_cache_size to 1024... 
and got a hang.  this time i grabbed stripe_cache_active before bumping 
the size again -- it was only 905 active.  as i recall the bug we were 
debugging a year+ ago the active was at the size when it would hang.  so 
this is probably something new.

anyhow raising it to 2048 got it unstuck, but i'm guessing i'll be able to 
hit that limit too if i try harder :)

btw what units are stripe_cache_size/active in?  is the memory consumed 
equal to (chunk_size * raid_disks * stripe_cache_size) or (chunk_size * 
raid_disks * stripe_cache_active)?

-dean

On Thu, 27 Dec 2007, dean gaudet wrote:

> hmm this seems more serious... i just ran into it with chunksize 64KiB and 
> while just untarring a bunch of linux kernels in parallel... increasing 
> stripe_cache_size did the trick again.
> 
> -dean
> 
> On Thu, 27 Dec 2007, dean gaudet wrote:
> 
> > hey neil -- remember that raid5 hang which me and only one or two others 
> > ever experienced and which was hard to reproduce?  we were debugging it 
> > well over a year ago (that box has 400+ day uptime now so at least that 
> > long ago :)  the workaround was to increase stripe_cache_size... i seem to 
> > have a way to reproduce something which looks much the same.
> > 
> > setup:
> > 
> > - 2.6.24-rc6
> > - system has 8GiB RAM but no swap
> > - 8x750GB in a raid5 with one spare, chunksize 1024KiB.
> > - mkfs.xfs default options
> > - mount -o noatime
> > - dd if=/dev/zero of=/mnt/foo bs=4k count=2621440
> > 
> > that sequence hangs for me within 10 seconds... and i can unhang / rehang 
> > it by toggling between stripe_cache_size 256 and 1024.  i detect the hang 
> > by watching "iostat -kx /dev/sd? 5".
> > 
> > i've attached the kernel log where i dumped task and timer state while it 
> > was hung... note that you'll see at some point i did an xfs mount with 
> > external journal but it happens with internal journal as well.
> > 
> > looks like it's using the raid456 module and async api.
> > 
> > anyhow let me know if you need more info / have any suggestions.
> > 
> > -dean
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-29 16:48   ` dean gaudet
@ 2007-12-29 20:47     ` Dan Williams
  2007-12-29 20:58       ` dean gaudet
  0 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2007-12-29 20:47 UTC (permalink / raw)
  To: dean gaudet; +Cc: linux-raid

On Dec 29, 2007 9:48 AM, dean gaudet <dean@arctic.org> wrote:
> hmm bummer, i'm doing another test (rsync 3.5M inodes from another box) on
> the same 64k chunk array and had raised the stripe_cache_size to 1024...
> and got a hang.  this time i grabbed stripe_cache_active before bumping
> the size again -- it was only 905 active.  as i recall the bug we were
> debugging a year+ ago the active was at the size when it would hang.  so
> this is probably something new.

I believe I am seeing the same issue and am trying to track down
whether XFS is doing something unexpected, i.e. I have not been able
to reproduce the problem with EXT3.  MD tries to increase throughput
by letting some stripe work build up in batches.  It looks like every
time your system has hung it has been in the 'inactive_blocked' state
i.e. > 3/4 of stripes active.  This state should automatically
clear...

>
> anyhow raising it to 2048 got it unstuck, but i'm guessing i'll be able to
> hit that limit too if i try harder :)

Once you hang if 'stripe_cache_size' is increased such that
stripe_cache_active < 3/4 * stripe_cache_size things will start
flowing again.

>
> btw what units are stripe_cache_size/active in?  is the memory consumed
> equal to (chunk_size * raid_disks * stripe_cache_size) or (chunk_size *
> raid_disks * stripe_cache_active)?
>

memory_consumed = PAGE_SIZE * raid_disks * stripe_cache_size

>
> -dean
>

--
Dan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-29 20:47     ` Dan Williams
@ 2007-12-29 20:58       ` dean gaudet
  2007-12-29 21:50         ` Justin Piszcz
  2007-12-29 22:06         ` Dan Williams
  0 siblings, 2 replies; 30+ messages in thread
From: dean gaudet @ 2007-12-29 20:58 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-raid

On Sat, 29 Dec 2007, Dan Williams wrote:

> On Dec 29, 2007 9:48 AM, dean gaudet <dean@arctic.org> wrote:
> > hmm bummer, i'm doing another test (rsync 3.5M inodes from another box) on
> > the same 64k chunk array and had raised the stripe_cache_size to 1024...
> > and got a hang.  this time i grabbed stripe_cache_active before bumping
> > the size again -- it was only 905 active.  as i recall the bug we were
> > debugging a year+ ago the active was at the size when it would hang.  so
> > this is probably something new.
> 
> I believe I am seeing the same issue and am trying to track down
> whether XFS is doing something unexpected, i.e. I have not been able
> to reproduce the problem with EXT3.  MD tries to increase throughput
> by letting some stripe work build up in batches.  It looks like every
> time your system has hung it has been in the 'inactive_blocked' state
> i.e. > 3/4 of stripes active.  This state should automatically
> clear...

cool, glad you can reproduce it :)

i have a bit more data... i'm seeing the same problem on debian's 
2.6.22-3-amd64 kernel, so it's not new in 2.6.24.

i'm doing some more isolation but just grabbing kernels i have precompiled 
so far -- a 2.6.19.7 kernel doesn't show the problem, and early 
indications are a 2.6.21.7 kernel also doesn't have the problem but i'm 
giving it longer to show its head.

i'll try a stock 2.6.22 next depending on how the 2.6.21 test goes, just 
so we get the debian patches out of the way.

i was tempted to blame async api because it's newish :)  but according to 
the dmesg output it doesn't appear the 2.6.22-3-amd64 kernel used async 
API, and it still hung, so async is probably not to blame.

anyhow the test case i'm using is the dma_thrasher script i attached... it 
takes about an hour to give me confidence there's no problems so this will 
take a while.

-dean

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-29 20:58       ` dean gaudet
@ 2007-12-29 21:50         ` Justin Piszcz
  2007-12-29 22:11           ` dean gaudet
  2007-12-29 22:06         ` Dan Williams
  1 sibling, 1 reply; 30+ messages in thread
From: Justin Piszcz @ 2007-12-29 21:50 UTC (permalink / raw)
  To: dean gaudet; +Cc: Dan Williams, linux-raid



On Sat, 29 Dec 2007, dean gaudet wrote:

> On Sat, 29 Dec 2007, Dan Williams wrote:
>
>> On Dec 29, 2007 9:48 AM, dean gaudet <dean@arctic.org> wrote:
>>> hmm bummer, i'm doing another test (rsync 3.5M inodes from another box) on
>>> the same 64k chunk array and had raised the stripe_cache_size to 1024...
>>> and got a hang.  this time i grabbed stripe_cache_active before bumping
>>> the size again -- it was only 905 active.  as i recall the bug we were
>>> debugging a year+ ago the active was at the size when it would hang.  so
>>> this is probably something new.
>>
>> I believe I am seeing the same issue and am trying to track down
>> whether XFS is doing something unexpected, i.e. I have not been able
>> to reproduce the problem with EXT3.  MD tries to increase throughput
>> by letting some stripe work build up in batches.  It looks like every
>> time your system has hung it has been in the 'inactive_blocked' state
>> i.e. > 3/4 of stripes active.  This state should automatically
>> clear...
>
> cool, glad you can reproduce it :)
>
> i have a bit more data... i'm seeing the same problem on debian's
> 2.6.22-3-amd64 kernel, so it's not new in 2.6.24.
>
> i'm doing some more isolation but just grabbing kernels i have precompiled
> so far -- a 2.6.19.7 kernel doesn't show the problem, and early
> indications are a 2.6.21.7 kernel also doesn't have the problem but i'm
> giving it longer to show its head.
>
> i'll try a stock 2.6.22 next depending on how the 2.6.21 test goes, just
> so we get the debian patches out of the way.
>
> i was tempted to blame async api because it's newish :)  but according to
> the dmesg output it doesn't appear the 2.6.22-3-amd64 kernel used async
> API, and it still hung, so async is probably not to blame.
>
> anyhow the test case i'm using is the dma_thrasher script i attached... it
> takes about an hour to give me confidence there's no problems so this will
> take a while.
>
> -dean
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Dean,

Curious btw what kind of filesystem size/raid type (5, but defaults 
I assume, nothing special right? (right-symmetric vs. 
left-symmetric, etc?)/cache size/chunk size(s) are you using/testing with?

The script you sent out earlier, you are able to reproduce it easily with 
31 or so kernel tar decompressions?

Justin.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-29 20:58       ` dean gaudet
  2007-12-29 21:50         ` Justin Piszcz
@ 2007-12-29 22:06         ` Dan Williams
  2007-12-30 17:58           ` dean gaudet
  1 sibling, 1 reply; 30+ messages in thread
From: Dan Williams @ 2007-12-29 22:06 UTC (permalink / raw)
  To: dean gaudet; +Cc: linux-raid

On Dec 29, 2007 1:58 PM, dean gaudet <dean@arctic.org> wrote:
> On Sat, 29 Dec 2007, Dan Williams wrote:
>
> > On Dec 29, 2007 9:48 AM, dean gaudet <dean@arctic.org> wrote:
> > > hmm bummer, i'm doing another test (rsync 3.5M inodes from another box) on
> > > the same 64k chunk array and had raised the stripe_cache_size to 1024...
> > > and got a hang.  this time i grabbed stripe_cache_active before bumping
> > > the size again -- it was only 905 active.  as i recall the bug we were
> > > debugging a year+ ago the active was at the size when it would hang.  so
> > > this is probably something new.
> >
> > I believe I am seeing the same issue and am trying to track down
> > whether XFS is doing something unexpected, i.e. I have not been able
> > to reproduce the problem with EXT3.  MD tries to increase throughput
> > by letting some stripe work build up in batches.  It looks like every
> > time your system has hung it has been in the 'inactive_blocked' state
> > i.e. > 3/4 of stripes active.  This state should automatically
> > clear...
>
> cool, glad you can reproduce it :)
>
> i have a bit more data... i'm seeing the same problem on debian's
> 2.6.22-3-amd64 kernel, so it's not new in 2.6.24.
>

This is just brainstorming at this point, but it looks like xfs can
submit more requests in the bi_end_io path such that it can lock
itself out of the RAID array.  The sequence that concerns me is:

return_io->xfs_buf_end_io->xfs_buf_io_end->xfs_buf_iodone_work->xfs_buf_iorequest->make_request-><hang>

I need verify whether this path is actually triggering, but if we are
in an inactive_blocked condition this new request will be put on a
wait queue and we'll never get to the release_stripe() call after
return_io().  It would be interesting to see if this is new XFS
behavior in recent kernels.

--
Dan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-29 21:50         ` Justin Piszcz
@ 2007-12-29 22:11           ` dean gaudet
  2007-12-29 22:21             ` dean gaudet
  0 siblings, 1 reply; 30+ messages in thread
From: dean gaudet @ 2007-12-29 22:11 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: Dan Williams, linux-raid

On Sat, 29 Dec 2007, Justin Piszcz wrote:

> Curious btw what kind of filesystem size/raid type (5, but defaults I assume,
> nothing special right? (right-symmetric vs. left-symmetric, etc?)/cache
> size/chunk size(s) are you using/testing with?

mdadm --create --level=5 --chunk=64 -n7 -x1 /dev/md2 /dev/sd[a-h]1
mkfs.xfs -f /dev/md2

otherwise defaults

> The script you sent out earlier, you are able to reproduce it easily with 31
> or so kernel tar decompressions?

not sure, the point of the script is to untar more than there is RAM.  it 
happened with a single rsync running though -- 3.5M indoes from a remote 
box.  it also happens with the single 10GB dd write... although i've been 
using the tar method for testing different kernel revs.

-dean

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-29 22:11           ` dean gaudet
@ 2007-12-29 22:21             ` dean gaudet
  0 siblings, 0 replies; 30+ messages in thread
From: dean gaudet @ 2007-12-29 22:21 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: Dan Williams, linux-raid


On Sat, 29 Dec 2007, dean gaudet wrote:

> On Sat, 29 Dec 2007, Justin Piszcz wrote:
> 
> > Curious btw what kind of filesystem size/raid type (5, but defaults I assume,
> > nothing special right? (right-symmetric vs. left-symmetric, etc?)/cache
> > size/chunk size(s) are you using/testing with?
> 
> mdadm --create --level=5 --chunk=64 -n7 -x1 /dev/md2 /dev/sd[a-h]1
> mkfs.xfs -f /dev/md2
> 
> otherwise defaults

hmm i missed a few things, here's exactly how i created the array:

mdadm --create --level=5 --chunk=64 -n7 -x1 --assume-clean /dev/md2 /dev/sd[a-h]1

it's reassembled automagically each reboot, but i do this each reboot:

mkfs.xfs -f /dev/md2
mount -o noatime /dev/md2 /mnt/new
./dma_thrasher linux.tar.gz /mnt/new

the --assume-clean and noatime probably make no difference though...

on the bisection front it looks like it's new behaviour between 2.6.21.7 
and 2.6.22.15 (stock kernels now, not debian).

i've got to step out for a while, but i'll go at it again later, probably 
with git bisect unless someone has some cherry picked changes to suggest.

-dean

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-29 22:06         ` Dan Williams
@ 2007-12-30 17:58           ` dean gaudet
  2008-01-09 18:28             ` Dan Williams
  0 siblings, 1 reply; 30+ messages in thread
From: dean gaudet @ 2007-12-30 17:58 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-raid

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3103 bytes --]

On Sat, 29 Dec 2007, Dan Williams wrote:

> On Dec 29, 2007 1:58 PM, dean gaudet <dean@arctic.org> wrote:
> > On Sat, 29 Dec 2007, Dan Williams wrote:
> >
> > > On Dec 29, 2007 9:48 AM, dean gaudet <dean@arctic.org> wrote:
> > > > hmm bummer, i'm doing another test (rsync 3.5M inodes from another box) on
> > > > the same 64k chunk array and had raised the stripe_cache_size to 1024...
> > > > and got a hang.  this time i grabbed stripe_cache_active before bumping
> > > > the size again -- it was only 905 active.  as i recall the bug we were
> > > > debugging a year+ ago the active was at the size when it would hang.  so
> > > > this is probably something new.
> > >
> > > I believe I am seeing the same issue and am trying to track down
> > > whether XFS is doing something unexpected, i.e. I have not been able
> > > to reproduce the problem with EXT3.  MD tries to increase throughput
> > > by letting some stripe work build up in batches.  It looks like every
> > > time your system has hung it has been in the 'inactive_blocked' state
> > > i.e. > 3/4 of stripes active.  This state should automatically
> > > clear...
> >
> > cool, glad you can reproduce it :)
> >
> > i have a bit more data... i'm seeing the same problem on debian's
> > 2.6.22-3-amd64 kernel, so it's not new in 2.6.24.
> >
> 
> This is just brainstorming at this point, but it looks like xfs can
> submit more requests in the bi_end_io path such that it can lock
> itself out of the RAID array.  The sequence that concerns me is:
> 
> return_io->xfs_buf_end_io->xfs_buf_io_end->xfs_buf_iodone_work->xfs_buf_iorequest->make_request-><hang>
> 
> I need verify whether this path is actually triggering, but if we are
> in an inactive_blocked condition this new request will be put on a
> wait queue and we'll never get to the release_stripe() call after
> return_io().  It would be interesting to see if this is new XFS
> behavior in recent kernels.


i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1

which was Neil's change in 2.6.22 for deferring generic_make_request
until there's enough stack space for it.

with my git tree sync'd to that commit my test cases fail in under 20
minutes uptime (i rebooted and tested 3x).  sync'd to the commit previous
to it i've got 8h of run-time now without the problem.

this isn't definitive of course since it does seem to be timing
dependent, but since all failures have occured much earlier than that
for me so far i think this indicates this change is either the cause of
the problem or exacerbates an existing raid5 problem.

given that this problem looks like a very rare problem i saw with 2.6.18
(raid5+xfs there too) i'm thinking Neil's commit may just exacerbate an
existing problem... not that i have evidence either way.

i've attached a new kernel log with a hang at d89d87965d... and the
reduced config file i was using for the bisect.  hopefully the hang
looks the same as what we were seeing at 2.6.24-rc6.  let me know.

-dean

[-- Attachment #2: Type: APPLICATION/octet-stream, Size: 13738 bytes --]

[-- Attachment #3: Type: APPLICATION/octet-stream, Size: 7117 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2007-12-30 17:58           ` dean gaudet
@ 2008-01-09 18:28             ` Dan Williams
  2008-01-10  0:09               ` Neil Brown
  0 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2008-01-09 18:28 UTC (permalink / raw)
  To: dean gaudet; +Cc: linux-raid, neilb

On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote:
> On Sat, 29 Dec 2007, Dan Williams wrote:
> 
> > On Dec 29, 2007 1:58 PM, dean gaudet <dean@arctic.org> wrote: 
> > > On Sat, 29 Dec 2007, Dan Williams wrote: 
> > > 
> > > > On Dec 29, 2007 9:48 AM, dean gaudet <dean@arctic.org> wrote: 
> > > > > hmm bummer, i'm doing another test (rsync 3.5M inodes from another box) on 
> > > > > the same 64k chunk array and had raised the stripe_cache_size to 1024... 
> > > > > and got a hang.  this time i grabbed stripe_cache_active before bumping 
> > > > > the size again -- it was only 905 active.  as i recall the bug we were 
> > > > > debugging a year+ ago the active was at the size when it would hang.  so 
> > > > > this is probably something new. 
> > > > 
> > > > I believe I am seeing the same issue and am trying to track down 
> > > > whether XFS is doing something unexpected, i.e. I have not been able 
> > > > to reproduce the problem with EXT3.  MD tries to increase throughput 
> > > > by letting some stripe work build up in batches.  It looks like every 
> > > > time your system has hung it has been in the 'inactive_blocked' state 
> > > > i.e. > 3/4 of stripes active.  This state should automatically 
> > > > clear... 
> > > 
> > > cool, glad you can reproduce it :) 
> > > 
> > > i have a bit more data... i'm seeing the same problem on debian's 
> > > 2.6.22-3-amd64 kernel, so it's not new in 2.6.24. 
> > > 
> > 
> > This is just brainstorming at this point, but it looks like xfs can 
> > submit more requests in the bi_end_io path such that it can lock 
> > itself out of the RAID array.  The sequence that concerns me is: 
> > 
> > return_io->xfs_buf_end_io->xfs_buf_io_end->xfs_buf_iodone_work->xfs_buf_iorequest->make_request-><hang> 
> > 
> > I need verify whether this path is actually triggering, but if we are 
> > in an inactive_blocked condition this new request will be put on a 
> > wait queue and we'll never get to the release_stripe() call after 
> > return_io().  It would be interesting to see if this is new XFS 
> > behavior in recent kernels.
> 
> 
> i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> 
> which was Neil's change in 2.6.22 for deferring generic_make_request 
> until there's enough stack space for it.
> 
> with my git tree sync'd to that commit my test cases fail in under 20 
> minutes uptime (i rebooted and tested 3x).  sync'd to the commit previous 
> to it i've got 8h of run-time now without the problem.
> 
> this isn't definitive of course since it does seem to be timing 
> dependent, but since all failures have occured much earlier than that 
> for me so far i think this indicates this change is either the cause of 
> the problem or exacerbates an existing raid5 problem.
> 
> given that this problem looks like a very rare problem i saw with 2.6.18 
> (raid5+xfs there too) i'm thinking Neil's commit may just exacerbate an 
> existing problem... not that i have evidence either way.
> 
> i've attached a new kernel log with a hang at d89d87965d... and the 
> reduced config file i was using for the bisect.  hopefully the hang 
> looks the same as what we were seeing at 2.6.24-rc6.  let me know.
> 

Dean could you try the below patch to see if it fixes your failure
scenario?  It passes my test case.

Thanks,
Dan

------->
md: add generic_make_request_immed to prevent raid5 hang

From: Dan Williams <dan.j.williams@intel.com>

Commit d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 reduced stack utilization
by preventing recursive calls to generic_make_request.  However the
following conditions can cause raid5 to hang until 'stripe_cache_size' is
increased:

1/ stripe_cache_active is N stripes away from the 'inactive_blocked' limit
   (3/4 * stripe_cache_size)
2/ a bio is submitted that requires M stripes to be processed where M > N
3/ stripes 1 through N are up-to-date and ready for immediate processing,
   i.e. no trip through raid5d required

This results in the calling thread hanging while waiting for resources to
process stripes N through M.  This means we never return from make_request.
All other raid5 users pile up in get_active_stripe.  Increasing
stripe_cache_size temporarily resolves the blockage by allowing the blocked
make_request to return to generic_make_request.

Another way to solve this is to move all i/o submission to raid5d context.

Thanks to Dean Gaudet for bisecting this down to d89d8796.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---

 block/ll_rw_blk.c      |   16 +++++++++++++---
 drivers/md/raid5.c     |    4 ++--
 include/linux/blkdev.h |    1 +
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 8b91994..bff40c2 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -3287,16 +3287,26 @@ end_io:
 }
 
 /*
- * We only want one ->make_request_fn to be active at a time,
- * else stack usage with stacked devices could be a problem.
+ * In the general case we only want one ->make_request_fn to be active
+ * at a time, else stack usage with stacked devices could be a problem.
  * So use current->bio_{list,tail} to keep a list of requests
  * submited by a make_request_fn function.
  * current->bio_tail is also used as a flag to say if
  * generic_make_request is currently active in this task or not.
  * If it is NULL, then no make_request is active.  If it is non-NULL,
  * then a make_request is active, and new requests should be added
- * at the tail
+ * at the tail.
+ * However, some stacking drivers, like md-raid5, need to submit
+ * the bio without delay when it may not have the resources to
+ * complete its q->make_request_fn.  generic_make_request_immed is
+ * provided for this explicit purpose.
  */
+void generic_make_request_immed(struct bio *bio)
+{
+	__generic_make_request(bio);
+}
+EXPORT_SYMBOL(generic_make_request_immed);
+
 void generic_make_request(struct bio *bio)
 {
 	if (current->bio_tail) {
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c857b5a..ffa2be4 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -450,7 +450,7 @@ static void ops_run_io(struct stripe_head *sh)
 			    test_bit(R5_ReWrite, &sh->dev[i].flags))
 				atomic_add(STRIPE_SECTORS,
 					&rdev->corrected_errors);
-			generic_make_request(bi);
+			generic_make_request_immed(bi);
 		} else {
 			if (rw == WRITE)
 				set_bit(STRIPE_DEGRADED, &sh->state);
@@ -3124,7 +3124,7 @@ static void handle_stripe6(struct stripe_head *sh, struct page *tmp_page)
 			if (rw == WRITE &&
 			    test_bit(R5_ReWrite, &sh->dev[i].flags))
 				atomic_add(STRIPE_SECTORS, &rdev->corrected_errors);
-			generic_make_request(bi);
+			generic_make_request_immed(bi);
 		} else {
 			if (rw == WRITE)
 				set_bit(STRIPE_DEGRADED, &sh->state);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index d18ee67..774a3a0 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -642,6 +642,7 @@ extern int blk_register_queue(struct gendisk *disk);
 extern void blk_unregister_queue(struct gendisk *disk);
 extern void register_disk(struct gendisk *dev);
 extern void generic_make_request(struct bio *bio);
+extern void generic_make_request_immed(struct bio *bio);
 extern void blk_put_request(struct request *);
 extern void __blk_put_request(struct request_queue *, struct request *);
 extern void blk_end_sync_rq(struct request *rq, int error);



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-09 18:28             ` Dan Williams
@ 2008-01-10  0:09               ` Neil Brown
  2008-01-10  3:07                 ` Dan Williams
                                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Neil Brown @ 2008-01-10  0:09 UTC (permalink / raw)
  To: Dan Williams; +Cc: dean gaudet, linux-raid

On Wednesday January 9, dan.j.williams@intel.com wrote:
> On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote:
> > i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> > 
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> > 
> > which was Neil's change in 2.6.22 for deferring generic_make_request 
> > until there's enough stack space for it.
> > 
> 
> Commit d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 reduced stack utilization
> by preventing recursive calls to generic_make_request.  However the
> following conditions can cause raid5 to hang until 'stripe_cache_size' is
> increased:
> 

Thanks for pursuing this guys.  That explanation certainly sounds very
credible.

The generic_make_request_immed is a good way to confirm that we have
found the bug,  but I don't like it as a long term solution, as it
just reintroduced the problem that we were trying to solve with the
problematic commit.

As you say, we could arrange that all request submission happens in
raid5d and I think this is the right way to proceed.  However we can
still take some of the work into the thread that is submitting the
IO by calling "raid5d()" at the end of make_request, like this.

Can you test it please?  Does it seem reasonable?

Thanks,
NeilBrown


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/md.c    |    2 +-
 ./drivers/md/raid5.c |    4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c	2008-01-07 13:32:10.000000000 +1100
+++ ./drivers/md/md.c	2008-01-10 11:08:02.000000000 +1100
@@ -5774,7 +5774,7 @@ void md_check_recovery(mddev_t *mddev)
 	if (mddev->ro)
 		return;
 
-	if (signal_pending(current)) {
+	if (current == mddev->thread->tsk && signal_pending(current)) {
 		if (mddev->pers->sync_request) {
 			printk(KERN_INFO "md: %s in immediate safe mode\n",
 			       mdname(mddev));

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c	2008-01-07 13:32:10.000000000 +1100
+++ ./drivers/md/raid5.c	2008-01-10 11:06:54.000000000 +1100
@@ -3432,6 +3432,7 @@ static int chunk_aligned_read(struct req
 	}
 }
 
+static void raid5d (mddev_t *mddev);
 
 static int make_request(struct request_queue *q, struct bio * bi)
 {
@@ -3547,7 +3548,7 @@ static int make_request(struct request_q
 				goto retry;
 			}
 			finish_wait(&conf->wait_for_overlap, &w);
-			handle_stripe(sh, NULL);
+			set_bit(STRIPE_HANDLE, &sh->state);
 			release_stripe(sh);
 		} else {
 			/* cannot get stripe for read-ahead, just give-up */
@@ -3569,6 +3570,7 @@ static int make_request(struct request_q
 			      test_bit(BIO_UPTODATE, &bi->bi_flags)
 			        ? 0 : -EIO);
 	}
+	raid5d(mddev);
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-10  0:09               ` Neil Brown
@ 2008-01-10  3:07                 ` Dan Williams
  2008-01-10  3:57                   ` Neil Brown
  2008-01-10  7:13                 ` dean gaudet
  2008-01-10 17:59                 ` dean gaudet
  2 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2008-01-10  3:07 UTC (permalink / raw)
  To: Neil Brown; +Cc: dean gaudet, linux-raid

On Jan 9, 2008 5:09 PM, Neil Brown <neilb@suse.de> wrote:
> On Wednesday January 9, dan.j.williams@intel.com wrote:
> > On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote:
> > > i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> > >
> > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> > >
> > > which was Neil's change in 2.6.22 for deferring generic_make_request
> > > until there's enough stack space for it.
> > >
> >
> > Commit d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 reduced stack utilization
> > by preventing recursive calls to generic_make_request.  However the
> > following conditions can cause raid5 to hang until 'stripe_cache_size' is
> > increased:
> >
>
> Thanks for pursuing this guys.  That explanation certainly sounds very
> credible.
>
> The generic_make_request_immed is a good way to confirm that we have
> found the bug,  but I don't like it as a long term solution, as it
> just reintroduced the problem that we were trying to solve with the
> problematic commit.
>
> As you say, we could arrange that all request submission happens in
> raid5d and I think this is the right way to proceed.  However we can
> still take some of the work into the thread that is submitting the
> IO by calling "raid5d()" at the end of make_request, like this.
>
> Can you test it please?

This passes my failure case.

However, my test is different from Dean's in that I am using tiobench
and the latest rev of my 'get_priority_stripe' patch. I believe the
failure mechanism is the same, but it would be good to get
confirmation from Dean.  get_priority_stripe has the effect of
increasing the frequency of
make_request->handle_stripe->generic_make_request sequences.

> Does it seem reasonable?

What do you think about limiting the number of stripes the submitting
thread handles to be equal to what it submitted?  If I'm a stripe that
only submits 1 stripe worth of work should I get stuck handling the
rest of the cache?

Regards,
Dan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-10  3:07                 ` Dan Williams
@ 2008-01-10  3:57                   ` Neil Brown
  2008-01-10  4:56                     ` Dan Williams
  2008-01-10 20:28                     ` Bill Davidsen
  0 siblings, 2 replies; 30+ messages in thread
From: Neil Brown @ 2008-01-10  3:57 UTC (permalink / raw)
  To: Dan Williams; +Cc: dean gaudet, linux-raid

On Wednesday January 9, dan.j.williams@intel.com wrote:
> On Jan 9, 2008 5:09 PM, Neil Brown <neilb@suse.de> wrote:
> > On Wednesday January 9, dan.j.williams@intel.com wrote:
> >
> > Can you test it please?
> 
> This passes my failure case.

Thanks!

> 
> > Does it seem reasonable?
> 
> What do you think about limiting the number of stripes the submitting
> thread handles to be equal to what it submitted?  If I'm a stripe that
> only submits 1 stripe worth of work should I get stuck handling the
> rest of the cache?

Dunno....
Someone has to do the work, and leaving it all to raid5d means that it
all gets done on one CPU.
I expect that most of the time the queue of ready stripes is empty so
make_request will mostly only handle it's own stripes anyway.
The times that it handles other thread's stripes will probably balance
out with the times that other threads handle this threads stripes.

So I'm incline to leave it as "do as much work as is available to be
done" as that is simplest.  But I can probably be talked out of it
with a convincing argument....

NeilBrown

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-10  3:57                   ` Neil Brown
@ 2008-01-10  4:56                     ` Dan Williams
  2008-01-10 20:28                     ` Bill Davidsen
  1 sibling, 0 replies; 30+ messages in thread
From: Dan Williams @ 2008-01-10  4:56 UTC (permalink / raw)
  To: Neil Brown; +Cc: dean gaudet, linux-raid

On Wed, 2008-01-09 at 20:57 -0700, Neil Brown wrote:
> So I'm incline to leave it as "do as much work as is available to be
> done" as that is simplest.  But I can probably be talked out of it
> with a convincing argument....

Well, in an age of CFS and CFQ it smacks of 'unfairness'.  But does that
trump KISS...? Probably not.

--
Dan


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-10  0:09               ` Neil Brown
  2008-01-10  3:07                 ` Dan Williams
@ 2008-01-10  7:13                 ` dean gaudet
  2008-01-10 18:49                   ` Dan Williams
  2008-01-10 17:59                 ` dean gaudet
  2 siblings, 1 reply; 30+ messages in thread
From: dean gaudet @ 2008-01-10  7:13 UTC (permalink / raw)
  To: Neil Brown; +Cc: Dan Williams, linux-raid

On Thu, 10 Jan 2008, Neil Brown wrote:

> On Wednesday January 9, dan.j.williams@intel.com wrote:
> > On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote:
> > > i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> > > 
> > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> > > 
> > > which was Neil's change in 2.6.22 for deferring generic_make_request 
> > > until there's enough stack space for it.
> > > 
> > 
> > Commit d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 reduced stack utilization
> > by preventing recursive calls to generic_make_request.  However the
> > following conditions can cause raid5 to hang until 'stripe_cache_size' is
> > increased:
> > 
> 
> Thanks for pursuing this guys.  That explanation certainly sounds very
> credible.
> 
> The generic_make_request_immed is a good way to confirm that we have
> found the bug,  but I don't like it as a long term solution, as it
> just reintroduced the problem that we were trying to solve with the
> problematic commit.
> 
> As you say, we could arrange that all request submission happens in
> raid5d and I think this is the right way to proceed.  However we can
> still take some of the work into the thread that is submitting the
> IO by calling "raid5d()" at the end of make_request, like this.
> 
> Can you test it please?  Does it seem reasonable?


i've got this running now (against 2.6.24-rc6)... it has passed ~25 
minutes of testing so far, which is a good sign.  i'll report back 
tomorrow and hopefully we'll have survived 8h+ of testing.

thanks!

w.r.t. dan's cfq comments -- i really don't know the details, but does 
this mean cfq will misattribute the IO to the wrong user/process?  or is 
it just a concern that CPU time will be spent on someone's IO?  the latter 
is fine to me... the former seems sucky because with today's multicore 
systems CPU time seems cheap compared to IO.

-dean

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-10  0:09               ` Neil Brown
  2008-01-10  3:07                 ` Dan Williams
  2008-01-10  7:13                 ` dean gaudet
@ 2008-01-10 17:59                 ` dean gaudet
  2 siblings, 0 replies; 30+ messages in thread
From: dean gaudet @ 2008-01-10 17:59 UTC (permalink / raw)
  To: Neil Brown; +Cc: Dan Williams, linux-raid

On Thu, 10 Jan 2008, Neil Brown wrote:

> On Wednesday January 9, dan.j.williams@intel.com wrote:
> > On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote:
> > > i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> > > 
> > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> > > 
> > > which was Neil's change in 2.6.22 for deferring generic_make_request 
> > > until there's enough stack space for it.
> > > 
> > 
> > Commit d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 reduced stack utilization
> > by preventing recursive calls to generic_make_request.  However the
> > following conditions can cause raid5 to hang until 'stripe_cache_size' is
> > increased:
> > 
> 
> Thanks for pursuing this guys.  That explanation certainly sounds very
> credible.
> 
> The generic_make_request_immed is a good way to confirm that we have
> found the bug,  but I don't like it as a long term solution, as it
> just reintroduced the problem that we were trying to solve with the
> problematic commit.
> 
> As you say, we could arrange that all request submission happens in
> raid5d and I think this is the right way to proceed.  However we can
> still take some of the work into the thread that is submitting the
> IO by calling "raid5d()" at the end of make_request, like this.
> 
> Can you test it please?  Does it seem reasonable?
> 
> Thanks,
> NeilBrown
> 
> 
> Signed-off-by: Neil Brown <neilb@suse.de>

it has passed 11h of the untar/diff/rm linux.tar.gz workload... that's 
pretty good evidence it works for me.  thanks!

Tested-by: dean gaudet <dean@arctic.org>

> 
> ### Diffstat output
>  ./drivers/md/md.c    |    2 +-
>  ./drivers/md/raid5.c |    4 +++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff .prev/drivers/md/md.c ./drivers/md/md.c
> --- .prev/drivers/md/md.c	2008-01-07 13:32:10.000000000 +1100
> +++ ./drivers/md/md.c	2008-01-10 11:08:02.000000000 +1100
> @@ -5774,7 +5774,7 @@ void md_check_recovery(mddev_t *mddev)
>  	if (mddev->ro)
>  		return;
>  
> -	if (signal_pending(current)) {
> +	if (current == mddev->thread->tsk && signal_pending(current)) {
>  		if (mddev->pers->sync_request) {
>  			printk(KERN_INFO "md: %s in immediate safe mode\n",
>  			       mdname(mddev));
> 
> diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
> --- .prev/drivers/md/raid5.c	2008-01-07 13:32:10.000000000 +1100
> +++ ./drivers/md/raid5.c	2008-01-10 11:06:54.000000000 +1100
> @@ -3432,6 +3432,7 @@ static int chunk_aligned_read(struct req
>  	}
>  }
>  
> +static void raid5d (mddev_t *mddev);
>  
>  static int make_request(struct request_queue *q, struct bio * bi)
>  {
> @@ -3547,7 +3548,7 @@ static int make_request(struct request_q
>  				goto retry;
>  			}
>  			finish_wait(&conf->wait_for_overlap, &w);
> -			handle_stripe(sh, NULL);
> +			set_bit(STRIPE_HANDLE, &sh->state);
>  			release_stripe(sh);
>  		} else {
>  			/* cannot get stripe for read-ahead, just give-up */
> @@ -3569,6 +3570,7 @@ static int make_request(struct request_q
>  			      test_bit(BIO_UPTODATE, &bi->bi_flags)
>  			        ? 0 : -EIO);
>  	}
> +	raid5d(mddev);
>  	return 0;
>  }
>  
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-10  7:13                 ` dean gaudet
@ 2008-01-10 18:49                   ` Dan Williams
  2008-01-11  1:46                     ` Neil Brown
  0 siblings, 1 reply; 30+ messages in thread
From: Dan Williams @ 2008-01-10 18:49 UTC (permalink / raw)
  To: dean gaudet; +Cc: Neil Brown, linux-raid

On Jan 10, 2008 12:13 AM, dean gaudet <dean@arctic.org> wrote:
> w.r.t. dan's cfq comments -- i really don't know the details, but does
> this mean cfq will misattribute the IO to the wrong user/process?  or is
> it just a concern that CPU time will be spent on someone's IO?  the latter
> is fine to me... the former seems sucky because with today's multicore
> systems CPU time seems cheap compared to IO.
>

I do not see this affecting the time slicing feature of cfq, because
as Neil says the work has to get done at some point.   If I give up
some of my slice working on someone else's I/O chances are the favor
will be returned in kind since the code does not discriminate.  The
io-priority capability of cfq currently does not work as advertised
with current MD since the priority is tied to the current thread and
the thread that actually submits the i/o on a stripe is
non-deterministic.  So I do not see this change making the situation
any worse.  In fact, it may make it a bit better since there is a
higher chance for the thread submitting i/o to MD to do its own i/o to
the backing disks.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-10  3:57                   ` Neil Brown
  2008-01-10  4:56                     ` Dan Williams
@ 2008-01-10 20:28                     ` Bill Davidsen
  1 sibling, 0 replies; 30+ messages in thread
From: Bill Davidsen @ 2008-01-10 20:28 UTC (permalink / raw)
  To: Neil Brown; +Cc: Dan Williams, dean gaudet, linux-raid

Neil Brown wrote:
> On Wednesday January 9, dan.j.williams@intel.com wrote:
>   
>> On Jan 9, 2008 5:09 PM, Neil Brown <neilb@suse.de> wrote:
>>     
>>> On Wednesday January 9, dan.j.williams@intel.com wrote:
>>>
>>> Can you test it please?
>>>       
>> This passes my failure case.
>>     
>
> Thanks!
>
>   
>>> Does it seem reasonable?
>>>       
>> What do you think about limiting the number of stripes the submitting
>> thread handles to be equal to what it submitted?  If I'm a stripe that
>> only submits 1 stripe worth of work should I get stuck handling the
>> rest of the cache?
>>     
>
> Dunno....
> Someone has to do the work, and leaving it all to raid5d means that it
> all gets done on one CPU.
> I expect that most of the time the queue of ready stripes is empty so
> make_request will mostly only handle it's own stripes anyway.
> The times that it handles other thread's stripes will probably balance
> out with the times that other threads handle this threads stripes.
>
> So I'm incline to leave it as "do as much work as is available to be
> done" as that is simplest.  But I can probably be talked out of it
> with a convincing argument....
>   

How about "it will perform better (defined as faster) during conditions 
of unusual i/o activity?" Is that a convincing argument to use your 
solution as offered? How about "complexity and maintainability are a 
zero-sum problem?"

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-10 18:49                   ` Dan Williams
@ 2008-01-11  1:46                     ` Neil Brown
  2008-01-11  2:14                       ` dean gaudet
  0 siblings, 1 reply; 30+ messages in thread
From: Neil Brown @ 2008-01-11  1:46 UTC (permalink / raw)
  To: Dan Williams; +Cc: dean gaudet, linux-raid

On Thursday January 10, dan.j.williams@gmail.com wrote:
> On Jan 10, 2008 12:13 AM, dean gaudet <dean@arctic.org> wrote:
> > w.r.t. dan's cfq comments -- i really don't know the details, but does
> > this mean cfq will misattribute the IO to the wrong user/process?  or is
> > it just a concern that CPU time will be spent on someone's IO?  the latter
> > is fine to me... the former seems sucky because with today's multicore
> > systems CPU time seems cheap compared to IO.
> >
> 
> I do not see this affecting the time slicing feature of cfq, because
> as Neil says the work has to get done at some point.   If I give up
> some of my slice working on someone else's I/O chances are the favor
> will be returned in kind since the code does not discriminate.  The
> io-priority capability of cfq currently does not work as advertised
> with current MD since the priority is tied to the current thread and
> the thread that actually submits the i/o on a stripe is
> non-deterministic.  So I do not see this change making the situation
> any worse.  In fact, it may make it a bit better since there is a
> higher chance for the thread submitting i/o to MD to do its own i/o to
> the backing disks.
> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>

Thanks.
But I suspect you didn't test it with a bitmap :-)
I ran the mdadm test suite and it hit a problem - easy enough to fix.

I'll look out for any other possible related problem (due to raid5d
running in different processes) and then submit it.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-11  1:46                     ` Neil Brown
@ 2008-01-11  2:14                       ` dean gaudet
  0 siblings, 0 replies; 30+ messages in thread
From: dean gaudet @ 2008-01-11  2:14 UTC (permalink / raw)
  To: Neil Brown; +Cc: Dan Williams, linux-raid

On Fri, 11 Jan 2008, Neil Brown wrote:

> Thanks.
> But I suspect you didn't test it with a bitmap :-)
> I ran the mdadm test suite and it hit a problem - easy enough to fix.

damn -- i "lost" my bitmap 'cause it was external and i didn't have things 
set up properly to pick it up after a reboot :)

if you send an updated patch i'll give it another spin...

-dean

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
@ 2008-01-23 13:37 Tim Southerwood
  2008-01-23 17:43 ` Carlos Carvalho
  0 siblings, 1 reply; 30+ messages in thread
From: Tim Southerwood @ 2008-01-23 13:37 UTC (permalink / raw)
  To: linux-raid

Sorry if this breaks threaded mail readers, I only just subscribed to 
the list so don;t have the original post to reply to.

I believe I'm having the same problem.

Regarding XFS on a raid5 md array:

Kernels 2.6.22-14 (Ubuntu Gutsy generic and server builds) *and* 
2.6.24-rc8 (pure build from virgin sources) compiled for amd64 arch.

Raid 5 configured across 4 x 500GB SATA disks (Nforce nv_sata driver, 
Asus M2N-E mobo, Athlon X64, 4GB RAM

MD Chunk size is 1024k. This is allocated to an LVM2 PV, then sliced up.
Taking one sample logical volume of 150GB I ran

mkfs.xfs -d su=1024k,sw=3 -L vol_linux /dev/vg00/vol_linux

I then found that putting high write load on that filesystem cause a 
hang. High load could be a little as a single rsync of a mirror of 
Ubunty Gutsy (many 10's of GB) from my old server to here. Hang would 
happen in a few hours typically.

I could generate relatively quick hangs by running xfs_fsr (defragger) 
in parallel.

Trying the workaround up upping /sys/block/md1/md/stripe_cache_size to 
4096 seems (fingers crossed) to have helped. Been running the rsync 
again, plus xfs_fst + a few dd's of 11 GB to the same filesystem.

I did notice also that the write speed increased dramatically with a 
bigger stripe_cache_size.

A more detailed analysis of the problem indicated that, after the hang:

I could log in;

One CPU core was stuck in 100% IO wait.
The other core was useable, with care. So I managed to get a SysRQ T and 
  one place the system appeared blocked was via this path:

[ 2039.466258] xfs_fsr       D 0000000000000000     0  7324   7308
[ 2039.466260]  ffff810119399858 0000000000000082 0000000000000000 
0000000000000046
[ 2039.466263]  ffff810110d6c680 ffff8101102ba998 ffff8101102ba770 
ffffffff8054e5e0
[ 2039.466265]  ffff8101102ba998 000000010014a1e6 ffffffffffffffff 
ffff810110ddcb30
[ 2039.466268] Call Trace:
[ 2039.466277]  [<ffffffff8808a26b>] :raid456:get_active_stripe+0x1cb/0x610
[ 2039.466282]  [<ffffffff80234000>] default_wake_function+0x0/0x10
[ 2039.466289]  [<ffffffff88090ff8>] :raid456:make_request+0x1f8/0x610
[ 2039.466293]  [<ffffffff80251c20>] autoremove_wake_function+0x0/0x30
[ 2039.466295]  [<ffffffff80331121>] __up_read+0x21/0xb0
[ 2039.466300]  [<ffffffff8031f336>] generic_make_request+0x1d6/0x3d0
[ 2039.466303]  [<ffffffff80280bad>] vm_normal_page+0x3d/0xc0
[ 2039.466307]  [<ffffffff8031f59f>] submit_bio+0x6f/0xf0
[ 2039.466311]  [<ffffffff802c98cc>] dio_bio_submit+0x5c/0x90
[ 2039.466313]  [<ffffffff802c9943>] dio_send_cur_page+0x43/0xa0
[ 2039.466316]  [<ffffffff802c99ee>] submit_page_section+0x4e/0x150
[ 2039.466319]  [<ffffffff802ca2e2>] __blockdev_direct_IO+0x742/0xb50
[ 2039.466342]  [<ffffffff8832e9a2>] :xfs:xfs_vm_direct_IO+0x182/0x190
[ 2039.466357]  [<ffffffff8832edb0>] :xfs:xfs_get_blocks_direct+0x0/0x20
[ 2039.466370]  [<ffffffff8832e350>] :xfs:xfs_end_io_direct+0x0/0x80
[ 2039.466375]  [<ffffffff80444fb5>] __wait_on_bit_lock+0x65/0x80
[ 2039.466380]  [<ffffffff80272883>] generic_file_direct_IO+0xe3/0x190
[ 2039.466385]  [<ffffffff802729a4>] generic_file_direct_write+0x74/0x150
[ 2039.466402]  [<ffffffff88336db2>] :xfs:xfs_write+0x492/0x8f0
[ 2039.466421]  [<ffffffff883099bc>] :xfs:xfs_iunlock+0x2c/0xb0
[ 2039.466437]  [<ffffffff88336866>] :xfs:xfs_read+0x186/0x240
[ 2039.466443]  [<ffffffff8029e5b9>] do_sync_write+0xd9/0x120
[ 2039.466448]  [<ffffffff80251c20>] autoremove_wake_function+0x0/0x30
[ 2039.466457]  [<ffffffff8029eead>] vfs_write+0xdd/0x190
[ 2039.466461]  [<ffffffff8029f5b3>] sys_write+0x53/0x90
[ 2039.466465]  [<ffffffff8020c29e>] system_call+0x7e/0x83

However, I'm of the opinion that the system should not deadlock, even if 
tunable parameters are unfavourable. I'm happy with the workaround 
(indeed the system performs better).

However, it will take me a week's worth of testing before I'm willing to 
commission this as my new fileserver.

So, if there is anything anyone would like me to try, I'm happy to 
volunteer as a guinea pig :)

Yes, I can build and patch kernels. But I'm not hot at debugging kernels 
so if kernel core dumps or whatever are needed, please point me at the 
right document or hint as to which commands I need to read about.

Cheers

Tim

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-23 13:37 2.6.24-rc6 reproducible raid5 hang Tim Southerwood
@ 2008-01-23 17:43 ` Carlos Carvalho
  2008-01-24 20:30   ` Tim Southerwood
  0 siblings, 1 reply; 30+ messages in thread
From: Carlos Carvalho @ 2008-01-23 17:43 UTC (permalink / raw)
  To: Tim Southerwood; +Cc: linux-raid

Tim Southerwood (ts@dionic.net) wrote on 23 January 2008 13:37:
 >Sorry if this breaks threaded mail readers, I only just subscribed to 
 >the list so don;t have the original post to reply to.
 >
 >I believe I'm having the same problem.
 >
 >Regarding XFS on a raid5 md array:
 >
 >Kernels 2.6.22-14 (Ubuntu Gutsy generic and server builds) *and* 
 >2.6.24-rc8 (pure build from virgin sources) compiled for amd64 arch.

This has been corrected already, install Neil's patches. It worked for
several people under high stress, including us.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-23 17:43 ` Carlos Carvalho
@ 2008-01-24 20:30   ` Tim Southerwood
  2008-01-28 17:29     ` Tim Southerwood
  0 siblings, 1 reply; 30+ messages in thread
From: Tim Southerwood @ 2008-01-24 20:30 UTC (permalink / raw)
  To: linux-raid

Carlos Carvalho wrote:
> Tim Southerwood (ts@dionic.net) wrote on 23 January 2008 13:37:
>  >Sorry if this breaks threaded mail readers, I only just subscribed to 
>  >the list so don;t have the original post to reply to.
>  >
>  >I believe I'm having the same problem.
>  >
>  >Regarding XFS on a raid5 md array:
>  >
>  >Kernels 2.6.22-14 (Ubuntu Gutsy generic and server builds) *and* 
>  >2.6.24-rc8 (pure build from virgin sources) compiled for amd64 arch.
> 
> This has been corrected already, install Neil's patches. It worked for
> several people under high stress, including us.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hi

I just coerced the patch into 2.6.23.14, reset 
/sys/block/md1/md/stripe_cache_size to default (256) and rebooted.

I can confirm that after 2 hours of heavy bashing[1] the system has not 
hung. Looks good - many thanks. But I will run with a stripe_cache_size 
of 4096 in practise as it improves write speen on my configuration about 
2.5 times.

Cheers

Tim



[1] Rsync > 50GB to raid pluf xfs_fsr + dd 11GB of /dev/zero to same 
filesystem.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-24 20:30   ` Tim Southerwood
@ 2008-01-28 17:29     ` Tim Southerwood
  2008-01-29 14:16       ` Carlos Carvalho
  0 siblings, 1 reply; 30+ messages in thread
From: Tim Southerwood @ 2008-01-28 17:29 UTC (permalink / raw)
  To: linux-raid

Subtitle: Patch to mainline yet?

Hi

I don't see evidence of Neil's patch in 2.6.24, so I applied it by hand
on my server.

Was that the correct thing to do, or did this issue get fixed in a 
different way that I wouldn't have spotted? I had a look at the git logs 
but it was not obvious - please pardon my ignorance, I'm not familiar 
enough with the code.

Many thanks,

Tim

Tim Southerwood wrote:
> Carlos Carvalho wrote:
>> Tim Southerwood (ts@dionic.net) wrote on 23 January 2008 13:37:
>>  >Sorry if this breaks threaded mail readers, I only just subscribed 
>> to  >the list so don;t have the original post to reply to.
>>  >
>>  >I believe I'm having the same problem.
>>  >
>>  >Regarding XFS on a raid5 md array:
>>  >
>>  >Kernels 2.6.22-14 (Ubuntu Gutsy generic and server builds) *and* 
>>  >2.6.24-rc8 (pure build from virgin sources) compiled for amd64 arch.
>>
>> This has been corrected already, install Neil's patches. It worked for
>> several people under high stress, including us.
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> Hi
> 
> I just coerced the patch into 2.6.23.14, reset 
> /sys/block/md1/md/stripe_cache_size to default (256) and rebooted.
> 
> I can confirm that after 2 hours of heavy bashing[1] the system has not 
> hung. Looks good - many thanks. But I will run with a stripe_cache_size 
> of 4096 in practise as it improves write speen on my configuration about 
> 2.5 times.
> 
> Cheers
> 
> Tim
> 
> 
> 
> [1] Rsync > 50GB to raid pluf xfs_fsr + dd 11GB of /dev/zero to same 
> filesystem.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-28 17:29     ` Tim Southerwood
@ 2008-01-29 14:16       ` Carlos Carvalho
  2008-01-29 22:58         ` Bill Davidsen
  0 siblings, 1 reply; 30+ messages in thread
From: Carlos Carvalho @ 2008-01-29 14:16 UTC (permalink / raw)
  To: linux-raid

Tim Southerwood (ts@dionic.net) wrote on 28 January 2008 17:29:
 >Subtitle: Patch to mainline yet?
 >
 >Hi
 >
 >I don't see evidence of Neil's patch in 2.6.24, so I applied it by hand
 >on my server.

I applied all 4 pending patches to .24. It's been better than .22 and
.23... Unfortunately the bitmap and rai1 patch don't go in .22.16.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-29 14:16       ` Carlos Carvalho
@ 2008-01-29 22:58         ` Bill Davidsen
  2008-02-14 10:13           ` Burkhard Carstens
  0 siblings, 1 reply; 30+ messages in thread
From: Bill Davidsen @ 2008-01-29 22:58 UTC (permalink / raw)
  To: Carlos Carvalho; +Cc: linux-raid, Neil Brown

Carlos Carvalho wrote:
> Tim Southerwood (ts@dionic.net) wrote on 28 January 2008 17:29:
>  >Subtitle: Patch to mainline yet?
>  >
>  >Hi
>  >
>  >I don't see evidence of Neil's patch in 2.6.24, so I applied it by hand
>  >on my server.
>
> I applied all 4 pending patches to .24. It's been better than .22 and
> .23... Unfortunately the bitmap and rai1 patch don't go in .22.16.

Neil, have these been sent up against 24-stable and 23-stable?

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: 2.6.24-rc6 reproducible raid5 hang
  2008-01-29 22:58         ` Bill Davidsen
@ 2008-02-14 10:13           ` Burkhard Carstens
  0 siblings, 0 replies; 30+ messages in thread
From: Burkhard Carstens @ 2008-02-14 10:13 UTC (permalink / raw)
  To: linux-raid

Am Dienstag, 29. Januar 2008 23:58 schrieb Bill Davidsen:
> Carlos Carvalho wrote:
> > Tim Southerwood (ts@dionic.net) wrote on 28 January 2008 17:29:
> >  >Subtitle: Patch to mainline yet?
> >  >
> >  >Hi
> >  >
> >  >I don't see evidence of Neil's patch in 2.6.24, so I applied it
> >  > by hand on my server.
> >
> > I applied all 4 pending patches to .24. It's been better than .22
> > and .23... Unfortunately the bitmap and rai1 patch don't go in
> > .22.16.
>
> Neil, have these been sent up against 24-stable and 23-stable?

.. and .22-stable ?

Also, is this a xfs-on-raid5 bug or would it also happen with 
ext3-on-raid5 ?

regards
 Burkhard


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2008-02-14 10:13 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-23 13:37 2.6.24-rc6 reproducible raid5 hang Tim Southerwood
2008-01-23 17:43 ` Carlos Carvalho
2008-01-24 20:30   ` Tim Southerwood
2008-01-28 17:29     ` Tim Southerwood
2008-01-29 14:16       ` Carlos Carvalho
2008-01-29 22:58         ` Bill Davidsen
2008-02-14 10:13           ` Burkhard Carstens
  -- strict thread matches above, loose matches on Subject: below --
2007-12-27 17:06 dean gaudet
2007-12-27 17:39 ` dean gaudet
2007-12-29 16:48   ` dean gaudet
2007-12-29 20:47     ` Dan Williams
2007-12-29 20:58       ` dean gaudet
2007-12-29 21:50         ` Justin Piszcz
2007-12-29 22:11           ` dean gaudet
2007-12-29 22:21             ` dean gaudet
2007-12-29 22:06         ` Dan Williams
2007-12-30 17:58           ` dean gaudet
2008-01-09 18:28             ` Dan Williams
2008-01-10  0:09               ` Neil Brown
2008-01-10  3:07                 ` Dan Williams
2008-01-10  3:57                   ` Neil Brown
2008-01-10  4:56                     ` Dan Williams
2008-01-10 20:28                     ` Bill Davidsen
2008-01-10  7:13                 ` dean gaudet
2008-01-10 18:49                   ` Dan Williams
2008-01-11  1:46                     ` Neil Brown
2008-01-11  2:14                       ` dean gaudet
2008-01-10 17:59                 ` dean gaudet
2007-12-27 19:52 ` Justin Piszcz
2007-12-28  0:08   ` dean gaudet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).