linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roman Mamedov <rm@romanrm.ru>
To: NeilBrown <neilb@suse.de>
Cc: Jes Sorensen <Jes.Sorensen@redhat.com>, linux-raid@vger.kernel.org
Subject: Re: tests/03r5assemV1 issues
Date: Wed, 11 Jul 2012 10:28:53 +0600	[thread overview]
Message-ID: <20120711102853.2a8bbd95@natsu> (raw)
In-Reply-To: <20120711142053.36d33f61@notabene.brown>

[-- Attachment #1: Type: text/plain, Size: 4425 bytes --]

On Wed, 11 Jul 2012 14:20:53 +1000
NeilBrown <neilb@suse.de> wrote:

> On Fri, 06 Jul 2012 11:59:13 +0200 Jes Sorensen <Jes.Sorensen@redhat.com>
> wrote:
> 
> > NeilBrown <neilb@suse.de> writes:
> > > On Tue, 03 Jul 2012 18:07:02 +0200 Jes Sorensen <Jes.Sorensen@redhat.com>
> > > wrote:
> > >
> > >> NeilBrown <neilb@suse.de> writes:
> > >> > On Mon, 02 Jul 2012 15:24:43 +0200 Jes Sorensen <Jes.Sorensen@redhat.com>
> > >> > wrote:
> > >> >
> > >> >> Hi Neil,
> > >> >> 
> > >> >> I am trying to get the test suite stable on RHEL, but I see a lot of
> > >> >> failures in 03r5assemV1, in particular between these two cases:
> > >> >> 
> > >> >> mdadm -A $md1 -u $uuid $devlist
> > >> >> check state U_U
> > >> >> eval $tst
> > >> >> 
> > >> >> mdadm -A $md1 --name=one $devlist
> > >> >> check state U_U
> > >> >> check spares 1
> > >> >> eval $tst
> > >> >> 
> > >> >> I have tested it with the latest upstream kernel as well and see the
> > >> >> same problems. I suspect it is simply the box that is too fast, ending
> > >> >> up with the raid check completing inbetween the two test cases?
> > >> >> 
> > >> >> Are you seeing the same thing there? I tried playing with the max speed
> > >> >> variable but it doesn't really seem to make any difference.
> > >> >> 
> > >> >> Any ideas for what we can be done to make this case more resilient to
> > >> >> false positives? I guess one option would be to re-create the array
> > >> >> inbetween each test?
> > >> >
> > >> > Maybe it really is a bug?
> > >> > The test harness set the resync speed to be very slow.  A fast box will get
> > >> > through the test more quickly and be more likely to see the array still
> > >> > syncing.
> > >> >
> > >> > I'll try to make time to look more closely.
> > >> > But I wouldn't discount the possibility that the second "mdadm -A" is
> > >> > short-circuiting the recovery somehow.
> > >> 
> > >> That could certainly explain what I am seeing. I noticed it doesn't
> > >> happen every single time in the same place (from memory), but it is
> > >> mostly in that spot in my case.
> > >> 
> > >> Even if I trimmed the max speed down to 50 it still happens.
> > >
> > > I cannot easily reproduce this.
> > > Exactly which kernel and which mdadm do you find it with - just to make sure
> > > I'm testing the same thing as you?
> > 
> > Hi Neil,
> > 
> > Odd - I see it with
> > mdadm:  721b662b5b33830090c220bbb04bf1904d4b7eed
> > kernel: ca24a145573124732152daff105ba68cc9a2b545
> > 
> > I've seen this happen for a while fwiw.
> > 
> > Note the box has a number of external drives with a number of my scratch
> > raid arrays on it. It shouldn't affect this, but just in case.
> > 
> > The system installed mdadm is a 3.2.3 derivative, but I checked running
> > with PATH=. as well.
> 
> Thanks.
> I think I figured out what is happening.
> 
> It seems that setting the max_speed down to 1000 is often enough, but not
> always.  So we need to set it lower.
> But setting max_speed lowers is not effective unless you also set min_speed
> lower.  This is the tricky bit that took me way too long to realised.
> 
> So with this patch, it is quite reliable.
> 
> NeilBrown
> 
> diff --git a/tests/03r5assemV1 b/tests/03r5assemV1
> index 52b1107..bca0c58 100644
> --- a/tests/03r5assemV1
> +++ b/tests/03r5assemV1
> @@ -60,7 +60,8 @@ eval $tst
>  ### Now with a missing device
>  # We don't want the recovery to complete while we are
>  # messing about here.
> -echo 1000 > /proc/sys/dev/raid/speed_limit_max
> +echo 100 > /proc/sys/dev/raid/speed_limit_max
> +echo 100 > /proc/sys/dev/raid/speed_limit_min

Purely from an armchair perspective, don't you need to reduce 'min' first, and
only then lower 'max'? As it is currently, depending on the kernel side the
first "echo" has every right to fail with "Invalid argument" (or something
similar), if there'd be a check that max can not be lower than min.

>  
>  mdadm -AR $md1 $dev0 $dev2 $dev3 $dev4 #
>  check state U_U
> @@ -124,3 +125,4 @@ mdadm -I -c $conf $dev1
>  mdadm -I -c $conf $dev2
>  eval $tst
>  echo 2000 > /proc/sys/dev/raid/speed_limit_max
> +echo 1000 > /proc/sys/dev/raid/speed_limit_min


-- 
With respect,
Roman

~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free."

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

  reply	other threads:[~2012-07-11  4:28 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-02 13:24 tests/03r5assemV1 issues Jes Sorensen
2012-07-03  1:44 ` NeilBrown
2012-07-03 16:07   ` Jes Sorensen
2012-07-04  5:23     ` NeilBrown
2012-07-06  9:59       ` Jes Sorensen
2012-07-11  4:20         ` NeilBrown
2012-07-11  4:28           ` Roman Mamedov [this message]
2012-07-11  7:18           ` Jes Sorensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120711102853.2a8bbd95@natsu \
    --to=rm@romanrm.ru \
    --cc=Jes.Sorensen@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).