From: Neil Brown <neilb@suse.de>
To: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: linux-raid <linux-raid@vger.kernel.org>, Xiao Ni <xni@redhat.com>
Subject: Re: 4.1-rc6 radi5 OOPS
Date: Wed, 10 Jun 2015 10:19:42 +1000 [thread overview]
Message-ID: <20150610101942.0bc26a25@home.neil.brown.name> (raw)
In-Reply-To: <wrfjzj4glajs.fsf@jes.lga.redhat.com>
On Wed, 03 Jun 2015 17:57:43 -0400
Jes Sorensen <Jes.Sorensen@redhat.com> wrote:
> NeilBrown <neilb@suse.de> writes:
> > On Wed, 03 Jun 2015 16:20:21 -0400 Jes Sorensen
> > <Jes.Sorensen@redhat.com> wrote:
> >
> >> Neil,
> >>
> >> I was running testing on the current 4.1-rc6 tree (Linus' top of
> >> trunk 8cd9234c64c584432f6992fe944ca9e46ca8ea76) and I am seeing
> >> the following OOPS which is reproducible.
> >>
> >> It shows up when running the mdadm test suite, 07changelevelintr
> >> to be specific.
> >>
> >> Is this something you have seen?
> >>
> >> Cheers,
> >> Jes
> >>
> >> ------------[ cut here ]------------
> >> kernel BUG at drivers/md/raid5.c:5391!
> >
> > No, I haven't seen that. And I've been running the test suite
> > quite a bit lately.
> >
> > Can you get it to print out the relevant numbers? Include
> > readpos/writepos/safepos too.
>
> This enough? Let me know if you need more.
>
> I suspect this started happening with the changes that went in between
> 4.1-rc5 and 4.1-rc6. I will try to bisect it tomorrow.
>
> Cheers,
> Jes
>
> mddev->dev_sectors: 0x9800, reshape_sectors: 0x0200 stripe_addr:
> fffffffffffffdff, sector_nr 0, readpos 511, writepos -513, safepos
> 512
These numbers suggest that conf->reshape_progress divided by
"data_disks" or "new_data_disks" is -1 - or really the unsigned
equivalent, which is MaxSectors.
But unless data_disks is 1, ->reshape_progress must really be -2 or -3
or something.
So maybe if you could confirm the values of ->reshape_progress,
data_disks, and new_data_disks, that might help.
I don't think ->reshape_progress could get a negative value in any way
except by being assigned MaxSectors. And that only happens when the
reshape has really completely finished.
So it looks like some sort of race. I have other evidence of a race
with the resync/reshape thread starting/stopping. If I track that
down it'll probably fix this issue too.
Thanks,
NeilBrown
next prev parent reply other threads:[~2015-06-10 0:19 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-03 20:20 4.1-rc6 radi5 OOPS Jes Sorensen
2015-06-03 20:40 ` NeilBrown
2015-06-03 21:57 ` Jes Sorensen
2015-06-03 22:15 ` NeilBrown
2015-06-04 1:44 ` Jes Sorensen
2015-06-10 0:19 ` Neil Brown [this message]
2015-06-10 1:57 ` Neil Brown
2015-06-10 16:27 ` Jes Sorensen
2015-06-11 6:48 ` Neil Brown
2015-06-11 7:02 ` Neil Brown
2015-06-11 7:20 ` Neil Brown
2015-06-12 21:52 ` Jes Sorensen
2015-06-13 4:26 ` Neil Brown
2015-06-10 21:02 ` Jes Sorensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150610101942.0bc26a25@home.neil.brown.name \
--to=neilb@suse.de \
--cc=Jes.Sorensen@redhat.com \
--cc=linux-raid@vger.kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).