From: NeilBrown <neilb@suse.de>
To: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: linux-raid@vger.kernel.org, Shaohua Li <shli@kernel.org>
Subject: Re: raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65
Date: Wed, 6 Mar 2013 13:18:04 +1100 [thread overview]
Message-ID: <20130306131804.0b39752a@notabene.brown> (raw)
In-Reply-To: <wrfjlia2fd6h.fsf@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 5944 bytes --]
On Tue, 05 Mar 2013 09:44:54 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
wrote:
> NeilBrown <neilb@suse.de> writes:
> > On Mon, 04 Mar 2013 14:50:54 +0100 Jes Sorensen <Jes.Sorensen@redhat.com>
> > wrote:
> >
> >> Hi,
> >>
> >> I have been hitting raid5 lockups with recent kernels. A bunch of
> >> bisecting narrowed it down to be caused by this commit:
> >>
> >> ca64cae96037de16e4af92678814f5d4bf0c1c65
> >>
> >> So far I can only reproduce the problem when running a test script
> >> creating raid5 arrays on top of loop devices and then running mkfs on
> >> those. I haven't managed to reproduce it on real disk devices yet, but I
> >> suspect it is possible too.
> >>
> >> Basically it looks like a race condition where R5_LOCKED doesn't get
> >> cleared for the device, however it is unclear to me how we get to that
> >> point. Since I am not really deeply familiar with the discard related
> >> changes, I figured someone might have a better idea what could go wrong.
> >>
> >> Cheers,
> >> Jes
> >>
> >>
> >>
> >> [ 4799.312280] sector=97f8 i=1 (null) (null) (null) ffff88022f5963c0
> >> 0
> >> [ 4799.322174] ------------[ cut here ]------------
> >> [ 4799.327330] WARNING: at drivers/md/raid5.c:352
> >> init_stripe+0x2d2/0x360 [raid456]()
> >> [ 4799.335775] Hardware name: S1200BTL
> >> [ 4799.339668] Modules linked in: raid456 async_raid6_recov
> >> async_memcpy async_pq raid6_pq async_xor xor async_tx lockd sunrpc
> >> bnep bluetooth rfkill sg coretemp e1000e raid1 dm_mirror kvm_intel
> >> kvm crc32c_intel iTCO_wdt iTCO_vendor_support dm_region_hash
> >> ghash_clmulni_intel lpc_ich dm_log dm_mod mfd_core i2c_i801 video
> >> pcspkr microcode uinput xfs usb_storage mgag200 i2c_algo_bit
> >> drm_kms_helper ttm drm i2c_core mpt2sas raid_class
> >> scsi_transport_sas [last unloaded: raid456]
> >> [ 4799.386633] Pid: 8204, comm: mkfs.ext4 Not tainted 3.7.0-rc1+ #17
> >> [ 4799.393431] Call Trace:
> >> [ 4799.396163] [<ffffffff810602ff>] warn_slowpath_common+0x7f/0xc0
> >> [ 4799.402868] [<ffffffff8106035a>] warn_slowpath_null+0x1a/0x20
> >> [ 4799.409375] [<ffffffffa0423b92>] init_stripe+0x2d2/0x360 [raid456]
> >> [ 4799.416368] [<ffffffffa042400b>] get_active_stripe+0x3eb/0x480 [raid456]
> >> [ 4799.423944] [<ffffffffa0427beb>] make_request+0x3eb/0x6b0 [raid456]
> >> [ 4799.431037] [<ffffffff81084210>] ? wake_up_bit+0x40/0x40
> >> [ 4799.437062] [<ffffffff814a6633>] md_make_request+0xc3/0x200
> >> [ 4799.443379] [<ffffffff81134655>] ? mempool_alloc_slab+0x15/0x20
> >> [ 4799.450082] [<ffffffff812c70d2>] generic_make_request+0xc2/0x110
> >> [ 4799.456881] [<ffffffff812c7199>] submit_bio+0x79/0x160
> >> [ 4799.462714] [<ffffffff811ca625>] ? bio_alloc_bioset+0x65/0x120
> >> [ 4799.469321] [<ffffffff812ce234>] blkdev_issue_discard+0x184/0x240
> >> [ 4799.476218] [<ffffffff812cef76>] blkdev_ioctl+0x3b6/0x810
> >> [ 4799.482338] [<ffffffff811cb971>] block_ioctl+0x41/0x50
> >> [ 4799.488170] [<ffffffff811a6aa9>] do_vfs_ioctl+0x99/0x580
> >> [ 4799.494185] [<ffffffff8128a19a>] ?
> >> inode_has_perm.isra.30.constprop.60+0x2a/0x30
> >> [ 4799.502535] [<ffffffff8128b6d7>] ? file_has_perm+0x97/0xb0
> >> [ 4799.508755] [<ffffffff811a7021>] sys_ioctl+0x91/0xb0
> >> [ 4799.514384] [<ffffffff810de9dc>] ? __audit_syscall_exit+0x3ec/0x450
> >> [ 4799.521475] [<ffffffff8161e759>] system_call_fastpath+0x16/0x1b
> >> [ 4799.528177] ---[ end trace 583fffce97b9ddd9 ]---
> >> [ 4799.533327] sector=97f8 i=0 (null) (null) (null) ffff88022f5963c0
> >> 0
> >> [ 4799.543227] ------------[ cut here ]------------
> >
> > Does this fix it?
> >
> > NeilBrown
>
> Unfortunately no, I still see these crashes with this one applied :(
>
Thanks - the symptom looked similar, but now that I look more closely I can
see it is quite different.
How about this then? I can't really see what is happening, but based on the
patch that you identified it must be related to these flags.
It seems that handle_stripe_clean_event() is being called to early, and it
doesn't clear out the ->written bios because they are still locked or
something. But it does clear R5_Discard on the parity block, so
handle_stripe_clean_event doesn't get called again.
This makes the handling of the various flags somewhat more uniform, which is
probably a good thing.
Thanks for testing,
NeilBrown
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 277d9c2..a005dcc 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -1246,8 +1246,7 @@ static void ops_complete_reconstruct(void *stripe_head_ref)
struct r5dev *dev = &sh->dev[i];
if (dev->written || i == pd_idx || i == qd_idx) {
- if (!discard)
- set_bit(R5_UPTODATE, &dev->flags);
+ set_bit(R5_UPTODATE, &dev->flags);
if (fua)
set_bit(R5_WantFUA, &dev->flags);
if (sync)
@@ -2784,8 +2783,7 @@ static void handle_stripe_clean_event(struct r5conf *conf,
if (sh->dev[i].written) {
dev = &sh->dev[i];
if (!test_bit(R5_LOCKED, &dev->flags) &&
- (test_bit(R5_UPTODATE, &dev->flags) ||
- test_bit(R5_Discard, &dev->flags))) {
+ test_bit(R5_UPTODATE, &dev->flags)) {
/* We can return any write requests */
struct bio *wbi, *wbi2;
pr_debug("Return write for disc %d\n", i);
@@ -2808,8 +2806,11 @@ static void handle_stripe_clean_event(struct r5conf *conf,
!test_bit(STRIPE_DEGRADED, &sh->state),
0);
}
- } else if (test_bit(R5_Discard, &sh->dev[i].flags))
- clear_bit(R5_Discard, &sh->dev[i].flags);
+ } else if (!test_bit(R5_LOCKED, &sh->dev[i].flags) &&
+ test_bit(R5_UPTODATE, &sh->dev[i].flags)) {
+ if (test_and_clear_bit(R5_Discard, &dev->flags))
+ clear_bit(R5_UPTODATE, &dev->flags);
+ }
if (test_and_clear_bit(STRIPE_FULL_WRITE, &sh->state))
if (atomic_dec_and_test(&conf->pending_full_writes))
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2013-03-06 2:18 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-04 13:50 raid5 lockups post ca64cae96037de16e4af92678814f5d4bf0c1c65 Jes Sorensen
2013-03-04 21:00 ` NeilBrown
2013-03-05 8:44 ` Jes Sorensen
2013-03-06 2:18 ` NeilBrown [this message]
2013-03-06 9:31 ` Jes Sorensen
2013-03-11 22:32 ` NeilBrown
2013-03-12 1:32 ` NeilBrown
2013-03-12 11:12 ` joystick
2013-03-20 0:54 ` NeilBrown
2013-03-12 13:45 ` Jes Sorensen
2013-03-12 23:35 ` NeilBrown
2013-03-13 7:32 ` Jes Sorensen
2013-03-14 7:35 ` Jes Sorensen
2013-03-20 0:55 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130306131804.0b39752a@notabene.brown \
--to=neilb@suse.de \
--cc=Jes.Sorensen@redhat.com \
--cc=linux-raid@vger.kernel.org \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).