* [GIT PULL REQUEST] late md/raid1 bug fixes for 3.17
@ 2014-09-24 2:18 NeilBrown
2014-09-26 19:08 ` BillStuff
0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2014-09-24 2:18 UTC (permalink / raw)
To: Linus Torvalds
Cc: linux RAID, lkml, Alexander Lyakas, Bassow Jonathan, majianpeng
[-- Attachment #1: Type: text/plain, Size: 1634 bytes --]
Hi Linus,
it is amazing how much easier it is to find bugs when you know one is there.
Two bug reports resulted in finding 7 bugs!!
All are tagged for -stable. Those that can't cause (rare) data corruption,
cause lockups.
Thanks,
NeilBrown
The following changes since commit d030671f3f261e528dc6e396a13f10859a74ae7c:
Merge branch 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup (2014-09-07 20:20:16 -0700)
are available in the git repository at:
git://git.neil.brown.name/md/ tags/md/3.17-more-fixes
for you to fetch changes up to b8cb6b4c121e1bf1963c16ed69e7adcb1bc301cd:
md/raid1: fix_read_error should act on all non-faulty devices. (2014-09-22 11:26:01 +1000)
----------------------------------------------------------------
Bugfixes for md/raid1
particularly, but not only, fixing new "resync" code.
----------------------------------------------------------------
NeilBrown (8):
md/raid1: intialise start_next_window for READ case to avoid hang
md/raid1: be more cautious where we read-balance during resync.
md/raid1: clean up request counts properly in close_sync()
md/raid1: make sure resync waits for conflicting writes to complete.
md/raid1: Don't use next_resync to determine how far resync has progressed
md/raid1: update next_resync under resync_lock.
md/raid1: count resync requests in nr_pending.
md/raid1: fix_read_error should act on all non-faulty devices.
drivers/md/raid1.c | 40 ++++++++++++++++++++++------------------
1 file changed, 22 insertions(+), 18 deletions(-)
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [GIT PULL REQUEST] late md/raid1 bug fixes for 3.17
2014-09-24 2:18 [GIT PULL REQUEST] late md/raid1 bug fixes for 3.17 NeilBrown
@ 2014-09-26 19:08 ` BillStuff
2014-09-27 0:09 ` NeilBrown
0 siblings, 1 reply; 3+ messages in thread
From: BillStuff @ 2014-09-26 19:08 UTC (permalink / raw)
To: NeilBrown; +Cc: linux RAID
On 09/23/2014 09:18 PM, NeilBrown wrote:
[snip]
> md/raid1: intialise start_next_window for READ case to avoid hang
>
Neil, I've been testing these patches for the past week or two to see if
they help a raid1 "check" hang I had.
They seem to help, but I noticed the above patch is different from what
you originally sent on the list.
The original patch has an extra chunk:
@@ -1444,6 +1445,7 @@ read_again:
r1_bio->state = 0;
r1_bio->mddev = mddev;
r1_bio->sector = bio->bi_iter.bi_sector + sectors_handled;
+ start_next_window = wait_barrier(conf, bio);
goto retry_write;
}
Is the correct patch with or without this chunk?
Thanks,
Bill
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [GIT PULL REQUEST] late md/raid1 bug fixes for 3.17
2014-09-26 19:08 ` BillStuff
@ 2014-09-27 0:09 ` NeilBrown
0 siblings, 0 replies; 3+ messages in thread
From: NeilBrown @ 2014-09-27 0:09 UTC (permalink / raw)
To: BillStuff; +Cc: linux RAID
[-- Attachment #1: Type: text/plain, Size: 1186 bytes --]
On Fri, 26 Sep 2014 14:08:08 -0500 BillStuff <billstuff2001@sbcglobal.net>
wrote:
> On 09/23/2014 09:18 PM, NeilBrown wrote:
> [snip]
> > md/raid1: intialise start_next_window for READ case to avoid hang
> >
>
> Neil, I've been testing these patches for the past week or two to see if
> they help a raid1 "check" hang I had.
>
> They seem to help, but I noticed the above patch is different from what
> you originally sent on the list.
>
> The original patch has an extra chunk:
>
> @@ -1444,6 +1445,7 @@ read_again:
> r1_bio->state = 0;
> r1_bio->mddev = mddev;
> r1_bio->sector = bio->bi_iter.bi_sector + sectors_handled;
> + start_next_window = wait_barrier(conf, bio);
> goto retry_write;
> }
>
> Is the correct patch with or without this chunk?
>
> Thanks,
> Bill
That hunk was wrong.
This new r1_bio is attached to the previous one and they all complete (and
particularly all "allow_barrier") as a unit. So only one wait_barrier is
needed.
That chunk only has any affect if you have a bad-blocks list with bad blocks
in it, and try to write a range of the device which includes the bad block.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-09-27 0:09 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-24 2:18 [GIT PULL REQUEST] late md/raid1 bug fixes for 3.17 NeilBrown
2014-09-26 19:08 ` BillStuff
2014-09-27 0:09 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).