* [PATCH] raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang
@ 2016-02-29 15:43 Nate Dailey
2016-03-06 23:33 ` Shaohua Li
2016-03-17 20:29 ` Joe Lawrence
0 siblings, 2 replies; 5+ messages in thread
From: Nate Dailey @ 2016-02-29 15:43 UTC (permalink / raw)
To: linux-raid; +Cc: Nate Dailey
If raid1d is handling a mix of read and write errors, handle_read_error's
call to freeze_array can get stuck.
This can happen because, though the bio_end_io_list is initially drained,
writes can be added to it via handle_write_finished as the retry_list
is processed. These writes contribute to nr_pending but are not included
in nr_queued.
If a later entry on the retry_list triggers a call to handle_read_error,
freeze array hangs waiting for nr_pending == nr_queued+extra. The writes
on the bio_end_io_list aren't included in nr_queued so the condition will
never be satisfied.
To prevent the hang, include bio_end_io_list writes in nr_queued.
There's probably a better way to handle decrementing nr_queued, but this
seemed like the safest way to avoid breaking surrounding code.
I'm happy to supply the script I used to repro this hang.
Signed-off-by: Nate Dailey <nate.dailey@stratus.com>
---
drivers/md/raid1.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 4e3843f..bb5bce0 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2274,6 +2274,7 @@ static void handle_write_finished(struct r1conf *conf, struct r1bio *r1_bio)
if (fail) {
spin_lock_irq(&conf->device_lock);
list_add(&r1_bio->retry_list, &conf->bio_end_io_list);
+ conf->nr_queued++;
spin_unlock_irq(&conf->device_lock);
md_wakeup_thread(conf->mddev->thread);
} else {
@@ -2391,8 +2392,10 @@ static void raid1d(struct md_thread *thread)
LIST_HEAD(tmp);
spin_lock_irqsave(&conf->device_lock, flags);
if (!test_bit(MD_CHANGE_PENDING, &mddev->flags)) {
- list_add(&tmp, &conf->bio_end_io_list);
- list_del_init(&conf->bio_end_io_list);
+ while (!list_empty(&conf->bio_end_io_list)) {
+ list_move(conf->bio_end_io_list.prev, &tmp);
+ conf->nr_queued--;
+ }
}
spin_unlock_irqrestore(&conf->device_lock, flags);
while (!list_empty(&tmp)) {
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang
2016-02-29 15:43 [PATCH] raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang Nate Dailey
@ 2016-03-06 23:33 ` Shaohua Li
2016-03-14 18:59 ` Shaohua Li
2016-03-17 20:29 ` Joe Lawrence
1 sibling, 1 reply; 5+ messages in thread
From: Shaohua Li @ 2016-03-06 23:33 UTC (permalink / raw)
To: Nate Dailey; +Cc: linux-raid
On Mon, Feb 29, 2016 at 10:43:58AM -0500, Nate Dailey wrote:
> If raid1d is handling a mix of read and write errors, handle_read_error's
> call to freeze_array can get stuck.
>
> This can happen because, though the bio_end_io_list is initially drained,
> writes can be added to it via handle_write_finished as the retry_list
> is processed. These writes contribute to nr_pending but are not included
> in nr_queued.
>
> If a later entry on the retry_list triggers a call to handle_read_error,
> freeze array hangs waiting for nr_pending == nr_queued+extra. The writes
> on the bio_end_io_list aren't included in nr_queued so the condition will
> never be satisfied.
>
> To prevent the hang, include bio_end_io_list writes in nr_queued.
>
> There's probably a better way to handle decrementing nr_queued, but this
> seemed like the safest way to avoid breaking surrounding code.
>
> I'm happy to supply the script I used to repro this hang.
Looks good. Could you please also fix raid10?
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang
2016-03-06 23:33 ` Shaohua Li
@ 2016-03-14 18:59 ` Shaohua Li
0 siblings, 0 replies; 5+ messages in thread
From: Shaohua Li @ 2016-03-14 18:59 UTC (permalink / raw)
To: Nate Dailey; +Cc: linux-raid
On Sun, Mar 06, 2016 at 03:33:04PM -0800, Shaohua Li wrote:
> On Mon, Feb 29, 2016 at 10:43:58AM -0500, Nate Dailey wrote:
> > If raid1d is handling a mix of read and write errors, handle_read_error's
> > call to freeze_array can get stuck.
> >
> > This can happen because, though the bio_end_io_list is initially drained,
> > writes can be added to it via handle_write_finished as the retry_list
> > is processed. These writes contribute to nr_pending but are not included
> > in nr_queued.
> >
> > If a later entry on the retry_list triggers a call to handle_read_error,
> > freeze array hangs waiting for nr_pending == nr_queued+extra. The writes
> > on the bio_end_io_list aren't included in nr_queued so the condition will
> > never be satisfied.
> >
> > To prevent the hang, include bio_end_io_list writes in nr_queued.
> >
> > There's probably a better way to handle decrementing nr_queued, but this
> > seemed like the safest way to avoid breaking surrounding code.
> >
> > I'm happy to supply the script I used to repro this hang.
>
> Looks good. Could you please also fix raid10?
Alright, I applied the patch and added raid10 part so this can be applied to 4.6
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang
2016-02-29 15:43 [PATCH] raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang Nate Dailey
2016-03-06 23:33 ` Shaohua Li
@ 2016-03-17 20:29 ` Joe Lawrence
2016-03-17 21:24 ` Shaohua Li
1 sibling, 1 reply; 5+ messages in thread
From: Joe Lawrence @ 2016-03-17 20:29 UTC (permalink / raw)
To: Nate Dailey, linux-raid, shli
On 02/29/2016 10:43 AM, Nate Dailey wrote:
> If raid1d is handling a mix of read and write errors, handle_read_error's
> call to freeze_array can get stuck.
>
> This can happen because, though the bio_end_io_list is initially drained,
> writes can be added to it via handle_write_finished as the retry_list
> is processed. These writes contribute to nr_pending but are not included
> in nr_queued.
>
> If a later entry on the retry_list triggers a call to handle_read_error,
> freeze array hangs waiting for nr_pending == nr_queued+extra. The writes
> on the bio_end_io_list aren't included in nr_queued so the condition will
> never be satisfied.
>
> To prevent the hang, include bio_end_io_list writes in nr_queued.
>
> There's probably a better way to handle decrementing nr_queued, but this
> seemed like the safest way to avoid breaking surrounding code.
>
> I'm happy to supply the script I used to repro this hang.
>
> Signed-off-by: Nate Dailey <nate.dailey@stratus.com>
> ---
> drivers/md/raid1.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 4e3843f..bb5bce0 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -2274,6 +2274,7 @@ static void handle_write_finished(struct r1conf *conf, struct r1bio *r1_bio)
> if (fail) {
> spin_lock_irq(&conf->device_lock);
> list_add(&r1_bio->retry_list, &conf->bio_end_io_list);
> + conf->nr_queued++;
> spin_unlock_irq(&conf->device_lock);
> md_wakeup_thread(conf->mddev->thread);
> } else {
> @@ -2391,8 +2392,10 @@ static void raid1d(struct md_thread *thread)
> LIST_HEAD(tmp);
> spin_lock_irqsave(&conf->device_lock, flags);
> if (!test_bit(MD_CHANGE_PENDING, &mddev->flags)) {
> - list_add(&tmp, &conf->bio_end_io_list);
> - list_del_init(&conf->bio_end_io_list);
> + while (!list_empty(&conf->bio_end_io_list)) {
> + list_move(conf->bio_end_io_list.prev, &tmp);
> + conf->nr_queued--;
> + }
> }
> spin_unlock_irqrestore(&conf->device_lock, flags);
> while (!list_empty(&tmp)) {
>
Nate, Shaohua,
It looks like bio_end_io_list was added in 55ce74d4bfe1 "md/raid1:
ensure device failure recorded before write request returns", which
dates back a ways:
% git tag --contains 55ce74d4bfe1b | grep -v 'rc' | sort -V
v4.3
v4.4
v4.5
Should these patches have 'Fixes' tags for stable backporting?
Regards,
-- Joe
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang
2016-03-17 20:29 ` Joe Lawrence
@ 2016-03-17 21:24 ` Shaohua Li
0 siblings, 0 replies; 5+ messages in thread
From: Shaohua Li @ 2016-03-17 21:24 UTC (permalink / raw)
To: Joe Lawrence; +Cc: Nate Dailey, linux-raid
On Thu, Mar 17, 2016 at 04:29:58PM -0400, Joe Lawrence wrote:
> On 02/29/2016 10:43 AM, Nate Dailey wrote:
> > If raid1d is handling a mix of read and write errors, handle_read_error's
> > call to freeze_array can get stuck.
> >
> > This can happen because, though the bio_end_io_list is initially drained,
> > writes can be added to it via handle_write_finished as the retry_list
> > is processed. These writes contribute to nr_pending but are not included
> > in nr_queued.
> >
> > If a later entry on the retry_list triggers a call to handle_read_error,
> > freeze array hangs waiting for nr_pending == nr_queued+extra. The writes
> > on the bio_end_io_list aren't included in nr_queued so the condition will
> > never be satisfied.
> >
> > To prevent the hang, include bio_end_io_list writes in nr_queued.
> >
> > There's probably a better way to handle decrementing nr_queued, but this
> > seemed like the safest way to avoid breaking surrounding code.
> >
> > I'm happy to supply the script I used to repro this hang.
> >
> > Signed-off-by: Nate Dailey <nate.dailey@stratus.com>
> > ---
> > drivers/md/raid1.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > index 4e3843f..bb5bce0 100644
> > --- a/drivers/md/raid1.c
> > +++ b/drivers/md/raid1.c
> > @@ -2274,6 +2274,7 @@ static void handle_write_finished(struct r1conf *conf, struct r1bio *r1_bio)
> > if (fail) {
> > spin_lock_irq(&conf->device_lock);
> > list_add(&r1_bio->retry_list, &conf->bio_end_io_list);
> > + conf->nr_queued++;
> > spin_unlock_irq(&conf->device_lock);
> > md_wakeup_thread(conf->mddev->thread);
> > } else {
> > @@ -2391,8 +2392,10 @@ static void raid1d(struct md_thread *thread)
> > LIST_HEAD(tmp);
> > spin_lock_irqsave(&conf->device_lock, flags);
> > if (!test_bit(MD_CHANGE_PENDING, &mddev->flags)) {
> > - list_add(&tmp, &conf->bio_end_io_list);
> > - list_del_init(&conf->bio_end_io_list);
> > + while (!list_empty(&conf->bio_end_io_list)) {
> > + list_move(conf->bio_end_io_list.prev, &tmp);
> > + conf->nr_queued--;
> > + }
> > }
> > spin_unlock_irqrestore(&conf->device_lock, flags);
> > while (!list_empty(&tmp)) {
> >
>
> Nate, Shaohua,
>
> It looks like bio_end_io_list was added in 55ce74d4bfe1 "md/raid1:
> ensure device failure recorded before write request returns", which
> dates back a ways:
>
> % git tag --contains 55ce74d4bfe1b | grep -v 'rc' | sort -V
> v4.3
> v4.4
> v4.5
>
> Should these patches have 'Fixes' tags for stable backporting?
i'll add it
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-03-17 21:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-29 15:43 [PATCH] raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang Nate Dailey
2016-03-06 23:33 ` Shaohua Li
2016-03-14 18:59 ` Shaohua Li
2016-03-17 20:29 ` Joe Lawrence
2016-03-17 21:24 ` Shaohua Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).