From: Mike Snitzer <snitzer@redhat.com>
To: "Merla, ShivaKrishna" <ShivaKrishna.Merla@netapp.com>
Cc: "dm-devel@redhat.com" <dm-devel@redhat.com>,
Mikulas Patocka <mpatocka@redhat.com>,
"agk@redhat.com" <agk@redhat.com>
Subject: Re: [PATCH]dm-mpath: fix for race condition between multipath_dtr and pg_init_done.
Date: Thu, 17 Oct 2013 14:53:06 -0400 [thread overview]
Message-ID: <20131017185306.GA29909@redhat.com> (raw)
In-Reply-To: <FB8A412F33166C4A930D37BD63236C902AD2A073@SACEXCMBX02-PRD.hq.netapp.com>
Thanks for reporting this. Much appreciated. More comments below.
On Thu, Oct 17 2013 at 1:31pm -0400,
Merla, ShivaKrishna <ShivaKrishna.Merla@netapp.com> wrote:
> Whenever multipath_dtr is happening, we should prevent queueing any further path
> activation work. There was a kernel panic where after pg_init_done() decrements
> pg_init_in_progress to 0, wait_for_pg_init_completion call assumes there are no
> more pending path management commands. But if pg_init_required is set by
> pg_init_done call due to retriable mode_select errors , then process_queued_ios()
> will again queue the path activation work. If free_multipath call has been
> completed by the time activate_path work is called, kernel panic was seen on
> accessing multipath members.
Your locking looks suspect to me, see comment inlined below multipath_dtr
But shouldn't we just train multipath_wait_for_pg_init_completion() to
look at m->pg_init_required? Have it wait for both pg_init_required and
pg_init_in_progress to be zero? We'd also have to audit that
pg_init_required cannot be set while pg_init_in_progress.
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
> RIP: 0010:[<ffffffffa003db1b>] [<ffffffffa003db1b>] activate_path+0x1b/0x30 [dm_multipath]
> [<ffffffff81090ac0>] worker_thread+0x170/0x2a0
> [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
>
> This patch will fix the issue by preventing any further path activations when
> multipath structures are being freed during table suspend/reload.
>
> Signed-off-by: Shiva Krishna Merla<shivakrishna.merla@netapp.com>
> Reviewed-by: Krishnasamy Somasundaram<somasundaram.krishnasamy@netapp.com>
> Tested-by: Speagle Andy<Andy.Speagle@netapp.com>
>
> ---
> --- a/drivers/md/dm-mpath.c 2013-01-29 10:12:10.000000000 -0600
> +++ b/drivers/md/dm-mpath.c 2013-10-17 09:23:21.896062928 -0500
> @@ -73,6 +73,7 @@ struct multipath {
>
> wait_queue_head_t pg_init_wait; /* Wait for pg_init completion */
>
> + unsigned dtr_in_progress; /* multipath destroy in progress */
> unsigned pg_init_required; /* pg_init needs calling? */
> unsigned pg_init_in_progress; /* Only one pg_init allowed at once */
> unsigned pg_init_delay_retry; /* Delay pg_init retry? */
> @@ -498,7 +499,8 @@ static void process_queued_ios(struct wo
> (!pgpath && !m->queue_if_no_path))
> must_queue = 0;
>
> - if (m->pg_init_required && !m->pg_init_in_progress && pgpath)
> + if (m->pg_init_required && !m->pg_init_in_progress && pgpath &&
> + !m->dtr_in_progress)
> __pg_init_all_paths(m);
>
> spin_unlock_irqrestore(&m->lock, flags);
> @@ -952,6 +954,7 @@ static void multipath_dtr(struct dm_targ
> {
> struct multipath *m = ti->private;
>
> + m->dtr_in_progress = 1;
> flush_multipath_work(m);
> free_multipath(m);
> }
Don't we need synchronization in multipath_dtr? Otherwise isn't there
still potential for a narrow race when checking m->dtr_in_progress from
process_queued_ios() or pg_init_limit_reached()?
Anyway, my concerns should be moot if multipath_wait_for_pg_init_completion()
is updated to look at pg_init_required. But I could be missing
something.
Mikulas (or Hannes or NEC guys), would welcome your take on this.
> @@ -1164,7 +1167,7 @@ static int pg_init_limit_reached(struct
>
> spin_lock_irqsave(&m->lock, flags);
>
> - if (m->pg_init_count <= m->pg_init_retries)
> + if (m->pg_init_count <= m->pg_init_retries && !m->dtr_in_progress)
> m->pg_init_required = 1;
> else
> limit_reached = 1;
> --
next prev parent reply other threads:[~2013-10-17 18:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-17 17:31 [PATCH]dm-mpath: fix for race condition between multipath_dtr and pg_init_done Merla, ShivaKrishna
2013-10-17 18:53 ` Mike Snitzer [this message]
2013-10-17 21:47 ` Hannes Reinecke
2013-10-17 21:10 ` Mike Snitzer
2013-10-17 22:03 ` Merla, ShivaKrishna
2013-10-30 0:57 ` Mike Snitzer
2013-10-30 13:32 ` Merla, ShivaKrishna
2013-10-30 13:42 ` Mike Snitzer
2013-10-30 13:44 ` Merla, ShivaKrishna
2013-11-01 18:21 ` Mikulas Patocka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131017185306.GA29909@redhat.com \
--to=snitzer@redhat.com \
--cc=ShivaKrishna.Merla@netapp.com \
--cc=agk@redhat.com \
--cc=dm-devel@redhat.com \
--cc=mpatocka@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.