From: Mike Snitzer <snitzer@redhat.com>
To: Junichi Nomura <j-nomura@ce.jp.nec.com>
Cc: device-mapper development <dm-devel@redhat.com>,
Bart Van Assche <bvanassche@acm.org>
Subject: Re: v3.15 dm-mpath regression: cable pull test causes I/O hang
Date: Tue, 8 Jul 2014 12:33:23 -0400 [thread overview]
Message-ID: <20140708163323.GB23439@redhat.com> (raw)
In-Reply-To: <11AF7C027C4C02408624617A498607840132038A@BPXM12GP.gisp.nec.co.jp>
On Mon, Jul 07 2014 at 8:55pm -0400,
Junichi Nomura <j-nomura@ce.jp.nec.com> wrote:
> On 07/07/14 22:40, Bart Van Assche wrote:
> > Thanks for looking into this issue. I have attached the requested
> > information to this e-mail for a test run with kernel 3.16-rc4 at the
> > initiator side.
>
> Thank you, Bart. The information is helpful.
>
> >From "dmsetup table" output, hardware handler is not used
> in your setup. So pg_init is not involved.
>
> > # cat /proc/diskstats
> ..
> > 253 1 dm-1 127 0 1016 1070 0 0 0 0 1 278360 278820
>
> In-flight IO remains.
>
> > # dmsetup info -c
> > Name Maj Min Stat Open Targ Event UUID
> ...
> > 26466363537333266 253 1 L--w 1 1 0 mpath-26466363537333266
>
> Typical reason of remainig IO is device staying as suspended.
> But the device state is ok here.
>
> > # dmsetup status
> ...
> > 26466363537333266: 0 256000 multipath 2 1 0 0 1 1 E 0 2 2 8:48 A 0 0 1 8:160 A 0 0 1
>
> Single path group with both paths being active on dm-1.
> But the path group is not active.
>
> I suspect what's happening here is nobody clears m->queue_io:
> multipath_busy() returns busy when queue_io=1 while clearing queue_io
> needs __choose_pgpath(), which won't be called if multipath_busy() is true.
>
> I think if you run 'sg_inq /dev/dm-1' for example in this case,
> the device will start working again.
> Since ioctl is not affected by multipath_busy(), somehow the problem
> was hidden in many cases by udev activities, for example.
>
> Attached patch should fix the problem.
> Could you give it a try?
>
> -
> Jun'ichi Nomura, NEC Corporation
>
>
> pg_ready() checks the current state of the multipath and may return
> false even if a new IO is needed to change the state.
>
> OTOH, if multipath_busy() returns busy, a new IO will not be sent
> to multipath target and the state change won't happen. That results
> in lock up.
>
> The intent of multipath_busy() is to avoid unnecessary cycles of
> dequeue + request_fn + requeue if it is known that multipath device
> will requeue.
>
> Such situation would be:
> - path group is being activated
> - there is no path and the multipath is setup to requeue if no path
>
> This patch should fix the problem introduced as a part of this commit:
> commit e809917735ebf1b9a56c24e877ce0d320baee2ec
> dm mpath: push back requests instead of queueing
Thanks Jun'ichi! I knew multipath_busy() wasn't quite right ;)
I've merged your fix with a revised header, if you see anything wrong
with the header please feel free to let me know! See:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=75c76c45b76e53b7c2f025d30e7e308bfe331004
This will likely go to Linus for 3.16-rc5 by this Friday.
BTW, I also staged a related cleanup for 3.17, see:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=2122e57c913cd842e5061137a7093bd06e728677
Thanks again,
Mike
next prev parent reply other threads:[~2014-07-08 16:33 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-27 13:02 v3.15 dm-mpath regression: cable pull test causes I/O hang Bart Van Assche
2014-06-27 13:33 ` Mike Snitzer
2014-06-27 14:18 ` Bart Van Assche
2014-07-02 22:02 ` Mike Snitzer
2014-07-03 5:43 ` Hannes Reinecke
2014-07-03 13:56 ` Bart Van Assche
2014-07-03 13:58 ` Hannes Reinecke
2014-07-03 14:05 ` Mike Snitzer
2014-07-03 14:15 ` Hannes Reinecke
2014-07-03 14:18 ` Mike Snitzer
2014-07-03 14:34 ` Bart Van Assche
2014-07-03 15:00 ` Mike Snitzer
2014-07-07 13:28 ` Bart Van Assche
2014-07-04 3:10 ` Junichi Nomura
2014-07-07 13:40 ` Bart Van Assche
2014-07-08 0:55 ` Junichi Nomura
2014-07-08 9:43 ` Bart Van Assche
2014-07-08 16:33 ` Mike Snitzer [this message]
2014-07-08 23:24 ` Junichi Nomura
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140708163323.GB23439@redhat.com \
--to=snitzer@redhat.com \
--cc=bvanassche@acm.org \
--cc=dm-devel@redhat.com \
--cc=j-nomura@ce.jp.nec.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.