dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
	device-mapper development <dm-devel@redhat.com>,
	"hch@lst.de" <hch@lst.de>
Subject: Re: [PATCH 8/9] dm: Fix two race conditions related to stopping and starting queues
Date: Thu, 1 Sep 2016 19:47:54 -0400	[thread overview]
Message-ID: <20160901234754.GA13653@redhat.com> (raw)
In-Reply-To: <938609b9-3a55-0ed3-ffeb-de27e1c1e864@sandisk.com>

On Thu, Sep 01 2016 at  7:17pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 09/01/2016 03:27 PM, Mike Snitzer wrote:
> >On Thu, Sep 01 2016 at  6:22pm -0400,
> >Bart Van Assche <bart.vanassche@sandisk.com> wrote:
> >
> >>On 09/01/2016 03:18 PM, Mike Snitzer wrote:
> >>>FYI I get the same 'dmsetup suspend --nolockfs --noflush mp' hang,
> >>>running mptest's test_02_sdev_delete, when I try your unmodified
> >>>patchset, see:
> >>>
> >>>http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=devel.bart
> >>
> >>Hello Mike,
> >>
> >>Are you aware that the code on that branch is a *modified* version
> >>of my patch series? The following patch is not present on that
> >>branch: "dm path selector: Avoid that device removal triggers an
> >>infinite loop". There are also other (smaller) differences.
> >
> >No, you're obviously talking about the 'devel' branch and not the
> >'devel.bart' branch I pointed to.  The 'devel.bart' branch is the
> >_exact_ patchset you sent.  It has the same problem as the 'devel'
> >branch.
> 
> Hello Mike,
> 
> Sorry that I misread your previous e-mail. After I received your
> latest e-mail I rebased my tree on top of the devel.bart branch
> mentioned above. My tests still pass. The only two patches in my
> tree that are relevant and that are not in the devel.bart branch
> have been attached to this e-mail. Did your test involve the sd
> driver? If so, do the attached two patches help? If the sd driver
> was not involved, can you provide more information about the hang
> you ran into? The output and log messages generated by the following
> commands after the hang has been reproduced would be very welcome:
> * echo w > /proc/sysrq-trigger
> * (cd /sys/block && grep -a '' dm*/mq/*/{pending,cpu*/rq_list})

sd is used.  I'll apply those patches and test, tomorrow, but I'm pretty
skeptical.

Haven't had any problems with these tests for quite a while.  The tests
I'm running are just those in the mptest testsuite, see:
https://github.com/snitm/mptest

Running them should be as simple as you doing:

git clone git://github.com/snitm/mptest.git
cd mptest
./runtest

The default is to use dm-mq on scsi-mq ontop of tcmloop.

multipath -ll shows:

mp () dm-4 LIO-ORG ,rd
size=1.0G features='4 queue_if_no_path retain_attached_hw_handler queue_mode mq' hwhandler='1 alua' wp=rw
|-+- policy='queue-length 0' prio=-1 status=active
| |- 7:0:1:0  sdj   8:144 active ready running
| `- 8:0:1:0  sdk   8:160 active ready running
`-+- policy='queue-length 0' prio=-1 status=enabled
  |- 9:0:1:0  sdl   8:176 active ready running
  `- 10:0:1:0 sdm   8:192 active ready running

[ 4839.452237] scsi host7: TCM_Loopback
[ 4839.472788] scsi host8: TCM_Loopback
[ 4839.492867] scsi host9: TCM_Loopback
[ 4839.512841] scsi host10: TCM_Loopback
[ 4839.549430] scsi 7:0:1:0: Direct-Access     LIO-ORG  rd               4.0  PQ: 0 ANSI: 5
[ 4839.570556] scsi 7:0:1:0: alua: supports implicit and explicit TPGS
[ 4839.577562] scsi 7:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 1
[ 4839.587810] sd 7:0:1:0: [sdj] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 4839.587830] sd 7:0:1:0: Attached scsi generic sg10 type 0
[ 4839.593569] sd 7:0:1:0: alua: transition timeout set to 60 seconds
[ 4839.593572] sd 7:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA
[ 4839.608254] scsi 8:0:1:0: Direct-Access     LIO-ORG  rd               4.0  PQ: 0 ANSI: 5
[ 4839.626620] sd 7:0:1:0: [sdj] Write Protect is off
[ 4839.631974] sd 7:0:1:0: [sdj] Mode Sense: 43 00 00 08
[ 4839.631999] sd 7:0:1:0: [sdj] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 4839.642209] loopback/naa.50014056fcae4fb4: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 4839.652646] sd 7:0:1:0: [sdj] Attached SCSI disk
[ 4839.673568] scsi 8:0:1:0: alua: supports implicit and explicit TPGS
[ 4839.680573] scsi 8:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 2
[ 4839.690814] sd 8:0:1:0: [sdk] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 4839.690888] sd 8:0:1:0: Attached scsi generic sg11 type 0
[ 4839.696543] sd 8:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA
[ 4839.711419] scsi 9:0:1:0: Direct-Access     LIO-ORG  rd               4.0  PQ: 0 ANSI: 5
[ 4839.722730] sd 8:0:1:0: [sdk] Write Protect is off
[ 4839.728076] sd 8:0:1:0: [sdk] Mode Sense: 43 00 00 08
[ 4839.728094] sd 8:0:1:0: [sdk] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 4839.738298] loopback/naa.500140553365fbe6: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 4839.748700] sd 8:0:1:0: [sdk] Attached SCSI disk
[ 4839.771561] scsi 9:0:1:0: alua: supports implicit and explicit TPGS
[ 4839.778567] scsi 9:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 3
[ 4839.788794] sd 9:0:1:0: [sdl] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 4839.788823] sd 9:0:1:0: Attached scsi generic sg12 type 0
[ 4839.794546] sd 9:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA
[ 4839.809308] scsi 10:0:1:0: Direct-Access     LIO-ORG  rd               4.0  PQ: 0 ANSI: 5
[ 4839.820806] sd 9:0:1:0: [sdl] Write Protect is off
[ 4839.826161] sd 9:0:1:0: [sdl] Mode Sense: 43 00 00 08
[ 4839.826181] sd 9:0:1:0: [sdl] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 4839.836379] loopback/naa.5001405631dca816: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 4839.846762] sd 9:0:1:0: [sdl] Attached SCSI disk
[ 4839.856572] scsi 10:0:1:0: alua: supports implicit and explicit TPGS
[ 4839.863673] scsi 10:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 4
[ 4839.874002] sd 10:0:1:0: [sdm] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 4839.874033] sd 10:0:1:0: Attached scsi generic sg13 type 0
[ 4839.879549] sd 10:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA
[ 4839.897162] sd 10:0:1:0: [sdm] Write Protect is off
[ 4839.902613] sd 10:0:1:0: [sdm] Mode Sense: 43 00 00 08
[ 4839.902632] sd 10:0:1:0: [sdm] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 4839.912935] loopback/naa.5001405afca06b48: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 4839.923291] sd 10:0:1:0: [sdm] Attached SCSI disk
[ 4841.065972] device-mapper: multipath queue-length: version 0.2.0 loaded

  reply	other threads:[~2016-09-01 23:47 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-31 22:14 [PATCH 0/9] dm patches for kernel v4.9 Bart Van Assche
2016-08-31 22:15 ` [PATCH 1/9] blk-mq: Introduce blk_mq_queue_stopped() Bart Van Assche
2016-08-31 22:16 ` [PATCH 2/9] dm: Rename a function argument Bart Van Assche
2016-09-01  3:29   ` Mike Snitzer
2016-09-01 14:17     ` Bart Van Assche
2016-08-31 22:16 ` [PATCH 3/9] dm: Introduce signal_pending_state() Bart Van Assche
2016-08-31 22:16 ` [PATCH 4/9] dm: Convert wait loops Bart Van Assche
2016-08-31 22:17 ` [PATCH 5/9] dm: Add two lockdep_assert_held() statements Bart Van Assche
2016-08-31 22:17 ` [PATCH 6/9] dm: Simplify dm_old_stop_queue() Bart Van Assche
2016-08-31 22:17 ` [PATCH 7/9] dm: Mark block layer queue dead before destroying the dm device Bart Van Assche
2016-08-31 22:18 ` [PATCH 8/9] dm: Fix two race conditions related to stopping and starting queues Bart Van Assche
2016-09-01  3:13   ` Mike Snitzer
2016-09-01 14:23     ` Bart Van Assche
2016-09-01 15:05       ` Mike Snitzer
2016-09-01 15:31         ` Bart Van Assche
2016-09-01 15:50           ` Mike Snitzer
2016-09-01 16:12             ` Mike Snitzer
2016-09-01 17:59               ` Bart Van Assche
2016-09-01 19:05                 ` Mike Snitzer
2016-09-01 19:35                   ` Mike Snitzer
2016-09-01 20:15                   ` Bart Van Assche
2016-09-01 20:33                     ` Mike Snitzer
2016-09-01 20:39                       ` Bart Van Assche
2016-09-01 20:48                         ` Mike Snitzer
2016-09-01 20:52                           ` Bart Van Assche
2016-09-01 21:17                             ` Mike Snitzer
2016-09-01 22:18                               ` Mike Snitzer
2016-09-01 22:22                                 ` Bart Van Assche
2016-09-01 22:26                                   ` Mike Snitzer
2016-09-01 23:17                                     ` Bart Van Assche
2016-09-01 23:47                                       ` Mike Snitzer [this message]
2016-09-02  0:03                                         ` Bart Van Assche
2016-09-02 15:12                                           ` Mike Snitzer
2016-09-02 16:10                                             ` should blk-mq halt requeue processing while queue is frozen? [was: Re: [PATCH 8/9] dm: Fix two race conditions related to stopping and starting queues] Mike Snitzer
2016-09-02 22:42                                               ` [dm-devel] should blk-mq halt requeue processing while queue is frozen? Bart Van Assche
2016-09-03  0:34                                                 ` Mike Snitzer
2016-09-07 16:41                                                 ` Mike Snitzer
2016-09-13  8:01                                                   ` [dm-devel] " Bart Van Assche
2016-09-13 14:36                                                     ` Mike Snitzer
2016-08-31 22:18 ` [PATCH 9/9] dm path selector: Avoid that device removal triggers an infinite loop Bart Van Assche
2016-09-01  2:49   ` Mike Snitzer
2016-09-01 14:14     ` Bart Van Assche
2016-09-01 15:06       ` Mike Snitzer
2016-09-01 15:22         ` Bart Van Assche
2016-09-01 15:26           ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160901234754.GA13653@redhat.com \
    --to=snitzer@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bart.vanassche@sandisk.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).