All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
	device-mapper development <dm-devel@redhat.com>,
	"hch@lst.de" <hch@lst.de>
Subject: Re: [PATCH 8/9] dm: Fix two race conditions related to stopping and starting queues
Date: Thu, 1 Sep 2016 19:47:54 -0400	[thread overview]
Message-ID: <20160901234754.GA13653@redhat.com> (raw)
In-Reply-To: <938609b9-3a55-0ed3-ffeb-de27e1c1e864@sandisk.com>

On Thu, Sep 01 2016 at  7:17pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 09/01/2016 03:27 PM, Mike Snitzer wrote:
> >On Thu, Sep 01 2016 at  6:22pm -0400,
> >Bart Van Assche <bart.vanassche@sandisk.com> wrote:
> >
> >>On 09/01/2016 03:18 PM, Mike Snitzer wrote:
> >>>FYI I get the same 'dmsetup suspend --nolockfs --noflush mp' hang,
> >>>running mptest's test_02_sdev_delete, when I try your unmodified
> >>>patchset, see:
> >>>
> >>>http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=devel.bart
> >>
> >>Hello Mike,
> >>
> >>Are you aware that the code on that branch is a *modified* version
> >>of my patch series? The following patch is not present on that
> >>branch: "dm path selector: Avoid that device removal triggers an
> >>infinite loop". There are also other (smaller) differences.
> >
> >No, you're obviously talking about the 'devel' branch and not the
> >'devel.bart' branch I pointed to.  The 'devel.bart' branch is the
> >_exact_ patchset you sent.  It has the same problem as the 'devel'
> >branch.
> 
> Hello Mike,
> 
> Sorry that I misread your previous e-mail. After I received your
> latest e-mail I rebased my tree on top of the devel.bart branch
> mentioned above. My tests still pass. The only two patches in my
> tree that are relevant and that are not in the devel.bart branch
> have been attached to this e-mail. Did your test involve the sd
> driver? If so, do the attached two patches help? If the sd driver
> was not involved, can you provide more information about the hang
> you ran into? The output and log messages generated by the following
> commands after the hang has been reproduced would be very welcome:
> * echo w > /proc/sysrq-trigger
> * (cd /sys/block && grep -a '' dm*/mq/*/{pending,cpu*/rq_list})

sd is used.  I'll apply those patches and test, tomorrow, but I'm pretty
skeptical.

Haven't had any problems with these tests for quite a while.  The tests
I'm running are just those in the mptest testsuite, see:
https://github.com/snitm/mptest

Running them should be as simple as you doing:

git clone git://github.com/snitm/mptest.git
cd mptest
./runtest

The default is to use dm-mq on scsi-mq ontop of tcmloop.

multipath -ll shows:

mp () dm-4 LIO-ORG ,rd
size=1.0G features='4 queue_if_no_path retain_attached_hw_handler queue_mode mq' hwhandler='1 alua' wp=rw
|-+- policy='queue-length 0' prio=-1 status=active
| |- 7:0:1:0  sdj   8:144 active ready running
| `- 8:0:1:0  sdk   8:160 active ready running
`-+- policy='queue-length 0' prio=-1 status=enabled
  |- 9:0:1:0  sdl   8:176 active ready running
  `- 10:0:1:0 sdm   8:192 active ready running

[ 4839.452237] scsi host7: TCM_Loopback
[ 4839.472788] scsi host8: TCM_Loopback
[ 4839.492867] scsi host9: TCM_Loopback
[ 4839.512841] scsi host10: TCM_Loopback
[ 4839.549430] scsi 7:0:1:0: Direct-Access     LIO-ORG  rd               4.0  PQ: 0 ANSI: 5
[ 4839.570556] scsi 7:0:1:0: alua: supports implicit and explicit TPGS
[ 4839.577562] scsi 7:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 1
[ 4839.587810] sd 7:0:1:0: [sdj] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 4839.587830] sd 7:0:1:0: Attached scsi generic sg10 type 0
[ 4839.593569] sd 7:0:1:0: alua: transition timeout set to 60 seconds
[ 4839.593572] sd 7:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA
[ 4839.608254] scsi 8:0:1:0: Direct-Access     LIO-ORG  rd               4.0  PQ: 0 ANSI: 5
[ 4839.626620] sd 7:0:1:0: [sdj] Write Protect is off
[ 4839.631974] sd 7:0:1:0: [sdj] Mode Sense: 43 00 00 08
[ 4839.631999] sd 7:0:1:0: [sdj] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 4839.642209] loopback/naa.50014056fcae4fb4: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 4839.652646] sd 7:0:1:0: [sdj] Attached SCSI disk
[ 4839.673568] scsi 8:0:1:0: alua: supports implicit and explicit TPGS
[ 4839.680573] scsi 8:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 2
[ 4839.690814] sd 8:0:1:0: [sdk] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 4839.690888] sd 8:0:1:0: Attached scsi generic sg11 type 0
[ 4839.696543] sd 8:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA
[ 4839.711419] scsi 9:0:1:0: Direct-Access     LIO-ORG  rd               4.0  PQ: 0 ANSI: 5
[ 4839.722730] sd 8:0:1:0: [sdk] Write Protect is off
[ 4839.728076] sd 8:0:1:0: [sdk] Mode Sense: 43 00 00 08
[ 4839.728094] sd 8:0:1:0: [sdk] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 4839.738298] loopback/naa.500140553365fbe6: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 4839.748700] sd 8:0:1:0: [sdk] Attached SCSI disk
[ 4839.771561] scsi 9:0:1:0: alua: supports implicit and explicit TPGS
[ 4839.778567] scsi 9:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 3
[ 4839.788794] sd 9:0:1:0: [sdl] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 4839.788823] sd 9:0:1:0: Attached scsi generic sg12 type 0
[ 4839.794546] sd 9:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA
[ 4839.809308] scsi 10:0:1:0: Direct-Access     LIO-ORG  rd               4.0  PQ: 0 ANSI: 5
[ 4839.820806] sd 9:0:1:0: [sdl] Write Protect is off
[ 4839.826161] sd 9:0:1:0: [sdl] Mode Sense: 43 00 00 08
[ 4839.826181] sd 9:0:1:0: [sdl] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 4839.836379] loopback/naa.5001405631dca816: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 4839.846762] sd 9:0:1:0: [sdl] Attached SCSI disk
[ 4839.856572] scsi 10:0:1:0: alua: supports implicit and explicit TPGS
[ 4839.863673] scsi 10:0:1:0: alua: device naa.600140559050dd34f6e46deb7e0e9f24 port group 0 rel port 4
[ 4839.874002] sd 10:0:1:0: [sdm] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 4839.874033] sd 10:0:1:0: Attached scsi generic sg13 type 0
[ 4839.879549] sd 10:0:1:0: alua: port group 00 state A non-preferred supports TOlUSNA
[ 4839.897162] sd 10:0:1:0: [sdm] Write Protect is off
[ 4839.902613] sd 10:0:1:0: [sdm] Mode Sense: 43 00 00 08
[ 4839.902632] sd 10:0:1:0: [sdm] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 4839.912935] loopback/naa.5001405afca06b48: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 4839.923291] sd 10:0:1:0: [sdm] Attached SCSI disk
[ 4841.065972] device-mapper: multipath queue-length: version 0.2.0 loaded

  reply	other threads:[~2016-09-01 23:47 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-31 22:14 [PATCH 0/9] dm patches for kernel v4.9 Bart Van Assche
2016-08-31 22:15 ` [PATCH 1/9] blk-mq: Introduce blk_mq_queue_stopped() Bart Van Assche
2016-08-31 22:16 ` [PATCH 2/9] dm: Rename a function argument Bart Van Assche
2016-09-01  3:29   ` Mike Snitzer
2016-09-01 14:17     ` Bart Van Assche
2016-08-31 22:16 ` [PATCH 3/9] dm: Introduce signal_pending_state() Bart Van Assche
2016-08-31 22:16 ` [PATCH 4/9] dm: Convert wait loops Bart Van Assche
2016-08-31 22:17 ` [PATCH 5/9] dm: Add two lockdep_assert_held() statements Bart Van Assche
2016-08-31 22:17 ` [PATCH 6/9] dm: Simplify dm_old_stop_queue() Bart Van Assche
2016-08-31 22:17 ` [PATCH 7/9] dm: Mark block layer queue dead before destroying the dm device Bart Van Assche
2016-08-31 22:18 ` [PATCH 8/9] dm: Fix two race conditions related to stopping and starting queues Bart Van Assche
2016-09-01  3:13   ` Mike Snitzer
2016-09-01 14:23     ` Bart Van Assche
2016-09-01 15:05       ` Mike Snitzer
2016-09-01 15:31         ` Bart Van Assche
2016-09-01 15:50           ` Mike Snitzer
2016-09-01 16:12             ` Mike Snitzer
2016-09-01 17:59               ` Bart Van Assche
2016-09-01 19:05                 ` Mike Snitzer
2016-09-01 19:35                   ` Mike Snitzer
2016-09-01 20:15                   ` Bart Van Assche
2016-09-01 20:33                     ` Mike Snitzer
2016-09-01 20:39                       ` Bart Van Assche
2016-09-01 20:48                         ` Mike Snitzer
2016-09-01 20:52                           ` Bart Van Assche
2016-09-01 21:17                             ` Mike Snitzer
2016-09-01 22:18                               ` Mike Snitzer
2016-09-01 22:22                                 ` Bart Van Assche
2016-09-01 22:26                                   ` Mike Snitzer
2016-09-01 23:17                                     ` Bart Van Assche
2016-09-01 23:47                                       ` Mike Snitzer [this message]
2016-09-02  0:03                                         ` Bart Van Assche
2016-09-02 15:12                                           ` Mike Snitzer
2016-09-02 16:10                                             ` should blk-mq halt requeue processing while queue is frozen? [was: Re: [PATCH 8/9] dm: Fix two race conditions related to stopping and starting queues] Mike Snitzer
2016-09-02 22:42                                               ` [dm-devel] should blk-mq halt requeue processing while queue is frozen? Bart Van Assche
2016-09-02 22:42                                                 ` Bart Van Assche
2016-09-03  0:34                                                 ` Mike Snitzer
2016-09-07 16:41                                                 ` Mike Snitzer
2016-09-13  8:01                                                   ` [dm-devel] " Bart Van Assche
2016-09-13 14:36                                                     ` Mike Snitzer
2016-08-31 22:18 ` [PATCH 9/9] dm path selector: Avoid that device removal triggers an infinite loop Bart Van Assche
2016-09-01  2:49   ` Mike Snitzer
2016-09-01 14:14     ` Bart Van Assche
2016-09-01 15:06       ` Mike Snitzer
2016-09-01 15:22         ` Bart Van Assche
2016-09-01 15:26           ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160901234754.GA13653@redhat.com \
    --to=snitzer@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bart.vanassche@sandisk.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.