linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [Regression x2, 3.13-git] virtio block mq hang, iostat busted on virtio devices
Date: Tue, 19 Nov 2013 15:51:27 -0700	[thread overview]
Message-ID: <528BEB6F.8040704@kernel.dk> (raw)
In-Reply-To: <528BE967.9070506@kernel.dk>

On 11/19/2013 03:42 PM, Jens Axboe wrote:
> On 11/19/2013 02:43 PM, Jens Axboe wrote:
>> On 11/19/2013 02:34 PM, Dave Chinner wrote:
>>> On Tue, Nov 19, 2013 at 02:20:42PM -0700, Jens Axboe wrote:
>>>> On Tue, Nov 19 2013, Jens Axboe wrote:
>>>>> On Tue, Nov 19 2013, Dave Chinner wrote:
>>>>>> Hi Jens,
>>>>>>
>>>>>> I was just running xfstests on a 3.13 kernel that has had the block
>>>>>> layer changed merged into it. generic/269 on XFS is hanging on a 2
>>>>>> CPU VM using virtio,cache=none for the block devices under test,
>>>>>> with many (130+) threads stuck below submit_bio() like this:
>>>>>>
>>>>>>  Call Trace:
>>>>>>   [<ffffffff81adb1c9>] schedule+0x29/0x70
>>>>>>   [<ffffffff817833ee>] percpu_ida_alloc+0x16e/0x330
>>>>>>   [<ffffffff81759bef>] blk_mq_wait_for_tags+0x1f/0x40
>>>>>>   [<ffffffff81758bee>] blk_mq_alloc_request_pinned+0x4e/0xf0
>>>>>>   [<ffffffff8175931b>] blk_mq_make_request+0x3bb/0x4a0
>>>>>>   [<ffffffff8174d2b2>] generic_make_request+0xc2/0x110
>>>>>>   [<ffffffff8174e40c>] submit_bio+0x6c/0x120
>>>>>>
>>>>>> reads and writes are hung, both data (direct and buffered) and
>>>>>> metadata.
>>>>>>
>>>>>> Some IOs are sitting in io_schedule, waiting for IO completion (both
>>>>>> buffered and direct IO, both reads and writes) so it looks like IO
>>>>>> completion has stalled in some manner, too.
>>>>>
>>>>> Can I get a recipe to reproduce this? I haven't had any luck so far.
>>>>
>>>> OK, I reproduced it. Looks weird, basically all 64 commands are in
>>>> flight, but haven't completed. So the next one that comes in just sits
>>>> there forever. I can't find any sysfs debug entries for virtio, would be
>>>> nice to inspect its queue as well...
>>>
>>> Does it have anything to do with the fact that the request queue
>>> depth is 128 entries and the tag pool only has 66 tags in it? i.e:
>>>
>>> /sys/block/vdb/queue/nr_requests
>>> 128
>>>
>>> /sys/block/vdb/mq/0/tags
>>> nr_tags=66, reserved_tags=2, batch_move=16, max_cache=32
>>> nr_free=0, nr_reserved=1
>>>   cpu00: nr_free=0
>>>   cpu01: nr_free=0
>>>
>>> Seems to imply that if we queue up more than 66 IOs without
>>> dispatching them, we'll run out of tags. And without another IO
>>> coming through, the "none" scheduler that virtio uses will never
>>> get a trigger to push out the currently queued IO?
>>
>> No, the nr_requests isn't actually relevant in the blk-mq context, the
>> driver sets its own depth. For the above, it's 64 normal commands, and 2
>> reserved. The reserved would be for a flush, for instance. If someone
>> attempts to queue more than the allocated number of requests, it'll stop
>> the blk-mq queue and kick things into gear on the virtio side. Then when
>> requests complete, we start the queue again.
>>
>> If you look at virtio_queue_rq(), that handles a single request. This
>> request is already tagged at this point. If we can't add it to the ring,
>> we simply stop the queue and kick off whatever pending we might have. We
>> return BLK_MQ_RQ_QUEUE_BUSY to blk-mq, which tells that to back off on
>> sending us more. When we get the virtblk_done() callback from virtio, we
>> end the requests on the blk-mq side and restart the queue.
> 
> I added some debug code to see if we had anything pending on the blk-mq
> side, and it's all empty. It really just looks like we are missing
> completions on the virtio side. Very odd.

Patching in the old rq path works however, so...

-- 
Jens Axboe


  reply	other threads:[~2013-11-19 22:51 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-19  8:02 [Regression x2, 3.13-git] virtio block mq hang, iostat busted on virtio devices Dave Chinner
2013-11-19 10:36 ` Christoph Hellwig
2013-11-19 16:05   ` Jens Axboe
2013-11-19 16:09     ` Christoph Hellwig
2013-11-19 16:16       ` Jens Axboe
2013-11-19 21:30     ` Dave Chinner
2013-11-19 21:40       ` Jens Axboe
2013-11-19 20:15 ` Jens Axboe
2013-11-19 21:20   ` Jens Axboe
2013-11-19 21:34     ` Dave Chinner
2013-11-19 21:43       ` Jens Axboe
2013-11-19 22:42         ` Jens Axboe
2013-11-19 22:51           ` Jens Axboe [this message]
2013-11-19 23:23             ` Dave Chinner
2013-11-19 23:59               ` Jens Axboe
2013-11-20  0:08                 ` Jens Axboe
2013-11-20  1:44                   ` Shaohua Li
2013-11-20  1:54                     ` Jens Axboe
2013-11-20  2:02                       ` Jens Axboe
2013-11-20  2:53                         ` Dave Chinner
2013-11-20  3:12                           ` Jens Axboe
2013-11-20  8:07                       ` Christoph Hellwig
2013-11-20 16:21                         ` Jens Axboe
2013-11-20  8:04         ` Christoph Hellwig
2013-11-20 16:20           ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=528BEB6F.8040704@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).