linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Christian Borntraeger <borntraeger@de.ibm.com>,
	Bart Van Assche <Bart.VanAssche@wdc.com>,
	"virtualization@lists.linux-foundation.org"
	<virtualization@lists.linux-foundation.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"mst@redhat.com" <mst@redhat.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)
Date: Tue, 21 Nov 2017 13:14:09 -0700	[thread overview]
Message-ID: <d8dbde03-f528-1cac-03dd-bcc724fcd5c8@kernel.dk> (raw)
In-Reply-To: <c438db5f-f4f1-69f8-37f3-e91eae29fa25@de.ibm.com>

On 11/21/2017 01:12 PM, Christian Borntraeger wrote:
> 
> 
> On 11/21/2017 08:30 PM, Jens Axboe wrote:
>> On 11/21/2017 12:15 PM, Christian Borntraeger wrote:
>>>
>>>
>>> On 11/21/2017 07:39 PM, Jens Axboe wrote:
>>>> On 11/21/2017 11:27 AM, Jens Axboe wrote:
>>>>> On 11/21/2017 11:12 AM, Christian Borntraeger wrote:
>>>>>>
>>>>>>
>>>>>> On 11/21/2017 07:09 PM, Jens Axboe wrote:
>>>>>>> On 11/21/2017 10:27 AM, Jens Axboe wrote:
>>>>>>>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote:
>>>>>>>>> Bisect points to
>>>>>>>>>
>>>>>>>>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit
>>>>>>>>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1
>>>>>>>>> Author: Christoph Hellwig <hch@lst.de>
>>>>>>>>> Date:   Mon Jun 26 12:20:57 2017 +0200
>>>>>>>>>
>>>>>>>>>     blk-mq: Create hctx for each present CPU
>>>>>>>>>     
>>>>>>>>>     commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream.
>>>>>>>>>     
>>>>>>>>>     Currently we only create hctx for online CPUs, which can lead to a lot
>>>>>>>>>     of churn due to frequent soft offline / online operations.  Instead
>>>>>>>>>     allocate one for each present CPU to avoid this and dramatically simplify
>>>>>>>>>     the code.
>>>>>>>>>     
>>>>>>>>>     Signed-off-by: Christoph Hellwig <hch@lst.de>
>>>>>>>>>     Reviewed-by: Jens Axboe <axboe@kernel.dk>
>>>>>>>>>     Cc: Keith Busch <keith.busch@intel.com>
>>>>>>>>>     Cc: linux-block@vger.kernel.org
>>>>>>>>>     Cc: linux-nvme@lists.infradead.org
>>>>>>>>>     Link: http://lkml.kernel.org/r/20170626102058.10200-3-hch@lst.de
>>>>>>>>>     Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>>>>>>>>>     Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
>>>>>>>>>     Cc: Mike Galbraith <efault@gmx.de>
>>>>>>>>>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>>>>>>
>>>>>>>> I wonder if we're simply not getting the masks updated correctly. I'll
>>>>>>>> take a look.
>>>>>>>
>>>>>>> Can't make it trigger here. We do init for each present CPU, which means
>>>>>>> that if I offline a few CPUs here and register a queue, those still show
>>>>>>> up as present (just offline) and get mapped accordingly.
>>>>>>>
>>>>>>> From the looks of it, your setup is different. If the CPU doesn't show
>>>>>>> up as present and it gets hotplugged, then I can see how this condition
>>>>>>> would trigger. What environment are you running this in? We might have
>>>>>>> to re-introduce the cpu hotplug notifier, right now we just monitor
>>>>>>> for a dead cpu and handle that.
>>>>>>
>>>>>> I am not doing a hot unplug and the replug, I use KVM and add a previously
>>>>>> not available CPU.
>>>>>>
>>>>>> in libvirt/virsh speak:
>>>>>>   <vcpu placement='static' current='1'>4</vcpu>
>>>>>
>>>>> So that's why we run into problems. It's not present when we load the device,
>>>>> but becomes present and online afterwards.
>>>>>
>>>>> Christoph, we used to handle this just fine, your patch broke it.
>>>>>
>>>>> I'll see if I can come up with an appropriate fix.
>>>>
>>>> Can you try the below?
>>>
>>>
>>> It does prevent the crash but it seems that the new CPU is not "used " after the hotplug for mq:
>>>
>>>
>>> output with 2 cpus:
>>> /sys/kernel/debug/block/vda
>>> /sys/kernel/debug/block/vda/hctx0
>>> /sys/kernel/debug/block/vda/hctx0/cpu0
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/completed
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/merged
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched
>>> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list
>>> /sys/kernel/debug/block/vda/hctx0/active
>>> /sys/kernel/debug/block/vda/hctx0/run
>>> /sys/kernel/debug/block/vda/hctx0/queued
>>> /sys/kernel/debug/block/vda/hctx0/dispatched
>>> /sys/kernel/debug/block/vda/hctx0/io_poll
>>> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap
>>> /sys/kernel/debug/block/vda/hctx0/sched_tags
>>> /sys/kernel/debug/block/vda/hctx0/tags_bitmap
>>> /sys/kernel/debug/block/vda/hctx0/tags
>>> /sys/kernel/debug/block/vda/hctx0/ctx_map
>>> /sys/kernel/debug/block/vda/hctx0/busy
>>> /sys/kernel/debug/block/vda/hctx0/dispatch
>>> /sys/kernel/debug/block/vda/hctx0/flags
>>> /sys/kernel/debug/block/vda/hctx0/state
>>> /sys/kernel/debug/block/vda/sched
>>> /sys/kernel/debug/block/vda/sched/dispatch
>>> /sys/kernel/debug/block/vda/sched/starved
>>> /sys/kernel/debug/block/vda/sched/batching
>>> /sys/kernel/debug/block/vda/sched/write_next_rq
>>> /sys/kernel/debug/block/vda/sched/write_fifo_list
>>> /sys/kernel/debug/block/vda/sched/read_next_rq
>>> /sys/kernel/debug/block/vda/sched/read_fifo_list
>>> /sys/kernel/debug/block/vda/write_hints
>>> /sys/kernel/debug/block/vda/state
>>> /sys/kernel/debug/block/vda/requeue_list
>>> /sys/kernel/debug/block/vda/poll_stat
>>
>> Try this, basically just a revert.
> 
> Yes, seems to work.
> 
> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>

Great, thanks for testing.

> Do you know why the original commit made it into 4.12 stable? After all
> it has no Fixes tag and no cc stable-

I was wondering the same thing when you said it was in 4.12.stable and
not in 4.12 release. That patch should absolutely not have gone into
stable, it's not marked as such and it's not fixing a problem that is
stable worthy. In fact, it's causing a regression...

Greg? Upstream commit is mentioned higher up, start of the email.

-- 
Jens Axboe

  reply	other threads:[~2017-11-21 20:14 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-17 14:42 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk Christian Borntraeger
2017-11-20 19:20 ` Bart Van Assche
2017-11-20 19:29   ` Christian Borntraeger
2017-11-20 19:42     ` Jens Axboe
2017-11-20 20:49       ` Christian Borntraeger
2017-11-20 20:52         ` Jens Axboe
2017-11-21  8:35           ` Christian Borntraeger
2017-11-21  9:50             ` Christian Borntraeger
2017-11-21 10:14               ` 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable) Christian Borntraeger
2017-11-21 17:27                 ` Jens Axboe
2017-11-21 18:09                   ` Jens Axboe
2017-11-21 18:12                     ` Christian Borntraeger
2017-11-21 18:27                       ` Jens Axboe
2017-11-21 18:39                         ` Jens Axboe
2017-11-21 19:15                           ` Christian Borntraeger
2017-11-21 19:30                             ` Jens Axboe
2017-11-21 20:12                               ` Christian Borntraeger
2017-11-21 20:14                                 ` Jens Axboe [this message]
2017-11-21 20:19                                   ` Christian Borntraeger
2017-11-21 20:21                                     ` Jens Axboe
2017-11-21 20:31                                       ` Christian Borntraeger
2017-11-21 20:39                                         ` Jens Axboe
2017-11-22  7:28                                           ` Christoph Hellwig
2017-11-22 14:46                                             ` Jens Axboe
2017-11-23 14:34                                               ` Christoph Hellwig
2017-11-23 14:42                                                 ` Hannes Reinecke
2017-11-23 14:47                                                   ` Christoph Hellwig
2017-11-23 15:05                                                 ` Christian Borntraeger
2017-11-23 18:17                                                 ` Christian Borntraeger
2017-11-23 18:25                                                   ` Christoph Hellwig
2017-11-23 18:28                                                     ` Christian Borntraeger
2017-11-23 18:32                                                       ` Christoph Hellwig
2017-11-23 18:59                                                         ` Christian Borntraeger
2017-11-24 13:09                                                           ` Christian Borntraeger
2017-11-27 15:54                                                             ` Christoph Hellwig
2017-11-29 19:18                                                               ` Christian Borntraeger
2017-11-29 19:36                                                                 ` Christian Borntraeger
2017-12-04 16:21                                                                 ` Christoph Hellwig
2017-12-06 12:25                                                                   ` Christian Borntraeger
2017-12-06 23:29                                                                     ` Christoph Hellwig
2017-12-07  9:20                                                                       ` Christian Borntraeger
2017-12-14 17:32                                                                         ` Christian Borntraeger
2017-12-18 13:56                                                                       ` Stefan Haberland
2017-12-20 15:47                                                                         ` Christian Borntraeger
2018-01-11  9:13                                                                           ` Ming Lei
2018-01-11  9:26                                                                             ` Stefan Haberland
2018-01-11 11:44                                                                             ` Christian Borntraeger
2018-01-11 13:17                                                                               ` Stefan Haberland
2018-01-11 17:46                                                                             ` Christoph Hellwig
2018-01-12  1:16                                                                               ` Ming Lei
2017-11-23 14:02                       ` Christoph Hellwig
2017-11-23 14:08                         ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8dbde03-f528-1cac-03dd-bcc724fcd5c8@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=Bart.VanAssche@wdc.com \
    --cc=borntraeger@de.ibm.com \
    --cc=hch@lst.de \
    --cc=jasowang@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).