public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: "jianchao.wang" <jianchao.w.wang@oracle.com>
Cc: Jens Axboe <axboe@fb.com>,
	linux-block@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Stefan Haberland <sth@linux.vnet.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 2/2] blk-mq: simplify queue mapping & schedule with each possisble CPU
Date: Wed, 17 Jan 2018 11:52:05 +0800	[thread overview]
Message-ID: <20180117035159.GA9487@ming.t460p> (raw)
In-Reply-To: <7f5bad86-febc-06fc-67c0-393777d172e4@oracle.com>

Hi Jianchao,

On Wed, Jan 17, 2018 at 10:56:13AM +0800, jianchao.wang wrote:
> Hi ming
> 
> Thanks for your patch and kindly response.

You are welcome!

> 
> On 01/16/2018 11:32 PM, Ming Lei wrote:
> > OK, I got it, and it should have been the only corner case in which
> > all CPUs mapped to this hctx become offline, and I believe the following
> > patch should address this case, could you give a test?
> > 
> > ---
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index c376d1b6309a..23f0f3ddffcf 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1416,21 +1416,44 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
> >   */
> >  static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
> >  {
> > +	bool tried = false;
> > +
> >  	if (hctx->queue->nr_hw_queues == 1)
> >  		return WORK_CPU_UNBOUND;
> >  
> >  	if (--hctx->next_cpu_batch <= 0) {
> >  		int next_cpu;
> > +select_cpu:
> >  
> >  		next_cpu = cpumask_next_and(hctx->next_cpu, hctx->cpumask,
> >  				cpu_online_mask);
> >  		if (next_cpu >= nr_cpu_ids)
> >  			next_cpu = cpumask_first_and(hctx->cpumask,cpu_online_mask);
> >  
> > -		hctx->next_cpu = next_cpu;
> > +		/*
> > +		 * No online CPU can be found here when running from
> > +		 * blk_mq_hctx_notify_dead(), so make sure hctx->next_cpu
> > +		 * is set correctly.
> > +		 */
> > +		if (next_cpu >= nr_cpu_ids)
> > +			hctx->next_cpu = cpumask_first_and(hctx->cpumask,
> > +					cpu_possible_mask);
> > +		else
> > +			hctx->next_cpu = next_cpu;
> >  		hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
> >  	}
> >  
> > +	/*
> > +	 * Do unbound schedule if we can't find a online CPU for this hctx,
> > +	 * and it should happen only if hctx->next_cpu is becoming DEAD.
> > +	 */
> > +	if (!cpu_online(hctx->next_cpu)) {
> > +		if (!tried) {
> > +			tried = true;
> > +			goto select_cpu;
> > +		}
> > +		return WORK_CPU_UNBOUND;
> > +	}
> >  	return hctx->next_cpu;
> >  }
> 
> I have tested this patch. The panic was gone, but I got the following:
> 
> [  231.674464] WARNING: CPU: 0 PID: 263 at /home/will/u04/source_code/linux-block/block/blk-mq.c:1315 __blk_mq_run_hw_queue+0x92/0xa0
>

......

> It is here.
> __blk_mq_run_hw_queue()
> ....
>     WARN_ON(!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
>         cpu_online(hctx->next_cpu));

I think this warning is triggered after the CPU of hctx->next_cpu becomes
online again, and it should have been dealt with by the following trick,
and this patch is against the previous one, please test it and see if
the warning can be fixed.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 23f0f3ddffcf..0620ccb65e4e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1452,6 +1452,9 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
 			tried = true;
 			goto select_cpu;
 		}
+
+		/* handle after this CPU of hctx->next_cpu becomes online again */
+		hctx->next_cpu_batch = 1;
 		return WORK_CPU_UNBOUND;
 	}
 	return hctx->next_cpu;

> ....
> 
> To eliminate this risk totally, we could blk_mq_hctx_next_cpu return the cpu  even if the cpu is offlined and modify the cpu_online above to cpu_active.
> The kworkers of the per-cpu pool must have be migrated back when the cpu is set active.
> But there seems to be issues in DASD as your previous comment.

Yes, we can't break DASD.

> >>>>
> That is the original version of this patch, and both Christian and Stefan
> reported that system can't boot from DASD in this way[2], and I changed
> to AND with cpu_online_mask, then their system can boot well
> >>>>
> 
> On the other hand, there is also risk in 
> 
> @@ -440,7 +440,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
>  		blk_queue_exit(q);
>  		return ERR_PTR(-EXDEV);
>  	}
> -	cpu = cpumask_first(alloc_data.hctx->cpumask);
> +	cpu = cpumask_first_and(alloc_data.hctx->cpumask, cpu_online_mask);
>  	alloc_data.ctx = __blk_mq_get_ctx(q, cpu);
> 
> what if the cpus in alloc_data.hctx->cpumask are all offlined ?

This one is crazy, and is used by NVMe only, it should be fine if
the passed 'hctx_idx' is retrieved by the current running CPU, such
as the way of blk_mq_map_queue(). But if not, bad thing may happen.

Thanks,
Ming

  reply	other threads:[~2018-01-17  3:52 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-12  2:53 [PATCH 0/2] blk-mq: support physical CPU hotplug Ming Lei
2018-01-12  2:53 ` [PATCH 1/2] genirq/affinity: assign vectors to all possible CPUs Ming Lei
2018-01-12 19:35   ` Thomas Gleixner
2018-01-12  2:53 ` [PATCH 2/2] blk-mq: simplify queue mapping & schedule with each possisble CPU Ming Lei
2018-01-16 10:00   ` Stefan Haberland
2018-01-16 10:12   ` jianchao.wang
2018-01-16 12:10     ` Ming Lei
2018-01-16 14:31       ` jianchao.wang
2018-01-16 15:11         ` jianchao.wang
2018-01-16 15:32         ` Ming Lei
2018-01-17  2:56           ` jianchao.wang
2018-01-17  3:52             ` Ming Lei [this message]
2018-01-17  5:24               ` jianchao.wang
2018-01-17  6:22                 ` Ming Lei
2018-01-17  8:09                   ` jianchao.wang
2018-01-17  9:57                     ` Ming Lei
2018-01-17 10:07                       ` Christian Borntraeger
2018-01-17 10:14                         ` Christian Borntraeger
2018-01-17 10:17                         ` Ming Lei
2018-01-19  3:05                       ` jianchao.wang
2018-01-26  9:31                         ` Ming Lei
2018-01-12  8:12 ` [PATCH 0/2] blk-mq: support physical CPU hotplug Christian Borntraeger
2018-01-12 10:47   ` Johannes Thumshirn
2018-01-12 18:02 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180117035159.GA9487@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@fb.com \
    --cc=borntraeger@de.ibm.com \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=jianchao.w.wang@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sth@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox