From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756993AbbJVIsG (ORCPT ); Thu, 22 Oct 2015 04:48:06 -0400 Received: from mail-yk0-f177.google.com ([209.85.160.177]:34639 "EHLO mail-yk0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756799AbbJVIrk (ORCPT ); Thu, 22 Oct 2015 04:47:40 -0400 Date: Thu, 22 Oct 2015 17:47:33 +0900 From: Tejun Heo To: Zhangqing Luo Cc: axboe@kernel.dk, Guru Anbalagane , Feng Jin , linux-kernel@vger.kernel.org Subject: Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs Message-ID: <20151022084733.GA24379@mtj.duckdns.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote: .... > So every time blk_mq_freeze_queue_start, it runs in this way > > blk_mq_freeze_queue_start > ->percpu_ref_kill->percpu_ref_kill_and_confirm > ->__percpu_ref_switch_to_atomic > ->call_rcu_sched(&ref->rcu,percpu_ref_switch_to_atomic_rcu) > > and blk_mq_freeze_queue_wait blocks on queue->mq_usage_counter > as it is not zero, and wake up by percpu_ref_switch_to_atomic_rcu > after a grace period > > > My question here is why should we change ref to PERCPU at blk_mq_finish_init? > because of this changing, delay appears. Because percpu operation is way cheaper than atomic ones and we want to optimize hot paths (request issue and completion) over cold paths (init and config changes). That's the whole point of percpu refcnting. The reason why percpu ref starts in atomic mode is to avoid expensive percpu freezing if the queue is created and abandoned in quick succession as SCSI does during LUN scanning. If percpu freezing is happening during that, the right solution is moving finish_init to late enough point so that percpu switching happens only after it's known that the queue won't be abandoned. Thanks. -- tejun