From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jens Axboe <axboe@kernel.dk>
Subject: Re: [PATCH v1 5/9] block: loop: convert to blk-mq
Date: Wed, 20 Aug 2014 21:58:36 -0500
Message-ID: <53F5605C.2010304@kernel.dk>
References: <1408031441-31156-1-git-send-email-ming.lei@canonical.com>	<1408031441-31156-6-git-send-email-ming.lei@canonical.com>	<20140815163111.GA16652@infradead.org>	<53EE370D.1060106@kernel.dk>	<53EE3966.60609@kernel.dk>	<CACVXFVO+3SHy0=zESJqG3ZB68AZaeK5_QK2CVmZWzZ7oBWDGwg@mail.gmail.com>	<53F0EAEC.9040505@kernel.dk>	<CACVXFVO3CrD2CoHiBOZ8GNMXrT2pJ=t2BzmAHksqRTmgrpsaaQ@mail.gmail.com>	<CACVXFVN6W-AAQ6g4LXHOyvMzU+EAmj_YYBEuYRnoSB+VqCJg8A@mail.gmail.com>	<53F3B89D.6070703@kernel.dk>	<CACVXFVP_q2MfZtjPAgXrjMJS2K6H2fTFtAe3ZJXBW83uEovqkQ@mail.gmail.com>	<53F4C835.7030407@kernel.dk> <CACVXFVPxXrYi+m0bC7tEcfvDzhQ=Xnapkd+yGRXbKCktgi3Ofw@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Christoph Hellwig <hch@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Kleikamp <dave.kleikamp@oracle.com>, Zach Brown <zab@zabbo.net>,
	Benjamin LaHaise <bcrl@kvack.org>,
	Kent Overstreet <kmo@daterainc.com>, open@kvack.org,
	list@kvack.org:AIO <linux-aio@kvack.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Dave Chinner <david@fromorbit.com>, Tejun Heo <tj@kernel.org>
To: Ming Lei <ming.lei@canonical.com>
Return-path: <owner-linux-aio@kvack.org>
In-Reply-To: <CACVXFVPxXrYi+m0bC7tEcfvDzhQ=Xnapkd+yGRXbKCktgi3Ofw@mail.gmail.com>
Sender: owner-linux-aio@kvack.org
List-Id: linux-fsdevel.vger.kernel.org

On 2014-08-20 21:54, Ming Lei wrote:
>>>   From my investigation, context switch increases almost 50% with
>>> workqueue compared with kthread in loop in a quad-core VM. With
>>> kthread, requests may be handled as batch in cases which won't be
>>> blocked in read()/write()(like null_blk, tmpfs, ...), but it is impossible
>>> with
>>> workqueue any more.  Also block plug&unplug should have been used
>>> with kthread to optimize the case, especially when kernel AIO is applied,
>>> still impossible with work queue too.
>>
>>
>> OK, that one is actually a good point, since one need not do per-item
>> queueing. We could handle different units, though. And we should have proper
>> marking of the last item in a chain of stuff, so we might even be able to
>> offload based on that instead of doing single items. It wont help the sync
>> case, but for that, workqueue and kthread would be identical.
>
> We may do that by introducing callback of queue_rq_list in blk_mq_ops,
> and I will figure out one patch today to see if it can help the case.

I don't think we should add to the interface, I prefer keeping it clean 
like it is right now. At least not if we can get around it. My point is 
that the driver already knows when the chain is complete, when REQ_LAST 
is set. So before that event triggers, it need not kick off IO, or at 
least i could do it in batches before that. That may not be fully 
reliable in case of queueing errors, but if REQ_LAST or 'error return' 
is used as the way to kick off pending IO, then that should be good 
enough. Haven't audited this in a while, but at least that is the intent 
of REQ_LAST.

-- 
Jens Axboe

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>