From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59911) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UzOSH-0004WW-Hu for qemu-devel@nongnu.org; Wed, 17 Jul 2013 05:55:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UzOSF-0003uz-VY for qemu-devel@nongnu.org; Wed, 17 Jul 2013 05:55:29 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34224) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UzOSF-0003up-NK for qemu-devel@nongnu.org; Wed, 17 Jul 2013 05:55:27 -0400 Message-ID: <51E669FE.7020706@redhat.com> Date: Wed, 17 Jul 2013 11:55:10 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1373899382-13514-1-git-send-email-stefanha@redhat.com> <51E42BEE.6030501@redhat.com> <20130717092216.GA25428@stefanha-thinkpad.redhat.com> In-Reply-To: <20130717092216.GA25428@stefanha-thinkpad.redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 00/13] dataplane: use block layer List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi Il 17/07/2013 11:22, Stefan Hajnoczi ha scritto: >> Stopping the dataplane event loop is not really necessary for >> correctness; that only requires stopping the ioeventfd, and indeed this >> series already includes patches for this. So you definitely have my >> "approval" (in quotes because you're the maintainer...) for patches 1-2-3. > > The TLS AioContext approach is compatible with the future improvements > that you mentioned. Yes, it is. But I'd rather try and understand whether we need it, before committing it. > The point of TLS AioContext is to avoid explicitly passing AioContext > everywhere. Since BH is sometimes used deep down it would be very > tedious to start passing around AioContext. qemu_bh_*() should just do > the right thing. Note that only aio_bh_new() requires the AioContext. Scheduling the BH or deleting doesn't. > BTW, besides stopping ioeventfd it is also necessary to either complete > pending I/O requests (drain) or to at least postpone callbacks for > pending I/O requests while another thread is accessing the BDS. Yes, draining is cool, I omitted it for simplicity because it happens also for non-dataplane. In fact, this is not dataplane-specific, no? So let's go with what we've been using so far---I agree that postponing risks getting thorny. >> However, I'm not sure it is feasible to have a single AioContext per >> thread. Consider a backup job whose target is not running in the same >> AioContext as the source; for optimization it might use bdrv_aio_* >> instead of coroutine-based I/O. >> >> So, once you have stopped the ioeventfd to correctly support >> bdrv_drain_all(), I would like to see code that makes other threads use >> bottom halves to do a context switch. Ping Fan's patches for a >> thread-safe BH list are useful for this. > > If I understand correctly this is means a block job wraps I/O calls so > that they are scheduled in a remote AioContext/thread when necessary. > This is necessary when a block job touches multiple BDS which belong to > different threads. Yes. I didn't consider that at this point you don't even need to put block jobs in the AioContext's thread, but you're right. They can run in the iothread. >> I think there are two ways to implement synchronous operations, there >> are two approaches: >> >> * A lock, just like Kevin mentioned (though I would place it on the >> AioContext, not the BlockDriverState). Then you can call aio_wait under >> the lock in bdrv_read and friends. > > This sounds clever. So block jobs don't need to explicitly wrap I/O > calls, it happens automatically in block.c:bdrv_read() and friends. > >> * A "complementary" bottom half that is scheduled in the >> qemu_aio_context. This second bottom half terminates processing in the >> dataplane thread and restarts the thread starting the synchronous >> operation. I'm not sure how easy it would be to write infrastructure >> for this, but basically it'd mean adding a third path ("in coroutine >> context, but in the wrong AioContext") to the current "if >> (qemu_in_coroutine())" tests that block.c has. > > Not sure I understand this approach but maybe something like moving a > coroutine from one AioContext to another and back again. Yes. I'm not sure which is best, of course. > FWIW I'm also looking at how much (little) effort it would take to > disable dataplane temporarily while block jobs are running. Since we're > discussing switching between threads already, it's not clear that we > gain much performance by trying to use dataplane. It clearly adds > complexity. Long term I'd like to remove the non-dataplane code altogether. I.e. the ioeventfd is started at the first kick and stopped at drain, but apart from that all virtio-blk I/O uses in the dataplane thread's event loop, always. It needs ioeventfd support in TCG, though. Paolo