From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45140) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WsbfF-0007KF-8h for qemu-devel@nongnu.org; Thu, 05 Jun 2014 13:41:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Wsbf9-0000bc-Fy for qemu-devel@nongnu.org; Thu, 05 Jun 2014 13:41:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:64443) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wsbf9-0000bG-6g for qemu-devel@nongnu.org; Thu, 05 Jun 2014 13:41:15 -0400 Message-ID: <5390ABB3.9050509@redhat.com> Date: Thu, 05 Jun 2014 19:41:07 +0200 From: Max Reitz MIME-Version: 1.0 References: <1401561792-13410-1-git-send-email-mreitz@redhat.com> <20140603143800.GG723@stefanha-thinkpad.muc.redhat.com> In-Reply-To: <20140603143800.GG723@stefanha-thinkpad.muc.redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC 0/5] nbd: Adapt for dataplane List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Kevin Wolf , Paolo Bonzini , Fam Zheng , qemu-devel@nongnu.org, Stefan Hajnoczi On 03.06.2014 16:38, Stefan Hajnoczi wrote: > On Sat, May 31, 2014 at 08:43:07PM +0200, Max Reitz wrote: >> For the NBD server to work with dataplane, it needs to correctly access >> the exported BDS. It makes the most sense to run both in the same >> AioContext, therefore this series implements methods for tracking a >> BDS's AioContext and makes NBD make use of this for keeping the clients >> connected to that BDS in the same AioContext. >> >> The reason this is an RFC and not a PATCH is my inexperience with AIO, >> coroutines and the like. Also, I'm not sure about what to do about the >> coroutines. The NBD server has up to two coroutines per client: One for >> receiving and one for sending. Theoretically, both have to be >> "transferred" to the new AioContext if it is changed; however, as far as >> I see it, coroutines are not really bound to an AioContext, they are >> simply run in the AioContext entering them. Therefore, I think a >> transfer is unnecessary. All coroutines are entered from nbd_read() and >> nbd_restart_write(), both of which are AIO routines registered via >> aio_set_fd_handler2(). >> >> As bs_aio_detach() unregisters all of these routines, the coroutines can >> no longer be entered, but only after bs_aio_attach() is called again. >> Then, when they are called, they will enter the coroutines in the new >> AioContext. Therefore, I think an explicit transfer unnecessary. > This reasoning sounds correct. > >> However, if bs_aio_detach() is called from a different thread than the >> old AioContext is running in, we may still have coroutines running for >> which we should wait before returning from bs_aio_detach(). > The bdrv_attach/detach_aio_context() APIs have rules regarding where > these functions are called from: > > /** > * bdrv_set_aio_context: > * > * Changes the #AioContext used for fd handlers, timers, and BHs by this > * BlockDriverState and all its children. > * > * This function must be called from the old #AioContext or with a lock held so > * the old #AioContext is not executing. Oh, that makes things easier. *g* > */ > void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context); > > and: > > /* Remove fd handlers, timers, and other event loop callbacks so the event > * loop is no longer in use. Called with no in-flight requests and in > * depth-first traversal order with parents before child nodes. > */ > void (*bdrv_detach_aio_context)(BlockDriverState *bs); > > /* Add fd handlers, timers, and other event loop callbacks so I/O requests > * can be processed again. Called with no in-flight requests and in > * depth-first traversal order with child nodes before parent nodes. > */ > void (*bdrv_attach_aio_context)(BlockDriverState *bs, > AioContext *new_context); > > These rules ensure that it's safe to perform these operations. You > don't have to support arbitrary callers in NBD either. > >> But because of my inexperience with coroutines, I'm not sure. I now have >> these patches nearly unchanged here for about a week and I'm looking for >> ways of testing them, but so far I could only test whether the old use >> cases work, but not whether they will work for what they are intended to >> do: With BDS changing their AioContext. >> >> So, because I'm not sure what else to do and because I don't know how to >> test multiple AIO threads (how do I move a BDS into another iothread?) >> I'm just sending this out as an RFC. > Use a Linux guest with virtio-blk: > > qemu -drive if=none,file=test.img,id=drive0 \ > -object iothread,id=iothread0 \ > -device virtio-blk-pci,drive=drive0,x-iothread=iothread0 \ > ... > > Once the guest has booted the virtio-blk device will be in dataplane > mode. That means drive0's BlockDriverState ->aio_context will be the > IOThread AioContext and not the global qemu_aio_context. Ah, thank you. > Now you can exercise the run-time NBD server over QMP and check that > things still work. For example, try running a few instance of dd > if=/dev/vdb of=/dev/null iflag=direct inside the guest to stress guest > I/O. > > Typically what happens if code is not dataplane-aware is that a deadlock > or crash occurs due to race conditions between the QEMU main loop and > the IOThread for this virtio-blk device. > > For an overview of dataplane programming concepts, see: > https://lists.gnu.org/archive/html/qemu-devel/2014-05/msg01436.html Yes, I took this email as a reference, however it only said how to create a new iothread, but not how to use it. :-) Max > Stefan