From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:42704) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QL8Pm-0002yZ-AX for qemu-devel@nongnu.org; Sat, 14 May 2011 02:33:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QL8Pk-0001ti-T5 for qemu-devel@nongnu.org; Sat, 14 May 2011 02:33:26 -0400 Received: from mail-yw0-f45.google.com ([209.85.213.45]:60439) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QL8Pk-0001ta-Nv for qemu-devel@nongnu.org; Sat, 14 May 2011 02:33:24 -0400 Received: by ywl41 with SMTP id 41so1307612ywl.4 for ; Fri, 13 May 2011 23:33:24 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <4DCDDAE3.2030201@linux.vnet.ibm.com> References: <1305233867-4367-1-git-send-email-jvrao@linux.vnet.ibm.com> <20110513085503.GA23158@stefanha-thinkpad.localdomain> <87sjsio0zi.fsf@linux.vnet.ibm.com> <4DCDCA6F.9000503@linux.vnet.ibm.com> <4DCDDAE3.2030201@linux.vnet.ibm.com> Date: Sat, 14 May 2011 07:33:24 +0100 Message-ID: From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [0/25] Async threading for VirtFS using glib threads & coroutines. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Venkateswararao Jujjuri Cc: qemu-devel@nongnu.org On Sat, May 14, 2011 at 2:29 AM, Venkateswararao Jujjuri wrote: > On 05/13/2011 05:18 PM, Venkateswararao Jujjuri wrote: >> >> On 05/13/2011 12:26 PM, Aneesh Kumar K.V wrote: >>> >>> On Fri, 13 May 2011 09:55:03 +0100, Stefan Hajnoczi >>> =A0wrote: >>>> >>>> On Thu, May 12, 2011 at 01:57:22PM -0700, Venkateswararao Jujjuri (JV) >>>> wrote: >>>>> >>>>> VirtFS (fileserver base on 9P) performs many blocking system calls in >>>>> the >>>>> vCPU context. This effort is to move the blocking calls out of vCPU/I= O >>>>> thread context, into asynchronous threads. >>>>> >>>>> Anthony's " Add hard build dependency on glib" patch and >>>>> Kevin/Stefan's coroutine effort is a prerequisite. >>>>> >>>>> This patch set contains: >>>>> =A0- Converting all 9pfs calls into coroutines. >>>>> =A0- Each 9P operation will be modified for: >>>>> =A0 =A0 - Remove post* functions. These are our call back functions w= hich >>>>> makes >>>>> =A0 =A0 =A0 the code very hard to read. Now with coroutines, we can a= chieve >>>>> the same >>>>> =A0 =A0 =A0 state machine model with nice sequential code flow. >>>>> =A0 =A0 - Move errno access near to the local_syscall() >>>>> =A0 =A0 - Introduce asynchronous threading >>>>> >>>>> This series has the basic infrastructure and few routines like >>>>> mkdir,monod,unlink,readdir,xattr,lstat, etc converted. >>>>> Currently we are working on converting and testing other 9P operation= s >>>>> also >>>>> into this model and those patches will follow shortly. >>>>> >>>>> Removing callback functions made some of the patches little lengthy. >>>> >>>> This long patch series adds temporary structs and marshalling code for >>>> each file system operation - I think none of this is necessary. =A0Ins= tead >>>> we can exploit coroutines more: >>>> >>>> The point of coroutines is that you can suspend a thread of control (a >>>> call-stack, not an OS-level thread) and can re-enter it later. =A0We >>>> should make coroutines thread-safe (i.e. work outside of the global >>>> mutex) and then allow switching a coroutine from a QEMU thread to a >>>> worker thread and back again: >>>> >>>> int coroutine_fn v9fs_co_readdir(V9fsState *s, V9fsFidState *fidp, >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0str= uct dirent **dent) >>>> { >>>> =A0 =A0 int ret =3D 0; >>>> >>>> =A0 =A0 v9fs_co_run_in_worker({ >>>> =A0 =A0 =A0 =A0 errno =3D 0; >>>> =A0 =A0 =A0 =A0 *dent =3D s->ops->readdir(&s->ctx, fidp->fs.dir); >>>> =A0 =A0 =A0 =A0 if (!*dent&& =A0errno) { >>>> =A0 =A0 =A0 =A0 =A0 =A0 ret =3D -errno; >>>> =A0 =A0 =A0 =A0 } >>>> =A0 =A0 }); >>>> =A0 =A0 return ret; >>>> } >>>> >>>> v9fs_co_readdir() can be called from a QEMU thread. =A0The block of co= de >>>> inside v9fs_co_run_in_worker() will be executed in a worker thread. >>>> Notice that no marshalling variables is necessary at all; we can use t= he >>>> function arguments and local variables because this is still the same >>>> function! >>>> >>>> When control reaches the end of the v9fs_co_run_in_worker() block, >>>> execution is resumed in a QEMU thread and the function then returns re= t. >>>> It would be incorrect to return inside the v9fs_co_run_in_worker() blo= ck >>>> because at that point we're still inside the worker thread. >>>> >>>> Here is how v9fs_co_run_in_worker() does its magic: >>>> >>>> #define v9fs_co_run_in_worker(block) \ >>>> { \ >>>> =A0 =A0 BH *co_bh; \ >>>> \ >>>> =A0 =A0 co_bh =3D qemu_bh_new(co_run_in_worker_bh, qemu_coroutine_self= ()); \ >>>> =A0 =A0 qemu_bh_schedule(co_bh); \ >>>> =A0 =A0 qemu_coroutine_yield(); /* re-entered in worker thread */ \ >>>> =A0 =A0 qemu_bh_delete(co_bh); \ >>>> \ >>>> =A0 =A0 block; \ >>>> \ >>>> =A0 =A0 qemu_coroutine_yield(); /* re-entered in QEMU thread */ \ >>>> } >>>> >>>> void co_run_in_worker_bh(void *opaque) >>>> { >>>> =A0 =A0 Coroutine *co =3D opaque; >>>> >>>> =A0 =A0 g_thread_pool_push(pool, co, NULL); >>>> } >>>> >>>> void worker_thread_fn(gpointer data, gpointer user_data) >>>> { >>>> =A0 =A0 Coroutine *co =3D user_data; >>>> =A0 =A0 char byte =3D 0; >>>> =A0 =A0 ssize_t len; >>>> >>>> =A0 =A0 qemu_coroutine_enter(co, NULL); >>>> >>>> =A0 =A0 g_async_queue_push(v9fs_pool.completed, co); >>>> =A0 =A0 do { >>>> =A0 =A0 =A0 =A0 len =3D write(v9fs_pool.wfd,&byte, sizeof(byte)); >>>> =A0 =A0 } while (len =3D=3D -1&& =A0errno =3D=3D EINTR); >>>> } >>>> >>>> void process_req_done(void *arg) >>>> { >>>> =A0 =A0 Coroutine *co; >>>> =A0 =A0 char byte; >>>> =A0 =A0 ssize_t len; >>>> >>>> =A0 =A0 do { >>>> =A0 =A0 =A0 =A0 len =3D read(v9fs_pool.rfd,&byte, sizeof(byte)); >>>> =A0 =A0 } while (len =3D=3D -1&& =A0errno =3D=3D EINTR); >>>> >>>> =A0 =A0 while ((co =3D g_async_queue_try_pop(v9fs_pool.completed)) != =3D NULL) { >>>> =A0 =A0 =A0 =A0 qemu_coroutine_enter(co, NULL); >>>> =A0 =A0 } >>>> } >>>> >>>> I typed this code out in the email, it has not been compiled or tested= . >>>> >>>> If you decide to eliminate coroutines entirely in the future and use >>>> worker threads exclusively to process requests, then there are clearly >>>> marked sections in the code: anything inside v9fs_co_run_in_worker() >>>> must be thread-safe already and anything outside it needs to be audite= d >>>> and made thread-safe. =A0The changes required are smaller than those i= f >>>> your current patch series was applied. =A0I wanted to mention this poi= nt >>>> to show that this doesn't paint virtfs into a corner. >>>> >>>> So where does this leave virtfs? =A0No marshalling is necessary and >>>> blocking operations can be performed inline using >>>> v9fs_co_run_in_worker() blocks. =A0The codebase will be a lot smaller. >>>> >>>> Does this seem reasonable? >>>> >>> Do we really need bottom halfs here ? can't we achieve the same with >>> v9fs_qemu_submit_request() and making the glib thread >>> function callback (request.func())to do qemu_coroutine_enter() >> >> >> I had the same question. :) =A0Tested without BH and touch testing worke= d >> fine. >> >> >> #define v9fs_co_run_in_worker(block) \ >> { \ >> =A0 =A0g_thread_pool_push(v9fs_pool.pool, qemu_coroutine_self(), NULL); = \ >> =A0 =A0qemu_coroutine_yield(); /* re-entered in worker thread */ \ >> \ >> =A0 =A0block; \ >> \ >> =A0 =A0qemu_coroutine_yield(); /* re-entered in QEMU thread */ \ >> } > > I guess there is a need for BH. :) > Without that .. little stress with smp=3D2 causing > > Co-routine re-entered recursively > Aborted Right. It's for synchronization: 1. Yield the coroutine in the QEMU thread. 2. Submit the coroutine to a worker thread. 3. Enter the coroutine in the worker thread. But it's not safe to swap steps 1 and 2. Otherwise the worker thread could enter the coroutine before step 2 while it is still running. Stefan