From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:45431) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QL2ZX-0000QZ-RR for qemu-devel@nongnu.org; Fri, 13 May 2011 20:19:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QL2ZW-0006Jc-J0 for qemu-devel@nongnu.org; Fri, 13 May 2011 20:19:07 -0400 Received: from e33.co.us.ibm.com ([32.97.110.151]:40640) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QL2ZW-0006JX-7N for qemu-devel@nongnu.org; Fri, 13 May 2011 20:19:06 -0400 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by e33.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p4E0BrF3008775 for ; Fri, 13 May 2011 18:11:54 -0600 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p4E0Iv8v140790 for ; Fri, 13 May 2011 18:18:57 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p4DIIT01024928 for ; Fri, 13 May 2011 12:18:29 -0600 Message-ID: <4DCDCA6F.9000503@linux.vnet.ibm.com> Date: Fri, 13 May 2011 17:18:55 -0700 From: Venkateswararao Jujjuri MIME-Version: 1.0 References: <1305233867-4367-1-git-send-email-jvrao@linux.vnet.ibm.com> <20110513085503.GA23158@stefanha-thinkpad.localdomain> <87sjsio0zi.fsf@linux.vnet.ibm.com> In-Reply-To: <87sjsio0zi.fsf@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [0/25] Async threading for VirtFS using glib threads & coroutines. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Aneesh Kumar K.V" Cc: Stefan Hajnoczi , aliguori@us.ibm.com, qemu-devel@nongnu.org, stefanha@linux.vnet.ibm.com On 05/13/2011 12:26 PM, Aneesh Kumar K.V wrote: > On Fri, 13 May 2011 09:55:03 +0100, Stefan Hajnoczi wrote: >> On Thu, May 12, 2011 at 01:57:22PM -0700, Venkateswararao Jujjuri (JV) wrote: >>> VirtFS (fileserver base on 9P) performs many blocking system calls in the >>> vCPU context. This effort is to move the blocking calls out of vCPU/IO >>> thread context, into asynchronous threads. >>> >>> Anthony's " Add hard build dependency on glib" patch and >>> Kevin/Stefan's coroutine effort is a prerequisite. >>> >>> This patch set contains: >>> - Converting all 9pfs calls into coroutines. >>> - Each 9P operation will be modified for: >>> - Remove post* functions. These are our call back functions which makes >>> the code very hard to read. Now with coroutines, we can achieve the same >>> state machine model with nice sequential code flow. >>> - Move errno access near to the local_syscall() >>> - Introduce asynchronous threading >>> >>> This series has the basic infrastructure and few routines like >>> mkdir,monod,unlink,readdir,xattr,lstat, etc converted. >>> Currently we are working on converting and testing other 9P operations also >>> into this model and those patches will follow shortly. >>> >>> Removing callback functions made some of the patches little lengthy. >> This long patch series adds temporary structs and marshalling code for >> each file system operation - I think none of this is necessary. Instead >> we can exploit coroutines more: >> >> The point of coroutines is that you can suspend a thread of control (a >> call-stack, not an OS-level thread) and can re-enter it later. We >> should make coroutines thread-safe (i.e. work outside of the global >> mutex) and then allow switching a coroutine from a QEMU thread to a >> worker thread and back again: >> >> int coroutine_fn v9fs_co_readdir(V9fsState *s, V9fsFidState *fidp, >> struct dirent **dent) >> { >> int ret = 0; >> >> v9fs_co_run_in_worker({ >> errno = 0; >> *dent = s->ops->readdir(&s->ctx, fidp->fs.dir); >> if (!*dent&& errno) { >> ret = -errno; >> } >> }); >> return ret; >> } >> >> v9fs_co_readdir() can be called from a QEMU thread. The block of code >> inside v9fs_co_run_in_worker() will be executed in a worker thread. >> Notice that no marshalling variables is necessary at all; we can use the >> function arguments and local variables because this is still the same >> function! >> >> When control reaches the end of the v9fs_co_run_in_worker() block, >> execution is resumed in a QEMU thread and the function then returns ret. >> It would be incorrect to return inside the v9fs_co_run_in_worker() block >> because at that point we're still inside the worker thread. >> >> Here is how v9fs_co_run_in_worker() does its magic: >> >> #define v9fs_co_run_in_worker(block) \ >> { \ >> BH *co_bh; \ >> \ >> co_bh = qemu_bh_new(co_run_in_worker_bh, qemu_coroutine_self()); \ >> qemu_bh_schedule(co_bh); \ >> qemu_coroutine_yield(); /* re-entered in worker thread */ \ >> qemu_bh_delete(co_bh); \ >> \ >> block; \ >> \ >> qemu_coroutine_yield(); /* re-entered in QEMU thread */ \ >> } >> >> void co_run_in_worker_bh(void *opaque) >> { >> Coroutine *co = opaque; >> >> g_thread_pool_push(pool, co, NULL); >> } >> >> void worker_thread_fn(gpointer data, gpointer user_data) >> { >> Coroutine *co = user_data; >> char byte = 0; >> ssize_t len; >> >> qemu_coroutine_enter(co, NULL); >> >> g_async_queue_push(v9fs_pool.completed, co); >> do { >> len = write(v9fs_pool.wfd,&byte, sizeof(byte)); >> } while (len == -1&& errno == EINTR); >> } >> >> void process_req_done(void *arg) >> { >> Coroutine *co; >> char byte; >> ssize_t len; >> >> do { >> len = read(v9fs_pool.rfd,&byte, sizeof(byte)); >> } while (len == -1&& errno == EINTR); >> >> while ((co = g_async_queue_try_pop(v9fs_pool.completed)) != NULL) { >> qemu_coroutine_enter(co, NULL); >> } >> } >> >> I typed this code out in the email, it has not been compiled or tested. >> >> If you decide to eliminate coroutines entirely in the future and use >> worker threads exclusively to process requests, then there are clearly >> marked sections in the code: anything inside v9fs_co_run_in_worker() >> must be thread-safe already and anything outside it needs to be audited >> and made thread-safe. The changes required are smaller than those if >> your current patch series was applied. I wanted to mention this point >> to show that this doesn't paint virtfs into a corner. >> >> So where does this leave virtfs? No marshalling is necessary and >> blocking operations can be performed inline using >> v9fs_co_run_in_worker() blocks. The codebase will be a lot smaller. >> >> Does this seem reasonable? >> > Do we really need bottom halfs here ? can't we achieve the same with > v9fs_qemu_submit_request() and making the glib thread > function callback (request.func())to do qemu_coroutine_enter() I had the same question. :) Tested without BH and touch testing worked fine. #define v9fs_co_run_in_worker(block) \ { \ g_thread_pool_push(v9fs_pool.pool, qemu_coroutine_self(), NULL); \ qemu_coroutine_yield(); /* re-entered in worker thread */ \ \ block; \ \ qemu_coroutine_yield(); /* re-entered in QEMU thread */ \ } void v9fs_qemu_process_req_done(void *arg) { Coroutine *co; char byte; ssize_t len; do { len = read(v9fs_pool.rfd, &byte, sizeof(byte)); } while (len == -1 && errno == EINTR); while ((co = g_async_queue_try_pop(v9fs_pool.completed)) != NULL) { qemu_coroutine_enter(co, NULL); } } void v9fs_thread_routine(gpointer data, gpointer user_data) { Coroutine *co = data; char byte = 0; ssize_t len; qemu_coroutine_enter(co, NULL); g_async_queue_push(v9fs_pool.completed, co); do { len = write(v9fs_pool.wfd, &byte, sizeof(byte)); } while (len == -1 && errno == EINTR); } This model makes the code simple and also in one shot we can convert all v9fs_do_syscalls into asynchronous threads. But as Aneesh raised will there be any additional overhead for the additional jumps? We can quickly test it out too. For this to work; First we need to consolidate errno and then convert v9fs_do_syscalls with v9fs_co_run_in_worker(). After that we can post patches folding in the Post* functions. Just a thought. Thanks, JV > like: > > int v9fs_co_readdir(V9fsState *s, V9fsFidState *fidp, struct dirent **dent) > { > v9fs_wthread_enter(); > s->ops->readdir(&s->ctx, fidp->fs.dir); > v9fs_wthread_exit(); > } > > v9fs_worker_thread_enter() > { > v9fs_qemu_submit_request(v9fs_worker_request); > qemu_coroutine_yield(); > } > > v9fs_coroutine_woker_func() > { > qemu_coroutine_enter(coroutine, NULL); > } > > > I also wonder whether additional bottom halfs and additional > setcontext/setjmp that we end up with will have a performance impact compared > to what we have currently ? > > -aneesh