From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:50623) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QL3fQ-0003TO-Iu for qemu-devel@nongnu.org; Fri, 13 May 2011 21:29:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QL3fP-0001C8-DI for qemu-devel@nongnu.org; Fri, 13 May 2011 21:29:16 -0400 Received: from e38.co.us.ibm.com ([32.97.110.159]:60709) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QL3fP-0001By-4Q for qemu-devel@nongnu.org; Fri, 13 May 2011 21:29:15 -0400 Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by e38.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p4E1L8Eq027301 for ; Fri, 13 May 2011 19:21:08 -0600 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v9.1) with ESMTP id p4E1T7q2162478 for ; Fri, 13 May 2011 19:29:08 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p4DJSeIe012962 for ; Fri, 13 May 2011 13:28:40 -0600 Received: from oc6675851006.ibm.com (dyn9047029068.beaverton.ibm.com [9.47.29.68]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p4DJSeI6012937 for ; Fri, 13 May 2011 13:28:40 -0600 Message-ID: <4DCDDAE3.2030201@linux.vnet.ibm.com> Date: Fri, 13 May 2011 18:29:07 -0700 From: Venkateswararao Jujjuri MIME-Version: 1.0 References: <1305233867-4367-1-git-send-email-jvrao@linux.vnet.ibm.com> <20110513085503.GA23158@stefanha-thinkpad.localdomain> <87sjsio0zi.fsf@linux.vnet.ibm.com> <4DCDCA6F.9000503@linux.vnet.ibm.com> In-Reply-To: <4DCDCA6F.9000503@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [0/25] Async threading for VirtFS using glib threads & coroutines. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On 05/13/2011 05:18 PM, Venkateswararao Jujjuri wrote: > On 05/13/2011 12:26 PM, Aneesh Kumar K.V wrote: >> On Fri, 13 May 2011 09:55:03 +0100, Stefan >> Hajnoczi wrote: >>> On Thu, May 12, 2011 at 01:57:22PM -0700, Venkateswararao Jujjuri >>> (JV) wrote: >>>> VirtFS (fileserver base on 9P) performs many blocking system calls >>>> in the >>>> vCPU context. This effort is to move the blocking calls out of vCPU/IO >>>> thread context, into asynchronous threads. >>>> >>>> Anthony's " Add hard build dependency on glib" patch and >>>> Kevin/Stefan's coroutine effort is a prerequisite. >>>> >>>> This patch set contains: >>>> - Converting all 9pfs calls into coroutines. >>>> - Each 9P operation will be modified for: >>>> - Remove post* functions. These are our call back functions >>>> which makes >>>> the code very hard to read. Now with coroutines, we can >>>> achieve the same >>>> state machine model with nice sequential code flow. >>>> - Move errno access near to the local_syscall() >>>> - Introduce asynchronous threading >>>> >>>> This series has the basic infrastructure and few routines like >>>> mkdir,monod,unlink,readdir,xattr,lstat, etc converted. >>>> Currently we are working on converting and testing other 9P >>>> operations also >>>> into this model and those patches will follow shortly. >>>> >>>> Removing callback functions made some of the patches little lengthy. >>> This long patch series adds temporary structs and marshalling code for >>> each file system operation - I think none of this is necessary. >>> Instead >>> we can exploit coroutines more: >>> >>> The point of coroutines is that you can suspend a thread of control (a >>> call-stack, not an OS-level thread) and can re-enter it later. We >>> should make coroutines thread-safe (i.e. work outside of the global >>> mutex) and then allow switching a coroutine from a QEMU thread to a >>> worker thread and back again: >>> >>> int coroutine_fn v9fs_co_readdir(V9fsState *s, V9fsFidState *fidp, >>> struct dirent **dent) >>> { >>> int ret = 0; >>> >>> v9fs_co_run_in_worker({ >>> errno = 0; >>> *dent = s->ops->readdir(&s->ctx, fidp->fs.dir); >>> if (!*dent&& errno) { >>> ret = -errno; >>> } >>> }); >>> return ret; >>> } >>> >>> v9fs_co_readdir() can be called from a QEMU thread. The block of code >>> inside v9fs_co_run_in_worker() will be executed in a worker thread. >>> Notice that no marshalling variables is necessary at all; we can use >>> the >>> function arguments and local variables because this is still the same >>> function! >>> >>> When control reaches the end of the v9fs_co_run_in_worker() block, >>> execution is resumed in a QEMU thread and the function then returns >>> ret. >>> It would be incorrect to return inside the v9fs_co_run_in_worker() >>> block >>> because at that point we're still inside the worker thread. >>> >>> Here is how v9fs_co_run_in_worker() does its magic: >>> >>> #define v9fs_co_run_in_worker(block) \ >>> { \ >>> BH *co_bh; \ >>> \ >>> co_bh = qemu_bh_new(co_run_in_worker_bh, qemu_coroutine_self()); \ >>> qemu_bh_schedule(co_bh); \ >>> qemu_coroutine_yield(); /* re-entered in worker thread */ \ >>> qemu_bh_delete(co_bh); \ >>> \ >>> block; \ >>> \ >>> qemu_coroutine_yield(); /* re-entered in QEMU thread */ \ >>> } >>> >>> void co_run_in_worker_bh(void *opaque) >>> { >>> Coroutine *co = opaque; >>> >>> g_thread_pool_push(pool, co, NULL); >>> } >>> >>> void worker_thread_fn(gpointer data, gpointer user_data) >>> { >>> Coroutine *co = user_data; >>> char byte = 0; >>> ssize_t len; >>> >>> qemu_coroutine_enter(co, NULL); >>> >>> g_async_queue_push(v9fs_pool.completed, co); >>> do { >>> len = write(v9fs_pool.wfd,&byte, sizeof(byte)); >>> } while (len == -1&& errno == EINTR); >>> } >>> >>> void process_req_done(void *arg) >>> { >>> Coroutine *co; >>> char byte; >>> ssize_t len; >>> >>> do { >>> len = read(v9fs_pool.rfd,&byte, sizeof(byte)); >>> } while (len == -1&& errno == EINTR); >>> >>> while ((co = g_async_queue_try_pop(v9fs_pool.completed)) != >>> NULL) { >>> qemu_coroutine_enter(co, NULL); >>> } >>> } >>> >>> I typed this code out in the email, it has not been compiled or tested. >>> >>> If you decide to eliminate coroutines entirely in the future and use >>> worker threads exclusively to process requests, then there are clearly >>> marked sections in the code: anything inside v9fs_co_run_in_worker() >>> must be thread-safe already and anything outside it needs to be audited >>> and made thread-safe. The changes required are smaller than those if >>> your current patch series was applied. I wanted to mention this point >>> to show that this doesn't paint virtfs into a corner. >>> >>> So where does this leave virtfs? No marshalling is necessary and >>> blocking operations can be performed inline using >>> v9fs_co_run_in_worker() blocks. The codebase will be a lot smaller. >>> >>> Does this seem reasonable? >>> >> Do we really need bottom halfs here ? can't we achieve the same with >> v9fs_qemu_submit_request() and making the glib thread >> function callback (request.func())to do qemu_coroutine_enter() > > > I had the same question. :) Tested without BH and touch testing > worked fine. > > > #define v9fs_co_run_in_worker(block) \ > { \ > g_thread_pool_push(v9fs_pool.pool, qemu_coroutine_self(), NULL); \ > qemu_coroutine_yield(); /* re-entered in worker thread */ \ > \ > block; \ > \ > qemu_coroutine_yield(); /* re-entered in QEMU thread */ \ > } I guess there is a need for BH. :) Without that .. little stress with smp=2 causing Co-routine re-entered recursively Aborted - JV > > void v9fs_qemu_process_req_done(void *arg) > { > Coroutine *co; > char byte; > ssize_t len; > > do { > len = read(v9fs_pool.rfd, &byte, sizeof(byte)); > } while (len == -1 && errno == EINTR); > > while ((co = g_async_queue_try_pop(v9fs_pool.completed)) != NULL) { > qemu_coroutine_enter(co, NULL); > } > } > > void v9fs_thread_routine(gpointer data, gpointer user_data) > { > Coroutine *co = data; > char byte = 0; > ssize_t len; > > qemu_coroutine_enter(co, NULL); > > g_async_queue_push(v9fs_pool.completed, co); > do { > len = write(v9fs_pool.wfd, &byte, sizeof(byte)); > } while (len == -1 && errno == EINTR); > } > > This model makes the code simple and also in one shot we can convert > all v9fs_do_syscalls > into asynchronous threads. But as Aneesh raised will there be any > additional overhead > for the additional jumps? We can quickly test it out too. > > For this to work; First we need to consolidate errno and then convert > v9fs_do_syscalls > with v9fs_co_run_in_worker(). After that we can post patches folding > in the Post* functions. > Just a thought. > > Thanks, > JV > >> like: >> >> int v9fs_co_readdir(V9fsState *s, V9fsFidState *fidp, struct dirent >> **dent) >> { >> v9fs_wthread_enter(); >> s->ops->readdir(&s->ctx, fidp->fs.dir); >> v9fs_wthread_exit(); >> } >> >> v9fs_worker_thread_enter() >> { >> v9fs_qemu_submit_request(v9fs_worker_request); >> qemu_coroutine_yield(); >> } >> >> v9fs_coroutine_woker_func() >> { >> qemu_coroutine_enter(coroutine, NULL); >> } >> >> >> I also wonder whether additional bottom halfs and additional >> setcontext/setjmp that we end up with will have a performance impact >> compared >> to what we have currently ? >> >> -aneesh > >