From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=44556 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P8qhY-0000hB-E0 for qemu-devel@nongnu.org; Thu, 21 Oct 2010 04:40:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P8qhT-000856-Jn for qemu-devel@nongnu.org; Thu, 21 Oct 2010 04:40:44 -0400 Received: from e23smtp09.au.ibm.com ([202.81.31.142]:60136) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P8qhS-00084N-Ol for qemu-devel@nongnu.org; Thu, 21 Oct 2010 04:40:39 -0400 Received: from d23relay03.au.ibm.com (d23relay03.au.ibm.com [202.81.31.245]) by e23smtp09.au.ibm.com (8.14.4/8.13.1) with ESMTP id o9L8eUpH013584 for ; Thu, 21 Oct 2010 19:40:30 +1100 Received: from d23av03.au.ibm.com (d23av03.au.ibm.com [9.190.234.97]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o9L8eTfw958634 for ; Thu, 21 Oct 2010 19:40:29 +1100 Received: from d23av03.au.ibm.com (loopback [127.0.0.1]) by d23av03.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id o9L8eTkx026698 for ; Thu, 21 Oct 2010 19:40:29 +1100 Date: Thu, 21 Oct 2010 14:10:25 +0530 From: Arun R Bharadwaj Subject: Re: [Qemu-devel] [PATCH 2/3] Make paio subsystem use threadlets Message-ID: <20101021084025.GC9481@linux.vnet.ibm.com> References: <20101019173946.16514.62027.stgit@localhost6.localdomain6> <20101019174320.16514.92329.stgit@localhost6.localdomain6> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Reply-To: arun@linux.vnet.ibm.com List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: qemu-devel@nongnu.org * Stefan Hajnoczi [2010-10-20 10:30:38]: > On Tue, Oct 19, 2010 at 6:43 PM, Arun R Bharadwaj > wrote: > > From: Gautham R Shenoy > > > > This patch makes the paio subsystem use the threadlet framework there= by > > decoupling asynchronous threading framework portion out of > > posix-aio-compat.c > > > > The patch has been tested with fstress. > > > > Signed-off-by: Gautham R Shenoy > > Signed-off-by: Sripathi Kodi > > --- > > =A0posix-aio-compat.c | =A0166 +++++++++-----------------------------= -------------- > > =A01 files changed, 30 insertions(+), 136 deletions(-) > > > > diff --git a/posix-aio-compat.c b/posix-aio-compat.c > > index 7b862b5..6977c18 100644 > > --- a/posix-aio-compat.c > > +++ b/posix-aio-compat.c > > @@ -29,6 +29,7 @@ > > =A0#include "block_int.h" > > > > =A0#include "block/raw-posix-aio.h" > > +#include "qemu-threadlets.h" > > > > > > =A0struct qemu_paiocb { > > @@ -51,6 +52,7 @@ struct qemu_paiocb { > > =A0 =A0 struct qemu_paiocb *next; > > > > =A0 =A0 int async_context_id; > > + =A0 =A0ThreadletWork work; >=20 > The QTAILQ_ENTRY(qemu_paiocb) node field is no longer used, please remo= ve it. >=20 > > =A0}; > > > > =A0typedef struct PosixAioState { > > @@ -59,15 +61,6 @@ typedef struct PosixAioState { > > =A0} PosixAioState; > > > > > > -static pthread_mutex_t lock =3D PTHREAD_MUTEX_INITIALIZER; > > -static pthread_cond_t cond =3D PTHREAD_COND_INITIALIZER; > > -static pthread_t thread_id; > > -static pthread_attr_t attr; > > -static int max_threads =3D 64; > > -static int cur_threads =3D 0; > > -static int idle_threads =3D 0; > > -static QTAILQ_HEAD(, qemu_paiocb) request_list; > > - > > =A0#ifdef CONFIG_PREADV > > =A0static int preadv_present =3D 1; > > =A0#else > > @@ -85,39 +78,6 @@ static void die(const char *what) > > =A0 =A0 die2(errno, what); > > =A0} > > > > -static void mutex_lock(pthread_mutex_t *mutex) > > -{ > > - =A0 =A0int ret =3D pthread_mutex_lock(mutex); > > - =A0 =A0if (ret) die2(ret, "pthread_mutex_lock"); > > -} > > - > > -static void mutex_unlock(pthread_mutex_t *mutex) > > -{ > > - =A0 =A0int ret =3D pthread_mutex_unlock(mutex); > > - =A0 =A0if (ret) die2(ret, "pthread_mutex_unlock"); > > -} > > - > > -static int cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *mut= ex, > > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct timespec= *ts) > > -{ > > - =A0 =A0int ret =3D pthread_cond_timedwait(cond, mutex, ts); > > - =A0 =A0if (ret && ret !=3D ETIMEDOUT) die2(ret, "pthread_cond_timed= wait"); > > - =A0 =A0return ret; > > -} > > - > > -static void cond_signal(pthread_cond_t *cond) > > -{ > > - =A0 =A0int ret =3D pthread_cond_signal(cond); > > - =A0 =A0if (ret) die2(ret, "pthread_cond_signal"); > > -} > > - > > -static void thread_create(pthread_t *thread, pthread_attr_t *attr, > > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0void *(*start_ro= utine)(void*), void *arg) > > -{ > > - =A0 =A0int ret =3D pthread_create(thread, attr, start_routine, arg)= ; > > - =A0 =A0if (ret) die2(ret, "pthread_create"); > > -} > > - > > =A0static ssize_t handle_aiocb_ioctl(struct qemu_paiocb *aiocb) > > =A0{ > > =A0 =A0 int ret; > > @@ -301,106 +261,51 @@ static ssize_t handle_aiocb_rw(struct qemu_pai= ocb *aiocb) > > =A0 =A0 return nbytes; > > =A0} > > > > -static void *aio_thread(void *unused) > > +static void aio_thread(ThreadletWork *work) > > =A0{ > > =A0 =A0 pid_t pid; > > + =A0 =A0struct qemu_paiocb *aiocb =3D container_of(work, struct qemu= _paiocb, work); > > + =A0 =A0ssize_t ret =3D 0; > > > > =A0 =A0 pid =3D getpid(); > > + =A0 =A0aiocb->active =3D 1; > > > > - =A0 =A0while (1) { > > - =A0 =A0 =A0 =A0struct qemu_paiocb *aiocb; > > - =A0 =A0 =A0 =A0ssize_t ret =3D 0; > > - =A0 =A0 =A0 =A0qemu_timeval tv; > > - =A0 =A0 =A0 =A0struct timespec ts; > > - > > - =A0 =A0 =A0 =A0qemu_gettimeofday(&tv); > > - =A0 =A0 =A0 =A0ts.tv_sec =3D tv.tv_sec + 10; > > - =A0 =A0 =A0 =A0ts.tv_nsec =3D 0; > > - > > - =A0 =A0 =A0 =A0mutex_lock(&lock); > > - > > - =A0 =A0 =A0 =A0while (QTAILQ_EMPTY(&request_list) && > > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 !(ret =3D=3D ETIMEDOUT)) { > > - =A0 =A0 =A0 =A0 =A0 =A0ret =3D cond_timedwait(&cond, &lock, &ts); > > - =A0 =A0 =A0 =A0} > > - > > - =A0 =A0 =A0 =A0if (QTAILQ_EMPTY(&request_list)) > > - =A0 =A0 =A0 =A0 =A0 =A0break; > > - > > - =A0 =A0 =A0 =A0aiocb =3D QTAILQ_FIRST(&request_list); > > - =A0 =A0 =A0 =A0QTAILQ_REMOVE(&request_list, aiocb, node); > > - =A0 =A0 =A0 =A0aiocb->active =3D 1; > > - =A0 =A0 =A0 =A0idle_threads--; > > - =A0 =A0 =A0 =A0mutex_unlock(&lock); > > - > > - =A0 =A0 =A0 =A0switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) { > > - =A0 =A0 =A0 =A0case QEMU_AIO_READ: > > - =A0 =A0 =A0 =A0case QEMU_AIO_WRITE: > > - =A0 =A0 =A0 =A0 =A0 =A0ret =3D handle_aiocb_rw(aiocb); > > - =A0 =A0 =A0 =A0 =A0 =A0break; > > - =A0 =A0 =A0 =A0case QEMU_AIO_FLUSH: > > - =A0 =A0 =A0 =A0 =A0 =A0ret =3D handle_aiocb_flush(aiocb); > > - =A0 =A0 =A0 =A0 =A0 =A0break; > > - =A0 =A0 =A0 =A0case QEMU_AIO_IOCTL: > > - =A0 =A0 =A0 =A0 =A0 =A0ret =3D handle_aiocb_ioctl(aiocb); > > - =A0 =A0 =A0 =A0 =A0 =A0break; > > - =A0 =A0 =A0 =A0default: > > - =A0 =A0 =A0 =A0 =A0 =A0fprintf(stderr, "invalid aio request (0x%x)\= n", aiocb->aio_type); > > - =A0 =A0 =A0 =A0 =A0 =A0ret =3D -EINVAL; > > - =A0 =A0 =A0 =A0 =A0 =A0break; > > - =A0 =A0 =A0 =A0} > > - > > - =A0 =A0 =A0 =A0mutex_lock(&lock); > > - =A0 =A0 =A0 =A0aiocb->ret =3D ret; > > - =A0 =A0 =A0 =A0idle_threads++; > > - =A0 =A0 =A0 =A0mutex_unlock(&lock); > > - > > - =A0 =A0 =A0 =A0if (kill(pid, aiocb->ev_signo)) die("kill failed"); > > + =A0 =A0switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) { > > + =A0 =A0case QEMU_AIO_READ: > > + =A0 =A0case QEMU_AIO_WRITE: > > + =A0 =A0 =A0 =A0ret =3D handle_aiocb_rw(aiocb); > > + =A0 =A0 =A0 =A0break; > > + =A0 =A0case QEMU_AIO_FLUSH: > > + =A0 =A0 =A0 =A0ret =3D handle_aiocb_flush(aiocb); > > + =A0 =A0 =A0 =A0break; > > + =A0 =A0case QEMU_AIO_IOCTL: > > + =A0 =A0 =A0 =A0ret =3D handle_aiocb_ioctl(aiocb); > > + =A0 =A0 =A0 =A0break; > > + =A0 =A0default: > > + =A0 =A0 =A0 =A0fprintf(stderr, "invalid aio request (0x%x)\n", aioc= b->aio_type); > > + =A0 =A0 =A0 =A0ret =3D -EINVAL; > > + =A0 =A0 =A0 =A0break; > > =A0 =A0 } > > > > - =A0 =A0idle_threads--; > > - =A0 =A0cur_threads--; > > - =A0 =A0mutex_unlock(&lock); > > + =A0 =A0aiocb->ret =3D ret; > > > > - =A0 =A0return NULL; > > -} > > - > > -static void spawn_thread(void) > > -{ > > - =A0 =A0sigset_t set, oldset; > > - > > - =A0 =A0cur_threads++; > > - =A0 =A0idle_threads++; > > - > > - =A0 =A0/* block all signals */ > > - =A0 =A0if (sigfillset(&set)) die("sigfillset"); > > - =A0 =A0if (sigprocmask(SIG_SETMASK, &set, &oldset)) die("sigprocmas= k"); > > - > > - =A0 =A0thread_create(&thread_id, &attr, aio_thread, NULL); > > - > > - =A0 =A0if (sigprocmask(SIG_SETMASK, &oldset, NULL)) die("sigprocmas= k restore"); > > + =A0 =A0if (kill(pid, aiocb->ev_signo)) die("kill failed"); > > =A0} > > > > =A0static void qemu_paio_submit(struct qemu_paiocb *aiocb) > > =A0{ > > =A0 =A0 aiocb->ret =3D -EINPROGRESS; > > =A0 =A0 aiocb->active =3D 0; > > - =A0 =A0mutex_lock(&lock); > > - =A0 =A0if (idle_threads =3D=3D 0 && cur_threads < max_threads) > > - =A0 =A0 =A0 =A0spawn_thread(); > > - =A0 =A0QTAILQ_INSERT_TAIL(&request_list, aiocb, node); > > - =A0 =A0mutex_unlock(&lock); > > - =A0 =A0cond_signal(&cond); > > + > > + =A0 =A0aiocb->work.func =3D aio_thread; > > + =A0 =A0submit_threadletwork(&aiocb->work); > > =A0} > > > > =A0static ssize_t qemu_paio_return(struct qemu_paiocb *aiocb) > > =A0{ > > =A0 =A0 ssize_t ret; > > > > - =A0 =A0mutex_lock(&lock); > > =A0 =A0 ret =3D aiocb->ret; > > - =A0 =A0mutex_unlock(&lock); > > - > > =A0 =A0 return ret; > > =A0} > > > > @@ -536,14 +441,14 @@ static void paio_cancel(BlockDriverAIOCB *block= acb) > > =A0 =A0 struct qemu_paiocb *acb =3D (struct qemu_paiocb *)blockacb; > > =A0 =A0 int active =3D 0; > > > > - =A0 =A0mutex_lock(&lock); > > =A0 =A0 if (!acb->active) { >=20 > I'm not sure the active field serves any purpose. No memory barriers > are used so the value of active is 0 before the work is executed and 0 > *or* 1 while the work is executed. >=20 > The cancel_threadletwork() function already indicates whether > cancellation succeeded. Why not just try to cancel instead of using > the active field? >=20 This series does not touch the active field anywhere. So I feel we can implement this as a separate patch instead of clubbing it with this. -arun > > - =A0 =A0 =A0 =A0QTAILQ_REMOVE(&request_list, acb, node); > > - =A0 =A0 =A0 =A0acb->ret =3D -ECANCELED; > > + =A0 =A0 =A0 =A0if (!cancel_threadletwork(&acb->work)) > > + =A0 =A0 =A0 =A0 =A0 =A0acb->ret =3D -ECANCELED; > > + =A0 =A0 =A0 =A0 else > > + =A0 =A0 =A0 =A0 =A0 =A0active =3D 1; >=20 > The 0 and 1 return value from cancel_threadletwork() is inverted. See > also my comment on patch 1/3 in this series. >=20 > > =A0 =A0 } else if (acb->ret =3D=3D -EINPROGRESS) { > > =A0 =A0 =A0 =A0 active =3D 1; > > =A0 =A0 } > > - =A0 =A0mutex_unlock(&lock); > > > > =A0 =A0 if (active) { > > =A0 =A0 =A0 =A0 /* fail safe: if the aio could not be canceled, we wa= it for >=20 > while (qemu_paio_error(acb) =3D=3D EINPROGRESS) > ; >=20 > Tight loop with no memory barrier reading a memory location that is > updated by another thread. We shouldn't communicate between threads > without barriers. >=20 > Stefan >=20