From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=34106 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PGEst-0004zR-Oj for qemu-devel@nongnu.org; Wed, 10 Nov 2010 12:55:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PGEsq-0005XX-Qg for qemu-devel@nongnu.org; Wed, 10 Nov 2010 12:54:58 -0500 Received: from e23smtp01.au.ibm.com ([202.81.31.143]:41034) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PGEsq-0005XN-52 for qemu-devel@nongnu.org; Wed, 10 Nov 2010 12:54:56 -0500 Received: from d23relay04.au.ibm.com (d23relay04.au.ibm.com [202.81.31.246]) by e23smtp01.au.ibm.com (8.14.4/8.13.1) with ESMTP id oAAHpeic002996 for ; Thu, 11 Nov 2010 04:51:40 +1100 Received: from d23av03.au.ibm.com (d23av03.au.ibm.com [9.190.234.97]) by d23relay04.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id oAAHsres2535466 for ; Thu, 11 Nov 2010 04:54:53 +1100 Received: from d23av03.au.ibm.com (loopback [127.0.0.1]) by d23av03.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id oAAHsqs9013864 for ; Thu, 11 Nov 2010 04:54:53 +1100 Date: Wed, 10 Nov 2010 23:24:42 +0530 From: Arun R Bharadwaj Subject: Re: [Qemu-devel] [PATCH 1/3] Make paio subsystem use threadlets infrastructure Message-ID: <20101110175442.GA2300@linux.vnet.ibm.com> References: <20101110131822.29714.34137.stgit@localhost6.localdomain6> <20101110131946.29714.98218.stgit@localhost6.localdomain6> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Reply-To: arun@linux.vnet.ibm.com List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: qemu-devel@nongnu.org * Stefan Hajnoczi [2010-11-10 13:45:29]: > On Wed, Nov 10, 2010 at 1:19 PM, Arun R Bharadwaj > wrote: > > @@ -301,106 +431,58 @@ static ssize_t handle_aiocb_rw(struct qemu_pai= ocb *aiocb) > > =A0 =A0 return nbytes; > > =A0} > > > > -static void *aio_thread(void *unused) > > +static void aio_thread(ThreadletWork *work) > > =A0{ > > =A0 =A0 pid_t pid; > > + =A0 =A0struct qemu_paiocb *aiocb =3D container_of(work, struct qemu= _paiocb, work); > > + =A0 =A0ssize_t ret =3D 0; > > > > =A0 =A0 pid =3D getpid(); > > > > - =A0 =A0while (1) { > > - =A0 =A0 =A0 =A0struct qemu_paiocb *aiocb; > > - =A0 =A0 =A0 =A0ssize_t ret =3D 0; > > - =A0 =A0 =A0 =A0qemu_timeval tv; > > - =A0 =A0 =A0 =A0struct timespec ts; > > - > > - =A0 =A0 =A0 =A0qemu_gettimeofday(&tv); > > - =A0 =A0 =A0 =A0ts.tv_sec =3D tv.tv_sec + 10; > > - =A0 =A0 =A0 =A0ts.tv_nsec =3D 0; > > - > > - =A0 =A0 =A0 =A0mutex_lock(&lock); > > - > > - =A0 =A0 =A0 =A0while (QTAILQ_EMPTY(&request_list) && > > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 !(ret =3D=3D ETIMEDOUT)) { > > - =A0 =A0 =A0 =A0 =A0 =A0ret =3D cond_timedwait(&cond, &lock, &ts); > > - =A0 =A0 =A0 =A0} > > - > > - =A0 =A0 =A0 =A0if (QTAILQ_EMPTY(&request_list)) > > - =A0 =A0 =A0 =A0 =A0 =A0break; > > - > > - =A0 =A0 =A0 =A0aiocb =3D QTAILQ_FIRST(&request_list); > > - =A0 =A0 =A0 =A0QTAILQ_REMOVE(&request_list, aiocb, node); > > - =A0 =A0 =A0 =A0aiocb->active =3D 1; > > - =A0 =A0 =A0 =A0idle_threads--; > > - =A0 =A0 =A0 =A0mutex_unlock(&lock); > > - > > - =A0 =A0 =A0 =A0switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) { > > - =A0 =A0 =A0 =A0case QEMU_AIO_READ: > > - =A0 =A0 =A0 =A0case QEMU_AIO_WRITE: > > - =A0 =A0 =A0 =A0 =A0 =A0ret =3D handle_aiocb_rw(aiocb); > > - =A0 =A0 =A0 =A0 =A0 =A0break; > > - =A0 =A0 =A0 =A0case QEMU_AIO_FLUSH: > > - =A0 =A0 =A0 =A0 =A0 =A0ret =3D handle_aiocb_flush(aiocb); > > - =A0 =A0 =A0 =A0 =A0 =A0break; > > - =A0 =A0 =A0 =A0case QEMU_AIO_IOCTL: > > - =A0 =A0 =A0 =A0 =A0 =A0ret =3D handle_aiocb_ioctl(aiocb); > > - =A0 =A0 =A0 =A0 =A0 =A0break; > > - =A0 =A0 =A0 =A0default: > > - =A0 =A0 =A0 =A0 =A0 =A0fprintf(stderr, "invalid aio request (0x%x)\= n", aiocb->aio_type); > > - =A0 =A0 =A0 =A0 =A0 =A0ret =3D -EINVAL; > > - =A0 =A0 =A0 =A0 =A0 =A0break; > > - =A0 =A0 =A0 =A0} > > - > > - =A0 =A0 =A0 =A0mutex_lock(&lock); > > - =A0 =A0 =A0 =A0aiocb->ret =3D ret; > > - =A0 =A0 =A0 =A0idle_threads++; > > - =A0 =A0 =A0 =A0mutex_unlock(&lock); > > - > > - =A0 =A0 =A0 =A0if (kill(pid, aiocb->ev_signo)) die("kill failed"); > > + =A0 =A0switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) { > > + =A0 =A0case QEMU_AIO_READ: > > + =A0 =A0case QEMU_AIO_WRITE: > > + =A0 =A0 =A0 =A0ret =3D handle_aiocb_rw(aiocb); > > + =A0 =A0 =A0 =A0break; > > + =A0 =A0case QEMU_AIO_FLUSH: > > + =A0 =A0 =A0 =A0ret =3D handle_aiocb_flush(aiocb); > > + =A0 =A0 =A0 =A0break; > > + =A0 =A0case QEMU_AIO_IOCTL: > > + =A0 =A0 =A0 =A0ret =3D handle_aiocb_ioctl(aiocb); > > + =A0 =A0 =A0 =A0break; > > + =A0 =A0default: > > + =A0 =A0 =A0 =A0fprintf(stderr, "invalid aio request (0x%x)\n", aioc= b->aio_type); > > + =A0 =A0 =A0 =A0ret =3D -EINVAL; > > + =A0 =A0 =A0 =A0break; > > =A0 =A0 } > > > > - =A0 =A0idle_threads--; > > - =A0 =A0cur_threads--; > > - =A0 =A0mutex_unlock(&lock); > > - > > - =A0 =A0return NULL; > > -} > > - > > -static void spawn_thread(void) > > -{ > > - =A0 =A0sigset_t set, oldset; > > - > > - =A0 =A0cur_threads++; > > - =A0 =A0idle_threads++; > > + =A0 =A0qemu_mutex_lock(&aiocb_mutex); > > + =A0 =A0aiocb->ret =3D ret; >=20 > This is where qemu_cond_broadcast() is needed. >=20 > > + =A0 =A0qemu_mutex_unlock(&aiocb_mutex); > > > > - =A0 =A0/* block all signals */ > > - =A0 =A0if (sigfillset(&set)) die("sigfillset"); > > - =A0 =A0if (sigprocmask(SIG_SETMASK, &set, &oldset)) die("sigprocmas= k"); > > - > > - =A0 =A0thread_create(&thread_id, &attr, aio_thread, NULL); > > - > > - =A0 =A0if (sigprocmask(SIG_SETMASK, &oldset, NULL)) die("sigprocmas= k restore"); > > + =A0 =A0if (kill(pid, aiocb->ev_signo)) { > > + =A0 =A0 =A0 =A0die("kill failed"); > > + =A0 =A0} > > =A0} > > >=20 > I wonder if the condition variable has a measurable performance > overhead. We unconditionally broadcast on paiocb completion. One > idea would be to keep a counter of waiters (should only ever be 0 or > 1) protected by aiocb_mutex and broadcast only when there is a waiter. > I just want to share this idea, I don't know if it's necessary to > implement it or if it could even work without a race condition. >=20 I did not understand exactly why we are going to see a performane hit. We will be doing a broadcast only after the aio_thread has finished the work right? So how is this going to affect performance even if we do a useless broadcast? -arun > > =A0 =A0 } > > > > =A0 =A0 paio_remove(acb); > > @@ -618,11 +692,12 @@ int paio_init(void) > > =A0 =A0 struct sigaction act; > > =A0 =A0 PosixAioState *s; > > =A0 =A0 int fds[2]; > > - =A0 =A0int ret; > > > > =A0 =A0 if (posix_aio_state) > > =A0 =A0 =A0 =A0 return 0; > > > > + =A0 =A0qemu_mutex_init(&aiocb_mutex); >=20 > Also needs qemu_cond_init(&aiocb_completion). >=20 > Stefan >=20