From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=57916 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1P8OJu-0002QV-H4
	for qemu-devel@nongnu.org; Tue, 19 Oct 2010 22:22:27 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <balbir@linux.vnet.ibm.com>) id 1P8OJt-0003SA-5v
	for qemu-devel@nongnu.org; Tue, 19 Oct 2010 22:22:26 -0400
Received: from e7.ny.us.ibm.com ([32.97.182.137]:42487)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <balbir@linux.vnet.ibm.com>) id 1P8OJt-0003S4-1c
	for qemu-devel@nongnu.org; Tue, 19 Oct 2010 22:22:25 -0400
Received: from d01relay01.pok.ibm.com (d01relay01.pok.ibm.com [9.56.227.233])
	by e7.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id o9K26WUD025986
	for <qemu-devel@nongnu.org>; Tue, 19 Oct 2010 22:06:32 -0400
Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64])
	by d01relay01.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	o9K2MNef372448
	for <qemu-devel@nongnu.org>; Tue, 19 Oct 2010 22:22:23 -0400
Received: from d01av04.pok.ibm.com (loopback [127.0.0.1])
	by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id
	o9K2MMk0002168
	for <qemu-devel@nongnu.org>; Tue, 19 Oct 2010 22:22:23 -0400
Date: Wed, 20 Oct 2010 07:52:17 +0530
From: Balbir Singh <balbir@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [PATCH 1/3] Introduce threadlets
Message-ID: <20101020022217.GL15844@balbir.in.ibm.com>
References: <20101019173946.16514.62027.stgit@localhost6.localdomain6>
	<20101019174245.16514.14542.stgit@localhost6.localdomain6>
	<20101019183644.GI15844@balbir.in.ibm.com>
	<4CBE0F5F.3020204@codemonkey.ws>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <4CBE0F5F.3020204@codemonkey.ws>
Reply-To: balbir@linux.vnet.ibm.com
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Arun R Bharadwaj <arun@linux.vnet.ibm.com>, qemu-devel@nongnu.org

* Anthony Liguori <anthony@codemonkey.ws> [2010-10-19 16:36:31]:

> On 10/19/2010 01:36 PM, Balbir Singh wrote:
> >>+    qemu_mutex_lock(&(queue->lock));
> >>+    while (1) {
> >>+        ThreadletWork *work;
> >>+        int ret = 0;
> >>+
> >>+        while (QTAILQ_EMPTY(&(queue->request_list))&&
> >>+               (ret != ETIMEDOUT)) {
> >>+            ret = qemu_cond_timedwait(&(queue->cond),
> >>+					&(queue->lock), 10*100000);
> >Ewww... what is 10*100000, can we use something more meaningful
> >please?
> 
> A define is fine but honestly, it's pretty darn obvious what it means...
> 
> >>+        }
> >>+
> >>+        assert(queue->idle_threads != 0);
> >This assertion holds because we believe one of the idle_threads
> >actually did the dequeuing, right?
> 
> An idle thread is a thread is one that is not doing work.  At this
> point in the code, we are not doing any work (yet) so if
> idle_threads count is zero, something is horribly wrong.  We're also
> going to unconditionally decrement in the future code path which
> means that if idle_threads is 0, it's going to become -1.
> 
> The use of idle_thread is to detect whether it's necessary to spawn
> an additional thread.
>

We can hit this assert if pthread_cond_signal() is called outside of
the mutex, let me try and explain below
 
> >>+        if (QTAILQ_EMPTY(&(queue->request_list))) {
> >>+            if (queue->cur_threads>  queue->min_threads) {
> >>+                /* We retain the minimum number of threads */
> >>+                break;
> >>+            }
> >>+        } else {
> >>+            work = QTAILQ_FIRST(&(queue->request_list));
> >>+            QTAILQ_REMOVE(&(queue->request_list), work, node);
> >>+
> >>+            queue->idle_threads--;
> >>+            qemu_mutex_unlock(&(queue->lock));
> >>+
> >>+            /* execute the work function */
> >>+            work->func(work);
> >>+
> >>+            qemu_mutex_lock(&(queue->lock));
> >>+            queue->idle_threads++;
> >>+        }
> >>+    }
> >>+
> >>+    queue->idle_threads--;
> >>+    queue->cur_threads--;
> >>+    qemu_mutex_unlock(&(queue->lock));
> >>+
> >>+    return NULL;
> >Does anybody do a join on the exiting thread from the pool?
> 
> No.  The thread is created in a detached state.
>

That makes sense, thanks for clarifying
 
> >>+}
> >>+
> >>+static void spawn_threadlet(ThreadletQueue *queue)
> >>+{
> >>+    QemuThread thread;
> >>+
> >>+    queue->cur_threads++;
> >>+    queue->idle_threads++;
> >>+
> >>+    qemu_thread_create(&thread, threadlet_worker, queue);
> >>+}
> >>+
> >>+/**
> >>+ * submit_threadletwork_to_queue: Submit a new task to a private queue to be
> >>+ *                            executed asynchronously.
> >>+ * @queue: Per-subsystem private queue to which the new task needs
> >>+ *         to be submitted.
> >>+ * @work: Contains information about the task that needs to be submitted.
> >>+ */
> >>+void submit_threadletwork_to_queue(ThreadletQueue *queue, ThreadletWork *work)
> >>+{
> >>+    qemu_mutex_lock(&(queue->lock));
> >>+    if (queue->idle_threads == 0&&  queue->cur_threads<  queue->max_threads) {
> >>+        spawn_threadlet(queue);
> >So we hold queue->lock, spawn the thread, the spawned thread tries to
> >acquire queue->lock
> 
> Yup.
> 
> >>+    }
> >>+    QTAILQ_INSERT_TAIL(&(queue->request_list), work, node);
> >>+    qemu_mutex_unlock(&(queue->lock));
> >>+    qemu_cond_signal(&(queue->cond));
> >In the case that we just spawned the threadlet, the cond_signal is
> >spurious. If we need predictable scheduling behaviour,
> >qemu_cond_signal needs to happen with queue->lock held.
> 
> It doesn't really affect predictability..
> 
> >I'd rewrite the function as
> >
> >/**
> >  * submit_threadletwork_to_queue: Submit a new task to a private queue to be
> >  *                            executed asynchronously.
> >  * @queue: Per-subsystem private queue to which the new task needs
> >  *         to be submitted.
> >  * @work: Contains information about the task that needs to be submitted.
> >  */
> >void submit_threadletwork_to_queue(ThreadletQueue *queue, ThreadletWork *work)
> >{
> >     qemu_mutex_lock(&(queue->lock));
> >     if (queue->idle_threads == 0&&  (queue->cur_threads<  queue->max_threads)) {
> >         spawn_threadlet(queue);
> >     } else {
> >         qemu_cond_signal(&(queue->cond));
> >     }
> >     QTAILQ_INSERT_TAIL(&(queue->request_list), work, node);
> >     qemu_mutex_unlock(&(queue->lock));
> >}
> 
> I think this is a lot more fragile.  You're relying on the fact that
> signal will not cause the signalled thread to actually awaken until
> we release the lock and doing work after signalling that the
> signalled thread needs to be completed before it wakes up.
> 
> I think you're a lot more robust in the long term if you treat
> condition signalling as a hand off point because it makes the code a
> lot more explicit about what's happening.
>

OK, here is a situation that can happen

T1                              T2
---                             ---
threadlet                       submit_threadletwork_to_queue
(sees condition as no work)     mutex_lock
qemu_cond_timedwait             add_work
...                             mutex_unlock

T3
--
cancel_threadlet_work_on_queue
mutex_lock (grabs it) before T1 can
cancels the work


                                qemu_cond_signal

T1
--
Grabs mutex_lock (from within cond_timedwait)
Now there is no work to do, the condition
has changed before the thread wakes up


The man page also states

"however, if predictable scheduling behavior is required, then that
mutex shall be locked by the thread calling pthread_cond_broadcast()
or pthread_cond_signal()"

> >>+/**
> >>+ * submit_threadletwork: Submit to the global queue a new task to be executed
> >>+ *                   asynchronously.
> >>+ * @work: Contains information about the task that needs to be submitted.
> >>+ */
> >>+void submit_threadletwork(ThreadletWork *work)
> >>+{
> >>+    if (unlikely(!globalqueue_init)) {
> >>+        threadlet_queue_init(&globalqueue, MAX_GLOBAL_THREADS,
> >>+                                MIN_GLOBAL_THREADS);
> >>+        globalqueue_init = 1;
> >>+    }
> >What protects globalqueue_init?
> 
> qemu_mutex, and that unlikely is almost certainly a premature optimization.
> 
> Regards,
> 
> Anthony Liguori
> 

-- 
	Three Cheers,
	Balbir