From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=36824 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1P8PdT-0003Gm-Bm
	for qemu-devel@nongnu.org; Tue, 19 Oct 2010 23:46:44 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jvrao@linux.vnet.ibm.com>) id 1P8PdR-0007Pp-I8
	for qemu-devel@nongnu.org; Tue, 19 Oct 2010 23:46:43 -0400
Received: from e31.co.us.ibm.com ([32.97.110.149]:60284)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jvrao@linux.vnet.ibm.com>) id 1P8PdR-0007PV-Am
	for qemu-devel@nongnu.org; Tue, 19 Oct 2010 23:46:41 -0400
Received: from d03relay05.boulder.ibm.com (d03relay05.boulder.ibm.com
	[9.17.195.107])
	by e31.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id o9K3Xs6g002654
	for <qemu-devel@nongnu.org>; Tue, 19 Oct 2010 21:33:54 -0600
Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168])
	by d03relay05.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	o9K3kdq3140760
	for <qemu-devel@nongnu.org>; Tue, 19 Oct 2010 21:46:39 -0600
Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1])
	by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP
	id o9K3kc9w010162
	for <qemu-devel@nongnu.org>; Tue, 19 Oct 2010 21:46:39 -0600
Message-ID: <4CBE661B.1010205@linux.vnet.ibm.com>
Date: Tue, 19 Oct 2010 20:46:35 -0700
From: "Venkateswararao Jujjuri (JV)" <jvrao@linux.vnet.ibm.com>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] [PATCH 1/3] Introduce threadlets
References: <20101019173946.16514.62027.stgit@localhost6.localdomain6>
	<20101019174245.16514.14542.stgit@localhost6.localdomain6>
	<20101019183644.GI15844@balbir.in.ibm.com>	<4CBE0F5F.3020204@codemonkey.ws>
	<20101020022217.GL15844@balbir.in.ibm.com>
In-Reply-To: <20101020022217.GL15844@balbir.in.ibm.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: balbir@linux.vnet.ibm.com
Cc: Arun R Bharadwaj <arun@linux.vnet.ibm.com>, qemu-devel@nongnu.org

On 10/19/2010 7:22 PM, Balbir Singh wrote:
> * Anthony Liguori <anthony@codemonkey.ws> [2010-10-19 16:36:31]:
> 
>> On 10/19/2010 01:36 PM, Balbir Singh wrote:
>>>> +    qemu_mutex_lock(&(queue->lock));
>>>> +    while (1) {
>>>> +        ThreadletWork *work;
>>>> +        int ret = 0;
>>>> +
>>>> +        while (QTAILQ_EMPTY(&(queue->request_list))&&
>>>> +               (ret != ETIMEDOUT)) {
>>>> +            ret = qemu_cond_timedwait(&(queue->cond),
>>>> +					&(queue->lock), 10*100000);
>>> Ewww... what is 10*100000, can we use something more meaningful
>>> please?
>>
>> A define is fine but honestly, it's pretty darn obvious what it means...
>>
>>>> +        }
>>>> +
>>>> +        assert(queue->idle_threads != 0);
>>> This assertion holds because we believe one of the idle_threads
>>> actually did the dequeuing, right?
>>
>> An idle thread is a thread is one that is not doing work.  At this
>> point in the code, we are not doing any work (yet) so if
>> idle_threads count is zero, something is horribly wrong.  We're also
>> going to unconditionally decrement in the future code path which
>> means that if idle_threads is 0, it's going to become -1.
>>
>> The use of idle_thread is to detect whether it's necessary to spawn
>> an additional thread.
>>
> 
> We can hit this assert if pthread_cond_signal() is called outside of
> the mutex, let me try and explain below
> 
>>>> +        if (QTAILQ_EMPTY(&(queue->request_list))) {
>>>> +            if (queue->cur_threads>  queue->min_threads) {
>>>> +                /* We retain the minimum number of threads */
>>>> +                break;
>>>> +            }
>>>> +        } else {
>>>> +            work = QTAILQ_FIRST(&(queue->request_list));
>>>> +            QTAILQ_REMOVE(&(queue->request_list), work, node);
>>>> +
>>>> +            queue->idle_threads--;
>>>> +            qemu_mutex_unlock(&(queue->lock));
>>>> +
>>>> +            /* execute the work function */
>>>> +            work->func(work);
>>>> +
>>>> +            qemu_mutex_lock(&(queue->lock));
>>>> +            queue->idle_threads++;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    queue->idle_threads--;
>>>> +    queue->cur_threads--;
>>>> +    qemu_mutex_unlock(&(queue->lock));
>>>> +
>>>> +    return NULL;
>>> Does anybody do a join on the exiting thread from the pool?
>>
>> No.  The thread is created in a detached state.
>>
> 
> That makes sense, thanks for clarifying
> 
>>>> +}
>>>> +
>>>> +static void spawn_threadlet(ThreadletQueue *queue)
>>>> +{
>>>> +    QemuThread thread;
>>>> +
>>>> +    queue->cur_threads++;
>>>> +    queue->idle_threads++;
>>>> +
>>>> +    qemu_thread_create(&thread, threadlet_worker, queue);
>>>> +}
>>>> +
>>>> +/**
>>>> + * submit_threadletwork_to_queue: Submit a new task to a private queue to be
>>>> + *                            executed asynchronously.
>>>> + * @queue: Per-subsystem private queue to which the new task needs
>>>> + *         to be submitted.
>>>> + * @work: Contains information about the task that needs to be submitted.
>>>> + */
>>>> +void submit_threadletwork_to_queue(ThreadletQueue *queue, ThreadletWork *work)
>>>> +{
>>>> +    qemu_mutex_lock(&(queue->lock));
>>>> +    if (queue->idle_threads == 0&&  queue->cur_threads<  queue->max_threads) {
>>>> +        spawn_threadlet(queue);
>>> So we hold queue->lock, spawn the thread, the spawned thread tries to
>>> acquire queue->lock
>>
>> Yup.
>>
>>>> +    }
>>>> +    QTAILQ_INSERT_TAIL(&(queue->request_list), work, node);
>>>> +    qemu_mutex_unlock(&(queue->lock));
>>>> +    qemu_cond_signal(&(queue->cond));
>>> In the case that we just spawned the threadlet, the cond_signal is
>>> spurious. If we need predictable scheduling behaviour,
>>> qemu_cond_signal needs to happen with queue->lock held.
>>
>> It doesn't really affect predictability..
>>
>>> I'd rewrite the function as
>>>
>>> /**
>>>  * submit_threadletwork_to_queue: Submit a new task to a private queue to be
>>>  *                            executed asynchronously.
>>>  * @queue: Per-subsystem private queue to which the new task needs
>>>  *         to be submitted.
>>>  * @work: Contains information about the task that needs to be submitted.
>>>  */
>>> void submit_threadletwork_to_queue(ThreadletQueue *queue, ThreadletWork *work)
>>> {
>>>     qemu_mutex_lock(&(queue->lock));
>>>     if (queue->idle_threads == 0&&  (queue->cur_threads<  queue->max_threads)) {
>>>         spawn_threadlet(queue);
>>>     } else {
>>>         qemu_cond_signal(&(queue->cond));
>>>     }
>>>     QTAILQ_INSERT_TAIL(&(queue->request_list), work, node);
>>>     qemu_mutex_unlock(&(queue->lock));
>>> }
>>
>> I think this is a lot more fragile.  You're relying on the fact that
>> signal will not cause the signalled thread to actually awaken until
>> we release the lock and doing work after signalling that the
>> signalled thread needs to be completed before it wakes up.
>>
>> I think you're a lot more robust in the long term if you treat
>> condition signalling as a hand off point because it makes the code a
>> lot more explicit about what's happening.
>>
> 
> OK, here is a situation that can happen
> 
> T1                              T2
> ---                             ---
> threadlet                       submit_threadletwork_to_queue
> (sees condition as no work)     mutex_lock
> qemu_cond_timedwait             add_work
> ...                             mutex_unlock
> 
> T3
> --
> cancel_threadlet_work_on_queue
> mutex_lock (grabs it) before T1 can
> cancels the work
> 
> 
>                                 qemu_cond_signal
> 
> T1
> --
> Grabs mutex_lock (from within cond_timedwait)
> Now there is no work to do, the condition
> has changed before the thread wakes up

So what? It won't find any work and goes back to sleep or exits.

idle_threads is decremented only in threadlet_worker(). Given that
we have a threadlet that is not doing anywork the assert should never hit unless
something horribly wrong .

- JV

> 
> 
> The man page also states
> 
> "however, if predictable scheduling behavior is required, then that
> mutex shall be locked by the thread calling pthread_cond_broadcast()
> or pthread_cond_signal()"
> 
>>>> +/**
>>>> + * submit_threadletwork: Submit to the global queue a new task to be executed
>>>> + *                   asynchronously.
>>>> + * @work: Contains information about the task that needs to be submitted.
>>>> + */
>>>> +void submit_threadletwork(ThreadletWork *work)
>>>> +{
>>>> +    if (unlikely(!globalqueue_init)) {
>>>> +        threadlet_queue_init(&globalqueue, MAX_GLOBAL_THREADS,
>>>> +                                MIN_GLOBAL_THREADS);
>>>> +        globalqueue_init = 1;
>>>> +    }
>>> What protects globalqueue_init?
>>
>> qemu_mutex, and that unlikely is almost certainly a premature optimization.
>>
>> Regards,
>>
>> Anthony Liguori
>>
>