From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:41527)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1SUSVr-0006NQ-AJ
	for qemu-devel@nongnu.org; Tue, 15 May 2012 20:54:48 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1SUSVp-0003pA-Cf
	for qemu-devel@nongnu.org; Tue, 15 May 2012 20:54:46 -0400
Received: from mail-ob0-f173.google.com ([209.85.214.173]:35203)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1SUSVp-0003ot-7l
	for qemu-devel@nongnu.org; Tue, 15 May 2012 20:54:45 -0400
Received: by obbwd20 with SMTP id wd20so317493obb.4
	for <qemu-devel@nongnu.org>; Tue, 15 May 2012 17:54:43 -0700 (PDT)
Message-ID: <4FB2FAD0.603@codemonkey.ws>
Date: Tue, 15 May 2012 19:54:40 -0500
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
References: <1336625347-10169-1-git-send-email-benh@kernel.crashing.org>
	<1336625347-10169-9-git-send-email-benh@kernel.crashing.org>
	<4FB1A80C.1010103@codemonkey.ws>
	<20120515014204.GE30229@truffala.fritz.box>
	<4FB1B95A.20209@codemonkey.ws> <1337049166.6727.32.camel@pasglop>
	<4FB1C480.1030408@codemonkey.ws> <1337050942.6727.40.camel@pasglop>
	<4FB26212.5050409@codemonkey.ws> <1337118943.6727.93.camel@pasglop>
	<4FB2D291.1050003@codemonkey.ws>
	<1337123324.6727.101.camel@pasglop>
	<4FB2EDB2.5050305@codemonkey.ws>
	<1337128915.6727.112.camel@pasglop>
In-Reply-To: <1337128915.6727.112.camel@pasglop>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 08/13] iommu: Introduce IOMMU emulation
	infrastructure
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Alex Williamson <alex.williamson@redhat.com>, Richard Henderson <rth@twiddle.net>, "Michael S. Tsirkin" <mst@redhat.com>, qemu-devel@nongnu.org, Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>

On 05/15/2012 07:41 PM, Benjamin Herrenschmidt wrote:
> On Tue, 2012-05-15 at 18:58 -0500, Anthony Liguori wrote:
>
>> Even ancient PIO devices really don't block indefinitely.
>>
>>> In our case (TCEs) it's a hypervisor call, not an MMIO op, so to some
>>> extent it's even more likely to do "blocking" things.
>>
>> Yes, so I think the right thing to do is not model hypercalls for sPAPR as
>> synchronous calls but rather as asynchronous calls.  Obviously, simply ones can
>> use a synchronous implementation...
>>
>> This is a matter of setting hlt=1 before dispatching the hypercall and passing a
>> continuation to the call that when executed, prepare the CPUState for the
>> hypercall return and then set hlt=0 to resume the CPU.
>
> Is there any reason not to set that hlt after the dispatch ? IE. from
> within the hypercall, for the very few that want to do asynchronous
> completion, do something like spapr_hcall_suspend() before returning ?

You certainly could do that but it may get a little weird dealing with the 
return path.  You'd have to return something like -EWOULDBLOCK and make sure you 
handle that in the dispatch code appropriately.

>>> It would have been possible to implement a "busy" return status with the
>>> guest having to try again, unfortunately that's not how Linux has
>>> implemented it, so we are stuck with the current semantics.
>>>
>>> Now, if you think that dropping the lock isn't good, what do you reckon
>>> I should do ?
>>
>> Add a reference count to dma map calls and a flush_pending flag.  If
>> flush_pending&&  ref>  0, return NULL for all map calls.
>>
>> Decrement ref on unmap and if ref = 0 and flush_pending, clear flush_pending.
>> You could add a flush_notifier too for this event.
>>
>> dma_flush() sets flush_pending if ref>  0.  Your TCE flush hypercall would
>> register for flush notifications and squirrel away the hypercall completion
>> continuation.
>
> Ok, I'll look into it, thanks. Any good example to look at for how that
> continuation stuff works ?

Just a callback and an opaque.  You could look at the AIOCB's in the block layer.

Regards,

Anthony Liguori

>> VT-d actually has a concept of a invalidation completion queue which delivers
>> interrupt based notification of invalidation completion events.  The above
>> flush_notify would be the natural way to support this since in this case, there
>> is no VCPU event that's directly involved in the completion event.
>
> Cheers,
> Ben.
>
>> Regards,
>>
>> Anthony Liguori
>>
>>> Cheers,
>>> Ben.
>>>
>>>
>
>