From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:52087) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TZLwt-000093-8P for qemu-devel@nongnu.org; Fri, 16 Nov 2012 08:27:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TZLwq-00054w-62 for qemu-devel@nongnu.org; Fri, 16 Nov 2012 08:27:11 -0500 Received: from mx1.redhat.com ([209.132.183.28]:13113) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TZLwp-00054q-UC for qemu-devel@nongnu.org; Fri, 16 Nov 2012 08:27:08 -0500 Message-ID: <50A63F9F.7000309@redhat.com> Date: Fri, 16 Nov 2012 14:29:03 +0100 From: Hans de Goede MIME-Version: 1.0 References: <50A50806.5020309@redhat.com> <50A50CD7.4070104@siemens.com> In-Reply-To: <50A50CD7.4070104@siemens.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] "usb: uhci: Look up queue by address, not token" List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Gerd Hoffmann , "qemu-devel@nongnu.org" Hi, On 11/15/2012 04:40 PM, Jan Kiszka wrote: > Hi Hans, > > On 2012-11-15 16:19, Hans de Goede wrote: >> Hi Jan, >> >> I just saw your $subject patch in Gerd's usb-next tree, and I've a question >> about it. The token should be enough to uniquely identify a device + ep, >> and unless a guest uses multiple qhs for a singe ep, that _should_ be enough. > > But what disallows that the guest issues multiple requests (QH + series > of TDs) for a single endpoint? I'm not finding any trace in the spec > that disallows this. It may not be explicitly forbidden in the spec, but the whole text of the spec implies a 1 on 1 relationship between qhs and eps, and this is how all major guests do it. > And my special guest is stumbling over that limitation in QEMU. I guess you're not at liberty to disclose what guest this is ? >> The reason why I want to know, is that identifying the queue by qh has a >> disadvantage, to be precise I believe the following can then happen: >> >> 1) The guest queues up multiple requests for a queue >> 2) We execute one, go async >> 3) The guest changes it mind an unlinks the qh >> 4) The guest will think the queue is cancelled after frindex has >> changed by 1, but we keep it around for 64 frames (which sucks, >> and I want to improve on this, but we need to for various reasons) >> 5) The guests submits some new requests to the queue, using a >> new qh (which possibly has a different address). >> 6) We see the new qh, and execute a request from there >> 7) The 1st request on the old qh completes, and we execute the next >> 8) Now things are a mess as we're executing requests from the old >> (cancelled from the guest pov) and new queue intermixed... >> >> Using the token to identify the queue fixes this, cause we will >> find the same queue for the old and new qh, and uhci_queue_verify() >> will then fail because of the qh addr change, and then we cancel >> the old queue. IOW then we do the right thing. > > I was afraid of triggering some conflict but, as pointed out above, the > current code also does something wrong. It associates two different > active QH with the same queue and then triggers a failure for the first > QH as it wrongly thinks the guest has unlinked it. > >> >> So I'm wondering if there is another, potentially better fix for >> what you are seeing? > > /me too. I'm not fully understanding your scenario above yet, > specifically when and how the guest can correctly remove an active QH, A guest can unlink an active qh when it wants to cancel the requests in there (ie a driver level timeout kicks in). This is a quite normal thing to do (unlike using multiple qhs for a single ep). For ehci we can reliable detect this happening (except for interrupt endpoints, sigh) thanks to the "doorbell" mechanism. But for uhci it is harder, and things are further complicated by the fact that Linux' full speed recovery code for bulk endpoints temporarily unlinks and then relinks a pending requests to make some changes to the qh while requests are in flight ... (bad Linux, bad!). So we rely on a combination of measures to detect cancels in the qemu uhci code, of which the qh validation is one, and this gets broken by your patch. > so I cannot provide good suggestions at this point. I see 2 solutions at this point: 1) You modify your special guest if you can 2) We add a property to the uhci emulation to search for queues by address rather then by token, which would of course default to false. Regards, Hans