From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:47430)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <js@sig21.net>) id 1TLEe5-00080r-E1
	for qemu-devel@nongnu.org; Mon, 08 Oct 2012 10:49:30 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <js@sig21.net>) id 1TLEdz-00005S-9t
	for qemu-devel@nongnu.org; Mon, 08 Oct 2012 10:49:25 -0400
Received: from bar.sig21.net ([80.81.252.164]:59690)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <js@sig21.net>)
	id 1TLEdz-000051-3K
	for qemu-devel@nongnu.org; Mon, 08 Oct 2012 10:49:19 -0400
Date: Mon, 8 Oct 2012 16:49:09 +0200
From: Johannes Stezenbach <js@sig21.net>
Message-ID: <20121008144909.GA31171@sig21.net>
References: <3321480.8UDes0xfFC@segfault.sh0n.net>
	<50606FDF.3070408@redhat.com>
	<10559125.MRDnL6POYS@segfault.sh0n.net>
	<2581372.ig9fx04ALR@segfault.sh0n.net>
	<5072B8A0.9060700@redhat.com> <20121008130125.GA3622@sig21.net>
	<5072DA4C.8050708@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5072DA4C.8050708@redhat.com>
Subject: Re: [Qemu-devel] EHCI USB regression in 1.2.0 -
 ehci_state_fetchqtd() asserting
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Hans de Goede <hdegoede@redhat.com>
Cc: Shawn Starr <shawn.starr@rogers.com>, qemu-devel@nongnu.org, gerd@kraxel.org

Hi,

On Mon, Oct 08, 2012 at 03:51:08PM +0200, Hans de Goede wrote:
> On 10/08/2012 03:01 PM, Johannes Stezenbach wrote:
> >
> >There will always be a race between the call to USBDEVFS_DISCARDURB
> >and the URB completing.  IMHO the handling in usb_host_stop_n_free_iso()
> >is buggy.  How about dropping the "killed" and "free" variables and
> >calling async_complete() and g_free() unconditionally?
> 
> This race is well known already handled correctly, 

You mean the message about "leaking iso urbs" is wrong?
(since it will be freed later in async_completem right?)

> the real problem is the
> "ehci warning: guest updated active QH" message, which most likely indicates
> that the guest has hit the doorbell (IAAD) in the EHCI controller, and then
> has not gotten an IAA interrupt within
> a certain amount of time triggering its IAAD watchdog (some real EHCI
> hardware is broken wrt delivering IAA interrupt) causing us to not see
> an unlinked qh as unlinked, and then later on triggering the
> "warning: guest updated active QH" message.
> 
> This is unavoidable when we get too large latencies, the ehci hardware
> simple was not designed to be virtualized, anything but actually.

OK, thanks for this explanation.
I haven't much clue about qemu but isn't the issue that qemu
delivers timer irqs to the guest (for EHCI_HRTIMER_IAA_WATCHDOG) while
failing to handle the IAAD -> IAA interrupt generation?
(via qemu_bh_schedule -> ehci_advance_async_state -> ehci_raise_irq,
why does ehci_raise_irq() not call ehci_update_irq() for USBSTS_IAA?)

If that cannot be fixed, have you tried talking to the Linux
EHCI driver maintainer if the EHCI_HRTIMER_IAA_WATCHDOG
timeout (10ms) can be increased or skipped entirely for non-broken hw?
(Linux commit 26f953fd884ea4879 suggests it's only for VIA chips)


Thanks,
Johannes