* [Qemu-devel] xhci migration breakage
@ 2017-03-30 12:23 Dr. David Alan Gilbert
2017-03-30 14:10 ` Gerd Hoffmann
0 siblings, 1 reply; 5+ messages in thread
From: Dr. David Alan Gilbert @ 2017-03-30 12:23 UTC (permalink / raw)
To: kraxel; +Cc: qemu-devel
Hi Gerd,
2.9 seems to have an xhci migration breakage, I've bisected it
down to:
commit ddb603ab6c981c1d67cb42266fc700c33e5b2d8f
Author: Gerd Hoffmann <kraxel@redhat.com>
Date: Mon Jan 30 16:36:46 2017 +0100
xhci: don't kick in xhci_submit and xhci_fire_ctl_transfer
I guess maybe there's a kick needed pre/post migrate?
(it's https://bugzilla.redhat.com/show_bug.cgi?id=1436616 )
The symptoms on the destination are that the qemu prints:
qemu-kvm: usb-msd: Bad signature 53425300
and the guest sees:
xhci_hcd 0000:00:06.0: WARN Event TRB for slot 1 ep 0 with no TDs queued?
xhci_hcd 0000:00:06.0: WARN Event TRB for slot 1 ep 2 with no TDs queued?
xhci_hcd 0000:00:06.0: WARN Event TRB for slot 1 ep 3 with no TDs queued?
and never gets any more IO on the USB disk.
Any ideas?
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] xhci migration breakage
2017-03-30 12:23 [Qemu-devel] xhci migration breakage Dr. David Alan Gilbert
@ 2017-03-30 14:10 ` Gerd Hoffmann
2017-03-30 17:42 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 5+ messages in thread
From: Gerd Hoffmann @ 2017-03-30 14:10 UTC (permalink / raw)
To: Dr. David Alan Gilbert; +Cc: qemu-devel
Hi,
> I guess maybe there's a kick needed pre/post migrate?
Sure, but it is there already (post_load arms kick_timer for active
endpoints), so it isn't *that* simple.
Looking at this I have no idea offhand, guess I have to schedule some
debugging for tomorrow. Keep me posted in case you continue digging.
cheers,
Gerd
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] xhci migration breakage
2017-03-30 14:10 ` Gerd Hoffmann
@ 2017-03-30 17:42 ` Dr. David Alan Gilbert
2017-03-31 5:45 ` Gerd Hoffmann
0 siblings, 1 reply; 5+ messages in thread
From: Dr. David Alan Gilbert @ 2017-03-30 17:42 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: qemu-devel
* Gerd Hoffmann (kraxel@redhat.com) wrote:
> Hi,
>
> > I guess maybe there's a kick needed pre/post migrate?
>
> Sure, but it is there already (post_load arms kick_timer for active
> endpoints), so it isn't *that* simple.
>
> Looking at this I have no idea offhand, guess I have to schedule some
> debugging for tomorrow. Keep me posted in case you continue digging.
OK, I don't know any of this code, but my poking about so far is:
source
....
xhci: setup packet pid 0x69 addr 1 ep 1
20095@1490894271.183259:usb_xhci_ep_kick slotid 1, epid 4, streamid 0
xhci: set epctx: 35497080 state=1 dequeue=0000000230cf2f61
20095@1490894271.183284:usb_xhci_xfer_start 0x7f302931fef0: slotid 1, epid 4, streamid 0
xhci_submit(slotid=1,epid=4)
xhci: setup packet pid 0xe1 addr 1 ep 2
20095@1490894271.183404:usb_xhci_ep_kick slotid 1, epid 3, streamid 0
xhci: set epctx: 35497060 state=1 dequeue=0000000230cf3331
20095@1490894271.183420:usb_xhci_xfer_start 0x7f302931fef0: slotid 1, epid 3, streamid 0
xhci_submit(slotid=1,epid=3)
xhci: setup packet pid 0x69 addr 1 ep 1
inputs_connect: inputs channel client create
red_dispatcher_set_cursor_peer:
(qemu) migra
migrate migrate_cancel migrate_incoming
migrate_set_cache_size migrate_set_capability migrate_set_downtime
migrate_set_parameter migrate_set_speed migrate_start_postcopy
(qemu) migrate_set_speed 1G
(qemu) migrate tcp:0:4444
usb-storage msd pre_save 0x7f3044627960 sig=0 tag=0
(qemu) q
dest:
(qemu) red_dispatcher_loadvm_commands:
usb-storage msd post_load 0x7f45f510ee50 sig=0 tag=0
I added these as debug on the 'usb-storage' vmsd element; but
the signature/tag being 0 I assume they're not the culprit.
(qemu) c
(qemu) 20107@1490894287.479732:usb_xhci_ep_kick slotid 1, epid 1, streamid 0
xhci: set epctx: 35497020 state=1 dequeue=0000000035499231
20107@1490894287.479763:usb_xhci_xfer_start 0x7f45f556a800: slotid 1, epid 1, streamid 0
xhci: setup packet pid 0x69 addr 1 ep 0
20107@1490894287.479799:usb_xhci_ep_kick slotid 1, epid 3, streamid 0
xhci: set epctx: 35497060 state=1 dequeue=0000000230cf3331
20107@1490894287.479812:usb_xhci_xfer_start 0x7f45f556a800: slotid 1, epid 3, streamid 0
xhci_submit(slotid=1,epid=3)
xhci: setup packet pid 0x69 addr 1 ep 1
xhci: set epctx: 35497060 state=2 dequeue=0000000230cf3331
20107@1490894287.479837:usb_xhci_ep_kick slotid 1, epid 4, streamid 0
xhci: set epctx: 35497080 state=1 dequeue=0000000230cf2f61
20107@1490894287.479848:usb_xhci_xfer_start 0x7f45f556a800: slotid 1, epid 4, streamid 0
xhci_submit(slotid=1,epid=4)
xhci: setup packet pid 0xe1 addr 1 ep 2
qemu-system-x86_64: usb-msd: Bad signature 53425300
given the bottom byte is 0 - is anything supposed to poison these things when
they're finished with them?
The backtrace (where I added an assert) shows it is coming off the timer
(Note I think this is a different run from that above):
#3 0x00007feba72c91f2 in __GI___assert_fail (assertion=assertion@entry=0x7febabcc1f3a "0", file=file@entry=0x7febabcb1f90 "/root/1436616/qemu/hw/usb/dev-storage.c", line=line@entry=429, function=function@entry=0x7febabcb20f0 <__PRETTY_FUNCTION__.30660> "usb_msd_handle_data") at assert.c:101
#4 0x00007febabad00e1 in usb_msd_handle_data (dev=0x7febae693e50, p=0x7febaeaef808)
at /root/1436616/qemu/hw/usb/dev-storage.c:429
#5 0x00007febabab2b58 in usb_handle_packet (dev=<optimized out>, p=p@entry=0x7febaeaef808)
at /root/1436616/qemu/hw/usb/core.c:412
#6 0x00007febabacae27 in xhci_kick_epctx (epctx=<optimized out>, xfer=<optimized out>, xhci=<optimized out>)
at /root/1436616/qemu/hw/usb/hcd-xhci.c:2103 <- right at the end of xhci_submit
#7 0x00007febabacae27 in xhci_kick_epctx (epctx=<optimized out>, xfer=<optimized out>, xhci=<optimized out>)
at /root/1436616/qemu/hw/usb/hcd-xhci.c:2112 right at end of xhci_fire_transfer
#8 0x00007febabacae27 in xhci_kick_epctx (epctx=0x7febaec395a0, streamid=0)
at /root/1436616/qemu/hw/usb/hcd-xhci.c:2240
#9 0x00007febabb4c8b9 in timerlist_run_timers (timer_list=0x7febad2ad790) at /root/1436616/qemu/qemu-timer.c:540
#10 0x00007febabb4cb20 in qemu_clock_run_all_timers (type=<optimized out>) at /root/1436616/qemu/qemu-timer.c:551
#11 0x00007febabb4cb20 in qemu_clock_run_all_timers () at /root/1436616/qemu/qemu-timer.c:665
#12 0x00007febabb4bae7 in main_loop_wait (nonblocking=<optimized out>) at /root/1436616/qemu/main-loop.c:516
#13 0x00007febab8855e2 in main () at /root/1436616/qemu/vl.c:1888
#14 0x00007febab8855e2 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
at /root/1436616/qemu/vl.c:4630
Dave
> cheers,
> Gerd
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] xhci migration breakage
2017-03-30 17:42 ` Dr. David Alan Gilbert
@ 2017-03-31 5:45 ` Gerd Hoffmann
2017-03-31 8:54 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 5+ messages in thread
From: Gerd Hoffmann @ 2017-03-31 5:45 UTC (permalink / raw)
To: Dr. David Alan Gilbert; +Cc: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1021 bytes --]
Hi,
> dest:
> (qemu) red_dispatcher_loadvm_commands:
> usb-storage msd post_load 0x7f45f510ee50 sig=0 tag=0
>
> I added these as debug on the 'usb-storage' vmsd element; but
> the signature/tag being 0 I assume they're not the culprit.
Yep, no inflight transfer, so highly unlikely (also note that ehci works
fine according to bugzilla).
> #6 0x00007febabacae27 in xhci_kick_epctx (epctx=<optimized out>, xfer=<optimized out>, xhci=<optimized out>)
> at /root/1436616/qemu/hw/usb/hcd-xhci.c:2103 <- right at the end of xhci_submit
> #7 0x00007febabacae27 in xhci_kick_epctx (epctx=<optimized out>, xfer=<optimized out>, xhci=<optimized out>)
> at /root/1436616/qemu/hw/usb/hcd-xhci.c:2112 right at end of xhci_fire_transfer
> #8 0x00007febabacae27 in xhci_kick_epctx (epctx=0x7febaec395a0, streamid=0)
> at /root/1436616/qemu/hw/usb/hcd-xhci.c:2240
Recursive xhci_kick_epctx calls, not good.
Does the attached patch (not tested yet) make any difference?
cheers,
Gerd
[-- Attachment #2: 0001-xhci-set-kick_active-for-retried-xfers-too.patch --]
[-- Type: text/x-patch, Size: 2779 bytes --]
From 1b10b10445ab438c3fa1107f9962add3ae0716c6 Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann <kraxel@redhat.com>
Date: Fri, 31 Mar 2017 07:40:45 +0200
Subject: [PATCH] xhci: set kick_active for retried xfers too
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
hw/usb/hcd-xhci.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index f0af852..8958f95 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -2083,6 +2083,7 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
return;
}
+ epctx->kick_active++;
if (epctx->retry) {
XHCITransfer *xfer = epctx->retry;
@@ -2093,6 +2094,7 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
mfindex = xhci_mfindex_get(xhci);
xhci_check_intr_iso_kick(xhci, xfer, epctx, mfindex);
if (xfer->running_retry) {
+ epctx->kick_active--;
return;
}
xfer->timed_xfer = 0;
@@ -2101,6 +2103,7 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
if (xfer->iso_xfer) {
/* retry iso transfer */
if (xhci_setup_packet(xfer) < 0) {
+ epctx->kick_active--;
return;
}
usb_handle_packet(xfer->packet.ep->dev, &xfer->packet);
@@ -2109,10 +2112,12 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
} else {
/* retry nak'ed transfer */
if (xhci_setup_packet(xfer) < 0) {
+ epctx->kick_active--;
return;
}
usb_handle_packet(xfer->packet.ep->dev, &xfer->packet);
if (xfer->packet.status == USB_RET_NAK) {
+ epctx->kick_active--;
return;
}
xhci_try_complete_packet(xfer);
@@ -2126,6 +2131,7 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
if (epctx->state == EP_HALTED) {
DPRINTF("xhci: ep halted, not running schedule\n");
+ epctx->kick_active--;
return;
}
@@ -2134,6 +2140,7 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
uint32_t err;
stctx = xhci_find_stream(epctx, streamid, &err);
if (stctx == NULL) {
+ epctx->kick_active--;
return;
}
ring = &stctx->ring;
@@ -2145,7 +2152,6 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
}
assert(ring->dequeue != 0);
- epctx->kick_active++;
while (1) {
length = xhci_ring_chain_length(xhci, ring);
if (length <= 0) {
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] xhci migration breakage
2017-03-31 5:45 ` Gerd Hoffmann
@ 2017-03-31 8:54 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 5+ messages in thread
From: Dr. David Alan Gilbert @ 2017-03-31 8:54 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: qemu-devel
* Gerd Hoffmann (kraxel@redhat.com) wrote:
> Hi,
>
> > dest:
> > (qemu) red_dispatcher_loadvm_commands:
> > usb-storage msd post_load 0x7f45f510ee50 sig=0 tag=0
> >
> > I added these as debug on the 'usb-storage' vmsd element; but
> > the signature/tag being 0 I assume they're not the culprit.
>
> Yep, no inflight transfer, so highly unlikely (also note that ehci works
> fine according to bugzilla).
>
> > #6 0x00007febabacae27 in xhci_kick_epctx (epctx=<optimized out>, xfer=<optimized out>, xhci=<optimized out>)
> > at /root/1436616/qemu/hw/usb/hcd-xhci.c:2103 <- right at the end of xhci_submit
> > #7 0x00007febabacae27 in xhci_kick_epctx (epctx=<optimized out>, xfer=<optimized out>, xhci=<optimized out>)
> > at /root/1436616/qemu/hw/usb/hcd-xhci.c:2112 right at end of xhci_fire_transfer
> > #8 0x00007febabacae27 in xhci_kick_epctx (epctx=0x7febaec395a0, streamid=0)
> > at /root/1436616/qemu/hw/usb/hcd-xhci.c:2240
>
> Recursive xhci_kick_epctx calls, not good.
>
> Does the attached patch (not tested yet) make any difference?
Nope.
Source:
xhci_submit(slotid=1,epid=3)
xhci: setup packet pid 0x69 addr 1 ep 1
30724@1490950388.025424:usb_xhci_ep_kick slotid 1, epid 4, streamid 0
xhci: set epctx: 23fe22080 state=1 dequeue=0000000035460f41
30724@1490950388.025449:usb_xhci_xfer_start 0x7fbf642facc0: slotid 1, epid 4, streamid 0
xhci_submit(slotid=1,epid=4)
xhci: setup packet pid 0xe1 addr 1 ep 2
30724@1490950388.025649:usb_xhci_ep_kick slotid 1, epid 3, streamid 0
xhci: set epctx: 23fe22060 state=1 dequeue=000000003550e421
30724@1490950388.025674:usb_xhci_xfer_start 0x7fbf642facc0: slotid 1, epid 3, streamid 0
xhci_submit(slotid=1,epid=3)
xhci: setup packet pid 0x69 addr 1 ep 1
30724@1490950388.025790:usb_xhci_ep_kick slotid 1, epid 3, streamid 0
xhci: set epctx: 23fe22060 state=1 dequeue=000000003550e431
30724@1490950388.025807:usb_xhci_xfer_start 0x7fbf642facc0: slotid 1, epid 3, streamid 0
xhci_submit(slotid=1,epid=3)
xhci: setup packet pid 0x69 addr 1 ep 1
30724@1490950388.026700:usb_xhci_ep_kick slotid 1, epid 4, streamid 0
xhci: set epctx: 23fe22080 state=1 dequeue=0000000035460f51
30724@1490950388.026739:usb_xhci_xfer_start 0x7fbf680008f0: slotid 1, epid 4, streamid 0
xhci_submit(slotid=1,epid=4)
xhci: setup packet pid 0xe1 addr 1 ep 2
30724@1490950388.026910:usb_xhci_ep_kick slotid 1, epid 3, streamid 0
xhci: set epctx: 23fe22060 state=1 dequeue=000000003550e441
30724@1490950388.026936:usb_xhci_xfer_start 0x7fbf680008f0: slotid 1, epid 3, streamid 0
xhci_submit(slotid=1,epid=3)
xhci: setup packet pid 0x69 addr 1 ep 1
Dest:
QEMU 2.8.92 monitor - type 'help' for more information
(qemu) red_dispatcher_loadvm_commands:
usb-storage msd post_load 0x7fbcde35def0 sig=0 tag=0
main_channel_link: add main channel client
main_channel_handle_parsed: net test: latency 210.332000 ms, bitrate 1656610 bps (1.579866 Mbps) LOW BANDWIDTH
red_dispatcher_set_cursor_peer:
inputs_connect: inputs channel client create
(qemu) c
(qemu) 30736@1490950422.138659:usb_xhci_ep_kick slotid 1, epid 1, streamid 0
xhci: set epctx: 23fe22020 state=1 dequeue=0000000230c87231
30736@1490950422.138695:usb_xhci_xfer_start 0x7fbcdd8cdee0: slotid 1, epid 1, streamid 0
xhci: setup packet pid 0x69 addr 1 ep 0
30736@1490950422.138731:usb_xhci_ep_kick slotid 1, epid 3, streamid 0
xhci: set epctx: 23fe22060 state=1 dequeue=000000003550e441
30736@1490950422.138746:usb_xhci_xfer_start 0x7fbcddaf8f20: slotid 1, epid 3, streamid 0
xhci_submit(slotid=1,epid=3)
xhci: setup packet pid 0x69 addr 1 ep 1
xhci: set epctx: 23fe22060 state=2 dequeue=000000003550e441
30736@1490950422.138769:usb_xhci_ep_kick slotid 1, epid 4, streamid 0
xhci: set epctx: 23fe22080 state=1 dequeue=0000000035460f51
30736@1490950422.138781:usb_xhci_xfer_start 0x7fbcdd76aff0: slotid 1, epid 4, streamid 0
xhci_submit(slotid=1,epid=4)
xhci: setup packet pid 0xe1 addr 1 ep 2
qemu-system-x86_64: usb-msd: Bad signature 53425300
Dave
> cheers,
> Gerd
>
> From 1b10b10445ab438c3fa1107f9962add3ae0716c6 Mon Sep 17 00:00:00 2001
> From: Gerd Hoffmann <kraxel@redhat.com>
> Date: Fri, 31 Mar 2017 07:40:45 +0200
> Subject: [PATCH] xhci: set kick_active for retried xfers too
>
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
> hw/usb/hcd-xhci.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
> index f0af852..8958f95 100644
> --- a/hw/usb/hcd-xhci.c
> +++ b/hw/usb/hcd-xhci.c
> @@ -2083,6 +2083,7 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
> return;
> }
>
> + epctx->kick_active++;
> if (epctx->retry) {
> XHCITransfer *xfer = epctx->retry;
>
> @@ -2093,6 +2094,7 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
> mfindex = xhci_mfindex_get(xhci);
> xhci_check_intr_iso_kick(xhci, xfer, epctx, mfindex);
> if (xfer->running_retry) {
> + epctx->kick_active--;
> return;
> }
> xfer->timed_xfer = 0;
> @@ -2101,6 +2103,7 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
> if (xfer->iso_xfer) {
> /* retry iso transfer */
> if (xhci_setup_packet(xfer) < 0) {
> + epctx->kick_active--;
> return;
> }
> usb_handle_packet(xfer->packet.ep->dev, &xfer->packet);
> @@ -2109,10 +2112,12 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
> } else {
> /* retry nak'ed transfer */
> if (xhci_setup_packet(xfer) < 0) {
> + epctx->kick_active--;
> return;
> }
> usb_handle_packet(xfer->packet.ep->dev, &xfer->packet);
> if (xfer->packet.status == USB_RET_NAK) {
> + epctx->kick_active--;
> return;
> }
> xhci_try_complete_packet(xfer);
> @@ -2126,6 +2131,7 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
>
> if (epctx->state == EP_HALTED) {
> DPRINTF("xhci: ep halted, not running schedule\n");
> + epctx->kick_active--;
> return;
> }
>
> @@ -2134,6 +2140,7 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
> uint32_t err;
> stctx = xhci_find_stream(epctx, streamid, &err);
> if (stctx == NULL) {
> + epctx->kick_active--;
> return;
> }
> ring = &stctx->ring;
> @@ -2145,7 +2152,6 @@ static void xhci_kick_epctx(XHCIEPContext *epctx, unsigned int streamid)
> }
> assert(ring->dequeue != 0);
>
> - epctx->kick_active++;
> while (1) {
> length = xhci_ring_chain_length(xhci, ring);
> if (length <= 0) {
> --
> 1.8.3.1
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-03-31 8:55 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-30 12:23 [Qemu-devel] xhci migration breakage Dr. David Alan Gilbert
2017-03-30 14:10 ` Gerd Hoffmann
2017-03-30 17:42 ` Dr. David Alan Gilbert
2017-03-31 5:45 ` Gerd Hoffmann
2017-03-31 8:54 ` Dr. David Alan Gilbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).