* Xen pci-passthrough problem with pci-detach and pci-assignable-remove @ 2014-01-10 14:51 Sander Eikelenboom 2014-01-10 15:12 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 20+ messages in thread From: Sander Eikelenboom @ 2014-01-10 14:51 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel Hi Konrad, Normally i'm never reattaching pci devices to dom0, but at the moment i have some use for it. But it seems pci-detach isn't completely detaching the device from the guest. - Say i have a guest (HVM) with domid=2 and a pci device passedthrough with bdf 00:19.0, the device is hidden on boot with xen-pciback.hide=(00:19.0) in grub. - Now i do a "xl pci-assignable-list" This returns nothing, which is correct since all hidden devices have already been assigned to guests. - Then i do "xl -v pci-detach 2 00:19.0" Which also returns nothing ... - Now i do a "xl pci-assignable-list" again .. This returns: "0000:00:19.0" So the pci-detach does seem to have done *something* :-) - But when now trying to remove the device from pciback to dom0 with "xl pci-assignable-remove 00:19.0" it gives an error and later it give some stacktraces .. xen_pciback: ****** removing device 0000:00:19.0 while still in-use! ****** xen_pciback: ****** driver domain may still access this device's i/o resources! xen_pciback: ****** shutdown driver domain before binding device xen_pciback: ****** to other drivers of domains When i shut the guest down instead of using pci-detach, the "xl pci-assignable-remove" works fine and i can rebind the device to it's driver in dom0. So am i misreading the wiki .. and is it not possible to detach a device from a running domain or ... ? Oh yes running xen-unstable and a 3.13-rc7 kernel -- Sander ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-10 14:51 Xen pci-passthrough problem with pci-detach and pci-assignable-remove Sander Eikelenboom @ 2014-01-10 15:12 ` Konrad Rzeszutek Wilk 2014-01-10 15:57 ` Sander Eikelenboom 0 siblings, 1 reply; 20+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-01-10 15:12 UTC (permalink / raw) To: Sander Eikelenboom; +Cc: xen-devel On Fri, Jan 10, 2014 at 03:51:57PM +0100, Sander Eikelenboom wrote: > Hi Konrad, > > Normally i'm never reattaching pci devices to dom0, but at the moment i have some use for it. > > But it seems pci-detach isn't completely detaching the device from the guest. > > - Say i have a guest (HVM) with domid=2 and a pci device passedthrough with bdf 00:19.0, the device is hidden on boot with xen-pciback.hide=(00:19.0) in grub. > > - Now i do a "xl pci-assignable-list" > This returns nothing, which is correct since all hidden devices have already been assigned to guests. > > - Then i do "xl -v pci-detach 2 00:19.0" > Which also returns nothing ... > > - Now i do a "xl pci-assignable-list" again .. > This returns: > "0000:00:19.0" > So the pci-detach does seem to have done *something* :-) Or it thinks it has :-) > > - But when now trying to remove the device from pciback to dom0 with "xl pci-assignable-remove 00:19.0" it gives an error > and later it give some stacktraces .. > > xen_pciback: ****** removing device 0000:00:19.0 while still in-use! ****** > xen_pciback: ****** driver domain may still access this device's i/o resources! > xen_pciback: ****** shutdown driver domain before binding device > xen_pciback: ****** to other drivers of domains What about /var/log/xen/qemu-dm* and the 'lspci' in the guest? Is the PCI device removed from there? > > > When i shut the guest down instead of using pci-detach, the "xl pci-assignable-remove" works fine and i can rebind the device to it's driver in dom0. > > So am i misreading the wiki .. and is it not possible to detach a device from a running domain or ... ? > > Oh yes running xen-unstable and a 3.13-rc7 kernel Do you see the same issue with 'xend'? > > -- > Sander > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-10 15:12 ` Konrad Rzeszutek Wilk @ 2014-01-10 15:57 ` Sander Eikelenboom 2014-01-10 16:12 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 20+ messages in thread From: Sander Eikelenboom @ 2014-01-10 15:57 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel Friday, January 10, 2014, 4:12:18 PM, you wrote: > On Fri, Jan 10, 2014 at 03:51:57PM +0100, Sander Eikelenboom wrote: >> Hi Konrad, >> >> Normally i'm never reattaching pci devices to dom0, but at the moment i have some use for it. >> >> But it seems pci-detach isn't completely detaching the device from the guest. >> >> - Say i have a guest (HVM) with domid=2 and a pci device passedthrough with bdf 00:19.0, the device is hidden on boot with xen-pciback.hide=(00:19.0) in grub. >> >> - Now i do a "xl pci-assignable-list" >> This returns nothing, which is correct since all hidden devices have already been assigned to guests. >> >> - Then i do "xl -v pci-detach 2 00:19.0" >> Which also returns nothing ... >> >> - Now i do a "xl pci-assignable-list" again .. >> This returns: >> "0000:00:19.0" >> So the pci-detach does seem to have done *something* :-) > Or it thinks it has :-) Well it has .. but probably not enough ;-) >> >> - But when now trying to remove the device from pciback to dom0 with "xl pci-assignable-remove 00:19.0" it gives an error >> and later it give some stacktraces .. >> >> xen_pciback: ****** removing device 0000:00:19.0 while still in-use! ****** >> xen_pciback: ****** driver domain may still access this device's i/o resources! >> xen_pciback: ****** shutdown driver domain before binding device >> xen_pciback: ****** to other drivers of domains > What about /var/log/xen/qemu-dm* and the 'lspci' in the guest? Is the PCI device > removed from there? Oeh i should have thought of that ... in the guest i get a "e1000e 0000:00:06.0 removed PHC" and it's gone from lspci .. in /var/log/xen/qemu-dm* .. i get nothing .. but i was using qemu-xen .. which is totally non verbose .. So let's try with qemu-xen-traditional .. which i also forgot to test ... Which gives exact the same error / warning as above, but it has some output in /var/log/xen/qemu-dm*: pt_msgctrl_reg_write: setup msi for dev 30 pt_msi_setup: pt_msi_setup requested pirq = 54 pt_msi_setup: msi mapped with pirq 36 pt_msi_update: Update msi with pirq 36 gvec 0 gflags 3036 pt_msgctrl_reg_write: setup msi for dev 28 pt_msi_setup: pt_msi_setup requested pirq = 53 pt_msi_setup: msi mapped with pirq 35 pt_msi_update: Update msi with pirq 35 gvec 0 gflags 3035 pt_msi_update: Update msi with pirq 36 gvec 0 gflags 3034 dm-command: hot remove pass-through pci dev generate a sci for PHP. deassert due to disable GPE bit. ACPI:debug: write addr=0xb044, val=0x30. ACPI:debug: write addr=0xb045, val=0x3. ACPI:debug: write addr=0xb044, val=0x30. ACPI:debug: write addr=0xb045, val=0x88. ACPI PCI hotplug: write devfn=0x30. pci_intx: intx=1 pci_intx: intx=1 pt_msi_disable: Unbind msi with pirq 36, gvec 0 pt_msi_disable: Unmap msi with pirq 36 Also worth mentioninng is that the console on which the "xl pci-assignable-remove 00:19.0" command is given, keeps hanging and eventually the hungtask stacktrace will appear. >> >> >> When i shut the guest down instead of using pci-detach, the "xl pci-assignable-remove" works fine and i can rebind the device to it's driver in dom0. >> >> So am i misreading the wiki .. and is it not possible to detach a device from a running domain or ... ? >> >> Oh yes running xen-unstable and a 3.13-rc7 kernel > Do you see the same issue with 'xend'? Erhmmm haven't used that for what seems to be ages .. :-) Hmm i also forgot the hungtask stacktrace i get sometime after the "xl pci-assignable-remove 00:19.0" ... It seems to be the pci_reset_function ... [ 52.099144] xen_bridge: port 4(vif2.0-emu) entered forwarding state [ 55.683141] xen_bridge: port 1(vif1.0) entered forwarding state [ 59.861385] xen-blkback:ring-ref 8, event-channel 22, protocol 1 (x86_64-abi) persistent grants [ 66.043965] xen_bridge: port 3(vif2.0) entered forwarding state [ 66.044549] xen_bridge: port 3(vif2.0) entered forwarding state [ 81.091149] xen_bridge: port 3(vif2.0) entered forwarding state [ 227.441191] xen_pciback: ****** removing device 0000:00:19.0 while still in-use! ****** [ 227.443482] xen_pciback: ****** driver domain may still access this device's i/o resources! [ 227.445811] xen_pciback: ****** shutdown driver domain before binding device [ 227.447811] xen_pciback: ****** to other drivers or domains [ 368.859343] INFO: task xl:3675 blocked for more than 120 seconds. [ 368.860447] Not tainted 3.13.0-rc7-20140110-creabox-nuc+ #1 [ 368.860990] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 368.861682] xl D ffff88003fd93f00 0 3675 3489 0x00000000 [ 368.862319] ffff880038c0e880 0000000000000282 0000000000000000 ffff880038fd03d0 [ 368.863035] 0000000000013f00 0000000000013f00 ffff880038c0e880 ffff880036abffd8 [ 368.863802] ffffffff81087ac6 ffff88003a0f00f8 ffff88003a0f00fc ffff880038c0e880 [ 368.864514] Call Trace: [ 368.864744] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 [ 368.865273] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 [ 368.865851] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 [ 368.866409] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 [ 368.866892] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e [ 368.867430] [<ffffffff818e7dc1>] ? _raw_spin_lock_irqsave+0x14/0x36 [ 368.867996] [<ffffffff818e7238>] ? down_write+0x9/0x26 [ 368.868467] [<ffffffff813f1863>] ? pcistub_put_pci_dev+0x7b/0xe0 [ 368.868991] [<ffffffff813f14a7>] ? pcistub_remove+0xd0/0x127 [ 368.869506] [<ffffffff8135b5b8>] ? pci_device_remove+0x38/0x83 [ 368.870017] [<ffffffff814cb37f>] ? __device_release_driver+0x82/0xdb [ 368.870593] [<ffffffff814cb602>] ? device_release_driver+0x1a/0x25 [ 368.871152] [<ffffffff814ca993>] ? unbind_store+0x59/0x89 [ 368.871659] [<ffffffff81178aa0>] ? sysfs_write_file+0x13f/0x18f [ 368.872173] [<ffffffff81122aa6>] ? vfs_write+0x95/0xfb [ 368.872641] [<ffffffff81122d8a>] ? SyS_write+0x51/0x85 [ 368.873087] [<ffffffff818ed179>] ? system_call_fastpath+0x16/0x1b [ 488.871331] INFO: task xl:3675 blocked for more than 120 seconds. [ 488.913929] Not tainted 3.13.0-rc7-20140110-creabox-nuc+ #1 [ 488.937031] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 488.960945] xl D ffff88003fd93f00 0 3675 3489 0x00000004 [ 488.986090] ffff880038c0e880 0000000000000282 0000000000000000 ffff880038fd03d0 [ 489.010383] 0000000000013f00 0000000000013f00 ffff880038c0e880 ffff880036abffd8 [ 489.034456] ffffffff81087ac6 ffff88003a0f00f8 ffff88003a0f00fc ffff880038c0e880 [ 489.058621] Call Trace: [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e [ 489.200927] [<ffffffff818e7dc1>] ? _raw_spin_lock_irqsave+0x14/0x36 [ 489.224076] [<ffffffff818e7238>] ? down_write+0x9/0x26 [ 489.246898] [<ffffffff813f1863>] ? pcistub_put_pci_dev+0x7b/0xe0 [ 489.270086] [<ffffffff813f14a7>] ? pcistub_remove+0xd0/0x127 [ 489.293053] [<ffffffff8135b5b8>] ? pci_device_remove+0x38/0x83 [ 489.316068] [<ffffffff814cb37f>] ? __device_release_driver+0x82/0xdb [ 489.338896] [<ffffffff814cb602>] ? device_release_driver+0x1a/0x25 [ 489.362459] [<ffffffff814ca993>] ? unbind_store+0x59/0x89 [ 489.385396] [<ffffffff81178aa0>] ? sysfs_write_file+0x13f/0x18f [ 489.408605] [<ffffffff81122aa6>] ? vfs_write+0x95/0xfb [ 489.431407] [<ffffffff81122d8a>] ? SyS_write+0x51/0x85 [ 489.454251] [<ffffffff818ed179>] ? system_call_fastpath+0x16/0x1b >> >> -- >> Sander >> >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-10 15:57 ` Sander Eikelenboom @ 2014-01-10 16:12 ` Konrad Rzeszutek Wilk 2014-01-10 16:16 ` Sander Eikelenboom 0 siblings, 1 reply; 20+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-01-10 16:12 UTC (permalink / raw) To: Sander Eikelenboom; +Cc: xen-devel On Fri, Jan 10, 2014 at 04:57:29PM +0100, Sander Eikelenboom wrote: > > Friday, January 10, 2014, 4:12:18 PM, you wrote: > > > On Fri, Jan 10, 2014 at 03:51:57PM +0100, Sander Eikelenboom wrote: > >> Hi Konrad, > >> > >> Normally i'm never reattaching pci devices to dom0, but at the moment i have some use for it. > >> > >> But it seems pci-detach isn't completely detaching the device from the guest. > >> > >> - Say i have a guest (HVM) with domid=2 and a pci device passedthrough with bdf 00:19.0, the device is hidden on boot with xen-pciback.hide=(00:19.0) in grub. > >> > >> - Now i do a "xl pci-assignable-list" > >> This returns nothing, which is correct since all hidden devices have already been assigned to guests. > >> > >> - Then i do "xl -v pci-detach 2 00:19.0" > >> Which also returns nothing ... > >> > >> - Now i do a "xl pci-assignable-list" again .. > >> This returns: > >> "0000:00:19.0" > >> So the pci-detach does seem to have done *something* :-) > > > Or it thinks it has :-) > > Well it has .. but probably not enough ;-) > > >> > >> - But when now trying to remove the device from pciback to dom0 with "xl pci-assignable-remove 00:19.0" it gives an error > >> and later it give some stacktraces .. > >> > >> xen_pciback: ****** removing device 0000:00:19.0 while still in-use! ****** > >> xen_pciback: ****** driver domain may still access this device's i/o resources! > >> xen_pciback: ****** shutdown driver domain before binding device > >> xen_pciback: ****** to other drivers of domains > > > What about /var/log/xen/qemu-dm* and the 'lspci' in the guest? Is the PCI device > > removed from there? > > Oeh i should have thought of that ... > > in the guest i get a "e1000e 0000:00:06.0 removed PHC" and it's gone from lspci .. > in /var/log/xen/qemu-dm* .. i get nothing .. but i was using qemu-xen .. which is totally non verbose .. > > So let's try with qemu-xen-traditional .. which i also forgot to test ... > > Which gives exact the same error / warning as above, but it has some output in /var/log/xen/qemu-dm*: > > pt_msgctrl_reg_write: setup msi for dev 30 > pt_msi_setup: pt_msi_setup requested pirq = 54 > pt_msi_setup: msi mapped with pirq 36 > pt_msi_update: Update msi with pirq 36 gvec 0 gflags 3036 > pt_msgctrl_reg_write: setup msi for dev 28 > pt_msi_setup: pt_msi_setup requested pirq = 53 > pt_msi_setup: msi mapped with pirq 35 > pt_msi_update: Update msi with pirq 35 gvec 0 gflags 3035 > pt_msi_update: Update msi with pirq 36 gvec 0 gflags 3034 > dm-command: hot remove pass-through pci dev > generate a sci for PHP. > deassert due to disable GPE bit. > ACPI:debug: write addr=0xb044, val=0x30. > ACPI:debug: write addr=0xb045, val=0x3. > ACPI:debug: write addr=0xb044, val=0x30. > ACPI:debug: write addr=0xb045, val=0x88. > ACPI PCI hotplug: write devfn=0x30. > pci_intx: intx=1 > pci_intx: intx=1 > pt_msi_disable: Unbind msi with pirq 36, gvec 0 > pt_msi_disable: Unmap msi with pirq 36 Good, so the device is safely removed from the guest. QEMU acted on 'libxl' command to remove it. > > > > Also worth mentioninng is that the console on which the "xl pci-assignable-remove 00:19.0" command is given, keeps hanging and eventually the hungtask stacktrace will appear. > > >> > >> > >> When i shut the guest down instead of using pci-detach, the "xl pci-assignable-remove" works fine and i can rebind the device to it's driver in dom0. > >> > >> So am i misreading the wiki .. and is it not possible to detach a device from a running domain or ... ? > >> > >> Oh yes running xen-unstable and a 3.13-rc7 kernel > > > Do you see the same issue with 'xend'? > > Erhmmm haven't used that for what seems to be ages .. :-) Heh. > > Hmm i also forgot the hungtask stacktrace i get sometime after the "xl pci-assignable-remove 00:19.0" ... Wow. You just walked in a pile of bugs didn't you? And on Friday nonethless. > > It seems to be the pci_reset_function ... > > [ 52.099144] xen_bridge: port 4(vif2.0-emu) entered forwarding state > [ 55.683141] xen_bridge: port 1(vif1.0) entered forwarding state > [ 59.861385] xen-blkback:ring-ref 8, event-channel 22, protocol 1 (x86_64-abi) persistent grants > [ 66.043965] xen_bridge: port 3(vif2.0) entered forwarding state > [ 66.044549] xen_bridge: port 3(vif2.0) entered forwarding state > [ 81.091149] xen_bridge: port 3(vif2.0) entered forwarding state > [ 227.441191] xen_pciback: ****** removing device 0000:00:19.0 while still in-use! ****** > [ 227.443482] xen_pciback: ****** driver domain may still access this device's i/o resources! > [ 227.445811] xen_pciback: ****** shutdown driver domain before binding device > [ 227.447811] xen_pciback: ****** to other drivers or domains > [ 368.859343] INFO: task xl:3675 blocked for more than 120 seconds. > [ 368.860447] Not tainted 3.13.0-rc7-20140110-creabox-nuc+ #1 > [ 368.860990] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 368.861682] xl D ffff88003fd93f00 0 3675 3489 0x00000000 > [ 368.862319] ffff880038c0e880 0000000000000282 0000000000000000 ffff880038fd03d0 > [ 368.863035] 0000000000013f00 0000000000013f00 ffff880038c0e880 ffff880036abffd8 > [ 368.863802] ffffffff81087ac6 ffff88003a0f00f8 ffff88003a0f00fc ffff880038c0e880 > [ 368.864514] Call Trace: > [ 368.864744] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 > [ 368.865273] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 > [ 368.865851] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 > [ 368.866409] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 > [ 368.866892] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e > [ 368.867430] [<ffffffff818e7dc1>] ? _raw_spin_lock_irqsave+0x14/0x36 > [ 368.867996] [<ffffffff818e7238>] ? down_write+0x9/0x26 > [ 368.868467] [<ffffffff813f1863>] ? pcistub_put_pci_dev+0x7b/0xe0 > [ 368.868991] [<ffffffff813f14a7>] ? pcistub_remove+0xd0/0x127 > [ 368.869506] [<ffffffff8135b5b8>] ? pci_device_remove+0x38/0x83 > [ 368.870017] [<ffffffff814cb37f>] ? __device_release_driver+0x82/0xdb > [ 368.870593] [<ffffffff814cb602>] ? device_release_driver+0x1a/0x25 > [ 368.871152] [<ffffffff814ca993>] ? unbind_store+0x59/0x89 > [ 368.871659] [<ffffffff81178aa0>] ? sysfs_write_file+0x13f/0x18f > [ 368.872173] [<ffffffff81122aa6>] ? vfs_write+0x95/0xfb > [ 368.872641] [<ffffffff81122d8a>] ? SyS_write+0x51/0x85 > [ 368.873087] [<ffffffff818ed179>] ? system_call_fastpath+0x16/0x1b > [ 488.871331] INFO: task xl:3675 blocked for more than 120 seconds. > [ 488.913929] Not tainted 3.13.0-rc7-20140110-creabox-nuc+ #1 > [ 488.937031] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 488.960945] xl D ffff88003fd93f00 0 3675 3489 0x00000004 > [ 488.986090] ffff880038c0e880 0000000000000282 0000000000000000 ffff880038fd03d0 > [ 489.010383] 0000000000013f00 0000000000013f00 ffff880038c0e880 ffff880036abffd8 > [ 489.034456] ffffffff81087ac6 ffff88003a0f00f8 ffff88003a0f00fc ffff880038c0e880 > [ 489.058621] Call Trace: > [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 > [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 > [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 > [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 > [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. I totally forgot about it ! I hope. > [ 489.200927] [<ffffffff818e7dc1>] ? _raw_spin_lock_irqsave+0x14/0x36 > [ 489.224076] [<ffffffff818e7238>] ? down_write+0x9/0x26 > [ 489.246898] [<ffffffff813f1863>] ? pcistub_put_pci_dev+0x7b/0xe0 > [ 489.270086] [<ffffffff813f14a7>] ? pcistub_remove+0xd0/0x127 > [ 489.293053] [<ffffffff8135b5b8>] ? pci_device_remove+0x38/0x83 > [ 489.316068] [<ffffffff814cb37f>] ? __device_release_driver+0x82/0xdb > [ 489.338896] [<ffffffff814cb602>] ? device_release_driver+0x1a/0x25 > [ 489.362459] [<ffffffff814ca993>] ? unbind_store+0x59/0x89 > [ 489.385396] [<ffffffff81178aa0>] ? sysfs_write_file+0x13f/0x18f > [ 489.408605] [<ffffffff81122aa6>] ? vfs_write+0x95/0xfb > [ 489.431407] [<ffffffff81122d8a>] ? SyS_write+0x51/0x85 > [ 489.454251] [<ffffffff818ed179>] ? system_call_fastpath+0x16/0x1b > > > >> > >> -- > >> Sander > >> > >> > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-10 16:12 ` Konrad Rzeszutek Wilk @ 2014-01-10 16:16 ` Sander Eikelenboom 2014-01-10 17:38 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 20+ messages in thread From: Sander Eikelenboom @ 2014-01-10 16:16 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel Friday, January 10, 2014, 5:12:48 PM, you wrote: > On Fri, Jan 10, 2014 at 04:57:29PM +0100, Sander Eikelenboom wrote: >> >> Friday, January 10, 2014, 4:12:18 PM, you wrote: >> >> > On Fri, Jan 10, 2014 at 03:51:57PM +0100, Sander Eikelenboom wrote: >> >> Hi Konrad, >> >> >> >> Normally i'm never reattaching pci devices to dom0, but at the moment i have some use for it. >> >> >> >> But it seems pci-detach isn't completely detaching the device from the guest. >> >> >> >> - Say i have a guest (HVM) with domid=2 and a pci device passedthrough with bdf 00:19.0, the device is hidden on boot with xen-pciback.hide=(00:19.0) in grub. >> >> >> >> - Now i do a "xl pci-assignable-list" >> >> This returns nothing, which is correct since all hidden devices have already been assigned to guests. >> >> >> >> - Then i do "xl -v pci-detach 2 00:19.0" >> >> Which also returns nothing ... >> >> >> >> - Now i do a "xl pci-assignable-list" again .. >> >> This returns: >> >> "0000:00:19.0" >> >> So the pci-detach does seem to have done *something* :-) >> >> > Or it thinks it has :-) >> >> Well it has .. but probably not enough ;-) >> >> >> >> >> - But when now trying to remove the device from pciback to dom0 with "xl pci-assignable-remove 00:19.0" it gives an error >> >> and later it give some stacktraces .. >> >> >> >> xen_pciback: ****** removing device 0000:00:19.0 while still in-use! ****** >> >> xen_pciback: ****** driver domain may still access this device's i/o resources! >> >> xen_pciback: ****** shutdown driver domain before binding device >> >> xen_pciback: ****** to other drivers of domains >> >> > What about /var/log/xen/qemu-dm* and the 'lspci' in the guest? Is the PCI device >> > removed from there? >> >> Oeh i should have thought of that ... >> >> in the guest i get a "e1000e 0000:00:06.0 removed PHC" and it's gone from lspci .. >> in /var/log/xen/qemu-dm* .. i get nothing .. but i was using qemu-xen .. which is totally non verbose .. >> >> So let's try with qemu-xen-traditional .. which i also forgot to test ... >> >> Which gives exact the same error / warning as above, but it has some output in /var/log/xen/qemu-dm*: >> >> pt_msgctrl_reg_write: setup msi for dev 30 >> pt_msi_setup: pt_msi_setup requested pirq = 54 >> pt_msi_setup: msi mapped with pirq 36 >> pt_msi_update: Update msi with pirq 36 gvec 0 gflags 3036 >> pt_msgctrl_reg_write: setup msi for dev 28 >> pt_msi_setup: pt_msi_setup requested pirq = 53 >> pt_msi_setup: msi mapped with pirq 35 >> pt_msi_update: Update msi with pirq 35 gvec 0 gflags 3035 >> pt_msi_update: Update msi with pirq 36 gvec 0 gflags 3034 >> dm-command: hot remove pass-through pci dev >> generate a sci for PHP. >> deassert due to disable GPE bit. >> ACPI:debug: write addr=0xb044, val=0x30. >> ACPI:debug: write addr=0xb045, val=0x3. >> ACPI:debug: write addr=0xb044, val=0x30. >> ACPI:debug: write addr=0xb045, val=0x88. >> ACPI PCI hotplug: write devfn=0x30. >> pci_intx: intx=1 >> pci_intx: intx=1 >> pt_msi_disable: Unbind msi with pirq 36, gvec 0 >> pt_msi_disable: Unmap msi with pirq 36 > Good, so the device is safely removed from the guest. > QEMU acted on 'libxl' command to remove it. >> >> >> >> Also worth mentioninng is that the console on which the "xl pci-assignable-remove 00:19.0" command is given, keeps hanging and eventually the hungtask stacktrace will appear. >> >> >> >> >> >> >> When i shut the guest down instead of using pci-detach, the "xl pci-assignable-remove" works fine and i can rebind the device to it's driver in dom0. >> >> >> >> So am i misreading the wiki .. and is it not possible to detach a device from a running domain or ... ? >> >> >> >> Oh yes running xen-unstable and a 3.13-rc7 kernel >> >> > Do you see the same issue with 'xend'? >> >> Erhmmm haven't used that for what seems to be ages .. :-) > Heh. >> >> Hmm i also forgot the hungtask stacktrace i get sometime after the "xl pci-assignable-remove 00:19.0" ... > Wow. You just walked in a pile of bugs didn't you? And on Friday > nonethless. As usual ;-) >> >> It seems to be the pci_reset_function ... >> >> [ 52.099144] xen_bridge: port 4(vif2.0-emu) entered forwarding state >> [ 55.683141] xen_bridge: port 1(vif1.0) entered forwarding state >> [ 59.861385] xen-blkback:ring-ref 8, event-channel 22, protocol 1 (x86_64-abi) persistent grants >> [ 66.043965] xen_bridge: port 3(vif2.0) entered forwarding state >> [ 66.044549] xen_bridge: port 3(vif2.0) entered forwarding state >> [ 81.091149] xen_bridge: port 3(vif2.0) entered forwarding state >> [ 227.441191] xen_pciback: ****** removing device 0000:00:19.0 while still in-use! ****** >> [ 227.443482] xen_pciback: ****** driver domain may still access this device's i/o resources! >> [ 227.445811] xen_pciback: ****** shutdown driver domain before binding device >> [ 227.447811] xen_pciback: ****** to other drivers or domains >> [ 368.859343] INFO: task xl:3675 blocked for more than 120 seconds. >> [ 368.860447] Not tainted 3.13.0-rc7-20140110-creabox-nuc+ #1 >> [ 368.860990] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 368.861682] xl D ffff88003fd93f00 0 3675 3489 0x00000000 >> [ 368.862319] ffff880038c0e880 0000000000000282 0000000000000000 ffff880038fd03d0 >> [ 368.863035] 0000000000013f00 0000000000013f00 ffff880038c0e880 ffff880036abffd8 >> [ 368.863802] ffffffff81087ac6 ffff88003a0f00f8 ffff88003a0f00fc ffff880038c0e880 >> [ 368.864514] Call Trace: >> [ 368.864744] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >> [ 368.865273] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >> [ 368.865851] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >> [ 368.866409] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> [ 368.866892] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >> [ 368.867430] [<ffffffff818e7dc1>] ? _raw_spin_lock_irqsave+0x14/0x36 >> [ 368.867996] [<ffffffff818e7238>] ? down_write+0x9/0x26 >> [ 368.868467] [<ffffffff813f1863>] ? pcistub_put_pci_dev+0x7b/0xe0 >> [ 368.868991] [<ffffffff813f14a7>] ? pcistub_remove+0xd0/0x127 >> [ 368.869506] [<ffffffff8135b5b8>] ? pci_device_remove+0x38/0x83 >> [ 368.870017] [<ffffffff814cb37f>] ? __device_release_driver+0x82/0xdb >> [ 368.870593] [<ffffffff814cb602>] ? device_release_driver+0x1a/0x25 >> [ 368.871152] [<ffffffff814ca993>] ? unbind_store+0x59/0x89 >> [ 368.871659] [<ffffffff81178aa0>] ? sysfs_write_file+0x13f/0x18f >> [ 368.872173] [<ffffffff81122aa6>] ? vfs_write+0x95/0xfb >> [ 368.872641] [<ffffffff81122d8a>] ? SyS_write+0x51/0x85 >> [ 368.873087] [<ffffffff818ed179>] ? system_call_fastpath+0x16/0x1b >> [ 488.871331] INFO: task xl:3675 blocked for more than 120 seconds. >> [ 488.913929] Not tainted 3.13.0-rc7-20140110-creabox-nuc+ #1 >> [ 488.937031] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 488.960945] xl D ffff88003fd93f00 0 3675 3489 0x00000004 >> [ 488.986090] ffff880038c0e880 0000000000000282 0000000000000000 ffff880038fd03d0 >> [ 489.010383] 0000000000013f00 0000000000013f00 ffff880038c0e880 ffff880036abffd8 >> [ 489.034456] ffffffff81087ac6 ffff88003a0f00f8 ffff88003a0f00fc ffff880038c0e880 >> [ 489.058621] Call Trace: >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. > I totally forgot about it ! Got a link to that patchset ? I at least could give it a spin .. you never know when fortune is on your side :-) > I hope. >> [ 489.200927] [<ffffffff818e7dc1>] ? _raw_spin_lock_irqsave+0x14/0x36 >> [ 489.224076] [<ffffffff818e7238>] ? down_write+0x9/0x26 >> [ 489.246898] [<ffffffff813f1863>] ? pcistub_put_pci_dev+0x7b/0xe0 >> [ 489.270086] [<ffffffff813f14a7>] ? pcistub_remove+0xd0/0x127 >> [ 489.293053] [<ffffffff8135b5b8>] ? pci_device_remove+0x38/0x83 >> [ 489.316068] [<ffffffff814cb37f>] ? __device_release_driver+0x82/0xdb >> [ 489.338896] [<ffffffff814cb602>] ? device_release_driver+0x1a/0x25 >> [ 489.362459] [<ffffffff814ca993>] ? unbind_store+0x59/0x89 >> [ 489.385396] [<ffffffff81178aa0>] ? sysfs_write_file+0x13f/0x18f >> [ 489.408605] [<ffffffff81122aa6>] ? vfs_write+0x95/0xfb >> [ 489.431407] [<ffffffff81122d8a>] ? SyS_write+0x51/0x85 >> [ 489.454251] [<ffffffff818ed179>] ? system_call_fastpath+0x16/0x1b >> >> >> >> >> >> -- >> >> Sander >> >> >> >> >> >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-10 16:16 ` Sander Eikelenboom @ 2014-01-10 17:38 ` Konrad Rzeszutek Wilk 2014-01-10 18:21 ` Sander Eikelenboom ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-01-10 17:38 UTC (permalink / raw) To: Sander Eikelenboom; +Cc: xen-devel > > Wow. You just walked in a pile of bugs didn't you? And on Friday > > nonethless. > > As usual ;-) Ha! ..snip.. > >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 > >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 > >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 > >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 > >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e > > > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. > > I totally forgot about it ! > > Got a link to that patchset ? https://lkml.org/lkml/2013/12/13/315 > I at least could give it a spin .. you never know when fortune is on your side :-) It is also at this git tree: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely want to merge it in your current Linus tree. Thank you! ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-10 17:38 ` Konrad Rzeszutek Wilk @ 2014-01-10 18:21 ` Sander Eikelenboom 2014-01-10 18:22 ` Sander Eikelenboom 2014-01-24 13:36 ` Sander Eikelenboom 2 siblings, 0 replies; 20+ messages in thread From: Sander Eikelenboom @ 2014-01-10 18:21 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel Friday, January 10, 2014, 6:38:10 PM, you wrote: >> > Wow. You just walked in a pile of bugs didn't you? And on Friday >> > nonethless. >> >> As usual ;-) > Ha! > ..snip.. >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >> > I totally forgot about it ! >> >> Got a link to that patchset ? > https://lkml.org/lkml/2013/12/13/315 >> I at least could give it a spin .. you never know when fortune is on your side :-) > It is also at this git tree: > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely > want to merge it in your current Linus tree. > Thank you! Hmm it was worth a shot, but it didn't cut it, still the same error/warning and console hung. I don't seem to get the hung task stacktrace i dit get before. However it is stuck, when switching to another console and trying to do a "lspci" there, that also hangs. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-10 17:38 ` Konrad Rzeszutek Wilk 2014-01-10 18:21 ` Sander Eikelenboom @ 2014-01-10 18:22 ` Sander Eikelenboom 2014-01-24 13:36 ` Sander Eikelenboom 2 siblings, 0 replies; 20+ messages in thread From: Sander Eikelenboom @ 2014-01-10 18:22 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel Friday, January 10, 2014, 6:38:10 PM, you wrote: >> > Wow. You just walked in a pile of bugs didn't you? And on Friday >> > nonethless. >> >> As usual ;-) > Ha! > ..snip.. >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >> > I totally forgot about it ! >> >> Got a link to that patchset ? > https://lkml.org/lkml/2013/12/13/315 >> I at least could give it a spin .. you never know when fortune is on your side :-) > It is also at this git tree: > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely > want to merge it in your current Linus tree. > Thank you! Ah and there is the same stacktrace as well .. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-10 17:38 ` Konrad Rzeszutek Wilk 2014-01-10 18:21 ` Sander Eikelenboom 2014-01-10 18:22 ` Sander Eikelenboom @ 2014-01-24 13:36 ` Sander Eikelenboom 2014-01-24 17:48 ` Konrad Rzeszutek Wilk 2014-01-27 16:29 ` George Dunlap 2 siblings, 2 replies; 20+ messages in thread From: Sander Eikelenboom @ 2014-01-24 13:36 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel Friday, January 10, 2014, 6:38:10 PM, you wrote: >> > Wow. You just walked in a pile of bugs didn't you? And on Friday >> > nonethless. >> >> As usual ;-) > Ha! > ..snip.. >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >> > I totally forgot about it ! >> >> Got a link to that patchset ? > https://lkml.org/lkml/2013/12/13/315 >> I at least could give it a spin .. you never know when fortune is on your side :-) > It is also at this git tree: > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely > want to merge it in your current Linus tree. > Thank you! Hi Konrad, Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to help with my problem,i'm no capable of using: - xl pci-detach - xl pci-assignable-remove - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-) So the first 4 seem to be an improvement. That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own. -- Sander ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-24 13:36 ` Sander Eikelenboom @ 2014-01-24 17:48 ` Konrad Rzeszutek Wilk 2014-01-24 18:53 ` Sander Eikelenboom 2014-02-20 8:53 ` Sander Eikelenboom 2014-01-27 16:29 ` George Dunlap 1 sibling, 2 replies; 20+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-01-24 17:48 UTC (permalink / raw) To: Sander Eikelenboom; +Cc: xen-devel On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote: > > Friday, January 10, 2014, 6:38:10 PM, you wrote: > > >> > Wow. You just walked in a pile of bugs didn't you? And on Friday > >> > nonethless. > >> > >> As usual ;-) > > > Ha! > > ..snip.. > >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 > >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 > >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 > >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 > >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e > >> > >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. > >> > I totally forgot about it ! > >> > >> Got a link to that patchset ? > > > https://lkml.org/lkml/2013/12/13/315 > > >> I at least could give it a spin .. you never know when fortune is on your side :-) > > > It is also at this git tree: > > > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the > > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely > > want to merge it in your current Linus tree. > > > Thank you! > > > Hi Konrad, > > Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) > seems to help with my problem,i'm no capable of using: > - xl pci-detach > - xl pci-assignable-remove > - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind > > to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-) > So the first 4 seem to be an improvement. > > That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own. Could you email me your lspci output and also which devices you move/switch etc? Thanks! > > -- > Sander > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-24 17:48 ` Konrad Rzeszutek Wilk @ 2014-01-24 18:53 ` Sander Eikelenboom 2014-02-20 8:53 ` Sander Eikelenboom 1 sibling, 0 replies; 20+ messages in thread From: Sander Eikelenboom @ 2014-01-24 18:53 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel Friday, January 24, 2014, 6:48:06 PM, you wrote: > On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote: >> >> Friday, January 10, 2014, 6:38:10 PM, you wrote: >> >> >> > Wow. You just walked in a pile of bugs didn't you? And on Friday >> >> > nonethless. >> >> >> >> As usual ;-) >> >> > Ha! >> > ..snip.. >> >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >> >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >> >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >> >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >> >> >> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >> >> > I totally forgot about it ! >> >> >> >> Got a link to that patchset ? >> >> > https://lkml.org/lkml/2013/12/13/315 >> >> >> I at least could give it a spin .. you never know when fortune is on your side :-) >> >> > It is also at this git tree: >> >> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the >> > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely >> > want to merge it in your current Linus tree. >> >> > Thank you! >> >> >> Hi Konrad, >> >> Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) >> seems to help with my problem,i'm no capable of using: >> - xl pci-detach >> - xl pci-assignable-remove >> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind >> >> to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-) >> So the first 4 seem to be an improvement. >> >> That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own. > Could you email me your lspci output and also which devices you move/switch etc? Hmm hope you didn't misunderstood :-) I now spot a missing "w" .. i am noW capable of .. :-) So it works when the first 4 patches of that branch are applied, I tried with both a NIC and a wireless NIC and had no problems. The problems with that last commit don't seem to be related to that moving/switch of devices it also occurs on a more regular create or shutdown of a guest with a device passed through to it. Just to be complete here is a stacktrace of the hung task i encounter then: [ 968.600248] INFO: task xenwatch:29 blocked for more than 120 seconds. [ 968.601885] Not tainted 3.13.020140123-pcireset-p5+ #1 [ 968.603086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 968.605298] xenwatch D ffff88003fb6c1d0 0 29 2 0x00000000 [ 968.607007] ffff88003fb6c1d0 0000000000000246 ffffffff81a0af60 ffff880038c8a980 [ 968.608773] 0000000000013f00 0000000000013f00 ffff88003fb6c1d0 ffff88003fb9bfd8 [ 968.609515] ffff88003fb6c1d0 7fffffffffffffff 7fffffffffffffff 0000000000000002 [ 968.610260] Call Trace: [ 968.610493] [<ffffffff818eda68>] ? genl_set_err.isra.15.constprop.21+0x51/0x51 [ 968.611163] [<ffffffff818eda94>] ? schedule_timeout+0x2c/0x123 [ 968.641854] [<ffffffff813e8bc5>] ? read_reply+0xcc/0xd8 [ 968.666733] [<ffffffff81117c2f>] ? kfree+0x50/0x6f [ 968.691593] [<ffffffff81116d5b>] ? arch_local_irq_restore+0x7/0x8 [ 968.716408] [<ffffffff813e8bc5>] ? read_reply+0xcc/0xd8 [ 968.740956] [<ffffffff81087ecf>] ? arch_local_irq_disable+0x7/0x8 [ 968.765631] [<ffffffff818eebc3>] ? __wait_for_common+0x123/0x15e [ 968.790148] [<ffffffff8107d188>] ? try_to_wake_up+0x198/0x198 [ 968.814917] [<ffffffff818f061e>] ? _raw_spin_unlock_irqrestore+0xb/0xc [ 968.839656] [<ffffffff813f18c1>] ? pcistub_get_pci_dev_by_slot+0xc3/0xd5 [ 968.864307] [<ffffffff813f27f4>] ? xen_pcibk_export_device+0x27/0xfe [ 968.888824] [<ffffffff813f2a10>] ? xen_pcibk_setup_backend+0x145/0x265 [ 968.913094] [<ffffffff813f300d>] ? xen_pcibk_xenbus_probe+0xeb/0x12a [ 968.937093] [<ffffffff813ea6b8>] ? xenbus_dev_probe+0x56/0xb5 [ 968.961314] [<ffffffff814cbd9b>] ? __driver_attach+0x73/0x73 [ 968.985750] [<ffffffff814cbc07>] ? driver_probe_device+0x92/0x1b3 [ 969.010282] [<ffffffff814ca434>] ? bus_for_each_drv+0x46/0x80 [ 969.034570] [<ffffffff814cbb40>] ? device_attach+0x68/0x86 [ 969.058685] [<ffffffff814cb22b>] ? bus_probe_device+0x2c/0x9d [ 969.082670] [<ffffffff814c99b3>] ? device_add+0x371/0x51c [ 969.106462] [<ffffffff8106339e>] ? init_timer_key+0xe/0x5a [ 969.130155] [<ffffffff813ea39c>] ? xenbus_probe_node+0x121/0x160 [ 969.153793] [<ffffffff813e9369>] ? xenbus_dev_request_and_reply+0x75/0x75 [ 969.177396] [<ffffffff813ea550>] ? xenbus_dev_changed+0x175/0x1a4 [ 969.200938] [<ffffffff813e9431>] ? xenwatch_thread+0xc8/0xf2 [ 969.224678] [<ffffffff813e9369>] ? xenbus_dev_request_and_reply+0x75/0x75 [ 969.248560] [<ffffffff81085072>] ? bit_waitqueue+0x82/0x82 [ 969.272249] [<ffffffff810728e6>] ? kthread+0x99/0xa1 [ 969.296128] [<ffffffff8100384f>] ? xen_mc_issue.constprop.20+0x27/0x4d [ 969.320029] [<ffffffff81070000>] ? get_task_pid+0x2a/0x2c [ 969.343706] [<ffffffff8107284d>] ? __kthread_parkme+0x59/0x59 [ 969.367233] [<ffffffff818f598c>] ? ret_from_fork+0x7c/0xb0 [ 969.390736] [<ffffffff8107284d>] ? __kthread_parkme+0x59/0x59 > Thanks! >> >> -- >> Sander >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-24 17:48 ` Konrad Rzeszutek Wilk 2014-01-24 18:53 ` Sander Eikelenboom @ 2014-02-20 8:53 ` Sander Eikelenboom 2014-02-20 16:18 ` Sander Eikelenboom 1 sibling, 1 reply; 20+ messages in thread From: Sander Eikelenboom @ 2014-02-20 8:53 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel [-- Attachment #1: Type: text/plain, Size: 3643 bytes --] Friday, January 24, 2014, 6:48:06 PM, you wrote: > On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote: >> >> Friday, January 10, 2014, 6:38:10 PM, you wrote: >> >> >> > Wow. You just walked in a pile of bugs didn't you? And on Friday >> >> > nonethless. >> >> >> >> As usual ;-) >> >> > Ha! >> > ..snip.. >> >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >> >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >> >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >> >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >> >> >> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >> >> > I totally forgot about it ! >> >> >> >> Got a link to that patchset ? >> >> > https://lkml.org/lkml/2013/12/13/315 >> >> >> I at least could give it a spin .. you never know when fortune is on your side :-) >> >> > It is also at this git tree: >> >> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the >> > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely >> > want to merge it in your current Linus tree. >> >> > Thank you! >> >> >> Hi Konrad, >> >> Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) >> seems to help with my problem,i'm no capable of using: >> - xl pci-detach >> - xl pci-assignable-remove >> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind >> >> to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-) >> So the first 4 seem to be an improvement. >> >> That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own. > Could you email me your lspci output and also which devices you move/switch etc? Hi Konrad, At the moment i found some time to figure out what goes wrong with the xl pci-detach and xl pci-assignable-remove, i have been able to narrow it down a bit: The problem only occurs when you: - passthrough 2 (or more?) pci devices assigned to a guest .. - and only remove 1 of those devices with "xl pci-detach" followed by a "xl pci-assignable-remove" - when you first detach both devices with "xl pci-detach" before doing the "xl pci-assignable-remove" it works ok. In my case i'm passingthrough 2 devices (02:00.0 and 00:19.0) I added some printk's and what i found out is that: - after doing the pci-detach of 02:00.0, it doesn't call pcistub_put_pci_dev for that device ... - but when i subsequently pci-detach the second (and last) device 00:19.0 .. it does call it for both 02:00.0 and 00:19.0 ... - so somehow that call for the first detached device gets deferred .. but since it are different devices and not functions of the same device i don't see any reason for it to wait until all other devices would have been detached ... I tried to capture the console output but some how that didn't work out, so i attached a screenshot of what happens when: - doing a xl pci-list for the guest - doing a xl pci-assignable-list - doing the xl pci-detach for 02:00.0 - doing a xl pci-list for the guest - doing a xl pci-assignable-list - waiting some time ... - doing the xl pci-detach for 00:19.0 - doing a xl pci-list for the guest - doing a xl pci-assignable-list There you can see this strange sequence of events :-) But i haven't been able to spot the culprit attached: screenshot.jpg -- Sander > Thanks! >> >> -- >> Sander >> [-- Attachment #2: screenshot.jpg --] [-- Type: image/jpeg, Size: 691527 bytes --] [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-02-20 8:53 ` Sander Eikelenboom @ 2014-02-20 16:18 ` Sander Eikelenboom 2014-04-01 16:13 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 20+ messages in thread From: Sander Eikelenboom @ 2014-02-20 16:18 UTC (permalink / raw) To: Konrad Rzeszutek Wilk, Ian Campbell; +Cc: xen-devel [-- Attachment #1: Type: text/plain, Size: 4208 bytes --] Thursday, February 20, 2014, 9:53:59 AM, you wrote: > Friday, January 24, 2014, 6:48:06 PM, you wrote: >> On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote: >>> >>> Friday, January 10, 2014, 6:38:10 PM, you wrote: >>> >>> >> > Wow. You just walked in a pile of bugs didn't you? And on Friday >>> >> > nonethless. >>> >> >>> >> As usual ;-) >>> >>> > Ha! >>> > ..snip.. >>> >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >>> >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >>> >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >>> >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >>> >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >>> >> >>> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >>> >> > I totally forgot about it ! >>> >> >>> >> Got a link to that patchset ? >>> >>> > https://lkml.org/lkml/2013/12/13/315 >>> >>> >> I at least could give it a spin .. you never know when fortune is on your side :-) >>> >>> > It is also at this git tree: >>> >>> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the >>> > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely >>> > want to merge it in your current Linus tree. >>> >>> > Thank you! >>> >>> >>> Hi Konrad, >>> >>> Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) >>> seems to help with my problem,i'm no capable of using: >>> - xl pci-detach >>> - xl pci-assignable-remove >>> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind >>> >>> to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-) >>> So the first 4 seem to be an improvement. >>> >>> That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own. >> Could you email me your lspci output and also which devices you move/switch etc? > Hi Konrad, > At the moment i found some time to figure out what goes wrong with the xl pci-detach and xl pci-assignable-remove, i have been > able to narrow it down a bit: > The problem only occurs when you: > - passthrough 2 (or more?) pci devices assigned to a guest .. > - and only remove 1 of those devices with "xl pci-detach" followed by a "xl pci-assignable-remove" > - when you first detach both devices with "xl pci-detach" before doing the "xl pci-assignable-remove" it works ok. > In my case i'm passingthrough 2 devices (02:00.0 and 00:19.0) > I added some printk's and what i found out is that: > - after doing the pci-detach of 02:00.0, it doesn't call pcistub_put_pci_dev for that device ... > - but when i subsequently pci-detach the second (and last) device 00:19.0 .. it does call it for both 02:00.0 and 00:19.0 ... > - so somehow that call for the first detached device gets deferred .. but since it are different devices and not functions of the same device i don't > see any reason for it to wait until all other devices would have been detached ... > I tried to capture the console output but some how that didn't work out, so i attached a screenshot of what happens when: > - doing a xl pci-list for the guest > - doing a xl pci-assignable-list > - doing the xl pci-detach for 02:00.0 > - doing a xl pci-list for the guest > - doing a xl pci-assignable-list > - waiting some time ... > - doing the xl pci-detach for 00:19.0 > - doing a xl pci-list for the guest > - doing a xl pci-assignable-list > There you can see this strange sequence of events :-) > But i haven't been able to spot the culprit Enabled some extra debugging and added some more printk's .. (see new screenshot) From what it seems .. the frontend state for the first device isn't changed on the first pci-detach .. Is the signaling on pci-detach the guests (pcifront) responsibility or the toolstacks (libxl) ? > attached: screenshot.jpg > -- > Sander >> Thanks! >>> >>> -- >>> Sander >>> [-- Attachment #2: screenshot2.jpg --] [-- Type: image/jpeg, Size: 304050 bytes --] [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-02-20 16:18 ` Sander Eikelenboom @ 2014-04-01 16:13 ` Konrad Rzeszutek Wilk 2014-04-02 10:43 ` Sander Eikelenboom 0 siblings, 1 reply; 20+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-04-01 16:13 UTC (permalink / raw) To: Sander Eikelenboom; +Cc: xen-devel, Ian Campbell On Thu, Feb 20, 2014 at 05:18:46PM +0100, Sander Eikelenboom wrote: > > Thursday, February 20, 2014, 9:53:59 AM, you wrote: > > > > Friday, January 24, 2014, 6:48:06 PM, you wrote: > > >> On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote: > >>> > >>> Friday, January 10, 2014, 6:38:10 PM, you wrote: > >>> > >>> >> > Wow. You just walked in a pile of bugs didn't you? And on Friday > >>> >> > nonethless. > >>> >> > >>> >> As usual ;-) > >>> > >>> > Ha! > >>> > ..snip.. > >>> >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 > >>> >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 > >>> >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 > >>> >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 > >>> >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e > >>> >> > >>> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. > >>> >> > I totally forgot about it ! > >>> >> > >>> >> Got a link to that patchset ? > >>> > >>> > https://lkml.org/lkml/2013/12/13/315 > >>> > >>> >> I at least could give it a spin .. you never know when fortune is on your side :-) > >>> > >>> > It is also at this git tree: > >>> > >>> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the > >>> > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely > >>> > want to merge it in your current Linus tree. > >>> > >>> > Thank you! > >>> > >>> > >>> Hi Konrad, > >>> > >>> Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) > >>> seems to help with my problem,i'm no capable of using: > >>> - xl pci-detach > >>> - xl pci-assignable-remove > >>> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind > >>> > >>> to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-) > >>> So the first 4 seem to be an improvement. > >>> > >>> That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own. > > >> Could you email me your lspci output and also which devices you move/switch etc? > > > Hi Konrad, > > > At the moment i found some time to figure out what goes wrong with the xl pci-detach and xl pci-assignable-remove, i have been > > able to narrow it down a bit: > > > The problem only occurs when you: > > - passthrough 2 (or more?) pci devices assigned to a guest .. > > - and only remove 1 of those devices with "xl pci-detach" followed by a "xl pci-assignable-remove" > > - when you first detach both devices with "xl pci-detach" before doing the "xl pci-assignable-remove" it works ok. > > > In my case i'm passingthrough 2 devices (02:00.0 and 00:19.0) > > > I added some printk's and what i found out is that: > > - after doing the pci-detach of 02:00.0, it doesn't call pcistub_put_pci_dev for that device ... > > - but when i subsequently pci-detach the second (and last) device 00:19.0 .. it does call it for both 02:00.0 and 00:19.0 ... > > - so somehow that call for the first detached device gets deferred .. but since it are different devices and not functions of the same device i don't > > see any reason for it to wait until all other devices would have been detached ... > > > > I tried to capture the console output but some how that didn't work out, so i attached a screenshot of what happens when: > > - doing a xl pci-list for the guest > > - doing a xl pci-assignable-list > > > - doing the xl pci-detach for 02:00.0 > > > - doing a xl pci-list for the guest > > - doing a xl pci-assignable-list > > > - waiting some time ... > > > - doing the xl pci-detach for 00:19.0 > > > - doing a xl pci-list for the guest > > - doing a xl pci-assignable-list > > > There you can see this strange sequence of events :-) > > > But i haven't been able to spot the culprit > > Enabled some extra debugging and added some more printk's .. (see new screenshot) > > From what it seems .. the frontend state for the first device isn't changed on the first pci-detach .. > > Is the signaling on pci-detach the guests (pcifront) responsibility or the toolstacks (libxl) ? It usually is pcifront. And in the screenshot I see: .. frontend is gone! unregister device which should trigger the process. And it does look to do that. Hm, I am wondering what the toolstack is waiting for. Time to debug. > > > > > attached: screenshot.jpg and thanks for the screenshot (didn't have copy-n-paste option handy :-)) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-04-01 16:13 ` Konrad Rzeszutek Wilk @ 2014-04-02 10:43 ` Sander Eikelenboom 2014-04-16 15:30 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 20+ messages in thread From: Sander Eikelenboom @ 2014-04-02 10:43 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel, Ian Campbell Tuesday, April 1, 2014, 6:13:09 PM, you wrote: > On Thu, Feb 20, 2014 at 05:18:46PM +0100, Sander Eikelenboom wrote: >> >> Thursday, February 20, 2014, 9:53:59 AM, you wrote: >> >> >> > Friday, January 24, 2014, 6:48:06 PM, you wrote: >> >> >> On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote: >> >>> >> >>> Friday, January 10, 2014, 6:38:10 PM, you wrote: >> >>> >> >>> >> > Wow. You just walked in a pile of bugs didn't you? And on Friday >> >>> >> > nonethless. >> >>> >> >> >>> >> As usual ;-) >> >>> >> >>> > Ha! >> >>> > ..snip.. >> >>> >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >> >>> >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >> >>> >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >> >>> >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> >>> >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >> >>> >> >> >>> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >> >>> >> > I totally forgot about it ! >> >>> >> >> >>> >> Got a link to that patchset ? >> >>> >> >>> > https://lkml.org/lkml/2013/12/13/315 >> >>> >> >>> >> I at least could give it a spin .. you never know when fortune is on your side :-) >> >>> >> >>> > It is also at this git tree: >> >>> >> >>> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the >> >>> > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely >> >>> > want to merge it in your current Linus tree. >> >>> >> >>> > Thank you! >> >>> >> >>> >> >>> Hi Konrad, >> >>> >> >>> Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) >> >>> seems to help with my problem,i'm no capable of using: >> >>> - xl pci-detach >> >>> - xl pci-assignable-remove >> >>> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind >> >>> >> >>> to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-) >> >>> So the first 4 seem to be an improvement. >> >>> >> >>> That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own. >> >> >> Could you email me your lspci output and also which devices you move/switch etc? >> >> > Hi Konrad, >> >> > At the moment i found some time to figure out what goes wrong with the xl pci-detach and xl pci-assignable-remove, i have been >> > able to narrow it down a bit: >> >> > The problem only occurs when you: >> > - passthrough 2 (or more?) pci devices assigned to a guest .. >> > - and only remove 1 of those devices with "xl pci-detach" followed by a "xl pci-assignable-remove" >> > - when you first detach both devices with "xl pci-detach" before doing the "xl pci-assignable-remove" it works ok. >> >> > In my case i'm passingthrough 2 devices (02:00.0 and 00:19.0) >> >> > I added some printk's and what i found out is that: >> > - after doing the pci-detach of 02:00.0, it doesn't call pcistub_put_pci_dev for that device ... >> > - but when i subsequently pci-detach the second (and last) device 00:19.0 .. it does call it for both 02:00.0 and 00:19.0 ... >> > - so somehow that call for the first detached device gets deferred .. but since it are different devices and not functions of the same device i don't >> > see any reason for it to wait until all other devices would have been detached ... >> >> >> > I tried to capture the console output but some how that didn't work out, so i attached a screenshot of what happens when: >> > - doing a xl pci-list for the guest >> > - doing a xl pci-assignable-list >> >> > - doing the xl pci-detach for 02:00.0 >> >> > - doing a xl pci-list for the guest >> > - doing a xl pci-assignable-list >> >> > - waiting some time ... >> >> > - doing the xl pci-detach for 00:19.0 >> >> > - doing a xl pci-list for the guest >> > - doing a xl pci-assignable-list >> >> > There you can see this strange sequence of events :-) >> >> > But i haven't been able to spot the culprit >> >> Enabled some extra debugging and added some more printk's .. (see new screenshot) >> >> From what it seems .. the frontend state for the first device isn't changed on the first pci-detach .. >> >> Is the signaling on pci-detach the guests (pcifront) responsibility or the toolstacks (libxl) ? > It usually is pcifront. And in the screenshot I see: > .. frontend is gone! unregister device > which should trigger the process. And it does look to do that. > Hm, I am wondering what the toolstack is waiting for. > Time to debug. Ok thx :-) >> >> >> >> > attached: screenshot.jpg > and thanks for the screenshot (didn't have copy-n-paste option handy :-)) Well i didn't have KVM/SOL working on the intel NUC .. busy with that today .. it's a nifty little machine .. just got to get the AMT/vPro stuff working. So although second best, the screenshot was all i had at that moment. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-04-02 10:43 ` Sander Eikelenboom @ 2014-04-16 15:30 ` Konrad Rzeszutek Wilk 2014-04-16 15:44 ` Sander Eikelenboom 2014-04-16 16:22 ` Sander Eikelenboom 0 siblings, 2 replies; 20+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-04-16 15:30 UTC (permalink / raw) To: Sander Eikelenboom; +Cc: xen-devel, Ian Campbell On Wed, Apr 02, 2014 at 12:43:12PM +0200, Sander Eikelenboom wrote: > > Tuesday, April 1, 2014, 6:13:09 PM, you wrote: > > > On Thu, Feb 20, 2014 at 05:18:46PM +0100, Sander Eikelenboom wrote: > >> > >> Thursday, February 20, 2014, 9:53:59 AM, you wrote: > >> > >> > >> > Friday, January 24, 2014, 6:48:06 PM, you wrote: > >> > >> >> On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote: > >> >>> > >> >>> Friday, January 10, 2014, 6:38:10 PM, you wrote: > >> >>> > >> >>> >> > Wow. You just walked in a pile of bugs didn't you? And on Friday > >> >>> >> > nonethless. > >> >>> >> > >> >>> >> As usual ;-) > >> >>> > >> >>> > Ha! > >> >>> > ..snip.. > >> >>> >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 > >> >>> >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 > >> >>> >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 > >> >>> >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 > >> >>> >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e > >> >>> >> > >> >>> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. > >> >>> >> > I totally forgot about it ! > >> >>> >> > >> >>> >> Got a link to that patchset ? > >> >>> > >> >>> > https://lkml.org/lkml/2013/12/13/315 > >> >>> > >> >>> >> I at least could give it a spin .. you never know when fortune is on your side :-) > >> >>> > >> >>> > It is also at this git tree: > >> >>> > >> >>> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the > >> >>> > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely > >> >>> > want to merge it in your current Linus tree. > >> >>> > >> >>> > Thank you! > >> >>> > >> >>> > >> >>> Hi Konrad, > >> >>> > >> >>> Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) > >> >>> seems to help with my problem,i'm no capable of using: > >> >>> - xl pci-detach > >> >>> - xl pci-assignable-remove > >> >>> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind > >> >>> > >> >>> to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-) > >> >>> So the first 4 seem to be an improvement. > >> >>> > >> >>> That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own. > >> > >> >> Could you email me your lspci output and also which devices you move/switch etc? > >> > >> > Hi Konrad, > >> > >> > At the moment i found some time to figure out what goes wrong with the xl pci-detach and xl pci-assignable-remove, i have been > >> > able to narrow it down a bit: > >> > >> > The problem only occurs when you: > >> > - passthrough 2 (or more?) pci devices assigned to a guest .. > >> > - and only remove 1 of those devices with "xl pci-detach" followed by a "xl pci-assignable-remove" > >> > - when you first detach both devices with "xl pci-detach" before doing the "xl pci-assignable-remove" it works ok. > >> > >> > In my case i'm passingthrough 2 devices (02:00.0 and 00:19.0) > >> > >> > I added some printk's and what i found out is that: > >> > - after doing the pci-detach of 02:00.0, it doesn't call pcistub_put_pci_dev for that device ... > >> > - but when i subsequently pci-detach the second (and last) device 00:19.0 .. it does call it for both 02:00.0 and 00:19.0 ... > >> > - so somehow that call for the first detached device gets deferred .. but since it are different devices and not functions of the same device i don't > >> > see any reason for it to wait until all other devices would have been detached ... > >> > >> > >> > I tried to capture the console output but some how that didn't work out, so i attached a screenshot of what happens when: > >> > - doing a xl pci-list for the guest > >> > - doing a xl pci-assignable-list > >> > >> > - doing the xl pci-detach for 02:00.0 > >> > >> > - doing a xl pci-list for the guest > >> > - doing a xl pci-assignable-list > >> > >> > - waiting some time ... > >> > >> > - doing the xl pci-detach for 00:19.0 > >> > >> > - doing a xl pci-list for the guest > >> > - doing a xl pci-assignable-list > >> > >> > There you can see this strange sequence of events :-) > >> > >> > But i haven't been able to spot the culprit > >> > >> Enabled some extra debugging and added some more printk's .. (see new screenshot) > >> > >> From what it seems .. the frontend state for the first device isn't changed on the first pci-detach .. > >> > >> Is the signaling on pci-detach the guests (pcifront) responsibility or the toolstacks (libxl) ? > > > It usually is pcifront. And in the screenshot I see: > > .. frontend is gone! unregister device > > which should trigger the process. And it does look to do that. > > Hm, I am wondering what the toolstack is waiting for. > > Time to debug. > > Ok thx :-) Just to make sure - you are not using the xen-pciback.hide parameter right? Just doing the /sysfs dance of 'echo BDF'> to various places. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-04-16 15:30 ` Konrad Rzeszutek Wilk @ 2014-04-16 15:44 ` Sander Eikelenboom 2014-04-16 16:22 ` Sander Eikelenboom 1 sibling, 0 replies; 20+ messages in thread From: Sander Eikelenboom @ 2014-04-16 15:44 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel, Ian Campbell Wednesday, April 16, 2014, 5:30:57 PM, you wrote: > On Wed, Apr 02, 2014 at 12:43:12PM +0200, Sander Eikelenboom wrote: >> >> Tuesday, April 1, 2014, 6:13:09 PM, you wrote: >> >> > On Thu, Feb 20, 2014 at 05:18:46PM +0100, Sander Eikelenboom wrote: >> >> >> >> Thursday, February 20, 2014, 9:53:59 AM, you wrote: >> >> >> >> >> >> > Friday, January 24, 2014, 6:48:06 PM, you wrote: >> >> >> >> >> On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote: >> >> >>> >> >> >>> Friday, January 10, 2014, 6:38:10 PM, you wrote: >> >> >>> >> >> >>> >> > Wow. You just walked in a pile of bugs didn't you? And on Friday >> >> >>> >> > nonethless. >> >> >>> >> >> >> >>> >> As usual ;-) >> >> >>> >> >> >>> > Ha! >> >> >>> > ..snip.. >> >> >>> >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >> >> >>> >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >> >> >>> >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >> >> >>> >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> >> >>> >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >> >> >>> >> >> >> >>> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >> >> >>> >> > I totally forgot about it ! >> >> >>> >> >> >> >>> >> Got a link to that patchset ? >> >> >>> >> >> >>> > https://lkml.org/lkml/2013/12/13/315 >> >> >>> >> >> >>> >> I at least could give it a spin .. you never know when fortune is on your side :-) >> >> >>> >> >> >>> > It is also at this git tree: >> >> >>> >> >> >>> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the >> >> >>> > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely >> >> >>> > want to merge it in your current Linus tree. >> >> >>> >> >> >>> > Thank you! >> >> >>> >> >> >>> >> >> >>> Hi Konrad, >> >> >>> >> >> >>> Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) >> >> >>> seems to help with my problem,i'm no capable of using: >> >> >>> - xl pci-detach >> >> >>> - xl pci-assignable-remove >> >> >>> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind >> >> >>> >> >> >>> to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-) >> >> >>> So the first 4 seem to be an improvement. >> >> >>> >> >> >>> That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own. >> >> >> >> >> Could you email me your lspci output and also which devices you move/switch etc? >> >> >> >> > Hi Konrad, >> >> >> >> > At the moment i found some time to figure out what goes wrong with the xl pci-detach and xl pci-assignable-remove, i have been >> >> > able to narrow it down a bit: >> >> >> >> > The problem only occurs when you: >> >> > - passthrough 2 (or more?) pci devices assigned to a guest .. >> >> > - and only remove 1 of those devices with "xl pci-detach" followed by a "xl pci-assignable-remove" >> >> > - when you first detach both devices with "xl pci-detach" before doing the "xl pci-assignable-remove" it works ok. >> >> >> >> > In my case i'm passingthrough 2 devices (02:00.0 and 00:19.0) >> >> >> >> > I added some printk's and what i found out is that: >> >> > - after doing the pci-detach of 02:00.0, it doesn't call pcistub_put_pci_dev for that device ... >> >> > - but when i subsequently pci-detach the second (and last) device 00:19.0 .. it does call it for both 02:00.0 and 00:19.0 ... >> >> > - so somehow that call for the first detached device gets deferred .. but since it are different devices and not functions of the same device i don't >> >> > see any reason for it to wait until all other devices would have been detached ... >> >> >> >> >> >> > I tried to capture the console output but some how that didn't work out, so i attached a screenshot of what happens when: >> >> > - doing a xl pci-list for the guest >> >> > - doing a xl pci-assignable-list >> >> >> >> > - doing the xl pci-detach for 02:00.0 >> >> >> >> > - doing a xl pci-list for the guest >> >> > - doing a xl pci-assignable-list >> >> >> >> > - waiting some time ... >> >> >> >> > - doing the xl pci-detach for 00:19.0 >> >> >> >> > - doing a xl pci-list for the guest >> >> > - doing a xl pci-assignable-list >> >> >> >> > There you can see this strange sequence of events :-) >> >> >> >> > But i haven't been able to spot the culprit >> >> >> >> Enabled some extra debugging and added some more printk's .. (see new screenshot) >> >> >> >> From what it seems .. the frontend state for the first device isn't changed on the first pci-detach .. >> >> >> >> Is the signaling on pci-detach the guests (pcifront) responsibility or the toolstacks (libxl) ? >> >> > It usually is pcifront. And in the screenshot I see: >> > .. frontend is gone! unregister device >> > which should trigger the process. And it does look to do that. >> > Hm, I am wondering what the toolstack is waiting for. >> > Time to debug. >> >> Ok thx :-) > Just to make sure - you are not using the xen-pciback.hide parameter right? > Just doing the /sysfs dance of 'echo BDF'> to various places. Nope, i always use xen-pciback.hide .. And normally i only create, shutdown or destroy guests .. and all goes well. As said .. it only crashes the host when you detach *not all* of the devices from a guest but only part of them. (so with one device .. also never a problem). Some how the detach isn't completed when there are still other devices attached. (what i'm trying to do is give back the only ethernet nic the machine has ...back to dom0 .. and leave the wireless NIC passed through to the openwrt router guest .. the whole detach and rebind works perfectly .. as long as it's the only device passed through to the guest .. so yes .. this is going to work ;-) ) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-04-16 15:30 ` Konrad Rzeszutek Wilk 2014-04-16 15:44 ` Sander Eikelenboom @ 2014-04-16 16:22 ` Sander Eikelenboom 1 sibling, 0 replies; 20+ messages in thread From: Sander Eikelenboom @ 2014-04-16 16:22 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel, Ian Campbell Wednesday, April 16, 2014, 5:30:57 PM, you wrote: > On Wed, Apr 02, 2014 at 12:43:12PM +0200, Sander Eikelenboom wrote: >> >> Tuesday, April 1, 2014, 6:13:09 PM, you wrote: >> >> > On Thu, Feb 20, 2014 at 05:18:46PM +0100, Sander Eikelenboom wrote: >> >> >> >> Thursday, February 20, 2014, 9:53:59 AM, you wrote: >> >> >> >> >> >> > Friday, January 24, 2014, 6:48:06 PM, you wrote: >> >> >> >> >> On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote: >> >> >>> >> >> >>> Friday, January 10, 2014, 6:38:10 PM, you wrote: >> >> >>> >> >> >>> >> > Wow. You just walked in a pile of bugs didn't you? And on Friday >> >> >>> >> > nonethless. >> >> >>> >> >> >> >>> >> As usual ;-) >> >> >>> >> >> >>> > Ha! >> >> >>> > ..snip.. >> >> >>> >> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >> >> >>> >> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >> >> >>> >> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >> >> >>> >> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >> >> >>> >> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >> >> >>> >> >> >> >>> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >> >> >>> >> > I totally forgot about it ! >> >> >>> >> >> >> >>> >> Got a link to that patchset ? >> >> >>> >> >> >>> > https://lkml.org/lkml/2013/12/13/315 >> >> >>> >> >> >>> >> I at least could give it a spin .. you never know when fortune is on your side :-) >> >> >>> >> >> >>> > It is also at this git tree: >> >> >>> >> >> >>> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the >> >> >>> > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely >> >> >>> > want to merge it in your current Linus tree. >> >> >>> >> >> >>> > Thank you! >> >> >>> >> >> >>> >> >> >>> Hi Konrad, >> >> >>> >> >> >>> Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) >> >> >>> seems to help with my problem,i'm no capable of using: >> >> >>> - xl pci-detach >> >> >>> - xl pci-assignable-remove >> >> >>> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind >> >> >>> >> >> >>> to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-) >> >> >>> So the first 4 seem to be an improvement. >> >> >>> >> >> >>> That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own. >> >> >> >> >> Could you email me your lspci output and also which devices you move/switch etc? >> >> >> >> > Hi Konrad, >> >> >> >> > At the moment i found some time to figure out what goes wrong with the xl pci-detach and xl pci-assignable-remove, i have been >> >> > able to narrow it down a bit: >> >> >> >> > The problem only occurs when you: >> >> > - passthrough 2 (or more?) pci devices assigned to a guest .. >> >> > - and only remove 1 of those devices with "xl pci-detach" followed by a "xl pci-assignable-remove" >> >> > - when you first detach both devices with "xl pci-detach" before doing the "xl pci-assignable-remove" it works ok. >> >> >> >> > In my case i'm passingthrough 2 devices (02:00.0 and 00:19.0) >> >> >> >> > I added some printk's and what i found out is that: >> >> > - after doing the pci-detach of 02:00.0, it doesn't call pcistub_put_pci_dev for that device ... >> >> > - but when i subsequently pci-detach the second (and last) device 00:19.0 .. it does call it for both 02:00.0 and 00:19.0 ... >> >> > - so somehow that call for the first detached device gets deferred .. but since it are different devices and not functions of the same device i don't >> >> > see any reason for it to wait until all other devices would have been detached ... >> >> >> >> >> >> > I tried to capture the console output but some how that didn't work out, so i attached a screenshot of what happens when: >> >> > - doing a xl pci-list for the guest >> >> > - doing a xl pci-assignable-list >> >> >> >> > - doing the xl pci-detach for 02:00.0 >> >> >> >> > - doing a xl pci-list for the guest >> >> > - doing a xl pci-assignable-list >> >> >> >> > - waiting some time ... >> >> >> >> > - doing the xl pci-detach for 00:19.0 >> >> >> >> > - doing a xl pci-list for the guest >> >> > - doing a xl pci-assignable-list >> >> >> >> > There you can see this strange sequence of events :-) >> >> >> >> > But i haven't been able to spot the culprit >> >> >> >> Enabled some extra debugging and added some more printk's .. (see new screenshot) >> >> >> >> From what it seems .. the frontend state for the first device isn't changed on the first pci-detach .. >> >> >> >> Is the signaling on pci-detach the guests (pcifront) responsibility or the toolstacks (libxl) ? >> >> > It usually is pcifront. And in the screenshot I see: >> > .. frontend is gone! unregister device >> > which should trigger the process. And it does look to do that. >> > Hm, I am wondering what the toolstack is waiting for. >> > Time to debug. >> >> Ok thx :-) > Just to make sure - you are not using the xen-pciback.hide parameter right? > Just doing the /sysfs dance of 'echo BDF'> to various places. I just took some dancing lessons .. and there is no difference by using xen-pciback.hide and doing the sysfs-dance before starting the domain. It still gives the warning about the device being pci-assignable-removed being "in use" .. and subsequent lock ups. (after the pci-detach followed by pci-assignable-remove of *not all* passed through devices. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-24 13:36 ` Sander Eikelenboom 2014-01-24 17:48 ` Konrad Rzeszutek Wilk @ 2014-01-27 16:29 ` George Dunlap 2014-01-27 16:42 ` Sander Eikelenboom 1 sibling, 1 reply; 20+ messages in thread From: George Dunlap @ 2014-01-27 16:29 UTC (permalink / raw) To: Sander Eikelenboom; +Cc: xen-devel On Fri, Jan 24, 2014 at 1:36 PM, Sander Eikelenboom <linux@eikelenboom.it> wrote: > > Friday, January 10, 2014, 6:38:10 PM, you wrote: > >>> > Wow. You just walked in a pile of bugs didn't you? And on Friday >>> > nonethless. >>> >>> As usual ;-) > >> Ha! >> ..snip.. >>> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >>> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >>> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >>> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >>> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >>> >>> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >>> > I totally forgot about it ! >>> >>> Got a link to that patchset ? > >> https://lkml.org/lkml/2013/12/13/315 > >>> I at least could give it a spin .. you never know when fortune is on your side :-) > >> It is also at this git tree: > >> git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the >> branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely >> want to merge it in your current Linus tree. > >> Thank you! > > > Hi Konrad, > > Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) > seems to help with my problem,i'm no capable of using: > - xl pci-detach > - xl pci-assignable-remove > - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind Out of curiosity, have you tried adding the -r option to pci-assignable-remove? xl pci-assignable-add will store the original driver to which the device was bound in xenstore; if you do "xl pci-assignable-remove -r" it will attempt to re-bind it to that driver. See more information here: http://blog.xen.org/index.php/2012/06/04/xen-4-2-preview-xl-and-pci-pass-through/ -George ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove 2014-01-27 16:29 ` George Dunlap @ 2014-01-27 16:42 ` Sander Eikelenboom 0 siblings, 0 replies; 20+ messages in thread From: Sander Eikelenboom @ 2014-01-27 16:42 UTC (permalink / raw) To: George Dunlap; +Cc: xen-devel Monday, January 27, 2014, 5:29:27 PM, you wrote: > On Fri, Jan 24, 2014 at 1:36 PM, Sander Eikelenboom > <linux@eikelenboom.it> wrote: >> >> Friday, January 10, 2014, 6:38:10 PM, you wrote: >> >>>> > Wow. You just walked in a pile of bugs didn't you? And on Friday >>>> > nonethless. >>>> >>>> As usual ;-) >> >>> Ha! >>> ..snip.. >>>> >> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45 >>>> >> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9 >>>> >> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5 >>>> >> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25 >>>> >> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e >>>> >>>> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix. >>>> > I totally forgot about it ! >>>> >>>> Got a link to that patchset ? >> >>> https://lkml.org/lkml/2013/12/13/315 >> >>>> I at least could give it a spin .. you never know when fortune is on your side :-) >> >>> It is also at this git tree: >> >>> git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the >>> branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely >>> want to merge it in your current Linus tree. >> >>> Thank you! >> >> >> Hi Konrad, >> >> Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) >> seems to help with my problem,i'm no capable of using: >> - xl pci-detach >> - xl pci-assignable-remove >> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind > Out of curiosity, have you tried adding the -r option to pci-assignable-remove? > xl pci-assignable-add will store the original driver to which the > device was bound in xenstore; if you do "xl pci-assignable-remove -r" > it will attempt to re-bind it to that driver. > See more information here: > http://blog.xen.org/index.php/2012/06/04/xen-4-2-preview-xl-and-pci-pass-through/ Hi George, Yes i did, but since i seize the device at boot it was never bound to a device, so that isn't useful in this case (it doesn't know what to rebind to). But this problem seems fixed after applying the first 4 commits in konrad's pci reset branch at: http://git.kernel.org/cgit/linux/kernel/git/konrad/xen.git devel/xen-pciback.slot_and_bus.v0 Applying the last commit creates more problems than it fixes at the moment (hungtasks), so that one clearly need more polishing. So it's clearly a problem in the kernel and pciback specifically. -- Sander > -George ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2014-04-16 16:23 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-10 14:51 Xen pci-passthrough problem with pci-detach and pci-assignable-remove Sander Eikelenboom 2014-01-10 15:12 ` Konrad Rzeszutek Wilk 2014-01-10 15:57 ` Sander Eikelenboom 2014-01-10 16:12 ` Konrad Rzeszutek Wilk 2014-01-10 16:16 ` Sander Eikelenboom 2014-01-10 17:38 ` Konrad Rzeszutek Wilk 2014-01-10 18:21 ` Sander Eikelenboom 2014-01-10 18:22 ` Sander Eikelenboom 2014-01-24 13:36 ` Sander Eikelenboom 2014-01-24 17:48 ` Konrad Rzeszutek Wilk 2014-01-24 18:53 ` Sander Eikelenboom 2014-02-20 8:53 ` Sander Eikelenboom 2014-02-20 16:18 ` Sander Eikelenboom 2014-04-01 16:13 ` Konrad Rzeszutek Wilk 2014-04-02 10:43 ` Sander Eikelenboom 2014-04-16 15:30 ` Konrad Rzeszutek Wilk 2014-04-16 15:44 ` Sander Eikelenboom 2014-04-16 16:22 ` Sander Eikelenboom 2014-01-27 16:29 ` George Dunlap 2014-01-27 16:42 ` Sander Eikelenboom
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.