From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Sander Eikelenboom <linux@eikelenboom.it>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove
Date: Fri, 10 Jan 2014 11:12:48 -0500 [thread overview]
Message-ID: <20140110161248.GE21360@phenom.dumpdata.com> (raw)
In-Reply-To: <1087166993.20140110165729@eikelenboom.it>
On Fri, Jan 10, 2014 at 04:57:29PM +0100, Sander Eikelenboom wrote:
>
> Friday, January 10, 2014, 4:12:18 PM, you wrote:
>
> > On Fri, Jan 10, 2014 at 03:51:57PM +0100, Sander Eikelenboom wrote:
> >> Hi Konrad,
> >>
> >> Normally i'm never reattaching pci devices to dom0, but at the moment i have some use for it.
> >>
> >> But it seems pci-detach isn't completely detaching the device from the guest.
> >>
> >> - Say i have a guest (HVM) with domid=2 and a pci device passedthrough with bdf 00:19.0, the device is hidden on boot with xen-pciback.hide=(00:19.0) in grub.
> >>
> >> - Now i do a "xl pci-assignable-list"
> >> This returns nothing, which is correct since all hidden devices have already been assigned to guests.
> >>
> >> - Then i do "xl -v pci-detach 2 00:19.0"
> >> Which also returns nothing ...
> >>
> >> - Now i do a "xl pci-assignable-list" again ..
> >> This returns:
> >> "0000:00:19.0"
> >> So the pci-detach does seem to have done *something* :-)
>
> > Or it thinks it has :-)
>
> Well it has .. but probably not enough ;-)
>
> >>
> >> - But when now trying to remove the device from pciback to dom0 with "xl pci-assignable-remove 00:19.0" it gives an error
> >> and later it give some stacktraces ..
> >>
> >> xen_pciback: ****** removing device 0000:00:19.0 while still in-use! ******
> >> xen_pciback: ****** driver domain may still access this device's i/o resources!
> >> xen_pciback: ****** shutdown driver domain before binding device
> >> xen_pciback: ****** to other drivers of domains
>
> > What about /var/log/xen/qemu-dm* and the 'lspci' in the guest? Is the PCI device
> > removed from there?
>
> Oeh i should have thought of that ...
>
> in the guest i get a "e1000e 0000:00:06.0 removed PHC" and it's gone from lspci ..
> in /var/log/xen/qemu-dm* .. i get nothing .. but i was using qemu-xen .. which is totally non verbose ..
>
> So let's try with qemu-xen-traditional .. which i also forgot to test ...
>
> Which gives exact the same error / warning as above, but it has some output in /var/log/xen/qemu-dm*:
>
> pt_msgctrl_reg_write: setup msi for dev 30
> pt_msi_setup: pt_msi_setup requested pirq = 54
> pt_msi_setup: msi mapped with pirq 36
> pt_msi_update: Update msi with pirq 36 gvec 0 gflags 3036
> pt_msgctrl_reg_write: setup msi for dev 28
> pt_msi_setup: pt_msi_setup requested pirq = 53
> pt_msi_setup: msi mapped with pirq 35
> pt_msi_update: Update msi with pirq 35 gvec 0 gflags 3035
> pt_msi_update: Update msi with pirq 36 gvec 0 gflags 3034
> dm-command: hot remove pass-through pci dev
> generate a sci for PHP.
> deassert due to disable GPE bit.
> ACPI:debug: write addr=0xb044, val=0x30.
> ACPI:debug: write addr=0xb045, val=0x3.
> ACPI:debug: write addr=0xb044, val=0x30.
> ACPI:debug: write addr=0xb045, val=0x88.
> ACPI PCI hotplug: write devfn=0x30.
> pci_intx: intx=1
> pci_intx: intx=1
> pt_msi_disable: Unbind msi with pirq 36, gvec 0
> pt_msi_disable: Unmap msi with pirq 36
Good, so the device is safely removed from the guest.
QEMU acted on 'libxl' command to remove it.
>
>
>
> Also worth mentioninng is that the console on which the "xl pci-assignable-remove 00:19.0" command is given, keeps hanging and eventually the hungtask stacktrace will appear.
>
> >>
> >>
> >> When i shut the guest down instead of using pci-detach, the "xl pci-assignable-remove" works fine and i can rebind the device to it's driver in dom0.
> >>
> >> So am i misreading the wiki .. and is it not possible to detach a device from a running domain or ... ?
> >>
> >> Oh yes running xen-unstable and a 3.13-rc7 kernel
>
> > Do you see the same issue with 'xend'?
>
> Erhmmm haven't used that for what seems to be ages .. :-)
Heh.
>
> Hmm i also forgot the hungtask stacktrace i get sometime after the "xl pci-assignable-remove 00:19.0" ...
Wow. You just walked in a pile of bugs didn't you? And on Friday
nonethless.
>
> It seems to be the pci_reset_function ...
>
> [ 52.099144] xen_bridge: port 4(vif2.0-emu) entered forwarding state
> [ 55.683141] xen_bridge: port 1(vif1.0) entered forwarding state
> [ 59.861385] xen-blkback:ring-ref 8, event-channel 22, protocol 1 (x86_64-abi) persistent grants
> [ 66.043965] xen_bridge: port 3(vif2.0) entered forwarding state
> [ 66.044549] xen_bridge: port 3(vif2.0) entered forwarding state
> [ 81.091149] xen_bridge: port 3(vif2.0) entered forwarding state
> [ 227.441191] xen_pciback: ****** removing device 0000:00:19.0 while still in-use! ******
> [ 227.443482] xen_pciback: ****** driver domain may still access this device's i/o resources!
> [ 227.445811] xen_pciback: ****** shutdown driver domain before binding device
> [ 227.447811] xen_pciback: ****** to other drivers or domains
> [ 368.859343] INFO: task xl:3675 blocked for more than 120 seconds.
> [ 368.860447] Not tainted 3.13.0-rc7-20140110-creabox-nuc+ #1
> [ 368.860990] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 368.861682] xl D ffff88003fd93f00 0 3675 3489 0x00000000
> [ 368.862319] ffff880038c0e880 0000000000000282 0000000000000000 ffff880038fd03d0
> [ 368.863035] 0000000000013f00 0000000000013f00 ffff880038c0e880 ffff880036abffd8
> [ 368.863802] ffffffff81087ac6 ffff88003a0f00f8 ffff88003a0f00fc ffff880038c0e880
> [ 368.864514] Call Trace:
> [ 368.864744] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45
> [ 368.865273] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9
> [ 368.865851] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5
> [ 368.866409] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25
> [ 368.866892] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e
> [ 368.867430] [<ffffffff818e7dc1>] ? _raw_spin_lock_irqsave+0x14/0x36
> [ 368.867996] [<ffffffff818e7238>] ? down_write+0x9/0x26
> [ 368.868467] [<ffffffff813f1863>] ? pcistub_put_pci_dev+0x7b/0xe0
> [ 368.868991] [<ffffffff813f14a7>] ? pcistub_remove+0xd0/0x127
> [ 368.869506] [<ffffffff8135b5b8>] ? pci_device_remove+0x38/0x83
> [ 368.870017] [<ffffffff814cb37f>] ? __device_release_driver+0x82/0xdb
> [ 368.870593] [<ffffffff814cb602>] ? device_release_driver+0x1a/0x25
> [ 368.871152] [<ffffffff814ca993>] ? unbind_store+0x59/0x89
> [ 368.871659] [<ffffffff81178aa0>] ? sysfs_write_file+0x13f/0x18f
> [ 368.872173] [<ffffffff81122aa6>] ? vfs_write+0x95/0xfb
> [ 368.872641] [<ffffffff81122d8a>] ? SyS_write+0x51/0x85
> [ 368.873087] [<ffffffff818ed179>] ? system_call_fastpath+0x16/0x1b
> [ 488.871331] INFO: task xl:3675 blocked for more than 120 seconds.
> [ 488.913929] Not tainted 3.13.0-rc7-20140110-creabox-nuc+ #1
> [ 488.937031] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 488.960945] xl D ffff88003fd93f00 0 3675 3489 0x00000004
> [ 488.986090] ffff880038c0e880 0000000000000282 0000000000000000 ffff880038fd03d0
> [ 489.010383] 0000000000013f00 0000000000013f00 ffff880038c0e880 ffff880036abffd8
> [ 489.034456] ffffffff81087ac6 ffff88003a0f00f8 ffff88003a0f00fc ffff880038c0e880
> [ 489.058621] Call Trace:
> [ 489.082358] [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45
> [ 489.106272] [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9
> [ 489.130158] [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5
> [ 489.154147] [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25
> [ 489.177890] [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e
Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix.
I totally forgot about it !
I hope.
> [ 489.200927] [<ffffffff818e7dc1>] ? _raw_spin_lock_irqsave+0x14/0x36
> [ 489.224076] [<ffffffff818e7238>] ? down_write+0x9/0x26
> [ 489.246898] [<ffffffff813f1863>] ? pcistub_put_pci_dev+0x7b/0xe0
> [ 489.270086] [<ffffffff813f14a7>] ? pcistub_remove+0xd0/0x127
> [ 489.293053] [<ffffffff8135b5b8>] ? pci_device_remove+0x38/0x83
> [ 489.316068] [<ffffffff814cb37f>] ? __device_release_driver+0x82/0xdb
> [ 489.338896] [<ffffffff814cb602>] ? device_release_driver+0x1a/0x25
> [ 489.362459] [<ffffffff814ca993>] ? unbind_store+0x59/0x89
> [ 489.385396] [<ffffffff81178aa0>] ? sysfs_write_file+0x13f/0x18f
> [ 489.408605] [<ffffffff81122aa6>] ? vfs_write+0x95/0xfb
> [ 489.431407] [<ffffffff81122d8a>] ? SyS_write+0x51/0x85
> [ 489.454251] [<ffffffff818ed179>] ? system_call_fastpath+0x16/0x1b
>
>
> >>
> >> --
> >> Sander
> >>
> >>
>
>
next prev parent reply other threads:[~2014-01-10 16:12 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-10 14:51 Xen pci-passthrough problem with pci-detach and pci-assignable-remove Sander Eikelenboom
2014-01-10 15:12 ` Konrad Rzeszutek Wilk
2014-01-10 15:57 ` Sander Eikelenboom
2014-01-10 16:12 ` Konrad Rzeszutek Wilk [this message]
2014-01-10 16:16 ` Sander Eikelenboom
2014-01-10 17:38 ` Konrad Rzeszutek Wilk
2014-01-10 18:21 ` Sander Eikelenboom
2014-01-10 18:22 ` Sander Eikelenboom
2014-01-24 13:36 ` Sander Eikelenboom
2014-01-24 17:48 ` Konrad Rzeszutek Wilk
2014-01-24 18:53 ` Sander Eikelenboom
2014-02-20 8:53 ` Sander Eikelenboom
2014-02-20 16:18 ` Sander Eikelenboom
2014-04-01 16:13 ` Konrad Rzeszutek Wilk
2014-04-02 10:43 ` Sander Eikelenboom
2014-04-16 15:30 ` Konrad Rzeszutek Wilk
2014-04-16 15:44 ` Sander Eikelenboom
2014-04-16 16:22 ` Sander Eikelenboom
2014-01-27 16:29 ` George Dunlap
2014-01-27 16:42 ` Sander Eikelenboom
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140110161248.GE21360@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=linux@eikelenboom.it \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.