All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Sander Eikelenboom <linux@eikelenboom.it>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Ian Campbell <Ian.Campbell@citrix.com>
Subject: Re: Xen pci-passthrough problem with pci-detach and pci-assignable-remove
Date: Tue, 1 Apr 2014 12:13:09 -0400	[thread overview]
Message-ID: <20140401161309.GA10072@phenom.dumpdata.com> (raw)
In-Reply-To: <929649832.20140220171846@eikelenboom.it>

On Thu, Feb 20, 2014 at 05:18:46PM +0100, Sander Eikelenboom wrote:
> 
> Thursday, February 20, 2014, 9:53:59 AM, you wrote:
> 
> 
> > Friday, January 24, 2014, 6:48:06 PM, you wrote:
> 
> >> On Fri, Jan 24, 2014 at 02:36:02PM +0100, Sander Eikelenboom wrote:
> >>> 
> >>> Friday, January 10, 2014, 6:38:10 PM, you wrote:
> >>> 
> >>> >> > Wow. You just walked in a pile of bugs didn't you? And on Friday
> >>> >> > nonethless.
> >>> >> 
> >>> >> As usual ;-)
> >>> 
> >>> > Ha!
> >>> > ..snip..
> >>> >> >> [  489.082358]  [<ffffffff81087ac6>] ? mutex_spin_on_owner+0x38/0x45
> >>> >> >> [  489.106272]  [<ffffffff818e5e22>] ? schedule_preempt_disabled+0x6/0x9
> >>> >> >> [  489.130158]  [<ffffffff818e7034>] ? __mutex_lock_slowpath+0x159/0x1b5
> >>> >> >> [  489.154147]  [<ffffffff818e70a6>] ? mutex_lock+0x16/0x25
> >>> >> >> [  489.177890]  [<ffffffff8135972d>] ? pci_reset_function+0x26/0x4e
> >>> >> 
> >>> >> > Yeah, that bug my RFC patchset (the one that does the slot/bus reset) should also fix.
> >>> >> > I totally forgot about it !
> >>> >> 
> >>> >> Got a link to that patchset ?
> >>> 
> >>> > https://lkml.org/lkml/2013/12/13/315
> >>> 
> >>> >> I at least could give it a spin .. you never know when fortune is on your side :-)
> >>> 
> >>> > It is also at this git tree:
> >>> 
> >>> > git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git and the
> >>> > branch name is "devel/xen-pciback.slot_and_bus.v0". You will likely
> >>> > want to merge it in your current Linus tree.
> >>> 
> >>> > Thank you!
> >>> 
> >>> 
> >>> Hi Konrad,
> >>> 
> >>> Just got time to test this some more, when merging this branch *except* the last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd)
> >>> seems to help with my problem,i'm no capable of using:
> >>> - xl pci-detach
> >>> - xl pci-assignable-remove
> >>> - echo "BDF" > /sys/bus/pci/drivers/<devicename>/bind
> >>> 
> >>> to remove a pci device from a running HVM guest and rebinding it to a driver in dom0 without those nasty stacktraces :-)
> >>> So the first 4 seem to be an improvement.
> >>> 
> >>> That last commit (9599a5ad38a3bb250e996ccb2cdaab6fb68aaacd) seems to give troubles of it's own.
> 
> >> Could you email me your lspci output and also which devices you move/switch etc?
> 
> > Hi Konrad,
> 
> > At the moment i found some time to figure out what goes wrong with the xl pci-detach and xl pci-assignable-remove, i have been
> > able to narrow it down a bit:
> 
> > The problem only occurs when you:
> > - passthrough 2 (or more?) pci devices assigned to a guest ..
> > - and only remove 1 of those devices with "xl pci-detach" followed by a "xl pci-assignable-remove"
> > - when you first detach both devices with "xl pci-detach" before doing the "xl pci-assignable-remove" it works ok.
> 
> > In my case i'm passingthrough 2 devices (02:00.0 and 00:19.0)
> 
> > I added some printk's and what i found out is that:
> > - after doing the pci-detach of 02:00.0, it doesn't call pcistub_put_pci_dev for that device ...
> > - but when i subsequently pci-detach the second (and last) device 00:19.0 .. it does call it for both 02:00.0 and 00:19.0 ...
> > - so somehow that call for the first detached device gets deferred .. but since it are different devices and not functions of the same device i don't
> >   see any reason for it to wait until all other devices would have been detached ...
> 
> 
> > I tried to capture the console output but some how that didn't work out, so i attached a screenshot of what happens when:
> > - doing a xl pci-list for the guest
> > - doing a xl pci-assignable-list
> 
> > - doing the xl pci-detach for 02:00.0
> 
> > - doing a xl pci-list for the guest
> > - doing a xl pci-assignable-list
> 
> > - waiting some time ...
> 
> > - doing the xl pci-detach for 00:19.0
> 
> > - doing a xl pci-list for the guest
> > - doing a xl pci-assignable-list
> 
> > There you can see this strange sequence of events :-)
> 
> > But i haven't been able to spot the culprit
> 
> Enabled some extra debugging and added some more printk's .. (see new screenshot)
> 
> From what it seems .. the frontend state for the first device isn't changed on the first pci-detach ..
> 
> Is the signaling on pci-detach the guests (pcifront) responsibility or the toolstacks (libxl) ?

It usually is pcifront. And in the screenshot I see:

.. frontend is gone! unregister device

which should trigger the process. And it does look to do that.

Hm, I am wondering what the toolstack is waiting for.

Time to debug.
> 
> 
> 
> > attached: screenshot.jpg

and thanks for the screenshot (didn't have copy-n-paste option handy :-))

  reply	other threads:[~2014-04-01 16:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-10 14:51 Xen pci-passthrough problem with pci-detach and pci-assignable-remove Sander Eikelenboom
2014-01-10 15:12 ` Konrad Rzeszutek Wilk
2014-01-10 15:57   ` Sander Eikelenboom
2014-01-10 16:12     ` Konrad Rzeszutek Wilk
2014-01-10 16:16       ` Sander Eikelenboom
2014-01-10 17:38         ` Konrad Rzeszutek Wilk
2014-01-10 18:21           ` Sander Eikelenboom
2014-01-10 18:22           ` Sander Eikelenboom
2014-01-24 13:36           ` Sander Eikelenboom
2014-01-24 17:48             ` Konrad Rzeszutek Wilk
2014-01-24 18:53               ` Sander Eikelenboom
2014-02-20  8:53               ` Sander Eikelenboom
2014-02-20 16:18                 ` Sander Eikelenboom
2014-04-01 16:13                   ` Konrad Rzeszutek Wilk [this message]
2014-04-02 10:43                     ` Sander Eikelenboom
2014-04-16 15:30                       ` Konrad Rzeszutek Wilk
2014-04-16 15:44                         ` Sander Eikelenboom
2014-04-16 16:22                         ` Sander Eikelenboom
2014-01-27 16:29             ` George Dunlap
2014-01-27 16:42               ` Sander Eikelenboom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140401161309.GA10072@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=linux@eikelenboom.it \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.