All of lore.kernel.org
 help / color / mirror / Atom feed
From: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>, Wei Liu <wl@xen.org>,
	George Dunlap <george.dunlap@citrix.com>,
	Julien Grall <julien@xen.org>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Paul Durrant <paul@xen.org>, Kevin Tian <kevin.tian@intel.com>
Subject: Re: [PATCH v3 2/6] xen: pci: introduce reference counting for pdev
Date: Fri, 21 Apr 2023 14:13:43 +0000	[thread overview]
Message-ID: <87o7nh727c.fsf@epam.com> (raw)
In-Reply-To: <ZEKLN8AlzDUckorU@Air-de-Roger>


Hi Roger,

Roger Pau Monné <roger.pau@citrix.com> writes:

> On Fri, Apr 21, 2023 at 11:00:23AM +0000, Volodymyr Babchuk wrote:
>> 
>> Hello Roger,
>> 
>> Roger Pau Monné <roger.pau@citrix.com> writes:
>> 
>> > On Mon, Apr 17, 2023 at 12:34:31PM +0200, Jan Beulich wrote:
>> >> On 17.04.2023 12:17, Roger Pau Monné wrote:
>> >> > On Fri, Apr 14, 2023 at 01:30:39AM +0000, Volodymyr Babchuk wrote:
>> >> >> Above I have proposed another view on this. I hope, it will work for
>> >> >> you. Just to reiterate, idea is to allow "harmless" refcounts to be left
>> >> >> after returning from pci_remove_device(). By "harmless" I mean that
>> >> >> owners of those refcounts will not try to access the physical PCI
>> >> >> device if pci_remove_device() is already finished.
>> >> > 
>> >> > I'm not strictly a maintainer of this piece code, albeit I have an
>> >> > opinion.  I will like to also hear Jans opinion, since he is the
>> >> > maintainer.
>> >> 
>> >> I'm afraid I can't really appreciate the term "harmless refcounts". Whoever
>> >> holds a ref is entitled to access the device. As stated before, I see only
>> >> two ways of getting things consistent: Either pci_remove_device() is
>> >> invoked upon dropping of the last ref,
>> >
>> > With this approach, what would be the implementation of
>> > PHYSDEVOP_manage_pci_remove?  Would it just check whether the pdev
>> > exist and either return 0 or -EBUSY?
>> >
>> 
>> Okay, I am preparing patches with the behavior you proposed. To test it,
>> I artificially set refcount to 2 and indeed PHYSDEVOP_manage_pci_remove
>> returned -EBUSY, which propagated to the linux driver. Problem is that
>> Linux driver can't do anything with this. It just displayed an error
>> message and removed device anyways. This is because Linux sends
>> PHYSDEVOP_manage_pci_remove in device_remove() call path and there is no
>> way to prevent the device removal. So, admin is not capable to try this
>> again.
>
> Ideally Linux won't remove the device, and then the admin would get to
> retry.  Maybe the way the Linux hook is placed is not the best one?
> The hypervisor should be authoritative on whether a device can be
> removed or not, and hence PHYSDEVOP_manage_pci_remove returning an
> error (EBUSY or otherwise) shouldn't allow the device unplug in Linux
> to continue.

Yes, it would be ideally, but Linux driver/device model is written in a
such way, that PCI subsystem tracks all the PCI device usage, so it can
be certain that it can remove the device. Thus, functions in the device
removal path either return void or 0. Of course, kernel does not know that
hypervisor has additional uses for the device, so there is no mechanisms
to prevent removal.

> We could add a PHYSDEVOP_manage_pci_test or similar that could be
> programmatically used to check whether a device has a matching pdev in
> the hypervisor, but I have no idea how that could be used by Linux so
> it's exposed to the user, and it seems to just make the interface more
> complicated for noo real benefit, when the same could be accomplished
> by PHYSDEVOP_manage_pci_remove.

We can ignore the kernel behavior and just call
PHYSDEVOP_manage_pci_remove from toolstack. Something like "xl
pci-hotunplug SBFD". But yes, this will make interface more complicated.

> Maybe the only feasible solution is for pci_remove_device() to drop a
> reference expecting it would be the last one, and print a warning
> message if it's not and return -EBUSY.  Expecting any remaining
> references to be dropped and the backing pdev to be freed.

So, basically in the same way as I proposed initially?

-- 
WBR, Volodymyr

  reply	other threads:[~2023-04-21 14:14 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-14 20:56 [PATCH v3 0/6] vpci: first series in preparation for vpci on ARM Volodymyr Babchuk
2023-03-14 20:56 ` [PATCH v3 2/6] xen: pci: introduce reference counting for pdev Volodymyr Babchuk
2023-03-16 16:16   ` Roger Pau Monné
2023-03-29  9:55     ` Jan Beulich
2023-03-29 10:48       ` Roger Pau Monné
2023-03-29 11:58         ` Jan Beulich
2023-04-11 23:41     ` Volodymyr Babchuk
2023-04-12  9:13       ` Roger Pau Monné
2023-04-12 21:54         ` Volodymyr Babchuk
2023-04-13 15:00           ` Roger Pau Monné
2023-04-14  1:30             ` Volodymyr Babchuk
2023-04-17 10:17               ` Roger Pau Monné
2023-04-17 10:34                 ` Jan Beulich
2023-04-17 10:51                   ` Roger Pau Monné
2023-04-17 11:02                     ` Jan Beulich
2023-04-21 11:00                     ` Volodymyr Babchuk
2023-04-21 12:24                       ` Jan Beulich
2023-04-21 13:02                         ` Volodymyr Babchuk
2023-04-21 13:10                       ` Roger Pau Monné
2023-04-21 14:13                         ` Volodymyr Babchuk [this message]
2023-04-24  7:46                           ` Jan Beulich
2023-04-24 14:15                             ` Volodymyr Babchuk
2023-04-24 14:27                               ` Jan Beulich
2023-03-29 10:04   ` Jan Beulich
2023-03-14 20:56 ` [PATCH v3 1/6] xen: add reference counter support Volodymyr Babchuk
2023-03-16 13:54   ` Roger Pau Monné
2023-03-16 14:03     ` Jan Beulich
2023-03-16 16:21       ` Roger Pau Monné
2023-04-11 22:27     ` Volodymyr Babchuk
2023-04-12 10:12       ` Roger Pau Monné
2023-03-16 16:19   ` Roger Pau Monné
2023-03-16 16:32     ` Jan Beulich
2023-03-16 16:39       ` Roger Pau Monné
2023-03-16 16:43         ` Jan Beulich
2023-03-16 16:48           ` Roger Pau Monné
2023-03-16 16:56             ` Jan Beulich
2023-03-17 10:05               ` Roger Pau Monné
2023-03-17 14:46                 ` Jan Beulich
2023-03-16 17:01   ` Jan Beulich
2023-04-11 22:38     ` Volodymyr Babchuk
2023-04-17  6:47       ` Jan Beulich
2023-03-14 20:56 ` [PATCH v3 5/6] vpci: use reference counter to protect vpci state Volodymyr Babchuk
2023-03-17  8:43   ` Roger Pau Monné
2023-03-29  9:31     ` Jan Beulich
2023-03-14 20:56 ` [PATCH v3 4/6] vpci: restrict unhandled read/write operations for guests Volodymyr Babchuk
2023-03-17  8:37   ` Roger Pau Monné
2023-03-14 20:56 ` [PATCH v3 6/6] xen: pci: print reference counter when dumping pci_devs Volodymyr Babchuk
2023-03-17  8:46   ` Roger Pau Monné
2023-03-14 20:56 ` [PATCH v3 3/6] vpci: crash domain if we wasn't able to (un) map vPCI regions Volodymyr Babchuk
2023-03-16 16:32   ` Roger Pau Monné

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o7nh727c.fsf@epam.com \
    --to=volodymyr_babchuk@epam.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=paul@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.