All of lore.kernel.org
 help / color / mirror / Atom feed
From: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: "Julien Grall" <julien@xen.org>,
	"Stewart Hildebrand" <stewart.hildebrand@amd.com>,
	"Oleksandr Andrushchenko" <Oleksandr_Andrushchenko@epam.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"George Dunlap" <george.dunlap@citrix.com>,
	"Jan Beulich" <jbeulich@suse.com>, "Wei Liu" <wl@xen.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology
Date: Fri, 17 Nov 2023 00:21:41 +0000	[thread overview]
Message-ID: <87o7ft44bv.fsf@epam.com> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2311161513210.773207@ubuntu-linux-20-04-desktop>


Hi Stefano,

Stefano Stabellini <sstabellini@kernel.org> writes:

> On Thu, 16 Nov 2023, Julien Grall wrote:
>> IIUC, this means that Xen will allocate the BDF. I think this will become a
>> problem quite quickly as some of the PCI may need to be assigned at a specific
>> vBDF (I have the intel graphic card in mind).
>> 
>> Also, xl allows you to specificy the slot (e.g. <bdf>@<vslot>) which would not
>> work with this approach.
>> 
>> For dom0less passthrough, I feel the virtual BDF should always be specified in
>> device-tree. When a domain is created after boot, then I think you want to
>> support <bdf>@<vslot> where <vslot> is optional.
>
> Hi Julien,
>
> I also think there should be a way to specify the virtual BDF, but if
> possible (meaning: it is not super difficult to implement) I think it
> would be very convenient if we could let Xen pick whatever virtual BDF
> Xen wants when the user doesn't specify the virtual BDF. That's
> because it would make it easier to specify the configuration for the
> user. Typically the user doesn't care about the virtual BDF, only to
> expose a specific host device to the VM. There are exceptions of course
> and that's why I think we should also have a way for the user to
> request a specific virtual BDF. One of these exceptions are integrated
> GPUs: the OS drivers used to have hardcoded BDFs. So it wouldn't work if
> the device shows up at a different virtual BDF compared to the host.
>
> Thinking more about this, one way to simplify the problem would be if we
> always reuse the physical BDF as virtual BDF for passthrough devices. I
> think that would solve the problem and makes it much more unlikely to
> run into drivers bugs.

I'm not sure that this is possible. AFAIK, if we have device with B>0,
we need to have bridge device for it. So, if I want to passthrough
device 08:00.0, I need to provide a virtual bridge with BDF 0:NN.0. This
unnecessary complicates things.

Also, there can be funny situation with conflicting BFD numbers exposed
by different domains. I know that this is not your typical setup, but
imagine that Domain A acts as a driver domain for PCI controller A and
Domain B acts as a driver domain for PCI controller B. They may expose
devices with same BDFs but with different segments.

> And we allocate a "special" virtual BDF space for emulated devices, with
> the Root Complex still emulated in Xen. For instance, we could reserve
> ff:xx:xx and in case of clashes we could refuse to continue. Or we could
> allocate the first free virtual BDF, after all the pasthrough devices.

Again, I may be wrong there, but we need an emulated PCI bridge device if we
want to use Bus numbers > 0.

>
> Example:
> - the user wants to assign physical 00:11.5 and b3:00.1 to the guest
> - Xen create virtual BDFs 00:11.5 and b3:00.1 for the passthrough devices
> - Xen allocates the next virtual BDF for emulated devices: b4:xx.x
> - If more virtual BDFs are needed for emulated devices, Xen allocates
>   b5:xx.x
>
> I still think, no matter the BDF allocation scheme, that we should try
> to avoid as much as possible to have two different PCI Root Complex
> emulators. Ideally we would have only one PCI Root Complex emulated by
> Xen. Having 2 PCI Root Complexes both of them emulated by Xen would be
> tolerable but not ideal.

But what is exactly wrong with this setup?

> The worst case I would like to avoid is to have
> two PCI Root Complexes, one emulated by Xen and one emulated by QEMU.

This is how our setup works right now.

I agree that we need some way to provide static vBDF numbers. But I am
wondering what is the best way to do this. We need some entity that
manages and assigns those vBDFs. It should reside in Xen, because there
is Dom0less use case. Probably we need to extend
xen_domctl_assign_device so we can either request a free vBDF or a
specific vBDF. And in the first case, Xen should return assigned vBDF so
toolstack can give it to a backend, if PCI device is purely virtual.

-- 
WBR, Volodymyr

  parent reply	other threads:[~2023-11-17  0:22 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-12 22:09 [PATCH v10 00/17] PCI devices passthrough on Arm, part 3 Volodymyr Babchuk
2023-10-12 22:09 ` [PATCH v10 01/17] pci: msi: pass pdev to pci_enable_msi() function Volodymyr Babchuk
2023-10-30 15:55   ` Jan Beulich
2023-11-17 13:59   ` Roger Pau Monné
2023-10-12 22:09 ` [PATCH v10 03/17] vpci: use per-domain PCI lock to protect vpci structure Volodymyr Babchuk
2023-11-03 15:39   ` Stewart Hildebrand
2023-11-17 15:16   ` Roger Pau Monné
2023-11-28 22:24     ` Volodymyr Babchuk
2023-10-12 22:09 ` [PATCH v10 05/17] vpci: add hooks for PCI device assign/de-assign Volodymyr Babchuk
2023-11-20 15:04   ` Roger Pau Monné
2023-10-12 22:09 ` [PATCH v10 04/17] vpci: restrict unhandled read/write operations for guests Volodymyr Babchuk
2023-10-12 22:09 ` [PATCH v10 02/17] pci: introduce per-domain PCI rwlock Volodymyr Babchuk
2023-11-17 14:33   ` Roger Pau Monné
2023-10-12 22:09 ` [PATCH v10 06/17] vpci/header: rework exit path in init_bars Volodymyr Babchuk
2023-11-20 15:07   ` Roger Pau Monné
2023-10-12 22:09 ` [PATCH v10 08/17] rangeset: add RANGESETF_no_print flag Volodymyr Babchuk
2023-10-12 22:09 ` [PATCH v10 07/17] vpci/header: implement guest BAR register handlers Volodymyr Babchuk
2023-10-14 16:00   ` Stewart Hildebrand
2023-11-20 16:06   ` Roger Pau Monné
2023-10-12 22:09 ` [PATCH v10 11/17] vpci/header: program p2m with guest BAR view Volodymyr Babchuk
2023-11-21 12:24   ` Roger Pau Monné
2023-10-12 22:09 ` [PATCH v10 10/17] vpci/header: handle p2m range sets per BAR Volodymyr Babchuk
2023-11-20 17:29   ` Roger Pau Monné
2023-10-12 22:09 ` [PATCH v10 09/17] rangeset: add rangeset_empty() function Volodymyr Babchuk
2023-10-13 17:54   ` Stewart Hildebrand
2023-10-13 18:08     ` Volodymyr Babchuk
2023-10-12 22:09 ` [PATCH v10 12/17] vpci/header: emulate PCI_COMMAND register for guests Volodymyr Babchuk
2023-10-13 21:53   ` Volodymyr Babchuk
2023-11-21 14:17   ` Roger Pau Monné
2023-12-01  2:05     ` Volodymyr Babchuk
2023-12-01  9:04       ` Roger Pau Monné
2023-12-21 22:58     ` Stewart Hildebrand
2023-10-12 22:09 ` [PATCH v10 13/17] vpci: add initial support for virtual PCI bus topology Volodymyr Babchuk
2023-11-16 16:06   ` Julien Grall
2023-11-16 23:28     ` Stefano Stabellini
2023-11-17  0:06       ` Julien Grall
2023-11-17  0:51         ` Stefano Stabellini
2023-11-17  0:21       ` Volodymyr Babchuk [this message]
2023-11-17  0:58         ` Stefano Stabellini
2023-11-17 14:09           ` Volodymyr Babchuk
2023-11-17 18:30             ` Julien Grall
2023-11-17 20:08               ` Volodymyr Babchuk
2023-11-17 21:43                 ` Stefano Stabellini
2023-11-17 22:22                   ` Volodymyr Babchuk
2023-11-18  0:45                     ` Stefano Stabellini
2023-11-21  0:42                       ` Volodymyr Babchuk
2023-11-22  1:12                         ` Stefano Stabellini
2023-11-22 11:53                           ` Roger Pau Monné
2023-11-22 21:18                             ` Stefano Stabellini
2023-11-23  8:29                               ` Roger Pau Monné
2023-11-28 23:45                                 ` Volodymyr Babchuk
2023-11-29  8:33                                   ` Roger Pau Monné
2023-11-30  2:28                                     ` Stefano Stabellini
2023-11-21 14:40   ` Roger Pau Monné
2023-10-12 22:09 ` [PATCH v10 14/17] xen/arm: translate virtual PCI bus topology for guests Volodymyr Babchuk
2023-11-21 15:11   ` Roger Pau Monné
2023-10-12 22:09 ` [PATCH v10 15/17] xen/arm: account IO handlers for emulated PCI MSI-X Volodymyr Babchuk
2023-10-13  8:34   ` Julien Grall
2023-10-13 13:06     ` Volodymyr Babchuk
2023-10-13 16:46       ` Julien Grall
2023-10-13 17:17         ` Volodymyr Babchuk
2023-10-12 22:09 ` [PATCH v10 16/17] xen/arm: vpci: permit access to guest vpci space Volodymyr Babchuk
2023-10-16 11:00   ` Jan Beulich
2023-10-24 19:44     ` Stewart Hildebrand
2023-10-12 22:09 ` [PATCH v10 17/17] arm/vpci: honor access size when returning an error Volodymyr Babchuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o7ft44bv.fsf@epam.com \
    --to=volodymyr_babchuk@epam.com \
    --cc=Oleksandr_Andrushchenko@epam.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=stewart.hildebrand@amd.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.