From: Don Slutz <dslutz@verizon.com>
To: "Michael S. Tsirkin" <mst@redhat.com>,
Paul Durrant <Paul.Durrant@citrix.com>
Cc: Stefano Stabellini <Stefano.Stabellini@citrix.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"Slutz, Donald Christopher" <dslutz@verizon.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Qemu-devel] [PATCH v2 0/4] Fix device listener interface for PCI to PCI bridges
Date: Tue, 09 Jun 2015 11:10:31 -0400 [thread overview]
Message-ID: <557701E7.9030409@Verizon.com> (raw)
In-Reply-To: <20150609161937-mutt-send-email-mst@redhat.com>
On 06/09/15 10:27, Michael S. Tsirkin wrote:
> On Tue, Jun 09, 2015 at 02:14:29PM +0000, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: Michael S. Tsirkin [mailto:mst@redhat.com]
>>> Sent: 09 June 2015 13:30
>>> To: Paul Durrant
>>> Cc: Don Slutz; qemu-devel@nongnu.org; xen-devel@lists.xen.org; Stefano
>>> Stabellini
>>> Subject: Re: [PATCH v2 0/4] Fix device listener interface for PCI to PCI bridges
>>>
>>> On Tue, Jun 09, 2015 at 10:58:26AM +0000, Paul Durrant wrote:
>>>>> -----Original Message-----
>>>>> From: Michael S. Tsirkin [mailto:mst@redhat.com]
>>>>> Sent: 09 June 2015 11:52
>>>>> To: Paul Durrant
>>>>> Cc: Don Slutz; qemu-devel@nongnu.org; xen-devel@lists.xen.org;
>>> Stefano
>>>>> Stabellini
>>>>> Subject: Re: [PATCH v2 0/4] Fix device listener interface for PCI to PCI
>>> bridges
>>>>>
>>>>> On Tue, Jun 09, 2015 at 09:18:49AM +0000, Paul Durrant wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Michael S. Tsirkin [mailto:mst@redhat.com]
>>>>>>> Sent: 09 June 2015 10:13
>>>>>>> To: Don Slutz
>>>>>>> Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org; Paul Durrant;
>>>>>>> Stefano Stabellini
>>>>>>> Subject: Re: [PATCH v2 0/4] Fix device listener interface for PCI to PCI
>>>>> bridges
>>>>>>>
>>>>>>> On Mon, Jun 08, 2015 at 05:18:48PM -0400, Don Slutz wrote:
>>>>>>>> changes v1 to v2:
>>>>>>>> Split v1 patch into 3.
>>>>>>>>
>>>>>>>> Added a BUG FIX patch (#1: "exec: Do not use MemoryRegion
>>> after
>>>>>>>> free").
>>>>>>>>
>>>>>>>> Technically this bug fix should be a separate patch, however this
>>>>>>>> issue only seems to reproduce when this patch set is applied.
>>>>>>>>
>>>>>>>>
>>>>>>>> Michael S. Tsirkin:
>>>>>>>> "You need some other API that makes sense, probably PCI
>>> specific."
>>>>>>>> This is basically patch #2: "Extend device listener interface..."
>>>>>>>>
>>>>>>>> "This is relying on undocumented assumptions and how specific
>>>>>>>> firmware works. There's nothing special about bus number 255,
>>>>>>>> and 0 is not very special either (except it happens to be the
>>>>>>>> reset value)."
>>>>>>>> Dropped all checking of 0 and 255.
>>>>>>>>
>>>>>>>>
>>>>>>>> Open question by Michael S. Tsirkin:
>>>>>>>>
>>>>>>>>>>>> On Thu, May 28, 2015 at 07:25:50AM -0400, Don Slutz wrote:
>>>>>>>> ...
>>>>>>>>>>>> It is not clear to me that the complexity of tracking bus
>>>>>>>>>>>> visibility make sense. Clearly you do.
>>>>>>>>>>> Well what was the point of the change?
>>>>>>>>>
>>>>>>>>> To get config cycles that are valid that you do not get today.
>>>>>>>>>
>>>>>>>>>>> If you don't care that we get irrelevant config cycles why not
>>>>>>>>>>> just send them all to QEMU?
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That would need to be answered by Paul Durrant and/or other
>>>>> people
>>>>>>> (see
>>>>>>>>> below)
>>>>>>>>>
>>>>>>>>
>>>>>>>> We could broadcast config space ioreqs to all emulators, the
>>> problem
>>>>>>>> is how do we know (in the case of a read) which emulator is actually
>>>>>>>> the one supplying the data? At some level Xen needs to know who
>>> is
>>>>>>>> implementing what.
>>>>>>>>
>>>>>>>> Paul Durrant
>>>>>>>
>>>>>>> Can irrelevant emulators all respond with some kind of nack?
>>>>>>> Xen would pick the one that did respond correctly.
>>>>>>>
>>>>>>
>>>>>> I guess that's possible if we add an extra bit to the ioreq_t, but QEMU
>>>>> would still need to know when to nack and when to ack.
>>>>>
>>>>> It's simple: ack if we find a device handling the specific BDF.
>>>>> The result would at least be contained.
>>>>> OTOH detecting master aborts in core is useful since it would
>>>>> let us implement error reporting correctly.
>>>>>
>>>>>
>>>>>> It's still much simpler if qemu updates Xen with exact set of (S)BDFs it's
>>>>> handling.
>>>>>>
>>>>>> Paul
>>>>>
>>>>>
>>>>> I suspect this calls for a bigger change, you need to re-scan
>>>>> all of the tree to detect visible devices.
>>>>> Maybe just write some xen-specific code to do this on each
>>>>> config access.
>>>>
>>>> Well, that's the thing really... what triggers the re-scan. Do we really need
>>> to scan on each access or is there a way to know when the topology is
>>> changed? Doing the former doesn't really sound wonderfully efficient and, if
>>> the answer to the second is yes, then that's the time to update Xen with the
>>> currently valid set of BDFs.
>>>>
>>>> Paul
>>>
>>>
>>> Several things can trigger a topology change.
>>
>> Well, IMO those need to be enumerated and suitable hooks need to be put in place so that Xen can be informed of the changes.
>>
This patch adds one of the hooks (the one that lets Xen know of the
change to BDF).
The hook that can report on the accessibility of a given PCI device is
much more complex since there is no simple way to say "PCI device pvscsi
#2 is not accessible" since there can be multiple PCI devices with the
same name and they can start out all having the BDF. One would need to
invent a way to describe the PCI tree and pass that around.
>> Paul
>
> pci_data_write and pci_data_read already do bus lookups.
> It seems that returning a nack there would just require
> hooks in these two places.
>
Yes, the answer of "can QEMU can reach a BDF right now" is simple. And
it may make sense to add.
Returning a set of reachable BDFs is much harder. I went with returning
the set of BDFs that may be reachable. I.E. every BDF registered with
Xen should be sent to this QEMU. At startup there will be ones that
either generate nacks or are reached.
Proper PCI bridge scanning code (which is missing from hvmloader) will
attempt to assign a non-overlapping set of BDF for all PCI devices
found. With enough PCI to PCI bridges (greater then 255) this is not
possible and I have no idea how many things would not work in this case.
I would hope that QEMU would work, but I do not know if this has ever be
tested.
However guest code can (and will) at times do things like hiding a
secondary PCI bus and all PCI devices that can be reached via it.
The current interface with Xen does not support the root bus vs non-root
bus information. Nor does it have PCI device tree structure.
All you have to work with is a set of BDFs.
-Don Slutz
>>> One other option is switching to a memory API
>>> for config accesses, then using a memory listener to detect
>>> topology changes. That would be a lot of work I'm afraid.
>>>
>>> --
>>> MST
WARNING: multiple messages have this Message-ID (diff)
From: Don Slutz <dslutz@verizon.com>
To: "Michael S. Tsirkin" <mst@redhat.com>,
Paul Durrant <Paul.Durrant@citrix.com>
Cc: Stefano Stabellini <Stefano.Stabellini@citrix.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"Slutz, Donald Christopher" <dslutz@verizon.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2 0/4] Fix device listener interface for PCI to PCI bridges
Date: Tue, 09 Jun 2015 11:10:31 -0400 [thread overview]
Message-ID: <557701E7.9030409@Verizon.com> (raw)
In-Reply-To: <20150609161937-mutt-send-email-mst@redhat.com>
On 06/09/15 10:27, Michael S. Tsirkin wrote:
> On Tue, Jun 09, 2015 at 02:14:29PM +0000, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: Michael S. Tsirkin [mailto:mst@redhat.com]
>>> Sent: 09 June 2015 13:30
>>> To: Paul Durrant
>>> Cc: Don Slutz; qemu-devel@nongnu.org; xen-devel@lists.xen.org; Stefano
>>> Stabellini
>>> Subject: Re: [PATCH v2 0/4] Fix device listener interface for PCI to PCI bridges
>>>
>>> On Tue, Jun 09, 2015 at 10:58:26AM +0000, Paul Durrant wrote:
>>>>> -----Original Message-----
>>>>> From: Michael S. Tsirkin [mailto:mst@redhat.com]
>>>>> Sent: 09 June 2015 11:52
>>>>> To: Paul Durrant
>>>>> Cc: Don Slutz; qemu-devel@nongnu.org; xen-devel@lists.xen.org;
>>> Stefano
>>>>> Stabellini
>>>>> Subject: Re: [PATCH v2 0/4] Fix device listener interface for PCI to PCI
>>> bridges
>>>>>
>>>>> On Tue, Jun 09, 2015 at 09:18:49AM +0000, Paul Durrant wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Michael S. Tsirkin [mailto:mst@redhat.com]
>>>>>>> Sent: 09 June 2015 10:13
>>>>>>> To: Don Slutz
>>>>>>> Cc: qemu-devel@nongnu.org; xen-devel@lists.xen.org; Paul Durrant;
>>>>>>> Stefano Stabellini
>>>>>>> Subject: Re: [PATCH v2 0/4] Fix device listener interface for PCI to PCI
>>>>> bridges
>>>>>>>
>>>>>>> On Mon, Jun 08, 2015 at 05:18:48PM -0400, Don Slutz wrote:
>>>>>>>> changes v1 to v2:
>>>>>>>> Split v1 patch into 3.
>>>>>>>>
>>>>>>>> Added a BUG FIX patch (#1: "exec: Do not use MemoryRegion
>>> after
>>>>>>>> free").
>>>>>>>>
>>>>>>>> Technically this bug fix should be a separate patch, however this
>>>>>>>> issue only seems to reproduce when this patch set is applied.
>>>>>>>>
>>>>>>>>
>>>>>>>> Michael S. Tsirkin:
>>>>>>>> "You need some other API that makes sense, probably PCI
>>> specific."
>>>>>>>> This is basically patch #2: "Extend device listener interface..."
>>>>>>>>
>>>>>>>> "This is relying on undocumented assumptions and how specific
>>>>>>>> firmware works. There's nothing special about bus number 255,
>>>>>>>> and 0 is not very special either (except it happens to be the
>>>>>>>> reset value)."
>>>>>>>> Dropped all checking of 0 and 255.
>>>>>>>>
>>>>>>>>
>>>>>>>> Open question by Michael S. Tsirkin:
>>>>>>>>
>>>>>>>>>>>> On Thu, May 28, 2015 at 07:25:50AM -0400, Don Slutz wrote:
>>>>>>>> ...
>>>>>>>>>>>> It is not clear to me that the complexity of tracking bus
>>>>>>>>>>>> visibility make sense. Clearly you do.
>>>>>>>>>>> Well what was the point of the change?
>>>>>>>>>
>>>>>>>>> To get config cycles that are valid that you do not get today.
>>>>>>>>>
>>>>>>>>>>> If you don't care that we get irrelevant config cycles why not
>>>>>>>>>>> just send them all to QEMU?
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That would need to be answered by Paul Durrant and/or other
>>>>> people
>>>>>>> (see
>>>>>>>>> below)
>>>>>>>>>
>>>>>>>>
>>>>>>>> We could broadcast config space ioreqs to all emulators, the
>>> problem
>>>>>>>> is how do we know (in the case of a read) which emulator is actually
>>>>>>>> the one supplying the data? At some level Xen needs to know who
>>> is
>>>>>>>> implementing what.
>>>>>>>>
>>>>>>>> Paul Durrant
>>>>>>>
>>>>>>> Can irrelevant emulators all respond with some kind of nack?
>>>>>>> Xen would pick the one that did respond correctly.
>>>>>>>
>>>>>>
>>>>>> I guess that's possible if we add an extra bit to the ioreq_t, but QEMU
>>>>> would still need to know when to nack and when to ack.
>>>>>
>>>>> It's simple: ack if we find a device handling the specific BDF.
>>>>> The result would at least be contained.
>>>>> OTOH detecting master aborts in core is useful since it would
>>>>> let us implement error reporting correctly.
>>>>>
>>>>>
>>>>>> It's still much simpler if qemu updates Xen with exact set of (S)BDFs it's
>>>>> handling.
>>>>>>
>>>>>> Paul
>>>>>
>>>>>
>>>>> I suspect this calls for a bigger change, you need to re-scan
>>>>> all of the tree to detect visible devices.
>>>>> Maybe just write some xen-specific code to do this on each
>>>>> config access.
>>>>
>>>> Well, that's the thing really... what triggers the re-scan. Do we really need
>>> to scan on each access or is there a way to know when the topology is
>>> changed? Doing the former doesn't really sound wonderfully efficient and, if
>>> the answer to the second is yes, then that's the time to update Xen with the
>>> currently valid set of BDFs.
>>>>
>>>> Paul
>>>
>>>
>>> Several things can trigger a topology change.
>>
>> Well, IMO those need to be enumerated and suitable hooks need to be put in place so that Xen can be informed of the changes.
>>
This patch adds one of the hooks (the one that lets Xen know of the
change to BDF).
The hook that can report on the accessibility of a given PCI device is
much more complex since there is no simple way to say "PCI device pvscsi
#2 is not accessible" since there can be multiple PCI devices with the
same name and they can start out all having the BDF. One would need to
invent a way to describe the PCI tree and pass that around.
>> Paul
>
> pci_data_write and pci_data_read already do bus lookups.
> It seems that returning a nack there would just require
> hooks in these two places.
>
Yes, the answer of "can QEMU can reach a BDF right now" is simple. And
it may make sense to add.
Returning a set of reachable BDFs is much harder. I went with returning
the set of BDFs that may be reachable. I.E. every BDF registered with
Xen should be sent to this QEMU. At startup there will be ones that
either generate nacks or are reached.
Proper PCI bridge scanning code (which is missing from hvmloader) will
attempt to assign a non-overlapping set of BDF for all PCI devices
found. With enough PCI to PCI bridges (greater then 255) this is not
possible and I have no idea how many things would not work in this case.
I would hope that QEMU would work, but I do not know if this has ever be
tested.
However guest code can (and will) at times do things like hiding a
secondary PCI bus and all PCI devices that can be reached via it.
The current interface with Xen does not support the root bus vs non-root
bus information. Nor does it have PCI device tree structure.
All you have to work with is a set of BDFs.
-Don Slutz
>>> One other option is switching to a memory API
>>> for config accesses, then using a memory listener to detect
>>> topology changes. That would be a lot of work I'm afraid.
>>>
>>> --
>>> MST
next prev parent reply other threads:[~2015-06-09 15:10 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-08 21:18 [Qemu-devel] [PATCH v2 0/4] Fix device listener interface for PCI to PCI bridges Don Slutz
2015-06-08 21:18 ` [Qemu-devel] [BUGFIX][PATCH v2 1/4] exec: Do not use MemoryRegion after free Don Slutz
2015-06-08 21:18 ` Don Slutz
2015-06-08 21:18 ` [Qemu-devel] [PATCH v2 2/4] Extend device listener interface for PCI to PCI bridges Don Slutz
2015-06-08 21:18 ` Don Slutz
2015-06-09 9:08 ` Paul Durrant
2015-06-09 9:08 ` [Qemu-devel] " Paul Durrant
2015-06-09 13:53 ` Don Slutz
2015-06-09 13:55 ` Paul Durrant
2015-06-09 14:00 ` Don Slutz
2015-06-09 14:00 ` [Qemu-devel] " Don Slutz
2015-06-09 13:55 ` Paul Durrant
2015-06-09 13:53 ` Don Slutz
2015-06-08 21:18 ` [Qemu-devel] [PATCH v2 3/4] xen: Add usage of " Don Slutz
2015-06-08 21:18 ` Don Slutz
2015-06-08 21:18 ` [Qemu-devel] [PATCH v2 4/4] xen: Fix map/unmap of pcidev to ioreq server Don Slutz
2015-06-08 21:18 ` Don Slutz
2015-06-09 9:05 ` Paul Durrant
2015-06-09 9:05 ` [Qemu-devel] " Paul Durrant
2015-06-09 13:51 ` Don Slutz
2015-06-09 13:51 ` [Qemu-devel] " Don Slutz
2015-06-09 14:03 ` Paul Durrant
2015-06-09 14:03 ` [Qemu-devel] " Paul Durrant
2015-06-09 14:28 ` Don Slutz
2015-06-09 15:11 ` Paul Durrant
2015-06-09 15:11 ` Paul Durrant
2015-06-09 16:40 ` Don Slutz
2015-06-09 16:40 ` [Qemu-devel] " Don Slutz
2015-06-09 14:28 ` Don Slutz
2015-06-09 9:13 ` [Qemu-devel] [PATCH v2 0/4] Fix device listener interface for PCI to PCI bridges Michael S. Tsirkin
2015-06-09 9:18 ` Paul Durrant
2015-06-09 9:18 ` [Qemu-devel] " Paul Durrant
2015-06-09 10:52 ` Michael S. Tsirkin
2015-06-09 10:52 ` Michael S. Tsirkin
2015-06-09 10:58 ` [Qemu-devel] " Paul Durrant
2015-06-09 12:29 ` Michael S. Tsirkin
2015-06-09 12:29 ` Michael S. Tsirkin
2015-06-09 14:14 ` [Qemu-devel] " Paul Durrant
2015-06-09 14:14 ` Paul Durrant
2015-06-09 14:27 ` [Qemu-devel] " Michael S. Tsirkin
2015-06-09 15:10 ` Don Slutz [this message]
2015-06-09 15:10 ` Don Slutz
2015-06-09 14:27 ` Michael S. Tsirkin
2015-06-09 10:58 ` Paul Durrant
2015-06-09 9:13 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=557701E7.9030409@Verizon.com \
--to=dslutz@verizon.com \
--cc=Paul.Durrant@citrix.com \
--cc=Stefano.Stabellini@citrix.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.