* Stubdom breakage in 4.5 @ 2015-02-03 12:22 Wei Liu 2015-02-03 13:42 ` Paul Durrant 0 siblings, 1 reply; 10+ messages in thread From: Wei Liu @ 2015-02-03 12:22 UTC (permalink / raw) To: xen-devel Cc: wei.liu2, Ian Campbell, Stefano Stabellini, Andrew Cooper, Ian Jackson, Paul Durrant, Jan Beulich Hi all I recently found out that stubdom in 4.5 is broken. A proper fix to that issue is likely to alter the start up protocol, which is not acceptable for backport. Providing a backport that doesn't alter the protocol used to start up stubdom is difficult. Paul, do you have any insight how we can fix stubdom in 4.5? Even a backportable workaround is better than just have stubdom broken. Wei. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Stubdom breakage in 4.5 2015-02-03 12:22 Stubdom breakage in 4.5 Wei Liu @ 2015-02-03 13:42 ` Paul Durrant 2015-02-03 13:47 ` Ian Campbell 0 siblings, 1 reply; 10+ messages in thread From: Paul Durrant @ 2015-02-03 13:42 UTC (permalink / raw) To: xen-devel@lists.xen.org Cc: Wei Liu, Ian Campbell, Andrew Cooper, Stefano Stabellini, Jan Beulich, Ian Jackson > -----Original Message----- > From: Wei Liu [mailto:wei.liu2@citrix.com] > Sent: 03 February 2015 12:22 > To: xen-devel@lists.xen.org > Cc: Wei Liu; Ian Campbell; Ian Jackson; Paul Durrant; Jan Beulich; Andrew > Cooper; Stefano Stabellini > Subject: Stubdom breakage in 4.5 > > Hi all > > I recently found out that stubdom in 4.5 is broken. > > A proper fix to that issue is likely to alter the start up protocol, > which is not acceptable for backport. > > Providing a backport that doesn't alter the protocol used to start up > stubdom is difficult. > > Paul, do you have any insight how we can fix stubdom in 4.5? Even a > backportable workaround is better than just have stubdom broken. > The minimal fix from your PoV, I guess, would be something that tells Xen not to complete I/O in the absence of a matching IOREQ but to wait until one is there, IIUC? It's a bit icky, and I don't know what form that something would be in. Paul > Wei. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Stubdom breakage in 4.5 2015-02-03 13:42 ` Paul Durrant @ 2015-02-03 13:47 ` Ian Campbell 2015-02-03 14:00 ` Paul Durrant 2015-02-03 14:11 ` Paul Durrant 0 siblings, 2 replies; 10+ messages in thread From: Ian Campbell @ 2015-02-03 13:47 UTC (permalink / raw) To: Paul Durrant Cc: Wei Liu, Andrew Cooper, xen-devel@lists.xen.org, Stefano Stabellini, Jan Beulich, Ian Jackson On Tue, 2015-02-03 at 13:42 +0000, Paul Durrant wrote: > > -----Original Message----- > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > Sent: 03 February 2015 12:22 > > To: xen-devel@lists.xen.org > > Cc: Wei Liu; Ian Campbell; Ian Jackson; Paul Durrant; Jan Beulich; Andrew > > Cooper; Stefano Stabellini > > Subject: Stubdom breakage in 4.5 > > > > Hi all > > > > I recently found out that stubdom in 4.5 is broken. > > > > A proper fix to that issue is likely to alter the start up protocol, > > which is not acceptable for backport. > > > > Providing a backport that doesn't alter the protocol used to start up > > stubdom is difficult. > > > > Paul, do you have any insight how we can fix stubdom in 4.5? Even a > > backportable workaround is better than just have stubdom broken. > > > > The minimal fix from your PoV, I guess, would be something that tells > Xen not to complete I/O in the absence of a matching IOREQ but to wait > until one is there, IIUC? It's a bit icky, and I don't know what form > that something would be in. "wait until one is there" == "the default ioreq is registered", i.e. put it on the default ring in anticipation of something eventually consuming it? This was the behaviour in 4.4 and earlier AIUI. I think reverting to that behaviour (nb, not by actually reverting the feature) in the 4.5 branch would be the best compromise, since as Wei says the proper fix for 4.6 will likely involve too much to backport (since it will involve fixing the startup interlock between toolstack and multiple qemus, and protocol changes like that aren't really stable backport candidates). Either way, this regression certainly needs fixing in 4.5 as well as unstable/4.6. It's my understanding that the stuff Don is doing is (at least partially) addressing the latter? Paul, can you take care of fixing, or ensuring someone else is fixing, the issue, please. Ian. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Stubdom breakage in 4.5 2015-02-03 13:47 ` Ian Campbell @ 2015-02-03 14:00 ` Paul Durrant 2015-02-04 21:52 ` Don Slutz 2015-02-03 14:11 ` Paul Durrant 1 sibling, 1 reply; 10+ messages in thread From: Paul Durrant @ 2015-02-03 14:00 UTC (permalink / raw) To: Ian Campbell Cc: Wei Liu, Andrew Cooper, xen-devel@lists.xen.org, Stefano Stabellini, Jan Beulich, Ian Jackson > -----Original Message----- > From: Ian Campbell > Sent: 03 February 2015 13:48 > To: Paul Durrant > Cc: Wei Liu; xen-devel@lists.xen.org; Ian Jackson; Jan Beulich; Andrew > Cooper; Stefano Stabellini > Subject: Re: Stubdom breakage in 4.5 > > On Tue, 2015-02-03 at 13:42 +0000, Paul Durrant wrote: > > > -----Original Message----- > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > Sent: 03 February 2015 12:22 > > > To: xen-devel@lists.xen.org > > > Cc: Wei Liu; Ian Campbell; Ian Jackson; Paul Durrant; Jan Beulich; Andrew > > > Cooper; Stefano Stabellini > > > Subject: Stubdom breakage in 4.5 > > > > > > Hi all > > > > > > I recently found out that stubdom in 4.5 is broken. > > > > > > A proper fix to that issue is likely to alter the start up protocol, > > > which is not acceptable for backport. > > > > > > Providing a backport that doesn't alter the protocol used to start up > > > stubdom is difficult. > > > > > > Paul, do you have any insight how we can fix stubdom in 4.5? Even a > > > backportable workaround is better than just have stubdom broken. > > > > > > > The minimal fix from your PoV, I guess, would be something that tells > > Xen not to complete I/O in the absence of a matching IOREQ but to wait > > until one is there, IIUC? It's a bit icky, and I don't know what form > > that something would be in. > > "wait until one is there" == "the default ioreq is registered", i.e. put > it on the default ring in anticipation of something eventually consuming > it? This was the behaviour in 4.4 and earlier AIUI. > > I think reverting to that behaviour (nb, not by actually reverting the > feature) in the 4.5 branch would be the best compromise, since as Wei > says the proper fix for 4.6 will likely involve too much to backport > (since it will involve fixing the startup interlock between toolstack > and multiple qemus, and protocol changes like that aren't really stable > backport candidates). > > Either way, this regression certainly needs fixing in 4.5 as well as > unstable/4.6. It's my understanding that the stuff Don is doing is (at > least partially) addressing the latter? > No, I don't think the stuff Don is doing will help this. He has need to steer his emulation requests, which are new and distinct. The case here is that you need an emulator for existent types of IOREQ to be present before the guest gets going and the toolstack is not ensuring this, so yes, forcibly creating the default emulator during domain build would solve that problem. However it does introduce another problem... Upstream QEMU now no longer hooks into Xen as the default emulator and therefore will not get emulation requests for the TPM probe done by hvmloader; those are now completed by Xen but would end up wedging the VM if Xen thought that a default emulator would eventually turn up. So, forcible creation of the default emulator would still need to be something that could be turned off if the latest upstream QEMU were in use. > Paul, can you take care of fixing, or ensuring someone else is fixing, > the issue, please. > I'm happy to fix once the best course of action is agreed. Paul > Ian. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Stubdom breakage in 4.5 2015-02-03 14:00 ` Paul Durrant @ 2015-02-04 21:52 ` Don Slutz 0 siblings, 0 replies; 10+ messages in thread From: Don Slutz @ 2015-02-04 21:52 UTC (permalink / raw) To: Paul Durrant, Ian Campbell Cc: Wei Liu, Andrew Cooper, xen-devel@lists.xen.org, Stefano Stabellini, Jan Beulich, Ian Jackson On 02/03/15 09:00, Paul Durrant wrote: >> -----Original Message----- >> From: Ian Campbell >> Sent: 03 February 2015 13:48 >> To: Paul Durrant >> Cc: Wei Liu; xen-devel@lists.xen.org; Ian Jackson; Jan Beulich; Andrew >> Cooper; Stefano Stabellini >> Subject: Re: Stubdom breakage in 4.5 >> >> On Tue, 2015-02-03 at 13:42 +0000, Paul Durrant wrote: >>>> -----Original Message----- >>>> From: Wei Liu [mailto:wei.liu2@citrix.com] >>>> Sent: 03 February 2015 12:22 >>>> To: xen-devel@lists.xen.org >>>> Cc: Wei Liu; Ian Campbell; Ian Jackson; Paul Durrant; Jan Beulich; Andrew >>>> Cooper; Stefano Stabellini >>>> Subject: Stubdom breakage in 4.5 >>>> >>>> Hi all >>>> >>>> I recently found out that stubdom in 4.5 is broken. >> Either way, this regression certainly needs fixing in 4.5 as well as >> unstable/4.6. It's my understanding that the stuff Don is doing is (at >> least partially) addressing the latter? >> > > No, I don't think the stuff Don is doing will help this. He has need to steer his emulation requests, which are new and distinct. The case here is that you need an emulator for existent types of IOREQ to be present before the guest gets going and the toolstack is not ensuring this, so yes, forcibly creating the default emulator during domain build would solve that problem. However it does introduce another problem... > Upstream QEMU now no longer hooks into Xen as the default emulator and therefore will not get emulation requests for the TPM probe done by hvmloader; those are now completed by Xen but would end up wedging the VM if Xen thought that a default emulator would eventually turn up. So, forcible creation of the default emulator would still need to be something that could be turned off if the latest upstream QEMU were in use. > Most of what I have posted does not apply. The only possible one that comes to mind is about using QEMU master (of the newer 4.6 QEMU) with 4.5 which I am assuming is not the case here (for reference it is about creating a default_ioreq_server to the QEMU that did call hvm_ioreq_server_enable() 1st, and then a 2nd time as the default one. (Message-ID: <54CAEF19.1030205@terremark.com>; Subject: Re: [Qemu-devel] [PATCH v5 2/2] Xen: Use the ioreq-server API when available)). -Don Slutz >> Paul, can you take care of fixing, or ensuring someone else is fixing, >> the issue, please. >> > > I'm happy to fix once the best course of action is agreed. > > Paul > > >> Ian. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Stubdom breakage in 4.5 2015-02-03 13:47 ` Ian Campbell 2015-02-03 14:00 ` Paul Durrant @ 2015-02-03 14:11 ` Paul Durrant 2015-02-04 12:30 ` Ian Campbell 1 sibling, 1 reply; 10+ messages in thread From: Paul Durrant @ 2015-02-03 14:11 UTC (permalink / raw) To: Ian Campbell Cc: Wei Liu, Andrew Cooper, xen-devel@lists.xen.org, Stefano Stabellini, Jan Beulich, Ian Jackson > -----Original Message----- > From: Paul Durrant > Sent: 03 February 2015 14:00 > To: Ian Campbell > Cc: Wei Liu; xen-devel@lists.xen.org; Ian Jackson; Jan Beulich; Andrew > Cooper; Stefano Stabellini > Subject: RE: Stubdom breakage in 4.5 > > > -----Original Message----- > > From: Ian Campbell > > Sent: 03 February 2015 13:48 > > To: Paul Durrant > > Cc: Wei Liu; xen-devel@lists.xen.org; Ian Jackson; Jan Beulich; Andrew > > Cooper; Stefano Stabellini > > Subject: Re: Stubdom breakage in 4.5 > > > > On Tue, 2015-02-03 at 13:42 +0000, Paul Durrant wrote: > > > > -----Original Message----- > > > > From: Wei Liu [mailto:wei.liu2@citrix.com] > > > > Sent: 03 February 2015 12:22 > > > > To: xen-devel@lists.xen.org > > > > Cc: Wei Liu; Ian Campbell; Ian Jackson; Paul Durrant; Jan Beulich; > Andrew > > > > Cooper; Stefano Stabellini > > > > Subject: Stubdom breakage in 4.5 > > > > > > > > Hi all > > > > > > > > I recently found out that stubdom in 4.5 is broken. > > > > > > > > A proper fix to that issue is likely to alter the start up protocol, > > > > which is not acceptable for backport. > > > > > > > > Providing a backport that doesn't alter the protocol used to start up > > > > stubdom is difficult. > > > > > > > > Paul, do you have any insight how we can fix stubdom in 4.5? Even a > > > > backportable workaround is better than just have stubdom broken. > > > > > > > > > > The minimal fix from your PoV, I guess, would be something that tells > > > Xen not to complete I/O in the absence of a matching IOREQ but to wait > > > until one is there, IIUC? It's a bit icky, and I don't know what form > > > that something would be in. > > > > "wait until one is there" == "the default ioreq is registered", i.e. put > > it on the default ring in anticipation of something eventually consuming > > it? This was the behaviour in 4.4 and earlier AIUI. > > > > I think reverting to that behaviour (nb, not by actually reverting the > > feature) in the 4.5 branch would be the best compromise, since as Wei > > says the proper fix for 4.6 will likely involve too much to backport > > (since it will involve fixing the startup interlock between toolstack > > and multiple qemus, and protocol changes like that aren't really stable > > backport candidates). > > > > Either way, this regression certainly needs fixing in 4.5 as well as > > unstable/4.6. It's my understanding that the stuff Don is doing is (at > > least partially) addressing the latter? > > > > No, I don't think the stuff Don is doing will help this. He has need to steer his > emulation requests, which are new and distinct. The case here is that you > need an emulator for existent types of IOREQ to be present before the guest > gets going and the toolstack is not ensuring this, so yes, forcibly creating the > default emulator during domain build would solve that problem. However it > does introduce another problem... > Upstream QEMU now no longer hooks into Xen as the default emulator and > therefore will not get emulation requests for the TPM probe done by > hvmloader; those are now completed by Xen but would end up wedging the > VM if Xen thought that a default emulator would eventually turn up. So, > forcible creation of the default emulator would still need to be something > that could be turned off if the latest upstream QEMU were in use. > > > Paul, can you take care of fixing, or ensuring someone else is fixing, > > the issue, please. > > > > I'm happy to fix once the best course of action is agreed. > How about this as a slightly hacky solution that I think may work in both cases? If Xen finds no emulator at all for an HVM guest then it waits around for at least one to show up before processing an emulation request. Until one does it stalls the vcpu in question indefinitely, but on the first emulator attach (i.e. ioreq server creations) then the IO will always be processed, even if it doesn't match the ioreq server. If no-one shouts I'll proceed along those lines. Paul > Paul > > > > Ian. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Stubdom breakage in 4.5 2015-02-03 14:11 ` Paul Durrant @ 2015-02-04 12:30 ` Ian Campbell 2015-02-04 12:58 ` Paul Durrant 0 siblings, 1 reply; 10+ messages in thread From: Ian Campbell @ 2015-02-04 12:30 UTC (permalink / raw) To: Paul Durrant Cc: Wei Liu, Andrew Cooper, xen-devel@lists.xen.org, Stefano Stabellini, Jan Beulich, Ian Jackson On Tue, 2015-02-03 at 14:11 +0000, Paul Durrant wrote: > How about this as a slightly hacky solution that I think may work in both cases? > > If Xen finds no emulator at all for an HVM guest then it waits around > for at least one to show up before processing an emulation request. > Until one does it stalls the vcpu in question indefinitely, but on the > first emulator attach (i.e. ioreq server creations) then the IO will > always be processed, even if it doesn't match the ioreq server. It sounds plausible to me and seems like it would probably be backportable. Longer term I think we still need to fix the domain creation interlock for launching multiple qemu's, ioreq servers and any other type of service thing we might launch (whether in a stub dom or not), at which point we may be able to remove the above workaround too. Ian. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Stubdom breakage in 4.5 2015-02-04 12:30 ` Ian Campbell @ 2015-02-04 12:58 ` Paul Durrant 2015-02-04 21:42 ` Don Slutz 0 siblings, 1 reply; 10+ messages in thread From: Paul Durrant @ 2015-02-04 12:58 UTC (permalink / raw) To: Ian Campbell Cc: Wei Liu, Andrew Cooper, xen-devel@lists.xen.org, Stefano Stabellini, Jan Beulich, Ian Jackson > -----Original Message----- > From: Ian Campbell > Sent: 04 February 2015 12:30 > To: Paul Durrant > Cc: Wei Liu; xen-devel@lists.xen.org; Ian Jackson; Jan Beulich; Andrew > Cooper; Stefano Stabellini > Subject: Re: Stubdom breakage in 4.5 > > On Tue, 2015-02-03 at 14:11 +0000, Paul Durrant wrote: > > How about this as a slightly hacky solution that I think may work in both > cases? > > > > If Xen finds no emulator at all for an HVM guest then it waits around > > for at least one to show up before processing an emulation request. > > Until one does it stalls the vcpu in question indefinitely, but on the > > first emulator attach (i.e. ioreq server creations) then the IO will > > always be processed, even if it doesn't match the ioreq server. > > It sounds plausible to me and seems like it would probably be > backportable. > Actually I think it may be even simpler, although I've not tried it. If hvm_domain_initialise() pauses the domain and the first hvm_ioreq_server_enable() unpauses it, I'm hoping that may be enough. > Longer term I think we still need to fix the domain creation interlock > for launching multiple qemu's, ioreq servers and any other type of > service thing we might launch (whether in a stub dom or not), at which > point we may be able to remove the above workaround too. > Yes. Once the toolstack is aware it can keep the domain paused until all emulators report readiness. Paul > Ian. > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Stubdom breakage in 4.5 2015-02-04 12:58 ` Paul Durrant @ 2015-02-04 21:42 ` Don Slutz 2015-02-05 11:04 ` Paul Durrant 0 siblings, 1 reply; 10+ messages in thread From: Don Slutz @ 2015-02-04 21:42 UTC (permalink / raw) To: Paul Durrant, Ian Campbell Cc: Wei Liu, Andrew Cooper, xen-devel@lists.xen.org, Stefano Stabellini, Jan Beulich, Ian Jackson On 02/04/15 07:58, Paul Durrant wrote: >> -----Original Message----- >> From: Ian Campbell >> Sent: 04 February 2015 12:30 >> To: Paul Durrant >> Cc: Wei Liu; xen-devel@lists.xen.org; Ian Jackson; Jan Beulich; Andrew >> Cooper; Stefano Stabellini >> Subject: Re: Stubdom breakage in 4.5 >> >> On Tue, 2015-02-03 at 14:11 +0000, Paul Durrant wrote: >>> How about this as a slightly hacky solution that I think may work in both >> cases? >>> >>> If Xen finds no emulator at all for an HVM guest then it waits around >>> for at least one to show up before processing an emulation request. >>> Until one does it stalls the vcpu in question indefinitely, but on the >>> first emulator attach (i.e. ioreq server creations) then the IO will >>> always be processed, even if it doesn't match the ioreq server. >> >> It sounds plausible to me and seems like it would probably be >> backportable. >> > > Actually I think it may be even simpler, although I've not tried it. If hvm_domain_initialise() pauses the domain and the first hvm_ioreq_server_enable() unpauses it, I'm hoping that may be enough. > You do need to keep the PVH case in mind here. -Don Slutz >> Longer term I think we still need to fix the domain creation interlock >> for launching multiple qemu's, ioreq servers and any other type of >> service thing we might launch (whether in a stub dom or not), at which >> point we may be able to remove the above workaround too. >> > > Yes. Once the toolstack is aware it can keep the domain paused until all emulators report readiness. > > Paul > >> Ian. >> >> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Stubdom breakage in 4.5 2015-02-04 21:42 ` Don Slutz @ 2015-02-05 11:04 ` Paul Durrant 0 siblings, 0 replies; 10+ messages in thread From: Paul Durrant @ 2015-02-05 11:04 UTC (permalink / raw) To: Don Slutz, Ian Campbell Cc: Wei Liu, Andrew Cooper, xen-devel@lists.xen.org, Stefano Stabellini, Jan Beulich, Ian Jackson > -----Original Message----- > From: Don Slutz [mailto:dslutz@verizon.com] > Sent: 04 February 2015 21:42 > To: Paul Durrant; Ian Campbell > Cc: Wei Liu; Andrew Cooper; xen-devel@lists.xen.org; Stefano Stabellini; Jan > Beulich; Ian Jackson > Subject: Re: [Xen-devel] Stubdom breakage in 4.5 > > On 02/04/15 07:58, Paul Durrant wrote: > >> -----Original Message----- > >> From: Ian Campbell > >> Sent: 04 February 2015 12:30 > >> To: Paul Durrant > >> Cc: Wei Liu; xen-devel@lists.xen.org; Ian Jackson; Jan Beulich; Andrew > >> Cooper; Stefano Stabellini > >> Subject: Re: Stubdom breakage in 4.5 > >> > >> On Tue, 2015-02-03 at 14:11 +0000, Paul Durrant wrote: > >>> How about this as a slightly hacky solution that I think may work in both > >> cases? > >>> > >>> If Xen finds no emulator at all for an HVM guest then it waits around > >>> for at least one to show up before processing an emulation request. > >>> Until one does it stalls the vcpu in question indefinitely, but on the > >>> first emulator attach (i.e. ioreq server creations) then the IO will > >>> always be processed, even if it doesn't match the ioreq server. > >> > >> It sounds plausible to me and seems like it would probably be > >> backportable. > >> > > > > Actually I think it may be even simpler, although I've not tried it. If > hvm_domain_initialise() pauses the domain and the first > hvm_ioreq_server_enable() unpauses it, I'm hoping that may be enough. > > > > You do need to keep the PVH case in mind here. Yes indeed. My test patch does not do the extra pause for PVH. Actually one thing that's in your patch series does 'help' a bit. The folding together of the has_dm check and the IO completion means that at least hvmloader gets back f-s rather than 0-s for the emulation that doesn't hit :-/ Paul > -Don Slutz > > > >> Longer term I think we still need to fix the domain creation interlock > >> for launching multiple qemu's, ioreq servers and any other type of > >> service thing we might launch (whether in a stub dom or not), at which > >> point we may be able to remove the above workaround too. > >> > > > > Yes. Once the toolstack is aware it can keep the domain paused until all > emulators report readiness. > > > > Paul > > > >> Ian. > >> > >> > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel > > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-02-05 11:04 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-02-03 12:22 Stubdom breakage in 4.5 Wei Liu 2015-02-03 13:42 ` Paul Durrant 2015-02-03 13:47 ` Ian Campbell 2015-02-03 14:00 ` Paul Durrant 2015-02-04 21:52 ` Don Slutz 2015-02-03 14:11 ` Paul Durrant 2015-02-04 12:30 ` Ian Campbell 2015-02-04 12:58 ` Paul Durrant 2015-02-04 21:42 ` Don Slutz 2015-02-05 11:04 ` Paul Durrant
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.