* xl: pci completion error
@ 2010-10-05 11:40 Sergey Tovpeko
2010-10-06 10:39 ` Stefano Stabellini
0 siblings, 1 reply; 11+ messages in thread
From: Sergey Tovpeko @ 2010-10-05 11:40 UTC (permalink / raw)
To: xen-devel@lists.xensource.com
Hello, list!
xl reports the error on passthrough-ed pci device removing.
do_pci_remove device 01:00.0
libxl: error: libxl_device.c:448:libxl__wait_for_device_model Device
Model not ready
libxl: error: libxl_pci.c:858:do_pci_remove Device Model didn't respond
in time
libxl: error: libxl.c:944:libxl_domain_destroy pci shutdown failed for
domid 1
libxl: error: libxl.c:896:libxl_destroy_device_model Couldn't find
device model's pid: No such file or directory
libxl: error: libxl.c:956:libxl_domain_destroy
libxl_destroy_device_model failed for 1
libxl: error: libxl_device.c:307:libxl__devices_destroy
/local/domain/1/device is empty
It seems that libxl_pci didn't get the 'pci-removed' status from
qemu-dm. Please, have a look who should set this status in qemu-dm. As
for me I added xenstore_record_dm_state("pci-removed");
after do_pci_del(par);
in xenstore_process_dm_command_event function.
It fixed up my issue of removing pci devices.
Sergey.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: xl: pci completion error
2010-10-05 11:40 xl: pci completion error Sergey Tovpeko
@ 2010-10-06 10:39 ` Stefano Stabellini
2010-10-06 13:08 ` Sergey Tovpeko
2010-10-06 14:05 ` Gianni Tedesco
0 siblings, 2 replies; 11+ messages in thread
From: Stefano Stabellini @ 2010-10-06 10:39 UTC (permalink / raw)
To: Sergey Tovpeko; +Cc: xen-devel@lists.xensource.com
On Tue, 5 Oct 2010, Sergey Tovpeko wrote:
> Hello, list!
>
> xl reports the error on passthrough-ed pci device removing.
>
> do_pci_remove device 01:00.0
> libxl: error: libxl_device.c:448:libxl__wait_for_device_model Device
> Model not ready
> libxl: error: libxl_pci.c:858:do_pci_remove Device Model didn't respond
> in time
> libxl: error: libxl.c:944:libxl_domain_destroy pci shutdown failed for
> domid 1
> libxl: error: libxl.c:896:libxl_destroy_device_model Couldn't find
> device model's pid: No such file or directory
> libxl: error: libxl.c:956:libxl_domain_destroy
> libxl_destroy_device_model failed for 1
> libxl: error: libxl_device.c:307:libxl__devices_destroy
> /local/domain/1/device is empty
>
>
>
> It seems that libxl_pci didn't get the 'pci-removed' status from
> qemu-dm. Please, have a look who should set this status in qemu-dm. As
> for me I added xenstore_record_dm_state("pci-removed");
> after do_pci_del(par);
> in xenstore_process_dm_command_event function.
>
> It fixed up my issue of removing pci devices.
What guest OS are you using?
Currently "pci-removed" is only written in response of an eject command
from the guest OS, that means that if the guest doesn't support pci
hotplug the value won't be written.
If you are using Linux you should make sure that the acpiphp module is
loaded, if you are using Windows, I think the only version that supports
pci hotplug is Windows Server 2008 but I might be wrong.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: xl: pci completion error
2010-10-06 10:39 ` Stefano Stabellini
@ 2010-10-06 13:08 ` Sergey Tovpeko
2010-10-06 13:54 ` Stefano Stabellini
2010-10-06 14:05 ` Gianni Tedesco
1 sibling, 1 reply; 11+ messages in thread
From: Sergey Tovpeko @ 2010-10-06 13:08 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: xen-devel@lists.xensource.com
Stefano Stabellini wrote:
> On Tue, 5 Oct 2010, Sergey Tovpeko wrote:
>
>> Hello, list!
>>
>> xl reports the error on passthrough-ed pci device removing.
>>
>> do_pci_remove device 01:00.0
>> libxl: error: libxl_device.c:448:libxl__wait_for_device_model Device
>> Model not ready
>> libxl: error: libxl_pci.c:858:do_pci_remove Device Model didn't respond
>> in time
>> libxl: error: libxl.c:944:libxl_domain_destroy pci shutdown failed for
>> domid 1
>> libxl: error: libxl.c:896:libxl_destroy_device_model Couldn't find
>> device model's pid: No such file or directory
>> libxl: error: libxl.c:956:libxl_domain_destroy
>> libxl_destroy_device_model failed for 1
>> libxl: error: libxl_device.c:307:libxl__devices_destroy
>> /local/domain/1/device is empty
>>
>>
>>
>> It seems that libxl_pci didn't get the 'pci-removed' status from
>> qemu-dm. Please, have a look who should set this status in qemu-dm. As
>> for me I added xenstore_record_dm_state("pci-removed");
>> after do_pci_del(par);
>> in xenstore_process_dm_command_event function.
>>
>> It fixed up my issue of removing pci devices.
>>
>
> What guest OS are you using?
> Currently "pci-removed" is only written in response of an eject command
> from the guest OS, that means that if the guest doesn't support pci
> hotplug the value won't be written.
> If you are using Linux you should make sure that the acpiphp module is
> loaded, if you are using Windows, I think the only version that supports
> pci hotplug is Windows Server 2008 but I might be wrong.
>
The guest is Windows XP sp2 32 bit.
When I do pci hot-unplug on the fly
xl pci-detach 11:0b.0
the guest detaches the pci card correctly with 'pci-removed' signal, but
on the shutdown, it doesn't.
In the code the flow is the following:
qemu-dm receives the 'pci-rem' command and sends the SCI interrupt to
the guest. I think guest should catch this interrupt and write to the
magic ACPI port that causes qemu-dm send 'pci-removed' signal.
On guest shutdown, libxl sends 'pci-rem' command to qemu-dm and waits
for the answer.
I suppose, by the time 'pci-rem' signal my guest have finished its work,
and doesn't react on SCI interrupt. As a result libxl doesn't receive
pci-removed signal.
It's my observation.
Sergey.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: xl: pci completion error
2010-10-06 13:08 ` Sergey Tovpeko
@ 2010-10-06 13:54 ` Stefano Stabellini
2010-10-06 14:29 ` Gianni Tedesco
0 siblings, 1 reply; 11+ messages in thread
From: Stefano Stabellini @ 2010-10-06 13:54 UTC (permalink / raw)
To: Sergey Tovpeko; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini
On Wed, 6 Oct 2010, Sergey Tovpeko wrote:
> The guest is Windows XP sp2 32 bit.
>
> When I do pci hot-unplug on the fly
> xl pci-detach 11:0b.0
> the guest detaches the pci card correctly with 'pci-removed' signal, but
> on the shutdown, it doesn't.
>
> In the code the flow is the following:
> qemu-dm receives the 'pci-rem' command and sends the SCI interrupt to
> the guest. I think guest should catch this interrupt and write to the
> magic ACPI port that causes qemu-dm send 'pci-removed' signal.
>
> On guest shutdown, libxl sends 'pci-rem' command to qemu-dm and waits
> for the answer.
> I suppose, by the time 'pci-rem' signal my guest have finished its work,
> and doesn't react on SCI interrupt. As a result libxl doesn't receive
> pci-removed signal.
>
> It's my observation.
>
We could probably benefit from a boolean "force" parameter to do_pci_remove
and libxl_device_pci_remove, that would cause do_pci_remove not to
return if libxl__wait_for_device_model returns error.
Of course libxl_device_pci_remove would be called with force=1 by
libxl_device_pci_shutdown.
Gianni, what do you think?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: xl: pci completion error
2010-10-06 10:39 ` Stefano Stabellini
2010-10-06 13:08 ` Sergey Tovpeko
@ 2010-10-06 14:05 ` Gianni Tedesco
1 sibling, 0 replies; 11+ messages in thread
From: Gianni Tedesco @ 2010-10-06 14:05 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Sergey Tovpeko, xen-devel@lists.xensource.com
On Wed, 2010-10-06 at 11:39 +0100, Stefano Stabellini wrote:
> On Tue, 5 Oct 2010, Sergey Tovpeko wrote:
> > Hello, list!
> >
> > xl reports the error on passthrough-ed pci device removing.
> >
> > do_pci_remove device 01:00.0
> > libxl: error: libxl_device.c:448:libxl__wait_for_device_model Device
> > Model not ready
> > libxl: error: libxl_pci.c:858:do_pci_remove Device Model didn't respond
> > in time
> > libxl: error: libxl.c:944:libxl_domain_destroy pci shutdown failed for
> > domid 1
> > libxl: error: libxl.c:896:libxl_destroy_device_model Couldn't find
> > device model's pid: No such file or directory
> > libxl: error: libxl.c:956:libxl_domain_destroy
> > libxl_destroy_device_model failed for 1
> > libxl: error: libxl_device.c:307:libxl__devices_destroy
> > /local/domain/1/device is empty
> >
> >
> >
> > It seems that libxl_pci didn't get the 'pci-removed' status from
> > qemu-dm. Please, have a look who should set this status in qemu-dm. As
> > for me I added xenstore_record_dm_state("pci-removed");
> > after do_pci_del(par);
> > in xenstore_process_dm_command_event function.
> >
> > It fixed up my issue of removing pci devices.
>
> What guest OS are you using?
> Currently "pci-removed" is only written in response of an eject command
> from the guest OS, that means that if the guest doesn't support pci
> hotplug the value won't be written.
> If you are using Linux you should make sure that the acpiphp module is
> loaded, if you are using Windows, I think the only version that supports
> pci hotplug is Windows Server 2008 but I might be wrong.
I am pretty sure I tested this with such a guest operating system and
got the same failure but have not had time to confirm and track it down.
There is another issue here which is that when xl bails after such a
missing device removal we are in an inconsistent state where the device
has been removed from xenstore but the guest still clings to it.
If you recall, Stefano, we discussed this issue a few weeks ago. I was
arguing in favour of a force-removal option which I still think is
viable. However, a minimal, and more safety conscious fix would be to
keep the device assigned until the domain shuts down in this instance.
Following patch implements the 'force' method but if we keep a
RETURN_FAIL after the libxl__wait_for_device_model it would be the safe
variant.
Does this solve the problem for you Sergey?
---
diff -r 36420e35c65a tools/libxl/libxl_pci.c
--- a/tools/libxl/libxl_pci.c Wed Sep 22 16:57:12 2010 +0100
+++ b/tools/libxl/libxl_pci.c Wed Sep 22 18:03:40 2010 +0100
@@ -858,8 +858,6 @@ static int do_pci_remove(libxl__gc *gc,
}
}
- libxl_device_pci_remove_xenstore(gc, domid, pcidev);
-
hvm = libxl__domain_is_hvm(ctx, domid);
if (hvm) {
if (libxl__wait_for_device_model(ctx, domid, "running", NULL, NULL) < 0) {
@@ -878,7 +876,9 @@ static int do_pci_remove(libxl__gc *gc,
xs_write(ctx->xsh, XBT_NULL, path, "pci-rem", strlen("pci-rem"));
if (libxl__wait_for_device_model(ctx, domid, "pci-removed", NULL, NULL) < 0) {
LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Device Model didn't respond in time");
- return ERROR_FAIL;
+ /* This depends on guest operating system acknowledging the SCI, if it doesn't
+ * respond in time then force the remove.
+ */
}
}
path = libxl__sprintf(gc, "/local/domain/0/device-model/%d/state", domid);
@@ -924,7 +924,7 @@ skip1:
if ((fscanf(f, "%u", &irq) == 1) && irq) {
rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
if (rc < 0) {
- LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_physdev_map_pirq irq=%d", irq);
+ LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_physdev_unmap_pirq irq=%d", irq);
}
rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
if (rc < 0) {
@@ -951,6 +951,8 @@ out:
libxl_device_pci_remove(ctx, stubdomid, &pcidev_s);
}
+ libxl_device_pci_remove_xenstore(gc, domid, pcidev);
+
return 0;
}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: xl: pci completion error
2010-10-06 13:54 ` Stefano Stabellini
@ 2010-10-06 14:29 ` Gianni Tedesco
2010-10-06 16:36 ` Stefano Stabellini
2010-10-07 12:47 ` Sergey Tovpeko
0 siblings, 2 replies; 11+ messages in thread
From: Gianni Tedesco @ 2010-10-06 14:29 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Sergey Tovpeko, xen-devel@lists.xensource.com
On Wed, 2010-10-06 at 14:54 +0100, Stefano Stabellini wrote:
> On Wed, 6 Oct 2010, Sergey Tovpeko wrote:
> > The guest is Windows XP sp2 32 bit.
> >
> > When I do pci hot-unplug on the fly
> > xl pci-detach 11:0b.0
> > the guest detaches the pci card correctly with 'pci-removed' signal, but
> > on the shutdown, it doesn't.
> >
> > In the code the flow is the following:
> > qemu-dm receives the 'pci-rem' command and sends the SCI interrupt to
> > the guest. I think guest should catch this interrupt and write to the
> > magic ACPI port that causes qemu-dm send 'pci-removed' signal.
> >
> > On guest shutdown, libxl sends 'pci-rem' command to qemu-dm and waits
> > for the answer.
> > I suppose, by the time 'pci-rem' signal my guest have finished its work,
> > and doesn't react on SCI interrupt. As a result libxl doesn't receive
> > pci-removed signal.
> >
> > It's my observation.
> >
>
> We could probably benefit from a boolean "force" parameter to do_pci_remove
> and libxl_device_pci_remove, that would cause do_pci_remove not to
> return if libxl__wait_for_device_model returns error.
> Of course libxl_device_pci_remove would be called with force=1 by
> libxl_device_pci_shutdown.
> Gianni, what do you think?
Sergeys analysis sounds very plausible to me actually. The co-ordination
required between qemu and libxl for PCI passthrough is very complicated
and one ought to have fairly low confidence in it's correctness :P
Below is a less hacky version of the patch I just sent. Stefano, please
consider this for inclusion.
---
xl: Implement PCI passthrough force removal
This fixes two errors with removing PCI devices from HVM domains. The
first error is that the handling of "pci-rem" device-model command is
erroneously implemented in qemu and difficult (impossible?) to get
right.
For example, during domain shutdown there can be a race where the guest
OS unloads it's drivers and perhaps even shuts down PCI subsystem before
the pci-rem command has been received by qemu. This means that no OS is
present to write to the port which causes the dm command to be
acknowledged.
We fix this by implementing a 'force removal' option to
libxl_device_pci_remove which is always set to 1 during guest shutdown.
It can be optionally enabled on the xl command line for other occasions.
The second error is that if a guest OS doesn't respond to the SCI
interrupt and therefore the pci-rem dm command, which can happen if the
guest OS has no ACPI PCI hotplug support, then device removal bails with
an error but only AFTER removing the device from xenstore. This means
that xenstore gets in to an inconsistent state where an assigned device
also appears to be assignable.
This is fixed by moving xenstore device removal to occur only after the
device has really been removed.
Signed-off-by: Gianni Tedesco <gianni.tedesco@citrix.com>
diff -r 02e199c96ece tools/libxl/libxl.h
--- a/tools/libxl/libxl.h Wed Oct 06 11:00:19 2010 +0100
+++ b/tools/libxl/libxl.h Wed Oct 06 15:29:12 2010 +0100
@@ -406,7 +406,7 @@ int libxl_device_vfb_clean_shutdown(libx
int libxl_device_vfb_hard_shutdown(libxl_ctx *ctx, uint32_t domid);
int libxl_device_pci_add(libxl_ctx *ctx, uint32_t domid, libxl_device_pci *pcidev);
-int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid, libxl_device_pci *pcidev);
+int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid, libxl_device_pci *pcidev, int force);
int libxl_device_pci_shutdown(libxl_ctx *ctx, uint32_t domid);
int libxl_device_pci_list_assigned(libxl_ctx *ctx, libxl_device_pci **list, uint32_t domid, int *num);
int libxl_device_pci_list_assignable(libxl_ctx *ctx, libxl_device_pci **list, int *num);
diff -r 02e199c96ece tools/libxl/libxl_pci.c
--- a/tools/libxl/libxl_pci.c Wed Oct 06 11:00:19 2010 +0100
+++ b/tools/libxl/libxl_pci.c Wed Oct 06 15:29:12 2010 +0100
@@ -841,7 +841,8 @@ out:
return rc;
}
-static int do_pci_remove(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcidev)
+static int do_pci_remove(libxl__gc *gc, uint32_t domid,
+ libxl_device_pci *pcidev, int force)
{
libxl_ctx *ctx = libxl__gc_owner(gc);
libxl_device_pci *assigned;
@@ -858,8 +859,6 @@ static int do_pci_remove(libxl__gc *gc,
}
}
- libxl_device_pci_remove_xenstore(gc, domid, pcidev);
-
hvm = libxl__domain_is_hvm(ctx, domid);
if (hvm) {
if (libxl__wait_for_device_model(ctx, domid, "running", NULL, NULL) < 0) {
@@ -878,7 +877,12 @@ static int do_pci_remove(libxl__gc *gc,
xs_write(ctx->xsh, XBT_NULL, path, "pci-rem", strlen("pci-rem"));
if (libxl__wait_for_device_model(ctx, domid, "pci-removed", NULL, NULL) < 0) {
LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Device Model didn't respond in time");
- return ERROR_FAIL;
+ /* This depends on guest operating system acknowledging the
+ * SCI, if it doesn't respond in time then we may wish to
+ * force the removal.
+ */
+ if ( !force )
+ return ERROR_FAIL;
}
}
path = libxl__sprintf(gc, "/local/domain/0/device-model/%d/state", domid);
@@ -924,7 +928,7 @@ skip1:
if ((fscanf(f, "%u", &irq) == 1) && irq) {
rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
if (rc < 0) {
- LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_physdev_map_pirq irq=%d", irq);
+ LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_physdev_unmap_pirq irq=%d", irq);
}
rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
if (rc < 0) {
@@ -948,13 +952,16 @@ out:
stubdomid = libxl_get_stubdom_id(ctx, domid);
if (stubdomid != 0) {
libxl_device_pci pcidev_s = *pcidev;
- libxl_device_pci_remove(ctx, stubdomid, &pcidev_s);
+ libxl_device_pci_remove(ctx, stubdomid, &pcidev_s, force);
}
+ libxl_device_pci_remove_xenstore(gc, domid, pcidev);
+
return 0;
}
-int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid, libxl_device_pci *pcidev)
+int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid,
+ libxl_device_pci *pcidev, int force)
{
libxl__gc gc = LIBXL_INIT_GC(ctx);
unsigned int orig_vdev, pfunc_mask;
@@ -980,7 +987,7 @@ int libxl_device_pci_remove(libxl_ctx *c
}else{
pcidev->vdevfn = orig_vdev;
}
- if ( do_pci_remove(&gc, domid, pcidev) )
+ if ( do_pci_remove(&gc, domid, pcidev, force) )
rc = ERROR_FAIL;
}
}
@@ -1049,7 +1056,11 @@ int libxl_device_pci_shutdown(libxl_ctx
if ( rc )
return rc;
for (i = 0; i < num; i++) {
- if (libxl_device_pci_remove(ctx, domid, pcidevs + i) < 0)
+ /* Force remove on shutdown since, on HVM, qemu will not always
+ * respond to SCI interrupt because the guest kernel has shut down the
+ * devices by the time we even get here!
+ */
+ if (libxl_device_pci_remove(ctx, domid, pcidevs + i, 1) < 0)
return ERROR_FAIL;
}
free(pcidevs);
diff -r 02e199c96ece tools/libxl/xl_cmdimpl.c
--- a/tools/libxl/xl_cmdimpl.c Wed Oct 06 11:00:19 2010 +0100
+++ b/tools/libxl/xl_cmdimpl.c Wed Oct 06 15:29:12 2010 +0100
@@ -2262,7 +2262,7 @@ int main_pcilist(int argc, char **argv)
return 0;
}
-static void pcidetach(char *dom, char *bdf)
+static void pcidetach(char *dom, char *bdf, int force)
{
libxl_device_pci pcidev;
@@ -2273,20 +2273,24 @@ static void pcidetach(char *dom, char *b
fprintf(stderr, "pci-detach: malformed BDF specification \"%s\"\n", bdf);
exit(2);
}
- libxl_device_pci_remove(&ctx, domid, &pcidev);
+ libxl_device_pci_remove(&ctx, domid, &pcidev, force);
libxl_device_pci_destroy(&pcidev);
}
int main_pcidetach(int argc, char **argv)
{
int opt;
+ int force = 0;
char *domname = NULL, *bdf = NULL;
- while ((opt = getopt(argc, argv, "h")) != -1) {
+ while ((opt = getopt(argc, argv, "hf")) != -1) {
switch (opt) {
case 'h':
help("pci-detach");
return 0;
+ case 'f':
+ force = 1;
+ break;
default:
fprintf(stderr, "option not supported\n");
break;
@@ -2300,7 +2304,7 @@ int main_pcidetach(int argc, char **argv
domname = argv[optind];
bdf = argv[optind + 1];
- pcidetach(domname, bdf);
+ pcidetach(domname, bdf, force);
return 0;
}
static void pciattach(char *dom, char *bdf, char *vs)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: xl: pci completion error
2010-10-06 14:29 ` Gianni Tedesco
@ 2010-10-06 16:36 ` Stefano Stabellini
2010-10-07 12:47 ` Sergey Tovpeko
1 sibling, 0 replies; 11+ messages in thread
From: Stefano Stabellini @ 2010-10-06 16:36 UTC (permalink / raw)
To: Gianni Tedesco
Cc: Sergey Tovpeko, xen-devel@lists.xensource.com, Stefano Stabellini
> > We could probably benefit from a boolean "force" parameter to do_pci_remove
> > and libxl_device_pci_remove, that would cause do_pci_remove not to
> > return if libxl__wait_for_device_model returns error.
> > Of course libxl_device_pci_remove would be called with force=1 by
> > libxl_device_pci_shutdown.
> > Gianni, what do you think?
>
> Sergeys analysis sounds very plausible to me actually. The co-ordination
> required between qemu and libxl for PCI passthrough is very complicated
> and one ought to have fairly low confidence in it's correctness :P
>
> Below is a less hacky version of the patch I just sent. Stefano, please
> consider this for inclusion.
>
I like the patch, I am going to apply it.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: xl: pci completion error
2010-10-06 14:29 ` Gianni Tedesco
2010-10-06 16:36 ` Stefano Stabellini
@ 2010-10-07 12:47 ` Sergey Tovpeko
2010-10-07 14:10 ` Gianni Tedesco
1 sibling, 1 reply; 11+ messages in thread
From: Sergey Tovpeko @ 2010-10-07 12:47 UTC (permalink / raw)
To: Gianni Tedesco; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini
Hello, Gianni!
I was interested, how your patch would behave with many pci devices
given to a domain.
It seemed that each device removal takes 10 seconds for the timeout.
pci=[ '01:00.0','00:1d.0', '00:1d.1', '00:1d.2','00:1d.3', '00:1d.7' ]
The result was a bit different. After the first pci removal, the others
somehow were destroyed.
So the second do_pci_remove run into the failure that there were no pci
devices at all.
So I got the following log.
Waiting for domain workxp (domid 1) to die [pid 3900]
Domain 1 is dead
Action for shutdown reason code 0 is destroy
Domain 1 needs to be cleaned up: destroying the domain
do_pci_remove device 01:00.0
libxl: error: libxl_device.c:448:libxl__wait_for_device_model Device
Model not ready
libxl: error: libxl_pci.c:861:do_pci_remove Device Model didn't respond
in time
do_pci_remove device 00:1d.0
libxl: error: libxl_pci.c:839:do_pci_remove PCI device not attached to
this domain
libxl: error: libxl.c:944:libxl_domain_destroy pci shutdown failed for
domid 1
libxl: error: libxl.c:896:libxl_destroy_device_model Couldn't find
device model's pid: No such file or directory
libxl: error: libxl.c:956:libxl_domain_destroy
libxl_destroy_device_model failed for 1
libxl: error: libxl_device.c:307:libxl__devices_destroy
/local/domain/1/device is empty
Sergey.
> Sergeys analysis sounds very plausible to me actually. The co-ordination
> required between qemu and libxl for PCI passthrough is very complicated
> and one ought to have fairly low confidence in it's correctness :P
>
> Below is a less hacky version of the patch I just sent. Stefano, please
> consider this for inclusion.
>
> ---
> xl: Implement PCI passthrough force removal
>
> This fixes two errors with removing PCI devices from HVM domains. The
> first error is that the handling of "pci-rem" device-model command is
> erroneously implemented in qemu and difficult (impossible?) to get
> right.
>
> For example, during domain shutdown there can be a race where the guest
> OS unloads it's drivers and perhaps even shuts down PCI subsystem before
> the pci-rem command has been received by qemu. This means that no OS is
> present to write to the port which causes the dm command to be
> acknowledged.
>
> We fix this by implementing a 'force removal' option to
> libxl_device_pci_remove which is always set to 1 during guest shutdown.
> It can be optionally enabled on the xl command line for other occasions.
>
> The second error is that if a guest OS doesn't respond to the SCI
> interrupt and therefore the pci-rem dm command, which can happen if the
> guest OS has no ACPI PCI hotplug support, then device removal bails with
> an error but only AFTER removing the device from xenstore. This means
> that xenstore gets in to an inconsistent state where an assigned device
> also appears to be assignable.
>
> This is fixed by moving xenstore device removal to occur only after the
> device has really been removed.
>
> Signed-off-by: Gianni Tedesco <gianni.tedesco@citrix.com>
>
> diff -r 02e199c96ece tools/libxl/libxl.h
> --- a/tools/libxl/libxl.h Wed Oct 06 11:00:19 2010 +0100
> +++ b/tools/libxl/libxl.h Wed Oct 06 15:29:12 2010 +0100
> @@ -406,7 +406,7 @@ int libxl_device_vfb_clean_shutdown(libx
> int libxl_device_vfb_hard_shutdown(libxl_ctx *ctx, uint32_t domid);
>
> int libxl_device_pci_add(libxl_ctx *ctx, uint32_t domid, libxl_device_pci *pcidev);
> -int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid, libxl_device_pci *pcidev);
> +int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid, libxl_device_pci *pcidev, int force);
> int libxl_device_pci_shutdown(libxl_ctx *ctx, uint32_t domid);
> int libxl_device_pci_list_assigned(libxl_ctx *ctx, libxl_device_pci **list, uint32_t domid, int *num);
> int libxl_device_pci_list_assignable(libxl_ctx *ctx, libxl_device_pci **list, int *num);
> diff -r 02e199c96ece tools/libxl/libxl_pci.c
> --- a/tools/libxl/libxl_pci.c Wed Oct 06 11:00:19 2010 +0100
> +++ b/tools/libxl/libxl_pci.c Wed Oct 06 15:29:12 2010 +0100
> @@ -841,7 +841,8 @@ out:
> return rc;
> }
>
> -static int do_pci_remove(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcidev)
> +static int do_pci_remove(libxl__gc *gc, uint32_t domid,
> + libxl_device_pci *pcidev, int force)
> {
> libxl_ctx *ctx = libxl__gc_owner(gc);
> libxl_device_pci *assigned;
> @@ -858,8 +859,6 @@ static int do_pci_remove(libxl__gc *gc,
> }
> }
>
> - libxl_device_pci_remove_xenstore(gc, domid, pcidev);
> -
> hvm = libxl__domain_is_hvm(ctx, domid);
> if (hvm) {
> if (libxl__wait_for_device_model(ctx, domid, "running", NULL, NULL) < 0) {
> @@ -878,7 +877,12 @@ static int do_pci_remove(libxl__gc *gc,
> xs_write(ctx->xsh, XBT_NULL, path, "pci-rem", strlen("pci-rem"));
> if (libxl__wait_for_device_model(ctx, domid, "pci-removed", NULL, NULL) < 0) {
> LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Device Model didn't respond in time");
> - return ERROR_FAIL;
> + /* This depends on guest operating system acknowledging the
> + * SCI, if it doesn't respond in time then we may wish to
> + * force the removal.
> + */
> + if ( !force )
> + return ERROR_FAIL;
> }
> }
> path = libxl__sprintf(gc, "/local/domain/0/device-model/%d/state", domid);
> @@ -924,7 +928,7 @@ skip1:
> if ((fscanf(f, "%u", &irq) == 1) && irq) {
> rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
> if (rc < 0) {
> - LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_physdev_map_pirq irq=%d", irq);
> + LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_physdev_unmap_pirq irq=%d", irq);
> }
> rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
> if (rc < 0) {
> @@ -948,13 +952,16 @@ out:
> stubdomid = libxl_get_stubdom_id(ctx, domid);
> if (stubdomid != 0) {
> libxl_device_pci pcidev_s = *pcidev;
> - libxl_device_pci_remove(ctx, stubdomid, &pcidev_s);
> + libxl_device_pci_remove(ctx, stubdomid, &pcidev_s, force);
> }
>
> + libxl_device_pci_remove_xenstore(gc, domid, pcidev);
> +
> return 0;
> }
>
> -int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid, libxl_device_pci *pcidev)
> +int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid,
> + libxl_device_pci *pcidev, int force)
> {
> libxl__gc gc = LIBXL_INIT_GC(ctx);
> unsigned int orig_vdev, pfunc_mask;
> @@ -980,7 +987,7 @@ int libxl_device_pci_remove(libxl_ctx *c
> }else{
> pcidev->vdevfn = orig_vdev;
> }
> - if ( do_pci_remove(&gc, domid, pcidev) )
> + if ( do_pci_remove(&gc, domid, pcidev, force) )
> rc = ERROR_FAIL;
> }
> }
> @@ -1049,7 +1056,11 @@ int libxl_device_pci_shutdown(libxl_ctx
> if ( rc )
> return rc;
> for (i = 0; i < num; i++) {
> - if (libxl_device_pci_remove(ctx, domid, pcidevs + i) < 0)
> + /* Force remove on shutdown since, on HVM, qemu will not always
> + * respond to SCI interrupt because the guest kernel has shut down the
> + * devices by the time we even get here!
> + */
> + if (libxl_device_pci_remove(ctx, domid, pcidevs + i, 1) < 0)
> return ERROR_FAIL;
> }
> free(pcidevs);
> diff -r 02e199c96ece tools/libxl/xl_cmdimpl.c
> --- a/tools/libxl/xl_cmdimpl.c Wed Oct 06 11:00:19 2010 +0100
> +++ b/tools/libxl/xl_cmdimpl.c Wed Oct 06 15:29:12 2010 +0100
> @@ -2262,7 +2262,7 @@ int main_pcilist(int argc, char **argv)
> return 0;
> }
>
> -static void pcidetach(char *dom, char *bdf)
> +static void pcidetach(char *dom, char *bdf, int force)
> {
> libxl_device_pci pcidev;
>
> @@ -2273,20 +2273,24 @@ static void pcidetach(char *dom, char *b
> fprintf(stderr, "pci-detach: malformed BDF specification \"%s\"\n", bdf);
> exit(2);
> }
> - libxl_device_pci_remove(&ctx, domid, &pcidev);
> + libxl_device_pci_remove(&ctx, domid, &pcidev, force);
> libxl_device_pci_destroy(&pcidev);
> }
>
> int main_pcidetach(int argc, char **argv)
> {
> int opt;
> + int force = 0;
> char *domname = NULL, *bdf = NULL;
>
> - while ((opt = getopt(argc, argv, "h")) != -1) {
> + while ((opt = getopt(argc, argv, "hf")) != -1) {
> switch (opt) {
> case 'h':
> help("pci-detach");
> return 0;
> + case 'f':
> + force = 1;
> + break;
> default:
> fprintf(stderr, "option not supported\n");
> break;
> @@ -2300,7 +2304,7 @@ int main_pcidetach(int argc, char **argv
> domname = argv[optind];
> bdf = argv[optind + 1];
>
> - pcidetach(domname, bdf);
> + pcidetach(domname, bdf, force);
> return 0;
> }
> static void pciattach(char *dom, char *bdf, char *vs)
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: xl: pci completion error
2010-10-07 12:47 ` Sergey Tovpeko
@ 2010-10-07 14:10 ` Gianni Tedesco
2010-10-08 10:15 ` Sergey Tovpeko
0 siblings, 1 reply; 11+ messages in thread
From: Gianni Tedesco @ 2010-10-07 14:10 UTC (permalink / raw)
To: Sergey Tovpeko; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini
On Thu, 2010-10-07 at 13:47 +0100, Sergey Tovpeko wrote:
> Hello, Gianni!
>
> I was interested, how your patch would behave with many pci devices
> given to a domain.
> It seemed that each device removal takes 10 seconds for the timeout.
>
> pci=[ '01:00.0','00:1d.0', '00:1d.1', '00:1d.2','00:1d.3', '00:1d.7' ]
Heh. you probably mean 00:1d.* which would have a higher chance of
working - It's a USB controller, am I right?
> The result was a bit different. After the first pci removal, the others
> somehow were destroyed.
> So the second do_pci_remove run into the failure that there were no pci
> devices at all.
Passing through multiple functions of one device as separate devices
causes some unspeakable badness. Therefore the above configuration is
not supported in libxl. Unless it is for an SR-IOV card, but even then
it is only supported 'in principle' and not in code - I have no hardware
to test this.
> So I got the following log.
>
> Waiting for domain workxp (domid 1) to die [pid 3900]
> Domain 1 is dead
> Action for shutdown reason code 0 is destroy
> Domain 1 needs to be cleaned up: destroying the domain
> do_pci_remove device 01:00.0
> libxl: error: libxl_device.c:448:libxl__wait_for_device_model Device
> Model not ready
> libxl: error: libxl_pci.c:861:do_pci_remove Device Model didn't respond
> in time
> do_pci_remove device 00:1d.0
> libxl: error: libxl_pci.c:839:do_pci_remove PCI device not attached to
> this domain
I am not sure how we can get to this state since we only call shutdown
on a device that is assigned (from checking xenstore) but then the
'attached-ness' check shows it's no longer assigned. That should
definitely not be the case...
Can you reproduce this without yesterdays patch??
> libxl: error: libxl.c:944:libxl_domain_destroy pci shutdown failed for
> domid 1
> libxl: error: libxl.c:896:libxl_destroy_device_model Couldn't find
> device model's pid: No such file or directory
> libxl: error: libxl.c:956:libxl_domain_destroy
> libxl_destroy_device_model failed for 1
> libxl: error: libxl_device.c:307:libxl__devices_destroy
> /local/domain/1/device is empty
Looks like unrelated xenstore corruption perhaps? I am not sure.
Gianni
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: xl: pci completion error
2010-10-07 14:10 ` Gianni Tedesco
@ 2010-10-08 10:15 ` Sergey Tovpeko
2010-10-08 10:36 ` Gianni Tedesco
0 siblings, 1 reply; 11+ messages in thread
From: Sergey Tovpeko @ 2010-10-08 10:15 UTC (permalink / raw)
To: Gianni Tedesco; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini
[-- Attachment #1: Type: text/plain, Size: 1063 bytes --]
> Heh. you probably mean 00:1d.* which would have a higher chance of
> working - It's a USB controller, am I right?
>
>
> Passing through multiple functions of one device as separate devices
> causes some unspeakable badness. Therefore the above configuration is
> not supported in libxl. Unless it is for an SR-IOV card, but even then
> it is only supported 'in principle' and not in code - I have no hardware
> to test this.
>
00:1d.* is the USB host controller. You're right.
You pointed me to the way how PCI devices are enumerated inside the
domain. I specify each pci function separately in config file, so I get
different virtual devices in the HVM domain.
And by the now, I havn't noticed any problem with separated devices,
which are the one multi-function pci device in real life. Perhaps, the
moment of the badness doesn't come yet. :-) And it's not a SRIOV card.
I attached the screenshot, in which there are some separate pci devices
in the HVM domain.
Well, I'm going to see what happens with shutdown case in more detail
Sergey.
[-- Attachment #2: pci1.jpg --]
[-- Type: image/jpeg, Size: 175359 bytes --]
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: xl: pci completion error
2010-10-08 10:15 ` Sergey Tovpeko
@ 2010-10-08 10:36 ` Gianni Tedesco
0 siblings, 0 replies; 11+ messages in thread
From: Gianni Tedesco @ 2010-10-08 10:36 UTC (permalink / raw)
To: Sergey Tovpeko; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini
On Fri, 2010-10-08 at 11:15 +0100, Sergey Tovpeko wrote:
> > Heh. you probably mean 00:1d.* which would have a higher chance of
> > working - It's a USB controller, am I right?
> >
> >
> > Passing through multiple functions of one device as separate devices
> > causes some unspeakable badness. Therefore the above configuration is
> > not supported in libxl. Unless it is for an SR-IOV card, but even then
> > it is only supported 'in principle' and not in code - I have no hardware
> > to test this.
> >
>
> 00:1d.* is the USB host controller. You're right.
>
> You pointed me to the way how PCI devices are enumerated inside the
> domain. I specify each pci function separately in config file, so I get
> different virtual devices in the HVM domain.
> And by the now, I havn't noticed any problem with separated devices,
> which are the one multi-function pci device in real life. Perhaps, the
> moment of the badness doesn't come yet. :-) And it's not a SRIOV card.
> I attached the screenshot, in which there are some separate pci devices
> in the HVM domain.
Yeha, the badness is either if the guest driver freaks (not sure if you
have attached some usb devices and tried it? but that didn't work for me
in linux if i recall correctly), the other problem is that if you remove
any other function than 0 then the whole device gets reset behind the
guest OS's back which may not be anticipated!
> Well, I'm going to see what happens with shutdown case in more detail
Yes if you find anything, do let know.
Thanks
Gianni
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-10-08 10:36 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-05 11:40 xl: pci completion error Sergey Tovpeko
2010-10-06 10:39 ` Stefano Stabellini
2010-10-06 13:08 ` Sergey Tovpeko
2010-10-06 13:54 ` Stefano Stabellini
2010-10-06 14:29 ` Gianni Tedesco
2010-10-06 16:36 ` Stefano Stabellini
2010-10-07 12:47 ` Sergey Tovpeko
2010-10-07 14:10 ` Gianni Tedesco
2010-10-08 10:15 ` Sergey Tovpeko
2010-10-08 10:36 ` Gianni Tedesco
2010-10-06 14:05 ` Gianni Tedesco
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.