xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* test report for Xen 4.3 RC1
@ 2013-05-27  3:49 Ren, Yongjie
  2013-05-28 15:15 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 11+ messages in thread
From: Ren, Yongjie @ 2013-05-27  3:49 UTC (permalink / raw)
  To: xen-devel@lists.xen.org; +Cc: Xu, YongweiX, Liu, SongtaoX, Tian, Yongxue

Hi All,
This is a report based on our testing for Xen 4.3.0 RC1 on Intel platforms.
(Sorry it's a little late. :-)  If the status changes, I'll have an update later.)

Test environment:
Xen: Xen 4.3 RC1 with qemu-upstream-unstable.git
Dom0: Linux kernel 3.9.3
Hardware: Intel Sandy Bridge, Ivy Bridge, Haswell systems

Below are the features we tested.
- PV and HVM guest booting (HVM: Ubuntu, Fedora, RHEL, Windows)
- Save/Restore and live migration
- PCI device assignment and SR-IOV
- power management: C-state/P-state, Dom0 S3, HVM S3
- AVX and XSAVE instruction set
- MCE
- CPU online/offline for Dom0
- vCPU hot-plug
- Nested Virtualization  (Please look at my report in the following link.)
 http://lists.xen.org/archives/html/xen-devel/2013-05/msg01145.html

New bugs (4): (some of which are not regressions)
1. sometimes failed to online cpu in Dom0
  http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1851
2. dom0 call trace when running sriov hvm guest with igbvf
  http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1852
  -- a regression in Linux kernel (Dom0).
3. Booting multiple guests will lead Dom0 call trace
  http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1853
4. After live migration, guest console continuously prints "Clocksource tsc unstable"
  http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1854

Old bugs: (11)
1. [ACPI] Dom0 can't resume from S3 sleep
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1707
2. [XL]"xl vcpu-set" causes dom0 crash or panic
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1730
3. Sometimes Xen panic on ia32pae Sandybridge when restore guest
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1747
4. 'xl vcpu-set' can't decrease the vCPU number of a HVM guest
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1822
5. Dom0 cannot be shutdown before PCI device detachment from guest
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1826
6. xl pci-list shows one PCI device (PF or VF) could be assigned to two different guests
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1834
7. [upstream qemu] Guest free memory with upstream qemu is 14MB lower than that with qemu-xen-unstable.git
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1836
8. [upstream qemu]'maxvcpus=NUM' item is not supported in upstream QEMU
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1837
9. [upstream qemu] Guest console hangs after save/restore or live-migration when setting 'hpet=0' in guest config file
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1838
10. [upstream qemu] 'xen_platform_pci=0' setting cannot make the guest use emulated PCI devices by default
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1839
11. Live migration fail when migrating the same guest for more than 2 times
  http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1845

Best Regards,
     Yongjie (Jay)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: test report for Xen 4.3 RC1
  2013-05-27  3:49 test report for Xen 4.3 RC1 Ren, Yongjie
@ 2013-05-28 15:15 ` Konrad Rzeszutek Wilk
  2013-05-28 15:21   ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-28 15:15 UTC (permalink / raw)
  To: Ren, Yongjie, george.dunlap
  Cc: Xu, YongweiX, Liu, SongtaoX, Tian, Yongxue,
	xen-devel@lists.xen.org

On Mon, May 27, 2013 at 03:49:27AM +0000, Ren, Yongjie wrote:
> Hi All,
> This is a report based on our testing for Xen 4.3.0 RC1 on Intel platforms.
> (Sorry it's a little late. :-)  If the status changes, I'll have an update later.)

OK, I've some updates and ideas that can help with narrowing some of these
issues down. Thank you for doing this.

> 
> Test environment:
> Xen: Xen 4.3 RC1 with qemu-upstream-unstable.git
> Dom0: Linux kernel 3.9.3

Could you please test v3.10-rc3. There have been some changes
for the VCPU hotplug added in v3.10 that I am not sure whether
they are in v3.9?
> Hardware: Intel Sandy Bridge, Ivy Bridge, Haswell systems
> 
> Below are the features we tested.
> - PV and HVM guest booting (HVM: Ubuntu, Fedora, RHEL, Windows)
> - Save/Restore and live migration
> - PCI device assignment and SR-IOV
> - power management: C-state/P-state, Dom0 S3, HVM S3
> - AVX and XSAVE instruction set
> - MCE
> - CPU online/offline for Dom0
> - vCPU hot-plug
> - Nested Virtualization  (Please look at my report in the following link.)
>  http://lists.xen.org/archives/html/xen-devel/2013-05/msg01145.html
> 
> New bugs (4): (some of which are not regressions)
> 1. sometimes failed to online cpu in Dom0
>   http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1851

That looks like you are hitting the udev race. 

Could you verify that these patches:
https://lkml.org/lkml/2013/5/13/520

fix the issue (They are destined for v3.11)

> 2. dom0 call trace when running sriov hvm guest with igbvf
>   http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1852
>   -- a regression in Linux kernel (Dom0).

Hm, the call-trace you refer too:

[   68.404440] Already setup the GSI :37

[   68.405105] igb 0000:04:00.0: Enabling SR-IOV VFs using the module parameter is deprecated - please use the pci sysfs interface.

[   68.506230] ------------[ cut here ]------------

[   68.506265] WARNING: at /home/www/builds_xen_unstable/xen-src-27009-20130509/linux-2.6-pvops.git/fs/sysfs/dir.c:536 sysfs_add_one+0xcc/0xf0()

[   68.506279] Hardware name: S2600CP

is a deprecated warning. Did you follow the 'pci sysfs' interface way?

Looking at da36b64736cf2552e7fb5109c0255d4af804f5e7
    ixgbe: Implement PCI SR-IOV sysfs callback operation
it says it is using this:

commit 1789382a72a537447d65ea4131d8bcc1ad85ce7b
Author: Donald Dutile <ddutile@redhat.com>
Date:   Mon Nov 5 15:20:36 2012 -0500

    PCI: SRIOV control and status via sysfs
    
    Provide files under sysfs to determine the maximum number of VFs
    an SR-IOV-capable PCIe device supports, and methods to enable and
    disable the VFs on a per-device basis.
    
    Currently, VF enablement by SR-IOV-capable PCIe devices is done
    via driver-specific module parameters.  If not setup in modprobe files,
    it requires admin to unload & reload PF drivers with number of desired
    VFs to enable.  Additionally, the enablement is system wide: all
    devices controlled by the same driver have the same number of VFs
    enabled.  Although the latter is probably desired, there are PCI
    configurations setup by system BIOS that may not enable that to occur.
    
    Two files are created for the PF of PCIe devices with SR-IOV support:
    
        sriov_totalvfs  Contains the maximum number of VFs the device
                        could support as reported by the TotalVFs register
                        in the SR-IOV extended capability.
    
        sriov_numvfs    Contains the number of VFs currently enabled on
                        this device as reported by the NumVFs register in
                        the SR-IOV extended capability.
    
                        Writing zero to this file disables all VFs.
    
                        Writing a positive number to this file enables that
                        number of VFs.
    
    These files are readable for all SR-IOV PF devices.  Writes to the
    sriov_numvfs file are effective only if a driver that supports the
    sriov_configure() method is attached.
    
    Signed-off-by: Donald Dutile <ddutile@redhat.com>


Can you try that please?


> 3. Booting multiple guests will lead Dom0 call trace
>   http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1853

That one worries me. Did you do a git bisect to figure out what
is commit is causing this?

> 4. After live migration, guest console continuously prints "Clocksource tsc unstable"
>   http://bugzilla-archived.xenproject.org//bugzilla/show_bug.cgi?id=1854

This looks like a current bug with QEMU unstable missing a ACPI table?

Did you try booting the guest with the old QEMU?

device_model_version = 'qemu-xen-traditional'

> 
> Old bugs: (11)
> 1. [ACPI] Dom0 can't resume from S3 sleep
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1707

That should be fixed in v3.11 (as now we have the fixes)
Could you try v3.10 with the Rafael's ACPI tree merged in?
(so the patches that he wants to submit for v3.11)

> 2. [XL]"xl vcpu-set" causes dom0 crash or panic
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1730

That I think is fixed in v3.10. Could you please check v3.10-rc3?

> 3. Sometimes Xen panic on ia32pae Sandybridge when restore guest
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1747

That looks to be with v2.6.32. Is the issue present with v3.9
or v3.10-rc3?

> 4. 'xl vcpu-set' can't decrease the vCPU number of a HVM guest
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1822

That I believe was an QEMU bug:
http://lists.xen.org/archives/html/xen-devel/2013-05/msg01054.html

which should be in QEMU traditional now (05-21 was when it went
in the tree)

> 5. Dom0 cannot be shutdown before PCI device detachment from guest
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1826

Ok, I can reproduce that too.

> 6. xl pci-list shows one PCI device (PF or VF) could be assigned to two different guests
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1834

OK, I can reproduce that too:

> xl create  /vm-pv.cfg 
Parsing config from /vm-pv.cfg
libxl: error: libxl_pci.c:1043:libxl__device_pci_add: PCI device 0:1:0.0 is not assignable
Daemon running with PID 3933

15:11:17 # 16 :/mnt/lab/latest/ 
> xl pci-list 1
Vdev Device
05.0 0000:01:00.0

> xl list
Name                                        ID   Mem VCPUs	State	Time(s)
Domain-0                                     0  2047     4     r-----      26.7
latest                                       1  2043     1     -b----       5.3
latestadesa                                  4  1024     3     -b----       5.1

15:11:20 # 20 :/mnt/lab/latest/ 
> xl pci-list 4
Vdev Device
00.0 0000:01:00.0


The rest I hadn't had a chance to look at. George, have you seen
these issues?

> 7. [upstream qemu] Guest free memory with upstream qemu is 14MB lower than that with qemu-xen-unstable.git
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1836
> 8. [upstream qemu]'maxvcpus=NUM' item is not supported in upstream QEMU
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1837
> 9. [upstream qemu] Guest console hangs after save/restore or live-migration when setting 'hpet=0' in guest config file
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1838
> 10. [upstream qemu] 'xen_platform_pci=0' setting cannot make the guest use emulated PCI devices by default
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1839
> 11. Live migration fail when migrating the same guest for more than 2 times
>   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1845
> 
> Best Regards,
>      Yongjie (Jay)
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: test report for Xen 4.3 RC1
  2013-05-28 15:15 ` Konrad Rzeszutek Wilk
@ 2013-05-28 15:21   ` Konrad Rzeszutek Wilk
  2013-05-28 15:24     ` George Dunlap
  2013-11-08 16:21     ` Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: " Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-28 15:21 UTC (permalink / raw)
  To: Ren, Yongjie, george.dunlap, xen
  Cc: Xu, YongweiX, Liu, SongtaoX, Tian, Yongxue,
	xen-devel@lists.xen.org

> > 5. Dom0 cannot be shutdown before PCI device detachment from guest
> >   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1826
> 
> Ok, I can reproduce that too.

This is what dom0 tells me:

[  483.586675] INFO: task init:4163 blocked for more than 120 seconds.
[  483.603675] "echo 0 > /proc/sys/kernel/hung_task_timG^G[  483.620747] init            D ffff880062b59c78  5904  4163      1 0x00000000
[  483.637699]  ffff880062b59bc8 0000000000000^G[  483.655189]  ffff880062b58000 ffff880062b58000 ffff880062b58010 ffff880062b58000
[  483.672505]  ffff880062b59fd8 ffff880062b58000 ffff880062f20180 ffff880078bca500
[  483.689527] Call Trace:
[  483.706298]  [<ffffffff816a0814>] schedule+0x24/0x70
[  483.723604]  [<ffffffff813bb0dd>] read_reply+0xad/0x160
[  483.741162]  [<ffffffff810b6b10>] ? wake_up_bit+0x40/0x40
[  483.758572]  [<ffffffff813bb274>] xs_talkv+0xe4/0x1f0
[  483.775741]  [<ffffffff813bb3c6>] xs_single+0x46/0x60
[  483.792791]  [<ffffffff813bbab4>] xenbus_transaction_start+0x24/0x60
[  483.809929]  [<ffffffff813ba202>] __xenbus_switch_ste+0x32/0x120
^G[  483.826947]  [<ffffffff8142df39>] ? __dev_printk+0x39/0x90
[  483.843792]  [<ffffffff8142dfde>] ? _dev_info+0x4e/0x50
[  483.860412]  [<ffffffff813ba2fb>] xenbus_switch_state+0xb/0x10
[  483.877312]  [<ffffffff813bd487>] xenbus_dev_shutdown+0x37/0xa0
[  483.894036]  [<ffffffff8142e275>] device_shutdown+0x15/0x180
[  483.910605]  [<ffffffff810a8841>] kernel_restart_prepare+0x31/0x40
[  483.927100]  [<ffffffff810a88a1>] kernel_restart+0x11^G[  483.943262]  [<ffffffff810a8ab5>] SYSC_reboot+0x1b5/0x260
[  483.959480]  [<ffffffff810ed52d>] ? trace_hardirqs_on_caller+0^G[  483.975786]  [<ffffffff810ed5fd>] ? trace_hardirqs_on+0xd/0x10
[  483.991819]  [<ffffffff8119db03>] ? kmem_cache_free+0x123/0x360
[  484.007675]  [<ffffffff8115c725>] ? __free_pages+0x25/0x^G[  484.023336]  [<ffffffff8115c9ac>] ? free_pages+0x4c/0x50
[  484.039176]  [<ffffffff8108b527>] ? __mmdrop+0x67/0xd0
[  484.055174]  [<ffffffff816aae95>] ? sysret_check+0x22/0x5d
[  484.070747]  [<ffffffff810ed52d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
[  484.086121]  [<ffffffff810a8b69>] SyS_reboot+0x9/0x10
[  484.101318]  [<ffffffff816aae69>] system_call_fastpath+0x16/0x1b
[  484.116585] 3 locks held by init/4163:
[  484.131650]+.+.+.}, at: [<ffffffff810a89e0>] SYSC_reboot+0xe0/0x260
^G^G^G^G^G^G[  484.147704]  #1:  (&__lockdep_no_validate__){......}, at: [<ffffffff8142e323>] device_shutdown+0xc3/0x180
[  484.164359]  #2:  (&xs_state.request_mutex){+.+...}, at: [<ffffffff813bb1fb>] xs_talkv+0x6b/0x1f0

create !
title -1 "linux, xenbus mutex hangs when rebooting dom0 and guests hung."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: test report for Xen 4.3 RC1
  2013-05-28 15:21   ` Konrad Rzeszutek Wilk
@ 2013-05-28 15:24     ` George Dunlap
  2013-11-11 10:22       ` Ian Campbell
  2013-11-08 16:21     ` Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: " Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 11+ messages in thread
From: George Dunlap @ 2013-05-28 15:24 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Ren, Yongjie, Tian, Yongxue, xen-devel@lists.xen.org,
	Xu, YongweiX, xen, Liu, SongtaoX

On 28/05/13 16:21, Konrad Rzeszutek Wilk wrote:
>>> 5. Dom0 cannot be shutdown before PCI device detachment from guest
>>>    http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1826
>> Ok, I can reproduce that too.
> This is what dom0 tells me:
>
> [  483.586675] INFO: task init:4163 blocked for more than 120 seconds.
> [  483.603675] "echo 0 > /proc/sys/kernel/hung_task_timG^G[  483.620747] init            D ffff880062b59c78  5904  4163      1 0x00000000
> [  483.637699]  ffff880062b59bc8 0000000000000^G[  483.655189]  ffff880062b58000 ffff880062b58000 ffff880062b58010 ffff880062b58000
> [  483.672505]  ffff880062b59fd8 ffff880062b58000 ffff880062f20180 ffff880078bca500
> [  483.689527] Call Trace:
> [  483.706298]  [<ffffffff816a0814>] schedule+0x24/0x70
> [  483.723604]  [<ffffffff813bb0dd>] read_reply+0xad/0x160
> [  483.741162]  [<ffffffff810b6b10>] ? wake_up_bit+0x40/0x40
> [  483.758572]  [<ffffffff813bb274>] xs_talkv+0xe4/0x1f0
> [  483.775741]  [<ffffffff813bb3c6>] xs_single+0x46/0x60
> [  483.792791]  [<ffffffff813bbab4>] xenbus_transaction_start+0x24/0x60
> [  483.809929]  [<ffffffff813ba202>] __xenbus_switch_ste+0x32/0x120
> ^G[  483.826947]  [<ffffffff8142df39>] ? __dev_printk+0x39/0x90
> [  483.843792]  [<ffffffff8142dfde>] ? _dev_info+0x4e/0x50
> [  483.860412]  [<ffffffff813ba2fb>] xenbus_switch_state+0xb/0x10
> [  483.877312]  [<ffffffff813bd487>] xenbus_dev_shutdown+0x37/0xa0
> [  483.894036]  [<ffffffff8142e275>] device_shutdown+0x15/0x180
> [  483.910605]  [<ffffffff810a8841>] kernel_restart_prepare+0x31/0x40
> [  483.927100]  [<ffffffff810a88a1>] kernel_restart+0x11^G[  483.943262]  [<ffffffff810a8ab5>] SYSC_reboot+0x1b5/0x260
> [  483.959480]  [<ffffffff810ed52d>] ? trace_hardirqs_on_caller+0^G[  483.975786]  [<ffffffff810ed5fd>] ? trace_hardirqs_on+0xd/0x10
> [  483.991819]  [<ffffffff8119db03>] ? kmem_cache_free+0x123/0x360
> [  484.007675]  [<ffffffff8115c725>] ? __free_pages+0x25/0x^G[  484.023336]  [<ffffffff8115c9ac>] ? free_pages+0x4c/0x50
> [  484.039176]  [<ffffffff8108b527>] ? __mmdrop+0x67/0xd0
> [  484.055174]  [<ffffffff816aae95>] ? sysret_check+0x22/0x5d
> [  484.070747]  [<ffffffff810ed52d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
> [  484.086121]  [<ffffffff810a8b69>] SyS_reboot+0x9/0x10
> [  484.101318]  [<ffffffff816aae69>] system_call_fastpath+0x16/0x1b
> [  484.116585] 3 locks held by init/4163:
> [  484.131650]+.+.+.}, at: [<ffffffff810a89e0>] SYSC_reboot+0xe0/0x260
> ^G^G^G^G^G^G[  484.147704]  #1:  (&__lockdep_no_validate__){......}, at: [<ffffffff8142e323>] device_shutdown+0xc3/0x180
> [  484.164359]  #2:  (&xs_state.request_mutex){+.+...}, at: [<ffffffff813bb1fb>] xs_talkv+0x6b/0x1f0
>
> create !
> title -1 "linux, xenbus mutex hangs when rebooting dom0 and guests hung."

1. I think that these commands have to come at the top
2. You don't need quotes in the title
3. You need to be polite and say "thanks" at the end so it knows it can 
stop paying attention. :-)

  -George

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: test report for Xen 4.3 RC1
  2013-05-28 15:21   ` Konrad Rzeszutek Wilk
  2013-05-28 15:24     ` George Dunlap
@ 2013-11-08 16:21     ` Konrad Rzeszutek Wilk
  2013-11-08 16:30       ` Processed: " xen
                         ` (2 more replies)
  1 sibling, 3 replies; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-11-08 16:21 UTC (permalink / raw)
  To: Ren, Yongjie, george.dunlap, xen
  Cc: Xu, YongweiX, Liu, SongtaoX, Tian, Yongxue,
	xen-devel@lists.xen.org

On Tue, May 28, 2013 at 11:21:56AM -0400, Konrad Rzeszutek Wilk wrote:
> > > 5. Dom0 cannot be shutdown before PCI device detachment from guest
> > >   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1826
> > 
> > Ok, I can reproduce that too.
> 
> This is what dom0 tells me:
> 
> [  483.586675] INFO: task init:4163 blocked for more than 120 seconds.
> [  483.603675] "echo 0 > /proc/sys/kernel/hung_task_timG^G[  483.620747] init            D ffff880062b59c78  5904  4163      1 0x00000000
> [  483.637699]  ffff880062b59bc8 0000000000000^G[  483.655189]  ffff880062b58000 ffff880062b58000 ffff880062b58010 ffff880062b58000
> [  483.672505]  ffff880062b59fd8 ffff880062b58000 ffff880062f20180 ffff880078bca500
> [  483.689527] Call Trace:
> [  483.706298]  [<ffffffff816a0814>] schedule+0x24/0x70
> [  483.723604]  [<ffffffff813bb0dd>] read_reply+0xad/0x160
> [  483.741162]  [<ffffffff810b6b10>] ? wake_up_bit+0x40/0x40
> [  483.758572]  [<ffffffff813bb274>] xs_talkv+0xe4/0x1f0
> [  483.775741]  [<ffffffff813bb3c6>] xs_single+0x46/0x60
> [  483.792791]  [<ffffffff813bbab4>] xenbus_transaction_start+0x24/0x60
> [  483.809929]  [<ffffffff813ba202>] __xenbus_switch_ste+0x32/0x120
> ^G[  483.826947]  [<ffffffff8142df39>] ? __dev_printk+0x39/0x90
> [  483.843792]  [<ffffffff8142dfde>] ? _dev_info+0x4e/0x50
> [  483.860412]  [<ffffffff813ba2fb>] xenbus_switch_state+0xb/0x10
> [  483.877312]  [<ffffffff813bd487>] xenbus_dev_shutdown+0x37/0xa0
> [  483.894036]  [<ffffffff8142e275>] device_shutdown+0x15/0x180
> [  483.910605]  [<ffffffff810a8841>] kernel_restart_prepare+0x31/0x40
> [  483.927100]  [<ffffffff810a88a1>] kernel_restart+0x11^G[  483.943262]  [<ffffffff810a8ab5>] SYSC_reboot+0x1b5/0x260
> [  483.959480]  [<ffffffff810ed52d>] ? trace_hardirqs_on_caller+0^G[  483.975786]  [<ffffffff810ed5fd>] ? trace_hardirqs_on+0xd/0x10
> [  483.991819]  [<ffffffff8119db03>] ? kmem_cache_free+0x123/0x360
> [  484.007675]  [<ffffffff8115c725>] ? __free_pages+0x25/0x^G[  484.023336]  [<ffffffff8115c9ac>] ? free_pages+0x4c/0x50
> [  484.039176]  [<ffffffff8108b527>] ? __mmdrop+0x67/0xd0
> [  484.055174]  [<ffffffff816aae95>] ? sysret_check+0x22/0x5d
> [  484.070747]  [<ffffffff810ed52d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
> [  484.086121]  [<ffffffff810a8b69>] SyS_reboot+0x9/0x10
> [  484.101318]  [<ffffffff816aae69>] system_call_fastpath+0x16/0x1b
> [  484.116585] 3 locks held by init/4163:
> [  484.131650]+.+.+.}, at: [<ffffffff810a89e0>] SYSC_reboot+0xe0/0x260
> ^G^G^G^G^G^G[  484.147704]  #1:  (&__lockdep_no_validate__){......}, at: [<ffffffff8142e323>] device_shutdown+0xc3/0x180
> [  484.164359]  #2:  (&xs_state.request_mutex){+.+...}, at: [<ffffffff813bb1fb>] xs_talkv+0x6b/0x1f0
> 

A bit of debugging shows that when we are in this state:


MSent SIGKILL to[  100.454603] xen-pciback pci-1-0: shutdown

telnet> send brk 
[  110.134554] SysRq : HELP : loglevel(0-9) reboot(b) crash(c) terminate-all-tasks(e) memory-full-oom-kill(f) debug(g) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) force-fb(V) show-blocked-tasks(w) dump-ftrace-buffer(z) 

... snip..

 xenstored       x 0000000000000002  5504  3437      1 0x00000006
  ffff88006b6efc88 0000000000000246 0000000000000d6d ffff88006b6ee000
  ffff88006b6effd8 ffff88006b6ee000 ffff88006b6ee010 ffff88006b6ee000
  ffff88006b6effd8 ffff88006b6ee000 ffff88006bc39500 ffff8800788b5480
 Call Trace:
  [<ffffffff8110fede>] ? cgroup_exit+0x10e/0x130
  [<ffffffff816b1594>] schedule+0x24/0x70
  [<ffffffff8109c43d>] do_exit+0x79d/0xbc0
  [<ffffffff8109c981>] do_group_exit+0x51/0x140
  [<ffffffff810ae6f4>] get_signal_to_deliver+0x264/0x760
  [<ffffffff8104c49f>] do_signal+0x4f/0x610
  [<ffffffff811c62ce>] ? __sb_end_write+0x2e/0x60
  [<ffffffff811c3d39>] ? vfs_write+0x129/0x170
  [<ffffffff8104cabd>] do_notify_resume+0x5d/0x80
  [<ffffffff816bc372>] int_signal+0x12/0x17


The 'x' means that the task has been killed.

(The other two threads 'xenbus' and 'xenwatch' are sleeping).

Since the xenstored can actually be in a domain nowadays and not
just in the initial domain and xenstored can be restarted anytime - we
can't depend on the task pid. Nor can we depend on the other
domain telling us that it is dead.

The best we can do is to get out of the way of the shutdown
process and not hang on forever.

This patch should solve it:
>From 228bb2fcde1267ed2a0b0d386f54d79ecacd0eb4 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Fri, 8 Nov 2013 10:48:58 -0500
Subject: [PATCH] xen/xenbus: Avoid synchronous wait on XenBus stalling
 shutdown/restart.

The 'read_reply' works with 'process_msg' to read of a reply in XenBus.
'process_msg' is running from within the 'xenbus' thread. Whenever
a message shows up in XenBus it is put on a xs_state.reply_list list
and 'read_reply' picks it up.

The problem is if the backend domain or the xenstored process is killed.
In which case 'xenbus' is still awaiting - and 'read_reply' if called -
stuck forever waiting for the reply_list to have some contents.

This is normally not a problem - as the backend domain can come back
or the xenstored process can be restarted. However if the domain
is in process of being powered off/restarted/halted - there is no
point of waiting on it coming back - as we are effectively being
terminated and should not impede the progress.

This patch solves this problem by checking the 'system_state' value
to see if we are in heading towards death. We also make the wait
mechanism a bit more asynchronous.

Fixes-Bug: http://bugs.xenproject.org/xen/bug/8
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xenbus/xenbus_xs.c |   24 +++++++++++++++++++++---
 1 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
index b6d5fff..177fb19 100644
--- a/drivers/xen/xenbus/xenbus_xs.c
+++ b/drivers/xen/xenbus/xenbus_xs.c
@@ -148,9 +148,24 @@ static void *read_reply(enum xsd_sockmsg_type *type, unsigned int *len)
 
 	while (list_empty(&xs_state.reply_list)) {
 		spin_unlock(&xs_state.reply_lock);
-		/* XXX FIXME: Avoid synchronous wait for response here. */
-		wait_event(xs_state.reply_waitq,
-			   !list_empty(&xs_state.reply_list));
+		wait_event_timeout(xs_state.reply_waitq,
+				   !list_empty(&xs_state.reply_list),
+				   msecs_to_jiffies(500));
+
+		/*
+		 * If we are in the process of being shut-down there is
+		 * no point of trying to contact XenBus - it is either
+		 * killed (xenstored application) or the other domain
+		 * has been killed or is unreachable.
+		 */
+		switch (system_state) {
+			case SYSTEM_POWER_OFF:
+			case SYSTEM_RESTART:
+			case SYSTEM_HALT:
+				return ERR_PTR(-EIO);
+			default:
+				break;
+		}
 		spin_lock(&xs_state.reply_lock);
 	}
 
@@ -215,6 +230,9 @@ void *xenbus_dev_request_and_reply(struct xsd_sockmsg *msg)
 
 	mutex_unlock(&xs_state.request_mutex);
 
+	if (IS_ERR(ret))
+		return ret;
+
 	if ((msg->type == XS_TRANSACTION_END) ||
 	    ((req_msg.type == XS_TRANSACTION_START) &&
 	     (msg->type == XS_ERROR)))
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Processed: Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: test report for Xen 4.3 RC1
  2013-11-08 16:21     ` Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: " Konrad Rzeszutek Wilk
@ 2013-11-08 16:30       ` xen
  2013-11-10 20:20       ` Matt Wilson
  2013-11-11  2:40       ` Liu, SongtaoX
  2 siblings, 0 replies; 11+ messages in thread
From: xen @ 2013-11-08 16:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel

Processing commands for xen@bugs.xenproject.org:

> On Tue, May 28, 2013 at 11:21:56AM -0400, Konrad Rzeszutek Wilk wrote:
Command failed: Unknown command `On'. at /srv/xen-devel-bugs/lib/emesinae/control.pl line 437, <M> line 45.
Stop processing here.

---
Xen Hypervisor Bug Tracker
See http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen for information on reporting bugs
Contact xen-bugs-owner@bugs.xenproject.org with any infrastructure issues

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: test report for Xen 4.3 RC1
  2013-11-08 16:21     ` Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: " Konrad Rzeszutek Wilk
  2013-11-08 16:30       ` Processed: " xen
@ 2013-11-10 20:20       ` Matt Wilson
  2013-11-10 20:30         ` Processed: " xen
  2013-11-11  2:40       ` Liu, SongtaoX
  2 siblings, 1 reply; 11+ messages in thread
From: Matt Wilson @ 2013-11-10 20:20 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Ren, Yongjie, Tian, Yongxue, george.dunlap, xen, Xu, YongweiX,
	xen-devel@lists.xen.org, Liu, SongtaoX

On Fri, Nov 08, 2013 at 11:21:21AM -0500, Konrad Rzeszutek Wilk wrote:
[...]
> This patch should solve it:
> From 228bb2fcde1267ed2a0b0d386f54d79ecacd0eb4 Mon Sep 17 00:00:00 2001
> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Date: Fri, 8 Nov 2013 10:48:58 -0500
> Subject: [PATCH] xen/xenbus: Avoid synchronous wait on XenBus stalling
>  shutdown/restart.
> 
> The 'read_reply' works with 'process_msg' to read of a reply in XenBus.
> 'process_msg' is running from within the 'xenbus' thread. Whenever
> a message shows up in XenBus it is put on a xs_state.reply_list list
> and 'read_reply' picks it up.
> 
> The problem is if the backend domain or the xenstored process is killed.
> In which case 'xenbus' is still awaiting - and 'read_reply' if called -
> stuck forever waiting for the reply_list to have some contents.
> 
> This is normally not a problem - as the backend domain can come back
> or the xenstored process can be restarted. However if the domain
> is in process of being powered off/restarted/halted - there is no
> point of waiting on it coming back - as we are effectively being
> terminated and should not impede the progress.
> 
> This patch solves this problem by checking the 'system_state' value
> to see if we are in heading towards death. We also make the wait
> mechanism a bit more asynchronous.
> 
> Fixes-Bug: http://bugs.xenproject.org/xen/bug/8
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Makes sense to me.

Acked-by: Matt Wilson <msw@amazon.com>

> ---
>  drivers/xen/xenbus/xenbus_xs.c |   24 +++++++++++++++++++++---
>  1 files changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
> index b6d5fff..177fb19 100644
> --- a/drivers/xen/xenbus/xenbus_xs.c
> +++ b/drivers/xen/xenbus/xenbus_xs.c
> @@ -148,9 +148,24 @@ static void *read_reply(enum xsd_sockmsg_type *type, unsigned int *len)
>  
>  	while (list_empty(&xs_state.reply_list)) {
>  		spin_unlock(&xs_state.reply_lock);
> -		/* XXX FIXME: Avoid synchronous wait for response here. */
> -		wait_event(xs_state.reply_waitq,
> -			   !list_empty(&xs_state.reply_list));
> +		wait_event_timeout(xs_state.reply_waitq,
> +				   !list_empty(&xs_state.reply_list),
> +				   msecs_to_jiffies(500));
> +
> +		/*
> +		 * If we are in the process of being shut-down there is
> +		 * no point of trying to contact XenBus - it is either
> +		 * killed (xenstored application) or the other domain
> +		 * has been killed or is unreachable.
> +		 */
> +		switch (system_state) {
> +			case SYSTEM_POWER_OFF:
> +			case SYSTEM_RESTART:
> +			case SYSTEM_HALT:
> +				return ERR_PTR(-EIO);
> +			default:
> +				break;
> +		}
>  		spin_lock(&xs_state.reply_lock);
>  	}
>  
> @@ -215,6 +230,9 @@ void *xenbus_dev_request_and_reply(struct xsd_sockmsg *msg)
>  
>  	mutex_unlock(&xs_state.request_mutex);
>  
> +	if (IS_ERR(ret))
> +		return ret;
> +
>  	if ((msg->type == XS_TRANSACTION_END) ||
>  	    ((req_msg.type == XS_TRANSACTION_START) &&
>  	     (msg->type == XS_ERROR)))

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Processed: Re:  Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: test report for Xen 4.3 RC1
  2013-11-10 20:20       ` Matt Wilson
@ 2013-11-10 20:30         ` xen
  0 siblings, 0 replies; 11+ messages in thread
From: xen @ 2013-11-10 20:30 UTC (permalink / raw)
  To: Matt Wilson, xen-devel

Processing commands for xen@bugs.xenproject.org:

> On Fri, Nov 08, 2013 at 11:21:21AM -0500, Konrad Rzeszutek Wilk wrote:
Command failed: Unknown command `On'. at /srv/xen-devel-bugs/lib/emesinae/control.pl line 437, <M> line 51.
Stop processing here.

---
Xen Hypervisor Bug Tracker
See http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen for information on reporting bugs
Contact xen-bugs-owner@bugs.xenproject.org with any infrastructure issues

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: test report for Xen 4.3 RC1
  2013-11-08 16:21     ` Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: " Konrad Rzeszutek Wilk
  2013-11-08 16:30       ` Processed: " xen
  2013-11-10 20:20       ` Matt Wilson
@ 2013-11-11  2:40       ` Liu, SongtaoX
  2013-11-11  2:45         ` Processed: " xen
  2 siblings, 1 reply; 11+ messages in thread
From: Liu, SongtaoX @ 2013-11-11  2:40 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, george.dunlap@eu.citrix.com,
	xen@bugs.xenproject.org, Zhou, Chao, Xu, Jiajun, Zhang, Yang Z
  Cc: Xu, YongweiX, xen-devel@lists.xen.org

Yes, the patch fixed the dom0 hang issue during rebooting with guest pci device conflict.
Thanks.


Regards
Songtao

> -----Original Message-----
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Saturday, November 09, 2013 12:21 AM
> To: Ren, Yongjie; george.dunlap@eu.citrix.com; xen@bugs.xenproject.org
> Cc: Xu, YongweiX; Liu, SongtaoX; Tian, Yongxue; xen-devel@lists.xen.org
> Subject: Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung."
> Was:Re: [Xen-devel] test report for Xen 4.3 RC1
> 
> On Tue, May 28, 2013 at 11:21:56AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > 5. Dom0 cannot be shutdown before PCI device detachment from guest
> > > >   http://bugzilla.xen.org/bugzilla/show_bug.cgi?id=1826
> > >
> > > Ok, I can reproduce that too.
> >
> > This is what dom0 tells me:
> >
> > [  483.586675] INFO: task init:4163 blocked for more than 120 seconds.
> > [  483.603675] "echo 0 >
> /proc/sys/kernel/hung_task_timG^G[  483.620747] init            D
> ffff880062b59c78  5904  4163      1 0x00000000
> > [  483.637699]  ffff880062b59bc8 0000000000000^G[  483.655189]
> > ffff880062b58000 ffff880062b58000 ffff880062b58010 ffff880062b58000 [
> > 483.672505]  ffff880062b59fd8 ffff880062b58000 ffff880062f20180
> ffff880078bca500 [  483.689527] Call Trace:
> > [  483.706298]  [<ffffffff816a0814>] schedule+0x24/0x70 [  483.723604]
> > [<ffffffff813bb0dd>] read_reply+0xad/0x160 [  483.741162]
> > [<ffffffff810b6b10>] ? wake_up_bit+0x40/0x40 [  483.758572]
> > [<ffffffff813bb274>] xs_talkv+0xe4/0x1f0 [  483.775741]
> > [<ffffffff813bb3c6>] xs_single+0x46/0x60 [  483.792791]
> > [<ffffffff813bbab4>] xenbus_transaction_start+0x24/0x60
> > [  483.809929]  [<ffffffff813ba202>] __xenbus_switch_ste+0x32/0x120
> > ^G[  483.826947]  [<ffffffff8142df39>] ? __dev_printk+0x39/0x90 [
> > 483.843792]  [<ffffffff8142dfde>] ? _dev_info+0x4e/0x50 [  483.860412]
> > [<ffffffff813ba2fb>] xenbus_switch_state+0xb/0x10 [  483.877312]
> > [<ffffffff813bd487>] xenbus_dev_shutdown+0x37/0xa0 [  483.894036]
> > [<ffffffff8142e275>] device_shutdown+0x15/0x180 [  483.910605]
> > [<ffffffff810a8841>] kernel_restart_prepare+0x31/0x40 [  483.927100]
> > [<ffffffff810a88a1>] kernel_restart+0x11^G[  483.943262]
> > [<ffffffff810a8ab5>] SYSC_reboot+0x1b5/0x260 [  483.959480]
> > [<ffffffff810ed52d>] ? trace_hardirqs_on_caller+0^G[  483.975786]
> > [<ffffffff810ed5fd>] ? trace_hardirqs_on+0xd/0x10 [  483.991819]
> > [<ffffffff8119db03>] ? kmem_cache_free+0x123/0x360 [  484.007675]
> > [<ffffffff8115c725>] ? __free_pages+0x25/0x^G[  484.023336]
> > [<ffffffff8115c9ac>] ? free_pages+0x4c/0x50 [  484.039176]
> > [<ffffffff8108b527>] ? __mmdrop+0x67/0xd0 [  484.055174]
> > [<ffffffff816aae95>] ? sysret_check+0x22/0x5d [  484.070747]
> > [<ffffffff810ed52d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
> > [  484.086121]  [<ffffffff810a8b69>] SyS_reboot+0x9/0x10 [
> > 484.101318]  [<ffffffff816aae69>] system_call_fastpath+0x16/0x1b [
> > 484.116585] 3 locks held by init/4163:
> > [  484.131650]+.+.+.}, at: [<ffffffff810a89e0>] SYSC_reboot+0xe0/0x260
> > ^G^G^G^G^G^G[  484.147704]  #1:  (&__lockdep_no_validate__){......},
> > at: [<ffffffff8142e323>] device_shutdown+0xc3/0x180 [  484.164359]
> > #2:  (&xs_state.request_mutex){+.+...}, at: [<ffffffff813bb1fb>]
> > xs_talkv+0x6b/0x1f0
> >
> 
> A bit of debugging shows that when we are in this state:
> 
> 
> MSent SIGKILL to[  100.454603] xen-pciback pci-1-0: shutdown
> 
> telnet> send brk
> [  110.134554] SysRq : HELP : loglevel(0-9) reboot(b) crash(c)
> terminate-all-tasks(e) memory-full-oom-kill(f) debug(g) kill-all-tasks(i)
> thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l)
> show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p)
> show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) force-fb(V)
> show-blocked-tasks(w) dump-ftrace-buffer(z)
> 
> ... snip..
> 
>  xenstored       x 0000000000000002  5504  3437      1 0x00000006
>   ffff88006b6efc88 0000000000000246 0000000000000d6d ffff88006b6ee000
>   ffff88006b6effd8 ffff88006b6ee000 ffff88006b6ee010 ffff88006b6ee000
>   ffff88006b6effd8 ffff88006b6ee000 ffff88006bc39500 ffff8800788b5480
>  Call Trace:
>   [<ffffffff8110fede>] ? cgroup_exit+0x10e/0x130
>   [<ffffffff816b1594>] schedule+0x24/0x70
>   [<ffffffff8109c43d>] do_exit+0x79d/0xbc0
>   [<ffffffff8109c981>] do_group_exit+0x51/0x140
>   [<ffffffff810ae6f4>] get_signal_to_deliver+0x264/0x760
>   [<ffffffff8104c49f>] do_signal+0x4f/0x610
>   [<ffffffff811c62ce>] ? __sb_end_write+0x2e/0x60
>   [<ffffffff811c3d39>] ? vfs_write+0x129/0x170
>   [<ffffffff8104cabd>] do_notify_resume+0x5d/0x80
>   [<ffffffff816bc372>] int_signal+0x12/0x17
> 
> 
> The 'x' means that the task has been killed.
> 
> (The other two threads 'xenbus' and 'xenwatch' are sleeping).
> 
> Since the xenstored can actually be in a domain nowadays and not
> just in the initial domain and xenstored can be restarted anytime - we
> can't depend on the task pid. Nor can we depend on the other
> domain telling us that it is dead.
> 
> The best we can do is to get out of the way of the shutdown
> process and not hang on forever.
> 
> This patch should solve it:
> From 228bb2fcde1267ed2a0b0d386f54d79ecacd0eb4 Mon Sep 17 00:00:00
> 2001
> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Date: Fri, 8 Nov 2013 10:48:58 -0500
> Subject: [PATCH] xen/xenbus: Avoid synchronous wait on XenBus stalling
>  shutdown/restart.
> 
> The 'read_reply' works with 'process_msg' to read of a reply in XenBus.
> 'process_msg' is running from within the 'xenbus' thread. Whenever
> a message shows up in XenBus it is put on a xs_state.reply_list list
> and 'read_reply' picks it up.
> 
> The problem is if the backend domain or the xenstored process is killed.
> In which case 'xenbus' is still awaiting - and 'read_reply' if called -
> stuck forever waiting for the reply_list to have some contents.
> 
> This is normally not a problem - as the backend domain can come back
> or the xenstored process can be restarted. However if the domain
> is in process of being powered off/restarted/halted - there is no
> point of waiting on it coming back - as we are effectively being
> terminated and should not impede the progress.
> 
> This patch solves this problem by checking the 'system_state' value
> to see if we are in heading towards death. We also make the wait
> mechanism a bit more asynchronous.
> 
> Fixes-Bug: http://bugs.xenproject.org/xen/bug/8
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  drivers/xen/xenbus/xenbus_xs.c |   24 +++++++++++++++++++++---
>  1 files changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
> index b6d5fff..177fb19 100644
> --- a/drivers/xen/xenbus/xenbus_xs.c
> +++ b/drivers/xen/xenbus/xenbus_xs.c
> @@ -148,9 +148,24 @@ static void *read_reply(enum xsd_sockmsg_type
> *type, unsigned int *len)
> 
>  	while (list_empty(&xs_state.reply_list)) {
>  		spin_unlock(&xs_state.reply_lock);
> -		/* XXX FIXME: Avoid synchronous wait for response here. */
> -		wait_event(xs_state.reply_waitq,
> -			   !list_empty(&xs_state.reply_list));
> +		wait_event_timeout(xs_state.reply_waitq,
> +				   !list_empty(&xs_state.reply_list),
> +				   msecs_to_jiffies(500));
> +
> +		/*
> +		 * If we are in the process of being shut-down there is
> +		 * no point of trying to contact XenBus - it is either
> +		 * killed (xenstored application) or the other domain
> +		 * has been killed or is unreachable.
> +		 */
> +		switch (system_state) {
> +			case SYSTEM_POWER_OFF:
> +			case SYSTEM_RESTART:
> +			case SYSTEM_HALT:
> +				return ERR_PTR(-EIO);
> +			default:
> +				break;
> +		}
>  		spin_lock(&xs_state.reply_lock);
>  	}
> 
> @@ -215,6 +230,9 @@ void *xenbus_dev_request_and_reply(struct
> xsd_sockmsg *msg)
> 
>  	mutex_unlock(&xs_state.request_mutex);
> 
> +	if (IS_ERR(ret))
> +		return ret;
> +
>  	if ((msg->type == XS_TRANSACTION_END) ||
>  	    ((req_msg.type == XS_TRANSACTION_START) &&
>  	     (msg->type == XS_ERROR)))
> --
> 1.7.7.6

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Processed: RE: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: test report for Xen 4.3 RC1
  2013-11-11  2:40       ` Liu, SongtaoX
@ 2013-11-11  2:45         ` xen
  0 siblings, 0 replies; 11+ messages in thread
From: xen @ 2013-11-11  2:45 UTC (permalink / raw)
  To: Liu, SongtaoX, xen-devel

Processing commands for xen@bugs.xenproject.org:

> Yes, the patch fixed the dom0 hang issue during rebooting with guest pci de=
Command failed: Unknown command `Yes,'. at /srv/xen-devel-bugs/lib/emesinae/control.pl line 437, <M> line 50.
Stop processing here.

---
Xen Hypervisor Bug Tracker
See http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen for information on reporting bugs
Contact xen-bugs-owner@bugs.xenproject.org with any infrastructure issues

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: test report for Xen 4.3 RC1
  2013-05-28 15:24     ` George Dunlap
@ 2013-11-11 10:22       ` Ian Campbell
  0 siblings, 0 replies; 11+ messages in thread
From: Ian Campbell @ 2013-11-11 10:22 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel@lists.xen.org

On Tue, 2013-05-28 at 16:24 +0100, George Dunlap wrote:
> > create !
> > title -1 "linux, xenbus mutex hangs when rebooting dom0 and guests hung."
> 
> 1. I think that these commands have to come at the top
> 2. You don't need quotes in the title
> 3. You need to be polite and say "thanks" at the end so it knows it can 
> stop paying attention. :-)

4. Use Bcc and not Cc so that the entirely subsequent thread doesn't get
sent to the bot when folks reply-all.

Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-11-11 10:22 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-27  3:49 test report for Xen 4.3 RC1 Ren, Yongjie
2013-05-28 15:15 ` Konrad Rzeszutek Wilk
2013-05-28 15:21   ` Konrad Rzeszutek Wilk
2013-05-28 15:24     ` George Dunlap
2013-11-11 10:22       ` Ian Campbell
2013-11-08 16:21     ` Is: linux, xenbus mutex hangs when rebooting dom0 and guests hung." Was:Re: " Konrad Rzeszutek Wilk
2013-11-08 16:30       ` Processed: " xen
2013-11-10 20:20       ` Matt Wilson
2013-11-10 20:30         ` Processed: " xen
2013-11-11  2:40       ` Liu, SongtaoX
2013-11-11  2:45         ` Processed: " xen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).