All of lore.kernel.org
 help / color / mirror / Atom feed
* Hypervisor error messages after xl block-detach with linux 3.18-rc5
@ 2014-11-21  8:42 Juergen Gross
  2014-11-21 13:57 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 14+ messages in thread
From: Juergen Gross @ 2014-11-21  8:42 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com

Hi,

while testing my "linear p2m list" patches I saw the following
problem (even without my patches in place):

In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the
disk image of a guest by attaching it to dom0:

xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
mount /dev/xvda2 /mnt
...
umount /mnt
xl block-detach 0 xvda

Worked without any problem. After some seconds the following messages
were issued on the console:

(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61110 (pfn 1f3f21c)
(XEN) mm.c:2995:d0 Error while pinning mfn 61110
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61110 (pfn 1f3f21c)
(XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61111 (pfn 1f3f21d)
(XEN) mm.c:2995:d0 Error while pinning mfn 61111
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61111 (pfn 1f3f21d)
(XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61120 (pfn 1f3f22c)
(XEN) mm.c:2995:d0 Error while pinning mfn 61120
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61120 (pfn 1f3f22c)
(XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61121 (pfn 1f3f22d)
(XEN) mm.c:2995:d0 Error while pinning mfn 61121
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61121 (pfn 1f3f22d)
(XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61102 (pfn 1f3f20e)
(XEN) mm.c:2995:d0 Error while pinning mfn 61102
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61102 (pfn 1f3f20e)
(XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61103 (pfn 1f3f20f)
(XEN) mm.c:2995:d0 Error while pinning mfn 61103
(XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 
1000000000000000) for mfn 61103 (pfn 1f3f20f)
(XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms

Is this a known issue?


Juergen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-21  8:42 Hypervisor error messages after xl block-detach with linux 3.18-rc5 Juergen Gross
@ 2014-11-21 13:57 ` Konrad Rzeszutek Wilk
  2014-11-24  9:55   ` Juergen Gross
  0 siblings, 1 reply; 14+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-11-21 13:57 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote:
> Hi,
> 
> while testing my "linear p2m list" patches I saw the following
> problem (even without my patches in place):
> 
> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the
> disk image of a guest by attaching it to dom0:
> 
> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
> mount /dev/xvda2 /mnt
> ...
> umount /mnt
> xl block-detach 0 xvda
> 
> Worked without any problem. After some seconds the following messages
> were issued on the console:
> 
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61110 (pfn 1f3f21c)
> (XEN) mm.c:2995:d0 Error while pinning mfn 61110
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61110 (pfn 1f3f21c)
> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61111 (pfn 1f3f21d)
> (XEN) mm.c:2995:d0 Error while pinning mfn 61111
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61111 (pfn 1f3f21d)
> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61120 (pfn 1f3f22c)
> (XEN) mm.c:2995:d0 Error while pinning mfn 61120
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61120 (pfn 1f3f22c)
> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61121 (pfn 1f3f22d)
> (XEN) mm.c:2995:d0 Error while pinning mfn 61121
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61121 (pfn 1f3f22d)
> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61102 (pfn 1f3f20e)
> (XEN) mm.c:2995:d0 Error while pinning mfn 61102
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61102 (pfn 1f3f20e)
> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61103 (pfn 1f3f20f)
> (XEN) mm.c:2995:d0 Error while pinning mfn 61103
> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
> for mfn 61103 (pfn 1f3f20f)
> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
> 
> Is this a known issue?

No. First time I see it.
> 
> 
> Juergen
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-21 13:57 ` Konrad Rzeszutek Wilk
@ 2014-11-24  9:55   ` Juergen Gross
  2014-11-24 10:02     ` Ian Campbell
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Juergen Gross @ 2014-11-24  9:55 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xensource.com

On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote:
> On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote:
>> Hi,
>>
>> while testing my "linear p2m list" patches I saw the following
>> problem (even without my patches in place):
>>
>> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the
>> disk image of a guest by attaching it to dom0:
>>
>> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
>> mount /dev/xvda2 /mnt
>> ...
>> umount /mnt
>> xl block-detach 0 xvda

Did some more testing:
- Seems to occur only when attaching a block device to dom0, problem
   shows up only after doing the block-detach.
- Sometimes I see only NMI watchdog messages, looking into hanging cpu
   state via xen debug keys I can see the cpu(s) in question are spinning
   in _raw_spin_lock():
   __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock()
   The hanging cpus were executing some random user processes (cron,
   bash, xargs), cr2 contained user addresses.
- Up to now I've seen the problem on a rather huge machine only (128GB,
   120 cpus). I just did a quick test on my laptop and nothing bad
   happened.
- Even on the huge machine just doing block-attach and block-detach
   without mounting the filesystem on the disk was harmless.

Any ideas how to find the problem?


Juergen

>>
>> Worked without any problem. After some seconds the following messages
>> were issued on the console:
>>
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61110 (pfn 1f3f21c)
>> (XEN) mm.c:2995:d0 Error while pinning mfn 61110
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61110 (pfn 1f3f21c)
>> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61111 (pfn 1f3f21d)
>> (XEN) mm.c:2995:d0 Error while pinning mfn 61111
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61111 (pfn 1f3f21d)
>> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61120 (pfn 1f3f22c)
>> (XEN) mm.c:2995:d0 Error while pinning mfn 61120
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61120 (pfn 1f3f22c)
>> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61121 (pfn 1f3f22d)
>> (XEN) mm.c:2995:d0 Error while pinning mfn 61121
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61121 (pfn 1f3f22d)
>> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61102 (pfn 1f3f20e)
>> (XEN) mm.c:2995:d0 Error while pinning mfn 61102
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61102 (pfn 1f3f20e)
>> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61103 (pfn 1f3f20f)
>> (XEN) mm.c:2995:d0 Error while pinning mfn 61103
>> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000)
>> for mfn 61103 (pfn 1f3f20f)
>> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms
>>
>> Is this a known issue?
>
> No. First time I see it.
>>
>>
>> Juergen
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-24  9:55   ` Juergen Gross
@ 2014-11-24 10:02     ` Ian Campbell
  2014-11-25 16:10       ` Juergen Gross
  2014-11-25 17:01       ` Juergen Gross
  2014-11-24 10:20     ` Jan Beulich
       [not found]     ` <5473147C020000780004A3D5@suse.com>
  2 siblings, 2 replies; 14+ messages in thread
From: Ian Campbell @ 2014-11-24 10:02 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini

On Mon, 2014-11-24 at 10:55 +0100, Juergen Gross wrote:
> On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote:
> > On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote:
> >> Hi,
> >>
> >> while testing my "linear p2m list" patches I saw the following
> >> problem (even without my patches in place):
> >>
> >> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the
> >> disk image of a guest by attaching it to dom0:
> >>
> >> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
> >> mount /dev/xvda2 /mnt
> >> ...
> >> umount /mnt
> >> xl block-detach 0 xvda
> 
> Did some more testing:
> - Seems to occur only when attaching a block device to dom0

That means a qdisk backed situation then, I think.

Is your qemu up to date?

http://xenbits.xen.org/gitweb/?p=qemu-upstream-unstable.git;a=commit;h=abbbc2f09a53f8f9ee565356ab11a78af006e45e resulted in slightly different messages, but could be related?

Ian.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-24  9:55   ` Juergen Gross
  2014-11-24 10:02     ` Ian Campbell
@ 2014-11-24 10:20     ` Jan Beulich
       [not found]     ` <5473147C020000780004A3D5@suse.com>
  2 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2014-11-24 10:20 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

>>> On 24.11.14 at 10:55, <JGross@suse.com> wrote:
> - Sometimes I see only NMI watchdog messages, looking into hanging cpu
>    state via xen debug keys I can see the cpu(s) in question are spinning
>    in _raw_spin_lock():
>    __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock()
>    The hanging cpus were executing some random user processes (cron,
>    bash, xargs), cr2 contained user addresses.

Is this perhaps what
http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html
appears to be about?

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
       [not found]     ` <5473147C020000780004A3D5@suse.com>
@ 2014-11-24 10:59       ` Juergen Gross
  2014-11-24 15:09         ` Juergen Gross
  0 siblings, 1 reply; 14+ messages in thread
From: Juergen Gross @ 2014-11-24 10:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xensource.com

On 11/24/2014 11:20 AM, Jan Beulich wrote:
>>>> On 24.11.14 at 10:55, <JGross@suse.com> wrote:
>> - Sometimes I see only NMI watchdog messages, looking into hanging cpu
>>     state via xen debug keys I can see the cpu(s) in question are spinning
>>     in _raw_spin_lock():
>>     __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock()
>>     The hanging cpus were executing some random user processes (cron,
>>     bash, xargs), cr2 contained user addresses.
>
> Is this perhaps what
> http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html
> appears to be about?

Hmm, I'm not sure.

I'll try a 3.17 kernel to verify.


Juergen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-24 10:59       ` Juergen Gross
@ 2014-11-24 15:09         ` Juergen Gross
  2014-11-24 17:27           ` Juergen Gross
  2014-11-25  4:47           ` Juergen Gross
  0 siblings, 2 replies; 14+ messages in thread
From: Juergen Gross @ 2014-11-24 15:09 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xensource.com

On 11/24/2014 11:59 AM, Juergen Gross wrote:
> On 11/24/2014 11:20 AM, Jan Beulich wrote:
>>>>> On 24.11.14 at 10:55, <JGross@suse.com> wrote:
>>> - Sometimes I see only NMI watchdog messages, looking into hanging cpu
>>>     state via xen debug keys I can see the cpu(s) in question are
>>> spinning
>>>     in _raw_spin_lock():
>>>     __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock()
>>>     The hanging cpus were executing some random user processes (cron,
>>>     bash, xargs), cr2 contained user addresses.
>>
>> Is this perhaps what
>> http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html
>> appears to be about?
>
> Hmm, I'm not sure.
>
> I'll try a 3.17 kernel to verify.

Still seeing the issue, but less frequent. OTOH I just found in above
thread in lkml that 3.17 is showing that issue, too. :-(

I'll try to setup a pv-variant of Linus' patch and test it...


Juergen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-24 15:09         ` Juergen Gross
@ 2014-11-24 17:27           ` Juergen Gross
  2014-11-25  7:50             ` Jan Beulich
  2014-11-25  4:47           ` Juergen Gross
  1 sibling, 1 reply; 14+ messages in thread
From: Juergen Gross @ 2014-11-24 17:27 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xensource.com

On 11/24/2014 04:09 PM, Juergen Gross wrote:
> On 11/24/2014 11:59 AM, Juergen Gross wrote:
>> On 11/24/2014 11:20 AM, Jan Beulich wrote:
>>>>>> On 24.11.14 at 10:55, <JGross@suse.com> wrote:
>>>> - Sometimes I see only NMI watchdog messages, looking into hanging cpu
>>>>     state via xen debug keys I can see the cpu(s) in question are
>>>> spinning
>>>>     in _raw_spin_lock():
>>>>     __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock()
>>>>     The hanging cpus were executing some random user processes (cron,
>>>>     bash, xargs), cr2 contained user addresses.
>>>
>>> Is this perhaps what
>>> http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html
>>>
>>> appears to be about?
>>
>> Hmm, I'm not sure.
>>
>> I'll try a 3.17 kernel to verify.
>
> Still seeing the issue, but less frequent. OTOH I just found in above
> thread in lkml that 3.17 is showing that issue, too. :-(
>
> I'll try to setup a pv-variant of Linus' patch and test it...

First test seems to be okay, no immediate NMI message...

Any idea why the block-attach/detach would trigger this problem so
easily? I can see the dependency on the high cpu count, but fail to do
so for the xl actions.


Juergen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-24 15:09         ` Juergen Gross
  2014-11-24 17:27           ` Juergen Gross
@ 2014-11-25  4:47           ` Juergen Gross
  1 sibling, 0 replies; 14+ messages in thread
From: Juergen Gross @ 2014-11-25  4:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xensource.com, Ian.Campbell@citrix.com

On 11/24/2014 04:09 PM, Juergen Gross wrote:
> On 11/24/2014 11:59 AM, Juergen Gross wrote:
>> On 11/24/2014 11:20 AM, Jan Beulich wrote:
>>>>>> On 24.11.14 at 10:55, <JGross@suse.com> wrote:
>>>> - Sometimes I see only NMI watchdog messages, looking into hanging cpu
>>>>     state via xen debug keys I can see the cpu(s) in question are
>>>> spinning
>>>>     in _raw_spin_lock():
>>>>     __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock()
>>>>     The hanging cpus were executing some random user processes (cron,
>>>>     bash, xargs), cr2 contained user addresses.
>>>
>>> Is this perhaps what
>>> http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html
>>>
>>> appears to be about?
>>
>> Hmm, I'm not sure.
>>
>> I'll try a 3.17 kernel to verify.
>
> Still seeing the issue, but less frequent. OTOH I just found in above
> thread in lkml that 3.17 is showing that issue, too. :-(
>
> I'll try to setup a pv-variant of Linus' patch and test it...

Okay, test survived the night. Seems really to be the same issue.

I think I'm seeing the qemu issue now Ian mentioned:

[  140.182849] xen:grant_table: WARNING: g.e. 0x10 still in use!
[  140.182859] deferring g.e. 0x10 (pfn 0xffffffffffffffff)
[  140.182864] xen:grant_table: WARNING: g.e. 0xf still in use!
[  140.182866] deferring g.e. 0xf (pfn 0xffffffffffffffff)
...
[  140.183128] xen:grant_table: WARNING: g.e. 0x2a still in use!
[  140.183129] deferring g.e. 0x2a (pfn 0xffffffffffffffff)
[  142.182274] xen:grant_table: freeing g.e. 0x9
[  145.182284] xen:grant_table: freeing g.e. 0x44
[  147.182272] xen:grant_table: freeing g.e. 0x43
[  501.182282] xen:grant_table: g.e. 0x10 still pending
[  501.182315] xen:grant_table: g.e. 0xf still pending
...

I'll update qemu and try again...


Juergen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-24 17:27           ` Juergen Gross
@ 2014-11-25  7:50             ` Jan Beulich
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2014-11-25  7:50 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

>>> On 24.11.14 at 18:27, <JGross@suse.com> wrote:
> On 11/24/2014 04:09 PM, Juergen Gross wrote:
>> Still seeing the issue, but less frequent. OTOH I just found in above
>> thread in lkml that 3.17 is showing that issue, too. :-(
>>
>> I'll try to setup a pv-variant of Linus' patch and test it...
> 
> First test seems to be okay, no immediate NMI message...
> 
> Any idea why the block-attach/detach would trigger this problem so
> easily? I can see the dependency on the high cpu count, but fail to do
> so for the xl actions.

No, no idea. It was only the similar symptoms that made me consider
this a possibility.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-24 10:02     ` Ian Campbell
@ 2014-11-25 16:10       ` Juergen Gross
  2014-11-25 16:18         ` Ian Campbell
  2014-11-25 17:01       ` Juergen Gross
  1 sibling, 1 reply; 14+ messages in thread
From: Juergen Gross @ 2014-11-25 16:10 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini

On 11/24/2014 11:02 AM, Ian Campbell wrote:
> On Mon, 2014-11-24 at 10:55 +0100, Juergen Gross wrote:
>> On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote:
>>>> Hi,
>>>>
>>>> while testing my "linear p2m list" patches I saw the following
>>>> problem (even without my patches in place):
>>>>
>>>> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the
>>>> disk image of a guest by attaching it to dom0:
>>>>
>>>> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
>>>> mount /dev/xvda2 /mnt
>>>> ...
>>>> umount /mnt
>>>> xl block-detach 0 xvda
>>
>> Did some more testing:
>> - Seems to occur only when attaching a block device to dom0
>
> That means a qdisk backed situation then, I think.
>
> Is your qemu up to date?
>
> http://xenbits.xen.org/gitweb/?p=qemu-upstream-unstable.git;a=commit;h=abbbc2f09a53f8f9ee565356ab11a78af006e45e resulted in slightly different messages, but could be related?

Tried it with xen-unstable and newest qemu. Now I have another problem:

xl -vvv block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: create: 
how=(nil) callback=(nil) poller=0xbac1d0
libxl: error: libxl.c:1897:device_addrm_aocomplete: unable to add device
libxl: debug: libxl_event.c:1739:libxl__ao_complete: ao 0xbac0f0: 
complete, rc=-16
libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: 
inprogress: poller=0xbac1d0, flags=ic
libxl: debug: libxl_event.c:1711:libxl__ao__destroy: ao 0xbac0f0: destroy
libxl_device_disk_add failed.


rc=-16 means ERROR_JSON_CONFIG_EMPTY. Could it be that newest tools
won't let me attach a block device to dom0?

IMHO that's a severe regression!


Juergen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-25 16:10       ` Juergen Gross
@ 2014-11-25 16:18         ` Ian Campbell
  2014-11-25 16:48           ` Juergen Gross
  0 siblings, 1 reply; 14+ messages in thread
From: Ian Campbell @ 2014-11-25 16:18 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini

On Tue, 2014-11-25 at 17:10 +0100, Juergen Gross wrote:
> On 11/24/2014 11:02 AM, Ian Campbell wrote:
> > On Mon, 2014-11-24 at 10:55 +0100, Juergen Gross wrote:
> >> On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote:
> >>> On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote:
> >>>> Hi,
> >>>>
> >>>> while testing my "linear p2m list" patches I saw the following
> >>>> problem (even without my patches in place):
> >>>>
> >>>> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the
> >>>> disk image of a guest by attaching it to dom0:
> >>>>
> >>>> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
> >>>> mount /dev/xvda2 /mnt
> >>>> ...
> >>>> umount /mnt
> >>>> xl block-detach 0 xvda
> >>
> >> Did some more testing:
> >> - Seems to occur only when attaching a block device to dom0
> >
> > That means a qdisk backed situation then, I think.
> >
> > Is your qemu up to date?
> >
> > http://xenbits.xen.org/gitweb/?p=qemu-upstream-unstable.git;a=commit;h=abbbc2f09a53f8f9ee565356ab11a78af006e45e resulted in slightly different messages, but could be related?
> 
> Tried it with xen-unstable and newest qemu. Now I have another problem:
> 
> xl -vvv block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
> libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: create: 
> how=(nil) callback=(nil) poller=0xbac1d0
> libxl: error: libxl.c:1897:device_addrm_aocomplete: unable to add device
> libxl: debug: libxl_event.c:1739:libxl__ao_complete: ao 0xbac0f0: 
> complete, rc=-16
> libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: 
> inprogress: poller=0xbac1d0, flags=ic
> libxl: debug: libxl_event.c:1711:libxl__ao__destroy: ao 0xbac0f0: destroy
> libxl_device_disk_add failed.
> 
> 
> rc=-16 means ERROR_JSON_CONFIG_EMPTY.

Have you updated your initscripts to run the xen-init-dom0 helper? This
replaced the manual xenstore-write stuff a little while back and added
some other init, including creating the necessary json description of
dom0.

Ian.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-25 16:18         ` Ian Campbell
@ 2014-11-25 16:48           ` Juergen Gross
  0 siblings, 0 replies; 14+ messages in thread
From: Juergen Gross @ 2014-11-25 16:48 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini

On 11/25/2014 05:18 PM, Ian Campbell wrote:
> On Tue, 2014-11-25 at 17:10 +0100, Juergen Gross wrote:
>> On 11/24/2014 11:02 AM, Ian Campbell wrote:
>>> On Mon, 2014-11-24 at 10:55 +0100, Juergen Gross wrote:
>>>> On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote:
>>>>> On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote:
>>>>>> Hi,
>>>>>>
>>>>>> while testing my "linear p2m list" patches I saw the following
>>>>>> problem (even without my patches in place):
>>>>>>
>>>>>> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the
>>>>>> disk image of a guest by attaching it to dom0:
>>>>>>
>>>>>> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
>>>>>> mount /dev/xvda2 /mnt
>>>>>> ...
>>>>>> umount /mnt
>>>>>> xl block-detach 0 xvda
>>>>
>>>> Did some more testing:
>>>> - Seems to occur only when attaching a block device to dom0
>>>
>>> That means a qdisk backed situation then, I think.
>>>
>>> Is your qemu up to date?
>>>
>>> http://xenbits.xen.org/gitweb/?p=qemu-upstream-unstable.git;a=commit;h=abbbc2f09a53f8f9ee565356ab11a78af006e45e resulted in slightly different messages, but could be related?
>>
>> Tried it with xen-unstable and newest qemu. Now I have another problem:
>>
>> xl -vvv block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
>> libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: create:
>> how=(nil) callback=(nil) poller=0xbac1d0
>> libxl: error: libxl.c:1897:device_addrm_aocomplete: unable to add device
>> libxl: debug: libxl_event.c:1739:libxl__ao_complete: ao 0xbac0f0:
>> complete, rc=-16
>> libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0:
>> inprogress: poller=0xbac1d0, flags=ic
>> libxl: debug: libxl_event.c:1711:libxl__ao__destroy: ao 0xbac0f0: destroy
>> libxl_device_disk_add failed.
>>
>>
>> rc=-16 means ERROR_JSON_CONFIG_EMPTY.
>
> Have you updated your initscripts to run the xen-init-dom0 helper? This
> replaced the manual xenstore-write stuff a little while back and added
> some other init, including creating the necessary json description of
> dom0.

Bingo! Now it works.

Thanks,

Juergen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5
  2014-11-24 10:02     ` Ian Campbell
  2014-11-25 16:10       ` Juergen Gross
@ 2014-11-25 17:01       ` Juergen Gross
  1 sibling, 0 replies; 14+ messages in thread
From: Juergen Gross @ 2014-11-25 17:01 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini

On 11/24/2014 11:02 AM, Ian Campbell wrote:
> On Mon, 2014-11-24 at 10:55 +0100, Juergen Gross wrote:
>> On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote:
>>>> Hi,
>>>>
>>>> while testing my "linear p2m list" patches I saw the following
>>>> problem (even without my patches in place):
>>>>
>>>> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the
>>>> disk image of a guest by attaching it to dom0:
>>>>
>>>> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w
>>>> mount /dev/xvda2 /mnt
>>>> ...
>>>> umount /mnt
>>>> xl block-detach 0 xvda
>>
>> Did some more testing:
>> - Seems to occur only when attaching a block device to dom0
>
> That means a qdisk backed situation then, I think.
>
> Is your qemu up to date?
>
> http://xenbits.xen.org/gitweb/?p=qemu-upstream-unstable.git;a=commit;h=abbbc2f09a53f8f9ee565356ab11a78af006e45e resulted in slightly different messages, but could be related?

Newest qemu and xen-unstable together with my dom0 kernel patch made
all issues go away at a first glance.


Juergen

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-11-25 17:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-21  8:42 Hypervisor error messages after xl block-detach with linux 3.18-rc5 Juergen Gross
2014-11-21 13:57 ` Konrad Rzeszutek Wilk
2014-11-24  9:55   ` Juergen Gross
2014-11-24 10:02     ` Ian Campbell
2014-11-25 16:10       ` Juergen Gross
2014-11-25 16:18         ` Ian Campbell
2014-11-25 16:48           ` Juergen Gross
2014-11-25 17:01       ` Juergen Gross
2014-11-24 10:20     ` Jan Beulich
     [not found]     ` <5473147C020000780004A3D5@suse.com>
2014-11-24 10:59       ` Juergen Gross
2014-11-24 15:09         ` Juergen Gross
2014-11-24 17:27           ` Juergen Gross
2014-11-25  7:50             ` Jan Beulich
2014-11-25  4:47           ` Juergen Gross

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.