From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: DomU: kernel BUG at arch/x86/xen/enlighten.c:425 Date: Tue, 15 Sep 2015 17:09:11 +0100 Message-ID: <55F842A7.4080903@citrix.com> References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3500229004951187840==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Thomas DEBESSE , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org --===============3500229004951187840== Content-Type: multipart/alternative; boundary="------------010208060404000106030308" --------------010208060404000106030308 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit On 15/09/15 17:03, Thomas DEBESSE wrote: > Hi, I'm replying to this thread from 2013: > http://lists.xen.org/archives/html/xen-devel/2013-03/threads.html#00649 > > Like James Sinclair, all I could find is a closed Debian bug from Dec > 2010 with no resolution: > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=60770 > > Do you have some news about this bug? > > I got it too with a 3.16 kernel on Debian: > > Sep 15 16:57:14 server kernel: [ 19.844447] ------------[ cut here > ]------------ > Sep 15 16:57:14 server kernel: [ 19.844468] kernel BUG at > /build/linux-sPqfgd/linux-3.16.7-ckt11/arch/x86/xen/enlighten.c:494! > Sep 15 16:57:14 server kernel: [ 19.844479] invalid opcode: 0000 > [#1] SMP > Sep 15 16:57:14 server kernel: [ 19.844487] Modules linked in: fuse > nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc evdev > coretemp pcspkr ext4 crc16 mbcache jbd2 dm_mod md_mod xen_netfront > xen_blkfront > Sep 15 16:57:14 server kernel: [ 19.844519] CPU: 1 PID: 930 Comm: > cmd Not tainted 3.16.0-4-686-pae #1 Debian 3.16.7-ckt11-1 > Sep 15 16:57:14 server kernel: [ 19.844529] task: e8ba4560 ti: > c29f8000 task.ti: c29f8000 > Sep 15 16:57:14 server kernel: [ 19.844535] EIP: 0061:[] > EFLAGS: 00010282 CPU: 1 > Sep 15 16:57:14 server kernel: [ 19.844545] EIP is at > set_aliased_prot+0x10d/0x120 > Sep 15 16:57:14 server kernel: [ 19.844551] EAX: ffffffea EBX: > ede01000 ECX: cc5ae063 EDX: 80000000 > Sep 15 16:57:14 server kernel: [ 19.844558] ESI: 00000000 EDI: > 80000001 EBP: c29f9dbc ESP: c29f9d98 > Sep 15 16:57:14 server kernel: [ 19.844564] DS: 007b ES: 007b FS: > 00d8 GS: 00e0 SS: 0069 > Sep 15 16:57:14 server kernel: [ 19.844570] CR0: 8005003b CR2: > 00111484 CR3: 029ab000 CR4: 00002660 > Sep 15 16:57:14 server kernel: [ 19.844578] Stack: > Sep 15 16:57:14 server kernel: [ 19.844582] 80000000 cc5ae063 > 001f3c8a ede01000 ecac2140 00000001 ede02000 00000400 > Sep 15 16:57:14 server kernel: [ 19.844594] 00000000 c29f9dd0 > c1003781 c2831ac0 e8892010 c2831ac0 c29f9ddc c10122be > Sep 15 16:57:14 server kernel: [ 19.844606] 00000000 c29f9e00 > c1053fa6 c29f9df0 c1002e90 e8ba4560 ecdcf8c0 00000000 > Sep 15 16:57:14 server kernel: [ 19.844618] Call Trace: > Sep 15 16:57:14 server kernel: [ 19.844628] [] ? > xen_free_ldt+0x31/0x40 > Sep 15 16:57:14 server kernel: [ 19.844640] [] ? > destroy_context+0x2e/0x90 > Sep 15 16:57:14 server kernel: [ 19.844651] [] ? > __mmdrop+0x26/0x90 > Sep 15 16:57:14 server kernel: [ 19.844659] [] ? > xen_end_context_switch+0x10/0x20 > Sep 15 16:57:14 server kernel: [ 19.844668] [] ? > finish_task_switch+0x9f/0xd0 > Sep 15 16:57:14 server kernel: [ 19.844677] [] ? > __schedule+0x230/0x6e0 > Sep 15 16:57:14 server kernel: [ 19.844685] [] ? > __sb_end_write+0x31/0x70 > Sep 15 16:57:14 server kernel: [ 19.844694] [] ? > pipe_write+0x34c/0x3d0 > Sep 15 16:57:14 server kernel: [ 19.844703] [] ? > _raw_spin_lock_irqsave+0x19/0x40 > Sep 15 16:57:14 server kernel: [ 19.844713] [] ? > _raw_spin_unlock_irqrestore+0x13/0x20 > Sep 15 16:57:14 server kernel: [ 19.844723] [] ? > prepare_to_wait+0x48/0x70 > Sep 15 16:57:14 server kernel: [ 19.844732] [] ? > pipe_wait+0x4d/0x80 > Sep 15 16:57:14 server kernel: [ 19.844740] [] ? > prepare_to_wait_event+0xd0/0xd0 > Sep 15 16:57:14 server kernel: [ 19.844749] [] ? > pipe_read+0x151/0x260 > Sep 15 16:57:14 server kernel: [ 19.844758] [] ? > new_sync_read+0x66/0xa0 > Sep 15 16:57:14 server kernel: [ 19.844766] [] ? > default_llseek+0x170/0x170 > Sep 15 16:57:14 server kernel: [ 19.844774] [] ? > vfs_read+0x80/0x150 > Sep 15 16:57:14 server kernel: [ 19.844780] [] ? > SyS_read+0x46/0x90 > Sep 15 16:57:14 server kernel: [ 19.844789] [] ? > sysenter_do_call+0x12/0x12 > Sep 15 16:57:14 server kernel: [ 19.844794] Code: 2e 83 c4 18 5b 5e > 5f 5d c3 90 8d 74 26 00 83 3d d4 92 76 c1 02 75 c8 8d b4 26 00 00 00 > 00 e8 2b 5e 13 00 83 c4 18 5b 5e 5f 5d c3 <0f> 0b 0f 0b 0f 0b 8d b6 00 > 00 00 00 8d bc 27 00 00 00 00 55 89 > Sep 15 16:57:14 server kernel: [ 19.844868] EIP: [] > set_aliased_prot+0x10d/0x120 SS:ESP 0069:c29f9d98 > Sep 15 16:57:14 server kernel: [ 19.844882] ---[ end trace > 5b8a5a9c639bac8c ]--- > > The message above is from DomU kernel. In fact, when I get this > message, I'm lucky: it means the error was handled without crashing. > Most of the case the vm just reboot itself before logging or printing > any message at all. > On Dom0 side, `xl dmesg` shows nothing. > > I downgraded my DomU kernel to 3.2 and it seems to work for now but > it's not a fix. > // > > I was running xen 4.4.1-9 and linux 3.16.7-ckt11-1 (686-pae) from Debian. > > I don't have more information, at all. The instantiation of HYPERVISOR_update_va_mapping() in set_aliased_prot() has always been buggy in pvops kernels. This bug should be fixed by c/s 0b0e55 "x86/xen: Probe target addresses in set_aliased_prot before the hypercall" which is in the process of being backported to #stable as a prerequisite for the recent LDT CVE fixes. ~Andrew --------------010208060404000106030308 Content-Type: text/html; charset="windows-1252" Content-Length: 8013 Content-Transfer-Encoding: quoted-printable
On 15/09/15 17:03, Thomas DEBESSE wrote:
Hi, I'm replying to this thread from 2013: http://lists.xen.org/archives/html/xen-devel/2013-03/threads.html#00649

Like James Sinclair, all I could find is a closed Debian bug from Dec 2010 with no resolution:
http://bugs.debian.org/cgi-bin/bugreport.cgi=3Fbug=3D60770

Do you have some news about this bug=3F

I got it too with a 3.16 kernel on Debian:

Sep 15 16:57:14 server kernel: [=A0=A0 19.844447] ------------[ cut here ]------------
Sep 15 16:57:14 server kernel: [=A0=A0 19.844468] kernel BUG at /build/linux-sPqfgd/linux-3.16.7-ckt11/arch/x86/xen/enlighten.c:494!
Sep 15 16:57:14 server kernel: [=A0=A0 19.844479] invalid opcode: 0000 [#1] SMP
Sep 15 16:57:14 server kernel: [=A0=A0 19.844487] Modules linked in: fuse nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc evdev coretemp pcspkr ext4 crc16 mbcache jbd2 dm_mod md_mod xen_netfront xen_blkfront
Sep 15 16:57:14 server kernel: [=A0=A0 19.844519] CPU: 1 PID: 930 Comm: cmd Not tainted 3.16.0-4-686-pae #1 Debian 3.16.7-ckt11-1
Sep 15 16:57:14 server kernel: [=A0=A0 19.844529] task: e8ba4560 ti: c29f8000 task.ti: c29f8000
Sep 15 16:57:14 server kernel: [=A0=A0 19.844535] EIP: 0061:[<c100373d>] EFLAGS: 00010282 CPU: 1
Sep 15 16:57:14 server kernel: [=A0=A0 19.844545] EIP is at set_aliased_prot+0x10d/0x120
Sep 15 16:57:14 server kernel: [=A0=A0 19.844551] EAX: ffffffea EBX: ede01000 ECX: cc5ae063 EDX: 80000000
Sep 15 16:57:14 server kernel: [=A0=A0 19.844558] ESI: 00000000 EDI: 80000001 EBP: c29f9dbc ESP: c29f9d98
Sep 15 16:57:14 server kernel: [=A0=A0 19.844564]=A0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069
Sep 15 16:57:14 server kernel: [=A0=A0 19.844570] CR0: 8005003b CR2: 00111484 CR3: 029ab000 CR4: 00002660
Sep 15 16:57:14 server kernel: [=A0=A0 19.844578] Stack:
Sep 15 16:57:14 server kernel: [=A0=A0 19.844582]=A0 80000000 cc5ae063 001f3c8a ede01000 ecac2140 00000001 ede02000 00000400
Sep 15 16:57:14 server kernel: [=A0=A0 19.844594]=A0 00000000 c29f9dd0 c1003781 c2831ac0 e8892010 c2831ac0 c29f9ddc c10122be
Sep 15 16:57:14 server kernel: [=A0=A0 19.844606]=A0 00000000 c29f9e00 c1053fa6 c29f9df0 c1002e90 e8ba4560 ecdcf8c0 00000000
Sep 15 16:57:14 server kernel: [=A0=A0 19.844618] Call Trace:
Sep 15 16:57:14 server kernel: [=A0=A0 19.844628]=A0 [<c1003781>] =3F xen_free_ldt+0x31/0x40
Sep 15 16:57:14 server kernel: [=A0=A0 19.844640]=A0 [<c10122be>] =3F destroy_context+0x2e/0x90
Sep 15 16:57:14 server kernel: [=A0=A0 19.844651]=A0 [<c1053fa6>] =3F __mmdrop+0x26/0x90
Sep 15 16:57:14 server kernel: [=A0=A0 19.844659]=A0 [<c1002e90>] =3F xen_end_context_switch+0x10/0x20
Sep 15 16:57:14 server kernel: [=A0=A0 19.844668]=A0 [<c107c59f>] =3F finish_task_switch+0x9f/0xd0
Sep 15 16:57:14 server kernel: [=A0=A0 19.844677]=A0 [<c1478e60>] =3F __schedule+0x230/0x6e0
Sep 15 16:57:14 server kernel: [=A0=A0 19.844685]=A0 [<c116e381>] =3F __sb_end_write+0x31/0x70
Sep 15 16:57:14 server kernel: [=A0=A0 19.844694]=A0 [<c117361c>] =3F pipe_write+0x34c/0x3d0
Sep 15 16:57:14 server kernel: [=A0=A0 19.844703]=A0 [<c147be59>] =3F _raw_spin_lock_irqsave+0x19/0x40
Sep 15 16:57:14 server kernel: [=A0=A0 19.844713]=A0 [<c147baa3>] =3F _raw_spin_unlock_irqrestore+0x13/0x20
Sep 15 16:57:14 server kernel: [=A0=A0 19.844723]=A0 [<c1090398>] =3F prepare_to_wait+0x48/0x70
Sep 15 16:57:14 server kernel: [=A0=A0 19.844732]=A0 [<c117324d>] =3F pipe_wait+0x4d/0x80
Sep 15 16:57:14 server kernel: [=A0=A0 19.844740]=A0 [<c1090680>] =3F prepare_to_wait_event+0xd0/0xd0
Sep 15 16:57:14 server kernel: [=A0=A0 19.844749]=A0 [<c11737f1>] =3F pipe_read+0x151/0x260
Sep 15 16:57:14 server kernel: [=A0=A0 19.844758]=A0 [<c116bd96>] =3F new_sync_read+0x66/0xa0
Sep 15 16:57:14 server kernel: [=A0=A0 19.844766]=A0 [<c116bd30>] =3F default_llseek+0x170/0x170
Sep 15 16:57:14 server kernel: [=A0=A0 19.844774]=A0 [<c116c620>] =3F vfs_read+0x80/0x150
Sep 15 16:57:14 server kernel: [=A0=A0 19.844780]=A0 [<c116cdc6>] =3F SyS_read+0x46/0x90
Sep 15 16:57:14 server kernel: [=A0=A0 19.844789]=A0 [<c147c2df>] =3F sysenter_do_call+0x12/0x12
Sep 15 16:57:14 server kernel: [=A0=A0 19.844794] Code: 2e 83 c4 18 5b 5e 5f 5d c3 90 8d 74 26 00 83 3d d4 92 76 c1 02 75 c8 8d b4 26 00 00 00 00 e8 2b 5e 13 00 83 c4 18 5b 5e 5f 5d c3 <0f> 0b 0f 0b 0f 0b 8d b6 00 00 00 00 8d bc 27 00 00 00 00 55 89
Sep 15 16:57:14 server kernel: [=A0=A0 19.844868] EIP: [<c100373d>] set_aliased_prot+0x10d/0x120 SS:ESP 0069:c29f9d98
Sep 15 16:57:14 server kernel: [ =A0 19.844882] ---[ end trace 5b8a5a9c639bac8c ]---

The message above is from DomU kernel. In fact, when I get this message, I'm lucky: it means the error was handled without crashing. Most of the case the vm just reboot itself before logging or printing any message at all.
On Dom0 side, `xl dmesg` shows nothing.

I downgraded my DomU kernel to 3.2 and it seems to work for now but it's not a fix.

I was running xen 4.4.1-9 and linux 3.16.7-ckt11-1 (686-pae) from Debian.

I don't have more information, at all.

The instantiation of HYPERVISOR_update_va_mapping() in set_aliased_prot() has always been buggy in pvops kernels.

This bug should be fixed by c/s 0b0e55 "x86/xen: Probe target addresses in set_aliased_prot before the hypercall" which is in the process of being backported to #stable as a prerequisite for the recent LDT CVE fixes.

~Andrew
--------------010208060404000106030308-- --===============3500229004951187840== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============3500229004951187840==--