From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: Alignment check on domU (2.6.32) Date: Tue, 30 Mar 2010 09:53:49 -0700 Message-ID: <4BB22C9D.7030502@goop.org> References: <32209efe1003292008y5880e9bfib238c089377b4ba7@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <32209efe1003292008y5880e9bfib238c089377b4ba7@mail.gmail.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Natalie Protasevich Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On 03/29/2010 08:08 PM, Natalie Protasevich wrote: > Hello, > We are getting alignment check with 2.6.32 kernel running as a domU on > an AMD system, Which 2.6.32 is it? Is it stock kernel.org, from xen.git, a distro, elsewhere? > while dom0 is a 2.6.18 kernel. > As far as I know we should not have run into such problem, since this > is x86_64 kernel. I am aware of the fact that for alignment check trap > AC bit needs to be set in eflags and AM should be set in CR0. I > tracked cr0 and AM was getting set, and problem was occurring when > something was setting AC flag at the time of calling memcpy_c(). I > cheated and cleared the AM flag in cr0 (as one can see in this trace) > but this didn't help. I haven't figured out what sets the AM flag... Do you have any other domains running at the time? What CPU is this? Does it run the same kernel native OK? J > > Here is the trace: > > [ 80.342300] alignment check: 0000 [#1] SMP > [ 80.342323] last sysfs file: /sys/devices/virtual/vc/vcsa7/dev > [ 80.342330] CPU 1 > [ 80.342339] Pid: 3875, comm: loas_check Not tainted > 2.6.32.10+drm33.1 #12 > [ 80.342347] RIP: e030:[] [] > memcpy_c+0xb/0x20 > [ 80.342365] RSP: e02b:ffff88015556d9b0 EFLAGS: 00050246 > [ 80.342371] RAX: ffff88017360cc8c RBX: ffff880176d91900 RCX: > 0000000000000002 > [ 80.342379] RDX: 0000000000000000 RSI: ffff880176d91958 RDI: > ffff88017360cc8c > [ 80.342388] RBP: ffff88015556d9e8 R08: ffffffff81570260 R09: > ffffffff81ae8840 > [ 80.342395] R10: 0000000000000000 R11: 0000000000000000 R12: > 0000000000000000 > [ 80.342403] R13: 000000000000000e R14: ffff880173f3fc00 R15: > ffff880176d91958 > [ 80.342417] FS: 00007f0f4dafd6e0(0000) GS:ffff880028047000(0000) > knlGS:0000000000000000 > [ 80.342425] CS: e033 DS: 002b ES: 002b CR0: 000000008001003b > [ 80.342432] CR2: 00007f0f4db1a000 CR3: 000000017362d000 CR4: > 0000000000000660 > [ 80.342440] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 80.342448] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000000 > [ 80.342457] Process loas_check (pid: 3875, threadinfo > ffff88015556c000, task ffff8801556cada0) > [ 80.342465] Stack: > [ 80.342469] ffffffff8157038f ffffffff81ae8840 ffff880173f3fc00 > 0000000000000000 > [ 80.342483] <0> ffff880173f3fc00 ffff880161a36400 0000000000000000 > ffff88015556da08 > [ 80.342500] <0> ffffffff815705d6 ffff880180000000 ffff880173f3fc00 > ffff88015556da28 > [ 80.342518] Call Trace: > [ 80.342528] [] ? ip_finish_output+0x12f/0x2f0 > [ 80.342538] [] ip_output+0x86/0xd0 > [ 80.342546] [] ip_local_out+0x20/0x30 > [ 80.342555] [] ip_queue_xmit+0x223/0x3f0 > [ 80.342565] [] ? tcp_send_active_reset+0x24/0x180 > [ 80.342576] [] ? xen_force_evtchn_callback+0xd/0x10 > [ 80.342586] [] ? check_events+0x12/0x20 > [ 80.342595] [] tcp_transmit_skb+0x402/0x780 > [ 80.342604] [] tcp_send_active_reset+0x89/0x180 > [ 80.342614] [] tcp_disconnect+0x6c/0x3c0 > [ 80.342622] [] tcp_close+0x3e4/0x480 > [ 80.342632] [] inet_release+0x42/0x70 > [ 80.342643] [] sock_release+0x18/0x60 > [ 80.342652] [] sock_close+0x12/0x30 > [ 80.342663] [] __fput+0xee/0x200 > [ 80.342671] [] ? xen_force_evtchn_callback+0xd/0x10 > [ 80.342681] [] fput+0x17/0x20 > [ 80.342690] [] filp_close+0x58/0x90 > [ 80.342698] [] ? xen_restore_fl_direct_end+0x0/0x1 > [ 80.342709] [] put_files_struct+0xcc/0xe0 > [ 80.342718] [] exit_files+0x50/0x60 > [ 80.342726] [] do_exit+0x1b7/0x7f0 > [ 80.342735] [] ? __dequeue_signal+0x16/0x160 > [ 80.342745] [] do_group_exit+0x3c/0xa0 > [ 80.342754] [] get_signal_to_deliver+0x1b8/0x380 > [ 80.342764] [] do_notify_resume+0xc9/0x8a0 > [ 80.342775] [] ? xen_mc_flush+0x11b/0x1d0 > [ 80.342786] [] ? > paravirt_end_context_switch+0x12/0x30 > [ 80.342798] [] ? finish_task_switch+0x5b/0xb0 > [ 80.342808] [] int_signal+0x12/0x17 > [ 80.342815] Code: 81 ea d8 1f 00 00 48 3b 42 20 73 07 48 8b 50 f9 > 31 c0 c3 31 d2 48 c7 c0 f2 ff ff ff c3 90 90 90 48 89 f8 89 d1 c1 e9 > 03 83 e2 07 48 a5 89 d1 f3 a4 c3 66 66 66 66 2e 0f 1f 84 00 00 00 > 00 00 > [ 80.342952] RIP [] memcpy_c+0xb/0x20 > [ 80.342962] RSP > [ 80.342969] ---[ end trace 1442aa6e9e3d337d ]--- > [ 80.342976] Fixing recursive fault but reboot is needed! > > This happens 2 out of 3 times. > I don't seem to find any similar recent reports and relevant commits > so far, and we haven't had such problem running 2.6.24 domU (Ubuntu > hardy) on the 2.6.18 dom0. I'm hoping someone can give a hand. > Thanks, > --Natalie > P.S. Just in case - here is the "original" trace before I tried to > modify the cr0: > > [ 64.544616] alignment check: 0000 [#1] SMP > [ 64.544640] last sysfs file: /sys/devices/virtual/vc/vcsa7/dev > [ 64.544647] CPU 1 > [ 64.544655] Pid: 3737, comm: loas_check Not tainted > 2.6.32.10+drm33.1 #8 > [ 64.544663] RIP: e030:[] [] > memcpy_c+0xb/0x20 > [ 64.544681] RSP: e02b:ffff880152e7d9b0 EFLAGS: 00050246 > [ 64.544687] RAX: ffff8801731e4c8c RBX: ffff880178185400 RCX: > 0000000000000002 > [ 64.544696] RDX: 0000000000000000 RSI: ffff880178185458 RDI: > ffff8801731e4c8c > [ 64.544703] RBP: ffff880152e7d9e8 R08: ffffffff81570110 R09: > ffffffff81ae6840 > [ 64.544711] R10: 0000000000000000 R11: 0000000000000000 R12: > 0000000000000000 > [ 64.544718] R13: 000000000000000e R14: ffff88017332d800 R15: > ffff880178185458 > [ 64.544732] FS: 00007fa8de9336e0(0000) GS:ffff880028047000(0000) > knlGS:0000000000000000 > [ 64.544741] CS: e033 DS: 002b ES: 002b CR0: 000000008005003b > [ 64.544748] CR2: 00000000081f1320 CR3: 0000000001001000 CR4: > 0000000000000660 > [ 64.544756] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 64.544764] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000000 > [ 64.544772] Process loas_check (pid: 3737, threadinfo > ffff880152e7c000, task ffff880152e72da0) > [ 64.544781] Stack: > [ 64.544785] ffffffff81570214 ffffffff81ae6840 ffff88017332d800 > 0000000000000000 > [ 64.544798] <0> ffff88017332d800 ffff880152e13200 0000000000000000 > ffff880152e7da08 > [ 64.544814] <0> ffffffff81570486 ffff880180000000 ffff88017332d800 > ffff880152e7da28 > [ 64.544832] Call Trace: > [ 64.544842] [] ? ip_finish_output+0x104/0x2f0 > [ 64.544853] [] ip_output+0x86/0xd0 > [ 64.544862] [] ip_local_out+0x20/0x30 > [ 64.544870] [] ip_queue_xmit+0x223/0x3f0 > [ 64.544880] [] ? xen_force_evtchn_callback+0xd/0x10 > [ 64.544889] [] ? check_events+0x12/0x20 > [ 64.544900] [] tcp_transmit_skb+0x402/0x780 > [ 64.544909] [] tcp_send_active_reset+0x89/0x180 > [ 64.544920] [] ? __d_free+0x3a/0x60 > [ 64.544929] [] tcp_disconnect+0x6c/0x3c0 > [ 64.544938] [] tcp_close+0x3e4/0x480 > [ 64.544946] [] inet_release+0x42/0x70 > [ 64.544956] [] sock_release+0x18/0x60 > [ 64.544964] [] sock_close+0x12/0x30 > [ 64.544974] [] __fput+0xee/0x200 > [ 64.544982] [] ? xen_force_evtchn_callback+0xd/0x10 > [ 64.544991] [] fput+0x17/0x20 > [ 64.545000] [] filp_close+0x58/0x90 > [ 64.545009] [] ? xen_restore_fl_direct_end+0x0/0x1 > [ 64.545019] [] put_files_struct+0xcc/0xe0 > [ 64.545028] [] exit_files+0x50/0x60 > [ 64.545036] [] do_exit+0x1b7/0x7f0 > [ 64.545046] [] ? __dequeue_signal+0x16/0x160 > [ 64.545055] [] do_group_exit+0x3c/0xa0 > [ 64.545064] [] get_signal_to_deliver+0x1b8/0x380 > [ 64.545073] [] do_notify_resume+0xc9/0x880 > [ 64.545084] [] ? xen_mc_flush+0x11b/0x1d0 > [ 64.545095] [] ? > paravirt_end_context_switch+0x12/0x30 > [ 64.545106] [] ? finish_task_switch+0x5b/0xb0 > [ 64.545115] [] int_signal+0x12/0x17 > [ 64.545121] Code: 81 ea d8 1f 00 00 48 3b 42 20 73 07 48 8b 50 f9 > 31 c0 c3 31 d2 48 c7 c0 f2 ff ff ff c3 90 90 90 48 89 f8 89 d1 c1 e9 > 03 83 e2 07 48 a5 89 d1 f3 a4 c3 66 66 66 66 2e 0f 1f 84 00 00 00 > 00 00 > [ 64.545252] RIP [] memcpy_c+0xb/0x20 > [ 64.545262] RSP > [ 64.545269] ---[ end trace 11cf940a2c626919 ]--- > [ 64.545276] Fixing recursive fault but reboot is needed! > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >