From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: LVM userspace causing dom0 crash Date: Fri, 11 May 2012 11:39:31 -0400 Message-ID: <20120511153931.GA21486@phenom.dumpdata.com> References: <4FA7EBF6.6040204@theshore.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <4FA7EBF6.6040204@theshore.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Christopher S. Aker" Cc: xen devel List-Id: xen-devel@lists.xenproject.org On Mon, May 07, 2012 at 11:36:22AM -0400, Christopher S. Aker wrote: > Xen: 4.1.3-rc1-pre (xenbits @ 23285) > Dom0: 3.2.6 PAE and 3.3.4 PAE > > We seeing the below crash on 3.x dom0s. A simple lvcreate/lvremove > loop deployed to a few dozen boxes will hit it quite reliably within > a short time. This happens on both an older LVM userspace and > newest, and in production we have seen this hit on lvremove, > lvrename, and lvdelete. > > #!/bin/bash > while true; do > lvcreate -L 256M -n test1 vg1; lvremove -f vg1/test1 > done So I tried this with 3.4-rc6 and didn't see this. The machine isn't that powerfull - just a Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz so four CPUs are visible. Let me try with 3.2.x shortly. > > BUG: unable to handle kernel paging request at bffff628 > IP: [] __page_check_address+0xb8/0x170 > *pdpt = 0000000003cfb027 *pde = 0000000013873067 *pte = 0000000000000000 > Oops: 0000 [#1] SMP > Modules linked in: ebt_comment ebt_arp ebt_set ebt_limit ebt_ip6 > ebt_ip ip_set_hash_net ip_set ebtable_nat xen_gntdev e1000e > Pid: 27902, comm: lvremove Not tainted 3.2.6-1 #1 Supermicro X8DT6/X8DT6 > EIP: 0061:[] EFLAGS: 00010246 CPU: 6 > EIP is at __page_check_address+0xb8/0x170 > EAX: bffff000 EBX: cbf76dd8 ECX: 00000000 EDX: 00000000 > ESI: bffff628 EDI: e49ed900 EBP: c80ffe60 ESP: c80ffe4c > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 > Process lvremove (pid: 27902, ti=c80fe000 task=d29adca0 task.ti=c80fe000) > Stack: > e4205000 00000fff da9b6bc0 d0068dc0 e49ed900 c80ffe94 c10ec769 c80ffe84 > 00000000 00000129 00000125 b76c5000 00000001 00000000 d0068c08 d0068dc0 > b76c5000 e49ed900 c80fff24 c10ecb73 00000002 00000005 35448025 c80ffec4 > Call Trace: > [] try_to_unmap_one+0x29/0x310 > [] try_to_unmap_file+0x83/0x560 > [] ? xen_pte_val+0xb9/0x140 > [] ? __raw_callee_save_xen_pte_val+0x6/0x8 > [] ? vm_normal_page+0x28/0xc0 > [] ? kmap_atomic_prot+0x45/0x110 > [] try_to_munlock+0x1c/0x40 > [] munlock_vma_page+0x49/0x90 > [] munlock_vma_pages_range+0x57/0xa0 > [] mlock_fixup+0xc2/0x130 > [] do_mlockall+0x6c/0x80 > [] sys_munlockall+0x29/0x50 > [] sysenter_do_call+0x12/0x28 > Code: ff c1 ee 09 81 e6 f8 0f 00 00 81 e1 ff 0f 00 00 0f ac ca 0c c1 > e2 05 03 55 ec 89 d0 e8 12 d3 f4 ff 8b 4d 0c 85 c9 8d 34 30 75 0c > 06 01 01 00 00 0f 84 84 00 00 00 8b 0d 00 0e 9b c1 89 4d f0 > EIP: [] __page_check_address+0xb8/0x170 SS:ESP 0069:c80ffe4c > CR2: 00000000bffff628 > ---[ end trace 8039aeca9c19f5ab ]--- > note: lvremove[27902] exited with preempt_count 1 > BUG: scheduling while atomic: lvremove/27902/0x00000001 > Modules linked in: ebt_comment ebt_arp ebt_set ebt_limit ebt_ip6 > ebt_ip ip_set_hash_net ip_set ebtable_nat xen_gntdev e1000e > Pid: 27902, comm: lvremove Tainted: G D 3.2.6-1 #1 > Call Trace: > [] __schedule_bug+0x5d/0x70 > [] __schedule+0x679/0x830 > [] ? xen_restore_fl_direct_reloc+0x4/0x4 > [] ? rcu_enter_nohz+0x3c/0x60 > [] ? xen_evtchn_do_upcall+0x20/0x30 > [] ? hypercall_page+0x227/0x1000 > [] ? xen_force_evtchn_callback+0x1a/0x30 > [] schedule+0x30/0x50 > [] rwsem_down_failed_common+0x9d/0xf0 > [] rwsem_down_read_failed+0x12/0x14 > [] call_rwsem_down_read_failed+0x7/0xc > [] ? down_read+0xd/0x10 > [] acct_collect+0x3a/0x170 > [] do_exit+0x62a/0x7d0 > [] ? kmsg_dump+0x37/0xc0 > [] oops_end+0x90/0xd0 > [] no_context+0xbe/0x190 > [] __bad_area_nosemaphore+0x98/0x140 > [] ? xen_clocksource_read+0x19/0x20 > [] ? xen_vcpuop_set_next_event+0x47/0x80 > [] bad_area_nosemaphore+0x12/0x20 > [] do_page_fault+0x2d2/0x3f0 > [] ? hrtimer_interrupt+0x1a9/0x2b0 > [] ? xen_force_evtchn_callback+0x1a/0x30 > [] ? check_events+0x8/0xc > [] ? xen_restore_fl_direct_reloc+0x4/0x4 > [] ? _raw_spin_unlock_irqrestore+0x14/0x20 > [] ? spurious_fault+0x130/0x130 > [] error_code+0x5a/0x60 > [] ? spurious_fault+0x130/0x130 > [] ? __page_check_address+0xb8/0x170 > [] try_to_unmap_one+0x29/0x310 > [] try_to_unmap_file+0x83/0x560 > [] ? xen_pte_val+0xb9/0x140 > [] ? __raw_callee_save_xen_pte_val+0x6/0x8 > [] ? vm_normal_page+0x28/0xc0 > [] ? kmap_atomic_prot+0x45/0x110 > [] try_to_munlock+0x1c/0x40 > [] munlock_vma_page+0x49/0x90 > [] munlock_vma_pages_range+0x57/0xa0 > [] mlock_fixup+0xc2/0x130 > [] do_mlockall+0x6c/0x80 > [] sys_munlockall+0x29/0x50 > [] sysenter_do_call+0x12/0x28 > > Thanks, > -Chris > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel