From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kip Macy Subject: Re: xm pause causing lockup Date: Fri, 15 Apr 2005 14:04:00 -0700 Message-ID: References: Reply-To: Kip Macy Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Pratt Cc: xen-devel List-Id: xen-devel@lists.xenproject.org > Does this happen if you boot with 'nosmp'? I don't really believe it's a > race, but might be worth checking. Yes, it still happens. It would have found it quite astonishing if it were a race. (XEN) EIP: 0808:[] (gdb) x/i 0xfc52d5a3 0xfc52d5a3 : mov 0x14(%eax),%eax (gdb) info line *0xfc52d5a3 Line 1236 of "mm.c" starts at address 0xfc52d5a0 and ends at 0xfc52d5b0 . (gdb)=20 Line 1236-1240 of local mm.c: while ( (y =3D page->u.inuse.type_info) =3D=3D x ) cpu_relax(); counter++; printk("page was not validated"); goto again; > Also, it's worth adding a printk into this loop just to check that that > is where you're getting caught. Obviously wasn't thinking and stuck it in the wrong place. Nonetheless, even without the printk I think I've proven my point. >=20 > /* Someone else is updating validation of this page. Wait... > */ > while ( (y =3D page->u.inuse.type_info) =3D=3D x ) > cpu_relax(); > goto again; Yep. >=20 > We need to figure out how the type count managed to get to one without > the page being validated. I presume you're doing a debug=3Dy build of Xen= ? Correct. Nothing comes out on the console apart from debug output from Free= BSD. > Do you get any warnings about illegal mmu_update attempts when you boot > FreeBSD? No, I don't. This is the offending code snippet from pmap_pinit: /* install self-referential address mapping entry(s) */ =09for (i =3D 0; i < NPGPTD; i++) { =09=09ma =3D xpmap_ptom(VM_PAGE_TO_PHYS(ptdpg[i])); =09=09pmap->pm_pdir[PTDPTDI + i] =3D ma | PG_V | PG_A | PG_M; #ifdef PAE =09=09pmap->pm_pdpt[i] =3D ma | PG_V; #endif =09=09/* re-map page directory read-only */ =09=09PT_SET_MA(pmap->pm_pdir, *vtopte((vm_offset_t)pmap->pm_pdir) & ~PG_RW= ); =09=09xen_pgd_pin(ma); =09} PT_SET_MA is just a wrapper for update_va_mapping. Have there been any recent changes to the page typing code that would cause it to get confused by a self-referential mapping? -Kip