From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kip Macy Subject: buggy linear page table handling Re: xm pause causing lockup Date: Sat, 16 Apr 2005 12:59:01 -0700 Message-ID: References: Reply-To: Kip Macy Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel List-Id: xen-devel@lists.xenproject.org I went through a few quick iterations to test page table reference counting. In short, if I L2 pin a zeroed page that I've re-mapped read-only the pin succeeds. If the page has a self-referential mapping before it is remapped read-only the pin never returns. It is probably safe to conclude that the type count is not correctly changed when the page is re-mapped if there is a self-referential entry. This used to work, thus it is also safe to say that this is a regression introduced some time between 3/22 and 4/11. Test code from pmap_pinit below. -Kip=20 =09/* ***** TEMP \/ ********** */ =09ma =3D xpmap_ptom(VM_PAGE_TO_PHYS(ptdpg[0])); #if 0 =09/* works */ =09pmap_qremove((vm_offset_t)pmap->pm_pdir, NPGPTD); #elif 0 =09/* works */ =09PT_SET_MA(pmap->pm_pdir, 0); #elif 0 =09/* works */ =09PT_SET_MA(pmap->pm_pdir, ma | PG_V | PG_A); #else =09=09 =09/* causes lockup on pin */ =09pmap->pm_pdir[PTDPTDI + i] =3D ma | PG_V | PG_A | PG_M; =09PT_SET_MA(pmap->pm_pdir, ma | PG_V | PG_A); #endif =09 =09printk("pinning %p - pass 0\n", ma); =09xen_pgd_pin(xpmap_ptom(VM_PAGE_TO_PHYS(ptdpg[0]))); =09printk("pinned %p - pass 0\n", ma); =09/* ***** TEMP ^ ********** */ On 4/15/05, Kip Macy wrote: > > Does this happen if you boot with 'nosmp'? I don't really believe it's = a > > race, but might be worth checking. >=20 > Yes, it still happens. It would have found it quite astonishing if it > were a race. > (XEN) EIP: 0808:[] > (gdb) x/i 0xfc52d5a3 > 0xfc52d5a3 : mov 0x14(%eax),%eax > (gdb) info line *0xfc52d5a3 > Line 1236 of "mm.c" starts at address 0xfc52d5a0 > and ends at 0xfc52d5b0 . > (gdb) >=20 > Line 1236-1240 of local mm.c: > while ( (y =3D page->u.inuse.type_info) =3D=3D x ) > cpu_relax(); > counter++; > printk("page was not validated"); > goto again; >=20 > > Also, it's worth adding a printk into this loop just to check that that > > is where you're getting caught. >=20 > Obviously wasn't thinking and stuck it in the wrong place. > Nonetheless, even without the printk I think I've proven my point. >=20 >=20 > > > > /* Someone else is updating validation of this page. Wait..= . > > */ > > while ( (y =3D page->u.inuse.type_info) =3D=3D x ) > > cpu_relax(); > > goto again; >=20 > Yep. >=20 > > > > We need to figure out how the type count managed to get to one without > > the page being validated. I presume you're doing a debug=3Dy build of X= en? >=20 > Correct. Nothing comes out on the console apart from debug output from Fr= eeBSD. >=20 > > Do you get any warnings about illegal mmu_update attempts when you boot > > FreeBSD? >=20 > No, I don't. This is the offending code snippet from pmap_pinit: >=20 > /* install self-referential address mapping entry(s) */ > for (i =3D 0; i < NPGPTD; i++) { > ma =3D xpmap_ptom(VM_PAGE_TO_PHYS(ptdpg[i])); > pmap->pm_pdir[PTDPTDI + i] =3D ma | PG_V | PG_A | PG_M; > #ifdef PAE > pmap->pm_pdpt[i] =3D ma | PG_V; > #endif > /* re-map page directory read-only */ > PT_SET_MA(pmap->pm_pdir, *vtopte((vm_offset_t)pmap->pm_pd= ir) & ~PG_RW); > xen_pgd_pin(ma); > } >=20 > PT_SET_MA is just a wrapper for update_va_mapping. Have there been any > recent changes to the page typing code that would cause it to get > confused by a self-referential mapping? >=20 > -Kip >