From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=39195 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PcXvq-0005eI-AR for qemu-devel@nongnu.org; Tue, 11 Jan 2011 01:42:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PcXvo-00040E-Uj for qemu-devel@nongnu.org; Tue, 11 Jan 2011 01:42:14 -0500 Received: from mail.mc.net ([209.172.128.24]:35949) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1PcXvo-000408-Pp for qemu-devel@nongnu.org; Tue, 11 Jan 2011 01:42:12 -0500 Message-ID: <4D2BFD97.1070002@mc.net> Date: Tue, 11 Jan 2011 00:49:59 -0600 From: Bob Breuer MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: phys_page_find bug? References: <4D2A839D.2050407@mc.net> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Blue Swirl Cc: qemu-devel , Artyom Tarasenko Blue Swirl wrote: > On Mon, Jan 10, 2011 at 3:57 AM, Bob Breuer wrote: > >> Blue Swirl wrote: >> >>> On Mon, Nov 8, 2010 at 6:55 PM, Artyom Tarasenko wrote: >>> >>> >>>> On Fri, May 7, 2010 at 6:26 PM, Artyom Tarasenko >>>> wrote: >>>> >>>> >>>>> phys_page_find (exec.c) returns sometimes a page for addresses where >>>>> nothing is connected. >>>>> >>>>> One example, done with qemu-system-sparc -M SS-20 >>>>> >>>>> ok f13ffff0 2f spacec@ . >>>>> >>>>> // The address translates correctly, in cpu_physical_memory_rw >>>>> // addr== 0xff13ffff0 (where nothing is connected) >>>>> // but then phys_page_find returns a nonzero and produces >>>>> >>>>> Unassigned mem read access of 1 byte to 0000000ff15ffff0 from xxxxx >>>>> >>>>> (note the "5" in the line above where "3" is expected) >>>>> >>>>> I wonder if this is only true for non-wired addresses, or whether >>>>> phys_page_find can also >>>>> find wrong pages for the addresses where something is connected? >>>>> >>>>> Or is my assumption is wrong and phys_page_find can return a page for >>>>> not-connected >>>>> addresses and the bug is actually in cpu_physical_memory_rw ? >>>>> >>>>> Is the qemu algorithm of working with the physical address space >>>>> described somewhere? >>>>> >>>>> >>>> I tried to switch devices off and found that the bug is triggered by >>>> registering escc. >>>> It's harder to debug without escc, so I can't tell whether something >>>> else is causing >>>> the problem too. >>>> >>>> Is escc addressing somehow special? >>>> >>>> >>> I don't think so, except that it lies close to the top of the physical >>> address space. >>> >>> >>> >>>>> Is the qemu algorithm of working with the physical address space described somewhere? >>>>> >>>>> >>>> I guess no one knows it anymore, since no-one cared to answer within a >>>> half year :-/. >>>> >>>> >>> There's of course good old exec.c, plenty of code and even some comments. ;-) >>> >>> >> You can also see this in SS-20 when OBP probes all the sbus slots. Slot >> 2 with the tcx graphics shows an unexpected address: >> Unassigned mem read access of 1 byte to 0000000e00000000 from ffd3f5e4 >> Unassigned mem read access of 1 byte to 0000000e10000000 from ffd3f5e4 >> Unassigned mem read access of 1 byte to 0000000020200000 from ffd3f5e4 >> Unassigned mem read access of 1 byte to 0000000e30000000 from ffd3f5e4 >> >> The 0202 should be e200 instead. >> >> There's two bugs in phys_page_find_alloc(). When the bottom level L2 >> table is populated with IO_MEM_UNASSIGNED, region_offset is then used >> for reporting the physical address. First, region_offset may not be >> aligned to the base address of the L2 region. And second, region_offset >> won't hold the full 36-bit address on a 32-bit host. >> > > I see, the bug is only visible on 32 bit hosts with guest address > space larger than 32 bits. Also, the effect seems to be that the > physical address for unassigned memory accesses is reported > incorrectly. This may make some difference for guest fault handlers. > > >> It seems that both can be fixed by returning NULL for unassigned >> addresses from phys_page_find(). All callers already handle a NULL >> return value. Would this allow any further optimizations to be made? >> >> Here's a patch to try: >> >> diff --git a/exec.c b/exec.c >> index 49c28b1..77b49c8 100644 >> --- a/exec.c >> +++ b/exec.c >> @@ -434,7 +434,11 @@ static PhysPageDesc >> *phys_page_find_alloc(target_phys_addr_t index, int alloc) >> >> static inline PhysPageDesc *phys_page_find(target_phys_addr_t index) >> { >> - return phys_page_find_alloc(index, 0); >> + PhysPageDesc *pd = phys_page_find_alloc(index, 0); >> + if (pd && pd->phys_offset == IO_MEM_UNASSIGNED) { >> + return NULL; >> + } >> + return pd; >> } >> > > This is repeated quite often: > p = phys_page_find(paddr >> TARGET_PAGE_BITS); > if (!p) { > pd = IO_MEM_UNASSIGNED; > } else { > pd = p->phys_offset; > } > > Then we could refactor: > static inline ram_addr_t phys_page_get_offset(target_phys_addr_t index) > { > PhysPageDesc *pd = phys_page_find_alloc(index, 0); > > if (!pd || pd->phys_offset == IO_MEM_UNASSIGNED) { > return IO_MEM_UNASSIGNED; > } > return pd->phys_offset; > } > Might work, but most callers also need region_offset for valid IO devices. Maybe try: pd = phys_page_get_offset(paddr >> TARGET_PAGE_BITS, &p); in the caller so they could still get p->region_offset later. On second thought, when gcc inlines phys_page_find(), can it optimize away the extra check for NULL?