From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=39195 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1PcXvq-0005eI-AR
	for qemu-devel@nongnu.org; Tue, 11 Jan 2011 01:42:15 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <breuerr@mc.net>) id 1PcXvo-00040E-Uj
	for qemu-devel@nongnu.org; Tue, 11 Jan 2011 01:42:14 -0500
Received: from mail.mc.net ([209.172.128.24]:35949)
	by eggs.gnu.org with smtp (Exim 4.71)
	(envelope-from <breuerr@mc.net>) id 1PcXvo-000408-Pp
	for qemu-devel@nongnu.org; Tue, 11 Jan 2011 01:42:12 -0500
Message-ID: <4D2BFD97.1070002@mc.net>
Date: Tue, 11 Jan 2011 00:49:59 -0600
From: Bob Breuer <breuerr@mc.net>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] Re: phys_page_find bug?
References: <q2pfb8d4f71005070926y8d6f9208y9a6b5eb6a70b77cb@mail.gmail.com>	<AANLkTikVYebUGUfMw3J5NP3BYWCYkGM0VS-++ByLbKWu@mail.gmail.com>	<AANLkTimKfUMg7CNb64vWYZhmxafk=eVtyXJQSG4txnuG@mail.gmail.com>	<4D2A839D.2050407@mc.net>
	<AANLkTikzKcGyaQwAEUX2LHgMUFwiQo_O2HXSA52kB1nQ@mail.gmail.com>
In-Reply-To: <AANLkTikzKcGyaQwAEUX2LHgMUFwiQo_O2HXSA52kB1nQ@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Blue Swirl <blauwirbel@gmail.com>
Cc: qemu-devel <qemu-devel@nongnu.org>, Artyom Tarasenko <atar4qemu@gmail.com>

Blue Swirl wrote:
> On Mon, Jan 10, 2011 at 3:57 AM, Bob Breuer <breuerr@mc.net> wrote:
>   
>> Blue Swirl wrote:
>>     
>>> On Mon, Nov 8, 2010 at 6:55 PM, Artyom Tarasenko <atar4qemu@gmail.com> wrote:
>>>
>>>       
>>>> On Fri, May 7, 2010 at 6:26 PM, Artyom Tarasenko
>>>> <atar4qemu@googlemail.com> wrote:
>>>>
>>>>         
>>>>> phys_page_find (exec.c) returns sometimes a page for addresses where
>>>>> nothing is connected.
>>>>>
>>>>> One example, done with qemu-system-sparc -M SS-20
>>>>>
>>>>> ok f13ffff0 2f spacec@ .
>>>>>
>>>>> // The address translates correctly, in cpu_physical_memory_rw
>>>>> // addr== 0xff13ffff0 (where nothing is connected)
>>>>> // but then phys_page_find returns a nonzero and produces
>>>>>
>>>>> Unassigned mem read access of 1 byte to 0000000ff15ffff0 from xxxxx
>>>>>
>>>>> (note the "5" in the line above where "3" is expected)
>>>>>
>>>>> I wonder if this is only true for non-wired addresses, or whether
>>>>> phys_page_find can also
>>>>> find wrong pages for the addresses where something is connected?
>>>>>
>>>>> Or is my assumption is wrong and phys_page_find can return a page for
>>>>> not-connected
>>>>> addresses and the bug is actually in cpu_physical_memory_rw ?
>>>>>
>>>>> Is the qemu algorithm of working with the physical address space
>>>>> described somewhere?
>>>>>
>>>>>           
>>>> I tried to switch devices off and found that the bug is triggered by
>>>> registering escc.
>>>> It's harder to debug without escc, so I can't tell whether something
>>>> else is causing
>>>> the problem too.
>>>>
>>>> Is escc addressing somehow special?
>>>>
>>>>         
>>> I don't think so, except that it lies close to the top of the physical
>>> address space.
>>>
>>>
>>>       
>>>>> Is the qemu algorithm of working with the physical address space described somewhere?
>>>>>
>>>>>           
>>>> I guess no one knows it anymore, since no-one cared to answer within a
>>>> half year :-/.
>>>>
>>>>         
>>> There's of course good old exec.c, plenty of code and even some comments. ;-)
>>>
>>>       
>> You can also see this in SS-20 when OBP probes all the sbus slots.  Slot
>> 2 with the tcx graphics shows an unexpected address:
>> Unassigned mem read access of 1 byte to 0000000e00000000 from ffd3f5e4
>> Unassigned mem read access of 1 byte to 0000000e10000000 from ffd3f5e4
>> Unassigned mem read access of 1 byte to 0000000020200000 from ffd3f5e4
>> Unassigned mem read access of 1 byte to 0000000e30000000 from ffd3f5e4
>>
>> The 0202 should be e200 instead.
>>
>> There's two bugs in phys_page_find_alloc().  When the bottom level L2
>> table is populated with IO_MEM_UNASSIGNED, region_offset is then used
>> for reporting the physical address.  First, region_offset may not be
>> aligned to the base address of the L2 region.  And second, region_offset
>> won't hold the full 36-bit address on a 32-bit host.
>>     
>
> I see, the bug is only visible on 32 bit hosts with guest address
> space larger than 32 bits. Also, the effect seems to be that the
> physical address for unassigned memory accesses is reported
> incorrectly. This may make some difference for guest fault handlers.
>
>   
>> It seems that both can be fixed by returning NULL for unassigned
>> addresses from phys_page_find().  All callers already handle a NULL
>> return value.  Would this allow any further optimizations to be made?
>>
>> Here's a patch to try:
>>
>> diff --git a/exec.c b/exec.c
>> index 49c28b1..77b49c8 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -434,7 +434,11 @@ static PhysPageDesc
>> *phys_page_find_alloc(target_phys_addr_t index, int alloc)
>>
>>  static inline PhysPageDesc *phys_page_find(target_phys_addr_t index)
>>  {
>> -    return phys_page_find_alloc(index, 0);
>> +    PhysPageDesc *pd = phys_page_find_alloc(index, 0);
>> +    if (pd && pd->phys_offset == IO_MEM_UNASSIGNED) {
>> +        return NULL;
>> +    }
>> +    return pd;
>>  }
>>     
>
> This is repeated quite often:
>     p = phys_page_find(paddr >> TARGET_PAGE_BITS);
>     if (!p) {
>         pd = IO_MEM_UNASSIGNED;
>     } else {
>         pd = p->phys_offset;
>     }
>
> Then we could refactor:
> static inline ram_addr_t phys_page_get_offset(target_phys_addr_t index)
> {
>     PhysPageDesc *pd = phys_page_find_alloc(index, 0);
>
>     if (!pd || pd->phys_offset == IO_MEM_UNASSIGNED) {
>         return IO_MEM_UNASSIGNED;
>     }
>     return pd->phys_offset;
> }
>   

Might work, but most callers also need region_offset for valid IO
devices.  Maybe try:
    pd = phys_page_get_offset(paddr >> TARGET_PAGE_BITS, &p);
in the caller so they could still get p->region_offset later.

On second thought, when gcc inlines phys_page_find(), can it optimize
away the extra check for NULL?