From mboxrd@z Thu Jan  1 00:00:00 1970
From: Joel Schopp <joel.schopp@amd.com>
Subject: Re: [PATCH] kvm: arm64: vgic: fix hyp panic with 64k pages on juno
 platform
Date: Thu, 24 Jul 2014 14:59:13 -0500
Message-ID: <53D16591.6070904@amd.com>
References: <1406230067-926-1-git-send-email-will.deacon@arm.com>
 <CAFEAcA-+0Bo1EjQrHjNDBgobafOE++kj5eaXvDVYgJMtu5W0aA@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Cc: Paolo Bonzini <pbonzini@redhat.com>, <gleb@kernel.org>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	kvm-devel <kvm@vger.kernel.org>,
	Christoffer Dall <christoffer.dall@linaro.org>,
	"Marc Zyngier" <marc.zyngier@arm.com>,
	Don Dutile <ddutile@redhat.com>, <stable@vger.kernel.org>
To: Peter Maydell <peter.maydell@linaro.org>,
	Will Deacon <will.deacon@arm.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-by2lp0239.outbound.protection.outlook.com ([207.46.163.239]:14074
	"EHLO na01-by2-obe.outbound.protection.outlook.com"
	rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
	id S934176AbaGXUBe (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 24 Jul 2014 16:01:34 -0400
In-Reply-To: <CAFEAcA-+0Bo1EjQrHjNDBgobafOE++kj5eaXvDVYgJMtu5W0aA@mail.gmail.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>


On 07/24/2014 02:47 PM, Peter Maydell wrote:
> On 24 July 2014 20:27, Will Deacon <will.deacon@arm.com> wrote:
>> If the physical address of GICV isn't page-aligned, then we end up
>> creating a stage-2 mapping of the page containing it, which causes us to
>> map neighbouring memory locations directly into the guest.
>>
>> As an example, consider a platform with GICV at physical 0x2c02f000
>> running a 64k-page host kernel. If qemu maps this into the guest at
>> 0x80010000, then guest physical addresses 0x80010000 - 0x8001efff will
>> map host physical region 0x2c020000 - 0x2c02efff. Accesses to these
>> physical regions may cause UNPREDICTABLE behaviour, for example, on the
>> Juno platform this will cause an SError exception to EL3, which brings
>> down the entire physical CPU resulting in RCU stalls / HYP panics / host
>> crashing / wasted weeks of debugging.
> This seems to me like a specific problem with Juno rather than an
> issue with having the GICV at a non-page-aligned start. The
> requirement to be able to expose host GICV as the guest GICC
> in a 64K pages system is just "nothing else in that 64K page
> (or pages, if the GICV runs across two pages) is allowed to be
> unsafe for the guest to touch", which remains true whether the
> GICV starts at 0K in the 64K page or 60K.
>
>> SBSA recommends that systems alias the 4k GICV across the bounding 64k
>> region, in which case GICV physical could be described as 0x2c020000 in
>> the above scenario.
> The SBSA "make every 4K region in the 64K page be the same thing"
> recommendation is one way of satisfying the requirement that the
> whole 64K page is safe for the guest to touch. (Making the rest of
> the page RAZ/WI would be another option I guess.) If your system
> actually implements the SBSA recommendation then in fact
> describing the GICV-phys-base as the 64K-aligned address is wrong,
> because then the register at GICV-base + 4K would not be
> the first register in the 2nd page of the GICV, it would be another
> copy of the 1st page. This happens to work on Linux guests
> currently because they don't touch anything in the 2nd page,
> but for cases like device passthrough IIRC we might well like
> the guest to use some of the 2nd page registers. So the only
> correct choice on those systems is to specify the +60K address
> as the GICV physaddr in the device tree, and use Marc's patchset
> to allow QEMU/kvmtool to determine the page offset within the 64K
> page so it can reflect that in the guest's device tree.
I have one of those systems specifying +60K address as the GICV physaddr
and it works well for me with 64K pages and kvm with both QEMU and kvmtool.

>
> I can't think of any way of determining whether a particular
> system gets this right or wrong automatically, which suggests
> perhaps we need to allow the device tree to specify that the
> GICV is 64k-page-safe...
I don't have a better solution, despite my lack of enthusiasm for yet
another device tree property.