From mboxrd@z Thu Jan  1 00:00:00 1970
From: Julien Grall <julien.grall@linaro.org>
Subject: Re: [PATCH RFC 0/N] xen: arm: rework early bring up
Date: Thu, 19 Sep 2013 15:49:08 +0100
Message-ID: <523B0EE4.1090304@linaro.org>
References: <1379381846.11304.73.camel@hastur.hellion.org.uk>
	<5239E3E0.3050306@linaro.org>
	<1379532131.11304.289.camel@hastur.hellion.org.uk>
	<1379532275.11304.291.camel@hastur.hellion.org.uk>
	<523A0658.6070007@linaro.org>
	<1379535773.11304.305.camel@hastur.hellion.org.uk>
	<523A17CE.3020107@linaro.org>
	<1379539699.11304.318.camel@hastur.hellion.org.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <1379539699.11304.318.camel@hastur.hellion.org.uk>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Ian Campbell <ian.campbell@citrix.com>
Cc: Tim Deegan <tim@xen.org>, Stefano Stabellini <stefano.stabellini@citrix.com>, Andre Przywara <andre.przywara@linaro.org>, xen-devel <xen-devel@lists.xen.org>
List-Id: xen-devel@lists.xenproject.org

On 09/18/2013 10:28 PM, Ian Campbell wrote:
> On Wed, 2013-09-18 at 22:14 +0100, Julien Grall wrote:
>> On 09/18/2013 09:22 PM, Ian Campbell wrote:
>>> On Wed, 2013-09-18 at 21:00 +0100, Julien Grall wrote:
>>>> On 09/18/2013 08:24 PM, Ian Campbell wrote:
>>>>> On Wed, 2013-09-18 at 20:22 +0100, Ian Campbell wrote:
>>>>>> On Wed, 2013-09-18 at 18:33 +0100, Julien Grall wrote:
>>>>>>> On 09/17/2013 02:37 AM, Ian Campbell wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> The following reworks early bring up on ARM to use a separate set of
>>>>>>>> boot pagetables. This simplifies things by avoiding the need to bring up
>>>>>>>> all CPUs in lock step, which in turn allows us to do secondary CPU
>>>>>>>> bringup in C code.
>>>>>>>>
>>>>>>>> Unfortunately the main bulk of this change is a single large patch which
>>>>>>>> is hard to decompose any further since it is basically pulling on the
>>>>>>>> thread and then knitting a new jumper from it.
>>>>>>>>
>>>>>>>> With these changes Xen now absolutely requires that the bootloader calls
>>>>>>>> the hypervisor in HYP mode, the previous workarounds have been removed.
>>>>>>>> For use on models a bootwrapper is now required. See
>>>>>>>>           git://xenbits.xen.org/people/ianc/boot-wrapper.git xen-arm32
>>>>>>>>           git://xenbits.xen.org/people/ianc/boot-wrapper-aarch64.git xen-arm64
>>>>>>>>
>>>>>>>> I have implemented support for CPU bringup on the fastmodel vexpress
>>>>>>>> platforms (v7and v8) here, I suppose it should work OK on a real
>>>>>>>> vexpress too but I've not tried it.
>>>>>>>
>>>>>>> I have just tried the patch series on the versatile express and it
>>>>>>> doesn't work. I'm using the u-boot from Andre and all the cpus already
>>>>>>> boot in hyp mode (checked without this patch series).
>>>>>>
>>>>>> :-(
>>>>>>
>>>>>> The code should essentially the same as the old kick cpus, but
>>>>>> reimplemented in C. The only difference I can spot is that the C version
>>>>>> is kicking a single CPU at a time while the ASM version is kicking
>>>>>> everyone at once. Does using send_SGI_allbutself instead work as a hack?
>>>>>>
>>>>>> Perhaps our idea of the CPUID is wrong or something? Your patches to
>>>>>> fixup the logical CPU id stuff might help?
>>>>
>>>> I have found the issue, in send_SGI_mask we remove the offline CPUs with
>>>> the following line:
>>>>     mask &= cpumask_bits(&cpu_online_map);
>>>>
>>>> As the secondary cpu (assuming we have a board with only 2 cores) is not
>>>> yet online, we can't send SGI with our code.
>>>> I'm very surprised that the code works on the fast models.
>>>
>>> So am I, now.
>>>
>>> OK. So online mask == 0x1 and we are trying to wake mask 0x2, so the
>>> actual target list we end up using is 0...
>>>
>>> The gic manual says "When TargetListFilter is 0b00, if the CPUTargetList
>>> field is 0x00 the Distributor does not forward the interrupt to any CPU
>>> interface." (footnote to table 4-21), so it shouldn't be waking
>>> anything... It may be a model bug
>>>
>>>> To solve the issue, I think we need to create another helper which will
>>>> send an SGI wihout removing offline CPUs.
>>>
>>> Looking briefly at the callchains leading to send_SGI_mask I don't see
>>> any which deliberately pass an offline CPU (which would be pretty
>>> non-sensical apart from this one case). So we could perhaps just remove
>>> this check and make it the caller's problem.
>>> Or maybe the right check is for cpu_possible mask or something similar
>>> instead of online?
>>
>> You are right, cpu_possible mask is better.
>> I can modify this patch http://patches.linaro.org/20437/ to add the 
>> check in gic_cpu_mask.
> 
> Sounds reasonable.

Hmm ... in fact this solution is not enough :/.
With my patch series "handle non-contiguous CPU ID" (mainly the patch #3
http://patches.linaro.org/20437/), the gic cpu ID of the target cpu is
not yet initialize.

Linux solves this issue by setting each gic cpu mask (our gic_cpu_id) to
0xff and refine each time a new cpu boot. If we choose this solution
that would mean that all secondaries cpus will boot at the same time.
As I understand with your new boot code (patch #5), we are only able to
handle one cpu simultaneously.

Perhaps, we need to reintroduce an holding pen which contains the MPIDR.

-- 
Julien Grall