From mboxrd@z Thu Jan 1 00:00:00 1970 From: Julien Grall Subject: Re: [PATCH RFC 0/N] xen: arm: rework early bring up Date: Thu, 19 Sep 2013 15:49:08 +0100 Message-ID: <523B0EE4.1090304@linaro.org> References: <1379381846.11304.73.camel@hastur.hellion.org.uk> <5239E3E0.3050306@linaro.org> <1379532131.11304.289.camel@hastur.hellion.org.uk> <1379532275.11304.291.camel@hastur.hellion.org.uk> <523A0658.6070007@linaro.org> <1379535773.11304.305.camel@hastur.hellion.org.uk> <523A17CE.3020107@linaro.org> <1379539699.11304.318.camel@hastur.hellion.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1379539699.11304.318.camel@hastur.hellion.org.uk> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: Tim Deegan , Stefano Stabellini , Andre Przywara , xen-devel List-Id: xen-devel@lists.xenproject.org On 09/18/2013 10:28 PM, Ian Campbell wrote: > On Wed, 2013-09-18 at 22:14 +0100, Julien Grall wrote: >> On 09/18/2013 09:22 PM, Ian Campbell wrote: >>> On Wed, 2013-09-18 at 21:00 +0100, Julien Grall wrote: >>>> On 09/18/2013 08:24 PM, Ian Campbell wrote: >>>>> On Wed, 2013-09-18 at 20:22 +0100, Ian Campbell wrote: >>>>>> On Wed, 2013-09-18 at 18:33 +0100, Julien Grall wrote: >>>>>>> On 09/17/2013 02:37 AM, Ian Campbell wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> The following reworks early bring up on ARM to use a separate set of >>>>>>>> boot pagetables. This simplifies things by avoiding the need to bring up >>>>>>>> all CPUs in lock step, which in turn allows us to do secondary CPU >>>>>>>> bringup in C code. >>>>>>>> >>>>>>>> Unfortunately the main bulk of this change is a single large patch which >>>>>>>> is hard to decompose any further since it is basically pulling on the >>>>>>>> thread and then knitting a new jumper from it. >>>>>>>> >>>>>>>> With these changes Xen now absolutely requires that the bootloader calls >>>>>>>> the hypervisor in HYP mode, the previous workarounds have been removed. >>>>>>>> For use on models a bootwrapper is now required. See >>>>>>>> git://xenbits.xen.org/people/ianc/boot-wrapper.git xen-arm32 >>>>>>>> git://xenbits.xen.org/people/ianc/boot-wrapper-aarch64.git xen-arm64 >>>>>>>> >>>>>>>> I have implemented support for CPU bringup on the fastmodel vexpress >>>>>>>> platforms (v7and v8) here, I suppose it should work OK on a real >>>>>>>> vexpress too but I've not tried it. >>>>>>> >>>>>>> I have just tried the patch series on the versatile express and it >>>>>>> doesn't work. I'm using the u-boot from Andre and all the cpus already >>>>>>> boot in hyp mode (checked without this patch series). >>>>>> >>>>>> :-( >>>>>> >>>>>> The code should essentially the same as the old kick cpus, but >>>>>> reimplemented in C. The only difference I can spot is that the C version >>>>>> is kicking a single CPU at a time while the ASM version is kicking >>>>>> everyone at once. Does using send_SGI_allbutself instead work as a hack? >>>>>> >>>>>> Perhaps our idea of the CPUID is wrong or something? Your patches to >>>>>> fixup the logical CPU id stuff might help? >>>> >>>> I have found the issue, in send_SGI_mask we remove the offline CPUs with >>>> the following line: >>>> mask &= cpumask_bits(&cpu_online_map); >>>> >>>> As the secondary cpu (assuming we have a board with only 2 cores) is not >>>> yet online, we can't send SGI with our code. >>>> I'm very surprised that the code works on the fast models. >>> >>> So am I, now. >>> >>> OK. So online mask == 0x1 and we are trying to wake mask 0x2, so the >>> actual target list we end up using is 0... >>> >>> The gic manual says "When TargetListFilter is 0b00, if the CPUTargetList >>> field is 0x00 the Distributor does not forward the interrupt to any CPU >>> interface." (footnote to table 4-21), so it shouldn't be waking >>> anything... It may be a model bug >>> >>>> To solve the issue, I think we need to create another helper which will >>>> send an SGI wihout removing offline CPUs. >>> >>> Looking briefly at the callchains leading to send_SGI_mask I don't see >>> any which deliberately pass an offline CPU (which would be pretty >>> non-sensical apart from this one case). So we could perhaps just remove >>> this check and make it the caller's problem. >>> Or maybe the right check is for cpu_possible mask or something similar >>> instead of online? >> >> You are right, cpu_possible mask is better. >> I can modify this patch http://patches.linaro.org/20437/ to add the >> check in gic_cpu_mask. > > Sounds reasonable. Hmm ... in fact this solution is not enough :/. With my patch series "handle non-contiguous CPU ID" (mainly the patch #3 http://patches.linaro.org/20437/), the gic cpu ID of the target cpu is not yet initialize. Linux solves this issue by setting each gic cpu mask (our gic_cpu_id) to 0xff and refine each time a new cpu boot. If we choose this solution that would mean that all secondaries cpus will boot at the same time. As I understand with your new boot code (patch #5), we are only able to handle one cpu simultaneously. Perhaps, we need to reintroduce an holding pen which contains the MPIDR. -- Julien Grall