* [PATCH] help guest boot up on AArch64 host with GICv2 @ 2016-01-15 20:02 Chris Metcalf 2016-01-18 9:28 ` Marc Zyngier 0 siblings, 1 reply; 10+ messages in thread From: Chris Metcalf @ 2016-01-15 20:02 UTC (permalink / raw) To: linux-arm-kernel We are using GICv2 compatibility mode in the Fast Models/Foundation Models simulations we are running because the boot code (ATF/UEFI) doesn't support GICv3 in our system at the moment. However, starting with kernel 4.2, the guest couldn't boot up because it wasn't getting timer interrupts. I tracked this down to a kernel commit that switched to using the "alternatives" mechanism -- rather than seeing either a GICv2 or GICv3 and configuring appropriately, the KVM code just configured the code that saves/restores the vgic state based on the presence of the system register interface to the GIC CPU interface. See the attached patch for a fix that manages this differently and allows me to boot up the guest in this configuration. However, even assuming this patch can be taken into an upstream tree, I still have a couple of additional problems: - I can boot up with the Foundation Models using this change, but not with the Fast Models (again, using a v3 GIC but in v2 compatibility mode in the device tree). The Fast Models dts looks like it has the same configuration for the GIC and the timers so I'm not sure what's going on here. Any suggestions appreciated. - Without this change, I could only boot kernels up to 4.1. With the change, I can boot kernels up to 4.3. But 4.4 won't boot for me either; I haven't bisected it down yet. So any suggestions on what might be going wrong here would also be appreciated. We are planning to eventually use GICv3 mode in our software stack but for the time being I assume it is interesting to resolve issues with GIC v2 compatibility mode on GIC v3. -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-gic-update-save-restore-pointers-only-when-gic-v3-de.patch Type: text/x-patch Size: 4775 bytes Desc: not available URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160115/51d99c9a/attachment.bin> ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] help guest boot up on AArch64 host with GICv2 2016-01-15 20:02 [PATCH] help guest boot up on AArch64 host with GICv2 Chris Metcalf @ 2016-01-18 9:28 ` Marc Zyngier 2016-01-26 20:43 ` Chris Metcalf 0 siblings, 1 reply; 10+ messages in thread From: Marc Zyngier @ 2016-01-18 9:28 UTC (permalink / raw) To: linux-arm-kernel Hi Chris, On 15/01/16 20:02, Chris Metcalf wrote: > We are using GICv2 compatibility mode in the Fast Models/Foundation > Models simulations we are running because the boot code (ATF/UEFI) > doesn't support GICv3 in our system at the moment. > > However, starting with kernel 4.2, the guest couldn't boot up because it > wasn't getting timer interrupts. I tracked this down to a kernel commit > that switched to using the "alternatives" mechanism -- rather than > seeing either a GICv2 or GICv3 and configuring appropriately, the KVM > code just configured the code that saves/restores the vgic state based > on the presence of the system register interface to the GIC CPU > interface. See the attached patch for a fix that manages this > differently and allows me to boot up the guest in this configuration. > > However, even assuming this patch can be taken into an upstream tree, I > still have a couple of additional problems: > > - I can boot up with the Foundation Models using this change, but not > with the Fast Models (again, using a v3 GIC but in v2 compatibility mode > in the device tree). The Fast Models dts looks like it has the same > configuration for the GIC and the timers so I'm not sure what's going on > here. Any suggestions appreciated. > > - Without this change, I could only boot kernels up to 4.1. With the > change, I can boot kernels up to 4.3. But 4.4 won't boot for me either; > I haven't bisected it down yet. So any suggestions on what might be > going wrong here would also be appreciated. > > We are planning to eventually use GICv3 mode in our software stack but > for the time being I assume it is interesting to resolve issues with GIC > v2 compatibility mode on GIC v3. > I'm afraid that this is the wrong approach. Whilst 4.2 was a bit too eager to use GICv3 (only checking the CPU capability and ignoring the actual state of the EL2/EL3 SRE bits), the fact that 4.4 doesn't boot is probably the sign of a broken firmware that enables the system register interface at EL3, letting the rest of the software stack to use GICv3 in native mode, and yet providing a GICv2 DT. This combination is unpredictable, and is likely to cause issues on some HW implementations. Could you please point me to the firmware you're using? Also, please check the following patches: 6d32ab2 arm64: Update booting requirements for GICv3 in GICv2 mode 76e52dd irqchip/gic: Warn if GICv3 system registers are enabled 963fcd4 arm64: cpufeatures: Check ICC_EL1_SRE.SRE before enabling ARM64_HAS_SYSREG_GIC_CPUIF 7cabd00 irqchip/gic-v3: Make gic_enable_sre an inline function d271976 arm64: el2_setup: Make sure ICC_SRE_EL2.SRE sticks before using GICv3 sysregs Can you point me to the one that prevents you from booting? Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] help guest boot up on AArch64 host with GICv2 2016-01-18 9:28 ` Marc Zyngier @ 2016-01-26 20:43 ` Chris Metcalf 2016-01-27 9:12 ` Marc Zyngier 0 siblings, 1 reply; 10+ messages in thread From: Chris Metcalf @ 2016-01-26 20:43 UTC (permalink / raw) To: linux-arm-kernel On 01/18/2016 04:28 AM, Marc Zyngier wrote: > Hi Chris, > > On 15/01/16 20:02, Chris Metcalf wrote: >> We are using GICv2 compatibility mode in the Fast Models/Foundation >> Models simulations we are running because the boot code (ATF/UEFI) >> doesn't support GICv3 in our system at the moment. >> >> However, starting with kernel 4.2, the guest couldn't boot up because it >> wasn't getting timer interrupts. I tracked this down to a kernel commit >> that switched to using the "alternatives" mechanism -- rather than >> seeing either a GICv2 or GICv3 and configuring appropriately, the KVM >> code just configured the code that saves/restores the vgic state based >> on the presence of the system register interface to the GIC CPU >> interface. See the attached patch for a fix that manages this >> differently and allows me to boot up the guest in this configuration. >> >> However, even assuming this patch can be taken into an upstream tree, I >> still have a couple of additional problems: >> >> - I can boot up with the Foundation Models using this change, but not >> with the Fast Models (again, using a v3 GIC but in v2 compatibility mode >> in the device tree). The Fast Models dts looks like it has the same >> configuration for the GIC and the timers so I'm not sure what's going on >> here. Any suggestions appreciated. >> >> - Without this change, I could only boot kernels up to 4.1. With the >> change, I can boot kernels up to 4.3. But 4.4 won't boot for me either; >> I haven't bisected it down yet. So any suggestions on what might be >> going wrong here would also be appreciated. >> >> We are planning to eventually use GICv3 mode in our software stack but >> for the time being I assume it is interesting to resolve issues with GIC >> v2 compatibility mode on GIC v3. >> > I'm afraid that this is the wrong approach. Whilst 4.2 was a bit too > eager to use GICv3 (only checking the CPU capability and ignoring the > actual state of the EL2/EL3 SRE bits), the fact that 4.4 doesn't boot is > probably the sign of a broken firmware that enables the system register > interface at EL3, letting the rest of the software stack to use GICv3 in > native mode, and yet providing a GICv2 DT. > > This combination is unpredictable, and is likely to cause issues on > some HW implementations. > > Could you please point me to the firmware you're using? > > Also, please check the following patches: > > 6d32ab2 arm64: Update booting requirements for GICv3 in GICv2 mode > 76e52dd irqchip/gic: Warn if GICv3 system registers are enabled > 963fcd4 arm64: cpufeatures: Check ICC_EL1_SRE.SRE before enabling > ARM64_HAS_SYSREG_GIC_CPUIF > 7cabd00 irqchip/gic-v3: Make gic_enable_sre an inline function > d271976 arm64: el2_setup: Make sure ICC_SRE_EL2.SRE sticks before using > GICv3 sysregs > > Can you point me to the one that prevents you from booting? The problematic commit is 963fcd4, because it calls gic_enable_sre() in the host kernel even with a GICv2 DT specified, and this seems to put things in a state such that we don't receive virtual timer interrupts in the guest when we boot it up. (I'm not that familiar with the QEMU DT but it is providing a GIC v2 to the guest.) With a v4.5-rc1 host, if I "return false" before the code in gic_enable_sre() that tries to actually enable the SRE, and then hardcode the __vgic_v2_XXX_state() save/restore calls into the __vgic_XXX_state() routines, then my guest boots up OK. We are using a modified ARM version of EDK v3.0-rc0, and a modified ARM Trusted Firmware based on commit 963fcd4 (between v1.1 and 1.2). We certainly haven't touched any of the GIC code in either one. I tried to modify the host DT to enable GICv3, but then the host itself hangs on boot, so clearly more is needed. (To be fair I've only tested v4.4 in that configuration, not v4.5-rc1.) The firmware isn't yet using GICv3 so perhaps that is part of the problem. -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] help guest boot up on AArch64 host with GICv2 2016-01-26 20:43 ` Chris Metcalf @ 2016-01-27 9:12 ` Marc Zyngier 2016-01-28 20:12 ` Chris Metcalf 0 siblings, 1 reply; 10+ messages in thread From: Marc Zyngier @ 2016-01-27 9:12 UTC (permalink / raw) To: linux-arm-kernel On 26/01/16 20:43, Chris Metcalf wrote: > On 01/18/2016 04:28 AM, Marc Zyngier wrote: >> Hi Chris, >> >> On 15/01/16 20:02, Chris Metcalf wrote: >>> We are using GICv2 compatibility mode in the Fast Models/Foundation >>> Models simulations we are running because the boot code (ATF/UEFI) >>> doesn't support GICv3 in our system at the moment. >>> >>> However, starting with kernel 4.2, the guest couldn't boot up because it >>> wasn't getting timer interrupts. I tracked this down to a kernel commit >>> that switched to using the "alternatives" mechanism -- rather than >>> seeing either a GICv2 or GICv3 and configuring appropriately, the KVM >>> code just configured the code that saves/restores the vgic state based >>> on the presence of the system register interface to the GIC CPU >>> interface. See the attached patch for a fix that manages this >>> differently and allows me to boot up the guest in this configuration. >>> >>> However, even assuming this patch can be taken into an upstream tree, I >>> still have a couple of additional problems: >>> >>> - I can boot up with the Foundation Models using this change, but not >>> with the Fast Models (again, using a v3 GIC but in v2 compatibility mode >>> in the device tree). The Fast Models dts looks like it has the same >>> configuration for the GIC and the timers so I'm not sure what's going on >>> here. Any suggestions appreciated. >>> >>> - Without this change, I could only boot kernels up to 4.1. With the >>> change, I can boot kernels up to 4.3. But 4.4 won't boot for me either; >>> I haven't bisected it down yet. So any suggestions on what might be >>> going wrong here would also be appreciated. >>> >>> We are planning to eventually use GICv3 mode in our software stack but >>> for the time being I assume it is interesting to resolve issues with GIC >>> v2 compatibility mode on GIC v3. >>> >> I'm afraid that this is the wrong approach. Whilst 4.2 was a bit too >> eager to use GICv3 (only checking the CPU capability and ignoring the >> actual state of the EL2/EL3 SRE bits), the fact that 4.4 doesn't boot is >> probably the sign of a broken firmware that enables the system register >> interface at EL3, letting the rest of the software stack to use GICv3 in >> native mode, and yet providing a GICv2 DT. >> >> This combination is unpredictable, and is likely to cause issues on >> some HW implementations. >> >> Could you please point me to the firmware you're using? >> >> Also, please check the following patches: >> >> 6d32ab2 arm64: Update booting requirements for GICv3 in GICv2 mode >> 76e52dd irqchip/gic: Warn if GICv3 system registers are enabled >> 963fcd4 arm64: cpufeatures: Check ICC_EL1_SRE.SRE before enabling >> ARM64_HAS_SYSREG_GIC_CPUIF >> 7cabd00 irqchip/gic-v3: Make gic_enable_sre an inline function >> d271976 arm64: el2_setup: Make sure ICC_SRE_EL2.SRE sticks before using >> GICv3 sysregs >> >> Can you point me to the one that prevents you from booting? > > The problematic commit is 963fcd4, because it calls gic_enable_sre() > in the host kernel even with a GICv2 DT specified, and this seems to > put things in a state such that we don't receive virtual timer > interrupts in the guest when we boot it up. (I'm not that familiar with > the QEMU DT but it is providing a GIC v2 to the guest.) > > With a v4.5-rc1 host, if I "return false" before the code in gic_enable_sre() > that tries to actually enable the SRE, and then hardcode the > __vgic_v2_XXX_state() save/restore calls into the __vgic_XXX_state() > routines, then my guest boots up OK. What if you just do the "return false"? I bet that it will work as well... > We are using a modified ARM version of EDK v3.0-rc0, and a modified > ARM Trusted Firmware based on commit 963fcd4 (between v1.1 and 1.2). Are you sure of that commit? It looks suspiciously like the ID ftom the kernel tree... > We certainly haven't touched any of the GIC code in either one. > > I tried to modify the host DT to enable GICv3, but then the host itself > hangs on boot, so clearly more is needed. (To be fair I've only tested > v4.4 in that configuration, not v4.5-rc1.) The firmware isn't yet using > GICv3 so perhaps that is part of the problem. That's indeed part of the problem. The firmware running at EL3 insists on using GICv2, but still let EL2 (and EL1) use GICv3 system registers. Could you please dump the content of ICC_SRE_EL3 just before entering the kernel at EL2? If you see ICC_SRE_EL3.SRE being set, then this would indicate a firmware bug (and leave the system in an unpredictable configuration). Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] help guest boot up on AArch64 host with GICv2 2016-01-27 9:12 ` Marc Zyngier @ 2016-01-28 20:12 ` Chris Metcalf 2016-01-29 7:24 ` Ard Biesheuvel 2016-01-29 17:54 ` Marc Zyngier 0 siblings, 2 replies; 10+ messages in thread From: Chris Metcalf @ 2016-01-28 20:12 UTC (permalink / raw) To: linux-arm-kernel On 01/27/2016 04:12 AM, Marc Zyngier wrote: > On 26/01/16 20:43, Chris Metcalf wrote: >> On 01/18/2016 04:28 AM, Marc Zyngier wrote: >>> Hi Chris, >>> >>> On 15/01/16 20:02, Chris Metcalf wrote: >>>> We are using GICv2 compatibility mode in the Fast Models/Foundation >>>> Models simulations we are running because the boot code (ATF/UEFI) >>>> doesn't support GICv3 in our system at the moment. >>>> >>>> However, starting with kernel 4.2, the guest couldn't boot up because it >>>> wasn't getting timer interrupts. I tracked this down to a kernel commit >>>> that switched to using the "alternatives" mechanism -- rather than >>>> seeing either a GICv2 or GICv3 and configuring appropriately, the KVM >>>> code just configured the code that saves/restores the vgic state based >>>> on the presence of the system register interface to the GIC CPU >>>> interface. See the attached patch for a fix that manages this >>>> differently and allows me to boot up the guest in this configuration. >>>> >>>> However, even assuming this patch can be taken into an upstream tree, I >>>> still have a couple of additional problems: >>>> >>>> - I can boot up with the Foundation Models using this change, but not >>>> with the Fast Models (again, using a v3 GIC but in v2 compatibility mode >>>> in the device tree). The Fast Models dts looks like it has the same >>>> configuration for the GIC and the timers so I'm not sure what's going on >>>> here. Any suggestions appreciated. >>>> >>>> - Without this change, I could only boot kernels up to 4.1. With the >>>> change, I can boot kernels up to 4.3. But 4.4 won't boot for me either; >>>> I haven't bisected it down yet. So any suggestions on what might be >>>> going wrong here would also be appreciated. >>>> >>>> We are planning to eventually use GICv3 mode in our software stack but >>>> for the time being I assume it is interesting to resolve issues with GIC >>>> v2 compatibility mode on GIC v3. >>>> >>> I'm afraid that this is the wrong approach. Whilst 4.2 was a bit too >>> eager to use GICv3 (only checking the CPU capability and ignoring the >>> actual state of the EL2/EL3 SRE bits), the fact that 4.4 doesn't boot is >>> probably the sign of a broken firmware that enables the system register >>> interface at EL3, letting the rest of the software stack to use GICv3 in >>> native mode, and yet providing a GICv2 DT. >>> >>> This combination is unpredictable, and is likely to cause issues on >>> some HW implementations. >>> >>> Could you please point me to the firmware you're using? >>> >>> Also, please check the following patches: >>> >>> 6d32ab2 arm64: Update booting requirements for GICv3 in GICv2 mode >>> 76e52dd irqchip/gic: Warn if GICv3 system registers are enabled >>> 963fcd4 arm64: cpufeatures: Check ICC_EL1_SRE.SRE before enabling >>> ARM64_HAS_SYSREG_GIC_CPUIF >>> 7cabd00 irqchip/gic-v3: Make gic_enable_sre an inline function >>> d271976 arm64: el2_setup: Make sure ICC_SRE_EL2.SRE sticks before using >>> GICv3 sysregs >>> >>> Can you point me to the one that prevents you from booting? >> The problematic commit is 963fcd4, because it calls gic_enable_sre() >> in the host kernel even with a GICv2 DT specified, and this seems to >> put things in a state such that we don't receive virtual timer >> interrupts in the guest when we boot it up. (I'm not that familiar with >> the QEMU DT but it is providing a GIC v2 to the guest.) >> >> With a v4.5-rc1 host, if I "return false" before the code in gic_enable_sre() >> that tries to actually enable the SRE, and then hardcode the >> __vgic_v2_XXX_state() save/restore calls into the __vgic_XXX_state() >> routines, then my guest boots up OK. > What if you just do the "return false"? I bet that it will work as well... Yes, that also works for my case. >> We are using a modified ARM version of EDK v3.0-rc0, and a modified >> ARM Trusted Firmware based on commit 963fcd4 (between v1.1 and 1.2). > Are you sure of that commit? It looks suspiciously like the ID ftom the > kernel tree... Hah, good catch! The double-click-to-copy behavior is kind of flakey on RHEL 6's default terminal, and I bet that bit me. It's 41099f4e. >> We certainly haven't touched any of the GIC code in either one. >> >> I tried to modify the host DT to enable GICv3, but then the host itself >> hangs on boot, so clearly more is needed. (To be fair I've only tested >> v4.4 in that configuration, not v4.5-rc1.) The firmware isn't yet using >> GICv3 so perhaps that is part of the problem. > That's indeed part of the problem. The firmware running at EL3 insists > on using GICv2, but still let EL2 (and EL1) use GICv3 system registers. > Could you please dump the content of ICC_SRE_EL3 just before entering > the kernel at EL2? If you see ICC_SRE_EL3.SRE being set, then this would > indicate a firmware bug (and leave the system in an unpredictable > configuration). Well, the firmware clearly does this intentionally. In ATF's drivers/arm/giv/arm_gic.c, the gicv3_cpuif_setup() function has a comment that reads: /******************************************************************************* * This function does some minimal GICv3 configuration. The Firmware itself does * not fully support GICv3 at this time and relies on GICv2 emulation as * provided by GICv3. This function allows software (like Linux) in later stages * to use full GICv3 features. ******************************************************************************/ and the function ends with: val = read_icc_sre_el3(); write_icc_sre_el3(val | ICC_SRE_EN | ICC_SRE_SRE); In our build environment, if I comment out those two lines, that fixes the guest boot problem (without any hacking on the Linux side), so that's good anyway. With this change it works for me in the Fast Models as well as Foundation Models, too. -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] help guest boot up on AArch64 host with GICv2 2016-01-28 20:12 ` Chris Metcalf @ 2016-01-29 7:24 ` Ard Biesheuvel 2016-01-29 17:49 ` Chris Metcalf 2016-01-29 17:54 ` Marc Zyngier 1 sibling, 1 reply; 10+ messages in thread From: Ard Biesheuvel @ 2016-01-29 7:24 UTC (permalink / raw) To: linux-arm-kernel On 28 January 2016 at 21:12, Chris Metcalf <cmetcalf@ezchip.com> wrote: > On 01/27/2016 04:12 AM, Marc Zyngier wrote: >> >> On 26/01/16 20:43, Chris Metcalf wrote: >>> >>> On 01/18/2016 04:28 AM, Marc Zyngier wrote: >>>> >>>> Hi Chris, >>>> >>>> On 15/01/16 20:02, Chris Metcalf wrote: >>>>> >>>>> We are using GICv2 compatibility mode in the Fast Models/Foundation >>>>> Models simulations we are running because the boot code (ATF/UEFI) >>>>> doesn't support GICv3 in our system at the moment. >>>>> >>>>> However, starting with kernel 4.2, the guest couldn't boot up because >>>>> it >>>>> wasn't getting timer interrupts. I tracked this down to a kernel >>>>> commit >>>>> that switched to using the "alternatives" mechanism -- rather than >>>>> seeing either a GICv2 or GICv3 and configuring appropriately, the KVM >>>>> code just configured the code that saves/restores the vgic state based >>>>> on the presence of the system register interface to the GIC CPU >>>>> interface. See the attached patch for a fix that manages this >>>>> differently and allows me to boot up the guest in this configuration. >>>>> >>>>> However, even assuming this patch can be taken into an upstream tree, I >>>>> still have a couple of additional problems: >>>>> >>>>> - I can boot up with the Foundation Models using this change, but not >>>>> with the Fast Models (again, using a v3 GIC but in v2 compatibility >>>>> mode >>>>> in the device tree). The Fast Models dts looks like it has the same >>>>> configuration for the GIC and the timers so I'm not sure what's going >>>>> on >>>>> here. Any suggestions appreciated. >>>>> >>>>> - Without this change, I could only boot kernels up to 4.1. With the >>>>> change, I can boot kernels up to 4.3. But 4.4 won't boot for me >>>>> either; >>>>> I haven't bisected it down yet. So any suggestions on what might be >>>>> going wrong here would also be appreciated. >>>>> >>>>> We are planning to eventually use GICv3 mode in our software stack but >>>>> for the time being I assume it is interesting to resolve issues with >>>>> GIC >>>>> v2 compatibility mode on GIC v3. >>>>> >>>> I'm afraid that this is the wrong approach. Whilst 4.2 was a bit too >>>> eager to use GICv3 (only checking the CPU capability and ignoring the >>>> actual state of the EL2/EL3 SRE bits), the fact that 4.4 doesn't boot is >>>> probably the sign of a broken firmware that enables the system register >>>> interface at EL3, letting the rest of the software stack to use GICv3 in >>>> native mode, and yet providing a GICv2 DT. >>>> >>>> This combination is unpredictable, and is likely to cause issues on >>>> some HW implementations. >>>> >>>> Could you please point me to the firmware you're using? >>>> >>>> Also, please check the following patches: >>>> >>>> 6d32ab2 arm64: Update booting requirements for GICv3 in GICv2 mode >>>> 76e52dd irqchip/gic: Warn if GICv3 system registers are enabled >>>> 963fcd4 arm64: cpufeatures: Check ICC_EL1_SRE.SRE before enabling >>>> ARM64_HAS_SYSREG_GIC_CPUIF >>>> 7cabd00 irqchip/gic-v3: Make gic_enable_sre an inline function >>>> d271976 arm64: el2_setup: Make sure ICC_SRE_EL2.SRE sticks before using >>>> GICv3 sysregs >>>> >>>> Can you point me to the one that prevents you from booting? >>> >>> The problematic commit is 963fcd4, because it calls gic_enable_sre() >>> in the host kernel even with a GICv2 DT specified, and this seems to >>> put things in a state such that we don't receive virtual timer >>> interrupts in the guest when we boot it up. (I'm not that familiar with >>> the QEMU DT but it is providing a GIC v2 to the guest.) >>> >>> With a v4.5-rc1 host, if I "return false" before the code in >>> gic_enable_sre() >>> that tries to actually enable the SRE, and then hardcode the >>> __vgic_v2_XXX_state() save/restore calls into the __vgic_XXX_state() >>> routines, then my guest boots up OK. >> >> What if you just do the "return false"? I bet that it will work as well... > > > Yes, that also works for my case. > >>> We are using a modified ARM version of EDK v3.0-rc0, and a modified >>> ARM Trusted Firmware based on commit 963fcd4 (between v1.1 and 1.2). >> What does 'EDK v3.0-rc0' mean? We don't do any versioned releases afaik, I recently fixed a GIC issue in the FVP EDK2 code, which prevented it from running the GICv3 in native mode rather than in GICv2 compatibility mode. 33ed33f ArmPkg/ArmGic: fix bug in GICv3 distributor configuration >>> We certainly haven't touched any of the GIC code in either one. >>> >>> I tried to modify the host DT to enable GICv3, but then the host itself >>> hangs on boot, so clearly more is needed. (To be fair I've only tested >>> v4.4 in that configuration, not v4.5-rc1.) The firmware isn't yet using >>> GICv3 so perhaps that is part of the problem. >> >> That's indeed part of the problem. The firmware running at EL3 insists >> on using GICv2, but still let EL2 (and EL1) use GICv3 system registers. >> Could you please dump the content of ICC_SRE_EL3 just before entering >> the kernel at EL2? If you see ICC_SRE_EL3.SRE being set, then this would >> indicate a firmware bug (and leave the system in an unpredictable >> configuration). > > > Well, the firmware clearly does this intentionally. In ATF's > drivers/arm/giv/arm_gic.c, the gicv3_cpuif_setup() function has > a comment that reads: > > /******************************************************************************* > * This function does some minimal GICv3 configuration. The Firmware itself > does > * not fully support GICv3 at this time and relies on GICv2 emulation as > * provided by GICv3. This function allows software (like Linux) in later > stages > * to use full GICv3 features. > > ******************************************************************************/ > This is deliberate, since running the GIC in v3 mode on the secure side would remove the ability on the non-secure side to use the v2 legacy mode. It does not limit the utility of the GICv3 on the non-secure side > and the function ends with: > > val = read_icc_sre_el3(); > write_icc_sre_el3(val | ICC_SRE_EN | ICC_SRE_SRE); > > In our build environment, if I comment out those two lines, that > fixes the guest boot problem (without any hacking on the Linux side), > so that's good anyway. With this change it works for me in the > Fast Models as well as Foundation Models, too. > For historical reasons, the EDK2 GIC driver infers the presence of a GICv3 from the ability to use the system register interface, and ignores the ID registers completely. Without the patch above, or the PcdArmGicV3WithV2Legacy set, the symptoms you are seeing on the firmware side are not entirely unexpected. Also note that, on the Foundation model, the GICv2 and the GICv3 live at different memory addresses. -- Ard. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] help guest boot up on AArch64 host with GICv2 2016-01-29 7:24 ` Ard Biesheuvel @ 2016-01-29 17:49 ` Chris Metcalf 0 siblings, 0 replies; 10+ messages in thread From: Chris Metcalf @ 2016-01-29 17:49 UTC (permalink / raw) To: linux-arm-kernel On 01/29/2016 02:24 AM, Ard Biesheuvel wrote: > On 28 January 2016 at 21:12, Chris Metcalf <cmetcalf@ezchip.com> wrote: >> On 01/27/2016 04:12 AM, Marc Zyngier wrote: >>> On 26/01/16 20:43, Chris Metcalf wrote: >>>> On 01/18/2016 04:28 AM, Marc Zyngier wrote: >>>>> Hi Chris, >>>>> >>>>> On 15/01/16 20:02, Chris Metcalf wrote: >>>>>> We are using GICv2 compatibility mode in the Fast Models/Foundation >>>>>> Models simulations we are running because the boot code (ATF/UEFI) >>>>>> doesn't support GICv3 in our system at the moment. >>>>>> >>>>>> However, starting with kernel 4.2, the guest couldn't boot up because >>>>>> it >>>>>> wasn't getting timer interrupts. I tracked this down to a kernel >>>>>> commit >>>>>> that switched to using the "alternatives" mechanism -- rather than >>>>>> seeing either a GICv2 or GICv3 and configuring appropriately, the KVM >>>>>> code just configured the code that saves/restores the vgic state based >>>>>> on the presence of the system register interface to the GIC CPU >>>>>> interface. See the attached patch for a fix that manages this >>>>>> differently and allows me to boot up the guest in this configuration. >>>>>> >>>>>> However, even assuming this patch can be taken into an upstream tree, I >>>>>> still have a couple of additional problems: >>>>>> >>>>>> - I can boot up with the Foundation Models using this change, but not >>>>>> with the Fast Models (again, using a v3 GIC but in v2 compatibility >>>>>> mode >>>>>> in the device tree). The Fast Models dts looks like it has the same >>>>>> configuration for the GIC and the timers so I'm not sure what's going >>>>>> on >>>>>> here. Any suggestions appreciated. >>>>>> >>>>>> - Without this change, I could only boot kernels up to 4.1. With the >>>>>> change, I can boot kernels up to 4.3. But 4.4 won't boot for me >>>>>> either; >>>>>> I haven't bisected it down yet. So any suggestions on what might be >>>>>> going wrong here would also be appreciated. >>>>>> >>>>>> We are planning to eventually use GICv3 mode in our software stack but >>>>>> for the time being I assume it is interesting to resolve issues with >>>>>> GIC >>>>>> v2 compatibility mode on GIC v3. >>>>>> >>>>> I'm afraid that this is the wrong approach. Whilst 4.2 was a bit too >>>>> eager to use GICv3 (only checking the CPU capability and ignoring the >>>>> actual state of the EL2/EL3 SRE bits), the fact that 4.4 doesn't boot is >>>>> probably the sign of a broken firmware that enables the system register >>>>> interface at EL3, letting the rest of the software stack to use GICv3 in >>>>> native mode, and yet providing a GICv2 DT. >>>>> >>>>> This combination is unpredictable, and is likely to cause issues on >>>>> some HW implementations. >>>>> >>>>> Could you please point me to the firmware you're using? >>>>> >>>>> Also, please check the following patches: >>>>> >>>>> 6d32ab2 arm64: Update booting requirements for GICv3 in GICv2 mode >>>>> 76e52dd irqchip/gic: Warn if GICv3 system registers are enabled >>>>> 963fcd4 arm64: cpufeatures: Check ICC_EL1_SRE.SRE before enabling >>>>> ARM64_HAS_SYSREG_GIC_CPUIF >>>>> 7cabd00 irqchip/gic-v3: Make gic_enable_sre an inline function >>>>> d271976 arm64: el2_setup: Make sure ICC_SRE_EL2.SRE sticks before using >>>>> GICv3 sysregs >>>>> >>>>> Can you point me to the one that prevents you from booting? >>>> The problematic commit is 963fcd4, because it calls gic_enable_sre() >>>> in the host kernel even with a GICv2 DT specified, and this seems to >>>> put things in a state such that we don't receive virtual timer >>>> interrupts in the guest when we boot it up. (I'm not that familiar with >>>> the QEMU DT but it is providing a GIC v2 to the guest.) >>>> >>>> With a v4.5-rc1 host, if I "return false" before the code in >>>> gic_enable_sre() >>>> that tries to actually enable the SRE, and then hardcode the >>>> __vgic_v2_XXX_state() save/restore calls into the __vgic_XXX_state() >>>> routines, then my guest boots up OK. >>> What if you just do the "return false"? I bet that it will work as well... >> >> Yes, that also works for my case. >> >>>> We are using a modified ARM version of EDK v3.0-rc0, and a modified >>>> ARM Trusted Firmware based on commit 963fcd4 (between v1.1 and 1.2). > > What does 'EDK v3.0-rc0' mean? We don't do any versioned releases afaik, It's a git tag from the repo at git://github.com/ARM-software/edk2 . In fact I'm not quite sure we are at that exact tag, since it seems like some fixes present in v3.0-rc0 are missing from our code base. But it's an early 2015 drop in any case. > I recently fixed a GIC issue in the FVP EDK2 code, which prevented it > from running the GICv3 in native mode rather than in GICv2 > compatibility mode. > > 33ed33f ArmPkg/ArmGic: fix bug in GICv3 distributor configuration Looks like an alternate version of that fix is present in the ARM repo as commit 152ac4, and we have that fix in our repo too. >>>> We certainly haven't touched any of the GIC code in either one. >>>> >>>> I tried to modify the host DT to enable GICv3, but then the host itself >>>> hangs on boot, so clearly more is needed. (To be fair I've only tested >>>> v4.4 in that configuration, not v4.5-rc1.) The firmware isn't yet using >>>> GICv3 so perhaps that is part of the problem. >>> That's indeed part of the problem. The firmware running at EL3 insists >>> on using GICv2, but still let EL2 (and EL1) use GICv3 system registers. >>> Could you please dump the content of ICC_SRE_EL3 just before entering >>> the kernel at EL2? If you see ICC_SRE_EL3.SRE being set, then this would >>> indicate a firmware bug (and leave the system in an unpredictable >>> configuration). >> >> Well, the firmware clearly does this intentionally. In ATF's >> drivers/arm/giv/arm_gic.c, the gicv3_cpuif_setup() function has >> a comment that reads: >> >> /******************************************************************************* >> * This function does some minimal GICv3 configuration. The Firmware itself >> does >> * not fully support GICv3 at this time and relies on GICv2 emulation as >> * provided by GICv3. This function allows software (like Linux) in later >> stages >> * to use full GICv3 features. >> >> ******************************************************************************/ >> > This is deliberate, since running the GIC in v3 mode on the secure > side would remove the ability on the non-secure side to use the v2 > legacy mode. It does not limit the utility of the GICv3 on the > non-secure side It does seem that it conflicts with trying to use a GIC v2 in the DT for tip Linux, though. >> and the function ends with: >> >> val = read_icc_sre_el3(); >> write_icc_sre_el3(val | ICC_SRE_EN | ICC_SRE_SRE); >> >> In our build environment, if I comment out those two lines, that >> fixes the guest boot problem (without any hacking on the Linux side), >> so that's good anyway. With this change it works for me in the >> Fast Models as well as Foundation Models, too. >> > For historical reasons, the EDK2 GIC driver infers the presence of a > GICv3 from the ability to use the system register interface, and > ignores the ID registers completely. Without the patch above, or the > PcdArmGicV3WithV2Legacy set, the symptoms you are seeing on the > firmware side are not entirely unexpected. I believe we do set PcdArmGicV3WithV2Legacy to TRUE for our platform, but we did require my patch above in addition: [PcdsFeatureFlag.common] # Force the UEFI GIC driver to use GICv2 legacy mode. To use # GICv3 without GICv2 legacy in UEFI, the ARM Trusted Firmware needs # to configure the Non-Secure interrupts in the GIC Redistributors # which is not supported at the moment. gArmTokenSpaceGuid.PcdArmGicV3WithV2Legacy|TRUE > Also note that, on the > Foundation model, the GICv2 and the GICv3 live at different memory > addresses. We have the GIC at different addresses in any case, but I will check with our hardware folks to see if we should be using different addresses if we try to use a GIC v3. Thanks! -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] help guest boot up on AArch64 host with GICv2 2016-01-28 20:12 ` Chris Metcalf 2016-01-29 7:24 ` Ard Biesheuvel @ 2016-01-29 17:54 ` Marc Zyngier 2016-01-29 18:29 ` Chris Metcalf 1 sibling, 1 reply; 10+ messages in thread From: Marc Zyngier @ 2016-01-29 17:54 UTC (permalink / raw) To: linux-arm-kernel On 28/01/16 20:12, Chris Metcalf wrote: > On 01/27/2016 04:12 AM, Marc Zyngier wrote: >> On 26/01/16 20:43, Chris Metcalf wrote: >>> On 01/18/2016 04:28 AM, Marc Zyngier wrote: >>>> Hi Chris, >>>> >>>> On 15/01/16 20:02, Chris Metcalf wrote: >>>>> We are using GICv2 compatibility mode in the Fast Models/Foundation >>>>> Models simulations we are running because the boot code (ATF/UEFI) >>>>> doesn't support GICv3 in our system at the moment. >>>>> >>>>> However, starting with kernel 4.2, the guest couldn't boot up because it >>>>> wasn't getting timer interrupts. I tracked this down to a kernel commit >>>>> that switched to using the "alternatives" mechanism -- rather than >>>>> seeing either a GICv2 or GICv3 and configuring appropriately, the KVM >>>>> code just configured the code that saves/restores the vgic state based >>>>> on the presence of the system register interface to the GIC CPU >>>>> interface. See the attached patch for a fix that manages this >>>>> differently and allows me to boot up the guest in this configuration. >>>>> >>>>> However, even assuming this patch can be taken into an upstream tree, I >>>>> still have a couple of additional problems: >>>>> >>>>> - I can boot up with the Foundation Models using this change, but not >>>>> with the Fast Models (again, using a v3 GIC but in v2 compatibility mode >>>>> in the device tree). The Fast Models dts looks like it has the same >>>>> configuration for the GIC and the timers so I'm not sure what's going on >>>>> here. Any suggestions appreciated. >>>>> >>>>> - Without this change, I could only boot kernels up to 4.1. With the >>>>> change, I can boot kernels up to 4.3. But 4.4 won't boot for me either; >>>>> I haven't bisected it down yet. So any suggestions on what might be >>>>> going wrong here would also be appreciated. >>>>> >>>>> We are planning to eventually use GICv3 mode in our software stack but >>>>> for the time being I assume it is interesting to resolve issues with GIC >>>>> v2 compatibility mode on GIC v3. >>>>> >>>> I'm afraid that this is the wrong approach. Whilst 4.2 was a bit too >>>> eager to use GICv3 (only checking the CPU capability and ignoring the >>>> actual state of the EL2/EL3 SRE bits), the fact that 4.4 doesn't boot is >>>> probably the sign of a broken firmware that enables the system register >>>> interface at EL3, letting the rest of the software stack to use GICv3 in >>>> native mode, and yet providing a GICv2 DT. >>>> >>>> This combination is unpredictable, and is likely to cause issues on >>>> some HW implementations. >>>> >>>> Could you please point me to the firmware you're using? >>>> >>>> Also, please check the following patches: >>>> >>>> 6d32ab2 arm64: Update booting requirements for GICv3 in GICv2 mode >>>> 76e52dd irqchip/gic: Warn if GICv3 system registers are enabled >>>> 963fcd4 arm64: cpufeatures: Check ICC_EL1_SRE.SRE before enabling >>>> ARM64_HAS_SYSREG_GIC_CPUIF >>>> 7cabd00 irqchip/gic-v3: Make gic_enable_sre an inline function >>>> d271976 arm64: el2_setup: Make sure ICC_SRE_EL2.SRE sticks before using >>>> GICv3 sysregs >>>> >>>> Can you point me to the one that prevents you from booting? >>> The problematic commit is 963fcd4, because it calls gic_enable_sre() >>> in the host kernel even with a GICv2 DT specified, and this seems to >>> put things in a state such that we don't receive virtual timer >>> interrupts in the guest when we boot it up. (I'm not that familiar with >>> the QEMU DT but it is providing a GIC v2 to the guest.) >>> >>> With a v4.5-rc1 host, if I "return false" before the code in gic_enable_sre() >>> that tries to actually enable the SRE, and then hardcode the >>> __vgic_v2_XXX_state() save/restore calls into the __vgic_XXX_state() >>> routines, then my guest boots up OK. >> What if you just do the "return false"? I bet that it will work as well... > > Yes, that also works for my case. > >>> We are using a modified ARM version of EDK v3.0-rc0, and a modified >>> ARM Trusted Firmware based on commit 963fcd4 (between v1.1 and 1.2). >> Are you sure of that commit? It looks suspiciously like the ID ftom the >> kernel tree... > > Hah, good catch! The double-click-to-copy behavior is kind of flakey > on RHEL 6's default terminal, and I bet that bit me. It's 41099f4e. > >>> We certainly haven't touched any of the GIC code in either one. >>> >>> I tried to modify the host DT to enable GICv3, but then the host itself >>> hangs on boot, so clearly more is needed. (To be fair I've only tested >>> v4.4 in that configuration, not v4.5-rc1.) The firmware isn't yet using >>> GICv3 so perhaps that is part of the problem. >> That's indeed part of the problem. The firmware running at EL3 insists >> on using GICv2, but still let EL2 (and EL1) use GICv3 system registers. >> Could you please dump the content of ICC_SRE_EL3 just before entering >> the kernel at EL2? If you see ICC_SRE_EL3.SRE being set, then this would >> indicate a firmware bug (and leave the system in an unpredictable >> configuration). > > Well, the firmware clearly does this intentionally. In ATF's > drivers/arm/giv/arm_gic.c, the gicv3_cpuif_setup() function has > a comment that reads: > > /******************************************************************************* > * This function does some minimal GICv3 configuration. The Firmware itself does > * not fully support GICv3 at this time and relies on GICv2 emulation as > * provided by GICv3. This function allows software (like Linux) in later stages > * to use full GICv3 features. > ******************************************************************************/ > > and the function ends with: > > val = read_icc_sre_el3(); > write_icc_sre_el3(val | ICC_SRE_EN | ICC_SRE_SRE); > > In our build environment, if I comment out those two lines, that > fixes the guest boot problem (without any hacking on the Linux side), > so that's good anyway. With this change it works for me in the > Fast Models as well as Foundation Models, too. By the look of it, you're trying to use a GICv3 firmware, and pass a GICv2 DT to the kernel. Do not do that. Either you use a GICv2 firmware (having spoken to the ATF guys, there is a GICv2 driver in there that should work for your case) and pass a GICv2 DT, or you go GICv3 all the way. A mix of the two things is completely unsupported on the model, and solidly places you in the UNPREDICTABLE category when running that on actual HW... Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] help guest boot up on AArch64 host with GICv2 2016-01-29 17:54 ` Marc Zyngier @ 2016-01-29 18:29 ` Chris Metcalf 2016-01-29 18:55 ` Marc Zyngier 0 siblings, 1 reply; 10+ messages in thread From: Chris Metcalf @ 2016-01-29 18:29 UTC (permalink / raw) To: linux-arm-kernel On 01/29/2016 12:54 PM, Marc Zyngier wrote: > By the look of it, you're trying to use a GICv3 firmware, and pass a > GICv2 DT to the kernel. Do not do that. Either you use a GICv2 firmware > (having spoken to the ATF guys, there is a GICv2 driver in there that > should work for your case) and pass a GICv2 DT, or you go GICv3 all the way. > > A mix of the two things is completely unsupported on the model, and > solidly places you in the UNPREDICTABLE category when running that on > actual HW... Once we upgrade to ATF 1.2 we will remove all of our GICv2 stuff and see if everything works smoothly; till then, at least, we seem to have a workaround in place for development that will let us keep moving forward. From the ATF docs it does seem that using a v3 GIC in v2 compatibility mode should be supported, but it's not pressing from our side to drill any deeper to try to see why it's not actually working correctly for us (though I'd be happy to try any further testing in this configuration if that's helpful). Thanks! -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] help guest boot up on AArch64 host with GICv2 2016-01-29 18:29 ` Chris Metcalf @ 2016-01-29 18:55 ` Marc Zyngier 0 siblings, 0 replies; 10+ messages in thread From: Marc Zyngier @ 2016-01-29 18:55 UTC (permalink / raw) To: linux-arm-kernel On 29/01/16 18:29, Chris Metcalf wrote: > On 01/29/2016 12:54 PM, Marc Zyngier wrote: >> By the look of it, you're trying to use a GICv3 firmware, and pass a >> GICv2 DT to the kernel. Do not do that. Either you use a GICv2 firmware >> (having spoken to the ATF guys, there is a GICv2 driver in there that >> should work for your case) and pass a GICv2 DT, or you go GICv3 all the way. >> >> A mix of the two things is completely unsupported on the model, and >> solidly places you in the UNPREDICTABLE category when running that on >> actual HW... > > Once we upgrade to ATF 1.2 we will remove all of our GICv2 > stuff and see if everything works smoothly; till then, at least, we > seem to have a workaround in place for development that will > let us keep moving forward. > > From the ATF docs it does seem that using a v3 GIC in v2 > compatibility mode should be supported, but it's not pressing > from our side to drill any deeper to try to see why it's not > actually working correctly for us (though I'd be happy to try > any further testing in this configuration if that's helpful). That's what I've been told as well - the GICv2 code in ATF 1.2 should do the trick out of the box. Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-01-29 18:55 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-01-15 20:02 [PATCH] help guest boot up on AArch64 host with GICv2 Chris Metcalf 2016-01-18 9:28 ` Marc Zyngier 2016-01-26 20:43 ` Chris Metcalf 2016-01-27 9:12 ` Marc Zyngier 2016-01-28 20:12 ` Chris Metcalf 2016-01-29 7:24 ` Ard Biesheuvel 2016-01-29 17:49 ` Chris Metcalf 2016-01-29 17:54 ` Marc Zyngier 2016-01-29 18:29 ` Chris Metcalf 2016-01-29 18:55 ` Marc Zyngier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).