* Re: ddr devfreq on H3 - possible? [not found] <CAKAF0m9DqPjB6C39ZbrRHFrJOodm7WQGTL0x1jduQjNU=JpQ2g@mail.gmail.com> @ 2022-12-29 17:29 ` Samuel Holland 2022-12-31 11:15 ` Kirill 0 siblings, 1 reply; 11+ messages in thread From: Samuel Holland @ 2022-12-29 17:29 UTC (permalink / raw) To: linux-sunxi; +Cc: Kirill Hi Kirill, On 12/28/22 07:10, Kirill wrote: > I'm trying to use your driver with h3, but have this result: > ``` > [ 387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit > DDRx with ODT > [ 388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750 > (6%) at 1248 MHz > [ 389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750 > (0%) at 1248 MHz > [ 389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to > 156 MHz, tREFI=19, tRFC=28, ODT=disabled > ``` > > After this CPU hangs and not responding. > Is possible (at least theoretically) to use this driver with H3? Yes, although it will need some help from firmware. If you look at the vendor driver[1] (pick any random Allwinner 4.9 tree), you will see there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3). The driver always calls mdfs_main(), which is a standalone program loaded to SRAM. The reason seems to be that the MDFS hardware is broken, as you found out. Something like this standalone MDFS application is not upstreamable, but conveniently we already have some firmware running from SRAM, namely U-Boot's PSCI/secure monitor implementation. And Allwinner already has some chips where they call a SMC to do this MDFS procedure[2]. So we can reuse that SMC function ID, and put the code in in the secure monitor. I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept. It works, though it did lock up once after playing with the devfreq sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are just copied from the vendor driver; they could surely be improved. The U-Boot patch is based on my series adding Crust support for H3, so I could have interactive peek/poke from the AR100 even when the DRAM controller is dead. It shouldn't be too hard to rebase that out and move the code to psci.c. I'm not sure the best way to upstream the changes to psci.S. Probably we need some platform callback to handle unknown function IDs. Regards, Samuel [1]: https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411 [2]: https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38 [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ddr devfreq on H3 - possible? 2022-12-29 17:29 ` ddr devfreq on H3 - possible? Samuel Holland @ 2022-12-31 11:15 ` Kirill 2022-12-31 20:45 ` Samuel Holland 0 siblings, 1 reply; 11+ messages in thread From: Kirill @ 2022-12-31 11:15 UTC (permalink / raw) To: Samuel Holland; +Cc: linux-sunxi Hi! I ported your patches for armbian kernel 6.1 / u-boot and it works! > It works, though it did lock up once after playing with the devfreq sysfs for several minutes Yes, I have hangs too. And the main reason for this problem - SMP. :( By calling SMC we put only one CPU into SRAM. But other CPUs still work and use DRAM! I don't see any hangs, if I disable all other CPUs: ``` echo 0 > /sys/devices/system/cpu/cpu1/online echo 0 > /sys/devices/system/cpu/cpu2/online echo 0 > /sys/devices/system/cpu/cpu3/online ``` Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem. They call mdfs_pause_cpu[1] for each CPU core (except current) This function located in the SRAM and locks CPU in infinity loop, until `set_paused(false)` called on CPU0 Also, legacy rockchip kernels use same hack[3]. But this method is not ideal... Before changing DDR freq we must make sure which each kernel is stuck on the SRAM function. But this is a very long process. In my proof-of-concept implementation sometimes elapses *few seconds* between I call smp_call_function and all cores stucking in SRAM function. This method may be suitable for manual freq change. But not for automatic governor mode. No chance to change frequency on heavy loaded CPUs. We need another way. I think, code on PSCI must stop/suspend any other working cpu cores before update freq. This is possible? [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638 [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088 [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168 P.S. sorry for duplicate, previous message declined by mlmmj чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>: > > Hi Kirill, > > On 12/28/22 07:10, Kirill wrote: > > I'm trying to use your driver with h3, but have this result: > > ``` > > [ 387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit > > DDRx with ODT > > [ 388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750 > > (6%) at 1248 MHz > > [ 389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750 > > (0%) at 1248 MHz > > [ 389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to > > 156 MHz, tREFI=19, tRFC=28, ODT=disabled > > ``` > > > > After this CPU hangs and not responding. > > Is possible (at least theoretically) to use this driver with H3? > > Yes, although it will need some help from firmware. If you look at the > vendor driver[1] (pick any random Allwinner 4.9 tree), you will see > there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3). > The driver always calls mdfs_main(), which is a standalone program > loaded to SRAM. The reason seems to be that the MDFS hardware is broken, > as you found out. > > Something like this standalone MDFS application is not upstreamable, but > conveniently we already have some firmware running from SRAM, namely > U-Boot's PSCI/secure monitor implementation. And Allwinner already has > some chips where they call a SMC to do this MDFS procedure[2]. So we can > reuse that SMC function ID, and put the code in in the secure monitor. > I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept. > > It works, though it did lock up once after playing with the devfreq > sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are > just copied from the vendor driver; they could surely be improved. > > The U-Boot patch is based on my series adding Crust support for H3, so I > could have interactive peek/poke from the AR100 even when the DRAM > controller is dead. It shouldn't be too hard to rebase that out and move > the code to psci.c. > > I'm not sure the best way to upstream the changes to psci.S. Probably we > need some platform callback to handle unknown function IDs. > > Regards, > Samuel > > [1]: > https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411 > [2]: > https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38 > [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq > [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ddr devfreq on H3 - possible? 2022-12-31 11:15 ` Kirill @ 2022-12-31 20:45 ` Samuel Holland 2023-01-01 21:40 ` Kirill 0 siblings, 1 reply; 11+ messages in thread From: Samuel Holland @ 2022-12-31 20:45 UTC (permalink / raw) To: Kirill; +Cc: linux-sunxi Hi Kirill, On 12/31/22 05:15, Kirill wrote: > Hi! > > I ported your patches for armbian kernel 6.1 / u-boot and it works! > >> It works, though it did lock up once after playing with the devfreq >> sysfs for several minutes > > Yes, I have hangs too. And the main reason for this problem - SMP. :( > > By calling SMC we put only one CPU into SRAM. But other CPUs still > work and use DRAM! > I don't see any hangs, if I disable all other CPUs: > ``` > echo 0 > /sys/devices/system/cpu/cpu1/online > echo 0 > /sys/devices/system/cpu/cpu2/online > echo 0 > /sys/devices/system/cpu/cpu3/online > ``` Thanks for investigating. That's good information. > Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem. > > They call mdfs_pause_cpu[1] for each CPU core (except current) > This function located in the SRAM and locks CPU in infinity loop, > until `set_paused(false)` called on CPU0 > Also, legacy rockchip kernels use same hack[3]. > > But this method is not ideal... > Before changing DDR freq we must make sure which each kernel is stuck > on the SRAM function. But this is a very long process. > In my proof-of-concept implementation sometimes elapses *few seconds* > between I call smp_call_function and all cores stucking in SRAM > function. > This method may be suitable for manual freq change. But not for > automatic governor mode. No chance to change frequency on heavy loaded > CPUs. > > We need another way. > > I think, code on PSCI must stop/suspend any other working cpu cores > before update freq. This is possible? It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM controller's host interface, which should cause the L2 cache subsystem (and thus the other CPUs) to stall when trying to access DRAM. This is what the MDFS hardware does on A64/H5, and I have seen no hangs there. Possibly the issue is that such a stall sometimes affects the CPU that is running from SRAM, even though it should not. (On the other hand, when using the MDFS hardware, it is okay if all four CPUs temporarily stall at the same time.) One thing to check is if sunxi_dram_dvfs_req() completes successfully. That function contains some unbounded loops, so it is possible to get stuck. You could toggle a GPIO or something at the end of the function. That would distinguish between "the secure monitor hung" and "we left the DRAM controller in a bad state and hung when switching back to code in DRAM" or even "we trashed the contents of DRAM". We do use the architectural timer inside sunxi_dram_dvfs_req(), but those registers are banked between secure/non-secure states, so that should not interfere with Linux's use of the timer. However, your test with offlining the other CPUs suggests we may really need some synchronization. I would suggest doing this inside U-Boot as well. You can send a SGI IPI to the other three CPUs and force them to trap into the secure monitor. Not only will this be immediate, but it will also ensure the other CPUs are running from SRAM during the reclocking. You can take inspiration from the existing IPI code in psci.c. It is quite convenient to be truly in control, so you can do things behind the OS's back, and keep it blissfully unaware. :) Regards, Samuel > [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638 > [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088 > [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168 > > P.S. sorry for duplicate, previous message declined by mlmmj > > > чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>: >> >> Hi Kirill, >> >> On 12/28/22 07:10, Kirill wrote: >>> I'm trying to use your driver with h3, but have this result: >>> ``` >>> [ 387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit >>> DDRx with ODT >>> [ 388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750 >>> (6%) at 1248 MHz >>> [ 389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750 >>> (0%) at 1248 MHz >>> [ 389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to >>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled >>> ``` >>> >>> After this CPU hangs and not responding. >>> Is possible (at least theoretically) to use this driver with H3? >> >> Yes, although it will need some help from firmware. If you look at the >> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see >> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3). >> The driver always calls mdfs_main(), which is a standalone program >> loaded to SRAM. The reason seems to be that the MDFS hardware is broken, >> as you found out. >> >> Something like this standalone MDFS application is not upstreamable, but >> conveniently we already have some firmware running from SRAM, namely >> U-Boot's PSCI/secure monitor implementation. And Allwinner already has >> some chips where they call a SMC to do this MDFS procedure[2]. So we can >> reuse that SMC function ID, and put the code in in the secure monitor. >> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept. >> >> It works, though it did lock up once after playing with the devfreq >> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are >> just copied from the vendor driver; they could surely be improved. >> >> The U-Boot patch is based on my series adding Crust support for H3, so I >> could have interactive peek/poke from the AR100 even when the DRAM >> controller is dead. It shouldn't be too hard to rebase that out and move >> the code to psci.c. >> >> I'm not sure the best way to upstream the changes to psci.S. Probably we >> need some platform callback to handle unknown function IDs. >> >> Regards, >> Samuel >> >> [1]: >> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411 >> [2]: >> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38 >> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq >> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t >> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ddr devfreq on H3 - possible? 2022-12-31 20:45 ` Samuel Holland @ 2023-01-01 21:40 ` Kirill 2023-01-01 22:20 ` Samuel Holland 0 siblings, 1 reply; 11+ messages in thread From: Kirill @ 2023-01-01 21:40 UTC (permalink / raw) To: Samuel Holland; +Cc: linux-sunxi I did a little debugging. When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);` Example of my debug: ``` /* 1. enter self-refresh and disable all master access */ reg_val = readl(PWRCTL); reg_val |= (0x1<<0); reg_val |= (0x1<<8); __gpio_debug(2); writel(reg_val, PWRCTL); __gpio_debug(3); __udelay(1); __gpio_debug(4); ``` __gpio_debug should not take any effect on the process, just switching GPIO's. Before hanging it can be worked for a few minutes. Hangs occur at random manners. Also hangs not depend on count of freq change. I tried something like this for fastest bug reproduction: ``` while true do echo "simple_ondemand" > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor sleep 5 echo "performance" > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor sleep 5 done ``` No effect, it still happens randomly :) > You can send a SGI IPI to the other three CPUs and force them to > trap into the secure monitor. Thanks, I will try this later. P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :( сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>: > > Hi Kirill, > > On 12/31/22 05:15, Kirill wrote: > > Hi! > > > > I ported your patches for armbian kernel 6.1 / u-boot and it works! > > > >> It works, though it did lock up once after playing with the devfreq > >> sysfs for several minutes > > > > Yes, I have hangs too. And the main reason for this problem - SMP. :( > > > > By calling SMC we put only one CPU into SRAM. But other CPUs still > > work and use DRAM! > > I don't see any hangs, if I disable all other CPUs: > > ``` > > echo 0 > /sys/devices/system/cpu/cpu1/online > > echo 0 > /sys/devices/system/cpu/cpu2/online > > echo 0 > /sys/devices/system/cpu/cpu3/online > > ``` > > Thanks for investigating. That's good information. > > > Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem. > > > > They call mdfs_pause_cpu[1] for each CPU core (except current) > > This function located in the SRAM and locks CPU in infinity loop, > > until `set_paused(false)` called on CPU0 > > Also, legacy rockchip kernels use same hack[3]. > > > > But this method is not ideal... > > Before changing DDR freq we must make sure which each kernel is stuck > > on the SRAM function. But this is a very long process. > > In my proof-of-concept implementation sometimes elapses *few seconds* > > between I call smp_call_function and all cores stucking in SRAM > > function. > > This method may be suitable for manual freq change. But not for > > automatic governor mode. No chance to change frequency on heavy loaded > > CPUs. > > > > We need another way. > > > > I think, code on PSCI must stop/suspend any other working cpu cores > > before update freq. This is possible? > > It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM > controller's host interface, which should cause the L2 cache subsystem > (and thus the other CPUs) to stall when trying to access DRAM. This is > what the MDFS hardware does on A64/H5, and I have seen no hangs there. > > Possibly the issue is that such a stall sometimes affects the CPU that > is running from SRAM, even though it should not. (On the other hand, > when using the MDFS hardware, it is okay if all four CPUs temporarily > stall at the same time.) > > One thing to check is if sunxi_dram_dvfs_req() completes successfully. > That function contains some unbounded loops, so it is possible to get > stuck. You could toggle a GPIO or something at the end of the function. > That would distinguish between "the secure monitor hung" and "we left > the DRAM controller in a bad state and hung when switching back to code > in DRAM" or even "we trashed the contents of DRAM". > > We do use the architectural timer inside sunxi_dram_dvfs_req(), but > those registers are banked between secure/non-secure states, so that > should not interfere with Linux's use of the timer. > > However, your test with offlining the other CPUs suggests we may really > need some synchronization. I would suggest doing this inside U-Boot as > well. You can send a SGI IPI to the other three CPUs and force them to > trap into the secure monitor. Not only will this be immediate, but it > will also ensure the other CPUs are running from SRAM during the > reclocking. You can take inspiration from the existing IPI code in psci.c. > > It is quite convenient to be truly in control, so you can do things > behind the OS's back, and keep it blissfully unaware. :) > > Regards, > Samuel > > > [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638 > > [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088 > > [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168 > > > > P.S. sorry for duplicate, previous message declined by mlmmj > > > > > > чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>: > >> > >> Hi Kirill, > >> > >> On 12/28/22 07:10, Kirill wrote: > >>> I'm trying to use your driver with h3, but have this result: > >>> ``` > >>> [ 387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit > >>> DDRx with ODT > >>> [ 388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750 > >>> (6%) at 1248 MHz > >>> [ 389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750 > >>> (0%) at 1248 MHz > >>> [ 389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to > >>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled > >>> ``` > >>> > >>> After this CPU hangs and not responding. > >>> Is possible (at least theoretically) to use this driver with H3? > >> > >> Yes, although it will need some help from firmware. If you look at the > >> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see > >> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3). > >> The driver always calls mdfs_main(), which is a standalone program > >> loaded to SRAM. The reason seems to be that the MDFS hardware is broken, > >> as you found out. > >> > >> Something like this standalone MDFS application is not upstreamable, but > >> conveniently we already have some firmware running from SRAM, namely > >> U-Boot's PSCI/secure monitor implementation. And Allwinner already has > >> some chips where they call a SMC to do this MDFS procedure[2]. So we can > >> reuse that SMC function ID, and put the code in in the secure monitor. > >> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept. > >> > >> It works, though it did lock up once after playing with the devfreq > >> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are > >> just copied from the vendor driver; they could surely be improved. > >> > >> The U-Boot patch is based on my series adding Crust support for H3, so I > >> could have interactive peek/poke from the AR100 even when the DRAM > >> controller is dead. It shouldn't be too hard to rebase that out and move > >> the code to psci.c. > >> > >> I'm not sure the best way to upstream the changes to psci.S. Probably we > >> need some platform callback to handle unknown function IDs. > >> > >> Regards, > >> Samuel > >> > >> [1]: > >> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411 > >> [2]: > >> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38 > >> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq > >> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t > >> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ddr devfreq on H3 - possible? 2023-01-01 21:40 ` Kirill @ 2023-01-01 22:20 ` Samuel Holland 2023-01-02 2:20 ` Kirill 0 siblings, 1 reply; 11+ messages in thread From: Samuel Holland @ 2023-01-01 22:20 UTC (permalink / raw) To: Kirill; +Cc: linux-sunxi On 1/1/23 15:40, Kirill wrote: > I did a little debugging. > When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);` > > Example of my debug: > ``` > /* 1. enter self-refresh and disable all master access */ > reg_val = readl(PWRCTL); > reg_val |= (0x1<<0); > reg_val |= (0x1<<8); > __gpio_debug(2); > writel(reg_val, PWRCTL); > __gpio_debug(3); > __udelay(1); > __gpio_debug(4); > ``` > __gpio_debug should not take any effect on the process, just switching GPIO's. Just to make sure -- is __gpio_debug declared with __secure? > Before hanging it can be worked for a few minutes. Hangs occur at > random manners. > Also hangs not depend on count of freq change. Hmm, the CPU hanging right when blocking access to DRAM suggests there is some asynchronous cache operation (prefetch or writeback) that is causes some DRAM traffic. I checked the U-Boot source code, and it disables both caches in the secure copy of SCTLR before setting up PSCI*. So there should not be any prefetching occurring while you are in monitor mode. But maybe there is some writeback from cache lines dirtied in non-secure state. You could try calling psci_v7_flush_dcache_all() before setting PWRCTL. If that does not help, I am not sure what else could be happening. Regards, Samuel * boot_jump_linux() -> announce_and_cleanup() -> cleanup_before_linux() -> cleanup_before_linux_select() > I tried something like this for fastest bug reproduction: > ``` > while true > do > echo "simple_ondemand" > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > sleep 5 > echo "performance" > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > sleep 5 > done > ``` > > No effect, it still happens randomly :) > >> You can send a SGI IPI to the other three CPUs and force them to >> trap into the secure monitor. > > Thanks, I will try this later. > > P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :( > > сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>: >> >> Hi Kirill, >> >> On 12/31/22 05:15, Kirill wrote: >>> Hi! >>> >>> I ported your patches for armbian kernel 6.1 / u-boot and it works! >>> >>>> It works, though it did lock up once after playing with the devfreq >>>> sysfs for several minutes >>> >>> Yes, I have hangs too. And the main reason for this problem - SMP. :( >>> >>> By calling SMC we put only one CPU into SRAM. But other CPUs still >>> work and use DRAM! >>> I don't see any hangs, if I disable all other CPUs: >>> ``` >>> echo 0 > /sys/devices/system/cpu/cpu1/online >>> echo 0 > /sys/devices/system/cpu/cpu2/online >>> echo 0 > /sys/devices/system/cpu/cpu3/online >>> ``` >> >> Thanks for investigating. That's good information. >> >>> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem. >>> >>> They call mdfs_pause_cpu[1] for each CPU core (except current) >>> This function located in the SRAM and locks CPU in infinity loop, >>> until `set_paused(false)` called on CPU0 >>> Also, legacy rockchip kernels use same hack[3]. >>> >>> But this method is not ideal... >>> Before changing DDR freq we must make sure which each kernel is stuck >>> on the SRAM function. But this is a very long process. >>> In my proof-of-concept implementation sometimes elapses *few seconds* >>> between I call smp_call_function and all cores stucking in SRAM >>> function. >>> This method may be suitable for manual freq change. But not for >>> automatic governor mode. No chance to change frequency on heavy loaded >>> CPUs. >>> >>> We need another way. >>> >>> I think, code on PSCI must stop/suspend any other working cpu cores >>> before update freq. This is possible? >> >> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM >> controller's host interface, which should cause the L2 cache subsystem >> (and thus the other CPUs) to stall when trying to access DRAM. This is >> what the MDFS hardware does on A64/H5, and I have seen no hangs there. >> >> Possibly the issue is that such a stall sometimes affects the CPU that >> is running from SRAM, even though it should not. (On the other hand, >> when using the MDFS hardware, it is okay if all four CPUs temporarily >> stall at the same time.) >> >> One thing to check is if sunxi_dram_dvfs_req() completes successfully. >> That function contains some unbounded loops, so it is possible to get >> stuck. You could toggle a GPIO or something at the end of the function. >> That would distinguish between "the secure monitor hung" and "we left >> the DRAM controller in a bad state and hung when switching back to code >> in DRAM" or even "we trashed the contents of DRAM". >> >> We do use the architectural timer inside sunxi_dram_dvfs_req(), but >> those registers are banked between secure/non-secure states, so that >> should not interfere with Linux's use of the timer. >> >> However, your test with offlining the other CPUs suggests we may really >> need some synchronization. I would suggest doing this inside U-Boot as >> well. You can send a SGI IPI to the other three CPUs and force them to >> trap into the secure monitor. Not only will this be immediate, but it >> will also ensure the other CPUs are running from SRAM during the >> reclocking. You can take inspiration from the existing IPI code in psci.c. >> >> It is quite convenient to be truly in control, so you can do things >> behind the OS's back, and keep it blissfully unaware. :) >> >> Regards, >> Samuel >> >>> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638 >>> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088 >>> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168 >>> >>> P.S. sorry for duplicate, previous message declined by mlmmj >>> >>> >>> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>: >>>> >>>> Hi Kirill, >>>> >>>> On 12/28/22 07:10, Kirill wrote: >>>>> I'm trying to use your driver with h3, but have this result: >>>>> ``` >>>>> [ 387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit >>>>> DDRx with ODT >>>>> [ 388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750 >>>>> (6%) at 1248 MHz >>>>> [ 389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750 >>>>> (0%) at 1248 MHz >>>>> [ 389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to >>>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled >>>>> ``` >>>>> >>>>> After this CPU hangs and not responding. >>>>> Is possible (at least theoretically) to use this driver with H3? >>>> >>>> Yes, although it will need some help from firmware. If you look at the >>>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see >>>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3). >>>> The driver always calls mdfs_main(), which is a standalone program >>>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken, >>>> as you found out. >>>> >>>> Something like this standalone MDFS application is not upstreamable, but >>>> conveniently we already have some firmware running from SRAM, namely >>>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has >>>> some chips where they call a SMC to do this MDFS procedure[2]. So we can >>>> reuse that SMC function ID, and put the code in in the secure monitor. >>>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept. >>>> >>>> It works, though it did lock up once after playing with the devfreq >>>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are >>>> just copied from the vendor driver; they could surely be improved. >>>> >>>> The U-Boot patch is based on my series adding Crust support for H3, so I >>>> could have interactive peek/poke from the AR100 even when the DRAM >>>> controller is dead. It shouldn't be too hard to rebase that out and move >>>> the code to psci.c. >>>> >>>> I'm not sure the best way to upstream the changes to psci.S. Probably we >>>> need some platform callback to handle unknown function IDs. >>>> >>>> Regards, >>>> Samuel >>>> >>>> [1]: >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411 >>>> [2]: >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38 >>>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq >>>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t >>>> >> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ddr devfreq on H3 - possible? 2023-01-01 22:20 ` Samuel Holland @ 2023-01-02 2:20 ` Kirill 2023-01-04 23:26 ` Kirill 0 siblings, 1 reply; 11+ messages in thread From: Kirill @ 2023-01-02 2:20 UTC (permalink / raw) To: Samuel Holland; +Cc: linux-sunxi > Just to make sure -- is __gpio_debug declared with __secure? Yes > try calling psci_v7_flush_dcache_all() before setting PWRCTL. Hm... I'm not sure, but it seems to be working.... My OPi Lite works ~one hour without any hangs! Of course, I need more time and tests to completely confirm that. I will continue testing. Great thanks!!! Also, I did some extra work: - Rewrited code to use u-boot registers and constants - implemented enabling/disabling ODT - implemented enabling/disabling self-refresh - implemented custom PSCI tables for platform-specific functions in u-boot And all of these also seem to be works. My fork with changes: https://github.com/Azq2/u-boot/commits/allwinner-h3-dram-dvfs пн, 2 янв. 2023 г. в 00:20, Samuel Holland <samuel@sholland.org>: > > On 1/1/23 15:40, Kirill wrote: > > I did a little debugging. > > When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);` > > > > Example of my debug: > > ``` > > /* 1. enter self-refresh and disable all master access */ > > reg_val = readl(PWRCTL); > > reg_val |= (0x1<<0); > > reg_val |= (0x1<<8); > > __gpio_debug(2); > > writel(reg_val, PWRCTL); > > __gpio_debug(3); > > __udelay(1); > > __gpio_debug(4); > > ``` > > __gpio_debug should not take any effect on the process, just switching GPIO's. > > Just to make sure -- is __gpio_debug declared with __secure? > > > Before hanging it can be worked for a few minutes. Hangs occur at > > random manners. > > Also hangs not depend on count of freq change. > > Hmm, the CPU hanging right when blocking access to DRAM suggests there > is some asynchronous cache operation (prefetch or writeback) that is > causes some DRAM traffic. > > I checked the U-Boot source code, and it disables both caches in the > secure copy of SCTLR before setting up PSCI*. So there should not be any > prefetching occurring while you are in monitor mode. But maybe there is > some writeback from cache lines dirtied in non-secure state. You could > try calling psci_v7_flush_dcache_all() before setting PWRCTL. > > If that does not help, I am not sure what else could be happening. > > Regards, > Samuel > > * boot_jump_linux() -> announce_and_cleanup() -> cleanup_before_linux() > -> cleanup_before_linux_select() > > > I tried something like this for fastest bug reproduction: > > ``` > > while true > > do > > echo "simple_ondemand" > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > > sleep 5 > > echo "performance" > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > > sleep 5 > > done > > ``` > > > > No effect, it still happens randomly :) > > > >> You can send a SGI IPI to the other three CPUs and force them to > >> trap into the secure monitor. > > > > Thanks, I will try this later. > > > > P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :( > > > > сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>: > >> > >> Hi Kirill, > >> > >> On 12/31/22 05:15, Kirill wrote: > >>> Hi! > >>> > >>> I ported your patches for armbian kernel 6.1 / u-boot and it works! > >>> > >>>> It works, though it did lock up once after playing with the devfreq > >>>> sysfs for several minutes > >>> > >>> Yes, I have hangs too. And the main reason for this problem - SMP. :( > >>> > >>> By calling SMC we put only one CPU into SRAM. But other CPUs still > >>> work and use DRAM! > >>> I don't see any hangs, if I disable all other CPUs: > >>> ``` > >>> echo 0 > /sys/devices/system/cpu/cpu1/online > >>> echo 0 > /sys/devices/system/cpu/cpu2/online > >>> echo 0 > /sys/devices/system/cpu/cpu3/online > >>> ``` > >> > >> Thanks for investigating. That's good information. > >> > >>> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem. > >>> > >>> They call mdfs_pause_cpu[1] for each CPU core (except current) > >>> This function located in the SRAM and locks CPU in infinity loop, > >>> until `set_paused(false)` called on CPU0 > >>> Also, legacy rockchip kernels use same hack[3]. > >>> > >>> But this method is not ideal... > >>> Before changing DDR freq we must make sure which each kernel is stuck > >>> on the SRAM function. But this is a very long process. > >>> In my proof-of-concept implementation sometimes elapses *few seconds* > >>> between I call smp_call_function and all cores stucking in SRAM > >>> function. > >>> This method may be suitable for manual freq change. But not for > >>> automatic governor mode. No chance to change frequency on heavy loaded > >>> CPUs. > >>> > >>> We need another way. > >>> > >>> I think, code on PSCI must stop/suspend any other working cpu cores > >>> before update freq. This is possible? > >> > >> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM > >> controller's host interface, which should cause the L2 cache subsystem > >> (and thus the other CPUs) to stall when trying to access DRAM. This is > >> what the MDFS hardware does on A64/H5, and I have seen no hangs there. > >> > >> Possibly the issue is that such a stall sometimes affects the CPU that > >> is running from SRAM, even though it should not. (On the other hand, > >> when using the MDFS hardware, it is okay if all four CPUs temporarily > >> stall at the same time.) > >> > >> One thing to check is if sunxi_dram_dvfs_req() completes successfully. > >> That function contains some unbounded loops, so it is possible to get > >> stuck. You could toggle a GPIO or something at the end of the function. > >> That would distinguish between "the secure monitor hung" and "we left > >> the DRAM controller in a bad state and hung when switching back to code > >> in DRAM" or even "we trashed the contents of DRAM". > >> > >> We do use the architectural timer inside sunxi_dram_dvfs_req(), but > >> those registers are banked between secure/non-secure states, so that > >> should not interfere with Linux's use of the timer. > >> > >> However, your test with offlining the other CPUs suggests we may really > >> need some synchronization. I would suggest doing this inside U-Boot as > >> well. You can send a SGI IPI to the other three CPUs and force them to > >> trap into the secure monitor. Not only will this be immediate, but it > >> will also ensure the other CPUs are running from SRAM during the > >> reclocking. You can take inspiration from the existing IPI code in psci.c. > >> > >> It is quite convenient to be truly in control, so you can do things > >> behind the OS's back, and keep it blissfully unaware. :) > >> > >> Regards, > >> Samuel > >> > >>> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638 > >>> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088 > >>> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168 > >>> > >>> P.S. sorry for duplicate, previous message declined by mlmmj > >>> > >>> > >>> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>: > >>>> > >>>> Hi Kirill, > >>>> > >>>> On 12/28/22 07:10, Kirill wrote: > >>>>> I'm trying to use your driver with h3, but have this result: > >>>>> ``` > >>>>> [ 387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit > >>>>> DDRx with ODT > >>>>> [ 388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750 > >>>>> (6%) at 1248 MHz > >>>>> [ 389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750 > >>>>> (0%) at 1248 MHz > >>>>> [ 389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to > >>>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled > >>>>> ``` > >>>>> > >>>>> After this CPU hangs and not responding. > >>>>> Is possible (at least theoretically) to use this driver with H3? > >>>> > >>>> Yes, although it will need some help from firmware. If you look at the > >>>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see > >>>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3). > >>>> The driver always calls mdfs_main(), which is a standalone program > >>>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken, > >>>> as you found out. > >>>> > >>>> Something like this standalone MDFS application is not upstreamable, but > >>>> conveniently we already have some firmware running from SRAM, namely > >>>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has > >>>> some chips where they call a SMC to do this MDFS procedure[2]. So we can > >>>> reuse that SMC function ID, and put the code in in the secure monitor. > >>>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept. > >>>> > >>>> It works, though it did lock up once after playing with the devfreq > >>>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are > >>>> just copied from the vendor driver; they could surely be improved. > >>>> > >>>> The U-Boot patch is based on my series adding Crust support for H3, so I > >>>> could have interactive peek/poke from the AR100 even when the DRAM > >>>> controller is dead. It shouldn't be too hard to rebase that out and move > >>>> the code to psci.c. > >>>> > >>>> I'm not sure the best way to upstream the changes to psci.S. Probably we > >>>> need some platform callback to handle unknown function IDs. > >>>> > >>>> Regards, > >>>> Samuel > >>>> > >>>> [1]: > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411 > >>>> [2]: > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38 > >>>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq > >>>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t > >>>> > >> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ddr devfreq on H3 - possible? 2023-01-02 2:20 ` Kirill @ 2023-01-04 23:26 ` Kirill 2023-01-05 2:47 ` Kirill 0 siblings, 1 reply; 11+ messages in thread From: Kirill @ 2023-01-04 23:26 UTC (permalink / raw) To: Samuel Holland; +Cc: linux-sunxi Hi Samel! Sadly, it does not work stable :( I found a good way to stress-test the ddr dvfs process. Just switching between max/min freq every 0.5s. (Previously I tested with performance <-> simple ondemand, before I understood why not present other governors. And it wasn't enough.) My script for testing: ``` #!/bin/bash while true do date echo "powersave" > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor sleep 0.5 echo "performance" > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor sleep 0.5 done ``` Test cases: 1. Without module sun8i_a33_mbus: system stable, all works fine I checked my setup: - Every power source looks fine on oscilloscope - Kernel memtest (memtest=10) don't find any problem - Userspace tool memtester don't find any problem 2. Single CPU + sun8i_a33_mbus + script: Works fine for a long time 3. 2 or 4 CPUs: Page Fault! Seems to be RAM corrupting... ``` [ 3890.960776] Unable to handle kernel paging request at virtual address 00044581 [ 3890.968017] [00044581] *pgd=00000000 [ 3890.971626] Internal error: Oops: 5 [#1] SMP THUMB2 ``` (full report: https://gist.github.com/Azq2/b2875ee228d1e68922d7c1f4f4e3f3df) I tried to implement "CPU locking in monitor mode" using SGI. But pagefaults still occur. And, also, it is too slow... I see how my input hangs on ssh... Code which I use: https://github.com/u-boot/u-boot/compare/master...Azq2:u-boot:new_test Maybe I incorrectly implemented SGI. пн, 2 янв. 2023 г. в 04:20, Kirill <kirill.zhumarin@gmail.com>: > > > Just to make sure -- is __gpio_debug declared with __secure? > > Yes > > > try calling psci_v7_flush_dcache_all() before setting PWRCTL. > > Hm... I'm not sure, but it seems to be working.... My OPi Lite works > ~one hour without any hangs! > > Of course, I need more time and tests to completely confirm that. > I will continue testing. > > Great thanks!!! > > Also, I did some extra work: > - Rewrited code to use u-boot registers and constants > - implemented enabling/disabling ODT > - implemented enabling/disabling self-refresh > - implemented custom PSCI tables for platform-specific functions in u-boot > > And all of these also seem to be works. > My fork with changes: > https://github.com/Azq2/u-boot/commits/allwinner-h3-dram-dvfs > > пн, 2 янв. 2023 г. в 00:20, Samuel Holland <samuel@sholland.org>: > > > > On 1/1/23 15:40, Kirill wrote: > > > I did a little debugging. > > > When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);` > > > > > > Example of my debug: > > > ``` > > > /* 1. enter self-refresh and disable all master access */ > > > reg_val = readl(PWRCTL); > > > reg_val |= (0x1<<0); > > > reg_val |= (0x1<<8); > > > __gpio_debug(2); > > > writel(reg_val, PWRCTL); > > > __gpio_debug(3); > > > __udelay(1); > > > __gpio_debug(4); > > > ``` > > > __gpio_debug should not take any effect on the process, just switching GPIO's. > > > > Just to make sure -- is __gpio_debug declared with __secure? > > > > > Before hanging it can be worked for a few minutes. Hangs occur at > > > random manners. > > > Also hangs not depend on count of freq change. > > > > Hmm, the CPU hanging right when blocking access to DRAM suggests there > > is some asynchronous cache operation (prefetch or writeback) that is > > causes some DRAM traffic. > > > > I checked the U-Boot source code, and it disables both caches in the > > secure copy of SCTLR before setting up PSCI*. So there should not be any > > prefetching occurring while you are in monitor mode. But maybe there is > > some writeback from cache lines dirtied in non-secure state. You could > > try calling psci_v7_flush_dcache_all() before setting PWRCTL. > > > > If that does not help, I am not sure what else could be happening. > > > > Regards, > > Samuel > > > > * boot_jump_linux() -> announce_and_cleanup() -> cleanup_before_linux() > > -> cleanup_before_linux_select() > > > > > I tried something like this for fastest bug reproduction: > > > ``` > > > while true > > > do > > > echo "simple_ondemand" > > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > > > sleep 5 > > > echo "performance" > > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > > > sleep 5 > > > done > > > ``` > > > > > > No effect, it still happens randomly :) > > > > > >> You can send a SGI IPI to the other three CPUs and force them to > > >> trap into the secure monitor. > > > > > > Thanks, I will try this later. > > > > > > P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :( > > > > > > сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>: > > >> > > >> Hi Kirill, > > >> > > >> On 12/31/22 05:15, Kirill wrote: > > >>> Hi! > > >>> > > >>> I ported your patches for armbian kernel 6.1 / u-boot and it works! > > >>> > > >>>> It works, though it did lock up once after playing with the devfreq > > >>>> sysfs for several minutes > > >>> > > >>> Yes, I have hangs too. And the main reason for this problem - SMP. :( > > >>> > > >>> By calling SMC we put only one CPU into SRAM. But other CPUs still > > >>> work and use DRAM! > > >>> I don't see any hangs, if I disable all other CPUs: > > >>> ``` > > >>> echo 0 > /sys/devices/system/cpu/cpu1/online > > >>> echo 0 > /sys/devices/system/cpu/cpu2/online > > >>> echo 0 > /sys/devices/system/cpu/cpu3/online > > >>> ``` > > >> > > >> Thanks for investigating. That's good information. > > >> > > >>> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem. > > >>> > > >>> They call mdfs_pause_cpu[1] for each CPU core (except current) > > >>> This function located in the SRAM and locks CPU in infinity loop, > > >>> until `set_paused(false)` called on CPU0 > > >>> Also, legacy rockchip kernels use same hack[3]. > > >>> > > >>> But this method is not ideal... > > >>> Before changing DDR freq we must make sure which each kernel is stuck > > >>> on the SRAM function. But this is a very long process. > > >>> In my proof-of-concept implementation sometimes elapses *few seconds* > > >>> between I call smp_call_function and all cores stucking in SRAM > > >>> function. > > >>> This method may be suitable for manual freq change. But not for > > >>> automatic governor mode. No chance to change frequency on heavy loaded > > >>> CPUs. > > >>> > > >>> We need another way. > > >>> > > >>> I think, code on PSCI must stop/suspend any other working cpu cores > > >>> before update freq. This is possible? > > >> > > >> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM > > >> controller's host interface, which should cause the L2 cache subsystem > > >> (and thus the other CPUs) to stall when trying to access DRAM. This is > > >> what the MDFS hardware does on A64/H5, and I have seen no hangs there. > > >> > > >> Possibly the issue is that such a stall sometimes affects the CPU that > > >> is running from SRAM, even though it should not. (On the other hand, > > >> when using the MDFS hardware, it is okay if all four CPUs temporarily > > >> stall at the same time.) > > >> > > >> One thing to check is if sunxi_dram_dvfs_req() completes successfully. > > >> That function contains some unbounded loops, so it is possible to get > > >> stuck. You could toggle a GPIO or something at the end of the function. > > >> That would distinguish between "the secure monitor hung" and "we left > > >> the DRAM controller in a bad state and hung when switching back to code > > >> in DRAM" or even "we trashed the contents of DRAM". > > >> > > >> We do use the architectural timer inside sunxi_dram_dvfs_req(), but > > >> those registers are banked between secure/non-secure states, so that > > >> should not interfere with Linux's use of the timer. > > >> > > >> However, your test with offlining the other CPUs suggests we may really > > >> need some synchronization. I would suggest doing this inside U-Boot as > > >> well. You can send a SGI IPI to the other three CPUs and force them to > > >> trap into the secure monitor. Not only will this be immediate, but it > > >> will also ensure the other CPUs are running from SRAM during the > > >> reclocking. You can take inspiration from the existing IPI code in psci.c. > > >> > > >> It is quite convenient to be truly in control, so you can do things > > >> behind the OS's back, and keep it blissfully unaware. :) > > >> > > >> Regards, > > >> Samuel > > >> > > >>> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638 > > >>> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088 > > >>> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168 > > >>> > > >>> P.S. sorry for duplicate, previous message declined by mlmmj > > >>> > > >>> > > >>> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>: > > >>>> > > >>>> Hi Kirill, > > >>>> > > >>>> On 12/28/22 07:10, Kirill wrote: > > >>>>> I'm trying to use your driver with h3, but have this result: > > >>>>> ``` > > >>>>> [ 387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit > > >>>>> DDRx with ODT > > >>>>> [ 388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750 > > >>>>> (6%) at 1248 MHz > > >>>>> [ 389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750 > > >>>>> (0%) at 1248 MHz > > >>>>> [ 389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to > > >>>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled > > >>>>> ``` > > >>>>> > > >>>>> After this CPU hangs and not responding. > > >>>>> Is possible (at least theoretically) to use this driver with H3? > > >>>> > > >>>> Yes, although it will need some help from firmware. If you look at the > > >>>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see > > >>>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3). > > >>>> The driver always calls mdfs_main(), which is a standalone program > > >>>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken, > > >>>> as you found out. > > >>>> > > >>>> Something like this standalone MDFS application is not upstreamable, but > > >>>> conveniently we already have some firmware running from SRAM, namely > > >>>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has > > >>>> some chips where they call a SMC to do this MDFS procedure[2]. So we can > > >>>> reuse that SMC function ID, and put the code in in the secure monitor. > > >>>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept. > > >>>> > > >>>> It works, though it did lock up once after playing with the devfreq > > >>>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are > > >>>> just copied from the vendor driver; they could surely be improved. > > >>>> > > >>>> The U-Boot patch is based on my series adding Crust support for H3, so I > > >>>> could have interactive peek/poke from the AR100 even when the DRAM > > >>>> controller is dead. It shouldn't be too hard to rebase that out and move > > >>>> the code to psci.c. > > >>>> > > >>>> I'm not sure the best way to upstream the changes to psci.S. Probably we > > >>>> need some platform callback to handle unknown function IDs. > > >>>> > > >>>> Regards, > > >>>> Samuel > > >>>> > > >>>> [1]: > > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411 > > >>>> [2]: > > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38 > > >>>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq > > >>>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t > > >>>> > > >> > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ddr devfreq on H3 - possible? 2023-01-04 23:26 ` Kirill @ 2023-01-05 2:47 ` Kirill 2023-01-05 20:57 ` Kirill 0 siblings, 1 reply; 11+ messages in thread From: Kirill @ 2023-01-05 2:47 UTC (permalink / raw) To: Samuel Holland; +Cc: linux-sunxi Oh, sorry for the irrelevant report. I was really surprised when I got pagefault even without SMC call (: Previously I used the armbian unstable kernel (6.1.1). On stable armbian kernel (5.15.x86) pagefault not reproduced in all cases. It's just armbian's bugs... I'll try to be more careful when preparing for the test environment. :) I will continue testing. чт, 5 янв. 2023 г. в 01:26, Kirill <kirill.zhumarin@gmail.com>: > > Hi Samel! > > Sadly, it does not work stable :( > > I found a good way to stress-test the ddr dvfs process. Just switching > between max/min freq every 0.5s. > (Previously I tested with performance <-> simple ondemand, before I > understood why not present other governors. And it wasn't enough.) > > My script for testing: > ``` > #!/bin/bash > > while true > do > date > echo "powersave" > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > sleep 0.5 > echo "performance" > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > sleep 0.5 > done > ``` > > Test cases: > 1. Without module sun8i_a33_mbus: system stable, all works fine > I checked my setup: > - Every power source looks fine on oscilloscope > - Kernel memtest (memtest=10) don't find any problem > - Userspace tool memtester don't find any problem > > 2. Single CPU + sun8i_a33_mbus + script: Works fine for a long time > > 3. 2 or 4 CPUs: Page Fault! Seems to be RAM corrupting... > ``` > [ 3890.960776] Unable to handle kernel paging request at virtual > address 00044581 > [ 3890.968017] [00044581] *pgd=00000000 > [ 3890.971626] Internal error: Oops: 5 [#1] SMP THUMB2 > ``` > (full report: https://gist.github.com/Azq2/b2875ee228d1e68922d7c1f4f4e3f3df) > > I tried to implement "CPU locking in monitor mode" using SGI. But > pagefaults still occur. > And, also, it is too slow... I see how my input hangs on ssh... > Code which I use: > https://github.com/u-boot/u-boot/compare/master...Azq2:u-boot:new_test > > Maybe I incorrectly implemented SGI. > > пн, 2 янв. 2023 г. в 04:20, Kirill <kirill.zhumarin@gmail.com>: > > > > > Just to make sure -- is __gpio_debug declared with __secure? > > > > Yes > > > > > try calling psci_v7_flush_dcache_all() before setting PWRCTL. > > > > Hm... I'm not sure, but it seems to be working.... My OPi Lite works > > ~one hour without any hangs! > > > > Of course, I need more time and tests to completely confirm that. > > I will continue testing. > > > > Great thanks!!! > > > > Also, I did some extra work: > > - Rewrited code to use u-boot registers and constants > > - implemented enabling/disabling ODT > > - implemented enabling/disabling self-refresh > > - implemented custom PSCI tables for platform-specific functions in u-boot > > > > And all of these also seem to be works. > > My fork with changes: > > https://github.com/Azq2/u-boot/commits/allwinner-h3-dram-dvfs > > > > пн, 2 янв. 2023 г. в 00:20, Samuel Holland <samuel@sholland.org>: > > > > > > On 1/1/23 15:40, Kirill wrote: > > > > I did a little debugging. > > > > When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);` > > > > > > > > Example of my debug: > > > > ``` > > > > /* 1. enter self-refresh and disable all master access */ > > > > reg_val = readl(PWRCTL); > > > > reg_val |= (0x1<<0); > > > > reg_val |= (0x1<<8); > > > > __gpio_debug(2); > > > > writel(reg_val, PWRCTL); > > > > __gpio_debug(3); > > > > __udelay(1); > > > > __gpio_debug(4); > > > > ``` > > > > __gpio_debug should not take any effect on the process, just switching GPIO's. > > > > > > Just to make sure -- is __gpio_debug declared with __secure? > > > > > > > Before hanging it can be worked for a few minutes. Hangs occur at > > > > random manners. > > > > Also hangs not depend on count of freq change. > > > > > > Hmm, the CPU hanging right when blocking access to DRAM suggests there > > > is some asynchronous cache operation (prefetch or writeback) that is > > > causes some DRAM traffic. > > > > > > I checked the U-Boot source code, and it disables both caches in the > > > secure copy of SCTLR before setting up PSCI*. So there should not be any > > > prefetching occurring while you are in monitor mode. But maybe there is > > > some writeback from cache lines dirtied in non-secure state. You could > > > try calling psci_v7_flush_dcache_all() before setting PWRCTL. > > > > > > If that does not help, I am not sure what else could be happening. > > > > > > Regards, > > > Samuel > > > > > > * boot_jump_linux() -> announce_and_cleanup() -> cleanup_before_linux() > > > -> cleanup_before_linux_select() > > > > > > > I tried something like this for fastest bug reproduction: > > > > ``` > > > > while true > > > > do > > > > echo "simple_ondemand" > > > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > > > > sleep 5 > > > > echo "performance" > > > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > > > > sleep 5 > > > > done > > > > ``` > > > > > > > > No effect, it still happens randomly :) > > > > > > > >> You can send a SGI IPI to the other three CPUs and force them to > > > >> trap into the secure monitor. > > > > > > > > Thanks, I will try this later. > > > > > > > > P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :( > > > > > > > > сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>: > > > >> > > > >> Hi Kirill, > > > >> > > > >> On 12/31/22 05:15, Kirill wrote: > > > >>> Hi! > > > >>> > > > >>> I ported your patches for armbian kernel 6.1 / u-boot and it works! > > > >>> > > > >>>> It works, though it did lock up once after playing with the devfreq > > > >>>> sysfs for several minutes > > > >>> > > > >>> Yes, I have hangs too. And the main reason for this problem - SMP. :( > > > >>> > > > >>> By calling SMC we put only one CPU into SRAM. But other CPUs still > > > >>> work and use DRAM! > > > >>> I don't see any hangs, if I disable all other CPUs: > > > >>> ``` > > > >>> echo 0 > /sys/devices/system/cpu/cpu1/online > > > >>> echo 0 > /sys/devices/system/cpu/cpu2/online > > > >>> echo 0 > /sys/devices/system/cpu/cpu3/online > > > >>> ``` > > > >> > > > >> Thanks for investigating. That's good information. > > > >> > > > >>> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem. > > > >>> > > > >>> They call mdfs_pause_cpu[1] for each CPU core (except current) > > > >>> This function located in the SRAM and locks CPU in infinity loop, > > > >>> until `set_paused(false)` called on CPU0 > > > >>> Also, legacy rockchip kernels use same hack[3]. > > > >>> > > > >>> But this method is not ideal... > > > >>> Before changing DDR freq we must make sure which each kernel is stuck > > > >>> on the SRAM function. But this is a very long process. > > > >>> In my proof-of-concept implementation sometimes elapses *few seconds* > > > >>> between I call smp_call_function and all cores stucking in SRAM > > > >>> function. > > > >>> This method may be suitable for manual freq change. But not for > > > >>> automatic governor mode. No chance to change frequency on heavy loaded > > > >>> CPUs. > > > >>> > > > >>> We need another way. > > > >>> > > > >>> I think, code on PSCI must stop/suspend any other working cpu cores > > > >>> before update freq. This is possible? > > > >> > > > >> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM > > > >> controller's host interface, which should cause the L2 cache subsystem > > > >> (and thus the other CPUs) to stall when trying to access DRAM. This is > > > >> what the MDFS hardware does on A64/H5, and I have seen no hangs there. > > > >> > > > >> Possibly the issue is that such a stall sometimes affects the CPU that > > > >> is running from SRAM, even though it should not. (On the other hand, > > > >> when using the MDFS hardware, it is okay if all four CPUs temporarily > > > >> stall at the same time.) > > > >> > > > >> One thing to check is if sunxi_dram_dvfs_req() completes successfully. > > > >> That function contains some unbounded loops, so it is possible to get > > > >> stuck. You could toggle a GPIO or something at the end of the function. > > > >> That would distinguish between "the secure monitor hung" and "we left > > > >> the DRAM controller in a bad state and hung when switching back to code > > > >> in DRAM" or even "we trashed the contents of DRAM". > > > >> > > > >> We do use the architectural timer inside sunxi_dram_dvfs_req(), but > > > >> those registers are banked between secure/non-secure states, so that > > > >> should not interfere with Linux's use of the timer. > > > >> > > > >> However, your test with offlining the other CPUs suggests we may really > > > >> need some synchronization. I would suggest doing this inside U-Boot as > > > >> well. You can send a SGI IPI to the other three CPUs and force them to > > > >> trap into the secure monitor. Not only will this be immediate, but it > > > >> will also ensure the other CPUs are running from SRAM during the > > > >> reclocking. You can take inspiration from the existing IPI code in psci.c. > > > >> > > > >> It is quite convenient to be truly in control, so you can do things > > > >> behind the OS's back, and keep it blissfully unaware. :) > > > >> > > > >> Regards, > > > >> Samuel > > > >> > > > >>> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638 > > > >>> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088 > > > >>> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168 > > > >>> > > > >>> P.S. sorry for duplicate, previous message declined by mlmmj > > > >>> > > > >>> > > > >>> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>: > > > >>>> > > > >>>> Hi Kirill, > > > >>>> > > > >>>> On 12/28/22 07:10, Kirill wrote: > > > >>>>> I'm trying to use your driver with h3, but have this result: > > > >>>>> ``` > > > >>>>> [ 387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit > > > >>>>> DDRx with ODT > > > >>>>> [ 388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750 > > > >>>>> (6%) at 1248 MHz > > > >>>>> [ 389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750 > > > >>>>> (0%) at 1248 MHz > > > >>>>> [ 389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to > > > >>>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled > > > >>>>> ``` > > > >>>>> > > > >>>>> After this CPU hangs and not responding. > > > >>>>> Is possible (at least theoretically) to use this driver with H3? > > > >>>> > > > >>>> Yes, although it will need some help from firmware. If you look at the > > > >>>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see > > > >>>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3). > > > >>>> The driver always calls mdfs_main(), which is a standalone program > > > >>>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken, > > > >>>> as you found out. > > > >>>> > > > >>>> Something like this standalone MDFS application is not upstreamable, but > > > >>>> conveniently we already have some firmware running from SRAM, namely > > > >>>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has > > > >>>> some chips where they call a SMC to do this MDFS procedure[2]. So we can > > > >>>> reuse that SMC function ID, and put the code in in the secure monitor. > > > >>>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept. > > > >>>> > > > >>>> It works, though it did lock up once after playing with the devfreq > > > >>>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are > > > >>>> just copied from the vendor driver; they could surely be improved. > > > >>>> > > > >>>> The U-Boot patch is based on my series adding Crust support for H3, so I > > > >>>> could have interactive peek/poke from the AR100 even when the DRAM > > > >>>> controller is dead. It shouldn't be too hard to rebase that out and move > > > >>>> the code to psci.c. > > > >>>> > > > >>>> I'm not sure the best way to upstream the changes to psci.S. Probably we > > > >>>> need some platform callback to handle unknown function IDs. > > > >>>> > > > >>>> Regards, > > > >>>> Samuel > > > >>>> > > > >>>> [1]: > > > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411 > > > >>>> [2]: > > > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38 > > > >>>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq > > > >>>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t > > > >>>> > > > >> > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ddr devfreq on H3 - possible? 2023-01-05 2:47 ` Kirill @ 2023-01-05 20:57 ` Kirill 2023-01-06 1:18 ` Kirill 0 siblings, 1 reply; 11+ messages in thread From: Kirill @ 2023-01-05 20:57 UTC (permalink / raw) To: Samuel Holland; +Cc: linux-sunxi Hi, Samuel! Now I can confidently say - psci_v7_flush_dcache_all() only partially helps. On low-loaded RAM it can work for a long time. But, usually, after a few hours it still hangs on PWRCTL. Previously, this bug was heavily testable for me, because of its random nature. But now I found a 100% method for reproducing this problem. First ssh: ``` #!/bin/bash while true do date echo "powersave" > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor sleep 0.5 echo "performance" > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor sleep 0.5 done ``` Second ssh: ``` $ sudo memtester 300M ``` On 2-4 CPUs: I have hung on PWRCTL after every run of the `memtester` tool (after a few seconds). On 1 CPU: Never hangs and works fine. On Thu, Jan 5, 2023 at 4:47 AM Kirill <kirill.zhumarin@gmail.com> wrote: > > Oh, sorry for the irrelevant report. I was really surprised when I got > pagefault even without SMC call (: > Previously I used the armbian unstable kernel (6.1.1). On stable > armbian kernel (5.15.x86) pagefault not reproduced in all cases. > It's just armbian's bugs... > > I'll try to be more careful when preparing for the test environment. :) > > I will continue testing. > > чт, 5 янв. 2023 г. в 01:26, Kirill <kirill.zhumarin@gmail.com>: > > > > Hi Samel! > > > > Sadly, it does not work stable :( > > > > I found a good way to stress-test the ddr dvfs process. Just switching > > between max/min freq every 0.5s. > > (Previously I tested with performance <-> simple ondemand, before I > > understood why not present other governors. And it wasn't enough.) > > > > My script for testing: > > ``` > > #!/bin/bash > > > > while true > > do > > date > > echo "powersave" > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > > sleep 0.5 > > echo "performance" > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > > sleep 0.5 > > done > > ``` > > > > Test cases: > > 1. Without module sun8i_a33_mbus: system stable, all works fine > > I checked my setup: > > - Every power source looks fine on oscilloscope > > - Kernel memtest (memtest=10) don't find any problem > > - Userspace tool memtester don't find any problem > > > > 2. Single CPU + sun8i_a33_mbus + script: Works fine for a long time > > > > 3. 2 or 4 CPUs: Page Fault! Seems to be RAM corrupting... > > ``` > > [ 3890.960776] Unable to handle kernel paging request at virtual > > address 00044581 > > [ 3890.968017] [00044581] *pgd=00000000 > > [ 3890.971626] Internal error: Oops: 5 [#1] SMP THUMB2 > > ``` > > (full report: https://gist.github.com/Azq2/b2875ee228d1e68922d7c1f4f4e3f3df) > > > > I tried to implement "CPU locking in monitor mode" using SGI. But > > pagefaults still occur. > > And, also, it is too slow... I see how my input hangs on ssh... > > Code which I use: > > https://github.com/u-boot/u-boot/compare/master...Azq2:u-boot:new_test > > > > Maybe I incorrectly implemented SGI. > > > > пн, 2 янв. 2023 г. в 04:20, Kirill <kirill.zhumarin@gmail.com>: > > > > > > > Just to make sure -- is __gpio_debug declared with __secure? > > > > > > Yes > > > > > > > try calling psci_v7_flush_dcache_all() before setting PWRCTL. > > > > > > Hm... I'm not sure, but it seems to be working.... My OPi Lite works > > > ~one hour without any hangs! > > > > > > Of course, I need more time and tests to completely confirm that. > > > I will continue testing. > > > > > > Great thanks!!! > > > > > > Also, I did some extra work: > > > - Rewrited code to use u-boot registers and constants > > > - implemented enabling/disabling ODT > > > - implemented enabling/disabling self-refresh > > > - implemented custom PSCI tables for platform-specific functions in u-boot > > > > > > And all of these also seem to be works. > > > My fork with changes: > > > https://github.com/Azq2/u-boot/commits/allwinner-h3-dram-dvfs > > > > > > пн, 2 янв. 2023 г. в 00:20, Samuel Holland <samuel@sholland.org>: > > > > > > > > On 1/1/23 15:40, Kirill wrote: > > > > > I did a little debugging. > > > > > When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);` > > > > > > > > > > Example of my debug: > > > > > ``` > > > > > /* 1. enter self-refresh and disable all master access */ > > > > > reg_val = readl(PWRCTL); > > > > > reg_val |= (0x1<<0); > > > > > reg_val |= (0x1<<8); > > > > > __gpio_debug(2); > > > > > writel(reg_val, PWRCTL); > > > > > __gpio_debug(3); > > > > > __udelay(1); > > > > > __gpio_debug(4); > > > > > ``` > > > > > __gpio_debug should not take any effect on the process, just switching GPIO's. > > > > > > > > Just to make sure -- is __gpio_debug declared with __secure? > > > > > > > > > Before hanging it can be worked for a few minutes. Hangs occur at > > > > > random manners. > > > > > Also hangs not depend on count of freq change. > > > > > > > > Hmm, the CPU hanging right when blocking access to DRAM suggests there > > > > is some asynchronous cache operation (prefetch or writeback) that is > > > > causes some DRAM traffic. > > > > > > > > I checked the U-Boot source code, and it disables both caches in the > > > > secure copy of SCTLR before setting up PSCI*. So there should not be any > > > > prefetching occurring while you are in monitor mode. But maybe there is > > > > some writeback from cache lines dirtied in non-secure state. You could > > > > try calling psci_v7_flush_dcache_all() before setting PWRCTL. > > > > > > > > If that does not help, I am not sure what else could be happening. > > > > > > > > Regards, > > > > Samuel > > > > > > > > * boot_jump_linux() -> announce_and_cleanup() -> cleanup_before_linux() > > > > -> cleanup_before_linux_select() > > > > > > > > > I tried something like this for fastest bug reproduction: > > > > > ``` > > > > > while true > > > > > do > > > > > echo "simple_ondemand" > > > > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > > > > > sleep 5 > > > > > echo "performance" > > > > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor > > > > > sleep 5 > > > > > done > > > > > ``` > > > > > > > > > > No effect, it still happens randomly :) > > > > > > > > > >> You can send a SGI IPI to the other three CPUs and force them to > > > > >> trap into the secure monitor. > > > > > > > > > > Thanks, I will try this later. > > > > > > > > > > P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :( > > > > > > > > > > сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>: > > > > >> > > > > >> Hi Kirill, > > > > >> > > > > >> On 12/31/22 05:15, Kirill wrote: > > > > >>> Hi! > > > > >>> > > > > >>> I ported your patches for armbian kernel 6.1 / u-boot and it works! > > > > >>> > > > > >>>> It works, though it did lock up once after playing with the devfreq > > > > >>>> sysfs for several minutes > > > > >>> > > > > >>> Yes, I have hangs too. And the main reason for this problem - SMP. :( > > > > >>> > > > > >>> By calling SMC we put only one CPU into SRAM. But other CPUs still > > > > >>> work and use DRAM! > > > > >>> I don't see any hangs, if I disable all other CPUs: > > > > >>> ``` > > > > >>> echo 0 > /sys/devices/system/cpu/cpu1/online > > > > >>> echo 0 > /sys/devices/system/cpu/cpu2/online > > > > >>> echo 0 > /sys/devices/system/cpu/cpu3/online > > > > >>> ``` > > > > >> > > > > >> Thanks for investigating. That's good information. > > > > >> > > > > >>> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem. > > > > >>> > > > > >>> They call mdfs_pause_cpu[1] for each CPU core (except current) > > > > >>> This function located in the SRAM and locks CPU in infinity loop, > > > > >>> until `set_paused(false)` called on CPU0 > > > > >>> Also, legacy rockchip kernels use same hack[3]. > > > > >>> > > > > >>> But this method is not ideal... > > > > >>> Before changing DDR freq we must make sure which each kernel is stuck > > > > >>> on the SRAM function. But this is a very long process. > > > > >>> In my proof-of-concept implementation sometimes elapses *few seconds* > > > > >>> between I call smp_call_function and all cores stucking in SRAM > > > > >>> function. > > > > >>> This method may be suitable for manual freq change. But not for > > > > >>> automatic governor mode. No chance to change frequency on heavy loaded > > > > >>> CPUs. > > > > >>> > > > > >>> We need another way. > > > > >>> > > > > >>> I think, code on PSCI must stop/suspend any other working cpu cores > > > > >>> before update freq. This is possible? > > > > >> > > > > >> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM > > > > >> controller's host interface, which should cause the L2 cache subsystem > > > > >> (and thus the other CPUs) to stall when trying to access DRAM. This is > > > > >> what the MDFS hardware does on A64/H5, and I have seen no hangs there. > > > > >> > > > > >> Possibly the issue is that such a stall sometimes affects the CPU that > > > > >> is running from SRAM, even though it should not. (On the other hand, > > > > >> when using the MDFS hardware, it is okay if all four CPUs temporarily > > > > >> stall at the same time.) > > > > >> > > > > >> One thing to check is if sunxi_dram_dvfs_req() completes successfully. > > > > >> That function contains some unbounded loops, so it is possible to get > > > > >> stuck. You could toggle a GPIO or something at the end of the function. > > > > >> That would distinguish between "the secure monitor hung" and "we left > > > > >> the DRAM controller in a bad state and hung when switching back to code > > > > >> in DRAM" or even "we trashed the contents of DRAM". > > > > >> > > > > >> We do use the architectural timer inside sunxi_dram_dvfs_req(), but > > > > >> those registers are banked between secure/non-secure states, so that > > > > >> should not interfere with Linux's use of the timer. > > > > >> > > > > >> However, your test with offlining the other CPUs suggests we may really > > > > >> need some synchronization. I would suggest doing this inside U-Boot as > > > > >> well. You can send a SGI IPI to the other three CPUs and force them to > > > > >> trap into the secure monitor. Not only will this be immediate, but it > > > > >> will also ensure the other CPUs are running from SRAM during the > > > > >> reclocking. You can take inspiration from the existing IPI code in psci.c. > > > > >> > > > > >> It is quite convenient to be truly in control, so you can do things > > > > >> behind the OS's back, and keep it blissfully unaware. :) > > > > >> > > > > >> Regards, > > > > >> Samuel > > > > >> > > > > >>> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638 > > > > >>> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088 > > > > >>> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168 > > > > >>> > > > > >>> P.S. sorry for duplicate, previous message declined by mlmmj > > > > >>> > > > > >>> > > > > >>> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>: > > > > >>>> > > > > >>>> Hi Kirill, > > > > >>>> > > > > >>>> On 12/28/22 07:10, Kirill wrote: > > > > >>>>> I'm trying to use your driver with h3, but have this result: > > > > >>>>> ``` > > > > >>>>> [ 387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit > > > > >>>>> DDRx with ODT > > > > >>>>> [ 388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750 > > > > >>>>> (6%) at 1248 MHz > > > > >>>>> [ 389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750 > > > > >>>>> (0%) at 1248 MHz > > > > >>>>> [ 389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to > > > > >>>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled > > > > >>>>> ``` > > > > >>>>> > > > > >>>>> After this CPU hangs and not responding. > > > > >>>>> Is possible (at least theoretically) to use this driver with H3? > > > > >>>> > > > > >>>> Yes, although it will need some help from firmware. If you look at the > > > > >>>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see > > > > >>>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3). > > > > >>>> The driver always calls mdfs_main(), which is a standalone program > > > > >>>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken, > > > > >>>> as you found out. > > > > >>>> > > > > >>>> Something like this standalone MDFS application is not upstreamable, but > > > > >>>> conveniently we already have some firmware running from SRAM, namely > > > > >>>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has > > > > >>>> some chips where they call a SMC to do this MDFS procedure[2]. So we can > > > > >>>> reuse that SMC function ID, and put the code in in the secure monitor. > > > > >>>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept. > > > > >>>> > > > > >>>> It works, though it did lock up once after playing with the devfreq > > > > >>>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are > > > > >>>> just copied from the vendor driver; they could surely be improved. > > > > >>>> > > > > >>>> The U-Boot patch is based on my series adding Crust support for H3, so I > > > > >>>> could have interactive peek/poke from the AR100 even when the DRAM > > > > >>>> controller is dead. It shouldn't be too hard to rebase that out and move > > > > >>>> the code to psci.c. > > > > >>>> > > > > >>>> I'm not sure the best way to upstream the changes to psci.S. Probably we > > > > >>>> need some platform callback to handle unknown function IDs. > > > > >>>> > > > > >>>> Regards, > > > > >>>> Samuel > > > > >>>> > > > > >>>> [1]: > > > > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411 > > > > >>>> [2]: > > > > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38 > > > > >>>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq > > > > >>>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t > > > > >>>> > > > > >> > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ddr devfreq on H3 - possible? 2023-01-05 20:57 ` Kirill @ 2023-01-06 1:18 ` Kirill 2023-01-07 16:40 ` Kirill 0 siblings, 1 reply; 11+ messages in thread From: Kirill @ 2023-01-06 1:18 UTC (permalink / raw) To: Samuel Holland; +Cc: linux-sunxi Additionally to the previous message... I think we must make sure of this really SMP-caused bug. I did some experiments with hanging all CPU's in the SRAM. I tried this: 1. Custom SGI. But I got random pagefaults. Don't know what is going wrong, but will try to debug it in the future. Example of code: https://github.com/u-boot/u-boot/compare/master...Azq2:u-boot:new_test 2. Disabling CPUs before and enabling after frequency change. I tried kernel API `cpu_remove()` / `cpu_add()` and simple bash script with usespace API `/sys/devices/system/cpu/cpuX/online`. But I have random pagefaults if too frequently disabling and enabling CPU's. Even without any sun8i_a33_mbus. It seems the CPU hotplug is broken on both 5.15 and 6.1 kernels. %) Perhaps this is related to SGI pagefaults at all. Of course it offtopic to the main theme of this listing. Just a sad fact. :( 3. Locking using SMC calls and smp_call_function. Locking code in the psci.c: https://gist.github.com/Azq2/5c2bf3855f9546aeb63b57f5aa042621#file-psci-c-L378 Changes to sun8i-a33-mbus.c: https://gist.github.com/Azq2/54713f2aa68c89bda0c8792f6b760908 Is the main idea - hang all secondary cores in their SMC handler. Yes, this is a totally bad and strangest implementation, but that is only for experiment. That is similar to the original allwinner driver. I tested this for some time and don't have any hang-ups on PWRCTL. Even with memterser. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ddr devfreq on H3 - possible? 2023-01-06 1:18 ` Kirill @ 2023-01-07 16:40 ` Kirill 0 siblings, 0 replies; 11+ messages in thread From: Kirill @ 2023-01-07 16:40 UTC (permalink / raw) To: Samuel Holland; +Cc: linux-sunxi Now I use something like that. "Not great, not terrible" solution. But, of course, not upstreamable. :) https://github.com/Azq2/armbian_build/blob/h3-ddr-dvfs-patch/patch/kernel/archive/sunxi-5.15/patches.armbian/PM-devfreq-Fix-h3-support-to-MBUS-driver.patch https://github.com/Azq2/armbian_build/blob/h3-ddr-dvfs-patch/patch/u-boot/u-boot-sunxi/allwinner-h3-ddr-dvfs-in-psci.patch ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-01-07 16:40 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAKAF0m9DqPjB6C39ZbrRHFrJOodm7WQGTL0x1jduQjNU=JpQ2g@mail.gmail.com>
2022-12-29 17:29 ` ddr devfreq on H3 - possible? Samuel Holland
2022-12-31 11:15 ` Kirill
2022-12-31 20:45 ` Samuel Holland
2023-01-01 21:40 ` Kirill
2023-01-01 22:20 ` Samuel Holland
2023-01-02 2:20 ` Kirill
2023-01-04 23:26 ` Kirill
2023-01-05 2:47 ` Kirill
2023-01-05 20:57 ` Kirill
2023-01-06 1:18 ` Kirill
2023-01-07 16:40 ` Kirill
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox