Re: ddr devfreq on H3

ARM Sunxi Platform Development
 help / color / mirror / Atom feed

* Re: ddr devfreq on H3 - possible?
       [not found] <CAKAF0m9DqPjB6C39ZbrRHFrJOodm7WQGTL0x1jduQjNU=JpQ2g@mail.gmail.com>
@ 2022-12-29 17:29 ` Samuel Holland
  2022-12-31 11:15   ` Kirill
  0 siblings, 1 reply; 11+ messages in thread
From: Samuel Holland @ 2022-12-29 17:29 UTC (permalink / raw)
  To: linux-sunxi; +Cc: Kirill

Hi Kirill,

On 12/28/22 07:10, Kirill wrote:
> I'm trying to use your driver with h3, but have this result:
> ```
> [  387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit
> DDRx with ODT
> [  388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750
> (6%) at 1248 MHz
> [  389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750
> (0%) at 1248 MHz
> [  389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to
> 156 MHz, tREFI=19, tRFC=28, ODT=disabled
> ```
> 
> After this CPU hangs and not responding.
> Is possible (at least theoretically) to use this driver with H3?

Yes, although it will need some help from firmware. If you look at the
vendor driver[1] (pick any random Allwinner 4.9 tree), you will see
there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3).
The driver always calls mdfs_main(), which is a standalone program
loaded to SRAM. The reason seems to be that the MDFS hardware is broken,
as you found out.

Something like this standalone MDFS application is not upstreamable, but
conveniently we already have some firmware running from SRAM, namely
U-Boot's PSCI/secure monitor implementation. And Allwinner already has
some chips where they call a SMC to do this MDFS procedure[2]. So we can
reuse that SMC function ID, and put the code in in the secure monitor.
I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept.

It works, though it did lock up once after playing with the devfreq
sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are
just copied from the vendor driver; they could surely be improved.

The U-Boot patch is based on my series adding Crust support for H3, so I
could have interactive peek/poke from the AR100 even when the DRAM
controller is dead. It shouldn't be too hard to rebase that out and move
the code to psci.c.

I'm not sure the best way to upstream the changes to psci.S. Probably we
need some platform callback to handle unknown function IDs.

Regards,
Samuel

[1]:
https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411
[2]:
https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38
[3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq
[4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ddr devfreq on H3 - possible?
  2022-12-29 17:29 ` ddr devfreq on H3 - possible? Samuel Holland
@ 2022-12-31 11:15   ` Kirill
  2022-12-31 20:45     ` Samuel Holland
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill @ 2022-12-31 11:15 UTC (permalink / raw)
  To: Samuel Holland; +Cc: linux-sunxi

Hi!

I ported your patches for armbian kernel 6.1 / u-boot and it works!

> It works, though it did lock up once after playing with the devfreq
sysfs for several minutes

Yes, I have hangs too. And the main reason for this problem - SMP. :(

By calling SMC we put only one CPU into SRAM. But other CPUs still
work and use DRAM!
I don't see any hangs, if I disable all other CPUs:
```
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu3/online
```

Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem.

They call mdfs_pause_cpu[1] for each CPU core (except current)
This function located in the SRAM and locks CPU in infinity loop,
until `set_paused(false)` called on CPU0
Also, legacy rockchip kernels use same hack[3].

But this method is not ideal...
Before changing DDR freq we must make sure which each kernel is stuck
on the SRAM function. But this is a very long process.
In my proof-of-concept implementation sometimes elapses *few seconds*
between I call smp_call_function and all cores stucking in SRAM
function.
This method may be suitable for manual freq change. But not for
automatic governor mode. No chance to change frequency on heavy loaded
CPUs.

We need another way.

I think, code on PSCI must stop/suspend any other working cpu cores
before update freq. This is possible?

[1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638
[2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088
[3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168

P.S. sorry for duplicate, previous message declined by mlmmj


чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>:
>
> Hi Kirill,
>
> On 12/28/22 07:10, Kirill wrote:
> > I'm trying to use your driver with h3, but have this result:
> > ```
> > [  387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit
> > DDRx with ODT
> > [  388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750
> > (6%) at 1248 MHz
> > [  389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750
> > (0%) at 1248 MHz
> > [  389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to
> > 156 MHz, tREFI=19, tRFC=28, ODT=disabled
> > ```
> >
> > After this CPU hangs and not responding.
> > Is possible (at least theoretically) to use this driver with H3?
>
> Yes, although it will need some help from firmware. If you look at the
> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see
> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3).
> The driver always calls mdfs_main(), which is a standalone program
> loaded to SRAM. The reason seems to be that the MDFS hardware is broken,
> as you found out.
>
> Something like this standalone MDFS application is not upstreamable, but
> conveniently we already have some firmware running from SRAM, namely
> U-Boot's PSCI/secure monitor implementation. And Allwinner already has
> some chips where they call a SMC to do this MDFS procedure[2]. So we can
> reuse that SMC function ID, and put the code in in the secure monitor.
> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept.
>
> It works, though it did lock up once after playing with the devfreq
> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are
> just copied from the vendor driver; they could surely be improved.
>
> The U-Boot patch is based on my series adding Crust support for H3, so I
> could have interactive peek/poke from the AR100 even when the DRAM
> controller is dead. It shouldn't be too hard to rebase that out and move
> the code to psci.c.
>
> I'm not sure the best way to upstream the changes to psci.S. Probably we
> need some platform callback to handle unknown function IDs.
>
> Regards,
> Samuel
>
> [1]:
> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411
> [2]:
> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38
> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq
> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ddr devfreq on H3 - possible?
  2022-12-31 11:15   ` Kirill
@ 2022-12-31 20:45     ` Samuel Holland
  2023-01-01 21:40       ` Kirill
  0 siblings, 1 reply; 11+ messages in thread
From: Samuel Holland @ 2022-12-31 20:45 UTC (permalink / raw)
  To: Kirill; +Cc: linux-sunxi

Hi Kirill,

On 12/31/22 05:15, Kirill wrote:
> Hi!
> 
> I ported your patches for armbian kernel 6.1 / u-boot and it works!
> 
>> It works, though it did lock up once after playing with the devfreq
>> sysfs for several minutes
> 
> Yes, I have hangs too. And the main reason for this problem - SMP. :(
> 
> By calling SMC we put only one CPU into SRAM. But other CPUs still
> work and use DRAM!
> I don't see any hangs, if I disable all other CPUs:
> ```
> echo 0 > /sys/devices/system/cpu/cpu1/online
> echo 0 > /sys/devices/system/cpu/cpu2/online
> echo 0 > /sys/devices/system/cpu/cpu3/online
> ```

Thanks for investigating. That's good information.

> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem.
> 
> They call mdfs_pause_cpu[1] for each CPU core (except current)
> This function located in the SRAM and locks CPU in infinity loop,
> until `set_paused(false)` called on CPU0
> Also, legacy rockchip kernels use same hack[3].
> 
> But this method is not ideal...
> Before changing DDR freq we must make sure which each kernel is stuck
> on the SRAM function. But this is a very long process.
> In my proof-of-concept implementation sometimes elapses *few seconds*
> between I call smp_call_function and all cores stucking in SRAM
> function.
> This method may be suitable for manual freq change. But not for
> automatic governor mode. No chance to change frequency on heavy loaded
> CPUs.
> 
> We need another way.
> 
> I think, code on PSCI must stop/suspend any other working cpu cores
> before update freq. This is possible?

It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM
controller's host interface, which should cause the L2 cache subsystem
(and thus the other CPUs) to stall when trying to access DRAM. This is
what the MDFS hardware does on A64/H5, and I have seen no hangs there.

Possibly the issue is that such a stall sometimes affects the CPU that
is running from SRAM, even though it should not. (On the other hand,
when using the MDFS hardware, it is okay if all four CPUs temporarily
stall at the same time.)

One thing to check is if sunxi_dram_dvfs_req() completes successfully.
That function contains some unbounded loops, so it is possible to get
stuck. You could toggle a GPIO or something at the end of the function.
That would distinguish between "the secure monitor hung" and "we left
the DRAM controller in a bad state and hung when switching back to code
in DRAM" or even "we trashed the contents of DRAM".

We do use the architectural timer inside sunxi_dram_dvfs_req(), but
those registers are banked between secure/non-secure states, so that
should not interfere with Linux's use of the timer.

However, your test with offlining the other CPUs suggests we may really
need some synchronization. I would suggest doing this inside U-Boot as
well. You can send a SGI IPI to the other three CPUs and force them to
trap into the secure monitor. Not only will this be immediate, but it
will also ensure the other CPUs are running from SRAM during the
reclocking. You can take inspiration from the existing IPI code in psci.c.

It is quite convenient to be truly in control, so you can do things
behind the OS's back, and keep it blissfully unaware. :)

Regards,
Samuel

> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638
> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088
> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168
> 
> P.S. sorry for duplicate, previous message declined by mlmmj
> 
> 
> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>:
>>
>> Hi Kirill,
>>
>> On 12/28/22 07:10, Kirill wrote:
>>> I'm trying to use your driver with h3, but have this result:
>>> ```
>>> [  387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit
>>> DDRx with ODT
>>> [  388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750
>>> (6%) at 1248 MHz
>>> [  389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750
>>> (0%) at 1248 MHz
>>> [  389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to
>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled
>>> ```
>>>
>>> After this CPU hangs and not responding.
>>> Is possible (at least theoretically) to use this driver with H3?
>>
>> Yes, although it will need some help from firmware. If you look at the
>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see
>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3).
>> The driver always calls mdfs_main(), which is a standalone program
>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken,
>> as you found out.
>>
>> Something like this standalone MDFS application is not upstreamable, but
>> conveniently we already have some firmware running from SRAM, namely
>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has
>> some chips where they call a SMC to do this MDFS procedure[2]. So we can
>> reuse that SMC function ID, and put the code in in the secure monitor.
>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept.
>>
>> It works, though it did lock up once after playing with the devfreq
>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are
>> just copied from the vendor driver; they could surely be improved.
>>
>> The U-Boot patch is based on my series adding Crust support for H3, so I
>> could have interactive peek/poke from the AR100 even when the DRAM
>> controller is dead. It shouldn't be too hard to rebase that out and move
>> the code to psci.c.
>>
>> I'm not sure the best way to upstream the changes to psci.S. Probably we
>> need some platform callback to handle unknown function IDs.
>>
>> Regards,
>> Samuel
>>
>> [1]:
>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411
>> [2]:
>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38
>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq
>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ddr devfreq on H3 - possible?
  2022-12-31 20:45     ` Samuel Holland
@ 2023-01-01 21:40       ` Kirill
  2023-01-01 22:20         ` Samuel Holland
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill @ 2023-01-01 21:40 UTC (permalink / raw)
  To: Samuel Holland; +Cc: linux-sunxi

I did a little debugging.
When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);`

Example of my debug:
```
/* 1. enter self-refresh and disable all master access */
reg_val = readl(PWRCTL);
reg_val |= (0x1<<0);
reg_val |= (0x1<<8);
__gpio_debug(2);
writel(reg_val, PWRCTL);
__gpio_debug(3);
__udelay(1);
__gpio_debug(4);
```
__gpio_debug should not take any effect on the process, just switching GPIO's.

Before hanging it can be worked for a few minutes. Hangs occur at
random manners.
Also hangs not depend on count of freq change.

I tried something like this for fastest bug reproduction:
```
while true
do
  echo "simple_ondemand" >
/sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
  sleep 5
  echo "performance" >
/sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
  sleep 5
done
```

No effect, it still happens randomly :)

> You can send a SGI IPI to the other three CPUs and force them to
> trap into the secure monitor.

Thanks, I will try this later.

P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :(

сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>:
>
> Hi Kirill,
>
> On 12/31/22 05:15, Kirill wrote:
> > Hi!
> >
> > I ported your patches for armbian kernel 6.1 / u-boot and it works!
> >
> >> It works, though it did lock up once after playing with the devfreq
> >> sysfs for several minutes
> >
> > Yes, I have hangs too. And the main reason for this problem - SMP. :(
> >
> > By calling SMC we put only one CPU into SRAM. But other CPUs still
> > work and use DRAM!
> > I don't see any hangs, if I disable all other CPUs:
> > ```
> > echo 0 > /sys/devices/system/cpu/cpu1/online
> > echo 0 > /sys/devices/system/cpu/cpu2/online
> > echo 0 > /sys/devices/system/cpu/cpu3/online
> > ```
>
> Thanks for investigating. That's good information.
>
> > Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem.
> >
> > They call mdfs_pause_cpu[1] for each CPU core (except current)
> > This function located in the SRAM and locks CPU in infinity loop,
> > until `set_paused(false)` called on CPU0
> > Also, legacy rockchip kernels use same hack[3].
> >
> > But this method is not ideal...
> > Before changing DDR freq we must make sure which each kernel is stuck
> > on the SRAM function. But this is a very long process.
> > In my proof-of-concept implementation sometimes elapses *few seconds*
> > between I call smp_call_function and all cores stucking in SRAM
> > function.
> > This method may be suitable for manual freq change. But not for
> > automatic governor mode. No chance to change frequency on heavy loaded
> > CPUs.
> >
> > We need another way.
> >
> > I think, code on PSCI must stop/suspend any other working cpu cores
> > before update freq. This is possible?
>
> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM
> controller's host interface, which should cause the L2 cache subsystem
> (and thus the other CPUs) to stall when trying to access DRAM. This is
> what the MDFS hardware does on A64/H5, and I have seen no hangs there.
>
> Possibly the issue is that such a stall sometimes affects the CPU that
> is running from SRAM, even though it should not. (On the other hand,
> when using the MDFS hardware, it is okay if all four CPUs temporarily
> stall at the same time.)
>
> One thing to check is if sunxi_dram_dvfs_req() completes successfully.
> That function contains some unbounded loops, so it is possible to get
> stuck. You could toggle a GPIO or something at the end of the function.
> That would distinguish between "the secure monitor hung" and "we left
> the DRAM controller in a bad state and hung when switching back to code
> in DRAM" or even "we trashed the contents of DRAM".
>
> We do use the architectural timer inside sunxi_dram_dvfs_req(), but
> those registers are banked between secure/non-secure states, so that
> should not interfere with Linux's use of the timer.
>
> However, your test with offlining the other CPUs suggests we may really
> need some synchronization. I would suggest doing this inside U-Boot as
> well. You can send a SGI IPI to the other three CPUs and force them to
> trap into the secure monitor. Not only will this be immediate, but it
> will also ensure the other CPUs are running from SRAM during the
> reclocking. You can take inspiration from the existing IPI code in psci.c.
>
> It is quite convenient to be truly in control, so you can do things
> behind the OS's back, and keep it blissfully unaware. :)
>
> Regards,
> Samuel
>
> > [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638
> > [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088
> > [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168
> >
> > P.S. sorry for duplicate, previous message declined by mlmmj
> >
> >
> > чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>:
> >>
> >> Hi Kirill,
> >>
> >> On 12/28/22 07:10, Kirill wrote:
> >>> I'm trying to use your driver with h3, but have this result:
> >>> ```
> >>> [  387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit
> >>> DDRx with ODT
> >>> [  388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750
> >>> (6%) at 1248 MHz
> >>> [  389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750
> >>> (0%) at 1248 MHz
> >>> [  389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to
> >>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled
> >>> ```
> >>>
> >>> After this CPU hangs and not responding.
> >>> Is possible (at least theoretically) to use this driver with H3?
> >>
> >> Yes, although it will need some help from firmware. If you look at the
> >> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see
> >> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3).
> >> The driver always calls mdfs_main(), which is a standalone program
> >> loaded to SRAM. The reason seems to be that the MDFS hardware is broken,
> >> as you found out.
> >>
> >> Something like this standalone MDFS application is not upstreamable, but
> >> conveniently we already have some firmware running from SRAM, namely
> >> U-Boot's PSCI/secure monitor implementation. And Allwinner already has
> >> some chips where they call a SMC to do this MDFS procedure[2]. So we can
> >> reuse that SMC function ID, and put the code in in the secure monitor.
> >> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept.
> >>
> >> It works, though it did lock up once after playing with the devfreq
> >> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are
> >> just copied from the vendor driver; they could surely be improved.
> >>
> >> The U-Boot patch is based on my series adding Crust support for H3, so I
> >> could have interactive peek/poke from the AR100 even when the DRAM
> >> controller is dead. It shouldn't be too hard to rebase that out and move
> >> the code to psci.c.
> >>
> >> I'm not sure the best way to upstream the changes to psci.S. Probably we
> >> need some platform callback to handle unknown function IDs.
> >>
> >> Regards,
> >> Samuel
> >>
> >> [1]:
> >> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411
> >> [2]:
> >> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38
> >> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq
> >> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t
> >>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ddr devfreq on H3 - possible?
  2023-01-01 21:40       ` Kirill
@ 2023-01-01 22:20         ` Samuel Holland
  2023-01-02  2:20           ` Kirill
  0 siblings, 1 reply; 11+ messages in thread
From: Samuel Holland @ 2023-01-01 22:20 UTC (permalink / raw)
  To: Kirill; +Cc: linux-sunxi

On 1/1/23 15:40, Kirill wrote:
> I did a little debugging.
> When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);`
> 
> Example of my debug:
> ```
> /* 1. enter self-refresh and disable all master access */
> reg_val = readl(PWRCTL);
> reg_val |= (0x1<<0);
> reg_val |= (0x1<<8);
> __gpio_debug(2);
> writel(reg_val, PWRCTL);
> __gpio_debug(3);
> __udelay(1);
> __gpio_debug(4);
> ```
> __gpio_debug should not take any effect on the process, just switching GPIO's.

Just to make sure -- is __gpio_debug declared with __secure?

> Before hanging it can be worked for a few minutes. Hangs occur at
> random manners.
> Also hangs not depend on count of freq change.

Hmm, the CPU hanging right when blocking access to DRAM suggests there
is some asynchronous cache operation (prefetch or writeback) that is
causes some DRAM traffic.

I checked the U-Boot source code, and it disables both caches in the
secure copy of SCTLR before setting up PSCI*. So there should not be any
prefetching occurring while you are in monitor mode. But maybe there is
some writeback from cache lines dirtied in non-secure state. You could
try calling psci_v7_flush_dcache_all() before setting PWRCTL.

If that does not help, I am not sure what else could be happening.

Regards,
Samuel

* boot_jump_linux() -> announce_and_cleanup() -> cleanup_before_linux()
-> cleanup_before_linux_select()

> I tried something like this for fastest bug reproduction:
> ```
> while true
> do
>   echo "simple_ondemand" >
> /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
>   sleep 5
>   echo "performance" >
> /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
>   sleep 5
> done
> ```
> 
> No effect, it still happens randomly :)
> 
>> You can send a SGI IPI to the other three CPUs and force them to
>> trap into the secure monitor.
> 
> Thanks, I will try this later.
> 
> P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :(
> 
> сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>:
>>
>> Hi Kirill,
>>
>> On 12/31/22 05:15, Kirill wrote:
>>> Hi!
>>>
>>> I ported your patches for armbian kernel 6.1 / u-boot and it works!
>>>
>>>> It works, though it did lock up once after playing with the devfreq
>>>> sysfs for several minutes
>>>
>>> Yes, I have hangs too. And the main reason for this problem - SMP. :(
>>>
>>> By calling SMC we put only one CPU into SRAM. But other CPUs still
>>> work and use DRAM!
>>> I don't see any hangs, if I disable all other CPUs:
>>> ```
>>> echo 0 > /sys/devices/system/cpu/cpu1/online
>>> echo 0 > /sys/devices/system/cpu/cpu2/online
>>> echo 0 > /sys/devices/system/cpu/cpu3/online
>>> ```
>>
>> Thanks for investigating. That's good information.
>>
>>> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem.
>>>
>>> They call mdfs_pause_cpu[1] for each CPU core (except current)
>>> This function located in the SRAM and locks CPU in infinity loop,
>>> until `set_paused(false)` called on CPU0
>>> Also, legacy rockchip kernels use same hack[3].
>>>
>>> But this method is not ideal...
>>> Before changing DDR freq we must make sure which each kernel is stuck
>>> on the SRAM function. But this is a very long process.
>>> In my proof-of-concept implementation sometimes elapses *few seconds*
>>> between I call smp_call_function and all cores stucking in SRAM
>>> function.
>>> This method may be suitable for manual freq change. But not for
>>> automatic governor mode. No chance to change frequency on heavy loaded
>>> CPUs.
>>>
>>> We need another way.
>>>
>>> I think, code on PSCI must stop/suspend any other working cpu cores
>>> before update freq. This is possible?
>>
>> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM
>> controller's host interface, which should cause the L2 cache subsystem
>> (and thus the other CPUs) to stall when trying to access DRAM. This is
>> what the MDFS hardware does on A64/H5, and I have seen no hangs there.
>>
>> Possibly the issue is that such a stall sometimes affects the CPU that
>> is running from SRAM, even though it should not. (On the other hand,
>> when using the MDFS hardware, it is okay if all four CPUs temporarily
>> stall at the same time.)
>>
>> One thing to check is if sunxi_dram_dvfs_req() completes successfully.
>> That function contains some unbounded loops, so it is possible to get
>> stuck. You could toggle a GPIO or something at the end of the function.
>> That would distinguish between "the secure monitor hung" and "we left
>> the DRAM controller in a bad state and hung when switching back to code
>> in DRAM" or even "we trashed the contents of DRAM".
>>
>> We do use the architectural timer inside sunxi_dram_dvfs_req(), but
>> those registers are banked between secure/non-secure states, so that
>> should not interfere with Linux's use of the timer.
>>
>> However, your test with offlining the other CPUs suggests we may really
>> need some synchronization. I would suggest doing this inside U-Boot as
>> well. You can send a SGI IPI to the other three CPUs and force them to
>> trap into the secure monitor. Not only will this be immediate, but it
>> will also ensure the other CPUs are running from SRAM during the
>> reclocking. You can take inspiration from the existing IPI code in psci.c.
>>
>> It is quite convenient to be truly in control, so you can do things
>> behind the OS's back, and keep it blissfully unaware. :)
>>
>> Regards,
>> Samuel
>>
>>> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638
>>> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088
>>> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168
>>>
>>> P.S. sorry for duplicate, previous message declined by mlmmj
>>>
>>>
>>> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>:
>>>>
>>>> Hi Kirill,
>>>>
>>>> On 12/28/22 07:10, Kirill wrote:
>>>>> I'm trying to use your driver with h3, but have this result:
>>>>> ```
>>>>> [  387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit
>>>>> DDRx with ODT
>>>>> [  388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750
>>>>> (6%) at 1248 MHz
>>>>> [  389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750
>>>>> (0%) at 1248 MHz
>>>>> [  389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to
>>>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled
>>>>> ```
>>>>>
>>>>> After this CPU hangs and not responding.
>>>>> Is possible (at least theoretically) to use this driver with H3?
>>>>
>>>> Yes, although it will need some help from firmware. If you look at the
>>>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see
>>>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3).
>>>> The driver always calls mdfs_main(), which is a standalone program
>>>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken,
>>>> as you found out.
>>>>
>>>> Something like this standalone MDFS application is not upstreamable, but
>>>> conveniently we already have some firmware running from SRAM, namely
>>>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has
>>>> some chips where they call a SMC to do this MDFS procedure[2]. So we can
>>>> reuse that SMC function ID, and put the code in in the secure monitor.
>>>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept.
>>>>
>>>> It works, though it did lock up once after playing with the devfreq
>>>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are
>>>> just copied from the vendor driver; they could surely be improved.
>>>>
>>>> The U-Boot patch is based on my series adding Crust support for H3, so I
>>>> could have interactive peek/poke from the AR100 even when the DRAM
>>>> controller is dead. It shouldn't be too hard to rebase that out and move
>>>> the code to psci.c.
>>>>
>>>> I'm not sure the best way to upstream the changes to psci.S. Probably we
>>>> need some platform callback to handle unknown function IDs.
>>>>
>>>> Regards,
>>>> Samuel
>>>>
>>>> [1]:
>>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411
>>>> [2]:
>>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38
>>>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq
>>>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t
>>>>
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ddr devfreq on H3 - possible?
  2023-01-01 22:20         ` Samuel Holland
@ 2023-01-02  2:20           ` Kirill
  2023-01-04 23:26             ` Kirill
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill @ 2023-01-02  2:20 UTC (permalink / raw)
  To: Samuel Holland; +Cc: linux-sunxi

> Just to make sure -- is __gpio_debug declared with __secure?

Yes

> try calling psci_v7_flush_dcache_all() before setting PWRCTL.

Hm... I'm not sure, but it seems to be working.... My OPi Lite works
~one hour without any hangs!

Of course, I need more time and tests to completely confirm that.
I will continue testing.

Great thanks!!!

Also, I did some extra work:
- Rewrited code to use u-boot registers and constants
- implemented enabling/disabling ODT
- implemented enabling/disabling self-refresh
- implemented custom PSCI tables for platform-specific functions in u-boot

And all of these also seem to be works.
My fork with changes:
https://github.com/Azq2/u-boot/commits/allwinner-h3-dram-dvfs

пн, 2 янв. 2023 г. в 00:20, Samuel Holland <samuel@sholland.org>:
>
> On 1/1/23 15:40, Kirill wrote:
> > I did a little debugging.
> > When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);`
> >
> > Example of my debug:
> > ```
> > /* 1. enter self-refresh and disable all master access */
> > reg_val = readl(PWRCTL);
> > reg_val |= (0x1<<0);
> > reg_val |= (0x1<<8);
> > __gpio_debug(2);
> > writel(reg_val, PWRCTL);
> > __gpio_debug(3);
> > __udelay(1);
> > __gpio_debug(4);
> > ```
> > __gpio_debug should not take any effect on the process, just switching GPIO's.
>
> Just to make sure -- is __gpio_debug declared with __secure?
>
> > Before hanging it can be worked for a few minutes. Hangs occur at
> > random manners.
> > Also hangs not depend on count of freq change.
>
> Hmm, the CPU hanging right when blocking access to DRAM suggests there
> is some asynchronous cache operation (prefetch or writeback) that is
> causes some DRAM traffic.
>
> I checked the U-Boot source code, and it disables both caches in the
> secure copy of SCTLR before setting up PSCI*. So there should not be any
> prefetching occurring while you are in monitor mode. But maybe there is
> some writeback from cache lines dirtied in non-secure state. You could
> try calling psci_v7_flush_dcache_all() before setting PWRCTL.
>
> If that does not help, I am not sure what else could be happening.
>
> Regards,
> Samuel
>
> * boot_jump_linux() -> announce_and_cleanup() -> cleanup_before_linux()
> -> cleanup_before_linux_select()
>
> > I tried something like this for fastest bug reproduction:
> > ```
> > while true
> > do
> >   echo "simple_ondemand" >
> > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> >   sleep 5
> >   echo "performance" >
> > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> >   sleep 5
> > done
> > ```
> >
> > No effect, it still happens randomly :)
> >
> >> You can send a SGI IPI to the other three CPUs and force them to
> >> trap into the secure monitor.
> >
> > Thanks, I will try this later.
> >
> > P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :(
> >
> > сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>:
> >>
> >> Hi Kirill,
> >>
> >> On 12/31/22 05:15, Kirill wrote:
> >>> Hi!
> >>>
> >>> I ported your patches for armbian kernel 6.1 / u-boot and it works!
> >>>
> >>>> It works, though it did lock up once after playing with the devfreq
> >>>> sysfs for several minutes
> >>>
> >>> Yes, I have hangs too. And the main reason for this problem - SMP. :(
> >>>
> >>> By calling SMC we put only one CPU into SRAM. But other CPUs still
> >>> work and use DRAM!
> >>> I don't see any hangs, if I disable all other CPUs:
> >>> ```
> >>> echo 0 > /sys/devices/system/cpu/cpu1/online
> >>> echo 0 > /sys/devices/system/cpu/cpu2/online
> >>> echo 0 > /sys/devices/system/cpu/cpu3/online
> >>> ```
> >>
> >> Thanks for investigating. That's good information.
> >>
> >>> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem.
> >>>
> >>> They call mdfs_pause_cpu[1] for each CPU core (except current)
> >>> This function located in the SRAM and locks CPU in infinity loop,
> >>> until `set_paused(false)` called on CPU0
> >>> Also, legacy rockchip kernels use same hack[3].
> >>>
> >>> But this method is not ideal...
> >>> Before changing DDR freq we must make sure which each kernel is stuck
> >>> on the SRAM function. But this is a very long process.
> >>> In my proof-of-concept implementation sometimes elapses *few seconds*
> >>> between I call smp_call_function and all cores stucking in SRAM
> >>> function.
> >>> This method may be suitable for manual freq change. But not for
> >>> automatic governor mode. No chance to change frequency on heavy loaded
> >>> CPUs.
> >>>
> >>> We need another way.
> >>>
> >>> I think, code on PSCI must stop/suspend any other working cpu cores
> >>> before update freq. This is possible?
> >>
> >> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM
> >> controller's host interface, which should cause the L2 cache subsystem
> >> (and thus the other CPUs) to stall when trying to access DRAM. This is
> >> what the MDFS hardware does on A64/H5, and I have seen no hangs there.
> >>
> >> Possibly the issue is that such a stall sometimes affects the CPU that
> >> is running from SRAM, even though it should not. (On the other hand,
> >> when using the MDFS hardware, it is okay if all four CPUs temporarily
> >> stall at the same time.)
> >>
> >> One thing to check is if sunxi_dram_dvfs_req() completes successfully.
> >> That function contains some unbounded loops, so it is possible to get
> >> stuck. You could toggle a GPIO or something at the end of the function.
> >> That would distinguish between "the secure monitor hung" and "we left
> >> the DRAM controller in a bad state and hung when switching back to code
> >> in DRAM" or even "we trashed the contents of DRAM".
> >>
> >> We do use the architectural timer inside sunxi_dram_dvfs_req(), but
> >> those registers are banked between secure/non-secure states, so that
> >> should not interfere with Linux's use of the timer.
> >>
> >> However, your test with offlining the other CPUs suggests we may really
> >> need some synchronization. I would suggest doing this inside U-Boot as
> >> well. You can send a SGI IPI to the other three CPUs and force them to
> >> trap into the secure monitor. Not only will this be immediate, but it
> >> will also ensure the other CPUs are running from SRAM during the
> >> reclocking. You can take inspiration from the existing IPI code in psci.c.
> >>
> >> It is quite convenient to be truly in control, so you can do things
> >> behind the OS's back, and keep it blissfully unaware. :)
> >>
> >> Regards,
> >> Samuel
> >>
> >>> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638
> >>> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088
> >>> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168
> >>>
> >>> P.S. sorry for duplicate, previous message declined by mlmmj
> >>>
> >>>
> >>> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>:
> >>>>
> >>>> Hi Kirill,
> >>>>
> >>>> On 12/28/22 07:10, Kirill wrote:
> >>>>> I'm trying to use your driver with h3, but have this result:
> >>>>> ```
> >>>>> [  387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit
> >>>>> DDRx with ODT
> >>>>> [  388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750
> >>>>> (6%) at 1248 MHz
> >>>>> [  389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750
> >>>>> (0%) at 1248 MHz
> >>>>> [  389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to
> >>>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled
> >>>>> ```
> >>>>>
> >>>>> After this CPU hangs and not responding.
> >>>>> Is possible (at least theoretically) to use this driver with H3?
> >>>>
> >>>> Yes, although it will need some help from firmware. If you look at the
> >>>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see
> >>>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3).
> >>>> The driver always calls mdfs_main(), which is a standalone program
> >>>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken,
> >>>> as you found out.
> >>>>
> >>>> Something like this standalone MDFS application is not upstreamable, but
> >>>> conveniently we already have some firmware running from SRAM, namely
> >>>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has
> >>>> some chips where they call a SMC to do this MDFS procedure[2]. So we can
> >>>> reuse that SMC function ID, and put the code in in the secure monitor.
> >>>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept.
> >>>>
> >>>> It works, though it did lock up once after playing with the devfreq
> >>>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are
> >>>> just copied from the vendor driver; they could surely be improved.
> >>>>
> >>>> The U-Boot patch is based on my series adding Crust support for H3, so I
> >>>> could have interactive peek/poke from the AR100 even when the DRAM
> >>>> controller is dead. It shouldn't be too hard to rebase that out and move
> >>>> the code to psci.c.
> >>>>
> >>>> I'm not sure the best way to upstream the changes to psci.S. Probably we
> >>>> need some platform callback to handle unknown function IDs.
> >>>>
> >>>> Regards,
> >>>> Samuel
> >>>>
> >>>> [1]:
> >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411
> >>>> [2]:
> >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38
> >>>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq
> >>>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t
> >>>>
> >>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ddr devfreq on H3 - possible?
  2023-01-02  2:20           ` Kirill
@ 2023-01-04 23:26             ` Kirill
  2023-01-05  2:47               ` Kirill
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill @ 2023-01-04 23:26 UTC (permalink / raw)
  To: Samuel Holland; +Cc: linux-sunxi

Hi Samel!

Sadly, it does not work stable :(

I found a good way to stress-test the ddr dvfs process. Just switching
between max/min freq every 0.5s.
(Previously I tested with performance <-> simple ondemand, before I
understood why not present other governors. And it wasn't enough.)

My script for testing:
```
#!/bin/bash

while true
do
date
echo "powersave" >
/sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
sleep 0.5
echo "performance" >
/sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
sleep 0.5
done
```

Test cases:
1. Without module sun8i_a33_mbus: system stable, all works fine
I checked my setup:
- Every power source looks fine on oscilloscope
- Kernel memtest (memtest=10) don't find any problem
- Userspace tool memtester don't find any problem

2. Single CPU + sun8i_a33_mbus + script: Works fine for a long time

3. 2 or 4 CPUs: Page Fault! Seems to be RAM corrupting...
```
[ 3890.960776] Unable to handle kernel paging request at virtual
address 00044581
[ 3890.968017] [00044581] *pgd=00000000
[ 3890.971626] Internal error: Oops: 5 [#1] SMP THUMB2
```
(full report: https://gist.github.com/Azq2/b2875ee228d1e68922d7c1f4f4e3f3df)

I tried to implement "CPU locking in monitor mode" using SGI. But
pagefaults still occur.
And, also, it is too slow... I see how my input hangs on ssh...
Code which I use:
https://github.com/u-boot/u-boot/compare/master...Azq2:u-boot:new_test

Maybe I incorrectly implemented SGI.

пн, 2 янв. 2023 г. в 04:20, Kirill <kirill.zhumarin@gmail.com>:
>
> > Just to make sure -- is __gpio_debug declared with __secure?
>
> Yes
>
> > try calling psci_v7_flush_dcache_all() before setting PWRCTL.
>
> Hm... I'm not sure, but it seems to be working.... My OPi Lite works
> ~one hour without any hangs!
>
> Of course, I need more time and tests to completely confirm that.
> I will continue testing.
>
> Great thanks!!!
>
> Also, I did some extra work:
> - Rewrited code to use u-boot registers and constants
> - implemented enabling/disabling ODT
> - implemented enabling/disabling self-refresh
> - implemented custom PSCI tables for platform-specific functions in u-boot
>
> And all of these also seem to be works.
> My fork with changes:
> https://github.com/Azq2/u-boot/commits/allwinner-h3-dram-dvfs
>
> пн, 2 янв. 2023 г. в 00:20, Samuel Holland <samuel@sholland.org>:
> >
> > On 1/1/23 15:40, Kirill wrote:
> > > I did a little debugging.
> > > When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);`
> > >
> > > Example of my debug:
> > > ```
> > > /* 1. enter self-refresh and disable all master access */
> > > reg_val = readl(PWRCTL);
> > > reg_val |= (0x1<<0);
> > > reg_val |= (0x1<<8);
> > > __gpio_debug(2);
> > > writel(reg_val, PWRCTL);
> > > __gpio_debug(3);
> > > __udelay(1);
> > > __gpio_debug(4);
> > > ```
> > > __gpio_debug should not take any effect on the process, just switching GPIO's.
> >
> > Just to make sure -- is __gpio_debug declared with __secure?
> >
> > > Before hanging it can be worked for a few minutes. Hangs occur at
> > > random manners.
> > > Also hangs not depend on count of freq change.
> >
> > Hmm, the CPU hanging right when blocking access to DRAM suggests there
> > is some asynchronous cache operation (prefetch or writeback) that is
> > causes some DRAM traffic.
> >
> > I checked the U-Boot source code, and it disables both caches in the
> > secure copy of SCTLR before setting up PSCI*. So there should not be any
> > prefetching occurring while you are in monitor mode. But maybe there is
> > some writeback from cache lines dirtied in non-secure state. You could
> > try calling psci_v7_flush_dcache_all() before setting PWRCTL.
> >
> > If that does not help, I am not sure what else could be happening.
> >
> > Regards,
> > Samuel
> >
> > * boot_jump_linux() -> announce_and_cleanup() -> cleanup_before_linux()
> > -> cleanup_before_linux_select()
> >
> > > I tried something like this for fastest bug reproduction:
> > > ```
> > > while true
> > > do
> > >   echo "simple_ondemand" >
> > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> > >   sleep 5
> > >   echo "performance" >
> > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> > >   sleep 5
> > > done
> > > ```
> > >
> > > No effect, it still happens randomly :)
> > >
> > >> You can send a SGI IPI to the other three CPUs and force them to
> > >> trap into the secure monitor.
> > >
> > > Thanks, I will try this later.
> > >
> > > P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :(
> > >
> > > сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>:
> > >>
> > >> Hi Kirill,
> > >>
> > >> On 12/31/22 05:15, Kirill wrote:
> > >>> Hi!
> > >>>
> > >>> I ported your patches for armbian kernel 6.1 / u-boot and it works!
> > >>>
> > >>>> It works, though it did lock up once after playing with the devfreq
> > >>>> sysfs for several minutes
> > >>>
> > >>> Yes, I have hangs too. And the main reason for this problem - SMP. :(
> > >>>
> > >>> By calling SMC we put only one CPU into SRAM. But other CPUs still
> > >>> work and use DRAM!
> > >>> I don't see any hangs, if I disable all other CPUs:
> > >>> ```
> > >>> echo 0 > /sys/devices/system/cpu/cpu1/online
> > >>> echo 0 > /sys/devices/system/cpu/cpu2/online
> > >>> echo 0 > /sys/devices/system/cpu/cpu3/online
> > >>> ```
> > >>
> > >> Thanks for investigating. That's good information.
> > >>
> > >>> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem.
> > >>>
> > >>> They call mdfs_pause_cpu[1] for each CPU core (except current)
> > >>> This function located in the SRAM and locks CPU in infinity loop,
> > >>> until `set_paused(false)` called on CPU0
> > >>> Also, legacy rockchip kernels use same hack[3].
> > >>>
> > >>> But this method is not ideal...
> > >>> Before changing DDR freq we must make sure which each kernel is stuck
> > >>> on the SRAM function. But this is a very long process.
> > >>> In my proof-of-concept implementation sometimes elapses *few seconds*
> > >>> between I call smp_call_function and all cores stucking in SRAM
> > >>> function.
> > >>> This method may be suitable for manual freq change. But not for
> > >>> automatic governor mode. No chance to change frequency on heavy loaded
> > >>> CPUs.
> > >>>
> > >>> We need another way.
> > >>>
> > >>> I think, code on PSCI must stop/suspend any other working cpu cores
> > >>> before update freq. This is possible?
> > >>
> > >> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM
> > >> controller's host interface, which should cause the L2 cache subsystem
> > >> (and thus the other CPUs) to stall when trying to access DRAM. This is
> > >> what the MDFS hardware does on A64/H5, and I have seen no hangs there.
> > >>
> > >> Possibly the issue is that such a stall sometimes affects the CPU that
> > >> is running from SRAM, even though it should not. (On the other hand,
> > >> when using the MDFS hardware, it is okay if all four CPUs temporarily
> > >> stall at the same time.)
> > >>
> > >> One thing to check is if sunxi_dram_dvfs_req() completes successfully.
> > >> That function contains some unbounded loops, so it is possible to get
> > >> stuck. You could toggle a GPIO or something at the end of the function.
> > >> That would distinguish between "the secure monitor hung" and "we left
> > >> the DRAM controller in a bad state and hung when switching back to code
> > >> in DRAM" or even "we trashed the contents of DRAM".
> > >>
> > >> We do use the architectural timer inside sunxi_dram_dvfs_req(), but
> > >> those registers are banked between secure/non-secure states, so that
> > >> should not interfere with Linux's use of the timer.
> > >>
> > >> However, your test with offlining the other CPUs suggests we may really
> > >> need some synchronization. I would suggest doing this inside U-Boot as
> > >> well. You can send a SGI IPI to the other three CPUs and force them to
> > >> trap into the secure monitor. Not only will this be immediate, but it
> > >> will also ensure the other CPUs are running from SRAM during the
> > >> reclocking. You can take inspiration from the existing IPI code in psci.c.
> > >>
> > >> It is quite convenient to be truly in control, so you can do things
> > >> behind the OS's back, and keep it blissfully unaware. :)
> > >>
> > >> Regards,
> > >> Samuel
> > >>
> > >>> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638
> > >>> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088
> > >>> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168
> > >>>
> > >>> P.S. sorry for duplicate, previous message declined by mlmmj
> > >>>
> > >>>
> > >>> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>:
> > >>>>
> > >>>> Hi Kirill,
> > >>>>
> > >>>> On 12/28/22 07:10, Kirill wrote:
> > >>>>> I'm trying to use your driver with h3, but have this result:
> > >>>>> ```
> > >>>>> [  387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit
> > >>>>> DDRx with ODT
> > >>>>> [  388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750
> > >>>>> (6%) at 1248 MHz
> > >>>>> [  389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750
> > >>>>> (0%) at 1248 MHz
> > >>>>> [  389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to
> > >>>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled
> > >>>>> ```
> > >>>>>
> > >>>>> After this CPU hangs and not responding.
> > >>>>> Is possible (at least theoretically) to use this driver with H3?
> > >>>>
> > >>>> Yes, although it will need some help from firmware. If you look at the
> > >>>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see
> > >>>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3).
> > >>>> The driver always calls mdfs_main(), which is a standalone program
> > >>>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken,
> > >>>> as you found out.
> > >>>>
> > >>>> Something like this standalone MDFS application is not upstreamable, but
> > >>>> conveniently we already have some firmware running from SRAM, namely
> > >>>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has
> > >>>> some chips where they call a SMC to do this MDFS procedure[2]. So we can
> > >>>> reuse that SMC function ID, and put the code in in the secure monitor.
> > >>>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept.
> > >>>>
> > >>>> It works, though it did lock up once after playing with the devfreq
> > >>>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are
> > >>>> just copied from the vendor driver; they could surely be improved.
> > >>>>
> > >>>> The U-Boot patch is based on my series adding Crust support for H3, so I
> > >>>> could have interactive peek/poke from the AR100 even when the DRAM
> > >>>> controller is dead. It shouldn't be too hard to rebase that out and move
> > >>>> the code to psci.c.
> > >>>>
> > >>>> I'm not sure the best way to upstream the changes to psci.S. Probably we
> > >>>> need some platform callback to handle unknown function IDs.
> > >>>>
> > >>>> Regards,
> > >>>> Samuel
> > >>>>
> > >>>> [1]:
> > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411
> > >>>> [2]:
> > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38
> > >>>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq
> > >>>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t
> > >>>>
> > >>
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ddr devfreq on H3 - possible?
  2023-01-04 23:26             ` Kirill
@ 2023-01-05  2:47               ` Kirill
  2023-01-05 20:57                 ` Kirill
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill @ 2023-01-05  2:47 UTC (permalink / raw)
  To: Samuel Holland; +Cc: linux-sunxi

Oh, sorry for the irrelevant report. I was really surprised when I got
pagefault even without SMC call (:
Previously I used the armbian unstable kernel (6.1.1). On stable
armbian kernel (5.15.x86) pagefault not reproduced in all cases.
It's just armbian's bugs...

I'll try to be more careful when preparing for the test environment. :)

I will continue testing.

чт, 5 янв. 2023 г. в 01:26, Kirill <kirill.zhumarin@gmail.com>:
>
> Hi Samel!
>
> Sadly, it does not work stable :(
>
> I found a good way to stress-test the ddr dvfs process. Just switching
> between max/min freq every 0.5s.
> (Previously I tested with performance <-> simple ondemand, before I
> understood why not present other governors. And it wasn't enough.)
>
> My script for testing:
> ```
> #!/bin/bash
>
> while true
> do
> date
> echo "powersave" >
> /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> sleep 0.5
> echo "performance" >
> /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> sleep 0.5
> done
> ```
>
> Test cases:
> 1. Without module sun8i_a33_mbus: system stable, all works fine
> I checked my setup:
> - Every power source looks fine on oscilloscope
> - Kernel memtest (memtest=10) don't find any problem
> - Userspace tool memtester don't find any problem
>
> 2. Single CPU + sun8i_a33_mbus + script: Works fine for a long time
>
> 3. 2 or 4 CPUs: Page Fault! Seems to be RAM corrupting...
> ```
> [ 3890.960776] Unable to handle kernel paging request at virtual
> address 00044581
> [ 3890.968017] [00044581] *pgd=00000000
> [ 3890.971626] Internal error: Oops: 5 [#1] SMP THUMB2
> ```
> (full report: https://gist.github.com/Azq2/b2875ee228d1e68922d7c1f4f4e3f3df)
>
> I tried to implement "CPU locking in monitor mode" using SGI. But
> pagefaults still occur.
> And, also, it is too slow... I see how my input hangs on ssh...
> Code which I use:
> https://github.com/u-boot/u-boot/compare/master...Azq2:u-boot:new_test
>
> Maybe I incorrectly implemented SGI.
>
> пн, 2 янв. 2023 г. в 04:20, Kirill <kirill.zhumarin@gmail.com>:
> >
> > > Just to make sure -- is __gpio_debug declared with __secure?
> >
> > Yes
> >
> > > try calling psci_v7_flush_dcache_all() before setting PWRCTL.
> >
> > Hm... I'm not sure, but it seems to be working.... My OPi Lite works
> > ~one hour without any hangs!
> >
> > Of course, I need more time and tests to completely confirm that.
> > I will continue testing.
> >
> > Great thanks!!!
> >
> > Also, I did some extra work:
> > - Rewrited code to use u-boot registers and constants
> > - implemented enabling/disabling ODT
> > - implemented enabling/disabling self-refresh
> > - implemented custom PSCI tables for platform-specific functions in u-boot
> >
> > And all of these also seem to be works.
> > My fork with changes:
> > https://github.com/Azq2/u-boot/commits/allwinner-h3-dram-dvfs
> >
> > пн, 2 янв. 2023 г. в 00:20, Samuel Holland <samuel@sholland.org>:
> > >
> > > On 1/1/23 15:40, Kirill wrote:
> > > > I did a little debugging.
> > > > When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);`
> > > >
> > > > Example of my debug:
> > > > ```
> > > > /* 1. enter self-refresh and disable all master access */
> > > > reg_val = readl(PWRCTL);
> > > > reg_val |= (0x1<<0);
> > > > reg_val |= (0x1<<8);
> > > > __gpio_debug(2);
> > > > writel(reg_val, PWRCTL);
> > > > __gpio_debug(3);
> > > > __udelay(1);
> > > > __gpio_debug(4);
> > > > ```
> > > > __gpio_debug should not take any effect on the process, just switching GPIO's.
> > >
> > > Just to make sure -- is __gpio_debug declared with __secure?
> > >
> > > > Before hanging it can be worked for a few minutes. Hangs occur at
> > > > random manners.
> > > > Also hangs not depend on count of freq change.
> > >
> > > Hmm, the CPU hanging right when blocking access to DRAM suggests there
> > > is some asynchronous cache operation (prefetch or writeback) that is
> > > causes some DRAM traffic.
> > >
> > > I checked the U-Boot source code, and it disables both caches in the
> > > secure copy of SCTLR before setting up PSCI*. So there should not be any
> > > prefetching occurring while you are in monitor mode. But maybe there is
> > > some writeback from cache lines dirtied in non-secure state. You could
> > > try calling psci_v7_flush_dcache_all() before setting PWRCTL.
> > >
> > > If that does not help, I am not sure what else could be happening.
> > >
> > > Regards,
> > > Samuel
> > >
> > > * boot_jump_linux() -> announce_and_cleanup() -> cleanup_before_linux()
> > > -> cleanup_before_linux_select()
> > >
> > > > I tried something like this for fastest bug reproduction:
> > > > ```
> > > > while true
> > > > do
> > > >   echo "simple_ondemand" >
> > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> > > >   sleep 5
> > > >   echo "performance" >
> > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> > > >   sleep 5
> > > > done
> > > > ```
> > > >
> > > > No effect, it still happens randomly :)
> > > >
> > > >> You can send a SGI IPI to the other three CPUs and force them to
> > > >> trap into the secure monitor.
> > > >
> > > > Thanks, I will try this later.
> > > >
> > > > P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :(
> > > >
> > > > сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>:
> > > >>
> > > >> Hi Kirill,
> > > >>
> > > >> On 12/31/22 05:15, Kirill wrote:
> > > >>> Hi!
> > > >>>
> > > >>> I ported your patches for armbian kernel 6.1 / u-boot and it works!
> > > >>>
> > > >>>> It works, though it did lock up once after playing with the devfreq
> > > >>>> sysfs for several minutes
> > > >>>
> > > >>> Yes, I have hangs too. And the main reason for this problem - SMP. :(
> > > >>>
> > > >>> By calling SMC we put only one CPU into SRAM. But other CPUs still
> > > >>> work and use DRAM!
> > > >>> I don't see any hangs, if I disable all other CPUs:
> > > >>> ```
> > > >>> echo 0 > /sys/devices/system/cpu/cpu1/online
> > > >>> echo 0 > /sys/devices/system/cpu/cpu2/online
> > > >>> echo 0 > /sys/devices/system/cpu/cpu3/online
> > > >>> ```
> > > >>
> > > >> Thanks for investigating. That's good information.
> > > >>
> > > >>> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem.
> > > >>>
> > > >>> They call mdfs_pause_cpu[1] for each CPU core (except current)
> > > >>> This function located in the SRAM and locks CPU in infinity loop,
> > > >>> until `set_paused(false)` called on CPU0
> > > >>> Also, legacy rockchip kernels use same hack[3].
> > > >>>
> > > >>> But this method is not ideal...
> > > >>> Before changing DDR freq we must make sure which each kernel is stuck
> > > >>> on the SRAM function. But this is a very long process.
> > > >>> In my proof-of-concept implementation sometimes elapses *few seconds*
> > > >>> between I call smp_call_function and all cores stucking in SRAM
> > > >>> function.
> > > >>> This method may be suitable for manual freq change. But not for
> > > >>> automatic governor mode. No chance to change frequency on heavy loaded
> > > >>> CPUs.
> > > >>>
> > > >>> We need another way.
> > > >>>
> > > >>> I think, code on PSCI must stop/suspend any other working cpu cores
> > > >>> before update freq. This is possible?
> > > >>
> > > >> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM
> > > >> controller's host interface, which should cause the L2 cache subsystem
> > > >> (and thus the other CPUs) to stall when trying to access DRAM. This is
> > > >> what the MDFS hardware does on A64/H5, and I have seen no hangs there.
> > > >>
> > > >> Possibly the issue is that such a stall sometimes affects the CPU that
> > > >> is running from SRAM, even though it should not. (On the other hand,
> > > >> when using the MDFS hardware, it is okay if all four CPUs temporarily
> > > >> stall at the same time.)
> > > >>
> > > >> One thing to check is if sunxi_dram_dvfs_req() completes successfully.
> > > >> That function contains some unbounded loops, so it is possible to get
> > > >> stuck. You could toggle a GPIO or something at the end of the function.
> > > >> That would distinguish between "the secure monitor hung" and "we left
> > > >> the DRAM controller in a bad state and hung when switching back to code
> > > >> in DRAM" or even "we trashed the contents of DRAM".
> > > >>
> > > >> We do use the architectural timer inside sunxi_dram_dvfs_req(), but
> > > >> those registers are banked between secure/non-secure states, so that
> > > >> should not interfere with Linux's use of the timer.
> > > >>
> > > >> However, your test with offlining the other CPUs suggests we may really
> > > >> need some synchronization. I would suggest doing this inside U-Boot as
> > > >> well. You can send a SGI IPI to the other three CPUs and force them to
> > > >> trap into the secure monitor. Not only will this be immediate, but it
> > > >> will also ensure the other CPUs are running from SRAM during the
> > > >> reclocking. You can take inspiration from the existing IPI code in psci.c.
> > > >>
> > > >> It is quite convenient to be truly in control, so you can do things
> > > >> behind the OS's back, and keep it blissfully unaware. :)
> > > >>
> > > >> Regards,
> > > >> Samuel
> > > >>
> > > >>> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638
> > > >>> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088
> > > >>> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168
> > > >>>
> > > >>> P.S. sorry for duplicate, previous message declined by mlmmj
> > > >>>
> > > >>>
> > > >>> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>:
> > > >>>>
> > > >>>> Hi Kirill,
> > > >>>>
> > > >>>> On 12/28/22 07:10, Kirill wrote:
> > > >>>>> I'm trying to use your driver with h3, but have this result:
> > > >>>>> ```
> > > >>>>> [  387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit
> > > >>>>> DDRx with ODT
> > > >>>>> [  388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750
> > > >>>>> (6%) at 1248 MHz
> > > >>>>> [  389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750
> > > >>>>> (0%) at 1248 MHz
> > > >>>>> [  389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to
> > > >>>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled
> > > >>>>> ```
> > > >>>>>
> > > >>>>> After this CPU hangs and not responding.
> > > >>>>> Is possible (at least theoretically) to use this driver with H3?
> > > >>>>
> > > >>>> Yes, although it will need some help from firmware. If you look at the
> > > >>>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see
> > > >>>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3).
> > > >>>> The driver always calls mdfs_main(), which is a standalone program
> > > >>>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken,
> > > >>>> as you found out.
> > > >>>>
> > > >>>> Something like this standalone MDFS application is not upstreamable, but
> > > >>>> conveniently we already have some firmware running from SRAM, namely
> > > >>>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has
> > > >>>> some chips where they call a SMC to do this MDFS procedure[2]. So we can
> > > >>>> reuse that SMC function ID, and put the code in in the secure monitor.
> > > >>>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept.
> > > >>>>
> > > >>>> It works, though it did lock up once after playing with the devfreq
> > > >>>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are
> > > >>>> just copied from the vendor driver; they could surely be improved.
> > > >>>>
> > > >>>> The U-Boot patch is based on my series adding Crust support for H3, so I
> > > >>>> could have interactive peek/poke from the AR100 even when the DRAM
> > > >>>> controller is dead. It shouldn't be too hard to rebase that out and move
> > > >>>> the code to psci.c.
> > > >>>>
> > > >>>> I'm not sure the best way to upstream the changes to psci.S. Probably we
> > > >>>> need some platform callback to handle unknown function IDs.
> > > >>>>
> > > >>>> Regards,
> > > >>>> Samuel
> > > >>>>
> > > >>>> [1]:
> > > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411
> > > >>>> [2]:
> > > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38
> > > >>>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq
> > > >>>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t
> > > >>>>
> > > >>
> > >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ddr devfreq on H3 - possible?
  2023-01-05  2:47               ` Kirill
@ 2023-01-05 20:57                 ` Kirill
  2023-01-06  1:18                   ` Kirill
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill @ 2023-01-05 20:57 UTC (permalink / raw)
  To: Samuel Holland; +Cc: linux-sunxi

Hi, Samuel!

Now I can confidently say - psci_v7_flush_dcache_all() only partially helps.
On low-loaded RAM it can work for a long time. But, usually, after a
few hours it still hangs on PWRCTL.

Previously, this bug was heavily testable for me, because of its random nature.
But now I found a 100% method for reproducing this problem.

First ssh:
```
#!/bin/bash
while true
do
date
echo "powersave" >
/sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
sleep 0.5
echo "performance" >
/sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
sleep 0.5
done
```

Second ssh:
```
$ sudo memtester 300M
```

On 2-4 CPUs: I have hung on PWRCTL after every run of the `memtester`
tool (after a few seconds).
On 1 CPU: Never hangs and works fine.

On Thu, Jan 5, 2023 at 4:47 AM Kirill <kirill.zhumarin@gmail.com> wrote:
>
> Oh, sorry for the irrelevant report. I was really surprised when I got
> pagefault even without SMC call (:
> Previously I used the armbian unstable kernel (6.1.1). On stable
> armbian kernel (5.15.x86) pagefault not reproduced in all cases.
> It's just armbian's bugs...
>
> I'll try to be more careful when preparing for the test environment. :)
>
> I will continue testing.
>
> чт, 5 янв. 2023 г. в 01:26, Kirill <kirill.zhumarin@gmail.com>:
> >
> > Hi Samel!
> >
> > Sadly, it does not work stable :(
> >
> > I found a good way to stress-test the ddr dvfs process. Just switching
> > between max/min freq every 0.5s.
> > (Previously I tested with performance <-> simple ondemand, before I
> > understood why not present other governors. And it wasn't enough.)
> >
> > My script for testing:
> > ```
> > #!/bin/bash
> >
> > while true
> > do
> > date
> > echo "powersave" >
> > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> > sleep 0.5
> > echo "performance" >
> > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> > sleep 0.5
> > done
> > ```
> >
> > Test cases:
> > 1. Without module sun8i_a33_mbus: system stable, all works fine
> > I checked my setup:
> > - Every power source looks fine on oscilloscope
> > - Kernel memtest (memtest=10) don't find any problem
> > - Userspace tool memtester don't find any problem
> >
> > 2. Single CPU + sun8i_a33_mbus + script: Works fine for a long time
> >
> > 3. 2 or 4 CPUs: Page Fault! Seems to be RAM corrupting...
> > ```
> > [ 3890.960776] Unable to handle kernel paging request at virtual
> > address 00044581
> > [ 3890.968017] [00044581] *pgd=00000000
> > [ 3890.971626] Internal error: Oops: 5 [#1] SMP THUMB2
> > ```
> > (full report: https://gist.github.com/Azq2/b2875ee228d1e68922d7c1f4f4e3f3df)
> >
> > I tried to implement "CPU locking in monitor mode" using SGI. But
> > pagefaults still occur.
> > And, also, it is too slow... I see how my input hangs on ssh...
> > Code which I use:
> > https://github.com/u-boot/u-boot/compare/master...Azq2:u-boot:new_test
> >
> > Maybe I incorrectly implemented SGI.
> >
> > пн, 2 янв. 2023 г. в 04:20, Kirill <kirill.zhumarin@gmail.com>:
> > >
> > > > Just to make sure -- is __gpio_debug declared with __secure?
> > >
> > > Yes
> > >
> > > > try calling psci_v7_flush_dcache_all() before setting PWRCTL.
> > >
> > > Hm... I'm not sure, but it seems to be working.... My OPi Lite works
> > > ~one hour without any hangs!
> > >
> > > Of course, I need more time and tests to completely confirm that.
> > > I will continue testing.
> > >
> > > Great thanks!!!
> > >
> > > Also, I did some extra work:
> > > - Rewrited code to use u-boot registers and constants
> > > - implemented enabling/disabling ODT
> > > - implemented enabling/disabling self-refresh
> > > - implemented custom PSCI tables for platform-specific functions in u-boot
> > >
> > > And all of these also seem to be works.
> > > My fork with changes:
> > > https://github.com/Azq2/u-boot/commits/allwinner-h3-dram-dvfs
> > >
> > > пн, 2 янв. 2023 г. в 00:20, Samuel Holland <samuel@sholland.org>:
> > > >
> > > > On 1/1/23 15:40, Kirill wrote:
> > > > > I did a little debugging.
> > > > > When hangs happens, CPU is always stuck on `writel(reg_val, PWRCTL);`
> > > > >
> > > > > Example of my debug:
> > > > > ```
> > > > > /* 1. enter self-refresh and disable all master access */
> > > > > reg_val = readl(PWRCTL);
> > > > > reg_val |= (0x1<<0);
> > > > > reg_val |= (0x1<<8);
> > > > > __gpio_debug(2);
> > > > > writel(reg_val, PWRCTL);
> > > > > __gpio_debug(3);
> > > > > __udelay(1);
> > > > > __gpio_debug(4);
> > > > > ```
> > > > > __gpio_debug should not take any effect on the process, just switching GPIO's.
> > > >
> > > > Just to make sure -- is __gpio_debug declared with __secure?
> > > >
> > > > > Before hanging it can be worked for a few minutes. Hangs occur at
> > > > > random manners.
> > > > > Also hangs not depend on count of freq change.
> > > >
> > > > Hmm, the CPU hanging right when blocking access to DRAM suggests there
> > > > is some asynchronous cache operation (prefetch or writeback) that is
> > > > causes some DRAM traffic.
> > > >
> > > > I checked the U-Boot source code, and it disables both caches in the
> > > > secure copy of SCTLR before setting up PSCI*. So there should not be any
> > > > prefetching occurring while you are in monitor mode. But maybe there is
> > > > some writeback from cache lines dirtied in non-secure state. You could
> > > > try calling psci_v7_flush_dcache_all() before setting PWRCTL.
> > > >
> > > > If that does not help, I am not sure what else could be happening.
> > > >
> > > > Regards,
> > > > Samuel
> > > >
> > > > * boot_jump_linux() -> announce_and_cleanup() -> cleanup_before_linux()
> > > > -> cleanup_before_linux_select()
> > > >
> > > > > I tried something like this for fastest bug reproduction:
> > > > > ```
> > > > > while true
> > > > > do
> > > > >   echo "simple_ondemand" >
> > > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> > > > >   sleep 5
> > > > >   echo "performance" >
> > > > > /sys/devices/platform/soc/1c62000.dram-controller/devfreq/1c62000.dram-controller/governor
> > > > >   sleep 5
> > > > > done
> > > > > ```
> > > > >
> > > > > No effect, it still happens randomly :)
> > > > >
> > > > >> You can send a SGI IPI to the other three CPUs and force them to
> > > > >> trap into the secure monitor.
> > > > >
> > > > > Thanks, I will try this later.
> > > > >
> > > > > P.S. Sorry for the duplicate, I forgot "Answer to All" in gmail :(
> > > > >
> > > > > сб, 31 дек. 2022 г. в 22:45, Samuel Holland <samuel@sholland.org>:
> > > > >>
> > > > >> Hi Kirill,
> > > > >>
> > > > >> On 12/31/22 05:15, Kirill wrote:
> > > > >>> Hi!
> > > > >>>
> > > > >>> I ported your patches for armbian kernel 6.1 / u-boot and it works!
> > > > >>>
> > > > >>>> It works, though it did lock up once after playing with the devfreq
> > > > >>>> sysfs for several minutes
> > > > >>>
> > > > >>> Yes, I have hangs too. And the main reason for this problem - SMP. :(
> > > > >>>
> > > > >>> By calling SMC we put only one CPU into SRAM. But other CPUs still
> > > > >>> work and use DRAM!
> > > > >>> I don't see any hangs, if I disable all other CPUs:
> > > > >>> ```
> > > > >>> echo 0 > /sys/devices/system/cpu/cpu1/online
> > > > >>> echo 0 > /sys/devices/system/cpu/cpu2/online
> > > > >>> echo 0 > /sys/devices/system/cpu/cpu3/online
> > > > >>> ```
> > > > >>
> > > > >> Thanks for investigating. That's good information.
> > > > >>
> > > > >>> Original sunxi 3.4 kernel uses some dirty hack[2] to get around this problem.
> > > > >>>
> > > > >>> They call mdfs_pause_cpu[1] for each CPU core (except current)
> > > > >>> This function located in the SRAM and locks CPU in infinity loop,
> > > > >>> until `set_paused(false)` called on CPU0
> > > > >>> Also, legacy rockchip kernels use same hack[3].
> > > > >>>
> > > > >>> But this method is not ideal...
> > > > >>> Before changing DDR freq we must make sure which each kernel is stuck
> > > > >>> on the SRAM function. But this is a very long process.
> > > > >>> In my proof-of-concept implementation sometimes elapses *few seconds*
> > > > >>> between I call smp_call_function and all cores stucking in SRAM
> > > > >>> function.
> > > > >>> This method may be suitable for manual freq change. But not for
> > > > >>> automatic governor mode. No chance to change frequency on heavy loaded
> > > > >>> CPUs.
> > > > >>>
> > > > >>> We need another way.
> > > > >>>
> > > > >>> I think, code on PSCI must stop/suspend any other working cpu cores
> > > > >>> before update freq. This is possible?
> > > > >>
> > > > >> It _shouldn't_ be necessary. Setting bit 8 in PWRCTL blocks the DRAM
> > > > >> controller's host interface, which should cause the L2 cache subsystem
> > > > >> (and thus the other CPUs) to stall when trying to access DRAM. This is
> > > > >> what the MDFS hardware does on A64/H5, and I have seen no hangs there.
> > > > >>
> > > > >> Possibly the issue is that such a stall sometimes affects the CPU that
> > > > >> is running from SRAM, even though it should not. (On the other hand,
> > > > >> when using the MDFS hardware, it is okay if all four CPUs temporarily
> > > > >> stall at the same time.)
> > > > >>
> > > > >> One thing to check is if sunxi_dram_dvfs_req() completes successfully.
> > > > >> That function contains some unbounded loops, so it is possible to get
> > > > >> stuck. You could toggle a GPIO or something at the end of the function.
> > > > >> That would distinguish between "the secure monitor hung" and "we left
> > > > >> the DRAM controller in a bad state and hung when switching back to code
> > > > >> in DRAM" or even "we trashed the contents of DRAM".
> > > > >>
> > > > >> We do use the architectural timer inside sunxi_dram_dvfs_req(), but
> > > > >> those registers are banked between secure/non-secure states, so that
> > > > >> should not interfere with Linux's use of the timer.
> > > > >>
> > > > >> However, your test with offlining the other CPUs suggests we may really
> > > > >> need some synchronization. I would suggest doing this inside U-Boot as
> > > > >> well. You can send a SGI IPI to the other three CPUs and force them to
> > > > >> trap into the secure monitor. Not only will this be immediate, but it
> > > > >> will also ensure the other CPUs are running from SRAM during the
> > > > >> reclocking. You can take inspiration from the existing IPI code in psci.c.
> > > > >>
> > > > >> It is quite convenient to be truly in control, so you can do things
> > > > >> behind the OS's back, and keep it blissfully unaware. :)
> > > > >>
> > > > >> Regards,
> > > > >> Samuel
> > > > >>
> > > > >>> [1] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L638
> > > > >>> [2] https://github.com/Hasiergo/Allwinner-A33-linux-3.4.113/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1088
> > > > >>> [3] https://github.com/bbelos/rk3188-kernel/blob/master/arch/arm/plat-rk/ddr_freq.c#L168
> > > > >>>
> > > > >>> P.S. sorry for duplicate, previous message declined by mlmmj
> > > > >>>
> > > > >>>
> > > > >>> чт, 29 дек. 2022 г. в 19:29, Samuel Holland <samuel@sholland.org>:
> > > > >>>>
> > > > >>>> Hi Kirill,
> > > > >>>>
> > > > >>>> On 12/28/22 07:10, Kirill wrote:
> > > > >>>>> I'm trying to use your driver with h3, but have this result:
> > > > >>>>> ```
> > > > >>>>> [  387.306429] sun8i-h3-mbus 1c62000.dram-controller: Detected 32-bit
> > > > >>>>> DDRx with ODT
> > > > >>>>> [  388.450262] sun8i-h3-mbus 1c62000.dram-controller: Using 15554/243750
> > > > >>>>> (6%) at 1248 MHz
> > > > >>>>> [  389.906319] sun8i-h3-mbus 1c62000.dram-controller: Using 1079/243750
> > > > >>>>> (0%) at 1248 MHz
> > > > >>>>> [  389.914314] sun8i-h3-mbus 1c62000.dram-controller: Setting DRAM to
> > > > >>>>> 156 MHz, tREFI=19, tRFC=28, ODT=disabled
> > > > >>>>> ```
> > > > >>>>>
> > > > >>>>> After this CPU hangs and not responding.
> > > > >>>>> Is possible (at least theoretically) to use this driver with H3?
> > > > >>>>
> > > > >>>> Yes, although it will need some help from firmware. If you look at the
> > > > >>>> vendor driver[1] (pick any random Allwinner 4.9 tree), you will see
> > > > >>>> there is no mdfs_dfs() implementation for CONFIG_ARCH_SUN8IW7P1 (H3).
> > > > >>>> The driver always calls mdfs_main(), which is a standalone program
> > > > >>>> loaded to SRAM. The reason seems to be that the MDFS hardware is broken,
> > > > >>>> as you found out.
> > > > >>>>
> > > > >>>> Something like this standalone MDFS application is not upstreamable, but
> > > > >>>> conveniently we already have some firmware running from SRAM, namely
> > > > >>>> U-Boot's PSCI/secure monitor implementation. And Allwinner already has
> > > > >>>> some chips where they call a SMC to do this MDFS procedure[2]. So we can
> > > > >>>> reuse that SMC function ID, and put the code in in the secure monitor.
> > > > >>>> I've thrown together U-Boot[3] and Linux[4] patches as a proof of concept.
> > > > >>>>
> > > > >>>> It works, though it did lock up once after playing with the devfreq
> > > > >>>> sysfs for several minutes. The contents of sunxi_dram_dvfs_req() are
> > > > >>>> just copied from the vendor driver; they could surely be improved.
> > > > >>>>
> > > > >>>> The U-Boot patch is based on my series adding Crust support for H3, so I
> > > > >>>> could have interactive peek/poke from the AR100 even when the DRAM
> > > > >>>> controller is dead. It shouldn't be too hard to rebase that out and move
> > > > >>>> the code to psci.c.
> > > > >>>>
> > > > >>>> I'm not sure the best way to upstream the changes to psci.S. Probably we
> > > > >>>> need some platform callback to handle unknown function IDs.
> > > > >>>>
> > > > >>>> Regards,
> > > > >>>> Samuel
> > > > >>>>
> > > > >>>> [1]:
> > > > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/dramfreq/sunxi-ddrfreq.c#L1411
> > > > >>>> [2]:
> > > > >>>> https://github.com/Tina-Linux/tina-v83x-linux-4.9/blob/master/drivers/devfreq/sunxi-mdfs.h#L38
> > > > >>>> [3]: https://github.com/smaeul/u-boot/commits/patch/h3-dram-devfreq
> > > > >>>> [4]: https://github.com/smaeul/linux/commits/wip/devfreq-a83t
> > > > >>>>
> > > > >>
> > > >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ddr devfreq on H3 - possible?
  2023-01-05 20:57                 ` Kirill
@ 2023-01-06  1:18                   ` Kirill
  2023-01-07 16:40                     ` Kirill
  0 siblings, 1 reply; 11+ messages in thread
From: Kirill @ 2023-01-06  1:18 UTC (permalink / raw)
  To: Samuel Holland; +Cc: linux-sunxi

Additionally to the previous message... I think we must make sure of
this really SMP-caused bug. I did some experiments with hanging all
CPU's in the SRAM.

I tried this:
1. Custom SGI. But I got random pagefaults. Don't know what is going
wrong, but will try to debug it in the future.
Example of code:
https://github.com/u-boot/u-boot/compare/master...Azq2:u-boot:new_test

2. Disabling CPUs before and enabling after frequency change.
I tried kernel API `cpu_remove()` / `cpu_add()` and simple bash script
with usespace API `/sys/devices/system/cpu/cpuX/online`.
But I have random pagefaults if too frequently disabling and enabling
CPU's. Even without any sun8i_a33_mbus.
It seems the CPU hotplug is broken on both 5.15 and 6.1 kernels. %)
Perhaps this is related to SGI pagefaults at all.

Of course it offtopic to the main theme of this listing. Just a sad fact. :(

3. Locking using SMC calls and smp_call_function.
Locking code in the psci.c:
https://gist.github.com/Azq2/5c2bf3855f9546aeb63b57f5aa042621#file-psci-c-L378
Changes to sun8i-a33-mbus.c:
https://gist.github.com/Azq2/54713f2aa68c89bda0c8792f6b760908

Is the main idea - hang all secondary cores in their SMC handler.

Yes, this is a totally bad and strangest implementation, but that is
only for experiment.
That is similar to the original allwinner driver.
I tested this for some time and don't have any hang-ups on PWRCTL.
Even with memterser.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ddr devfreq on H3 - possible?
  2023-01-06  1:18                   ` Kirill
@ 2023-01-07 16:40                     ` Kirill
  0 siblings, 0 replies; 11+ messages in thread
From: Kirill @ 2023-01-07 16:40 UTC (permalink / raw)
  To: Samuel Holland; +Cc: linux-sunxi

Now I use something like that. "Not great, not terrible" solution.
But, of course, not upstreamable. :)

https://github.com/Azq2/armbian_build/blob/h3-ddr-dvfs-patch/patch/kernel/archive/sunxi-5.15/patches.armbian/PM-devfreq-Fix-h3-support-to-MBUS-driver.patch
https://github.com/Azq2/armbian_build/blob/h3-ddr-dvfs-patch/patch/u-boot/u-boot-sunxi/allwinner-h3-ddr-dvfs-in-psci.patch

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-01-07 16:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAKAF0m9DqPjB6C39ZbrRHFrJOodm7WQGTL0x1jduQjNU=JpQ2g@mail.gmail.com>
2022-12-29 17:29 ` ddr devfreq on H3 - possible? Samuel Holland
2022-12-31 11:15   ` Kirill
2022-12-31 20:45     ` Samuel Holland
2023-01-01 21:40       ` Kirill
2023-01-01 22:20         ` Samuel Holland
2023-01-02  2:20           ` Kirill
2023-01-04 23:26             ` Kirill
2023-01-05  2:47               ` Kirill
2023-01-05 20:57                 ` Kirill
2023-01-06  1:18                   ` Kirill
2023-01-07 16:40                     ` Kirill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox