public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: AngeloGioacchino Del Regno  <angelogioacchino.delregno@collabora.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>,
	Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>,
	Boris Brezillon <boris.brezillon@collabora.com>
Cc: Steven Price <steven.price@arm.com>,
	tzimmermann@suse.de, linux-kernel@vger.kernel.org,
	mripard@kernel.org, dri-devel@lists.freedesktop.org,
	wenst@chromium.org, kernel@collabora.com,
	"linux-samsung-soc@vger.kernel.org" 
	<linux-samsung-soc@vger.kernel.org>
Subject: Re: [PATCH] drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()
Date: Mon, 27 Nov 2023 12:26:52 +0100	[thread overview]
Message-ID: <ac36d1e2-36a4-473c-9acf-e0a1fc7d3bfb@collabora.com> (raw)
In-Reply-To: <054f6a93-8911-40bb-b677-ccdfd27d132b@samsung.com>

Il 27/11/23 12:24, Marek Szyprowski ha scritto:
> On 24.11.2023 13:45, Marek Szyprowski wrote:
>> On 22.11.2023 10:29, Krzysztof Kozlowski wrote:
>>> On 22/11/2023 10:06, AngeloGioacchino Del Regno wrote:
>>>>>>> Hey Krzysztof,
>>>>>>>
>>>>>>> This is interesting. It might be about the cores that are missing
>>>>>>> from the partial
>>>>>>> core_mask raising interrupts, but an external abort on
>>>>>>> non-linefetch is strange to
>>>>>>> see here.
>>>>>> I've seen such external aborts in the past, and the fault type has
>>>>>> often been misleading. It's unlikely to have anything to do with a
>>>>> Yeah, often accessing device with power or clocks gated.
>>>>>
>>>> Except my commit does *not* gate SoC power, nor SoC clocks 🙂
>>> It could be that something (like clocks or power supplies) was missing
>>> on this board/SoC, which was not critical till your patch came.
>>>
>>>> What the "Really power off ..." commit does is to ask the GPU to
>>>> internally power
>>>> off the shaders, tilers and L2, that's why I say that it is strange
>>>> to see that
>>>> kind of abort.
>>>>
>>>> The GPU_INT_CLEAR GPU_INT_STAT, GPU_FAULT_STATUS and
>>>> GPU_FAULT_ADDRESS_{HI/LO}
>>>> registers should still be accessible even with shaders, tilers and
>>>> cache OFF.
>>>>
>>>> Anyway, yes, synchronizing IRQs before calling the poweroff sequence
>>>> would also
>>>> work, but that'd add up quite a bit of latency on the
>>>> runtime_suspend() call, so
>>>> in this case I'd be more for avoiding to execute any register r/w in
>>>> the handler
>>>> by either checking if the GPU is supposed to be OFF, or clearing
>>>> interrupts, which
>>>> may not work if those are generated after the execution of the
>>>> poweroff function.
>>>> Or we could simply disable the irq after power_off, but that'd be
>>>> hacky (as well).
>>>>
>>>>
>>>> Let's see if asking to poweroff *everything* works:
>>> Worked.
>>
>> Yes, I also got into this issue some time ago, but I didn't report it
>> because I also had some power supply related problems on my test farm
>> and everything was a bit unstable. I wasn't 100% sure that the
>> $subject patch is responsible for the observed issues. Now, after
>> fixing power supply, I confirm that the issue was revealed by the
>> $subject patch and above mentioned change fixes the problem. Feel free
>> to add:
>>
>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> 
> 
> I must revoke my tested-by tag for the above fix alone. Although it
> fixed the boot issue and system stability issue, it looks that there is
> still something missing and opening the panfrost dri device causes a
> system crash:
> 
> root@target:~# ./modetest -C
> trying to open device 'i915'...failed
> trying to open device 'amdgpu'...failed
> trying to open device 'radeon'...failed
> trying to open device 'nouveau'...failed
> trying to open device 'vmwgfx'...failed
> trying to open device 'omapdrm'...failed
> trying to open device 'exynos'...done
> root@target:~#
> 
> 8<--- cut here ---
> Unhandled fault: external abort on non-linefetch (0x1008) at 0xf0c6803c
> [f0c6803c] *pgd=42d87811, *pte=11800653, *ppte=11800453
> Internal error: : 1008 [#1] PREEMPT SMP ARM
> Modules linked in: exynos_gsc s5p_mfc s5p_jpeg v4l2_mem2mem
> videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common
> videodev mc s5p_cec
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 6.7.0-rc2-next-20231127-00055-ge14abcb527d6 #7649
> Hardware name: Samsung Exynos (Flattened Device Tree)
> PC is at panfrost_gpu_irq_handler+0x18/0xfc
> LR is at __handle_irq_event_percpu+0xcc/0x31c
> ...
> Process swapper/0 (pid: 0, stack limit = 0x0e2875ff)
> Stack: (0xc1301e48 to 0xc1302000)
> ...
>    panfrost_gpu_irq_handler from __handle_irq_event_percpu+0xcc/0x31c
>    __handle_irq_event_percpu from handle_irq_event+0x38/0x80
>    handle_irq_event from handle_fasteoi_irq+0x9c/0x250
>    handle_fasteoi_irq from generic_handle_domain_irq+0x24/0x34
>    generic_handle_domain_irq from gic_handle_irq+0x88/0xa8
>    gic_handle_irq from generic_handle_arch_irq+0x34/0x44
>    generic_handle_arch_irq from __irq_svc+0x8c/0xd0
> Exception stack(0xc1301f10 to 0xc1301f58)
> ...
>    __irq_svc from default_idle_call+0x20/0x2c4
>    default_idle_call from do_idle+0x244/0x2b4
>    do_idle from cpu_startup_entry+0x28/0x2c
>    cpu_startup_entry from rest_init+0xec/0x190
>    rest_init from arch_post_acpi_subsys_init+0x0/0x8
> Code: e591300c e593402c f57ff04f e591300c (e593903c)
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Fatal exception in interrupt
> CPU2: stopping
> 
> 
> It looks that the panfrost interrupts must be somehow synchronized with
> turning power off, what has been already discussed. Let me know if you
> want me to test any patch.
> 

The new series containing the whole interrupts sync code is almost ready,
currently testing it on my machines here.

I should be able to send it between today and tomorrow.

Cheers,
Angelo


  reply	other threads:[~2023-11-27 11:27 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-02 14:15 [PATCH] drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off() AngeloGioacchino Del Regno
2023-11-08 13:20 ` Steven Price
2023-11-21 15:34   ` Krzysztof Kozlowski
2023-11-21 16:11     ` AngeloGioacchino Del Regno
2023-11-21 16:35       ` Krzysztof Kozlowski
2023-11-21 16:55       ` Boris Brezillon
2023-11-21 17:08         ` Krzysztof Kozlowski
2023-11-22  9:02           ` Boris Brezillon
2023-11-22  9:06           ` AngeloGioacchino Del Regno
2023-11-22  9:29             ` Krzysztof Kozlowski
2023-11-24 12:45               ` Marek Szyprowski
2023-11-27 11:24                 ` Marek Szyprowski
2023-11-27 11:26                   ` AngeloGioacchino Del Regno [this message]
2023-12-04  7:53               ` Krzysztof Kozlowski
2023-11-22  9:48             ` Steven Price
2023-11-22 10:33               ` AngeloGioacchino Del Regno
2023-11-22  9:54             ` Boris Brezillon
2023-11-22 10:23               ` AngeloGioacchino Del Regno
2023-11-22 10:42                 ` Boris Brezillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac36d1e2-36a4-473c-9acf-e0a1fc7d3bfb@collabora.com \
    --to=angelogioacchino.delregno@collabora.com \
    --cc=boris.brezillon@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel@collabora.com \
    --cc=krzysztof.kozlowski@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-samsung-soc@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mripard@kernel.org \
    --cc=steven.price@arm.com \
    --cc=tzimmermann@suse.de \
    --cc=wenst@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox