public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chunyou Tang <tangchunyou@163.com>
To: Steven Price <steven.price@arm.com>
Cc: tomeu.vizoso@collabora.com, airlied@linux.ie,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	alyssa.rosenzweig@collabora.com,
	ChunyouTang <tangchunyou@icubecorp.cn>
Subject: Re: [PATCH v2] drm/panfrost:report the full raw fault information instead
Date: Fri, 25 Jun 2021 17:49:37 +0800	[thread overview]
Message-ID: <20210625174937.0000183f@163.com> (raw)
In-Reply-To: <04bc1306-f8a3-2e3c-b55d-030d1448fad2@arm.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=GB18030, Size: 6758 bytes --]

Hi Steve,
	Thinks for your reply.
	When I only set the pte |= ARM_LPAE_PTE_SH_NS;there have no "GPU
Fault",When I set the pte |= ARM_LPAE_PTE_SH_IS(or
ARM_LPAE_PTE_SH_OS);there have "GPU Fault".I don't know how the pte
effect this issue?
	Can you give me some suggestions again?

Thinks.

Chunyou

ÓÚ Thu, 24 Jun 2021 14:22:04 +0100
Steven Price <steven.price@arm.com> дµÀ:

> On 22/06/2021 02:40, Chunyou Tang wrote:
> > Hi Steve,
> > 	I will send a new patch with suitable subject/commit
> > message. But I send a V3 or a new patch?
> 
> Send a V3 - it is a new version of this patch.
> 
> > 	I met a bug about the GPU,I have no idea about how to fix
> > it, If you can give me some suggestion,it is perfect.
> > 
> > You can see such kernel log:
> > 
> > Jun 20 10:20:13 icube kernel: [  774.566760] mvp_gpu 0000:05:00.0:
> > GPU Fault 0x00000088 (SHAREABILITY_FAULT) at 0x000000000310fd00 Jun
> > 20 10:20:13 icube kernel: [  774.566764] mvp_gpu 0000:05:00.0:
> > There were multiple GPU faults - some have not been reported Jun 20
> > 10:20:13 icube kernel: [  774.667542] mvp_gpu 0000:05:00.0:
> > AS_ACTIVE bit stuck Jun 20 10:20:13 icube kernel: [  774.767900]
> > mvp_gpu 0000:05:00.0: AS_ACTIVE bit stuck Jun 20 10:20:13 icube
> > kernel: [  774.868546] mvp_gpu 0000:05:00.0: AS_ACTIVE bit stuck
> > Jun 20 10:20:13 icube kernel: [  774.968910] mvp_gpu 0000:05:00.0:
> > AS_ACTIVE bit stuck Jun 20 10:20:13 icube kernel: [  775.069251]
> > mvp_gpu 0000:05:00.0: AS_ACTIVE bit stuck Jun 20 10:20:22 icube
> > kernel: [  783.693971] mvp_gpu 0000:05:00.0: gpu sched timeout,
> > js=1, config=0x7300, status=0x8, head=0x362c900, tail=0x362c100,
> > sched_job=000000003252fb84
> > 
> > In
> > https://lore.kernel.org/dri-devel/20200510165538.19720-1-peron.clem@gmail.com/
> > there had a same bug like mine,and I found you at the mail list,I
> > don't know how it fixed?
> 
> The GPU_SHAREABILITY_FAULT error means that a cache line has been
> accessed both as shareable and non-shareable and therefore coherency
> cannot be guaranteed. Although the "multiple GPU faults" means that
> this may not be the underlying cause.
> 
> The fact that your dmesg log has PCI style identifiers
> ("0000:05:00.0") suggests this is an unusual platform - I've not
> previously been aware of a Mali device behind PCI. Is this device
> working with the kbase/DDK proprietary driver? It would be worth
> looking at the kbase kernel code for the platform to see if there is
> anything special done for the platform.
> 
> From the dmesg logs all I can really tell is that the GPU seems
> unhappy about the memory system.
> 
> Steve
> 
> > I need your help!
> > 
> > thinks very much!
> > 
> > Chunyou
> > 
> > ÓÚ Mon, 21 Jun 2021 11:45:20 +0100
> > Steven Price <steven.price@arm.com> дµÀ:
> > 
> >> On 19/06/2021 04:18, Chunyou Tang wrote:
> >>> Hi Steve,
> >>> 	1,Now I know how to write the subject
> >>> 	2,the low 8 bits is the exception type in spec.
> >>>
> >>> and you can see prnfrost_exception_name()
> >>>
> >>> switch (exception_code) {
> >>>                 /* Non-Fault Status code */
> >>> case 0x00: return "NOT_STARTED/IDLE/OK";
> >>> case 0x01: return "DONE";
> >>> case 0x02: return "INTERRUPTED";
> >>> case 0x03: return "STOPPED";
> >>> case 0x04: return "TERMINATED";
> >>> case 0x08: return "ACTIVE";
> >>> ........
> >>> ........
> >>> case 0xD8: return "ACCESS_FLAG";
> >>> case 0xD9 ... 0xDF: return "ACCESS_FLAG";
> >>> case 0xE0 ... 0xE7: return "ADDRESS_SIZE_FAULT";
> >>> case 0xE8 ... 0xEF: return "MEMORY_ATTRIBUTES_FAULT";
> >>> }
> >>> return "UNKNOWN";
> >>> }
> >>>
> >>> the exception_code in case is only 8 bits,so if fault_status
> >>> in panfrost_gpu_irq_handler() don't & 0xFF,it can't get correct
> >>> exception reason,it will be always UNKNOWN.
> >>
> >> Yes, I'm happy with the change - I just need a patch that I can
> >> apply. At the moment this patch only changes the first '0x%08x'
> >> output rather than the call to panfrost_exception_name() as well.
> >> So we just need a patch which does:
> >>
> >> - fault_status & 0xFF, panfrost_exception_name(pfdev,
> >> fault_status),
> >> + fault_status, panfrost_exception_name(pfdev, fault_status &
> >> 0xFF),
> >>
> >> along with a suitable subject/commit message describing the
> >> change. If you can send me that I can apply it.
> >>
> >> Thanks,
> >>
> >> Steve
> >>
> >> PS. Sorry for going round in circles here - I'm trying to help you
> >> get setup so you'll be able to contribute patches easily in
> >> future. An important part of that is ensuring you can send a
> >> properly formatted patch to the list.
> >>
> >> PPS. I'm still not receiving your emails directly. I don't think
> >> it's a problem at my end because I'm receiving other emails, but
> >> if you can somehow fix the problem you're likely to receive a
> >> faster response.
> >>
> >>> ÓÚ Fri, 18 Jun 2021 13:43:24 +0100
> >>> Steven Price <steven.price@arm.com> дµÀ:
> >>>
> >>>> On 17/06/2021 07:20, ChunyouTang wrote:
> >>>>> From: ChunyouTang <tangchunyou@icubecorp.cn>
> >>>>>
> >>>>> of the low 8 bits.
> >>>>
> >>>> Please don't split the subject like this. The first line of the
> >>>> commit should be a (very short) summary of the patch. Then a
> >>>> blank line and then a longer description of what the purpose of
> >>>> the patch is and why it's needed.
> >>>>
> >>>> Also you previously had this as part of a series (the first part
> >>>> adding the "& 0xFF" in the panfrost_exception_name() call). I'm
> >>>> not sure we need two patches for the single line, but as it
> >>>> stands this patch doesn't apply.
> >>>>
> >>>> Also I'm still not receiving any emails from you directly (only
> >>>> via the list), so it's possible I might have missed something
> >>>> you sent.
> >>>>
> >>>> Steve
> >>>>
> >>>>>
> >>>>> Signed-off-by: ChunyouTang <tangchunyou@icubecorp.cn>
> >>>>> ---
> >>>>>  drivers/gpu/drm/panfrost/panfrost_gpu.c | 2 +-
> >>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> >>>>> b/drivers/gpu/drm/panfrost/panfrost_gpu.c index
> >>>>> 1fffb6a0b24f..d2d287bbf4e7 100644 ---
> >>>>> a/drivers/gpu/drm/panfrost/panfrost_gpu.c +++
> >>>>> b/drivers/gpu/drm/panfrost/panfrost_gpu.c @@ -33,7 +33,7 @@
> >>>>> static irqreturn_t panfrost_gpu_irq_handler(int irq, void
> >>>>> *data) address |= gpu_read(pfdev, GPU_FAULT_ADDRESS_LO); 
> >>>>>  		dev_warn(pfdev->dev, "GPU Fault 0x%08x (%s) at
> >>>>> 0x%016llx\n",
> >>>>> -			 fault_status & 0xFF,
> >>>>> panfrost_exception_name(pfdev, fault_status & 0xFF),
> >>>>> +			 fault_status,
> >>>>> panfrost_exception_name(pfdev, fault_status & 0xFF), address);
> >>>>>  
> >>>>>  		if (state & GPU_IRQ_MULTIPLE_FAULT)
> >>>>>
> >>>
> >>>
> > 
> > 



  reply	other threads:[~2021-06-25  9:50 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17  6:20 [PATCH v2] drm/panfrost:report the full raw fault information instead ChunyouTang
2021-06-18 12:43 ` Steven Price
2021-06-19  3:18   ` Chunyou Tang
2021-06-21 10:45     ` Steven Price
2021-06-22  1:40       ` Chunyou Tang
2021-06-24 13:22         ` Steven Price
2021-06-25  9:49           ` Chunyou Tang [this message]
2021-06-28 10:48             ` Steven Price
2021-06-28 14:17               ` Robin Murphy
2021-06-29  3:08                 ` Chunyou Tang
2021-06-29  3:04               ` Chunyou Tang
2021-07-01 10:15                 ` Steven Price
2021-07-02  1:40                   ` Chunyou Tang
2021-07-05 13:50                     ` Steven Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210625174937.0000183f@163.com \
    --to=tangchunyou@163.com \
    --cc=airlied@linux.ie \
    --cc=alyssa.rosenzweig@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=steven.price@arm.com \
    --cc=tangchunyou@icubecorp.cn \
    --cc=tomeu.vizoso@collabora.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox