From: Akhil P Oommen <akhilpo@codeaurora.org>
To: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v5 3/5] drm/msm: Improve the a6xx page fault handler
Date: Tue, 15 Jun 2021 20:21:54 +0530 [thread overview]
Message-ID: <172f92ca-7a50-c19d-bf4d-e06c1502336c@codeaurora.org> (raw)
In-Reply-To: <20210610214431.539029-4-robdclark@gmail.com>
On 6/11/2021 3:14 AM, Rob Clark wrote:
> From: Jordan Crouse <jcrouse@codeaurora.org>
>
> Use the new adreno-smmu-priv fault info function to get more SMMU
> debug registers and print the current TTBR0 to debug per-instance
> pagetables and figure out which GPU block generated the request.
>
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
> drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +-
> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 76 +++++++++++++++++++++++++--
> drivers/gpu/drm/msm/msm_iommu.c | 11 +++-
> drivers/gpu/drm/msm/msm_mmu.h | 4 +-
> 4 files changed, 87 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index f46562c12022..eb030b00bff4 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -1075,7 +1075,7 @@ bool a5xx_idle(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
> return true;
> }
>
> -static int a5xx_fault_handler(void *arg, unsigned long iova, int flags)
> +static int a5xx_fault_handler(void *arg, unsigned long iova, int flags, void *data)
> {
> struct msm_gpu *gpu = arg;
> pr_warn_ratelimited("*** gpu fault: iova=%08lx, flags=%d (%u,%u,%u,%u)\n",
> @@ -1085,7 +1085,7 @@ static int a5xx_fault_handler(void *arg, unsigned long iova, int flags)
> gpu_read(gpu, REG_A5XX_CP_SCRATCH_REG(6)),
> gpu_read(gpu, REG_A5XX_CP_SCRATCH_REG(7)));
>
> - return -EFAULT;
> + return 0;
Shall we return -ENOSYS here to let the smmu handler print the fault
debugging info? Due to the change here: iommu/arm-smmu: Add support for
driver IOMMU fault handlers - arm_smmu_context_fault()
> }
>
> static void a5xx_cp_err_irq(struct msm_gpu *gpu)
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c7f0ddb12d8f..fc19db10bff1 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -1032,18 +1032,88 @@ static void a6xx_recover(struct msm_gpu *gpu)
> msm_gpu_hw_init(gpu);
> }
>
> -static int a6xx_fault_handler(void *arg, unsigned long iova, int flags)
> +static const char *a6xx_uche_fault_block(struct msm_gpu *gpu, u32 mid)
> +{
> + static const char *uche_clients[7] = {
> + "VFD", "SP", "VSC", "VPC", "HLSQ", "PC", "LRZ",
> + };
> + u32 val;
> +
> + if (mid < 1 || mid > 3)
> + return "UNKNOWN";
> +
> + /*
> + * The source of the data depends on the mid ID read from FSYNR1.
> + * and the client ID read from the UCHE block
> + */
> + val = gpu_read(gpu, REG_A6XX_UCHE_CLIENT_PF);
Are you sure gx rail is up at this point?
> +
> + /* mid = 3 is most precise and refers to only one block per client */
> + if (mid == 3)
> + return uche_clients[val & 7];
> +
> + /* For mid=2 the source is TP or VFD except when the client id is 0 */
> + if (mid == 2)
> + return ((val & 7) == 0) ? "TP" : "TP|VFD";
> +
> + /* For mid=1 just return "UCHE" as a catchall for everything else */
> + return "UCHE";
> +}
UCHE decoding scheme is different for a660. I am not sure who should
handle that. Jonathan?
> +
> +static const char *a6xx_fault_block(struct msm_gpu *gpu, u32 id)
> +{
> + if (id == 0)
> + return "CP";
> + else if (id == 4)
> + return "CCU";
> + else if (id == 6)
> + return "CDP Prefetch";
I guess id == 7 is not possible here.
> +
> + return a6xx_uche_fault_block(gpu, id);
> +}
> +
> +#define ARM_SMMU_FSR_TF BIT(1)
> +#define ARM_SMMU_FSR_PF BIT(3)
> +#define ARM_SMMU_FSR_EF BIT(4)
> +
> +static int a6xx_fault_handler(void *arg, unsigned long iova, int flags, void *data)
> {
> struct msm_gpu *gpu = arg;
> + struct adreno_smmu_fault_info *info = data;
> + const char *type = "UNKNOWN";
>
> - pr_warn_ratelimited("*** gpu fault: iova=%08lx, flags=%d (%u,%u,%u,%u)\n",
> + /*
> + * Print a default message if we couldn't get the data from the
> + * adreno-smmu-priv
> + */
> + if (!info) {
> + pr_warn_ratelimited("*** gpu fault: iova=%.16lx flags=%d (%u,%u,%u,%u)\n",
> iova, flags,
> gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(4)),
> gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(5)),
> gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(6)),
> gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(7)));
>
> - return -EFAULT;
> + return 0;
return -ENOSYS here so that the smmu driver prints some extra debug data?
> + }
> +
> + if (info->fsr & ARM_SMMU_FSR_TF)
> + type = "TRANSLATION";
> + else if (info->fsr & ARM_SMMU_FSR_PF)
> + type = "PERMISSION";
> + else if (info->fsr & ARM_SMMU_FSR_EF)
> + type = "EXTERNAL";
> +
> + pr_warn_ratelimited("*** gpu fault: ttbr0=%.16llx iova=%.16lx dir=%s type=%s source=%s (%u,%u,%u,%u)\n",
> + info->ttbr0, iova,
> + flags & IOMMU_FAULT_WRITE ? "WRITE" : "READ", type,
> + a6xx_fault_block(gpu, info->fsynr1 & 0xff),
> + gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(4)),
> + gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(5)),
> + gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(6)),
> + gpu_read(gpu, REG_A6XX_CP_SCRATCH_REG(7)));
> +
> + return 0;
> }
>
> static void a6xx_cp_hw_err_irq(struct msm_gpu *gpu)
> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
> index 50d881794758..6975b95c3c29 100644
> --- a/drivers/gpu/drm/msm/msm_iommu.c
> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> @@ -211,8 +211,17 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
> unsigned long iova, int flags, void *arg)
> {
> struct msm_iommu *iommu = arg;
> + struct adreno_smmu_priv *adreno_smmu = dev_get_drvdata(iommu->base.dev);
> + struct adreno_smmu_fault_info info, *ptr = NULL;
> +
> + if (adreno_smmu->get_fault_info) {
> + adreno_smmu->get_fault_info(adreno_smmu->cookie, &info);
> + ptr = &info;
> + }
> +
> if (iommu->base.handler)
> - return iommu->base.handler(iommu->base.arg, iova, flags);
> + return iommu->base.handler(iommu->base.arg, iova, flags, ptr);
> +
> pr_warn_ratelimited("*** fault: iova=%16lx, flags=%d\n", iova, flags);
> return 0;
> }
> diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
> index 61ade89d9e48..a88f44c3268d 100644
> --- a/drivers/gpu/drm/msm/msm_mmu.h
> +++ b/drivers/gpu/drm/msm/msm_mmu.h
> @@ -26,7 +26,7 @@ enum msm_mmu_type {
> struct msm_mmu {
> const struct msm_mmu_funcs *funcs;
> struct device *dev;
> - int (*handler)(void *arg, unsigned long iova, int flags);
> + int (*handler)(void *arg, unsigned long iova, int flags, void *data);
> void *arg;
> enum msm_mmu_type type;
> };
> @@ -43,7 +43,7 @@ struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain);
> struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu);
>
> static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
> - int (*handler)(void *arg, unsigned long iova, int flags))
> + int (*handler)(void *arg, unsigned long iova, int flags, void *data))
Just curious, why hide the datatype of 'data' using 'void *' here?
> {
> mmu->arg = arg;
> mmu->handler = handler;
>
next prev parent reply other threads:[~2021-06-15 14:52 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-10 21:44 [PATCH v5 0/5] iommu/arm-smmu: adreno-smmu page fault handling Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` [PATCH v5 1/5] iommu/arm-smmu: Add support for driver IOMMU fault handlers Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-14 17:26 ` Bjorn Andersson
2021-06-14 17:26 ` Bjorn Andersson
2021-06-14 17:26 ` Bjorn Andersson
2021-06-14 17:26 ` Bjorn Andersson
2021-06-10 21:44 ` [PATCH v5 2/5] iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault info Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-14 17:30 ` Bjorn Andersson
2021-06-14 17:30 ` Bjorn Andersson
2021-06-14 17:30 ` Bjorn Andersson
2021-06-14 17:30 ` Bjorn Andersson
2021-06-10 21:44 ` [PATCH v5 3/5] drm/msm: Improve the a6xx page fault handler Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-14 17:46 ` Bjorn Andersson
2021-06-14 17:46 ` Bjorn Andersson
2021-06-14 17:46 ` Bjorn Andersson
2021-06-15 14:51 ` Akhil P Oommen [this message]
2021-06-25 3:39 ` Bjorn Andersson
2021-06-25 3:39 ` Bjorn Andersson
2021-06-25 3:39 ` Bjorn Andersson
2021-06-25 15:42 ` Rob Clark
2021-06-25 15:42 ` Rob Clark
2021-06-25 15:42 ` Rob Clark
2021-06-10 21:44 ` [PATCH v5 4/5] iommu/arm-smmu-qcom: Add stall support Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-11 13:49 ` Jordan Crouse
2021-06-11 13:49 ` Jordan Crouse
2021-06-11 13:49 ` Jordan Crouse
2021-06-11 13:49 ` Jordan Crouse
2021-06-14 17:54 ` Bjorn Andersson
2021-06-14 17:54 ` Bjorn Andersson
2021-06-14 17:54 ` Bjorn Andersson
2021-06-14 17:54 ` Bjorn Andersson
2021-06-10 21:44 ` [PATCH v5 5/5] drm/msm: devcoredump iommu fault support Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-10 21:44 ` Rob Clark
2021-06-11 13:49 ` Jordan Crouse
2021-06-11 13:49 ` Jordan Crouse
2021-06-11 13:49 ` Jordan Crouse
2021-07-04 12:53 ` [PATCH v5 0/5] iommu/arm-smmu: adreno-smmu page fault handling Dmitry Baryshkov
2021-07-04 12:53 ` Dmitry Baryshkov
2021-07-04 12:53 ` Dmitry Baryshkov
2021-07-04 12:53 ` Dmitry Baryshkov
2021-07-04 18:20 ` Rob Clark
2021-07-04 18:20 ` Rob Clark
2021-07-04 18:20 ` Rob Clark
2021-07-04 18:20 ` Rob Clark
2021-07-06 21:36 ` Bjorn Andersson
2021-07-06 21:36 ` Bjorn Andersson
2021-07-06 21:36 ` Bjorn Andersson
2021-07-06 21:36 ` Bjorn Andersson
2021-07-07 5:12 ` John Stultz
2021-07-07 5:12 ` John Stultz
2021-07-07 5:12 ` John Stultz
2021-07-07 5:12 ` John Stultz
2021-07-07 17:38 ` Rob Clark
2021-07-07 17:38 ` Rob Clark
2021-07-07 17:38 ` Rob Clark
2021-07-07 17:38 ` Rob Clark
-- strict thread matches above, loose matches on Subject: below --
2021-07-06 14:27 [PATCH v5 3/5] drm/msm: Improve the a6xx page fault handler Yassine Oudjana
2021-07-06 14:27 ` Yassine Oudjana
2021-07-06 14:27 ` Yassine Oudjana via iommu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=172f92ca-7a50-c19d-bf4d-e06c1502336c@codeaurora.org \
--to=akhilpo@codeaurora.org \
--cc=dri-devel@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.