Linux ARM-MSM sub-architecture
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Aravind Vijayakumar <quic_aprasann@quicinc.com>,
	Rob Clark <robdclark@chromium.org>
Cc: will@kernel.org, joro@8bytes.org, dmitry.baryshkov@linaro.org,
	quic_bjorande@quicinc.com, konrad.dybcio@linaro.org,
	quic_eberman@quicinc.com, quic_psodagud@quicinc.com,
	quic_rvishwak@quicinc.com, quic_saipraka@quicinc.com,
	quic_molvera@quicinc.com, marijn.suijten@somainline.org,
	mani@kernel.org, linux-arm-kernel@lists.infradead.org,
	iommu@lists.linux.dev, linux-arm-msm@vger.kernel.org
Subject: Re: [PATCH] iommu/arm-smmu-qcom: NULL pointer check for driver data
Date: Fri, 15 Sep 2023 11:08:09 +0100	[thread overview]
Message-ID: <39a08615-d05c-3835-faaf-df9fc56c8f3f@arm.com> (raw)
In-Reply-To: <7bff908a-f4f2-6a57-23ea-ab5cc82cde09@quicinc.com>

On 2023-09-15 01:20, Aravind Vijayakumar wrote:
> 
> On 9/8/2023 5:21 AM, Robin Murphy wrote:
>> On 2023-09-08 06:17, Aravind Vijayakumar wrote:
>>>
>>> On 8/29/2023 7:30 AM, Rob Clark wrote:
>>>> On Mon, Aug 28, 2023 at 2:35 PM Aravind Vijayakumar
>>>> <quic_aprasann@quicinc.com> wrote:
>>>>>
>>>>> On 8/16/2023 6:01 PM, Rob Clark wrote:
>>>>>> On Wed, Aug 16, 2023 at 3:55 PM Aravind Vijayakumar
>>>>>> <quic_aprasann@quicinc.com> wrote:
>>>>>>> The driver_data is NULL when qcom_adreno_smmu_init_context()
>>>>>>> is called before the dev_set_drvdata() from the client driver
>>>>>>> and is resulting in kernel crash.
>>>>>>>
>>>>>>> So add a null pointer check to handle the scenario
>>>>>>> where the client driver for the GPU SMMU device would
>>>>>>> be setting the driver data after the smmu client device
>>>>>>> probe is done and not necessarily before that. The function
>>>>>>> qcom_adreno_smmu_init_context() assumes that the client
>>>>>>> driver always set the driver data using dev_set_drvdata()
>>>>>>> before the smmu client device probe, but this assumption
>>>>>>> is not always true.
>>>>>>>
>>>>>>> Signed-off-by: Aravind Vijayakumar <quic_aprasann@quicinc.com>
>>>>>>> ---
>>>>>>>    drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 3 +++
>>>>>>>    1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
>>>>>>> b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>>>> index c71afda79d64..5323f82264ca 100644
>>>>>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>>>> @@ -231,6 +231,9 @@ static int 
>>>>>>> qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
>>>>>>>            */
>>>>>>>
>>>>>>>           priv = dev_get_drvdata(dev);
>>>>>>> +       if (!priv)
>>>>>>> +               return 0;
>>>>>> could this -EPROBE_DEFER instead, or something like that? I think you
>>>>>> patch as proposed would result in per-process gpu pgtables silently
>>>>>> failing
>>>>>>
>>>>>> BR,
>>>>>> -R
>>>>> Thanks for the review comments. Returning -EPROBE_DEFER wont work
>>>>> because the probe of the client driver (which sets the driver data) 
>>>>> will
>>>>> never get triggered. However, the probe of the client driver 
>>>>> succeeds if
>>>>> we return -ENODATA. would that be acceptable?
>>>> I _think_ so.. I need to page back in the sequence of how this works,
>>>> but I do have some warn_on's in drm/msm to complain loudly if we don't
>>>> get per-process pgtables.  I'd be interested to see the callstack
>>>> where you hit this issue.  From what I remember the sequence should
>>>> be:
>>>>
>>>> 1) before the client dev probes, arm-smmu probes and attaches the
>>>> dma-api managed iommu_domain (which IIRC should be an identity domain,
>>>> and is otherwise unused).. at this point drvdata is NULL
>>>> 2) the drm/msm can probe
>>>> 3) at some point later when GPU fw is avail the GPU is loaded, drvdata
>>>> is set, and we start creating and attaching the iommu_domain's that
>>>> are actually used (one for kernel context and one each for userspace
>>>> processes using the GPU
>>>>
>>>> I guess maybe if you are hitting this case of NULL drvdata, then you
>>>> aren't getting an identity context for the dma-api managed
>>>> iommu_domain?
>>>>
>>>> BR,
>>>> -R
>>>>
>>> Yes, there are some warn_ons in io-pgtable.c, which have helped a lot 
>>> during debugging. The following is the call stack when we are hitting 
>>> the issue:
>>>
>>>    qcom_adreno_smmu_init_context+0x28/0x100
>>>    arm_smmu_init_domain_context+0x1fc/0x4cc
>>>    arm_smmu_attach_dev+0x7c/0x410
>>>    __iommu_attach_device+0x28/0x110
>>>    iommu_probe_device+0x98/0x144
>>>    of_iommu_configure+0x1f0/0x278
>>>    of_dma_configure_id+0x15c/0x320
>>>    platform_dma_configure+0x24/0x90
>>>    really_probe+0x138/0x39c
>>>    __driver_probe_device+0x114/0x190
>>>    device_driver_attach+0x4c/0xac
>>>    bind_store+0xb8/0x110
>>
>> OK, so it looks like you are indeed getting a non-identity default 
>> domain as Rob suspected. I guess that means qcom_smmu_client_of_match 
>> needs updating for this platform? (In which case, maybe a WARN() here 
>> to point in that direction might be handy as well?)
>>
>> Thanks,
>> Robin.
>>
> Hi Rob and Robin,
> 
> With the NULL check and return -ENODATA, we are getting the dma-api 
> managed iommu domain, and reattach the device to iommu domain by calling 
> "of_dma_configure" from the driver probe, but without the NULL check it 
> will be a kernel crash.

Um, what? Drivers should absolutely not be calling of_dma_configure() on 
their own devices, that is a massively egregious abuse of the API.

> Also, with the NULL check we don't have to update the 
> qcom_smmu_client_of_match and we can use the existing function - 
> "qcom_adreno_smmu_init_context" itself.

So making a bunch of conceptually-wrong code changes "saves" making the 
correct 1-line data change for the mechanism to work as designed? That 
hardly sounds like a good justification to me :/

Thanks,
Robin.

      reply	other threads:[~2023-09-15 10:09 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-16 22:55 [PATCH] iommu/arm-smmu-qcom: NULL pointer check for driver data Aravind Vijayakumar
2023-08-17  1:01 ` Rob Clark
2023-08-28 21:35   ` Aravind Vijayakumar
2023-08-29 14:30     ` Rob Clark
2023-09-08  5:17       ` Aravind Vijayakumar
2023-09-08 12:21         ` Robin Murphy
2023-09-15  0:20           ` Aravind Vijayakumar
2023-09-15 10:08             ` Robin Murphy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=39a08615-d05c-3835-faaf-df9fc56c8f3f@arm.com \
    --to=robin.murphy@arm.com \
    --cc=dmitry.baryshkov@linaro.org \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=konrad.dybcio@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=mani@kernel.org \
    --cc=marijn.suijten@somainline.org \
    --cc=quic_aprasann@quicinc.com \
    --cc=quic_bjorande@quicinc.com \
    --cc=quic_eberman@quicinc.com \
    --cc=quic_molvera@quicinc.com \
    --cc=quic_psodagud@quicinc.com \
    --cc=quic_rvishwak@quicinc.com \
    --cc=quic_saipraka@quicinc.com \
    --cc=robdclark@chromium.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox