linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Baolu Lu <baolu.lu@linux.intel.com>
To: Pranjal Shrivastava <praan@google.com>, Will Deacon <will@kernel.org>
Cc: baolu.lu@linux.intel.com, Kunkun Jiang <jiangkunkun@huawei.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Joerg Roedel <joro@8bytes.org>, Jason Gunthorpe <jgg@ziepe.ca>,
	Nicolin Chen <nicolinc@nvidia.com>,
	Michael Shavit <mshavit@google.com>,
	Mostafa Saleh <smostafa@google.com>,
	"moderated list:ARM SMMU DRIVERS"
	<linux-arm-kernel@lists.infradead.org>,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	wanghaibin.wang@huawei.com, yuzenghui@huawei.com,
	tangnianyao@huawei.com
Subject: Re: [bug report] iommu/arm-smmu-v3: Event cannot be printed in some scenarios
Date: Tue, 6 Aug 2024 08:09:26 +0800	[thread overview]
Message-ID: <bc992d69-da3a-41c7-bee9-81c469fb592d@linux.intel.com> (raw)
In-Reply-To: <ZrDwolC6oXN44coq@google.com>

On 2024/8/5 23:32, Pranjal Shrivastava wrote:
> On Mon, Aug 05, 2024 at 01:30:01PM +0100, Will Deacon wrote:
>> On Mon, Aug 05, 2024 at 08:13:09PM +0800, Kunkun Jiang wrote:
>>> On 2024/8/2 22:38, Pranjal Shrivastava wrote:
>>>> Hey,
>>>> On Mon, Jul 29, 2024 at 11:02 AM Baolu Lu<baolu.lu@linux.intel.com>  wrote:
>>>>> On 2024/7/24 18:24, Will Deacon wrote:
>>>>>> On Wed, Jul 24, 2024 at 05:22:59PM +0800, Kunkun Jiang wrote:
>>>>>>> On 2024/7/24 9:42, Kunkun Jiang wrote:
>>>>>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>>>>> 1797                 while (!queue_remove_raw(q, evt)) {
>>>>>>>> 1798                         u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
>>>>>>>> 1799
>>>>>>>> 1800                         ret = arm_smmu_handle_evt(smmu, evt);
>>>>>>>> 1801                         if (!ret || !__ratelimit(&rs))
>>>>>>>> 1802                                 continue;
>>>>>>>> 1803
>>>>>>>> 1804                         dev_info(smmu->dev, "event 0x%02x
>>>>>>>> received:\n", id);
>>>>>>>> 1805                         for (i = 0; i < ARRAY_SIZE(evt); ++i)
>>>>>>>> 1806                                 dev_info(smmu->dev, "\t0x%016llx\n",
>>>>>>>> 1807                                          (unsigned long
>>>>>>>> long)evt[i]);
>>>>>>>> 1808
>>>>>>>> 1809                         cond_resched();
>>>>>>>> 1810                 }
>>>>>>>>
>>>>>>>> The smmu-v3 driver cannot print event information when "ret" is 0.
>>>>>>>> Unfortunately due to commit 3dfa64aecbaf
>>>>>>>> ("iommu: Make iommu_report_device_fault() return void"), the default
>>>>>>>> return value in arm_smmu_handle_evt() is 0. Maybe a trace should
>>>>>>>> be added here?
>>>>>>> Additional explanation. Background introduction:
>>>>>>> 1.A device(VF) is passthrough(VFIO-PCI) to a VM.
>>>>>>> 2.The SMMU has the stall feature.
>>>>>>> 3.Modified guest device driver to generate an event.
>>>>>>>
>>>>>>> This event handling process is as follows:
>>>>>>> arm_smmu_evtq_thread
>>>>>>>        ret = arm_smmu_handle_evt
>>>>>>>            iommu_report_device_fault
>>>>>>>                iopf_param = iopf_get_dev_fault_param(dev);
>>>>>>>                // iopf is not enabled.
>>>>>>> // No RESUME will be sent!
>>>>>>>                if (WARN_ON(!iopf_param))
>>>>>>>                    return;
>>>>>>>        if (!ret || !__ratelimit(&rs))
>>>>>>>            continue;
>>>>>>>
>>>>>>> In this scenario, the io page-fault capability is not enabled.
>>>>>>> There are two problems here:
>>>>>>> 1. The event information is not printed.
>>>>>>> 2. The entire device(PF level) is stalled,not just the current
>>>>>>> VF. This affects other normal VFs.
>>>>>> Oh, so that stall is probably also due to b554e396e51c ("iommu: Make
>>>>>> iopf_group_response() return void"). I agree that we need a way to
>>>>>> propagate error handling back to the driver in the case that
>>>>>> 'iopf_param' is NULL, otherwise we're making the unexpected fault
>>>>>> considerably more problematic than it needs to be.
>>>>>>
>>>>>> Lu -- can we add the -ENODEV return back in the case that
>>>>>> iommu_report_device_fault() doesn't even find a 'iommu_fault_param' for
>>>>>> the device?
>>>>> Yes, of course. The commit b554e396e51c was added to consolidate the
>>>>> drivers' auto response code in the core with the assumption that driver
>>>>> only needs to call iommu_report_device_fault() for reporting an iopf.
>>>>>
>>>> I had a go at taking Jason's diff and implementing the suggestions in
>>>> this thread.
>>>> Kunkun -- please can you see if this fixes the problem for you?
>>> Okay, I'll test it as soon as I can.
>> It looks like the diff sent by Pranjal has whitespace mangling, so I
>> don't think you'll be able to apply it.
>>
>> Pranjal -- please can you send an unmangled version? If you want to test
>> out your mail setup, I'm happy to be a guinea pig so you don't spam the
>> mailing lists!
> Ugh, apologies for that, something went wrong with my client.
> Kunkun -- Please let me know if this fixes the problem.
> Lu -- It looks like the intel->page_response callback doesn't expect a
> NULL event, so, for now, I immediately return in that case. LMK what you
> think?

That's okay. We had such check there before the refactoring.

Thanks,
baolu


  parent reply	other threads:[~2024-08-06  0:10 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-24  1:42 [bug report] iommu/arm-smmu-v3: Event cannot be printed in some scenarios Kunkun Jiang
2024-07-24  9:15 ` Mostafa Saleh
2024-07-24  9:22 ` Kunkun Jiang
2024-07-24 10:24   ` Will Deacon
2024-07-24 13:03     ` Jason Gunthorpe
2024-07-25  7:35       ` Tian, Kevin
2024-07-25 12:58         ` Jason Gunthorpe
2024-07-26  0:04           ` Tian, Kevin
2024-07-29  5:29     ` Baolu Lu
2024-08-02 14:38       ` Pranjal Shrivastava
2024-08-05 12:13         ` Kunkun Jiang
2024-08-05 12:30           ` Will Deacon
     [not found]             ` <ZrDwolC6oXN44coq@google.com>
2024-08-06  0:09               ` Baolu Lu [this message]
2024-08-06 12:49               ` Jason Gunthorpe
2024-08-06 15:58                 ` Pranjal Shrivastava
2024-08-07  5:35                   ` Baolu Lu
2024-08-08 13:50                     ` Pranjal Shrivastava
2024-08-13 17:56                       ` Jason Gunthorpe
2024-08-14  9:02                         ` Pranjal Shrivastava

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc992d69-da3a-41c7-bee9-81c469fb592d@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=jiangkunkun@huawei.com \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mshavit@google.com \
    --cc=nicolinc@nvidia.com \
    --cc=praan@google.com \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=tangnianyao@huawei.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).