All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baolu Lu <baolu.lu@linux.intel.com>
To: Pranjal Shrivastava <praan@google.com>, Will Deacon <will@kernel.org>
Cc: baolu.lu@linux.intel.com, Kunkun Jiang <jiangkunkun@huawei.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Joerg Roedel <joro@8bytes.org>, Jason Gunthorpe <jgg@ziepe.ca>,
	Nicolin Chen <nicolinc@nvidia.com>,
	Michael Shavit <mshavit@google.com>,
	Mostafa Saleh <smostafa@google.com>,
	"moderated list:ARM SMMU DRIVERS"
	<linux-arm-kernel@lists.infradead.org>,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	wanghaibin.wang@huawei.com, yuzenghui@huawei.com,
	tangnianyao@huawei.com
Subject: Re: [bug report] iommu/arm-smmu-v3: Event cannot be printed in some scenarios
Date: Tue, 6 Aug 2024 08:09:26 +0800	[thread overview]
Message-ID: <bc992d69-da3a-41c7-bee9-81c469fb592d@linux.intel.com> (raw)
In-Reply-To: <ZrDwolC6oXN44coq@google.com>

On 2024/8/5 23:32, Pranjal Shrivastava wrote:
> On Mon, Aug 05, 2024 at 01:30:01PM +0100, Will Deacon wrote:
>> On Mon, Aug 05, 2024 at 08:13:09PM +0800, Kunkun Jiang wrote:
>>> On 2024/8/2 22:38, Pranjal Shrivastava wrote:
>>>> Hey,
>>>> On Mon, Jul 29, 2024 at 11:02 AM Baolu Lu<baolu.lu@linux.intel.com>  wrote:
>>>>> On 2024/7/24 18:24, Will Deacon wrote:
>>>>>> On Wed, Jul 24, 2024 at 05:22:59PM +0800, Kunkun Jiang wrote:
>>>>>>> On 2024/7/24 9:42, Kunkun Jiang wrote:
>>>>>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>>>>> 1797                 while (!queue_remove_raw(q, evt)) {
>>>>>>>> 1798                         u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);
>>>>>>>> 1799
>>>>>>>> 1800                         ret = arm_smmu_handle_evt(smmu, evt);
>>>>>>>> 1801                         if (!ret || !__ratelimit(&rs))
>>>>>>>> 1802                                 continue;
>>>>>>>> 1803
>>>>>>>> 1804                         dev_info(smmu->dev, "event 0x%02x
>>>>>>>> received:\n", id);
>>>>>>>> 1805                         for (i = 0; i < ARRAY_SIZE(evt); ++i)
>>>>>>>> 1806                                 dev_info(smmu->dev, "\t0x%016llx\n",
>>>>>>>> 1807                                          (unsigned long
>>>>>>>> long)evt[i]);
>>>>>>>> 1808
>>>>>>>> 1809                         cond_resched();
>>>>>>>> 1810                 }
>>>>>>>>
>>>>>>>> The smmu-v3 driver cannot print event information when "ret" is 0.
>>>>>>>> Unfortunately due to commit 3dfa64aecbaf
>>>>>>>> ("iommu: Make iommu_report_device_fault() return void"), the default
>>>>>>>> return value in arm_smmu_handle_evt() is 0. Maybe a trace should
>>>>>>>> be added here?
>>>>>>> Additional explanation. Background introduction:
>>>>>>> 1.A device(VF) is passthrough(VFIO-PCI) to a VM.
>>>>>>> 2.The SMMU has the stall feature.
>>>>>>> 3.Modified guest device driver to generate an event.
>>>>>>>
>>>>>>> This event handling process is as follows:
>>>>>>> arm_smmu_evtq_thread
>>>>>>>        ret = arm_smmu_handle_evt
>>>>>>>            iommu_report_device_fault
>>>>>>>                iopf_param = iopf_get_dev_fault_param(dev);
>>>>>>>                // iopf is not enabled.
>>>>>>> // No RESUME will be sent!
>>>>>>>                if (WARN_ON(!iopf_param))
>>>>>>>                    return;
>>>>>>>        if (!ret || !__ratelimit(&rs))
>>>>>>>            continue;
>>>>>>>
>>>>>>> In this scenario, the io page-fault capability is not enabled.
>>>>>>> There are two problems here:
>>>>>>> 1. The event information is not printed.
>>>>>>> 2. The entire device(PF level) is stalled,not just the current
>>>>>>> VF. This affects other normal VFs.
>>>>>> Oh, so that stall is probably also due to b554e396e51c ("iommu: Make
>>>>>> iopf_group_response() return void"). I agree that we need a way to
>>>>>> propagate error handling back to the driver in the case that
>>>>>> 'iopf_param' is NULL, otherwise we're making the unexpected fault
>>>>>> considerably more problematic than it needs to be.
>>>>>>
>>>>>> Lu -- can we add the -ENODEV return back in the case that
>>>>>> iommu_report_device_fault() doesn't even find a 'iommu_fault_param' for
>>>>>> the device?
>>>>> Yes, of course. The commit b554e396e51c was added to consolidate the
>>>>> drivers' auto response code in the core with the assumption that driver
>>>>> only needs to call iommu_report_device_fault() for reporting an iopf.
>>>>>
>>>> I had a go at taking Jason's diff and implementing the suggestions in
>>>> this thread.
>>>> Kunkun -- please can you see if this fixes the problem for you?
>>> Okay, I'll test it as soon as I can.
>> It looks like the diff sent by Pranjal has whitespace mangling, so I
>> don't think you'll be able to apply it.
>>
>> Pranjal -- please can you send an unmangled version? If you want to test
>> out your mail setup, I'm happy to be a guinea pig so you don't spam the
>> mailing lists!
> Ugh, apologies for that, something went wrong with my client.
> Kunkun -- Please let me know if this fixes the problem.
> Lu -- It looks like the intel->page_response callback doesn't expect a
> NULL event, so, for now, I immediately return in that case. LMK what you
> think?

That's okay. We had such check there before the refactoring.

Thanks,
baolu

  reply	other threads:[~2024-08-06  0:09 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-24  1:42 [bug report] iommu/arm-smmu-v3: Event cannot be printed in some scenarios Kunkun Jiang
2024-07-24  9:15 ` Mostafa Saleh
2024-07-24  9:22 ` Kunkun Jiang
2024-07-24 10:24   ` Will Deacon
2024-07-24 13:03     ` Jason Gunthorpe
2024-07-25  7:35       ` Tian, Kevin
2024-07-25 12:58         ` Jason Gunthorpe
2024-07-26  0:04           ` Tian, Kevin
2024-07-29  5:29     ` Baolu Lu
2024-08-02 14:38       ` Pranjal Shrivastava
2024-08-05 12:13         ` Kunkun Jiang
2024-08-05 12:30           ` Will Deacon
2024-08-05 15:32             ` Pranjal Shrivastava
2024-08-06  0:09               ` Baolu Lu [this message]
2024-08-06 12:49               ` Jason Gunthorpe
2024-08-06 15:58                 ` Pranjal Shrivastava
2024-08-07  5:35                   ` Baolu Lu
2024-08-08 13:50                     ` Pranjal Shrivastava
2024-08-13 17:56                       ` Jason Gunthorpe
2024-08-14  9:02                         ` Pranjal Shrivastava

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc992d69-da3a-41c7-bee9-81c469fb592d@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@ziepe.ca \
    --cc=jiangkunkun@huawei.com \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mshavit@google.com \
    --cc=nicolinc@nvidia.com \
    --cc=praan@google.com \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=tangnianyao@huawei.com \
    --cc=wanghaibin.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.