Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Baolu Lu <baolu.lu@linux.intel.com>
To: "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com>
Cc: "intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>,
	"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"Kurmi, Suresh Kumar" <suresh.kumar.kurmi@intel.com>,
	"Saarinen, Jani" <jani.saarinen@intel.com>,
	"De Marchi, Lucas" <lucas.demarchi@intel.com>
Subject: Re: Regression on drm-tip
Date: Sun, 16 Mar 2025 16:03:21 +0800	[thread overview]
Message-ID: <7db3b702-51e1-4c0d-8e0a-578239247587@linux.intel.com> (raw)
In-Reply-To: <SJ1PR11MB61299D9421F7B3DEA6424389B9DC2@SJ1PR11MB6129.namprd11.prod.outlook.com>

On 3/16/25 15:27, Borah, Chaitanya Kumar wrote:
> 
>> -----Original Message-----
>> From: Baolu Lu<baolu.lu@linux.intel.com>
>> Sent: Sunday, March 16, 2025 8:04 AM
>> To: Borah, Chaitanya Kumar<chaitanya.kumar.borah@intel.com>
>> Cc:intel-gfx@lists.freedesktop.org;intel-xe@lists.freedesktop.org;
>> iommu@lists.linux.dev
>> Subject: Re: Regression on drm-tip
>>
>> On 3/14/25 17:04, Borah, Chaitanya Kumar wrote:
>>>
>>>> -----Original Message-----
>>>> From: Baolu Lu<baolu.lu@linux.intel.com>
>>>> Sent: Thursday, March 13, 2025 7:53 PM
>>>> To: Borah, Chaitanya Kumar<chaitanya.kumar.borah@intel.com>
>>>> Cc:baolu.lu@linux.intel.com;intel-gfx@lists.freedesktop.org; intel-
>>>> xe@lists.freedesktop.org;iommu@lists.linux.dev
>>>> Subject: Re: Regression on drm-tip
>>>>
>>>> On 2025/3/13 16:51, Borah, Chaitanya Kumar wrote:
>>>>> Hello Lu,
>>>>>
>>>>> Hope you are doing well. I am Chaitanya from the linux graphics team
>>>>> in
>>>> Intel.
>>>>> This mail is regarding a regression we are seeing in our CI runs[1]
>>>>> on drm-tip
>>>> repository.
>>>>> ````````````````````````````````````````````````````````````````````
>>>>> `` ``````````` <4>[    2.856622] WARNING: possible circular locking
>>>>> dependency detected <4>[    2.856631]
>>>>> 6.14.0-rc5-CI_DRM_16217-gc55ef90b69d3+ #1 Tainted: G          I <4>[
>>>>> 2.856642] ------------------------------------------------------
>>>>> <4>[    2.856650] swapper/0/1 is trying to acquire lock:
>>>>> <4>[    2.856657] ffffffff8360ecc8
>>>>> (iommu_probe_device_lock){+.+.}-{3:3}, at:
>>>>> iommu_probe_device+0x1d/0x70 <4>[    2.856679]
>>>>>                      but task is already holding lock:
>>>>> <4>[    2.856686] ffff888102ab6fa8
>>>>> (&device->physical_node_lock){+.+.}-{3:3}, at:
>>>>> intel_iommu_init+0xea1/0x1220
>>>>> ````````````````````````````````````````````````````````````````````
>>>>> ``
>>>>> ```````````
>>>>> Details log can be found in [2].
>>>>>
>>>>> After bisecting the tree, the following patch [3] seems to be the
>>>>> first "bad" commit
>>>>>
>>>>> ````````````````````````````````````````````````````````````````````
>>>>> ``
>>>>> ```````````````````````````````````
>>>>> commit b150654f74bf0df8e6a7936d5ec51400d9ec06d8
>>>>> Author: LuBaolumailto:baolu.lu@linux.intel.com
>>>>> Date:   Fri Feb 28 18:27:26 2025 +0800
>>>>>
>>>>>        iommu/vt-d: Fix suspicious RCU usage
>>>>>
>>>>> ````````````````````````````````````````````````````````````````````
>>>>> ``
>>>>> ```````````````````````````````````
>>>>>
>>>>> We also verified that if we revert the patch the issue is not seen.
>>>>>
>>>>> Could you please check why the patch causes this regression and
>>>>> provide a
>>>> fix if necessary?
>>>>
>>>> Can you please take a quick test to check if the following fix works?
>>>>
>>>> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
>>>> index
>>>> e540092d664d..06debeaec643 100644
>>>> --- a/drivers/iommu/intel/dmar.c
>>>> +++ b/drivers/iommu/intel/dmar.c
>>>> @@ -2051,8 +2051,13 @@ int enable_drhd_fault_handling(unsigned int
>> cpu)
>>>>                    if (iommu->irq || iommu->node != cpu_to_node(cpu))
>>>>                            continue;
>>>>
>>>> +               /*
>>>> +                * Call dmar_alloc_hwirq() with dmar_global_lock held,
>>>> +                * could cause possible lock race condition.
>>>> +                */
>>>> +               up_read(&dmar_global_lock);
>>>>                    ret = dmar_set_interrupt(iommu);
>>>> -
>>>> +               down_read(&dmar_global_lock);
>>>>                    if (ret) {
>>>>                            pr_err("DRHD %Lx: failed to enable fault, interrupt, ret
>> %d\n",
>>>>                                   (unsigned long
>>>> long)drhd->reg_base_addr, ret);
>>>>
>>>> Thanks,
>>>> baolu
>>> We still see the issue with this change.
>> I am attempting to reproduce this issue with my MTL machine. I pulled the
>> test branch from:
>>
>> https://anongit.freedesktop.org/git/drm-tip.git
>>
>> and built the test kernel image using the configuration file from:
>>
>> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16217/kconfig.txt
>>
>> But I did not observe the lockdep splat mentioned above after booting.
>>
>> Is there anything I might have missed?
>>
> +Suresh, Jani, Lucas
> 
> We are seeing this only the skykale and kabylake on our CI runs.

If so, will below change make any difference?

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 85aa66ef4d61..ec2f385ae25b 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3049,6 +3049,7 @@ static int __init probe_acpi_namespace_devices(void)
                         if (dev->bus != &acpi_bus_type)
                                 continue;

+                       up_read(&dmar_global_lock);
                         adev = to_acpi_device(dev);
                         mutex_lock(&adev->physical_node_lock);
                         list_for_each_entry(pn,
@@ -3058,6 +3059,7 @@ static int __init probe_acpi_namespace_devices(void)
                                         break;
                         }
                         mutex_unlock(&adev->physical_node_lock);
+                       down_read(&dmar_global_lock);

                         if (ret)
                                 return ret;

Thanks,
baolu

  reply	other threads:[~2025-03-16  8:06 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-13  8:51 Regression on drm-tip Borah, Chaitanya Kumar
2025-03-13  9:30 ` Baolu Lu
2025-03-13 14:23 ` Baolu Lu
2025-03-14  9:04   ` Borah, Chaitanya Kumar
2025-03-16  2:33     ` Baolu Lu
2025-03-16  7:27       ` Borah, Chaitanya Kumar
2025-03-16  8:03         ` Baolu Lu [this message]
2025-03-16 10:01           ` Borah, Chaitanya Kumar
2025-03-17  4:04             ` Baolu Lu
2025-03-22 20:59               ` Lucas De Marchi
2025-03-13 14:28 ` ✗ CI.Patch_applied: failure for " Patchwork
2025-03-16  8:15 ` ✗ CI.Patch_applied: failure for Regression on drm-tip (rev2) Patchwork
2025-03-18 10:15 ` Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2025-04-28  6:02 Regression on drm-tip Borah, Chaitanya Kumar
2025-11-27  6:25 REGRESSION " Borah, Chaitanya Kumar
2025-11-27 16:01 ` Saarinen, Jani
2025-11-27 16:06   ` Saarinen, Jani
2025-11-27 23:04 ` Ville Syrjälä
2025-11-28  7:46   ` Borah, Chaitanya Kumar
2025-12-05 10:14     ` Christian Brauner
2025-12-01 16:13   ` Saarinen, Jani
2025-12-05 10:14 ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7db3b702-51e1-4c0d-8e0a-578239247587@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=chaitanya.kumar.borah@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=iommu@lists.linux.dev \
    --cc=jani.saarinen@intel.com \
    --cc=lucas.demarchi@intel.com \
    --cc=suresh.kumar.kurmi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox