Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Regression on drm-tip
@ 2025-03-13  8:51 Borah, Chaitanya Kumar
  2025-03-13  9:30 ` Baolu Lu
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Borah, Chaitanya Kumar @ 2025-03-13  8:51 UTC (permalink / raw)
  To: Baolu Lu
  Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	iommu@lists.linux.dev

Hello Lu,

Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.

This mail is regarding a regression we are seeing in our CI runs[1] on drm-tip repository.

`````````````````````````````````````````````````````````````````````````````````
<4>[    2.856622] WARNING: possible circular locking dependency detected
<4>[    2.856631] 6.14.0-rc5-CI_DRM_16217-gc55ef90b69d3+ #1 Tainted: G          I       
<4>[    2.856642] ------------------------------------------------------
<4>[    2.856650] swapper/0/1 is trying to acquire lock:
<4>[    2.856657] ffffffff8360ecc8 (iommu_probe_device_lock){+.+.}-{3:3}, at: iommu_probe_device+0x1d/0x70
<4>[    2.856679] 
                  but task is already holding lock:
<4>[    2.856686] ffff888102ab6fa8 (&device->physical_node_lock){+.+.}-{3:3}, at: intel_iommu_init+0xea1/0x1220
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [2].

After bisecting the tree, the following patch [3] seems to be the first "bad" commit

`````````````````````````````````````````````````````````````````````````````````````````````````````````
commit b150654f74bf0df8e6a7936d5ec51400d9ec06d8
Author: Lu Baolu mailto:baolu.lu@linux.intel.com
Date:   Fri Feb 28 18:27:26 2025 +0800

    iommu/vt-d: Fix suspicious RCU usage

`````````````````````````````````````````````````````````````````````````````````````````````````````````

We also verified that if we revert the patch the issue is not seen.

Could you please check why the patch causes this regression and provide a fix if necessary?

Gitlab issue for the regression is [4].

Thank you.

Regards

Chaitanya

[1] https://intel-gfx-ci.01.org/tree/drm-tip/combined-alt.html?
[2] https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16276/fi-kbl-8809g/boot0.txt
[3] https://cgit.freedesktop.org/drm-tip/commit/?id=b150654f74bf0df8e6a7936d5ec51400d9ec06d8
[4] https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13818

Regards

Chaitanya


^ permalink raw reply	[flat|nested] 22+ messages in thread
* Regression on drm-tip
@ 2025-04-28  6:02 Borah, Chaitanya Kumar
  0 siblings, 0 replies; 22+ messages in thread
From: Borah, Chaitanya Kumar @ 2025-04-28  6:02 UTC (permalink / raw)
  To: Hall, Christopher S
  Cc: intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	Keller,  Jacob E, intel-wired-lan@lists.osuosl.org,
	Saarinen, Jani, Kurmi, Suresh Kumar, De Marchi, Lucas

Hello Christopher,

This mail is regarding a regression we are seeing in our CI runs[1] on drm-tip[2] repository.

`````````````````````````````````````````````````````````````````````````````````
<4>[    7.891028] =============================
<4>[    7.891293] [ BUG: Invalid wait context ]
<4>[    7.891526] 6.15.0-rc3-CI_DRM_16443-gdc80d6a10c1c+ #1 Tainted: G        W          
<4>[    7.891792] -----------------------------
<4>[    7.892070] (udev-worker)/286 is trying to lock:
<4>[    7.892349] ffff88811671bcc8 (&adapter->ptm_lock){....}-{3:3}, at: igc_ptp_reset+0x155/0x320 [igc]
<4>[    7.892660] other info that might help us debug this:
<4>[    7.892943] context-{4:4}
<4>[    7.893226] 2 locks held by (udev-worker)/286:
<4>[    7.893515]  #0: ffff888103bd41b0 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x104/0x220
<4>[    7.893823]  #1: ffff88811671bb70 (&adapter->tmreg_lock){....}-{2:2}, at: igc_ptp_reset+0x53/0x320 [igc]
<4>[    7.894134] stack backtrace:
<4>[    7.894439] CPU: 2 UID: 0 PID: 286 Comm: (udev-worker) Tainted: G        W           6.15.0-rc3-CI_DRM_16443-gdc80d6a10c1c+ #1 PREEMPT(voluntary) 
<4>[    7.894442] Tainted: [W]=WARN
<4>[    7.894443] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, BIOS TNTGL357.0067.2022.0718.1742 07/18/2022
`````````````````````````````````````````````````````````````````````````````````
Detailed log can be found in [3].

After bisecting the tree, the following patch [4] seems to be the first "bad"
commit

`````````````````````````````````````````````````````````````````````````````````````````````````````````
commit 1a931c4f5e6862e61a4b130cb76b422e1415f644
Author: Christopher S M Hall mailto:christopher.s.hall@intel.com
Date:   Tue Apr 1 16:35:34 2025 -0700

    igc: add lock preventing multiple simultaneous PTM transactions
`````````````````````````````````````````````````````````````````````````````````````````````````````````

We also verified that if we revert the patch the issue is not seen.

Could you please check why the patch causes this regression and provide a fix if necessary?

Thank you.

Regards

Chaitanya

[1] https://intel-gfx-ci.01.org/tree/drm-tip/shard-tglu.html
[2] https://cgit.freedesktop.org/drm-tip/tree/
[3] https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16443/fi-tgl-1115g4/boot0.txt
[4] https://cgit.freedesktop.org/drm-tip/commit/?id=1a931c4f5e6862e61a4b130cb76b422e1415f644


^ permalink raw reply	[flat|nested] 22+ messages in thread
* REGRESSION on drm-tip
@ 2025-11-27  6:25 Borah, Chaitanya Kumar
  2025-11-27 16:01 ` Saarinen, Jani
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Borah, Chaitanya Kumar @ 2025-11-27  6:25 UTC (permalink / raw)
  To: brauner
  Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	Saarinen,  Jani, Kurmi, Suresh Kumar, Lucas De Marchi

Hello Christian,

This is Chaitanya (again!).

This mail is regarding another regression we are seeing in our CI 
runs[1] on drm-tip (with both xe and i915).

`````````````````````````````````````````````````````````````````````````````````
<4> [157.687644] ------------[ cut here ]------------
<4> [157.687768] WARNING: CPU: 5 PID: 2277 at kernel/freezer.c:139 
__set_task_frozen+0x7f/0xb0
...
<4> [157.687923] PKRU: 55555554
<4> [157.687924] Call Trace:
<4> [157.687925]  <TASK>
<4> [157.687926]  ? __pfx___set_task_frozen+0x10/0x10
<4> [157.687929]  task_call_func+0x6d/0x120
<4> [157.687932]  ? cgroup_freezing+0x89/0x200
<4> [157.687937]  freeze_task+0x98/0x100
<4> [157.687940]  try_to_freeze_tasks+0xd2/0x440
<4> [157.687946]  freeze_processes+0x56/0xd0
<4> [157.687948]  hibernate+0x129/0x4a0
<4> [157.687951]  state_store+0xd3/0xe0
<4> [157.687954]  kobj_attr_store+0x12/0x40
<4> [157.687959]  sysfs_kf_write+0x4d/0x80
<4> [157.687963]  kernfs_fop_write_iter+0x188/0x240
<4> [157.687967]  vfs_write+0x283/0x540
<4> [157.687969]  ? free_to_partial_list+0x46d/0x640
<4> [157.687976]  ksys_write+0x6f/0xf0
<4> [157.687980]  __x64_sys_write+0x19/0x30
<4> [157.687982]  x64_sys_call+0x79/0x26a0
<4> [157.687984]  do_syscall_64+0x93/0xd60
<4> [157.687987]  ? putname+0x65/0x90
<4> [157.687990]  ? kmem_cache_free+0x553/0x680
<4> [157.687995]  ? putname+0x65/0x90
<4> [157.687997]  ? putname+0x65/0x90
<4> [157.687999]  ? do_sys_openat2+0x8b/0xd0
<4> [157.688003]  ? __x64_sys_openat+0x54/0xa0
<4> [157.688007]  ? do_syscall_64+0x1b7/0xd60
<4> [157.688009]  ? __fput+0x1bf/0x2f0
<4> [157.688012]  ? fput_close_sync+0x3d/0xa0
<4> [157.688015]  ? __x64_sys_close+0x3e/0x90
<4> [157.688017]  ? do_syscall_64+0x1b7/0xd60
<4> [157.688019]  ? putname+0x65/0x90
<4> [157.688021]  ? putname+0x65/0x90
<4> [157.688023]  ? do_sys_openat2+0x8b/0xd0
<4> [157.688024]  ? __fput+0x1bf/0x2f0
<4> [157.688028]  ? __x64_sys_openat+0x54/0xa0
<4> [157.688032]  ? do_syscall_64+0x1b7/0xd60
<4> [157.688034]  ? do_syscall_64+0x1b7/0xd60
<4> [157.688036]  ? irqentry_exit+0x77/0xb0
<4> [157.688038]  ? exc_page_fault+0xbd/0x2c0
<4> [157.688042]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
<4> [157.688044] RIP: 0033:0x72523c91c574
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [2].

After bisecting the tree, the following patch [3] seems to be the first 
"bad" commit

`````````````````````````````````````````````````````````````````````````````````````````````````````````
commit a3f8f8662771285511ae26c4c8d3ba1cd22159b9
Author: Christian Brauner <brauner@kernel.org>
Date:   Wed Nov 5 14:39:45 2025 +0100

     power: always freeze efivarfs
`````````````````````````````````````````````````````````````````````````````````````````````````````````

We also verified that if we revert the patch the issue is not seen.

Could you please check why the patch causes this regression and provide 
a fix if necessary?

Thank you.

Regards

Chaitanya

[1]
https://intel-gfx-ci.01.org/tree/drm-tip/index.html?testfilter=suspend
[2]
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_17595/shard-mtlp-6/igt@gem_exec_suspend@basic-s4-devices.html
[3] 
https://gitlab.com/freedesktop-mirror/drm-tip/-/commit/a3f8f8662771285511ae26c4c8d3ba1cd22159b9

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-12-05 10:14 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-13  8:51 Regression on drm-tip Borah, Chaitanya Kumar
2025-03-13  9:30 ` Baolu Lu
2025-03-13 14:23 ` Baolu Lu
2025-03-14  9:04   ` Borah, Chaitanya Kumar
2025-03-16  2:33     ` Baolu Lu
2025-03-16  7:27       ` Borah, Chaitanya Kumar
2025-03-16  8:03         ` Baolu Lu
2025-03-16 10:01           ` Borah, Chaitanya Kumar
2025-03-17  4:04             ` Baolu Lu
2025-03-22 20:59               ` Lucas De Marchi
2025-03-13 14:28 ` ✗ CI.Patch_applied: failure for " Patchwork
2025-03-16  8:15 ` ✗ CI.Patch_applied: failure for Regression on drm-tip (rev2) Patchwork
2025-03-18 10:15 ` Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2025-04-28  6:02 Regression on drm-tip Borah, Chaitanya Kumar
2025-11-27  6:25 REGRESSION " Borah, Chaitanya Kumar
2025-11-27 16:01 ` Saarinen, Jani
2025-11-27 16:06   ` Saarinen, Jani
2025-11-27 23:04 ` Ville Syrjälä
2025-11-28  7:46   ` Borah, Chaitanya Kumar
2025-12-05 10:14     ` Christian Brauner
2025-12-01 16:13   ` Saarinen, Jani
2025-12-05 10:14 ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox