All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: "José Roberto de Souza" <jose.souza@intel.com>
Cc: <intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH] drm/xe: Use xe_pm_runtime_get() in xe_ggtt_remove_node()
Date: Thu, 11 Jul 2024 16:14:43 -0400	[thread overview]
Message-ID: <ZpA9M3dz2Kfj5yNM@intel.com> (raw)
In-Reply-To: <20240711200031.49798-1-jose.souza@intel.com>

On Thu, Jul 11, 2024 at 01:00:31PM -0700, José Roberto de Souza wrote:
> I don't see a relationship between drm_dev_enter() and pm_runtime.
> A plugged device could still no one holding a PM refcount.
> 
> And this is being triggered from ttm_bo_delayed_delete() and I can't
> see no one in the call chain getting a runtime pm before
> xe_ggtt_remove_node(), so here replacing xe_pm_runtime_get_noresume()
> by xe_pm_runtime_get().
> 
> This change probably will fix the kernel OOPS below:

It will remove this and create a lockdep splat.

The right solution is this series:

https://lore.kernel.org/intel-xe/20240711171155.173717-12-rodrigo.vivi@intel.com/T/#u

Please help with review there.

> 
> ------------[ cut here ]------------
> xe 0000:4d:00.0: [drm] Missing outer runtime PM protection
> WARNING: CPU: 100 PID: 3524 at drivers/gpu/drm/xe/xe_pm.c:551 xe_pm_runtime_get_noresume+0x48/0x60 [xe]
> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore mei_gsc xe drm_gpuvm video drm_ttm_helper ttm gpu_sched drm_suballoc_helper drm_exec drm_display_helper drm_kunit_helpers kunit drm_buddy intel_rapl_msr intel_rapl_common cmdlinepart spi_nor mtd intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nls_iso8859_1 nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rndis_host ast cdc_ether i2c_algo_bit dm_multipath dax_hmem i40e ixgbe scsi_dh_rdac drm_shmem_helper usbnet mei_me cxl_acpi scsi_dh_emc rapl scsi_dh_alua mii drm_kms_helper intel_cstate mdio cxl_core e1000e libie efi_pstore i2c_i801 intel_pch_thermal spi_intel_pci mei isst_if_mbox_pci i2c_smbus isst_if_mmio spi_intel isst_if_common intel_th_gth intel_th_pci ipmi_ssif ioatdma intel_vsec intel_th dca wmi ipmi_si
>  acpi_power_meter acpi_ipmi ipmi_devintf acpi_pad ipmi_msghandler mac_hid sch_fq_codel msr parport_pc ppdev lp parport drm ip_tables x_tables autofs4
> CPU: 100 PID: 3524 Comm: kworker/u580:4 Not tainted 6.10.0-rc5-xe #1
> Hardware name: Intel Corporation WHITLEY/WHITLEY, BIOS SE5C6200.86B.0027.P15.2205121306 05/12/2022
> Workqueue: ttm ttm_bo_delayed_delete [ttm]
> RIP: 0010:xe_pm_runtime_get_noresume+0x48/0x60 [xe]
> Code: cc cc cc 48 8b 7b 08 4c 8b 67 50 4d 85 e4 75 03 4c 8b 27 e8 aa bd f4 e0 4c 89 e2 48 c7 c7 d8 1a 03 a1 48 89 c6 e8 08 b7 32 e0 <0f> 0b 48 8b 43 08 f0 ff 80 f8 02 00 00 5b 41 5c 5d c3 cc cc cc cc
> RSP: 0000:ffa00000225afc00 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: ff1100014c510000 RCX: 0000000000000027
> RDX: 0000000000000027 RSI: 0000000000000000 RDI: ff1100103fe31a48
> RBP: ffa00000225afc10 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000001 R11: 632da25ec9e647d2 R12: ff1100011d93b710
> R13: ff1100016cf6c448 R14: 0000000000000001 R15: ff1100014c510000
> FS:  0000000000000000(0000) GS:ff1100103fe00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4d34e18198 CR3: 000000000aa54006 CR4: 0000000000771ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
>  <TASK>
>  ? show_regs+0x67/0x70
>  ? __warn+0x94/0x1b0
>  ? xe_pm_runtime_get_noresume+0x48/0x60 [xe]
>  ? report_bug+0x1b7/0x1d0
>  ? handle_bug+0x46/0x80
>  ? exc_invalid_op+0x19/0x70
>  ? asm_exc_invalid_op+0x1b/0x20
>  ? xe_pm_runtime_get_noresume+0x48/0x60 [xe]
>  xe_ggtt_remove_node+0x99/0x110 [xe]
>  xe_ggtt_remove_bo+0x59/0x1d0 [xe]
>  ? _raw_write_unlock+0x23/0x50
>  ? drm_vma_offset_remove+0x66/0x80 [drm]
>  xe_ttm_bo_destroy+0x135/0x230 [xe]
>  ttm_bo_release+0x6e/0x320 [ttm]
>  ttm_bo_delayed_delete+0x82/0xa0 [ttm]
>  process_scheduled_works+0x3aa/0x750
>  worker_thread+0x14f/0x2f0
>  ? __pfx_worker_thread+0x10/0x10
>  kthread+0xf5/0x130
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork+0x39/0x60
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork_asm+0x1a/0x30
>  </TASK>
> irq event stamp: 26249
> hardirqs last  enabled at (26255): [<ffffffff811b8f51>] vprintk_emit+0x351/0x360
> hardirqs last disabled at (26260): [<ffffffff811b8f33>] vprintk_emit+0x333/0x360
> softirqs last  enabled at (25326): [<ffffffff810f04bf>] handle_softirqs+0x30f/0x430
> softirqs last disabled at (25319): [<ffffffff810f0e09>] irq_exit_rcu+0x89/0xb0
> ---[ end trace 0000000000000000 ]---
> 
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_ggtt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
> index 0cdbc1296e885..13ce0f51f517a 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt.c
> +++ b/drivers/gpu/drm/xe/xe_ggtt.c
> @@ -489,7 +489,7 @@ void xe_ggtt_remove_node(struct xe_ggtt *ggtt, struct drm_mm_node *node,
>  
>  	bound = drm_dev_enter(&xe->drm, &idx);
>  	if (bound)
> -		xe_pm_runtime_get_noresume(xe);
> +		xe_pm_runtime_get(xe);
>  
>  	mutex_lock(&ggtt->lock);
>  	if (bound)
> -- 
> 2.45.2
> 

  reply	other threads:[~2024-07-11 20:14 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-11 20:00 [PATCH] drm/xe: Use xe_pm_runtime_get() in xe_ggtt_remove_node() José Roberto de Souza
2024-07-11 20:14 ` Rodrigo Vivi [this message]
2024-07-11 20:26 ` ✓ CI.Patch_applied: success for " Patchwork
2024-07-11 20:26 ` ✓ CI.checkpatch: " Patchwork
2024-07-11 20:27 ` ✓ CI.KUnit: " Patchwork
2024-07-11 20:39 ` ✓ CI.Build: " Patchwork
2024-07-11 20:41 ` ✓ CI.Hooks: " Patchwork
2024-07-11 20:43 ` ✓ CI.checksparse: " Patchwork
2024-07-11 21:08 ` ✓ CI.BAT: " Patchwork
2024-07-11 23:28 ` ✗ CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZpA9M3dz2Kfj5yNM@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jose.souza@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.