* [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO
@ 2025-11-05 23:21 Sultan Alsawaf
2025-11-06 18:46 ` Mario Limonciello
0 siblings, 1 reply; 9+ messages in thread
From: Sultan Alsawaf @ 2025-11-05 23:21 UTC (permalink / raw)
To: bin.du, pratap.nirujogi, mario.limonciello, amd-gfx, linux-kernel
Cc: alexander.deucher, benjamin.chan, christian.koenig, dantony,
gjorgji.rosikopulos, king.li, phil.jawich
From: Sultan Alsawaf <sultan@kerneltoast.com>
When the BO pointer provided to amdgpu_bo_create_kernel() points to
non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address
rather than allocate a new BO.
This functionality is never desired for allocating ISP buffers. A new BO
should always be created when isp_kernel_buffer_alloc() is called, per the
description for isp_kernel_buffer_alloc().
Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call.
Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers")
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c
index 9cddbf50442a..37270c4dab8d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c
@@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device *dev, u64 size,
if (ret)
return ret;
+ /* Ensure *bo is NULL so a new BO will be created */
+ *bo = NULL;
ret = amdgpu_bo_create_kernel(adev,
size,
ISP_MC_ADDR_ALIGN,
--
2.51.2
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO 2025-11-05 23:21 [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO Sultan Alsawaf @ 2025-11-06 18:46 ` Mario Limonciello 2025-11-06 18:58 ` Sultan Alsawaf 0 siblings, 1 reply; 9+ messages in thread From: Mario Limonciello @ 2025-11-06 18:46 UTC (permalink / raw) To: Sultan Alsawaf, bin.du, pratap.nirujogi, amd-gfx, linux-kernel Cc: alexander.deucher, benjamin.chan, christian.koenig, dantony, gjorgji.rosikopulos, king.li, phil.jawich On 11/5/25 5:21 PM, Sultan Alsawaf wrote: > From: Sultan Alsawaf <sultan@kerneltoast.com> > > When the BO pointer provided to amdgpu_bo_create_kernel() points to > non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address > rather than allocate a new BO. > > This functionality is never desired for allocating ISP buffers. A new BO > should always be created when isp_kernel_buffer_alloc() is called, per the > description for isp_kernel_buffer_alloc(). Are you just going off descriptions, or is there a problem with reuse? > > Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call. > > Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers") > Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c > index 9cddbf50442a..37270c4dab8d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c > @@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device *dev, u64 size, > if (ret) > return ret; > > + /* Ensure *bo is NULL so a new BO will be created */ > + *bo = NULL; > ret = amdgpu_bo_create_kernel(adev, > size, > ISP_MC_ADDR_ALIGN, ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO 2025-11-06 18:46 ` Mario Limonciello @ 2025-11-06 18:58 ` Sultan Alsawaf 2025-11-06 19:00 ` Mario Limonciello 2025-11-06 20:08 ` Nirujogi, Pratap 0 siblings, 2 replies; 9+ messages in thread From: Sultan Alsawaf @ 2025-11-06 18:58 UTC (permalink / raw) To: Mario Limonciello Cc: bin.du, pratap.nirujogi, amd-gfx, linux-kernel, alexander.deucher, benjamin.chan, christian.koenig, dantony, gjorgji.rosikopulos, king.li, phil.jawich On Thu, Nov 06, 2025 at 12:46:51PM -0600, Mario Limonciello wrote: > On 11/5/25 5:21 PM, Sultan Alsawaf wrote: > > From: Sultan Alsawaf <sultan@kerneltoast.com> > > > > When the BO pointer provided to amdgpu_bo_create_kernel() points to > > non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address > > rather than allocate a new BO. > > > > This functionality is never desired for allocating ISP buffers. A new BO > > should always be created when isp_kernel_buffer_alloc() is called, per the > > description for isp_kernel_buffer_alloc(). > > Are you just going off descriptions, or is there a problem with reuse? I am going based off the ISP4 driver's usage of isp_kernel_buffer_alloc(). This fixes the following crash I experienced on v5 of the ISP4 patchset: [ 175.485627] BUG: unable to handle page fault for address: 00007f6b1092e888 [ 175.485799] #PF: supervisor read access in kernel mode [ 175.485840] #PF: error_code(0x0000) - not-present page [ 175.485869] PGD 0 P4D 0 [ 175.485889] Oops: Oops: 0000 [#1] SMP [ 175.485908] CPU: 15 UID: 1000 PID: 57022 Comm: cheese Tainted: G U W 6.17.7 #1 PREEMPT [ 175.485921] Tainted: [U]=USER, [W]=WARN [ 175.485933] Hardware name: HP HP ZBook Ultra G1a 14 inch Mobile Workstation PC/8D01, BIOS X89 Ver. 01.03.00 04/25/2025 [ 175.485943] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu] [ 175.485961] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6 [ 175.485976] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 [ 175.485988] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000 [ 175.486002] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20 [ 175.486012] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8 [ 175.486023] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002 [ 175.486031] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000 [ 175.486037] FS: 00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000 [ 175.486046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 175.486058] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0 [ 175.486067] PKRU: 55555554 [ 175.486072] Call Trace: [ 175.486081] <TASK> [ 175.486092] ? smu_cmn_send_smc_msg_with_param+0xc0/0x360 [amdgpu] [ 175.486102] amdgpu_bo_create_kernel+0x15/0x70 [amdgpu] [ 175.486110] isp_kernel_buffer_alloc+0x56/0xa0 [amdgpu] [ 175.486118] isp4if_gpu_mem_alloc.isra.0+0x45/0x70 [amd_capture] [ 175.486126] isp4if_start+0x3a/0x320 [amd_capture] [ 175.486141] isp4sd_set_power+0x96/0x1e0 [amd_capture] [ 175.486148] pipeline_pm_power_one+0xf2/0x110 [videodev] [ 175.486155] pipeline_pm_power+0x51/0x90 [videodev] [ 175.486161] v4l2_pipeline_pm_use+0x3b/0x60 [videodev] [ 175.486169] isp4vid_qops_start_streaming+0x22/0x140 [amd_capture] [ 175.486176] ? isp4vid_qops_buffer_queue+0xb1/0x140 [amd_capture] [ 175.486185] vb2_start_streaming+0x79/0x140 [videobuf2_common] [ 175.486192] vb2_core_qbuf+0x3b5/0x480 [videobuf2_common] [ 175.486200] vb2_qbuf+0x98/0x100 [videobuf2_v4l2] [ 175.486208] __video_do_ioctl+0x480/0x4b0 [videodev] [ 175.486219] video_usercopy+0x1e5/0x760 [videodev] [ 175.486226] ? v4l_s_output+0x50/0x50 [videodev] [ 175.486237] v4l2_ioctl+0x45/0x50 [videodev] [ 175.486245] __x64_sys_ioctl+0x77/0xc0 [ 175.486251] do_syscall_64+0x48/0xbf0 [ 175.486260] entry_SYSCALL_64_after_hwframe+0x50/0x58 [ 175.486268] RIP: 0033:0x7fb03371674d [ 175.486275] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 [ 175.486282] RSP: 002b:00007faf60ffc9d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 175.486292] RAX: ffffffffffffffda RBX: 00007fafb40050b0 RCX: 00007fb03371674d [ 175.486301] RDX: 00007faf60ffca70 RSI: 00000000c058560f RDI: 000000000000002c [ 175.486306] RBP: 00007faf60ffca20 R08: 13455f273d2513b9 R09: 0000000000000210 [ 175.486313] R10: 0000000000000215 R11: 0000000000000246 R12: 00007faf9801c4b0 [ 175.486318] R13: 0000000000000001 R14: 00007faf60ffcad0 R15: 0000000000000001 [ 175.486324] </TASK> [ 175.486330] Modules linked in: ccm hid_sensor_prox hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amd_capture videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc pinctrl_amdisp i2c_designware_amdisp uhid cmac algif_hash algif_skcipher af_alg bnep uinput nls_iso8859_1 vfat fat snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm joydev snd_soc_dmic snd_acp_pcm mousedev intel_rapl_msr snd_sof_amd_acp70 snd_sof_amd_acp63 snd_hda_scodec_cs35l56_spi intel_rapl_common snd_ctl_led snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_hda_codec_alc269 snd_sof_amd_renoir snd_hda_scodec_component snd_sof_amd_acp snd_hda_codec_realtek_lib snd_sof_pci snd_sof_xtensa_dsp snd_hda_codec_generic snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match amdgpu snd_amd_sdw_acpi snd_hda_codec_atihdmi soundwire_amd soundwire_generic_allocation snd_hda_codec_hdmi [ 175.486373] soundwire_bus snd_soc_sdca snd_soc_core snd_compress snd_hda_intel ac97_bus snd_pcm_dmaengine mt7925e drm_panel_backlight_quirks amdxcp snd_hda_codec snd_rpl_pci_acp6x mt7925_common btusb drm_buddy cdc_mbim mt792x_lib snd_acp_pci cdc_wdm btrtl drm_exec snd_hda_core snd_amd_acpi_mach mt76_connac_lib snd_hda_scodec_cs35l56_i2c btintel snd_acp_legacy_common drm_suballoc_helper cdc_ncm snd_intel_dspcfg mt76 snd_hda_scodec_cs35l56 snd_pci_acp6x cdc_ether drm_ttm_helper btbcm snd_intel_sdw_acpi snd_hda_cirrus_scodec snd_hwdep usbnet ttm snd_pci_acp5x btmtk snd_soc_cs35l56_shared ucsi_acpi kvm_amd mac80211 snd_pcm r8152 i2c_algo_bit snd_rn_pci_acp3x typec_ucsi snd_soc_cs_amp_lib libarc4 spd5118 bluetooth mii drm_display_helper snd_timer cs_dsp kvm typec snd_acp_config hp_wmi snd cfg80211 libphy amdxdna roles cec snd_soc_acpi ecdh_generic sp5100_tco hid_multitouch irqbypass sparse_keymap rfkill rapl mdio_bus video gpu_sched amd_pmf wmi_bmof snd_pci_acp3x soundcore amdtee i2c_hid_acpi serial_multi_instantiate [ 175.486384] i2c_hid amd_sfh thunderbolt wireless_hotkey amd_pmc platform_profile wmi mac_hid i2c_piix4 i2c_smbus i2c_dev sg crypto_user loop nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod polyval_clmulni ghash_clmulni_intel aesni_intel nvme nvme_core nvme_keyring serio_raw ccp nvme_auth [ 175.486394] CR2: 00007f6b1092e888 [ 175.486402] ---[ end trace 0000000000000000 ]--- [ 175.486409] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu] [ 175.486416] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6 [ 175.486422] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 [ 175.486429] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000 [ 175.486433] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20 [ 175.486439] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8 [ 175.486444] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002 [ 175.486448] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000 [ 175.486455] FS: 00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000 [ 175.486460] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 175.486464] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0 [ 175.486470] PKRU: 55555554 Admittedly, it's my fault that ISP4 stopped zeroing the BO pointer argument (&mem_info->mem_handle) passed to isp_kernel_buffer_alloc() [1]. But since this issue slipped past Bin and presumably others who reviewed the code, it highlights that isp_kernel_buffer_alloc() is not working as expected in this respect, and the description for isp_kernel_buffer_alloc() reinforces this. > > > > Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call. > > > > Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers") > > Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c > > index 9cddbf50442a..37270c4dab8d 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c > > @@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device *dev, u64 size, > > if (ret) > > return ret; > > + /* Ensure *bo is NULL so a new BO will be created */ > > + *bo = NULL; > > ret = amdgpu_bo_create_kernel(adev, > > size, > > ISP_MC_ADDR_ALIGN, > [1] https://lore.kernel.org/all/aNB0P18ytI1KopWI@sultan-box/ Sultan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO 2025-11-06 18:58 ` Sultan Alsawaf @ 2025-11-06 19:00 ` Mario Limonciello 2025-11-06 20:08 ` Nirujogi, Pratap 1 sibling, 0 replies; 9+ messages in thread From: Mario Limonciello @ 2025-11-06 19:00 UTC (permalink / raw) To: Sultan Alsawaf Cc: bin.du, pratap.nirujogi, amd-gfx, linux-kernel, alexander.deucher, benjamin.chan, christian.koenig, dantony, gjorgji.rosikopulos, king.li, phil.jawich On 11/6/25 12:58 PM, Sultan Alsawaf wrote: > On Thu, Nov 06, 2025 at 12:46:51PM -0600, Mario Limonciello wrote: >> On 11/5/25 5:21 PM, Sultan Alsawaf wrote: >>> From: Sultan Alsawaf <sultan@kerneltoast.com> >>> >>> When the BO pointer provided to amdgpu_bo_create_kernel() points to >>> non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address >>> rather than allocate a new BO. >>> >>> This functionality is never desired for allocating ISP buffers. A new BO >>> should always be created when isp_kernel_buffer_alloc() is called, per the >>> description for isp_kernel_buffer_alloc(). >> >> Are you just going off descriptions, or is there a problem with reuse? > > I am going based off the ISP4 driver's usage of isp_kernel_buffer_alloc(). > > This fixes the following crash I experienced on v5 of the ISP4 patchset: > > [ 175.485627] BUG: unable to handle page fault for address: 00007f6b1092e888 > [ 175.485799] #PF: supervisor read access in kernel mode > [ 175.485840] #PF: error_code(0x0000) - not-present page > [ 175.485869] PGD 0 P4D 0 > [ 175.485889] Oops: Oops: 0000 [#1] SMP > [ 175.485908] CPU: 15 UID: 1000 PID: 57022 Comm: cheese Tainted: G U W 6.17.7 #1 PREEMPT > [ 175.485921] Tainted: [U]=USER, [W]=WARN > [ 175.485933] Hardware name: HP HP ZBook Ultra G1a 14 inch Mobile Workstation PC/8D01, BIOS X89 Ver. 01.03.00 04/25/2025 > [ 175.485943] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu] > [ 175.485961] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6 > [ 175.485976] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 > [ 175.485988] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000 > [ 175.486002] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20 > [ 175.486012] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8 > [ 175.486023] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002 > [ 175.486031] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000 > [ 175.486037] FS: 00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000 > [ 175.486046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 175.486058] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0 > [ 175.486067] PKRU: 55555554 > [ 175.486072] Call Trace: > [ 175.486081] <TASK> > [ 175.486092] ? smu_cmn_send_smc_msg_with_param+0xc0/0x360 [amdgpu] > [ 175.486102] amdgpu_bo_create_kernel+0x15/0x70 [amdgpu] > [ 175.486110] isp_kernel_buffer_alloc+0x56/0xa0 [amdgpu] > [ 175.486118] isp4if_gpu_mem_alloc.isra.0+0x45/0x70 [amd_capture] > [ 175.486126] isp4if_start+0x3a/0x320 [amd_capture] > [ 175.486141] isp4sd_set_power+0x96/0x1e0 [amd_capture] > [ 175.486148] pipeline_pm_power_one+0xf2/0x110 [videodev] > [ 175.486155] pipeline_pm_power+0x51/0x90 [videodev] > [ 175.486161] v4l2_pipeline_pm_use+0x3b/0x60 [videodev] > [ 175.486169] isp4vid_qops_start_streaming+0x22/0x140 [amd_capture] > [ 175.486176] ? isp4vid_qops_buffer_queue+0xb1/0x140 [amd_capture] > [ 175.486185] vb2_start_streaming+0x79/0x140 [videobuf2_common] > [ 175.486192] vb2_core_qbuf+0x3b5/0x480 [videobuf2_common] > [ 175.486200] vb2_qbuf+0x98/0x100 [videobuf2_v4l2] > [ 175.486208] __video_do_ioctl+0x480/0x4b0 [videodev] > [ 175.486219] video_usercopy+0x1e5/0x760 [videodev] > [ 175.486226] ? v4l_s_output+0x50/0x50 [videodev] > [ 175.486237] v4l2_ioctl+0x45/0x50 [videodev] > [ 175.486245] __x64_sys_ioctl+0x77/0xc0 > [ 175.486251] do_syscall_64+0x48/0xbf0 > [ 175.486260] entry_SYSCALL_64_after_hwframe+0x50/0x58 > [ 175.486268] RIP: 0033:0x7fb03371674d > [ 175.486275] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 > [ 175.486282] RSP: 002b:00007faf60ffc9d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > [ 175.486292] RAX: ffffffffffffffda RBX: 00007fafb40050b0 RCX: 00007fb03371674d > [ 175.486301] RDX: 00007faf60ffca70 RSI: 00000000c058560f RDI: 000000000000002c > [ 175.486306] RBP: 00007faf60ffca20 R08: 13455f273d2513b9 R09: 0000000000000210 > [ 175.486313] R10: 0000000000000215 R11: 0000000000000246 R12: 00007faf9801c4b0 > [ 175.486318] R13: 0000000000000001 R14: 00007faf60ffcad0 R15: 0000000000000001 > [ 175.486324] </TASK> > [ 175.486330] Modules linked in: ccm hid_sensor_prox hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amd_capture videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc pinctrl_amdisp i2c_designware_amdisp uhid cmac algif_hash algif_skcipher af_alg bnep uinput nls_iso8859_1 vfat fat snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm joydev snd_soc_dmic snd_acp_pcm mousedev intel_rapl_msr snd_sof_amd_acp70 snd_sof_amd_acp63 snd_hda_scodec_cs35l56_spi intel_rapl_common snd_ctl_led snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_hda_codec_alc269 snd_sof_amd_renoir snd_hda_scodec_component snd_sof_amd_acp snd_hda_codec_realtek_lib snd_sof_pci snd_sof_xtensa_dsp snd_hda_codec_generic snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match amdgpu snd_amd_sdw_acpi snd_hda_codec_atihdmi soundwire_amd soundwire_generic_allocation snd_hda_codec_hdmi > [ 175.486373] soundwire_bus snd_soc_sdca snd_soc_core snd_compress snd_hda_intel ac97_bus snd_pcm_dmaengine mt7925e drm_panel_backlight_quirks amdxcp snd_hda_codec snd_rpl_pci_acp6x mt7925_common btusb drm_buddy cdc_mbim mt792x_lib snd_acp_pci cdc_wdm btrtl drm_exec snd_hda_core snd_amd_acpi_mach mt76_connac_lib snd_hda_scodec_cs35l56_i2c btintel snd_acp_legacy_common drm_suballoc_helper cdc_ncm snd_intel_dspcfg mt76 snd_hda_scodec_cs35l56 snd_pci_acp6x cdc_ether drm_ttm_helper btbcm snd_intel_sdw_acpi snd_hda_cirrus_scodec snd_hwdep usbnet ttm snd_pci_acp5x btmtk snd_soc_cs35l56_shared ucsi_acpi kvm_amd mac80211 snd_pcm r8152 i2c_algo_bit snd_rn_pci_acp3x typec_ucsi snd_soc_cs_amp_lib libarc4 spd5118 bluetooth mii drm_display_helper snd_timer cs_dsp kvm typec snd_acp_config hp_wmi snd cfg80211 libphy amdxdna roles cec snd_soc_acpi ecdh_generic sp5100_tco hid_multitouch irqbypass sparse_keymap rfkill rapl mdio_bus video gpu_sched amd_pmf wmi_bmof snd_pci_acp3x soundcore amdtee i2c_hid_acpi serial_multi_instantiate > [ 175.486384] i2c_hid amd_sfh thunderbolt wireless_hotkey amd_pmc platform_profile wmi mac_hid i2c_piix4 i2c_smbus i2c_dev sg crypto_user loop nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod polyval_clmulni ghash_clmulni_intel aesni_intel nvme nvme_core nvme_keyring serio_raw ccp nvme_auth > [ 175.486394] CR2: 00007f6b1092e888 > [ 175.486402] ---[ end trace 0000000000000000 ]--- > [ 175.486409] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu] > [ 175.486416] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6 > [ 175.486422] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 > [ 175.486429] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000 > [ 175.486433] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20 > [ 175.486439] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8 > [ 175.486444] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002 > [ 175.486448] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000 > [ 175.486455] FS: 00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000 > [ 175.486460] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 175.486464] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0 > [ 175.486470] PKRU: 55555554 > > Admittedly, it's my fault that ISP4 stopped zeroing the BO pointer argument > (&mem_info->mem_handle) passed to isp_kernel_buffer_alloc() [1]. But since this > issue slipped past Bin and presumably others who reviewed the code, it > highlights that isp_kernel_buffer_alloc() is not working as expected in this > respect, and the description for isp_kernel_buffer_alloc() reinforces this. Got it. Make sense to me now. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> > >>> >>> Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call. >>> >>> Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers") >>> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>> index 9cddbf50442a..37270c4dab8d 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>> @@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device *dev, u64 size, >>> if (ret) >>> return ret; >>> + /* Ensure *bo is NULL so a new BO will be created */ >>> + *bo = NULL; >>> ret = amdgpu_bo_create_kernel(adev, >>> size, >>> ISP_MC_ADDR_ALIGN, >> > > [1] https://lore.kernel.org/all/aNB0P18ytI1KopWI@sultan-box/ > > Sultan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO 2025-11-06 18:58 ` Sultan Alsawaf 2025-11-06 19:00 ` Mario Limonciello @ 2025-11-06 20:08 ` Nirujogi, Pratap 2025-11-06 20:31 ` Sultan Alsawaf 1 sibling, 1 reply; 9+ messages in thread From: Nirujogi, Pratap @ 2025-11-06 20:08 UTC (permalink / raw) To: Sultan Alsawaf, Mario Limonciello Cc: bin.du, pratap.nirujogi, amd-gfx, linux-kernel, alexander.deucher, benjamin.chan, christian.koenig, dantony, gjorgji.rosikopulos, king.li, phil.jawich On 11/6/2025 1:58 PM, Sultan Alsawaf wrote: > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > On Thu, Nov 06, 2025 at 12:46:51PM -0600, Mario Limonciello wrote: >> On 11/5/25 5:21 PM, Sultan Alsawaf wrote: >>> From: Sultan Alsawaf <sultan@kerneltoast.com> >>> >>> When the BO pointer provided to amdgpu_bo_create_kernel() points to >>> non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address >>> rather than allocate a new BO. >>> >>> This functionality is never desired for allocating ISP buffers. A new BO >>> should always be created when isp_kernel_buffer_alloc() is called, per the >>> description for isp_kernel_buffer_alloc(). >> >> Are you just going off descriptions, or is there a problem with reuse? > > I am going based off the ISP4 driver's usage of isp_kernel_buffer_alloc(). > > This fixes the following crash I experienced on v5 of the ISP4 patchset: > > [ 175.485627] BUG: unable to handle page fault for address: 00007f6b1092e888 > [ 175.485799] #PF: supervisor read access in kernel mode > [ 175.485840] #PF: error_code(0x0000) - not-present page > [ 175.485869] PGD 0 P4D 0 > [ 175.485889] Oops: Oops: 0000 [#1] SMP > [ 175.485908] CPU: 15 UID: 1000 PID: 57022 Comm: cheese Tainted: G U W 6.17.7 #1 PREEMPT > [ 175.485921] Tainted: [U]=USER, [W]=WARN > [ 175.485933] Hardware name: HP HP ZBook Ultra G1a 14 inch Mobile Workstation PC/8D01, BIOS X89 Ver. 01.03.00 04/25/2025 > [ 175.485943] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu] > [ 175.485961] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6 > [ 175.485976] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 > [ 175.485988] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000 > [ 175.486002] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20 > [ 175.486012] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8 > [ 175.486023] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002 > [ 175.486031] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000 > [ 175.486037] FS: 00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000 > [ 175.486046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 175.486058] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0 > [ 175.486067] PKRU: 55555554 > [ 175.486072] Call Trace: > [ 175.486081] <TASK> > [ 175.486092] ? smu_cmn_send_smc_msg_with_param+0xc0/0x360 [amdgpu] > [ 175.486102] amdgpu_bo_create_kernel+0x15/0x70 [amdgpu] > [ 175.486110] isp_kernel_buffer_alloc+0x56/0xa0 [amdgpu] > [ 175.486118] isp4if_gpu_mem_alloc.isra.0+0x45/0x70 [amd_capture] > [ 175.486126] isp4if_start+0x3a/0x320 [amd_capture] > [ 175.486141] isp4sd_set_power+0x96/0x1e0 [amd_capture] > [ 175.486148] pipeline_pm_power_one+0xf2/0x110 [videodev] > [ 175.486155] pipeline_pm_power+0x51/0x90 [videodev] > [ 175.486161] v4l2_pipeline_pm_use+0x3b/0x60 [videodev] > [ 175.486169] isp4vid_qops_start_streaming+0x22/0x140 [amd_capture] > [ 175.486176] ? isp4vid_qops_buffer_queue+0xb1/0x140 [amd_capture] > [ 175.486185] vb2_start_streaming+0x79/0x140 [videobuf2_common] > [ 175.486192] vb2_core_qbuf+0x3b5/0x480 [videobuf2_common] > [ 175.486200] vb2_qbuf+0x98/0x100 [videobuf2_v4l2] > [ 175.486208] __video_do_ioctl+0x480/0x4b0 [videodev] > [ 175.486219] video_usercopy+0x1e5/0x760 [videodev] > [ 175.486226] ? v4l_s_output+0x50/0x50 [videodev] > [ 175.486237] v4l2_ioctl+0x45/0x50 [videodev] > [ 175.486245] __x64_sys_ioctl+0x77/0xc0 > [ 175.486251] do_syscall_64+0x48/0xbf0 > [ 175.486260] entry_SYSCALL_64_after_hwframe+0x50/0x58 > [ 175.486268] RIP: 0033:0x7fb03371674d > [ 175.486275] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 > [ 175.486282] RSP: 002b:00007faf60ffc9d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > [ 175.486292] RAX: ffffffffffffffda RBX: 00007fafb40050b0 RCX: 00007fb03371674d > [ 175.486301] RDX: 00007faf60ffca70 RSI: 00000000c058560f RDI: 000000000000002c > [ 175.486306] RBP: 00007faf60ffca20 R08: 13455f273d2513b9 R09: 0000000000000210 > [ 175.486313] R10: 0000000000000215 R11: 0000000000000246 R12: 00007faf9801c4b0 > [ 175.486318] R13: 0000000000000001 R14: 00007faf60ffcad0 R15: 0000000000000001 > [ 175.486324] </TASK> > [ 175.486330] Modules linked in: ccm hid_sensor_prox hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amd_capture videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc pinctrl_amdisp i2c_designware_amdisp uhid cmac algif_hash algif_skcipher af_alg bnep uinput nls_iso8859_1 vfat fat snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm joydev snd_soc_dmic snd_acp_pcm mousedev intel_rapl_msr snd_sof_amd_acp70 snd_sof_amd_acp63 snd_hda_scodec_cs35l56_spi intel_rapl_common snd_ctl_led snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_hda_codec_alc269 snd_sof_amd_renoir snd_hda_scodec_component snd_sof_amd_acp snd_hda_codec_realtek_lib snd_sof_pci snd_sof_xtensa_dsp snd_hda_codec_generic snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match amdgpu snd_amd_sdw_acpi snd_hda_codec_atihdmi soundwire_amd soundwire_generic_allocation snd_hda_codec_hdmi > [ 175.486373] soundwire_bus snd_soc_sdca snd_soc_core snd_compress snd_hda_intel ac97_bus snd_pcm_dmaengine mt7925e drm_panel_backlight_quirks amdxcp snd_hda_codec snd_rpl_pci_acp6x mt7925_common btusb drm_buddy cdc_mbim mt792x_lib snd_acp_pci cdc_wdm btrtl drm_exec snd_hda_core snd_amd_acpi_mach mt76_connac_lib snd_hda_scodec_cs35l56_i2c btintel snd_acp_legacy_common drm_suballoc_helper cdc_ncm snd_intel_dspcfg mt76 snd_hda_scodec_cs35l56 snd_pci_acp6x cdc_ether drm_ttm_helper btbcm snd_intel_sdw_acpi snd_hda_cirrus_scodec snd_hwdep usbnet ttm snd_pci_acp5x btmtk snd_soc_cs35l56_shared ucsi_acpi kvm_amd mac80211 snd_pcm r8152 i2c_algo_bit snd_rn_pci_acp3x typec_ucsi snd_soc_cs_amp_lib libarc4 spd5118 bluetooth mii drm_display_helper snd_timer cs_dsp kvm typec snd_acp_config hp_wmi snd cfg80211 libphy amdxdna roles cec snd_soc_acpi ecdh_generic sp5100_tco hid_multitouch irqbypass sparse_keymap rfkill rapl mdio_bus video gpu_sched amd_pmf wmi_bmof snd_pci_acp3x soundcore amdtee i2c_hid_acpi serial_multi_instantiate > [ 175.486384] i2c_hid amd_sfh thunderbolt wireless_hotkey amd_pmc platform_profile wmi mac_hid i2c_piix4 i2c_smbus i2c_dev sg crypto_user loop nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod polyval_clmulni ghash_clmulni_intel aesni_intel nvme nvme_core nvme_keyring serio_raw ccp nvme_auth > [ 175.486394] CR2: 00007f6b1092e888 > [ 175.486402] ---[ end trace 0000000000000000 ]--- > [ 175.486409] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu] > [ 175.486416] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6 > [ 175.486422] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 > [ 175.486429] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000 > [ 175.486433] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20 > [ 175.486439] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8 > [ 175.486444] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002 > [ 175.486448] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000 > [ 175.486455] FS: 00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000 > [ 175.486460] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 175.486464] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0 > [ 175.486470] PKRU: 55555554 > > Admittedly, it's my fault that ISP4 stopped zeroing the BO pointer argument > (&mem_info->mem_handle) passed to isp_kernel_buffer_alloc() [1]. But since this > issue slipped past Bin and presumably others who reviewed the code, it > highlights that isp_kernel_buffer_alloc() is not working as expected in this > respect, and the description for isp_kernel_buffer_alloc() reinforces this. > Thanks Sultan for suggesting this fix. Yes, its hard to believe that this slipped through until now. Instead of initializing *bo=NULL, I suggest ensuring *bo is actually NULL before calling amdgpu_bo_create_kernel(). int isp_kernel_buffer_alloc(...) { ... if (WARN_ON(*bo)) return -EINVAL; ... ret = amdgpu_bo_create_kernel(..) ... } This way the caller will get to know parameters passed are invalid and can take care of initializing the handles appropriately. Thanks, Pratap >>> >>> Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call. >>> >>> Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers") >>> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>> index 9cddbf50442a..37270c4dab8d 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>> @@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device *dev, u64 size, >>> if (ret) >>> return ret; >>> + /* Ensure *bo is NULL so a new BO will be created */ >>> + *bo = NULL; >>> ret = amdgpu_bo_create_kernel(adev, >>> size, >>> ISP_MC_ADDR_ALIGN, >> > > [1] https://lore.kernel.org/all/aNB0P18ytI1KopWI@sultan-box/ > > Sultan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO 2025-11-06 20:08 ` Nirujogi, Pratap @ 2025-11-06 20:31 ` Sultan Alsawaf 2025-11-06 20:51 ` Nirujogi, Pratap 0 siblings, 1 reply; 9+ messages in thread From: Sultan Alsawaf @ 2025-11-06 20:31 UTC (permalink / raw) To: Nirujogi, Pratap Cc: Mario Limonciello, bin.du, pratap.nirujogi, amd-gfx, linux-kernel, alexander.deucher, benjamin.chan, christian.koenig, dantony, gjorgji.rosikopulos, king.li, phil.jawich On Thu, Nov 06, 2025 at 03:08:44PM -0500, Nirujogi, Pratap wrote: > > > On 11/6/2025 1:58 PM, Sultan Alsawaf wrote: > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > > > > On Thu, Nov 06, 2025 at 12:46:51PM -0600, Mario Limonciello wrote: > > > On 11/5/25 5:21 PM, Sultan Alsawaf wrote: > > > > From: Sultan Alsawaf <sultan@kerneltoast.com> > > > > > > > > When the BO pointer provided to amdgpu_bo_create_kernel() points to > > > > non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address > > > > rather than allocate a new BO. > > > > > > > > This functionality is never desired for allocating ISP buffers. A new BO > > > > should always be created when isp_kernel_buffer_alloc() is called, per the > > > > description for isp_kernel_buffer_alloc(). > > > > > > Are you just going off descriptions, or is there a problem with reuse? > > > > I am going based off the ISP4 driver's usage of isp_kernel_buffer_alloc(). > > > > This fixes the following crash I experienced on v5 of the ISP4 patchset: > > > > [ 175.485627] BUG: unable to handle page fault for address: 00007f6b1092e888 > > [ 175.485799] #PF: supervisor read access in kernel mode > > [ 175.485840] #PF: error_code(0x0000) - not-present page > > [ 175.485869] PGD 0 P4D 0 > > [ 175.485889] Oops: Oops: 0000 [#1] SMP > > [ 175.485908] CPU: 15 UID: 1000 PID: 57022 Comm: cheese Tainted: G U W 6.17.7 #1 PREEMPT > > [ 175.485921] Tainted: [U]=USER, [W]=WARN > > [ 175.485933] Hardware name: HP HP ZBook Ultra G1a 14 inch Mobile Workstation PC/8D01, BIOS X89 Ver. 01.03.00 04/25/2025 > > [ 175.485943] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu] > > [ 175.485961] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6 > > [ 175.485976] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 > > [ 175.485988] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000 > > [ 175.486002] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20 > > [ 175.486012] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8 > > [ 175.486023] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002 > > [ 175.486031] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000 > > [ 175.486037] FS: 00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000 > > [ 175.486046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 175.486058] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0 > > [ 175.486067] PKRU: 55555554 > > [ 175.486072] Call Trace: > > [ 175.486081] <TASK> > > [ 175.486092] ? smu_cmn_send_smc_msg_with_param+0xc0/0x360 [amdgpu] > > [ 175.486102] amdgpu_bo_create_kernel+0x15/0x70 [amdgpu] > > [ 175.486110] isp_kernel_buffer_alloc+0x56/0xa0 [amdgpu] > > [ 175.486118] isp4if_gpu_mem_alloc.isra.0+0x45/0x70 [amd_capture] > > [ 175.486126] isp4if_start+0x3a/0x320 [amd_capture] > > [ 175.486141] isp4sd_set_power+0x96/0x1e0 [amd_capture] > > [ 175.486148] pipeline_pm_power_one+0xf2/0x110 [videodev] > > [ 175.486155] pipeline_pm_power+0x51/0x90 [videodev] > > [ 175.486161] v4l2_pipeline_pm_use+0x3b/0x60 [videodev] > > [ 175.486169] isp4vid_qops_start_streaming+0x22/0x140 [amd_capture] > > [ 175.486176] ? isp4vid_qops_buffer_queue+0xb1/0x140 [amd_capture] > > [ 175.486185] vb2_start_streaming+0x79/0x140 [videobuf2_common] > > [ 175.486192] vb2_core_qbuf+0x3b5/0x480 [videobuf2_common] > > [ 175.486200] vb2_qbuf+0x98/0x100 [videobuf2_v4l2] > > [ 175.486208] __video_do_ioctl+0x480/0x4b0 [videodev] > > [ 175.486219] video_usercopy+0x1e5/0x760 [videodev] > > [ 175.486226] ? v4l_s_output+0x50/0x50 [videodev] > > [ 175.486237] v4l2_ioctl+0x45/0x50 [videodev] > > [ 175.486245] __x64_sys_ioctl+0x77/0xc0 > > [ 175.486251] do_syscall_64+0x48/0xbf0 > > [ 175.486260] entry_SYSCALL_64_after_hwframe+0x50/0x58 > > [ 175.486268] RIP: 0033:0x7fb03371674d > > [ 175.486275] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 > > [ 175.486282] RSP: 002b:00007faf60ffc9d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > > [ 175.486292] RAX: ffffffffffffffda RBX: 00007fafb40050b0 RCX: 00007fb03371674d > > [ 175.486301] RDX: 00007faf60ffca70 RSI: 00000000c058560f RDI: 000000000000002c > > [ 175.486306] RBP: 00007faf60ffca20 R08: 13455f273d2513b9 R09: 0000000000000210 > > [ 175.486313] R10: 0000000000000215 R11: 0000000000000246 R12: 00007faf9801c4b0 > > [ 175.486318] R13: 0000000000000001 R14: 00007faf60ffcad0 R15: 0000000000000001 > > [ 175.486324] </TASK> > > [ 175.486330] Modules linked in: ccm hid_sensor_prox hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amd_capture videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc pinctrl_amdisp i2c_designware_amdisp uhid cmac algif_hash algif_skcipher af_alg bnep uinput nls_iso8859_1 vfat fat snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm joydev snd_soc_dmic snd_acp_pcm mousedev intel_rapl_msr snd_sof_amd_acp70 snd_sof_amd_acp63 snd_hda_scodec_cs35l56_spi intel_rapl_common snd_ctl_led snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_hda_codec_alc269 snd_sof_amd_renoir snd_hda_scodec_component snd_sof_amd_acp snd_hda_codec_realtek_lib snd_sof_pci snd_sof_xtensa_dsp snd_hda_codec_generic snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match amdgpu snd_amd_sdw_acpi snd_hda_codec_atihdmi soundwire_amd soundwire_generic_allocation snd_hda_codec_hdmi > > [ 175.486373] soundwire_bus snd_soc_sdca snd_soc_core snd_compress snd_hda_intel ac97_bus snd_pcm_dmaengine mt7925e drm_panel_backlight_quirks amdxcp snd_hda_codec snd_rpl_pci_acp6x mt7925_common btusb drm_buddy cdc_mbim mt792x_lib snd_acp_pci cdc_wdm btrtl drm_exec snd_hda_core snd_amd_acpi_mach mt76_connac_lib snd_hda_scodec_cs35l56_i2c btintel snd_acp_legacy_common drm_suballoc_helper cdc_ncm snd_intel_dspcfg mt76 snd_hda_scodec_cs35l56 snd_pci_acp6x cdc_ether drm_ttm_helper btbcm snd_intel_sdw_acpi snd_hda_cirrus_scodec snd_hwdep usbnet ttm snd_pci_acp5x btmtk snd_soc_cs35l56_shared ucsi_acpi kvm_amd mac80211 snd_pcm r8152 i2c_algo_bit snd_rn_pci_acp3x typec_ucsi snd_soc_cs_amp_lib libarc4 spd5118 bluetooth mii drm_display_helper snd_timer cs_dsp kvm typec snd_acp_config hp_wmi snd cfg80211 libphy amdxdna roles cec snd_soc_acpi ecdh_generic sp5100_tco hid_multitouch irqbypass sparse_keymap rfkill rapl mdio_bus video gpu_sched amd_pmf wmi_bmof snd_pci_acp3x soundcore amdtee i2c_hid_acpi serial_multi_instantiate > > [ 175.486384] i2c_hid amd_sfh thunderbolt wireless_hotkey amd_pmc platform_profile wmi mac_hid i2c_piix4 i2c_smbus i2c_dev sg crypto_user loop nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod polyval_clmulni ghash_clmulni_intel aesni_intel nvme nvme_core nvme_keyring serio_raw ccp nvme_auth > > [ 175.486394] CR2: 00007f6b1092e888 > > [ 175.486402] ---[ end trace 0000000000000000 ]--- > > [ 175.486409] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu] > > [ 175.486416] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6 > > [ 175.486422] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 > > [ 175.486429] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000 > > [ 175.486433] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20 > > [ 175.486439] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8 > > [ 175.486444] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002 > > [ 175.486448] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000 > > [ 175.486455] FS: 00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000 > > [ 175.486460] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 175.486464] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0 > > [ 175.486470] PKRU: 55555554 > > > > Admittedly, it's my fault that ISP4 stopped zeroing the BO pointer argument > > (&mem_info->mem_handle) passed to isp_kernel_buffer_alloc() [1]. But since this > > issue slipped past Bin and presumably others who reviewed the code, it > > highlights that isp_kernel_buffer_alloc() is not working as expected in this > > respect, and the description for isp_kernel_buffer_alloc() reinforces this. > > > Thanks Sultan for suggesting this fix. Yes, its hard to believe that this > slipped through until now. > > Instead of initializing *bo=NULL, I suggest ensuring *bo is actually NULL > before calling amdgpu_bo_create_kernel(). > > int isp_kernel_buffer_alloc(...) > { > ... > if (WARN_ON(*bo)) > return -EINVAL; > ... > ret = amdgpu_bo_create_kernel(..) > ... > } > > This way the caller will get to know parameters passed are invalid and can > take care of initializing the handles appropriately. Hi Pratap, I am opposed to this idea for multiple reasons: 1. *bo is an output parameter from isp_kernel_buffer_alloc(), so the input value should not matter. 2. The only correct value for *bo is NULL when isp_kernel_buffer_alloc() passes it to amdgpu_bo_create_kernel(). Since there is only one correct value, there is no reason to burden callers of isp_kernel_buffer_alloc() with intializing *bo to NULL. 3. This adds more code, another WARN_ON(), and another error case to isp_kernel_buffer_alloc(). All of that can be eliminated entirely by just initializing *bo to NULL in isp_kernel_buffer_alloc(), as I've done. 4. The reason *bo needs to be NULL is an implementation detail that is internal to isp_kernel_buffer_alloc(), because amdgpu_bo_create_kernel() needs it to be NULL in order to allocate a new buffer. isp_kernel_buffer_alloc() should handle its own internal implementation details instead of punting the responsibility onto callers. Sultan > > Thanks, > Pratap > > > > > > > > > Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call. > > > > > > > > Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers") > > > > Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> > > > > --- > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c > > > > index 9cddbf50442a..37270c4dab8d 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c > > > > @@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device *dev, u64 size, > > > > if (ret) > > > > return ret; > > > > + /* Ensure *bo is NULL so a new BO will be created */ > > > > + *bo = NULL; > > > > ret = amdgpu_bo_create_kernel(adev, > > > > size, > > > > ISP_MC_ADDR_ALIGN, > > > > > > > [1] https://lore.kernel.org/all/aNB0P18ytI1KopWI@sultan-box/ > > > > Sultan > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO 2025-11-06 20:31 ` Sultan Alsawaf @ 2025-11-06 20:51 ` Nirujogi, Pratap 2025-11-10 3:12 ` Du, Bin 0 siblings, 1 reply; 9+ messages in thread From: Nirujogi, Pratap @ 2025-11-06 20:51 UTC (permalink / raw) To: Sultan Alsawaf Cc: Mario Limonciello, bin.du, pratap.nirujogi, amd-gfx, linux-kernel, alexander.deucher, benjamin.chan, christian.koenig, dantony, gjorgji.rosikopulos, king.li, phil.jawich On 11/6/2025 3:31 PM, Sultan Alsawaf wrote: > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > On Thu, Nov 06, 2025 at 03:08:44PM -0500, Nirujogi, Pratap wrote: >> >> >> On 11/6/2025 1:58 PM, Sultan Alsawaf wrote: >>> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. >>> >>> >>> On Thu, Nov 06, 2025 at 12:46:51PM -0600, Mario Limonciello wrote: >>>> On 11/5/25 5:21 PM, Sultan Alsawaf wrote: >>>>> From: Sultan Alsawaf <sultan@kerneltoast.com> >>>>> >>>>> When the BO pointer provided to amdgpu_bo_create_kernel() points to >>>>> non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address >>>>> rather than allocate a new BO. >>>>> >>>>> This functionality is never desired for allocating ISP buffers. A new BO >>>>> should always be created when isp_kernel_buffer_alloc() is called, per the >>>>> description for isp_kernel_buffer_alloc(). >>>> >>>> Are you just going off descriptions, or is there a problem with reuse? >>> >>> I am going based off the ISP4 driver's usage of isp_kernel_buffer_alloc(). >>> >>> This fixes the following crash I experienced on v5 of the ISP4 patchset: >>> >>> [ 175.485627] BUG: unable to handle page fault for address: 00007f6b1092e888 >>> [ 175.485799] #PF: supervisor read access in kernel mode >>> [ 175.485840] #PF: error_code(0x0000) - not-present page >>> [ 175.485869] PGD 0 P4D 0 >>> [ 175.485889] Oops: Oops: 0000 [#1] SMP >>> [ 175.485908] CPU: 15 UID: 1000 PID: 57022 Comm: cheese Tainted: G U W 6.17.7 #1 PREEMPT >>> [ 175.485921] Tainted: [U]=USER, [W]=WARN >>> [ 175.485933] Hardware name: HP HP ZBook Ultra G1a 14 inch Mobile Workstation PC/8D01, BIOS X89 Ver. 01.03.00 04/25/2025 >>> [ 175.485943] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu] >>> [ 175.485961] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6 >>> [ 175.485976] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 >>> [ 175.485988] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000 >>> [ 175.486002] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20 >>> [ 175.486012] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8 >>> [ 175.486023] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002 >>> [ 175.486031] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000 >>> [ 175.486037] FS: 00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000 >>> [ 175.486046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 175.486058] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0 >>> [ 175.486067] PKRU: 55555554 >>> [ 175.486072] Call Trace: >>> [ 175.486081] <TASK> >>> [ 175.486092] ? smu_cmn_send_smc_msg_with_param+0xc0/0x360 [amdgpu] >>> [ 175.486102] amdgpu_bo_create_kernel+0x15/0x70 [amdgpu] >>> [ 175.486110] isp_kernel_buffer_alloc+0x56/0xa0 [amdgpu] >>> [ 175.486118] isp4if_gpu_mem_alloc.isra.0+0x45/0x70 [amd_capture] >>> [ 175.486126] isp4if_start+0x3a/0x320 [amd_capture] >>> [ 175.486141] isp4sd_set_power+0x96/0x1e0 [amd_capture] >>> [ 175.486148] pipeline_pm_power_one+0xf2/0x110 [videodev] >>> [ 175.486155] pipeline_pm_power+0x51/0x90 [videodev] >>> [ 175.486161] v4l2_pipeline_pm_use+0x3b/0x60 [videodev] >>> [ 175.486169] isp4vid_qops_start_streaming+0x22/0x140 [amd_capture] >>> [ 175.486176] ? isp4vid_qops_buffer_queue+0xb1/0x140 [amd_capture] >>> [ 175.486185] vb2_start_streaming+0x79/0x140 [videobuf2_common] >>> [ 175.486192] vb2_core_qbuf+0x3b5/0x480 [videobuf2_common] >>> [ 175.486200] vb2_qbuf+0x98/0x100 [videobuf2_v4l2] >>> [ 175.486208] __video_do_ioctl+0x480/0x4b0 [videodev] >>> [ 175.486219] video_usercopy+0x1e5/0x760 [videodev] >>> [ 175.486226] ? v4l_s_output+0x50/0x50 [videodev] >>> [ 175.486237] v4l2_ioctl+0x45/0x50 [videodev] >>> [ 175.486245] __x64_sys_ioctl+0x77/0xc0 >>> [ 175.486251] do_syscall_64+0x48/0xbf0 >>> [ 175.486260] entry_SYSCALL_64_after_hwframe+0x50/0x58 >>> [ 175.486268] RIP: 0033:0x7fb03371674d >>> [ 175.486275] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 >>> [ 175.486282] RSP: 002b:00007faf60ffc9d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 >>> [ 175.486292] RAX: ffffffffffffffda RBX: 00007fafb40050b0 RCX: 00007fb03371674d >>> [ 175.486301] RDX: 00007faf60ffca70 RSI: 00000000c058560f RDI: 000000000000002c >>> [ 175.486306] RBP: 00007faf60ffca20 R08: 13455f273d2513b9 R09: 0000000000000210 >>> [ 175.486313] R10: 0000000000000215 R11: 0000000000000246 R12: 00007faf9801c4b0 >>> [ 175.486318] R13: 0000000000000001 R14: 00007faf60ffcad0 R15: 0000000000000001 >>> [ 175.486324] </TASK> >>> [ 175.486330] Modules linked in: ccm hid_sensor_prox hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amd_capture videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc pinctrl_amdisp i2c_designware_amdisp uhid cmac algif_hash algif_skcipher af_alg bnep uinput nls_iso8859_1 vfat fat snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm joydev snd_soc_dmic snd_acp_pcm mousedev intel_rapl_msr snd_sof_amd_acp70 snd_sof_amd_acp63 snd_hda_scodec_cs35l56_spi intel_rapl_common snd_ctl_led snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_hda_codec_alc269 snd_sof_amd_renoir snd_hda_scodec_component snd_sof_amd_acp snd_hda_codec_realtek_lib snd_sof_pci snd_sof_xtensa_dsp snd_hda_codec_generic snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match amdgpu snd_amd_sdw_acpi snd_hda_codec_atihdmi soundwire_amd soundwire_generic_allocation snd_hda_codec_hdmi >>> [ 175.486373] soundwire_bus snd_soc_sdca snd_soc_core snd_compress snd_hda_intel ac97_bus snd_pcm_dmaengine mt7925e drm_panel_backlight_quirks amdxcp snd_hda_codec snd_rpl_pci_acp6x mt7925_common btusb drm_buddy cdc_mbim mt792x_lib snd_acp_pci cdc_wdm btrtl drm_exec snd_hda_core snd_amd_acpi_mach mt76_connac_lib snd_hda_scodec_cs35l56_i2c btintel snd_acp_legacy_common drm_suballoc_helper cdc_ncm snd_intel_dspcfg mt76 snd_hda_scodec_cs35l56 snd_pci_acp6x cdc_ether drm_ttm_helper btbcm snd_intel_sdw_acpi snd_hda_cirrus_scodec snd_hwdep usbnet ttm snd_pci_acp5x btmtk snd_soc_cs35l56_shared ucsi_acpi kvm_amd mac80211 snd_pcm r8152 i2c_algo_bit snd_rn_pci_acp3x typec_ucsi snd_soc_cs_amp_lib libarc4 spd5118 bluetooth mii drm_display_helper snd_timer cs_dsp kvm typec snd_acp_config hp_wmi snd cfg80211 libphy amdxdna roles cec snd_soc_acpi ecdh_generic sp5100_tco hid_multitouch irqbypass sparse_keymap rfkill rapl mdio_bus video gpu_sched amd_pmf wmi_bmof snd_pci_acp3x soundcore amdtee i2c_hid_acpi serial_multi_instantiate >>> [ 175.486384] i2c_hid amd_sfh thunderbolt wireless_hotkey amd_pmc platform_profile wmi mac_hid i2c_piix4 i2c_smbus i2c_dev sg crypto_user loop nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod polyval_clmulni ghash_clmulni_intel aesni_intel nvme nvme_core nvme_keyring serio_raw ccp nvme_auth >>> [ 175.486394] CR2: 00007f6b1092e888 >>> [ 175.486402] ---[ end trace 0000000000000000 ]--- >>> [ 175.486409] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 [amdgpu] >>> [ 175.486416] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 89 04 24 e8 d6 >>> [ 175.486422] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 >>> [ 175.486429] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: 0000000000000000 >>> [ 175.486433] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: ffff97b14e097b20 >>> [ 175.486439] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: ffff97b085af04c8 >>> [ 175.486444] R10: 0000000000ffffff R11: ffff97b085af0560 R12: 0000000000000002 >>> [ 175.486448] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: ffff97b0a0f00000 >>> [ 175.486455] FS: 00007faf60ffe6c0(0000) GS:ffff97cfa7133000(0000) knlGS:0000000000000000 >>> [ 175.486460] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 175.486464] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: 0000000000f50ef0 >>> [ 175.486470] PKRU: 55555554 >>> >>> Admittedly, it's my fault that ISP4 stopped zeroing the BO pointer argument >>> (&mem_info->mem_handle) passed to isp_kernel_buffer_alloc() [1]. But since this >>> issue slipped past Bin and presumably others who reviewed the code, it >>> highlights that isp_kernel_buffer_alloc() is not working as expected in this >>> respect, and the description for isp_kernel_buffer_alloc() reinforces this. >>> >> Thanks Sultan for suggesting this fix. Yes, its hard to believe that this >> slipped through until now. >> >> Instead of initializing *bo=NULL, I suggest ensuring *bo is actually NULL >> before calling amdgpu_bo_create_kernel(). >> >> int isp_kernel_buffer_alloc(...) >> { >> ... >> if (WARN_ON(*bo)) >> return -EINVAL; >> ... >> ret = amdgpu_bo_create_kernel(..) >> ... >> } >> >> This way the caller will get to know parameters passed are invalid and can >> take care of initializing the handles appropriately. > > Hi Pratap, > > I am opposed to this idea for multiple reasons: > > 1. *bo is an output parameter from isp_kernel_buffer_alloc(), so the input value > should not matter. > > 2. The only correct value for *bo is NULL when isp_kernel_buffer_alloc() passes > it to amdgpu_bo_create_kernel(). Since there is only one correct value, there > is no reason to burden callers of isp_kernel_buffer_alloc() with intializing > *bo to NULL. > > 3. This adds more code, another WARN_ON(), and another error case to > isp_kernel_buffer_alloc(). All of that can be eliminated entirely by just > initializing *bo to NULL in isp_kernel_buffer_alloc(), as I've done. > > 4. The reason *bo needs to be NULL is an implementation detail that is internal > to isp_kernel_buffer_alloc(), because amdgpu_bo_create_kernel() needs it to > be NULL in order to allocate a new buffer. isp_kernel_buffer_alloc() should > handle its own internal implementation details instead of punting the > responsibility onto callers. > > Sultan > Either approach works for me. My main intention is to ensure the caller passes the BO handle initialized to NULL in this case. That said, initializing *bo = NULL as you explained is perfectly fine too. Reviewed-by: Pratap Nirujogi <pratap.nirujogi@amd.com> >> >> Thanks, >> Pratap >> >>>>> >>>>> Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call. >>>>> >>>>> Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp buffers") >>>>> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> >>>>> --- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++ >>>>> 1 file changed, 2 insertions(+) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>>>> index 9cddbf50442a..37270c4dab8d 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>>>> @@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device *dev, u64 size, >>>>> if (ret) >>>>> return ret; >>>>> + /* Ensure *bo is NULL so a new BO will be created */ >>>>> + *bo = NULL; >>>>> ret = amdgpu_bo_create_kernel(adev, >>>>> size, >>>>> ISP_MC_ADDR_ALIGN, >>>> >>> >>> [1] https://lore.kernel.org/all/aNB0P18ytI1KopWI@sultan-box/ >>> >>> Sultan >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO 2025-11-06 20:51 ` Nirujogi, Pratap @ 2025-11-10 3:12 ` Du, Bin 2025-11-10 16:26 ` Nirujogi, Pratap 0 siblings, 1 reply; 9+ messages in thread From: Du, Bin @ 2025-11-10 3:12 UTC (permalink / raw) To: Nirujogi, Pratap, Sultan Alsawaf Cc: Mario Limonciello, pratap.nirujogi, amd-gfx, linux-kernel, alexander.deucher, benjamin.chan, christian.koenig, dantony, gjorgji.rosikopulos, king.li, phil.jawich Thans Sultan & Pratap, So, based on the discussion, the ISP driver will retain its current implementation and won’t do unnecessary init to *bo before calling isp_kernel_buffer_alloc(). On 11/7/2025 4:51 AM, Nirujogi, Pratap wrote: > > > On 11/6/2025 3:31 PM, Sultan Alsawaf wrote: >> Caution: This message originated from an External Source. Use proper >> caution when opening attachments, clicking links, or responding. >> >> >> On Thu, Nov 06, 2025 at 03:08:44PM -0500, Nirujogi, Pratap wrote: >>> >>> >>> On 11/6/2025 1:58 PM, Sultan Alsawaf wrote: >>>> Caution: This message originated from an External Source. Use proper >>>> caution when opening attachments, clicking links, or responding. >>>> >>>> >>>> On Thu, Nov 06, 2025 at 12:46:51PM -0600, Mario Limonciello wrote: >>>>> On 11/5/25 5:21 PM, Sultan Alsawaf wrote: >>>>>> From: Sultan Alsawaf <sultan@kerneltoast.com> >>>>>> >>>>>> When the BO pointer provided to amdgpu_bo_create_kernel() points to >>>>>> non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that >>>>>> address >>>>>> rather than allocate a new BO. >>>>>> >>>>>> This functionality is never desired for allocating ISP buffers. A >>>>>> new BO >>>>>> should always be created when isp_kernel_buffer_alloc() is called, >>>>>> per the >>>>>> description for isp_kernel_buffer_alloc(). >>>>> >>>>> Are you just going off descriptions, or is there a problem with reuse? >>>> >>>> I am going based off the ISP4 driver's usage of >>>> isp_kernel_buffer_alloc(). >>>> >>>> This fixes the following crash I experienced on v5 of the ISP4 >>>> patchset: >>>> >>>> [ 175.485627] BUG: unable to handle page fault for address: >>>> 00007f6b1092e888 >>>> [ 175.485799] #PF: supervisor read access in kernel mode >>>> [ 175.485840] #PF: error_code(0x0000) - not-present page >>>> [ 175.485869] PGD 0 P4D 0 >>>> [ 175.485889] Oops: Oops: 0000 [#1] SMP >>>> [ 175.485908] CPU: 15 UID: 1000 PID: 57022 Comm: cheese >>>> Tainted: G U W 6.17.7 #1 PREEMPT >>>> [ 175.485921] Tainted: [U]=USER, [W]=WARN >>>> [ 175.485933] Hardware name: HP HP ZBook Ultra G1a 14 inch >>>> Mobile Workstation PC/8D01, BIOS X89 Ver. 01.03.00 04/25/2025 >>>> [ 175.485943] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 >>>> [amdgpu] >>>> [ 175.485961] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 >>>> 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f >>>> 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 >>>> 89 04 24 e8 d6 >>>> [ 175.485976] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 >>>> [ 175.485988] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: >>>> 0000000000000000 >>>> [ 175.486002] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: >>>> ffff97b14e097b20 >>>> [ 175.486012] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: >>>> ffff97b085af04c8 >>>> [ 175.486023] R10: 0000000000ffffff R11: ffff97b085af0560 R12: >>>> 0000000000000002 >>>> [ 175.486031] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: >>>> ffff97b0a0f00000 >>>> [ 175.486037] FS: 00007faf60ffe6c0(0000) >>>> GS:ffff97cfa7133000(0000) knlGS:0000000000000000 >>>> [ 175.486046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> [ 175.486058] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: >>>> 0000000000f50ef0 >>>> [ 175.486067] PKRU: 55555554 >>>> [ 175.486072] Call Trace: >>>> [ 175.486081] <TASK> >>>> [ 175.486092] ? smu_cmn_send_smc_msg_with_param+0xc0/0x360 >>>> [amdgpu] >>>> [ 175.486102] amdgpu_bo_create_kernel+0x15/0x70 [amdgpu] >>>> [ 175.486110] isp_kernel_buffer_alloc+0x56/0xa0 [amdgpu] >>>> [ 175.486118] isp4if_gpu_mem_alloc.isra.0+0x45/0x70 [amd_capture] >>>> [ 175.486126] isp4if_start+0x3a/0x320 [amd_capture] >>>> [ 175.486141] isp4sd_set_power+0x96/0x1e0 [amd_capture] >>>> [ 175.486148] pipeline_pm_power_one+0xf2/0x110 [videodev] >>>> [ 175.486155] pipeline_pm_power+0x51/0x90 [videodev] >>>> [ 175.486161] v4l2_pipeline_pm_use+0x3b/0x60 [videodev] >>>> [ 175.486169] isp4vid_qops_start_streaming+0x22/0x140 >>>> [amd_capture] >>>> [ 175.486176] ? isp4vid_qops_buffer_queue+0xb1/0x140 >>>> [amd_capture] >>>> [ 175.486185] vb2_start_streaming+0x79/0x140 [videobuf2_common] >>>> [ 175.486192] vb2_core_qbuf+0x3b5/0x480 [videobuf2_common] >>>> [ 175.486200] vb2_qbuf+0x98/0x100 [videobuf2_v4l2] >>>> [ 175.486208] __video_do_ioctl+0x480/0x4b0 [videodev] >>>> [ 175.486219] video_usercopy+0x1e5/0x760 [videodev] >>>> [ 175.486226] ? v4l_s_output+0x50/0x50 [videodev] >>>> [ 175.486237] v4l2_ioctl+0x45/0x50 [videodev] >>>> [ 175.486245] __x64_sys_ioctl+0x77/0xc0 >>>> [ 175.486251] do_syscall_64+0x48/0xbf0 >>>> [ 175.486260] entry_SYSCALL_64_after_hwframe+0x50/0x58 >>>> [ 175.486268] RIP: 0033:0x7fb03371674d >>>> [ 175.486275] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d >>>> 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 >>>> 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 >>>> 25 28 00 00 00 >>>> [ 175.486282] RSP: 002b:00007faf60ffc9d0 EFLAGS: 00000246 >>>> ORIG_RAX: 0000000000000010 >>>> [ 175.486292] RAX: ffffffffffffffda RBX: 00007fafb40050b0 RCX: >>>> 00007fb03371674d >>>> [ 175.486301] RDX: 00007faf60ffca70 RSI: 00000000c058560f RDI: >>>> 000000000000002c >>>> [ 175.486306] RBP: 00007faf60ffca20 R08: 13455f273d2513b9 R09: >>>> 0000000000000210 >>>> [ 175.486313] R10: 0000000000000215 R11: 0000000000000246 R12: >>>> 00007faf9801c4b0 >>>> [ 175.486318] R13: 0000000000000001 R14: 00007faf60ffcad0 R15: >>>> 0000000000000001 >>>> [ 175.486324] </TASK> >>>> [ 175.486330] Modules linked in: ccm hid_sensor_prox >>>> hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer >>>> kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub rfcomm >>>> snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amd_capture >>>> videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc >>>> pinctrl_amdisp i2c_designware_amdisp uhid cmac algif_hash >>>> algif_skcipher af_alg bnep uinput nls_iso8859_1 vfat fat >>>> snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn >>>> snd_acp70 snd_acp_i2s snd_acp_pdm joydev snd_soc_dmic snd_acp_pcm >>>> mousedev intel_rapl_msr snd_sof_amd_acp70 snd_sof_amd_acp63 >>>> snd_hda_scodec_cs35l56_spi intel_rapl_common snd_ctl_led >>>> snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_hda_codec_alc269 >>>> snd_sof_amd_renoir snd_hda_scodec_component snd_sof_amd_acp >>>> snd_hda_codec_realtek_lib snd_sof_pci snd_sof_xtensa_dsp >>>> snd_hda_codec_generic snd_sof snd_sof_utils snd_pci_ps >>>> snd_soc_acpi_amd_match amdgpu snd_amd_sdw_acpi snd_hda_codec_atihdmi >>>> soundwire_amd soundwire_generic_allocation snd_hda_codec_hdmi >>>> [ 175.486373] soundwire_bus snd_soc_sdca snd_soc_core >>>> snd_compress snd_hda_intel ac97_bus snd_pcm_dmaengine mt7925e >>>> drm_panel_backlight_quirks amdxcp snd_hda_codec snd_rpl_pci_acp6x >>>> mt7925_common btusb drm_buddy cdc_mbim mt792x_lib snd_acp_pci >>>> cdc_wdm btrtl drm_exec snd_hda_core snd_amd_acpi_mach >>>> mt76_connac_lib snd_hda_scodec_cs35l56_i2c btintel >>>> snd_acp_legacy_common drm_suballoc_helper cdc_ncm snd_intel_dspcfg >>>> mt76 snd_hda_scodec_cs35l56 snd_pci_acp6x cdc_ether drm_ttm_helper >>>> btbcm snd_intel_sdw_acpi snd_hda_cirrus_scodec snd_hwdep usbnet ttm >>>> snd_pci_acp5x btmtk snd_soc_cs35l56_shared ucsi_acpi kvm_amd >>>> mac80211 snd_pcm r8152 i2c_algo_bit snd_rn_pci_acp3x typec_ucsi >>>> snd_soc_cs_amp_lib libarc4 spd5118 bluetooth mii drm_display_helper >>>> snd_timer cs_dsp kvm typec snd_acp_config hp_wmi snd cfg80211 libphy >>>> amdxdna roles cec snd_soc_acpi ecdh_generic sp5100_tco >>>> hid_multitouch irqbypass sparse_keymap rfkill rapl mdio_bus video >>>> gpu_sched amd_pmf wmi_bmof snd_pci_acp3x soundcore amdtee >>>> i2c_hid_acpi serial_multi_instantiate >>>> [ 175.486384] i2c_hid amd_sfh thunderbolt wireless_hotkey >>>> amd_pmc platform_profile wmi mac_hid i2c_piix4 i2c_smbus i2c_dev sg >>>> crypto_user loop nfnetlink ip_tables x_tables dm_crypt >>>> encrypted_keys trusted asn1_encoder tee dm_mod polyval_clmulni >>>> ghash_clmulni_intel aesni_intel nvme nvme_core nvme_keyring >>>> serio_raw ccp nvme_auth >>>> [ 175.486394] CR2: 00007f6b1092e888 >>>> [ 175.486402] ---[ end trace 0000000000000000 ]--- >>>> [ 175.486409] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 >>>> [amdgpu] >>>> [ 175.486416] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 >>>> 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 0f >>>> 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 48 >>>> 89 04 24 e8 d6 >>>> [ 175.486422] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 >>>> [ 175.486429] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: >>>> 0000000000000000 >>>> [ 175.486433] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: >>>> ffff97b14e097b20 >>>> [ 175.486439] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: >>>> ffff97b085af04c8 >>>> [ 175.486444] R10: 0000000000ffffff R11: ffff97b085af0560 R12: >>>> 0000000000000002 >>>> [ 175.486448] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: >>>> ffff97b0a0f00000 >>>> [ 175.486455] FS: 00007faf60ffe6c0(0000) >>>> GS:ffff97cfa7133000(0000) knlGS:0000000000000000 >>>> [ 175.486460] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> [ 175.486464] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: >>>> 0000000000f50ef0 >>>> [ 175.486470] PKRU: 55555554 >>>> >>>> Admittedly, it's my fault that ISP4 stopped zeroing the BO pointer >>>> argument >>>> (&mem_info->mem_handle) passed to isp_kernel_buffer_alloc() [1]. But >>>> since this >>>> issue slipped past Bin and presumably others who reviewed the code, it >>>> highlights that isp_kernel_buffer_alloc() is not working as expected >>>> in this >>>> respect, and the description for isp_kernel_buffer_alloc() >>>> reinforces this. >>>> >>> Thanks Sultan for suggesting this fix. Yes, its hard to believe that >>> this >>> slipped through until now. >>> >>> Instead of initializing *bo=NULL, I suggest ensuring *bo is actually >>> NULL >>> before calling amdgpu_bo_create_kernel(). >>> >>> int isp_kernel_buffer_alloc(...) >>> { >>> ... >>> if (WARN_ON(*bo)) >>> return -EINVAL; >>> ... >>> ret = amdgpu_bo_create_kernel(..) >>> ... >>> } >>> >>> This way the caller will get to know parameters passed are invalid >>> and can >>> take care of initializing the handles appropriately. >> >> Hi Pratap, >> >> I am opposed to this idea for multiple reasons: >> >> 1. *bo is an output parameter from isp_kernel_buffer_alloc(), so the >> input value >> should not matter. >> >> 2. The only correct value for *bo is NULL when >> isp_kernel_buffer_alloc() passes >> it to amdgpu_bo_create_kernel(). Since there is only one correct >> value, there >> is no reason to burden callers of isp_kernel_buffer_alloc() with >> intializing >> *bo to NULL. >> >> 3. This adds more code, another WARN_ON(), and another error case to >> isp_kernel_buffer_alloc(). All of that can be eliminated entirely >> by just >> initializing *bo to NULL in isp_kernel_buffer_alloc(), as I've done. >> >> 4. The reason *bo needs to be NULL is an implementation detail that is >> internal >> to isp_kernel_buffer_alloc(), because amdgpu_bo_create_kernel() >> needs it to >> be NULL in order to allocate a new buffer. >> isp_kernel_buffer_alloc() should >> handle its own internal implementation details instead of punting the >> responsibility onto callers. >> >> Sultan >> > Either approach works for me. My main intention is to ensure the caller > passes the BO handle initialized to NULL in this case. That said, > initializing *bo = NULL as you explained is perfectly fine too. > > Reviewed-by: Pratap Nirujogi <pratap.nirujogi@amd.com> > >>> >>> Thanks, >>> Pratap >>> >>>>>> >>>>>> Ensure this by zeroing *bo right before the >>>>>> amdgpu_bo_create_kernel() call. >>>>>> >>>>>> Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for isp >>>>>> buffers") >>>>>> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> >>>>>> --- >>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++ >>>>>> 1 file changed, 2 insertions(+) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/ >>>>>> gpu/drm/amd/amdgpu/amdgpu_isp.c >>>>>> index 9cddbf50442a..37270c4dab8d 100644 >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>>>>> @@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device >>>>>> *dev, u64 size, >>>>>> if (ret) >>>>>> return ret; >>>>>> + /* Ensure *bo is NULL so a new BO will be created */ >>>>>> + *bo = NULL; >>>>>> ret = amdgpu_bo_create_kernel(adev, >>>>>> size, >>>>>> ISP_MC_ADDR_ALIGN, >>>>> >>>> >>>> [1] https://lore.kernel.org/all/aNB0P18ytI1KopWI@sultan-box/ >>>> >>>> Sultan >>> > -- Regards, Bin ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO 2025-11-10 3:12 ` Du, Bin @ 2025-11-10 16:26 ` Nirujogi, Pratap 0 siblings, 0 replies; 9+ messages in thread From: Nirujogi, Pratap @ 2025-11-10 16:26 UTC (permalink / raw) To: Du, Bin, Sultan Alsawaf Cc: Mario Limonciello, pratap.nirujogi, amd-gfx, linux-kernel, alexander.deucher, benjamin.chan, christian.koenig, dantony, gjorgji.rosikopulos, king.li, phil.jawich Hi Bin, Yes, with this change applied, its not necessary to explicitly initialize the kernel bo handles in ISP driver. We can continue using kmalloc() in isp4if_gpu_mem_alloc() of ISP driver. Thanks, Pratap On 11/9/2025 10:12 PM, Du, Bin wrote: > Thans Sultan & Pratap, > > So, based on the discussion, the ISP driver will retain its current > implementation and won’t do unnecessary init to *bo before calling > isp_kernel_buffer_alloc(). > > On 11/7/2025 4:51 AM, Nirujogi, Pratap wrote: >> >> >> On 11/6/2025 3:31 PM, Sultan Alsawaf wrote: >>> Caution: This message originated from an External Source. Use proper >>> caution when opening attachments, clicking links, or responding. >>> >>> >>> On Thu, Nov 06, 2025 at 03:08:44PM -0500, Nirujogi, Pratap wrote: >>>> >>>> >>>> On 11/6/2025 1:58 PM, Sultan Alsawaf wrote: >>>>> Caution: This message originated from an External Source. Use >>>>> proper caution when opening attachments, clicking links, or >>>>> responding. >>>>> >>>>> >>>>> On Thu, Nov 06, 2025 at 12:46:51PM -0600, Mario Limonciello wrote: >>>>>> On 11/5/25 5:21 PM, Sultan Alsawaf wrote: >>>>>>> From: Sultan Alsawaf <sultan@kerneltoast.com> >>>>>>> >>>>>>> When the BO pointer provided to amdgpu_bo_create_kernel() points to >>>>>>> non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin >>>>>>> that address >>>>>>> rather than allocate a new BO. >>>>>>> >>>>>>> This functionality is never desired for allocating ISP buffers. A >>>>>>> new BO >>>>>>> should always be created when isp_kernel_buffer_alloc() is >>>>>>> called, per the >>>>>>> description for isp_kernel_buffer_alloc(). >>>>>> >>>>>> Are you just going off descriptions, or is there a problem with >>>>>> reuse? >>>>> >>>>> I am going based off the ISP4 driver's usage of >>>>> isp_kernel_buffer_alloc(). >>>>> >>>>> This fixes the following crash I experienced on v5 of the ISP4 >>>>> patchset: >>>>> >>>>> [ 175.485627] BUG: unable to handle page fault for address: >>>>> 00007f6b1092e888 >>>>> [ 175.485799] #PF: supervisor read access in kernel mode >>>>> [ 175.485840] #PF: error_code(0x0000) - not-present page >>>>> [ 175.485869] PGD 0 P4D 0 >>>>> [ 175.485889] Oops: Oops: 0000 [#1] SMP >>>>> [ 175.485908] CPU: 15 UID: 1000 PID: 57022 Comm: cheese >>>>> Tainted: G U W 6.17.7 #1 PREEMPT >>>>> [ 175.485921] Tainted: [U]=USER, [W]=WARN >>>>> [ 175.485933] Hardware name: HP HP ZBook Ultra G1a 14 inch >>>>> Mobile Workstation PC/8D01, BIOS X89 Ver. 01.03.00 04/25/2025 >>>>> [ 175.485943] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 >>>>> [amdgpu] >>>>> [ 175.485961] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 >>>>> 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 >>>>> 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 >>>>> 48 89 04 24 e8 d6 >>>>> [ 175.485976] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 >>>>> [ 175.485988] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: >>>>> 0000000000000000 >>>>> [ 175.486002] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: >>>>> ffff97b14e097b20 >>>>> [ 175.486012] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: >>>>> ffff97b085af04c8 >>>>> [ 175.486023] R10: 0000000000ffffff R11: ffff97b085af0560 R12: >>>>> 0000000000000002 >>>>> [ 175.486031] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: >>>>> ffff97b0a0f00000 >>>>> [ 175.486037] FS: 00007faf60ffe6c0(0000) >>>>> GS:ffff97cfa7133000(0000) knlGS:0000000000000000 >>>>> [ 175.486046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>> [ 175.486058] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: >>>>> 0000000000f50ef0 >>>>> [ 175.486067] PKRU: 55555554 >>>>> [ 175.486072] Call Trace: >>>>> [ 175.486081] <TASK> >>>>> [ 175.486092] ? smu_cmn_send_smc_msg_with_param+0xc0/0x360 >>>>> [amdgpu] >>>>> [ 175.486102] amdgpu_bo_create_kernel+0x15/0x70 [amdgpu] >>>>> [ 175.486110] isp_kernel_buffer_alloc+0x56/0xa0 [amdgpu] >>>>> [ 175.486118] isp4if_gpu_mem_alloc.isra.0+0x45/0x70 >>>>> [amd_capture] >>>>> [ 175.486126] isp4if_start+0x3a/0x320 [amd_capture] >>>>> [ 175.486141] isp4sd_set_power+0x96/0x1e0 [amd_capture] >>>>> [ 175.486148] pipeline_pm_power_one+0xf2/0x110 [videodev] >>>>> [ 175.486155] pipeline_pm_power+0x51/0x90 [videodev] >>>>> [ 175.486161] v4l2_pipeline_pm_use+0x3b/0x60 [videodev] >>>>> [ 175.486169] isp4vid_qops_start_streaming+0x22/0x140 >>>>> [amd_capture] >>>>> [ 175.486176] ? isp4vid_qops_buffer_queue+0xb1/0x140 >>>>> [amd_capture] >>>>> [ 175.486185] vb2_start_streaming+0x79/0x140 [videobuf2_common] >>>>> [ 175.486192] vb2_core_qbuf+0x3b5/0x480 [videobuf2_common] >>>>> [ 175.486200] vb2_qbuf+0x98/0x100 [videobuf2_v4l2] >>>>> [ 175.486208] __video_do_ioctl+0x480/0x4b0 [videodev] >>>>> [ 175.486219] video_usercopy+0x1e5/0x760 [videodev] >>>>> [ 175.486226] ? v4l_s_output+0x50/0x50 [videodev] >>>>> [ 175.486237] v4l2_ioctl+0x45/0x50 [videodev] >>>>> [ 175.486245] __x64_sys_ioctl+0x77/0xc0 >>>>> [ 175.486251] do_syscall_64+0x48/0xbf0 >>>>> [ 175.486260] entry_SYSCALL_64_after_hwframe+0x50/0x58 >>>>> [ 175.486268] RIP: 0033:0x7fb03371674d >>>>> [ 175.486275] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d >>>>> 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 >>>>> 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b >>>>> 04 25 28 00 00 00 >>>>> [ 175.486282] RSP: 002b:00007faf60ffc9d0 EFLAGS: 00000246 >>>>> ORIG_RAX: 0000000000000010 >>>>> [ 175.486292] RAX: ffffffffffffffda RBX: 00007fafb40050b0 RCX: >>>>> 00007fb03371674d >>>>> [ 175.486301] RDX: 00007faf60ffca70 RSI: 00000000c058560f RDI: >>>>> 000000000000002c >>>>> [ 175.486306] RBP: 00007faf60ffca20 R08: 13455f273d2513b9 R09: >>>>> 0000000000000210 >>>>> [ 175.486313] R10: 0000000000000215 R11: 0000000000000246 R12: >>>>> 00007faf9801c4b0 >>>>> [ 175.486318] R13: 0000000000000001 R14: 00007faf60ffcad0 R15: >>>>> 0000000000000001 >>>>> [ 175.486324] </TASK> >>>>> [ 175.486330] Modules linked in: ccm hid_sensor_prox >>>>> hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer >>>>> kfifo_buf hid_sensor_iio_common industrialio hid_sensor_hub rfcomm >>>>> snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amd_capture >>>>> videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc >>>>> pinctrl_amdisp i2c_designware_amdisp uhid cmac algif_hash >>>>> algif_skcipher af_alg bnep uinput nls_iso8859_1 vfat fat >>>>> snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn >>>>> snd_acp70 snd_acp_i2s snd_acp_pdm joydev snd_soc_dmic snd_acp_pcm >>>>> mousedev intel_rapl_msr snd_sof_amd_acp70 snd_sof_amd_acp63 >>>>> snd_hda_scodec_cs35l56_spi intel_rapl_common snd_ctl_led >>>>> snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_hda_codec_alc269 >>>>> snd_sof_amd_renoir snd_hda_scodec_component snd_sof_amd_acp >>>>> snd_hda_codec_realtek_lib snd_sof_pci snd_sof_xtensa_dsp >>>>> snd_hda_codec_generic snd_sof snd_sof_utils snd_pci_ps >>>>> snd_soc_acpi_amd_match amdgpu snd_amd_sdw_acpi >>>>> snd_hda_codec_atihdmi soundwire_amd soundwire_generic_allocation >>>>> snd_hda_codec_hdmi >>>>> [ 175.486373] soundwire_bus snd_soc_sdca snd_soc_core >>>>> snd_compress snd_hda_intel ac97_bus snd_pcm_dmaengine mt7925e >>>>> drm_panel_backlight_quirks amdxcp snd_hda_codec snd_rpl_pci_acp6x >>>>> mt7925_common btusb drm_buddy cdc_mbim mt792x_lib snd_acp_pci >>>>> cdc_wdm btrtl drm_exec snd_hda_core snd_amd_acpi_mach >>>>> mt76_connac_lib snd_hda_scodec_cs35l56_i2c btintel >>>>> snd_acp_legacy_common drm_suballoc_helper cdc_ncm snd_intel_dspcfg >>>>> mt76 snd_hda_scodec_cs35l56 snd_pci_acp6x cdc_ether drm_ttm_helper >>>>> btbcm snd_intel_sdw_acpi snd_hda_cirrus_scodec snd_hwdep usbnet ttm >>>>> snd_pci_acp5x btmtk snd_soc_cs35l56_shared ucsi_acpi kvm_amd >>>>> mac80211 snd_pcm r8152 i2c_algo_bit snd_rn_pci_acp3x typec_ucsi >>>>> snd_soc_cs_amp_lib libarc4 spd5118 bluetooth mii drm_display_helper >>>>> snd_timer cs_dsp kvm typec snd_acp_config hp_wmi snd cfg80211 >>>>> libphy amdxdna roles cec snd_soc_acpi ecdh_generic sp5100_tco >>>>> hid_multitouch irqbypass sparse_keymap rfkill rapl mdio_bus video >>>>> gpu_sched amd_pmf wmi_bmof snd_pci_acp3x soundcore amdtee >>>>> i2c_hid_acpi serial_multi_instantiate >>>>> [ 175.486384] i2c_hid amd_sfh thunderbolt wireless_hotkey >>>>> amd_pmc platform_profile wmi mac_hid i2c_piix4 i2c_smbus i2c_dev sg >>>>> crypto_user loop nfnetlink ip_tables x_tables dm_crypt >>>>> encrypted_keys trusted asn1_encoder tee dm_mod polyval_clmulni >>>>> ghash_clmulni_intel aesni_intel nvme nvme_core nvme_keyring >>>>> serio_raw ccp nvme_auth >>>>> [ 175.486394] CR2: 00007f6b1092e888 >>>>> [ 175.486402] ---[ end trace 0000000000000000 ]--- >>>>> [ 175.486409] RIP: 0010:amdgpu_bo_create_reserved+0xb1/0x1c0 >>>>> [amdgpu] >>>>> [ 175.486416] Code: 8b 30 44 89 64 24 20 48 89 44 24 28 c7 44 >>>>> 24 30 01 00 00 00 c7 44 24 1c b8 02 00 00 c6 44 24 08 00 4d 85 f6 >>>>> 0f 84 a9 00 00 00 <49> 8b 86 a8 01 00 00 49 8b be 40 01 00 00 31 f6 >>>>> 48 89 04 24 e8 d6 >>>>> [ 175.486422] RSP: 0018:ffff97b14e097ad0 EFLAGS: 00010202 >>>>> [ 175.486429] RAX: 0000000000000021 RBX: ffff97b085af04d0 RCX: >>>>> 0000000000000000 >>>>> [ 175.486433] RDX: 0000000000008000 RSI: ffff97b14e097ae0 RDI: >>>>> ffff97b14e097b20 >>>>> [ 175.486439] RBP: ffff97b085af04c8 R08: ffff97b085af04d8 R09: >>>>> ffff97b085af04c8 >>>>> [ 175.486444] R10: 0000000000ffffff R11: ffff97b085af0560 R12: >>>>> 0000000000000002 >>>>> [ 175.486448] R13: ffff97b085af04d8 R14: 00007f6b1092e6e0 R15: >>>>> ffff97b0a0f00000 >>>>> [ 175.486455] FS: 00007faf60ffe6c0(0000) >>>>> GS:ffff97cfa7133000(0000) knlGS:0000000000000000 >>>>> [ 175.486460] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>> [ 175.486464] CR2: 00007f6b1092e888 CR3: 0000000101c3f000 CR4: >>>>> 0000000000f50ef0 >>>>> [ 175.486470] PKRU: 55555554 >>>>> >>>>> Admittedly, it's my fault that ISP4 stopped zeroing the BO pointer >>>>> argument >>>>> (&mem_info->mem_handle) passed to isp_kernel_buffer_alloc() [1]. >>>>> But since this >>>>> issue slipped past Bin and presumably others who reviewed the code, it >>>>> highlights that isp_kernel_buffer_alloc() is not working as >>>>> expected in this >>>>> respect, and the description for isp_kernel_buffer_alloc() >>>>> reinforces this. >>>>> >>>> Thanks Sultan for suggesting this fix. Yes, its hard to believe that >>>> this >>>> slipped through until now. >>>> >>>> Instead of initializing *bo=NULL, I suggest ensuring *bo is actually >>>> NULL >>>> before calling amdgpu_bo_create_kernel(). >>>> >>>> int isp_kernel_buffer_alloc(...) >>>> { >>>> ... >>>> if (WARN_ON(*bo)) >>>> return -EINVAL; >>>> ... >>>> ret = amdgpu_bo_create_kernel(..) >>>> ... >>>> } >>>> >>>> This way the caller will get to know parameters passed are invalid >>>> and can >>>> take care of initializing the handles appropriately. >>> >>> Hi Pratap, >>> >>> I am opposed to this idea for multiple reasons: >>> >>> 1. *bo is an output parameter from isp_kernel_buffer_alloc(), so the >>> input value >>> should not matter. >>> >>> 2. The only correct value for *bo is NULL when >>> isp_kernel_buffer_alloc() passes >>> it to amdgpu_bo_create_kernel(). Since there is only one correct >>> value, there >>> is no reason to burden callers of isp_kernel_buffer_alloc() with >>> intializing >>> *bo to NULL. >>> >>> 3. This adds more code, another WARN_ON(), and another error case to >>> isp_kernel_buffer_alloc(). All of that can be eliminated entirely >>> by just >>> initializing *bo to NULL in isp_kernel_buffer_alloc(), as I've done. >>> >>> 4. The reason *bo needs to be NULL is an implementation detail that >>> is internal >>> to isp_kernel_buffer_alloc(), because amdgpu_bo_create_kernel() >>> needs it to >>> be NULL in order to allocate a new buffer. >>> isp_kernel_buffer_alloc() should >>> handle its own internal implementation details instead of punting >>> the >>> responsibility onto callers. >>> >>> Sultan >>> >> Either approach works for me. My main intention is to ensure the >> caller passes the BO handle initialized to NULL in this case. That >> said, initializing *bo = NULL as you explained is perfectly fine too. >> >> Reviewed-by: Pratap Nirujogi <pratap.nirujogi@amd.com> >> >>>> >>>> Thanks, >>>> Pratap >>>> >>>>>>> >>>>>>> Ensure this by zeroing *bo right before the >>>>>>> amdgpu_bo_create_kernel() call. >>>>>>> >>>>>>> Fixes: 55d42f616976 ("drm/amd/amdgpu: Add helper functions for >>>>>>> isp buffers") >>>>>>> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> >>>>>>> --- >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c | 2 ++ >>>>>>> 1 file changed, 2 insertions(+) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c b/drivers/ >>>>>>> gpu/drm/amd/amdgpu/amdgpu_isp.c >>>>>>> index 9cddbf50442a..37270c4dab8d 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_isp.c >>>>>>> @@ -280,6 +280,8 @@ int isp_kernel_buffer_alloc(struct device >>>>>>> *dev, u64 size, >>>>>>> if (ret) >>>>>>> return ret; >>>>>>> + /* Ensure *bo is NULL so a new BO will be created */ >>>>>>> + *bo = NULL; >>>>>>> ret = amdgpu_bo_create_kernel(adev, >>>>>>> size, >>>>>>> ISP_MC_ADDR_ALIGN, >>>>>> >>>>> >>>>> [1] https://lore.kernel.org/all/aNB0P18ytI1KopWI@sultan-box/ >>>>> >>>>> Sultan >>>> >> > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-11-10 16:26 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-11-05 23:21 [PATCH] drm/amd/amdgpu: Ensure isp_kernel_buffer_alloc() creates a new BO Sultan Alsawaf 2025-11-06 18:46 ` Mario Limonciello 2025-11-06 18:58 ` Sultan Alsawaf 2025-11-06 19:00 ` Mario Limonciello 2025-11-06 20:08 ` Nirujogi, Pratap 2025-11-06 20:31 ` Sultan Alsawaf 2025-11-06 20:51 ` Nirujogi, Pratap 2025-11-10 3:12 ` Du, Bin 2025-11-10 16:26 ` Nirujogi, Pratap
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox