[PATCH 0/3] NVKM GSP RPC fixes

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] NVKM GSP RPC fixes
@ 2024-10-17  7:19 Zhi Wang
  2024-10-17  7:19 ` [PATCH 1/3] nvkm/gsp: correctly advance the read pointer of GSP message queue Zhi Wang
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Zhi Wang @ 2024-10-17  7:19 UTC (permalink / raw)
  To: nouveau
  Cc: airlied, daniel, dakr, bskeggs, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiw, zhiwang

Hi folks:

Here are some fixes of weird bugs I encountered when I was enabling vGPU [1] on
core-driver aka NVKM. They are exposed because of the new RPCs required by
vGPU.

For testing, I tried to run Uniengine Heaven[2] on my RTX 4060 for 8 hours and
the GL CTS runner[3] (commandline: ./cts-runner --type-gl40) from Khronos
without any problem.

v2:

- Remove the Fixes: tags as the vanilla nouveau aren't going to hit these bugs. (Danilo)
- Test the patchset on VK-GL-CTS. (Danilo)
- Remove the ambiguous empty line in PATCH 2. (Danilo)
- Rename the r535_gsp_msgq_wait to gsp_msgq_recv. (Danilo)
- Introduce a data structure to hold the params of gsp_msgq_recv(). (Danilo)
- Document the params and the states they are related to. (Danilo)

[1] https://lore.kernel.org/kvm/20240922124951.1946072-1-zhiw@nvidia.com/T/#t
[2] https://benchmark.unigine.com/heaven
[3] https://github.com/KhronosGroup/VK-GL-CTS

Zhi Wang (3):
  nvkm/gsp: correctly advance the read pointer of GSP message queue
  nvkm: correctly calculate the available space of the GSP cmdq buffer
  nvkm: handle the return of large RPC

 .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    | 251 +++++++++++++-----
 1 file changed, 184 insertions(+), 67 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] nvkm/gsp: correctly advance the read pointer of GSP message queue
  2024-10-17  7:19 [PATCH 0/3] NVKM GSP RPC fixes Zhi Wang
@ 2024-10-17  7:19 ` Zhi Wang
  2024-10-17  7:19 ` [PATCH 2/3] nvkm: correctly calculate the available space of the GSP cmdq buffer Zhi Wang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Zhi Wang @ 2024-10-17  7:19 UTC (permalink / raw)
  To: nouveau
  Cc: airlied, daniel, dakr, bskeggs, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiw, zhiwang

A GSP event message consists three parts: message header, RPC header,
message body. GSP calculates the number of pages to write from the
total size of a GSP message. This behavior can be observed from the
movement of the write pointer.

However, nvkm takes only the size of RPC header and message body as
the message size when advancing the read pointer. When handling a
two-page GSP message in the non rollback case, It wrongly takes the
message body of the previous message as the message header of the next
message. As the "message length" tends to be zero, in the calculation of
size needs to be copied (0 - size of (message header)), the size needs to
be copied will be "0xffffffxx". It also triggers a kernel panic due to a
NULL pointer error.

[  547.614102] msg: 00000f90: ff ff ff ff ff ff ff ff 40 d7 18 fb 8b 00 00 00  ........@.......
[  547.622533] msg: 00000fa0: 00 00 00 00 ff ff ff ff ff ff ff ff 00 00 00 00  ................
[  547.630965] msg: 00000fb0: ff ff ff ff ff ff ff ff 00 00 00 00 ff ff ff ff  ................
[  547.639397] msg: 00000fc0: ff ff ff ff 00 00 00 00 ff ff ff ff ff ff ff ff  ................
[  547.647832] nvkm 0000:c1:00.0: gsp: peek msg rpc fn:0 len:0x0/0xffffffffffffffe0
[  547.655225] nvkm 0000:c1:00.0: gsp: get msg rpc fn:0 len:0x0/0xffffffffffffffe0
[  547.662532] BUG: kernel NULL pointer dereference, address: 0000000000000020
[  547.669485] #PF: supervisor read access in kernel mode
[  547.674624] #PF: error_code(0x0000) - not-present page
[  547.679755] PGD 0 P4D 0
[  547.682294] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  547.686643] CPU: 22 PID: 322 Comm: kworker/22:1 Tainted: G            E      6.9.0-rc6+ #1
[  547.694893] Hardware name: ASRockRack 1U1G-MILAN/N/ROMED8-NL, BIOS L3.12E 09/06/2022
[  547.702626] Workqueue: events r535_gsp_msgq_work [nvkm]
[  547.707921] RIP: 0010:r535_gsp_msg_recv+0x87/0x230 [nvkm]
[  547.713375] Code: 00 8b 70 08 48 89 e1 31 d2 4c 89 f7 e8 12 f5 ff ff 48 89 c5 48 85 c0 0f 84 cf 00 00 00 48 81 fd 00 f0 ff ff 0f 87 c4 00 00 00 <8b> 55 10 41 8b 46 30 85 d2 0f 85 f6 00 00 00 83 f8 04 76 10 ba 05
[  547.732119] RSP: 0018:ffffabe440f87e10 EFLAGS: 00010203
[  547.737335] RAX: 0000000000000010 RBX: 0000000000000008 RCX: 000000000000003f
[  547.744461] RDX: 0000000000000000 RSI: ffffabe4480a8030 RDI: 0000000000000010
[  547.751585] RBP: 0000000000000010 R08: 0000000000000000 R09: ffffabe440f87bb0
[  547.758707] R10: ffffabe440f87dc8 R11: 0000000000000010 R12: 0000000000000000
[  547.765834] R13: 0000000000000000 R14: ffff9351df1e5000 R15: 0000000000000000
[  547.772958] FS:  0000000000000000(0000) GS:ffff93708eb00000(0000) knlGS:0000000000000000
[  547.781035] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  547.786771] CR2: 0000000000000020 CR3: 00000003cc220002 CR4: 0000000000770ef0
[  547.793896] PKRU: 55555554
[  547.796600] Call Trace:
[  547.799046]  <TASK>
[  547.801152]  ? __die+0x20/0x70
[  547.804211]  ? page_fault_oops+0x75/0x170
[  547.808221]  ? print_hex_dump+0x100/0x160
[  547.812226]  ? exc_page_fault+0x64/0x150
[  547.816152]  ? asm_exc_page_fault+0x22/0x30
[  547.820341]  ? r535_gsp_msg_recv+0x87/0x230 [nvkm]
[  547.825184]  r535_gsp_msgq_work+0x42/0x50 [nvkm]
[  547.829845]  process_one_work+0x196/0x3d0
[  547.833861]  worker_thread+0x2fc/0x410
[  547.837613]  ? __pfx_worker_thread+0x10/0x10
[  547.841885]  kthread+0xdf/0x110
[  547.845031]  ? __pfx_kthread+0x10/0x10
[  547.848775]  ret_from_fork+0x30/0x50
[  547.852354]  ? __pfx_kthread+0x10/0x10
[  547.856097]  ret_from_fork_asm+0x1a/0x30
[  547.860019]  </TASK>
[  547.862208] Modules linked in: nvkm(E) gsp_log(E) snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) snd_timer(E) snd_seq_device(E) snd(E) soundcore(E) rfkill(E) qrtr(E) vfat(E) fat(E) ipmi_ssif(E) amd_atl(E) intel_rapl_msr(E) intel_rapl_common(E) amd64_edac(E) mlx5_ib(E) edac_mce_amd(E) kvm_amd(E) ib_uverbs(E) kvm(E) ib_core(E) acpi_ipmi(E) ipmi_si(E) ipmi_devintf(E) mxm_wmi(E) joydev(E) rapl(E) ptdma(E) i2c_piix4(E) acpi_cpufreq(E) wmi_bmof(E) pcspkr(E) k10temp(E) ipmi_msghandler(E) xfs(E) libcrc32c(E) ast(E) i2c_algo_bit(E) drm_shmem_helper(E) crct10dif_pclmul(E) drm_kms_helper(E) ahci(E) crc32_pclmul(E) nvme_tcp(E) libahci(E) nvme(E) crc32c_intel(E) nvme_fabrics(E) cdc_ether(E) nvme_core(E) usbnet(E) mlx5_core(E) ghash_clmulni_intel(E) drm(E) libata(E) ccp(E) mii(E) t10_pi(E) mlxfw(E) sp5100_tco(E) psample(E) pci_hyperv_intf(E) wmi(E) dm_multipath(E) sunrpc(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) be2iscsi(E) bnx2i(E) cnic(E) uio(E) cxgb4i(E) cxgb4(E) tls(E) libcxgbi(E) libcxgb(E) qla4xxx(E)
[  547.862283]  iscsi_boot_sysfs(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) fuse(E) [last unloaded: gsp_log(E)]
[  547.962691] CR2: 0000000000000020
[  547.966003] ---[ end trace 0000000000000000 ]---
[  549.012012] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1370499158 wd_nsec: 1370498904
[  549.043676] pstore: backend (erst) writing error (-28)
[  549.050924] RIP: 0010:r535_gsp_msg_recv+0x87/0x230 [nvkm]
[  549.056389] Code: 00 8b 70 08 48 89 e1 31 d2 4c 89 f7 e8 12 f5 ff ff 48 89 c5 48 85 c0 0f 84 cf 00 00 00 48 81 fd 00 f0 ff ff 0f 87 c4 00 00 00 <8b> 55 10 41 8b 46 30 85 d2 0f 85 f6 00 00 00 83 f8 04 76 10 ba 05
[  549.075138] RSP: 0018:ffffabe440f87e10 EFLAGS: 00010203
[  549.080361] RAX: 0000000000000010 RBX: 0000000000000008 RCX: 000000000000003f
[  549.087484] RDX: 0000000000000000 RSI: ffffabe4480a8030 RDI: 0000000000000010
[  549.094609] RBP: 0000000000000010 R08: 0000000000000000 R09: ffffabe440f87bb0
[  549.101733] R10: ffffabe440f87dc8 R11: 0000000000000010 R12: 0000000000000000
[  549.108857] R13: 0000000000000000 R14: ffff9351df1e5000 R15: 0000000000000000
[  549.115982] FS:  0000000000000000(0000) GS:ffff93708eb00000(0000) knlGS:0000000000000000
[  549.124061] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  549.129807] CR2: 0000000000000020 CR3: 00000003cc220002 CR4: 0000000000770ef0
[  549.136940] PKRU: 55555554
[  549.139653] Kernel panic - not syncing: Fatal exception
[  549.145054] Kernel Offset: 0x18c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  549.165074] ---[ end Kernel panic - not syncing: Fatal exception ]---

Also, nvkm wrongly advances the read pointer when handling a two-page GSP
message in the rollback case. In the rollback case, the GSP message will
be copied in two rounds. When handling a two-page GSP message, nvkm first
copies amount of (GSP_PAGE_SIZE - header) data into the buffer, then
advances the read pointer by the result of DIV_ROUND_UP(size,
GSP_PAGE_SIZE). Thus, the read pointer is advanced by 1.

Next, nvkm copies the amount of (total size - (GSP_PAGE_SIZE -
header)) data into the buffer. The left amount of the data will be always
larger than one page since the message header is not taken into account
in the first copy. Thus, the read pointer is advanced by DIV_ROUND_UP(
size(larger than one page), GSP_PAGE_SIZE) = 2.

In the end, the read pointer is wrongly advanced by 3 when handling a
two-page GSP message in the rollback case.

Fix the problems by taking the total size of the message into account
when advancing the read pointer and calculate the read pointer in the end
of the all copies for the rollback case.

BTW: the two-page GSP message can be observed in the msgq when vGPU is
enabled.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index cf58f9da9139..736cde1987d0 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -121,6 +121,8 @@ r535_gsp_msgq_wait(struct nvkm_gsp *gsp, u32 repc, u32 *prepc, int *ptime)
 		return mqe->data;
 	}
 
+	size = ALIGN(repc + GSP_MSG_HDR_SIZE, GSP_PAGE_SIZE);
+
 	msg = kvmalloc(repc, GFP_KERNEL);
 	if (!msg)
 		return ERR_PTR(-ENOMEM);
@@ -129,19 +131,15 @@ r535_gsp_msgq_wait(struct nvkm_gsp *gsp, u32 repc, u32 *prepc, int *ptime)
 	len = min_t(u32, repc, len);
 	memcpy(msg, mqe->data, len);
 
-	rptr += DIV_ROUND_UP(len, GSP_PAGE_SIZE);
-	if (rptr == gsp->msgq.cnt)
-		rptr = 0;
-
 	repc -= len;
 
 	if (repc) {
 		mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + 0 * 0x1000);
 		memcpy(msg + len, mqe, repc);
-
-		rptr += DIV_ROUND_UP(repc, GSP_PAGE_SIZE);
 	}
 
+	rptr = (rptr + DIV_ROUND_UP(size, GSP_PAGE_SIZE)) % gsp->msgq.cnt;
+
 	mb();
 	(*gsp->msgq.rptr) = rptr;
 	return msg;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] nvkm: correctly calculate the available space of the GSP cmdq buffer
  2024-10-17  7:19 [PATCH 0/3] NVKM GSP RPC fixes Zhi Wang
  2024-10-17  7:19 ` [PATCH 1/3] nvkm/gsp: correctly advance the read pointer of GSP message queue Zhi Wang
@ 2024-10-17  7:19 ` Zhi Wang
  2024-10-17  7:19 ` [PATCH 3/3] nvkm: handle the return of large RPC Zhi Wang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Zhi Wang @ 2024-10-17  7:19 UTC (permalink / raw)
  To: nouveau
  Cc: airlied, daniel, dakr, bskeggs, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiw, zhiwang

r535_gsp_cmdq_push() waits for the available page in the GSP cmdq
buffer when handling a large RPC request. When it sees at least one
available page in the cmdq, it quits the waiting with the amount of
free buffer pages in the queue.

Unfortunately, it always takes the [write pointer, buf_size) as
available buffer pages before rolling back and wrongly calculates the
size of the data should be copied. Thus, it can overwrite the RPC
request that GSP is currently reading, which causes GSP hang due
to corrupted RPC request:

[  549.209389] ------------[ cut here ]------------
[  549.214010] WARNING: CPU: 8 PID: 6314 at drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:116 r535_gsp_msgq_wait+0xd0/0x190 [nvkm]
[  549.225678] Modules linked in: nvkm(E+) gsp_log(E) snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) snd_timer(E) snd_seq_device(E) snd(E) soundcore(E) rfkill(E) qrtr(E) vfat(E) fat(E) ipmi_ssif(E) amd_atl(E) intel_rapl_msr(E) intel_rapl_common(E) mlx5_ib(E) amd64_edac(E) edac_mce_amd(E) kvm_amd(E) ib_uverbs(E) kvm(E) ib_core(E) acpi_ipmi(E) ipmi_si(E) mxm_wmi(E) ipmi_devintf(E) rapl(E) i2c_piix4(E) wmi_bmof(E) joydev(E) ptdma(E) acpi_cpufreq(E) k10temp(E) pcspkr(E) ipmi_msghandler(E) xfs(E) libcrc32c(E) ast(E) i2c_algo_bit(E) crct10dif_pclmul(E) drm_shmem_helper(E) nvme_tcp(E) crc32_pclmul(E) ahci(E) drm_kms_helper(E) libahci(E) nvme_fabrics(E) crc32c_intel(E) nvme(E) cdc_ether(E) mlx5_core(E) nvme_core(E) usbnet(E) drm(E) libata(E) ccp(E) ghash_clmulni_intel(E) mii(E) t10_pi(E) mlxfw(E) sp5100_tco(E) psample(E) pci_hyperv_intf(E) wmi(E) dm_multipath(E) sunrpc(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) be2iscsi(E) bnx2i(E) cnic(E) uio(E) cxgb4i(E) cxgb4(E) tls(E) libcxgbi(E) libcxgb(E) qla4xxx(E)
[  549.225752]  iscsi_boot_sysfs(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) fuse(E) [last unloaded: gsp_log(E)]
[  549.326293] CPU: 8 PID: 6314 Comm: insmod Tainted: G            E      6.9.0-rc6+ #1
[  549.334039] Hardware name: ASRockRack 1U1G-MILAN/N/ROMED8-NL, BIOS L3.12E 09/06/2022
[  549.341781] RIP: 0010:r535_gsp_msgq_wait+0xd0/0x190 [nvkm]
[  549.347343] Code: 08 00 00 89 da c1 e2 0c 48 8d ac 11 00 10 00 00 48 8b 0c 24 48 85 c9 74 1f c1 e0 0c 4c 8d 6d 30 83 e8 30 89 01 e9 68 ff ff ff <0f> 0b 49 c7 c5 92 ff ff ff e9 5a ff ff ff ba ff ff ff ff be c0 0c
[  549.366090] RSP: 0018:ffffacbccaaeb7d0 EFLAGS: 00010246
[  549.371315] RAX: 0000000000000000 RBX: 0000000000000012 RCX: 0000000000923e28
[  549.378451] RDX: 0000000000000000 RSI: 0000000055555554 RDI: ffffacbccaaeb730
[  549.385590] RBP: 0000000000000001 R08: ffff8bd14d235f70 R09: ffff8bd14d235f70
[  549.392721] R10: 0000000000000002 R11: ffff8bd14d233864 R12: 0000000000000020
[  549.399854] R13: ffffacbccaaeb818 R14: 0000000000000020 R15: ffff8bb298c67000
[  549.406988] FS:  00007f5179244740(0000) GS:ffff8bd14d200000(0000) knlGS:0000000000000000
[  549.415076] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  549.420829] CR2: 00007fa844000010 CR3: 00000001567dc005 CR4: 0000000000770ef0
[  549.427963] PKRU: 55555554
[  549.430672] Call Trace:
[  549.433126]  <TASK>
[  549.435233]  ? __warn+0x7f/0x130
[  549.438473]  ? r535_gsp_msgq_wait+0xd0/0x190 [nvkm]
[  549.443426]  ? report_bug+0x18a/0x1a0
[  549.447098]  ? handle_bug+0x3c/0x70
[  549.450589]  ? exc_invalid_op+0x14/0x70
[  549.454430]  ? asm_exc_invalid_op+0x16/0x20
[  549.458619]  ? r535_gsp_msgq_wait+0xd0/0x190 [nvkm]
[  549.463565]  r535_gsp_msg_recv+0x46/0x230 [nvkm]
[  549.468257]  r535_gsp_rpc_push+0x106/0x160 [nvkm]
[  549.473033]  r535_gsp_rpc_rm_ctrl_push+0x40/0x130 [nvkm]
[  549.478422]  nvidia_grid_init_vgpu_types+0xbc/0xe0 [nvkm]
[  549.483899]  nvidia_grid_init+0xb1/0xd0 [nvkm]
[  549.488420]  ? srso_alias_return_thunk+0x5/0xfbef5
[  549.493213]  nvkm_device_pci_probe+0x305/0x420 [nvkm]
[  549.498338]  local_pci_probe+0x46/0xa0
[  549.502096]  pci_call_probe+0x56/0x170
[  549.505851]  pci_device_probe+0x79/0xf0
[  549.509690]  ? driver_sysfs_add+0x59/0xc0
[  549.513702]  really_probe+0xd9/0x380
[  549.517282]  __driver_probe_device+0x78/0x150
[  549.521640]  driver_probe_device+0x1e/0x90
[  549.525746]  __driver_attach+0xd2/0x1c0
[  549.529594]  ? __pfx___driver_attach+0x10/0x10
[  549.534045]  bus_for_each_dev+0x78/0xd0
[  549.537893]  bus_add_driver+0x112/0x210
[  549.541750]  driver_register+0x5c/0x120
[  549.545596]  ? __pfx_nvkm_init+0x10/0x10 [nvkm]
[  549.550224]  do_one_initcall+0x44/0x300
[  549.554063]  ? do_init_module+0x23/0x240
[  549.557989]  do_init_module+0x64/0x240

Calculate the available buffer page before rolling back based on
the result from the waiting.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 736cde1987d0..50ae56013344 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -161,7 +161,7 @@ r535_gsp_cmdq_push(struct nvkm_gsp *gsp, void *argv)
 	u64 *end;
 	u64 csum = 0;
 	int free, time = 1000000;
-	u32 wptr, size;
+	u32 wptr, size, step;
 	u32 off = 0;
 
 	argc = ALIGN(GSP_MSG_HDR_SIZE + argc, GSP_PAGE_SIZE);
@@ -195,7 +195,9 @@ r535_gsp_cmdq_push(struct nvkm_gsp *gsp, void *argv)
 		}
 
 		cqe = (void *)((u8 *)gsp->shm.cmdq.ptr + 0x1000 + wptr * 0x1000);
-		size = min_t(u32, argc, (gsp->cmdq.cnt - wptr) * GSP_PAGE_SIZE);
+		step = min_t(u32, free, (gsp->cmdq.cnt - wptr));
+		size = min_t(u32, argc, step * GSP_PAGE_SIZE);
+
 		memcpy(cqe, (u8 *)cmd + off, size);
 
 		wptr += DIV_ROUND_UP(size, 0x1000);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] nvkm: handle the return of large RPC
  2024-10-17  7:19 [PATCH 0/3] NVKM GSP RPC fixes Zhi Wang
  2024-10-17  7:19 ` [PATCH 1/3] nvkm/gsp: correctly advance the read pointer of GSP message queue Zhi Wang
  2024-10-17  7:19 ` [PATCH 2/3] nvkm: correctly calculate the available space of the GSP cmdq buffer Zhi Wang
@ 2024-10-17  7:19 ` Zhi Wang
  2024-10-17 11:29   ` Danilo Krummrich
  2024-10-17 11:32 ` [PATCH 0/3] NVKM GSP RPC fixes Danilo Krummrich
  2024-11-25 16:50 ` Danilo Krummrich
  4 siblings, 1 reply; 9+ messages in thread
From: Zhi Wang @ 2024-10-17  7:19 UTC (permalink / raw)
  To: nouveau
  Cc: airlied, daniel, dakr, bskeggs, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiw, zhiwang

The max RPC size is 16 pages (including the RPC header). To send an RPC
larger than 16 pages, nvkm should split it into multiple RPCs and send
it accordingly. The first of the split RPCs has the expected function
number, while the rest of the split RPCs are sent with function number
as NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD. GSP will consume the split
RPCs from the cmdq and always write the result back to the msgq. The
result is also formed as split RPCs.

However, NVKM is able to send split RPC when dealing with large RPCs,
but totally not aware of handling the return of the large RPCs, which
are the split RPC in the msgq. Thus, it keeps dumping the unknown RPC
messages from msgq, which is actually CONTINUATION_RECORD message,
discard them unexpectly. Thus, the caller will not be able to consume
the result from GSP.

Introduce the handling of split RPCs on the msgq path. Slightly
re-factor the low-level part of receiving RPCs from the msgq, RPC
vehicle handling to merge the split RPCs back into a large RPC before
handling it to the upper level. Thus, the upper-level of RPC APIs don't
need to be heavily changed.

Signed-off-by: Zhi Wang <zhiw@nvidia.com>
---
 .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    | 237 +++++++++++++-----
 1 file changed, 177 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
index 50ae56013344..9c422644c9e7 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
@@ -72,6 +72,21 @@ struct r535_gsp_msg {
 
 #define GSP_MSG_HDR_SIZE offsetof(struct r535_gsp_msg, data)
 
+struct nvfw_gsp_rpc {
+	u32 header_version;
+	u32 signature;
+	u32 length;
+	u32 function;
+	u32 rpc_result;
+	u32 rpc_result_private;
+	u32 sequence;
+	union {
+		u32 spare;
+		u32 cpuRmGfid;
+	};
+	u8  data[];
+};
+
 static int
 r535_rpc_status_to_errno(uint32_t rpc_status)
 {
@@ -86,16 +101,34 @@ r535_rpc_status_to_errno(uint32_t rpc_status)
 	}
 }
 
+struct gsp_msgq_recv_args {
+	/* timeout in us */
+	int time;
+	/* if set, peek the msgq, otherwise copy it */
+	u32 *prepc;
+	/*
+	 * the size (without message header) of message to
+	 * wait(when peek)/copy from the msgq
+	 */
+	u32 repc;
+	/* the message buffer */
+	u8 *msg;
+	/*
+	 * skip copying the rpc header, used when handling a large RPC.
+	 * rpc header only shows up in the first segment of a large RPC.
+	 */
+	bool skip_copy_rpc_header;
+};
+
 static void *
-r535_gsp_msgq_wait(struct nvkm_gsp *gsp, u32 repc, u32 *prepc, int *ptime)
+gsp_msgq_recv(struct nvkm_gsp *gsp, struct gsp_msgq_recv_args *args)
 {
 	struct r535_gsp_msg *mqe;
-	u32 size, rptr = *gsp->msgq.rptr;
+	u32 rptr = *gsp->msgq.rptr;
 	int used;
-	u8 *msg;
-	u32 len;
+	u32 size, len, repc;
 
-	size = DIV_ROUND_UP(GSP_MSG_HDR_SIZE + repc, GSP_PAGE_SIZE);
+	size = DIV_ROUND_UP(GSP_MSG_HDR_SIZE + args->repc, GSP_PAGE_SIZE);
 	if (WARN_ON(!size || size >= gsp->msgq.cnt))
 		return ERR_PTR(-EINVAL);
 
@@ -109,46 +142,149 @@ r535_gsp_msgq_wait(struct nvkm_gsp *gsp, u32 repc, u32 *prepc, int *ptime)
 			break;
 
 		usleep_range(1, 2);
-	} while (--(*ptime));
+	} while (--(args->time));
 
-	if (WARN_ON(!*ptime))
+	if (WARN_ON(!args->time))
 		return ERR_PTR(-ETIMEDOUT);
 
 	mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + rptr * 0x1000);
 
-	if (prepc) {
-		*prepc = (used * GSP_PAGE_SIZE) - sizeof(*mqe);
+	if (args->prepc) {
+		*args->prepc = (used * GSP_PAGE_SIZE) - sizeof(*mqe);
 		return mqe->data;
 	}
 
+	repc = args->repc;
 	size = ALIGN(repc + GSP_MSG_HDR_SIZE, GSP_PAGE_SIZE);
 
-	msg = kvmalloc(repc, GFP_KERNEL);
-	if (!msg)
-		return ERR_PTR(-ENOMEM);
-
 	len = ((gsp->msgq.cnt - rptr) * GSP_PAGE_SIZE) - sizeof(*mqe);
 	len = min_t(u32, repc, len);
-	memcpy(msg, mqe->data, len);
+	if (!args->skip_copy_rpc_header)
+		memcpy(args->msg, mqe->data, len);
+	else
+		memcpy(args->msg, mqe->data + sizeof(struct nvfw_gsp_rpc),
+		       len - sizeof(struct nvfw_gsp_rpc));
 
 	repc -= len;
 
 	if (repc) {
 		mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + 0 * 0x1000);
-		memcpy(msg + len, mqe, repc);
+		memcpy(args->msg + len, mqe, repc);
 	}
 
 	rptr = (rptr + DIV_ROUND_UP(size, GSP_PAGE_SIZE)) % gsp->msgq.cnt;
 
 	mb();
 	(*gsp->msgq.rptr) = rptr;
-	return msg;
+	return args->msg;
+}
+
+static void
+r535_gsp_msg_dump(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg, int lvl)
+{
+	if (gsp->subdev.debug >= lvl) {
+		nvkm_printk__(&gsp->subdev, lvl, info,
+			      "msg fn:%d len:0x%x/0x%zx res:0x%x resp:0x%x\n",
+			      msg->function, msg->length, msg->length - sizeof(*msg),
+			      msg->rpc_result, msg->rpc_result_private);
+		print_hex_dump(KERN_INFO, "msg: ", DUMP_PREFIX_OFFSET, 16, 1,
+			       msg->data, msg->length - sizeof(*msg), true);
+	}
+}
+
+static void *
+r535_gsp_msgq_recv_continuation(struct nvkm_gsp *gsp, u32 *payload_size,
+				u8 *buf, int time)
+{
+	struct nvkm_subdev *subdev = &gsp->subdev;
+	struct nvfw_gsp_rpc *msg;
+	struct gsp_msgq_recv_args args = { 0 };
+	u32 size;
+
+	/* Peek the header of message */
+	args.time = time;
+	args.repc = sizeof(*msg);
+	args.prepc = &size;
+
+	msg = gsp_msgq_recv(gsp, &args);
+	if (IS_ERR_OR_NULL(msg))
+		return msg;
+
+	if (msg->function != NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD) {
+		nvkm_error(subdev, "Not a continuation of a large RPC\n");
+		r535_gsp_msg_dump(gsp, msg, NV_DBG_ERROR);
+		return ERR_PTR(-EIO);
+	}
+
+	*payload_size = msg->length - sizeof(*msg);
+
+	/* Recv the continuation message */
+	args.time = time;
+	args.repc = msg->length;
+	args.prepc = NULL;
+	args.msg = buf;
+	args.skip_copy_rpc_header = true;
+
+	return gsp_msgq_recv(gsp, &args);
 }
 
 static void *
-r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 repc, int *ptime)
+r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 msg_repc, u32 total_repc,
+		   int time)
 {
-	return r535_gsp_msgq_wait(gsp, repc, NULL, ptime);
+	struct gsp_msgq_recv_args args = { 0 };
+	struct nvfw_gsp_rpc *msg;
+	const u32 max_msg_size = (16 * 0x1000) - sizeof(struct r535_gsp_msg);
+	const u32 max_rpc_size = max_msg_size - sizeof(*msg);
+	u32 repc = total_repc;
+	u8 *buf, *next;
+
+	if (WARN_ON(msg_repc > max_msg_size))
+		return NULL;
+
+	buf = kvmalloc(max_t(u32, msg_repc, total_repc + sizeof(*msg)), GFP_KERNEL);
+	if (!buf)
+		return ERR_PTR(-ENOMEM);
+
+	/* Recv the message */
+	args.time = time;
+	args.repc = msg_repc;
+	args.prepc = NULL;
+	args.msg = buf;
+	args.skip_copy_rpc_header = false;
+
+	msg = gsp_msgq_recv(gsp, &args);
+	if (IS_ERR_OR_NULL(msg)) {
+		kfree(buf);
+		return msg;
+	}
+
+	if (total_repc <= max_rpc_size)
+		return buf;
+
+	/* Gather the message from the following continuation messages. */
+	next = buf;
+
+	next += msg_repc;
+	repc -= msg_repc - sizeof(*msg);
+
+	while (repc) {
+		struct nvfw_gsp_rpc *cont_msg;
+		u32 size;
+
+		cont_msg = r535_gsp_msgq_recv_continuation(gsp, &size, next,
+						      time);
+		if (IS_ERR_OR_NULL(cont_msg)) {
+			kfree(buf);
+			return cont_msg;
+		}
+		repc -= size;
+		next += size;
+	}
+
+	/* Patch the message length. The caller sees a consolidated message */
+	msg->length = total_repc + sizeof(*msg);
+	return buf;
 }
 
 static int
@@ -234,54 +370,33 @@ r535_gsp_cmdq_get(struct nvkm_gsp *gsp, u32 argc)
 	return cmd->data;
 }
 
-struct nvfw_gsp_rpc {
-	u32 header_version;
-	u32 signature;
-	u32 length;
-	u32 function;
-	u32 rpc_result;
-	u32 rpc_result_private;
-	u32 sequence;
-	union {
-		u32 spare;
-		u32 cpuRmGfid;
-	};
-	u8  data[];
-};
-
 static void
 r535_gsp_msg_done(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg)
 {
 	kvfree(msg);
 }
 
-static void
-r535_gsp_msg_dump(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg, int lvl)
-{
-	if (gsp->subdev.debug >= lvl) {
-		nvkm_printk__(&gsp->subdev, lvl, info,
-			      "msg fn:%d len:0x%x/0x%zx res:0x%x resp:0x%x\n",
-			      msg->function, msg->length, msg->length - sizeof(*msg),
-			      msg->rpc_result, msg->rpc_result_private);
-		print_hex_dump(KERN_INFO, "msg: ", DUMP_PREFIX_OFFSET, 16, 1,
-			       msg->data, msg->length - sizeof(*msg), true);
-	}
-}
-
 static struct nvfw_gsp_rpc *
 r535_gsp_msg_recv(struct nvkm_gsp *gsp, int fn, u32 repc)
 {
 	struct nvkm_subdev *subdev = &gsp->subdev;
+	struct gsp_msgq_recv_args args = { 0 };
 	struct nvfw_gsp_rpc *msg;
 	int time = 4000000, i;
 	u32 size;
 
 retry:
-	msg = r535_gsp_msgq_wait(gsp, sizeof(*msg), &size, &time);
+	/* Peek the header of message */
+	args.time = time;
+	args.repc = sizeof(*msg);
+	args.prepc = &size;
+
+	msg = gsp_msgq_recv(gsp, &args);
 	if (IS_ERR_OR_NULL(msg))
 		return msg;
 
-	msg = r535_gsp_msgq_recv(gsp, msg->length, &time);
+	/* Recv the message */
+	msg = r535_gsp_msgq_recv(gsp, msg->length, repc, time);
 	if (IS_ERR_OR_NULL(msg))
 		return msg;
 
@@ -734,6 +849,7 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
 	mutex_lock(&gsp->cmdq.mutex);
 	if (rpc_size > max_rpc_size) {
 		const u32 fn = rpc->function;
+		u32 remain_rpc_size = rpc_size;
 
 		/* Adjust length, and send initial RPC. */
 		rpc->length = sizeof(*rpc) + max_rpc_size;
@@ -744,11 +860,11 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
 			goto done;
 
 		argv += max_rpc_size;
-		rpc_size -= max_rpc_size;
+		remain_rpc_size -= max_rpc_size;
 
 		/* Remaining chunks sent as CONTINUATION_RECORD RPCs. */
-		while (rpc_size) {
-			u32 size = min(rpc_size, max_rpc_size);
+		while (remain_rpc_size) {
+			u32 size = min(remain_rpc_size, max_rpc_size);
 			void *next;
 
 			next = r535_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD, size);
@@ -764,19 +880,20 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
 				goto done;
 
 			argv += size;
-			rpc_size -= size;
+			remain_rpc_size -= size;
 		}
 
 		/* Wait for reply. */
-		if (wait) {
-			rpc = r535_gsp_msg_recv(gsp, fn, repc);
-			if (!IS_ERR_OR_NULL(rpc))
+		rpc = r535_gsp_msg_recv(gsp, fn, rpc_size);
+		if (!IS_ERR_OR_NULL(rpc)) {
+			if (wait)
 				repv = rpc->data;
-			else
-				repv = rpc;
-		} else {
-			repv = NULL;
-		}
+			else {
+				nvkm_gsp_rpc_done(gsp, rpc);
+				repv = NULL;
+			}
+		} else
+			repv = wait ? rpc : NULL;
 	} else {
 		repv = r535_gsp_rpc_send(gsp, argv, wait, repc);
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] nvkm: handle the return of large RPC
  2024-10-17  7:19 ` [PATCH 3/3] nvkm: handle the return of large RPC Zhi Wang
@ 2024-10-17 11:29   ` Danilo Krummrich
  2024-10-17 19:50     ` Zhi Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Danilo Krummrich @ 2024-10-17 11:29 UTC (permalink / raw)
  To: Zhi Wang
  Cc: nouveau, airlied, daniel, bskeggs, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiwang

On Thu, Oct 17, 2024 at 12:19:22AM -0700, Zhi Wang wrote:
> The max RPC size is 16 pages (including the RPC header). To send an RPC
> larger than 16 pages, nvkm should split it into multiple RPCs and send
> it accordingly. The first of the split RPCs has the expected function
> number, while the rest of the split RPCs are sent with function number
> as NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD. GSP will consume the split
> RPCs from the cmdq and always write the result back to the msgq. The
> result is also formed as split RPCs.
> 
> However, NVKM is able to send split RPC when dealing with large RPCs,
> but totally not aware of handling the return of the large RPCs, which
> are the split RPC in the msgq. Thus, it keeps dumping the unknown RPC
> messages from msgq, which is actually CONTINUATION_RECORD message,
> discard them unexpectly. Thus, the caller will not be able to consume
> the result from GSP.
> 
> Introduce the handling of split RPCs on the msgq path. Slightly
> re-factor the low-level part of receiving RPCs from the msgq, RPC
> vehicle handling to merge the split RPCs back into a large RPC before
> handling it to the upper level. Thus, the upper-level of RPC APIs don't
> need to be heavily changed.
> 
> Signed-off-by: Zhi Wang <zhiw@nvidia.com>
> ---
>  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    | 237 +++++++++++++-----
>  1 file changed, 177 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
> index 50ae56013344..9c422644c9e7 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
> @@ -72,6 +72,21 @@ struct r535_gsp_msg {
>  
>  #define GSP_MSG_HDR_SIZE offsetof(struct r535_gsp_msg, data)
>  
> +struct nvfw_gsp_rpc {
> +	u32 header_version;
> +	u32 signature;
> +	u32 length;
> +	u32 function;
> +	u32 rpc_result;
> +	u32 rpc_result_private;
> +	u32 sequence;
> +	union {
> +		u32 spare;
> +		u32 cpuRmGfid;
> +	};
> +	u8  data[];
> +};
> +
>  static int
>  r535_rpc_status_to_errno(uint32_t rpc_status)
>  {
> @@ -86,16 +101,34 @@ r535_rpc_status_to_errno(uint32_t rpc_status)
>  	}
>  }
>  
> +struct gsp_msgq_recv_args {

Since we're adding the necessary documentation to this structure please make
them proper kernel doc comments [1].

Please also make sure to explain things a bit more in detail.

For you NVIDIA folks it's likely way easier to understand the code, since you
probably have existing documentation of the GSP interface, are involved in the
design of the interface in the first place, etc.

Try to think about what's necessary for new people joining the project to
understand the code.

[1] https://docs.kernel.org/doc-guide/kernel-doc.html

> +	/* timeout in us */
> +	int time;

This is misleading, it's not really the time, but the number of sleep cycles.
It's just that the current cycle is set to usleep_range(1, 2).

Maybe we want to name it "retries", "retry_count", or just "retry" instead?

Besides that, do you see a need to continuously decrease the retry count at all?

To me it looks like we could just make it a constant and let the function that
needs the retry loop deal with it internally instead of passing it around
everywhere. Do I miss anything?

> +	/* if set, peek the msgq, otherwise copy it */
> +	u32 *prepc;

This sounds like if `prepc` would be a bool. Please add a description that
properly explains what the field actually represents.

> +	/*
> +	 * the size (without message header) of message to
> +	 * wait(when peek)/copy from the msgq
> +	 */
> +	u32 repc;

Since the term "repc" is used everywhere in this file, but is never explained,
maybe this would be a good place?

Do we know what it means? I mean, the semantics in this context is clear, but
otherwise it looks like some generic term like "argc"?

Otherwise, feel free to replace it with a less generic and more meaningful name.

> +	/* the message buffer */
> +	u8 *msg;

Please expand this a bit; the following questions might help:
  - What kind of message?
  - Is it always the same kind of message?
  - Who is supposed to allocate it?
  - How does it get freed?

> +	/*
> +	 * skip copying the rpc header, used when handling a large RPC.
> +	 * rpc header only shows up in the first segment of a large RPC.
> +	 */
> +	bool skip_copy_rpc_header;
> +};
> +
>  static void *
> -r535_gsp_msgq_wait(struct nvkm_gsp *gsp, u32 repc, u32 *prepc, int *ptime)
> +gsp_msgq_recv(struct nvkm_gsp *gsp, struct gsp_msgq_recv_args *args)

For static functions it's probably fine to omit the "r535" prefix, but since
it's been used everywhere else, we may want to keep it for consistency.

Also please add a proper kernel doc comment to this function explaining its
semantics as well.

>  {
>  	struct r535_gsp_msg *mqe;
> -	u32 size, rptr = *gsp->msgq.rptr;
> +	u32 rptr = *gsp->msgq.rptr;
>  	int used;
> -	u8 *msg;
> -	u32 len;
> +	u32 size, len, repc;
>  
> -	size = DIV_ROUND_UP(GSP_MSG_HDR_SIZE + repc, GSP_PAGE_SIZE);
> +	size = DIV_ROUND_UP(GSP_MSG_HDR_SIZE + args->repc, GSP_PAGE_SIZE);
>  	if (WARN_ON(!size || size >= gsp->msgq.cnt))
>  		return ERR_PTR(-EINVAL);
>  
> @@ -109,46 +142,149 @@ r535_gsp_msgq_wait(struct nvkm_gsp *gsp, u32 repc, u32 *prepc, int *ptime)
>  			break;
>  
>  		usleep_range(1, 2);
> -	} while (--(*ptime));
> +	} while (--(args->time));
>  
> -	if (WARN_ON(!*ptime))
> +	if (WARN_ON(!args->time))
>  		return ERR_PTR(-ETIMEDOUT);
>  
>  	mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + rptr * 0x1000);
>  
> -	if (prepc) {
> -		*prepc = (used * GSP_PAGE_SIZE) - sizeof(*mqe);
> +	if (args->prepc) {
> +		*args->prepc = (used * GSP_PAGE_SIZE) - sizeof(*mqe);
>  		return mqe->data;
>  	}
>  
> +	repc = args->repc;
>  	size = ALIGN(repc + GSP_MSG_HDR_SIZE, GSP_PAGE_SIZE);
>  
> -	msg = kvmalloc(repc, GFP_KERNEL);
> -	if (!msg)
> -		return ERR_PTR(-ENOMEM);
> -
>  	len = ((gsp->msgq.cnt - rptr) * GSP_PAGE_SIZE) - sizeof(*mqe);
>  	len = min_t(u32, repc, len);
> -	memcpy(msg, mqe->data, len);
> +	if (!args->skip_copy_rpc_header)
> +		memcpy(args->msg, mqe->data, len);
> +	else
> +		memcpy(args->msg, mqe->data + sizeof(struct nvfw_gsp_rpc),
> +		       len - sizeof(struct nvfw_gsp_rpc));
>  
>  	repc -= len;
>  
>  	if (repc) {
>  		mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + 0 * 0x1000);
> -		memcpy(msg + len, mqe, repc);
> +		memcpy(args->msg + len, mqe, repc);
>  	}
>  
>  	rptr = (rptr + DIV_ROUND_UP(size, GSP_PAGE_SIZE)) % gsp->msgq.cnt;
>  
>  	mb();
>  	(*gsp->msgq.rptr) = rptr;
> -	return msg;
> +	return args->msg;
> +}
> +
> +static void
> +r535_gsp_msg_dump(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg, int lvl)
> +{
> +	if (gsp->subdev.debug >= lvl) {
> +		nvkm_printk__(&gsp->subdev, lvl, info,
> +			      "msg fn:%d len:0x%x/0x%zx res:0x%x resp:0x%x\n",
> +			      msg->function, msg->length, msg->length - sizeof(*msg),
> +			      msg->rpc_result, msg->rpc_result_private);
> +		print_hex_dump(KERN_INFO, "msg: ", DUMP_PREFIX_OFFSET, 16, 1,
> +			       msg->data, msg->length - sizeof(*msg), true);
> +	}
> +}
> +
> +static void *
> +r535_gsp_msgq_recv_continuation(struct nvkm_gsp *gsp, u32 *payload_size,
> +				u8 *buf, int time)

This looks like a great place to add an explanation about continuation records.

Someone who wants to join the project likely has no idea about RPCs and
continuation records.

Please add some documentation explaining what continuation records are, when
they are expected to be sent and what this function does to deal with them.

> +{
> +	struct nvkm_subdev *subdev = &gsp->subdev;
> +	struct nvfw_gsp_rpc *msg;
> +	struct gsp_msgq_recv_args args = { 0 };
> +	u32 size;
> +
> +	/* Peek the header of message */
> +	args.time = time;
> +	args.repc = sizeof(*msg);
> +	args.prepc = &size;
> +
> +	msg = gsp_msgq_recv(gsp, &args);
> +	if (IS_ERR_OR_NULL(msg))
> +		return msg;
> +
> +	if (msg->function != NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD) {
> +		nvkm_error(subdev, "Not a continuation of a large RPC\n");
> +		r535_gsp_msg_dump(gsp, msg, NV_DBG_ERROR);
> +		return ERR_PTR(-EIO);
> +	}
> +
> +	*payload_size = msg->length - sizeof(*msg);
> +
> +	/* Recv the continuation message */
> +	args.time = time;
> +	args.repc = msg->length;
> +	args.prepc = NULL;
> +	args.msg = buf;
> +	args.skip_copy_rpc_header = true;
> +
> +	return gsp_msgq_recv(gsp, &args);
>  }
>  
>  static void *
> -r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 repc, int *ptime)
> +r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 msg_repc, u32 total_repc,
> +		   int time)

Please add a kernel doc comment as well and explain the arguments. All the
different sizes quickly become very confusing.

It would also be good to establish a common naming scheme for all the different
sizes. Would you mind adding a separate patch that does that?

You can add comment at the beginning of the file explaining all the different
sizes and assign them a unique name that we then can use throughout the driver.

Also, I still find the "repc" thing confusing. If it's just a size why not just
call it "size"?

>  {
> -	return r535_gsp_msgq_wait(gsp, repc, NULL, ptime);
> +	struct gsp_msgq_recv_args args = { 0 };
> +	struct nvfw_gsp_rpc *msg;

Please try to avoid name collisions, such as naming this `msg`, which has
nothing to do with the below `args.msg`.

> +	const u32 max_msg_size = (16 * 0x1000) - sizeof(struct r535_gsp_msg);
> +	const u32 max_rpc_size = max_msg_size - sizeof(*msg);
> +	u32 repc = total_repc;
> +	u8 *buf, *next;
> +
> +	if (WARN_ON(msg_repc > max_msg_size))
> +		return NULL;
> +
> +	buf = kvmalloc(max_t(u32, msg_repc, total_repc + sizeof(*msg)), GFP_KERNEL);
> +	if (!buf)
> +		return ERR_PTR(-ENOMEM);
> +
> +	/* Recv the message */
> +	args.time = time;
> +	args.repc = msg_repc;
> +	args.prepc = NULL;
> +	args.msg = buf;
> +	args.skip_copy_rpc_header = false;
> +
> +	msg = gsp_msgq_recv(gsp, &args);
> +	if (IS_ERR_OR_NULL(msg)) {
> +		kfree(buf);
> +		return msg;
> +	}
> +
> +	if (total_repc <= max_rpc_size)
> +		return buf;
> +
> +	/* Gather the message from the following continuation messages. */
> +	next = buf;
> +
> +	next += msg_repc;
> +	repc -= msg_repc - sizeof(*msg);
> +
> +	while (repc) {
> +		struct nvfw_gsp_rpc *cont_msg;
> +		u32 size;
> +
> +		cont_msg = r535_gsp_msgq_recv_continuation(gsp, &size, next,
> +						      time);
> +		if (IS_ERR_OR_NULL(cont_msg)) {
> +			kfree(buf);
> +			return cont_msg;
> +		}
> +		repc -= size;
> +		next += size;
> +	}
> +
> +	/* Patch the message length. The caller sees a consolidated message */
> +	msg->length = total_repc + sizeof(*msg);
> +	return buf;
>  }
>  
>  static int
> @@ -234,54 +370,33 @@ r535_gsp_cmdq_get(struct nvkm_gsp *gsp, u32 argc)
>  	return cmd->data;
>  }
>  
> -struct nvfw_gsp_rpc {
> -	u32 header_version;
> -	u32 signature;
> -	u32 length;
> -	u32 function;
> -	u32 rpc_result;
> -	u32 rpc_result_private;
> -	u32 sequence;
> -	union {
> -		u32 spare;
> -		u32 cpuRmGfid;
> -	};
> -	u8  data[];
> -};
> -
>  static void
>  r535_gsp_msg_done(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg)
>  {
>  	kvfree(msg);
>  }
>  
> -static void
> -r535_gsp_msg_dump(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg, int lvl)
> -{
> -	if (gsp->subdev.debug >= lvl) {
> -		nvkm_printk__(&gsp->subdev, lvl, info,
> -			      "msg fn:%d len:0x%x/0x%zx res:0x%x resp:0x%x\n",
> -			      msg->function, msg->length, msg->length - sizeof(*msg),
> -			      msg->rpc_result, msg->rpc_result_private);
> -		print_hex_dump(KERN_INFO, "msg: ", DUMP_PREFIX_OFFSET, 16, 1,
> -			       msg->data, msg->length - sizeof(*msg), true);
> -	}
> -}
> -
>  static struct nvfw_gsp_rpc *
>  r535_gsp_msg_recv(struct nvkm_gsp *gsp, int fn, u32 repc)
>  {
>  	struct nvkm_subdev *subdev = &gsp->subdev;
> +	struct gsp_msgq_recv_args args = { 0 };
>  	struct nvfw_gsp_rpc *msg;
>  	int time = 4000000, i;
>  	u32 size;
>  
>  retry:
> -	msg = r535_gsp_msgq_wait(gsp, sizeof(*msg), &size, &time);
> +	/* Peek the header of message */
> +	args.time = time;
> +	args.repc = sizeof(*msg);
> +	args.prepc = &size;
> +
> +	msg = gsp_msgq_recv(gsp, &args);
>  	if (IS_ERR_OR_NULL(msg))
>  		return msg;
>  
> -	msg = r535_gsp_msgq_recv(gsp, msg->length, &time);
> +	/* Recv the message */
> +	msg = r535_gsp_msgq_recv(gsp, msg->length, repc, time);
>  	if (IS_ERR_OR_NULL(msg))
>  		return msg;
>  
> @@ -734,6 +849,7 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
>  	mutex_lock(&gsp->cmdq.mutex);
>  	if (rpc_size > max_rpc_size) {
>  		const u32 fn = rpc->function;
> +		u32 remain_rpc_size = rpc_size;
>  
>  		/* Adjust length, and send initial RPC. */
>  		rpc->length = sizeof(*rpc) + max_rpc_size;
> @@ -744,11 +860,11 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
>  			goto done;
>  
>  		argv += max_rpc_size;
> -		rpc_size -= max_rpc_size;
> +		remain_rpc_size -= max_rpc_size;
>  
>  		/* Remaining chunks sent as CONTINUATION_RECORD RPCs. */
> -		while (rpc_size) {
> -			u32 size = min(rpc_size, max_rpc_size);
> +		while (remain_rpc_size) {
> +			u32 size = min(remain_rpc_size, max_rpc_size);
>  			void *next;
>  
>  			next = r535_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD, size);
> @@ -764,19 +880,20 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
>  				goto done;
>  
>  			argv += size;
> -			rpc_size -= size;
> +			remain_rpc_size -= size;
>  		}
>  
>  		/* Wait for reply. */
> -		if (wait) {
> -			rpc = r535_gsp_msg_recv(gsp, fn, repc);
> -			if (!IS_ERR_OR_NULL(rpc))
> +		rpc = r535_gsp_msg_recv(gsp, fn, rpc_size);
> +		if (!IS_ERR_OR_NULL(rpc)) {
> +			if (wait)
>  				repv = rpc->data;
> -			else
> -				repv = rpc;
> -		} else {
> -			repv = NULL;
> -		}
> +			else {
> +				nvkm_gsp_rpc_done(gsp, rpc);
> +				repv = NULL;
> +			}
> +		} else
> +			repv = wait ? rpc : NULL;
>  	} else {
>  		repv = r535_gsp_rpc_send(gsp, argv, wait, repc);
>  	}
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] NVKM GSP RPC fixes
  2024-10-17  7:19 [PATCH 0/3] NVKM GSP RPC fixes Zhi Wang
                   ` (2 preceding siblings ...)
  2024-10-17  7:19 ` [PATCH 3/3] nvkm: handle the return of large RPC Zhi Wang
@ 2024-10-17 11:32 ` Danilo Krummrich
  2024-11-25 16:50 ` Danilo Krummrich
  4 siblings, 0 replies; 9+ messages in thread
From: Danilo Krummrich @ 2024-10-17 11:32 UTC (permalink / raw)
  To: Zhi Wang
  Cc: nouveau, airlied, daniel, bskeggs, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiwang

On Thu, Oct 17, 2024 at 12:19:19AM -0700, Zhi Wang wrote:
> Hi folks:
> 
> Here are some fixes of weird bugs I encountered when I was enabling vGPU [1] on
> core-driver aka NVKM. They are exposed because of the new RPCs required by
> vGPU.
> 
> For testing, I tried to run Uniengine Heaven[2] on my RTX 4060 for 8 hours and
> the GL CTS runner[3] (commandline: ./cts-runner --type-gl40) from Khronos
> without any problem.
> 
> v2:
> 
> - Remove the Fixes: tags as the vanilla nouveau aren't going to hit these bugs. (Danilo)
> - Test the patchset on VK-GL-CTS. (Danilo)
> - Remove the ambiguous empty line in PATCH 2. (Danilo)
> - Rename the r535_gsp_msgq_wait to gsp_msgq_recv. (Danilo)
> - Introduce a data structure to hold the params of gsp_msgq_recv(). (Danilo)
> - Document the params and the states they are related to. (Danilo)

When you send a v2, please make sure to pass `-v2` to `git format-patch`,
otherwise it's hard to distinguish patch versions from their subject.

> 
> [1] https://lore.kernel.org/kvm/20240922124951.1946072-1-zhiw@nvidia.com/T/#t
> [2] https://benchmark.unigine.com/heaven
> [3] https://github.com/KhronosGroup/VK-GL-CTS
> 
> Zhi Wang (3):
>   nvkm/gsp: correctly advance the read pointer of GSP message queue
>   nvkm: correctly calculate the available space of the GSP cmdq buffer
>   nvkm: handle the return of large RPC
> 
>  .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    | 251 +++++++++++++-----
>  1 file changed, 184 insertions(+), 67 deletions(-)
> 
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] nvkm: handle the return of large RPC
  2024-10-17 11:29   ` Danilo Krummrich
@ 2024-10-17 19:50     ` Zhi Wang
  2024-10-17 20:53       ` Danilo Krummrich
  0 siblings, 1 reply; 9+ messages in thread
From: Zhi Wang @ 2024-10-17 19:50 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: nouveau@lists.freedesktop.org, airlied@gmail.com, daniel@ffwll.ch,
	Ben Skeggs, Andy Currid, Neo Jia, Surath Mitra, Ankit Agrawal,
	Aniket Agashe, Kirti Wankhede, Tarun Gupta (SW-GPU),
	zhiwang@kernel.org

On 17/10/2024 14.29, Danilo Krummrich wrote:
> External email: Use caution opening links or attachments
> 
> 

Hi Danilo:

Thanks for the reply. See my comments below. If there is no more comment 
on PATCH 1/2, I would like to send a seperate series for large RPC 
support in msgq (PATCH 3) in the next spin.

> On Thu, Oct 17, 2024 at 12:19:22AM -0700, Zhi Wang wrote:
>> The max RPC size is 16 pages (including the RPC header). To send an RPC
>> larger than 16 pages, nvkm should split it into multiple RPCs and send
>> it accordingly. The first of the split RPCs has the expected function
>> number, while the rest of the split RPCs are sent with function number
>> as NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD. GSP will consume the split
>> RPCs from the cmdq and always write the result back to the msgq. The
>> result is also formed as split RPCs.
>>
>> However, NVKM is able to send split RPC when dealing with large RPCs,
>> but totally not aware of handling the return of the large RPCs, which
>> are the split RPC in the msgq. Thus, it keeps dumping the unknown RPC
>> messages from msgq, which is actually CONTINUATION_RECORD message,
>> discard them unexpectly. Thus, the caller will not be able to consume
>> the result from GSP.
>>
>> Introduce the handling of split RPCs on the msgq path. Slightly
>> re-factor the low-level part of receiving RPCs from the msgq, RPC
>> vehicle handling to merge the split RPCs back into a large RPC before
>> handling it to the upper level. Thus, the upper-level of RPC APIs don't
>> need to be heavily changed.
>>
>> Signed-off-by: Zhi Wang <zhiw@nvidia.com>
>> ---
>>   .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    | 237 +++++++++++++-----
>>   1 file changed, 177 insertions(+), 60 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
>> index 50ae56013344..9c422644c9e7 100644
>> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
>> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
>> @@ -72,6 +72,21 @@ struct r535_gsp_msg {
>>
>>   #define GSP_MSG_HDR_SIZE offsetof(struct r535_gsp_msg, data)
>>
>> +struct nvfw_gsp_rpc {
>> +     u32 header_version;
>> +     u32 signature;
>> +     u32 length;
>> +     u32 function;
>> +     u32 rpc_result;
>> +     u32 rpc_result_private;
>> +     u32 sequence;
>> +     union {
>> +             u32 spare;
>> +             u32 cpuRmGfid;
>> +     };
>> +     u8  data[];
>> +};
>> +
>>   static int
>>   r535_rpc_status_to_errno(uint32_t rpc_status)
>>   {
>> @@ -86,16 +101,34 @@ r535_rpc_status_to_errno(uint32_t rpc_status)
>>        }
>>   }
>>
>> +struct gsp_msgq_recv_args {
> 
> Since we're adding the necessary documentation to this structure please make
> them proper kernel doc comments [1].
> 
> Please also make sure to explain things a bit more in detail.
> 
> For you NVIDIA folks it's likely way easier to understand the code, since you
> probably have existing documentation of the GSP interface, are involved in the
> design of the interface in the first place, etc.
> 
Oops. What I can only say is (:p) : I learned almost all the knowledge 
of GSP procotols from GSP RPC operation code in OpenRM [1] and Ben's GSP 
enabling patches [2] plus learning in debug.

I can write a doc based on my notes and ask some one can help on review.

[1] 
https://github.com/NVIDIA/open-gpu-kernel-modules/blob/ed4be649623435ebb04f5e93f859bf46d977daa4/src/nvidia/src/kernel/gpu/gsp/message_queue_cpu.c#L4

[2] 
https://lore.kernel.org/nouveau/CAPM=9tyW=YuDQrRwrYK_ayuvEnp+9irTuze=MP-zkowm3CFJ9A@mail.gmail.com/T/

> Try to think about what's necessary for new people joining the project to
> understand the code.
> 

When writting v2, I was thinking if it would be better to just chop this 
wait functions into several. It actually contains several functions now:

- wait
- wait for the GSP message header when peek the msgq.
- wait for the complete GSP message there can copy it into the vehicle.
	- skip the RPC header for continuation message.
	- non skip the RPC header for ordinary messages.

> [1] https://docs.kernel.org/doc-guide/kernel-doc.html
> 
>> +     /* timeout in us */
>> +     int time;
> 
> This is misleading, it's not really the time, but the number of sleep cycles.
> It's just that the current cycle is set to usleep_range(1, 2).
> 
> Maybe we want to name it "retries", "retry_count", or just "retry" instead?
> 
> Besides that, do you see a need to continuously decrease the retry count at all?
> 
> To me it looks like we could just make it a constant and let the function that
> needs the retry loop deal with it internally instead of passing it around
> everywhere. Do I miss anything?
> 

Totally agree.

>> +     /* if set, peek the msgq, otherwise copy it */
>> +     u32 *prepc;
> 
> This sounds like if `prepc` would be a bool. Please add a description that
> properly explains what the field actually represents.
> 

According to what I understand, prepc represents the actual message 
length when peeking the msgq. (first, wait for the GSP RPC message 
header to be there, then passing the actual GSP RPC message length to 
the caller.)

>> +     /*
>> +      * the size (without message header) of message to
>> +      * wait(when peek)/copy from the msgq
>> +      */
>> +     u32 repc;
> 
> Since the term "repc" is used everywhere in this file, but is never explained,
> maybe this would be a good place?
> 

It would be nice to refine repc, argc and argv, since they were quite 
confusing to me when I ramped up nouveau code.

> Do we know what it means? I mean, the semantics in this context is clear, but
> otherwise it looks like some generic term like "argc"?
> 
> Otherwise, feel free to replace it with a less generic and more meaningful name.
> ()
>> +     /* the message buffer */
>> +     u8 *msg;
> 
> Please expand this a bit; the following questions might help:
>    - What kind of message?
>    - Is it always the same kind of message?
>    - Who is supposed to allocate it?
>    - How does it get freed?
> 

I think this will look clearer if we chopped the function into several.

Back to the questions:

- What kind of message?
GSP RPC message without message header

- Is it always the same kind of message?
Yes

- Who is supposed to allocate it?
r535_gsp_msgq_recv().

- How does it get freed?
nvkm_gsp_rpc_done().

An example of lifecycle of RPC vehicle can be found in 
r535_gsp_rpc_rm_alloc_push().

>> +     /*
>> +      * skip copying the rpc header, used when handling a large RPC.
>> +      * rpc header only shows up in the first segment of a large RPC.
>> +      */
>> +     bool skip_copy_rpc_header;
>> +};
>> +
>>   static void *
>> -r535_gsp_msgq_wait(struct nvkm_gsp *gsp, u32 repc, u32 *prepc, int *ptime)
>> +gsp_msgq_recv(struct nvkm_gsp *gsp, struct gsp_msgq_recv_args *args)
> 
> For static functions it's probably fine to omit the "r535" prefix, but since
> it's been used everywhere else, we may want to keep it for consistency.
> 

Was thinking about this. Was it a good idea to have the version number 
in the function name everywhere?

> Also please add a proper kernel doc comment to this function explaining its
> semantics as well.
> 
Sure.
>>   {
>>        struct r535_gsp_msg *mqe;
>> -     u32 size, rptr = *gsp->msgq.rptr;
>> +     u32 rptr = *gsp->msgq.rptr;
>>        int used;
>> -     u8 *msg;
>> -     u32 len;
>> +     u32 size, len, repc;
>>
>> -     size = DIV_ROUND_UP(GSP_MSG_HDR_SIZE + repc, GSP_PAGE_SIZE);
>> +     size = DIV_ROUND_UP(GSP_MSG_HDR_SIZE + args->repc, GSP_PAGE_SIZE);
>>        if (WARN_ON(!size || size >= gsp->msgq.cnt))
>>                return ERR_PTR(-EINVAL);
>>
>> @@ -109,46 +142,149 @@ r535_gsp_msgq_wait(struct nvkm_gsp *gsp, u32 repc, u32 *prepc, int *ptime)
>>                        break;
>>
>>                usleep_range(1, 2);
>> -     } while (--(*ptime));
>> +     } while (--(args->time));
>>
>> -     if (WARN_ON(!*ptime))
>> +     if (WARN_ON(!args->time))
>>                return ERR_PTR(-ETIMEDOUT);
>>
>>        mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + rptr * 0x1000);
>>
>> -     if (prepc) {
>> -             *prepc = (used * GSP_PAGE_SIZE) - sizeof(*mqe);
>> +     if (args->prepc) {
>> +             *args->prepc = (used * GSP_PAGE_SIZE) - sizeof(*mqe);
>>                return mqe->data;
>>        }
>>
>> +     repc = args->repc;
>>        size = ALIGN(repc + GSP_MSG_HDR_SIZE, GSP_PAGE_SIZE);
>>
>> -     msg = kvmalloc(repc, GFP_KERNEL);
>> -     if (!msg)
>> -             return ERR_PTR(-ENOMEM);
>> -
>>        len = ((gsp->msgq.cnt - rptr) * GSP_PAGE_SIZE) - sizeof(*mqe);
>>        len = min_t(u32, repc, len);
>> -     memcpy(msg, mqe->data, len);
>> +     if (!args->skip_copy_rpc_header)
>> +             memcpy(args->msg, mqe->data, len);
>> +     else
>> +             memcpy(args->msg, mqe->data + sizeof(struct nvfw_gsp_rpc),
>> +                    len - sizeof(struct nvfw_gsp_rpc));
>>
>>        repc -= len;
>>
>>        if (repc) {
>>                mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + 0 * 0x1000);
>> -             memcpy(msg + len, mqe, repc);
>> +             memcpy(args->msg + len, mqe, repc);
>>        }
>>
>>        rptr = (rptr + DIV_ROUND_UP(size, GSP_PAGE_SIZE)) % gsp->msgq.cnt;
>>
>>        mb();
>>        (*gsp->msgq.rptr) = rptr;
>> -     return msg;
>> +     return args->msg;
>> +}
>> +
>> +static void
>> +r535_gsp_msg_dump(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg, int lvl)
>> +{
>> +     if (gsp->subdev.debug >= lvl) {
>> +             nvkm_printk__(&gsp->subdev, lvl, info,
>> +                           "msg fn:%d len:0x%x/0x%zx res:0x%x resp:0x%x\n",
>> +                           msg->function, msg->length, msg->length - sizeof(*msg),
>> +                           msg->rpc_result, msg->rpc_result_private);
>> +             print_hex_dump(KERN_INFO, "msg: ", DUMP_PREFIX_OFFSET, 16, 1,
>> +                            msg->data, msg->length - sizeof(*msg), true);
>> +     }
>> +}
>> +
>> +static void *
>> +r535_gsp_msgq_recv_continuation(struct nvkm_gsp *gsp, u32 *payload_size,
>> +                             u8 *buf, int time)
> 
> This looks like a great place to add an explanation about continuation records.
> 
> Someone who wants to join the project likely has no idea about RPCs and
> continuation records.
> 
> Please add some documentation explaining what continuation records are, when
> they are expected to be sent and what this function does to deal with them.
> 

Sure.

>> +{
>> +     struct nvkm_subdev *subdev = &gsp->subdev;
>> +     struct nvfw_gsp_rpc *msg;
>> +     struct gsp_msgq_recv_args args = { 0 };
>> +     u32 size;
>> +
>> +     /* Peek the header of message */
>> +     args.time = time;
>> +     args.repc = sizeof(*msg);
>> +     args.prepc = &size;
>> +
>> +     msg = gsp_msgq_recv(gsp, &args);
>> +     if (IS_ERR_OR_NULL(msg))
>> +             return msg;
>> +
>> +     if (msg->function != NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD) {
>> +             nvkm_error(subdev, "Not a continuation of a large RPC\n");
>> +             r535_gsp_msg_dump(gsp, msg, NV_DBG_ERROR);
>> +             return ERR_PTR(-EIO);
>> +     }
>> +
>> +     *payload_size = msg->length - sizeof(*msg);
>> +
>> +     /* Recv the continuation message */
>> +     args.time = time;
>> +     args.repc = msg->length;
>> +     args.prepc = NULL;
>> +     args.msg = buf;
>> +     args.skip_copy_rpc_header = true;
>> +
>> +     return gsp_msgq_recv(gsp, &args);
>>   }
>>
>>   static void *
>> -r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 repc, int *ptime)
>> +r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 msg_repc, u32 total_repc,
>> +                int time)
> 
> Please add a kernel doc comment as well and explain the arguments. All the
> different sizes quickly become very confusing.
> 
> It would also be good to establish a common naming scheme for all the different
> sizes. Would you mind adding a separate patch that does that?
> 
> You can add comment at the beginning of the file explaining all the different
> sizes and assign them a unique name that we then can use throughout the driver.
>

I think maybe we can define the terms of the size after the doc, where 
just explained the GSP RPC message layout. Should be much eaiser then.

> Also, I still find the "repc" thing confusing. If it's just a size why not just
> call it "size"?
> 
>>   {
>> -     return r535_gsp_msgq_wait(gsp, repc, NULL, ptime);
>> +     struct gsp_msgq_recv_args args = { 0 };
>> +     struct nvfw_gsp_rpc *msg;
> 
> Please try to avoid name collisions, such as naming this `msg`, which has
> nothing to do with the below `args.msg`.
> 
>> +     const u32 max_msg_size = (16 * 0x1000) - sizeof(struct r535_gsp_msg);
>> +     const u32 max_rpc_size = max_msg_size - sizeof(*msg);
>> +     u32 repc = total_repc;
>> +     u8 *buf, *next;
>> +
>> +     if (WARN_ON(msg_repc > max_msg_size))
>> +             return NULL;
>> +
>> +     buf = kvmalloc(max_t(u32, msg_repc, total_repc + sizeof(*msg)), GFP_KERNEL);
>> +     if (!buf)
>> +             return ERR_PTR(-ENOMEM);
>> +
>> +     /* Recv the message */
>> +     args.time = time;
>> +     args.repc = msg_repc;
>> +     args.prepc = NULL;
>> +     args.msg = buf;
>> +     args.skip_copy_rpc_header = false;
>> +
>> +     msg = gsp_msgq_recv(gsp, &args);
>> +     if (IS_ERR_OR_NULL(msg)) {
>> +             kfree(buf);
>> +             return msg;
>> +     }
>> +
>> +     if (total_repc <= max_rpc_size)
>> +             return buf;
>> +
>> +     /* Gather the message from the following continuation messages. */
>> +     next = buf;
>> +
>> +     next += msg_repc;
>> +     repc -= msg_repc - sizeof(*msg);
>> +
>> +     while (repc) {
>> +             struct nvfw_gsp_rpc *cont_msg;
>> +             u32 size;
>> +
>> +             cont_msg = r535_gsp_msgq_recv_continuation(gsp, &size, next,
>> +                                                   time);
>> +             if (IS_ERR_OR_NULL(cont_msg)) {
>> +                     kfree(buf);
>> +                     return cont_msg;
>> +             }
>> +             repc -= size;
>> +             next += size;
>> +     }
>> +
>> +     /* Patch the message length. The caller sees a consolidated message */
>> +     msg->length = total_repc + sizeof(*msg);
>> +     return buf;
>>   }
>>
>>   static int
>> @@ -234,54 +370,33 @@ r535_gsp_cmdq_get(struct nvkm_gsp *gsp, u32 argc)
>>        return cmd->data;
>>   }
>>
>> -struct nvfw_gsp_rpc {
>> -     u32 header_version;
>> -     u32 signature;
>> -     u32 length;
>> -     u32 function;
>> -     u32 rpc_result;
>> -     u32 rpc_result_private;
>> -     u32 sequence;
>> -     union {
>> -             u32 spare;
>> -             u32 cpuRmGfid;
>> -     };
>> -     u8  data[];
>> -};
>> -
>>   static void
>>   r535_gsp_msg_done(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg)
>>   {
>>        kvfree(msg);
>>   }
>>
>> -static void
>> -r535_gsp_msg_dump(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg, int lvl)
>> -{
>> -     if (gsp->subdev.debug >= lvl) {
>> -             nvkm_printk__(&gsp->subdev, lvl, info,
>> -                           "msg fn:%d len:0x%x/0x%zx res:0x%x resp:0x%x\n",
>> -                           msg->function, msg->length, msg->length - sizeof(*msg),
>> -                           msg->rpc_result, msg->rpc_result_private);
>> -             print_hex_dump(KERN_INFO, "msg: ", DUMP_PREFIX_OFFSET, 16, 1,
>> -                            msg->data, msg->length - sizeof(*msg), true);
>> -     }
>> -}
>> -
>>   static struct nvfw_gsp_rpc *
>>   r535_gsp_msg_recv(struct nvkm_gsp *gsp, int fn, u32 repc)
>>   {
>>        struct nvkm_subdev *subdev = &gsp->subdev;
>> +     struct gsp_msgq_recv_args args = { 0 };
>>        struct nvfw_gsp_rpc *msg;
>>        int time = 4000000, i;
>>        u32 size;
>>
>>   retry:
>> -     msg = r535_gsp_msgq_wait(gsp, sizeof(*msg), &size, &time);
>> +     /* Peek the header of message */
>> +     args.time = time;
>> +     args.repc = sizeof(*msg);
>> +     args.prepc = &size;
>> +
>> +     msg = gsp_msgq_recv(gsp, &args);
>>        if (IS_ERR_OR_NULL(msg))
>>                return msg;
>>
>> -     msg = r535_gsp_msgq_recv(gsp, msg->length, &time);
>> +     /* Recv the message */
>> +     msg = r535_gsp_msgq_recv(gsp, msg->length, repc, time);
>>        if (IS_ERR_OR_NULL(msg))
>>                return msg;
>>
>> @@ -734,6 +849,7 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
>>        mutex_lock(&gsp->cmdq.mutex);
>>        if (rpc_size > max_rpc_size) {
>>                const u32 fn = rpc->function;
>> +             u32 remain_rpc_size = rpc_size;
>>
>>                /* Adjust length, and send initial RPC. */
>>                rpc->length = sizeof(*rpc) + max_rpc_size;
>> @@ -744,11 +860,11 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
>>                        goto done;
>>
>>                argv += max_rpc_size;
>> -             rpc_size -= max_rpc_size;
>> +             remain_rpc_size -= max_rpc_size;
>>
>>                /* Remaining chunks sent as CONTINUATION_RECORD RPCs. */
>> -             while (rpc_size) {
>> -                     u32 size = min(rpc_size, max_rpc_size);
>> +             while (remain_rpc_size) {
>> +                     u32 size = min(remain_rpc_size, max_rpc_size);
>>                        void *next;
>>
>>                        next = r535_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD, size);
>> @@ -764,19 +880,20 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
>>                                goto done;
>>
>>                        argv += size;
>> -                     rpc_size -= size;
>> +                     remain_rpc_size -= size;
>>                }
>>
>>                /* Wait for reply. */
>> -             if (wait) {
>> -                     rpc = r535_gsp_msg_recv(gsp, fn, repc);
>> -                     if (!IS_ERR_OR_NULL(rpc))
>> +             rpc = r535_gsp_msg_recv(gsp, fn, rpc_size);
>> +             if (!IS_ERR_OR_NULL(rpc)) {
>> +                     if (wait)
>>                                repv = rpc->data;
>> -                     else
>> -                             repv = rpc;
>> -             } else {
>> -                     repv = NULL;
>> -             }
>> +                     else {
>> +                             nvkm_gsp_rpc_done(gsp, rpc);
>> +                             repv = NULL;
>> +                     }
>> +             } else
>> +                     repv = wait ? rpc : NULL;
>>        } else {
>>                repv = r535_gsp_rpc_send(gsp, argv, wait, repc);
>>        }
>> --
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] nvkm: handle the return of large RPC
  2024-10-17 19:50     ` Zhi Wang
@ 2024-10-17 20:53       ` Danilo Krummrich
  0 siblings, 0 replies; 9+ messages in thread
From: Danilo Krummrich @ 2024-10-17 20:53 UTC (permalink / raw)
  To: Zhi Wang
  Cc: nouveau@lists.freedesktop.org, airlied@gmail.com, daniel@ffwll.ch,
	Ben Skeggs, Andy Currid, Neo Jia, Surath Mitra, Ankit Agrawal,
	Aniket Agashe, Kirti Wankhede, Tarun Gupta (SW-GPU),
	zhiwang@kernel.org

On Thu, Oct 17, 2024 at 07:50:24PM +0000, Zhi Wang wrote:
> On 17/10/2024 14.29, Danilo Krummrich wrote:
> > External email: Use caution opening links or attachments
> > 
> > 
> 
> Hi Danilo:
> 
> Thanks for the reply. See my comments below. If there is no more comment 
> on PATCH 1/2, I would like to send a seperate series for large RPC 
> support in msgq (PATCH 3) in the next spin.

Sounds good, I'll apply patch 1 and 2 tomorrow then.

Please make sure to still send the "new" series as v3.

> 
> > On Thu, Oct 17, 2024 at 12:19:22AM -0700, Zhi Wang wrote:
> >> The max RPC size is 16 pages (including the RPC header). To send an RPC
> >> larger than 16 pages, nvkm should split it into multiple RPCs and send
> >> it accordingly. The first of the split RPCs has the expected function
> >> number, while the rest of the split RPCs are sent with function number
> >> as NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD. GSP will consume the split
> >> RPCs from the cmdq and always write the result back to the msgq. The
> >> result is also formed as split RPCs.
> >>
> >> However, NVKM is able to send split RPC when dealing with large RPCs,
> >> but totally not aware of handling the return of the large RPCs, which
> >> are the split RPC in the msgq. Thus, it keeps dumping the unknown RPC
> >> messages from msgq, which is actually CONTINUATION_RECORD message,
> >> discard them unexpectly. Thus, the caller will not be able to consume
> >> the result from GSP.
> >>
> >> Introduce the handling of split RPCs on the msgq path. Slightly
> >> re-factor the low-level part of receiving RPCs from the msgq, RPC
> >> vehicle handling to merge the split RPCs back into a large RPC before
> >> handling it to the upper level. Thus, the upper-level of RPC APIs don't
> >> need to be heavily changed.
> >>
> >> Signed-off-by: Zhi Wang <zhiw@nvidia.com>
> >> ---
> >>   .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    | 237 +++++++++++++-----
> >>   1 file changed, 177 insertions(+), 60 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
> >> index 50ae56013344..9c422644c9e7 100644
> >> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
> >> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
> >> @@ -72,6 +72,21 @@ struct r535_gsp_msg {
> >>
> >>   #define GSP_MSG_HDR_SIZE offsetof(struct r535_gsp_msg, data)
> >>
> >> +struct nvfw_gsp_rpc {
> >> +     u32 header_version;
> >> +     u32 signature;
> >> +     u32 length;
> >> +     u32 function;
> >> +     u32 rpc_result;
> >> +     u32 rpc_result_private;
> >> +     u32 sequence;
> >> +     union {
> >> +             u32 spare;
> >> +             u32 cpuRmGfid;
> >> +     };
> >> +     u8  data[];
> >> +};
> >> +
> >>   static int
> >>   r535_rpc_status_to_errno(uint32_t rpc_status)
> >>   {
> >> @@ -86,16 +101,34 @@ r535_rpc_status_to_errno(uint32_t rpc_status)
> >>        }
> >>   }
> >>
> >> +struct gsp_msgq_recv_args {
> > 
> > Since we're adding the necessary documentation to this structure please make
> > them proper kernel doc comments [1].
> > 
> > Please also make sure to explain things a bit more in detail.
> > 
> > For you NVIDIA folks it's likely way easier to understand the code, since you
> > probably have existing documentation of the GSP interface, are involved in the
> > design of the interface in the first place, etc.
> > 
> Oops. What I can only say is (:p) : I learned almost all the knowledge 
> of GSP procotols from GSP RPC operation code in OpenRM [1] and Ben's GSP 
> enabling patches [2] plus learning in debug.
> 
> I can write a doc based on my notes and ask some one can help on review.

I'm not asking you to write a full doc (even though it would be very appreciated
:p), but I really would like to get everything we keep changing to a point where
new people are able to get a grasp of what's going on with reasonable effort.

> 
> [1] 
> https://github.com/NVIDIA/open-gpu-kernel-modules/blob/ed4be649623435ebb04f5e93f859bf46d977daa4/src/nvidia/src/kernel/gpu/gsp/message_queue_cpu.c#L4
> 
> [2] 
> https://lore.kernel.org/nouveau/CAPM=9tyW=YuDQrRwrYK_ayuvEnp+9irTuze=MP-zkowm3CFJ9A@mail.gmail.com/T/
> 
> > Try to think about what's necessary for new people joining the project to
> > understand the code.
> > 
> 
> When writting v2, I was thinking if it would be better to just chop this 
> wait functions into several. It actually contains several functions now:
> 
> - wait
> - wait for the GSP message header when peek the msgq.
> - wait for the complete GSP message there can copy it into the vehicle.
> 	- skip the RPC header for continuation message.
> 	- non skip the RPC header for ordinary messages.

Indeed -- that's what I had in mind too when I said that r535_gsp_msgq_wait()
has quite some more semantics than what the name implies and that we should
"build a properly documented state machine around it".

> 
> > [1] https://docs.kernel.org/doc-guide/kernel-doc.html
> > 
> >> +     /* timeout in us */
> >> +     int time;
> > 
> > This is misleading, it's not really the time, but the number of sleep cycles.
> > It's just that the current cycle is set to usleep_range(1, 2).
> > 
> > Maybe we want to name it "retries", "retry_count", or just "retry" instead?
> > 
> > Besides that, do you see a need to continuously decrease the retry count at all?
> > 
> > To me it looks like we could just make it a constant and let the function that
> > needs the retry loop deal with it internally instead of passing it around
> > everywhere. Do I miss anything?
> > 
> 
> Totally agree.

Great, then let's get this simplified.

> 
> >> +     /* if set, peek the msgq, otherwise copy it */
> >> +     u32 *prepc;
> > 
> > This sounds like if `prepc` would be a bool. Please add a description that
> > properly explains what the field actually represents.
> > 
> 
> According to what I understand, prepc represents the actual message 
> length when peeking the msgq. (first, wait for the GSP RPC message 
> header to be there, then passing the actual GSP RPC message length to 
> the caller.)
> 
> >> +     /*
> >> +      * the size (without message header) of message to
> >> +      * wait(when peek)/copy from the msgq
> >> +      */
> >> +     u32 repc;
> > 
> > Since the term "repc" is used everywhere in this file, but is never explained,
> > maybe this would be a good place?
> > 
> 
> It would be nice to refine repc, argc and argv, since they were quite 
> confusing to me when I ramped up nouveau code.

Fully agree, I think they are indeed confusing. I think within this file those
arguments are never that generic such that we can't come up with a more
descriptive and defined naming scheme.

> 
> > Do we know what it means? I mean, the semantics in this context is clear, but
> > otherwise it looks like some generic term like "argc"?
> > 
> > Otherwise, feel free to replace it with a less generic and more meaningful name.
> > ()
> >> +     /* the message buffer */
> >> +     u8 *msg;
> > 
> > Please expand this a bit; the following questions might help:
> >    - What kind of message?
> >    - Is it always the same kind of message?
> >    - Who is supposed to allocate it?
> >    - How does it get freed?
> > 
> 
> I think this will look clearer if we chopped the function into several.

Agreed!

> 
> Back to the questions:
> 
> - What kind of message?
> GSP RPC message without message header
> 
> - Is it always the same kind of message?
> Yes
> 
> - Who is supposed to allocate it?
> r535_gsp_msgq_recv().
> 
> - How does it get freed?
> nvkm_gsp_rpc_done().
> 
> An example of lifecycle of RPC vehicle can be found in 
> r535_gsp_rpc_rm_alloc_push().

I'm aware. :) I asked those questions, because I think the answers for those
should be made obvious to the reader of the code. Ideally, through both, design
and documentation.

> 
> >> +     /*
> >> +      * skip copying the rpc header, used when handling a large RPC.
> >> +      * rpc header only shows up in the first segment of a large RPC.
> >> +      */
> >> +     bool skip_copy_rpc_header;
> >> +};
> >> +
> >>   static void *
> >> -r535_gsp_msgq_wait(struct nvkm_gsp *gsp, u32 repc, u32 *prepc, int *ptime)
> >> +gsp_msgq_recv(struct nvkm_gsp *gsp, struct gsp_msgq_recv_args *args)
> > 
> > For static functions it's probably fine to omit the "r535" prefix, but since
> > it's been used everywhere else, we may want to keep it for consistency.
> > 
> 
> Was thinking about this. Was it a good idea to have the version number 
> in the function name everywhere?

For the public ones they have to be, since we somehow need to deal with multiple
firmware versions.

For the private ones, it's probably nice to have too, because otherwise it may
be annoying to grep for things specific to a given firmware version.

> 
> > Also please add a proper kernel doc comment to this function explaining its
> > semantics as well.
> > 
> Sure.
> >>   {
> >>        struct r535_gsp_msg *mqe;
> >> -     u32 size, rptr = *gsp->msgq.rptr;
> >> +     u32 rptr = *gsp->msgq.rptr;
> >>        int used;
> >> -     u8 *msg;
> >> -     u32 len;
> >> +     u32 size, len, repc;
> >>
> >> -     size = DIV_ROUND_UP(GSP_MSG_HDR_SIZE + repc, GSP_PAGE_SIZE);
> >> +     size = DIV_ROUND_UP(GSP_MSG_HDR_SIZE + args->repc, GSP_PAGE_SIZE);
> >>        if (WARN_ON(!size || size >= gsp->msgq.cnt))
> >>                return ERR_PTR(-EINVAL);
> >>
> >> @@ -109,46 +142,149 @@ r535_gsp_msgq_wait(struct nvkm_gsp *gsp, u32 repc, u32 *prepc, int *ptime)
> >>                        break;
> >>
> >>                usleep_range(1, 2);
> >> -     } while (--(*ptime));
> >> +     } while (--(args->time));
> >>
> >> -     if (WARN_ON(!*ptime))
> >> +     if (WARN_ON(!args->time))
> >>                return ERR_PTR(-ETIMEDOUT);
> >>
> >>        mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + rptr * 0x1000);
> >>
> >> -     if (prepc) {
> >> -             *prepc = (used * GSP_PAGE_SIZE) - sizeof(*mqe);
> >> +     if (args->prepc) {
> >> +             *args->prepc = (used * GSP_PAGE_SIZE) - sizeof(*mqe);
> >>                return mqe->data;
> >>        }
> >>
> >> +     repc = args->repc;
> >>        size = ALIGN(repc + GSP_MSG_HDR_SIZE, GSP_PAGE_SIZE);
> >>
> >> -     msg = kvmalloc(repc, GFP_KERNEL);
> >> -     if (!msg)
> >> -             return ERR_PTR(-ENOMEM);
> >> -
> >>        len = ((gsp->msgq.cnt - rptr) * GSP_PAGE_SIZE) - sizeof(*mqe);
> >>        len = min_t(u32, repc, len);
> >> -     memcpy(msg, mqe->data, len);
> >> +     if (!args->skip_copy_rpc_header)
> >> +             memcpy(args->msg, mqe->data, len);
> >> +     else
> >> +             memcpy(args->msg, mqe->data + sizeof(struct nvfw_gsp_rpc),
> >> +                    len - sizeof(struct nvfw_gsp_rpc));
> >>
> >>        repc -= len;
> >>
> >>        if (repc) {
> >>                mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + 0 * 0x1000);
> >> -             memcpy(msg + len, mqe, repc);
> >> +             memcpy(args->msg + len, mqe, repc);
> >>        }
> >>
> >>        rptr = (rptr + DIV_ROUND_UP(size, GSP_PAGE_SIZE)) % gsp->msgq.cnt;
> >>
> >>        mb();
> >>        (*gsp->msgq.rptr) = rptr;
> >> -     return msg;
> >> +     return args->msg;
> >> +}
> >> +
> >> +static void
> >> +r535_gsp_msg_dump(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg, int lvl)
> >> +{
> >> +     if (gsp->subdev.debug >= lvl) {
> >> +             nvkm_printk__(&gsp->subdev, lvl, info,
> >> +                           "msg fn:%d len:0x%x/0x%zx res:0x%x resp:0x%x\n",
> >> +                           msg->function, msg->length, msg->length - sizeof(*msg),
> >> +                           msg->rpc_result, msg->rpc_result_private);
> >> +             print_hex_dump(KERN_INFO, "msg: ", DUMP_PREFIX_OFFSET, 16, 1,
> >> +                            msg->data, msg->length - sizeof(*msg), true);
> >> +     }
> >> +}
> >> +
> >> +static void *
> >> +r535_gsp_msgq_recv_continuation(struct nvkm_gsp *gsp, u32 *payload_size,
> >> +                             u8 *buf, int time)
> > 
> > This looks like a great place to add an explanation about continuation records.
> > 
> > Someone who wants to join the project likely has no idea about RPCs and
> > continuation records.
> > 
> > Please add some documentation explaining what continuation records are, when
> > they are expected to be sent and what this function does to deal with them.
> > 
> 
> Sure.
> 
> >> +{
> >> +     struct nvkm_subdev *subdev = &gsp->subdev;
> >> +     struct nvfw_gsp_rpc *msg;
> >> +     struct gsp_msgq_recv_args args = { 0 };
> >> +     u32 size;
> >> +
> >> +     /* Peek the header of message */
> >> +     args.time = time;
> >> +     args.repc = sizeof(*msg);
> >> +     args.prepc = &size;
> >> +
> >> +     msg = gsp_msgq_recv(gsp, &args);
> >> +     if (IS_ERR_OR_NULL(msg))
> >> +             return msg;
> >> +
> >> +     if (msg->function != NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD) {
> >> +             nvkm_error(subdev, "Not a continuation of a large RPC\n");
> >> +             r535_gsp_msg_dump(gsp, msg, NV_DBG_ERROR);
> >> +             return ERR_PTR(-EIO);
> >> +     }
> >> +
> >> +     *payload_size = msg->length - sizeof(*msg);
> >> +
> >> +     /* Recv the continuation message */
> >> +     args.time = time;
> >> +     args.repc = msg->length;
> >> +     args.prepc = NULL;
> >> +     args.msg = buf;
> >> +     args.skip_copy_rpc_header = true;
> >> +
> >> +     return gsp_msgq_recv(gsp, &args);
> >>   }
> >>
> >>   static void *
> >> -r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 repc, int *ptime)
> >> +r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 msg_repc, u32 total_repc,
> >> +                int time)
> > 
> > Please add a kernel doc comment as well and explain the arguments. All the
> > different sizes quickly become very confusing.
> > 
> > It would also be good to establish a common naming scheme for all the different
> > sizes. Would you mind adding a separate patch that does that?
> > 
> > You can add comment at the beginning of the file explaining all the different
> > sizes and assign them a unique name that we then can use throughout the driver.
> >
> 
> I think maybe we can define the terms of the size after the doc, where 
> just explained the GSP RPC message layout. Should be much eaiser then.

Sure, sounds good.

> 
> > Also, I still find the "repc" thing confusing. If it's just a size why not just
> > call it "size"?
> > 
> >>   {
> >> -     return r535_gsp_msgq_wait(gsp, repc, NULL, ptime);
> >> +     struct gsp_msgq_recv_args args = { 0 };
> >> +     struct nvfw_gsp_rpc *msg;
> > 
> > Please try to avoid name collisions, such as naming this `msg`, which has
> > nothing to do with the below `args.msg`.
> > 
> >> +     const u32 max_msg_size = (16 * 0x1000) - sizeof(struct r535_gsp_msg);
> >> +     const u32 max_rpc_size = max_msg_size - sizeof(*msg);
> >> +     u32 repc = total_repc;
> >> +     u8 *buf, *next;
> >> +
> >> +     if (WARN_ON(msg_repc > max_msg_size))
> >> +             return NULL;
> >> +
> >> +     buf = kvmalloc(max_t(u32, msg_repc, total_repc + sizeof(*msg)), GFP_KERNEL);
> >> +     if (!buf)
> >> +             return ERR_PTR(-ENOMEM);
> >> +
> >> +     /* Recv the message */
> >> +     args.time = time;
> >> +     args.repc = msg_repc;
> >> +     args.prepc = NULL;
> >> +     args.msg = buf;
> >> +     args.skip_copy_rpc_header = false;
> >> +
> >> +     msg = gsp_msgq_recv(gsp, &args);
> >> +     if (IS_ERR_OR_NULL(msg)) {
> >> +             kfree(buf);
> >> +             return msg;
> >> +     }
> >> +
> >> +     if (total_repc <= max_rpc_size)
> >> +             return buf;
> >> +
> >> +     /* Gather the message from the following continuation messages. */
> >> +     next = buf;
> >> +
> >> +     next += msg_repc;
> >> +     repc -= msg_repc - sizeof(*msg);
> >> +
> >> +     while (repc) {
> >> +             struct nvfw_gsp_rpc *cont_msg;
> >> +             u32 size;
> >> +
> >> +             cont_msg = r535_gsp_msgq_recv_continuation(gsp, &size, next,
> >> +                                                   time);
> >> +             if (IS_ERR_OR_NULL(cont_msg)) {
> >> +                     kfree(buf);
> >> +                     return cont_msg;
> >> +             }
> >> +             repc -= size;
> >> +             next += size;
> >> +     }
> >> +
> >> +     /* Patch the message length. The caller sees a consolidated message */
> >> +     msg->length = total_repc + sizeof(*msg);
> >> +     return buf;
> >>   }
> >>
> >>   static int
> >> @@ -234,54 +370,33 @@ r535_gsp_cmdq_get(struct nvkm_gsp *gsp, u32 argc)
> >>        return cmd->data;
> >>   }
> >>
> >> -struct nvfw_gsp_rpc {
> >> -     u32 header_version;
> >> -     u32 signature;
> >> -     u32 length;
> >> -     u32 function;
> >> -     u32 rpc_result;
> >> -     u32 rpc_result_private;
> >> -     u32 sequence;
> >> -     union {
> >> -             u32 spare;
> >> -             u32 cpuRmGfid;
> >> -     };
> >> -     u8  data[];
> >> -};
> >> -
> >>   static void
> >>   r535_gsp_msg_done(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg)
> >>   {
> >>        kvfree(msg);
> >>   }
> >>
> >> -static void
> >> -r535_gsp_msg_dump(struct nvkm_gsp *gsp, struct nvfw_gsp_rpc *msg, int lvl)
> >> -{
> >> -     if (gsp->subdev.debug >= lvl) {
> >> -             nvkm_printk__(&gsp->subdev, lvl, info,
> >> -                           "msg fn:%d len:0x%x/0x%zx res:0x%x resp:0x%x\n",
> >> -                           msg->function, msg->length, msg->length - sizeof(*msg),
> >> -                           msg->rpc_result, msg->rpc_result_private);
> >> -             print_hex_dump(KERN_INFO, "msg: ", DUMP_PREFIX_OFFSET, 16, 1,
> >> -                            msg->data, msg->length - sizeof(*msg), true);
> >> -     }
> >> -}
> >> -
> >>   static struct nvfw_gsp_rpc *
> >>   r535_gsp_msg_recv(struct nvkm_gsp *gsp, int fn, u32 repc)
> >>   {
> >>        struct nvkm_subdev *subdev = &gsp->subdev;
> >> +     struct gsp_msgq_recv_args args = { 0 };
> >>        struct nvfw_gsp_rpc *msg;
> >>        int time = 4000000, i;
> >>        u32 size;
> >>
> >>   retry:
> >> -     msg = r535_gsp_msgq_wait(gsp, sizeof(*msg), &size, &time);
> >> +     /* Peek the header of message */
> >> +     args.time = time;
> >> +     args.repc = sizeof(*msg);
> >> +     args.prepc = &size;
> >> +
> >> +     msg = gsp_msgq_recv(gsp, &args);
> >>        if (IS_ERR_OR_NULL(msg))
> >>                return msg;
> >>
> >> -     msg = r535_gsp_msgq_recv(gsp, msg->length, &time);
> >> +     /* Recv the message */
> >> +     msg = r535_gsp_msgq_recv(gsp, msg->length, repc, time);
> >>        if (IS_ERR_OR_NULL(msg))
> >>                return msg;
> >>
> >> @@ -734,6 +849,7 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
> >>        mutex_lock(&gsp->cmdq.mutex);
> >>        if (rpc_size > max_rpc_size) {
> >>                const u32 fn = rpc->function;
> >> +             u32 remain_rpc_size = rpc_size;
> >>
> >>                /* Adjust length, and send initial RPC. */
> >>                rpc->length = sizeof(*rpc) + max_rpc_size;
> >> @@ -744,11 +860,11 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
> >>                        goto done;
> >>
> >>                argv += max_rpc_size;
> >> -             rpc_size -= max_rpc_size;
> >> +             remain_rpc_size -= max_rpc_size;
> >>
> >>                /* Remaining chunks sent as CONTINUATION_RECORD RPCs. */
> >> -             while (rpc_size) {
> >> -                     u32 size = min(rpc_size, max_rpc_size);
> >> +             while (remain_rpc_size) {
> >> +                     u32 size = min(remain_rpc_size, max_rpc_size);
> >>                        void *next;
> >>
> >>                        next = r535_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD, size);
> >> @@ -764,19 +880,20 @@ r535_gsp_rpc_push(struct nvkm_gsp *gsp, void *argv, bool wait, u32 repc)
> >>                                goto done;
> >>
> >>                        argv += size;
> >> -                     rpc_size -= size;
> >> +                     remain_rpc_size -= size;
> >>                }
> >>
> >>                /* Wait for reply. */
> >> -             if (wait) {
> >> -                     rpc = r535_gsp_msg_recv(gsp, fn, repc);
> >> -                     if (!IS_ERR_OR_NULL(rpc))
> >> +             rpc = r535_gsp_msg_recv(gsp, fn, rpc_size);
> >> +             if (!IS_ERR_OR_NULL(rpc)) {
> >> +                     if (wait)
> >>                                repv = rpc->data;
> >> -                     else
> >> -                             repv = rpc;
> >> -             } else {
> >> -                     repv = NULL;
> >> -             }
> >> +                     else {
> >> +                             nvkm_gsp_rpc_done(gsp, rpc);
> >> +                             repv = NULL;
> >> +                     }
> >> +             } else
> >> +                     repv = wait ? rpc : NULL;
> >>        } else {
> >>                repv = r535_gsp_rpc_send(gsp, argv, wait, repc);
> >>        }
> >> --
> >> 2.34.1
> >>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] NVKM GSP RPC fixes
  2024-10-17  7:19 [PATCH 0/3] NVKM GSP RPC fixes Zhi Wang
                   ` (3 preceding siblings ...)
  2024-10-17 11:32 ` [PATCH 0/3] NVKM GSP RPC fixes Danilo Krummrich
@ 2024-11-25 16:50 ` Danilo Krummrich
  4 siblings, 0 replies; 9+ messages in thread
From: Danilo Krummrich @ 2024-11-25 16:50 UTC (permalink / raw)
  To: Zhi Wang
  Cc: nouveau, airlied, daniel, bskeggs, acurrid, cjia, smitra, ankita,
	aniketa, kwankhede, targupta, zhiwang

On 10/17/24 9:19 AM, Zhi Wang wrote:
> Hi folks:
> 
> Here are some fixes of weird bugs I encountered when I was enabling vGPU [1] on
> core-driver aka NVKM. They are exposed because of the new RPCs required by
> vGPU.
> 
> For testing, I tried to run Uniengine Heaven[2] on my RTX 4060 for 8 hours and
> the GL CTS runner[3] (commandline: ./cts-runner --type-gl40) from Khronos
> without any problem.
> 
> v2:
> 
> - Remove the Fixes: tags as the vanilla nouveau aren't going to hit these bugs. (Danilo)
> - Test the patchset on VK-GL-CTS. (Danilo)
> - Remove the ambiguous empty line in PATCH 2. (Danilo)
> - Rename the r535_gsp_msgq_wait to gsp_msgq_recv. (Danilo)
> - Introduce a data structure to hold the params of gsp_msgq_recv(). (Danilo)
> - Document the params and the states they are related to. (Danilo)
> 
> [1] https://lore.kernel.org/kvm/20240922124951.1946072-1-zhiw@nvidia.com/T/#t
> [2] https://benchmark.unigine.com/heaven
> [3] https://github.com/KhronosGroup/VK-GL-CTS
> 
> Zhi Wang (3):
>    nvkm/gsp: correctly advance the read pointer of GSP message queue
>    nvkm: correctly calculate the available space of the GSP cmdq buffer

Applied patches 1 and 2 to drm-misc-next, thanks!

Sorry for the delay,
Danilo

>    nvkm: handle the return of large RPC
> 
>   .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c    | 251 +++++++++++++-----
>   1 file changed, 184 insertions(+), 67 deletions(-)
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-11-25 16:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-17  7:19 [PATCH 0/3] NVKM GSP RPC fixes Zhi Wang
2024-10-17  7:19 ` [PATCH 1/3] nvkm/gsp: correctly advance the read pointer of GSP message queue Zhi Wang
2024-10-17  7:19 ` [PATCH 2/3] nvkm: correctly calculate the available space of the GSP cmdq buffer Zhi Wang
2024-10-17  7:19 ` [PATCH 3/3] nvkm: handle the return of large RPC Zhi Wang
2024-10-17 11:29   ` Danilo Krummrich
2024-10-17 19:50     ` Zhi Wang
2024-10-17 20:53       ` Danilo Krummrich
2024-10-17 11:32 ` [PATCH 0/3] NVKM GSP RPC fixes Danilo Krummrich
2024-11-25 16:50 ` Danilo Krummrich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.