Re: [PATCH v2 6/7] accel/rocket: Add job submission IOCTL

linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
To: "Rob Herring" <robh@kernel.org>,
	"Krzysztof Kozlowski" <krzk+dt@kernel.org>,
	"Conor Dooley" <conor+dt@kernel.org>,
	"Heiko Stuebner" <heiko@sntech.de>,
	"Oded Gabbay" <ogabbay@kernel.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Sebastian Reichel" <sebastian.reichel@collabora.com>,
	"Jeffrey Hugo" <quic_jhugo@quicinc.com>,
	linux-rockchip@lists.infradead.org,
	"Tomeu Vizoso" <tomeu@tomeuvizoso.net>
Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
	Tomeu Vizoso <tomeu@tomeuvizoso.net>
Subject: Re: [PATCH v2 6/7] accel/rocket: Add job submission IOCTL
Date: Tue, 29 Apr 2025 12:05:28 +0200	[thread overview]
Message-ID: <2365360.ElGaqSPkdT@workhorse> (raw)
In-Reply-To: <20250225-6-10-rocket-v2-6-d4dbcfafc141@tomeuvizoso.net>

On Tuesday, 25 February 2025 08:55:52 Central European Summer Time Tomeu Vizoso wrote:
> Using the DRM GPU scheduler infrastructure, with a scheduler for each
> core.
> 
> Userspace can decide for a series of tasks to be executed sequentially
> in the same core, so SRAM locality can be taken advantage of.
> 
> The job submission code was initially based on Panfrost.
> 
> v2:
> - Remove hardcoded number of cores
> - Misc. style fixes (Jeffrey Hugo)
> - Repack IOCTL struct (Jeffrey Hugo)
> 
> Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
> ---
>  drivers/accel/rocket/Makefile        |   3 +-
>  drivers/accel/rocket/rocket_core.c   |   6 +
>  drivers/accel/rocket/rocket_core.h   |  14 +
>  drivers/accel/rocket/rocket_device.c |   2 +
>  drivers/accel/rocket/rocket_device.h |   2 +
>  drivers/accel/rocket/rocket_drv.c    |  15 +
>  drivers/accel/rocket/rocket_drv.h    |   4 +
>  drivers/accel/rocket/rocket_job.c    | 710 +++++++++++++++++++++++++++++++++++
>  drivers/accel/rocket/rocket_job.h    |  50 +++
>  include/uapi/drm/rocket_accel.h      |  55 +++
>  10 files changed, 860 insertions(+), 1 deletion(-)

Hi Tomeu,

some more power management things I've noticed here.

>
> [...]
>
> diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..25b31f28e932aaee86173b9a0962932c9c640c03
> --- /dev/null
> +++ b/drivers/accel/rocket/rocket_job.c
>
> [...]
>
> +static void
> +rocket_reset(struct rocket_core *core, struct drm_sched_job *bad)
> +{
> +	bool cookie;
> +
> +	if (!atomic_read(&core->reset.pending))
> +		return;
> +
> +	/*
> +	 * Stop the scheduler.
> +	 *
> +	 * FIXME: We temporarily get out of the dma_fence_signalling section
> +	 * because the cleanup path generate lockdep splats when taking locks
> +	 * to release job resources. We should rework the code to follow this
> +	 * pattern:
> +	 *
> +	 *	try_lock
> +	 *	if (locked)
> +	 *		release
> +	 *	else
> +	 *		schedule_work_to_release_later
> +	 */
> +	drm_sched_stop(&core->sched, bad);
> +
> +	cookie = dma_fence_begin_signalling();
> +
> +	if (bad)
> +		drm_sched_increase_karma(bad);
> +
> +	/*
> +	 * Mask job interrupts and synchronize to make sure we won't be
> +	 * interrupted during our reset.
> +	 */
> +	rocket_write(core, REG_PC_INTERRUPT_MASK, 0x0);
> +	synchronize_irq(core->irq);
> +
> +	/* Handle the remaining interrupts before we reset. */
> +	rocket_job_handle_irq(core);
> +
> +	/*
> +	 * Remaining interrupts have been handled, but we might still have
> +	 * stuck jobs. Let's make sure the PM counters stay balanced by
> +	 * manually calling pm_runtime_put_noidle() and
> +	 * rocket_devfreq_record_idle() for each stuck job.
> +	 * Let's also make sure the cycle counting register's refcnt is
> +	 * kept balanced to prevent it from running forever
> +	 */
> +	spin_lock(&core->job_lock);
> +	if (core->in_flight_job)
> +		pm_runtime_put_noidle(core->dev);

This particular line of code caused me issues when I was experimenting with the
driver on RK3576. My current situation is that every job that gets submitted
times out because of some IRQs never arriving, which is besides the point, but
it did expose a power management bug here that I suspect may affect RK3588 as
well. After like 3 timeouts, we reset again and hang in the interrupt mask write
just a few lines above. This is because we somehow managed to get into a
situation where this function is called with pclk disabled, and this manual
balancing act may be part of it. Not doing the manual balancing at all here
doesn't fix it, I had to explicitly wrap the reset function in
pm_runtime_get_sync and pm_runtime_put_noidle to avoid having this function
execute with disabled clocks. That seems like the "wrong solution" though
because it means something already powered off our hardware too eagerly before
we got the reset done.

My gut instinct tells me that there's a bug here with multiple timed out jobs
being in flight, but I'm not 100% on it. Maybe you'll be able to reproduce it
on an RK3588 by artificially forcing all your jobs to time out with some very
low timeout value or by masking the interrupts.

> +
> +	core->in_flight_job = NULL;
> +	spin_unlock(&core->job_lock);
> +
> +	/* Proceed with reset now. */
> +	pm_runtime_force_suspend(core->dev);
> +	pm_runtime_force_resume(core->dev);

Do we need to guarantee some sort of exclusivity here so that nothing tries to
concurrently use the device while we're forcing it off and back on again? I'm
unfamiliar with the drm fence stuff, so I may be overthinking this.

> +
> +	/* GPU has been reset, we can clear the reset pending bit. */
> +	atomic_set(&core->reset.pending, 0);

Nitpick: should probably be "NPU" ;)

> +
> +	/*
> +	 * Now resubmit jobs that were previously queued but didn't have a
> +	 * chance to finish.
> +	 * FIXME: We temporarily get out of the DMA fence signalling section
> +	 * while resubmitting jobs because the job submission logic will
> +	 * allocate memory with the GFP_KERNEL flag which can trigger memory
> +	 * reclaim and exposes a lock ordering issue.
> +	 */
> +	dma_fence_end_signalling(cookie);
> +	drm_sched_resubmit_jobs(&core->sched);
> +	cookie = dma_fence_begin_signalling();
> +
> +	/* Restart the scheduler */
> +	drm_sched_start(&core->sched, 0);
> +
> +	dma_fence_end_signalling(cookie);
> +}
>
> [...]
>

Kind regards,
Nicolas Frattaroli

next prev parent reply	other threads:[~2025-04-29 10:06 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-25  7:55 [PATCH v2 0/7] New DRM accel driver for Rockchip's RKNN NPU Tomeu Vizoso
2025-02-25  7:55 ` [PATCH v2 1/7] dt-bindings: npu: rockchip,rknn: Add bindings Tomeu Vizoso
2025-02-25 16:02   ` Rob Herring
2025-05-14 16:26     ` Tomeu Vizoso
2025-04-25 18:50   ` Nicolas Frattaroli
2025-05-14 15:18     ` Tomeu Vizoso
2025-05-14 17:50       ` Nicolas Frattaroli
2025-05-15  8:30         ` Tomeu Vizoso
2025-05-16 10:25           ` Nicolas Frattaroli
2025-05-16 10:56             ` Tomeu Vizoso
2025-02-25  7:55 ` [PATCH v2 2/7] arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588s Tomeu Vizoso
2025-02-25  7:55 ` [PATCH v2 3/7] arm64: dts: rockchip: Enable the NPU on quartzpro64 Tomeu Vizoso
2025-02-25  7:55 ` [PATCH v2 4/7] accel/rocket: Add a new driver for Rockchip's NPU Tomeu Vizoso
2025-02-25  8:21   ` Thomas Zimmermann
2025-03-21 15:48   ` Jeff Hugo
2025-04-25 18:22   ` Nicolas Frattaroli
2025-05-16  9:15     ` Tomeu Vizoso
2025-04-29  9:39   ` Nicolas Frattaroli
2025-02-25  7:55 ` [PATCH v2 5/7] accel/rocket: Add IOCTL for BO creation Tomeu Vizoso
2025-02-25  8:35   ` Thomas Zimmermann
2025-03-21 15:56   ` Jeffrey Hugo
2025-02-25  7:55 ` [PATCH v2 6/7] accel/rocket: Add job submission IOCTL Tomeu Vizoso
2025-02-25  8:44   ` Thomas Zimmermann
2025-03-21 16:09   ` Jeff Hugo
2025-04-29 10:05   ` Nicolas Frattaroli [this message]
2025-02-25  7:55 ` [PATCH v2 7/7] accel/rocket: Add IOCTLs for synchronizing memory accesses Tomeu Vizoso
2025-03-21 16:15   ` Jeffrey Hugo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2365360.ElGaqSPkdT@workhorse \
    --to=nicolas.frattaroli@collabora.com \
    --cc=airlied@gmail.com \
    --cc=christian.koenig@amd.com \
    --cc=conor+dt@kernel.org \
    --cc=corbet@lwn.net \
    --cc=devicetree@vger.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=heiko@sntech.de \
    --cc=krzk+dt@kernel.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=ogabbay@kernel.org \
    --cc=quic_jhugo@quicinc.com \
    --cc=robh@kernel.org \
    --cc=sebastian.reichel@collabora.com \
    --cc=simona@ffwll.ch \
    --cc=sumit.semwal@linaro.org \
    --cc=tomeu@tomeuvizoso.net \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).