From: Arnd Bergmann <arnd@kernel.org>
To: Detlev Casanova <detlev.casanova@collabora.com>,
Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>,
Mauro Carvalho Chehab <mchehab@kernel.org>,
Heiko Stuebner <heiko@sntech.de>,
Nathan Chancellor <nathan@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>,
Nick Desaulniers <nick.desaulniers+lkml@gmail.com>,
Bill Wendling <morbo@google.com>,
Justin Stitt <justinstitt@google.com>,
Nicolas Dufresne <nicolas.dufresne@collabora.com>,
Hans Verkuil <hverkuil+cisco@kernel.org>,
Alex Bee <knaerzche@gmail.com>, Jonas Karlman <jonas@kwiboo.se>,
Kees Cook <kees@kernel.org>,
linux-media@vger.kernel.org, linux-rockchip@lists.infradead.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, llvm@lists.linux.dev
Subject: [PATCH 2/2] [v2] media: rkvdec: reduce stack usage in rkvdec_init_v4l2_vp9_count_tbl()
Date: Thu, 5 Mar 2026 16:26:17 +0100 [thread overview]
Message-ID: <20260305152644.791897-2-arnd@kernel.org> (raw)
In-Reply-To: <20260305152644.791897-1-arnd@kernel.org>
From: Arnd Bergmann <arnd@arndb.de>
The deeply nested loop in rkvdec_init_v4l2_vp9_count_tbl() needs a lot
of registers, so when the clang register allocator runs out, it ends up
spilling countless temporaries to the stack:
drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c:966:12: error: stack frame size (1472) exceeds limit (1280) in 'rkvdec_vp9_start' [-Werror,-Wframe-larger-than]
Split out the innermost loop into a separate function that is marked
noinline_for_stack. I tried out all combinations of having some of
the inner loops inside of the separate function, but this was the only
veriant that creates reasonable code with clang-22 on arm64.
Link: https://lore.kernel.org/linux-media/20260202094804.1231706-1-arnd@kernel.org/T/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
--
v2: rework after sering more of the same warning with v1 applied.
My earlier version was much simpler but still exceeded 1280 bytes of
stack space in some configurations for unnecessary variable spills.
---
.../platform/rockchip/rkvdec/rkvdec-vp9.c | 48 ++++++++++---------
1 file changed, 26 insertions(+), 22 deletions(-)
diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c
index e4cdd2122873..ecb2819bd566 100644
--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c
+++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c
@@ -893,12 +893,36 @@ static void rkvdec_vp9_done(struct rkvdec_ctx *ctx,
update_ctx_last_info(vp9_ctx);
}
+/* noinline to ensure clang's register allocator doesn't run out of registers */
+static noinline void
+rkvdec_init_v4l2_vp9_count_tbl_loop(struct rkvdec_vp9_ctx *vp9_ctx, int i, int j, int k, int l)
+{
+ struct rkvdec_vp9_intra_frame_symbol_counts *intra_cnts = vp9_ctx->count_tbl.cpu;
+ struct rkvdec_vp9_inter_frame_symbol_counts *inter_cnts = vp9_ctx->count_tbl.cpu;
+
+ for (int m = 0; m < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0][0]); ++m) {
+ vp9_ctx->inter_cnts.coeff[i][j][k][l][m] =
+ &inter_cnts->ref_cnt[k][i][j][l][m].coeff;
+ vp9_ctx->inter_cnts.eob[i][j][k][l][m][0] =
+ &inter_cnts->ref_cnt[k][i][j][l][m].eob[0];
+ vp9_ctx->inter_cnts.eob[i][j][k][l][m][1] =
+ &inter_cnts->ref_cnt[k][i][j][l][m].eob[1];
+ \
+ vp9_ctx->intra_cnts.coeff[i][j][k][l][m] =
+ &intra_cnts->ref_cnt[k][i][j][l][m].coeff;
+ vp9_ctx->intra_cnts.eob[i][j][k][l][m][0] =
+ &intra_cnts->ref_cnt[k][i][j][l][m].eob[0];
+ vp9_ctx->intra_cnts.eob[i][j][k][l][m][1] =
+ &intra_cnts->ref_cnt[k][i][j][l][m].eob[1];
+ }
+}
+
static void rkvdec_init_v4l2_vp9_count_tbl(struct rkvdec_ctx *ctx)
{
struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv;
struct rkvdec_vp9_intra_frame_symbol_counts *intra_cnts = vp9_ctx->count_tbl.cpu;
struct rkvdec_vp9_inter_frame_symbol_counts *inter_cnts = vp9_ctx->count_tbl.cpu;
- int i, j, k, l, m;
+ int i, j, k, l;
vp9_ctx->inter_cnts.partition = &inter_cnts->partition;
vp9_ctx->inter_cnts.skip = &inter_cnts->skip;
@@ -936,31 +960,11 @@ static void rkvdec_init_v4l2_vp9_count_tbl(struct rkvdec_ctx *ctx)
vp9_ctx->inter_cnts.class0_hp = &inter_cnts->class0_hp;
vp9_ctx->inter_cnts.hp = &inter_cnts->hp;
-#define INNERMOST_LOOP \
- do { \
- for (m = 0; m < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0][0]); ++m) {\
- vp9_ctx->inter_cnts.coeff[i][j][k][l][m] = \
- &inter_cnts->ref_cnt[k][i][j][l][m].coeff; \
- vp9_ctx->inter_cnts.eob[i][j][k][l][m][0] = \
- &inter_cnts->ref_cnt[k][i][j][l][m].eob[0]; \
- vp9_ctx->inter_cnts.eob[i][j][k][l][m][1] = \
- &inter_cnts->ref_cnt[k][i][j][l][m].eob[1]; \
- \
- vp9_ctx->intra_cnts.coeff[i][j][k][l][m] = \
- &intra_cnts->ref_cnt[k][i][j][l][m].coeff; \
- vp9_ctx->intra_cnts.eob[i][j][k][l][m][0] = \
- &intra_cnts->ref_cnt[k][i][j][l][m].eob[0]; \
- vp9_ctx->intra_cnts.eob[i][j][k][l][m][1] = \
- &intra_cnts->ref_cnt[k][i][j][l][m].eob[1]; \
- } \
- } while (0)
-
for (i = 0; i < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff); ++i)
for (j = 0; j < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0]); ++j)
for (k = 0; k < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0]); ++k)
for (l = 0; l < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0]); ++l)
- INNERMOST_LOOP;
-#undef INNERMOST_LOOP
+ rkvdec_init_v4l2_vp9_count_tbl_loop(vp9_ctx, i, j, k, l);
}
static int rkvdec_vp9_start(struct rkvdec_ctx *ctx)
--
2.39.5
next prev parent reply other threads:[~2026-03-05 15:27 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 15:26 [PATCH 1/2] [RESEND] media: rkvdec: reduce excessive stack usage in assemble_hw_pps() Arnd Bergmann
2026-03-05 15:26 ` Arnd Bergmann [this message]
2026-03-05 16:37 ` Nicolas Dufresne
2026-03-05 17:10 ` Arnd Bergmann
2026-03-05 18:43 ` Nicolas Dufresne
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260305152644.791897-2-arnd@kernel.org \
--to=arnd@kernel.org \
--cc=arnd@arndb.de \
--cc=detlev.casanova@collabora.com \
--cc=ezequiel@vanguardiasur.com.ar \
--cc=heiko@sntech.de \
--cc=hverkuil+cisco@kernel.org \
--cc=jonas@kwiboo.se \
--cc=justinstitt@google.com \
--cc=kees@kernel.org \
--cc=knaerzche@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-rockchip@lists.infradead.org \
--cc=llvm@lists.linux.dev \
--cc=mchehab@kernel.org \
--cc=morbo@google.com \
--cc=nathan@kernel.org \
--cc=nick.desaulniers+lkml@gmail.com \
--cc=nicolas.dufresne@collabora.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox