public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Nathan Chancellor <nathan@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Alexandre Chartre <alexandre.chartre@oracle.com>,
	David Laight <david.laight.linux@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] objtool: Fix stack overflow in validate_branch()
Date: Wed, 3 Dec 2025 20:11:54 +0100	[thread overview]
Message-ID: <aTCLevOLZ69EpXNF@gmail.com> (raw)
In-Reply-To: <7i2v6lkl7pd2jzk57omos6pqkgwooewrrztsvi5weibvod2f5b@3mkwqwzslyl4>

* Josh Poimboeuf <jpoimboe@kernel.org> wrote:

> On Wed, Dec 03, 2025 at 10:25:34AM +0100, Ingo Molnar wrote:
> > * Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> > > On Tue, Dec 02, 2025 at 05:20:22PM +0100, Ingo Molnar wrote:
> > > > * Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> > > > > On an allmodconfig kernel compiled with Clang, objtool is
> > > > > segfaulting in drivers/scsi/qla2xxx/qla2xxx.o due to a stack
> > > > > overflow in validate_branch().
> > > > >
> > > > > Due in part to KASAN being enabled, the qla2xxx code has a large
> > > > > number of conditional jumps, causing objtool to go quite deep in
> > > > > its recursion.
> > > > >
> > > > > By far the biggest offender of stack usage is the recently added
> > > > > 'prev_state' stack variable in validate_insn(), coming in at 328
> > > > > bytes.
> > > >
> > > > That's weird - how can a user-space tool run into stack limits, are
> > > > they set particularly conservatively?
> > >
> > > On my Fedora system, "ulimit -s" is 8MB.  You'd think that would be
> > > enough :-)
> > >
> > > In this case, objtool had over 20,000 stack frames caused by
> > > recursively following over 7,000(!) conditional jumps in a single
> > > function.
> >
> > BTW., I just instrumented it, and it's even worse: on current upstream,
> > the allmodconfig qla2xxx.o code built with clang-20.1.8 has a worst-case
> > recursion depth of 50,944 (!), for the qla83xx_fw_dump() function.
>
> Is that number of loops or total stack frames?

So I tracked the depth of validate_insn() recursion directly:

                ret = validate_insn(file, func, insn, &state, prev_insn, next_insn,
-                                   &dead_end);
+                                   &dead_end, depth++);

                if (!insn->trace) {
                        if (ret)

And in validate_insn():

	if (depth > max_depth) {
		max_depth = depth;
		printf("# objtool new max depth: %ld for %s()\n", max_depth, func->name);
	}

Actual function recursion depth may be deeper, if any of the helper
functions get uninlined.

> kernel and clang 20.1.8 I'm getting a max recursion depth of 7,165 loops
> (not frames).  See the below patch for how I measured that.

Your patch seems to be similar, except that I passed in 'depth'
directly, because as a kernel developer I don't trust globals :-)

But it should measure the same thing AFAICS, right?

> You may be underestimating the amount of memory usage objtool needs.
> Running objtool on that binary with "/usr/bin/time -v" shows the maximum
> resident set size is 140M.  So the stack usage of 5.5MB is only about
> 4.4% of the total memory usage.

Still, the stack is some of the cache-hottest pieces of memory
in that workload - and the biggest negative impact from the
current recursion pattern comes from the sparse parsing, which
suffers an even worse negative effect with a 140MB working set.

> > One relatively simple method to 'straighten out' the parsing flow would
> > be to add an internal 'branch queue' with a limited size of say 16 or 32
> > entries, and defer the parsing of these branch targets and continue with
> > the next instruction, until one of these conditions is true:
> >
> >   - 'branch queue' is full
> >
> >   - JMP, CALL, RET or any other branching/trapping instruction is found
> >
> >   - already validated instruction is found
> >
> >   - end of symbol/section/file/etc.
> >
> > At which point the current 'branch queue' is flushed. (It might even be
> > implemented as a branch-target stack, which may have a bit better
> > locality.)
>
> Objtool tracks a considerable amount of state across branches.  The
> recursion works well for keeping that state at hand.  So there is a
> certain level of dependency there which I have a feeling might be
> difficult to extricate.  I haven't really looked at it though.

I'm not against recursion for branches at all, I just suggest
to change the order of how the recursion is fed: instead of parsing
the two instruction streams of a branch point in this order:

	verify target recursively
	verify next instruction

(Which is arguably the simplest.)

I suggest the following recursion pattern:

	verify a batch of serial sequence of instruction(s) and save conditional branch targets (if any)
	verify saved branch targets, recursively

This change to the recursion pattern should make a very large
impact on max recursion depth, in addition to substantially better
cache locality.

Thanks,

	Ingo


  parent reply	other threads:[~2025-12-03 19:11 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-02 16:16 [PATCH] objtool: Fix stack overflow in validate_branch() Josh Poimboeuf
2025-12-02 16:20 ` Ingo Molnar
2025-12-02 16:49   ` Josh Poimboeuf
2025-12-02 17:03     ` Ingo Molnar
2025-12-02 17:11       ` Josh Poimboeuf
2025-12-02 19:56         ` Josh Poimboeuf
2025-12-02 20:20           ` Ingo Molnar
2025-12-02 22:05             ` David Laight
2025-12-02 23:01               ` Josh Poimboeuf
2025-12-03 11:02                 ` David Laight
2025-12-03 16:11                 ` Ingo Molnar
2025-12-03 16:40                 ` [tip: objtool/urgent] objtool: Add more robust signal error handling, detect and warn about stack overflows tip-bot2 for Josh Poimboeuf
2025-12-03 18:48                 ` tip-bot2 for Josh Poimboeuf
2025-12-03  9:25     ` [PATCH] objtool: Fix stack overflow in validate_branch() Ingo Molnar
2025-12-03 18:54       ` Josh Poimboeuf
2025-12-03 18:58         ` Josh Poimboeuf
2025-12-03 19:15           ` Ingo Molnar
2025-12-03 19:37             ` David Laight
2025-12-03 20:30               ` Linus Torvalds
2025-12-03 23:53             ` Josh Poimboeuf
2025-12-03 19:11         ` Ingo Molnar [this message]
2025-12-04  2:47           ` Josh Poimboeuf
2025-12-02 16:27 ` [tip: objtool/urgent] " tip-bot2 for Josh Poimboeuf
2025-12-02 16:28 ` [PATCH] " Josh Poimboeuf
2025-12-02 16:41   ` Ingo Molnar
2025-12-02 16:44 ` [tip: objtool/urgent] " tip-bot2 for Josh Poimboeuf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTCLevOLZ69EpXNF@gmail.com \
    --to=mingo@kernel.org \
    --cc=alexandre.chartre@oracle.com \
    --cc=david.laight.linux@gmail.com \
    --cc=jpoimboe@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nathan@kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox