From: David Laight <david.laight.linux@gmail.com>
To: Willy Tarreau <w@1wt.eu>
Cc: "Thomas Weißschuh" <linux@weissschuh.net>,
linux-kernel@vger.kernel.org, "Cheng Li" <lechain@gmail.com>
Subject: Re: [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf()
Date: Sun, 8 Feb 2026 16:54:25 +0000 [thread overview]
Message-ID: <20260208165425.3ffd67dc@pumpkin> (raw)
In-Reply-To: <aYihTXM3titVomKc@1wt.eu>
On Sun, 8 Feb 2026 15:44:29 +0100
Willy Tarreau <w@1wt.eu> wrote:
> Hi David,
>
> On Sun, Feb 08, 2026 at 12:20:31PM +0000, David Laight wrote:
> > On Sat, 7 Feb 2026 23:50:19 +0000
> > David Laight <david.laight.linux@gmail.com> wrote:
> >
> > > On Sat, 7 Feb 2026 21:05:42 +0100
> > > Willy Tarreau <w@1wt.eu> wrote:
> > >
> > > > On Fri, Feb 06, 2026 at 07:11:15PM +0000, david.laight.linux@gmail.com wrote:
> > > > > From: David Laight <david.laight.linux@gmail.com>
> > > > >
> > > > > Move the check for the length modifiers into the format processing
> > > > > between the field width and conversion specifier.
> > > > > This lets the loop be simplified and a 'fast scan' for a format start
> > > > > used.
> > > > >
> > > > > If an error is detected (eg an invalid conversion specifier) then
> > > > > copy the invalid format to the output buffer.
> > > > >
> > > > > Reduces code size by about 10% on x86-64.
> > > >
> > > > I'm surprised, because for me it's the opposite:
> > > >
> > > > $ size hello-patch*
> > > > text data bss dec hex filename
> > > > 1859 48 24 1931 78b hello-patch1
> > > > 2071 48 24 2143 85f hello-patch2
> > > > 2091 48 24 2163 873 hello-patch3
> > > > 2422 48 24 2494 9be hello-patch4
> > > >
> > > > The whole program grew by almost 16%, and that's a 30% increase since
> > > > the first patch. This is with gcc 15 -Oz. aarch64 however decreased by
> > > > 15 bytes since previous patch.
> > > >
> > > > I have not figured what makes this change yet, I'm still digging.
> > >
> > > Running scripts/bloat-o-meter will give more detail.
> > >
> > > > Willy
> > >
> > > I'm using gcc 12.2 and just running 'make O=xxx' for the test program.
> > > The object looks like what I'd expect, so might be -O2.
> > >
> > > Is it constant folding the #defines.
> > > For me it generating the (1 << (c & 31)) & 0xxxxx as you might hope.
> >
> > Further thoughts:
> >
> > On some of the builds I've done gcc duplicated the code following an 'if'
> > into both the 'then' and 'else' clauses.
> > This isn't good for code size.
>
> That's common in loops for example. That's also one reason for avoiding
> "else" statements in compact code.
>
> However here I finally found what inflates the code, when disassembling
> the whole function: with the move of the multiple "if" statements,
> recent compilers managed to turn it into a jump table, that considerably
> inflates .rodata and the function as well. By passing -fno-jump-tables,
> the size drops by ~500 bytes:
That is just insane...
That might go away with the patch that changes is all to bit-masks.
I'd done some full disassembly comparisons myself to see why changes
made the code larger.
I had an OPTIMIZER_HIDE_VAR(sign) in there to help, but the final
version didn't need it.
What this sort of code needs is something to force the compiler to
only have one copy of something - I found a proposal for an attribute
(or similar) for an asm block to do that, but nothing came of it.
>
> text data bss dec hex filename
> 2422 48 24 2494 9be hello-patch4
> 1917 48 24 1989 7c5 hello-patch4-alt <---
>
> Building with gcc before 13 also avoids this table and explains why
> you had better code with gcc-12.
>
> I also noticed that we can reduce the loop by ~40 bytes by moving the
> literal copy after after the block that deals with format sequences,
> because it eases comparisons, but that's no big deal for now since your
> subsequent patches are going to change all that.
Some of the early patches are carefully arranged to reduce churn
later on.
I might add the 'if (v == 0)' clause much earlier to avoid the churn
cause by the extra indent when it is added.
I'll add some extra comments as you suggested in the other patches.
I do know all about optimising for size, and for the 'worst case path'.
The latter was some embedded hdlc code that had to finish in 196 clocks.
David
>
> At least I wanted to understand what was causing this difference for
> us both, and whether it risked remaining definitive or not, so now
> this patch is OK to me.
>
> Acked-by: Willy Tarreau <w@1wt.eu>
>
> Willy
next prev parent reply other threads:[~2026-02-08 16:54 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
2026-02-06 19:11 ` [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch' david.laight.linux
2026-02-07 18:51 ` Willy Tarreau
2026-02-16 18:52 ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback david.laight.linux
2026-02-07 19:12 ` Willy Tarreau
2026-02-07 23:28 ` David Laight
2026-02-08 15:12 ` Willy Tarreau
2026-02-08 22:49 ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback david.laight.linux
2026-02-07 19:29 ` Willy Tarreau
2026-02-07 23:36 ` David Laight
2026-02-16 19:07 ` Thomas Weißschuh
2026-02-17 11:51 ` David Laight
2026-02-18 17:52 ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
2026-02-07 19:38 ` Willy Tarreau
2026-02-07 23:43 ` David Laight
2026-02-08 15:14 ` Willy Tarreau
2026-02-16 19:30 ` Thomas Weißschuh
2026-02-16 22:29 ` David Laight
2026-02-18 17:30 ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
2026-02-07 20:05 ` Willy Tarreau
2026-02-07 23:50 ` David Laight
2026-02-08 12:20 ` David Laight
2026-02-08 14:44 ` Willy Tarreau
2026-02-08 16:54 ` David Laight [this message]
2026-02-08 17:06 ` Willy Tarreau
2026-02-06 19:11 ` [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars david.laight.linux
2026-02-08 15:22 ` Willy Tarreau
2026-02-16 19:52 ` Thomas Weißschuh
2026-02-16 22:47 ` David Laight
2026-02-18 17:36 ` Thomas Weißschuh
2026-02-18 22:57 ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X" david.laight.linux
2026-02-08 15:47 ` Willy Tarreau
2026-02-08 17:14 ` David Laight
2026-02-08 16:06 ` Willy Tarreau
2026-02-16 19:57 ` Thomas Weißschuh
2026-02-16 22:50 ` David Laight
2026-02-18 17:39 ` Thomas Weißschuh
2026-02-16 20:11 ` Thomas Weißschuh
2026-02-16 22:52 ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 08/11] tools/nolibc/printf: Add support for zero padding and field precision david.laight.linux
2026-02-08 16:16 ` Willy Tarreau
2026-02-08 17:31 ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 09/11] selftests/nolibc: Improve reporting of vfprintf() errors david.laight.linux
2026-02-16 20:05 ` Thomas Weißschuh
2026-02-17 10:48 ` David Laight
2026-02-18 17:48 ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests david.laight.linux
2026-02-16 20:14 ` Thomas Weißschuh
2026-02-16 20:23 ` Thomas Weißschuh
2026-02-16 22:54 ` David Laight
2026-02-18 17:41 ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 11/11] selftests/nolibc: Use printf("%.*s", n, "") to align output david.laight.linux
2026-02-08 16:20 ` Willy Tarreau
2026-02-16 20:22 ` Thomas Weißschuh
2026-02-06 21:36 ` [PATCH v2 next 00/11] tools/nolibc: Enhance printf() David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260208165425.3ffd67dc@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=lechain@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@weissschuh.net \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.