From: David Laight <david.laight.linux@gmail.com>
To: "Thomas Weißschuh" <linux@weissschuh.net>
Cc: Willy Tarreau <w@1wt.eu>,
linux-kernel@vger.kernel.org, Cheng Li <lechain@gmail.com>
Subject: Re: [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars
Date: Mon, 16 Feb 2026 22:47:36 +0000 [thread overview]
Message-ID: <20260216224736.518df26e@pumpkin> (raw)
In-Reply-To: <74f30662-2481-47eb-bfef-2195b170f4af@t-8ch.de>
On Mon, 16 Feb 2026 20:52:49 +0100
Thomas Weißschuh <linux@weissschuh.net> wrote:
> On 2026-02-06 19:11:16+0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> >
> > Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
> > conversion specifiers.
> > This makes it easy to test for multiple values at once.
> >
> > Detect the conversion flags " #+-0" although they are currently all ignored.
> >
> > Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L'
> > (both long long).
> >
> > Add support for "%i" (the same as %d").
>
> Would it be much work to split the new functionality into its own patch?
> >
> > Unconditionally generate the signed values (for %d) to remove a second
> > set of checks for the size.
> >
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> >
> > Changes for v2:
> > - Use #defines to make the code a lot more readable.
> > - Include the changes from the old patch 10 that used masks for the
> > conversion specifiers.
> > - Detect all the valid flag characters even though they are not implemented.
> > - Support for left justifying field is moved to patch 7.
> >
> > tools/include/nolibc/stdio.h | 151 ++++++++++++++++++++++++-----------
> > 1 file changed, 103 insertions(+), 48 deletions(-)
> >
> > diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> > index bb54f488c228..b14cf8224403 100644
> > --- a/tools/include/nolibc/stdio.h
> > +++ b/tools/include/nolibc/stdio.h
> > @@ -240,19 +240,44 @@ char *fgets(char *s, int size, FILE *stream)
> > }
> >
> >
> > -/* minimal printf(). It supports the following formats:
> > - * - %[l*]{d,u,c,x,p}
> > - * - %s
> > - * - unknown modifiers are ignored.
> > +/* simple printf(). It supports the following formats:
> > + * - %[-][width][{l,t,z,ll,L,j,q}]{d,u,c,x,p,s,m,%}
> > + * - %%
> > + * - invalid formats are copied to the output buffer
> > */
> > +
> > +/* This code uses 'flag' variables that are indexed by the low 6 bits
> > + * of characters to optimise checks for multiple characters.
> > + *
> > + * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
> > + * returns non-zero if the bit for any of the specified characters is set.
> > + *
> > + * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
> > + * returns the flag bit for ch if it is one of the specified characters.
> > + * All the characters must be in the same 32 character block (non-alphabetic,
> > + * upper case, or lower case) of the ASCII character set.)
> > + */
> > +#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
> > +#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
> > +#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
> > + (_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
> > + _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
> > + _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
> > + _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
> > +#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
> > + ((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
> > +#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
> > + (ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
> > + _NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))
>
> With signed chars I get:
>
> sysroot/i386/include/stdio.h:321:37: error: comparison is always false due to limited range of data type [-Werror=type-limits]
> 321 | (ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
Stupid type-limits warning.
It is almost impossible to get rid of without making the code unreadable.
> | ^
> sysroot/i386/include/stdio.h:389:35: note: in expansion of macro '_NOLIBC_PF_CHAR_IS_ONE_OF'
> 389 | ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'c', 'd', 'i', 'u', 'x', 'p');
> |
>
> This can be fixed by switching 'ch' to be always unsigned.
That's likely to provoke another error elsewhere.
In any case optimising that test away makes the code smaller!
>
> You can run the the builtin test suite like this:
> -p triggers the download of the toolchains
> -l uses LLVM/clang instead of the downloaded toolchain
>
> $ cd tools/testing/selftests/nolibc
> $ ./run-tests.sh -m user -p
> $ ./run-tests.sh -m user -l
I've just been running:
rm nolibc-test; make -O /path/.../dir && ./nolibc-test printf
>
> > +
> > typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
> >
> > static __attribute__((unused, format(printf, 3, 0)))
> > int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
> > {
> > - char lpref, ch;
> > - unsigned long long v;
> > + char ch;
> > unsigned int written, width;
> > + unsigned int flags, ch_flag;
> > size_t len;
> > char tmpbuf[21];
> > const char *outstr;
> > @@ -265,6 +290,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> > break;
> >
> > width = 0;
> > + flags = 0;
> > if (ch != '%') {
> > while (*fmt && *fmt != '%')
> > fmt++;
> > @@ -274,6 +300,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >
> > ch = *fmt++;
> >
> > + /* Conversion flag characters */
> > + for (;; ch = *fmt++) {
> > + ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
> > + if (!ch_flag)
> > + break;
> > + flags |= ch_flag;
> > + }
>
> What is the advantage of this over:
>
> while (1) {
> /* ... */
>
> ch = *fmt++;
> }
One line shorter :-)
>
> Or combine it with the 'ch = *fmt++' from above and do:
>
> while (1) {
> ch = *fmt++;
>
> /* ... */
> }
>
> These look much simpler to me.
The code always needs to get a new character after using one.
So you come out of each bit of code with the 'next char to check' in ch.
Which means you need a new character at the end of the loop for the next
iteration.
>
> > +
> > /* width */
> > while (ch >= '0' && ch <= '9') {
> > width *= 10;
>
> (...)
>
> > +do_output:
> > written += len;
> >
> > + /* An OPTIMIZER_HIDE_VAR() seems to stop gcc back-merging this
> > + * code into one of the conditionals above.
> > + */
> > + __asm__ volatile("" : "=r"(len) : "0"(len));
>
> Can we make a definition in nolibc/compiler.h out of this?
I've added _NOLIBC_OPTIMIZER_HIDE_VAR() to compiler.h in the draft of v3.
> Why the 'volatile'? Wo don't have that in the kernel.
I probably wrote that before looking up the kernel definition.
> Why separate input and ouput arguments instead of one combined one ('+r')?
> I have been wondering the same about the kernel definition, too.
I'm not sure either.
Certainly "+r" is more modern - which is why it isn't used in a lot of places.
They may be identical (indeed "+r" might get converted to the "0" form),
but maybe it gives better separation - I just copied the same version.
David
>
> > +
> > while (width > len) {
> > unsigned int pad_len = ((width - len - 1) & 15) + 1;
> > width -= pad_len;
> > --
> > 2.39.5
> >
next prev parent reply other threads:[~2026-02-16 22:47 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
2026-02-06 19:11 ` [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch' david.laight.linux
2026-02-07 18:51 ` Willy Tarreau
2026-02-16 18:52 ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback david.laight.linux
2026-02-07 19:12 ` Willy Tarreau
2026-02-07 23:28 ` David Laight
2026-02-08 15:12 ` Willy Tarreau
2026-02-08 22:49 ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback david.laight.linux
2026-02-07 19:29 ` Willy Tarreau
2026-02-07 23:36 ` David Laight
2026-02-16 19:07 ` Thomas Weißschuh
2026-02-17 11:51 ` David Laight
2026-02-18 17:52 ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
2026-02-07 19:38 ` Willy Tarreau
2026-02-07 23:43 ` David Laight
2026-02-08 15:14 ` Willy Tarreau
2026-02-16 19:30 ` Thomas Weißschuh
2026-02-16 22:29 ` David Laight
2026-02-18 17:30 ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
2026-02-07 20:05 ` Willy Tarreau
2026-02-07 23:50 ` David Laight
2026-02-08 12:20 ` David Laight
2026-02-08 14:44 ` Willy Tarreau
2026-02-08 16:54 ` David Laight
2026-02-08 17:06 ` Willy Tarreau
2026-02-06 19:11 ` [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars david.laight.linux
2026-02-08 15:22 ` Willy Tarreau
2026-02-16 19:52 ` Thomas Weißschuh
2026-02-16 22:47 ` David Laight [this message]
2026-02-18 17:36 ` Thomas Weißschuh
2026-02-18 22:57 ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X" david.laight.linux
2026-02-08 15:47 ` Willy Tarreau
2026-02-08 17:14 ` David Laight
2026-02-08 16:06 ` Willy Tarreau
2026-02-16 19:57 ` Thomas Weißschuh
2026-02-16 22:50 ` David Laight
2026-02-18 17:39 ` Thomas Weißschuh
2026-02-16 20:11 ` Thomas Weißschuh
2026-02-16 22:52 ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 08/11] tools/nolibc/printf: Add support for zero padding and field precision david.laight.linux
2026-02-08 16:16 ` Willy Tarreau
2026-02-08 17:31 ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 09/11] selftests/nolibc: Improve reporting of vfprintf() errors david.laight.linux
2026-02-16 20:05 ` Thomas Weißschuh
2026-02-17 10:48 ` David Laight
2026-02-18 17:48 ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests david.laight.linux
2026-02-16 20:14 ` Thomas Weißschuh
2026-02-16 20:23 ` Thomas Weißschuh
2026-02-16 22:54 ` David Laight
2026-02-18 17:41 ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 11/11] selftests/nolibc: Use printf("%.*s", n, "") to align output david.laight.linux
2026-02-08 16:20 ` Willy Tarreau
2026-02-16 20:22 ` Thomas Weißschuh
2026-02-06 21:36 ` [PATCH v2 next 00/11] tools/nolibc: Enhance printf() David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260216224736.518df26e@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=lechain@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@weissschuh.net \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.