From: david.laight.linux@gmail.com
To: "Willy Tarreau" <w@1wt.eu>,
"Thomas Weißschuh" <linux@weissschuh.net>,
linux-kernel@vger.kernel.org, "Cheng Li" <lechain@gmail.com>
Cc: David Laight <david.laight.linux@gmail.com>
Subject: [PATCH v3 next 11/17] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars
Date: Mon, 23 Feb 2026 10:17:29 +0000 [thread overview]
Message-ID: <20260223101735.2922-12-david.laight.linux@gmail.com> (raw)
In-Reply-To: <20260223101735.2922-1-david.laight.linux@gmail.com>
From: David Laight <david.laight.linux@gmail.com>
Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
conversion specifiers.
This makes it easy to test for multiple values at once.
Detect the conversion flags " #+-0" although they are currently all ignored.
Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L'
(both long long).
Add support for "%i" (the same as %d") and "%X" (treated at "%x").
Unconditionally generate the signed values (for %d) to remove a second
set of checks for the size.
Separate out the formatting of single characters from numbers.
Output the sign for negative values then negate and treat as unsigned.
Change/add tests to use conversions i and X, and length modifiers L and ll.
Use the correct minimum value for "%Li".
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David Laight <david.laight.linux@gmail.com>
---
Changes for v3:
- Patch 6 in v2.
- Move all the variable definitions to the top of the function.
The loop body is a bit long to hide definitions at its top.
- Avoid -Wtype-limits validating format characters.
- Include changes to the selftests.
Changes for v2:
- Use #defines to make the code a lot more readable.
- Include the changes from the old patch 10 that used masks for the
conversion specifiers.
- Detect all the valid flag characters even though they are not implemented.
- Support for left justifying field is moved to patch 7.
tools/include/nolibc/stdio.h | 162 +++++++++++++------
tools/testing/selftests/nolibc/nolibc-test.c | 14 +-
2 files changed, 124 insertions(+), 52 deletions(-)
diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index ae96b7bebbfe..6cb106367e3b 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -291,10 +291,15 @@ int fseek(FILE *stream, long offset, int whence)
}
-/* minimal printf(). It supports the following formats:
- * - %[l*]{d,u,c,x,p}
- * - %s
- * - unknown modifiers are ignored.
+/* printf(). Supports most of the normal integer and string formats.
+ * - %[#-+ 0][width][{l,t,z,ll,L,j,q}]{c,d,i,u,x,X,p,s,m,%}
+ * - %% generates a single %
+ * - %m outputs strerror(errno).
+ * - %X outputs a..f the same as %x.
+ * - The modifiers [#-+ 0] are currently ignored.
+ * - No support for precision or variable widths.
+ * - No support for floating point or wide characters.
+ * - Invalid formats are copied to the output buffer.
*
* Called by vfprintf() and snprintf() to do the actual formatting.
* The callers provide a callback function to save the formatted data.
@@ -305,15 +310,43 @@ int fseek(FILE *stream, long offset, int whence)
* - with (NULL, 0) at the end of the __nolibc_printf.
* If the callback returns non-zero __nolibc_printf() immediately returns -1.
*/
+
typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
+/* This code uses 'flag' variables that are indexed by the low 6 bits
+ * of characters to optimise checks for multiple characters.
+ *
+ * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
+ * returns non-zero if the bit for any of the specified characters is set.
+ *
+ * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
+ * returns the flag bit for ch if it is one of the specified characters.
+ * All the characters must be in the same 32 character block (non-alphabetic,
+ * upper case, or lower case) of the ASCII character set.
+ */
+#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
+#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
+#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
+ (_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
+ _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
+ _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
+ _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
+#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
+ ((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
+#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
+ ((unsigned int)(ch) - (cmp_1 & 0xe0) > 0x1f ? 0 : \
+ _NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))
+
static __attribute__((unused, format(printf, 3, 0)))
int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
{
- char lpref, ch;
+ char ch;
unsigned long long v;
+ long long signed_v;
int written, width, len;
+ unsigned int flags, ch_flag;
char outbuf[21];
+ char *out;
const char *outstr;
written = 0;
@@ -324,6 +357,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
break;
width = 0;
+ flags = 0;
if (ch != '%') {
while (*fmt && *fmt != '%')
fmt++;
@@ -334,7 +368,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
/* we're in a format sequence */
- ch = *fmt++;
+ /* Conversion flag characters */
+ while (1) {
+ ch = *fmt++;
+ ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
+ if (!ch_flag)
+ break;
+ flags |= ch_flag;
+ }
/* width */
while (ch >= '0' && ch <= '9') {
@@ -344,62 +385,82 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
ch = *fmt++;
}
- /* Length modifiers */
- if (ch == 'l') {
- lpref = 1;
- ch = *fmt++;
- if (ch == 'l') {
- lpref = 2;
- ch = *fmt++;
+ /* Length modifier.
+ * They miss the conversion flags characters " #+-0" so can go into flags.
+ * Change both L and ll to q.
+ */
+ if (ch == 'L')
+ ch = 'q';
+ ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'l', 't', 'z', 'j', 'q');
+ if (ch_flag != 0) {
+ if (ch == 'l' && fmt[0] == 'l') {
+ fmt++;
+ ch_flag = _NOLIBC_PF_FLAG('q');
}
- } else if (ch == 'j') {
- /* intmax_t is long long */
- lpref = 2;
+ flags |= ch_flag;
ch = *fmt++;
- } else {
- lpref = 0;
}
- if (ch == 'c' || ch == 'd' || ch == 'u' || ch == 'x' || ch == 'p') {
- char *out = outbuf;
+ /* Conversion specifiers. */
- if (ch == 'p')
+ /* Numeric and pointer conversion specifiers.
+ *
+ * Use an explicit bound check (rather than _NOLIBC_PF_CHAR_IS_ONE_OF())
+ * so that 'X' can be allowed through.
+ * 'X' gets treated and 'x' because _NOLIBC_PF_FLAG() returns the same
+ * value for both.
+ */
+ ch_flag = _NOLIBC_PF_FLAG(ch);
+ if (((ch >= 'a' && ch <= 'z') || ch == 'X') &&
+ _NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'i', 'u', 'x', 'p')) {
+ /* 'long' is needed for pointer conversions and ltz lengths.
+ * A single test can be used provided 'p' (the same bit as '0')
+ * is masked from flags.
+ */
+ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')),
+ 'p', 'l', 't', 'z')) {
v = va_arg(args, unsigned long);
- else if (lpref) {
- if (lpref > 1)
- v = va_arg(args, unsigned long long);
- else
- v = va_arg(args, unsigned long);
- } else
+ signed_v = (long)v;
+ } else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j', 'q')) {
+ v = va_arg(args, unsigned long long);
+ signed_v = v;
+ } else {
v = va_arg(args, unsigned int);
+ signed_v = (int)v;
+ }
- if (ch == 'd') {
- /* sign-extend the value */
- if (lpref == 0)
- v = (long long)(int)v;
- else if (lpref == 1)
- v = (long long)(long)v;
+ if (ch == 'c') {
+ /* "%c" - single character. */
+ outbuf[0] = v;
+ len = 1;
+ outstr = outbuf;
+ goto do_output;
}
- switch (ch) {
- case 'c':
- out[0] = v;
- out[1] = 0;
- break;
- case 'd':
- i64toa_r(v, out);
- break;
- case 'u':
+ out = outbuf;
+
+ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i')) {
+ /* "%d" and "%i" - signed decimal numbers. */
+ if (signed_v < 0) {
+ *out++ = '-';
+ v = -(signed_v + 1);
+ v++;
+ }
+ }
+
+ /* Convert the number to ascii in the required base. */
+ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i', 'u')) {
+ /* Base 10 */
u64toa_r(v, out);
- break;
- case 'p':
- *(out++) = '0';
- *(out++) = 'x';
- __nolibc_fallthrough;
- default: /* 'x' and 'p' above */
+ } else {
+ /* Base 16 */
+ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p')) {
+ *(out++) = '0';
+ *(out++) = 'x';
+ }
u64toh_r(v, out);
- break;
}
+
outstr = outbuf;
goto do_strlen_output;
}
@@ -438,6 +499,9 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
do_output:
written += len;
+ /* Stop gcc back-merging this code into one of the conditionals above. */
+ _NOLIBC_OPTIMIZER_HIDE_VAR(len);
+
width -= len;
while (width > 0) {
/* Output pad in 16 byte blocks with the small block first. */
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index 61968fdfeec0..498d3125eb24 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -1833,11 +1833,19 @@ static int run_printf(int min, int max)
CASE_TEST(number); EXPECT_VFPRINTF(1, "1234", "%d", 1234); break;
CASE_TEST(negnumber); EXPECT_VFPRINTF(1, "-1234", "%d", -1234); break;
CASE_TEST(unsigned); EXPECT_VFPRINTF(1, "12345", "%u", 12345); break;
+ CASE_TEST(signed_max); EXPECT_VFPRINTF(1, "2147483647", "%i", ~0u >> 1); break;
+ CASE_TEST(signed_min); EXPECT_VFPRINTF(1, "-2147483648", "%i", (~0u >> 1) + 1); break;
+ CASE_TEST(unsigned_max); EXPECT_VFPRINTF(1, "4294967295", "%u", ~0u); break;
CASE_TEST(char); EXPECT_VFPRINTF(1, "c", "%c", 'c'); break;
- CASE_TEST(hex); EXPECT_VFPRINTF(1, "f", "%x", 0xf); break;
+ CASE_TEST(hex_nolibc); EXPECT_VFPRINTF(is_nolibc, "|f|d|", "|%x|%X|", 0xf, 0xd); break;
+ CASE_TEST(hex_libc); EXPECT_VFPRINTF(!is_nolibc, "|f|D|", "|%x|%X|", 0xf, 0xd); break;
CASE_TEST(pointer); EXPECT_VFPRINTF(1, "0x1", "%p", (void *) 0x1); break;
- CASE_TEST(uintmax_t); EXPECT_VFPRINTF(1, "18446744073709551615", "%ju", 0xffffffffffffffffULL); break;
- CASE_TEST(intmax_t); EXPECT_VFPRINTF(1, "-9223372036854775807", "%jd", 0x8000000000000001LL); break;
+ CASE_TEST(percent); EXPECT_VFPRINTF(1, "a%d42%69%", "a%%d%d%%%d%%", 42, 69); break;
+ CASE_TEST(perc_qual); EXPECT_VFPRINTF(1, "a%d2", "a%-14l%d%d", 2); break;
+ CASE_TEST(invalid); EXPECT_VFPRINTF(1, "a%12yx3%y42%P", "a%12yx%d%y%d%P", 3, 42); break;
+ CASE_TEST(intmax_max); EXPECT_VFPRINTF(1, "9223372036854775807", "%lld", ~0ULL >> 1); break;
+ CASE_TEST(intmax_min); EXPECT_VFPRINTF(1, "-9223372036854775808", "%Li", (~0ULL >> 1) + 1); break;
+ CASE_TEST(uintmax_max); EXPECT_VFPRINTF(1, "18446744073709551615", "%ju", ~0ULL); break;
CASE_TEST(truncation); EXPECT_VFPRINTF(1, "012345678901234567890123456789", "%s", "012345678901234567890123456789"); break;
CASE_TEST(string_width); EXPECT_VFPRINTF(1, " 1", "%10s", "1"); break;
CASE_TEST(number_width); EXPECT_VFPRINTF(1, " 1", "%10d", 1); break;
--
2.39.5
next prev parent reply other threads:[~2026-02-23 10:48 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 10:17 [PATCH v3 next 00/17] Enhance printf() david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 01/17] tools/nolibc: Add _NOLIBC_OPTIMIZER_HIDE_VAR() to compiler.h david.laight.linux
2026-02-25 21:25 ` Thomas Weißschuh
2026-02-25 22:17 ` David Laight
2026-02-25 22:24 ` Thomas Weißschuh
2026-02-23 10:17 ` [PATCH v3 next 02/17] tools/nolibc: Optimise and common up the number to ascii functions david.laight.linux
2026-02-25 21:40 ` Thomas Weißschuh
2026-02-25 22:09 ` David Laight
2026-02-23 10:17 ` [PATCH v3 next 03/17] selftests/nolibc: Fix build with host headers and libc david.laight.linux
2026-02-25 21:24 ` Thomas Weißschuh
2026-02-23 10:17 ` [PATCH v3 next 04/17] selftests/nolibc: Improve reporting of vfprintf() errors david.laight.linux
2026-02-25 21:56 ` Thomas Weißschuh
2026-02-26 10:12 ` David Laight
2026-02-26 21:39 ` Thomas Weißschuh
2026-02-23 10:17 ` [PATCH v3 next 05/17] tools/nolibc: Implement strerror() in terms of strerror_r() david.laight.linux
2026-02-25 22:09 ` Thomas Weißschuh
2026-02-25 22:58 ` David Laight
2026-02-23 10:17 ` [PATCH v3 next 06/17] tools/nolibc/printf: Change variables 'c' to 'ch' and 'tmpbuf[]' to 'outbuf[]' david.laight.linux
2026-02-25 22:23 ` Thomas Weißschuh
2026-02-23 10:17 ` [PATCH v3 next 07/17] tools/nolibc/printf: Move snprintf length check to callback david.laight.linux
2026-02-25 22:37 ` Thomas Weißschuh
2026-02-25 23:12 ` David Laight
2026-02-26 21:29 ` Thomas Weißschuh
2026-02-26 22:11 ` David Laight
2026-02-23 10:17 ` [PATCH v3 next 08/17] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 09/17] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 10/17] tools/nolibc/printf: Use goto and reduce indentation david.laight.linux
2026-02-23 10:17 ` david.laight.linux [this message]
2026-02-23 10:17 ` [PATCH v3 next 12/17] tools/nolibc/printf: Handle "%s" with the numeric formats david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 13/17] tools/nolibc/printf: Add support for conversion flags david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 14/17] tools/nolibc/printf: Add support for left aligning fields david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 15/17] tools/nolibc/printf: Add support for zero padding and field precision david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 16/17] tools/nolibc/printf: Add support for octal output david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 17/17] selftests/nolibc: Use printf variable field widths and precisions david.laight.linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260223101735.2922-12-david.laight.linux@gmail.com \
--to=david.laight.linux@gmail.com \
--cc=lechain@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@weissschuh.net \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.