From: david.laight.linux@gmail.com
To: "Willy Tarreau" <w@1wt.eu>,
"Thomas Weißschuh" <linux@weissschuh.net>,
linux-kernel@vger.kernel.org, "Cheng Li" <lechain@gmail.com>
Cc: David Laight <david.laight.linux@gmail.com>
Subject: [PATCH 14/23] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars
Date: Mon, 2 Mar 2026 10:18:06 +0000 [thread overview]
Message-ID: <20260302101815.3043-15-david.laight.linux@gmail.com> (raw)
In-Reply-To: <20260302101815.3043-1-david.laight.linux@gmail.com>
From: David Laight <david.laight.linux@gmail.com>
Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
conversion specifiers.
This makes it easy to test for multiple values at once.
Detect the conversion flags " #+-0" although they are currently all ignored.
Unconditionally generate the signed values (for %d) to remove a second
set of checks for the size.
Separate out the formatting of single characters from numbers.
Output the sign for negative values then negate and treat as unsigned.
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David Laight <david.laight.linux@gmail.com>
---
Changes for v4:
- Move the support for length modifiers t, j, q, L and formats
i and X to the next patch.
- Convert ll to j (not q) since q isn't added until the next patch.
Changes for v3:
- Patch 6 in v2.
- Move all the variable definitions to the top of the function.
The loop body is a bit long to hide definitions at its top.
- Avoid -Wtype-limits validating format characters.
- Include changes to the selftests.
Changes for v2:
- Use #defines to make the code a lot more readable.
- Include the changes from the old patch 10 that used masks for the
conversion specifiers.
- Detect all the valid flag characters even though they are not implemented.
- Support for left justifying field is moved to patch 7.
tools/include/nolibc/stdio.h | 157 ++++++++++++++++++++++++-----------
1 file changed, 108 insertions(+), 49 deletions(-)
diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index 13fe6c4d7f58..ea1288d87eea 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -291,10 +291,14 @@ int fseek(FILE *stream, long offset, int whence)
}
-/* minimal printf(). It supports the following formats:
- * - %[l*]{d,u,c,x,p}
- * - %s
- * - unknown modifiers are ignored.
+/* printf(). Supports the following integer and string formats.
+ * - %[#-+ 0][width][{l,ll,j}]{c,d,u,x,p,s,m,%}
+ * - %% generates a single %
+ * - %m outputs strerror(errno).
+ * - The modifiers [#-+ 0] are currently ignored.
+ * - No support for precision or variable widths.
+ * - No support for floating point or wide characters.
+ * - Invalid formats are copied to the output buffer.
*
* Called by vfprintf() and snprintf() to do the actual formatting.
* The callers provide a callback function to save the formatted data.
@@ -305,15 +309,43 @@ int fseek(FILE *stream, long offset, int whence)
* - with (NULL, 0) at the end of the __nolibc_printf.
* If the callback returns non-zero __nolibc_printf() immediately returns -1.
*/
+
typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
+/* This code uses 'flag' variables that are indexed by the low 6 bits
+ * of characters to optimise checks for multiple characters.
+ *
+ * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
+ * returns non-zero if the bit for any of the specified characters is set.
+ *
+ * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
+ * returns the flag bit for ch if it is one of the specified characters.
+ * All the characters must be in the same 32 character block (non-alphabetic,
+ * upper case, or lower case) of the ASCII character set.
+ */
+#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
+#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
+#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
+ (_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
+ _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
+ _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
+ _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
+#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
+ ((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
+#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
+ ((unsigned int)(ch) - (cmp_1 & 0xe0) > 0x1f ? 0 : \
+ _NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))
+
static __attribute__((unused, format(printf, 3, 0)))
int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
{
- char lpref, ch;
+ char ch;
unsigned long long v;
+ long long signed_v;
int written, width, len;
+ unsigned int flags, ch_flag;
char outbuf[21];
+ char *out;
const char *outstr;
written = 0;
@@ -324,6 +356,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
break;
width = 0;
+ flags = 0;
if (ch != '%') {
while (*fmt && *fmt != '%')
fmt++;
@@ -334,7 +367,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
/* we're in a format sequence */
- ch = *fmt++;
+ /* Conversion flag characters */
+ while (1) {
+ ch = *fmt++;
+ ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
+ if (!ch_flag)
+ break;
+ flags |= ch_flag;
+ }
/* width */
while (ch >= '0' && ch <= '9') {
@@ -344,62 +384,78 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
ch = *fmt++;
}
- /* Length modifiers */
- if (ch == 'l') {
- lpref = 1;
- ch = *fmt++;
- if (ch == 'l') {
- lpref = 2;
- ch = *fmt++;
+ /* Length modifier.
+ * They miss the conversion flags characters " #+-0" so can go into flags.
+ * Change ll to j (both always 64bits).
+ */
+ ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'l', 'j');
+ if (ch_flag != 0) {
+ if (ch == 'l' && fmt[0] == 'l') {
+ fmt++;
+ ch_flag = _NOLIBC_PF_FLAG('j');
}
- } else if (ch == 'j') {
- /* intmax_t is long long */
- lpref = 2;
+ flags |= ch_flag;
ch = *fmt++;
- } else {
- lpref = 0;
}
- if (ch == 'c' || ch == 'd' || ch == 'u' || ch == 'x' || ch == 'p') {
- char *out = outbuf;
+ /* Conversion specifiers. */
- if (ch == 'p')
+ /* Numeric and pointer conversion specifiers.
+ *
+ * Use an explicit bound check (rather than _NOLIBC_PF_CHAR_IS_ONE_OF())
+ * so ch_flag can be used later.
+ */
+ ch_flag = _NOLIBC_PF_FLAG(ch);
+ if ((ch >= 'a' && ch <= 'z') &&
+ _NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'u', 'x', 'p')) {
+ /* 'long' is needed for pointer conversions and ltz lengths.
+ * A single test can be used provided 'p' (the same bit as '0')
+ * is masked from flags.
+ */
+ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')),
+ 'p', 'l')) {
v = va_arg(args, unsigned long);
- else if (lpref) {
- if (lpref > 1)
- v = va_arg(args, unsigned long long);
- else
- v = va_arg(args, unsigned long);
- } else
+ signed_v = (long)v;
+ } else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j')) {
+ v = va_arg(args, unsigned long long);
+ signed_v = v;
+ } else {
v = va_arg(args, unsigned int);
+ signed_v = (int)v;
+ }
- if (ch == 'd') {
- /* sign-extend the value */
- if (lpref == 0)
- v = (long long)(int)v;
- else if (lpref == 1)
- v = (long long)(long)v;
+ if (ch == 'c') {
+ /* "%c" - single character. */
+ outbuf[0] = v;
+ len = 1;
+ outstr = outbuf;
+ goto do_output;
}
- switch (ch) {
- case 'c':
- out[0] = v;
- out[1] = 0;
- break;
- case 'd':
- i64toa_r(v, out);
- break;
- case 'u':
+ out = outbuf;
+
+ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd')) {
+ /* "%d" and "%i" - signed decimal numbers. */
+ if (signed_v < 0) {
+ *out++ = '-';
+ v = -(signed_v + 1);
+ v++;
+ }
+ }
+
+ /* Convert the number to ascii in the required base. */
+ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'u')) {
+ /* Base 10 */
u64toa_r(v, out);
- break;
- case 'p':
- *(out++) = '0';
- *(out++) = 'x';
- __nolibc_fallthrough;
- default: /* 'x' and 'p' above */
+ } else {
+ /* Base 16 */
+ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p')) {
+ *(out++) = '0';
+ *(out++) = 'x';
+ }
u64toh_r(v, out);
- break;
}
+
outstr = outbuf;
goto do_strlen_output;
}
@@ -442,6 +498,9 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
do_output:
written += len;
+ /* Stop gcc back-merging this code into one of the conditionals above. */
+ _NOLIBC_OPTIMIZER_HIDE_VAR(len);
+
width -= len;
while (width > 0) {
/* Output pad in 16 byte blocks with the small block first. */
--
2.39.5
next prev parent reply other threads:[~2026-03-02 10:18 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-02 10:17 [PATCH v4 next 00/23] Enhance printf() david.laight.linux
2026-03-02 10:17 ` [PATCH v4 next 01/23] tools/nolibc: Add _NOLIBC_OPTIMIZER_HIDE_VAR() to compiler.h david.laight.linux
2026-03-07 10:50 ` Willy Tarreau
2026-03-02 10:17 ` [PATCH v4 next 02/23] tools/nolibc/printf: Move snprintf length check to callback david.laight.linux
2026-03-07 10:48 ` Willy Tarreau
2026-03-02 10:17 ` [PATCH v4 next 03/23] selftests/nolibc: Return correct value when printf test fails david.laight.linux
2026-03-02 10:17 ` [PATCH v4 next 04/23] selftests/nolibc: check vsnprintf() output buffer before the length david.laight.linux
2026-03-02 10:17 ` [PATCH v4 next 05/23] selftests/nolibc: Use length of 'expected' string to check snprintf() output david.laight.linux
2026-03-02 10:17 ` [PATCH v4 next 06/23] selftests/nolibc: Check that snprintf() doesn't write beyond the buffer end david.laight.linux
2026-03-02 10:17 ` [PATCH v4 next 07/23] selftests/nolibc: Let EXPECT_VFPRINTF() tests be skipped david.laight.linux
2026-03-02 10:18 ` [PATCH 08/23] selftests/nolibc: Rename w to written in expect_vfprintf() david.laight.linux
2026-03-02 10:18 ` [PATCH v4 next 09/23] tools/nolibc: Implement strerror() in terms of strerror_r() david.laight.linux
2026-03-07 10:18 ` Willy Tarreau
2026-03-07 11:31 ` David Laight
2026-03-07 11:37 ` Willy Tarreau
2026-03-07 16:55 ` David Laight
2026-03-07 17:17 ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 10/23] tools/nolibc: Rename the 'errnum' parameter to strerror() david.laight.linux
2026-03-07 10:19 ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 11/23] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
2026-03-02 10:18 ` [PATCH 12/23] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
2026-03-02 10:18 ` [PATCH v4 next 13/23] tools/nolibc/printf: Use goto and reduce indentation david.laight.linux
2026-03-07 10:30 ` Willy Tarreau
2026-03-02 10:18 ` david.laight.linux [this message]
2026-03-02 10:18 ` [PATCH v4 next 15/23] tools/nolibc/printf: Add support for length modifiers tzqL and formats iX david.laight.linux
2026-03-02 10:18 ` [PATCH v4 next 16/23] tools/nolibc/printf: Handle "%s" with the numeric formats david.laight.linux
2026-03-07 10:32 ` Willy Tarreau
2026-03-02 10:18 ` [PATCH 17/23] tools/nolibc/printf: Prepend sign to converted number david.laight.linux
2026-03-07 10:40 ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 18/23] tools/nolibc/printf: Add support for conversion flags space and plus david.laight.linux
2026-03-07 10:46 ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 19/23] tools/nolibc/printf: Special case 0 and add support for %#x david.laight.linux
2026-03-07 10:46 ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 20/23] tools/nolibc/printf: Add support for left aligning fields david.laight.linux
2026-03-07 10:46 ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 21/23] tools/nolibc/printf: Add support for zero padding and field precision david.laight.linux
2026-03-02 10:18 ` [PATCH v4 next 22/23] tools/nolibc/printf: Add support for octal output david.laight.linux
2026-03-07 10:45 ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 23/23] selftests/nolibc: Use printf variable field widths and precisions david.laight.linux
2026-03-07 10:53 ` [PATCH v4 next 00/23] Enhance printf() Willy Tarreau
2026-03-07 18:02 ` Thomas Weißschuh
2026-03-07 22:03 ` David Laight
2026-03-07 22:20 ` Thomas Weißschuh
2026-03-08 9:23 ` Willy Tarreau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260302101815.3043-15-david.laight.linux@gmail.com \
--to=david.laight.linux@gmail.com \
--cc=lechain@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@weissschuh.net \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox