public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: david.laight.linux@gmail.com
To: "Willy Tarreau" <w@1wt.eu>,
	"Thomas Weißschuh" <linux@weissschuh.net>,
	linux-kernel@vger.kernel.org, "Cheng Li" <lechain@gmail.com>
Cc: David Laight <david.laight.linux@gmail.com>
Subject: [PATCH v3 next 11/17] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars
Date: Mon, 23 Feb 2026 10:17:29 +0000	[thread overview]
Message-ID: <20260223101735.2922-12-david.laight.linux@gmail.com> (raw)
In-Reply-To: <20260223101735.2922-1-david.laight.linux@gmail.com>

From: David Laight <david.laight.linux@gmail.com>

Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
conversion specifiers.
This makes it easy to test for multiple values at once.

Detect the conversion flags " #+-0" although they are currently all ignored.

Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L'
(both long long).

Add support for "%i" (the same as %d") and "%X" (treated at "%x").

Unconditionally generate the signed values (for %d) to remove a second
set of checks for the size.

Separate out the formatting of single characters from numbers.
Output the sign for negative values then negate and treat as unsigned.

Change/add tests to use conversions i and X, and length modifiers L and ll.
Use the correct minimum value for "%Li".

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David Laight <david.laight.linux@gmail.com>
---
                        
Changes for v3: 
- Patch 6 in v2.        
- Move all the variable definitions to the top of the function.
  The loop body is a bit long to hide definitions at its top.
- Avoid -Wtype-limits validating format characters.
- Include changes to the selftests.

Changes for v2:
- Use #defines to make the code a lot more readable.
- Include the changes from the old patch 10 that used masks for the
  conversion specifiers.
- Detect all the valid flag characters even though they are not implemented.
- Support for left justifying field is moved to patch 7.
 
 tools/include/nolibc/stdio.h                 | 162 +++++++++++++------
 tools/testing/selftests/nolibc/nolibc-test.c |  14 +-
 2 files changed, 124 insertions(+), 52 deletions(-)

diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index ae96b7bebbfe..6cb106367e3b 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -291,10 +291,15 @@ int fseek(FILE *stream, long offset, int whence)
 }
 
 
-/* minimal printf(). It supports the following formats:
- *  - %[l*]{d,u,c,x,p}
- *  - %s
- *  - unknown modifiers are ignored.
+/* printf(). Supports most of the normal integer and string formats.
+ *  - %[#-+ 0][width][{l,t,z,ll,L,j,q}]{c,d,i,u,x,X,p,s,m,%}
+ *  - %% generates a single %
+ *  - %m outputs strerror(errno).
+ *  - %X outputs a..f the same as %x.
+ *  - The modifiers [#-+ 0] are currently ignored.
+ *  - No support for precision or variable widths.
+ *  - No support for floating point or wide characters.
+ *  - Invalid formats are copied to the output buffer.
  *
  * Called by vfprintf() and snprintf() to do the actual formatting.
  * The callers provide a callback function to save the formatted data.
@@ -305,15 +310,43 @@ int fseek(FILE *stream, long offset, int whence)
  *  - with (NULL, 0) at the end of the __nolibc_printf.
  * If the callback returns non-zero __nolibc_printf() immediately returns -1.
  */
+
 typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
 
+/* This code uses 'flag' variables that are indexed by the low 6 bits
+ * of characters to optimise checks for multiple characters.
+ *
+ * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
+ * returns non-zero if the bit for any of the specified characters is set.
+ *
+ * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
+ * returns the flag bit for ch if it is one of the specified characters.
+ * All the characters must be in the same 32 character block (non-alphabetic,
+ * upper case, or lower case) of the ASCII character set.
+ */
+#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
+#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
+#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
+	(_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
+	 _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
+	 _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
+	 _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
+#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
+	((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
+#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
+	((unsigned int)(ch) - (cmp_1 & 0xe0) > 0x1f ? 0 : \
+		_NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))
+
 static __attribute__((unused, format(printf, 3, 0)))
 int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
 {
-	char lpref, ch;
+	char ch;
 	unsigned long long v;
+	long long signed_v;
 	int written, width, len;
+	unsigned int flags, ch_flag;
 	char outbuf[21];
+	char *out;
 	const char *outstr;
 
 	written = 0;
@@ -324,6 +357,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 			break;
 
 		width = 0;
+		flags = 0;
 		if (ch != '%') {
 			while (*fmt && *fmt != '%')
 				fmt++;
@@ -334,7 +368,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 
 		/* we're in a format sequence */
 
-		ch = *fmt++;
+		/* Conversion flag characters */
+		while (1) {
+			ch = *fmt++;
+			ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
+			if (!ch_flag)
+				break;
+			flags |= ch_flag;
+		}
 
 		/* width */
 		while (ch >= '0' && ch <= '9') {
@@ -344,62 +385,82 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 			ch = *fmt++;
 		}
 
-		/* Length modifiers */
-		if (ch == 'l') {
-			lpref = 1;
-			ch = *fmt++;
-			if (ch == 'l') {
-				lpref = 2;
-				ch = *fmt++;
+		/* Length modifier.
+		 * They miss the conversion flags characters " #+-0" so can go into flags.
+		 * Change both L and ll to q.
+		 */
+		if (ch == 'L')
+			ch = 'q';
+		ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'l', 't', 'z', 'j', 'q');
+		if (ch_flag != 0) {
+			if (ch == 'l' && fmt[0] == 'l') {
+				fmt++;
+				ch_flag = _NOLIBC_PF_FLAG('q');
 			}
-		} else if (ch == 'j') {
-			/* intmax_t is long long */
-			lpref = 2;
+			flags |= ch_flag;
 			ch = *fmt++;
-		} else {
-			lpref = 0;
 		}
 
-		if (ch == 'c' || ch == 'd' || ch == 'u' || ch == 'x' || ch == 'p') {
-			char *out = outbuf;
+		/* Conversion specifiers. */
 
-			if (ch == 'p')
+		/* Numeric and pointer conversion specifiers.
+		 *
+		 * Use an explicit bound check (rather than _NOLIBC_PF_CHAR_IS_ONE_OF())
+		 * so that 'X' can be allowed through.
+		 * 'X' gets treated and 'x' because _NOLIBC_PF_FLAG() returns the same
+		 * value for both.
+		 */
+		ch_flag = _NOLIBC_PF_FLAG(ch);
+		if (((ch >= 'a' && ch <= 'z') || ch == 'X') &&
+		    _NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'i', 'u', 'x', 'p')) {
+			/* 'long' is needed for pointer conversions and ltz lengths.
+			 * A single test can be used provided 'p' (the same bit as '0')
+			 * is masked from flags.
+			 */
+			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')),
+						     'p', 'l', 't', 'z')) {
 				v = va_arg(args, unsigned long);
-			else if (lpref) {
-				if (lpref > 1)
-					v = va_arg(args, unsigned long long);
-				else
-					v = va_arg(args, unsigned long);
-			} else
+				signed_v = (long)v;
+			} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j', 'q')) {
+				v = va_arg(args, unsigned long long);
+				signed_v = v;
+			} else {
 				v = va_arg(args, unsigned int);
+				signed_v = (int)v;
+			}
 
-			if (ch == 'd') {
-				/* sign-extend the value */
-				if (lpref == 0)
-					v = (long long)(int)v;
-				else if (lpref == 1)
-					v = (long long)(long)v;
+			if (ch == 'c') {
+				/* "%c" - single character. */
+				outbuf[0] = v;
+				len = 1;
+				outstr = outbuf;
+				goto do_output;
 			}
 
-			switch (ch) {
-			case 'c':
-				out[0] = v;
-				out[1] = 0;
-				break;
-			case 'd':
-				i64toa_r(v, out);
-				break;
-			case 'u':
+			out = outbuf;
+
+			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i')) {
+				/* "%d" and "%i" - signed decimal numbers. */
+				if (signed_v < 0) {
+					*out++ = '-';
+					v = -(signed_v + 1);
+					v++;
+				}
+			}
+
+			/* Convert the number to ascii in the required base. */
+			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i', 'u')) {
+				/* Base 10 */
 				u64toa_r(v, out);
-				break;
-			case 'p':
-				*(out++) = '0';
-				*(out++) = 'x';
-				__nolibc_fallthrough;
-			default: /* 'x' and 'p' above */
+			} else {
+				/* Base 16 */
+				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p')) {
+					*(out++) = '0';
+					*(out++) = 'x';
+				}
 				u64toh_r(v, out);
-				break;
 			}
+
 			outstr = outbuf;
 			goto do_strlen_output;
 		}
@@ -438,6 +499,9 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 do_output:
 		written += len;
 
+		/* Stop gcc back-merging this code into one of the conditionals above. */
+		_NOLIBC_OPTIMIZER_HIDE_VAR(len);
+
 		width -= len;
 		while (width > 0) {
 			/* Output pad in 16 byte blocks with the small block first. */
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index 61968fdfeec0..498d3125eb24 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -1833,11 +1833,19 @@ static int run_printf(int min, int max)
 		CASE_TEST(number);       EXPECT_VFPRINTF(1, "1234", "%d", 1234); break;
 		CASE_TEST(negnumber);    EXPECT_VFPRINTF(1, "-1234", "%d", -1234); break;
 		CASE_TEST(unsigned);     EXPECT_VFPRINTF(1, "12345", "%u", 12345); break;
+		CASE_TEST(signed_max);   EXPECT_VFPRINTF(1, "2147483647", "%i", ~0u >> 1); break;
+		CASE_TEST(signed_min);   EXPECT_VFPRINTF(1, "-2147483648", "%i", (~0u >> 1) + 1); break;
+		CASE_TEST(unsigned_max); EXPECT_VFPRINTF(1, "4294967295", "%u", ~0u); break;
 		CASE_TEST(char);         EXPECT_VFPRINTF(1, "c", "%c", 'c'); break;
-		CASE_TEST(hex);          EXPECT_VFPRINTF(1, "f", "%x", 0xf); break;
+		CASE_TEST(hex_nolibc);   EXPECT_VFPRINTF(is_nolibc, "|f|d|", "|%x|%X|", 0xf, 0xd); break;
+		CASE_TEST(hex_libc);     EXPECT_VFPRINTF(!is_nolibc, "|f|D|", "|%x|%X|", 0xf, 0xd); break;
 		CASE_TEST(pointer);      EXPECT_VFPRINTF(1, "0x1", "%p", (void *) 0x1); break;
-		CASE_TEST(uintmax_t);    EXPECT_VFPRINTF(1, "18446744073709551615", "%ju", 0xffffffffffffffffULL); break;
-		CASE_TEST(intmax_t);     EXPECT_VFPRINTF(1, "-9223372036854775807", "%jd", 0x8000000000000001LL); break;
+		CASE_TEST(percent);      EXPECT_VFPRINTF(1, "a%d42%69%", "a%%d%d%%%d%%", 42, 69); break;
+		CASE_TEST(perc_qual);    EXPECT_VFPRINTF(1, "a%d2", "a%-14l%d%d", 2); break;
+		CASE_TEST(invalid);      EXPECT_VFPRINTF(1, "a%12yx3%y42%P", "a%12yx%d%y%d%P", 3, 42); break;
+		CASE_TEST(intmax_max);   EXPECT_VFPRINTF(1, "9223372036854775807", "%lld", ~0ULL >> 1); break;
+		CASE_TEST(intmax_min);   EXPECT_VFPRINTF(1, "-9223372036854775808", "%Li", (~0ULL >> 1) + 1); break;
+		CASE_TEST(uintmax_max);  EXPECT_VFPRINTF(1, "18446744073709551615", "%ju", ~0ULL); break;
 		CASE_TEST(truncation);   EXPECT_VFPRINTF(1, "012345678901234567890123456789", "%s", "012345678901234567890123456789"); break;
 		CASE_TEST(string_width); EXPECT_VFPRINTF(1, "         1", "%10s", "1"); break;
 		CASE_TEST(number_width); EXPECT_VFPRINTF(1, "         1", "%10d", 1); break;
-- 
2.39.5


  parent reply	other threads:[~2026-02-23 10:48 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23 10:17 [PATCH v3 next 00/17] Enhance printf() david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 01/17] tools/nolibc: Add _NOLIBC_OPTIMIZER_HIDE_VAR() to compiler.h david.laight.linux
2026-02-25 21:25   ` Thomas Weißschuh
2026-02-25 22:17     ` David Laight
2026-02-25 22:24       ` Thomas Weißschuh
2026-02-23 10:17 ` [PATCH v3 next 02/17] tools/nolibc: Optimise and common up the number to ascii functions david.laight.linux
2026-02-25 21:40   ` Thomas Weißschuh
2026-02-25 22:09     ` David Laight
2026-02-23 10:17 ` [PATCH v3 next 03/17] selftests/nolibc: Fix build with host headers and libc david.laight.linux
2026-02-25 21:24   ` Thomas Weißschuh
2026-02-23 10:17 ` [PATCH v3 next 04/17] selftests/nolibc: Improve reporting of vfprintf() errors david.laight.linux
2026-02-25 21:56   ` Thomas Weißschuh
2026-02-26 10:12     ` David Laight
2026-02-26 21:39       ` Thomas Weißschuh
2026-02-23 10:17 ` [PATCH v3 next 05/17] tools/nolibc: Implement strerror() in terms of strerror_r() david.laight.linux
2026-02-25 22:09   ` Thomas Weißschuh
2026-02-25 22:58     ` David Laight
2026-02-23 10:17 ` [PATCH v3 next 06/17] tools/nolibc/printf: Change variables 'c' to 'ch' and 'tmpbuf[]' to 'outbuf[]' david.laight.linux
2026-02-25 22:23   ` Thomas Weißschuh
2026-02-23 10:17 ` [PATCH v3 next 07/17] tools/nolibc/printf: Move snprintf length check to callback david.laight.linux
2026-02-25 22:37   ` Thomas Weißschuh
2026-02-25 23:12     ` David Laight
2026-02-26 21:29       ` Thomas Weißschuh
2026-02-26 22:11         ` David Laight
2026-02-23 10:17 ` [PATCH v3 next 08/17] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 09/17] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 10/17] tools/nolibc/printf: Use goto and reduce indentation david.laight.linux
2026-02-23 10:17 ` david.laight.linux [this message]
2026-02-23 10:17 ` [PATCH v3 next 12/17] tools/nolibc/printf: Handle "%s" with the numeric formats david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 13/17] tools/nolibc/printf: Add support for conversion flags david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 14/17] tools/nolibc/printf: Add support for left aligning fields david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 15/17] tools/nolibc/printf: Add support for zero padding and field precision david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 16/17] tools/nolibc/printf: Add support for octal output david.laight.linux
2026-02-23 10:17 ` [PATCH v3 next 17/17] selftests/nolibc: Use printf variable field widths and precisions david.laight.linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260223101735.2922-12-david.laight.linux@gmail.com \
    --to=david.laight.linux@gmail.com \
    --cc=lechain@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@weissschuh.net \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox