All of lore.kernel.org
 help / color / mirror / Atom feed
From: david.laight.linux@gmail.com
To: "Willy Tarreau" <w@1wt.eu>,
	"Thomas Weißschuh" <linux@weissschuh.net>,
	linux-kernel@vger.kernel.org, "Cheng Li" <lechain@gmail.com>
Cc: David Laight <david.laight.linux@gmail.com>
Subject: [PATCH 14/23] tools/nolibc/printf: Use bit-masks to hold requested  flag, length and conversion chars
Date: Mon,  2 Mar 2026 10:18:06 +0000	[thread overview]
Message-ID: <20260302101815.3043-15-david.laight.linux@gmail.com> (raw)
In-Reply-To: <20260302101815.3043-1-david.laight.linux@gmail.com>

From: David Laight <david.laight.linux@gmail.com>

Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
conversion specifiers.
This makes it easy to test for multiple values at once.

Detect the conversion flags " #+-0" although they are currently all ignored.

Unconditionally generate the signed values (for %d) to remove a second
set of checks for the size.

Separate out the formatting of single characters from numbers.
Output the sign for negative values then negate and treat as unsigned.

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Changes for v4:
- Move the support for length modifiers t, j, q, L and formats
  i and X to the next patch.
- Convert ll to j (not q) since q isn't added until the next patch.

Changes for v3:
- Patch 6 in v2.
- Move all the variable definitions to the top of the function.
  The loop body is a bit long to hide definitions at its top.
- Avoid -Wtype-limits validating format characters.
- Include changes to the selftests.

Changes for v2:
- Use #defines to make the code a lot more readable.
- Include the changes from the old patch 10 that used masks for the
  conversion specifiers.
- Detect all the valid flag characters even though they are not implemented.
- Support for left justifying field is moved to patch 7.
 tools/include/nolibc/stdio.h | 157 ++++++++++++++++++++++++-----------
 1 file changed, 108 insertions(+), 49 deletions(-)

diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index 13fe6c4d7f58..ea1288d87eea 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -291,10 +291,14 @@ int fseek(FILE *stream, long offset, int whence)
 }
 
 
-/* minimal printf(). It supports the following formats:
- *  - %[l*]{d,u,c,x,p}
- *  - %s
- *  - unknown modifiers are ignored.
+/* printf(). Supports the following integer and string formats.
+ *  - %[#-+ 0][width][{l,ll,j}]{c,d,u,x,p,s,m,%}
+ *  - %% generates a single %
+ *  - %m outputs strerror(errno).
+ *  - The modifiers [#-+ 0] are currently ignored.
+ *  - No support for precision or variable widths.
+ *  - No support for floating point or wide characters.
+ *  - Invalid formats are copied to the output buffer.
  *
  * Called by vfprintf() and snprintf() to do the actual formatting.
  * The callers provide a callback function to save the formatted data.
@@ -305,15 +309,43 @@ int fseek(FILE *stream, long offset, int whence)
  *  - with (NULL, 0) at the end of the __nolibc_printf.
  * If the callback returns non-zero __nolibc_printf() immediately returns -1.
  */
+
 typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
 
+/* This code uses 'flag' variables that are indexed by the low 6 bits
+ * of characters to optimise checks for multiple characters.
+ *
+ * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
+ * returns non-zero if the bit for any of the specified characters is set.
+ *
+ * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
+ * returns the flag bit for ch if it is one of the specified characters.
+ * All the characters must be in the same 32 character block (non-alphabetic,
+ * upper case, or lower case) of the ASCII character set.
+ */
+#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
+#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
+#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
+	(_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
+	 _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
+	 _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
+	 _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
+#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
+	((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
+#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
+	((unsigned int)(ch) - (cmp_1 & 0xe0) > 0x1f ? 0 : \
+		_NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))
+
 static __attribute__((unused, format(printf, 3, 0)))
 int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
 {
-	char lpref, ch;
+	char ch;
 	unsigned long long v;
+	long long signed_v;
 	int written, width, len;
+	unsigned int flags, ch_flag;
 	char outbuf[21];
+	char *out;
 	const char *outstr;
 
 	written = 0;
@@ -324,6 +356,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 			break;
 
 		width = 0;
+		flags = 0;
 		if (ch != '%') {
 			while (*fmt && *fmt != '%')
 				fmt++;
@@ -334,7 +367,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 
 		/* we're in a format sequence */
 
-		ch = *fmt++;
+		/* Conversion flag characters */
+		while (1) {
+			ch = *fmt++;
+			ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
+			if (!ch_flag)
+				break;
+			flags |= ch_flag;
+		}
 
 		/* width */
 		while (ch >= '0' && ch <= '9') {
@@ -344,62 +384,78 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 			ch = *fmt++;
 		}
 
-		/* Length modifiers */
-		if (ch == 'l') {
-			lpref = 1;
-			ch = *fmt++;
-			if (ch == 'l') {
-				lpref = 2;
-				ch = *fmt++;
+		/* Length modifier.
+		 * They miss the conversion flags characters " #+-0" so can go into flags.
+		 * Change ll to j (both always 64bits).
+		 */
+		ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'l', 'j');
+		if (ch_flag != 0) {
+			if (ch == 'l' && fmt[0] == 'l') {
+				fmt++;
+				ch_flag = _NOLIBC_PF_FLAG('j');
 			}
-		} else if (ch == 'j') {
-			/* intmax_t is long long */
-			lpref = 2;
+			flags |= ch_flag;
 			ch = *fmt++;
-		} else {
-			lpref = 0;
 		}
 
-		if (ch == 'c' || ch == 'd' || ch == 'u' || ch == 'x' || ch == 'p') {
-			char *out = outbuf;
+		/* Conversion specifiers. */
 
-			if (ch == 'p')
+		/* Numeric and pointer conversion specifiers.
+		 *
+		 * Use an explicit bound check (rather than _NOLIBC_PF_CHAR_IS_ONE_OF())
+		 * so ch_flag can be used later.
+		 */
+		ch_flag = _NOLIBC_PF_FLAG(ch);
+		if ((ch >= 'a' && ch <= 'z') &&
+		    _NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'u', 'x', 'p')) {
+			/* 'long' is needed for pointer conversions and ltz lengths.
+			 * A single test can be used provided 'p' (the same bit as '0')
+			 * is masked from flags.
+			 */
+			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')),
+						     'p', 'l')) {
 				v = va_arg(args, unsigned long);
-			else if (lpref) {
-				if (lpref > 1)
-					v = va_arg(args, unsigned long long);
-				else
-					v = va_arg(args, unsigned long);
-			} else
+				signed_v = (long)v;
+			} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j')) {
+				v = va_arg(args, unsigned long long);
+				signed_v = v;
+			} else {
 				v = va_arg(args, unsigned int);
+				signed_v = (int)v;
+			}
 
-			if (ch == 'd') {
-				/* sign-extend the value */
-				if (lpref == 0)
-					v = (long long)(int)v;
-				else if (lpref == 1)
-					v = (long long)(long)v;
+			if (ch == 'c') {
+				/* "%c" - single character. */
+				outbuf[0] = v;
+				len = 1;
+				outstr = outbuf;
+				goto do_output;
 			}
 
-			switch (ch) {
-			case 'c':
-				out[0] = v;
-				out[1] = 0;
-				break;
-			case 'd':
-				i64toa_r(v, out);
-				break;
-			case 'u':
+			out = outbuf;
+
+			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd')) {
+				/* "%d" and "%i" - signed decimal numbers. */
+				if (signed_v < 0) {
+					*out++ = '-';
+					v = -(signed_v + 1);
+					v++;
+				}
+			}
+
+			/* Convert the number to ascii in the required base. */
+			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'u')) {
+				/* Base 10 */
 				u64toa_r(v, out);
-				break;
-			case 'p':
-				*(out++) = '0';
-				*(out++) = 'x';
-				__nolibc_fallthrough;
-			default: /* 'x' and 'p' above */
+			} else {
+				/* Base 16 */
+				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p')) {
+					*(out++) = '0';
+					*(out++) = 'x';
+				}
 				u64toh_r(v, out);
-				break;
 			}
+
 			outstr = outbuf;
 			goto do_strlen_output;
 		}
@@ -442,6 +498,9 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 do_output:
 		written += len;
 
+		/* Stop gcc back-merging this code into one of the conditionals above. */
+		_NOLIBC_OPTIMIZER_HIDE_VAR(len);
+
 		width -= len;
 		while (width > 0) {
 			/* Output pad in 16 byte blocks with the small block first. */
-- 
2.39.5


  parent reply	other threads:[~2026-03-02 10:18 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-02 10:17 [PATCH v4 next 00/23] Enhance printf() david.laight.linux
2026-03-02 10:17 ` [PATCH v4 next 01/23] tools/nolibc: Add _NOLIBC_OPTIMIZER_HIDE_VAR() to compiler.h david.laight.linux
2026-03-07 10:50   ` Willy Tarreau
2026-03-02 10:17 ` [PATCH v4 next 02/23] tools/nolibc/printf: Move snprintf length check to callback david.laight.linux
2026-03-07 10:48   ` Willy Tarreau
2026-03-02 10:17 ` [PATCH v4 next 03/23] selftests/nolibc: Return correct value when printf test fails david.laight.linux
2026-03-02 10:17 ` [PATCH v4 next 04/23] selftests/nolibc: check vsnprintf() output buffer before the length david.laight.linux
2026-03-02 10:17 ` [PATCH v4 next 05/23] selftests/nolibc: Use length of 'expected' string to check snprintf() output david.laight.linux
2026-03-02 10:17 ` [PATCH v4 next 06/23] selftests/nolibc: Check that snprintf() doesn't write beyond the buffer end david.laight.linux
2026-03-02 10:17 ` [PATCH v4 next 07/23] selftests/nolibc: Let EXPECT_VFPRINTF() tests be skipped david.laight.linux
2026-03-02 10:18 ` [PATCH 08/23] selftests/nolibc: Rename w to written in expect_vfprintf() david.laight.linux
2026-03-02 10:18 ` [PATCH v4 next 09/23] tools/nolibc: Implement strerror() in terms of strerror_r() david.laight.linux
2026-03-07 10:18   ` Willy Tarreau
2026-03-07 11:31     ` David Laight
2026-03-07 11:37       ` Willy Tarreau
2026-03-07 16:55         ` David Laight
2026-03-07 17:17           ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 10/23] tools/nolibc: Rename the 'errnum' parameter to strerror() david.laight.linux
2026-03-07 10:19   ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 11/23] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
2026-03-02 10:18 ` [PATCH 12/23] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
2026-03-02 10:18 ` [PATCH v4 next 13/23] tools/nolibc/printf: Use goto and reduce indentation david.laight.linux
2026-03-07 10:30   ` Willy Tarreau
2026-03-02 10:18 ` david.laight.linux [this message]
2026-03-02 10:18 ` [PATCH v4 next 15/23] tools/nolibc/printf: Add support for length modifiers tzqL and formats iX david.laight.linux
2026-03-02 10:18 ` [PATCH v4 next 16/23] tools/nolibc/printf: Handle "%s" with the numeric formats david.laight.linux
2026-03-07 10:32   ` Willy Tarreau
2026-03-02 10:18 ` [PATCH 17/23] tools/nolibc/printf: Prepend sign to converted number david.laight.linux
2026-03-07 10:40   ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 18/23] tools/nolibc/printf: Add support for conversion flags space and plus david.laight.linux
2026-03-07 10:46   ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 19/23] tools/nolibc/printf: Special case 0 and add support for %#x david.laight.linux
2026-03-07 10:46   ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 20/23] tools/nolibc/printf: Add support for left aligning fields david.laight.linux
2026-03-07 10:46   ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 21/23] tools/nolibc/printf: Add support for zero padding and field precision david.laight.linux
2026-03-02 10:18 ` [PATCH v4 next 22/23] tools/nolibc/printf: Add support for octal output david.laight.linux
2026-03-07 10:45   ` Willy Tarreau
2026-03-02 10:18 ` [PATCH v4 next 23/23] selftests/nolibc: Use printf variable field widths and precisions david.laight.linux
2026-03-07 10:53 ` [PATCH v4 next 00/23] Enhance printf() Willy Tarreau
2026-03-07 18:02 ` Thomas Weißschuh
2026-03-07 22:03   ` David Laight
2026-03-07 22:20     ` Thomas Weißschuh
2026-03-08  9:23   ` Willy Tarreau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260302101815.3043-15-david.laight.linux@gmail.com \
    --to=david.laight.linux@gmail.com \
    --cc=lechain@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@weissschuh.net \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.