public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 next 00/11] tools/nolibc: Enhance printf()
@ 2026-02-06 19:11 david.laight.linux
  2026-02-06 19:11 ` [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch' david.laight.linux
                   ` (11 more replies)
  0 siblings, 12 replies; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Update printf() so that it handles almost all the non-fp formats.
In particular:
- Left alignment.
- Zero padding.
- Field precision.
- Variable field width and precision.
- Width modifiers q, L, t and z.
- Conversion specifiers i and X (X generates lower case).
About the only things that are missing are octal and floating point.

The tests are updated to match.

There is a slight increase in code size, but it is minimalised
by the the heavy use of bit-pattern matches.

vfprintf() is modified to buffer data for each call and do a single
write system call at the end.
This improves performance somewhat, but is actually most useful when
using strace.

Final bloat-o-meter output for nolibc-test using gcc 12.2 on x86-64:
add/remove: 2/2 grow/shrink: 8/2 up/down: 2189/-320 (1869)
Function                                     old     new   delta
run_printf                                  1688    3121   +1433
utoa_r                                         -     144    +144
u64toa_r                                       -     144    +144
__nolibc_fprintf_cb                           58     202    +144
expect_vfprintf                              362     435     +73
__nolibc_printf                             1081    1148     +67
prepare                                      600     657     +57
__nolibc_sprintf_cb                           58     101     +43
printf                                       199     241     +42
dprintf.constprop                            215     257     +42
snprintf.constprop                           229     204     -25
result                                       157     107     -50
puts.isra                                    101       -    -101
utoa_r.isra                                  144       -    -144
Total: Before=32966, After=34835, chg +5.67%

u64toa_r() was previously inlined into __nolibc_printf() so
__nolibc_printf() actually increases by about 200 bytes.

The largest increases come from supporting variable field widths ("%*d")
(mostly the extra va_arg() call), the subtle difference between zero pad
("%06d") and field precision ("%6.6d"), and the special rules for zero.

The 50 bytes saved from result() are from changing it to use "%.*s".
This offsets most of the cost of supporting variable widths.
(It is also removes the only call to puts().)

Changes for v2:
Mostly changes to improve the readability of the code.
- New patch #1 inserted to rename the variable 'c' to 'ch'.
- Use #define 'magic' for the bit-masks that check multiple characters.
  The check for the conversion flag characters is then based on:
	ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
- Re-order the changes so that the old patch 10 (Use bit-pattern for
  integral formats) is done at the same time as bit-masks are used for
  the flags characters and length modifiers.
  This means the restructuring changes are done before new features are
  added.
- Put all the changes to the selftest together at the end.
  There is one extra test for ("%#01x", 0x1234) (should be "0x1234")
  which is problematic because once you've removed the length of the "0x"
  from the field width there are -1 character postions for the digits.

David Laight (11):
  tools/nolibc/printf: Change variable used for format chars from 'c' to
    'ch'
  tools/nolibc/printf: Move snprintf length check to callback
  tools/nolibc/printf: Add buffering to vfprintf() callback.
  tools/nolibc/printf: Output pad characters in 16 byte chunks
  tools/nolibc/printf: Simplify __nolibc_printf()
  tools/nolibc/printf: Use bit-masks to hold requested flag, length and
    conversion chars
  tools/nolibc/printf: Add support for conversion flags "#- +" and
    format "%X"
  tools/nolibc/printf: Add support for zero padding and field precision
  selftests/nolibc: Improve reporting of vfprintf() errors
  selftests/nolibc: Increase coverage of printf format tests
  selftests/nolibc: Use printf("%.*s", n, "") to align test output

 tools/include/nolibc/stdio.h                 | 445 ++++++++++++++-----
 tools/testing/selftests/nolibc/nolibc-test.c |  97 ++--
 2 files changed, 386 insertions(+), 156 deletions(-)

-- 
2.39.5


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch'
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-07 18:51   ` Willy Tarreau
  2026-02-16 18:52   ` Thomas Weißschuh
  2026-02-06 19:11 ` [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback david.laight.linux
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

This makes the code slightly easier to read because the variable
stands out from the single character literals (especially 'c').

The following patches pretty much rewrite the function so the
churn is limited.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

New patch for v2.

 tools/include/nolibc/stdio.h | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index 1f16dab2ac88..f162cc697a73 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -250,7 +250,7 @@ typedef int (*__nolibc_printf_cb)(intptr_t state, const char *buf, size_t size);
 static __attribute__((unused, format(printf, 4, 0)))
 int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char *fmt, va_list args)
 {
-	char escape, lpref, c;
+	char escape, lpref, ch;
 	unsigned long long v;
 	unsigned int written, width;
 	size_t len, ofs, w;
@@ -259,7 +259,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
 
 	written = ofs = escape = lpref = 0;
 	while (1) {
-		c = fmt[ofs++];
+		ch = fmt[ofs++];
 		width = 0;
 
 		if (escape) {
@@ -267,17 +267,17 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
 			escape = 0;
 
 			/* width */
-			while (c >= '0' && c <= '9') {
+			while (ch >= '0' && ch <= '9') {
 				width *= 10;
-				width += c - '0';
+				width += ch - '0';
 
-				c = fmt[ofs++];
+				ch = fmt[ofs++];
 			}
 
-			if (c == 'c' || c == 'd' || c == 'u' || c == 'x' || c == 'p') {
+			if (ch == 'c' || ch == 'd' || ch == 'u' || ch == 'x' || ch == 'p') {
 				char *out = tmpbuf;
 
-				if (c == 'p')
+				if (ch == 'p')
 					v = va_arg(args, unsigned long);
 				else if (lpref) {
 					if (lpref > 1)
@@ -287,7 +287,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
 				} else
 					v = va_arg(args, unsigned int);
 
-				if (c == 'd') {
+				if (ch == 'd') {
 					/* sign-extend the value */
 					if (lpref == 0)
 						v = (long long)(int)v;
@@ -295,7 +295,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
 						v = (long long)(long)v;
 				}
 
-				switch (c) {
+				switch (ch) {
 				case 'c':
 					out[0] = v;
 					out[1] = 0;
@@ -316,28 +316,28 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
 				}
 				outstr = tmpbuf;
 			}
-			else if (c == 's') {
+			else if (ch == 's') {
 				outstr = va_arg(args, char *);
 				if (!outstr)
 					outstr="(null)";
 			}
-			else if (c == 'm') {
+			else if (ch == 'm') {
 #ifdef NOLIBC_IGNORE_ERRNO
 				outstr = "unknown error";
 #else
 				outstr = strerror(errno);
 #endif /* NOLIBC_IGNORE_ERRNO */
 			}
-			else if (c == '%') {
+			else if (ch == '%') {
 				/* queue it verbatim */
 				continue;
 			}
 			else {
 				/* modifiers or final 0 */
-				if (c == 'l') {
+				if (ch == 'l') {
 					/* long format prefix, maintain the escape */
 					lpref++;
-				} else if (c == 'j') {
+				} else if (ch == 'j') {
 					lpref = 2;
 				}
 				escape = 1;
@@ -348,7 +348,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
 		}
 
 		/* not an escape sequence */
-		if (c == 0 || c == '%') {
+		if (ch == 0 || ch == '%') {
 			/* flush pending data on escape or end */
 			escape = 1;
 			lpref = 0;
@@ -369,7 +369,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
 
 			written += len;
 		do_escape:
-			if (c == 0)
+			if (ch == 0)
 				break;
 			fmt += ofs;
 			ofs = 0;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
  2026-02-06 19:11 ` [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch' david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-07 19:12   ` Willy Tarreau
  2026-02-06 19:11 ` [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback david.laight.linux
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Move output truncation to the snprintf() callback.
This simplifies the main code and ensures the truncation will be
correct when left-alignment is added.

Add a zero length callback to 'finalise' the buffer rather than
doing it in snprintf() itself.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Changes for v2:
- Formally patch 1
- Add comments about the final callback.

 tools/include/nolibc/stdio.h | 68 ++++++++++++++++++++++--------------
 1 file changed, 41 insertions(+), 27 deletions(-)

diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index f162cc697a73..36733ecd4261 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -245,15 +245,15 @@ char *fgets(char *s, int size, FILE *stream)
  *  - %s
  *  - unknown modifiers are ignored.
  */
-typedef int (*__nolibc_printf_cb)(intptr_t state, const char *buf, size_t size);
+typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
 
-static __attribute__((unused, format(printf, 4, 0)))
-int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char *fmt, va_list args)
+static __attribute__((unused, format(printf, 3, 0)))
+int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
 {
 	char escape, lpref, ch;
 	unsigned long long v;
 	unsigned int written, width;
-	size_t len, ofs, w;
+	size_t len, ofs;
 	char tmpbuf[21];
 	const char *outstr;
 
@@ -355,17 +355,13 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
 			outstr = fmt;
 			len = ofs - 1;
 		flush_str:
-			if (n) {
-				w = len < n ? len : n;
-				n -= w;
-				while (width-- > w) {
-					if (cb(state, " ", 1) != 0)
-						return -1;
-					written += 1;
-				}
-				if (cb(state, outstr, w) != 0)
+			while (width-- > len) {
+				if (cb(state, " ", 1) != 0)
 					return -1;
+				written += 1;
 			}
+			if (cb(state, outstr, len) != 0)
+				return -1;
 
 			written += len;
 		do_escape:
@@ -378,18 +374,23 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
 
 		/* literal char, just queue it */
 	}
+
+	/* Flush/terminate any buffer. */
+	if (cb(state, NULL, 0) != 0)
+		return -1;
+
 	return written;
 }
 
-static int __nolibc_fprintf_cb(intptr_t state, const char *buf, size_t size)
+static int __nolibc_fprintf_cb(void *stream, const char *buf, size_t size)
 {
-	return _fwrite(buf, size, (FILE *)state);
+	return size ? _fwrite(buf, size, stream) : 0;
 }
 
 static __attribute__((unused, format(printf, 2, 0)))
 int vfprintf(FILE *stream, const char *fmt, va_list args)
 {
-	return __nolibc_printf(__nolibc_fprintf_cb, (intptr_t)stream, SIZE_MAX, fmt, args);
+	return __nolibc_printf(__nolibc_fprintf_cb, stream, fmt, args);
 }
 
 static __attribute__((unused, format(printf, 1, 0)))
@@ -447,26 +448,39 @@ int dprintf(int fd, const char *fmt, ...)
 	return ret;
 }
 
-static int __nolibc_sprintf_cb(intptr_t _state, const char *buf, size_t size)
+struct __nolibc_sprintf_cb_state {
+	char *buf;
+	size_t size;
+};
+
+static int __nolibc_sprintf_cb(void *v_state, const char *buf, size_t size)
 {
-	char **state = (char **)_state;
+	struct __nolibc_sprintf_cb_state *state = v_state;
+	char *tgt;
 
-	memcpy(*state, buf, size);
-	*state += size;
+	if (size >= state->size) {
+		if (state->size <= 1)
+			return 0;
+		size = state->size - 1;
+	}
+	tgt = state->buf;
+	if (size) {
+		state->size -= size;
+		state->buf = tgt + size;
+		memcpy(tgt, buf, size);
+	} else {
+		/* In particular from cb(NULL, 0) at the end of __nolibc_printf(). */
+		*tgt = '\0';
+	}
 	return 0;
 }
 
 static __attribute__((unused, format(printf, 3, 0)))
 int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
 {
-	char *state = buf;
-	int ret;
+	struct __nolibc_sprintf_cb_state state = { .buf = buf, .size = size };
 
-	ret = __nolibc_printf(__nolibc_sprintf_cb, (intptr_t)&state, size, fmt, args);
-	if (ret < 0)
-		return ret;
-	buf[(size_t)ret < size ? (size_t)ret : size - 1] = '\0';
-	return ret;
+	return __nolibc_printf(__nolibc_sprintf_cb, &state, fmt, args);
 }
 
 static __attribute__((unused, format(printf, 3, 4)))
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback.
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
  2026-02-06 19:11 ` [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch' david.laight.linux
  2026-02-06 19:11 ` [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-07 19:29   ` Willy Tarreau
  2026-02-06 19:11 ` [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Add per-call buffering to the vprintf() callback.
While this adds some extra code it will speed things up and
makes a massive difference to anyone looking at strace output.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Changes for v2:
Formally patch 2, unchanged.

 tools/include/nolibc/stdio.h | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index 36733ecd4261..552f09d51d82 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -382,15 +382,41 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 	return written;
 }
 
-static int __nolibc_fprintf_cb(void *stream, const char *buf, size_t size)
+struct __nolibc_fprintf_cb_state {
+	FILE *stream;
+	unsigned int buf_offset;
+	char buf[128];
+};
+
+static int __nolibc_fprintf_cb(void *v_state, const char *buf, size_t size)
 {
-	return size ? _fwrite(buf, size, stream) : 0;
+	struct __nolibc_fprintf_cb_state *state = v_state;
+	unsigned int off = state->buf_offset;
+
+	if (off + size > sizeof(state->buf) || buf == NULL) {
+		state->buf_offset = 0;
+		if (off && _fwrite(state->buf, off, state->stream))
+			return -1;
+		if (size > sizeof(state->buf))
+			return _fwrite(buf, size, state->stream);
+		off = 0;
+	}
+
+	if (size) {
+		state->buf_offset = off + size;
+		memcpy(state->buf + off, buf, size);
+	}
+	return 0;
 }
 
 static __attribute__((unused, format(printf, 2, 0)))
 int vfprintf(FILE *stream, const char *fmt, va_list args)
 {
-	return __nolibc_printf(__nolibc_fprintf_cb, stream, fmt, args);
+	struct __nolibc_fprintf_cb_state state;
+
+	state.stream = stream;
+	state.buf_offset = 0;
+	return __nolibc_printf(__nolibc_fprintf_cb, &state, fmt, args);
 }
 
 static __attribute__((unused, format(printf, 1, 0)))
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
                   ` (2 preceding siblings ...)
  2026-02-06 19:11 ` [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-07 19:38   ` Willy Tarreau
  2026-02-16 19:30   ` Thomas Weißschuh
  2026-02-06 19:11 ` [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
                   ` (7 subsequent siblings)
  11 siblings, 2 replies; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Simple to do and saves calls to the callback function.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Changes for v2:
Formally patch 3, unchanged.

 tools/include/nolibc/stdio.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index 552f09d51d82..c044a6b3babe 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -355,10 +355,12 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 			outstr = fmt;
 			len = ofs - 1;
 		flush_str:
-			while (width-- > len) {
-				if (cb(state, " ", 1) != 0)
+			while (width > len) {
+				unsigned int pad_len = ((width - len - 1) & 15) + 1;
+				width -= pad_len;
+				written += pad_len;
+				if (cb(state, "                ", pad_len) != 0)
 					return -1;
-				written += 1;
 			}
 			if (cb(state, outstr, len) != 0)
 				return -1;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf()
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
                   ` (3 preceding siblings ...)
  2026-02-06 19:11 ` [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-07 20:05   ` Willy Tarreau
  2026-02-06 19:11 ` [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars david.laight.linux
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Move the check for the length modifiers into the format processing
between the field width and conversion specifier.
This lets the loop be simplified and a 'fast scan' for a format start
used.

If an error is detected (eg an invalid conversion specifier) then
copy the invalid format to the output buffer.

Reduces code size by about 10% on x86-64.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Unchanged from v1.

 tools/include/nolibc/stdio.h | 100 ++++++++++++++++++-----------------
 1 file changed, 51 insertions(+), 49 deletions(-)

diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index c044a6b3babe..bb54f488c228 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -250,28 +250,52 @@ typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
 static __attribute__((unused, format(printf, 3, 0)))
 int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
 {
-	char escape, lpref, ch;
+	char lpref, ch;
 	unsigned long long v;
 	unsigned int written, width;
-	size_t len, ofs;
+	size_t len;
 	char tmpbuf[21];
 	const char *outstr;
 
-	written = ofs = escape = lpref = 0;
+	written = 0;
 	while (1) {
-		ch = fmt[ofs++];
+		outstr = fmt;
+		ch = *fmt++;
+		if (!ch)
+			break;
+
 		width = 0;
+		if (ch != '%') {
+			while (*fmt && *fmt != '%')
+				fmt++;
+			len = fmt - outstr;
+		} else {
+			/* we're in a format sequence */
 
-		if (escape) {
-			/* we're in an escape sequence, ofs == 1 */
-			escape = 0;
+			ch = *fmt++;
 
 			/* width */
 			while (ch >= '0' && ch <= '9') {
 				width *= 10;
 				width += ch - '0';
 
-				ch = fmt[ofs++];
+				ch = *fmt++;
+			}
+
+			/* Length modifiers */
+			if (ch == 'l') {
+				lpref = 1;
+				ch = *fmt++;
+				if (ch == 'l') {
+					lpref = 2;
+					ch = *fmt++;
+				}
+			} else if (ch == 'j') {
+				/* intmax_t is long long */
+				lpref = 2;
+				ch = *fmt++;
+			} else {
+				lpref = 0;
 			}
 
 			if (ch == 'c' || ch == 'd' || ch == 'u' || ch == 'x' || ch == 'p') {
@@ -327,54 +351,32 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 #else
 				outstr = strerror(errno);
 #endif /* NOLIBC_IGNORE_ERRNO */
-			}
-			else if (ch == '%') {
-				/* queue it verbatim */
-				continue;
-			}
-			else {
-				/* modifiers or final 0 */
-				if (ch == 'l') {
-					/* long format prefix, maintain the escape */
-					lpref++;
-				} else if (ch == 'j') {
-					lpref = 2;
+			} else {
+				if (ch != '%') {
+					/* Invalid format: back up to output the format characters */
+					fmt = outstr + 1;
+					/* and output a '%' now. */
 				}
-				escape = 1;
-				goto do_escape;
+				/* %% is documented as a 'conversion specifier'.
+				 * Any flags, precision or length modifier are ignored.
+				 */
+				width = 0;
+				outstr = "%";
 			}
 			len = strlen(outstr);
-			goto flush_str;
 		}
 
-		/* not an escape sequence */
-		if (ch == 0 || ch == '%') {
-			/* flush pending data on escape or end */
-			escape = 1;
-			lpref = 0;
-			outstr = fmt;
-			len = ofs - 1;
-		flush_str:
-			while (width > len) {
-				unsigned int pad_len = ((width - len - 1) & 15) + 1;
-				width -= pad_len;
-				written += pad_len;
-				if (cb(state, "                ", pad_len) != 0)
-					return -1;
-			}
-			if (cb(state, outstr, len) != 0)
-				return -1;
+		written += len;
 
-			written += len;
-		do_escape:
-			if (ch == 0)
-				break;
-			fmt += ofs;
-			ofs = 0;
-			continue;
+		while (width > len) {
+			unsigned int pad_len = ((width - len - 1) & 15) + 1;
+			width -= pad_len;
+			written += pad_len;
+			if (cb(state, "                ", pad_len) != 0)
+				return -1;
 		}
-
-		/* literal char, just queue it */
+		if (cb(state, outstr, len) != 0)
+			return -1;
 	}
 
 	/* Flush/terminate any buffer. */
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
                   ` (4 preceding siblings ...)
  2026-02-06 19:11 ` [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-08 15:22   ` Willy Tarreau
  2026-02-16 19:52   ` Thomas Weißschuh
  2026-02-06 19:11 ` [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X" david.laight.linux
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
conversion specifiers.
This makes it easy to test for multiple values at once.

Detect the conversion flags " #+-0" although they are currently all ignored.

Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L'
(both long long).

Add support for "%i" (the same as %d").

Unconditionally generate the signed values (for %d) to remove a second
set of checks for the size.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Changes for v2:
- Use #defines to make the code a lot more readable.
- Include the changes from the old patch 10 that used masks for the
  conversion specifiers.
- Detect all the valid flag characters even though they are not implemented.
- Support for left justifying field is moved to patch 7.

 tools/include/nolibc/stdio.h | 151 ++++++++++++++++++++++++-----------
 1 file changed, 103 insertions(+), 48 deletions(-)

diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index bb54f488c228..b14cf8224403 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -240,19 +240,44 @@ char *fgets(char *s, int size, FILE *stream)
 }
 
 
-/* minimal printf(). It supports the following formats:
- *  - %[l*]{d,u,c,x,p}
- *  - %s
- *  - unknown modifiers are ignored.
+/* simple printf(). It supports the following formats:
+ *  - %[-][width][{l,t,z,ll,L,j,q}]{d,u,c,x,p,s,m,%}
+ *  - %%
+ *  - invalid formats are copied to the output buffer
  */
+
+/* This code uses 'flag' variables that are indexed by the low 6 bits
+ * of characters to optimise checks for multiple characters.
+ *
+ * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
+ * returns non-zero if the bit for any of the specified characters is set.
+ *
+ * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
+ * returns the flag bit for ch if it is one of the specified characters.
+ * All the characters must be in the same 32 character block (non-alphabetic,
+ * upper case, or lower case) of the ASCII character set.)
+ */
+#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
+#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
+#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
+	(_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
+	 _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
+	 _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
+	 _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
+#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
+	((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
+#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
+	(ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
+		_NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))
+
 typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
 
 static __attribute__((unused, format(printf, 3, 0)))
 int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
 {
-	char lpref, ch;
-	unsigned long long v;
+	char ch;
 	unsigned int written, width;
+	unsigned int flags, ch_flag;
 	size_t len;
 	char tmpbuf[21];
 	const char *outstr;
@@ -265,6 +290,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 			break;
 
 		width = 0;
+		flags = 0;
 		if (ch != '%') {
 			while (*fmt && *fmt != '%')
 				fmt++;
@@ -274,6 +300,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 
 			ch = *fmt++;
 
+			/* Conversion flag characters */
+			for (;; ch = *fmt++) {
+				ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
+				if (!ch_flag)
+					break;
+				flags |= ch_flag;
+			}
+
 			/* width */
 			while (ch >= '0' && ch <= '9') {
 				width *= 10;
@@ -282,62 +316,77 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 				ch = *fmt++;
 			}
 
-			/* Length modifiers */
-			if (ch == 'l') {
-				lpref = 1;
-				ch = *fmt++;
-				if (ch == 'l') {
-					lpref = 2;
-					ch = *fmt++;
+			/* Length modifier.
+			 * They miss the conversion flags characters " #+-0" so can go into flags.
+			 * Change both L and ll to q.
+			 */
+			if (ch == 'L')
+				ch = 'q';
+			ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'l', 't', 'z', 'j', 'q');
+			if (ch_flag != 0) {
+				if (ch == 'l' && fmt[0] == 'l') {
+					fmt++;
+					ch_flag = _NOLIBC_PF_FLAG('q');
 				}
-			} else if (ch == 'j') {
-				/* intmax_t is long long */
-				lpref = 2;
+				flags |= ch_flag;
 				ch = *fmt++;
-			} else {
-				lpref = 0;
 			}
 
-			if (ch == 'c' || ch == 'd' || ch == 'u' || ch == 'x' || ch == 'p') {
+			/* Conversion specifiers. */
+
+			/* Numeric conversion specifiers. */
+			ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'c', 'd', 'i', 'u', 'x', 'p');
+			if (ch_flag != 0) {
+				unsigned long long v;
+				long long signed_v;
 				char *out = tmpbuf;
 
-				if (ch == 'p')
+				/* 'long' is needed for pointer/string conversions and ltz lengths.
+				 * A single test can be used provided 'p' (the same bit as '0')
+				 * is masked from flags.
+				 */
+				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')),
+							     'p', 'l', 't', 'z')) {
 					v = va_arg(args, unsigned long);
-				else if (lpref) {
-					if (lpref > 1)
-						v = va_arg(args, unsigned long long);
-					else
-						v = va_arg(args, unsigned long);
-				} else
+					signed_v = (long)v;
+				} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j', 'q')) {
+					v = va_arg(args, unsigned long long);
+					signed_v = v;
+				} else {
 					v = va_arg(args, unsigned int);
+					signed_v = (int)v;
+				}
 
-				if (ch == 'd') {
-					/* sign-extend the value */
-					if (lpref == 0)
-						v = (long long)(int)v;
-					else if (lpref == 1)
-						v = (long long)(long)v;
+				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c')) {
+					/* "%c" - single character. */
+					tmpbuf[0] = v;
+					len = 1;
+					outstr = tmpbuf;
+					goto do_output;
 				}
 
-				switch (ch) {
-				case 'c':
-					out[0] = v;
-					out[1] = 0;
-					break;
-				case 'd':
-					i64toa_r(v, out);
-					break;
-				case 'u':
+				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i')) {
+					/* "%d" and "%i" - signed decimal numbers. */
+					if (signed_v < 0) {
+						*out++ = '-';
+						v = -(signed_v + 1);
+						v++;
+					}
+				}
+
+				/* Convert the number to ascii in the required base. */
+				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i', 'u')) {
+					/* Base 10 */
 					u64toa_r(v, out);
-					break;
-				case 'p':
-					*(out++) = '0';
-					*(out++) = 'x';
-					__nolibc_fallthrough;
-				default: /* 'x' and 'p' above */
+				} else {
+					/* Base 16 */
+					if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p')) {
+						*(out++) = '0';
+						*(out++) = 'x';
+					}
 					u64toh_r(v, out);
-					break;
 				}
+
 				outstr = tmpbuf;
 			}
 			else if (ch == 's') {
@@ -366,8 +415,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 			len = strlen(outstr);
 		}
 
+do_output:
 		written += len;
 
+		/* An OPTIMIZER_HIDE_VAR() seems to stop gcc back-merging this
+		 * code into one of the conditionals above.
+		 */
+		__asm__ volatile("" : "=r"(len) : "0"(len));
+
 		while (width > len) {
 			unsigned int pad_len = ((width - len - 1) & 15) + 1;
 			width -= pad_len;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X"
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
                   ` (5 preceding siblings ...)
  2026-02-06 19:11 ` [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-08 15:47   ` Willy Tarreau
                     ` (3 more replies)
  2026-02-06 19:11 ` [PATCH v2 next 08/11] tools/nolibc/printf: Add support for zero padding and field precision david.laight.linux
                   ` (4 subsequent siblings)
  11 siblings, 4 replies; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Add support for all the normal flag chacacters except '0' (zero pad).
  '-' left alignment.
  '+' and ' ' Sign characters for non-negative numbers.
  '#' adds 0x to hex numbers.
Partially support "%X", outputs lower case a..f the same as "%x".

Move the "%s" code in with the numeric formats to save a va_arg() call.

Prepend the sign (or "0x") after conversion to ascii and use the length
returned by u64toh_r() and u64toa_r().
Both needed for precision and zero-padding in the next patch.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Changes for v2:
- Add support for left justifying fields (removed from patch 6).
- Merge in the changes from the old patch 8.
- Add support for "%#x" (formally part of patch 9).

 tools/include/nolibc/stdio.h | 97 ++++++++++++++++++++++++++----------
 1 file changed, 71 insertions(+), 26 deletions(-)

diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index b14cf8224403..032416e900ee 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -240,10 +240,16 @@ char *fgets(char *s, int size, FILE *stream)
 }
 
 
-/* simple printf(). It supports the following formats:
- *  - %[-][width][{l,t,z,ll,L,j,q}]{d,u,c,x,p,s,m,%}
- *  - %%
- *  - invalid formats are copied to the output buffer
+/* printf(). Supports most of the normal integer and string formats.
+ *  - %[#-+ ][width][{l,t,z,ll,L,j,q}]{d,i,u,c,x,X,p,s,m,%}
+ *  - %% generates a single %
+ *  - %m outputs strerror(errno).
+ *  - # only affects %x and prepends 0x to non-zero values.
+ *  - %o (octal) isn't supported.
+ *  - %X outputs a..f the same as %x.
+ *  - No support for zero padding, precision or variable widths.
+ *  - No support for wide characters.
+ *  - invalid formats are copied to the output buffer.
  */
 
 /* This code uses 'flag' variables that are indexed by the low 6 bits
@@ -279,7 +285,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 	unsigned int written, width;
 	unsigned int flags, ch_flag;
 	size_t len;
-	char tmpbuf[21];
+	char tmpbuf[32 + 24];
 	const char *outstr;
 
 	written = 0;
@@ -334,19 +340,32 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 
 			/* Conversion specifiers. */
 
-			/* Numeric conversion specifiers. */
-			ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'c', 'd', 'i', 'u', 'x', 'p');
-			if (ch_flag != 0) {
+			/* Numeric and pointer conversion specifiers.
+			 *
+			 * Use an explicit bound check (rather than _NOLIBC_PF_CHAR_IS_ONE_OF())
+			 * so that 'X' can be allowed through.
+			 * 'X' gets treated and 'x' because _NOLIBC_PF_FLAG() returns the same
+			 * value for both.
+			 */
+			if ((ch < 'a' || ch > 'z') && ch != 'X')
+				goto non_numeric_conversion;
+
+			/* We need to check for "%p" or "%#x" later, merging here gives better code.
+			 * But '#' collides with 'c' so shift right.
+			 */
+			ch_flag = _NOLIBC_PF_FLAG(ch) | (flags & _NOLIBC_PF_FLAG('#')) >> 1;
+			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'i', 'u', 'x', 'p', 's')) {
 				unsigned long long v;
 				long long signed_v;
-				char *out = tmpbuf;
+				char *out = tmpbuf + 32;
+				int sign = 0;
 
 				/* 'long' is needed for pointer/string conversions and ltz lengths.
 				 * A single test can be used provided 'p' (the same bit as '0')
 				 * is masked from flags.
 				 */
 				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')),
-							     'p', 'l', 't', 'z')) {
+							     'p', 's', 'l', 't', 'z')) {
 					v = va_arg(args, unsigned long);
 					signed_v = (long)v;
 				} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j', 'q')) {
@@ -365,40 +384,62 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 					goto do_output;
 				}
 
+				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 's')) {
+					/* "%s" - character string. */
+					if (!v) {
+						outstr = "(null)";
+						len = 6;
+						goto do_output;
+					}
+					outstr = (void *)v;
+do_strnlen_output:
+					len = strnlen(outstr, INT_MAX);
+					goto do_output;
+				}
+
 				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i')) {
 					/* "%d" and "%i" - signed decimal numbers. */
 					if (signed_v < 0) {
-						*out++ = '-';
+						sign = '-';
 						v = -(signed_v + 1);
 						v++;
+					} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '+')) {
+						sign = '+';
+					} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, ' ')) {
+						sign = ' ';
 					}
 				}
 
 				/* Convert the number to ascii in the required base. */
 				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i', 'u')) {
 					/* Base 10 */
-					u64toa_r(v, out);
+					len = u64toa_r(v, out);
 				} else {
 					/* Base 16 */
-					if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p')) {
-						*(out++) = '0';
-						*(out++) = 'x';
+					if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p', '#' - 1)) {
+						/* "%p" and "%#x" need "0x" prepending. */
+						sign = 'x' | '0' << 8;
 					}
-					u64toh_r(v, out);
+					len = u64toh_r(v, out);
 				}
 
-				outstr = tmpbuf;
-			}
-			else if (ch == 's') {
-				outstr = va_arg(args, char *);
-				if (!outstr)
-					outstr="(null)";
+				/* Add 0, 1 or 2 ("0x") sign characters left of any zero padding */
+				for (; sign; sign >>= 8) {
+					len++;
+					*--out = sign;
+				}
+				outstr = out;
+				goto do_output;
 			}
-			else if (ch == 'm') {
+
+non_numeric_conversion:
+			if (ch == 'm') {
 #ifdef NOLIBC_IGNORE_ERRNO
 				outstr = "unknown error";
+				len = __builtin_strlen(outstr);
 #else
 				outstr = strerror(errno);
+				goto do_strnlen_output;
 #endif /* NOLIBC_IGNORE_ERRNO */
 			} else {
 				if (ch != '%') {
@@ -409,10 +450,10 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 				/* %% is documented as a 'conversion specifier'.
 				 * Any flags, precision or length modifier are ignored.
 				 */
+				len = 1;
 				width = 0;
-				outstr = "%";
+				outstr = fmt - 1;
 			}
-			len = strlen(outstr);
 		}
 
 do_output:
@@ -423,6 +464,10 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 		 */
 		__asm__ volatile("" : "=r"(len) : "0"(len));
 
+		/* Output 'left pad', 'value' then 'right pad'. */
+		flags = _NOLIBC_PF_FLAGS_CONTAIN(flags, '-');
+		if (flags && cb(state, outstr, len) != 0)
+			return -1;
 		while (width > len) {
 			unsigned int pad_len = ((width - len - 1) & 15) + 1;
 			width -= pad_len;
@@ -430,7 +475,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 			if (cb(state, "                ", pad_len) != 0)
 				return -1;
 		}
-		if (cb(state, outstr, len) != 0)
+		if (!flags && cb(state, outstr, len) != 0)
 			return -1;
 	}
 
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v2 next 08/11] tools/nolibc/printf: Add support for zero padding and field precision
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
                   ` (6 preceding siblings ...)
  2026-02-06 19:11 ` [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X" david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-08 16:16   ` Willy Tarreau
  2026-02-06 19:11 ` [PATCH v2 next 09/11] selftests/nolibc: Improve reporting of vfprintf() errors david.laight.linux
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Include support for variable field widths (eg "%*.*d") and the special
processing required for zero especially when the precision is 0.

Zero padding is limited to 30 zero characters.
This is wider than the largest numeric field so shouldn't be a problem.

When processing "%#01x" subtracting the number of 'sign' characters
from the field width generates a negative precision.
Fix by making all the variables 'signed int'.
This also makes the code smaller.
The space padding can then optimised as well (saves another 60 bytes of
code).

All the standard printf formats are now supported except octal
and floating point.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Changes for v2:
- These changes were previously in patch 9.
  However you need to apply the old patch 10 to get anything like
  the same source. The files then more of less match apart from 'c'
  being renamed 'ch' and the 'magic' #defines.

 tools/include/nolibc/stdio.h | 103 +++++++++++++++++++++++++++--------
 1 file changed, 80 insertions(+), 23 deletions(-)

diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
index 032416e900ee..d85607942784 100644
--- a/tools/include/nolibc/stdio.h
+++ b/tools/include/nolibc/stdio.h
@@ -241,13 +241,12 @@ char *fgets(char *s, int size, FILE *stream)
 
 
 /* printf(). Supports most of the normal integer and string formats.
- *  - %[#-+ ][width][{l,t,z,ll,L,j,q}]{d,i,u,c,x,X,p,s,m,%}
+ *  - %[#0-+ ][width|*[.precision|*]][{l,t,z,ll,L,j,q}]{d,i,u,c,x,X,p,s,m,%}
  *  - %% generates a single %
  *  - %m outputs strerror(errno).
  *  - # only affects %x and prepends 0x to non-zero values.
  *  - %o (octal) isn't supported.
  *  - %X outputs a..f the same as %x.
- *  - No support for zero padding, precision or variable widths.
  *  - No support for wide characters.
  *  - invalid formats are copied to the output buffer.
  */
@@ -282,9 +281,8 @@ static __attribute__((unused, format(printf, 3, 0)))
 int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
 {
 	char ch;
-	unsigned int written, width;
+	int len, written, width, precision;
 	unsigned int flags, ch_flag;
-	size_t len;
 	char tmpbuf[32 + 24];
 	const char *outstr;
 
@@ -314,12 +312,24 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 				flags |= ch_flag;
 			}
 
-			/* width */
-			while (ch >= '0' && ch <= '9') {
-				width *= 10;
-				width += ch - '0';
-
-				ch = *fmt++;
+                        /* Width and precision */
+			for (;; ch = *fmt++) {
+				if (ch == '*') {
+					precision = va_arg(args, unsigned int);
+					ch = *fmt++;
+				} else {
+					for (precision = 0; ch >= '0' && ch <= '9'; ch = *fmt++)
+						precision = precision * 10 + (ch - '0');
+				}
+				if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '.'))
+					break;
+				width = precision;
+				if (ch != '.') {
+					/* Default precision for strings */
+					precision = INT_MAX;
+					break;
+				}
+				flags |= _NOLIBC_PF_FLAG('.');
 			}
 
 			/* Length modifier.
@@ -388,12 +398,13 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 					/* "%s" - character string. */
 					if (!v) {
 						outstr = "(null)";
-						len = 6;
+						/* Match glibc, nothing output if precision too small */
+						len = precision >= 6 ? 6 : 0;
 						goto do_output;
 					}
 					outstr = (void *)v;
 do_strnlen_output:
-					len = strnlen(outstr, INT_MAX);
+					len = strnlen(outstr, precision);
 					goto do_output;
 				}
 
@@ -410,19 +421,64 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 					}
 				}
 
-				/* Convert the number to ascii in the required base. */
-				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i', 'u')) {
-					/* Base 10 */
-					len = u64toa_r(v, out);
+				if (v == 0) {
+					/* There are special rules for zero. */
+					if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p')) {
+						/* "%p" match glibc, precision is ignored */
+						outstr = "(nil)";
+						len = 5;
+						goto do_output;
+					}
+					if (!precision) {
+						/* Explicit %nn.0d, no digits output */
+						len = 0;
+						goto prepend_sign;
+					}
+					/* All formats (including "%#x") just output "0". */
+					*out = '0';
+					len = 1;
 				} else {
-					/* Base 16 */
-					if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p', '#' - 1)) {
-						/* "%p" and "%#x" need "0x" prepending. */
-						sign = 'x' | '0' << 8;
+					/* Convert the number to ascii in the required base. */
+					if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i', 'u')) {
+						/* Base 10 */
+						len = u64toa_r(v, out);
+					} else {
+						/* Base 16 */
+						if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p', '#' - 1)) {
+							/* "%p" and "%#x" need "0x" prepending. */
+							sign = 'x' | '0' << 8;
+						}
+						len = u64toh_r(v, out);
+					}
+				}
+
+				/* Add zero padding */
+				if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '0', '.')) {
+					if (!_NOLIBC_PF_FLAGS_CONTAIN(flags, '.')) {
+						if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '-'))
+							/* Left justify overrides zero pad */
+							goto prepend_sign;
+						/* eg "%05d", Zero pad to field width less sign */
+						precision = width;
+						if (sign) {
+							precision--;
+							if (sign >= 256)
+								precision--;
+						}
+					}
+					if (precision > 30)
+						/* Don't run off the start of tmpbuf[] */
+						precision = 30;
+					for (; len < precision; len++) {
+						/* Stop gcc generating horrid code and memset().
+						 * This is OPTIMIZER_HIDE_VAR() from compiler.h.
+						 */
+						__asm__ volatile("" : "=r"(len) : "0"(len));
+						*--out = '0';
 					}
-					len = u64toh_r(v, out);
 				}
 
+prepend_sign:
 				/* Add 0, 1 or 2 ("0x") sign characters left of any zero padding */
 				for (; sign; sign >>= 8) {
 					len++;
@@ -465,11 +521,12 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
 		__asm__ volatile("" : "=r"(len) : "0"(len));
 
 		/* Output 'left pad', 'value' then 'right pad'. */
+		width -= len;
 		flags = _NOLIBC_PF_FLAGS_CONTAIN(flags, '-');
 		if (flags && cb(state, outstr, len) != 0)
 			return -1;
-		while (width > len) {
-			unsigned int pad_len = ((width - len - 1) & 15) + 1;
+		while (width > 0) {
+			int pad_len = ((width - 1) & 15) + 1;
 			width -= pad_len;
 			written += pad_len;
 			if (cb(state, "                ", pad_len) != 0)
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v2 next 09/11] selftests/nolibc: Improve reporting of vfprintf() errors
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
                   ` (7 preceding siblings ...)
  2026-02-06 19:11 ` [PATCH v2 next 08/11] tools/nolibc/printf: Add support for zero padding and field precision david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-16 20:05   ` Thomas Weißschuh
  2026-02-06 19:11 ` [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests david.laight.linux
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Check the string matches before checking the returned length.
Only print the string once when it matches.
Normally the length is that of the expected string, make it easier
to write tests by treating a length of zero as being that of the
expected output.
Additionally check that nothing beyond the end is written.

This also correctly returns 1 (the number of errors) when strcmp()
fails rather than the return value from strncmp() which is the
signed difference between the mismatching characters.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Changes for v2:
- Formally patch 5, otherwise unchanged.

 tools/testing/selftests/nolibc/nolibc-test.c | 27 ++++++++++++++++----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index 3c5a226dad3a..9378a1f26c34 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -1567,28 +1567,45 @@ int run_stdlib(int min, int max)
 
 static int expect_vfprintf(int llen, int c, const char *expected, const char *fmt, ...)
 {
+	unsigned int i;
 	char buf[100];
 	va_list args;
 	ssize_t w;
 	int ret;
 
+	for (i = 0; i < sizeof(buf); i++)
+		buf[i] = i;
 
 	va_start(args, fmt);
-	/* Only allow writing 21 bytes, to test truncation */
+	/* Only allow writing 20 bytes, to test truncation */
 	w = vsnprintf(buf, 21, fmt, args);
 	va_end(args);
 
+	llen += printf(" \"%s\"", buf);
+	ret = strcmp(expected, buf);
+	if (ret) {
+		llen += printf(" should be \"%s\"", expected);
+		result(llen, FAIL);
+		return 1;
+	}
+	if (!c)
+		c = strlen(expected);
 	if (w != c) {
 		llen += printf(" written(%d) != %d", (int)w, c);
 		result(llen, FAIL);
 		return 1;
 	}
 
-	llen += printf(" \"%s\" = \"%s\"", expected, buf);
-	ret = strncmp(expected, buf, c);
+	for (i = c + 1; i < sizeof(buf); i++) {
+		if (buf[i] - i) {
+			llen += printf(" overwrote buf[%d] with 0x%x", i, buf[i]);
+			result(llen, FAIL);
+			return 1;
+		}
+	}
 
-	result(llen, ret ? FAIL : OK);
-	return ret;
+	result(llen, OK);
+	return 0;
 }
 
 static int test_scanf(void)
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
                   ` (8 preceding siblings ...)
  2026-02-06 19:11 ` [PATCH v2 next 09/11] selftests/nolibc: Improve reporting of vfprintf() errors david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-16 20:14   ` Thomas Weißschuh
  2026-02-16 20:23   ` Thomas Weißschuh
  2026-02-06 19:11 ` [PATCH v2 next 11/11] selftests/nolibc: Use printf("%.*s", n, "") to align output david.laight.linux
  2026-02-06 21:36 ` [PATCH v2 next 00/11] tools/nolibc: Enhance printf() David Laight
  11 siblings, 2 replies; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Extra tests include:
- %%, including ignored modifiers.
- Invalid formats copied to output buffer (matches glibc).
- Left aligned output "%-..."
- Zero padding "%0...".
- "(nil)" for NULL pointers (matches glibc).
- Alternate form "%#x" (prepends 0x in non-zero).
- Field precision as well as width, printf("%.0d", 0) is "".
- Variable length width and precision "%*.*d".
- Length qualifiers L and ll.
- Conversion specifiers i and X.
- More 'corner' cases.

There are no explicit tests of long (l, t or z) because they would
have to differ between 32bit and 64bit.

Set the expected length to zero for all the non-truncating tests.
(Annoying when a test is changed.)

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Changes for v2:
- Formally patch 11.
- Remove broken __attribute__ on expect_vfprintf().
- Add extra test for "#01x".

 tools/testing/selftests/nolibc/nolibc-test.c | 49 +++++++++++++++-----
 1 file changed, 37 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index 9378a1f26c34..10ae6dcd71b8 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -1727,20 +1727,45 @@ static int run_printf(int min, int max)
 		 */
 		switch (test + __LINE__ + 1) {
 		CASE_TEST(empty);        EXPECT_VFPRINTF(0, "", ""); break;
-		CASE_TEST(simple);       EXPECT_VFPRINTF(3, "foo", "foo"); break;
-		CASE_TEST(string);       EXPECT_VFPRINTF(3, "foo", "%s", "foo"); break;
-		CASE_TEST(number);       EXPECT_VFPRINTF(4, "1234", "%d", 1234); break;
-		CASE_TEST(negnumber);    EXPECT_VFPRINTF(5, "-1234", "%d", -1234); break;
-		CASE_TEST(unsigned);     EXPECT_VFPRINTF(5, "12345", "%u", 12345); break;
-		CASE_TEST(char);         EXPECT_VFPRINTF(1, "c", "%c", 'c'); break;
-		CASE_TEST(hex);          EXPECT_VFPRINTF(1, "f", "%x", 0xf); break;
-		CASE_TEST(pointer);      EXPECT_VFPRINTF(3, "0x1", "%p", (void *) 0x1); break;
-		CASE_TEST(uintmax_t);    EXPECT_VFPRINTF(20, "18446744073709551615", "%ju", 0xffffffffffffffffULL); break;
-		CASE_TEST(intmax_t);     EXPECT_VFPRINTF(20, "-9223372036854775807", "%jd", 0x8000000000000001LL); break;
+		CASE_TEST(simple);       EXPECT_VFPRINTF(0, "foo", "foo"); break;
+		CASE_TEST(string);       EXPECT_VFPRINTF(0, "foo", "%s", "foo"); break;
+		CASE_TEST(number);       EXPECT_VFPRINTF(0, "1234", "%d", 1234); break;
+		CASE_TEST(negnumber);    EXPECT_VFPRINTF(0, "-1234", "%d", -1234); break;
+		CASE_TEST(unsigned);     EXPECT_VFPRINTF(0, "12345", "%u", 12345); break;
+		CASE_TEST(signed_max);   EXPECT_VFPRINTF(0, "2147483647", "%i", ~0u >> 1); break;
+		CASE_TEST(signed_min);   EXPECT_VFPRINTF(0, "-2147483648", "%i", (~0u >> 1) + 1); break;
+		CASE_TEST(unsigned_max); EXPECT_VFPRINTF(0, "4294967295", "%u", ~0u); break;
+		CASE_TEST(char);         EXPECT_VFPRINTF(0, "|c|d|   e|", "|%c|%.0c|%4c|", 'c', 'd', 'e'); break;
+		CASE_TEST(hex);          EXPECT_VFPRINTF(0, "|f|d|", "|%x|%X|", 0xf, 0xd); break;
+		CASE_TEST(pointer);      EXPECT_VFPRINTF(0, "0x1", "%p", (void *) 0x1); break;
+		CASE_TEST(pointer_NULL); EXPECT_VFPRINTF(0, "|(nil)|(nil)|", "|%p|%.4p|", (void *)0, (void *)0); break;
+		CASE_TEST(string_NULL);  EXPECT_VFPRINTF(0, "|(null)||(null)|", "|%s|%.5s|%.6s|", (void *)0, (void *)0, (void *)0); break;
+		CASE_TEST(percent);      EXPECT_VFPRINTF(0, "a%d42%69%", "a%%d%d%%%d%%", 42, 69); break;
+		CASE_TEST(perc_qual);    EXPECT_VFPRINTF(0, "a%d2", "a%-14l%d%d", 2); break;
+		CASE_TEST(invalid);      EXPECT_VFPRINTF(0, "a%12yx3%y42%y", "a%12yx%d%y%d%y", 3, 42); break;
+		CASE_TEST(intmax_max);   EXPECT_VFPRINTF(0, "9223372036854775807", "%lld", ~0ULL >> 1); break;
+		CASE_TEST(intmax_min);   EXPECT_VFPRINTF(0, "-9223372036854775808", "%Li", (~0ULL >> 1) + 1); break;
+		CASE_TEST(uintmax_max);  EXPECT_VFPRINTF(0, "18446744073709551615", "%ju", ~0ULL); break;
 		CASE_TEST(truncation);   EXPECT_VFPRINTF(25, "01234567890123456789", "%s", "0123456789012345678901234"); break;
-		CASE_TEST(string_width); EXPECT_VFPRINTF(10, "         1", "%10s", "1"); break;
-		CASE_TEST(number_width); EXPECT_VFPRINTF(10, "         1", "%10d", 1); break;
+		CASE_TEST(string_width); EXPECT_VFPRINTF(0, "         1", "%10s", "1"); break;
+		CASE_TEST(string_trunc); EXPECT_VFPRINTF(0, "     12345", "%10.5s", "1234567890"); break;
+		CASE_TEST(number_width); EXPECT_VFPRINTF(0, "         1", "%10d", 1); break;
+		CASE_TEST(number_left);  EXPECT_VFPRINTF(0, "|-5      |", "|%-8d|", -5); break;
+		CASE_TEST(string_align); EXPECT_VFPRINTF(0, "|foo     |", "|%-8s|", "foo"); break;
 		CASE_TEST(width_trunc);  EXPECT_VFPRINTF(25, "                    ", "%25d", 1); break;
+		CASE_TEST(width_tr_lft); EXPECT_VFPRINTF(25, "1                   ", "%-25d", 1); break;
+		CASE_TEST(number_pad);   EXPECT_VFPRINTF(0, "0000000005", "%010d", 5); break;
+		CASE_TEST(number_pad);   EXPECT_VFPRINTF(0, "|0000000005|0x1234|", "|%010d|%#01x|", 5, 0x1234); break;
+		CASE_TEST(num_pad_neg);  EXPECT_VFPRINTF(0, "-000000005", "%010d", -5); break;
+		CASE_TEST(num_pad_hex);  EXPECT_VFPRINTF(0, "00fffffffb", "%010x", -5); break;
+		CASE_TEST(num_pad_trunc);EXPECT_VFPRINTF(40, "          0000000000", "%040d", 5); break;  /* max 30 '0' can be added */
+		CASE_TEST(number_prec);  EXPECT_VFPRINTF(0, "     00005", "%10.5d", 5); break;
+		CASE_TEST(num_prec_neg); EXPECT_VFPRINTF(0, "    -00005", "%10.5d", -5); break;
+		CASE_TEST(num_prec_var); EXPECT_VFPRINTF(0, "    -00005", "%*.*d", 10, 5, -5); break;
+		CASE_TEST(num_0_prec_0); EXPECT_VFPRINTF(0, "|| |+||||", "|%.0d|% .0d|%+.0d|%.0u|%.0x|%#.0x|", 0, 0, 0, 0, 0, 0); break;
+		CASE_TEST(hex_alt);      EXPECT_VFPRINTF(0, "|0x1|0x01| 0x02|", "|%#x|%#04x|%#5.2x|", 1, 1, 2); break;
+		CASE_TEST(hex_0_alt);    EXPECT_VFPRINTF(0, "|0|0000|   00|", "|%#x|%#04x|%#5.2x|", 0, 0, 0); break;
+		CASE_TEST(dec_alt);      EXPECT_VFPRINTF(0, "|1|0001|   02|", "|%#d|%#04d|%#5.2d|", 1, 1, 2); break; /* '#' is ignored */
 		CASE_TEST(scanf);        EXPECT_ZR(1, test_scanf()); break;
 		CASE_TEST(strerror);     EXPECT_ZR(1, test_strerror()); break;
 		CASE_TEST(printf_error); EXPECT_ZR(1, test_printf_error()); break;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v2 next 11/11] selftests/nolibc: Use printf("%.*s", n, "") to align output
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
                   ` (9 preceding siblings ...)
  2026-02-06 19:11 ` [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests david.laight.linux
@ 2026-02-06 19:11 ` david.laight.linux
  2026-02-08 16:20   ` Willy Tarreau
  2026-02-16 20:22   ` Thomas Weißschuh
  2026-02-06 21:36 ` [PATCH v2 next 00/11] tools/nolibc: Enhance printf() David Laight
  11 siblings, 2 replies; 60+ messages in thread
From: david.laight.linux @ 2026-02-06 19:11 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li; +Cc: David Laight

From: David Laight <david.laight.linux@gmail.com>

Now that printf supports '*' for field widths it can be used to
align the "[OK]" strings in the output.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---

Changes for v2:
- Formally patch 12, unchanged.

 tools/testing/selftests/nolibc/nolibc-test.c | 21 ++++----------------
 1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index 10ae6dcd71b8..416d0fd8d85e 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -141,21 +141,6 @@ static const char *errorname(int err)
 	}
 }
 
-static void align_result(size_t llen)
-{
-	const size_t align = 64;
-	char buf[align];
-	size_t n;
-
-	if (llen >= align)
-		return;
-
-	n = align - llen;
-	memset(buf, ' ', n);
-	buf[n] = '\0';
-	fputs(buf, stdout);
-}
-
 enum RESULT {
 	OK,
 	FAIL,
@@ -173,8 +158,10 @@ static void result(int llen, enum RESULT r)
 	else
 		msg = " [FAIL]";
 
-	align_result(llen);
-	puts(msg);
+	llen = 64 - llen;
+	if (llen < 0)
+		llen = 0;
+	printf("%*s%s\n", llen, "", msg);
 }
 
 /* The tests below are intended to be used by the macroes, which evaluate
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 00/11] tools/nolibc: Enhance printf()
  2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
                   ` (10 preceding siblings ...)
  2026-02-06 19:11 ` [PATCH v2 next 11/11] selftests/nolibc: Use printf("%.*s", n, "") to align output david.laight.linux
@ 2026-02-06 21:36 ` David Laight
  11 siblings, 0 replies; 60+ messages in thread
From: David Laight @ 2026-02-06 21:36 UTC (permalink / raw)
  To: Willy Tarreau, Thomas Weißschuh, linux-kernel, Cheng Li

On Fri,  6 Feb 2026 19:11:10 +0000
david.laight.linux@gmail.com wrote:

> From: David Laight <david.laight.linux@gmail.com>
> 
> Update printf() so that it handles almost all the non-fp formats.
> In particular:
> - Left alignment.
> - Zero padding.
> - Field precision.
> - Variable field width and precision.
> - Width modifiers q, L, t and z.
> - Conversion specifiers i and X (X generates lower case).
> About the only things that are missing are octal and floating point.

Since it is pretty much a re-write, a copy of the new version:

/* printf(). Supports most of the normal integer and string formats.
 *  - %[#0-+ ][width|*[.precision|*]][{l,t,z,ll,L,j,q}]{d,i,u,c,x,X,p,s,m,%}
 *  - %% generates a single %
 *  - %m outputs strerror(errno).
 *  - # only affects %x and prepends 0x to non-zero values.
 *  - %o (octal) isn't supported.
 *  - %X outputs a..f the same as %x.
 *  - No support for wide characters.
 *  - invalid formats are copied to the output buffer.
 */

/* This code uses 'flag' variables that are indexed by the low 6 bits
 * of characters to optimise checks for multiple characters.
 *
 * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
 * returns non-zero if the bit for any of the specified characters is set.
 *
 * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
 * returns the flag bit for ch if it is one of the specified characters.
 * All the characters must be in the same 32 character block (non-alphabetic,
 * upper case, or lower case) of the ASCII character set.)
 */
#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
	(_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
	 _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
	 _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
	 _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
	((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
	(ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
		_NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))

typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);

static __attribute__((unused, format(printf, 3, 0)))
int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
{
	char ch;
	int len, written, width, precision;
	unsigned int flags, ch_flag;
	char tmpbuf[32 + 24];
	const char *outstr;

	written = 0;
	while (1) {
		outstr = fmt;
		ch = *fmt++;
		if (!ch)
			break;

		width = 0;
		flags = 0;
		if (ch != '%') {
			while (*fmt && *fmt != '%')
				fmt++;
			len = fmt - outstr;
		} else {
			/* we're in a format sequence */

			ch = *fmt++;

			/* Conversion flag characters */
			for (;; ch = *fmt++) {
				ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
				if (!ch_flag)
					break;
				flags |= ch_flag;
			}

                        /* Width and precision */
			for (;; ch = *fmt++) {
				if (ch == '*') {
					precision = va_arg(args, unsigned int);
					ch = *fmt++;
				} else {
					for (precision = 0; ch >= '0' && ch <= '9'; ch = *fmt++)
						precision = precision * 10 + (ch - '0');
				}
				if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '.'))
					break;
				width = precision;
				if (ch != '.') {
					/* Default precision for strings */
					precision = INT_MAX;
					break;
				}
				flags |= _NOLIBC_PF_FLAG('.');
			}

			/* Length modifier.
			 * They miss the conversion flags characters " #+-0" so can go into flags.
			 * Change both L and ll to q.
			 */
			if (ch == 'L')
				ch = 'q';
			ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'l', 't', 'z', 'j', 'q');
			if (ch_flag != 0) {
				if (ch == 'l' && fmt[0] == 'l') {
					fmt++;
					ch_flag = _NOLIBC_PF_FLAG('q');
				}
				flags |= ch_flag;
				ch = *fmt++;
			}

			/* Conversion specifiers. */

			/* Numeric and pointer conversion specifiers.
			 *
			 * Use an explicit bound check (rather than _NOLIBC_PF_CHAR_IS_ONE_OF())
			 * so that 'X' can be allowed through.
			 * 'X' gets treated and 'x' because _NOLIBC_PF_FLAG() returns the same
			 * value for both.
			 */
			if ((ch < 'a' || ch > 'z') && ch != 'X')
				goto non_numeric_conversion;

			/* We need to check for "%p" or "%#x" later, merging here gives better code.
			 * But '#' collides with 'c' so shift right.
			 */
			ch_flag = _NOLIBC_PF_FLAG(ch) | (flags & _NOLIBC_PF_FLAG('#')) >> 1;
			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'i', 'u', 'x', 'p', 's')) {
				unsigned long long v;
				long long signed_v;
				char *out = tmpbuf + 32;
				int sign = 0;

				/* 'long' is needed for pointer/string conversions and ltz lengths.
				 * A single test can be used provided 'p' (the same bit as '0')
				 * is masked from flags.
				 */
				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')),
							     'p', 's', 'l', 't', 'z')) {
					v = va_arg(args, unsigned long);
					signed_v = (long)v;
				} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j', 'q')) {
					v = va_arg(args, unsigned long long);
					signed_v = v;
				} else {
					v = va_arg(args, unsigned int);
					signed_v = (int)v;
				}

				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c')) {
					/* "%c" - single character. */
					tmpbuf[0] = v;
					len = 1;
					outstr = tmpbuf;
					goto do_output;
				}

				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 's')) {
					/* "%s" - character string. */
					if (!v) {
						outstr = "(null)";
						/* Match glibc, nothing output if precision too small */
						len = precision >= 6 ? 6 : 0;
						goto do_output;
					}
					outstr = (void *)v;
do_strnlen_output:
					len = strnlen(outstr, precision);
					goto do_output;
				}

				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i')) {
					/* "%d" and "%i" - signed decimal numbers. */
					if (signed_v < 0) {
						sign = '-';
						v = -(signed_v + 1);
						v++;
					} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '+')) {
						sign = '+';
					} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, ' ')) {
						sign = ' ';
					}
				}

				if (v == 0) {
					/* There are special rules for zero. */
					if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p')) {
						/* "%p" match glibc, precision is ignored */
						outstr = "(nil)";
						len = 5;
						goto do_output;
					}
					if (!precision) {
						/* Explicit %nn.0d, no digits output */
						len = 0;
						goto prepend_sign;
					}
					/* All formats (including "%#x") just output "0". */
					*out = '0';
					len = 1;
				} else {
					/* Convert the number to ascii in the required base. */
					if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i', 'u')) {
						/* Base 10 */
						len = u64toa_r(v, out);
					} else {
						/* Base 16 */
						if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p', '#' - 1)) {
							/* "%p" and "%#x" need "0x" prepending. */
							sign = 'x' | '0' << 8;
						}
						len = u64toh_r(v, out);
					}
				}

				/* Add zero padding */
				if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '0', '.')) {
					if (!_NOLIBC_PF_FLAGS_CONTAIN(flags, '.')) {
						if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '-'))
							/* Left justify overrides zero pad */
							goto prepend_sign;
						/* eg "%05d", Zero pad to field width less sign */
						precision = width;
						if (sign) {
							precision--;
							if (sign >= 256)
								precision--;
						}
					}
					if (precision > 30)
						/* Don't run off the start of tmpbuf[] */
						precision = 30;
					for (; len < precision; len++) {
						/* Stop gcc generating horrid code and memset().
						 * This is OPTIMIZER_HIDE_VAR() from compiler.h.
						 */
						__asm__ volatile("" : "=r"(len) : "0"(len));
						*--out = '0';
					}
				}

prepend_sign:
				/* Add 0, 1 or 2 ("0x") sign characters left of any zero padding */
				for (; sign; sign >>= 8) {
					len++;
					*--out = sign;
				}
				outstr = out;
				goto do_output;
			}

non_numeric_conversion:
			if (ch == 'm') {
#ifdef NOLIBC_IGNORE_ERRNO
				outstr = "unknown error";
				len = __builtin_strlen(outstr);
#else
				outstr = strerror(errno);
				goto do_strnlen_output;
#endif /* NOLIBC_IGNORE_ERRNO */
			} else {
				if (ch != '%') {
					/* Invalid format: back up to output the format characters */
					fmt = outstr + 1;
					/* and output a '%' now. */
				}
				/* %% is documented as a 'conversion specifier'.
				 * Any flags, precision or length modifier are ignored.
				 */
				len = 1;
				width = 0;
				outstr = fmt - 1;
			}
		}

do_output:
		written += len;

		/* An OPTIMIZER_HIDE_VAR() seems to stop gcc back-merging this
		 * code into one of the conditionals above.
		 */
		__asm__ volatile("" : "=r"(len) : "0"(len));

		/* Output 'left pad', 'value' then 'right pad'. */
		width -= len;
		flags = _NOLIBC_PF_FLAGS_CONTAIN(flags, '-');
		if (flags && cb(state, outstr, len) != 0)
			return -1;
		while (width > 0) {
			int pad_len = ((width - 1) & 15) + 1;
			width -= pad_len;
			written += pad_len;
			if (cb(state, "                ", pad_len) != 0)
				return -1;
		}
		if (!flags && cb(state, outstr, len) != 0)
			return -1;
	}

	/* Flush/terminate any buffer. */
	if (cb(state, NULL, 0) != 0)
		return -1;

	return written;
}

struct __nolibc_fprintf_cb_state {
	FILE *stream;
	unsigned int buf_offset;
	char buf[128];
};

static int __nolibc_fprintf_cb(void *v_state, const char *buf, size_t size)
{
	struct __nolibc_fprintf_cb_state *state = v_state;
	unsigned int off = state->buf_offset;

	if (off + size > sizeof(state->buf) || buf == NULL) {
		state->buf_offset = 0;
		if (off && _fwrite(state->buf, off, state->stream))
			return -1;
		if (size > sizeof(state->buf))
			return _fwrite(buf, size, state->stream);
		off = 0;
	}

	if (size) {
		state->buf_offset = off + size;
		memcpy(state->buf + off, buf, size);
	}
	return 0;
}

...

struct __nolibc_sprintf_cb_state {
	char *buf;
	size_t size;
};

static int __nolibc_sprintf_cb(void *v_state, const char *buf, size_t size)
{
	struct __nolibc_sprintf_cb_state *state = v_state;
	char *tgt;

	if (size >= state->size) {
		if (state->size <= 1)
			return 0;
		size = state->size - 1;
	}
	tgt = state->buf;
	if (size) {
		state->size -= size;
		state->buf = tgt + size;
		memcpy(tgt, buf, size);
	} else {
		/* In particular from cb(NULL, 0) at the end of __nolibc_printf(). */
		*tgt = '\0';
	}
	return 0;
}

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch'
  2026-02-06 19:11 ` [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch' david.laight.linux
@ 2026-02-07 18:51   ` Willy Tarreau
  2026-02-16 18:52   ` Thomas Weißschuh
  1 sibling, 0 replies; 60+ messages in thread
From: Willy Tarreau @ 2026-02-07 18:51 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

Hi David,

On Fri, Feb 06, 2026 at 07:11:11PM +0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> This makes the code slightly easier to read because the variable
> stands out from the single character literals (especially 'c').

While I thought it was purely cosmetic and useless, I have to
admit that it eases reading of certain large "if" conditions.

> The following patches pretty much rewrite the function so the
> churn is limited.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>

Acked-by: Willy Tarreau <w@1wt.eu>

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback
  2026-02-06 19:11 ` [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback david.laight.linux
@ 2026-02-07 19:12   ` Willy Tarreau
  2026-02-07 23:28     ` David Laight
  0 siblings, 1 reply; 60+ messages in thread
From: Willy Tarreau @ 2026-02-07 19:12 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Fri, Feb 06, 2026 at 07:11:12PM +0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Move output truncation to the snprintf() callback.
> This simplifies the main code and ensures the truncation will be
> correct when left-alignment is added.
> 
> Add a zero length callback to 'finalise' the buffer rather than
> doing it in snprintf() itself.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> 
> Changes for v2:
> - Formally patch 1
> - Add comments about the final callback.
> 
>  tools/include/nolibc/stdio.h | 68 ++++++++++++++++++++++--------------
>  1 file changed, 41 insertions(+), 27 deletions(-)
> 
> diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> index f162cc697a73..36733ecd4261 100644
> --- a/tools/include/nolibc/stdio.h
> +++ b/tools/include/nolibc/stdio.h
> @@ -245,15 +245,15 @@ char *fgets(char *s, int size, FILE *stream)
>   *  - %s
>   *  - unknown modifiers are ignored.
>   */
> -typedef int (*__nolibc_printf_cb)(intptr_t state, const char *buf, size_t size);
> +typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
>  
> -static __attribute__((unused, format(printf, 4, 0)))
> -int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char *fmt, va_list args)
> +static __attribute__((unused, format(printf, 3, 0)))
> +int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
>  {
>  	char escape, lpref, ch;
>  	unsigned long long v;
>  	unsigned int written, width;
> -	size_t len, ofs, w;
> +	size_t len, ofs;
>  	char tmpbuf[21];
>  	const char *outstr;
>  
> @@ -355,17 +355,13 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
>  			outstr = fmt;
>  			len = ofs - 1;
>  		flush_str:
> -			if (n) {
> -				w = len < n ? len : n;
> -				n -= w;
> -				while (width-- > w) {
> -					if (cb(state, " ", 1) != 0)
> -						return -1;
> -					written += 1;
> -				}
> -				if (cb(state, outstr, w) != 0)
> +			while (width-- > len) {
> +				if (cb(state, " ", 1) != 0)
>  					return -1;
> +				written += 1;
>  			}
> +			if (cb(state, outstr, len) != 0)
> +				return -1;
>  
>  			written += len;
>  		do_escape:
> @@ -378,18 +374,23 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
>  
>  		/* literal char, just queue it */
>  	}
> +
> +	/* Flush/terminate any buffer. */
> +	if (cb(state, NULL, 0) != 0)
> +		return -1;
> +

I suspect this hunk is in fact part of the next patch which adds
buffering, because I don't see what there is to flush here without
any buffer. If really needed, then I think that it's about time to
start adding a comment about the __nolibc_printf() function to explain
how it's supposed to work, because callback-driven code is unreadable,
there are hidden expectations everywhere that are super hard to guess
or verify.

>  	return written;
>  }
>  
> -static int __nolibc_fprintf_cb(intptr_t state, const char *buf, size_t size)
> +static int __nolibc_fprintf_cb(void *stream, const char *buf, size_t size)

I must confess I'm not a big fan of the void* here. I've seen that you're
having one state for snprintf() and another one for fprintf(), maybe they
could be efficiently merged into a common printf_state ? Note that I'm not
vetoing this, I just want to be convinced that it's the best choice, and
neither the code, comments nor commit messages for now suggest so.

>  {
> -	return _fwrite(buf, size, (FILE *)state);
> +	return size ? _fwrite(buf, size, stream) : 0;
>  }
>  
>  static __attribute__((unused, format(printf, 2, 0)))
>  int vfprintf(FILE *stream, const char *fmt, va_list args)
>  {
> -	return __nolibc_printf(__nolibc_fprintf_cb, (intptr_t)stream, SIZE_MAX, fmt, args);
> +	return __nolibc_printf(__nolibc_fprintf_cb, stream, fmt, args);
>  }
>  
>  static __attribute__((unused, format(printf, 1, 0)))
> @@ -447,26 +448,39 @@ int dprintf(int fd, const char *fmt, ...)
>  	return ret;
>  }
>  
> -static int __nolibc_sprintf_cb(intptr_t _state, const char *buf, size_t size)
> +struct __nolibc_sprintf_cb_state {
> +	char *buf;
> +	size_t size;
> +};
> +
> +static int __nolibc_sprintf_cb(void *v_state, const char *buf, size_t size)
>  {
> -	char **state = (char **)_state;
> +	struct __nolibc_sprintf_cb_state *state = v_state;
> +	char *tgt;
>  
> -	memcpy(*state, buf, size);
> -	*state += size;
> +	if (size >= state->size) {
> +		if (state->size <= 1)
> +			return 0;

I failed to understand that one. Don't we want to at least write the
trailing zero when there's one byte left ? A short comment explaining
that case would help.

> +		size = state->size - 1;
> +	}
> +	tgt = state->buf;
> +	if (size) {
> +		state->size -= size;
> +		state->buf = tgt + size;
> +		memcpy(tgt, buf, size);
> +	} else {
> +		/* In particular from cb(NULL, 0) at the end of __nolibc_printf(). */
> +		*tgt = '\0';
> +	}

Usually, "if/else" constructs result in larger code due to jumps. Here
we certainly can unconditionally write the trailing zero. Bingo, we're
saving 9 bytes on x86_64 by moving it above. And even 17 bytes by dropping
the test on size and updating the state after the memcpy:

	if (size >= state->size) {
		if (state->size <= 1)
			return 0;
		size = state->size - 1;
	}
	*state->buf = '\0';
	memcpy(state->buf, buf, size);
	state->buf += size;
	state->size -= size;

  snprintf-patch1:000000000000003e t __nolibc_sprintf_cb
  snprintf-patch1-alt1:0000000000000033 t __nolibc_sprintf_cb
  snprintf-patch1-alt2:000000000000002d t __nolibc_sprintf_cb

(not tested but worth a try).

>  	return 0;
>  }
>  
>  static __attribute__((unused, format(printf, 3, 0)))
>  int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
>  {
> -	char *state = buf;
> -	int ret;
> +	struct __nolibc_sprintf_cb_state state = { .buf = buf, .size = size };
>  
> -	ret = __nolibc_printf(__nolibc_sprintf_cb, (intptr_t)&state, size, fmt, args);
> -	if (ret < 0)
> -		return ret;
> -	buf[(size_t)ret < size ? (size_t)ret : size - 1] = '\0';
> -	return ret;
> +	return __nolibc_printf(__nolibc_sprintf_cb, &state, fmt, args);
>  }
>  
>  static __attribute__((unused, format(printf, 3, 4)))

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback.
  2026-02-06 19:11 ` [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback david.laight.linux
@ 2026-02-07 19:29   ` Willy Tarreau
  2026-02-07 23:36     ` David Laight
  0 siblings, 1 reply; 60+ messages in thread
From: Willy Tarreau @ 2026-02-07 19:29 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Fri, Feb 06, 2026 at 07:11:13PM +0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Add per-call buffering to the vprintf() callback.
> While this adds some extra code it will speed things up and
> makes a massive difference to anyone looking at strace output.

This patch alone adds more than 200 extra bytes to the smallest binary
for something that was never expressed as a need by users:

 $ size hello-patch*
   text    data     bss     dec     hex filename
   1859      48      24    1931     78b hello-patch1
   2071      48      24    2143     85f hello-patch2

I doubt it would make sense to have a build option to choose this.
Or alternately one could decide do disable it when __OPTIMIZE_SIZE__
is defined. I just tried quickly and it does the job:

  @@ -390,6 +390,7 @@ struct __nolibc_fprintf_cb_state {
   
   static int __nolibc_fprintf_cb(void *v_state, const char *buf, size_t size)
   {
  +#if !defined(__OPTIMIZE_SIZE__)
          struct __nolibc_fprintf_cb_state *state = v_state;
          unsigned int off = state->buf_offset;
   
  @@ -407,16 +408,24 @@ static int __nolibc_fprintf_cb(void *v_state, const char *buf, size_t size)
                  memcpy(state->buf + off, buf, size);
          }
          return 0;
  +#else
  +       /* v_state is the stream */
  +       return size ? _fwrite(buf, size, v_state) : 0;
  +#endif
   }
   
   static __attribute__((unused, format(printf, 2, 0)))
   int vfprintf(FILE *stream, const char *fmt, va_list args)
   {
  +#if !defined(__OPTIMIZE_SIZE__)
          struct __nolibc_fprintf_cb_state state;
   
          state.stream = stream;
          state.buf_offset = 0;
          return __nolibc_printf(__nolibc_fprintf_cb, &state, fmt, args);
  +#else
  +       return __nolibc_printf(__nolibc_fprintf_cb, stream, fmt, args);
  +#endif
   }
   
   static __attribute__((unused, format(printf, 1, 0)))

> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> 
> Changes for v2:
> Formally patch 2, unchanged.
> 
>  tools/include/nolibc/stdio.h | 32 +++++++++++++++++++++++++++++---
>  1 file changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> index 36733ecd4261..552f09d51d82 100644
> --- a/tools/include/nolibc/stdio.h
> +++ b/tools/include/nolibc/stdio.h
> @@ -382,15 +382,41 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
>  	return written;
>  }
>  
> -static int __nolibc_fprintf_cb(void *stream, const char *buf, size_t size)
> +struct __nolibc_fprintf_cb_state {
> +	FILE *stream;
> +	unsigned int buf_offset;
> +	char buf[128];
> +};

So that's the other state I was wondering if we could merge with the first
one.

> +static int __nolibc_fprintf_cb(void *v_state, const char *buf, size_t size)
>  {
> -	return size ? _fwrite(buf, size, stream) : 0;
> +	struct __nolibc_fprintf_cb_state *state = v_state;
> +	unsigned int off = state->buf_offset;
> +
> +	if (off + size > sizeof(state->buf) || buf == NULL) {

Please mention that special case of buf==NULL in a comment above the
function. That's an internal API choice.

> +		state->buf_offset = 0;
> +		if (off && _fwrite(state->buf, off, state->stream))
> +			return -1;
> +		if (size > sizeof(state->buf))
> +			return _fwrite(buf, size, state->stream);
> +		off = 0;
> +	}
> +
> +	if (size) {
> +		state->buf_offset = off + size;
> +		memcpy(state->buf + off, buf, size);
> +	}
> +	return 0;
>  }
>  
>  static __attribute__((unused, format(printf, 2, 0)))
>  int vfprintf(FILE *stream, const char *fmt, va_list args)
>  {
> -	return __nolibc_printf(__nolibc_fprintf_cb, stream, fmt, args);
> +	struct __nolibc_fprintf_cb_state state;
> +
> +	state.stream = stream;
> +	state.buf_offset = 0;
> +	return __nolibc_printf(__nolibc_fprintf_cb, &state, fmt, args);
>  }
>  
>  static __attribute__((unused, format(printf, 1, 0)))

Thanks,
Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks
  2026-02-06 19:11 ` [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
@ 2026-02-07 19:38   ` Willy Tarreau
  2026-02-07 23:43     ` David Laight
  2026-02-16 19:30   ` Thomas Weißschuh
  1 sibling, 1 reply; 60+ messages in thread
From: Willy Tarreau @ 2026-02-07 19:38 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Fri, Feb 06, 2026 at 07:11:14PM +0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Simple to do and saves calls to the callback function.

+20 bytes here but OK for me.

> Signed-off-by: David Laight <david.laight.linux@gmail.com>

Acked-by: Willy Tarreau <w@1wt.eu>

> ---
> 
> Changes for v2:
> Formally patch 3, unchanged.
> 
>  tools/include/nolibc/stdio.h | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> index 552f09d51d82..c044a6b3babe 100644
> --- a/tools/include/nolibc/stdio.h
> +++ b/tools/include/nolibc/stdio.h
> @@ -355,10 +355,12 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
>  			outstr = fmt;
>  			len = ofs - 1;
>  		flush_str:
> -			while (width-- > len) {
> -				if (cb(state, " ", 1) != 0)
> +			while (width > len) {
> +				unsigned int pad_len = ((width - len - 1) & 15) + 1;
> +				width -= pad_len;
> +				written += pad_len;
> +				if (cb(state, "                ", pad_len) != 0)
>  					return -1;
> -				written += 1;
>  			}
>  			if (cb(state, outstr, len) != 0)
>  				return -1;

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf()
  2026-02-06 19:11 ` [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
@ 2026-02-07 20:05   ` Willy Tarreau
  2026-02-07 23:50     ` David Laight
  0 siblings, 1 reply; 60+ messages in thread
From: Willy Tarreau @ 2026-02-07 20:05 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Fri, Feb 06, 2026 at 07:11:15PM +0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Move the check for the length modifiers into the format processing
> between the field width and conversion specifier.
> This lets the loop be simplified and a 'fast scan' for a format start
> used.
> 
> If an error is detected (eg an invalid conversion specifier) then
> copy the invalid format to the output buffer.
> 
> Reduces code size by about 10% on x86-64.

I'm surprised, because for me it's the opposite:

  $ size hello-patch*
     text    data     bss     dec     hex filename
     1859      48      24    1931     78b hello-patch1
     2071      48      24    2143     85f hello-patch2
     2091      48      24    2163     873 hello-patch3
     2422      48      24    2494     9be hello-patch4

The whole program grew by almost 16%, and that's a 30% increase since
the first patch. This is with gcc 15 -Oz. aarch64 however decreased by
15 bytes since previous patch.

I have not figured what makes this change yet, I'm still digging.

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback
  2026-02-07 19:12   ` Willy Tarreau
@ 2026-02-07 23:28     ` David Laight
  2026-02-08 15:12       ` Willy Tarreau
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-07 23:28 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sat, 7 Feb 2026 20:12:30 +0100
Willy Tarreau <w@1wt.eu> wrote:

> On Fri, Feb 06, 2026 at 07:11:12PM +0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Move output truncation to the snprintf() callback.
> > This simplifies the main code and ensures the truncation will be
> > correct when left-alignment is added.
> > 
> > Add a zero length callback to 'finalise' the buffer rather than
> > doing it in snprintf() itself.
> > 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> > 
> > Changes for v2:
> > - Formally patch 1
> > - Add comments about the final callback.
> > 
> >  tools/include/nolibc/stdio.h | 68 ++++++++++++++++++++++--------------
> >  1 file changed, 41 insertions(+), 27 deletions(-)
> > 
> > diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> > index f162cc697a73..36733ecd4261 100644
> > --- a/tools/include/nolibc/stdio.h
> > +++ b/tools/include/nolibc/stdio.h
> > @@ -245,15 +245,15 @@ char *fgets(char *s, int size, FILE *stream)
> >   *  - %s
> >   *  - unknown modifiers are ignored.
> >   */
> > -typedef int (*__nolibc_printf_cb)(intptr_t state, const char *buf, size_t size);
> > +typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
> >  
> > -static __attribute__((unused, format(printf, 4, 0)))
> > -int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char *fmt, va_list args)
> > +static __attribute__((unused, format(printf, 3, 0)))
> > +int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
> >  {
> >  	char escape, lpref, ch;
> >  	unsigned long long v;
> >  	unsigned int written, width;
> > -	size_t len, ofs, w;
> > +	size_t len, ofs;
> >  	char tmpbuf[21];
> >  	const char *outstr;
> >  
> > @@ -355,17 +355,13 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
> >  			outstr = fmt;
> >  			len = ofs - 1;
> >  		flush_str:
> > -			if (n) {
> > -				w = len < n ? len : n;
> > -				n -= w;
> > -				while (width-- > w) {
> > -					if (cb(state, " ", 1) != 0)
> > -						return -1;
> > -					written += 1;
> > -				}
> > -				if (cb(state, outstr, w) != 0)
> > +			while (width-- > len) {
> > +				if (cb(state, " ", 1) != 0)
> >  					return -1;
> > +				written += 1;
> >  			}
> > +			if (cb(state, outstr, len) != 0)
> > +				return -1;
> >  
> >  			written += len;
> >  		do_escape:
> > @@ -378,18 +374,23 @@ int __nolibc_printf(__nolibc_printf_cb cb, intptr_t state, size_t n, const char
> >  
> >  		/* literal char, just queue it */
> >  	}
> > +
> > +	/* Flush/terminate any buffer. */
> > +	if (cb(state, NULL, 0) != 0)
> > +		return -1;
> > +  
> 
> I suspect this hunk is in fact part of the next patch which adds
> buffering, because I don't see what there is to flush here without
> any buffer.

Ok, it could be 'terminate' here, and I could change the line again
in the next patch.
If the next patch goes in (it can be missed out) then maybe a different
comment would be better.
Or something more 'wooly' line 'finalise the output'.

> If really needed, then I think that it's about time to
> start adding a comment about the __nolibc_printf() function to explain
> how it's supposed to work, because callback-driven code is unreadable,
> there are hidden expectations everywhere that are super hard to guess
> or verify.

Something has to add the terminating '\0'.
The callback function has the rest of the logic for writing to the buffer.
So it saves replicating the length test (as was done before).
It also lets vsnprintf() directly return the value returned from
__nolibc_fprintf_cb().
The '\0' could be added after every memcpy(), but that has hard code paths
when the buffer length is 0 or 1.

It also helps if the compiler decides to inline vsnprintf.

But yes, some of the motivation does come from the following patch.
There you really do want all the write system calls to be in one function.

> 
> >  	return written;
> >  }
> >  
> > -static int __nolibc_fprintf_cb(intptr_t state, const char *buf, size_t size)
> > +static int __nolibc_fprintf_cb(void *stream, const char *buf, size_t size)  
> 
> I must confess I'm not a big fan of the void* here. I've seen that you're
> having one state for snprintf() and another one for fprintf(), maybe they
> could be efficiently merged into a common printf_state ? Note that I'm not
> vetoing this, I just want to be convinced that it's the best choice, and
> neither the code, comments nor commit messages for now suggest so.

'void *' has to be better than intptr_t.
A common state wouldn't work - all the fields are different
(before and after the next patch).
At least here all the functions are together and it is a straight callback.
The usual problem with registering callback for later is the extreme
difficulty in safely de-registering them.

> 
> >  {
> > -	return _fwrite(buf, size, (FILE *)state);
> > +	return size ? _fwrite(buf, size, stream) : 0;
> >  }
> >  
> >  static __attribute__((unused, format(printf, 2, 0)))
> >  int vfprintf(FILE *stream, const char *fmt, va_list args)
> >  {
> > -	return __nolibc_printf(__nolibc_fprintf_cb, (intptr_t)stream, SIZE_MAX, fmt, args);
> > +	return __nolibc_printf(__nolibc_fprintf_cb, stream, fmt, args);
> >  }
> >  
> >  static __attribute__((unused, format(printf, 1, 0)))
> > @@ -447,26 +448,39 @@ int dprintf(int fd, const char *fmt, ...)
> >  	return ret;
> >  }
> >  
> > -static int __nolibc_sprintf_cb(intptr_t _state, const char *buf, size_t size)
> > +struct __nolibc_sprintf_cb_state {
> > +	char *buf;
> > +	size_t size;
> > +};
> > +
> > +static int __nolibc_sprintf_cb(void *v_state, const char *buf, size_t size)
> >  {
> > -	char **state = (char **)_state;
> > +	struct __nolibc_sprintf_cb_state *state = v_state;
> > +	char *tgt;
> >  
> > -	memcpy(*state, buf, size);
> > -	*state += size;
> > +	if (size >= state->size) {
> > +		if (state->size <= 1)
> > +			return 0;  
> 
> I failed to understand that one. Don't we want to at least write the
> trailing zero when there's one byte left ? A short comment explaining
> that case would help.

state->size is normally greater than zero.
So when size if zero the first condition is false, and the *tgt = 0
line is executed later.
However it is valid to call vsnprintf(NULL, 0, fmt, args).
That is the only way state->size can be zero - in that case the code
needs to return without writing to the buffer.

> 
> > +		size = state->size - 1;
> > +	}
> > +	tgt = state->buf;
> > +	if (size) {
> > +		state->size -= size;
> > +		state->buf = tgt + size;
> > +		memcpy(tgt, buf, size);
> > +	} else {
> > +		/* In particular from cb(NULL, 0) at the end of __nolibc_printf(). */
> > +		*tgt = '\0';
> > +	}  
> 
> Usually, "if/else" constructs result in larger code due to jumps.

Size doesn't usually matter, what makes a big difference is the branches
being mispredicted - I think that is ~20 clocks on my zen-5 (getting close
the the P4-netburst values).

I think the nolibc is optimised for size, but only within reason.
That if/else does help document what is going on.

> Here we certainly can unconditionally write the trailing zero.

Writing the zero then overwriting it with 'no bytes' is too subtle (even for me).
I'm sure you'd have wanted a big comment :-)

> Bingo, we're
> saving 9 bytes on x86_64 by moving it above. And even 17 bytes by dropping
> the test on size and updating the state after the memcpy:
> 
> 	if (size >= state->size) {
> 		if (state->size <= 1)
> 			return 0;
> 		size = state->size - 1;
> 	}
> 	*state->buf = '\0';
> 	memcpy(state->buf, buf, size);
> 	state->buf += size;
> 	state->size -= size;

That starts relying on the compiler assuming that the memcpy() can't
be writing to state->buf and state->size.
I'm not certain it can assume that in the callback function
(With or without 'strict aliasing').
So the explicit local for tgt helps a lot - and does no harm.
In the past I've done:
	x->buf += size;
	memcpy(x->buf - size, src, size)
knowing that the compiler will just use the old value.


Note that something has to write the '\0' for:	
	snprintf(buf, 1, fmt, args);
and
	snprintf(buf, len, "");
but not for
	snprintf(buf, 0, fmt, args);
so just writing the the byte following the memcpy() isn't enough.

	David


> 
>   snprintf-patch1:000000000000003e t __nolibc_sprintf_cb
>   snprintf-patch1-alt1:0000000000000033 t __nolibc_sprintf_cb
>   snprintf-patch1-alt2:000000000000002d t __nolibc_sprintf_cb
> 
> (not tested but worth a try).
> 
> >  	return 0;
> >  }
> >  
> >  static __attribute__((unused, format(printf, 3, 0)))
> >  int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
> >  {
> > -	char *state = buf;
> > -	int ret;
> > +	struct __nolibc_sprintf_cb_state state = { .buf = buf, .size = size };
> >  
> > -	ret = __nolibc_printf(__nolibc_sprintf_cb, (intptr_t)&state, size, fmt, args);
> > -	if (ret < 0)
> > -		return ret;
> > -	buf[(size_t)ret < size ? (size_t)ret : size - 1] = '\0';
> > -	return ret;
> > +	return __nolibc_printf(__nolibc_sprintf_cb, &state, fmt, args);
> >  }
> >  
> >  static __attribute__((unused, format(printf, 3, 4)))  
> 
> Willy


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback.
  2026-02-07 19:29   ` Willy Tarreau
@ 2026-02-07 23:36     ` David Laight
  2026-02-16 19:07       ` Thomas Weißschuh
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-07 23:36 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sat, 7 Feb 2026 20:29:34 +0100
Willy Tarreau <w@1wt.eu> wrote:

> On Fri, Feb 06, 2026 at 07:11:13PM +0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Add per-call buffering to the vprintf() callback.
> > While this adds some extra code it will speed things up and
> > makes a massive difference to anyone looking at strace output.  
> 
> This patch alone adds more than 200 extra bytes to the smallest binary
> for something that was never expressed as a need by users:
> 
>  $ size hello-patch*
>    text    data     bss     dec     hex filename
>    1859      48      24    1931     78b hello-patch1
>    2071      48      24    2143     85f hello-patch2
> 
> I doubt it would make sense to have a build option to choose this.
> Or alternately one could decide do disable it when __OPTIMIZE_SIZE__
> is defined. I just tried quickly and it does the job:

That probably makes sense.
For anything non-trivial the extra size is noise.
Actually is would be easier to read with a single #if covering the
whole lot.

> 
>   @@ -390,6 +390,7 @@ struct __nolibc_fprintf_cb_state {
>    
>    static int __nolibc_fprintf_cb(void *v_state, const char *buf, size_t size)
>    {
>   +#if !defined(__OPTIMIZE_SIZE__)
>           struct __nolibc_fprintf_cb_state *state = v_state;
>           unsigned int off = state->buf_offset;
>    
>   @@ -407,16 +408,24 @@ static int __nolibc_fprintf_cb(void *v_state, const char *buf, size_t size)
>                   memcpy(state->buf + off, buf, size);
>           }
>           return 0;
>   +#else
>   +       /* v_state is the stream */
>   +       return size ? _fwrite(buf, size, v_state) : 0;
>   +#endif
>    }
>    
>    static __attribute__((unused, format(printf, 2, 0)))
>    int vfprintf(FILE *stream, const char *fmt, va_list args)
>    {
>   +#if !defined(__OPTIMIZE_SIZE__)
>           struct __nolibc_fprintf_cb_state state;
>    
>           state.stream = stream;
>           state.buf_offset = 0;
>           return __nolibc_printf(__nolibc_fprintf_cb, &state, fmt, args);
>   +#else
>   +       return __nolibc_printf(__nolibc_fprintf_cb, stream, fmt, args);
>   +#endif
>    }
>    
>    static __attribute__((unused, format(printf, 1, 0)))
> 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> > 
> > Changes for v2:
> > Formally patch 2, unchanged.
> > 
> >  tools/include/nolibc/stdio.h | 32 +++++++++++++++++++++++++++++---
> >  1 file changed, 29 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> > index 36733ecd4261..552f09d51d82 100644
> > --- a/tools/include/nolibc/stdio.h
> > +++ b/tools/include/nolibc/stdio.h
> > @@ -382,15 +382,41 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  	return written;
> >  }
> >  
> > -static int __nolibc_fprintf_cb(void *stream, const char *buf, size_t size)
> > +struct __nolibc_fprintf_cb_state {
> > +	FILE *stream;
> > +	unsigned int buf_offset;
> > +	char buf[128];
> > +};  
> 
> So that's the other state I was wondering if we could merge with the first
> one.

The snprintf state is the 'address to write the next character to'
and 'the amount of space remaining'.
No real overlap at all.

> 
> > +static int __nolibc_fprintf_cb(void *v_state, const char *buf, size_t size)
> >  {
> > -	return size ? _fwrite(buf, size, stream) : 0;
> > +	struct __nolibc_fprintf_cb_state *state = v_state;
> > +	unsigned int off = state->buf_offset;
> > +
> > +	if (off + size > sizeof(state->buf) || buf == NULL) {  
> 
> Please mention that special case of buf==NULL in a comment above the
> function. That's an internal API choice.

ok - it might just go above this line though.

	David

> 
> > +		state->buf_offset = 0;
> > +		if (off && _fwrite(state->buf, off, state->stream))
> > +			return -1;
> > +		if (size > sizeof(state->buf))
> > +			return _fwrite(buf, size, state->stream);
> > +		off = 0;
> > +	}
> > +
> > +	if (size) {
> > +		state->buf_offset = off + size;
> > +		memcpy(state->buf + off, buf, size);
> > +	}
> > +	return 0;
> >  }
> >  
> >  static __attribute__((unused, format(printf, 2, 0)))
> >  int vfprintf(FILE *stream, const char *fmt, va_list args)
> >  {
> > -	return __nolibc_printf(__nolibc_fprintf_cb, stream, fmt, args);
> > +	struct __nolibc_fprintf_cb_state state;
> > +
> > +	state.stream = stream;
> > +	state.buf_offset = 0;
> > +	return __nolibc_printf(__nolibc_fprintf_cb, &state, fmt, args);
> >  }
> >  
> >  static __attribute__((unused, format(printf, 1, 0)))  
> 
> Thanks,
> Willy


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks
  2026-02-07 19:38   ` Willy Tarreau
@ 2026-02-07 23:43     ` David Laight
  2026-02-08 15:14       ` Willy Tarreau
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-07 23:43 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sat, 7 Feb 2026 20:38:36 +0100
Willy Tarreau <w@1wt.eu> wrote:

> On Fri, Feb 06, 2026 at 07:11:14PM +0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Simple to do and saves calls to the callback function.  
> 
> +20 bytes here but OK for me.

I think some of those come back when I change all the variables to 'int'.
Some, but not all, is because width is 32bit but len is 64bit.
The final change that did:
	width -= len;
	...
	while (width > 0)
saved a surprising amount provided the ... contained some code.
Breath on the code (or compiler version) and you easily get +/-60 bytes.

	David

> 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>  
> 
> Acked-by: Willy Tarreau <w@1wt.eu>
> 
> > ---
> > 
> > Changes for v2:
> > Formally patch 3, unchanged.
> > 
> >  tools/include/nolibc/stdio.h | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> > index 552f09d51d82..c044a6b3babe 100644
> > --- a/tools/include/nolibc/stdio.h
> > +++ b/tools/include/nolibc/stdio.h
> > @@ -355,10 +355,12 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  			outstr = fmt;
> >  			len = ofs - 1;
> >  		flush_str:
> > -			while (width-- > len) {
> > -				if (cb(state, " ", 1) != 0)
> > +			while (width > len) {
> > +				unsigned int pad_len = ((width - len - 1) & 15) + 1;
> > +				width -= pad_len;
> > +				written += pad_len;
> > +				if (cb(state, "                ", pad_len) != 0)
> >  					return -1;
> > -				written += 1;
> >  			}
> >  			if (cb(state, outstr, len) != 0)
> >  				return -1;  
> 
> Willy


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf()
  2026-02-07 20:05   ` Willy Tarreau
@ 2026-02-07 23:50     ` David Laight
  2026-02-08 12:20       ` David Laight
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-07 23:50 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sat, 7 Feb 2026 21:05:42 +0100
Willy Tarreau <w@1wt.eu> wrote:

> On Fri, Feb 06, 2026 at 07:11:15PM +0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Move the check for the length modifiers into the format processing
> > between the field width and conversion specifier.
> > This lets the loop be simplified and a 'fast scan' for a format start
> > used.
> > 
> > If an error is detected (eg an invalid conversion specifier) then
> > copy the invalid format to the output buffer.
> > 
> > Reduces code size by about 10% on x86-64.  
> 
> I'm surprised, because for me it's the opposite:
> 
>   $ size hello-patch*
>      text    data     bss     dec     hex filename
>      1859      48      24    1931     78b hello-patch1
>      2071      48      24    2143     85f hello-patch2
>      2091      48      24    2163     873 hello-patch3
>      2422      48      24    2494     9be hello-patch4
> 
> The whole program grew by almost 16%, and that's a 30% increase since
> the first patch. This is with gcc 15 -Oz. aarch64 however decreased by
> 15 bytes since previous patch.
> 
> I have not figured what makes this change yet, I'm still digging.

Running scripts/bloat-o-meter will give more detail.

> Willy

I'm using gcc 12.2 and just running 'make O=xxx' for the test program.
The object looks like what I'd expect, so might be -O2.

Is it constant folding the #defines.
For me it generating the (1 << (c & 31)) & 0xxxxx as you might hope.

	David


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf()
  2026-02-07 23:50     ` David Laight
@ 2026-02-08 12:20       ` David Laight
  2026-02-08 14:44         ` Willy Tarreau
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-08 12:20 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sat, 7 Feb 2026 23:50:19 +0000
David Laight <david.laight.linux@gmail.com> wrote:

> On Sat, 7 Feb 2026 21:05:42 +0100
> Willy Tarreau <w@1wt.eu> wrote:
> 
> > On Fri, Feb 06, 2026 at 07:11:15PM +0000, david.laight.linux@gmail.com wrote:  
> > > From: David Laight <david.laight.linux@gmail.com>
> > > 
> > > Move the check for the length modifiers into the format processing
> > > between the field width and conversion specifier.
> > > This lets the loop be simplified and a 'fast scan' for a format start
> > > used.
> > > 
> > > If an error is detected (eg an invalid conversion specifier) then
> > > copy the invalid format to the output buffer.
> > > 
> > > Reduces code size by about 10% on x86-64.    
> > 
> > I'm surprised, because for me it's the opposite:
> > 
> >   $ size hello-patch*
> >      text    data     bss     dec     hex filename
> >      1859      48      24    1931     78b hello-patch1
> >      2071      48      24    2143     85f hello-patch2
> >      2091      48      24    2163     873 hello-patch3
> >      2422      48      24    2494     9be hello-patch4
> > 
> > The whole program grew by almost 16%, and that's a 30% increase since
> > the first patch. This is with gcc 15 -Oz. aarch64 however decreased by
> > 15 bytes since previous patch.
> > 
> > I have not figured what makes this change yet, I'm still digging.  
> 
> Running scripts/bloat-o-meter will give more detail.
> 
> > Willy  
> 
> I'm using gcc 12.2 and just running 'make O=xxx' for the test program.
> The object looks like what I'd expect, so might be -O2.
> 
> Is it constant folding the #defines.
> For me it generating the (1 << (c & 31)) & 0xxxxx as you might hope.

Further thoughts:

On some of the builds I've done gcc duplicated the code following an 'if'
into both the 'then' and 'else' clauses.
This isn't good for code size.
At one point I had an OPTIMIZER_HIDE_VAR(sign) before the u64to...()
block which did help, but that wasn't needed after the last patch or in
the patch sequence I posted.
(Maybe the 'if (v == 0) ...' block makes a difference.)

It might also be worth including the patch that changes u64to...()
before doing the size checks.
gcc should inline the wrappers - so it definitely changes the way the
code is generated.
To add octal support (for completeness) I'd explicitly generate the
three sets of constants in the printf() code and then call 
_nolibc_utoa_base().
The octal support is (approx):
	else if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'o') {
		base = 8;
		recip = _NOLIBC_U64TOA_RECIP(8);
		if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, '#' - 1)
			sign = '0';
	}
The last bit could be:
	sign = ((ch_flags >> n) & 1) * '0';
gcc might be persuaded to do that, but probably needs help.

	David

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf()
  2026-02-08 12:20       ` David Laight
@ 2026-02-08 14:44         ` Willy Tarreau
  2026-02-08 16:54           ` David Laight
  0 siblings, 1 reply; 60+ messages in thread
From: Willy Tarreau @ 2026-02-08 14:44 UTC (permalink / raw)
  To: David Laight; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

Hi David,

On Sun, Feb 08, 2026 at 12:20:31PM +0000, David Laight wrote:
> On Sat, 7 Feb 2026 23:50:19 +0000
> David Laight <david.laight.linux@gmail.com> wrote:
> 
> > On Sat, 7 Feb 2026 21:05:42 +0100
> > Willy Tarreau <w@1wt.eu> wrote:
> > 
> > > On Fri, Feb 06, 2026 at 07:11:15PM +0000, david.laight.linux@gmail.com wrote:  
> > > > From: David Laight <david.laight.linux@gmail.com>
> > > > 
> > > > Move the check for the length modifiers into the format processing
> > > > between the field width and conversion specifier.
> > > > This lets the loop be simplified and a 'fast scan' for a format start
> > > > used.
> > > > 
> > > > If an error is detected (eg an invalid conversion specifier) then
> > > > copy the invalid format to the output buffer.
> > > > 
> > > > Reduces code size by about 10% on x86-64.    
> > > 
> > > I'm surprised, because for me it's the opposite:
> > > 
> > >   $ size hello-patch*
> > >      text    data     bss     dec     hex filename
> > >      1859      48      24    1931     78b hello-patch1
> > >      2071      48      24    2143     85f hello-patch2
> > >      2091      48      24    2163     873 hello-patch3
> > >      2422      48      24    2494     9be hello-patch4
> > > 
> > > The whole program grew by almost 16%, and that's a 30% increase since
> > > the first patch. This is with gcc 15 -Oz. aarch64 however decreased by
> > > 15 bytes since previous patch.
> > > 
> > > I have not figured what makes this change yet, I'm still digging.  
> > 
> > Running scripts/bloat-o-meter will give more detail.
> > 
> > > Willy  
> > 
> > I'm using gcc 12.2 and just running 'make O=xxx' for the test program.
> > The object looks like what I'd expect, so might be -O2.
> > 
> > Is it constant folding the #defines.
> > For me it generating the (1 << (c & 31)) & 0xxxxx as you might hope.
> 
> Further thoughts:
> 
> On some of the builds I've done gcc duplicated the code following an 'if'
> into both the 'then' and 'else' clauses.
> This isn't good for code size.

That's common in loops for example. That's also one reason for avoiding
"else" statements in compact code.

However here I finally found what inflates the code, when disassembling
the whole function: with the move of the multiple "if" statements,
recent compilers managed to turn it into a jump table, that considerably
inflates .rodata and the function as well. By passing -fno-jump-tables,
the size drops by ~500 bytes:

   text    data     bss     dec     hex filename
   2422      48      24    2494     9be hello-patch4
   1917      48      24    1989     7c5 hello-patch4-alt   <---

Building with gcc before 13 also avoids this table and explains why
you had better code with gcc-12.

I also noticed that we can reduce the loop by ~40 bytes by moving the
literal copy after after the block that deals with format sequences,
because it eases comparisons, but that's no big deal for now since your
subsequent patches are going to change all that.

At least I wanted to understand what was causing this difference for
us both, and whether it risked remaining definitive or not, so now
this patch is OK to me.

Acked-by: Willy Tarreau <w@1wt.eu>

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback
  2026-02-07 23:28     ` David Laight
@ 2026-02-08 15:12       ` Willy Tarreau
  2026-02-08 22:49         ` David Laight
  0 siblings, 1 reply; 60+ messages in thread
From: Willy Tarreau @ 2026-02-08 15:12 UTC (permalink / raw)
  To: David Laight; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sat, Feb 07, 2026 at 11:28:25PM +0000, David Laight wrote:
> > > +
> > > +	/* Flush/terminate any buffer. */
> > > +	if (cb(state, NULL, 0) != 0)
> > > +		return -1;
> > > +  
> > 
> > I suspect this hunk is in fact part of the next patch which adds
> > buffering, because I don't see what there is to flush here without
> > any buffer.
> 
> Ok, it could be 'terminate' here, and I could change the line again
> in the next patch.
> If the next patch goes in (it can be missed out) then maybe a different
> comment would be better.
> Or something more 'wooly' line 'finalise the output'.
> 
> > If really needed, then I think that it's about time to
> > start adding a comment about the __nolibc_printf() function to explain
> > how it's supposed to work, because callback-driven code is unreadable,
> > there are hidden expectations everywhere that are super hard to guess
> > or verify.
> 
> Something has to add the terminating '\0'.

OK that's the point that wasn't obvious when reading the code.
Maybe just mention "finish and append \0" or something like this
so that it's clear it wasn't yet done.

> The callback function has the rest of the logic for writing to the buffer.
> So it saves replicating the length test (as was done before).
> It also lets vsnprintf() directly return the value returned from
> __nolibc_fprintf_cb().
> The '\0' could be added after every memcpy(), but that has hard code paths
> when the buffer length is 0 or 1.

Yeah that's what I assumed (especially since it's often more convenient
when trying to print possibly truncated strings), but I agree that it
would add complicated logic everywhere.

> It also helps if the compiler decides to inline vsnprintf.
> 
> But yes, some of the motivation does come from the following patch.
> There you really do want all the write system calls to be in one function.

It's another good point that when writing to a file you don't want to
emit the zero so that can simplify the processing.

> > > -static int __nolibc_fprintf_cb(intptr_t state, const char *buf, size_t size)
> > > +static int __nolibc_fprintf_cb(void *stream, const char *buf, size_t size)  
> > 
> > I must confess I'm not a big fan of the void* here. I've seen that you're
> > having one state for snprintf() and another one for fprintf(), maybe they
> > could be efficiently merged into a common printf_state ? Note that I'm not
> > vetoing this, I just want to be convinced that it's the best choice, and
> > neither the code, comments nor commit messages for now suggest so.
> 
> 'void *' has to be better than intptr_t.
> A common state wouldn't work - all the fields are different
> (before and after the next patch).

That's what I wanted to double-check with you.

> At least here all the functions are together and it is a straight callback.
> The usual problem with registering callback for later is the extreme
> difficulty in safely de-registering them.

Yep, and also following the code :-)

I really think that a longer comment before the main function explaining
the call sequences with states will significantly help the occasional
reader (i.e.  vsnprintf() -> __nolibc_printf() -> sprintf_cb()).

> > > +static int __nolibc_sprintf_cb(void *v_state, const char *buf, size_t size)
> > >  {
> > > -	char **state = (char **)_state;
> > > +	struct __nolibc_sprintf_cb_state *state = v_state;
> > > +	char *tgt;
> > >  
> > > -	memcpy(*state, buf, size);
> > > -	*state += size;
> > > +	if (size >= state->size) {
> > > +		if (state->size <= 1)
> > > +			return 0;  
> > 
> > I failed to understand that one. Don't we want to at least write the
> > trailing zero when there's one byte left ? A short comment explaining
> > that case would help.
> 
> state->size is normally greater than zero.
> So when size if zero the first condition is false, and the *tgt = 0
> line is executed later.

I meant, not what the code does but what situation this corresponds to :-)
However, I think I get it now, I think you just want to reserve the last
char for the final call. Mentioning this would help. Maybe just:

  /* defer trailing \0 to final call (with size=0). Final call with
   * zero-size output also gets here.
   */

> However it is valid to call vsnprintf(NULL, 0, fmt, args).
> That is the only way state->size can be zero - in that case the code
> needs to return without writing to the buffer.

Yep.

> > > +	tgt = state->buf;
> > > +	if (size) {
> > > +		state->size -= size;
> > > +		state->buf = tgt + size;
> > > +		memcpy(tgt, buf, size);
> > > +	} else {
> > > +		/* In particular from cb(NULL, 0) at the end of __nolibc_printf(). */
> > > +		*tgt = '\0';
> > > +	}  
> > 
> > Usually, "if/else" constructs result in larger code due to jumps.
> 
> Size doesn't usually matter, what makes a big difference is the branches
> being mispredicted - I think that is ~20 clocks on my zen-5 (getting close
> the the P4-netburst values).

Let me please insist a lot here: we DO NOT CARE about performance in
this lib. We do care about size and maintainability. Branch prediction
has no place at all in that code, and the entirety of the lib would
be written differently if it targetted performance. For example please
have a look at memcpy().

> I think the nolibc is optimised for size, but only within reason.
> That if/else does help document what is going on.

But I am concerned about not needlessly inflating its size. In just
a few patches the simplest hello world grew by 500 bytes. That's almost
the size of my entire very first init program where this project
started. It does matter when you start to place lots of small programs
in an initrd, as all of them are static and the size is multiplied by
the number of programs. And optimizing for branch prediction is the
last valid reason for increasing the code size here.

> > Here we certainly can unconditionally write the trailing zero.
> 
> Writing the zero then overwriting it with 'no bytes' is too subtle (even for me).
> I'm sure you'd have wanted a big comment :-)

Yes definitely!

> > Bingo, we're
> > saving 9 bytes on x86_64 by moving it above. And even 17 bytes by dropping
> > the test on size and updating the state after the memcpy:
> > 
> > 	if (size >= state->size) {
> > 		if (state->size <= 1)
> > 			return 0;
> > 		size = state->size - 1;
> > 	}
> > 	*state->buf = '\0';
> > 	memcpy(state->buf, buf, size);
> > 	state->buf += size;
> > 	state->size -= size;
> 
> That starts relying on the compiler assuming that the memcpy() can't
> be writing to state->buf and state->size.
> I'm not certain it can assume that in the callback function
> (With or without 'strict aliasing').

There's nothing to assume, the code is absolutely valid. We developers
have to do the correct thing and not suspect the compiler could do bad
things in our back, but as long as we're not passing state->buf pointing
to state, there's no such risk of aliasing.

> So the explicit local for tgt helps a lot - and does no harm.
> In the past I've done:
> 	x->buf += size;
> 	memcpy(x->buf - size, src, size)
> knowing that the compiler will just use the old value.

Yeah tried it as well with no success.

> Note that something has to write the '\0' for:	
> 	snprintf(buf, 1, fmt, args);
> and
> 	snprintf(buf, len, "");
> but not for
> 	snprintf(buf, 0, fmt, args);
> so just writing the the byte following the memcpy() isn't enough.

But I didn't change anything to that. I'm not contesting the usefulness
of the memcpy(), just asking to avoid adding "else" blocks everywhere
that add tens of bytes for each patch when we can simply perform the
else branch unconditionally before the "if".

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks
  2026-02-07 23:43     ` David Laight
@ 2026-02-08 15:14       ` Willy Tarreau
  0 siblings, 0 replies; 60+ messages in thread
From: Willy Tarreau @ 2026-02-08 15:14 UTC (permalink / raw)
  To: David Laight; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sat, Feb 07, 2026 at 11:43:05PM +0000, David Laight wrote:
> On Sat, 7 Feb 2026 20:38:36 +0100
> Willy Tarreau <w@1wt.eu> wrote:
> 
> > On Fri, Feb 06, 2026 at 07:11:14PM +0000, david.laight.linux@gmail.com wrote:
> > > From: David Laight <david.laight.linux@gmail.com>
> > > 
> > > Simple to do and saves calls to the callback function.  
> > 
> > +20 bytes here but OK for me.
> 
> I think some of those come back when I change all the variables to 'int'.
> Some, but not all, is because width is 32bit but len is 64bit.
> The final change that did:
> 	width -= len;
> 	...
> 	while (width > 0)
> saved a surprising amount provided the ... contained some code.
> Breath on the code (or compiler version) and you easily get +/-60 bytes.

Yeah I tried as well with size_t (since compilers generally do not like
mixing data types and tend to place conversions every few instructions),
but some constants become larger and the code inflates as well.

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars
  2026-02-06 19:11 ` [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars david.laight.linux
@ 2026-02-08 15:22   ` Willy Tarreau
  2026-02-16 19:52   ` Thomas Weißschuh
  1 sibling, 0 replies; 60+ messages in thread
From: Willy Tarreau @ 2026-02-08 15:22 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Fri, Feb 06, 2026 at 07:11:16PM +0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
> conversion specifiers.
> This makes it easy to test for multiple values at once.
> 
> Detect the conversion flags " #+-0" although they are currently all ignored.
> 
> Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L'
> (both long long).
> 
> Add support for "%i" (the same as %d").
> 
> Unconditionally generate the signed values (for %d) to remove a second
> set of checks for the size.

OK, looks good to me, and I confirm that the inflation caused by the
jump table at patch 4 is now gone.

Acked-by: Willy Tarreau <w@1wt.eu>

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X"
  2026-02-06 19:11 ` [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X" david.laight.linux
@ 2026-02-08 15:47   ` Willy Tarreau
  2026-02-08 17:14     ` David Laight
  2026-02-08 16:06   ` Willy Tarreau
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 60+ messages in thread
From: Willy Tarreau @ 2026-02-08 15:47 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Fri, Feb 06, 2026 at 07:11:17PM +0000, david.laight.linux@gmail.com wrote:
> -/* simple printf(). It supports the following formats:
> - *  - %[-][width][{l,t,z,ll,L,j,q}]{d,u,c,x,p,s,m,%}
> - *  - %%
> - *  - invalid formats are copied to the output buffer
> +/* printf(). Supports most of the normal integer and string formats.
> + *  - %[#-+ ][width][{l,t,z,ll,L,j,q}]{d,i,u,c,x,X,p,s,m,%}
> + *  - %% generates a single %
> + *  - %m outputs strerror(errno).
> + *  - # only affects %x and prepends 0x to non-zero values.
> + *  - %o (octal) isn't supported.
> + *  - %X outputs a..f the same as %x.
> + *  - No support for zero padding, precision or variable widths.
> + *  - No support for wide characters.
> + *  - invalid formats are copied to the output buffer.
>   */

Thanks for updating this one, it does help quite a bit.

>  /* This code uses 'flag' variables that are indexed by the low 6 bits
> @@ -279,7 +285,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
>  	unsigned int written, width;
>  	unsigned int flags, ch_flag;
>  	size_t len;
> -	char tmpbuf[21];
> +	char tmpbuf[32 + 24];

The previous buffer was sized to store a 64-bit int. I couldn't figure
what these 32 and 24 correspond to with the new supported specifiers.
Maybe please add a short comment on the line to hint about what they
correspond to ?

>  	const char *outstr;
>  
>  	written = 0;
> @@ -334,19 +340,32 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
>  
>  			/* Conversion specifiers. */
>  
> -			/* Numeric conversion specifiers. */
> -			ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'c', 'd', 'i', 'u', 'x', 'p');
> -			if (ch_flag != 0) {
> +			/* Numeric and pointer conversion specifiers.
> +			 *
> +			 * Use an explicit bound check (rather than _NOLIBC_PF_CHAR_IS_ONE_OF())
> +			 * so that 'X' can be allowed through.
> +			 * 'X' gets treated and 'x' because _NOLIBC_PF_FLAG() returns the same
> +			 * value for both.
> +			 */
> +			if ((ch < 'a' || ch > 'z') && ch != 'X')
> +				goto non_numeric_conversion;
> +
> +			/* We need to check for "%p" or "%#x" later, merging here gives better code.
> +			 * But '#' collides with 'c' so shift right.
> +			 */
> +			ch_flag = _NOLIBC_PF_FLAG(ch) | (flags & _NOLIBC_PF_FLAG('#')) >> 1;
> +			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'i', 'u', 'x', 'p', 's')) {
>  				unsigned long long v;
>  				long long signed_v;
> -				char *out = tmpbuf;
> +				char *out = tmpbuf + 32;

OK so you seem to be reserving a part of the buffer for certain uses ?

> +				int sign = 0;
>  
>  				/* 'long' is needed for pointer/string conversions and ltz lengths.
>  				 * A single test can be used provided 'p' (the same bit as '0')
>  				 * is masked from flags.
>  				 */
>  				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')),
> -							     'p', 'l', 't', 'z')) {
> +							     'p', 's', 'l', 't', 'z')) {
>  					v = va_arg(args, unsigned long);
>  					signed_v = (long)v;
>  				} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j', 'q')) {
> @@ -365,40 +384,62 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
>  					goto do_output;
>  				}
>  
> +				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 's')) {
> +					/* "%s" - character string. */
> +					if (!v) {
> +						outstr = "(null)";
> +						len = 6;
> +						goto do_output;
> +					}
> +					outstr = (void *)v;
> +do_strnlen_output:
> +					len = strnlen(outstr, INT_MAX);

I get why you turned strlen() to strnlen(INT_MAX)  (result being an int)
but this will not change anything IMHO in that the rest of a 2GB+ string
will be written in multiple passes and will overflow the output anyway.
Thus I think that sticking to strlen() remains simpler and less confusing.

(...)
> -			else if (ch == 'm') {
> +
> +non_numeric_conversion:
> +			if (ch == 'm') {
>  #ifdef NOLIBC_IGNORE_ERRNO
>  				outstr = "unknown error";
> +				len = __builtin_strlen(outstr);
>  #else
>  				outstr = strerror(errno);
> +				goto do_strnlen_output;
>  #endif /* NOLIBC_IGNORE_ERRNO */

It's simlper (and smaller) to use the common label for both here:

   #ifdef NOLIBC_IGNORE_ERRNO
   				outstr = "unknown error";
   #else
   				outstr = strerror(errno);
   #endif /* NOLIBC_IGNORE_ERRNO */
  +				goto do_strnlen_output;

Overall OK to me.

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X"
  2026-02-06 19:11 ` [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X" david.laight.linux
  2026-02-08 15:47   ` Willy Tarreau
@ 2026-02-08 16:06   ` Willy Tarreau
  2026-02-16 19:57   ` Thomas Weißschuh
  2026-02-16 20:11   ` Thomas Weißschuh
  3 siblings, 0 replies; 60+ messages in thread
From: Willy Tarreau @ 2026-02-08 16:06 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

Another point I didn't notice, which confused me while reading next
patch:

On Fri, Feb 06, 2026 at 07:11:17PM +0000, david.laight.linux@gmail.com wrote:
> +			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'i', 'u', 'x', 'p', 's')) {
>  				unsigned long long v;
>  				long long signed_v;
> -				char *out = tmpbuf;
> +				char *out = tmpbuf + 32;
> +				int sign = 0;
(...)
> +					if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p', '#' - 1)) {
> +						/* "%p" and "%#x" need "0x" prepending. */
> +						sign = 'x' | '0' << 8;
(...)
> +				/* Add 0, 1 or 2 ("0x") sign characters left of any zero padding */
> +				for (; sign; sign >>= 8) {
> +					len++;
> +					*--out = sign;
> +				}

That's fine, but please mention in front of "sign" declaration something
like "/* may contain more than one char, starting with LSB */".

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 08/11] tools/nolibc/printf: Add support for zero padding and field precision
  2026-02-06 19:11 ` [PATCH v2 next 08/11] tools/nolibc/printf: Add support for zero padding and field precision david.laight.linux
@ 2026-02-08 16:16   ` Willy Tarreau
  2026-02-08 17:31     ` David Laight
  0 siblings, 1 reply; 60+ messages in thread
From: Willy Tarreau @ 2026-02-08 16:16 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Fri, Feb 06, 2026 at 07:11:18PM +0000, david.laight.linux@gmail.com wrote:
> +				/* Add zero padding */
> +				if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '0', '.')) {
> +					if (!_NOLIBC_PF_FLAGS_CONTAIN(flags, '.')) {
> +						if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '-'))
> +							/* Left justify overrides zero pad */
> +							goto prepend_sign;
> +						/* eg "%05d", Zero pad to field width less sign */
> +						precision = width;
> +						if (sign) {
> +							precision--;
> +							if (sign >= 256)

This is the one that confused me with previous patch. As long as we cannot
have 4 chars that's OK, otherwise the 4th char can flip the sign. We could
also add and unsigned cast here to clarify this, though admittedly this
should not change over time.

Just for the record, this is the patch that increases the code the most
(+265 bytes for me, no jump tables). But I think the feature is worth it
and I'm not seeing low hanging fruits to reduce it. I honestly find the
code particularly complex to follow now but that's related to the
multiplicity of output formats and I doubt we can do much about this.

Acked-by: Willy Tarreau <w@1wt.eu>

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 11/11] selftests/nolibc: Use printf("%.*s", n, "") to align output
  2026-02-06 19:11 ` [PATCH v2 next 11/11] selftests/nolibc: Use printf("%.*s", n, "") to align output david.laight.linux
@ 2026-02-08 16:20   ` Willy Tarreau
  2026-02-16 20:22   ` Thomas Weißschuh
  1 sibling, 0 replies; 60+ messages in thread
From: Willy Tarreau @ 2026-02-08 16:20 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Fri, Feb 06, 2026 at 07:11:21PM +0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Now that printf supports '*' for field widths it can be used to
> align the "[OK]" strings in the output.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>

The tests look good to me. For 9,10,11: 

Acked-by: Willy Tarreau <w@1wt.eu>

Thanks,
Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf()
  2026-02-08 14:44         ` Willy Tarreau
@ 2026-02-08 16:54           ` David Laight
  2026-02-08 17:06             ` Willy Tarreau
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-08 16:54 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sun, 8 Feb 2026 15:44:29 +0100
Willy Tarreau <w@1wt.eu> wrote:

> Hi David,
> 
> On Sun, Feb 08, 2026 at 12:20:31PM +0000, David Laight wrote:
> > On Sat, 7 Feb 2026 23:50:19 +0000
> > David Laight <david.laight.linux@gmail.com> wrote:
> >   
> > > On Sat, 7 Feb 2026 21:05:42 +0100
> > > Willy Tarreau <w@1wt.eu> wrote:
> > >   
> > > > On Fri, Feb 06, 2026 at 07:11:15PM +0000, david.laight.linux@gmail.com wrote:    
> > > > > From: David Laight <david.laight.linux@gmail.com>
> > > > > 
> > > > > Move the check for the length modifiers into the format processing
> > > > > between the field width and conversion specifier.
> > > > > This lets the loop be simplified and a 'fast scan' for a format start
> > > > > used.
> > > > > 
> > > > > If an error is detected (eg an invalid conversion specifier) then
> > > > > copy the invalid format to the output buffer.
> > > > > 
> > > > > Reduces code size by about 10% on x86-64.      
> > > > 
> > > > I'm surprised, because for me it's the opposite:
> > > > 
> > > >   $ size hello-patch*
> > > >      text    data     bss     dec     hex filename
> > > >      1859      48      24    1931     78b hello-patch1
> > > >      2071      48      24    2143     85f hello-patch2
> > > >      2091      48      24    2163     873 hello-patch3
> > > >      2422      48      24    2494     9be hello-patch4
> > > > 
> > > > The whole program grew by almost 16%, and that's a 30% increase since
> > > > the first patch. This is with gcc 15 -Oz. aarch64 however decreased by
> > > > 15 bytes since previous patch.
> > > > 
> > > > I have not figured what makes this change yet, I'm still digging.    
> > > 
> > > Running scripts/bloat-o-meter will give more detail.
> > >   
> > > > Willy    
> > > 
> > > I'm using gcc 12.2 and just running 'make O=xxx' for the test program.
> > > The object looks like what I'd expect, so might be -O2.
> > > 
> > > Is it constant folding the #defines.
> > > For me it generating the (1 << (c & 31)) & 0xxxxx as you might hope.  
> > 
> > Further thoughts:
> > 
> > On some of the builds I've done gcc duplicated the code following an 'if'
> > into both the 'then' and 'else' clauses.
> > This isn't good for code size.  
> 
> That's common in loops for example. That's also one reason for avoiding
> "else" statements in compact code.
> 
> However here I finally found what inflates the code, when disassembling
> the whole function: with the move of the multiple "if" statements,
> recent compilers managed to turn it into a jump table, that considerably
> inflates .rodata and the function as well. By passing -fno-jump-tables,
> the size drops by ~500 bytes:

That is just insane...
That might go away with the patch that changes is all to bit-masks.

I'd done some full disassembly comparisons myself to see why changes
made the code larger.
I had an OPTIMIZER_HIDE_VAR(sign) in there to help, but the final
version didn't need it.
What this sort of code needs is something to force the compiler to
only have one copy of something - I found a proposal for an attribute
(or similar) for an asm block to do that, but nothing came of it.

> 
>    text    data     bss     dec     hex filename
>    2422      48      24    2494     9be hello-patch4
>    1917      48      24    1989     7c5 hello-patch4-alt   <---
> 
> Building with gcc before 13 also avoids this table and explains why
> you had better code with gcc-12.
> 
> I also noticed that we can reduce the loop by ~40 bytes by moving the
> literal copy after after the block that deals with format sequences,
> because it eases comparisons, but that's no big deal for now since your
> subsequent patches are going to change all that.

Some of the early patches are carefully arranged to reduce churn
later on.

I might add the 'if (v == 0)' clause much earlier to avoid the churn
cause by the extra indent when it is added.

I'll add some extra comments as you suggested in the other patches.

I do know all about optimising for size, and for the 'worst case path'.
The latter was some embedded hdlc code that had to finish in 196 clocks.

	David

> 
> At least I wanted to understand what was causing this difference for
> us both, and whether it risked remaining definitive or not, so now
> this patch is OK to me.
> 
> Acked-by: Willy Tarreau <w@1wt.eu>
> 
> Willy


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf()
  2026-02-08 16:54           ` David Laight
@ 2026-02-08 17:06             ` Willy Tarreau
  0 siblings, 0 replies; 60+ messages in thread
From: Willy Tarreau @ 2026-02-08 17:06 UTC (permalink / raw)
  To: David Laight; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sun, Feb 08, 2026 at 04:54:25PM +0000, David Laight wrote:
> > However here I finally found what inflates the code, when disassembling
> > the whole function: with the move of the multiple "if" statements,
> > recent compilers managed to turn it into a jump table, that considerably
> > inflates .rodata and the function as well. By passing -fno-jump-tables,
> > the size drops by ~500 bytes:
> 
> That is just insane...
> That might go away with the patch that changes is all to bit-masks.

Yes, as mentioned later, it does.

> I'd done some full disassembly comparisons myself to see why changes
> made the code larger.
> I had an OPTIMIZER_HIDE_VAR(sign) in there to help, but the final
> version didn't need it.
> What this sort of code needs is something to force the compiler to
> only have one copy of something - I found a proposal for an attribute
> (or similar) for an asm block to do that, but nothing came of it.

Yeah I'm using similar hacks against the optimizer sometimes. That's
no big deal as there will always be variations between compilers, what
matters to me is that we can explain them (and indeed often when we
can we're also able to prevent the compiler from acting against us).

> > 
> >    text    data     bss     dec     hex filename
> >    2422      48      24    2494     9be hello-patch4
> >    1917      48      24    1989     7c5 hello-patch4-alt   <---
> > 
> > Building with gcc before 13 also avoids this table and explains why
> > you had better code with gcc-12.
> > 
> > I also noticed that we can reduce the loop by ~40 bytes by moving the
> > literal copy after after the block that deals with format sequences,
> > because it eases comparisons, but that's no big deal for now since your
> > subsequent patches are going to change all that.
> 
> Some of the early patches are carefully arranged to reduce churn
> later on.

Yes I noticed that. But the whole function is changed in the end so
we cannot avoid a number of complicated changes anyway.

> I might add the 'if (v == 0)' clause much earlier to avoid the churn
> cause by the extra indent when it is added.
> 
> I'll add some extra comments as you suggested in the other patches.

Yes, that's what is the most needed (and I don't deny that there are
already quite a bunch). When optimizing code, often the code ends up
being write-only. You're doing something while having the data flow in
your head and it turns into code (like size>=256), but when you don't
know the initial assumptions and you face this, you think "WTF?". Here
the comments need to indicate the developer's design choices (e.g.
"sign can hold up to two chars starting from LSB") and some of the
assumptions that become complicated to establish due to the long list
of if/else dealing with the multiple variants of specifiers.

> I do know all about optimising for size, and for the 'worst case path'.
> The latter was some embedded hdlc code that had to finish in 196 clocks.

Rest assured that it's quite visible, we're using the same tricks to save
every possible resource (making bitmaps from words etc), it's just that
doing this requires an amazing amount of comments. I'm used to saying
that each source or object byte saved offers more budget for comments :-)

Willy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X"
  2026-02-08 15:47   ` Willy Tarreau
@ 2026-02-08 17:14     ` David Laight
  0 siblings, 0 replies; 60+ messages in thread
From: David Laight @ 2026-02-08 17:14 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sun, 8 Feb 2026 16:47:23 +0100
Willy Tarreau <w@1wt.eu> wrote:

> On Fri, Feb 06, 2026 at 07:11:17PM +0000, david.laight.linux@gmail.com wrote:
> > -/* simple printf(). It supports the following formats:
> > - *  - %[-][width][{l,t,z,ll,L,j,q}]{d,u,c,x,p,s,m,%}
> > - *  - %%
> > - *  - invalid formats are copied to the output buffer
> > +/* printf(). Supports most of the normal integer and string formats.
> > + *  - %[#-+ ][width][{l,t,z,ll,L,j,q}]{d,i,u,c,x,X,p,s,m,%}
> > + *  - %% generates a single %
> > + *  - %m outputs strerror(errno).
> > + *  - # only affects %x and prepends 0x to non-zero values.
> > + *  - %o (octal) isn't supported.
> > + *  - %X outputs a..f the same as %x.
> > + *  - No support for zero padding, precision or variable widths.
> > + *  - No support for wide characters.
> > + *  - invalid formats are copied to the output buffer.
> >   */  
> 
> Thanks for updating this one, it does help quite a bit.
> 
> >  /* This code uses 'flag' variables that are indexed by the low 6 bits
> > @@ -279,7 +285,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  	unsigned int written, width;
> >  	unsigned int flags, ch_flag;
> >  	size_t len;
> > -	char tmpbuf[21];
> > +	char tmpbuf[32 + 24];  
> 
> The previous buffer was sized to store a 64-bit int. I couldn't figure
> what these 32 and 24 correspond to with the new supported specifiers.
> Maybe please add a short comment on the line to hint about what they
> correspond to ?

The could just be 2 here, but the next patch needs extra space for
all the '0' bytes.
I do like to make char[] arrays multiple of 4 bytes as well.
That is one reason for rounding the 21 up to 24.
It does also mean the line doesn't need changing to support octal.

> 
> >  	const char *outstr;
> >  
> >  	written = 0;
> > @@ -334,19 +340,32 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  
> >  			/* Conversion specifiers. */
> >  
> > -			/* Numeric conversion specifiers. */
> > -			ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'c', 'd', 'i', 'u', 'x', 'p');
> > -			if (ch_flag != 0) {
> > +			/* Numeric and pointer conversion specifiers.
> > +			 *
> > +			 * Use an explicit bound check (rather than _NOLIBC_PF_CHAR_IS_ONE_OF())
> > +			 * so that 'X' can be allowed through.
> > +			 * 'X' gets treated and 'x' because _NOLIBC_PF_FLAG() returns the same
> > +			 * value for both.
> > +			 */
> > +			if ((ch < 'a' || ch > 'z') && ch != 'X')
> > +				goto non_numeric_conversion;
> > +
> > +			/* We need to check for "%p" or "%#x" later, merging here gives better code.
> > +			 * But '#' collides with 'c' so shift right.
> > +			 */
> > +			ch_flag = _NOLIBC_PF_FLAG(ch) | (flags & _NOLIBC_PF_FLAG('#')) >> 1;
> > +			if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'i', 'u', 'x', 'p', 's')) {
> >  				unsigned long long v;
> >  				long long signed_v;
> > -				char *out = tmpbuf;
> > +				char *out = tmpbuf + 32;  
> 
> OK so you seem to be reserving a part of the buffer for certain uses ?

Sign and zero pad (next patch).
Ok, make a comment...

> 
> > +				int sign = 0;
> >  
> >  				/* 'long' is needed for pointer/string conversions and ltz lengths.
> >  				 * A single test can be used provided 'p' (the same bit as '0')
> >  				 * is masked from flags.
> >  				 */
> >  				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')),
> > -							     'p', 'l', 't', 'z')) {
> > +							     'p', 's', 'l', 't', 'z')) {
> >  					v = va_arg(args, unsigned long);
> >  					signed_v = (long)v;
> >  				} else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j', 'q')) {
> > @@ -365,40 +384,62 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  					goto do_output;
> >  				}
> >  
> > +				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 's')) {
> > +					/* "%s" - character string. */
> > +					if (!v) {
> > +						outstr = "(null)";
> > +						len = 6;
> > +						goto do_output;
> > +					}
> > +					outstr = (void *)v;
> > +do_strnlen_output:
> > +					len = strnlen(outstr, INT_MAX);  
> 
> I get why you turned strlen() to strnlen(INT_MAX)  (result being an int)
> but this will not change anything IMHO in that the rest of a 2GB+ string
> will be written in multiple passes and will overflow the output anyway.
> Thus I think that sticking to strlen() remains simpler and less confusing.

Wrong reason.
The next patch changes it to:
					len = strnlen(outstr, precision);
and I didn't want to change the name of the label as well as this line.

> 
> (...)
> > -			else if (ch == 'm') {
> > +
> > +non_numeric_conversion:
> > +			if (ch == 'm') {
> >  #ifdef NOLIBC_IGNORE_ERRNO
> >  				outstr = "unknown error";
> > +				len = __builtin_strlen(outstr);
> >  #else
> >  				outstr = strerror(errno);
> > +				goto do_strnlen_output;
> >  #endif /* NOLIBC_IGNORE_ERRNO */  
> 
> It's simlper (and smaller) to use the common label for both here:
> 
>    #ifdef NOLIBC_IGNORE_ERRNO
>    				outstr = "unknown error";
>    #else
>    				outstr = strerror(errno);
>    #endif /* NOLIBC_IGNORE_ERRNO */
>   +				goto do_strnlen_output;

If you look closely strerror() gets inlined here (at least with the makefile
that builds the test program).

I think you'll get smaller code by marking some of the functions noinline.

There is an 'interesting' option for 64bit.
Put "errno=" into 'sign', set 'v = errno', clear '.' (or set precision to 1)
and jump to the 'output in decimal' code.

strerror() could then be implemented using snprintf(buf, n, "%m");

Related is inlining the 'divide by reciprocal' u64toa() code into sprintf()
- that might save it spilling to stack and be smaller.
Then implement atoi() (etc) using snprintf().
Might generate smaller code overall - but it isn't a serious suggestion.

	David

> 
> Overall OK to me.
	Ta
> 
> Willy


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 08/11] tools/nolibc/printf: Add support for zero padding and field precision
  2026-02-08 16:16   ` Willy Tarreau
@ 2026-02-08 17:31     ` David Laight
  0 siblings, 0 replies; 60+ messages in thread
From: David Laight @ 2026-02-08 17:31 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sun, 8 Feb 2026 17:16:24 +0100
Willy Tarreau <w@1wt.eu> wrote:

> On Fri, Feb 06, 2026 at 07:11:18PM +0000, david.laight.linux@gmail.com wrote:
> > +				/* Add zero padding */
> > +				if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '0', '.')) {
> > +					if (!_NOLIBC_PF_FLAGS_CONTAIN(flags, '.')) {
> > +						if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '-'))
> > +							/* Left justify overrides zero pad */
> > +							goto prepend_sign;
> > +						/* eg "%05d", Zero pad to field width less sign */
> > +						precision = width;
> > +						if (sign) {
> > +							precision--;
> > +							if (sign >= 256)  
> 
> This is the one that confused me with previous patch. As long as we cannot
> have 4 chars that's OK, otherwise the 4th char can flip the sign. We could
> also add and unsigned cast here to clarify this, though admittedly this
> should not change over time.

The idea is that sign is 0/1/2 characters, a comment at the top might help. 

> Just for the record, this is the patch that increases the code the most
> (+265 bytes for me, no jump tables). But I think the feature is worth it
> and I'm not seeing low hanging fruits to reduce it. I honestly find the
> code particularly complex to follow now but that's related to the
> multiplicity of output formats and I doubt we can do much about this.

The final complexity is why I posted the full final version.
Trying to follow all the patches through is hard.
Some of the early changes are arranged to make the later ones smaller.

Yes it steals back all the gains from the previous patches and a bit more.
If you add the 'divide by reciprocal' [iu]64to[ah]_r() patch I think you
gain back at least two copies of the conversion code.
That might make it a net win.
The original code called u64toa_r(), u64toh_r() and i64toa_r() from
the numeric conversion code and utoa() from strerror().
Each copy is about 150 bytes on x86-64.
One copy is gone already, but I think there are two left.

That might give octal conversion at no net cost,

	David

> 
> Acked-by: Willy Tarreau <w@1wt.eu>
> 
> Willy


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback
  2026-02-08 15:12       ` Willy Tarreau
@ 2026-02-08 22:49         ` David Laight
  0 siblings, 0 replies; 60+ messages in thread
From: David Laight @ 2026-02-08 22:49 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Thomas Weißschuh, linux-kernel, Cheng Li

On Sun, 8 Feb 2026 16:12:59 +0100
Willy Tarreau <w@1wt.eu> wrote:

....
> > > Here we certainly can unconditionally write the trailing zero.  
> > 
> > Writing the zero then overwriting it with 'no bytes' is too subtle (even for me).
> > I'm sure you'd have wanted a big comment :-)  
> 
> Yes definitely!
> 
> > > Bingo, we're
> > > saving 9 bytes on x86_64 by moving it above. And even 17 bytes by dropping
> > > the test on size and updating the state after the memcpy:
> > > 
> > > 	if (size >= state->size) {
> > > 		if (state->size <= 1)
> > > 			return 0;
> > > 		size = state->size - 1;
> > > 	}
> > > 	*state->buf = '\0';
> > > 	memcpy(state->buf, buf, size);
> > > 	state->buf += size;
> > > 	state->size -= size;  
> > 
> > That starts relying on the compiler assuming that the memcpy() can't
> > be writing to state->buf and state->size.
> > I'm not certain it can assume that in the callback function
> > (With or without 'strict aliasing').  
> 
> There's nothing to assume, the code is absolutely valid. We developers
> have to do the correct thing and not suspect the compiler could do bad
> things in our back, but as long as we're not passing state->buf pointing
> to state, there's no such risk of aliasing.

Right, there won't be aliasing, the compiler might be able to detect
that it can't happen here because it can see all of the references to
the relevant structures.
But I think that if the functions weren't all static the compiler wouldn't
be able to make that assumption.
That would mean it couldn't use CSE for state->buf and state->size which
would make the code larger (esp. on non-x86 where it can't do add-to-memory).

Thinks (bad sign)...
Was the saving on x86 because it did add/sub to memory rather than a
register add/sub followed by a memory write?
So for pretty much everything else (except m68k) you get a read/add/write
sequences after the memcpy() which are larger (and slower).
So you win on x86 and lose everywhere else.
(m68k may not hace any callee saved register - so will have to spill
to stack across a real memcpy() call.)

There is another issue that (IIRC) technically memcpy(buf, NULL, 0) is UB.
(This might be for some cpu (vax?) that trap when a NULL pointer is loaded
into an address register.)
I can image clang detecting that path and just deleting all the code.
(Or doing something else equally unexpected.)

If you care about that the 'if (size)' has to stay.
(Or other changes done so that the pointer isn't NULL.)

	David


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch'
  2026-02-06 19:11 ` [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch' david.laight.linux
  2026-02-07 18:51   ` Willy Tarreau
@ 2026-02-16 18:52   ` Thomas Weißschuh
  1 sibling, 0 replies; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-16 18:52 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-06 19:11:11+0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> This makes the code slightly easier to read because the variable
> stands out from the single character literals (especially 'c').
> 
> The following patches pretty much rewrite the function so the
> churn is limited.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>

Suggested-by: Willy Tarreau <w@1wt.eu>
Acked-by: Thomas Weißschuh <linux@weissschuh.net>

(...)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback.
  2026-02-07 23:36     ` David Laight
@ 2026-02-16 19:07       ` Thomas Weißschuh
  2026-02-17 11:51         ` David Laight
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-16 19:07 UTC (permalink / raw)
  To: David Laight; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-07 23:36:51+0000, David Laight wrote:
> On Sat, 7 Feb 2026 20:29:34 +0100
> Willy Tarreau <w@1wt.eu> wrote:
> 
> > On Fri, Feb 06, 2026 at 07:11:13PM +0000, david.laight.linux@gmail.com wrote:
> > > From: David Laight <david.laight.linux@gmail.com>
> > > 
> > > Add per-call buffering to the vprintf() callback.
> > > While this adds some extra code it will speed things up and
> > > makes a massive difference to anyone looking at strace output.  
> > 
> > This patch alone adds more than 200 extra bytes to the smallest binary
> > for something that was never expressed as a need by users:
> > 
> >  $ size hello-patch*
> >    text    data     bss     dec     hex filename
> >    1859      48      24    1931     78b hello-patch1
> >    2071      48      24    2143     85f hello-patch2
> > 
> > I doubt it would make sense to have a build option to choose this.
> > Or alternately one could decide do disable it when __OPTIMIZE_SIZE__
> > is defined. I just tried quickly and it does the job:
> 
> That probably makes sense.
> For anything non-trivial the extra size is noise.
> Actually is would be easier to read with a single #if covering the
> whole lot.

While I like the slimmer strace output, the added complexity, including
a different implementation for __OPTIMIZE_SIZE__ is a bit scary.
Could we leave this out for now and work on it on its own?


Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks
  2026-02-06 19:11 ` [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
  2026-02-07 19:38   ` Willy Tarreau
@ 2026-02-16 19:30   ` Thomas Weißschuh
  2026-02-16 22:29     ` David Laight
  1 sibling, 1 reply; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-16 19:30 UTC (permalink / raw)
  To: david.laight.linux, g; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-06 19:11:14+0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Simple to do and saves calls to the callback function.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> 
> Changes for v2:
> Formally patch 3, unchanged.
> 
>  tools/include/nolibc/stdio.h | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> index 552f09d51d82..c044a6b3babe 100644
> --- a/tools/include/nolibc/stdio.h
> +++ b/tools/include/nolibc/stdio.h
> @@ -355,10 +355,12 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
>  			outstr = fmt;
>  			len = ofs - 1;
>  		flush_str:
> -			while (width-- > len) {
> -				if (cb(state, " ", 1) != 0)
> +			while (width > len) {
> +				unsigned int pad_len = ((width - len - 1) & 15) + 1;

So this is a modulo, which does not return 0.
While I am sure you and Willy recognize this instantly, I had to squint a bit.
While I understand that a regular modulo operator might be problematic
on some architectures, does it also apply to constant module values?
The compiler could do this on its own.  If not, we should have a macro.

I am wondering why it is a modulo and not some sort of max()...

Could you move the tests for the new functionality into the patches that
introduce that functionality? That would make it easier to review and
play with the code. Also it would allow to selectively apply the patches.
For example I get compiler warnings/errors in the later patches, so I
the full array of new tests will fail if I skip these later patches.

> +				width -= pad_len;
> +				written += pad_len;
> +				if (cb(state, "                ", pad_len) != 0)
>  					return -1;
> -				written += 1;
>  			}
>  			if (cb(state, outstr, len) != 0)
>  				return -1;
> -- 
> 2.39.5
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars
  2026-02-06 19:11 ` [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars david.laight.linux
  2026-02-08 15:22   ` Willy Tarreau
@ 2026-02-16 19:52   ` Thomas Weißschuh
  2026-02-16 22:47     ` David Laight
  1 sibling, 1 reply; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-16 19:52 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-06 19:11:16+0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
> conversion specifiers.
> This makes it easy to test for multiple values at once.
> 
> Detect the conversion flags " #+-0" although they are currently all ignored.
> 
> Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L'
> (both long long).
> 
> Add support for "%i" (the same as %d").

Would it be much work to split the new functionality into its own patch?
> 
> Unconditionally generate the signed values (for %d) to remove a second
> set of checks for the size.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> 
> Changes for v2:
> - Use #defines to make the code a lot more readable.
> - Include the changes from the old patch 10 that used masks for the
>   conversion specifiers.
> - Detect all the valid flag characters even though they are not implemented.
> - Support for left justifying field is moved to patch 7.
> 
>  tools/include/nolibc/stdio.h | 151 ++++++++++++++++++++++++-----------
>  1 file changed, 103 insertions(+), 48 deletions(-)
> 
> diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> index bb54f488c228..b14cf8224403 100644
> --- a/tools/include/nolibc/stdio.h
> +++ b/tools/include/nolibc/stdio.h
> @@ -240,19 +240,44 @@ char *fgets(char *s, int size, FILE *stream)
>  }
>  
>  
> -/* minimal printf(). It supports the following formats:
> - *  - %[l*]{d,u,c,x,p}
> - *  - %s
> - *  - unknown modifiers are ignored.
> +/* simple printf(). It supports the following formats:
> + *  - %[-][width][{l,t,z,ll,L,j,q}]{d,u,c,x,p,s,m,%}
> + *  - %%
> + *  - invalid formats are copied to the output buffer
>   */
> +
> +/* This code uses 'flag' variables that are indexed by the low 6 bits
> + * of characters to optimise checks for multiple characters.
> + *
> + * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
> + * returns non-zero if the bit for any of the specified characters is set.
> + *
> + * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
> + * returns the flag bit for ch if it is one of the specified characters.
> + * All the characters must be in the same 32 character block (non-alphabetic,
> + * upper case, or lower case) of the ASCII character set.)
> + */
> +#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
> +#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
> +#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
> +	(_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
> +	 _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
> +	 _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
> +	 _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
> +#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
> +	((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
> +#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
> +	(ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
> +		_NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))

With signed chars I get:

sysroot/i386/include/stdio.h:321:37: error: comparison is always false due to limited range of data type [-Werror=type-limits]
  321 |         (ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
      |                                     ^
sysroot/i386/include/stdio.h:389:35: note: in expansion of macro '_NOLIBC_PF_CHAR_IS_ONE_OF'
  389 |                         ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'c', 'd', 'i', 'u', 'x', 'p');
      |

This can be fixed by switching 'ch' to be always unsigned.

You can run the the builtin test suite like this:
-p triggers the download of the toolchains
-l uses LLVM/clang instead of the downloaded toolchain

$ cd tools/testing/selftests/nolibc
$ ./run-tests.sh -m user -p
$ ./run-tests.sh -m user -l

> +
>  typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
>  
>  static __attribute__((unused, format(printf, 3, 0)))
>  int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
>  {
> -	char lpref, ch;
> -	unsigned long long v;
> +	char ch;
>  	unsigned int written, width;
> +	unsigned int flags, ch_flag;
>  	size_t len;
>  	char tmpbuf[21];
>  	const char *outstr;
> @@ -265,6 +290,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
>  			break;
>  
>  		width = 0;
> +		flags = 0;
>  		if (ch != '%') {
>  			while (*fmt && *fmt != '%')
>  				fmt++;
> @@ -274,6 +300,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
>  
>  			ch = *fmt++;
>  
> +			/* Conversion flag characters */
> +			for (;; ch = *fmt++) {
> +				ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
> +				if (!ch_flag)
> +					break;
> +				flags |= ch_flag;
> +			}

What is the advantage of this over:

while (1) {
	/* ... */

	ch = *fmt++;
}

Or combine it with the 'ch = *fmt++' from above and do:

while (1) {
	ch = *fmt++;

	/* ... */
}

These look much simpler to me.

> +
>  			/* width */
>  			while (ch >= '0' && ch <= '9') {
>  				width *= 10;

(...)

> +do_output:
>  		written += len;
>  
> +		/* An OPTIMIZER_HIDE_VAR() seems to stop gcc back-merging this
> +		 * code into one of the conditionals above.
> +		 */
> +		__asm__ volatile("" : "=r"(len) : "0"(len));

Can we make a definition in nolibc/compiler.h out of this?
Why the 'volatile'? Wo don't have that in the kernel.
Why separate input and ouput arguments instead of one combined one ('+r')?
I have been wondering the same about the kernel definition, too.

> +
>  		while (width > len) {
>  			unsigned int pad_len = ((width - len - 1) & 15) + 1;
>  			width -= pad_len;
> -- 
> 2.39.5
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X"
  2026-02-06 19:11 ` [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X" david.laight.linux
  2026-02-08 15:47   ` Willy Tarreau
  2026-02-08 16:06   ` Willy Tarreau
@ 2026-02-16 19:57   ` Thomas Weißschuh
  2026-02-16 22:50     ` David Laight
  2026-02-16 20:11   ` Thomas Weißschuh
  3 siblings, 1 reply; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-16 19:57 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-06 19:11:17+0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Add support for all the normal flag chacacters except '0' (zero pad).
>   '-' left alignment.
>   '+' and ' ' Sign characters for non-negative numbers.
>   '#' adds 0x to hex numbers.
> Partially support "%X", outputs lower case a..f the same as "%x".
> 
> Move the "%s" code in with the numeric formats to save a va_arg() call.
> 
> Prepend the sign (or "0x") after conversion to ascii and use the length
> returned by u64toh_r() and u64toa_r().
> Both needed for precision and zero-padding in the next patch.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> 
> Changes for v2:
> - Add support for left justifying fields (removed from patch 6).
> - Merge in the changes from the old patch 8.
> - Add support for "%#x" (formally part of patch 9).
> 
>  tools/include/nolibc/stdio.h | 97 ++++++++++++++++++++++++++----------
>  1 file changed, 71 insertions(+), 26 deletions(-)

(...)

> +non_numeric_conversion:
> +			if (ch == 'm') {
>  #ifdef NOLIBC_IGNORE_ERRNO
>  				outstr = "unknown error";
> +				len = __builtin_strlen(outstr);

Why the builtin and not the regular one?

>  #else
>  				outstr = strerror(errno);
> +				goto do_strnlen_output;
>  #endif /* NOLIBC_IGNORE_ERRNO */
>  			} else {
>  				if (ch != '%') {

(...)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 09/11] selftests/nolibc: Improve reporting of vfprintf() errors
  2026-02-06 19:11 ` [PATCH v2 next 09/11] selftests/nolibc: Improve reporting of vfprintf() errors david.laight.linux
@ 2026-02-16 20:05   ` Thomas Weißschuh
  2026-02-17 10:48     ` David Laight
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-16 20:05 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-06 19:11:19+0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Check the string matches before checking the returned length.
> Only print the string once when it matches.
> Normally the length is that of the expected string, make it easier
> to write tests by treating a length of zero as being that of the
> expected output.
> Additionally check that nothing beyond the end is written.
> 
> This also correctly returns 1 (the number of errors) when strcmp()
> fails rather than the return value from strncmp() which is the
> signed difference between the mismatching characters.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> 
> Changes for v2:
> - Formally patch 5, otherwise unchanged.
> 
>  tools/testing/selftests/nolibc/nolibc-test.c | 27 ++++++++++++++++----
>  1 file changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
> index 3c5a226dad3a..9378a1f26c34 100644
> --- a/tools/testing/selftests/nolibc/nolibc-test.c
> +++ b/tools/testing/selftests/nolibc/nolibc-test.c
> @@ -1567,28 +1567,45 @@ int run_stdlib(int min, int max)
>  
>  static int expect_vfprintf(int llen, int c, const char *expected, const char *fmt, ...)
>  {
> +	unsigned int i;
>  	char buf[100];
>  	va_list args;
>  	ssize_t w;
>  	int ret;
>  
> +	for (i = 0; i < sizeof(buf); i++)
> +		buf[i] = i;

Some of these values are valid ascii characters.
An out-of-bounds write may be missed if the test happens to print the
correct character. Another poison value would avoid this issue.

>  
>  	va_start(args, fmt);
> -	/* Only allow writing 21 bytes, to test truncation */
> +	/* Only allow writing 20 bytes, to test truncation */
>  	w = vsnprintf(buf, 21, fmt, args);
>  	va_end(args);
>  
> +	llen += printf(" \"%s\"", buf);
> +	ret = strcmp(expected, buf);
> +	if (ret) {
> +		llen += printf(" should be \"%s\"", expected);
> +		result(llen, FAIL);
> +		return 1;
> +	}
> +	if (!c)
> +		c = strlen(expected);

This patch does multiple different things again. Please split them up.
Specifically I am not a fan of allowing to pass '0' here.
While it is slightly more convenient when writing the tests,
it increases the opportunity for errors and makes the tests less clear
when reading them.

>  	if (w != c) {
>  		llen += printf(" written(%d) != %d", (int)w, c);
>  		result(llen, FAIL);
>  		return 1;
>  	}
>  
> -	llen += printf(" \"%s\" = \"%s\"", expected, buf);
> -	ret = strncmp(expected, buf, c);
> +	for (i = c + 1; i < sizeof(buf); i++) {
> +		if (buf[i] - i) {

With a fixed poison value the check above wouldn't look this weird.

> +			llen += printf(" overwrote buf[%d] with 0x%x", i, buf[i]);
> +			result(llen, FAIL);
> +			return 1;
> +		}
> +	}
>  
> -	result(llen, ret ? FAIL : OK);
> -	return ret;
> +	result(llen, OK);
> +	return 0;
>  }
>  
>  static int test_scanf(void)
> -- 
> 2.39.5
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X"
  2026-02-06 19:11 ` [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X" david.laight.linux
                     ` (2 preceding siblings ...)
  2026-02-16 19:57   ` Thomas Weißschuh
@ 2026-02-16 20:11   ` Thomas Weißschuh
  2026-02-16 22:52     ` David Laight
  3 siblings, 1 reply; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-16 20:11 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-06 19:11:17+0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Add support for all the normal flag chacacters except '0' (zero pad).
>   '-' left alignment.
>   '+' and ' ' Sign characters for non-negative numbers.
>   '#' adds 0x to hex numbers.
> Partially support "%X", outputs lower case a..f the same as "%x".
> 
> Move the "%s" code in with the numeric formats to save a va_arg() call.
> 
> Prepend the sign (or "0x") after conversion to ascii and use the length
> returned by u64toh_r() and u64toa_r().
> Both needed for precision and zero-padding in the next patch.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> 
> Changes for v2:
> - Add support for left justifying fields (removed from patch 6).
> - Merge in the changes from the old patch 8.
> - Add support for "%#x" (formally part of patch 9).
> 
>  tools/include/nolibc/stdio.h | 97 ++++++++++++++++++++++++++----------
>  1 file changed, 71 insertions(+), 26 deletions(-)

(...)

> @@ -365,40 +384,62 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
>  					goto do_output;
>  				}
>  
> +				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 's')) {
> +					/* "%s" - character string. */
> +					if (!v) {
> +						outstr = "(null)";
> +						len = 6;
> +						goto do_output;
> +					}
> +					outstr = (void *)v;

When building for i386:

In file included from ./nolibc.h:123,
                 from ./ctype.h:8,
                 from <command-line>:
./stdio.h: In function '__nolibc_printf':
./stdio.h:445:50: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
  445 |                                         outstr = (void *)v;
      |

Needs a cast through (uintptr_t). Also instead of casting through
(void *), please specify the actual target type that is expected.

> +do_strnlen_output:
> +					len = strnlen(outstr, INT_MAX);
> +					goto do_output;
> +				}

(...)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests
  2026-02-06 19:11 ` [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests david.laight.linux
@ 2026-02-16 20:14   ` Thomas Weißschuh
  2026-02-16 20:23   ` Thomas Weißschuh
  1 sibling, 0 replies; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-16 20:14 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-06 19:11:20+0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Extra tests include:
> - %%, including ignored modifiers.
> - Invalid formats copied to output buffer (matches glibc).
> - Left aligned output "%-..."
> - Zero padding "%0...".
> - "(nil)" for NULL pointers (matches glibc).
> - Alternate form "%#x" (prepends 0x in non-zero).
> - Field precision as well as width, printf("%.0d", 0) is "".
> - Variable length width and precision "%*.*d".
> - Length qualifiers L and ll.
> - Conversion specifiers i and X.
> - More 'corner' cases.
> 
> There are no explicit tests of long (l, t or z) because they would
> have to differ between 32bit and 64bit.
> 
> Set the expected length to zero for all the non-truncating tests.
> (Annoying when a test is changed.)

New tests, yeah!

But as mentioned in my other responses I'd like the new tests to be part
of the patches introducing the new features. (Or of course a patch
introducing new tests for the existing features).

Also all tests should explicitly specify the return values.

So in the end this patch should probably be gone away completely.


Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 11/11] selftests/nolibc: Use printf("%.*s", n, "") to align output
  2026-02-06 19:11 ` [PATCH v2 next 11/11] selftests/nolibc: Use printf("%.*s", n, "") to align output david.laight.linux
  2026-02-08 16:20   ` Willy Tarreau
@ 2026-02-16 20:22   ` Thomas Weißschuh
  1 sibling, 0 replies; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-16 20:22 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-06 19:11:21+0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Now that printf supports '*' for field widths it can be used to
> align the "[OK]" strings in the output.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>

Acked-by: Thomas Weißschuh <linux@weissschuh.net>

> ---
> 
> Changes for v2:
> - Formally patch 12, unchanged.
> 
>  tools/testing/selftests/nolibc/nolibc-test.c | 21 ++++----------------
>  1 file changed, 4 insertions(+), 17 deletions(-)

(...)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests
  2026-02-06 19:11 ` [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests david.laight.linux
  2026-02-16 20:14   ` Thomas Weißschuh
@ 2026-02-16 20:23   ` Thomas Weißschuh
  2026-02-16 22:54     ` David Laight
  1 sibling, 1 reply; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-16 20:23 UTC (permalink / raw)
  To: david.laight.linux; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-06 19:11:20+0000, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Extra tests include:
> - %%, including ignored modifiers.
> - Invalid formats copied to output buffer (matches glibc).
> - Left aligned output "%-..."
> - Zero padding "%0...".
> - "(nil)" for NULL pointers (matches glibc).
> - Alternate form "%#x" (prepends 0x in non-zero).
> - Field precision as well as width, printf("%.0d", 0) is "".
> - Variable length width and precision "%*.*d".
> - Length qualifiers L and ll.
> - Conversion specifiers i and X.
> - More 'corner' cases.
> 
> There are no explicit tests of long (l, t or z) because they would
> have to differ between 32bit and 64bit.
> 
> Set the expected length to zero for all the non-truncating tests.
> (Annoying when a test is changed.)
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> 
> Changes for v2:
> - Formally patch 11.
> - Remove broken __attribute__ on expect_vfprintf().
> - Add extra test for "#01x".
> 
>  tools/testing/selftests/nolibc/nolibc-test.c | 49 +++++++++++++++-----
>  1 file changed, 37 insertions(+), 12 deletions(-)

(...)

> +		CASE_TEST(num_pad_trunc);EXPECT_VFPRINTF(40, "          0000000000", "%040d", 5); break;  /* max 30 '0' can be added */

This fails in 'libc-test' with glibc and musl:

32 num_pad_trunc "00000000000000000000" should be "          0000000000" [FAIL]

> +		CASE_TEST(number_prec);  EXPECT_VFPRINTF(0, "     00005", "%10.5d", 5); break;

(...)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks
  2026-02-16 19:30   ` Thomas Weißschuh
@ 2026-02-16 22:29     ` David Laight
  2026-02-18 17:30       ` Thomas Weißschuh
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-16 22:29 UTC (permalink / raw)
  To: Thomas Weißschuh; +Cc: g, Willy Tarreau, linux-kernel, Cheng Li

On Mon, 16 Feb 2026 20:30:27 +0100
Thomas Weißschuh <linux@weissschuh.net> wrote:

> On 2026-02-06 19:11:14+0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Simple to do and saves calls to the callback function.
> > 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> > 
> > Changes for v2:
> > Formally patch 3, unchanged.
> > 
> >  tools/include/nolibc/stdio.h | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> > index 552f09d51d82..c044a6b3babe 100644
> > --- a/tools/include/nolibc/stdio.h
> > +++ b/tools/include/nolibc/stdio.h
> > @@ -355,10 +355,12 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  			outstr = fmt;
> >  			len = ofs - 1;
> >  		flush_str:
> > -			while (width-- > len) {
> > -				if (cb(state, " ", 1) != 0)
> > +			while (width > len) {
> > +				unsigned int pad_len = ((width - len - 1) & 15) + 1;  
> 
> So this is a modulo, which does not return 0.
> While I am sure you and Willy recognize this instantly, I had to squint a bit.
> While I understand that a regular modulo operator might be problematic
> on some architectures, does it also apply to constant module values?

The modulo would return 0..15, I wanted 1..16 hence the -1 and +1.
Perhaps another comment.

> The compiler could do this on its own.  If not, we should have a macro.
> 
> I am wondering why it is a modulo and not some sort of max()...

It doesn't require a conditional of any sort.
Basically it does the short transfer first followed by the full length ones.
Whereas a min/max would do the large ones first and the remainder at the end.
The effect is basically the same.

> 
> Could you move the tests for the new functionality into the patches that
> introduce that functionality?

Can do, I was thinking you'd prefer separate patches for the code changes
and test changes and didn't want to generate a stupid number of patches.
It is bad enough already :-)

> That would make it easier to review and
> play with the code. Also it would allow to selectively apply the patches.
> For example I get compiler warnings/errors in the later patches, so I
> the full array of new tests will fail if I skip these later patches.

I've been applying the patch so the tests and then doing a 'git reset' on
it and not worrying about the tests that fail.

	David

> 
> > +				width -= pad_len;
> > +				written += pad_len;
> > +				if (cb(state, "                ", pad_len) != 0)
> >  					return -1;
> > -				written += 1;
> >  			}
> >  			if (cb(state, outstr, len) != 0)
> >  				return -1;
> > -- 
> > 2.39.5
> >   


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars
  2026-02-16 19:52   ` Thomas Weißschuh
@ 2026-02-16 22:47     ` David Laight
  2026-02-18 17:36       ` Thomas Weißschuh
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-16 22:47 UTC (permalink / raw)
  To: Thomas Weißschuh; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On Mon, 16 Feb 2026 20:52:49 +0100
Thomas Weißschuh <linux@weissschuh.net> wrote:

> On 2026-02-06 19:11:16+0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
> > conversion specifiers.
> > This makes it easy to test for multiple values at once.
> > 
> > Detect the conversion flags " #+-0" although they are currently all ignored.
> > 
> > Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L'
> > (both long long).
> > 
> > Add support for "%i" (the same as %d").  
> 
> Would it be much work to split the new functionality into its own patch?
> > 
> > Unconditionally generate the signed values (for %d) to remove a second
> > set of checks for the size.
> > 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> > 
> > Changes for v2:
> > - Use #defines to make the code a lot more readable.
> > - Include the changes from the old patch 10 that used masks for the
> >   conversion specifiers.
> > - Detect all the valid flag characters even though they are not implemented.
> > - Support for left justifying field is moved to patch 7.
> > 
> >  tools/include/nolibc/stdio.h | 151 ++++++++++++++++++++++++-----------
> >  1 file changed, 103 insertions(+), 48 deletions(-)
> > 
> > diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> > index bb54f488c228..b14cf8224403 100644
> > --- a/tools/include/nolibc/stdio.h
> > +++ b/tools/include/nolibc/stdio.h
> > @@ -240,19 +240,44 @@ char *fgets(char *s, int size, FILE *stream)
> >  }
> >  
> >  
> > -/* minimal printf(). It supports the following formats:
> > - *  - %[l*]{d,u,c,x,p}
> > - *  - %s
> > - *  - unknown modifiers are ignored.
> > +/* simple printf(). It supports the following formats:
> > + *  - %[-][width][{l,t,z,ll,L,j,q}]{d,u,c,x,p,s,m,%}
> > + *  - %%
> > + *  - invalid formats are copied to the output buffer
> >   */
> > +
> > +/* This code uses 'flag' variables that are indexed by the low 6 bits
> > + * of characters to optimise checks for multiple characters.
> > + *
> > + * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
> > + * returns non-zero if the bit for any of the specified characters is set.
> > + *
> > + * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
> > + * returns the flag bit for ch if it is one of the specified characters.
> > + * All the characters must be in the same 32 character block (non-alphabetic,
> > + * upper case, or lower case) of the ASCII character set.)
> > + */
> > +#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
> > +#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
> > +#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
> > +	(_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
> > +	 _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
> > +	 _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
> > +	 _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
> > +#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
> > +	((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
> > +#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
> > +	(ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
> > +		_NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))  
> 
> With signed chars I get:
> 
> sysroot/i386/include/stdio.h:321:37: error: comparison is always false due to limited range of data type [-Werror=type-limits]
>   321 |         (ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \

Stupid type-limits warning.
It is almost impossible to get rid of without making the code unreadable.

>       |                                     ^
> sysroot/i386/include/stdio.h:389:35: note: in expansion of macro '_NOLIBC_PF_CHAR_IS_ONE_OF'
>   389 |                         ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'c', 'd', 'i', 'u', 'x', 'p');
>       |
> 
> This can be fixed by switching 'ch' to be always unsigned.

That's likely to provoke another error elsewhere.
In any case optimising that test away makes the code smaller!

> 
> You can run the the builtin test suite like this:
> -p triggers the download of the toolchains
> -l uses LLVM/clang instead of the downloaded toolchain
> 
> $ cd tools/testing/selftests/nolibc
> $ ./run-tests.sh -m user -p
> $ ./run-tests.sh -m user -l

I've just been running:
	rm nolibc-test; make -O /path/.../dir && ./nolibc-test printf

> 
> > +
> >  typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
> >  
> >  static __attribute__((unused, format(printf, 3, 0)))
> >  int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
> >  {
> > -	char lpref, ch;
> > -	unsigned long long v;
> > +	char ch;
> >  	unsigned int written, width;
> > +	unsigned int flags, ch_flag;
> >  	size_t len;
> >  	char tmpbuf[21];
> >  	const char *outstr;
> > @@ -265,6 +290,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  			break;
> >  
> >  		width = 0;
> > +		flags = 0;
> >  		if (ch != '%') {
> >  			while (*fmt && *fmt != '%')
> >  				fmt++;
> > @@ -274,6 +300,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  
> >  			ch = *fmt++;
> >  
> > +			/* Conversion flag characters */
> > +			for (;; ch = *fmt++) {
> > +				ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
> > +				if (!ch_flag)
> > +					break;
> > +				flags |= ch_flag;
> > +			}  
> 
> What is the advantage of this over:
> 
> while (1) {
> 	/* ... */
> 
> 	ch = *fmt++;
> }

One line shorter :-)

> 
> Or combine it with the 'ch = *fmt++' from above and do:
> 
> while (1) {
> 	ch = *fmt++;
> 
> 	/* ... */
> }
> 
> These look much simpler to me.

The code always needs to get a new character after using one.
So you come out of each bit of code with the 'next char to check' in ch.
Which means you need a new character at the end of the loop for the next
iteration.

> 
> > +
> >  			/* width */
> >  			while (ch >= '0' && ch <= '9') {
> >  				width *= 10;  
> 
> (...)
> 
> > +do_output:
> >  		written += len;
> >  
> > +		/* An OPTIMIZER_HIDE_VAR() seems to stop gcc back-merging this
> > +		 * code into one of the conditionals above.
> > +		 */
> > +		__asm__ volatile("" : "=r"(len) : "0"(len));  
> 
> Can we make a definition in nolibc/compiler.h out of this?

I've added _NOLIBC_OPTIMIZER_HIDE_VAR() to compiler.h in the draft of v3.

> Why the 'volatile'? Wo don't have that in the kernel.

I probably wrote that before looking up the kernel definition.

> Why separate input and ouput arguments instead of one combined one ('+r')?
> I have been wondering the same about the kernel definition, too.

I'm not sure either.
Certainly "+r" is more modern - which is why it isn't used in a lot of places.
They may be identical (indeed "+r" might get converted to the "0" form),
but maybe it gives better separation - I just copied the same version.

	David

> 
> > +
> >  		while (width > len) {
> >  			unsigned int pad_len = ((width - len - 1) & 15) + 1;
> >  			width -= pad_len;
> > -- 
> > 2.39.5
> >   


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X"
  2026-02-16 19:57   ` Thomas Weißschuh
@ 2026-02-16 22:50     ` David Laight
  2026-02-18 17:39       ` Thomas Weißschuh
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-16 22:50 UTC (permalink / raw)
  To: Thomas Weißschuh; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On Mon, 16 Feb 2026 20:57:37 +0100
Thomas Weißschuh <linux@weissschuh.net> wrote:

> On 2026-02-06 19:11:17+0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Add support for all the normal flag chacacters except '0' (zero pad).
> >   '-' left alignment.
> >   '+' and ' ' Sign characters for non-negative numbers.
> >   '#' adds 0x to hex numbers.
> > Partially support "%X", outputs lower case a..f the same as "%x".
> > 
> > Move the "%s" code in with the numeric formats to save a va_arg() call.
> > 
> > Prepend the sign (or "0x") after conversion to ascii and use the length
> > returned by u64toh_r() and u64toa_r().
> > Both needed for precision and zero-padding in the next patch.
> > 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> > 
> > Changes for v2:
> > - Add support for left justifying fields (removed from patch 6).
> > - Merge in the changes from the old patch 8.
> > - Add support for "%#x" (formally part of patch 9).
> > 
> >  tools/include/nolibc/stdio.h | 97 ++++++++++++++++++++++++++----------
> >  1 file changed, 71 insertions(+), 26 deletions(-)  
> 
> (...)
> 
> > +non_numeric_conversion:
> > +			if (ch == 'm') {
> >  #ifdef NOLIBC_IGNORE_ERRNO
> >  				outstr = "unknown error";
> > +				len = __builtin_strlen(outstr);  
> 
> Why the builtin and not the regular one?

In that case the builtin one will generate a compile-time constant.
But that code gets changed in later patches.
The #if belongs in strerror().

	David

> 
> >  #else
> >  				outstr = strerror(errno);
> > +				goto do_strnlen_output;
> >  #endif /* NOLIBC_IGNORE_ERRNO */
> >  			} else {
> >  				if (ch != '%') {  
> 
> (...)


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X"
  2026-02-16 20:11   ` Thomas Weißschuh
@ 2026-02-16 22:52     ` David Laight
  0 siblings, 0 replies; 60+ messages in thread
From: David Laight @ 2026-02-16 22:52 UTC (permalink / raw)
  To: Thomas Weißschuh; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On Mon, 16 Feb 2026 21:11:18 +0100
Thomas Weißschuh <linux@weissschuh.net> wrote:

> On 2026-02-06 19:11:17+0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Add support for all the normal flag chacacters except '0' (zero pad).
> >   '-' left alignment.
> >   '+' and ' ' Sign characters for non-negative numbers.
> >   '#' adds 0x to hex numbers.
> > Partially support "%X", outputs lower case a..f the same as "%x".
> > 
> > Move the "%s" code in with the numeric formats to save a va_arg() call.
> > 
> > Prepend the sign (or "0x") after conversion to ascii and use the length
> > returned by u64toh_r() and u64toa_r().
> > Both needed for precision and zero-padding in the next patch.
> > 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> > 
> > Changes for v2:
> > - Add support for left justifying fields (removed from patch 6).
> > - Merge in the changes from the old patch 8.
> > - Add support for "%#x" (formally part of patch 9).
> > 
> >  tools/include/nolibc/stdio.h | 97 ++++++++++++++++++++++++++----------
> >  1 file changed, 71 insertions(+), 26 deletions(-)  
> 
> (...)
> 
> > @@ -365,40 +384,62 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  					goto do_output;
> >  				}
> >  
> > +				if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 's')) {
> > +					/* "%s" - character string. */
> > +					if (!v) {
> > +						outstr = "(null)";
> > +						len = 6;
> > +						goto do_output;
> > +					}
> > +					outstr = (void *)v;  
> 
> When building for i386:
> 
> In file included from ./nolibc.h:123,
>                  from ./ctype.h:8,
>                  from <command-line>:
> ./stdio.h: In function '__nolibc_printf':
> ./stdio.h:445:50: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
>   445 |                                         outstr = (void *)v;
>       |
> 
> Needs a cast through (uintptr_t). Also instead of casting through
> (void *), please specify the actual target type that is expected.

Ok.

> 
> > +do_strnlen_output:
> > +					len = strnlen(outstr, INT_MAX);
> > +					goto do_output;
> > +				}  
> 
> (...)


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests
  2026-02-16 20:23   ` Thomas Weißschuh
@ 2026-02-16 22:54     ` David Laight
  2026-02-18 17:41       ` Thomas Weißschuh
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-16 22:54 UTC (permalink / raw)
  To: Thomas Weißschuh; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On Mon, 16 Feb 2026 21:23:50 +0100
Thomas Weißschuh <linux@weissschuh.net> wrote:

> On 2026-02-06 19:11:20+0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Extra tests include:
> > - %%, including ignored modifiers.
> > - Invalid formats copied to output buffer (matches glibc).
> > - Left aligned output "%-..."
> > - Zero padding "%0...".
> > - "(nil)" for NULL pointers (matches glibc).
> > - Alternate form "%#x" (prepends 0x in non-zero).
> > - Field precision as well as width, printf("%.0d", 0) is "".
> > - Variable length width and precision "%*.*d".
> > - Length qualifiers L and ll.
> > - Conversion specifiers i and X.
> > - More 'corner' cases.
> > 
> > There are no explicit tests of long (l, t or z) because they would
> > have to differ between 32bit and 64bit.
> > 
> > Set the expected length to zero for all the non-truncating tests.
> > (Annoying when a test is changed.)
> > 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> > 
> > Changes for v2:
> > - Formally patch 11.
> > - Remove broken __attribute__ on expect_vfprintf().
> > - Add extra test for "#01x".
> > 
> >  tools/testing/selftests/nolibc/nolibc-test.c | 49 +++++++++++++++-----
> >  1 file changed, 37 insertions(+), 12 deletions(-)  
> 
> (...)
> 
> > +		CASE_TEST(num_pad_trunc);EXPECT_VFPRINTF(40, "          0000000000", "%040d", 5); break;  /* max 30 '0' can be added */  
> 
> This fails in 'libc-test' with glibc and musl:
> 
> 32 num_pad_trunc "00000000000000000000" should be "          0000000000" [FAIL]

I've a plan for that - the test needs skipping.
I can use a -40 to avoid another parameter.

	David

> 
> > +		CASE_TEST(number_prec);  EXPECT_VFPRINTF(0, "     00005", "%10.5d", 5); break;  
> 
> (...)


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 09/11] selftests/nolibc: Improve reporting of vfprintf() errors
  2026-02-16 20:05   ` Thomas Weißschuh
@ 2026-02-17 10:48     ` David Laight
  2026-02-18 17:48       ` Thomas Weißschuh
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-17 10:48 UTC (permalink / raw)
  To: Thomas Weißschuh; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On Mon, 16 Feb 2026 21:05:40 +0100
Thomas Weißschuh <linux@weissschuh.net> wrote:

> On 2026-02-06 19:11:19+0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Check the string matches before checking the returned length.
> > Only print the string once when it matches.
> > Normally the length is that of the expected string, make it easier
> > to write tests by treating a length of zero as being that of the
> > expected output.
> > Additionally check that nothing beyond the end is written.
> > 
> > This also correctly returns 1 (the number of errors) when strcmp()
> > fails rather than the return value from strncmp() which is the
> > signed difference between the mismatching characters.
> > 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> > 
> > Changes for v2:
> > - Formally patch 5, otherwise unchanged.
> > 
> >  tools/testing/selftests/nolibc/nolibc-test.c | 27 ++++++++++++++++----
> >  1 file changed, 22 insertions(+), 5 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
> > index 3c5a226dad3a..9378a1f26c34 100644
> > --- a/tools/testing/selftests/nolibc/nolibc-test.c
> > +++ b/tools/testing/selftests/nolibc/nolibc-test.c
> > @@ -1567,28 +1567,45 @@ int run_stdlib(int min, int max)
> >  
> >  static int expect_vfprintf(int llen, int c, const char *expected, const char *fmt, ...)
> >  {
> > +	unsigned int i;
> >  	char buf[100];
> >  	va_list args;
> >  	ssize_t w;
> >  	int ret;
> >  
> > +	for (i = 0; i < sizeof(buf); i++)
> > +		buf[i] = i;  
> 
> Some of these values are valid ascii characters.
> An out-of-bounds write may be missed if the test happens to print the
> correct character. Another poison value would avoid this issue.

I wanted to use different values just in case the test wrote the
value I'd picked to two adjacent locations.
I could use a (bad) random value (eg the time(NULL) & 0xff) so that
an overwrite of the chosen value would be picked up by different run.

> 
> >  
> >  	va_start(args, fmt);
> > -	/* Only allow writing 21 bytes, to test truncation */
> > +	/* Only allow writing 20 bytes, to test truncation */
> >  	w = vsnprintf(buf, 21, fmt, args);
> >  	va_end(args);
> >  
> > +	llen += printf(" \"%s\"", buf);
> > +	ret = strcmp(expected, buf);
> > +	if (ret) {
> > +		llen += printf(" should be \"%s\"", expected);
> > +		result(llen, FAIL);
> > +		return 1;
> > +	}
> > +	if (!c)
> > +		c = strlen(expected);  
> 
> This patch does multiple different things again. Please split them up.
> Specifically I am not a fan of allowing to pass '0' here.
> While it is slightly more convenient when writing the tests,
> it increases the opportunity for errors and makes the tests less clear
> when reading them.

I definitely wanted the invalid string output on failure.
Not being able to see why a test failed is a PITA.
Reporting the length error only makes sense when the output is truncated,
and the 'truncation length' is a property of expect_vfprintf() not
the test itself (and I had to increase it for the octal tests).

One option is to remove the 'c' parameter completely and require the
test include the un-truncated output for the truncation tests.
There is no reason for any of them to be horribly long.

Then add a parameter for 'run this test' (like all the other tests)
so that some tests can be excluded/changed when is_nolibc isn't set.

	David

> 
> >  	if (w != c) {
> >  		llen += printf(" written(%d) != %d", (int)w, c);
> >  		result(llen, FAIL);
> >  		return 1;
> >  	}
> >  
> > -	llen += printf(" \"%s\" = \"%s\"", expected, buf);
> > -	ret = strncmp(expected, buf, c);
> > +	for (i = c + 1; i < sizeof(buf); i++) {
> > +		if (buf[i] - i) {  
> 
> With a fixed poison value the check above wouldn't look this weird.
> 
> > +			llen += printf(" overwrote buf[%d] with 0x%x", i, buf[i]);
> > +			result(llen, FAIL);
> > +			return 1;
> > +		}
> > +	}
> >  
> > -	result(llen, ret ? FAIL : OK);
> > -	return ret;
> > +	result(llen, OK);
> > +	return 0;
> >  }
> >  
> >  static int test_scanf(void)
> > -- 
> > 2.39.5
> >   


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback.
  2026-02-16 19:07       ` Thomas Weißschuh
@ 2026-02-17 11:51         ` David Laight
  2026-02-18 17:52           ` Thomas Weißschuh
  0 siblings, 1 reply; 60+ messages in thread
From: David Laight @ 2026-02-17 11:51 UTC (permalink / raw)
  To: Thomas Weißschuh; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On Mon, 16 Feb 2026 20:07:35 +0100
Thomas Weißschuh <linux@weissschuh.net> wrote:

> On 2026-02-07 23:36:51+0000, David Laight wrote:
> > On Sat, 7 Feb 2026 20:29:34 +0100
> > Willy Tarreau <w@1wt.eu> wrote:
> >   
> > > On Fri, Feb 06, 2026 at 07:11:13PM +0000, david.laight.linux@gmail.com wrote:  
> > > > From: David Laight <david.laight.linux@gmail.com>
> > > > 
> > > > Add per-call buffering to the vprintf() callback.
> > > > While this adds some extra code it will speed things up and
> > > > makes a massive difference to anyone looking at strace output.    
> > > 
> > > This patch alone adds more than 200 extra bytes to the smallest binary
> > > for something that was never expressed as a need by users:
> > > 
> > >  $ size hello-patch*
> > >    text    data     bss     dec     hex filename
> > >    1859      48      24    1931     78b hello-patch1
> > >    2071      48      24    2143     85f hello-patch2
> > > 
> > > I doubt it would make sense to have a build option to choose this.
> > > Or alternately one could decide do disable it when __OPTIMIZE_SIZE__
> > > is defined. I just tried quickly and it does the job:  
> > 
> > That probably makes sense.
> > For anything non-trivial the extra size is noise.
> > Actually is would be easier to read with a single #if covering the
> > whole lot.  
> 
> While I like the slimmer strace output, the added complexity, including
> a different implementation for __OPTIMIZE_SIZE__ is a bit scary.
> Could we leave this out for now and work on it on its own?

I might move it later in the patch series.

I plan to add a #define that can select some of the extra features.
Basically making bits from 'flags' makes the compiler optimise the
associated code away - so 0 => no options and -1 => all options.
But you also set '-' (with the quotes) to get just left alignment
or '-', '.' to get left alignment and zero pad.

I can use a character (that misses a flag -eg 'b') to enable buffering.

I thought about being able to disable features, but the requirement
is only really to make small programs very small (but they may want
one extra bit), the 400(ish) bytes that everything adds isn't that
significant for anything non-trivial.
(Oh and that is 400 for this version, with 'nothing' it is smaller
that the old version).

	David

> 
> 
> Thomas
> 


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks
  2026-02-16 22:29     ` David Laight
@ 2026-02-18 17:30       ` Thomas Weißschuh
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-18 17:30 UTC (permalink / raw)
  To: David Laight; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-16 22:29:43+0000, David Laight wrote:
> On Mon, 16 Feb 2026 20:30:27 +0100
> Thomas Weißschuh <linux@weissschuh.net> wrote:
> 
> > On 2026-02-06 19:11:14+0000, david.laight.linux@gmail.com wrote:
> > > From: David Laight <david.laight.linux@gmail.com>
> > > 
> > > Simple to do and saves calls to the callback function.
> > > 
> > > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > > ---
> > > 
> > > Changes for v2:
> > > Formally patch 3, unchanged.
> > > 
> > >  tools/include/nolibc/stdio.h | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> > > index 552f09d51d82..c044a6b3babe 100644
> > > --- a/tools/include/nolibc/stdio.h
> > > +++ b/tools/include/nolibc/stdio.h
> > > @@ -355,10 +355,12 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> > >  			outstr = fmt;
> > >  			len = ofs - 1;
> > >  		flush_str:
> > > -			while (width-- > len) {
> > > -				if (cb(state, " ", 1) != 0)
> > > +			while (width > len) {
> > > +				unsigned int pad_len = ((width - len - 1) & 15) + 1;  
> > 
> > So this is a modulo, which does not return 0.
> > While I am sure you and Willy recognize this instantly, I had to squint a bit.
> > While I understand that a regular modulo operator might be problematic
> > on some architectures, does it also apply to constant module values?
> 
> The modulo would return 0..15, I wanted 1..16 hence the -1 and +1.
> Perhaps another comment.

Yes please.

> > The compiler could do this on its own.  If not, we should have a macro.
> > 
> > I am wondering why it is a modulo and not some sort of max()...
> 
> It doesn't require a conditional of any sort.
> Basically it does the short transfer first followed by the full length ones.
> Whereas a min/max would do the large ones first and the remainder at the end.
> The effect is basically the same.
> 
> > 
> > Could you move the tests for the new functionality into the patches that
> > introduce that functionality?
> 
> Can do, I was thinking you'd prefer separate patches for the code changes
> and test changes and didn't want to generate a stupid number of patches.

We don't have a hard rule for that. If the test addition is complex, it
makes sense to have a dedicated patch. Here the test cases are single
lines and keeping the functionality and patches together makes them
easier to handle.

> It is bad enough already :-)

Agreed :-)


Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars
  2026-02-16 22:47     ` David Laight
@ 2026-02-18 17:36       ` Thomas Weißschuh
  2026-02-18 22:57         ` David Laight
  0 siblings, 1 reply; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-18 17:36 UTC (permalink / raw)
  To: David Laight; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-16 22:47:36+0000, David Laight wrote:
> On Mon, 16 Feb 2026 20:52:49 +0100
> Thomas Weißschuh <linux@weissschuh.net> wrote:
> 
> > On 2026-02-06 19:11:16+0000, david.laight.linux@gmail.com wrote:
> > > From: David Laight <david.laight.linux@gmail.com>
> > > 
> > > Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
> > > conversion specifiers.
> > > This makes it easy to test for multiple values at once.
> > > 
> > > Detect the conversion flags " #+-0" although they are currently all ignored.
> > > 
> > > Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L'
> > > (both long long).
> > > 
> > > Add support for "%i" (the same as %d").  
> > 
> > Would it be much work to split the new functionality into its own patch?
> > > 
> > > Unconditionally generate the signed values (for %d) to remove a second
> > > set of checks for the size.
> > > 
> > > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > > ---
> > > 
> > > Changes for v2:
> > > - Use #defines to make the code a lot more readable.
> > > - Include the changes from the old patch 10 that used masks for the
> > >   conversion specifiers.
> > > - Detect all the valid flag characters even though they are not implemented.
> > > - Support for left justifying field is moved to patch 7.
> > > 
> > >  tools/include/nolibc/stdio.h | 151 ++++++++++++++++++++++++-----------
> > >  1 file changed, 103 insertions(+), 48 deletions(-)
> > > 
> > > diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> > > index bb54f488c228..b14cf8224403 100644
> > > --- a/tools/include/nolibc/stdio.h
> > > +++ b/tools/include/nolibc/stdio.h
> > > @@ -240,19 +240,44 @@ char *fgets(char *s, int size, FILE *stream)
> > >  }
> > >  
> > >  
> > > -/* minimal printf(). It supports the following formats:
> > > - *  - %[l*]{d,u,c,x,p}
> > > - *  - %s
> > > - *  - unknown modifiers are ignored.
> > > +/* simple printf(). It supports the following formats:
> > > + *  - %[-][width][{l,t,z,ll,L,j,q}]{d,u,c,x,p,s,m,%}
> > > + *  - %%
> > > + *  - invalid formats are copied to the output buffer
> > >   */
> > > +
> > > +/* This code uses 'flag' variables that are indexed by the low 6 bits
> > > + * of characters to optimise checks for multiple characters.
> > > + *
> > > + * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
> > > + * returns non-zero if the bit for any of the specified characters is set.
> > > + *
> > > + * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
> > > + * returns the flag bit for ch if it is one of the specified characters.
> > > + * All the characters must be in the same 32 character block (non-alphabetic,
> > > + * upper case, or lower case) of the ASCII character set.)
> > > + */
> > > +#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
> > > +#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
> > > +#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
> > > +	(_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
> > > +	 _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
> > > +	 _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
> > > +	 _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
> > > +#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
> > > +	((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
> > > +#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
> > > +	(ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
> > > +		_NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))  
> > 
> > With signed chars I get:
> > 
> > sysroot/i386/include/stdio.h:321:37: error: comparison is always false due to limited range of data type [-Werror=type-limits]
> >   321 |         (ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
> 
> Stupid type-limits warning.
> It is almost impossible to get rid of without making the code unreadable.
> 
> >       |                                     ^
> > sysroot/i386/include/stdio.h:389:35: note: in expansion of macro '_NOLIBC_PF_CHAR_IS_ONE_OF'
> >   389 |                         ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'c', 'd', 'i', 'u', 'x', 'p');
> >       |
> > 
> > This can be fixed by switching 'ch' to be always unsigned.
> 
> That's likely to provoke another error elsewhere.
> In any case optimising that test away makes the code smaller!

We will need to get rid of this warning in some way.
Using pragmas in nolibc proper is something we don't want to do.

> > You can run the the builtin test suite like this:
> > -p triggers the download of the toolchains
> > -l uses LLVM/clang instead of the downloaded toolchain
> > 
> > $ cd tools/testing/selftests/nolibc
> > $ ./run-tests.sh -m user -p
> > $ ./run-tests.sh -m user -l
> 
> I've just been running:
> 	rm nolibc-test; make -O /path/.../dir && ./nolibc-test printf

This will miss weird things going on between different architectures.
Probably not directly relevant for changes to printf, but still useful.
And it shows the type limits warning.

> > > +
> > >  typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
> > >  
> > >  static __attribute__((unused, format(printf, 3, 0)))
> > >  int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
> > >  {
> > > -	char lpref, ch;
> > > -	unsigned long long v;
> > > +	char ch;
> > >  	unsigned int written, width;
> > > +	unsigned int flags, ch_flag;
> > >  	size_t len;
> > >  	char tmpbuf[21];
> > >  	const char *outstr;
> > > @@ -265,6 +290,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> > >  			break;
> > >  
> > >  		width = 0;
> > > +		flags = 0;
> > >  		if (ch != '%') {
> > >  			while (*fmt && *fmt != '%')
> > >  				fmt++;
> > > @@ -274,6 +300,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> > >  
> > >  			ch = *fmt++;
> > >  
> > > +			/* Conversion flag characters */
> > > +			for (;; ch = *fmt++) {
> > > +				ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
> > > +				if (!ch_flag)
> > > +					break;
> > > +				flags |= ch_flag;
> > > +			}  
> > 
> > What is the advantage of this over:
> > 
> > while (1) {
> > 	/* ... */
> > 
> > 	ch = *fmt++;
> > }
> 
> One line shorter :-)

Meh. Let's not optimize for that.

> > Or combine it with the 'ch = *fmt++' from above and do:
> > 
> > while (1) {
> > 	ch = *fmt++;
> > 
> > 	/* ... */
> > }
> > 
> > These look much simpler to me.
> 
> The code always needs to get a new character after using one.
> So you come out of each bit of code with the 'next char to check' in ch.
> Which means you need a new character at the end of the loop for the next
> iteration.

Fair enough, but let's use some clearer logic.

> > > +
> > >  			/* width */
> > >  			while (ch >= '0' && ch <= '9') {
> > >  				width *= 10;  
> > 
> > (...)
> > 
> > > +do_output:
> > >  		written += len;
> > >  
> > > +		/* An OPTIMIZER_HIDE_VAR() seems to stop gcc back-merging this
> > > +		 * code into one of the conditionals above.
> > > +		 */
> > > +		__asm__ volatile("" : "=r"(len) : "0"(len));  

(...)

> > Why separate input and ouput arguments instead of one combined one ('+r')?
> > I have been wondering the same about the kernel definition, too.
> 
> I'm not sure either.
> Certainly "+r" is more modern - which is why it isn't used in a lot of places.
> They may be identical (indeed "+r" might get converted to the "0" form),
> but maybe it gives better separation - I just copied the same version.

Okay. We use '+r' in the syscall wrappers, so availability should not be
an issue. But I don't really care on way or another.


Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X"
  2026-02-16 22:50     ` David Laight
@ 2026-02-18 17:39       ` Thomas Weißschuh
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-18 17:39 UTC (permalink / raw)
  To: David Laight; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-16 22:50:25+0000, David Laight wrote:
> On Mon, 16 Feb 2026 20:57:37 +0100
> Thomas Weißschuh <linux@weissschuh.net> wrote:
> 
> > On 2026-02-06 19:11:17+0000, david.laight.linux@gmail.com wrote:
> > > From: David Laight <david.laight.linux@gmail.com>
> > > 
> > > Add support for all the normal flag chacacters except '0' (zero pad).
> > >   '-' left alignment.
> > >   '+' and ' ' Sign characters for non-negative numbers.
> > >   '#' adds 0x to hex numbers.
> > > Partially support "%X", outputs lower case a..f the same as "%x".
> > > 
> > > Move the "%s" code in with the numeric formats to save a va_arg() call.
> > > 
> > > Prepend the sign (or "0x") after conversion to ascii and use the length
> > > returned by u64toh_r() and u64toa_r().
> > > Both needed for precision and zero-padding in the next patch.
> > > 
> > > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > > ---
> > > 
> > > Changes for v2:
> > > - Add support for left justifying fields (removed from patch 6).
> > > - Merge in the changes from the old patch 8.
> > > - Add support for "%#x" (formally part of patch 9).
> > > 
> > >  tools/include/nolibc/stdio.h | 97 ++++++++++++++++++++++++++----------
> > >  1 file changed, 71 insertions(+), 26 deletions(-)  
> > 
> > (...)
> > 
> > > +non_numeric_conversion:
> > > +			if (ch == 'm') {
> > >  #ifdef NOLIBC_IGNORE_ERRNO
> > >  				outstr = "unknown error";
> > > +				len = __builtin_strlen(outstr);  
> > 
> > Why the builtin and not the regular one?
> 
> In that case the builtin one will generate a compile-time constant.

Then please also add a comment.

> But that code gets changed in later patches.
> The #if belongs in strerror().


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests
  2026-02-16 22:54     ` David Laight
@ 2026-02-18 17:41       ` Thomas Weißschuh
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-18 17:41 UTC (permalink / raw)
  To: David Laight; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-16 22:54:15+0000, David Laight wrote:
> On Mon, 16 Feb 2026 21:23:50 +0100
> Thomas Weißschuh <linux@weissschuh.net> wrote:
> 
> > On 2026-02-06 19:11:20+0000, david.laight.linux@gmail.com wrote:
> > > From: David Laight <david.laight.linux@gmail.com>
> > > 
> > > Extra tests include:
> > > - %%, including ignored modifiers.
> > > - Invalid formats copied to output buffer (matches glibc).
> > > - Left aligned output "%-..."
> > > - Zero padding "%0...".
> > > - "(nil)" for NULL pointers (matches glibc).
> > > - Alternate form "%#x" (prepends 0x in non-zero).
> > > - Field precision as well as width, printf("%.0d", 0) is "".
> > > - Variable length width and precision "%*.*d".
> > > - Length qualifiers L and ll.
> > > - Conversion specifiers i and X.
> > > - More 'corner' cases.
> > > 
> > > There are no explicit tests of long (l, t or z) because they would
> > > have to differ between 32bit and 64bit.
> > > 
> > > Set the expected length to zero for all the non-truncating tests.
> > > (Annoying when a test is changed.)
> > > 
> > > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > > ---
> > > 
> > > Changes for v2:
> > > - Formally patch 11.
> > > - Remove broken __attribute__ on expect_vfprintf().
> > > - Add extra test for "#01x".
> > > 
> > >  tools/testing/selftests/nolibc/nolibc-test.c | 49 +++++++++++++++-----
> > >  1 file changed, 37 insertions(+), 12 deletions(-)  
> > 
> > (...)
> > 
> > > +		CASE_TEST(num_pad_trunc);EXPECT_VFPRINTF(40, "          0000000000", "%040d", 5); break;  /* max 30 '0' can be added */  
> > 
> > This fails in 'libc-test' with glibc and musl:
> > 
> > 32 num_pad_trunc "00000000000000000000" should be "          0000000000" [FAIL]
> 
> I've a plan for that - the test needs skipping.
> I can use a -40 to avoid another parameter.

No magic negative numbers please. Introduce EXPECT_VFPRINTF_COND() which
takes an additional condition parameter.


Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 09/11] selftests/nolibc: Improve reporting of vfprintf() errors
  2026-02-17 10:48     ` David Laight
@ 2026-02-18 17:48       ` Thomas Weißschuh
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-18 17:48 UTC (permalink / raw)
  To: David Laight; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-17 10:48:07+0000, David Laight wrote:
> On Mon, 16 Feb 2026 21:05:40 +0100
> Thomas Weißschuh <linux@weissschuh.net> wrote:
> 
> > On 2026-02-06 19:11:19+0000, david.laight.linux@gmail.com wrote:
> > > From: David Laight <david.laight.linux@gmail.com>
> > > 
> > > Check the string matches before checking the returned length.
> > > Only print the string once when it matches.
> > > Normally the length is that of the expected string, make it easier
> > > to write tests by treating a length of zero as being that of the
> > > expected output.
> > > Additionally check that nothing beyond the end is written.
> > > 
> > > This also correctly returns 1 (the number of errors) when strcmp()
> > > fails rather than the return value from strncmp() which is the
> > > signed difference between the mismatching characters.
> > > 
> > > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > > ---
> > > 
> > > Changes for v2:
> > > - Formally patch 5, otherwise unchanged.
> > > 
> > >  tools/testing/selftests/nolibc/nolibc-test.c | 27 ++++++++++++++++----
> > >  1 file changed, 22 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
> > > index 3c5a226dad3a..9378a1f26c34 100644
> > > --- a/tools/testing/selftests/nolibc/nolibc-test.c
> > > +++ b/tools/testing/selftests/nolibc/nolibc-test.c
> > > @@ -1567,28 +1567,45 @@ int run_stdlib(int min, int max)
> > >  
> > >  static int expect_vfprintf(int llen, int c, const char *expected, const char *fmt, ...)
> > >  {
> > > +	unsigned int i;
> > >  	char buf[100];
> > >  	va_list args;
> > >  	ssize_t w;
> > >  	int ret;
> > >  
> > > +	for (i = 0; i < sizeof(buf); i++)
> > > +		buf[i] = i;  
> > 
> > Some of these values are valid ascii characters.
> > An out-of-bounds write may be missed if the test happens to print the
> > correct character. Another poison value would avoid this issue.
> 
> I wanted to use different values just in case the test wrote the
> value I'd picked to two adjacent locations.
> I could use a (bad) random value (eg the time(NULL) & 0xff) so that
> an overwrite of the chosen value would be picked up by different run.

What about 'i & 0x0F'?

> > >  	va_start(args, fmt);
> > > -	/* Only allow writing 21 bytes, to test truncation */
> > > +	/* Only allow writing 20 bytes, to test truncation */
> > >  	w = vsnprintf(buf, 21, fmt, args);
> > >  	va_end(args);
> > >  
> > > +	llen += printf(" \"%s\"", buf);
> > > +	ret = strcmp(expected, buf);
> > > +	if (ret) {
> > > +		llen += printf(" should be \"%s\"", expected);
> > > +		result(llen, FAIL);
> > > +		return 1;
> > > +	}
> > > +	if (!c)
> > > +		c = strlen(expected);  
> > 
> > This patch does multiple different things again. Please split them up.
> > Specifically I am not a fan of allowing to pass '0' here.
> > While it is slightly more convenient when writing the tests,
> > it increases the opportunity for errors and makes the tests less clear
> > when reading them.
> 
> I definitely wanted the invalid string output on failure.
> Not being able to see why a test failed is a PITA.
> Reporting the length error only makes sense when the output is truncated,
> and the 'truncation length' is a property of expect_vfprintf() not
> the test itself (and I had to increase it for the octal tests).

Printing the full invalid string on failure is great.

My objections was only about the 'if (!c) c = strlen();' shorthand.

> One option is to remove the 'c' parameter completely and require the
> test include the un-truncated output for the truncation tests.
> There is no reason for any of them to be horribly long.

That would work, too. I prefer the return value slightly, as it can be
understood easily to correspond to the function return value.

> Then add a parameter for 'run this test' (like all the other tests)
> so that some tests can be excluded/changed when is_nolibc isn't set.

I left it out intentionally to reduce the amount of parameter soup.
To you expect more conditional testcases to be necessary?
If not a TEST_VFPRINTF_COND() macro used only here looks better to me.


Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback.
  2026-02-17 11:51         ` David Laight
@ 2026-02-18 17:52           ` Thomas Weißschuh
  0 siblings, 0 replies; 60+ messages in thread
From: Thomas Weißschuh @ 2026-02-18 17:52 UTC (permalink / raw)
  To: David Laight; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On 2026-02-17 11:51:43+0000, David Laight wrote:
> On Mon, 16 Feb 2026 20:07:35 +0100
> Thomas Weißschuh <linux@weissschuh.net> wrote:
> 
> > On 2026-02-07 23:36:51+0000, David Laight wrote:
> > > On Sat, 7 Feb 2026 20:29:34 +0100
> > > Willy Tarreau <w@1wt.eu> wrote:
> > >   
> > > > On Fri, Feb 06, 2026 at 07:11:13PM +0000, david.laight.linux@gmail.com wrote:  
> > > > > From: David Laight <david.laight.linux@gmail.com>
> > > > > 
> > > > > Add per-call buffering to the vprintf() callback.
> > > > > While this adds some extra code it will speed things up and
> > > > > makes a massive difference to anyone looking at strace output.    
> > > > 
> > > > This patch alone adds more than 200 extra bytes to the smallest binary
> > > > for something that was never expressed as a need by users:
> > > > 
> > > >  $ size hello-patch*
> > > >    text    data     bss     dec     hex filename
> > > >    1859      48      24    1931     78b hello-patch1
> > > >    2071      48      24    2143     85f hello-patch2
> > > > 
> > > > I doubt it would make sense to have a build option to choose this.
> > > > Or alternately one could decide do disable it when __OPTIMIZE_SIZE__
> > > > is defined. I just tried quickly and it does the job:  
> > > 
> > > That probably makes sense.
> > > For anything non-trivial the extra size is noise.
> > > Actually is would be easier to read with a single #if covering the
> > > whole lot.  
> > 
> > While I like the slimmer strace output, the added complexity, including
> > a different implementation for __OPTIMIZE_SIZE__ is a bit scary.
> > Could we leave this out for now and work on it on its own?
> 
> I might move it later in the patch series.

Which advantage would such a move have?
Please leave it out for now.

> I plan to add a #define that can select some of the extra features.
> Basically making bits from 'flags' makes the compiler optimise the
> associated code away - so 0 => no options and -1 => all options.
> But you also set '-' (with the quotes) to get just left alignment
> or '-', '.' to get left alignment and zero pad.

While I like the idea, it seems impossible to test properly.
This will also require a whole other discussion. Let's also skip this for now.

> I can use a character (that misses a flag -eg 'b') to enable buffering.
> 
> I thought about being able to disable features, but the requirement
> is only really to make small programs very small (but they may want
> one extra bit), the 400(ish) bytes that everything adds isn't that
> significant for anything non-trivial.

Agreed.


Thomas

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars
  2026-02-18 17:36       ` Thomas Weißschuh
@ 2026-02-18 22:57         ` David Laight
  0 siblings, 0 replies; 60+ messages in thread
From: David Laight @ 2026-02-18 22:57 UTC (permalink / raw)
  To: Thomas Weißschuh; +Cc: Willy Tarreau, linux-kernel, Cheng Li

On Wed, 18 Feb 2026 18:36:28 +0100
Thomas Weißschuh <linux@weissschuh.net> wrote:

...
> > > With signed chars I get:
> > > 
> > > sysroot/i386/include/stdio.h:321:37: error: comparison is always false due to limited range of data type [-Werror=type-limits]
> > >   321 |         (ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \  
> > 
> > Stupid type-limits warning.
> > It is almost impossible to get rid of without making the code unreadable.
> >   
> > >       |                                     ^
> > > sysroot/i386/include/stdio.h:389:35: note: in expansion of macro '_NOLIBC_PF_CHAR_IS_ONE_OF'
> > >   389 |                         ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'c', 'd', 'i', 'u', 'x', 'p');
> > >       |
> > > 
> > > This can be fixed by switching 'ch' to be always unsigned.  
> > 
> > That's likely to provoke another error elsewhere.
> > In any case optimising that test away makes the code smaller!  
> 
> We will need to get rid of this warning in some way.
> Using pragmas in nolibc proper is something we don't want to do.

I think I've a version that won't generate the warning.
Not that I get it from the version of gcc I'm using.
Just changing the ch to +ch might be enough, and the optimiser
will still optimise to a single signed compare.

> 
> > > You can run the the builtin test suite like this:
> > > -p triggers the download of the toolchains
> > > -l uses LLVM/clang instead of the downloaded toolchain
> > > 
> > > $ cd tools/testing/selftests/nolibc
> > > $ ./run-tests.sh -m user -p
> > > $ ./run-tests.sh -m user -l  
> > 
> > I've just been running:
> > 	rm nolibc-test; make -O /path/.../dir && ./nolibc-test printf  
> 
> This will miss weird things going on between different architectures.
> Probably not directly relevant for changes to printf, but still useful.
> And it shows the type limits warning.
> 
> > > > +
> > > >  typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
> > > >  
> > > >  static __attribute__((unused, format(printf, 3, 0)))
> > > >  int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
> > > >  {
> > > > -	char lpref, ch;
> > > > -	unsigned long long v;
> > > > +	char ch;
> > > >  	unsigned int written, width;
> > > > +	unsigned int flags, ch_flag;
> > > >  	size_t len;
> > > >  	char tmpbuf[21];
> > > >  	const char *outstr;
> > > > @@ -265,6 +290,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> > > >  			break;
> > > >  
> > > >  		width = 0;
> > > > +		flags = 0;
> > > >  		if (ch != '%') {
> > > >  			while (*fmt && *fmt != '%')
> > > >  				fmt++;
> > > > @@ -274,6 +300,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> > > >  
> > > >  			ch = *fmt++;
> > > >  
> > > > +			/* Conversion flag characters */
> > > > +			for (;; ch = *fmt++) {
> > > > +				ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
> > > > +				if (!ch_flag)
> > > > +					break;
> > > > +				flags |= ch_flag;
> > > > +			}    
> > > 
> > > What is the advantage of this over:
> > > 
> > > while (1) {
> > > 	/* ... */
> > > 
> > > 	ch = *fmt++;
> > > }  
> > 
> > One line shorter :-)  
> 
> Meh. Let's not optimize for that.

Actually I looked at that code again.
The outer 'ch = *fmt++' can be put inside the loop.
It was outside the 'width' code and I inserted that bit in the gap.

...
> > > Why separate input and ouput arguments instead of one combined one ('+r')?
> > > I have been wondering the same about the kernel definition, too.  
> > 
> > I'm not sure either.
> > Certainly "+r" is more modern - which is why it isn't used in a lot of places.
> > They may be identical (indeed "+r" might get converted to the "0" form),
> > but maybe it gives better separation - I just copied the same version.  
> 
> Okay. We use '+r' in the syscall wrappers, so availability should not be
> an issue. But I don't really care on way or another.

"+r" seems to work - I will use it.

	David

> 
> 
> Thomas


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2026-02-18 22:57 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-06 19:11 [PATCH v2 next 00/11] tools/nolibc: Enhance printf() david.laight.linux
2026-02-06 19:11 ` [PATCH v2 next 01/11] tools/nolibc/printf: Change variable used for format chars from 'c' to 'ch' david.laight.linux
2026-02-07 18:51   ` Willy Tarreau
2026-02-16 18:52   ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 02/11] tools/nolibc/printf: Move snprintf length check to callback david.laight.linux
2026-02-07 19:12   ` Willy Tarreau
2026-02-07 23:28     ` David Laight
2026-02-08 15:12       ` Willy Tarreau
2026-02-08 22:49         ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 03/11] tools/nolibc/printf: Add buffering to vfprintf() callback david.laight.linux
2026-02-07 19:29   ` Willy Tarreau
2026-02-07 23:36     ` David Laight
2026-02-16 19:07       ` Thomas Weißschuh
2026-02-17 11:51         ` David Laight
2026-02-18 17:52           ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 04/11] tools/nolibc/printf: Output pad characters in 16 byte chunks david.laight.linux
2026-02-07 19:38   ` Willy Tarreau
2026-02-07 23:43     ` David Laight
2026-02-08 15:14       ` Willy Tarreau
2026-02-16 19:30   ` Thomas Weißschuh
2026-02-16 22:29     ` David Laight
2026-02-18 17:30       ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 05/11] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
2026-02-07 20:05   ` Willy Tarreau
2026-02-07 23:50     ` David Laight
2026-02-08 12:20       ` David Laight
2026-02-08 14:44         ` Willy Tarreau
2026-02-08 16:54           ` David Laight
2026-02-08 17:06             ` Willy Tarreau
2026-02-06 19:11 ` [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars david.laight.linux
2026-02-08 15:22   ` Willy Tarreau
2026-02-16 19:52   ` Thomas Weißschuh
2026-02-16 22:47     ` David Laight
2026-02-18 17:36       ` Thomas Weißschuh
2026-02-18 22:57         ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 07/11] tools/nolibc/printf: Add support for conversion flags "#- +" and format "%X" david.laight.linux
2026-02-08 15:47   ` Willy Tarreau
2026-02-08 17:14     ` David Laight
2026-02-08 16:06   ` Willy Tarreau
2026-02-16 19:57   ` Thomas Weißschuh
2026-02-16 22:50     ` David Laight
2026-02-18 17:39       ` Thomas Weißschuh
2026-02-16 20:11   ` Thomas Weißschuh
2026-02-16 22:52     ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 08/11] tools/nolibc/printf: Add support for zero padding and field precision david.laight.linux
2026-02-08 16:16   ` Willy Tarreau
2026-02-08 17:31     ` David Laight
2026-02-06 19:11 ` [PATCH v2 next 09/11] selftests/nolibc: Improve reporting of vfprintf() errors david.laight.linux
2026-02-16 20:05   ` Thomas Weißschuh
2026-02-17 10:48     ` David Laight
2026-02-18 17:48       ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 10/11] selftests/nolibc: Increase coverage of printf format tests david.laight.linux
2026-02-16 20:14   ` Thomas Weißschuh
2026-02-16 20:23   ` Thomas Weißschuh
2026-02-16 22:54     ` David Laight
2026-02-18 17:41       ` Thomas Weißschuh
2026-02-06 19:11 ` [PATCH v2 next 11/11] selftests/nolibc: Use printf("%.*s", n, "") to align output david.laight.linux
2026-02-08 16:20   ` Willy Tarreau
2026-02-16 20:22   ` Thomas Weißschuh
2026-02-06 21:36 ` [PATCH v2 next 00/11] tools/nolibc: Enhance printf() David Laight

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox