From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailtransmit05.runbox.com (mailtransmit05.runbox.com [185.226.149.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D595E2BEC43 for ; Mon, 23 Feb 2026 10:48:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.226.149.38 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771843708; cv=none; b=dPbQ9qsxRDxg2ljKj2yTTeXws8cWxYHymU9vnNjujCETbzUHqbrsp1GgQovd/dipoA1E4L7Dpu8HsmqhJTtAuNIxkrDj9f1KOOcYKzzy83L2WdbkbQmds2vqgrbI0ZKvlq5vT/4KRN5DT5zXcjfRLLD5fz0kGxDqpjRN/joMNcI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771843708; c=relaxed/simple; bh=ylIhlhaEo6yfElZksGDZcgztW4ZHLYQ2JL9UcCNZk1I=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=HjfjqaC8RtUhgKzwIiA70j4Y/IpRZtK1SWZzLlFuX6Z3AkFC9QceGO1MwbpBWY9szSvwuIdZVtGKw6BPtIE60dLnfRHX7SWi64WflERmC+C4IdB9/aD0O4gNCW+fILTI1n782jFtYGvHbysJtcqRkH1s/Z6Xxt8XjLzRMLRNgmo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=runbox.com; dkim=pass (2048-bit key) header.d=runbox.com header.i=@runbox.com header.b=qq7fISL/; arc=none smtp.client-ip=185.226.149.38 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=runbox.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=runbox.com header.i=@runbox.com header.b="qq7fISL/" Received: from mailtransmit03.runbox ([10.9.9.163] helo=aibo.runbox.com) by mailtransmit05.runbox.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1vuT1B-00GoD0-87; Mon, 23 Feb 2026 11:18:21 +0100 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=runbox.com; s=selector2; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To :Message-Id:Date:Subject:Cc:To:From; bh=SpPQYt5vTvFEukQyZBkE1bS2Fw53bct4O9KkTTpoVHM=; b=qq7fISL/6dtmC/FZtFLZxKGRnu lQgvaZo8/63WERHmicVZR6B8RvOhWKby5lw5XrfWm/TLQYJEzPCUO/zdF0ePjjoF6z70sstnAxP+A jnL1ymLo8mz6UuYyv0HIeUoVUD7Q3SV6ZTZv340mBHZhhO9Cdf0F78MaTFzQ7tgDGBs8ZebZlIdJT Dn4gqtqxAl6107kxx5EPQ3A6ycTVftGpJ7gxro2Jx7+7eHxVYBPU+HrvWJytk9flokDjL4AL27BI3 dc2ZJZzbxJI79Gsa8d2n5tKt45rgMJhtoAvmx5bvRrHXz3YK8o/Zdvcc6QEQJJtEj4dmSQS+0HT2L eJXjqoew==; Received: from [10.9.9.73] (helo=submission02.runbox) by mailtransmit03.runbox with esmtp (Exim 4.86_2) (envelope-from ) id 1vuT1A-0001S8-Jz; Mon, 23 Feb 2026 11:18:21 +0100 Received: by submission02.runbox with esmtpsa [Authenticated ID (1493616)] (TLS1.2:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.93) id 1vuT14-006AjD-Jy; Mon, 23 Feb 2026 11:18:14 +0100 From: david.laight.linux@gmail.com To: Willy Tarreau , =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= , linux-kernel@vger.kernel.org, Cheng Li Cc: David Laight Subject: [PATCH v3 next 11/17] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars Date: Mon, 23 Feb 2026 10:17:29 +0000 Message-Id: <20260223101735.2922-12-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20260223101735.2922-1-david.laight.linux@gmail.com> References: <20260223101735.2922-1-david.laight.linux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: David Laight Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and conversion specifiers. This makes it easy to test for multiple values at once. Detect the conversion flags " #+-0" although they are currently all ignored. Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L' (both long long). Add support for "%i" (the same as %d") and "%X" (treated at "%x"). Unconditionally generate the signed values (for %d) to remove a second set of checks for the size. Separate out the formatting of single characters from numbers. Output the sign for negative values then negate and treat as unsigned. Change/add tests to use conversions i and X, and length modifiers L and ll. Use the correct minimum value for "%Li". Acked-by: Willy Tarreau Signed-off-by: David Laight --- Changes for v3: - Patch 6 in v2. - Move all the variable definitions to the top of the function. The loop body is a bit long to hide definitions at its top. - Avoid -Wtype-limits validating format characters. - Include changes to the selftests. Changes for v2: - Use #defines to make the code a lot more readable. - Include the changes from the old patch 10 that used masks for the conversion specifiers. - Detect all the valid flag characters even though they are not implemented. - Support for left justifying field is moved to patch 7. tools/include/nolibc/stdio.h | 162 +++++++++++++------ tools/testing/selftests/nolibc/nolibc-test.c | 14 +- 2 files changed, 124 insertions(+), 52 deletions(-) diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h index ae96b7bebbfe..6cb106367e3b 100644 --- a/tools/include/nolibc/stdio.h +++ b/tools/include/nolibc/stdio.h @@ -291,10 +291,15 @@ int fseek(FILE *stream, long offset, int whence) } -/* minimal printf(). It supports the following formats: - * - %[l*]{d,u,c,x,p} - * - %s - * - unknown modifiers are ignored. +/* printf(). Supports most of the normal integer and string formats. + * - %[#-+ 0][width][{l,t,z,ll,L,j,q}]{c,d,i,u,x,X,p,s,m,%} + * - %% generates a single % + * - %m outputs strerror(errno). + * - %X outputs a..f the same as %x. + * - The modifiers [#-+ 0] are currently ignored. + * - No support for precision or variable widths. + * - No support for floating point or wide characters. + * - Invalid formats are copied to the output buffer. * * Called by vfprintf() and snprintf() to do the actual formatting. * The callers provide a callback function to save the formatted data. @@ -305,15 +310,43 @@ int fseek(FILE *stream, long offset, int whence) * - with (NULL, 0) at the end of the __nolibc_printf. * If the callback returns non-zero __nolibc_printf() immediately returns -1. */ + typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size); +/* This code uses 'flag' variables that are indexed by the low 6 bits + * of characters to optimise checks for multiple characters. + * + * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...) + * returns non-zero if the bit for any of the specified characters is set. + * + * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...) + * returns the flag bit for ch if it is one of the specified characters. + * All the characters must be in the same 32 character block (non-alphabetic, + * upper case, or lower case) of the ASCII character set. + */ +#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f)) +#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0) +#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \ + (_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \ + _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \ + _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \ + _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8)) +#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \ + ((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0)) +#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \ + ((unsigned int)(ch) - (cmp_1 & 0xe0) > 0x1f ? 0 : \ + _NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__)) + static __attribute__((unused, format(printf, 3, 0))) int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args) { - char lpref, ch; + char ch; unsigned long long v; + long long signed_v; int written, width, len; + unsigned int flags, ch_flag; char outbuf[21]; + char *out; const char *outstr; written = 0; @@ -324,6 +357,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list break; width = 0; + flags = 0; if (ch != '%') { while (*fmt && *fmt != '%') fmt++; @@ -334,7 +368,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list /* we're in a format sequence */ - ch = *fmt++; + /* Conversion flag characters */ + while (1) { + ch = *fmt++; + ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0'); + if (!ch_flag) + break; + flags |= ch_flag; + } /* width */ while (ch >= '0' && ch <= '9') { @@ -344,62 +385,82 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list ch = *fmt++; } - /* Length modifiers */ - if (ch == 'l') { - lpref = 1; - ch = *fmt++; - if (ch == 'l') { - lpref = 2; - ch = *fmt++; + /* Length modifier. + * They miss the conversion flags characters " #+-0" so can go into flags. + * Change both L and ll to q. + */ + if (ch == 'L') + ch = 'q'; + ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'l', 't', 'z', 'j', 'q'); + if (ch_flag != 0) { + if (ch == 'l' && fmt[0] == 'l') { + fmt++; + ch_flag = _NOLIBC_PF_FLAG('q'); } - } else if (ch == 'j') { - /* intmax_t is long long */ - lpref = 2; + flags |= ch_flag; ch = *fmt++; - } else { - lpref = 0; } - if (ch == 'c' || ch == 'd' || ch == 'u' || ch == 'x' || ch == 'p') { - char *out = outbuf; + /* Conversion specifiers. */ - if (ch == 'p') + /* Numeric and pointer conversion specifiers. + * + * Use an explicit bound check (rather than _NOLIBC_PF_CHAR_IS_ONE_OF()) + * so that 'X' can be allowed through. + * 'X' gets treated and 'x' because _NOLIBC_PF_FLAG() returns the same + * value for both. + */ + ch_flag = _NOLIBC_PF_FLAG(ch); + if (((ch >= 'a' && ch <= 'z') || ch == 'X') && + _NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'i', 'u', 'x', 'p')) { + /* 'long' is needed for pointer conversions and ltz lengths. + * A single test can be used provided 'p' (the same bit as '0') + * is masked from flags. + */ + if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')), + 'p', 'l', 't', 'z')) { v = va_arg(args, unsigned long); - else if (lpref) { - if (lpref > 1) - v = va_arg(args, unsigned long long); - else - v = va_arg(args, unsigned long); - } else + signed_v = (long)v; + } else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j', 'q')) { + v = va_arg(args, unsigned long long); + signed_v = v; + } else { v = va_arg(args, unsigned int); + signed_v = (int)v; + } - if (ch == 'd') { - /* sign-extend the value */ - if (lpref == 0) - v = (long long)(int)v; - else if (lpref == 1) - v = (long long)(long)v; + if (ch == 'c') { + /* "%c" - single character. */ + outbuf[0] = v; + len = 1; + outstr = outbuf; + goto do_output; } - switch (ch) { - case 'c': - out[0] = v; - out[1] = 0; - break; - case 'd': - i64toa_r(v, out); - break; - case 'u': + out = outbuf; + + if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i')) { + /* "%d" and "%i" - signed decimal numbers. */ + if (signed_v < 0) { + *out++ = '-'; + v = -(signed_v + 1); + v++; + } + } + + /* Convert the number to ascii in the required base. */ + if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i', 'u')) { + /* Base 10 */ u64toa_r(v, out); - break; - case 'p': - *(out++) = '0'; - *(out++) = 'x'; - __nolibc_fallthrough; - default: /* 'x' and 'p' above */ + } else { + /* Base 16 */ + if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p')) { + *(out++) = '0'; + *(out++) = 'x'; + } u64toh_r(v, out); - break; } + outstr = outbuf; goto do_strlen_output; } @@ -438,6 +499,9 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list do_output: written += len; + /* Stop gcc back-merging this code into one of the conditionals above. */ + _NOLIBC_OPTIMIZER_HIDE_VAR(len); + width -= len; while (width > 0) { /* Output pad in 16 byte blocks with the small block first. */ diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index 61968fdfeec0..498d3125eb24 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -1833,11 +1833,19 @@ static int run_printf(int min, int max) CASE_TEST(number); EXPECT_VFPRINTF(1, "1234", "%d", 1234); break; CASE_TEST(negnumber); EXPECT_VFPRINTF(1, "-1234", "%d", -1234); break; CASE_TEST(unsigned); EXPECT_VFPRINTF(1, "12345", "%u", 12345); break; + CASE_TEST(signed_max); EXPECT_VFPRINTF(1, "2147483647", "%i", ~0u >> 1); break; + CASE_TEST(signed_min); EXPECT_VFPRINTF(1, "-2147483648", "%i", (~0u >> 1) + 1); break; + CASE_TEST(unsigned_max); EXPECT_VFPRINTF(1, "4294967295", "%u", ~0u); break; CASE_TEST(char); EXPECT_VFPRINTF(1, "c", "%c", 'c'); break; - CASE_TEST(hex); EXPECT_VFPRINTF(1, "f", "%x", 0xf); break; + CASE_TEST(hex_nolibc); EXPECT_VFPRINTF(is_nolibc, "|f|d|", "|%x|%X|", 0xf, 0xd); break; + CASE_TEST(hex_libc); EXPECT_VFPRINTF(!is_nolibc, "|f|D|", "|%x|%X|", 0xf, 0xd); break; CASE_TEST(pointer); EXPECT_VFPRINTF(1, "0x1", "%p", (void *) 0x1); break; - CASE_TEST(uintmax_t); EXPECT_VFPRINTF(1, "18446744073709551615", "%ju", 0xffffffffffffffffULL); break; - CASE_TEST(intmax_t); EXPECT_VFPRINTF(1, "-9223372036854775807", "%jd", 0x8000000000000001LL); break; + CASE_TEST(percent); EXPECT_VFPRINTF(1, "a%d42%69%", "a%%d%d%%%d%%", 42, 69); break; + CASE_TEST(perc_qual); EXPECT_VFPRINTF(1, "a%d2", "a%-14l%d%d", 2); break; + CASE_TEST(invalid); EXPECT_VFPRINTF(1, "a%12yx3%y42%P", "a%12yx%d%y%d%P", 3, 42); break; + CASE_TEST(intmax_max); EXPECT_VFPRINTF(1, "9223372036854775807", "%lld", ~0ULL >> 1); break; + CASE_TEST(intmax_min); EXPECT_VFPRINTF(1, "-9223372036854775808", "%Li", (~0ULL >> 1) + 1); break; + CASE_TEST(uintmax_max); EXPECT_VFPRINTF(1, "18446744073709551615", "%ju", ~0ULL); break; CASE_TEST(truncation); EXPECT_VFPRINTF(1, "012345678901234567890123456789", "%s", "012345678901234567890123456789"); break; CASE_TEST(string_width); EXPECT_VFPRINTF(1, " 1", "%10s", "1"); break; CASE_TEST(number_width); EXPECT_VFPRINTF(1, " 1", "%10d", 1); break; -- 2.39.5