From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A52A32D7DF1 for ; Fri, 6 Feb 2026 21:36:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770413773; cv=none; b=Yc7D3om+rUja+f9x0y2E9ojJ5GgAAU4V63Wqai8EyI251xQUmjebPvIQAtMU2MicskqSFqLppDvCn1aMSJ0AdXfR3oisB4zY+2EqbMuqVtQdoR+0l1Fo6bhFbIxUKxAjpolrv0BblZP4U2Ahuzdh/DSdO8lZbVRHaxOxYiUaQGg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770413773; c=relaxed/simple; bh=XWO37zJsEehC7Fa08mu/AhKAxEnE1uda9fqqv7U/MZM=; h=Date:From:To:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=geETyp7nTW6CdSVchuR4KSxPRUMaIJoH72X3HofZZc3hK1XIO5emU7DNWb7CmEF3HyhB1JI8JA2K26MuBg4ffjug9T3mTul0Av6wdyW18DsRHvPi7ynRfX06V17oYqwU3TmSf1sKDUSsFJJWqt3QaNEJF3zYiyU1bHgKrwoBfQg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZVxe+8LA; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZVxe+8LA" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-48068127f00so24104455e9.3 for ; Fri, 06 Feb 2026 13:36:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1770413771; x=1771018571; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:to:from:date:from:to:cc:subject:date:message-id :reply-to; bh=qeycxS39EErVr4dc7sjRiKC2+vhQu/3QnuOf7mpZhLk=; b=ZVxe+8LAmtIU8B7W4DdlddPAPbyBaV3yD+K6TETcy/qzj5CPM9B1/96XoCmnlVDWKS 1qGjJZQczClJzpRhwXaASWg/u35KYQG88vZ4RpdaZh7tE4r5YhjNjISm595uwL6qc74z oP5LTv94v9dHpx4gq0LSap6lzAJvYX+EfeCvf75Tkh0LvMjEqRlS/GvAsLfN4mXnnyxU 1bZy1JeR9kWDAQh2uvS0maynVn/iwe7+WeUN1z0xvY/YaBMoVqH1/u8IqB/k8WkpfKAP /PxbbPr2Mzobr6RjB8dFoqWk7745Kz5Frtw2Wqko3SS3zoiRx7bP4+ffnyQrNIbcWMLF pLWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770413771; x=1771018571; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:to:from:date:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=qeycxS39EErVr4dc7sjRiKC2+vhQu/3QnuOf7mpZhLk=; b=pLZ/SDJHIp4LjtcsCDZNQ8cvvGQw9x+M6FSdNtF/FXKer+dmu6MeINgLbwZ981MzGJ xyHqglOvNGc8K4Ka9nFBtFUIYUPNWqBfxTqVCmm1maYhZkgBXQSdqXqz6KhG1KtmGQJx WMdOtUmtDVnXwn+2SzJMMIV+/VHeNNwDWPThCLDR5v8pDY9PfG62a3HRQBjDyMOHtxOg EN6yJiXtz1rx2Jc//0T6OkWENXBheY/q7jWtNtIIo6ICQjclYvC3ehPe88mBEZ6Fp1Cv CD2XeSUhK3j2MiP8lIOtFTjivyU4hIRHkFtFZxOPC9pEc2DKnkH/AGTcwriDv0vZ56nN YZsA== X-Forwarded-Encrypted: i=1; AJvYcCVQmxvGKe3m21PSZ3AcEAndu2olKNfJY6229aQn7eK7xuxmRg5vdbnkswnYKND5UVw4lJPopYOKXSj+n/k=@vger.kernel.org X-Gm-Message-State: AOJu0Yys75gjy+gS4yEfIFLLSWCJM+xyVVKDj9+R2nEYDoiD6MzYkasE 1dzE+/bsx8FFZZAF2KIr7mqJNDql6FhINXM6IsnLr/MZadOcgLdltcaU X-Gm-Gg: AZuq6aKN9jFmNQhzquAtOEz4BxiVALk9a2ta/wE489BZcCdpx8TD1hRSoj/87w461MY MidFZSQawSg6xDcvMeNTkeaFMzTXGqQ/H6fe53N1VTvGJnQLnp5jw7hh3GJopf96ePBqqoqdSbc n+kRI6ZGpQahCunVb/rQvsLzjyUTIlLf10QA27Vvwxe3OR5rTRDvKPkaVmkFLqgWA1H0l3yhOWe Q7G/JPe0kQLYr9qQdM4iCWbc5otryMK+VlpomwMjo9uA8nNrPiVKVTipjdTev+fhtgbM5OczB8H iAme+jqgklz848dJQA/rTD/GxYTKFsGkIfNOUvGVXm0NeX2SdgqI9Y3/E/oL8uH/n8Q+3uNpG9E aJDShXAMIMekydlVU/s2H2IgOETD2IVHEa7NOHIq91tTfpj6NgCIi+Z+wj5Fd8BYae4YOpMNsCg QQAYxAAbqXMXDa/u7CFsiVBYi49ZQW0DuGmCca9HutsCwR+g22fKKV X-Received: by 2002:a05:600c:6289:b0:47d:18b0:bb9a with SMTP id 5b1f17b1804b1-483203393b7mr66977475e9.33.1770413770705; Fri, 06 Feb 2026 13:36:10 -0800 (PST) Received: from pumpkin (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-483203f529bsm29579275e9.4.2026.02.06.13.36.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Feb 2026 13:36:10 -0800 (PST) Date: Fri, 6 Feb 2026 21:36:08 +0000 From: David Laight To: Willy Tarreau , Thomas =?UTF-8?B?V2Vpw59zY2h1aA==?= , linux-kernel@vger.kernel.org, Cheng Li Subject: Re: [PATCH v2 next 00/11] tools/nolibc: Enhance printf() Message-ID: <20260206213608.1bbad591@pumpkin> In-Reply-To: <20260206191121.3602-1-david.laight.linux@gmail.com> References: <20260206191121.3602-1-david.laight.linux@gmail.com> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Fri, 6 Feb 2026 19:11:10 +0000 david.laight.linux@gmail.com wrote: > From: David Laight > > Update printf() so that it handles almost all the non-fp formats. > In particular: > - Left alignment. > - Zero padding. > - Field precision. > - Variable field width and precision. > - Width modifiers q, L, t and z. > - Conversion specifiers i and X (X generates lower case). > About the only things that are missing are octal and floating point. Since it is pretty much a re-write, a copy of the new version: /* printf(). Supports most of the normal integer and string formats. * - %[#0-+ ][width|*[.precision|*]][{l,t,z,ll,L,j,q}]{d,i,u,c,x,X,p,s,m,%} * - %% generates a single % * - %m outputs strerror(errno). * - # only affects %x and prepends 0x to non-zero values. * - %o (octal) isn't supported. * - %X outputs a..f the same as %x. * - No support for wide characters. * - invalid formats are copied to the output buffer. */ /* This code uses 'flag' variables that are indexed by the low 6 bits * of characters to optimise checks for multiple characters. * * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...) * returns non-zero if the bit for any of the specified characters is set. * * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...) * returns the flag bit for ch if it is one of the specified characters. * All the characters must be in the same 32 character block (non-alphabetic, * upper case, or lower case) of the ASCII character set.) */ #define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f)) #define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0) #define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \ (_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \ _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \ _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \ _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8)) #define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \ ((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0)) #define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \ (ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \ _NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__)) typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size); static __attribute__((unused, format(printf, 3, 0))) int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args) { char ch; int len, written, width, precision; unsigned int flags, ch_flag; char tmpbuf[32 + 24]; const char *outstr; written = 0; while (1) { outstr = fmt; ch = *fmt++; if (!ch) break; width = 0; flags = 0; if (ch != '%') { while (*fmt && *fmt != '%') fmt++; len = fmt - outstr; } else { /* we're in a format sequence */ ch = *fmt++; /* Conversion flag characters */ for (;; ch = *fmt++) { ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0'); if (!ch_flag) break; flags |= ch_flag; } /* Width and precision */ for (;; ch = *fmt++) { if (ch == '*') { precision = va_arg(args, unsigned int); ch = *fmt++; } else { for (precision = 0; ch >= '0' && ch <= '9'; ch = *fmt++) precision = precision * 10 + (ch - '0'); } if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '.')) break; width = precision; if (ch != '.') { /* Default precision for strings */ precision = INT_MAX; break; } flags |= _NOLIBC_PF_FLAG('.'); } /* Length modifier. * They miss the conversion flags characters " #+-0" so can go into flags. * Change both L and ll to q. */ if (ch == 'L') ch = 'q'; ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'l', 't', 'z', 'j', 'q'); if (ch_flag != 0) { if (ch == 'l' && fmt[0] == 'l') { fmt++; ch_flag = _NOLIBC_PF_FLAG('q'); } flags |= ch_flag; ch = *fmt++; } /* Conversion specifiers. */ /* Numeric and pointer conversion specifiers. * * Use an explicit bound check (rather than _NOLIBC_PF_CHAR_IS_ONE_OF()) * so that 'X' can be allowed through. * 'X' gets treated and 'x' because _NOLIBC_PF_FLAG() returns the same * value for both. */ if ((ch < 'a' || ch > 'z') && ch != 'X') goto non_numeric_conversion; /* We need to check for "%p" or "%#x" later, merging here gives better code. * But '#' collides with 'c' so shift right. */ ch_flag = _NOLIBC_PF_FLAG(ch) | (flags & _NOLIBC_PF_FLAG('#')) >> 1; if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c', 'd', 'i', 'u', 'x', 'p', 's')) { unsigned long long v; long long signed_v; char *out = tmpbuf + 32; int sign = 0; /* 'long' is needed for pointer/string conversions and ltz lengths. * A single test can be used provided 'p' (the same bit as '0') * is masked from flags. */ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag | (flags & ~_NOLIBC_PF_FLAG('p')), 'p', 's', 'l', 't', 'z')) { v = va_arg(args, unsigned long); signed_v = (long)v; } else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, 'j', 'q')) { v = va_arg(args, unsigned long long); signed_v = v; } else { v = va_arg(args, unsigned int); signed_v = (int)v; } if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'c')) { /* "%c" - single character. */ tmpbuf[0] = v; len = 1; outstr = tmpbuf; goto do_output; } if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 's')) { /* "%s" - character string. */ if (!v) { outstr = "(null)"; /* Match glibc, nothing output if precision too small */ len = precision >= 6 ? 6 : 0; goto do_output; } outstr = (void *)v; do_strnlen_output: len = strnlen(outstr, precision); goto do_output; } if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i')) { /* "%d" and "%i" - signed decimal numbers. */ if (signed_v < 0) { sign = '-'; v = -(signed_v + 1); v++; } else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '+')) { sign = '+'; } else if (_NOLIBC_PF_FLAGS_CONTAIN(flags, ' ')) { sign = ' '; } } if (v == 0) { /* There are special rules for zero. */ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p')) { /* "%p" match glibc, precision is ignored */ outstr = "(nil)"; len = 5; goto do_output; } if (!precision) { /* Explicit %nn.0d, no digits output */ len = 0; goto prepend_sign; } /* All formats (including "%#x") just output "0". */ *out = '0'; len = 1; } else { /* Convert the number to ascii in the required base. */ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'd', 'i', 'u')) { /* Base 10 */ len = u64toa_r(v, out); } else { /* Base 16 */ if (_NOLIBC_PF_FLAGS_CONTAIN(ch_flag, 'p', '#' - 1)) { /* "%p" and "%#x" need "0x" prepending. */ sign = 'x' | '0' << 8; } len = u64toh_r(v, out); } } /* Add zero padding */ if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '0', '.')) { if (!_NOLIBC_PF_FLAGS_CONTAIN(flags, '.')) { if (_NOLIBC_PF_FLAGS_CONTAIN(flags, '-')) /* Left justify overrides zero pad */ goto prepend_sign; /* eg "%05d", Zero pad to field width less sign */ precision = width; if (sign) { precision--; if (sign >= 256) precision--; } } if (precision > 30) /* Don't run off the start of tmpbuf[] */ precision = 30; for (; len < precision; len++) { /* Stop gcc generating horrid code and memset(). * This is OPTIMIZER_HIDE_VAR() from compiler.h. */ __asm__ volatile("" : "=r"(len) : "0"(len)); *--out = '0'; } } prepend_sign: /* Add 0, 1 or 2 ("0x") sign characters left of any zero padding */ for (; sign; sign >>= 8) { len++; *--out = sign; } outstr = out; goto do_output; } non_numeric_conversion: if (ch == 'm') { #ifdef NOLIBC_IGNORE_ERRNO outstr = "unknown error"; len = __builtin_strlen(outstr); #else outstr = strerror(errno); goto do_strnlen_output; #endif /* NOLIBC_IGNORE_ERRNO */ } else { if (ch != '%') { /* Invalid format: back up to output the format characters */ fmt = outstr + 1; /* and output a '%' now. */ } /* %% is documented as a 'conversion specifier'. * Any flags, precision or length modifier are ignored. */ len = 1; width = 0; outstr = fmt - 1; } } do_output: written += len; /* An OPTIMIZER_HIDE_VAR() seems to stop gcc back-merging this * code into one of the conditionals above. */ __asm__ volatile("" : "=r"(len) : "0"(len)); /* Output 'left pad', 'value' then 'right pad'. */ width -= len; flags = _NOLIBC_PF_FLAGS_CONTAIN(flags, '-'); if (flags && cb(state, outstr, len) != 0) return -1; while (width > 0) { int pad_len = ((width - 1) & 15) + 1; width -= pad_len; written += pad_len; if (cb(state, " ", pad_len) != 0) return -1; } if (!flags && cb(state, outstr, len) != 0) return -1; } /* Flush/terminate any buffer. */ if (cb(state, NULL, 0) != 0) return -1; return written; } struct __nolibc_fprintf_cb_state { FILE *stream; unsigned int buf_offset; char buf[128]; }; static int __nolibc_fprintf_cb(void *v_state, const char *buf, size_t size) { struct __nolibc_fprintf_cb_state *state = v_state; unsigned int off = state->buf_offset; if (off + size > sizeof(state->buf) || buf == NULL) { state->buf_offset = 0; if (off && _fwrite(state->buf, off, state->stream)) return -1; if (size > sizeof(state->buf)) return _fwrite(buf, size, state->stream); off = 0; } if (size) { state->buf_offset = off + size; memcpy(state->buf + off, buf, size); } return 0; } ... struct __nolibc_sprintf_cb_state { char *buf; size_t size; }; static int __nolibc_sprintf_cb(void *v_state, const char *buf, size_t size) { struct __nolibc_sprintf_cb_state *state = v_state; char *tgt; if (size >= state->size) { if (state->size <= 1) return 0; size = state->size - 1; } tgt = state->buf; if (size) { state->size -= size; state->buf = tgt + size; memcpy(tgt, buf, size); } else { /* In particular from cb(NULL, 0) at the end of __nolibc_printf(). */ *tgt = '\0'; } return 0; }