Re: [PATCH next 06/12] tools/nolibc/printf: Add support for left alignment and %[tzLq]d"

All of lore.kernel.org
 help / color / mirror / Atom feed

From: David Laight <david.laight.linux@gmail.com>
To: Willy Tarreau <w@1wt.eu>
Cc: "Thomas Weißschuh" <linux@weissschuh.net>,
	linux-kernel@vger.kernel.org, "Cheng Li" <lechain@gmail.com>
Subject: Re: [PATCH next 06/12] tools/nolibc/printf: Add support for left alignment and %[tzLq]d"
Date: Wed, 4 Feb 2026 10:17:05 +0000	[thread overview]
Message-ID: <20260204101705.11d2d99b@pumpkin> (raw)
In-Reply-To: <aYLHuzt5AGO6h2yW@1wt.eu>

On Wed, 4 Feb 2026 05:14:51 +0100
Willy Tarreau <w@1wt.eu> wrote:

> Hi David,
> 
> On Tue, Feb 03, 2026 at 10:29:54AM +0000, david.laight.linux@gmail.com wrote:
> > From: David Laight <david.laight.linux@gmail.com>
> > 
> > Use a single 'flags' variable to hold both format flags and length modifiers.
> > Use (1u << (c & 31)) for the flag bits to reduce code complexity.
> > 
> > Add support for left justifying fields.
> > 
> > Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L'
> > (both long long).
> > 
> > Unconditionall generate the signed values (for %d) to remove a second
> > set of checks for the size.
> > 
> > Use 'signed int' for the lengths to make the pad test simpler.
> > 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> >  tools/include/nolibc/stdio.h | 88 ++++++++++++++++++++----------------
> >  1 file changed, 50 insertions(+), 38 deletions(-)
> > 
> > diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> > index 164d2384978e..1ce4d357a802 100644
> > --- a/tools/include/nolibc/stdio.h
> > +++ b/tools/include/nolibc/stdio.h
> > @@ -240,20 +240,20 @@ char *fgets(char *s, int size, FILE *stream)
> >  }
> >  
> >  
> > -/* minimal printf(). It supports the following formats:
> > - *  - %[l*]{d,u,c,x,p}
> > - *  - %s
> > - *  - unknown modifiers are ignored.
> > +/* simple printf(). It supports the following formats:
> > + *  - %[-][width][{l,t,z,ll,L,j,q}]{d,u,c,x,p,s,m}
> > + *  - %%
> > + *  - invalid formats are copied to the output buffer
> >   */
> >  typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
> >  
> > +#define __PF_FLAG(c) (1u << ((c) & 0x1f))  
> 
> This flag will be exposed to user code, you'll have to previx it with
> _NOLIBC_.

The lines are long enough already, something shorter would be ideal.
The #undef at the bottom stops it being exposed.

> 
> >  static __attribute__((unused, format(printf, 3, 0)))
> >  int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
> >  {
> > -	char lpref, c;
> > -	unsigned long long v;
> > -	unsigned int written, width;
> > -	size_t len;
> > +	char c;
> > +	int len, written, width;
> > +	unsigned int flags;
> >  	char tmpbuf[21];
> >  	const char *outstr;
> >  
> > @@ -265,6 +265,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  			break;
> >  
> >  		width = 0;
> > +		flags = 0;
> >  		if (c != '%') {
> >  			while (*fmt && *fmt != '%')
> >  				fmt++;
> > @@ -274,6 +275,13 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  
> >  			c = *fmt++;
> >  
> > +			/* Flag characters */
> > +			for (; c >= 0x20 && c <= 0x3f; c = *fmt++) {
> > +				if ((__PF_FLAG(c) & (__PF_FLAG('-'))) == 0)
> > +					break;
> > +				flags |= __PF_FLAG(c);
> > +			}  
> 
> Honestly I don't find that it improves readability here and makes one
> keep doubts in background as "what if c == 'm' which will match '-'?".
> I think that one would better be written as the usual:
> 
> 			for (; c >= 0x20 && c <= 0x3f; c = *fmt++) {
> 				if (c == '-')
> 					break;
> 				flags |= __PF_FLAG(c);
> 			}
> 
> Or even simpler since there's already a condition in the for() loop:
> 
> 			for (; c >= 0x20 && c != '-' && c <= 0x3f; c = *fmt++)
> 				flags |= __PF_FLAG(c);

It is all written that way for when more flags get added - look at the later
patches which check for any of "#-+ 0".
At that point the line get long and unreadable - as below :-)

I didn't want to add the flags here before supporting them later.
But they could all be accepted and ignored until implemented.
That might be better anyway.

Actually it might be worth s/c/ch/ to make the brain see the difference
between c and 'c' more easily.

Perhaps I'm expand the comment a bit.
It is all a hint as to what is happening later on with the character tests.

> 
> >  			/* width */
> >  			while (c >= '0' && c <= '9') {
> >  				width *= 10;
> > @@ -282,41 +290,34 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  				c = *fmt++;
> >  			}
> >  
> > -			/* Length modifiers */
> > -			if (c == 'l') {
> > -				lpref = 1;
> > -				c = *fmt++;
> > -				if (c == 'l') {
> > -					lpref = 2;
> > +			/* Length modifiers are lower case except 'L' which is the same a 'q' */
> > +			if ((c >= 'a' && c <= 'z') || (c == 'L' && (c = 'q'))) {  
> 
> Then please say "... which we replace by 'q'" so that we don't first read
> that (c = 'q') as a possible typo. Also maybe add a mention to the fact
> that "flags" will exclusively represent lowercase modifiers from now on?

I'll clarify it.

> 
> > +				if (__PF_FLAG(c) & (__PF_FLAG('l') | __PF_FLAG('t') | __PF_FLAG('z') |
> > +						    __PF_FLAG('j') | __PF_FLAG('q'))) {  
> 
> Even though I understand the value in checking bit positions (I use that
> all the time as well), above this is just unreadable. Maybe you need a
> different macro, maybe define another macro _NOLIBC_PF_LEN_MOD made of
> the addition of all the flags to test against, I don't know, but the
> construct, the line break in the middle of the expression and the
> parenthesis needed for the macro just requires a lot of effort to
> understand what's being tested. Alternately another possibility would
> be to have another macro taking 4-5 char args and composing the flags
> in one call, passing 0 or -1 for unused ones. This would also make
> several parenthesis disappear which would help.

Hmmm, some macro magic might work, loosely:
#define FLNZ(q) (q ? 1 << (q & 31) ? 0)
#define FLM3(q1, q2, q3, ...) ((FLNZ(q1) | FLNZ(q2) | FLNZ(q3))
#define FLT(fl, ...) (fl & FLM3(__VA_ARGS__, 0, 0, 0))
#define CT(c, ...) FLT(1 << (c & 31), __VA_ARGS__) 
Then the above would be:
	if (CT(c, 'l', 't', 'z', 'j', 'q')) {
Clearly needs some better and longer names and a big comment block.

> 
> > +					if (c == 'l' && fmt[0] == 'l') {
> > +						fmt++;
> > +						c = 'q';
> > +					}
> > +					/* These all miss "# -0+" */
> > +					flags |= __PF_FLAG(c);
> >  					c = *fmt++;
> >  				}
> > -			} else if (c == 'j') {
> > -				/* intmax_t is long long */
> > -				lpref = 2;
> > -				c = *fmt++;
> > -			} else {
> > -				lpref = 0;
> >  			}
> >  
> >  			if (c == 'c' || c == 'd' || c == 'u' || c == 'x' || c == 'p') {
> > +				unsigned long long v;
> > +				long long signed_v;
> >  				char *out = tmpbuf;
> >  
> > -				if (c == 'p')
> > +				if ((c == 'p') || (flags & (__PF_FLAG('l') | __PF_FLAG('t') | __PF_FLAG('z')))) {
> >  					v = va_arg(args, unsigned long);
> > -				else if (lpref) {
> > -					if (lpref > 1)
> > -						v = va_arg(args, unsigned long long);
> > -					else
> > -						v = va_arg(args, unsigned long);
> > -				} else
> > +					signed_v = (long)v;
> > +				} else if (flags & (__PF_FLAG('j') | __PF_FLAG('q'))) {
> > +					v = va_arg(args, unsigned long long);
> > +					signed_v = v;
> > +				} else {
> >  					v = va_arg(args, unsigned int);
> > -
> > -				if (c == 'd') {
> > -					/* sign-extend the value */
> > -					if (lpref == 0)
> > -						v = (long long)(int)v;
> > -					else if (lpref == 1)
> > -						v = (long long)(long)v;
> > +					signed_v = (int)v;
> >  				}
> >  
> >  				switch (c) {
> > @@ -325,7 +326,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  					out[1] = 0;
> >  					break;
> >  				case 'd':
> > -					i64toa_r(v, out);
> > +					i64toa_r(signed_v, out);
> >  					break;
> >  				case 'u':
> >  					u64toa_r(v, out);
> > @@ -366,14 +367,24 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  
> >  		written += len;
> >  
> > -		while (width > len) {
> > -			unsigned int pad_len = ((width - len - 1) & 15) + 1;
> > +                /* An OPTIMIZER_HIDE_VAR() seems to stop gcc back-merging this
> > +                 * code into one of the conditionals above.  
> 
> Be careful, space indentation above.
> 
> > +		 */
> > +                __asm__ volatile("" : "=r"(len) : "0"(len));  
> 
> and here.

Oops...

> 
> > +
> > +		/* Output 'left pad', 'value' then 'right pad'. */
> > +		flags &= __PF_FLAG('-');
> > +		width -= len;
> > +		if (flags && cb(state, outstr, len) != 0)
> > +			return -1;
> > +		while (width > 0) {
> > +			int pad_len = ((width - 1) & 15) + 1;
> >  			width -= pad_len;
> >  			written += pad_len;
> >  			if (cb(state, "                ", pad_len) != 0)
> >  				return -1;
> >  		}
> > -		if (cb(state, outstr, len) != 0)
> > +		if (!flags && cb(state, outstr, len) != 0)
> >  			return -1;
> >  	}
> >  
> > @@ -382,6 +393,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> >  
> >  	return written;
> >  }
> > +#undef _PF_FLAG  
> 
> This one can be dropped once named as _NOLIBC_xxx

I'll see if I can get a max of 2 expansions on a line.
Otherwise the lines get horribly long.

	David

> 
> Willy

next prev parent reply	other threads:[~2026-02-04 10:17 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-03 10:29 [PATCH next 00/12] tools/nolibc: Enhance printf() david.laight.linux
2026-02-03 10:29 ` [PATCH next 01/12] tools/nolibc/printf: Move length check to snprintf callback david.laight.linux
2026-02-03 10:29 ` [PATCH next 02/12] tools/nolibc/printf: Add buffering to vfprintf() callback david.laight.linux
2026-02-03 10:29 ` [PATCH next 03/12] tools/nolibc/printf: output pad spaces in 16 byte chunks david.laight.linux
2026-02-03 10:29 ` [PATCH next 04/12] selftests/nolibc: Improve reporting of vfprintf() errors david.laight.linux
2026-02-03 10:29 ` [PATCH next 05/12] tools/nolibc/printf: Simplify __nolibc_printf() david.laight.linux
2026-02-03 10:29 ` [PATCH next 06/12] tools/nolibc/printf: Add support for left alignment and %[tzLq]d" david.laight.linux
2026-02-04  4:14   ` Willy Tarreau
2026-02-04 10:17     ` David Laight [this message]
2026-02-04 10:40       ` Willy Tarreau
2026-02-04 15:39         ` David Laight
2026-02-03 10:29 ` [PATCH next 07/12] tools/nolibc/printf: Prepend the sign after a numeric conversion david.laight.linux
2026-02-03 10:29 ` [PATCH next 08/12] tools/nolibc/printf: use bit-match to detect valid conversion characters david.laight.linux
2026-02-03 10:29 ` [PATCH next 09/12] tools/nolibc/printf: support precision and zero padding david.laight.linux
2026-02-03 10:29 ` [PATCH next 10/12] tools/nolibc/printf: Use bit-pattern for integral formats david.laight.linux
2026-02-03 10:29 ` [PATCH 11/12] selftests/nolibc: Increase coverage of printf format tests david.laight.linux
2026-02-03 18:22   ` kernel test robot
2026-02-03 22:05     ` David Laight
2026-02-03 23:23   ` kernel test robot
2026-02-03 10:30 ` [PATCH 12/12] selftests/nolibc: Use printf("%.*s", n, "") to align output david.laight.linux
2026-02-03 17:39 ` [PATCH next 00/12] tools/nolibc: Enhance printf() David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260204101705.11d2d99b@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=lechain@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@weissschuh.net \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.