BPF List
 help / color / mirror / Atom feed
* Re: [PATCH 1/3] ring-buffer: Add uname to match criteria for persistent ring buffer
       [not found]                       ` <CAHk-=wg9x1Xt2cmiBbCz5XTppDQ=RNkjkmegwaF6=QghG6kBtA@mail.gmail.com>
@ 2024-12-18  0:47                         ` Alexei Starovoitov
  2024-12-18  1:26                           ` Linus Torvalds
  0 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2024-12-18  0:47 UTC (permalink / raw)
  To: Linus Torvalds, Florent Revest, Daniel Borkmann, bpf
  Cc: Steven Rostedt, Alexei Starovoitov, LKML, linux-trace-kernel,
	Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
	stable

On Tue, Dec 17, 2024 at 3:32 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> If the *only* thing that accesses that word array is vbin_printf and
> bstr_printf, then I could just change the packing to act like va_list
> does (ie the word array would *actually* be a word array, and char and
> short values would get individual words).
>
> But at least the bpf cde seems to know about the crazy packing, and
> also does that
>
>         tmp_buf = PTR_ALIGN(tmp_buf, sizeof(u32));
>
> in bpf_bprintf_prepare() because it knows that it's not *actually* an
> array of words despite it being documented as such.
>
> Of course, the bpf code only does the packed access thing for '%c',
> and doesn't seem to know that the same is true of '%hd' and '%hhd',
> presumably because nobody actually uses that.
>
> Let's add Alexei to the participants. I think bpf too would actually
> prefer that the odd char/short packing *not* be done, if only because
> it clearly does the wrong thing as-is for non-%c argument (ie those
> %hd/%hhd cases).

We reject %hd case as EINVAL and do byte copy for %c.
All that was done as part of
commit 48cac3f4a96d ("bpf: Implement formatted output helpers with bstr_printf")
that cleaned things up greatly.
The byte copy for %c wasn't deliberate to save space.
Just happen to work with bstr_printf().
We can totally switch to u32 if that's the direction for bstr_printf.
To handle %s we use bpf_trace_copy_string(tmp_buf, )
which does _nofault() underneath.
Since the tmp_buf is byte packed because of strings the %c case
just adds a byte too. Strings and %c can be made u32 aligned.

Since we're on this topic, Daniel is looking to reuse format_decode()
in bpf_bprintf_prepare() to get rid of our manual format validation.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] ring-buffer: Add uname to match criteria for persistent ring buffer
  2024-12-18  0:47                         ` [PATCH 1/3] ring-buffer: Add uname to match criteria for persistent ring buffer Alexei Starovoitov
@ 2024-12-18  1:26                           ` Linus Torvalds
  2024-12-18  1:32                             ` [PATCH] vsprintf: simplify number handling Linus Torvalds
  2024-12-18  1:39                             ` [PATCH 1/3] ring-buffer: Add uname to match criteria for persistent ring buffer Linus Torvalds
  0 siblings, 2 replies; 11+ messages in thread
From: Linus Torvalds @ 2024-12-18  1:26 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Florent Revest, Daniel Borkmann, bpf, Steven Rostedt,
	Alexei Starovoitov, LKML, linux-trace-kernel, Masami Hiramatsu,
	Mark Rutland, Mathieu Desnoyers, Andrew Morton, stable

On Tue, 17 Dec 2024 at 16:47, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> Since we're on this topic, Daniel is looking to reuse format_decode()
> in bpf_bprintf_prepare() to get rid of our manual format validation.

That was literally why I started looking into this - the many separate
type formats actually end up causing format_decode() (and the callers)
to have to generate multiple different cases, which then in turn
either cause a jump table, or - more commonly due to the CPU indirect
branch mitigations - a chain of conditionals that are fairly ugly.

Compressing the state table for the types from 11 down to 4 types
helps a bit, but then also dealing with the "smaller than int" things
as just 'int' (with the formatting flags that are separate) also ends
up avoiding some unnecessary and extra cases.

Because in the end, 'size_t' and 'long' are the same thing, even on
architectures like 32-bit x86 where 'size_t' really is 'unsigned int'
- simply because the only thing that matters for fetching the value is
the size, which is 32-bit.

(The whole "is it signed" and the truncation to smaller-than-int etc
is then something we have to handle anyway in by the 'printf_spec'
thing).

So I have a patch series to clean some of this up and avoid the extra
states. I'm not entirely happy with it, though, and I've been going
back and forth on some of the code, so I'm not ready to post it or
have anybody use it as a basis for some "real" cleanups.

I guess I could at least post the "turn 11 different types into 4"
part. I have other things in there, but that part seems fairly
unambiguously good.

Let me go separate that part out and maybe people can point out where
I've done something silly.

               Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] vsprintf: simplify number handling
  2024-12-18  1:26                           ` Linus Torvalds
@ 2024-12-18  1:32                             ` Linus Torvalds
  2024-12-18  9:19                               ` Rasmus Villemoes
                                                 ` (2 more replies)
  2024-12-18  1:39                             ` [PATCH 1/3] ring-buffer: Add uname to match criteria for persistent ring buffer Linus Torvalds
  1 sibling, 3 replies; 11+ messages in thread
From: Linus Torvalds @ 2024-12-18  1:32 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Florent Revest,
	Steven Rostedt
  Cc: LKML, bpf, Linus Torvalds

Instead of dealing with all the different special types (size_t,
unsigned char, ptrdiff_t..) just deal with the size of the integer type
and the sign.

This avoids a lot of unnecessary case statements, and the games we play
with the value of the 'SIGN' flags value

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

NOTE! Only very lightly tested.  Also meant to be purely preparatory. 
It might be broken. 

I started doing this in the hope that the vsnprintf() core code could
maybe be further cleaned up enough that it would actually be something
we could use more generally and export to other users that want to
basically do the printk format handling. 

The code is *not* there yet, though. Small steps.

 lib/vsprintf.c | 139 +++++++++++++------------------------------------
 1 file changed, 37 insertions(+), 102 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 9d3dac38a3f4..ad57b43bb9ab 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -407,7 +407,7 @@ int num_to_str(char *buf, int size, unsigned long long num, unsigned int width)
 	return len + width;
 }
 
-#define SIGN	1		/* unsigned/signed, must be 1 */
+#define SIGN	1		/* unsigned/signed */
 #define LEFT	2		/* left justified */
 #define PLUS	4		/* show plus */
 #define SPACE	8		/* space if plus */
@@ -415,12 +415,15 @@ int num_to_str(char *buf, int size, unsigned long long num, unsigned int width)
 #define SMALL	32		/* use lowercase in hex (must be 32 == 0x20) */
 #define SPECIAL	64		/* prefix hex with "0x", octal with "0" */
 
-static_assert(SIGN == 1);
 static_assert(ZEROPAD == ('0' - ' '));
 static_assert(SMALL == ('a' ^ 'A'));
 
 enum format_type {
 	FORMAT_TYPE_NONE, /* Just a string part */
+	FORMAT_TYPE_1BYTE = 1, /* char/short/int are their own sizes */
+	FORMAT_TYPE_2BYTE = 2,
+	FORMAT_TYPE_8BYTE = 3,
+	FORMAT_TYPE_4BYTE = 4,
 	FORMAT_TYPE_WIDTH,
 	FORMAT_TYPE_PRECISION,
 	FORMAT_TYPE_CHAR,
@@ -428,19 +431,10 @@ enum format_type {
 	FORMAT_TYPE_PTR,
 	FORMAT_TYPE_PERCENT_CHAR,
 	FORMAT_TYPE_INVALID,
-	FORMAT_TYPE_LONG_LONG,
-	FORMAT_TYPE_ULONG,
-	FORMAT_TYPE_LONG,
-	FORMAT_TYPE_UBYTE,
-	FORMAT_TYPE_BYTE,
-	FORMAT_TYPE_USHORT,
-	FORMAT_TYPE_SHORT,
-	FORMAT_TYPE_UINT,
-	FORMAT_TYPE_INT,
-	FORMAT_TYPE_SIZE_T,
-	FORMAT_TYPE_PTRDIFF
 };
 
+#define FORMAT_TYPE_SIZE(type) (sizeof(type) <= 4 ? sizeof(type) : FORMAT_TYPE_8BYTE)
+
 struct printf_spec {
 	unsigned int	type:8;		/* format_type enum */
 	signed int	field_width:24;	/* width of output field */
@@ -2707,23 +2701,19 @@ int format_decode(const char *fmt, struct printf_spec *spec)
 	}
 
 	if (qualifier == 'L')
-		spec->type = FORMAT_TYPE_LONG_LONG;
+		spec->type = FORMAT_TYPE_SIZE(long long);
 	else if (qualifier == 'l') {
-		BUILD_BUG_ON(FORMAT_TYPE_ULONG + SIGN != FORMAT_TYPE_LONG);
-		spec->type = FORMAT_TYPE_ULONG + (spec->flags & SIGN);
+		spec->type = FORMAT_TYPE_SIZE(long);
 	} else if (qualifier == 'z') {
-		spec->type = FORMAT_TYPE_SIZE_T;
+		spec->type = FORMAT_TYPE_SIZE(size_t);
 	} else if (qualifier == 't') {
-		spec->type = FORMAT_TYPE_PTRDIFF;
+		spec->type = FORMAT_TYPE_SIZE(ptrdiff_t);
 	} else if (qualifier == 'H') {
-		BUILD_BUG_ON(FORMAT_TYPE_UBYTE + SIGN != FORMAT_TYPE_BYTE);
-		spec->type = FORMAT_TYPE_UBYTE + (spec->flags & SIGN);
+		spec->type = FORMAT_TYPE_SIZE(char);
 	} else if (qualifier == 'h') {
-		BUILD_BUG_ON(FORMAT_TYPE_USHORT + SIGN != FORMAT_TYPE_SHORT);
-		spec->type = FORMAT_TYPE_USHORT + (spec->flags & SIGN);
+		spec->type = FORMAT_TYPE_SIZE(short);
 	} else {
-		BUILD_BUG_ON(FORMAT_TYPE_UINT + SIGN != FORMAT_TYPE_INT);
-		spec->type = FORMAT_TYPE_UINT + (spec->flags & SIGN);
+		spec->type = FORMAT_TYPE_SIZE(int);
 	}
 
 	return ++fmt - start;
@@ -2747,6 +2737,17 @@ set_precision(struct printf_spec *spec, int prec)
 	}
 }
 
+/* Turn a 1/2/4-byte value into a 64-bit one with sign handling */
+static unsigned long long get_num(unsigned int val, struct printf_spec spec)
+{
+	unsigned int shift = 32 - spec.type*8;
+
+	val <<= shift;
+	if (!(spec.flags & SIGN))
+		return val >> shift;
+	return (int)val >> shift;
+}
+
 /**
  * vsnprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
@@ -2873,43 +2874,10 @@ int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
 			goto out;
 
 		default:
-			switch (spec.type) {
-			case FORMAT_TYPE_LONG_LONG:
+			if (spec.type == FORMAT_TYPE_8BYTE)
 				num = va_arg(args, long long);
-				break;
-			case FORMAT_TYPE_ULONG:
-				num = va_arg(args, unsigned long);
-				break;
-			case FORMAT_TYPE_LONG:
-				num = va_arg(args, long);
-				break;
-			case FORMAT_TYPE_SIZE_T:
-				if (spec.flags & SIGN)
-					num = va_arg(args, ssize_t);
-				else
-					num = va_arg(args, size_t);
-				break;
-			case FORMAT_TYPE_PTRDIFF:
-				num = va_arg(args, ptrdiff_t);
-				break;
-			case FORMAT_TYPE_UBYTE:
-				num = (unsigned char) va_arg(args, int);
-				break;
-			case FORMAT_TYPE_BYTE:
-				num = (signed char) va_arg(args, int);
-				break;
-			case FORMAT_TYPE_USHORT:
-				num = (unsigned short) va_arg(args, int);
-				break;
-			case FORMAT_TYPE_SHORT:
-				num = (short) va_arg(args, int);
-				break;
-			case FORMAT_TYPE_INT:
-				num = (int) va_arg(args, int);
-				break;
-			default:
-				num = va_arg(args, unsigned int);
-			}
+			else
+				num = get_num(va_arg(args, int), spec);
 
 			str = number(str, end, num, spec);
 		}
@@ -3183,26 +3151,13 @@ int vbin_printf(u32 *bin_buf, size_t size, const char *fmt, va_list args)
 
 		default:
 			switch (spec.type) {
-
-			case FORMAT_TYPE_LONG_LONG:
+			case FORMAT_TYPE_8BYTE:
 				save_arg(long long);
 				break;
-			case FORMAT_TYPE_ULONG:
-			case FORMAT_TYPE_LONG:
-				save_arg(unsigned long);
-				break;
-			case FORMAT_TYPE_SIZE_T:
-				save_arg(size_t);
-				break;
-			case FORMAT_TYPE_PTRDIFF:
-				save_arg(ptrdiff_t);
-				break;
-			case FORMAT_TYPE_UBYTE:
-			case FORMAT_TYPE_BYTE:
+			case FORMAT_TYPE_1BYTE:
 				save_arg(char);
 				break;
-			case FORMAT_TYPE_USHORT:
-			case FORMAT_TYPE_SHORT:
+			case FORMAT_TYPE_2BYTE:
 				save_arg(short);
 				break;
 			default:
@@ -3375,37 +3330,17 @@ int bstr_printf(char *buf, size_t size, const char *fmt, const u32 *bin_buf)
 			unsigned long long num;
 
 			switch (spec.type) {
-
-			case FORMAT_TYPE_LONG_LONG:
+			case FORMAT_TYPE_8BYTE:
 				num = get_arg(long long);
 				break;
-			case FORMAT_TYPE_ULONG:
-			case FORMAT_TYPE_LONG:
-				num = get_arg(unsigned long);
+			case FORMAT_TYPE_2BYTE:
+				num = get_num(get_arg(short), spec);
 				break;
-			case FORMAT_TYPE_SIZE_T:
-				num = get_arg(size_t);
-				break;
-			case FORMAT_TYPE_PTRDIFF:
-				num = get_arg(ptrdiff_t);
-				break;
-			case FORMAT_TYPE_UBYTE:
-				num = get_arg(unsigned char);
-				break;
-			case FORMAT_TYPE_BYTE:
-				num = get_arg(signed char);
-				break;
-			case FORMAT_TYPE_USHORT:
-				num = get_arg(unsigned short);
-				break;
-			case FORMAT_TYPE_SHORT:
-				num = get_arg(short);
-				break;
-			case FORMAT_TYPE_UINT:
-				num = get_arg(unsigned int);
+			case FORMAT_TYPE_1BYTE:
+				num = get_num(get_arg(char), spec);
 				break;
 			default:
-				num = get_arg(int);
+				num = get_num(get_arg(int), spec);
 			}
 
 			str = number(str, end, num, spec);
-- 
2.47.1.402.g41f6bc43b7


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] ring-buffer: Add uname to match criteria for persistent ring buffer
  2024-12-18  1:26                           ` Linus Torvalds
  2024-12-18  1:32                             ` [PATCH] vsprintf: simplify number handling Linus Torvalds
@ 2024-12-18  1:39                             ` Linus Torvalds
  2024-12-18  1:53                               ` Alexei Starovoitov
  1 sibling, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2024-12-18  1:39 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Florent Revest, Daniel Borkmann, bpf, Steven Rostedt,
	Alexei Starovoitov, LKML, linux-trace-kernel, Masami Hiramatsu,
	Mark Rutland, Mathieu Desnoyers, Andrew Morton, stable

On Tue, 17 Dec 2024 at 17:26, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Let me go separate that part out and maybe people can point out where
> I've done something silly.

Ok, that part I had actually already locally separated out better than
some of my later patches in the series, so I sent it out as

  https://lore.kernel.org/all/20241218013620.1679088-1-torvalds@linux-foundation.org/

but I'm not guaranteeing it's right. Consider it a WIP thing, and only
a first step.

           Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] ring-buffer: Add uname to match criteria for persistent ring buffer
  2024-12-18  1:39                             ` [PATCH 1/3] ring-buffer: Add uname to match criteria for persistent ring buffer Linus Torvalds
@ 2024-12-18  1:53                               ` Alexei Starovoitov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2024-12-18  1:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Florent Revest, Daniel Borkmann, bpf, Steven Rostedt,
	Alexei Starovoitov, LKML, linux-trace-kernel, Masami Hiramatsu,
	Mark Rutland, Mathieu Desnoyers, Andrew Morton, stable

On Tue, Dec 17, 2024 at 5:39 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Tue, 17 Dec 2024 at 17:26, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Let me go separate that part out and maybe people can point out where
> > I've done something silly.
>
> Ok, that part I had actually already locally separated out better than
> some of my later patches in the series, so I sent it out as
>
>   https://lore.kernel.org/all/20241218013620.1679088-1-torvalds@linux-foundation.org/

lgtm.
Since bstr_printf() converts 1/2/4 to unsigned long long num
with a sign according to the format specifier it's good
from the calling convention perspective.
Doesn't matter here, but anyone passing 32-bit ints around
needs to be aware of odd riscv abi promotion rules.
x86-64 and arm64 zero extend 32-bit ints while riscv does sign
extension when s32 and u32 are passed.
Quote from the spec:
"In RV64, 32-bit types, such as int, are stored in integer registers
as proper sign extensions of their 32-bit values; that is, bits 63..31
are all equal. This restriction holds even for unsigned 32-bit types."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vsprintf: simplify number handling
  2024-12-18  1:32                             ` [PATCH] vsprintf: simplify number handling Linus Torvalds
@ 2024-12-18  9:19                               ` Rasmus Villemoes
  2024-12-18 15:32                               ` Steven Rostedt
  2024-12-23 20:11                               ` Linus Torvalds
  2 siblings, 0 replies; 11+ messages in thread
From: Rasmus Villemoes @ 2024-12-18  9:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alexei Starovoitov, Daniel Borkmann, Florent Revest,
	Steven Rostedt, LKML, bpf

On Tue, Dec 17 2024, Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Instead of dealing with all the different special types (size_t,
> unsigned char, ptrdiff_t..) just deal with the size of the integer type
> and the sign.
>
> This avoids a lot of unnecessary case statements, and the games we play
> with the value of the 'SIGN' flags value
>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> ---
>
> NOTE! Only very lightly tested.  Also meant to be purely preparatory. 
> It might be broken. 
>
> I started doing this in the hope that the vsnprintf() core code could
> maybe be further cleaned up enough that it would actually be something
> we could use more generally and export to other users that want to
> basically do the printk format handling. 
>
> The code is *not* there yet, though. Small steps.
>

> +/* Turn a 1/2/4-byte value into a 64-bit one with sign handling */
> +static unsigned long long get_num(unsigned int val, struct printf_spec spec)
> +{
> +	unsigned int shift = 32 - spec.type*8;
> +
> +	val <<= shift;
> +	if (!(spec.flags & SIGN))
> +		return val >> shift;
> +	return (int)val >> shift;
> +}
> +

I think this needs to be a little more explicit that it's not just about
handling sign extension, but also truncation to the desired
width. Otherwise somebody will wonder why this isn't

  if (!(spec.flags & SIGN))
    return val;
  return (int)(val << shift) >> shift;

[with the latter line duplicating our sign_extend32 helper].

The rest looks good.

Rasmus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vsprintf: simplify number handling
  2024-12-18  1:32                             ` [PATCH] vsprintf: simplify number handling Linus Torvalds
  2024-12-18  9:19                               ` Rasmus Villemoes
@ 2024-12-18 15:32                               ` Steven Rostedt
  2024-12-18 17:32                                 ` Linus Torvalds
  2024-12-23 20:11                               ` Linus Torvalds
  2 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2024-12-18 15:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alexei Starovoitov, Daniel Borkmann, Florent Revest, LKML, bpf

On Tue, 17 Dec 2024 17:32:09 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> @@ -2747,6 +2737,17 @@ set_precision(struct printf_spec *spec, int prec)
>  	}
>  }
>  
> +/* Turn a 1/2/4-byte value into a 64-bit one with sign handling */
> +static unsigned long long get_num(unsigned int val, struct printf_spec spec)
> +{
> +	unsigned int shift = 32 - spec.type*8;
> +
> +	val <<= shift;
> +	if (!(spec.flags & SIGN))
> +		return val >> shift;
> +	return (int)val >> shift;
> +}

I was about to make the same comment that Rasmus said about not bothering
with the shift if it is not signed, but I won't repeat that.

> +
>  /**
>   * vsnprintf - Format a string and place it in a buffer
>   * @buf: The buffer to place the result into
> @@ -2873,43 +2874,10 @@ int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
>  			goto out;
>  
>  		default:
> -			switch (spec.type) {
> -			case FORMAT_TYPE_LONG_LONG:
> +			if (spec.type == FORMAT_TYPE_8BYTE)
>  				num = va_arg(args, long long);
> -				break;
> -			case FORMAT_TYPE_ULONG:
> -				num = va_arg(args, unsigned long);
> -				break;
> -			case FORMAT_TYPE_LONG:
> -				num = va_arg(args, long);
> -				break;
> -			case FORMAT_TYPE_SIZE_T:
> -				if (spec.flags & SIGN)
> -					num = va_arg(args, ssize_t);
> -				else
> -					num = va_arg(args, size_t);
> -				break;
> -			case FORMAT_TYPE_PTRDIFF:
> -				num = va_arg(args, ptrdiff_t);
> -				break;
> -			case FORMAT_TYPE_UBYTE:
> -				num = (unsigned char) va_arg(args, int);
> -				break;
> -			case FORMAT_TYPE_BYTE:
> -				num = (signed char) va_arg(args, int);
> -				break;
> -			case FORMAT_TYPE_USHORT:
> -				num = (unsigned short) va_arg(args, int);
> -				break;
> -			case FORMAT_TYPE_SHORT:
> -				num = (short) va_arg(args, int);
> -				break;
> -			case FORMAT_TYPE_INT:
> -				num = (int) va_arg(args, int);
> -				break;
> -			default:
> -				num = va_arg(args, unsigned int);
> -			}
> +			else
> +				num = get_num(va_arg(args, int), spec);

The function name should likely be called:

    make_num_long_long(va_arg(args, int), spec);

Because after even reading the get_num() function, when I came down here I
had to go back up to make sure that was the function I was looking at, as
"get_num()" doesn't mean anything to me.


>  
>  			str = number(str, end, num, spec);
>  		}
> @@ -3183,26 +3151,13 @@ int vbin_printf(u32 *bin_buf, size_t size, const char *fmt, va_list args)
>  
>  		default:
>  			switch (spec.type) {
> -
> -			case FORMAT_TYPE_LONG_LONG:
> +			case FORMAT_TYPE_8BYTE:
>  				save_arg(long long);
>  				break;
> -			case FORMAT_TYPE_ULONG:
> -			case FORMAT_TYPE_LONG:
> -				save_arg(unsigned long);
> -				break;
> -			case FORMAT_TYPE_SIZE_T:
> -				save_arg(size_t);
> -				break;
> -			case FORMAT_TYPE_PTRDIFF:
> -				save_arg(ptrdiff_t);
> -				break;
> -			case FORMAT_TYPE_UBYTE:
> -			case FORMAT_TYPE_BYTE:
> +			case FORMAT_TYPE_1BYTE:
>  				save_arg(char);
>  				break;
> -			case FORMAT_TYPE_USHORT:
> -			case FORMAT_TYPE_SHORT:
> +			case FORMAT_TYPE_2BYTE:
>  				save_arg(short);
>  				break;
>  			default:
> @@ -3375,37 +3330,17 @@ int bstr_printf(char *buf, size_t size, const char *fmt, const u32 *bin_buf)
>  			unsigned long long num;
>  
>  			switch (spec.type) {
> -
> -			case FORMAT_TYPE_LONG_LONG:
> +			case FORMAT_TYPE_8BYTE:
>  				num = get_arg(long long);
>  				break;
> -			case FORMAT_TYPE_ULONG:
> -			case FORMAT_TYPE_LONG:
> -				num = get_arg(unsigned long);
> +			case FORMAT_TYPE_2BYTE:
> +				num = get_num(get_arg(short), spec);
>  				break;
> -			case FORMAT_TYPE_SIZE_T:
> -				num = get_arg(size_t);
> -				break;
> -			case FORMAT_TYPE_PTRDIFF:
> -				num = get_arg(ptrdiff_t);
> -				break;
> -			case FORMAT_TYPE_UBYTE:
> -				num = get_arg(unsigned char);
> -				break;
> -			case FORMAT_TYPE_BYTE:
> -				num = get_arg(signed char);
> -				break;
> -			case FORMAT_TYPE_USHORT:
> -				num = get_arg(unsigned short);
> -				break;
> -			case FORMAT_TYPE_SHORT:
> -				num = get_arg(short);
> -				break;
> -			case FORMAT_TYPE_UINT:
> -				num = get_arg(unsigned int);
> +			case FORMAT_TYPE_1BYTE:
> +				num = get_num(get_arg(char), spec);
>  				break;
>  			default:
> -				num = get_arg(int);
> +				num = get_num(get_arg(int), spec);
>  			}
>  
>  			str = number(str, end, num, spec);
> -- 

I went to test this by adding the following:

diff --git a/samples/trace_printk/trace-printk.c b/samples/trace_printk/trace-printk.c
index cfc159580263..1ff688637404 100644
--- a/samples/trace_printk/trace-printk.c
+++ b/samples/trace_printk/trace-printk.c
@@ -43,6 +43,17 @@ static int __init trace_printk_init(void)
 
 	trace_printk(trace_printk_test_global_str_fmt, "", "dynamic string");
 
+	trace_printk("Print unsigned long long %llu\n", -1LL);
+	trace_printk("Print long long %lld\n", -1LL);
+	trace_printk("Print unsigned long %llu\n", -1L);
+	trace_printk("Print long  %ld\n", -1L);
+	trace_printk("Print unsigned int %u\n", -1);
+	trace_printk("Print int %d\n", -1);
+	trace_printk("Print unsigned short %hu\n", (short)-1);
+	trace_printk("Print short %hd\n", (short)-1);
+	trace_printk("Print unsigned char %hhu\n", (char)-1);
+	trace_printk("Print char %hhd\n", (char)-1);
+
 	return 0;
 }
 
And the trace file looks fine:

 # modprobe trace-printk
 # cat /sys/kernel/tracing/trace

        modprobe-905     [003] .....   113.624838: init_module: Print unsigned long long 18446744073709551615
        modprobe-905     [003] .....   113.624838: init_module: Print long long -1
        modprobe-905     [003] .....   113.624839: init_module: Print unsigned long 18446744073709551615
        modprobe-905     [003] .....   113.624839: init_module: Print long  -1
        modprobe-905     [003] .....   113.624840: init_module: Print unsigned int 4294967295
        modprobe-905     [003] .....   113.624841: init_module: Print int -1
        modprobe-905     [003] .....   113.624841: init_module: Print unsigned short 65535
        modprobe-905     [003] .....   113.624842: init_module: Print short -1
        modprobe-905     [003] .....   113.624843: init_module: Print unsigned char 255
        modprobe-905     [003] .....   113.624843: init_module: Print char -1

But when I did the following:

 # trace-cmd extract
 # trace-cmd report

It showed:

        modprobe-905   [003] .....   113.624838: bprint:               init_module: Print unsigned long long 18446744073709551615
        modprobe-905   [003] .....   113.624838: bprint:               init_module: Print long long -1
        modprobe-905   [003] .....   113.624839: bprint:               init_module: Print unsigned long 18446744073709551615
        modprobe-905   [003] .....   113.624839: bprint:               init_module: Print long  -1
        modprobe-905   [003] .....   113.624840: bprint:               init_module: Print unsigned int 4294967295
        modprobe-905   [003] .....   113.624841: bprint:               init_module: Print int -1
        modprobe-905   [003] .....   113.624841: bprint:               init_module: Print unsigned short u
        modprobe-905   [003] .....   113.624842: bprint:               [FAILED TO PARSE] ip=0xffffffffc060e045 fmt=0xffff8c05c338e760 buf=ARRAY[]
        modprobe-905   [003] .....   113.624843: bprint:               [FAILED TO PARSE] ip=0xffffffffc060e045 fmt=0xffff8c05c338ec40 buf=ARRAY[]
        modprobe-905   [003] .....   113.624843: bprint:               [FAILED TO PARSE] ip=0xffffffffc060e045 fmt=0xffff8c05c338e280 buf=ARRAY[]

Those "[FAILED TO PARSE]" messages have nothing to do with your code, but
it means that it doesn't handle 'h' at all. Even the "unsigned short"
printed but still failed to parse properly.

This is because libtraceevent appears to not support "%h" in print formats.
That at least means there would be no breakage if they are modified in any
way.

-- Steve

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] vsprintf: simplify number handling
  2024-12-18 15:32                               ` Steven Rostedt
@ 2024-12-18 17:32                                 ` Linus Torvalds
  2024-12-18 18:04                                   ` Steven Rostedt
  0 siblings, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2024-12-18 17:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Alexei Starovoitov, Daniel Borkmann, Florent Revest, LKML, bpf

On Wed, 18 Dec 2024 at 07:31, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> I went to test this by adding the following:
>
> diff --git a/samples/trace_printk/trace-printk.c b/samples/trace_printk/trace-printk.c
> index cfc159580263..1ff688637404 100644
> --- a/samples/trace_printk/trace-printk.c
> +++ b/samples/trace_printk/trace-printk.c
> @@ -43,6 +43,17 @@ static int __init trace_printk_init(void)
>
>         trace_printk(trace_printk_test_global_str_fmt, "", "dynamic string");
>
> +       trace_printk("Print unsigned long long %llu\n", -1LL);
> +       trace_printk("Print long long %lld\n", -1LL);
> +       trace_printk("Print unsigned long %llu\n", -1L);
> +       trace_printk("Print long  %ld\n", -1L);
> +       trace_printk("Print unsigned int %u\n", -1);
> +       trace_printk("Print int %d\n", -1);
> +       trace_printk("Print unsigned short %hu\n", (short)-1);
> +       trace_printk("Print short %hd\n", (short)-1);
> +       trace_printk("Print unsigned char %hhu\n", (char)-1);
> +       trace_printk("Print char %hhd\n", (char)-1);

For testing the real corner cases, you should probably check the real
truncation handling.

IOW, things like '%hh{d,u}' together with a value like 0x3456789ab.

Because yes, the truncation is very much part of the number handling,
and is actually a very important part of printk, and really the only
actual reason '%hhd' and friends even exist (without truncation, you'd
just use %d and %u).

The fact that both you and Rasmus felt that part needed more of a
comment clearly just means that I may be more aware of it than most
people are.

Because I considered the "handle sign" issue to be also making sure we
handle the size and the *lack* of sign corrrectly.

So I'll extend on the comment.

As you also point out with your tracing test:

>         modprobe-905   [003] .....   113.624842: bprint:               [FAILED TO PARSE] ip=0xffffffffc060e045 fmt=0xffff8c05c338e760 buf=ARRAY[]
>         modprobe-905   [003] .....   113.624843: bprint:               [FAILED TO PARSE] ip=0xffffffffc060e045 fmt=0xffff8c05c338ec40 buf=ARRAY[]
>         modprobe-905   [003] .....   113.624843: bprint:               [FAILED TO PARSE] ip=0xffffffffc060e045 fmt=0xffff8c05c338e280 buf=ARRAY[]
>
> Those "[FAILED TO PARSE]" messages have nothing to do with your code, but
> it means that it doesn't handle 'h' at all. Even the "unsigned short"
> printed but still failed to parse properly.

Yeah, %h{d,u} and %hh{d,u} are not hugely common, and apparently it's
not just your tracing tools that don't understand them: Alexei
reported that the bpf binary printk code also refused them.

That said, they *do* exist in the kernel, including in tracing:

    git grep 'TP_printk.*".*%hh*[ud].*"'

doesn't return lots of hits, but does report a handful.

> This is because libtraceevent appears to not support "%h" in print formats.
> That at least means there would be no breakage if they are modified in any
> way.

Oh, %hd is not getting modified (and if I did, that would be a major bug).

It's very much a part of the standard printf format, and is very much
inherent to the whole varargs and C integer promotion rules (ie you
literally *cannot* pass an actual 'char' value to a varargs function -
the normal C integer type extension rules apply).

So this is not really some odd kernel extension, and while there are
only a handful of users in tracing (that apparently trace-cmd cannot
deal with), it's not even _that_ uncommon in general:

    git grep '".*%hh*[ud].*"' | wc -l

reports that we have 501 of them in the kernel sources.

           Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vsprintf: simplify number handling
  2024-12-18 17:32                                 ` Linus Torvalds
@ 2024-12-18 18:04                                   ` Steven Rostedt
  2024-12-18 18:31                                     ` Steven Rostedt
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2024-12-18 18:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alexei Starovoitov, Daniel Borkmann, Florent Revest, LKML, bpf

On Wed, 18 Dec 2024 09:32:45 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> As you also point out with your tracing test:
> 
> >         modprobe-905   [003] .....   113.624842: bprint:               [FAILED TO PARSE] ip=0xffffffffc060e045 fmt=0xffff8c05c338e760 buf=ARRAY[]
> >         modprobe-905   [003] .....   113.624843: bprint:               [FAILED TO PARSE] ip=0xffffffffc060e045 fmt=0xffff8c05c338ec40 buf=ARRAY[]
> >         modprobe-905   [003] .....   113.624843: bprint:               [FAILED TO PARSE] ip=0xffffffffc060e045 fmt=0xffff8c05c338e280 buf=ARRAY[]
> >
> > Those "[FAILED TO PARSE]" messages have nothing to do with your code, but
> > it means that it doesn't handle 'h' at all. Even the "unsigned short"
> > printed but still failed to parse properly.  
> 
> Yeah, %h{d,u} and %hh{d,u} are not hugely common, and apparently it's
> not just your tracing tools that don't understand them: Alexei
> reported that the bpf binary printk code also refused them.
> 
> That said, they *do* exist in the kernel, including in tracing:
> 
>     git grep 'TP_printk.*".*%hh*[ud].*"'
> 
> doesn't return lots of hits, but does report a handful.

Those are not processed by vbin_printf() or bstr_printf() the TP_printk()
of the event is a simple vsnprintf() and is executed on the read side.

The TP_printk() macro is basically translated into:

	trace_event_printf(iter, print);

Where that "print" is the TP_printk() passed to TRACE_EVENT(). And that's
the function that I was fixing:

void trace_event_printf(struct trace_iterator *iter, const char *fmt, ...)
{
	va_list ap;

	va_start(ap, fmt);
	trace_check_vprintf(iter, trace_event_format(iter, fmt), ap);
	va_end(ap);
}

So, if vsnprintf() handles anything, so does TP_printk(). Nothing to do
with the binary formatting.

> 
> > This is because libtraceevent appears to not support "%h" in print formats.
> > That at least means there would be no breakage if they are modified in any
> > way.  
> 
> Oh, %hd is not getting modified (and if I did, that would be a major bug).
> 
> It's very much a part of the standard printf format, and is very much
> inherent to the whole varargs and C integer promotion rules (ie you
> literally *cannot* pass an actual 'char' value to a varargs function -
> the normal C integer type extension rules apply).
> 
> So this is not really some odd kernel extension, and while there are
> only a handful of users in tracing (that apparently trace-cmd cannot
> deal with), it's not even _that_ uncommon in general:

trace-cmd (and libtraceevent for that matter) does handle "%h" and "%hh"
as well.

But the vbin_printf() which trace_printk() uses is a different beast, and
requires rebuilding the arguments so that it can be parsed, and there "%h"
isn't supported.

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vsprintf: simplify number handling
  2024-12-18 18:04                                   ` Steven Rostedt
@ 2024-12-18 18:31                                     ` Steven Rostedt
  0 siblings, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2024-12-18 18:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alexei Starovoitov, Daniel Borkmann, Florent Revest, LKML, bpf

On Wed, 18 Dec 2024 13:04:27 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> > It's very much a part of the standard printf format, and is very much
> > inherent to the whole varargs and C integer promotion rules (ie you
> > literally *cannot* pass an actual 'char' value to a varargs function -
> > the normal C integer type extension rules apply).
> > 
> > So this is not really some odd kernel extension, and while there are
> > only a handful of users in tracing (that apparently trace-cmd cannot
> > deal with), it's not even _that_ uncommon in general:  
> 
> trace-cmd (and libtraceevent for that matter) does handle "%h" and "%hh"
> as well.
> 
> But the vbin_printf() which trace_printk() uses is a different beast, and
> requires rebuilding the arguments so that it can be parsed, and there "%h"
> isn't supported.

Just to state the difference between TP_printk() and trace_printk() is that
with trace events only the data is saved to the ring buffer. For example,
for the sched_waking trace event:

TRACE_EVENT(sched_waking,

	TP_PROTO(struct task_struct *p),

	TP_ARGS(__perf_task(p)),

	TP_STRUCT__entry(
		__array(	char,	comm,	TASK_COMM_LEN	)
		__field(	pid_t,	pid			)
		__field(	int,	prio			)
		__field(	int,	target_cpu		)
	),

	TP_fast_assign(
		memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
		__entry->pid		= p->pid;
		__entry->prio		= p->prio; /* XXX SCHED_DEADLINE */
		__entry->target_cpu	= task_cpu(p);
	),

	TP_printk("comm=%s pid=%d prio=%d target_cpu=%03d",
		  __entry->comm, __entry->pid, __entry->prio,
		  __entry->target_cpu)
);

[ Note, I made this into a TRACE_EVENT() but in reality it's multiple
  events that uses DECLARE_EVENT_CLASS() and DEFINE_EVENT(), but the idea is
  still the same. ]

	TP_STRUCT__entry(
		__array(	char,	comm,	TASK_COMM_LEN	)
		__field(	pid_t,	pid			)
		__field(	int,	prio			)
		__field(	int,	target_cpu		)
	),

That turns into:

	struct trace_event_raw_sched_waking {
		struct trace_entry		ent;
		char				comm[TASK_COMM_LEN];
		pid_t				pid;
		int				prio;
		int				target_cpu;
		char				__data[];
	}

Then we have how to load that structure:

	TP_fast_assign(
		memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
		__entry->pid		= p->pid;
		__entry->prio		= p->prio; /* XXX SCHED_DEADLINE */
		__entry->target_cpu	= task_cpu(p);
	),
 
Where the "__entry" is of type struct trace_event_raw_sched_waking and
points into the reserved location in the ring buffer. This has the above
assignments write directly into the ring buffer and avoids any copying.

Now the "trace" file needs to know how to print it, that's where the
TP_printk() is. It is basically a vsprintf(TP_printk()) with the __entry
again pointing to the content of the ring buffer.

But trace_printk() is not a trace event and requires writing the format as
well as the data into the ring buffer when it is called. It use to simply
just use vsnprintf() but it was considered much faster to not do the
formatting during the record and to push it back to when it is read. As
trace_printk() is used specifically to find hard to hit bugs, to keep it
from causing "heisenbugs", using vbin_printf() proved to be much faster and
made trace_printk() less intrusive to debugging.

For historical analysis, here's where it was first introduced:

  https://lore.kernel.org/lkml/49aa0c73.1c07d00a.4fc6.ffffb4d7@mx.google.com/

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] vsprintf: simplify number handling
  2024-12-18  1:32                             ` [PATCH] vsprintf: simplify number handling Linus Torvalds
  2024-12-18  9:19                               ` Rasmus Villemoes
  2024-12-18 15:32                               ` Steven Rostedt
@ 2024-12-23 20:11                               ` Linus Torvalds
  2 siblings, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2024-12-23 20:11 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Florent Revest,
	Steven Rostedt, Rasmus Villemoes
  Cc: LKML, bpf

So I pushed out the whole series of vsprintf cleanups into the
'vsnprintf' branch.

It's rebased on top of 6.13-rc4, and if it works for people, I'll keep
it stable.

But note that last commit, which makes the 'binary' version pack
numerical arguments (but not '%c') with the traditional and simpler C
type expansion (ie 'at least int sized').

Everything else is *supposed* to be a complete no-op, and only clean
up code (and make it generate a lot better code, btw).

The main objective having been to make it possibly easier to do fairly
straightforward printf code with

        while (*fmt.str) {
                fmt = format_decode(fmt, &spec);

                switch (fmt.state) {
                case FORMAT_STATE_NONE:
                        ...
                }
        }

with a lot fewer case statements (particularly cutting down the number
handling to the point where you *really* don't need to worry about all
the special cases).

It's still obviously not usable from outside the vsprintf.c code
itself (none of these interfaces are exported to modules, and are in
fact still 'static'). But it's at the point where maybe it's clean
enough that some day it *could* be exposed.

Anyway, comments? Particularly any of the strange vbin_printf /
bstr_printf users, because that last commit does change the encoding
of %hd / %hhd in the binary buffer. It does look like bpf can't care,
and tracing hopefully always pairs up vbin_printf with bstr_printf,
but maybe some horrid thing exposes the raw buffer to user space.

From Steven's email and the "[FAILED TO PARSE]" output from 'trace-cmd
report', I suspect that the raw buffer is indeed exposed to user
space, but was already handled wrong by the tracing tools. I might
even have unintentionally fixed it, because the new binary buffer
format is not only simplerr, but a hell of a lot more logical too.

But hey - while the other commits aren't *supposed* to have any
semantic differences and only simplify the code and make for better
code generation, mistakes happen.

It WorksForMe(tm), and I've been running most of this for about a week
now (ore-rebase), so it's presumably not *completely* broken.

            Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-12-23 20:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20241217173237.836878448@goodmis.org>
     [not found] ` <20241217173520.314190793@goodmis.org>
     [not found]   ` <CAHk-=wg5Kcr=sBuZcWs90CSGbJuKy0QsLaCC5oD15gS+Hk8j1A@mail.gmail.com>
     [not found]     ` <20241217130454.5bb593e8@gandalf.local.home>
     [not found]       ` <CAHk-=whLJW1SWvJTHYmdVAL2yL=dh4RzMuxgT7rnksSpkfUVaA@mail.gmail.com>
     [not found]         ` <20241217133318.06f849c9@gandalf.local.home>
     [not found]           ` <CAHk-=wgi1z85Cs4VmxTqFiG75qzoS_h_nszg6qP1ennEpdokkw@mail.gmail.com>
     [not found]             ` <20241217140153.22ac28b0@gandalf.local.home>
     [not found]               ` <CAHk-=wgpjLhSv9_rnAGS1adekEHMHbjVFvmZEuEmVftuo2sJBw@mail.gmail.com>
     [not found]                 ` <20241217144411.2165f73b@gandalf.local.home>
     [not found]                   ` <CAHk-=whWfmZbwRmySSpOyYEZJgcKG3d-qheYidnwu+b+rk6THg@mail.gmail.com>
     [not found]                     ` <20241217175301.03d25799@gandalf.local.home>
     [not found]                       ` <CAHk-=wg9x1Xt2cmiBbCz5XTppDQ=RNkjkmegwaF6=QghG6kBtA@mail.gmail.com>
2024-12-18  0:47                         ` [PATCH 1/3] ring-buffer: Add uname to match criteria for persistent ring buffer Alexei Starovoitov
2024-12-18  1:26                           ` Linus Torvalds
2024-12-18  1:32                             ` [PATCH] vsprintf: simplify number handling Linus Torvalds
2024-12-18  9:19                               ` Rasmus Villemoes
2024-12-18 15:32                               ` Steven Rostedt
2024-12-18 17:32                                 ` Linus Torvalds
2024-12-18 18:04                                   ` Steven Rostedt
2024-12-18 18:31                                     ` Steven Rostedt
2024-12-23 20:11                               ` Linus Torvalds
2024-12-18  1:39                             ` [PATCH 1/3] ring-buffer: Add uname to match criteria for persistent ring buffer Linus Torvalds
2024-12-18  1:53                               ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox