Re: [PATCH next 1/1] lib: Optimise hex_dump_to_buffer()

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: David Laight <david.laight.linux@gmail.com>
To: Nick Child <nnac123@linux.ibm.com>
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Randy Dunlap <rdunlap@infradead.org>
Subject: Re: [PATCH next 1/1] lib: Optimise hex_dump_to_buffer()
Date: Wed, 19 Feb 2025 13:13:47 +0000	[thread overview]
Message-ID: <20250219131347.3ec72325@pumpkin> (raw)
In-Reply-To: <Z7UPNhW_bXSOzACk@li-4c4c4544-0047-5210-804b-b8c04f323634.ibm.com>

On Tue, 18 Feb 2025 16:52:38 -0600
Nick Child <nnac123@linux.ibm.com> wrote:

> On Sun, Feb 16, 2025 at 08:19:01PM +0000, David Laight wrote:
> > Fastpath the normal case of single byte output that fits in the buffer.
> > Output byte groups (byteswapped on little-endian) without calling snprintf().
> > Remove the restriction that rowsize must be 16 or 32.
> > Remove the restriction that groupsize must be 8 or less.
> > If groupsize isn't a power of 2 or doesn't divide into both len and
> >   rowsize it is set to 1 (otherwise byteswapping is hard).
> > Change the types of the rowsize and groupsize parameters to be unsigned types.
> >  
> 
> Thank you!
> 
> > Tested in a userspace harness, code size (x86-64) halved to 723 bytes.
> > 
> > Signed-off-by: David Laight <david.laight.linux@gmail.com>
> > ---
> >  include/linux/printk.h |   6 +-
> >  lib/hexdump.c          | 165 ++++++++++++++++++++---------------------
> >  2 files changed, 85 insertions(+), 86 deletions(-)
> > 
> > diff --git a/include/linux/printk.h b/include/linux/printk.h
> > index 4217a9f412b2..49e67f63277e 100644
> > --- a/include/linux/printk.h
> > +++ b/include/linux/printk.h
> > @@ -752,9 +752,9 @@ enum {
> >  	DUMP_PREFIX_ADDRESS,
> >  	DUMP_PREFIX_OFFSET
> >  };  
> ...
> > - * hex_dump_to_buffer() works on one "line" of output at a time, i.e.,
> > + * If @groupsize isn't a power of 2 that divides into both @len and @rowsize
> > + * the it is set to 1.  
> 
> s/the/then/
> 
> > + *
> > + * hex_dump_to_buffer() works on one "line" of output at a time, e.g.,
> >   * 16 or 32 bytes of input data converted to hex + ASCII output.  
> ...
> > -		linebuf[lx++] = ' ';
> > +	if (!ascii) {
> > +		*dst = 0;
> > +		return out_len;
> >  	}
> > + 
> > +	pad_len += 2;  
> 
> So at a minimum there is 2 spaces before the ascii translation?

That is the way it always was.
Does make sense that way.

> when people allocate linebuf, what should they use to calculate the len?

Enough :-)
Unchanged from before, 'rowsize * 4 + 2' is just enough.

> 
> Also side nit, this existed before this patch, the endian switch may
> occur on the hex dump but it doesn't on the ascii conversion:
> [   20.172006][  T150] 706f6e6d6c6b6a696867666564636261  abcdefghijklmnop

Correct, that is what you want.
Consider { u32 x; char y[4]; } - you don't want the ascii byte reversed.

Indeed, if the data might not be aligned it is easier to process the
hex bytes in address order than a little-endian value that is split
between two 'words'. 

> 
> 
> > +	out_len += pad_len + len;
> > +	if (dst + pad_len >= dst_end)
> > +		pad_len = dst_end - dst - 1;  
> 
> Why not jump to hex_truncate here? This feels like an error case and
> if I am understanding correctly, this will pad the rest of the buffer
> leaving no room for ascii.

I missed that, can't remember what the old code did.

> 
> > +	while (pad_len--)
> > +		*dst++ = ' ';
> > +
> > +	if (dst + len >= dst_end)  
> ...
> > -- 
> > 2.39.5  
> 
> I will like to also support a wrapper to a bitmap argument as Andy
> mentioned. Mostly for selfish reasons though: I would like an argument
> to be added to skip endian conversion, and just observe the bytes as they
> appear in memory (without having to use groupsize 1).

More useful would be an option to keep address order with a space
between the bytes, but add an extra space every 'n' bytes.
So you get: "00 11 22 33  44 55 66 77  88..."

But that is an enhancement rather than a rewrite.
Although it is all easier to do with my new 'slow path' loop.

> I had fun tracking some of these bitwise operations on power of 2
> integers, I think I must've missed that day in school. Cool stuff :)

It came the day after the rule for inverting boolean expressions.

	David

next prev parent reply	other threads:[~2025-02-19 13:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-16 20:19 [PATCH next 1/1] lib: Optimise hex_dump_to_buffer() David Laight
2025-02-16 20:47 ` Andy Shevchenko
2025-02-16 22:37   ` David Laight
2025-02-18 11:38     ` Andy Shevchenko
2025-02-18 22:52 ` Nick Child
2025-02-19 13:13   ` David Laight [this message]
2025-02-19 19:29   ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250219131347.3ec72325@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nnac123@linux.ibm.com \
    --cc=rdunlap@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox