public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Cc: Caleb Sander Mateos <csander@purestorage.com>,
	akpm@linux-foundation.org, axboe@kernel.dk,
	ceph-devel@vger.kernel.org, ebiggers@kernel.org, hch@lst.de,
	home7438072@gmail.com, idryomov@gmail.com, jaegeuk@kernel.org,
	kbusch@kernel.org, linux-fscrypt@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	sagi@grimberg.me, tytso@mit.edu, visitorckw@gmail.com,
	xiubli@redhat.com
Subject: Re: [PATCH v3 3/6] lib/base64: rework encode/decode for speed and stricter validation
Date: Mon, 6 Oct 2025 21:52:12 +0100	[thread overview]
Message-ID: <20251006215212.2920d571@pumpkin> (raw)
In-Reply-To: <aNz21InCM4Pa93TL@wu-Pro-E500-G6-WS720T>

On Wed, 1 Oct 2025 17:39:32 +0800
Guan-Chun Wu <409411716@gms.tku.edu.tw> wrote:

> On Tue, Sep 30, 2025 at 05:11:12PM -0700, Caleb Sander Mateos wrote:
> > On Fri, Sep 26, 2025 at 12:01 AM Guan-Chun Wu <409411716@gms.tku.edu.tw> wrote:  
> > >
> > > The old base64 implementation relied on a bit-accumulator loop, which was
> > > slow for larger inputs and too permissive in validation. It would accept
> > > extra '=', missing '=', or even '=' appearing in the middle of the input,
> > > allowing malformed strings to pass. This patch reworks the internals to
> > > improve performance and enforce stricter validation.
> > >
> > > Changes:
> > >  - Encoder:
> > >    * Process input in 3-byte blocks, mapping 24 bits into four 6-bit
> > >      symbols, avoiding bit-by-bit shifting and reducing loop iterations.
> > >    * Handle the final 1-2 leftover bytes explicitly and emit '=' only when
> > >      requested.
> > >  - Decoder:
> > >    * Based on the reverse lookup tables from the previous patch, decode
> > >      input in 4-character groups.
> > >    * Each group is looked up directly, converted into numeric values, and
> > >      combined into 3 output bytes.
> > >    * Explicitly handle padded and unpadded forms:
> > >       - With padding: input length must be a multiple of 4, and '=' is
> > >         allowed only in the last two positions. Reject stray or early '='.
> > >       - Without padding: validate tail lengths (2 or 3 chars) and require
> > >         unused low bits to be zero.
> > >    * Removed the bit-accumulator style loop to reduce loop iterations.
> > >
> > > Performance (x86_64, Intel Core i7-10700 @ 2.90GHz, avg over 1000 runs,
> > > KUnit):
> > >
> > > Encode:
> > >   64B   ~90ns   -> ~32ns   (~2.8x)
> > >   1KB  ~1332ns  -> ~510ns  (~2.6x)
> > >
> > > Decode:
> > >   64B  ~1530ns  -> ~64ns   (~23.9x)
> > >   1KB ~27726ns  -> ~982ns  (~28.3x)
> > >
> > > Co-developed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> > > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> > > Co-developed-by: Yu-Sheng Huang <home7438072@gmail.com>
> > > Signed-off-by: Yu-Sheng Huang <home7438072@gmail.com>
> > > Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw>
> > > ---
> > >  lib/base64.c | 150 +++++++++++++++++++++++++++++++++++++--------------
> > >  1 file changed, 110 insertions(+), 40 deletions(-)
> > >
> > > diff --git a/lib/base64.c b/lib/base64.c
> > > index b20fdf168..fd1db4611 100644
> > > --- a/lib/base64.c
> > > +++ b/lib/base64.c
> > > @@ -93,26 +93,43 @@ static const s8 base64_rev_tables[][256] = {
> > >  int base64_encode(const u8 *src, int srclen, char *dst, bool padding, enum base64_variant variant)
> > >  {
> > >         u32 ac = 0;
> > > -       int bits = 0;
> > > -       int i;
> > >         char *cp = dst;
> > >         const char *base64_table = base64_tables[variant];
> > >
> > > -       for (i = 0; i < srclen; i++) {
> > > -               ac = (ac << 8) | src[i];
> > > -               bits += 8;
> > > -               do {
> > > -                       bits -= 6;
> > > -                       *cp++ = base64_table[(ac >> bits) & 0x3f];
> > > -               } while (bits >= 6);
> > > -       }
> > > -       if (bits) {
> > > -               *cp++ = base64_table[(ac << (6 - bits)) & 0x3f];
> > > -               bits -= 6;
> > > +       while (srclen >= 3) {
> > > +               ac = ((u32)src[0] << 16) |
> > > +                        ((u32)src[1] << 8) |
> > > +                        (u32)src[2];
> > > +
> > > +               *cp++ = base64_table[ac >> 18];
> > > +               *cp++ = base64_table[(ac >> 12) & 0x3f];
> > > +               *cp++ = base64_table[(ac >> 6) & 0x3f];
> > > +               *cp++ = base64_table[ac & 0x3f];
> > > +
> > > +               src += 3;
> > > +               srclen -= 3;
> > >         }
> > > -       while (bits < 0) {
> > > -               *cp++ = '=';
> > > -               bits += 2;
> > > +
> > > +       switch (srclen) {
> > > +       case 2:
> > > +               ac = ((u32)src[0] << 16) |
> > > +                    ((u32)src[1] << 8);
> > > +
> > > +               *cp++ = base64_table[ac >> 18];
> > > +               *cp++ = base64_table[(ac >> 12) & 0x3f];
> > > +               *cp++ = base64_table[(ac >> 6) & 0x3f];
> > > +               if (padding)
> > > +                       *cp++ = '=';
> > > +               break;
> > > +       case 1:
> > > +               ac = ((u32)src[0] << 16);
> > > +               *cp++ = base64_table[ac >> 18];
> > > +               *cp++ = base64_table[(ac >> 12) & 0x3f];
> > > +               if (padding) {
> > > +                       *cp++ = '=';
> > > +                       *cp++ = '=';
> > > +               }
> > > +               break;
> > >         }
> > >         return cp - dst;
> > >  }
> > > @@ -128,39 +145,92 @@ EXPORT_SYMBOL_GPL(base64_encode);
> > >   *
> > >   * Decodes a string using the selected Base64 variant.
> > >   *
> > > - * This implementation hasn't been optimized for performance.
> > > - *
> > >   * Return: the length of the resulting decoded binary data in bytes,
> > >   *        or -1 if the string isn't a valid Base64 string.
> > >   */
> > >  int base64_decode(const char *src, int srclen, u8 *dst, bool padding, enum base64_variant variant)
> > >  {
> > > -       u32 ac = 0;
> > > -       int bits = 0;
> > > -       int i;
> > >         u8 *bp = dst;
> > > -       s8 ch;
> > > -
> > > -       for (i = 0; i < srclen; i++) {
> > > -               if (src[i] == '=') {
> > > -                       ac = (ac << 6);
> > > -                       bits += 6;
> > > -                       if (bits >= 8)
> > > -                               bits -= 8;
> > > -                       continue;
> > > -               }
> > > -               ch = base64_rev_tables[variant][(u8)src[i]];
> > > -               if (ch == -1)
> > > +       s8 input1, input2, input3, input4;
> > > +       u32 val;
> > > +
> > > +       if (srclen == 0)
> > > +               return 0;  
> > 
> > Doesn't look like this special case is necessary; all the if and while
> > conditions below are false if srclen == 0, so the function will just
> > end up returning 0 in that case anyways. It would be nice to avoid
> > this branch, especially as it seems like an uncommon case.
> >  
> 
> You're right. I'll remove it. Thanks.
> 
> > > +
> > > +       /* Validate the input length for padding */
> > > +       if (unlikely(padding && (srclen & 0x03) != 0))
> > > +               return -1;
> > > +
> > > +       while (srclen >= 4) {
> > > +               /* Decode the next 4 characters */
> > > +               input1 = base64_rev_tables[variant][(u8)src[0]];
> > > +               input2 = base64_rev_tables[variant][(u8)src[1]];
> > > +               input3 = base64_rev_tables[variant][(u8)src[2]];
> > > +               input4 = base64_rev_tables[variant][(u8)src[3]];
> > > +
> > > +               /* Return error if any Base64 character is invalid */
> > > +               if (unlikely(input1 < 0 || input2 < 0 || (!padding && (input3 < 0 || input4 < 0))))
> > > +                       return -1;
> > > +
> > > +               /* Handle padding */
> > > +               if (unlikely(padding && ((input3 < 0 && input4 >= 0) ||
> > > +                                        (input3 < 0 && src[2] != '=') ||
> > > +                                        (input4 < 0 && src[3] != '=') ||
> > > +                                        (srclen > 4 && (input3 < 0 || input4 < 0)))))  
> > 
> > Would be preferable to check and strip the padding (i.e. decrease
> > srclen) before this main loop. That way we could avoid several
> > branches in this hot loop that are only necessary to handle the
> > padding chars.
> >   
> 
> You're right. As long as we check and strip the padding first, the
> behavior with or without padding can be the same, and it could also
> reduce some unnecessary branches. I'll make the change.

As I said earlier.
Calculate 'val' first using signed arithmetic.
If it is non-negative there are three bytes to write.
If negative then check for src[2] and src[3] being '=' (etc) before erroring out.

That way there is only one check in the normal path.

	David

> 
> Best regards,
> Guan-Chun
> 
> > > +                       return -1;
> > > +               val = ((u32)input1 << 18) |
> > > +                     ((u32)input2 << 12) |
> > > +                     ((u32)((input3 < 0) ? 0 : input3) << 6) |
> > > +                     (u32)((input4 < 0) ? 0 : input4);
> > > +
> > > +               *bp++ = (u8)(val >> 16);
> > > +
> > > +               if (input3 >= 0)
> > > +                       *bp++ = (u8)(val >> 8);
> > > +               if (input4 >= 0)
> > > +                       *bp++ = (u8)val;
> > > +
> > > +               src += 4;
> > > +               srclen -= 4;
> > > +       }
> > > +
> > > +       /* Handle leftover characters when padding is not used */
> > > +       if (!padding && srclen > 0) {
> > > +               switch (srclen) {
> > > +               case 2:
> > > +                       input1 = base64_rev_tables[variant][(u8)src[0]];
> > > +                       input2 = base64_rev_tables[variant][(u8)src[1]];
> > > +                       if (unlikely(input1 < 0 || input2 < 0))
> > > +                               return -1;
> > > +
> > > +                       val = ((u32)input1 << 6) | (u32)input2; /* 12 bits */
> > > +                       if (unlikely(val & 0x0F))
> > > +                               return -1; /* low 4 bits must be zero */
> > > +
> > > +                       *bp++ = (u8)(val >> 4);
> > > +                       break;
> > > +               case 3:
> > > +                       input1 = base64_rev_tables[variant][(u8)src[0]];
> > > +                       input2 = base64_rev_tables[variant][(u8)src[1]];
> > > +                       input3 = base64_rev_tables[variant][(u8)src[2]];
> > > +                       if (unlikely(input1 < 0 || input2 < 0 || input3 < 0))
> > > +                               return -1;
> > > +
> > > +                       val = ((u32)input1 << 12) |
> > > +                             ((u32)input2 << 6) |
> > > +                             (u32)input3; /* 18 bits */
> > > +
> > > +                       if (unlikely(val & 0x03))
> > > +                               return -1; /* low 2 bits must be zero */
> > > +
> > > +                       *bp++ = (u8)(val >> 10);
> > > +                       *bp++ = (u8)((val >> 2) & 0xFF);  
> > 
> > "& 0xFF" is redundant with the cast to u8.
> > 
> > Best,
> > Caleb
> >   
> > > +                       break;
> > > +               default:
> > >                         return -1;
> > > -               ac = (ac << 6) | ch;
> > > -               bits += 6;
> > > -               if (bits >= 8) {
> > > -                       bits -= 8;
> > > -                       *bp++ = (u8)(ac >> bits);
> > >                 }
> > >         }
> > > -       if (ac & ((1 << bits) - 1))
> > > -               return -1;
> > > +
> > >         return bp - dst;
> > >  }
> > >  EXPORT_SYMBOL_GPL(base64_decode);
> > > --
> > > 2.34.1
> > >
> > >  
> 


  reply	other threads:[~2025-10-06 20:52 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-26  6:52 [PATCH v3 0/6] lib/base64: add generic encoder/decoder, migrate users Guan-Chun Wu
2025-09-26  6:55 ` [PATCH v3 1/6] lib/base64: Add support for multiple variants Guan-Chun Wu
2025-09-30 23:56   ` Caleb Sander Mateos
2025-10-01 14:09     ` Guan-Chun Wu
2025-09-26  6:55 ` [PATCH v3 2/6] lib/base64: Optimize base64_decode() with reverse lookup tables Guan-Chun Wu
2025-09-26 23:33   ` Caleb Sander Mateos
2025-09-28  6:37     ` Kuan-Wei Chiu
2025-10-01 10:18     ` Guan-Chun Wu
2025-10-01 16:20       ` Caleb Sander Mateos
2025-10-05 17:18         ` David Laight
2025-10-07  8:28           ` Guan-Chun Wu
2025-10-07 14:57             ` Caleb Sander Mateos
2025-10-07 17:11               ` Eric Biggers
2025-10-07 18:23               ` David Laight
2025-10-09 12:25                 ` Guan-Chun Wu
2025-10-10  9:51                   ` David Laight
2025-10-13  9:49                     ` Guan-Chun Wu
2025-10-14  8:14                       ` David Laight
2025-10-16 10:07                         ` Guan-Chun Wu
2025-10-27 13:12                         ` Guan-Chun Wu
2025-10-27 14:18                           ` David Laight
2025-10-28  6:58                             ` Guan-Chun Wu
2025-09-28 18:57   ` David Laight
2025-09-26  6:56 ` [PATCH v3 3/6] lib/base64: rework encode/decode for speed and stricter validation Guan-Chun Wu
2025-10-01  0:11   ` Caleb Sander Mateos
2025-10-01  9:39     ` Guan-Chun Wu
2025-10-06 20:52       ` David Laight [this message]
2025-10-07  8:34         ` Guan-Chun Wu
2025-09-26  6:56 ` [PATCH v3 4/6] lib: add KUnit tests for base64 encoding/decoding Guan-Chun Wu
2025-09-26  6:56 ` [PATCH v3 5/6] fscrypt: replace local base64url helpers with lib/base64 Guan-Chun Wu
2025-09-26  6:57 ` [PATCH v3 6/6] ceph: replace local base64 " Guan-Chun Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251006215212.2920d571@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=409411716@gms.tku.edu.tw \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=ceph-devel@vger.kernel.org \
    --cc=csander@purestorage.com \
    --cc=ebiggers@kernel.org \
    --cc=hch@lst.de \
    --cc=home7438072@gmail.com \
    --cc=idryomov@gmail.com \
    --cc=jaegeuk@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-fscrypt@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    --cc=tytso@mit.edu \
    --cc=visitorckw@gmail.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox