public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Cc: akpm@linux-foundation.org, axboe@kernel.dk,
	ceph-devel@vger.kernel.org, hch@lst.de, home7438072@gmail.com,
	idryomov@gmail.com, jaegeuk@kernel.org, kbusch@kernel.org,
	linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nvme@lists.infradead.org, sagi@grimberg.me, tytso@mit.edu,
	visitorckw@gmail.com, xiubli@redhat.com
Subject: Re: [PATCH v2 2/5] lib/base64: rework encoder/decoder with customizable support and update nvme-auth
Date: Thu, 11 Sep 2025 11:27:42 -0700	[thread overview]
Message-ID: <20250911182742.GC1376@sol> (raw)
In-Reply-To: <20250911074159.656943-1-409411716@gms.tku.edu.tw>

On Thu, Sep 11, 2025 at 03:41:59PM +0800, Guan-Chun Wu wrote:
> Rework base64_encode() and base64_decode() with extended interfaces
> that support custom 64-character tables and optional '=' padding.
> This makes them flexible enough to cover both standard RFC4648 Base64
> and non-standard variants such as base64url.

RFC4648 specifies both base64 and base64url.

> The encoder is redesigned to process input in 3-byte blocks, each
> mapped directly into 4 output symbols. Base64 naturally encodes
> 24 bits of input as four 6-bit values, so operating on aligned
> 3-byte chunks matches the algorithm's structure. This block-based
> approach eliminates the need for bit-by-bit streaming, reduces shifts,
> masks, and loop iterations, and removes data-dependent branches from
> the main loop.

There already weren't any data-dependent branches in the encoder.

> The decoder replaces strchr()-based lookups with direct table-indexed
> mapping. It processes input in 4-character groups and supports both
> padded and non-padded forms. Validation has been strengthened: illegal
> characters and misplaced '=' padding now cause errors, preventing
> silent data corruption.

The decoder already detected invalid inputs.

> While this is a mechanical update following the lib/base64 rework,
> nvme-auth also benefits from the performance improvements in the new
> encoder/decoder, achieving faster encode/decode without altering the
> output format.
> 
> The reworked encoder and decoder unify Base64 handling across the kernel
> with higher performance, stricter correctness, and flexibility to support
> subsystem-specific variants.

Which part is more strictly correct?

> diff --git a/lib/base64.c b/lib/base64.c
> index 9416bded2..b2bd5dab5 100644
> --- a/lib/base64.c
> +++ b/lib/base64.c
> @@ -15,104 +15,236 @@
>  #include <linux/string.h>
>  #include <linux/base64.h>
>  
> -static const char base64_table[65] =
> -	"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
> +#define BASE64_6BIT_MASK      0x3f /* Mask to extract lowest 6 bits */
> +#define BASE64_BITS_PER_BYTE  8
> +#define BASE64_CHUNK_BITS     6
> +
> +/* Output-char-indexed shifts: for output chars 0,1,2,3 respectively */
> +#define BASE64_SHIFT_OUT0	(BASE64_CHUNK_BITS * 3) /* 18 */
> +#define BASE64_SHIFT_OUT1	(BASE64_CHUNK_BITS * 2) /* 12 */
> +#define BASE64_SHIFT_OUT2	(BASE64_CHUNK_BITS * 1) /* 6  */
> +/* OUT3 uses 0 shift and just masks with BASE64_6BIT_MASK */
> +
> +/* For extracting bytes from the 24-bit value (decode main loop) */
> +#define BASE64_SHIFT_BYTE0        (BASE64_BITS_PER_BYTE * 2) /* 16 */
> +#define BASE64_SHIFT_BYTE1        (BASE64_BITS_PER_BYTE * 1) /* 8  */
> +
> +/* Tail (no padding) shifts to extract bytes */
> +#define BASE64_TAIL2_BYTE0_SHIFT  ((BASE64_CHUNK_BITS * 2) - BASE64_BITS_PER_BYTE)       /* 4  */
> +#define BASE64_TAIL3_BYTE0_SHIFT  ((BASE64_CHUNK_BITS * 3) - BASE64_BITS_PER_BYTE)       /* 10 */
> +#define BASE64_TAIL3_BYTE1_SHIFT  ((BASE64_CHUNK_BITS * 3) - (BASE64_BITS_PER_BYTE * 2)) /* 2  */
> +
> +/* Extra: masks for leftover validation (no padding) */
> +#define BASE64_MASK(n) ({        \
> +	unsigned int __n = (n);  \
> +	__n ? ((1U << __n) - 1U) : 0U; \
> +})
> +#define BASE64_TAIL2_UNUSED_BITS  (BASE64_CHUNK_BITS * 2 - BASE64_BITS_PER_BYTE)     /* 4 */
> +#define BASE64_TAIL3_UNUSED_BITS  (BASE64_CHUNK_BITS * 3 - BASE64_BITS_PER_BYTE * 2) /* 2 */

These #defines make the code unnecessarily hard to read.  Most of them
should just be replaced with the integer literals.

>   * This implementation hasn't been optimized for performance.

But the commit message claims performance improvements.

>   *
>   * Return: the length of the resulting decoded binary data in bytes,
>   *	   or -1 if the string isn't a valid base64 string.

base64 => Base64, since multiple variants are supported now.  Refer to
the terminology used by RFC4686.  Base64 is the general term, and
"base64" and "base64url" specific variants of Base64.

- Eric

  parent reply	other threads:[~2025-09-11 18:29 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-11  7:29 [PATCH v2 0/5] lib/base64: add generic encoder/decoder, migrate users Guan-Chun Wu
2025-09-11  7:32 ` [PATCH v2 1/5] lib/base64: Replace strchr() for better performance Guan-Chun Wu
2025-09-11 15:50   ` Caleb Sander Mateos
2025-09-11 16:02     ` Caleb Sander Mateos
2025-09-11 16:25     ` Kuan-Wei Chiu
2025-09-11 16:38       ` Kuan-Wei Chiu
2025-09-14 20:12         ` David Laight
2025-09-15  7:50           ` Kuan-Wei Chiu
2025-09-15 11:02             ` David Laight
2025-09-16  7:22               ` Kuan-Wei Chiu
2025-09-11 18:14   ` Eric Biggers
2025-09-11 18:44     ` Kuan-Wei Chiu
2025-09-11 18:49       ` Eric Biggers
2025-09-11 19:00         ` Kuan-Wei Chiu
2025-09-13 21:27       ` David Laight
2025-09-12 22:54   ` David Laight
2025-09-11  7:41 ` [PATCH v2 2/5] lib/base64: rework encoder/decoder with customizable support and update nvme-auth Guan-Chun Wu
2025-09-11 15:59   ` Caleb Sander Mateos
2025-09-12  7:21     ` Guan-Chun Wu
2025-09-11 18:27   ` Eric Biggers [this message]
2025-09-12  6:37     ` FIRST_NAME LAST_NAME
2025-09-12  6:52       ` Guan-Chun Wu
2025-09-12  7:15     ` Guan-Chun Wu
2025-09-11  7:45 ` [PATCH v2 3/5] lib: add KUnit tests for base64 encoding/decoding Guan-Chun Wu
2025-09-11  7:45 ` [PATCH v2 4/5] fscrypt: replace local base64url helpers with generic lib/base64 helpers Guan-Chun Wu
2025-09-11 18:47   ` Eric Biggers
2025-09-12  7:51     ` Guan-Chun Wu
2025-09-11  7:46 ` [PATCH v2 5/5] ceph: replace local base64 encode/decode " Guan-Chun Wu
  -- strict thread matches above, loose matches on Subject: below --
2025-09-11  7:32 [PATCH v2 2/5] lib/base64: rework encoder/decoder with customizable support and update nvme-auth Guan-Chun Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250911182742.GC1376@sol \
    --to=ebiggers@kernel.org \
    --cc=409411716@gms.tku.edu.tw \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=ceph-devel@vger.kernel.org \
    --cc=hch@lst.de \
    --cc=home7438072@gmail.com \
    --cc=idryomov@gmail.com \
    --cc=jaegeuk@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-fscrypt@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    --cc=tytso@mit.edu \
    --cc=visitorckw@gmail.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox