From: Eric Biggers <ebiggers@kernel.org>
To: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Cc: akpm@linux-foundation.org, axboe@kernel.dk,
ceph-devel@vger.kernel.org, hch@lst.de, home7438072@gmail.com,
idryomov@gmail.com, jaegeuk@kernel.org, kbusch@kernel.org,
linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nvme@lists.infradead.org, sagi@grimberg.me, tytso@mit.edu,
visitorckw@gmail.com, xiubli@redhat.com
Subject: Re: [PATCH v2 2/5] lib/base64: rework encoder/decoder with customizable support and update nvme-auth
Date: Thu, 11 Sep 2025 11:27:42 -0700 [thread overview]
Message-ID: <20250911182742.GC1376@sol> (raw)
In-Reply-To: <20250911074159.656943-1-409411716@gms.tku.edu.tw>
On Thu, Sep 11, 2025 at 03:41:59PM +0800, Guan-Chun Wu wrote:
> Rework base64_encode() and base64_decode() with extended interfaces
> that support custom 64-character tables and optional '=' padding.
> This makes them flexible enough to cover both standard RFC4648 Base64
> and non-standard variants such as base64url.
RFC4648 specifies both base64 and base64url.
> The encoder is redesigned to process input in 3-byte blocks, each
> mapped directly into 4 output symbols. Base64 naturally encodes
> 24 bits of input as four 6-bit values, so operating on aligned
> 3-byte chunks matches the algorithm's structure. This block-based
> approach eliminates the need for bit-by-bit streaming, reduces shifts,
> masks, and loop iterations, and removes data-dependent branches from
> the main loop.
There already weren't any data-dependent branches in the encoder.
> The decoder replaces strchr()-based lookups with direct table-indexed
> mapping. It processes input in 4-character groups and supports both
> padded and non-padded forms. Validation has been strengthened: illegal
> characters and misplaced '=' padding now cause errors, preventing
> silent data corruption.
The decoder already detected invalid inputs.
> While this is a mechanical update following the lib/base64 rework,
> nvme-auth also benefits from the performance improvements in the new
> encoder/decoder, achieving faster encode/decode without altering the
> output format.
>
> The reworked encoder and decoder unify Base64 handling across the kernel
> with higher performance, stricter correctness, and flexibility to support
> subsystem-specific variants.
Which part is more strictly correct?
> diff --git a/lib/base64.c b/lib/base64.c
> index 9416bded2..b2bd5dab5 100644
> --- a/lib/base64.c
> +++ b/lib/base64.c
> @@ -15,104 +15,236 @@
> #include <linux/string.h>
> #include <linux/base64.h>
>
> -static const char base64_table[65] =
> - "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
> +#define BASE64_6BIT_MASK 0x3f /* Mask to extract lowest 6 bits */
> +#define BASE64_BITS_PER_BYTE 8
> +#define BASE64_CHUNK_BITS 6
> +
> +/* Output-char-indexed shifts: for output chars 0,1,2,3 respectively */
> +#define BASE64_SHIFT_OUT0 (BASE64_CHUNK_BITS * 3) /* 18 */
> +#define BASE64_SHIFT_OUT1 (BASE64_CHUNK_BITS * 2) /* 12 */
> +#define BASE64_SHIFT_OUT2 (BASE64_CHUNK_BITS * 1) /* 6 */
> +/* OUT3 uses 0 shift and just masks with BASE64_6BIT_MASK */
> +
> +/* For extracting bytes from the 24-bit value (decode main loop) */
> +#define BASE64_SHIFT_BYTE0 (BASE64_BITS_PER_BYTE * 2) /* 16 */
> +#define BASE64_SHIFT_BYTE1 (BASE64_BITS_PER_BYTE * 1) /* 8 */
> +
> +/* Tail (no padding) shifts to extract bytes */
> +#define BASE64_TAIL2_BYTE0_SHIFT ((BASE64_CHUNK_BITS * 2) - BASE64_BITS_PER_BYTE) /* 4 */
> +#define BASE64_TAIL3_BYTE0_SHIFT ((BASE64_CHUNK_BITS * 3) - BASE64_BITS_PER_BYTE) /* 10 */
> +#define BASE64_TAIL3_BYTE1_SHIFT ((BASE64_CHUNK_BITS * 3) - (BASE64_BITS_PER_BYTE * 2)) /* 2 */
> +
> +/* Extra: masks for leftover validation (no padding) */
> +#define BASE64_MASK(n) ({ \
> + unsigned int __n = (n); \
> + __n ? ((1U << __n) - 1U) : 0U; \
> +})
> +#define BASE64_TAIL2_UNUSED_BITS (BASE64_CHUNK_BITS * 2 - BASE64_BITS_PER_BYTE) /* 4 */
> +#define BASE64_TAIL3_UNUSED_BITS (BASE64_CHUNK_BITS * 3 - BASE64_BITS_PER_BYTE * 2) /* 2 */
These #defines make the code unnecessarily hard to read. Most of them
should just be replaced with the integer literals.
> * This implementation hasn't been optimized for performance.
But the commit message claims performance improvements.
> *
> * Return: the length of the resulting decoded binary data in bytes,
> * or -1 if the string isn't a valid base64 string.
base64 => Base64, since multiple variants are supported now. Refer to
the terminology used by RFC4686. Base64 is the general term, and
"base64" and "base64url" specific variants of Base64.
- Eric
next prev parent reply other threads:[~2025-09-11 18:29 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-11 7:29 [PATCH v2 0/5] lib/base64: add generic encoder/decoder, migrate users Guan-Chun Wu
2025-09-11 7:32 ` [PATCH v2 1/5] lib/base64: Replace strchr() for better performance Guan-Chun Wu
2025-09-11 15:50 ` Caleb Sander Mateos
2025-09-11 16:02 ` Caleb Sander Mateos
2025-09-11 16:25 ` Kuan-Wei Chiu
2025-09-11 16:38 ` Kuan-Wei Chiu
2025-09-14 20:12 ` David Laight
2025-09-15 7:50 ` Kuan-Wei Chiu
2025-09-15 11:02 ` David Laight
2025-09-16 7:22 ` Kuan-Wei Chiu
2025-09-11 18:14 ` Eric Biggers
2025-09-11 18:44 ` Kuan-Wei Chiu
2025-09-11 18:49 ` Eric Biggers
2025-09-11 19:00 ` Kuan-Wei Chiu
2025-09-13 21:27 ` David Laight
2025-09-12 22:54 ` David Laight
2025-09-11 7:41 ` [PATCH v2 2/5] lib/base64: rework encoder/decoder with customizable support and update nvme-auth Guan-Chun Wu
2025-09-11 15:59 ` Caleb Sander Mateos
2025-09-12 7:21 ` Guan-Chun Wu
2025-09-11 18:27 ` Eric Biggers [this message]
2025-09-12 6:37 ` FIRST_NAME LAST_NAME
2025-09-12 6:52 ` Guan-Chun Wu
2025-09-12 7:15 ` Guan-Chun Wu
2025-09-11 7:45 ` [PATCH v2 3/5] lib: add KUnit tests for base64 encoding/decoding Guan-Chun Wu
2025-09-11 7:45 ` [PATCH v2 4/5] fscrypt: replace local base64url helpers with generic lib/base64 helpers Guan-Chun Wu
2025-09-11 18:47 ` Eric Biggers
2025-09-12 7:51 ` Guan-Chun Wu
2025-09-11 7:46 ` [PATCH v2 5/5] ceph: replace local base64 encode/decode " Guan-Chun Wu
-- strict thread matches above, loose matches on Subject: below --
2025-09-11 7:32 [PATCH v2 2/5] lib/base64: rework encoder/decoder with customizable support and update nvme-auth Guan-Chun Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250911182742.GC1376@sol \
--to=ebiggers@kernel.org \
--cc=409411716@gms.tku.edu.tw \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=ceph-devel@vger.kernel.org \
--cc=hch@lst.de \
--cc=home7438072@gmail.com \
--cc=idryomov@gmail.com \
--cc=jaegeuk@kernel.org \
--cc=kbusch@kernel.org \
--cc=linux-fscrypt@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
--cc=tytso@mit.edu \
--cc=visitorckw@gmail.com \
--cc=xiubli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.