From: David Laight <david.laight.linux@gmail.com>
To: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Cc: Caleb Sander Mateos <csander@purestorage.com>,
akpm@linux-foundation.org, axboe@kernel.dk,
ceph-devel@vger.kernel.org, ebiggers@kernel.org, hch@lst.de,
home7438072@gmail.com, idryomov@gmail.com, jaegeuk@kernel.org,
kbusch@kernel.org, linux-fscrypt@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
sagi@grimberg.me, tytso@mit.edu, visitorckw@gmail.com,
xiubli@redhat.com
Subject: Re: [PATCH v3 2/6] lib/base64: Optimize base64_decode() with reverse lookup tables
Date: Fri, 10 Oct 2025 10:51:38 +0100 [thread overview]
Message-ID: <20251010105138.0356ad75@pumpkin> (raw)
In-Reply-To: <aOeprat4/97oSWE0@wu-Pro-E500-G6-WS720T>
On Thu, 9 Oct 2025 20:25:17 +0800
Guan-Chun Wu <409411716@gms.tku.edu.tw> wrote:
...
> As Eric mentioned, the decoder in fs/crypto/ needs to reject invalid input.
(to avoid two different input buffers giving the same output)
Which is annoyingly reasonable.
> One possible solution I came up with is to first create a shared
> base64_rev_common lookup table as the base for all Base64 variants.
> Then, depending on the variant (e.g., BASE64_STD, BASE64_URLSAFE, etc.), we
> can dynamically adjust the character mappings for position 62 and position 63
> at runtime, based on the variant.
>
> Here are the changes to the code:
>
> static const s8 base64_rev_common[256] = {
> [0 ... 255] = -1,
> ['A'] = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
> 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
> ['a'] = 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
> 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
> ['0'] = 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
> };
>
> static const struct {
> char char62, char63;
> } base64_symbols[] = {
> [BASE64_STD] = { '+', '/' },
> [BASE64_URLSAFE] = { '-', '_' },
> [BASE64_IMAP] = { '+', ',' },
> };
>
> int base64_decode(const char *src, int srclen, u8 *dst, bool padding, enum base64_variant variant)
> {
> u8 *bp = dst;
> u8 pad_cnt = 0;
> s8 input1, input2, input3, input4;
> u32 val;
> s8 base64_rev_tables[256];
>
> /* Validate the input length for padding */
> if (unlikely(padding && (srclen & 0x03) != 0))
> return -1;
There is no need for an early check.
Pick it up after the loop when 'srclen != 0'.
>
> memcpy(base64_rev_tables, base64_rev_common, sizeof(base64_rev_common));
Ugg - having a memcpy() here is not a good idea.
It really is better to have 3 arrays, but use a 'mostly common' initialiser.
Perhaps:
#define BASE64_REV_INIT(ch_62, ch_63) = { \
[0 ... 255] = -1, \
['A'] = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, \
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, \
['a'] = 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, \
39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, \
['0'] = 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, \
[ch_62] = 62, [ch_63] = 63, \
}
static const s8 base64_rev_maps[][256] = {
[BASE64_STD] = BASE64_REV_INIT('+', '/'),
[BASE64_URLSAFE] = BASE64_REV_INIT('-', '_'),
[BASE64_IMAP] = BASE64_REV_INIT('+', ',')
};
Then (after validating variant):
const s8 *map = base64_rev_maps[variant];
>
> if (variant < BASE64_STD || variant > BASE64_IMAP)
> return -1;
>
> base64_rev_tables[base64_symbols[variant].char62] = 62;
> base64_rev_tables[base64_symbols[variant].char63] = 63;
>
> while (padding && srclen > 0 && src[srclen - 1] == '=') {
> pad_cnt++;
> srclen--;
> if (pad_cnt > 2)
> return -1;
> }
I'm not sure I'd to that there.
You are (in some sense) optimising for padding.
From what I remember, "abcd" gives 24 bits, "abc=" 16 and "ab==" 8.
>
> while (srclen >= 4) {
> /* Decode the next 4 characters */
> input1 = base64_rev_tables[(u8)src[0]];
> input2 = base64_rev_tables[(u8)src[1]];
> input3 = base64_rev_tables[(u8)src[2]];
> input4 = base64_rev_tables[(u8)src[3]];
I'd be tempted to make src[] unsigned - probably be assigning the parameter
to a local at the top of the function.
Also you have input3 = ... src[2]...
Perhaps they should be input[0..3] instead.
>
> val = (input1 << 18) |
> (input2 << 12) |
> (input3 << 6) |
> input4;
Four lines is excessive, C doesn't require the () and I'm not sure the
compilers complain about << and |.
>
> if (unlikely((s32)val < 0))
> return -1;
Make 'val' signed - then you don't need the cast.
You can pick up the padding check here, something like:
val = input1 << 18 | input2 << 12;
if (!padding || val < 0 || src[3] != '=')
return -1;
*bp++ = val >> 16;
if (src[2] == '=')
return bp - dst;
if (input3 < 0)
return -1;
val |= input3 << 6;
*bp++ = val >> 8;
return bp - dst;
Or, if you really want to use the code below the loop:
if (!padding || src[3] != '=')
return -1;
padding = 0;
srclen -= 1 + (src[2] == '=');
break;
>
> *bp++ = (u8)(val >> 16);
> *bp++ = (u8)(val >> 8);
> *bp++ = (u8)val;
You don't need those casts.
>
> src += 4;
> srclen -= 4;
> }
>
> /* Handle leftover characters when padding is not used */
You are coming here with padding.
I'm not sure what should happen without padding.
For a multi-line file decode I suspect the characters need adding to
the start of the next line (ie lines aren't required to contain
multiples of 4 characters - even though they almost always will).
> if (srclen > 0) {
> switch (srclen) {
You don't need an 'if' and a 'switch'.
srclen is likely to be zero, but perhaps write as:
if (likely(!srclen))
return bp - dst;
if (padding || srclen == 1)
return -1;
val = base64_rev_tables[(u8)src[0]] << 12 | base64_rev_tables[(u8)src[1]] << 6;
*bp++ = val >> 10;
if (srclen == 1) {
if (val & 0x800003ff)
return -1;
} else {
val |= base64_rev_tables[(u8)src[2]];
if (val & 0x80000003)
return -1;
*bp++ = val >> 2;
}
return bp - dst;
}
David
> case 2:
> input1 = base64_rev_tables[(u8)src[0]];
> input2 = base64_rev_tables[(u8)src[1]];
> val = (input1 << 6) | input2; /* 12 bits */
> if (unlikely((s32)val < 0 || val & 0x0F))
> return -1;
>
> *bp++ = (u8)(val >> 4);
> break;
> case 3:
> input1 = base64_rev_tables[(u8)src[0]];
> input2 = base64_rev_tables[(u8)src[1]];
> input3 = base64_rev_tables[(u8)src[2]];
>
> val = (input1 << 12) |
> (input2 << 6) |
> input3; /* 18 bits */
> if (unlikely((s32)val < 0 || val & 0x03))
> return -1;
>
> *bp++ = (u8)(val >> 10);
> *bp++ = (u8)(val >> 2);
> break;
> default:
> return -1;
> }
> }
>
> return bp - dst;
> }
> Based on KUnit testing, the performance results are as follows:
> base64_performance_tests: [64B] decode run : 40ns
> base64_performance_tests: [1KB] decode run : 463ns
>
> However, this approach introduces an issue. It uses 256 bytes of memory
> on the stack for base64_rev_tables, which might not be ideal. Does anyone
> have any thoughts or alternative suggestions to solve this issue, or is it
> not really a concern?
>
> Best regards,
> Guan-Chun
>
> > >
> > > Best,
> > > Caleb
> >
next prev parent reply other threads:[~2025-10-10 9:51 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-26 6:52 [PATCH v3 0/6] lib/base64: add generic encoder/decoder, migrate users Guan-Chun Wu
2025-09-26 6:55 ` [PATCH v3 1/6] lib/base64: Add support for multiple variants Guan-Chun Wu
2025-09-30 23:56 ` Caleb Sander Mateos
2025-10-01 14:09 ` Guan-Chun Wu
2025-09-26 6:55 ` [PATCH v3 2/6] lib/base64: Optimize base64_decode() with reverse lookup tables Guan-Chun Wu
2025-09-26 23:33 ` Caleb Sander Mateos
2025-09-28 6:37 ` Kuan-Wei Chiu
2025-10-01 10:18 ` Guan-Chun Wu
2025-10-01 16:20 ` Caleb Sander Mateos
2025-10-05 17:18 ` David Laight
2025-10-07 8:28 ` Guan-Chun Wu
2025-10-07 14:57 ` Caleb Sander Mateos
2025-10-07 17:11 ` Eric Biggers
2025-10-07 18:23 ` David Laight
2025-10-09 12:25 ` Guan-Chun Wu
2025-10-10 9:51 ` David Laight [this message]
2025-10-13 9:49 ` Guan-Chun Wu
2025-10-14 8:14 ` David Laight
2025-10-16 10:07 ` Guan-Chun Wu
2025-10-27 13:12 ` Guan-Chun Wu
2025-10-27 14:18 ` David Laight
2025-10-28 6:58 ` Guan-Chun Wu
2025-09-28 18:57 ` David Laight
2025-09-26 6:56 ` [PATCH v3 3/6] lib/base64: rework encode/decode for speed and stricter validation Guan-Chun Wu
2025-10-01 0:11 ` Caleb Sander Mateos
2025-10-01 9:39 ` Guan-Chun Wu
2025-10-06 20:52 ` David Laight
2025-10-07 8:34 ` Guan-Chun Wu
2025-09-26 6:56 ` [PATCH v3 4/6] lib: add KUnit tests for base64 encoding/decoding Guan-Chun Wu
2025-09-26 6:56 ` [PATCH v3 5/6] fscrypt: replace local base64url helpers with lib/base64 Guan-Chun Wu
2025-09-26 6:57 ` [PATCH v3 6/6] ceph: replace local base64 " Guan-Chun Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251010105138.0356ad75@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=409411716@gms.tku.edu.tw \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=ceph-devel@vger.kernel.org \
--cc=csander@purestorage.com \
--cc=ebiggers@kernel.org \
--cc=hch@lst.de \
--cc=home7438072@gmail.com \
--cc=idryomov@gmail.com \
--cc=jaegeuk@kernel.org \
--cc=kbusch@kernel.org \
--cc=linux-fscrypt@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
--cc=tytso@mit.edu \
--cc=visitorckw@gmail.com \
--cc=xiubli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.