Re: [PATCH v3 2/6] lib/base64: Optimize base64_decode() with reverse lookup tables

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: David Laight <david.laight.linux@gmail.com>
To: Guan-Chun Wu <409411716@gms.tku.edu.tw>
Cc: Caleb Sander Mateos <csander@purestorage.com>,
	akpm@linux-foundation.org, axboe@kernel.dk,
	ceph-devel@vger.kernel.org, ebiggers@kernel.org, hch@lst.de,
	home7438072@gmail.com, idryomov@gmail.com, jaegeuk@kernel.org,
	kbusch@kernel.org, linux-fscrypt@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	sagi@grimberg.me, tytso@mit.edu, visitorckw@gmail.com,
	xiubli@redhat.com
Subject: Re: [PATCH v3 2/6] lib/base64: Optimize base64_decode() with reverse lookup tables
Date: Fri, 10 Oct 2025 10:51:38 +0100	[thread overview]
Message-ID: <20251010105138.0356ad75@pumpkin> (raw)
In-Reply-To: <aOeprat4/97oSWE0@wu-Pro-E500-G6-WS720T>

On Thu, 9 Oct 2025 20:25:17 +0800
Guan-Chun Wu <409411716@gms.tku.edu.tw> wrote:

...
> As Eric mentioned, the decoder in fs/crypto/ needs to reject invalid input.

(to avoid two different input buffers giving the same output)

Which is annoyingly reasonable.

> One possible solution I came up with is to first create a shared
> base64_rev_common lookup table as the base for all Base64 variants.
> Then, depending on the variant (e.g., BASE64_STD, BASE64_URLSAFE, etc.), we
> can dynamically adjust the character mappings for position 62 and position 63
> at runtime, based on the variant.
> 
> Here are the changes to the code:
> 
> static const s8 base64_rev_common[256] = {
> 	[0 ... 255] = -1,
> 	['A'] =  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12,
> 		13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
> 	['a'] = 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
> 		39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
> 	['0'] = 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
> };
> 
> static const struct {
> 	char char62, char63;
> } base64_symbols[] = {
> 	[BASE64_STD] = { '+', '/' },
> 	[BASE64_URLSAFE] = { '-', '_' },
> 	[BASE64_IMAP] = { '+', ',' },
> };
> 
> int base64_decode(const char *src, int srclen, u8 *dst, bool padding, enum base64_variant variant)
> {
> 	u8 *bp = dst;
> 	u8 pad_cnt = 0;
> 	s8 input1, input2, input3, input4;
> 	u32 val;
> 	s8 base64_rev_tables[256];
> 
> 	/* Validate the input length for padding */
> 	if (unlikely(padding && (srclen & 0x03) != 0))
> 		return -1;

There is no need for an early check.
Pick it up after the loop when 'srclen != 0'.

> 
> 	memcpy(base64_rev_tables, base64_rev_common, sizeof(base64_rev_common));

Ugg - having a memcpy() here is not a good idea.
It really is better to have 3 arrays, but use a 'mostly common' initialiser.
Perhaps:
#define BASE64_REV_INIT(ch_62, ch_63) = { \
	[0 ... 255] = -1, \
	['A'] =  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, \
		13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, \
	['a'] = 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, \
		39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, \
	['0'] = 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, \
	[ch_62] = 62, [ch_63] = 63, \
}

static const s8 base64_rev_maps[][256] = {
	[BASE64_STD] = BASE64_REV_INIT('+', '/'),
	[BASE64_URLSAFE] = BASE64_REV_INIT('-', '_'),
	[BASE64_IMAP] = BASE64_REV_INIT('+', ',')
};

Then (after validating variant):
	const s8 *map = base64_rev_maps[variant];

> 
> 	if (variant < BASE64_STD || variant > BASE64_IMAP)
> 		return -1;
> 
> 	base64_rev_tables[base64_symbols[variant].char62] = 62;
> 	base64_rev_tables[base64_symbols[variant].char63] = 63;
> 
> 	while (padding && srclen > 0 && src[srclen - 1] == '=') {
> 		pad_cnt++;
> 		srclen--;
> 		if (pad_cnt > 2)
> 			return -1;
> 	}

I'm not sure I'd to that there.
You are (in some sense) optimising for padding.
From what I remember, "abcd" gives 24 bits, "abc=" 16 and "ab==" 8.

> 
> 	while (srclen >= 4) {
> 		/* Decode the next 4 characters */
> 		input1 = base64_rev_tables[(u8)src[0]];
> 		input2 = base64_rev_tables[(u8)src[1]];
> 		input3 = base64_rev_tables[(u8)src[2]];
> 		input4 = base64_rev_tables[(u8)src[3]];

I'd be tempted to make src[] unsigned - probably be assigning the parameter
to a local at the top of the function.

Also you have input3 = ... src[2]...
Perhaps they should be input[0..3] instead.

> 
> 		val = (input1 << 18) |
> 		      (input2 << 12) |
> 		      (input3 << 6) |
> 		      input4;

Four lines is excessive, C doesn't require the () and I'm not sure the
compilers complain about << and |.

> 
> 		if (unlikely((s32)val < 0))
> 			return -1;

Make 'val' signed - then you don't need the cast.
You can pick up the padding check here, something like:
			val = input1 << 18 | input2 << 12;
			if (!padding || val < 0 || src[3] != '=')
				return -1;
			*bp++ = val >> 16;
			if (src[2] == '=')
				return bp - dst;
			if (input3 < 0)
				return -1;
			val |= input3 << 6;
			*bp++ = val >> 8;
			return bp - dst;

Or, if you really want to use the code below the loop:
			if (!padding || src[3] != '=')
				return -1;
			padding = 0;
			srclen -= 1 + (src[2] == '=');
			break;


> 
> 		*bp++ = (u8)(val >> 16);
> 		*bp++ = (u8)(val >> 8);
> 		*bp++ = (u8)val;

You don't need those casts.

> 
> 		src += 4;
> 		srclen -= 4;
> 	}
> 
> 	/* Handle leftover characters when padding is not used */

You are coming here with padding.
I'm not sure what should happen without padding.
For a multi-line file decode I suspect the characters need adding to
the start of the next line (ie lines aren't required to contain
multiples of 4 characters - even though they almost always will).

> 	if (srclen > 0) {
> 		switch (srclen) {

You don't need an 'if' and a 'switch'.
srclen is likely to be zero, but perhaps write as:
	if (likely(!srclen))
		return bp - dst;
	if (padding || srclen == 1)
		return -1;

	val = base64_rev_tables[(u8)src[0]] << 12 | base64_rev_tables[(u8)src[1]] << 6;
	*bp++ = val >> 10;
	if (srclen == 1) {
		if (val & 0x800003ff)
			return -1;
	} else {
		val |= base64_rev_tables[(u8)src[2]];
		if (val & 0x80000003)
			return -1;
		*bp++ = val >> 2;
	}
	return bp - dst;
}

	David

> 		case 2:
> 			input1 = base64_rev_tables[(u8)src[0]];
> 			input2 = base64_rev_tables[(u8)src[1]];
> 			val = (input1 << 6) | input2; /* 12 bits */
> 			if (unlikely((s32)val < 0 || val & 0x0F))
> 				return -1;
> 
> 			*bp++ = (u8)(val >> 4);
> 			break;
> 		case 3:
> 			input1 = base64_rev_tables[(u8)src[0]];
> 			input2 = base64_rev_tables[(u8)src[1]];
> 			input3 = base64_rev_tables[(u8)src[2]];
> 
> 			val = (input1 << 12) |
> 			      (input2 << 6) |
> 			      input3; /* 18 bits */
> 			if (unlikely((s32)val < 0 || val & 0x03))
> 				return -1;
> 
> 			*bp++ = (u8)(val >> 10);
> 			*bp++ = (u8)(val >> 2);
> 			break;
> 		default:
> 			return -1;
> 		}
> 	}
> 
> 	return bp - dst;
> }
> Based on KUnit testing, the performance results are as follows:
> 	base64_performance_tests: [64B] decode run : 40ns
> 	base64_performance_tests: [1KB] decode run : 463ns
> 
> However, this approach introduces an issue. It uses 256 bytes of memory
> on the stack for base64_rev_tables, which might not be ideal. Does anyone
> have any thoughts or alternative suggestions to solve this issue, or is it
> not really a concern?
> 
> Best regards,
> Guan-Chun
> 
> > > 
> > > Best,
> > > Caleb  
> >

next prev parent reply	other threads:[~2025-10-10  9:51 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-26  6:52 [PATCH v3 0/6] lib/base64: add generic encoder/decoder, migrate users Guan-Chun Wu
2025-09-26  6:55 ` [PATCH v3 1/6] lib/base64: Add support for multiple variants Guan-Chun Wu
2025-09-30 23:56   ` Caleb Sander Mateos
2025-10-01 14:09     ` Guan-Chun Wu
2025-09-26  6:55 ` [PATCH v3 2/6] lib/base64: Optimize base64_decode() with reverse lookup tables Guan-Chun Wu
2025-09-26 23:33   ` Caleb Sander Mateos
2025-09-28  6:37     ` Kuan-Wei Chiu
2025-10-01 10:18     ` Guan-Chun Wu
2025-10-01 16:20       ` Caleb Sander Mateos
2025-10-05 17:18         ` David Laight
2025-10-07  8:28           ` Guan-Chun Wu
2025-10-07 14:57             ` Caleb Sander Mateos
2025-10-07 17:11               ` Eric Biggers
2025-10-07 18:23               ` David Laight
2025-10-09 12:25                 ` Guan-Chun Wu
2025-10-10  9:51                   ` David Laight [this message]
2025-10-13  9:49                     ` Guan-Chun Wu
2025-10-14  8:14                       ` David Laight
2025-10-16 10:07                         ` Guan-Chun Wu
2025-10-27 13:12                         ` Guan-Chun Wu
2025-10-27 14:18                           ` David Laight
2025-10-28  6:58                             ` Guan-Chun Wu
2025-09-28 18:57   ` David Laight
2025-09-26  6:56 ` [PATCH v3 3/6] lib/base64: rework encode/decode for speed and stricter validation Guan-Chun Wu
2025-10-01  0:11   ` Caleb Sander Mateos
2025-10-01  9:39     ` Guan-Chun Wu
2025-10-06 20:52       ` David Laight
2025-10-07  8:34         ` Guan-Chun Wu
2025-09-26  6:56 ` [PATCH v3 4/6] lib: add KUnit tests for base64 encoding/decoding Guan-Chun Wu
2025-09-26  6:56 ` [PATCH v3 5/6] fscrypt: replace local base64url helpers with lib/base64 Guan-Chun Wu
2025-09-26  6:57 ` [PATCH v3 6/6] ceph: replace local base64 " Guan-Chun Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251010105138.0356ad75@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=409411716@gms.tku.edu.tw \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=ceph-devel@vger.kernel.org \
    --cc=csander@purestorage.com \
    --cc=ebiggers@kernel.org \
    --cc=hch@lst.de \
    --cc=home7438072@gmail.com \
    --cc=idryomov@gmail.com \
    --cc=jaegeuk@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-fscrypt@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    --cc=tytso@mit.edu \
    --cc=visitorckw@gmail.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox