From: David Laight <david.laight.linux@gmail.com>
To: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>,
Guan-Chun Wu <409411716@gms.tku.edu.tw>,
Andrew Morton <akpm@linux-foundation.org>,
ebiggers@kernel.org, tytso@mit.edu, jaegeuk@kernel.org,
xiubli@redhat.com, idryomov@gmail.com, kbusch@kernel.org,
axboe@kernel.dk, hch@lst.de, sagi@grimberg.me,
home7438072@gmail.com, linux-nvme@lists.infradead.org,
linux-fscrypt@vger.kernel.org, ceph-devel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 0/6] lib/base64: add generic encoder/decoder, migrate users
Date: Mon, 3 Nov 2025 13:22:13 +0000 [thread overview]
Message-ID: <20251103132213.5feb4586@pumpkin> (raw)
In-Reply-To: <aQiM7OWWM0dXTT0J@google.com>
On Mon, 3 Nov 2025 19:07:24 +0800
Kuan-Wei Chiu <visitorckw@gmail.com> wrote:
> +Cc David
>
> Hi Guan-Chun,
>
> If we need to respin this series, please Cc David when sending the next
> version.
>
> On Mon, Nov 03, 2025 at 11:24:35AM +0100, Andy Shevchenko wrote:
> > On Fri, Oct 31, 2025 at 09:09:47PM -0700, Andrew Morton wrote:
> > > On Wed, 29 Oct 2025 18:17:25 +0800 Guan-Chun Wu <409411716@gms.tku.edu.tw> wrote:
> > >
> > > > This series introduces a generic Base64 encoder/decoder to the kernel
> > > > library, eliminating duplicated implementations and delivering significant
> > > > performance improvements.
> > > >
> > > > The Base64 API has been extended to support multiple variants (Standard,
> > > > URL-safe, and IMAP) as defined in RFC 4648 and RFC 3501. The API now takes
> > > > a variant parameter and an option to control padding. As part of this
> > > > series, users are migrated to the new interface while preserving their
> > > > specific formats: fscrypt now uses BASE64_URLSAFE, Ceph uses BASE64_IMAP,
> > > > and NVMe is updated to BASE64_STD.
> > > >
> > > > On the encoder side, the implementation processes input in 3-byte blocks,
> > > > mapping 24 bits directly to 4 output symbols. This avoids bit-by-bit
> > > > streaming and reduces loop overhead, achieving about a 2.7x speedup compared
> > > > to previous implementations.
> > > >
> > > > On the decoder side, replace strchr() lookups with per-variant reverse tables
> > > > and process input in 4-character groups. Each group is mapped to numeric values
> > > > and combined into 3 bytes. Padded and unpadded forms are validated explicitly,
> > > > rejecting invalid '=' usage and enforcing tail rules.
> > >
> > > Looks like wonderful work, thanks. And it's good to gain a selftest
> > > for this code.
> > >
> > > > This improves throughput by ~43-52x.
> > >
> > > Well that isn't a thing we see every day.
> >
> > I agree with the judgement, the problem is that this broke drastically a build:
> >
> > lib/base64.c:35:17: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides]
> > 35 | [BASE64_STD] = BASE64_REV_INIT('+', '/'),
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~
> > lib/base64.c:26:11: note: expanded from macro 'BASE64_REV_INIT'
> > 26 | ['A'] = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, \
> > | ^
> > lib/base64.c:35:17: note: previous initialization is here
> > 35 | [BASE64_STD] = BASE64_REV_INIT('+', '/'),
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~
> > lib/base64.c:25:16: note: expanded from macro 'BASE64_REV_INIT'
> > 25 | [0 ... 255] = -1, \
> > | ^~
> > ...
> > fatal error: too many errors emitted, stopping now [-ferror-limit=]
> > 20 errors generated.
> >
> Since I didn't notice this build failure, I guess this happens during a
> W=1 build? Sorry for that. Maybe I should add W=1 compilation testing
> to my checklist before sending patches in the future. I also got an
> email from the kernel test robot with a duplicate initialization
> warning from the sparse tool [1], pointing to the same code.
>
> This implementation was based on David's previous suggestion [2] to
> first default all entries to -1 and then set the values for the 64
> character entries. This was to avoid expanding the large 256 * 3 table
> and improve code readability.
>
> Hi David,
>
> Since I believe many people test and care about W=1 builds,
Last time I tried a W=1 build it failed horribly because of 'type-limits'.
The kernel does that all the time - usually for its own error tests inside
#define and inline functions.
Certainly some of the changes I've seen to stop W=1 warnings are really
a bad idea - but that is a bit of a digression.
Warnings can be temporarily disabled using #pragma.
That might be the best thing to do here with this over-zealous warning.
This compiles on gcc and clang (even though the warnings have different names):
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Woverride-init"
int x[16] = { [0 ... 15] = -1, [5] = 5};
#pragma GCC diagnostic pop
> I think we need to find another way to avoid this warning?
> Perhaps we could consider what you suggested:
>
> #define BASE64_REV_INIT(val_plus, val_comma, val_minus, val_slash, val_under) { \
> [ 0 ... '+'-1 ] = -1, \
> [ '+' ] = val_plus, val_comma, val_minus, -1, val_slash, \
> [ '0' ] = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, \
> [ '9'+1 ... 'A'-1 ] = -1, \
> [ 'A' ] = 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, \
> 23, 24, 25, 26, 27, 28, 28, 30, 31, 32, 33, 34, 35, \
> [ 'Z'+1 ... '_'-1 ] = -1, \
> [ '_' ] = val_under, \
> [ '_'+1 ... 'a'-1 ] = -1, \
> [ 'a' ] = 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, \
> 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, \
> [ 'z'+1 ... 255 ] = -1 \
> }
I just checked, neither gcc nor clang allow empty ranges (eg [ 6 ... 5 ] = -1).
Which means the coder has to know which characters are adjacent as well
as getting the order right.
Basically avoiding the warning sucks.
> Or should we just expand the 256 * 3 table as it was before?
That has much the same issue - IIRC it relies on three big sequential lists.
The #pragma may be best - but doesn't solve sparse (unless it processes
them as well).
David
>
> [1]: https://lore.kernel.org/oe-kbuild-all/202511021343.107utehN-lkp@intel.com/
> [2]: https://lore.kernel.org/lkml/20250928195736.71bec9ae@pumpkin/
>
> Regards,
> Kuan-Wei
next prev parent reply other threads:[~2025-11-03 13:22 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 10:17 [PATCH v4 0/6] lib/base64: add generic encoder/decoder, migrate users Guan-Chun Wu
2025-10-29 10:20 ` [PATCH v4 1/6] lib/base64: Add support for multiple variants Guan-Chun Wu
2025-10-29 10:20 ` [PATCH v4 2/6] lib/base64: Optimize base64_decode() with reverse lookup tables Guan-Chun Wu
2025-10-29 10:21 ` [PATCH v4 3/6] lib/base64: rework encode/decode for speed and stricter validation Guan-Chun Wu
2025-10-29 10:21 ` [PATCH v4 4/6] lib: add KUnit tests for base64 encoding/decoding Guan-Chun Wu
2025-10-29 10:21 ` [PATCH v4 5/6] fscrypt: replace local base64url helpers with lib/base64 Guan-Chun Wu
2025-10-29 10:22 ` [PATCH v4 6/6] ceph: replace local base64 " Guan-Chun Wu
2025-11-01 4:09 ` [PATCH v4 0/6] lib/base64: add generic encoder/decoder, migrate users Andrew Morton
2025-11-03 10:24 ` Andy Shevchenko
2025-11-03 11:07 ` Kuan-Wei Chiu
2025-11-03 13:22 ` David Laight [this message]
2025-11-03 14:41 ` Andy Shevchenko
2025-11-03 18:16 ` Andy Shevchenko
2025-11-03 19:29 ` David Laight
2025-11-03 19:37 ` Andy Shevchenko
2025-11-03 22:32 ` David Laight
2025-11-04 8:21 ` Andy Shevchenko
2025-11-04 1:27 ` Andrew Morton
2025-11-04 8:22 ` Andy Shevchenko
2025-11-04 9:03 ` David Laight
2025-11-04 9:48 ` Andy Shevchenko
2025-11-05 9:48 ` David Laight
2025-11-05 14:13 ` Andy Shevchenko
2025-11-05 14:38 ` David Laight
2025-11-09 12:36 ` Guan-Chun Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251103132213.5feb4586@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=409411716@gms.tku.edu.tw \
--cc=akpm@linux-foundation.org \
--cc=andriy.shevchenko@intel.com \
--cc=axboe@kernel.dk \
--cc=ceph-devel@vger.kernel.org \
--cc=ebiggers@kernel.org \
--cc=hch@lst.de \
--cc=home7438072@gmail.com \
--cc=idryomov@gmail.com \
--cc=jaegeuk@kernel.org \
--cc=kbusch@kernel.org \
--cc=linux-fscrypt@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
--cc=tytso@mit.edu \
--cc=visitorckw@gmail.com \
--cc=xiubli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.