AF_ALG hardening

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* AF_ALG hardening
       [not found]                   ` <20260502033556.GA3872267@google.com>
@ 2026-05-02  4:52                     ` Demi Marie Obenour
  2026-05-02  8:19                       ` Simon Richter
  2026-05-02 19:16                       ` Eric Biggers
       [not found]                     ` <20260502035402.GB3872267@google.com>
  1 sibling, 2 replies; 10+ messages in thread
From: Demi Marie Obenour @ 2026-05-02  4:52 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Jan Schaumann, iwd, Linux kernel mailing list, linux-crypto,
	Herbert Xu

[-- Attachment #1.1.1: Type: text/plain, Size: 5269 bytes --]

On 5/1/26 23:35, Eric Biggers wrote:
> On Fri, May 01, 2026 at 08:21:27PM -0400, Demi Marie Obenour wrote:
>> I think the single biggest hardening win for AF_ALG would be to move
>> to the crypto library.  The recent CVEs you mentioned mostly seem
>> to relate to the crypto API, and with a hard-coded list of allowed
>> algorithms there's no need to use the crypto API anymore.  I'm not
>> familiar enough with kernel code to do this easily, but for anyone
>> with basic knowledge of the existing code it should (hopefully) be
>> straightforward.
>>
>> In the meantime, only using synchronous algorithms and not using
>> hardware drivers would also be a useful simplification.  The latter
>> would make it especially clear that AF_ALG is deprecated, because
>> its one potential advantage (being able to use hardware acceleration)
>> would no longer be present.
> 
> The kernel's crypto library
> (https://docs.kernel.org/crypto/libcrypto.html) does greatly simplify a
> lot of kernel code that needs to use crypto algorithms.  Yes, AF_ALG
> doesn't use it directly yet.  Currently AF_ALG puts all the data in
> (zero-copy) scatterlists, then invokes the "traditional crypto API"
> which is very complex and has full scatterlist support, asynchronous
> execution support, an algorithm template system, etc.  In some cases the
> crypto library is then used internally, but it's not called directly.
> 
> So the idea would be something along the lines of:
> 
> - Add an algorithm allowlist to AF_ALG.  It would include only what the
>   small set of userspace programs that uses it actually needs.  Bizarre
>   stuff like "authencesn" wouldn't be included.
> 
> - Change AF_ALG to make it copy any data written to an AF_ALG file
>   descriptor into an internal kernel buffer.  Put the output in another
>   internal kernel buffer, then copy it to userspace.  No zero-copy, and
>   no scatterlists.  Both restrictions would greatly reduce the chance of
>   bugs: the actual crypto algorithms would operate only on these
>   internal buffers, not on pagecache data (e.g. the contents of 'su') or
>   buffers that userspace can concurrently modify.  The use of simple
>   virtual addresses would eliminate all the scatterlist complexity.
> 
> - AF_ALG would implement each algorithm by invoking the corresponding
>   the crypto library functions
>   (https://docs.kernel.org/crypto/libcrypto.html#api-documentation).  No
>   asynchronous execution, no buggy hardware crypto drivers, etc.

Yup!  That's exactly the idea.

> It sounds good to me.  For people who feel like the su binary on their
> system is a bit too restrictive and would like to fix that, these
> changes might not be all that great for them.  But for the rest of us,
> they should work rather well.
> 
> Of course, it'll also be a fair a bit of work, and unfortunately I also
> expect pushback from people who (incorrectly IMO) think that AF_ALG
> performance is important, even moreso than security.

If one cares about crypto offload performance, they would be better
served by creating a better interface to it than AF_ALG.  AF_ALG is
a horrible API with (presumably) tons of overhead.  I know the QAT
driver and an Nvidia BlueField DPU accelerator driver both bypass it.

Furthermore, AF_ALG only supports symmetric algorithms.  These
algorithms are inexpensive in software, so the cost of going to an
accelerator and back is enormous compared to the cost of a single
operation.  For offload to even a very fast accelerator to make sense,
one must be able to deeply pipeline requests.  However, this creates
a huge amount of additional complexity for software.

On the other hand, asymmetric cryptography performs far more work per
operation.  This might (no benchmarks!) mean that offloading asymmetric
algorithms makes more sense than offloading symmetric ones.  The cost
of sending the work to the accelerator and waiting for completion
is less than the time needed to perform the operation, so even a
synchronous interface could still be faster than running the algorithm
on the CPU.  Furthermore, long-term keys are very often asymmetric,
so DPA and EMA protections are much more likely to be relevant here.
Asymmetric accelerators also don't have a better alternative in the
form of inline encryption hardware.

I think a high performance interface to hardware cryptography (and,
more importantly, compression) would look much more like RDMA.
There would be a kernel driver that did the bare minimum to provide
isolation between userspace programs, and a userspace driver that
was responsible for abstracting over the hardware.

> Either way, the first step will be to create the algorithm allowlist,
> which should happen anyway, regardless of the other changes.

The simplest changes I can see are:

1. Get rid of zero-copy support (splice()).
2. Get rid of AIO support.
3. Only allow software implementations.

All of these are really simple.  I can send patches, but be warned
that they would only be compile-tested, as I don't know how to test
the code.

I removed oss-security from CC as this is now a Linux kernel development
discussion.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AF_ALG hardening
  2026-05-02  4:52                     ` AF_ALG hardening Demi Marie Obenour
@ 2026-05-02  8:19                       ` Simon Richter
  2026-05-02 20:42                         ` Demi Marie Obenour
  2026-05-02 19:16                       ` Eric Biggers
  1 sibling, 1 reply; 10+ messages in thread
From: Simon Richter @ 2026-05-02  8:19 UTC (permalink / raw)
  To: Demi Marie Obenour, Eric Biggers
  Cc: Jan Schaumann, iwd, Linux kernel mailing list, linux-crypto,
	Herbert Xu

[-- Attachment #1.1: Type: text/plain, Size: 4788 bytes --]

Hi,

On 5/2/26 13:52, Demi Marie Obenour wrote:

>> Of course, it'll also be a fair a bit of work, and unfortunately I also
>> expect pushback from people who (incorrectly IMO) think that AF_ALG
>> performance is important, even moreso than security.

AF_ALG performance (time/power) is important in the way that it's 
literally the only point to its existence. If all it provides is extra 
overhead over a software implementation, then it makes no sense to keep it.

> If one cares about crypto offload performance, they would be better
> served by creating a better interface to it than AF_ALG.  AF_ALG is
> a horrible API with (presumably) tons of overhead.  I know the QAT
> driver and an Nvidia BlueField DPU accelerator driver both bypass it.

The API is designed to be zerocopy, that's why it's this horrible 
combination of socket API and splice(). The general assumption here is 
that it does not make sense to offload small requests in the first 
place, and application programmers are aware of that.

The use case is "I have a file or pipe full of data and a device with a 
kernel driver that should process it, can we somehow avoid copying the 
data to userspace only to immediately copy it back to kernelspace?"

This copying is even more silly if the actual question I have in 
userspace is "what is the SHA256 checksum of this file?" or "what is the 
SHA256 checksum of the string 'blob 8794311528\0' followed by this 
file?" (where you can see why anyone would ask such a silly question and 
prefer to use the dedicated hardware that processes 24 GB/s over the CPU 
at 100 MB/s)

> Furthermore, AF_ALG only supports symmetric algorithms.  These
> algorithms are inexpensive in software, so the cost of going to an
> accelerator and back is enormous compared to the cost of a single
> operation.

Yes, initial setup cost is high, so this only makes sense for large 
requests or batches (submitting individual requests is generally cheap, 
the difficulty is ensuring the data is accessible to the hardware).

That's also why there are no asymmetric algorithms: these aren't 
generally used on large amounts of data, so it's never worth it to 
offload these.

It would make sense to offload asymmetric algorithms if there was a 
secure key storage inside the device, but AFAIK the API does not support 
that, or even the notion of on-device contexts.

It is not a good API, and it sits on top of the ahash/acomp/acrypt 
interfaces which are also unfriendly to accelerator hardware.

> For offload to even a very fast accelerator to make sense,
> one must be able to deeply pipeline requests.  However, this creates
> a huge amount of additional complexity for software.

Software that has requirements like that is already complex -- if I have 
a few thousand workload packets, I need a worker pool.

If I don't have these requirements, then indeed I am better off with a 
software-only solution in userspace, because it is not relevant from a 
performance standpoint.

> Asymmetric accelerators also don't have a better alternative in the
> form of inline encryption hardware.

Quite a number of architectures do not have inline encryption support, 
and these are more likely to use offload hardware even for smaller 
requests (e.g. for power saving).

> I think a high performance interface to hardware cryptography (and,
> more importantly, compression) would look much more like RDMA.
> There would be a kernel driver that did the bare minimum to provide
> isolation between userspace programs, and a userspace driver that
> was responsible for abstracting over the hardware.

Offload hardware comes in two flavours: the high-throughput kind, built 
into devices where no one cares about power, and the 
lower-power-than-the-CPU-doing-it kind.

The former can easily provide user contexts even in virtualized 
environments, but the latter is generally found in systems that do not 
even have an IOMMU. Either we have two distinct interfaces for these, or 
we need one that can handle either.

My feeling is that no one is happy with either AF_ALG or the 
asynchronous interfaces in general, so I think they should be removed 
completely, and there should be a separate "offload" SIG that creates 
new interfaces that are actually usable with current hardware.

 > 1. Get rid of zero-copy support (splice()).
 > 2. Get rid of AIO support.
 > 3. Only allow software implementations.

That makes sense if we're forced to keep the interface for now, but it 
means that offload support through the crypto subsystem is completely 
dead, and anyone wanting to support offload hardware needs to go 
elsewhere. Can we get a definitive statement that this is intended?

    Simon

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AF_ALG hardening
  2026-05-02  8:19                       ` Simon Richter
@ 2026-05-02 20:42                         ` Demi Marie Obenour
  0 siblings, 0 replies; 10+ messages in thread
From: Demi Marie Obenour @ 2026-05-02 20:42 UTC (permalink / raw)
  To: Simon Richter, Eric Biggers
  Cc: Jan Schaumann, iwd, Linux kernel mailing list, linux-crypto,
	Herbert Xu


[-- Attachment #1.1.1: Type: text/plain, Size: 7336 bytes --]

On 5/2/26 04:19, Simon Richter wrote:
> Hi,
> 
> On 5/2/26 13:52, Demi Marie Obenour wrote:
> 
>>> Of course, it'll also be a fair a bit of work, and unfortunately I also
>>> expect pushback from people who (incorrectly IMO) think that AF_ALG
>>> performance is important, even moreso than security.
> 
> AF_ALG performance (time/power) is important in the way that it's 
> literally the only point to its existence. If all it provides is extra 
> overhead over a software implementation, then it makes no sense to keep it.

The only reason for keeping it is for compatibility with existing
userspace.

>> If one cares about crypto offload performance, they would be better
>> served by creating a better interface to it than AF_ALG.  AF_ALG is
>> a horrible API with (presumably) tons of overhead.  I know the QAT
>> driver and an Nvidia BlueField DPU accelerator driver both bypass it.
> 
> The API is designed to be zerocopy, that's why it's this horrible 
> combination of socket API and splice(). The general assumption here is 
> that it does not make sense to offload small requests in the first 
> place, and application programmers are aware of that.
> 
> The use case is "I have a file or pipe full of data and a device with a 
> kernel driver that should process it, can we somehow avoid copying the 
> data to userspace only to immediately copy it back to kernelspace?"
> 
> This copying is even more silly if the actual question I have in 
> userspace is "what is the SHA256 checksum of this file?" or "what is the 
> SHA256 checksum of the string 'blob 8794311528\0' followed by this 
> file?" (where you can see why anyone would ask such a silly question and 
> prefer to use the dedicated hardware that processes 24 GB/s over the CPU 
> at 100 MB/s)

Do you have a specific device that has such hardware and can use an
upstream kernel?  I have yet to see any concrete examples.

>> Furthermore, AF_ALG only supports symmetric algorithms.  These
>> algorithms are inexpensive in software, so the cost of going to an
>> accelerator and back is enormous compared to the cost of a single
>> operation.
> 
> Yes, initial setup cost is high, so this only makes sense for large 
> requests or batches (submitting individual requests is generally cheap, 
> the difficulty is ensuring the data is accessible to the hardware).
> 
> That's also why there are no asymmetric algorithms: these aren't 
> generally used on large amounts of data, so it's never worth it to 
> offload these.

Asymmetric cryptography is far more expensive than symmetric
cryptography.  Tens of microseconds or more on a high-end CPU
I believe.

That is more than enough time to justify going to an accelerator and
back, if the accelerator can do the job significantly faster.

> It would make sense to offload asymmetric algorithms if there was a 
> secure key storage inside the device, but AFAIK the API does not support 
> that, or even the notion of on-device contexts.
> 
> It is not a good API, and it sits on top of the ahash/acomp/acrypt 
> interfaces which are also unfriendly to accelerator hardware.

Not surprised.

>> For offload to even a very fast accelerator to make sense,
>> one must be able to deeply pipeline requests.  However, this creates
>> a huge amount of additional complexity for software.
> 
> Software that has requirements like that is already complex -- if I have 
> a few thousand workload packets, I need a worker pool.

Or a thread-per-core architecture.

> If I don't have these requirements, then indeed I am better off with a 
> software-only solution in userspace, because it is not relevant from a 
> performance standpoint.
> 
>> Asymmetric accelerators also don't have a better alternative in the
>> form of inline encryption hardware.
> 
> Quite a number of architectures do not have inline encryption support, 
> and these are more likely to use offload hardware even for smaller 
> requests (e.g. for power saving).

Please provide a real-world example where that using the accelerator
really does save power compared to running the cryptography on the CPU.

>> I think a high performance interface to hardware cryptography (and,
>> more importantly, compression) would look much more like RDMA.
>> There would be a kernel driver that did the bare minimum to provide
>> isolation between userspace programs, and a userspace driver that
>> was responsible for abstracting over the hardware.
> 
> Offload hardware comes in two flavours: the high-throughput kind, built 
> into devices where no one cares about power, and the 
> lower-power-than-the-CPU-doing-it kind.

Again, please provide benchmarks.  I have yet to see a real-world
example where the accelerator is faster for short (read: realistic)
message sizes.  Eric Biggers has provided many where it is far slower,
and didn't find any situation where it saved power.

For very long messages, yes, it can be faster.  But I have yet to
see a situation where (a) performance for large files matters and (b)
there is an accelerator worth using.

Network and storage encryption is obviously performance-critical,
but it uses small messages.  Furthermore, both of them are well-suited
to inline cryptographic engines, which are much more efficient.

I mostly associate large file encryption and hashing with things
like verifying software updates.  On large systems, this matters
because a human is waiting.  However, these systems are also ones
for which software cryptography is very fast.  On small systems,
I expect update validation to be much less performance-critical.

> The former can easily provide user contexts even in virtualized 
> environments, but the latter is generally found in systems that do not 
> even have an IOMMU. Either we have two distinct interfaces for these, or 
> we need one that can handle either.
> 
> My feeling is that no one is happy with either AF_ALG or the 
> asynchronous interfaces in general, so I think they should be removed 
> completely, and there should be a separate "offload" SIG that creates 
> new interfaces that are actually usable with current hardware.
> 
>  > 1. Get rid of zero-copy support (splice()).
>  > 2. Get rid of AIO support.
>  > 3. Only allow software implementations.
> 
> That makes sense if we're forced to keep the interface for now, but it 
> means that offload support through the crypto subsystem is completely 
> dead, and anyone wanting to support offload hardware needs to go 
> elsewhere. Can we get a definitive statement that this is intended?

AF_ALG is dead.  Much of the rest of the kernel is moving from the
crypto API to the software-only crypto library.

Offload is far more complex than software cryptography, so there needs
to be a substantial benefit to justify using it.  Have you seen any
real-world cases of this?  Inline encryption hardware (both in storage
controllers and in NICs) is definitely a win, but it doesn't use the
crypto API at all.

It's easy to provide synthetic benchmarks where offload is a win, but
synthetic benchmarks don't justify a giant CVE magnet.  If offload is a
win in the real world, then it should be possible to demonstrate this.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AF_ALG hardening
  2026-05-02  4:52                     ` AF_ALG hardening Demi Marie Obenour
  2026-05-02  8:19                       ` Simon Richter
@ 2026-05-02 19:16                       ` Eric Biggers
  2026-05-04 19:01                         ` Simon Richter
  1 sibling, 1 reply; 10+ messages in thread
From: Eric Biggers @ 2026-05-02 19:16 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Jan Schaumann, iwd, Linux kernel mailing list, linux-crypto,
	Herbert Xu

On Sat, May 02, 2026 at 12:52:57AM -0400, Demi Marie Obenour wrote:
> > Either way, the first step will be to create the algorithm allowlist,
> > which should happen anyway, regardless of the other changes.
> 
> The simplest changes I can see are:
> 
> 1. Get rid of zero-copy support (splice()).
> 2. Get rid of AIO support.
> 3. Only allow software implementations.
> 
> All of these are really simple.  I can send patches, but be warned
> that they would only be compile-tested, as I don't know how to test
> the code.

If you're interested, please send patches, and we'll see where things go
from there.  We need to get more people helping with this stuff.

For (1), it probably should work like the way the zero-copy support was
disabled in the 6.1 LTS kernel last year, where (I think) the splice()
syscall still succeeds but it just copies the data.

For (2) and (3), you can find examples of disabling asynchronous crypto
API stuff at
https://lore.kernel.org/linux-fscrypt/20250704070322.20692-1-ebiggers@kernel.org/
and
https://lore.kernel.org/linux-fscrypt/20250708181313.66961-1-ebiggers@kernel.org/.
Note that to request a synchronous algorithm you have to pass
CRYPTO_ALG_ASYNC (yes, really).

I think there are a few test scripts for AF_ALG in libkcapi.  Besides
that AF_ALG is barely tested.  So you're in good company.

- Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AF_ALG hardening
  2026-05-02 19:16                       ` Eric Biggers
@ 2026-05-04 19:01                         ` Simon Richter
  2026-05-04 19:54                           ` Eric Biggers
  0 siblings, 1 reply; 10+ messages in thread
From: Simon Richter @ 2026-05-04 19:01 UTC (permalink / raw)
  To: Eric Biggers, Demi Marie Obenour
  Cc: Jan Schaumann, iwd, Linux kernel mailing list, linux-crypto,
	Herbert Xu

[-- Attachment #1.1: Type: text/plain, Size: 1823 bytes --]

Hi,

On 5/3/26 04:16, Eric Biggers wrote:

> On Sat, May 02, 2026 at 12:52:57AM -0400, Demi Marie Obenour wrote:

>> The simplest changes I can see are:

>> 1. Get rid of zero-copy support (splice()).
>> 2. Get rid of AIO support.
>> 3. Only allow software implementations.

> For (2) and (3), you can find examples of disabling asynchronous crypto

I think we need to make up our minds here.

This thread is about removing asynchronous implementations and 
accelerator support from AF_ALG, so it can support legacy applications 
with known-good implementations, while the other thread[1] is about 
removing everything *but* accelerator support from AF_ALG -- and as 
accelerators are typically asynchronous, this aspect has to stay as well.

At least with the opposite proposals, it would be good to know which one 
is official policy.

At the same time, the third thread[2] deprecates AF_ALG because of its 
wonky security posture, while newer accelerators are implementing their 
own userspace interfaces because AF_ALG is too limited, so we're already 
replacing one CVE magnet with several independent ones, and deprecating 
AF_ALG means that future drivers will add even more of those because 
there is no longer a common framework to attach to.

Also, if AF_ALG is deprecated and the kernel no longer uses 
ahash/acrypt/acomp internally, there is no point in accelerator cards 
even registering with the crypto subsystem. Should that be an explicit 
policy "accelerator cards are outside the scope of the crypto subsystem, 
even if they implement a cryptographic algorithm"?

    Simon

[1] 
https://lore.kernel.org/linux-crypto/112bf0af-1551-4d3e-ab15-e5dea3fc2435@app.fastmail.com/

[2] 
https://lore.kernel.org/linux-crypto/20260430011544.31823-1-ebiggers@kernel.org/

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: AF_ALG hardening
  2026-05-04 19:01                         ` Simon Richter
@ 2026-05-04 19:54                           ` Eric Biggers
  0 siblings, 0 replies; 10+ messages in thread
From: Eric Biggers @ 2026-05-04 19:54 UTC (permalink / raw)
  To: Simon Richter
  Cc: Demi Marie Obenour, Jan Schaumann, iwd, Linux kernel mailing list,
	linux-crypto, Herbert Xu

On Tue, May 05, 2026 at 04:01:47AM +0900, Simon Richter wrote:
> Hi,
> 
> On 5/3/26 04:16, Eric Biggers wrote:
> 
> > On Sat, May 02, 2026 at 12:52:57AM -0400, Demi Marie Obenour wrote:
> 
> > > The simplest changes I can see are:
> 
> > > 1. Get rid of zero-copy support (splice()).
> > > 2. Get rid of AIO support.
> > > 3. Only allow software implementations.
> 
> > For (2) and (3), you can find examples of disabling asynchronous crypto
> 
> I think we need to make up our minds here.
> 
> This thread is about removing asynchronous implementations and accelerator
> support from AF_ALG, so it can support legacy applications with known-good
> implementations, while the other thread[1] is about removing everything
> *but* accelerator support from AF_ALG -- and as accelerators are typically
> asynchronous, this aspect has to stay as well.
> 

Thread [1] is a patch that removes the kernel's last
architecture-optimized implementation of MD5, which is a broken and
deprecated algorithm anyway.  So it's not just AF_ALG that's motivating
that particular patch, but also a desire to focus effort on modern
algorithms and keep the different Linux architectures consistent.  So I
think the scope of that thread is more narrow than what you're claiming.

Also, it's already been established that for now AF_ALG will have to
keep the software code used by a small set of userspace programs such as
iwd.  So no, it cannot be completely removed yet (except on systems that
don't use any of these programs, where it can be already).  However,
that doesn't mean that we shouldn't be nudging people towards better
solutions, with an eye towards future attack surface reductions.

> At least with the opposite proposals, it would be good to know which one is
> official policy.
> 
> At the same time, the third thread[2] deprecates AF_ALG because of its wonky
> security posture, while newer accelerators are implementing their own
> userspace interfaces because AF_ALG is too limited, so we're already
> replacing one CVE magnet with several independent ones, and deprecating
> AF_ALG means that future drivers will add even more of those because there
> is no longer a common framework to attach to.

It's long been clear that by far the best way to accelerate symmetric
crypto is to just put it in the CPU, or in-line in the storage or
network controller.  Indeed, that's what almost everyone does now.

So I would expect the demand for this kind of interface to symmetric
crypto to continue to decline, as it already has been for a long time.
And as you pointed out, AF_ALG doesn't work well for it anyway, which
makes AF_ALG increasingly kind of besides the point.

> Also, if AF_ALG is deprecated and the kernel no longer uses
> ahash/acrypt/acomp internally, there is no point in accelerator cards even
> registering with the crypto subsystem. Should that be an explicit policy
> "accelerator cards are outside the scope of the crypto subsystem, even if
> they implement a cryptographic algorithm"?

There are still some in-kernel users of the asynchronous crypto APIs,
for example IPsec and dm-crypt.  So I think your prediction of the
demise of these APIs is a bit premature as well.  But yes, at least for
the symmetric crypto, kernel subsystems have been been repeatedly seeing
that the async support just isn't worth it.  We'll see more kernel
subsystems switching to sync-only.  But in practice this is a gradual
transition.

Anyway, I don't think I'm proposing conflicting things.  We can and
should document a general deprecation of AF_ALG, while also helping
update userspace programs to no longer use it, while also applying
various hardening measures to reduce AF_ALG's attack surface as best we
can in the meantime.  There are multiple independent hardening measures
that could be applied, and they will be up for discussion on the
individual patches that implement them.

- Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <20260502035402.GB3872267@google.com>]

* Re: [oss-security] CVE-2026-31431: CopyFail: linux local privilege scalation
       [not found]                     ` <20260502035402.GB3872267@google.com>
@ 2026-05-02  6:39                       ` Demi Marie Obenour
       [not found]                         ` <CAM=PXV4q2i13W8Z_AZGDfdxbqWANJ=U4Sw3FTcv5mH_QUrrSfA@mail.gmail.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Demi Marie Obenour @ 2026-05-02  6:39 UTC (permalink / raw)
  To: Eric Biggers
  Cc: oss-security, Jan Schaumann, iwd, Linux kernel mailing list,
	Linux kernel mailing list


[-- Attachment #1.1.1: Type: text/plain, Size: 1012 bytes --]

On 5/1/26 23:54, Eric Biggers wrote:
> On Sat, May 02, 2026 at 03:35:58AM +0000, Eric Biggers wrote:
>> So the idea would be something along the lines of:
> 
> And just to make sure no one gets the wrong impression: just because
> there seem to be ways in which the attack surface of AF_ALG could/should
> be reduced doesn't mean that userspace should keep using it (or even
> worse, start to use it).  Fixing programs like iwd needs to proceed
> concurrently, so that eventually (some years down the line) the problem
> can finally be fully solved by removing AF_ALG from the kernel source.
> 
> - Eric

Can AF_ALG be emulated using LD_PRELOAD?  That would allow it to be
eliminated from the kernel much more quickly, as one would not need
to get rid of all of its existing users.  It would even work for those
who need AF_ALG because of closed source binaries, who otherwise will
have no alternative other than running an old kernel in a VM.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <CAM=PXV4q2i13W8Z_AZGDfdxbqWANJ=U4Sw3FTcv5mH_QUrrSfA@mail.gmail.com>]

[parent not found: <afcqxCv58YrhbtVr@definition.pseudorandom.co.uk>]

* Re: [oss-security] CVE-2026-31431: CopyFail: linux local privilege scalation
       [not found]                           ` <afcqxCv58YrhbtVr@definition.pseudorandom.co.uk>
@ 2026-05-03 19:20                             ` Greg Dahlman
  0 siblings, 0 replies; 10+ messages in thread
From: Greg Dahlman @ 2026-05-03 19:20 UTC (permalink / raw)
  To: oss-security, linux-crypto, Linux kernel mailing list

Note: re-adding the other lists so that people have an opportunity to
correct my errors.

"CAP_FOO in the init namespace" doesn't matter if "CAP_FOO" is the
gate in the default namespace, namespaces a facade-pattern and not an
isolation, unix abstract sockets, af_inet, vsock, af_alg etc... do not
currently use credentials at all IIRC.

LD_PRELOAD, as a way to (transparently) replace this functionality
without user intervention , involves putting in an interposer to
directly intercept all socket() calls at a system scale in this case,
when it is typically a thread scope concept.

I think socket() hasn't been interposable for at least a decade (in
glibc) you will weaken overall security by reintroducing the PLT or...
Note many people want to avoid adding in a `/etc/ld.so.preload`
because fighting dynamic linker hijacking is not easy due to Unix-like
systems having zero security boundary between the parent and child
process.

The bigger problem is that the embedded users are not where most of
the friction is going to come from, while the motivations are similar,
FIPS 140-3 validation, and downstream vendors which used distros
validations, incorporated into regilitory, compliance, and governance
is a large unidentified user base.

Searching for "Kernel Crypto API" in the Module name on this site will
show some of the upstream validations.

   https://csrc.nist.gov/projects/cryptographic-module-validation-program/validated-modules/search

In the case of non path backed sockets, userns provides zero
protections and only adds to the attack surface, the only credential
use for non-path backed sockets currently is the restriction of ports
below 1024 on af_inet,

Remember namespace support is not implicit, and all af_family calls
outside of those specific families that have namespace support all
stay in the default namespace.

If you dig through the $distro openssl security documents from the
NIST link above from the vendors you will see why people liked the
contract that af_alg offered, because they were depending on the
kernel teams stable api and reputation. and they could simplify their
compliance because it is easier to ensure no openssl installs exist at
all on a system than to try and maintain compliance, governance, and
regulatory obligations.

While there are some use cases like firmware images on some embedded
systems, where having the DMA pipe into a cryptoengine avoided Von
Neumann bottleneck issues and CPU usage etc.. No matter how flawed it
is to use af_alg, it provided a simple zero dependency interface that
their tools already supported (socket) and reduced their lifecycle
costs.  The reduced performance of the hw crypto engines for smaller
data sizes was acceptable as a trade-off, not as the primary driver in
many cases.

I should be 100% clear, namespaces _are not_ a security feature, but
they can be leveraged to lower privileges and improve a security
posture.  But when you have interfaces like sockets (non unix like)
the main advantage of network namespaces is they allow you to
constrain something that due to historical reasons has almost zero
controls (except tcp ports < 1024).

But the default is for any new, legacy or other subsystem to only live
in the default namespace.  The friction is when ~4 out of the 40+
af_families is namespace but the rest are not.

There is a very real problem with people overestimating the isolation
capabilities of namespaces in general, but paying attention to the
official documentation may help here:

https://www.kernel.org/doc/html/latest/admin-guide/namespaces/compatibility-list.html

     The same is true for the IPC namespaces being shared - two users
from different user namespaces should not access the same IPC objects
even having equal UIDs.
     But currently this is not so.

The "should not access" is a very different contract than most people expect.

The FIPS/ISO compliance issue mostly invalidates what I hoped was an
easy fix and putting a kernel call interposer via ld_preload will
still add  friction that is likely to block the aspirations of
removing af_alg from the kernel.  I think that there is a path to do
so, and I think it would be best in the long run.  But the friction
here is not just from code changes, which are far easier to accomplish
than the regulatory issues.

The compliance based user base is one that is often far more challenging.

I do still think that both userland and kernel would benefit from some
mechanism that would make it easier for security teams, admins, and
users to run with lower privileges.  IMHO thinking about enabling that
control will also be critical to the kernel team's ability to remain
effective.  Different use cases will always conflict, and
non-namespace users would also benefit from ways to restrict access to
af_families.

IMHO if the team thinks af_alg is unfixable, it is maybe one of the
rare cases where breaking changes are necessary.  It may be more
productive to help compliance based users migrate than provide a
brittle shim that still invalidates all their authorizations anyway.

I am not an expert on FIPS/ISO compliance, but I do know that
providing guidance that helps users migrate would go a long way.  You
could say, have a userland process that provides a socket-like
interface with guidance on how to wrap or create a their_socket() to
migrate.

I still think that for non af_inet/unix (file backed)socket af
families, there needs to be a credentials mechanism.  People are
building systems on top of vsock and other non unix/if based systems
that are just as vulnerable. Like af_alg, vsock is known to have
serious issues and was designed for a trusted environment.  Without an
effective way to limit exposure from either userland or the kernel
there is enough that is simply just unexplored that it will be
expensive.

On Sun, May 3, 2026 at 5:00 AM Simon McVittie <smcv@debian.org> wrote:
>
> On Sat, 02 May 2026 at 14:21:57 -0600, Greg Dahlman wrote:
> >LD_PRELOAD and capabilities
>
> These seem orthogonal, rather than being part of the same idea.
>
> LD_PRELOAD is discretionary (cooperative) so it would only be useful if
> used in a design something like this:
>
> - at the kernel level, AF_ALG just doesn't work (fails with a
>    permission-related error), at least for unprivileged processes
> - but in user-space, an opt-in LD_PRELOAD module intercepts the socket(),
>    etc. calls for AF_ALG, and emulates the behaviour of current kernels
>    by calling into a user-space crypto library
>
> It can't be a security boundary, but it can be a mitigation for the
> regressions that a new security boundary (or complete feature removal)
> would otherwise cause, similar to the way LD_PRELOADs like aoss and
> padsp mitigated the regressions for older binaries when distro kernels
> disabled OSS audio.
>
> Meanwhile capabilities are a way to let trusted, privileged processes
> have access to things that unprivileged processes do not, for example
> making AF_ALG available to a few system services that need it but not
> available to all of user-space.
>
> >You should expect any UID (even nobody) to be able to gain the
> >privileges in their bounding set
>
> The kernel can distinguish between "CAP_FOO in the init namespace" and
> "CAP_FOO in any other userns" if it wants to, and some kernel features
> are already gated by having a capability in the init namespace
> specifically. For example CAP_SYS_ADMIN in the init namespace allows
> mounting block-device-backed filesystems like ext4, but CAP_SYS_ADMIN in
> a different userns only allows a few "safe" mount operations
> (bind-mounts, overlayfs, FUSE).
>
>      smcv

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <cfe5a1f5-f7fe-44a5-8af9-8e4c8d68b3d7@terraraq.uk>]

* Re: [oss-security] CVE-2026-31431: CopyFail: linux local privilege scalation
       [not found]           ` <cfe5a1f5-f7fe-44a5-8af9-8e4c8d68b3d7@terraraq.uk>
@ 2026-05-02 22:32             ` Demi Marie Obenour
  2026-05-03  6:30               ` Peter Gutmann
  0 siblings, 1 reply; 10+ messages in thread
From: Demi Marie Obenour @ 2026-05-02 22:32 UTC (permalink / raw)
  To: oss-security, Richard Kettlewell, Eric Biggers,
	Linux kernel mailing list, linux-crypto


[-- Attachment #1.1.1: Type: text/plain, Size: 1608 bytes --]

On 5/2/26 15:13, Richard Kettlewell wrote:
> On 01/05/2026 16:30, Demi Marie Obenour wrote:
>> On 4/30/26 03:19, Eric Biggers wrote:
>>> But I also hope this finally provides some more impetus for AF_ALG to be
>>> deprecated and removed.  It's a massive, largely pointless attack
>>> surface which has been causing problems, including regular CVEs, ever
>>> since it was added to the kernel in 2010.  And of course it's gotten
>>> even worse lately, with LLMs now being able to find the bugs.
>>>
>>> Userspace crypto libraries exist.  There's no need to escalate to kernel
>>> mode just to do some math.
>>
>> The only reason I can think of to keep it is for embedded systems
>> with weak CPUs and crypto accelerators that are actually worth using.
>> However, those seem to be very rare outside of things like routers,
>> which run specialized distros like OpenWRT.  Even when the accelerator
>> exists and is worth using, AF_ALG is certainly not an efficient way
>> to access it.
> 
> I have that use case, although fortunately it's in a context where 
> splice() is disabled. But the requirement is for access to the SoC's 
> accelerator - the interface doesn't need to be via AF_ALG in particular, 
> it doesn't have to offer software crypto (and it might be better if it 
> didn't), and it needn't be independent of the specific hardware 
> (although in the bigger picture it'd be a shame if it wasn't).
> 
> ttfn/rjk

Can you provide benchmarks showing that the accelerator is faster
than the CPU on realistic workloads?
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [oss-security] CVE-2026-31431: CopyFail: linux local privilege scalation
  2026-05-02 22:32             ` Demi Marie Obenour
@ 2026-05-03  6:30               ` Peter Gutmann
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Gutmann @ 2026-05-03  6:30 UTC (permalink / raw)
  To: oss-security@lists.openwall.com, Richard Kettlewell, Eric Biggers,
	Linux kernel mailing list, linux-crypto@vger.kernel.org

Demi Marie Obenour writes:

>Can you provide benchmarks showing that the accelerator is faster than the
>CPU on realistic workloads?

That could be tricky.  The accelerator uses more hardware crypto than the CPU
on realistic workloads, would that do?

The following is from playing around on a few bits of hardware that were to
hand some years ago, so don't take it as gospel, but:

/* Check for the presence of crypto hardware support.  This is something of
   an exercise in futility because the crypto hardware is anything from
   slightly slower (large data blocks) to much, much slower (more standard
   small data blocks) than software due to the overhead of getting the data
   through the API to and from the cryptologic, the cryptologic startup/
   shutdown overhead, and in the case of /dev/crypto, in and out of the
   kernel.  The only place where it does matter is things like Cortex M3-
   level SoCs, so a combination of lower-power CPUs, no instruction-level
   assist for crypto, and direct hardware access from the RTOS with no
   overhead where you just point the cryptologic at a block of memory and
   say "process this".

   However, people really want to see the fancy crypto hardware used even if
   it yields a net loss in performance so we try and enable it if possible
   unless it really is pointless, just a software emulation (many
   /dev/crypto instances) where, assuming the crypto is provided by OpenSSL,
   you can end up in a situation where OpenSSL is calling into a kernel
   interface that then provides access to another, older and possibly
   unpatched, copy of OpenSSL code that's doing the crypto.

   [...]

So one solution would be to get the fingers-of-one-hand applications still
using the interface off it onto user-mode software-only and then make it
kernel-only, closing the door on the entire attack surface from user space.

Peter.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-05-04 19:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <afJorKIje4O6dXbH@netmeister.org>
     [not found] ` <d6111caa-db61-498a-92cb-ea7a0aa0a5e2@ehuk.net>
     [not found]   ` <87se8dgicq.fsf@gentoo.org>
     [not found]     ` <afL-QhLfEKqHZqka@eldamar.lan>
     [not found]       ` <20260430071917.GB54208@sol>
     [not found]         ` <177abb5d-8ba9-4bb9-8b23-9fbc868ed3cd@gmail.com>
     [not found]           ` <20260501180028.GA2260@sol>
     [not found]             ` <19837ef5-e5b6-45f4-8336-3ce07423dfb1@gmail.com>
     [not found]               ` <20260501201841.GA2540@quark>
     [not found]                 ` <c13dd3c5-ddc1-431e-bc7d-2de39c551f8e@gmail.com>
     [not found]                   ` <20260502033556.GA3872267@google.com>
2026-05-02  4:52                     ` AF_ALG hardening Demi Marie Obenour
2026-05-02  8:19                       ` Simon Richter
2026-05-02 20:42                         ` Demi Marie Obenour
2026-05-02 19:16                       ` Eric Biggers
2026-05-04 19:01                         ` Simon Richter
2026-05-04 19:54                           ` Eric Biggers
     [not found]                     ` <20260502035402.GB3872267@google.com>
2026-05-02  6:39                       ` [oss-security] CVE-2026-31431: CopyFail: linux local privilege scalation Demi Marie Obenour
     [not found]                         ` <CAM=PXV4q2i13W8Z_AZGDfdxbqWANJ=U4Sw3FTcv5mH_QUrrSfA@mail.gmail.com>
     [not found]                           ` <afcqxCv58YrhbtVr@definition.pseudorandom.co.uk>
2026-05-03 19:20                             ` Greg Dahlman
     [not found]           ` <cfe5a1f5-f7fe-44a5-8af9-8e4c8d68b3d7@terraraq.uk>
2026-05-02 22:32             ` Demi Marie Obenour
2026-05-03  6:30               ` Peter Gutmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox