* [PATCH net 0/3] rxrpc: Better fix for DATA/RESPONSE decrypt vs splice()
@ 2026-05-11 16:07 David Howells
2026-05-11 16:07 ` [PATCH net 1/3] rxrpc: Also unshare DATA/RESPONSE packets when paged frags are present David Howells
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: David Howells @ 2026-05-11 16:07 UTC (permalink / raw)
To: netdev
Cc: David Howells, Hyunwoo Kim, Marc Dionne, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
linux-afs, linux-kernel
Here are two patches containing better fixes for the in-place decryption of
DATA and RESPONSE packets that can corrupt pagecache spliced into UDP
packets and sent to an AF_RXRPC server [CVE-2026-43500].
[!] Note that Hyunwoo Kim's fix is included as that is a prerequisite for
the main patches to build. This is in Linus's tree, but not yet net/main.
One patch fixes DATA decryption by having recvmsg unconditionally extract
the data into a flat bounce buffer and, if need be, decrypt it there. It
doesn't seem to cause a performance problem to do this even on unencrypted
packets; for encrypted packets it makes sure the content is correctly
aligned for crypto which seems to get a small performance gain.
Further, it means that DATA packets are no longer copied in the I/O thread,
avoiding a slowdown of the protocol engine that runs there.
The other patch fixes RESPONSE decryption by having the connection event
handler worker copy the data to a flat buffer and, again, decrypt it there.
This simplifies RESPONSE handling.
With these two fixes, the data content of the received sk_buff no longer
gets altered.
David
The patches can be found here also:
http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-fixes
David Howells (2):
rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in
recvmsg
rxrpc: Fix RESPONSE packet verification to extract skb to a linear
buffer
Hyunwoo Kim (1):
rxrpc: Also unshare DATA/RESPONSE packets when paged frags are present
net/rxrpc/ar-internal.h | 77 ++++++++++++++++--
net/rxrpc/call_event.c | 20 +----
net/rxrpc/call_object.c | 2 +
net/rxrpc/conn_event.c | 32 ++++----
net/rxrpc/insecure.c | 8 +-
net/rxrpc/protocol.h | 1 -
net/rxrpc/recvmsg.c | 72 +++++++++++++----
net/rxrpc/rxgk.c | 175 ++++++++++++++--------------------------
net/rxrpc/rxgk_app.c | 91 +++++++++------------
net/rxrpc/rxgk_common.h | 76 ++++++++---------
net/rxrpc/rxkad.c | 175 +++++++++++++++-------------------------
11 files changed, 349 insertions(+), 380 deletions(-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net 1/3] rxrpc: Also unshare DATA/RESPONSE packets when paged frags are present
2026-05-11 16:07 [PATCH net 0/3] rxrpc: Better fix for DATA/RESPONSE decrypt vs splice() David Howells
@ 2026-05-11 16:07 ` David Howells
2026-05-11 16:07 ` [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg David Howells
2026-05-11 16:07 ` [PATCH net 3/3] rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer David Howells
2 siblings, 0 replies; 15+ messages in thread
From: David Howells @ 2026-05-11 16:07 UTC (permalink / raw)
To: netdev
Cc: David Howells, Hyunwoo Kim, Marc Dionne, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
linux-afs, linux-kernel, stable
From: Hyunwoo Kim <imv4bel@gmail.com>
The DATA-packet handler in rxrpc_input_call_event() and the RESPONSE
handler in rxrpc_verify_response() copy the skb to a linear one before
calling into the security ops only when skb_cloned() is true. An skb
that is not cloned but still carries externally-owned paged fragments
(e.g. SKBFL_SHARED_FRAG set by splice() into a UDP socket via
__ip_append_data, or a chained skb_has_frag_list()) falls through to
the in-place decryption path, which binds the frag pages directly into
the AEAD/skcipher SGL via skb_to_sgvec().
Extend the gate to also unshare when skb_has_frag_list() or
skb_has_shared_frag() is true. This catches the splice-loopback vector
and other externally-shared frag sources while preserving the
zero-copy fast path for skbs whose frags are kernel-private (e.g. NIC
page_pool RX, GRO). The OOM/trace handling already in place is reused.
Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()")
Cc: stable@vger.kernel.org
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
net/rxrpc/call_event.c | 4 +++-
net/rxrpc/conn_event.c | 3 ++-
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
index fdd683261226..2b19b252225e 100644
--- a/net/rxrpc/call_event.c
+++ b/net/rxrpc/call_event.c
@@ -334,7 +334,9 @@ bool rxrpc_input_call_event(struct rxrpc_call *call)
if (sp->hdr.type == RXRPC_PACKET_TYPE_DATA &&
sp->hdr.securityIndex != 0 &&
- skb_cloned(skb)) {
+ (skb_cloned(skb) ||
+ skb_has_frag_list(skb) ||
+ skb_has_shared_frag(skb))) {
/* Unshare the packet so that it can be
* modified by in-place decryption.
*/
diff --git a/net/rxrpc/conn_event.c b/net/rxrpc/conn_event.c
index a2130d25aaa9..442414d90ba1 100644
--- a/net/rxrpc/conn_event.c
+++ b/net/rxrpc/conn_event.c
@@ -245,7 +245,8 @@ static int rxrpc_verify_response(struct rxrpc_connection *conn,
{
int ret;
- if (skb_cloned(skb)) {
+ if (skb_cloned(skb) || skb_has_frag_list(skb) ||
+ skb_has_shared_frag(skb)) {
/* Copy the packet if shared so that we can do in-place
* decryption.
*/
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg
2026-05-11 16:07 [PATCH net 0/3] rxrpc: Better fix for DATA/RESPONSE decrypt vs splice() David Howells
2026-05-11 16:07 ` [PATCH net 1/3] rxrpc: Also unshare DATA/RESPONSE packets when paged frags are present David Howells
@ 2026-05-11 16:07 ` David Howells
2026-05-12 7:58 ` Jeffrey Altman
2026-05-12 13:38 ` David Laight
2026-05-11 16:07 ` [PATCH net 3/3] rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer David Howells
2 siblings, 2 replies; 15+ messages in thread
From: David Howells @ 2026-05-11 16:07 UTC (permalink / raw)
To: netdev
Cc: David Howells, Hyunwoo Kim, Marc Dionne, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
linux-afs, linux-kernel, Jeffrey Altman, Jiayuan Chen, stable
This improves the fix for CVE-2026-43500.
Fix the pagecache corruption from in-place decryption of a DATA packet
transmitted locally by splice() by getting rid of the packet sharing in the
I/O thread and unconditionally extracting the packet content into a bounce
buffer in which the buffer is decrypted. recvmsg() (or the kernel
equivalent) then copies the data from the bounce buffer to the destination
buffer. The sk_buff then remains unmodified.
This has an additional advantage in that the packet is then arranged in the
buffer with the correct alignment required for the crypto algorithms to
process directly. The performance of the crypto does seem to be a little
faster and, surprisingly, the unencrypted performance doesn't seem to
change much - possibly due to removing complexity from the I/O thread.
Yet another advantage is that the I/O thread doesn't have to copy packets
which would slow down packet distribution, ACK generation, etc..
The buffer belongs to the call and is allocated initially at 2K,
sufficiently large to hold a whole jumbo subpacket, but the buffer will be
increased in size if needed. There is one downside here, and that's if a
MSG_PEEK of more than one byte occurs, it may move on to the next packet,
replacing the content of the buffer. In such a case, it has to go back and
re-decrypt the current packet.
Note that rx_pkt_offset may legitimately see 0 as a valid offset now, so
switch to using USHRT_MAX to indicate an invalid offset.
Note also that I would generally prefer to replace the buffers of the
current sk_buff with a new kmalloc'd buffer of the right size, ditching the
old data and frags as this makes the handling of MSG_PEEK easier and
removes the double-decryption issue, but this looks like quite a
complicated thing to achieve. skb_morph() looks half way to what I want,
but I don't want to have to allocate a new sk_buff.
Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()")
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Closes: https://lore.kernel.org/r/afKV2zGR6rrelPC7@v4bel/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Simon Horman <horms@kernel.org>
cc: Jiayuan Chen <jiayuan.chen@linux.dev>
cc: netdev@vger.kernel.org
cc: linux-afs@lists.infradead.org
cc: stable@vger.kernel.org
---
net/rxrpc/ar-internal.h | 7 +++-
net/rxrpc/call_event.c | 22 +----------
net/rxrpc/call_object.c | 2 +
net/rxrpc/insecure.c | 3 --
net/rxrpc/recvmsg.c | 72 +++++++++++++++++++++++++++-------
net/rxrpc/rxgk.c | 49 +++++++++++------------
net/rxrpc/rxgk_common.h | 79 +++++++++++++++++++++++++++++++++++++
net/rxrpc/rxkad.c | 86 +++++++++++++++--------------------------
8 files changed, 200 insertions(+), 120 deletions(-)
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 27c2aa2dd023..783367eea798 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -213,8 +213,6 @@ struct rxrpc_skb_priv {
struct {
u16 offset; /* Offset of data */
u16 len; /* Length of data */
- u8 flags;
-#define RXRPC_RX_VERIFIED 0x01
};
struct {
rxrpc_seq_t first_ack; /* First packet in acks table */
@@ -774,6 +772,11 @@ struct rxrpc_call {
struct sk_buff_head recvmsg_queue; /* Queue of packets ready for recvmsg() */
struct sk_buff_head rx_queue; /* Queue of packets for this call to receive */
struct sk_buff_head rx_oos_queue; /* Queue of out of sequence packets */
+ void *rx_dec_buffer; /* Decryption buffer */
+ unsigned short rx_dec_bsize; /* rx_dec_buffer size */
+ unsigned short rx_dec_offset; /* Decrypted packet data offset */
+ unsigned short rx_dec_len; /* Decrypted packet data len */
+ rxrpc_seq_t rx_dec_seq; /* Packet in decryption buffer */
rxrpc_seq_t rx_highest_seq; /* Higest sequence number received */
rxrpc_seq_t rx_consumed; /* Highest packet consumed */
diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
index 2b19b252225e..fec59d9338b9 100644
--- a/net/rxrpc/call_event.c
+++ b/net/rxrpc/call_event.c
@@ -332,27 +332,7 @@ bool rxrpc_input_call_event(struct rxrpc_call *call)
saw_ack |= sp->hdr.type == RXRPC_PACKET_TYPE_ACK;
- if (sp->hdr.type == RXRPC_PACKET_TYPE_DATA &&
- sp->hdr.securityIndex != 0 &&
- (skb_cloned(skb) ||
- skb_has_frag_list(skb) ||
- skb_has_shared_frag(skb))) {
- /* Unshare the packet so that it can be
- * modified by in-place decryption.
- */
- struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC);
-
- if (nskb) {
- rxrpc_new_skb(nskb, rxrpc_skb_new_unshared);
- rxrpc_input_call_packet(call, nskb);
- rxrpc_free_skb(nskb, rxrpc_skb_put_call_rx);
- } else {
- /* OOM - Drop the packet. */
- rxrpc_see_skb(skb, rxrpc_skb_see_unshare_nomem);
- }
- } else {
- rxrpc_input_call_packet(call, skb);
- }
+ rxrpc_input_call_packet(call, skb);
rxrpc_free_skb(skb, rxrpc_skb_put_call_rx);
did_receive = true;
}
diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
index f035f486c139..fcb9d38bb521 100644
--- a/net/rxrpc/call_object.c
+++ b/net/rxrpc/call_object.c
@@ -152,6 +152,7 @@ struct rxrpc_call *rxrpc_alloc_call(struct rxrpc_sock *rx, gfp_t gfp,
spin_lock_init(&call->notify_lock);
refcount_set(&call->ref, 1);
call->debug_id = debug_id;
+ call->rx_pkt_offset = USHRT_MAX;
call->tx_total_len = -1;
call->tx_jumbo_max = 1;
call->next_rx_timo = 20 * HZ;
@@ -553,6 +554,7 @@ static void rxrpc_cleanup_rx_buffers(struct rxrpc_call *call)
rxrpc_purge_queue(&call->recvmsg_queue);
rxrpc_purge_queue(&call->rx_queue);
rxrpc_purge_queue(&call->rx_oos_queue);
+ kfree(call->rx_dec_buffer);
}
/*
diff --git a/net/rxrpc/insecure.c b/net/rxrpc/insecure.c
index 0a260df45d25..7a26c6097d03 100644
--- a/net/rxrpc/insecure.c
+++ b/net/rxrpc/insecure.c
@@ -32,9 +32,6 @@ static int none_secure_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
static int none_verify_packet(struct rxrpc_call *call, struct sk_buff *skb)
{
- struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
-
- sp->flags |= RXRPC_RX_VERIFIED;
return 0;
}
diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
index e1f7513a46db..865e368381d5 100644
--- a/net/rxrpc/recvmsg.c
+++ b/net/rxrpc/recvmsg.c
@@ -147,15 +147,55 @@ static void rxrpc_rotate_rx_window(struct rxrpc_call *call)
}
/*
- * Decrypt and verify a DATA packet.
+ * Decrypt and verify a DATA packet. The content of the packet is pulled out
+ * into a flat buffer rather than decrypting in place in the skbuff. This also
+ * has the advantage of aligning the buffer correctly for the crypto routines.
+ *
+ * We keep track of the sequence number of the packet currently decrypted into
+ * the buffer in ->rx_dec_seq. Unfortunately, this means that a MSG_PEEK of
+ * more than one byte may cause a later packet to be decrypted into the buffer,
+ * requiring the original to be re-decrypted when recvmsg() is called again.
*/
static int rxrpc_verify_data(struct rxrpc_call *call, struct sk_buff *skb)
{
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
+ int ret;
- if (sp->flags & RXRPC_RX_VERIFIED)
+ if (call->rx_dec_seq == sp->hdr.seq && call->rx_dec_buffer)
return 0;
- return call->security->verify_packet(call, skb);
+
+ if (call->rx_dec_bsize < sp->len) {
+ /* Make sure we can hold a 1412-byte jumbo subpacket and make
+ * sure that the buffer size is aligned to a crypto blocksize.
+ */
+ size_t size = umin(round_up(sp->len, 32), 2048);
+ void *buffer = krealloc(call->rx_dec_buffer, size, GFP_NOFS);
+
+ if (!buffer)
+ return -ENOMEM;
+ call->rx_dec_buffer = buffer;
+ call->rx_dec_bsize = size;
+ }
+
+ ret = -EFAULT;
+ if (skb_copy_bits(skb, sp->offset, call->rx_dec_buffer, sp->len) < 0)
+ goto err;
+
+ call->rx_dec_offset = 0;
+ call->rx_dec_len = sp->len;
+ call->rx_dec_seq = sp->hdr.seq;
+ ret = call->security->verify_packet(call, skb);
+ if (ret < 0)
+ goto err;
+ return 0;
+
+err:
+ kfree(call->rx_dec_buffer);
+ call->rx_dec_buffer = NULL;
+ call->rx_dec_bsize = 0;
+ call->rx_dec_offset = 0;
+ call->rx_dec_len = 0;
+ return ret;
}
/*
@@ -283,16 +323,19 @@ static int rxrpc_recvmsg_data(struct socket *sock, struct rxrpc_call *call,
if (msg)
sock_recv_timestamp(msg, sock->sk, skb);
- if (rx_pkt_offset == 0) {
+ if (rx_pkt_offset == USHRT_MAX) {
ret2 = rxrpc_verify_data(call, skb);
trace_rxrpc_recvdata(call, rxrpc_recvmsg_next, seq,
- sp->offset, sp->len, ret2);
+ call->rx_dec_offset,
+ call->rx_dec_len, ret2);
if (ret2 < 0) {
ret = ret2;
goto out;
}
- rx_pkt_offset = sp->offset;
- rx_pkt_len = sp->len;
+ sp = rxrpc_skb(skb);
+ seq = sp->hdr.seq;
+ rx_pkt_offset = call->rx_dec_offset;
+ rx_pkt_len = call->rx_dec_len;
} else {
trace_rxrpc_recvdata(call, rxrpc_recvmsg_cont, seq,
rx_pkt_offset, rx_pkt_len, 0);
@@ -304,10 +347,10 @@ static int rxrpc_recvmsg_data(struct socket *sock, struct rxrpc_call *call,
if (copy > remain)
copy = remain;
if (copy > 0) {
- ret2 = skb_copy_datagram_iter(skb, rx_pkt_offset, iter,
- copy);
- if (ret2 < 0) {
- ret = ret2;
+ ret2 = copy_to_iter(call->rx_dec_buffer + rx_pkt_offset,
+ copy, iter);
+ if (ret2 != copy) {
+ ret = -EFAULT;
goto out;
}
@@ -328,13 +371,14 @@ static int rxrpc_recvmsg_data(struct socket *sock, struct rxrpc_call *call,
/* The whole packet has been transferred. */
if (sp->hdr.flags & RXRPC_LAST_PACKET)
ret = 1;
- rx_pkt_offset = 0;
+ rx_pkt_offset = USHRT_MAX;
rx_pkt_len = 0;
+ if (unlikely(flags & MSG_PEEK))
+ break;
skb = skb_peek_next(skb, &call->recvmsg_queue);
- if (!(flags & MSG_PEEK))
- rxrpc_rotate_rx_window(call);
+ rxrpc_rotate_rx_window(call);
if (!rx->app_ops &&
!skb_queue_empty_lockless(&rx->recvmsg_oobq)) {
diff --git a/net/rxrpc/rxgk.c b/net/rxrpc/rxgk.c
index 0d5e654da918..88e651dd0e90 100644
--- a/net/rxrpc/rxgk.c
+++ b/net/rxrpc/rxgk.c
@@ -473,8 +473,9 @@ static int rxgk_verify_packet_integrity(struct rxrpc_call *call,
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
struct rxgk_header *hdr;
struct krb5_buffer metadata;
- unsigned int offset = sp->offset, len = sp->len;
+ unsigned int offset = 0, len = call->rx_dec_len;
size_t data_offset = 0, data_len = len;
+ void *data = call->rx_dec_buffer;
u32 ac = 0;
int ret = -ENOMEM;
@@ -496,16 +497,16 @@ static int rxgk_verify_packet_integrity(struct rxrpc_call *call,
metadata.len = sizeof(*hdr);
metadata.data = hdr;
- ret = rxgk_verify_mic_skb(gk->krb5, gk->rx_Kc, &metadata,
- skb, &offset, &len, &ac);
+ ret = rxgk_verify_mic(gk->krb5, gk->rx_Kc, &metadata,
+ data, &offset, &len, &ac);
kfree(hdr);
if (ret < 0) {
if (ret != -ENOMEM)
rxrpc_abort_eproto(call, skb, ac,
rxgk_abort_1_verify_mic_eproto);
} else {
- sp->offset = offset;
- sp->len = len;
+ call->rx_dec_offset = offset;
+ call->rx_dec_len = len;
}
put_gk:
@@ -522,49 +523,45 @@ static int rxgk_verify_packet_encrypted(struct rxrpc_call *call,
struct sk_buff *skb)
{
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
- struct rxgk_header hdr;
- unsigned int offset = sp->offset, len = sp->len;
+ struct rxgk_header *hdr;
+ unsigned int offset = 0, len = call->rx_dec_len;
+ void *data = call->rx_dec_buffer;
int ret;
u32 ac = 0;
_enter("");
- ret = rxgk_decrypt_skb(gk->krb5, gk->rx_enc, skb, &offset, &len, &ac);
+ ret = rxgk_decrypt(gk->krb5, gk->rx_enc, data, &offset, &len, &ac);
if (ret < 0) {
if (ret != -ENOMEM)
rxrpc_abort_eproto(call, skb, ac, rxgk_abort_2_decrypt_eproto);
goto error;
}
- if (len < sizeof(hdr)) {
+ if (len < sizeof(*hdr)) {
ret = rxrpc_abort_eproto(call, skb, RXGK_PACKETSHORT,
rxgk_abort_2_short_header);
goto error;
}
/* Extract the header from the skb */
- ret = skb_copy_bits(skb, offset, &hdr, sizeof(hdr));
- if (ret < 0) {
- ret = rxrpc_abort_eproto(call, skb, RXGK_PACKETSHORT,
- rxgk_abort_2_short_encdata);
- goto error;
- }
- offset += sizeof(hdr);
- len -= sizeof(hdr);
-
- if (ntohl(hdr.epoch) != call->conn->proto.epoch ||
- ntohl(hdr.cid) != call->cid ||
- ntohl(hdr.call_number) != call->call_id ||
- ntohl(hdr.seq) != sp->hdr.seq ||
- ntohl(hdr.sec_index) != call->security_ix ||
- ntohl(hdr.data_len) > len) {
+ hdr = data + offset;
+ offset += sizeof(*hdr);
+ len -= sizeof(*hdr);
+
+ if (ntohl(hdr->epoch) != call->conn->proto.epoch ||
+ ntohl(hdr->cid) != call->cid ||
+ ntohl(hdr->call_number) != call->call_id ||
+ ntohl(hdr->seq) != sp->hdr.seq ||
+ ntohl(hdr->sec_index) != call->security_ix ||
+ ntohl(hdr->data_len) > len) {
ret = rxrpc_abort_eproto(call, skb, RXGK_SEALEDINCON,
rxgk_abort_2_short_data);
goto error;
}
- sp->offset = offset;
- sp->len = ntohl(hdr.data_len);
+ call->rx_dec_offset = offset;
+ call->rx_dec_len = ntohl(hdr->data_len);
ret = 0;
error:
rxgk_put(gk);
diff --git a/net/rxrpc/rxgk_common.h b/net/rxrpc/rxgk_common.h
index 1e257d7ab8ec..dc8b0f106104 100644
--- a/net/rxrpc/rxgk_common.h
+++ b/net/rxrpc/rxgk_common.h
@@ -105,6 +105,45 @@ int rxgk_decrypt_skb(const struct krb5_enctype *krb5,
return ret;
}
+/*
+ * Apply decryption and checksumming functions a flat data buffer. The offset
+ * and length are updated to reflect the actual content of the encrypted
+ * region.
+ */
+static inline int rxgk_decrypt(const struct krb5_enctype *krb5,
+ struct crypto_aead *aead,
+ void *data,
+ unsigned int *_offset, unsigned int *_len,
+ int *_error_code)
+{
+ struct scatterlist sg[1];
+ size_t offset = 0, len = *_len;
+ int ret;
+
+ sg_init_one(sg, data, len);
+
+ ret = crypto_krb5_decrypt(krb5, aead, sg, 1, &offset, &len);
+ switch (ret) {
+ case 0:
+ *_offset += offset;
+ *_len = len;
+ break;
+ case -EBADMSG: /* Checksum mismatch. */
+ case -EPROTO:
+ *_error_code = RXGK_SEALEDINCON;
+ break;
+ case -EMSGSIZE:
+ *_error_code = RXGK_PACKETSHORT;
+ break;
+ case -ENOPKG: /* Would prefer RXGK_BADETYPE, but not available for YFS. */
+ default:
+ *_error_code = RXGK_INCONSISTENCY;
+ break;
+ }
+
+ return ret;
+}
+
/*
* Check the MIC on a region of an skbuff. The offset and length are updated
* to reflect the actual content of the secure region.
@@ -148,3 +187,43 @@ int rxgk_verify_mic_skb(const struct krb5_enctype *krb5,
return ret;
}
+
+/*
+ * Check the MIC on a flat buffer. The offset and length are updated to
+ * reflect the actual content of the secure region.
+ */
+static inline
+int rxgk_verify_mic(const struct krb5_enctype *krb5,
+ struct crypto_shash *shash,
+ const struct krb5_buffer *metadata,
+ void *data,
+ unsigned int *_offset, unsigned int *_len,
+ u32 *_error_code)
+{
+ struct scatterlist sg[1];
+ size_t offset = 0, len = *_len;
+ int ret;
+
+ sg_init_one(sg, data, len);
+
+ ret = crypto_krb5_verify_mic(krb5, shash, metadata, sg, 1, &offset, &len);
+ switch (ret) {
+ case 0:
+ *_offset += offset;
+ *_len = len;
+ break;
+ case -EBADMSG: /* Checksum mismatch */
+ case -EPROTO:
+ *_error_code = RXGK_SEALEDINCON;
+ break;
+ case -EMSGSIZE:
+ *_error_code = RXGK_PACKETSHORT;
+ break;
+ case -ENOPKG: /* Would prefer RXGK_BADETYPE, but not available for YFS. */
+ default:
+ *_error_code = RXGK_INCONSISTENCY;
+ break;
+ }
+
+ return ret;
+}
diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
index cba7935977f0..075936337836 100644
--- a/net/rxrpc/rxkad.c
+++ b/net/rxrpc/rxkad.c
@@ -430,27 +430,25 @@ static int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb,
rxrpc_seq_t seq,
struct skcipher_request *req)
{
- struct rxkad_level1_hdr sechdr;
+ struct rxkad_level1_hdr *sechdr;
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
struct rxrpc_crypt iv;
- struct scatterlist sg[16];
- u32 data_size, buf;
+ struct scatterlist sg[1];
+ void *data = call->rx_dec_buffer;
+ u32 len = sp->len, data_size, buf;
u16 check;
int ret;
_enter("");
- if (sp->len < 8)
+ if (len < 8)
return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON,
rxkad_abort_1_short_header);
/* Decrypt the skbuff in-place. TODO: We really want to decrypt
* directly into the target buffer.
*/
- sg_init_table(sg, ARRAY_SIZE(sg));
- ret = skb_to_sgvec(skb, sg, sp->offset, 8);
- if (unlikely(ret < 0))
- return ret;
+ sg_init_one(sg, data, len);
/* start the decryption afresh */
memset(&iv, 0, sizeof(iv));
@@ -464,13 +462,11 @@ static int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb,
return ret;
/* Extract the decrypted packet length */
- if (skb_copy_bits(skb, sp->offset, &sechdr, sizeof(sechdr)) < 0)
- return rxrpc_abort_eproto(call, skb, RXKADDATALEN,
- rxkad_abort_1_short_encdata);
- sp->offset += sizeof(sechdr);
- sp->len -= sizeof(sechdr);
+ sechdr = data;
+ call->rx_dec_offset = sizeof(*sechdr);
+ len -= sizeof(*sechdr);
- buf = ntohl(sechdr.data_size);
+ buf = ntohl(sechdr->data_size);
data_size = buf & 0xffff;
check = buf >> 16;
@@ -479,10 +475,10 @@ static int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb,
if (check != 0)
return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON,
rxkad_abort_1_short_check);
- if (data_size > sp->len)
+ if (data_size > len)
return rxrpc_abort_eproto(call, skb, RXKADDATALEN,
rxkad_abort_1_short_data);
- sp->len = data_size;
+ call->rx_dec_len = data_size;
_leave(" = 0 [dlen=%x]", data_size);
return 0;
@@ -496,43 +492,28 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
struct skcipher_request *req)
{
const struct rxrpc_key_token *token;
- struct rxkad_level2_hdr sechdr;
+ struct rxkad_level2_hdr *sechdr;
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
struct rxrpc_crypt iv;
- struct scatterlist _sg[4], *sg;
- u32 data_size, buf;
+ struct scatterlist sg[1];
+ void *data = call->rx_dec_buffer;
+ u32 len = sp->len, data_size, buf;
u16 check;
- int nsg, ret;
+ int ret;
- _enter(",{%d}", sp->len);
+ _enter(",{%d}", len);
- if (sp->len < 8)
+ if (len < 8)
return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON,
rxkad_abort_2_short_header);
/* Don't let the crypto algo see a misaligned length. */
- sp->len = round_down(sp->len, 8);
+ len = round_down(len, 8);
- /* Decrypt the skbuff in-place. TODO: We really want to decrypt
- * directly into the target buffer.
+ /* Decrypt in place in the call's decryption buffer. TODO: We really
+ * want to decrypt directly into the target buffer.
*/
- sg = _sg;
- nsg = skb_shinfo(skb)->nr_frags + 1;
- if (nsg <= 4) {
- nsg = 4;
- } else {
- sg = kmalloc_objs(*sg, nsg, GFP_NOIO);
- if (!sg)
- return -ENOMEM;
- }
-
- sg_init_table(sg, nsg);
- ret = skb_to_sgvec(skb, sg, sp->offset, sp->len);
- if (unlikely(ret < 0)) {
- if (sg != _sg)
- kfree(sg);
- return ret;
- }
+ sg_init_one(sg, data, len);
/* decrypt from the session key */
token = call->conn->key->payload.data[0];
@@ -540,11 +521,9 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
skcipher_request_set_sync_tfm(req, call->conn->rxkad.cipher);
skcipher_request_set_callback(req, 0, NULL, NULL);
- skcipher_request_set_crypt(req, sg, sg, sp->len, iv.x);
+ skcipher_request_set_crypt(req, sg, sg, len, iv.x);
ret = crypto_skcipher_decrypt(req);
skcipher_request_zero(req);
- if (sg != _sg)
- kfree(sg);
if (ret < 0) {
if (ret == -ENOMEM)
return ret;
@@ -553,13 +532,11 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
}
/* Extract the decrypted packet length */
- if (skb_copy_bits(skb, sp->offset, &sechdr, sizeof(sechdr)) < 0)
- return rxrpc_abort_eproto(call, skb, RXKADDATALEN,
- rxkad_abort_2_short_len);
- sp->offset += sizeof(sechdr);
- sp->len -= sizeof(sechdr);
+ sechdr = data;
+ call->rx_dec_offset = sizeof(*sechdr);
+ len -= sizeof(*sechdr);
- buf = ntohl(sechdr.data_size);
+ buf = ntohl(sechdr->data_size);
data_size = buf & 0xffff;
check = buf >> 16;
@@ -569,17 +546,18 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON,
rxkad_abort_2_short_check);
- if (data_size > sp->len)
+ if (data_size > len)
return rxrpc_abort_eproto(call, skb, RXKADDATALEN,
rxkad_abort_2_short_data);
- sp->len = data_size;
+ call->rx_dec_len = data_size;
_leave(" = 0 [dlen=%x]", data_size);
return 0;
}
/*
- * Verify the security on a received packet and the subpackets therein.
+ * Verify the security on a received (sub)packet. If the packet needs
+ * modifying (e.g. decrypting), it must be copied.
*/
static int rxkad_verify_packet(struct rxrpc_call *call, struct sk_buff *skb)
{
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net 3/3] rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer
2026-05-11 16:07 [PATCH net 0/3] rxrpc: Better fix for DATA/RESPONSE decrypt vs splice() David Howells
2026-05-11 16:07 ` [PATCH net 1/3] rxrpc: Also unshare DATA/RESPONSE packets when paged frags are present David Howells
2026-05-11 16:07 ` [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg David Howells
@ 2026-05-11 16:07 ` David Howells
2026-05-12 8:22 ` Jeffrey Altman
2026-05-13 0:06 ` Jakub Kicinski
2 siblings, 2 replies; 15+ messages in thread
From: David Howells @ 2026-05-11 16:07 UTC (permalink / raw)
To: netdev
Cc: David Howells, Hyunwoo Kim, Marc Dionne, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
linux-afs, linux-kernel, Jeffrey Altman, Jiayuan Chen, stable
This improves the fix for CVE-2026-43500.
Fix the verification of RESPONSE packets to avoid the problem of
overwriting a RESPONSE packet sent via splice to a local address by
extracting the contents of the UDP packet into a kmalloc'd linear buffer
rather than decrypting the data in place in the sk_buff (which may corrupt
the original buffer).
Further, since the way the RESPONSE data is handled is being overhauled,
add an XDR decode abstraction that hides the details of buffer advancement
and length checking. At some point it may be worth seeing if NFS's XDR
stuff can be used, but that involves linking against the sunrpc module
which is undesirable.
Fixes: 24481a7f5733 ("rxrpc: Fix conn-level packet handling to unshare RESPONSE packets")
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Closes: https://lore.kernel.org/r/afKV2zGR6rrelPC7@v4bel/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Simon Horman <horms@kernel.org>
cc: Jiayuan Chen <jiayuan.chen@linux.dev>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
cc: stable@kernel.org
---
net/rxrpc/ar-internal.h | 70 +++++++++++++++++--
net/rxrpc/conn_event.c | 35 +++++-----
net/rxrpc/insecure.c | 5 +-
net/rxrpc/protocol.h | 1 -
net/rxrpc/rxgk.c | 146 +++++++++++++---------------------------
net/rxrpc/rxgk_app.c | 91 +++++++++++--------------
net/rxrpc/rxgk_common.h | 115 ++++---------------------------
net/rxrpc/rxkad.c | 89 ++++++++++--------------
8 files changed, 219 insertions(+), 333 deletions(-)
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 783367eea798..610fa208157b 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -33,6 +33,13 @@ struct rxrpc_txbuf;
struct rxrpc_txqueue;
struct rxgk_context;
+typedef __be32 xdr_t;
+
+struct xdr_buffer {
+ xdr_t *p; /* Current decode point */
+ unsigned int len; /* Amount left in buffer */
+};
+
/*
* Mark applied to socket buffers in skb->mark. skb->priority is used
* to pass supplementary information.
@@ -307,16 +314,16 @@ struct rxrpc_security {
struct sk_buff *challenge);
/* verify a response */
- int (*verify_response)(struct rxrpc_connection *,
- struct sk_buff *);
+ int (*verify_response)(struct rxrpc_connection *conn,
+ struct sk_buff *response_skb,
+ struct xdr_buffer *response);
/* clear connection security */
void (*clear)(struct rxrpc_connection *);
/* Default ticket -> key decoder */
int (*default_decode_ticket)(struct rxrpc_connection *conn, struct sk_buff *skb,
- unsigned int ticket_offset, unsigned int ticket_len,
- struct key **_key);
+ struct xdr_buffer *ticket, struct key **_key);
};
/*
@@ -1591,6 +1598,61 @@ static inline unsigned int rxrpc_tx_in_flight(const struct rxrpc_call *call)
return call->tx_nr_sent - rxrpc_left_out(call) + call->tx_nr_resent;
}
+#define xdr_round_up(x) (round_up((x), sizeof(__be32)))
+#define xdr_round_down(x) (round_down((x), sizeof(__be32)))
+#define xdr_object_len(x) (4 + xdr_round_up(x))
+
+/*
+ * Check the amount of data remaining in an XDR buffer.
+ */
+static inline bool xdr_check(struct xdr_buffer *buf, unsigned int bytes)
+{
+ if (bytes > buf->len ||
+ xdr_round_up(bytes) > buf->len)
+ return false;
+ return true;
+}
+
+/*
+ * Grab a region in an XDR buffer and advance the buffer position, returning a
+ * pointer to the start of the region or NULL if there's insufficient data.
+ */
+static inline void *xdr_extract_region(struct xdr_buffer *buf, unsigned int bytes)
+{
+ xdr_t *p = buf->p;
+
+ if (!xdr_check(buf, bytes))
+ return NULL;
+
+ bytes = xdr_round_up(bytes);
+ buf->p += bytes / sizeof(*buf->p);
+ buf->len -= bytes;
+ return p;
+}
+
+/*
+ * Decode an XDR word.
+ */
+static inline u32 xdr_dec(xdr_t x)
+{
+ return ntohl(x);
+}
+
+/*
+ * Grab a blob with size from an XDR buffer and advance the buffer position.
+ */
+static inline bool xdr_extract_blob(struct xdr_buffer *buf, struct xdr_buffer *blob)
+{
+ xdr_t *xsize;
+
+ xsize = xdr_extract_region(buf, sizeof(*xsize));
+ if (!xsize)
+ return false;
+ blob->len = xdr_dec(*xsize);
+ blob->p = xdr_extract_region(buf, blob->len);
+ return blob->p != NULL;
+}
+
/*
* debug tracing
*/
diff --git a/net/rxrpc/conn_event.c b/net/rxrpc/conn_event.c
index 442414d90ba1..26cab3d2075e 100644
--- a/net/rxrpc/conn_event.c
+++ b/net/rxrpc/conn_event.c
@@ -243,28 +243,25 @@ static void rxrpc_call_is_secure(struct rxrpc_call *call)
static int rxrpc_verify_response(struct rxrpc_connection *conn,
struct sk_buff *skb)
{
+ struct xdr_buffer response = {
+ .len = skb->len - sizeof(struct rxrpc_wire_header),
+ };
+ void *buffer;
int ret;
- if (skb_cloned(skb) || skb_has_frag_list(skb) ||
- skb_has_shared_frag(skb)) {
- /* Copy the packet if shared so that we can do in-place
- * decryption.
- */
- struct sk_buff *nskb = skb_copy(skb, GFP_NOFS);
-
- if (nskb) {
- rxrpc_new_skb(nskb, rxrpc_skb_new_unshared);
- ret = conn->security->verify_response(conn, nskb);
- rxrpc_free_skb(nskb, rxrpc_skb_put_response_copy);
- } else {
- /* OOM - Drop the packet. */
- rxrpc_see_skb(skb, rxrpc_skb_see_unshare_nomem);
- ret = -ENOMEM;
- }
- } else {
- ret = conn->security->verify_response(conn, skb);
- }
+ buffer = kmalloc(response.len, GFP_NOFS);
+ if (!buffer)
+ return ret;
+
+ ret = skb_copy_bits(skb, sizeof(struct rxrpc_wire_header), buffer, response.len);
+ if (ret < 0)
+ goto out;
+
+ response.p = buffer;
+ ret = conn->security->verify_response(conn, skb, &response);
+out:
+ kfree(buffer);
return ret;
}
diff --git a/net/rxrpc/insecure.c b/net/rxrpc/insecure.c
index 7a26c6097d03..dd1827f683bc 100644
--- a/net/rxrpc/insecure.c
+++ b/net/rxrpc/insecure.c
@@ -54,9 +54,10 @@ static int none_sendmsg_respond_to_challenge(struct sk_buff *challenge,
}
static int none_verify_response(struct rxrpc_connection *conn,
- struct sk_buff *skb)
+ struct sk_buff *response_skb,
+ struct xdr_buffer *response)
{
- return rxrpc_abort_conn(conn, skb, RX_PROTOCOL_ERROR, -EPROTO,
+ return rxrpc_abort_conn(conn, response_skb, RX_PROTOCOL_ERROR, -EPROTO,
rxrpc_eproto_rxnull_response);
}
diff --git a/net/rxrpc/protocol.h b/net/rxrpc/protocol.h
index f8bfec12bc7e..6e02a84fd370 100644
--- a/net/rxrpc/protocol.h
+++ b/net/rxrpc/protocol.h
@@ -198,7 +198,6 @@ struct rxgk_header {
*/
struct rxgk_response {
__be64 start_time;
- __be32 token_len;
} __packed;
#endif /* _LINUX_RXRPC_PACKET_H */
diff --git a/net/rxrpc/rxgk.c b/net/rxrpc/rxgk.c
index 88e651dd0e90..fd73b1ff3b97 100644
--- a/net/rxrpc/rxgk.c
+++ b/net/rxrpc/rxgk.c
@@ -524,43 +524,42 @@ static int rxgk_verify_packet_encrypted(struct rxrpc_call *call,
{
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
struct rxgk_header *hdr;
- unsigned int offset = 0, len = call->rx_dec_len;
- void *data = call->rx_dec_buffer;
+ struct xdr_buffer xdr = {
+ .p = call->rx_dec_buffer,
+ .len = call->rx_dec_len,
+ };
int ret;
u32 ac = 0;
_enter("");
- ret = rxgk_decrypt(gk->krb5, gk->rx_enc, data, &offset, &len, &ac);
+ ret = rxgk_decrypt(gk->krb5, gk->rx_enc, &xdr, &ac);
if (ret < 0) {
if (ret != -ENOMEM)
rxrpc_abort_eproto(call, skb, ac, rxgk_abort_2_decrypt_eproto);
goto error;
}
- if (len < sizeof(*hdr)) {
+ /* Extract the header from the skb */
+ hdr = xdr_extract_region(&xdr, sizeof(*hdr));
+ if (!hdr) {
ret = rxrpc_abort_eproto(call, skb, RXGK_PACKETSHORT,
rxgk_abort_2_short_header);
goto error;
}
- /* Extract the header from the skb */
- hdr = data + offset;
- offset += sizeof(*hdr);
- len -= sizeof(*hdr);
-
if (ntohl(hdr->epoch) != call->conn->proto.epoch ||
ntohl(hdr->cid) != call->cid ||
ntohl(hdr->call_number) != call->call_id ||
ntohl(hdr->seq) != sp->hdr.seq ||
ntohl(hdr->sec_index) != call->security_ix ||
- ntohl(hdr->data_len) > len) {
+ ntohl(hdr->data_len) > xdr.len) {
ret = rxrpc_abort_eproto(call, skb, RXGK_SEALEDINCON,
rxgk_abort_2_short_data);
goto error;
}
- call->rx_dec_offset = offset;
+ call->rx_dec_offset = (void *)xdr.p - call->rx_dec_buffer;
call->rx_dec_len = ntohl(hdr->data_len);
ret = 0;
error:
@@ -1073,37 +1072,37 @@ static int rxgk_sendmsg_respond_to_challenge(struct sk_buff *challenge,
* unsigned int call_numbers<>;
* };
*/
-static int rxgk_do_verify_authenticator(struct rxrpc_connection *conn,
- const struct krb5_enctype *krb5,
- struct sk_buff *skb,
- __be32 *p, __be32 *end)
+static int rxgk_verify_authenticator(struct rxrpc_connection *conn,
+ const struct krb5_enctype *krb5,
+ struct sk_buff *skb,
+ struct xdr_buffer *auth)
{
- u32 app_len, call_count, level, epoch, cid, i;
+ struct xdr_buffer appdata;
+ xdr_t *nonce, *x;
+ u32 call_count, level, epoch, cid, i;
_enter("");
- if ((end - p) * sizeof(__be32) < 24)
+ nonce = xdr_extract_region(auth, 20);
+ if (!nonce)
return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
rxgk_abort_resp_short_auth);
- if (memcmp(p, conn->rxgk.nonce, 20) != 0)
+ if (memcmp(nonce, conn->rxgk.nonce, 20) != 0)
return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
rxgk_abort_resp_bad_nonce);
- p += 20 / sizeof(__be32);
- app_len = ntohl(*p++);
- if (app_len > (end - p) * sizeof(__be32))
+ if (!xdr_extract_blob(auth, &appdata))
return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
rxgk_abort_resp_short_applen);
- p += xdr_round_up(app_len) / sizeof(__be32);
- if (end - p < 4)
+ x = xdr_extract_region(auth, 4 * sizeof(xdr_t));
+ if (!x)
return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
rxgk_abort_resp_short_auth);
-
- level = ntohl(*p++);
- epoch = ntohl(*p++);
- cid = ntohl(*p++);
- call_count = ntohl(*p++);
+ level = xdr_dec(x[0]);
+ epoch = xdr_dec(x[1]);
+ cid = xdr_dec(x[2]);
+ call_count = xdr_dec(x[3]);
if (level != conn->security_level ||
epoch != conn->proto.epoch ||
@@ -1112,12 +1111,13 @@ static int rxgk_do_verify_authenticator(struct rxrpc_connection *conn,
return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
rxgk_abort_resp_bad_param);
- if (end - p < call_count)
+ x = xdr_extract_region(auth, call_count * sizeof(xdr_t));
+ if (!x)
return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
rxgk_abort_resp_short_call_list);
for (i = 0; i < call_count; i++) {
- u32 call_id = ntohl(*p++);
+ u32 call_id = xdr_dec(x[i]);
if (call_id > INT_MAX)
return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
@@ -1140,37 +1140,6 @@ static int rxgk_do_verify_authenticator(struct rxrpc_connection *conn,
return 0;
}
-/*
- * Extract the authenticator and verify it.
- */
-static int rxgk_verify_authenticator(struct rxrpc_connection *conn,
- const struct krb5_enctype *krb5,
- struct sk_buff *skb,
- unsigned int auth_offset, unsigned int auth_len)
-{
- void *auth;
- __be32 *p;
- int ret;
-
- auth = kmalloc(auth_len, GFP_NOFS);
- if (!auth)
- return -ENOMEM;
-
- ret = skb_copy_bits(skb, auth_offset, auth, auth_len);
- if (ret < 0) {
- ret = rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
- rxgk_abort_resp_short_auth);
- goto error;
- }
-
- p = auth;
- ret = rxgk_do_verify_authenticator(conn, krb5, skb, p,
- p + auth_len / sizeof(*p));
-error:
- kfree(auth);
- return ret;
-}
-
/*
* Verify a response.
*
@@ -1181,55 +1150,34 @@ static int rxgk_verify_authenticator(struct rxrpc_connection *conn,
* };
*/
static int rxgk_verify_response(struct rxrpc_connection *conn,
- struct sk_buff *skb)
+ struct sk_buff *skb,
+ struct xdr_buffer *response)
{
const struct krb5_enctype *krb5;
struct rxrpc_key_token *token;
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
- struct rxgk_response rhdr;
+ struct rxgk_response *rhdr;
struct rxgk_context *gk;
+ struct xdr_buffer resp_token, auth;
struct key *key = NULL;
- unsigned int offset = sizeof(struct rxrpc_wire_header);
- unsigned int len = skb->len - sizeof(struct rxrpc_wire_header);
- unsigned int token_offset, token_len;
- unsigned int auth_offset, auth_len;
- __be32 xauth_len;
int ret, ec;
_enter("{%d}", conn->debug_id);
/* Parse the RXGK_Response object */
- if (sizeof(rhdr) + sizeof(__be32) > len)
+ rhdr = xdr_extract_region(response, sizeof(*rhdr));
+ if (!rhdr)
goto short_packet;
-
- if (skb_copy_bits(skb, offset, &rhdr, sizeof(rhdr)) < 0)
+ if (!xdr_extract_blob(response, &resp_token))
goto short_packet;
- offset += sizeof(rhdr);
- len -= sizeof(rhdr);
-
- token_offset = offset;
- token_len = ntohl(rhdr.token_len);
- if (token_len > len ||
- xdr_round_up(token_len) + sizeof(__be32) > len)
+ if (!xdr_extract_blob(response, &auth))
goto short_packet;
- trace_rxrpc_rx_response(conn, sp->hdr.serial, 0, sp->hdr.cksum, token_len);
+ trace_rxrpc_rx_response(conn, sp->hdr.serial, 0, sp->hdr.cksum, resp_token.len);
- offset += xdr_round_up(token_len);
- len -= xdr_round_up(token_len);
-
- if (skb_copy_bits(skb, offset, &xauth_len, sizeof(xauth_len)) < 0)
- goto short_packet;
- offset += sizeof(xauth_len);
- len -= sizeof(xauth_len);
-
- auth_offset = offset;
- auth_len = ntohl(xauth_len);
- if (auth_len > len)
- goto short_packet;
- if (auth_len & 3)
+ if (auth.len & 3)
goto inconsistent;
- if (auth_len < 20 + 9 * 4)
+ if (auth.len < 20 + 9 * 4)
goto auth_too_short;
/* We need to extract and decrypt the token and instantiate a session
@@ -1238,7 +1186,7 @@ static int rxgk_verify_response(struct rxrpc_connection *conn,
* to the app to deal with - which might mean a round trip to
* userspace.
*/
- ret = rxgk_extract_token(conn, skb, token_offset, token_len, &key);
+ ret = rxgk_extract_token(conn, skb, &resp_token, &key);
if (ret < 0)
goto out;
@@ -1252,7 +1200,7 @@ static int rxgk_verify_response(struct rxrpc_connection *conn,
*/
token = key->payload.data[0];
conn->security_level = token->rxgk->level;
- conn->rxgk.start_time = __be64_to_cpu(rhdr.start_time);
+ conn->rxgk.start_time = __be64_to_cpu(rhdr->start_time);
gk = rxgk_generate_transport_key(conn, token->rxgk, sp->hdr.cksum, GFP_NOFS);
if (IS_ERR(gk)) {
@@ -1262,18 +1210,18 @@ static int rxgk_verify_response(struct rxrpc_connection *conn,
krb5 = gk->krb5;
- trace_rxrpc_rx_response(conn, sp->hdr.serial, krb5->etype, sp->hdr.cksum, token_len);
+ trace_rxrpc_rx_response(conn, sp->hdr.serial, krb5->etype, sp->hdr.cksum,
+ resp_token.len);
/* Decrypt, parse and verify the authenticator. */
- ret = rxgk_decrypt_skb(krb5, gk->resp_enc, skb,
- &auth_offset, &auth_len, &ec);
+ ret = rxgk_decrypt(krb5, gk->resp_enc, &auth, &ec);
if (ret < 0) {
rxrpc_abort_conn(conn, skb, RXGK_SEALEDINCON, ret,
rxgk_abort_resp_auth_dec);
goto out_gk;
}
- ret = rxgk_verify_authenticator(conn, krb5, skb, auth_offset, auth_len);
+ ret = rxgk_verify_authenticator(conn, krb5, skb, &auth);
if (ret < 0)
goto out_gk;
diff --git a/net/rxrpc/rxgk_app.c b/net/rxrpc/rxgk_app.c
index 0ef2a29eb695..5ced22f17c5b 100644
--- a/net/rxrpc/rxgk_app.c
+++ b/net/rxrpc/rxgk_app.c
@@ -40,40 +40,43 @@
* };
*/
int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
- unsigned int ticket_offset, unsigned int ticket_len,
- struct key **_key)
+ struct xdr_buffer *xticket, struct key **_key)
{
struct rxrpc_key_token *token;
const struct cred *cred = current_cred(); // TODO - use socket creds
+ struct xdr_buffer xdr = *xticket, K0;
struct key *key;
size_t pre_ticket_len, payload_len;
- unsigned int klen, enctype;
+ unsigned int enctype;
void *payload, *ticket;
- __be32 *t, *p, *q, tmp[2];
+ __be32 *t, *p, *q;
+ xdr_t *x;
int ret;
_enter("");
- if (ticket_len < 10 * sizeof(__be32))
+ /* Extract K0 */
+ x = xdr_extract_region(&xdr, sizeof(xdr_t) * 1);
+ if (!x)
return rxrpc_abort_conn(conn, skb, RXGK_INCONSISTENCY, -EPROTO,
rxgk_abort_resp_short_yfs_tkt);
- /* Get the session key length */
- ret = skb_copy_bits(skb, ticket_offset, tmp, sizeof(tmp));
- if (ret < 0)
- return rxrpc_abort_conn(conn, skb, RXGK_INCONSISTENCY, -EPROTO,
- rxgk_abort_resp_short_yfs_klen);
- enctype = ntohl(tmp[0]);
- klen = ntohl(tmp[1]);
+ enctype = xdr_dec(*x);
- if (klen > ticket_len - 10 * sizeof(__be32))
+ if (!xdr_extract_blob(&xdr, &K0))
return rxrpc_abort_conn(conn, skb, RXGK_INCONSISTENCY, -EPROTO,
rxgk_abort_resp_short_yfs_key);
+ /* Extract level to expirationtime. */
+ t = xdr_extract_region(&xdr, sizeof(xdr_t) * 7);
+ if (!t)
+ return rxrpc_abort_conn(conn, skb, RXGK_INCONSISTENCY, -EPROTO,
+ rxgk_abort_resp_short_yfs_tkt);
+
pre_ticket_len = ((5 + 14) * sizeof(__be32) +
- xdr_round_up(klen) +
+ xdr_round_up(K0.len) +
sizeof(__be32));
- payload_len = pre_ticket_len + xdr_round_up(ticket_len);
+ payload_len = pre_ticket_len + xdr_round_up(xticket->len);
payload = kzalloc(payload_len, GFP_NOFS);
if (!payload)
@@ -84,12 +87,7 @@ int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
* it.
*/
ticket = payload + pre_ticket_len;
- ret = skb_copy_bits(skb, ticket_offset, ticket, ticket_len);
- if (ret < 0) {
- ret = rxrpc_abort_conn(conn, skb, RXGK_INCONSISTENCY, -EPROTO,
- rxgk_abort_resp_short_yfs_tkt);
- goto error;
- }
+ memcpy(ticket, xticket->p, xticket->len);
/* Fill out the form header. */
p = payload;
@@ -97,13 +95,12 @@ int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
p[1] = htonl(1); /* len(cellname) */
p[2] = htonl(0x20000000); /* Cellname " " */
p[3] = htonl(1); /* #tokens */
- p[4] = htonl(15 * sizeof(__be32) + xdr_round_up(klen) +
- xdr_round_up(ticket_len)); /* Token len */
+ p[4] = htonl(15 * sizeof(__be32) + xdr_round_up(K0.len) +
+ xdr_round_up(xticket->len)); /* Token len */
/* Now fill in the body. Most of this we can just scrape directly from
* the ticket.
*/
- t = ticket + sizeof(__be32) * 2 + xdr_round_up(klen);
q = payload + 5 * sizeof(__be32);
q[0] = htonl(RXRPC_SECURITY_YFS_RXGK);
q[1] = t[1]; /* begintime - msw */
@@ -118,21 +115,21 @@ int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
q[10] = t[4]; /* - lsw */
q[11] = 0; /* enctype - msw */
q[12] = htonl(enctype); /* - lsw */
- q[13] = htonl(klen); /* Key length */
+ q[13] = htonl(K0.len); /* Key length */
q += 14;
- memcpy(q, ticket + sizeof(__be32) * 2, klen);
- q += xdr_round_up(klen) / 4;
- q[0] = htonl(ticket_len);
+ memcpy(q, K0.p, K0.len);
+ q += xdr_round_up(K0.len) / 4;
+ q[0] = htonl(xticket->len);
q++;
if (WARN_ON((unsigned long)q != (unsigned long)ticket)) {
ret = -EIO;
goto error;
}
- /* Ticket read in with skb_copy_bits above */
- q += xdr_round_up(ticket_len) / 4;
+ /* Ticket appended above. */
+ q += xdr_round_up(xticket->len) / 4;
if (WARN_ON((unsigned long)q - (unsigned long)payload != payload_len)) {
ret = -EIO;
goto error;
@@ -182,44 +179,34 @@ int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
* [tools.ietf.org/html/draft-wilkinson-afs3-rxgk-afs-08 sec 6.1]
*/
int rxgk_extract_token(struct rxrpc_connection *conn, struct sk_buff *skb,
- unsigned int token_offset, unsigned int token_len,
- struct key **_key)
+ struct xdr_buffer *token, struct key **_key)
{
const struct krb5_enctype *krb5;
const struct krb5_buffer *server_secret;
struct crypto_aead *token_enc = NULL;
+ struct xdr_buffer ticket;
struct key *server_key;
- unsigned int ticket_offset, ticket_len;
u32 kvno, enctype;
int ret, ec = 0;
struct {
__be32 kvno;
__be32 enctype;
- __be32 token_len;
- } container;
+ } *container;
- if (token_len < sizeof(container))
+ container = xdr_extract_region(token, sizeof(container));
+ if (!container)
goto short_packet;
/* Decode the RXGK_TokenContainer object. This tells us which server
* key we should be using. We can then fetch the key, get the secret
* and set up the crypto to extract the token.
*/
- if (skb_copy_bits(skb, token_offset, &container, sizeof(container)) < 0)
- goto short_packet;
-
- kvno = ntohl(container.kvno);
- enctype = ntohl(container.enctype);
- ticket_len = ntohl(container.token_len);
- ticket_offset = token_offset + sizeof(container);
-
- if (ticket_len > xdr_round_down(token_len - sizeof(container)))
- goto short_packet;
+ kvno = ntohl(container->kvno);
+ enctype = ntohl(container->enctype);
_debug("KVNO %u", kvno);
_debug("ENC %u", enctype);
- _debug("TLEN %u", ticket_len);
server_key = rxrpc_look_up_server_security(conn, skb, kvno, enctype);
if (IS_ERR(server_key))
@@ -237,8 +224,11 @@ int rxgk_extract_token(struct rxrpc_connection *conn, struct sk_buff *skb,
* gain access to K0, from which we can derive the transport key and
* thence decode the authenticator.
*/
- ret = rxgk_decrypt_skb(krb5, token_enc, skb,
- &ticket_offset, &ticket_len, &ec);
+ if (!xdr_extract_blob(token, &ticket))
+ goto short_packet;
+ _debug("TLEN %u", ticket.len);
+
+ ret = rxgk_decrypt(krb5, token_enc, &ticket, &ec);
crypto_free_aead(token_enc);
token_enc = NULL;
if (ret < 0) {
@@ -248,8 +238,7 @@ int rxgk_extract_token(struct rxrpc_connection *conn, struct sk_buff *skb,
return ret;
}
- ret = conn->security->default_decode_ticket(conn, skb, ticket_offset,
- ticket_len, _key);
+ ret = conn->security->default_decode_ticket(conn, skb, &ticket, _key);
if (ret < 0)
goto cant_get_token;
diff --git a/net/rxrpc/rxgk_common.h b/net/rxrpc/rxgk_common.h
index dc8b0f106104..f0b724c44036 100644
--- a/net/rxrpc/rxgk_common.h
+++ b/net/rxrpc/rxgk_common.h
@@ -33,19 +33,13 @@ struct rxgk_context {
struct crypto_aead *resp_enc; /* Response packet enc key */
};
-#define xdr_round_up(x) (round_up((x), sizeof(__be32)))
-#define xdr_round_down(x) (round_down((x), sizeof(__be32)))
-#define xdr_object_len(x) (4 + xdr_round_up(x))
-
/*
* rxgk_app.c
*/
int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
- unsigned int ticket_offset, unsigned int ticket_len,
- struct key **_key);
+ struct xdr_buffer *xticket, struct key **_key);
int rxgk_extract_token(struct rxrpc_connection *conn, struct sk_buff *skb,
- unsigned int token_offset, unsigned int token_len,
- struct key **_key);
+ struct xdr_buffer *token, struct key **_key);
/*
* rxgk_kdf.c
@@ -61,50 +55,6 @@ int rxgk_set_up_token_cipher(const struct krb5_buffer *server_key,
const struct krb5_enctype **_krb5,
gfp_t gfp);
-/*
- * Apply decryption and checksumming functions to part of an skbuff. The
- * offset and length are updated to reflect the actual content of the encrypted
- * region.
- */
-static inline
-int rxgk_decrypt_skb(const struct krb5_enctype *krb5,
- struct crypto_aead *aead,
- struct sk_buff *skb,
- unsigned int *_offset, unsigned int *_len,
- int *_error_code)
-{
- struct scatterlist sg[16];
- size_t offset = 0, len = *_len;
- int nr_sg, ret;
-
- sg_init_table(sg, ARRAY_SIZE(sg));
- nr_sg = skb_to_sgvec(skb, sg, *_offset, len);
- if (unlikely(nr_sg < 0))
- return nr_sg;
-
- ret = crypto_krb5_decrypt(krb5, aead, sg, nr_sg,
- &offset, &len);
- switch (ret) {
- case 0:
- *_offset += offset;
- *_len = len;
- break;
- case -EBADMSG: /* Checksum mismatch. */
- case -EPROTO:
- *_error_code = RXGK_SEALEDINCON;
- break;
- case -EMSGSIZE:
- *_error_code = RXGK_PACKETSHORT;
- break;
- case -ENOPKG: /* Would prefer RXGK_BADETYPE, but not available for YFS. */
- default:
- *_error_code = RXGK_INCONSISTENCY;
- break;
- }
-
- return ret;
-}
-
/*
* Apply decryption and checksumming functions a flat data buffer. The offset
* and length are updated to reflect the actual content of the encrypted
@@ -112,21 +62,24 @@ int rxgk_decrypt_skb(const struct krb5_enctype *krb5,
*/
static inline int rxgk_decrypt(const struct krb5_enctype *krb5,
struct crypto_aead *aead,
- void *data,
- unsigned int *_offset, unsigned int *_len,
+ struct xdr_buffer *buf,
int *_error_code)
{
struct scatterlist sg[1];
- size_t offset = 0, len = *_len;
+ size_t offset = 0, len = buf->len;
int ret;
- sg_init_one(sg, data, len);
-
+ sg_init_one(sg, buf->p, len);
ret = crypto_krb5_decrypt(krb5, aead, sg, 1, &offset, &len);
switch (ret) {
case 0:
- *_offset += offset;
- *_len = len;
+ if (offset & 3) {
+ *_error_code = RXGK_INCONSISTENCY;
+ ret = -EPROTO;
+ break;
+ }
+ buf->p = (void *)buf->p + offset;
+ buf->len = len;
break;
case -EBADMSG: /* Checksum mismatch. */
case -EPROTO:
@@ -144,50 +97,6 @@ static inline int rxgk_decrypt(const struct krb5_enctype *krb5,
return ret;
}
-/*
- * Check the MIC on a region of an skbuff. The offset and length are updated
- * to reflect the actual content of the secure region.
- */
-static inline
-int rxgk_verify_mic_skb(const struct krb5_enctype *krb5,
- struct crypto_shash *shash,
- const struct krb5_buffer *metadata,
- struct sk_buff *skb,
- unsigned int *_offset, unsigned int *_len,
- u32 *_error_code)
-{
- struct scatterlist sg[16];
- size_t offset = 0, len = *_len;
- int nr_sg, ret;
-
- sg_init_table(sg, ARRAY_SIZE(sg));
- nr_sg = skb_to_sgvec(skb, sg, *_offset, len);
- if (unlikely(nr_sg < 0))
- return nr_sg;
-
- ret = crypto_krb5_verify_mic(krb5, shash, metadata, sg, nr_sg,
- &offset, &len);
- switch (ret) {
- case 0:
- *_offset += offset;
- *_len = len;
- break;
- case -EBADMSG: /* Checksum mismatch */
- case -EPROTO:
- *_error_code = RXGK_SEALEDINCON;
- break;
- case -EMSGSIZE:
- *_error_code = RXGK_PACKETSHORT;
- break;
- case -ENOPKG: /* Would prefer RXGK_BADETYPE, but not available for YFS. */
- default:
- *_error_code = RXGK_INCONSISTENCY;
- break;
- }
-
- return ret;
-}
-
/*
* Check the MIC on a flat buffer. The offset and length are updated to
* reflect the actual content of the secure region.
diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
index 075936337836..e0f3db451b05 100644
--- a/net/rxrpc/rxkad.c
+++ b/net/rxrpc/rxkad.c
@@ -963,7 +963,6 @@ static int rxkad_decrypt_ticket(struct rxrpc_connection *conn,
*_expiry = 0;
ASSERT(server_key->payload.data[0] != NULL);
- ASSERTCMP((unsigned long) ticket & 7UL, ==, 0);
memcpy(&iv, &server_key->payload.data[2], sizeof(iv));
@@ -1112,20 +1111,49 @@ static int rxkad_decrypt_response(struct rxrpc_connection *conn,
* verify a response
*/
static int rxkad_verify_response(struct rxrpc_connection *conn,
- struct sk_buff *skb)
+ struct sk_buff *skb,
+ struct xdr_buffer *response_xdr)
{
struct rxkad_response *response;
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
struct rxrpc_crypt session_key;
struct key *server_key;
time64_t expiry;
- void *ticket = NULL;
+ void *ticket;
u32 version, kvno, ticket_len, level;
__be32 csum;
int ret, i;
_enter("{%d}", conn->debug_id);
+ response = xdr_extract_region(response_xdr, sizeof(*response));
+ if (!response)
+ return rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
+ rxkad_abort_resp_short);
+
+ version = ntohl(response->version);
+ ticket_len = ntohl(response->ticket_len);
+ kvno = ntohl(response->kvno);
+
+ trace_rxrpc_rx_response(conn, sp->hdr.serial, version, kvno, ticket_len);
+
+ if (version != RXKAD_VERSION)
+ return rxrpc_abort_conn(conn, skb, RXKADINCONSISTENCY, -EPROTO,
+ rxkad_abort_resp_version);
+
+ if (kvno >= RXKAD_TKT_TYPE_KERBEROS_V5)
+ return rxrpc_abort_conn(conn, skb, RXKADUNKNOWNKEY, -EPROTO,
+ rxkad_abort_resp_unknown_tkt);
+
+ ticket = xdr_extract_region(response_xdr, ticket_len);
+ if (!ticket)
+ return rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
+ rxkad_abort_resp_short_tkt);
+
+ if (ticket_len < 4 || ticket_len > MAXKRB5TICKETLEN)
+ return rxrpc_abort_conn(conn, skb, RXKADTICKETLEN, -EPROTO,
+ rxkad_abort_resp_tkt_len);
+
server_key = rxrpc_look_up_server_security(conn, skb, 0, 0);
if (IS_ERR(server_key)) {
ret = PTR_ERR(server_key);
@@ -1142,55 +1170,10 @@ static int rxkad_verify_response(struct rxrpc_connection *conn,
}
}
- ret = -ENOMEM;
- response = kzalloc_obj(struct rxkad_response, GFP_NOFS);
- if (!response)
- goto error;
-
- if (skb_copy_bits(skb, sizeof(struct rxrpc_wire_header),
- response, sizeof(*response)) < 0) {
- ret = rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
- rxkad_abort_resp_short);
- goto error;
- }
-
- version = ntohl(response->version);
- ticket_len = ntohl(response->ticket_len);
- kvno = ntohl(response->kvno);
-
- trace_rxrpc_rx_response(conn, sp->hdr.serial, version, kvno, ticket_len);
-
- if (version != RXKAD_VERSION) {
- ret = rxrpc_abort_conn(conn, skb, RXKADINCONSISTENCY, -EPROTO,
- rxkad_abort_resp_version);
- goto error;
- }
-
- if (ticket_len < 4 || ticket_len > MAXKRB5TICKETLEN) {
- ret = rxrpc_abort_conn(conn, skb, RXKADTICKETLEN, -EPROTO,
- rxkad_abort_resp_tkt_len);
- goto error;
- }
-
- if (kvno >= RXKAD_TKT_TYPE_KERBEROS_V5) {
- ret = rxrpc_abort_conn(conn, skb, RXKADUNKNOWNKEY, -EPROTO,
- rxkad_abort_resp_unknown_tkt);
- goto error;
- }
-
- /* extract the kerberos ticket and decrypt and decode it */
- ret = -ENOMEM;
- ticket = kmalloc(ticket_len, GFP_NOFS);
- if (!ticket)
- goto error;
-
- if (skb_copy_bits(skb, sizeof(struct rxrpc_wire_header) + sizeof(*response),
- ticket, ticket_len) < 0) {
- ret = rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
- rxkad_abort_resp_short_tkt);
- goto error;
- }
-
+ /* Extract the kerberos ticket and decrypt and decode it. We copy it
+ * before decryption which ensures it's correctly aligned for the
+ * decryption.
+ */
ret = rxkad_decrypt_ticket(conn, server_key, skb, ticket, ticket_len,
&session_key, &expiry);
if (ret < 0)
@@ -1265,8 +1248,6 @@ static int rxkad_verify_response(struct rxrpc_connection *conn,
ret = rxrpc_get_server_data_key(conn, &session_key, expiry, kvno);
error:
- kfree(ticket);
- kfree(response);
key_put(server_key);
_leave(" = %d", ret);
return ret;
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg
2026-05-11 16:07 ` [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg David Howells
@ 2026-05-12 7:58 ` Jeffrey Altman
2026-05-13 8:01 ` David Howells
2026-05-12 13:38 ` David Laight
1 sibling, 1 reply; 15+ messages in thread
From: Jeffrey Altman @ 2026-05-12 7:58 UTC (permalink / raw)
To: David Howells
Cc: netdev, Hyunwoo Kim, Marc Dionne, Jakub Kicinski, David S. Miller,
Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
Jiayuan Chen, stable
[-- Attachment #1: Type: text/plain, Size: 23796 bytes --]
> On May 11, 2026, at 12:07 PM, David Howells <dhowells@redhat.com> wrote:
>
> This improves the fix for CVE-2026-43500.
>
> Fix the pagecache corruption from in-place decryption of a DATA packet
> transmitted locally by splice() by getting rid of the packet sharing in the
> I/O thread and unconditionally extracting the packet content into a bounce
> buffer in which the buffer is decrypted. recvmsg() (or the kernel
> equivalent) then copies the data from the bounce buffer to the destination
> buffer. The sk_buff then remains unmodified.
>
> This has an additional advantage in that the packet is then arranged in the
> buffer with the correct alignment required for the crypto algorithms to
> process directly. The performance of the crypto does seem to be a little
> faster and, surprisingly, the unencrypted performance doesn't seem to
> change much - possibly due to removing complexity from the I/O thread.
>
> Yet another advantage is that the I/O thread doesn't have to copy packets
> which would slow down packet distribution, ACK generation, etc..
>
> The buffer belongs to the call and is allocated initially at 2K,
> sufficiently large to hold a whole jumbo subpacket, but the buffer will be
> increased in size if needed. There is one downside here, and that's if a
> MSG_PEEK of more than one byte occurs, it may move on to the next packet,
> replacing the content of the buffer. In such a case, it has to go back and
> re-decrypt the current packet.
>
> Note that rx_pkt_offset may legitimately see 0 as a valid offset now, so
> switch to using USHRT_MAX to indicate an invalid offset.
>
> Note also that I would generally prefer to replace the buffers of the
> current sk_buff with a new kmalloc'd buffer of the right size, ditching the
> old data and frags as this makes the handling of MSG_PEEK easier and
> removes the double-decryption issue, but this looks like quite a
> complicated thing to achieve. skb_morph() looks half way to what I want,
> but I don't want to have to allocate a new sk_buff.
It might be useful to add a back-porting note indication that the rxgk
changes can be safely dropped if the kernel version does not contain
rxgk support.
>
> Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()")
> Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
> Closes: https://lore.kernel.org/r/afKV2zGR6rrelPC7@v4bel/
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Marc Dionne <marc.dionne@auristor.com>
> cc: Jeffrey Altman <jaltman@auristor.com>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Eric Dumazet <edumazet@google.com>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: Simon Horman <horms@kernel.org>
> cc: Jiayuan Chen <jiayuan.chen@linux.dev>
> cc: netdev@vger.kernel.org
> cc: linux-afs@lists.infradead.org
> cc: stable@vger.kernel.org
> ---
> net/rxrpc/ar-internal.h | 7 +++-
> net/rxrpc/call_event.c | 22 +----------
> net/rxrpc/call_object.c | 2 +
> net/rxrpc/insecure.c | 3 --
> net/rxrpc/recvmsg.c | 72 +++++++++++++++++++++++++++-------
> net/rxrpc/rxgk.c | 49 +++++++++++------------
> net/rxrpc/rxgk_common.h | 79 +++++++++++++++++++++++++++++++++++++
> net/rxrpc/rxkad.c | 86 +++++++++++++++--------------------------
> 8 files changed, 200 insertions(+), 120 deletions(-)
>
> diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
> index 27c2aa2dd023..783367eea798 100644
> --- a/net/rxrpc/ar-internal.h
> +++ b/net/rxrpc/ar-internal.h
> @@ -213,8 +213,6 @@ struct rxrpc_skb_priv {
> struct {
> u16 offset; /* Offset of data */
> u16 len; /* Length of data */
> - u8 flags;
> -#define RXRPC_RX_VERIFIED 0x01
> };
> struct {
> rxrpc_seq_t first_ack; /* First packet in acks table */
> @@ -774,6 +772,11 @@ struct rxrpc_call {
> struct sk_buff_head recvmsg_queue; /* Queue of packets ready for recvmsg() */
> struct sk_buff_head rx_queue; /* Queue of packets for this call to receive */
> struct sk_buff_head rx_oos_queue; /* Queue of out of sequence packets */
> + void *rx_dec_buffer; /* Decryption buffer */
> + unsigned short rx_dec_bsize; /* rx_dec_buffer size */
> + unsigned short rx_dec_offset; /* Decrypted packet data offset */
> + unsigned short rx_dec_len; /* Decrypted packet data len */
> + rxrpc_seq_t rx_dec_seq; /* Packet in decryption buffer */
>
> rxrpc_seq_t rx_highest_seq; /* Higest sequence number received */
> rxrpc_seq_t rx_consumed; /* Highest packet consumed */
Instead of allocating the storage within struct rxrpc_call perhaps
It would be better to add them to struct rxrpc_channel. Doing so
would reduce the allocation/deallocation churn. The majority of
calls are short lived (perhaps a single packet in each direction)
but there will be many calls in rapid succession.
> diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
> index 2b19b252225e..fec59d9338b9 100644
> --- a/net/rxrpc/call_event.c
> +++ b/net/rxrpc/call_event.c
> @@ -332,27 +332,7 @@ bool rxrpc_input_call_event(struct rxrpc_call *call)
>
> saw_ack |= sp->hdr.type == RXRPC_PACKET_TYPE_ACK;
>
> - if (sp->hdr.type == RXRPC_PACKET_TYPE_DATA &&
> - sp->hdr.securityIndex != 0 &&
> - (skb_cloned(skb) ||
> - skb_has_frag_list(skb) ||
> - skb_has_shared_frag(skb))) {
> - /* Unshare the packet so that it can be
> - * modified by in-place decryption.
> - */
> - struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC);
> -
> - if (nskb) {
> - rxrpc_new_skb(nskb, rxrpc_skb_new_unshared);
> - rxrpc_input_call_packet(call, nskb);
> - rxrpc_free_skb(nskb, rxrpc_skb_put_call_rx);
> - } else {
> - /* OOM - Drop the packet. */
> - rxrpc_see_skb(skb, rxrpc_skb_see_unshare_nomem);
> - }
> - } else {
> - rxrpc_input_call_packet(call, skb);
> - }
> + rxrpc_input_call_packet(call, skb);
> rxrpc_free_skb(skb, rxrpc_skb_put_call_rx);
> did_receive = true;
> }
> diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
> index f035f486c139..fcb9d38bb521 100644
> --- a/net/rxrpc/call_object.c
> +++ b/net/rxrpc/call_object.c
> @@ -152,6 +152,7 @@ struct rxrpc_call *rxrpc_alloc_call(struct rxrpc_sock *rx, gfp_t gfp,
> spin_lock_init(&call->notify_lock);
> refcount_set(&call->ref, 1);
> call->debug_id = debug_id;
> + call->rx_pkt_offset = USHRT_MAX;
> call->tx_total_len = -1;
> call->tx_jumbo_max = 1;
> call->next_rx_timo = 20 * HZ;
> @@ -553,6 +554,7 @@ static void rxrpc_cleanup_rx_buffers(struct rxrpc_call *call)
> rxrpc_purge_queue(&call->recvmsg_queue);
> rxrpc_purge_queue(&call->rx_queue);
> rxrpc_purge_queue(&call->rx_oos_queue);
> + kfree(call->rx_dec_buffer);
> }
>
> /*
> diff --git a/net/rxrpc/insecure.c b/net/rxrpc/insecure.c
> index 0a260df45d25..7a26c6097d03 100644
> --- a/net/rxrpc/insecure.c
> +++ b/net/rxrpc/insecure.c
> @@ -32,9 +32,6 @@ static int none_secure_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
>
> static int none_verify_packet(struct rxrpc_call *call, struct sk_buff *skb)
> {
> - struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> -
> - sp->flags |= RXRPC_RX_VERIFIED;
> return 0;
> }
>
> diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
> index e1f7513a46db..865e368381d5 100644
> --- a/net/rxrpc/recvmsg.c
> +++ b/net/rxrpc/recvmsg.c
> @@ -147,15 +147,55 @@ static void rxrpc_rotate_rx_window(struct rxrpc_call *call)
> }
>
> /*
> - * Decrypt and verify a DATA packet.
> + * Decrypt and verify a DATA packet. The content of the packet is pulled out
> + * into a flat buffer rather than decrypting in place in the skbuff. This also
> + * has the advantage of aligning the buffer correctly for the crypto routines.
> + *
> + * We keep track of the sequence number of the packet currently decrypted into
> + * the buffer in ->rx_dec_seq. Unfortunately, this means that a MSG_PEEK of
> + * more than one byte may cause a later packet to be decrypted into the buffer,
> + * requiring the original to be re-decrypted when recvmsg() is called again.
> */
> static int rxrpc_verify_data(struct rxrpc_call *call, struct sk_buff *skb)
> {
> struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> + int ret;
>
> - if (sp->flags & RXRPC_RX_VERIFIED)
> + if (call->rx_dec_seq == sp->hdr.seq && call->rx_dec_buffer)
> return 0;
> - return call->security->verify_packet(call, skb);
> +
> + if (call->rx_dec_bsize < sp->len) {
> + /* Make sure we can hold a 1412-byte jumbo subpacket and make
> + * sure that the buffer size is aligned to a crypto blocksize.
> + */
> + size_t size = umin(round_up(sp->len, 32), 2048);
I think you meant to use max() here so that a minimum of 2048 bytes
is allocated.
I think applying a cap on the allocation size would also be
beneficial. IBM/Transarc derived Rx implementations have a hard
upper-bound of 21180 (15 x 1412) bytes plus one 28 byte rx header.
Applying a cap of 32KiB seems prudent.
It is also worth noting that there are no current implementations
of Rx RPC which will send individual Rx DATA packets larger than
1444 bytes including the Rx header. Rx RESPONSE packets can be sent
as large as 16384 bytes (including the Rx header). However, it is
extremely unlikely that this buffer once allocated would ever need
to be grown.
> + void *buffer = krealloc(call->rx_dec_buffer, size, GFP_NOFS);
> +
> + if (!buffer)
> + return -ENOMEM;
> + call->rx_dec_buffer = buffer;
> + call->rx_dec_bsize = size;
> + }
> +
> + ret = -EFAULT;
> + if (skb_copy_bits(skb, sp->offset, call->rx_dec_buffer, sp->len) < 0)
> + goto err;
> +
> + call->rx_dec_offset = 0;
> + call->rx_dec_len = sp->len;
> + call->rx_dec_seq = sp->hdr.seq;
> + ret = call->security->verify_packet(call, skb);
> + if (ret < 0)
> + goto err;
> + return 0;
> +
> +err:
> + kfree(call->rx_dec_buffer);
It might be better to avoid deallocating the buffer on the error
path and permit it to be freed during normal call (or call channel)
deallocation.
> + call->rx_dec_buffer = NULL;
> + call->rx_dec_bsize = 0;
> + call->rx_dec_offset = 0;
> + call->rx_dec_len = 0;
> + return ret;
> }
>
> /*
> @@ -283,16 +323,19 @@ static int rxrpc_recvmsg_data(struct socket *sock, struct rxrpc_call *call,
> if (msg)
> sock_recv_timestamp(msg, sock->sk, skb);
>
> - if (rx_pkt_offset == 0) {
> + if (rx_pkt_offset == USHRT_MAX) {
> ret2 = rxrpc_verify_data(call, skb);
> trace_rxrpc_recvdata(call, rxrpc_recvmsg_next, seq,
> - sp->offset, sp->len, ret2);
> + call->rx_dec_offset,
> + call->rx_dec_len, ret2);
> if (ret2 < 0) {
> ret = ret2;
> goto out;
> }
> - rx_pkt_offset = sp->offset;
> - rx_pkt_len = sp->len;
> + sp = rxrpc_skb(skb);
> + seq = sp->hdr.seq;
> + rx_pkt_offset = call->rx_dec_offset;
> + rx_pkt_len = call->rx_dec_len;
> } else {
> trace_rxrpc_recvdata(call, rxrpc_recvmsg_cont, seq,
> rx_pkt_offset, rx_pkt_len, 0);
> @@ -304,10 +347,10 @@ static int rxrpc_recvmsg_data(struct socket *sock, struct rxrpc_call *call,
> if (copy > remain)
> copy = remain;
> if (copy > 0) {
> - ret2 = skb_copy_datagram_iter(skb, rx_pkt_offset, iter,
> - copy);
> - if (ret2 < 0) {
> - ret = ret2;
> + ret2 = copy_to_iter(call->rx_dec_buffer + rx_pkt_offset,
> + copy, iter);
> + if (ret2 != copy) {
> + ret = -EFAULT;
> goto out;
> }
>
> @@ -328,13 +371,14 @@ static int rxrpc_recvmsg_data(struct socket *sock, struct rxrpc_call *call,
> /* The whole packet has been transferred. */
> if (sp->hdr.flags & RXRPC_LAST_PACKET)
> ret = 1;
> - rx_pkt_offset = 0;
> + rx_pkt_offset = USHRT_MAX;
> rx_pkt_len = 0;
> + if (unlikely(flags & MSG_PEEK))
> + break;
>
> skb = skb_peek_next(skb, &call->recvmsg_queue);
>
> - if (!(flags & MSG_PEEK))
> - rxrpc_rotate_rx_window(call);
> + rxrpc_rotate_rx_window(call);
>
> if (!rx->app_ops &&
> !skb_queue_empty_lockless(&rx->recvmsg_oobq)) {
> diff --git a/net/rxrpc/rxgk.c b/net/rxrpc/rxgk.c
> index 0d5e654da918..88e651dd0e90 100644
> --- a/net/rxrpc/rxgk.c
> +++ b/net/rxrpc/rxgk.c
> @@ -473,8 +473,9 @@ static int rxgk_verify_packet_integrity(struct rxrpc_call *call,
> struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> struct rxgk_header *hdr;
> struct krb5_buffer metadata;
> - unsigned int offset = sp->offset, len = sp->len;
> + unsigned int offset = 0, len = call->rx_dec_len;
> size_t data_offset = 0, data_len = len;
> + void *data = call->rx_dec_buffer;
> u32 ac = 0;
> int ret = -ENOMEM;
>
> @@ -496,16 +497,16 @@ static int rxgk_verify_packet_integrity(struct rxrpc_call *call,
>
> metadata.len = sizeof(*hdr);
> metadata.data = hdr;
> - ret = rxgk_verify_mic_skb(gk->krb5, gk->rx_Kc, &metadata,
> - skb, &offset, &len, &ac);
> + ret = rxgk_verify_mic(gk->krb5, gk->rx_Kc, &metadata,
> + data, &offset, &len, &ac);
> kfree(hdr);
> if (ret < 0) {
> if (ret != -ENOMEM)
> rxrpc_abort_eproto(call, skb, ac,
> rxgk_abort_1_verify_mic_eproto);
> } else {
> - sp->offset = offset;
> - sp->len = len;
> + call->rx_dec_offset = offset;
> + call->rx_dec_len = len;
> }
>
> put_gk:
> @@ -522,49 +523,45 @@ static int rxgk_verify_packet_encrypted(struct rxrpc_call *call,
> struct sk_buff *skb)
> {
> struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> - struct rxgk_header hdr;
> - unsigned int offset = sp->offset, len = sp->len;
> + struct rxgk_header *hdr;
> + unsigned int offset = 0, len = call->rx_dec_len;
> + void *data = call->rx_dec_buffer;
> int ret;
> u32 ac = 0;
>
> _enter("");
>
> - ret = rxgk_decrypt_skb(gk->krb5, gk->rx_enc, skb, &offset, &len, &ac);
> + ret = rxgk_decrypt(gk->krb5, gk->rx_enc, data, &offset, &len, &ac);
> if (ret < 0) {
> if (ret != -ENOMEM)
> rxrpc_abort_eproto(call, skb, ac, rxgk_abort_2_decrypt_eproto);
> goto error;
> }
>
> - if (len < sizeof(hdr)) {
> + if (len < sizeof(*hdr)) {
> ret = rxrpc_abort_eproto(call, skb, RXGK_PACKETSHORT,
> rxgk_abort_2_short_header);
> goto error;
> }
>
> /* Extract the header from the skb */
> - ret = skb_copy_bits(skb, offset, &hdr, sizeof(hdr));
> - if (ret < 0) {
> - ret = rxrpc_abort_eproto(call, skb, RXGK_PACKETSHORT,
> - rxgk_abort_2_short_encdata);
> - goto error;
> - }
> - offset += sizeof(hdr);
> - len -= sizeof(hdr);
> -
> - if (ntohl(hdr.epoch) != call->conn->proto.epoch ||
> - ntohl(hdr.cid) != call->cid ||
> - ntohl(hdr.call_number) != call->call_id ||
> - ntohl(hdr.seq) != sp->hdr.seq ||
> - ntohl(hdr.sec_index) != call->security_ix ||
> - ntohl(hdr.data_len) > len) {
> + hdr = data + offset;
> + offset += sizeof(*hdr);
> + len -= sizeof(*hdr);
> +
> + if (ntohl(hdr->epoch) != call->conn->proto.epoch ||
> + ntohl(hdr->cid) != call->cid ||
> + ntohl(hdr->call_number) != call->call_id ||
> + ntohl(hdr->seq) != sp->hdr.seq ||
> + ntohl(hdr->sec_index) != call->security_ix ||
> + ntohl(hdr->data_len) > len) {
> ret = rxrpc_abort_eproto(call, skb, RXGK_SEALEDINCON,
> rxgk_abort_2_short_data);
> goto error;
> }
>
> - sp->offset = offset;
> - sp->len = ntohl(hdr.data_len);
> + call->rx_dec_offset = offset;
> + call->rx_dec_len = ntohl(hdr->data_len);
> ret = 0;
> error:
> rxgk_put(gk);
> diff --git a/net/rxrpc/rxgk_common.h b/net/rxrpc/rxgk_common.h
> index 1e257d7ab8ec..dc8b0f106104 100644
> --- a/net/rxrpc/rxgk_common.h
> +++ b/net/rxrpc/rxgk_common.h
> @@ -105,6 +105,45 @@ int rxgk_decrypt_skb(const struct krb5_enctype *krb5,
> return ret;
> }
>
> +/*
> + * Apply decryption and checksumming functions a flat data buffer. The offset
> + * and length are updated to reflect the actual content of the encrypted
> + * region.
> + */
> +static inline int rxgk_decrypt(const struct krb5_enctype *krb5,
> + struct crypto_aead *aead,
> + void *data,
> + unsigned int *_offset, unsigned int *_len,
> + int *_error_code)
> +{
> + struct scatterlist sg[1];
> + size_t offset = 0, len = *_len;
> + int ret;
> +
> + sg_init_one(sg, data, len);
> +
> + ret = crypto_krb5_decrypt(krb5, aead, sg, 1, &offset, &len);
> + switch (ret) {
> + case 0:
> + *_offset += offset;
> + *_len = len;
> + break;
> + case -EBADMSG: /* Checksum mismatch. */
> + case -EPROTO:
> + *_error_code = RXGK_SEALEDINCON;
> + break;
> + case -EMSGSIZE:
> + *_error_code = RXGK_PACKETSHORT;
> + break;
> + case -ENOPKG: /* Would prefer RXGK_BADETYPE, but not available for YFS. */
> + default:
> + *_error_code = RXGK_INCONSISTENCY;
> + break;
> + }
> +
> + return ret;
> +}
> +
> /*
> * Check the MIC on a region of an skbuff. The offset and length are updated
> * to reflect the actual content of the secure region.
> @@ -148,3 +187,43 @@ int rxgk_verify_mic_skb(const struct krb5_enctype *krb5,
>
> return ret;
> }
> +
> +/*
> + * Check the MIC on a flat buffer. The offset and length are updated to
> + * reflect the actual content of the secure region.
> + */
> +static inline
> +int rxgk_verify_mic(const struct krb5_enctype *krb5,
> + struct crypto_shash *shash,
> + const struct krb5_buffer *metadata,
> + void *data,
> + unsigned int *_offset, unsigned int *_len,
> + u32 *_error_code)
> +{
> + struct scatterlist sg[1];
> + size_t offset = 0, len = *_len;
> + int ret;
> +
> + sg_init_one(sg, data, len);
> +
> + ret = crypto_krb5_verify_mic(krb5, shash, metadata, sg, 1, &offset, &len);
> + switch (ret) {
> + case 0:
> + *_offset += offset;
> + *_len = len;
> + break;
> + case -EBADMSG: /* Checksum mismatch */
> + case -EPROTO:
> + *_error_code = RXGK_SEALEDINCON;
> + break;
> + case -EMSGSIZE:
> + *_error_code = RXGK_PACKETSHORT;
> + break;
> + case -ENOPKG: /* Would prefer RXGK_BADETYPE, but not available for YFS. */
> + default:
> + *_error_code = RXGK_INCONSISTENCY;
> + break;
> + }
> +
> + return ret;
> +}
> diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
> index cba7935977f0..075936337836 100644
> --- a/net/rxrpc/rxkad.c
> +++ b/net/rxrpc/rxkad.c
> @@ -430,27 +430,25 @@ static int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb,
> rxrpc_seq_t seq,
> struct skcipher_request *req)
> {
> - struct rxkad_level1_hdr sechdr;
> + struct rxkad_level1_hdr *sechdr;
> struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> struct rxrpc_crypt iv;
> - struct scatterlist sg[16];
> - u32 data_size, buf;
> + struct scatterlist sg[1];
> + void *data = call->rx_dec_buffer;
> + u32 len = sp->len, data_size, buf;
> u16 check;
> int ret;
>
> _enter("");
>
> - if (sp->len < 8)
> + if (len < 8)
> return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON,
> rxkad_abort_1_short_header);
>
> /* Decrypt the skbuff in-place. TODO: We really want to decrypt
> * directly into the target buffer.
> */
> - sg_init_table(sg, ARRAY_SIZE(sg));
> - ret = skb_to_sgvec(skb, sg, sp->offset, 8);
> - if (unlikely(ret < 0))
> - return ret;
> + sg_init_one(sg, data, len);
>
> /* start the decryption afresh */
> memset(&iv, 0, sizeof(iv));
> @@ -464,13 +462,11 @@ static int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb,
> return ret;
>
> /* Extract the decrypted packet length */
> - if (skb_copy_bits(skb, sp->offset, &sechdr, sizeof(sechdr)) < 0)
> - return rxrpc_abort_eproto(call, skb, RXKADDATALEN,
> - rxkad_abort_1_short_encdata);
> - sp->offset += sizeof(sechdr);
> - sp->len -= sizeof(sechdr);
> + sechdr = data;
> + call->rx_dec_offset = sizeof(*sechdr);
> + len -= sizeof(*sechdr);
>
> - buf = ntohl(sechdr.data_size);
> + buf = ntohl(sechdr->data_size);
> data_size = buf & 0xffff;
>
> check = buf >> 16;
> @@ -479,10 +475,10 @@ static int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb,
> if (check != 0)
> return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON,
> rxkad_abort_1_short_check);
> - if (data_size > sp->len)
> + if (data_size > len)
> return rxrpc_abort_eproto(call, skb, RXKADDATALEN,
> rxkad_abort_1_short_data);
> - sp->len = data_size;
> + call->rx_dec_len = data_size;
>
> _leave(" = 0 [dlen=%x]", data_size);
> return 0;
> @@ -496,43 +492,28 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
> struct skcipher_request *req)
> {
> const struct rxrpc_key_token *token;
> - struct rxkad_level2_hdr sechdr;
> + struct rxkad_level2_hdr *sechdr;
> struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> struct rxrpc_crypt iv;
> - struct scatterlist _sg[4], *sg;
> - u32 data_size, buf;
> + struct scatterlist sg[1];
> + void *data = call->rx_dec_buffer;
> + u32 len = sp->len, data_size, buf;
> u16 check;
> - int nsg, ret;
> + int ret;
>
> - _enter(",{%d}", sp->len);
> + _enter(",{%d}", len);
>
> - if (sp->len < 8)
> + if (len < 8)
> return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON,
> rxkad_abort_2_short_header);
>
> /* Don't let the crypto algo see a misaligned length. */
> - sp->len = round_down(sp->len, 8);
> + len = round_down(len, 8);
>
> - /* Decrypt the skbuff in-place. TODO: We really want to decrypt
> - * directly into the target buffer.
> + /* Decrypt in place in the call's decryption buffer. TODO: We really
> + * want to decrypt directly into the target buffer.
> */
> - sg = _sg;
> - nsg = skb_shinfo(skb)->nr_frags + 1;
> - if (nsg <= 4) {
> - nsg = 4;
> - } else {
> - sg = kmalloc_objs(*sg, nsg, GFP_NOIO);
> - if (!sg)
> - return -ENOMEM;
> - }
> -
> - sg_init_table(sg, nsg);
> - ret = skb_to_sgvec(skb, sg, sp->offset, sp->len);
> - if (unlikely(ret < 0)) {
> - if (sg != _sg)
> - kfree(sg);
> - return ret;
> - }
> + sg_init_one(sg, data, len);
>
> /* decrypt from the session key */
> token = call->conn->key->payload.data[0];
> @@ -540,11 +521,9 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
>
> skcipher_request_set_sync_tfm(req, call->conn->rxkad.cipher);
> skcipher_request_set_callback(req, 0, NULL, NULL);
> - skcipher_request_set_crypt(req, sg, sg, sp->len, iv.x);
> + skcipher_request_set_crypt(req, sg, sg, len, iv.x);
> ret = crypto_skcipher_decrypt(req);
> skcipher_request_zero(req);
> - if (sg != _sg)
> - kfree(sg);
> if (ret < 0) {
> if (ret == -ENOMEM)
> return ret;
> @@ -553,13 +532,11 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
> }
>
> /* Extract the decrypted packet length */
> - if (skb_copy_bits(skb, sp->offset, &sechdr, sizeof(sechdr)) < 0)
> - return rxrpc_abort_eproto(call, skb, RXKADDATALEN,
> - rxkad_abort_2_short_len);
> - sp->offset += sizeof(sechdr);
> - sp->len -= sizeof(sechdr);
> + sechdr = data;
> + call->rx_dec_offset = sizeof(*sechdr);
> + len -= sizeof(*sechdr);
>
> - buf = ntohl(sechdr.data_size);
> + buf = ntohl(sechdr->data_size);
> data_size = buf & 0xffff;
>
> check = buf >> 16;
> @@ -569,17 +546,18 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb,
> return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON,
> rxkad_abort_2_short_check);
>
> - if (data_size > sp->len)
> + if (data_size > len)
> return rxrpc_abort_eproto(call, skb, RXKADDATALEN,
> rxkad_abort_2_short_data);
>
> - sp->len = data_size;
> + call->rx_dec_len = data_size;
> _leave(" = 0 [dlen=%x]", data_size);
> return 0;
> }
>
> /*
> - * Verify the security on a received packet and the subpackets therein.
> + * Verify the security on a received (sub)packet. If the packet needs
> + * modifying (e.g. decrypting), it must be copied.
> */
> static int rxkad_verify_packet(struct rxrpc_call *call, struct sk_buff *skb)
> {
>
With the exception of the provided feedback this change looks good.
Thank you.
Jeffrey Altman
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 4120 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net 3/3] rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer
2026-05-11 16:07 ` [PATCH net 3/3] rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer David Howells
@ 2026-05-12 8:22 ` Jeffrey Altman
2026-05-13 0:06 ` Jakub Kicinski
1 sibling, 0 replies; 15+ messages in thread
From: Jeffrey Altman @ 2026-05-12 8:22 UTC (permalink / raw)
To: David Howells
Cc: netdev, Hyunwoo Kim, Marc Dionne, Jakub Kicinski, David S. Miller,
Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
Jiayuan Chen, stable
[-- Attachment #1: Type: text/plain, Size: 32852 bytes --]
> On May 11, 2026, at 12:07 PM, David Howells <dhowells@redhat.com> wrote:
>
> This improves the fix for CVE-2026-43500.
>
> Fix the verification of RESPONSE packets to avoid the problem of
> overwriting a RESPONSE packet sent via splice to a local address by
> extracting the contents of the UDP packet into a kmalloc'd linear buffer
> rather than decrypting the data in place in the sk_buff (which may corrupt
> the original buffer).
>
> Further, since the way the RESPONSE data is handled is being overhauled,
> add an XDR decode abstraction that hides the details of buffer advancement
> and length checking. At some point it may be worth seeing if NFS's XDR
> stuff can be used, but that involves linking against the sunrpc module
> which is undesirable.
>
Might be worth adding a back-porting note that the rxgk changes can be
safely dropped if the kernel does not support rxgk.
> Fixes: 24481a7f5733 ("rxrpc: Fix conn-level packet handling to unshare RESPONSE packets")
> Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
> Closes: https://lore.kernel.org/r/afKV2zGR6rrelPC7@v4bel/
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Marc Dionne <marc.dionne@auristor.com>
> cc: Jeffrey Altman <jaltman@auristor.com>
> cc: Eric Dumazet <edumazet@google.com>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: Simon Horman <horms@kernel.org>
> cc: Jiayuan Chen <jiayuan.chen@linux.dev>
> cc: linux-afs@lists.infradead.org
> cc: netdev@vger.kernel.org
> cc: stable@kernel.org
> ---
> net/rxrpc/ar-internal.h | 70 +++++++++++++++++--
> net/rxrpc/conn_event.c | 35 +++++-----
> net/rxrpc/insecure.c | 5 +-
> net/rxrpc/protocol.h | 1 -
> net/rxrpc/rxgk.c | 146 +++++++++++++---------------------------
> net/rxrpc/rxgk_app.c | 91 +++++++++++--------------
> net/rxrpc/rxgk_common.h | 115 ++++---------------------------
> net/rxrpc/rxkad.c | 89 ++++++++++--------------
> 8 files changed, 219 insertions(+), 333 deletions(-)
>
> diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
> index 783367eea798..610fa208157b 100644
> --- a/net/rxrpc/ar-internal.h
> +++ b/net/rxrpc/ar-internal.h
> @@ -33,6 +33,13 @@ struct rxrpc_txbuf;
> struct rxrpc_txqueue;
> struct rxgk_context;
>
> +typedef __be32 xdr_t;
> +
> +struct xdr_buffer {
> + xdr_t *p; /* Current decode point */
> + unsigned int len; /* Amount left in buffer */
> +};
> +
> /*
> * Mark applied to socket buffers in skb->mark. skb->priority is used
> * to pass supplementary information.
> @@ -307,16 +314,16 @@ struct rxrpc_security {
> struct sk_buff *challenge);
>
> /* verify a response */
> - int (*verify_response)(struct rxrpc_connection *,
> - struct sk_buff *);
> + int (*verify_response)(struct rxrpc_connection *conn,
> + struct sk_buff *response_skb,
> + struct xdr_buffer *response);
>
> /* clear connection security */
> void (*clear)(struct rxrpc_connection *);
>
> /* Default ticket -> key decoder */
> int (*default_decode_ticket)(struct rxrpc_connection *conn, struct sk_buff *skb,
> - unsigned int ticket_offset, unsigned int ticket_len,
> - struct key **_key);
> + struct xdr_buffer *ticket, struct key **_key);
> };
>
> /*
> @@ -1591,6 +1598,61 @@ static inline unsigned int rxrpc_tx_in_flight(const struct rxrpc_call *call)
> return call->tx_nr_sent - rxrpc_left_out(call) + call->tx_nr_resent;
> }
>
> +#define xdr_round_up(x) (round_up((x), sizeof(__be32)))
> +#define xdr_round_down(x) (round_down((x), sizeof(__be32)))
> +#define xdr_object_len(x) (4 + xdr_round_up(x))
> +
> +/*
> + * Check the amount of data remaining in an XDR buffer.
> + */
> +static inline bool xdr_check(struct xdr_buffer *buf, unsigned int bytes)
> +{
> + if (bytes > buf->len ||
> + xdr_round_up(bytes) > buf->len)
> + return false;
> + return true;
> +}
> +
> +/*
> + * Grab a region in an XDR buffer and advance the buffer position, returning a
> + * pointer to the start of the region or NULL if there's insufficient data.
> + */
> +static inline void *xdr_extract_region(struct xdr_buffer *buf, unsigned int bytes)
> +{
> + xdr_t *p = buf->p;
> +
> + if (!xdr_check(buf, bytes))
> + return NULL;
> +
> + bytes = xdr_round_up(bytes);
> + buf->p += bytes / sizeof(*buf->p);
> + buf->len -= bytes;
> + return p;
> +}
> +
> +/*
> + * Decode an XDR word.
> + */
> +static inline u32 xdr_dec(xdr_t x)
> +{
> + return ntohl(x);
> +}
> +
> +/*
> + * Grab a blob with size from an XDR buffer and advance the buffer position.
> + */
> +static inline bool xdr_extract_blob(struct xdr_buffer *buf, struct xdr_buffer *blob)
> +{
> + xdr_t *xsize;
> +
> + xsize = xdr_extract_region(buf, sizeof(*xsize));
> + if (!xsize)
> + return false;
> + blob->len = xdr_dec(*xsize);
> + blob->p = xdr_extract_region(buf, blob->len);
> + return blob->p != NULL;
> +}
> +
> /*
> * debug tracing
> */
> diff --git a/net/rxrpc/conn_event.c b/net/rxrpc/conn_event.c
> index 442414d90ba1..26cab3d2075e 100644
> --- a/net/rxrpc/conn_event.c
> +++ b/net/rxrpc/conn_event.c
> @@ -243,28 +243,25 @@ static void rxrpc_call_is_secure(struct rxrpc_call *call)
> static int rxrpc_verify_response(struct rxrpc_connection *conn,
> struct sk_buff *skb)
> {
> + struct xdr_buffer response = {
> + .len = skb->len - sizeof(struct rxrpc_wire_header),
> + };
> + void *buffer;
> int ret;
>
> - if (skb_cloned(skb) || skb_has_frag_list(skb) ||
> - skb_has_shared_frag(skb)) {
> - /* Copy the packet if shared so that we can do in-place
> - * decryption.
> - */
> - struct sk_buff *nskb = skb_copy(skb, GFP_NOFS);
> -
> - if (nskb) {
> - rxrpc_new_skb(nskb, rxrpc_skb_new_unshared);
> - ret = conn->security->verify_response(conn, nskb);
> - rxrpc_free_skb(nskb, rxrpc_skb_put_response_copy);
> - } else {
> - /* OOM - Drop the packet. */
> - rxrpc_see_skb(skb, rxrpc_skb_see_unshare_nomem);
> - ret = -ENOMEM;
> - }
> - } else {
> - ret = conn->security->verify_response(conn, skb);
> - }
> + buffer = kmalloc(response.len, GFP_NOFS);
> + if (!buffer)
> + return ret;
> +
> + ret = skb_copy_bits(skb, sizeof(struct rxrpc_wire_header), buffer, response.len);
> + if (ret < 0)
> + goto out;
> +
> + response.p = buffer;
> + ret = conn->security->verify_response(conn, skb, &response);
>
> +out:
> + kfree(buffer);
> return ret;
> }
>
> diff --git a/net/rxrpc/insecure.c b/net/rxrpc/insecure.c
> index 7a26c6097d03..dd1827f683bc 100644
> --- a/net/rxrpc/insecure.c
> +++ b/net/rxrpc/insecure.c
> @@ -54,9 +54,10 @@ static int none_sendmsg_respond_to_challenge(struct sk_buff *challenge,
> }
>
> static int none_verify_response(struct rxrpc_connection *conn,
> - struct sk_buff *skb)
> + struct sk_buff *response_skb,
> + struct xdr_buffer *response)
> {
> - return rxrpc_abort_conn(conn, skb, RX_PROTOCOL_ERROR, -EPROTO,
> + return rxrpc_abort_conn(conn, response_skb, RX_PROTOCOL_ERROR, -EPROTO,
> rxrpc_eproto_rxnull_response);
> }
>
> diff --git a/net/rxrpc/protocol.h b/net/rxrpc/protocol.h
> index f8bfec12bc7e..6e02a84fd370 100644
> --- a/net/rxrpc/protocol.h
> +++ b/net/rxrpc/protocol.h
> @@ -198,7 +198,6 @@ struct rxgk_header {
> */
> struct rxgk_response {
> __be64 start_time;
> - __be32 token_len;
> } __packed;
>
> #endif /* _LINUX_RXRPC_PACKET_H */
> diff --git a/net/rxrpc/rxgk.c b/net/rxrpc/rxgk.c
> index 88e651dd0e90..fd73b1ff3b97 100644
> --- a/net/rxrpc/rxgk.c
> +++ b/net/rxrpc/rxgk.c
> @@ -524,43 +524,42 @@ static int rxgk_verify_packet_encrypted(struct rxrpc_call *call,
> {
> struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> struct rxgk_header *hdr;
> - unsigned int offset = 0, len = call->rx_dec_len;
> - void *data = call->rx_dec_buffer;
> + struct xdr_buffer xdr = {
> + .p = call->rx_dec_buffer,
> + .len = call->rx_dec_len,
> + };
> int ret;
> u32 ac = 0;
>
> _enter("");
>
> - ret = rxgk_decrypt(gk->krb5, gk->rx_enc, data, &offset, &len, &ac);
> + ret = rxgk_decrypt(gk->krb5, gk->rx_enc, &xdr, &ac);
> if (ret < 0) {
> if (ret != -ENOMEM)
> rxrpc_abort_eproto(call, skb, ac, rxgk_abort_2_decrypt_eproto);
> goto error;
> }
>
> - if (len < sizeof(*hdr)) {
> + /* Extract the header from the skb */
> + hdr = xdr_extract_region(&xdr, sizeof(*hdr));
> + if (!hdr) {
> ret = rxrpc_abort_eproto(call, skb, RXGK_PACKETSHORT,
> rxgk_abort_2_short_header);
> goto error;
> }
>
> - /* Extract the header from the skb */
> - hdr = data + offset;
> - offset += sizeof(*hdr);
> - len -= sizeof(*hdr);
> -
> if (ntohl(hdr->epoch) != call->conn->proto.epoch ||
> ntohl(hdr->cid) != call->cid ||
> ntohl(hdr->call_number) != call->call_id ||
> ntohl(hdr->seq) != sp->hdr.seq ||
> ntohl(hdr->sec_index) != call->security_ix ||
> - ntohl(hdr->data_len) > len) {
> + ntohl(hdr->data_len) > xdr.len) {
> ret = rxrpc_abort_eproto(call, skb, RXGK_SEALEDINCON,
> rxgk_abort_2_short_data);
> goto error;
> }
>
> - call->rx_dec_offset = offset;
> + call->rx_dec_offset = (void *)xdr.p - call->rx_dec_buffer;
> call->rx_dec_len = ntohl(hdr->data_len);
> ret = 0;
> error:
> @@ -1073,37 +1072,37 @@ static int rxgk_sendmsg_respond_to_challenge(struct sk_buff *challenge,
> * unsigned int call_numbers<>;
> * };
> */
> -static int rxgk_do_verify_authenticator(struct rxrpc_connection *conn,
> - const struct krb5_enctype *krb5,
> - struct sk_buff *skb,
> - __be32 *p, __be32 *end)
> +static int rxgk_verify_authenticator(struct rxrpc_connection *conn,
> + const struct krb5_enctype *krb5,
> + struct sk_buff *skb,
> + struct xdr_buffer *auth)
> {
> - u32 app_len, call_count, level, epoch, cid, i;
> + struct xdr_buffer appdata;
> + xdr_t *nonce, *x;
> + u32 call_count, level, epoch, cid, i;
>
> _enter("");
>
> - if ((end - p) * sizeof(__be32) < 24)
> + nonce = xdr_extract_region(auth, 20);
> + if (!nonce)
> return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
> rxgk_abort_resp_short_auth);
> - if (memcmp(p, conn->rxgk.nonce, 20) != 0)
> + if (memcmp(nonce, conn->rxgk.nonce, 20) != 0)
> return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
> rxgk_abort_resp_bad_nonce);
> - p += 20 / sizeof(__be32);
>
> - app_len = ntohl(*p++);
> - if (app_len > (end - p) * sizeof(__be32))
> + if (!xdr_extract_blob(auth, &appdata))
> return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
> rxgk_abort_resp_short_applen);
>
> - p += xdr_round_up(app_len) / sizeof(__be32);
> - if (end - p < 4)
> + x = xdr_extract_region(auth, 4 * sizeof(xdr_t));
> + if (!x)
> return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
> rxgk_abort_resp_short_auth);
> -
> - level = ntohl(*p++);
> - epoch = ntohl(*p++);
> - cid = ntohl(*p++);
> - call_count = ntohl(*p++);
> + level = xdr_dec(x[0]);
> + epoch = xdr_dec(x[1]);
> + cid = xdr_dec(x[2]);
> + call_count = xdr_dec(x[3]);
>
> if (level != conn->security_level ||
> epoch != conn->proto.epoch ||
> @@ -1112,12 +1111,13 @@ static int rxgk_do_verify_authenticator(struct rxrpc_connection *conn,
> return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
> rxgk_abort_resp_bad_param);
>
> - if (end - p < call_count)
> + x = xdr_extract_region(auth, call_count * sizeof(xdr_t));
> + if (!x)
> return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
> rxgk_abort_resp_short_call_list);
>
> for (i = 0; i < call_count; i++) {
> - u32 call_id = ntohl(*p++);
> + u32 call_id = xdr_dec(x[i]);
>
> if (call_id > INT_MAX)
> return rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
> @@ -1140,37 +1140,6 @@ static int rxgk_do_verify_authenticator(struct rxrpc_connection *conn,
> return 0;
> }
>
> -/*
> - * Extract the authenticator and verify it.
> - */
> -static int rxgk_verify_authenticator(struct rxrpc_connection *conn,
> - const struct krb5_enctype *krb5,
> - struct sk_buff *skb,
> - unsigned int auth_offset, unsigned int auth_len)
> -{
> - void *auth;
> - __be32 *p;
> - int ret;
> -
> - auth = kmalloc(auth_len, GFP_NOFS);
> - if (!auth)
> - return -ENOMEM;
> -
> - ret = skb_copy_bits(skb, auth_offset, auth, auth_len);
> - if (ret < 0) {
> - ret = rxrpc_abort_conn(conn, skb, RXGK_NOTAUTH, -EPROTO,
> - rxgk_abort_resp_short_auth);
> - goto error;
> - }
> -
> - p = auth;
> - ret = rxgk_do_verify_authenticator(conn, krb5, skb, p,
> - p + auth_len / sizeof(*p));
> -error:
> - kfree(auth);
> - return ret;
> -}
> -
> /*
> * Verify a response.
> *
> @@ -1181,55 +1150,34 @@ static int rxgk_verify_authenticator(struct rxrpc_connection *conn,
> * };
> */
> static int rxgk_verify_response(struct rxrpc_connection *conn,
> - struct sk_buff *skb)
> + struct sk_buff *skb,
> + struct xdr_buffer *response)
> {
> const struct krb5_enctype *krb5;
> struct rxrpc_key_token *token;
> struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> - struct rxgk_response rhdr;
> + struct rxgk_response *rhdr;
> struct rxgk_context *gk;
> + struct xdr_buffer resp_token, auth;
> struct key *key = NULL;
> - unsigned int offset = sizeof(struct rxrpc_wire_header);
> - unsigned int len = skb->len - sizeof(struct rxrpc_wire_header);
> - unsigned int token_offset, token_len;
> - unsigned int auth_offset, auth_len;
> - __be32 xauth_len;
> int ret, ec;
>
> _enter("{%d}", conn->debug_id);
>
> /* Parse the RXGK_Response object */
> - if (sizeof(rhdr) + sizeof(__be32) > len)
> + rhdr = xdr_extract_region(response, sizeof(*rhdr));
> + if (!rhdr)
> goto short_packet;
> -
> - if (skb_copy_bits(skb, offset, &rhdr, sizeof(rhdr)) < 0)
> + if (!xdr_extract_blob(response, &resp_token))
> goto short_packet;
> - offset += sizeof(rhdr);
> - len -= sizeof(rhdr);
> -
> - token_offset = offset;
> - token_len = ntohl(rhdr.token_len);
> - if (token_len > len ||
> - xdr_round_up(token_len) + sizeof(__be32) > len)
> + if (!xdr_extract_blob(response, &auth))
> goto short_packet;
>
> - trace_rxrpc_rx_response(conn, sp->hdr.serial, 0, sp->hdr.cksum, token_len);
> + trace_rxrpc_rx_response(conn, sp->hdr.serial, 0, sp->hdr.cksum, resp_token.len);
>
> - offset += xdr_round_up(token_len);
> - len -= xdr_round_up(token_len);
> -
> - if (skb_copy_bits(skb, offset, &xauth_len, sizeof(xauth_len)) < 0)
> - goto short_packet;
> - offset += sizeof(xauth_len);
> - len -= sizeof(xauth_len);
> -
> - auth_offset = offset;
> - auth_len = ntohl(xauth_len);
> - if (auth_len > len)
> - goto short_packet;
> - if (auth_len & 3)
> + if (auth.len & 3)
> goto inconsistent;
> - if (auth_len < 20 + 9 * 4)
> + if (auth.len < 20 + 9 * 4)
> goto auth_too_short;
>
> /* We need to extract and decrypt the token and instantiate a session
> @@ -1238,7 +1186,7 @@ static int rxgk_verify_response(struct rxrpc_connection *conn,
> * to the app to deal with - which might mean a round trip to
> * userspace.
> */
> - ret = rxgk_extract_token(conn, skb, token_offset, token_len, &key);
> + ret = rxgk_extract_token(conn, skb, &resp_token, &key);
> if (ret < 0)
> goto out;
>
> @@ -1252,7 +1200,7 @@ static int rxgk_verify_response(struct rxrpc_connection *conn,
> */
> token = key->payload.data[0];
> conn->security_level = token->rxgk->level;
> - conn->rxgk.start_time = __be64_to_cpu(rhdr.start_time);
> + conn->rxgk.start_time = __be64_to_cpu(rhdr->start_time);
>
> gk = rxgk_generate_transport_key(conn, token->rxgk, sp->hdr.cksum, GFP_NOFS);
> if (IS_ERR(gk)) {
> @@ -1262,18 +1210,18 @@ static int rxgk_verify_response(struct rxrpc_connection *conn,
>
> krb5 = gk->krb5;
>
> - trace_rxrpc_rx_response(conn, sp->hdr.serial, krb5->etype, sp->hdr.cksum, token_len);
> + trace_rxrpc_rx_response(conn, sp->hdr.serial, krb5->etype, sp->hdr.cksum,
> + resp_token.len);
>
> /* Decrypt, parse and verify the authenticator. */
> - ret = rxgk_decrypt_skb(krb5, gk->resp_enc, skb,
> - &auth_offset, &auth_len, &ec);
> + ret = rxgk_decrypt(krb5, gk->resp_enc, &auth, &ec);
> if (ret < 0) {
> rxrpc_abort_conn(conn, skb, RXGK_SEALEDINCON, ret,
> rxgk_abort_resp_auth_dec);
> goto out_gk;
> }
>
> - ret = rxgk_verify_authenticator(conn, krb5, skb, auth_offset, auth_len);
> + ret = rxgk_verify_authenticator(conn, krb5, skb, &auth);
> if (ret < 0)
> goto out_gk;
>
> diff --git a/net/rxrpc/rxgk_app.c b/net/rxrpc/rxgk_app.c
> index 0ef2a29eb695..5ced22f17c5b 100644
> --- a/net/rxrpc/rxgk_app.c
> +++ b/net/rxrpc/rxgk_app.c
> @@ -40,40 +40,43 @@
> * };
> */
> int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
> - unsigned int ticket_offset, unsigned int ticket_len,
> - struct key **_key)
> + struct xdr_buffer *xticket, struct key **_key)
> {
> struct rxrpc_key_token *token;
> const struct cred *cred = current_cred(); // TODO - use socket creds
> + struct xdr_buffer xdr = *xticket, K0;
> struct key *key;
> size_t pre_ticket_len, payload_len;
> - unsigned int klen, enctype;
> + unsigned int enctype;
> void *payload, *ticket;
> - __be32 *t, *p, *q, tmp[2];
> + __be32 *t, *p, *q;
> + xdr_t *x;
> int ret;
>
> _enter("");
>
> - if (ticket_len < 10 * sizeof(__be32))
> + /* Extract K0 */
> + x = xdr_extract_region(&xdr, sizeof(xdr_t) * 1);
> + if (!x)
> return rxrpc_abort_conn(conn, skb, RXGK_INCONSISTENCY, -EPROTO,
> rxgk_abort_resp_short_yfs_tkt);
>
> - /* Get the session key length */
> - ret = skb_copy_bits(skb, ticket_offset, tmp, sizeof(tmp));
> - if (ret < 0)
> - return rxrpc_abort_conn(conn, skb, RXGK_INCONSISTENCY, -EPROTO,
> - rxgk_abort_resp_short_yfs_klen);
> - enctype = ntohl(tmp[0]);
> - klen = ntohl(tmp[1]);
> + enctype = xdr_dec(*x);
>
> - if (klen > ticket_len - 10 * sizeof(__be32))
> + if (!xdr_extract_blob(&xdr, &K0))
> return rxrpc_abort_conn(conn, skb, RXGK_INCONSISTENCY, -EPROTO,
> rxgk_abort_resp_short_yfs_key);
>
> + /* Extract level to expirationtime. */
> + t = xdr_extract_region(&xdr, sizeof(xdr_t) * 7);
> + if (!t)
> + return rxrpc_abort_conn(conn, skb, RXGK_INCONSISTENCY, -EPROTO,
> + rxgk_abort_resp_short_yfs_tkt);
> +
> pre_ticket_len = ((5 + 14) * sizeof(__be32) +
> - xdr_round_up(klen) +
> + xdr_round_up(K0.len) +
> sizeof(__be32));
> - payload_len = pre_ticket_len + xdr_round_up(ticket_len);
> + payload_len = pre_ticket_len + xdr_round_up(xticket->len);
>
> payload = kzalloc(payload_len, GFP_NOFS);
> if (!payload)
> @@ -84,12 +87,7 @@ int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
> * it.
> */
> ticket = payload + pre_ticket_len;
> - ret = skb_copy_bits(skb, ticket_offset, ticket, ticket_len);
> - if (ret < 0) {
> - ret = rxrpc_abort_conn(conn, skb, RXGK_INCONSISTENCY, -EPROTO,
> - rxgk_abort_resp_short_yfs_tkt);
> - goto error;
> - }
> + memcpy(ticket, xticket->p, xticket->len);
>
> /* Fill out the form header. */
> p = payload;
> @@ -97,13 +95,12 @@ int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
> p[1] = htonl(1); /* len(cellname) */
> p[2] = htonl(0x20000000); /* Cellname " " */
> p[3] = htonl(1); /* #tokens */
> - p[4] = htonl(15 * sizeof(__be32) + xdr_round_up(klen) +
> - xdr_round_up(ticket_len)); /* Token len */
> + p[4] = htonl(15 * sizeof(__be32) + xdr_round_up(K0.len) +
> + xdr_round_up(xticket->len)); /* Token len */
>
> /* Now fill in the body. Most of this we can just scrape directly from
> * the ticket.
> */
> - t = ticket + sizeof(__be32) * 2 + xdr_round_up(klen);
> q = payload + 5 * sizeof(__be32);
> q[0] = htonl(RXRPC_SECURITY_YFS_RXGK);
> q[1] = t[1]; /* begintime - msw */
> @@ -118,21 +115,21 @@ int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
> q[10] = t[4]; /* - lsw */
> q[11] = 0; /* enctype - msw */
> q[12] = htonl(enctype); /* - lsw */
> - q[13] = htonl(klen); /* Key length */
> + q[13] = htonl(K0.len); /* Key length */
>
> q += 14;
>
> - memcpy(q, ticket + sizeof(__be32) * 2, klen);
> - q += xdr_round_up(klen) / 4;
> - q[0] = htonl(ticket_len);
> + memcpy(q, K0.p, K0.len);
> + q += xdr_round_up(K0.len) / 4;
> + q[0] = htonl(xticket->len);
> q++;
> if (WARN_ON((unsigned long)q != (unsigned long)ticket)) {
> ret = -EIO;
> goto error;
> }
>
> - /* Ticket read in with skb_copy_bits above */
> - q += xdr_round_up(ticket_len) / 4;
> + /* Ticket appended above. */
> + q += xdr_round_up(xticket->len) / 4;
> if (WARN_ON((unsigned long)q - (unsigned long)payload != payload_len)) {
> ret = -EIO;
> goto error;
> @@ -182,44 +179,34 @@ int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
> * [tools.ietf.org/html/draft-wilkinson-afs3-rxgk-afs-08 sec 6.1]
> */
> int rxgk_extract_token(struct rxrpc_connection *conn, struct sk_buff *skb,
> - unsigned int token_offset, unsigned int token_len,
> - struct key **_key)
> + struct xdr_buffer *token, struct key **_key)
> {
> const struct krb5_enctype *krb5;
> const struct krb5_buffer *server_secret;
> struct crypto_aead *token_enc = NULL;
> + struct xdr_buffer ticket;
> struct key *server_key;
> - unsigned int ticket_offset, ticket_len;
> u32 kvno, enctype;
> int ret, ec = 0;
>
> struct {
> __be32 kvno;
> __be32 enctype;
> - __be32 token_len;
> - } container;
> + } *container;
>
> - if (token_len < sizeof(container))
> + container = xdr_extract_region(token, sizeof(container));
> + if (!container)
> goto short_packet;
>
> /* Decode the RXGK_TokenContainer object. This tells us which server
> * key we should be using. We can then fetch the key, get the secret
> * and set up the crypto to extract the token.
> */
> - if (skb_copy_bits(skb, token_offset, &container, sizeof(container)) < 0)
> - goto short_packet;
> -
> - kvno = ntohl(container.kvno);
> - enctype = ntohl(container.enctype);
> - ticket_len = ntohl(container.token_len);
> - ticket_offset = token_offset + sizeof(container);
> -
> - if (ticket_len > xdr_round_down(token_len - sizeof(container)))
> - goto short_packet;
> + kvno = ntohl(container->kvno);
> + enctype = ntohl(container->enctype);
>
> _debug("KVNO %u", kvno);
> _debug("ENC %u", enctype);
> - _debug("TLEN %u", ticket_len);
>
> server_key = rxrpc_look_up_server_security(conn, skb, kvno, enctype);
> if (IS_ERR(server_key))
> @@ -237,8 +224,11 @@ int rxgk_extract_token(struct rxrpc_connection *conn, struct sk_buff *skb,
> * gain access to K0, from which we can derive the transport key and
> * thence decode the authenticator.
> */
> - ret = rxgk_decrypt_skb(krb5, token_enc, skb,
> - &ticket_offset, &ticket_len, &ec);
> + if (!xdr_extract_blob(token, &ticket))
> + goto short_packet;
> + _debug("TLEN %u", ticket.len);
> +
> + ret = rxgk_decrypt(krb5, token_enc, &ticket, &ec);
> crypto_free_aead(token_enc);
> token_enc = NULL;
> if (ret < 0) {
> @@ -248,8 +238,7 @@ int rxgk_extract_token(struct rxrpc_connection *conn, struct sk_buff *skb,
> return ret;
> }
>
> - ret = conn->security->default_decode_ticket(conn, skb, ticket_offset,
> - ticket_len, _key);
> + ret = conn->security->default_decode_ticket(conn, skb, &ticket, _key);
> if (ret < 0)
> goto cant_get_token;
>
> diff --git a/net/rxrpc/rxgk_common.h b/net/rxrpc/rxgk_common.h
> index dc8b0f106104..f0b724c44036 100644
> --- a/net/rxrpc/rxgk_common.h
> +++ b/net/rxrpc/rxgk_common.h
> @@ -33,19 +33,13 @@ struct rxgk_context {
> struct crypto_aead *resp_enc; /* Response packet enc key */
> };
>
> -#define xdr_round_up(x) (round_up((x), sizeof(__be32)))
> -#define xdr_round_down(x) (round_down((x), sizeof(__be32)))
> -#define xdr_object_len(x) (4 + xdr_round_up(x))
> -
> /*
> * rxgk_app.c
> */
> int rxgk_yfs_decode_ticket(struct rxrpc_connection *conn, struct sk_buff *skb,
> - unsigned int ticket_offset, unsigned int ticket_len,
> - struct key **_key);
> + struct xdr_buffer *xticket, struct key **_key);
> int rxgk_extract_token(struct rxrpc_connection *conn, struct sk_buff *skb,
> - unsigned int token_offset, unsigned int token_len,
> - struct key **_key);
> + struct xdr_buffer *token, struct key **_key);
>
> /*
> * rxgk_kdf.c
> @@ -61,50 +55,6 @@ int rxgk_set_up_token_cipher(const struct krb5_buffer *server_key,
> const struct krb5_enctype **_krb5,
> gfp_t gfp);
>
> -/*
> - * Apply decryption and checksumming functions to part of an skbuff. The
> - * offset and length are updated to reflect the actual content of the encrypted
> - * region.
> - */
> -static inline
> -int rxgk_decrypt_skb(const struct krb5_enctype *krb5,
> - struct crypto_aead *aead,
> - struct sk_buff *skb,
> - unsigned int *_offset, unsigned int *_len,
> - int *_error_code)
> -{
> - struct scatterlist sg[16];
> - size_t offset = 0, len = *_len;
> - int nr_sg, ret;
> -
> - sg_init_table(sg, ARRAY_SIZE(sg));
> - nr_sg = skb_to_sgvec(skb, sg, *_offset, len);
> - if (unlikely(nr_sg < 0))
> - return nr_sg;
> -
> - ret = crypto_krb5_decrypt(krb5, aead, sg, nr_sg,
> - &offset, &len);
> - switch (ret) {
> - case 0:
> - *_offset += offset;
> - *_len = len;
> - break;
> - case -EBADMSG: /* Checksum mismatch. */
> - case -EPROTO:
> - *_error_code = RXGK_SEALEDINCON;
> - break;
> - case -EMSGSIZE:
> - *_error_code = RXGK_PACKETSHORT;
> - break;
> - case -ENOPKG: /* Would prefer RXGK_BADETYPE, but not available for YFS. */
> - default:
> - *_error_code = RXGK_INCONSISTENCY;
> - break;
> - }
> -
> - return ret;
> -}
> -
> /*
> * Apply decryption and checksumming functions a flat data buffer. The offset
> * and length are updated to reflect the actual content of the encrypted
> @@ -112,21 +62,24 @@ int rxgk_decrypt_skb(const struct krb5_enctype *krb5,
> */
> static inline int rxgk_decrypt(const struct krb5_enctype *krb5,
> struct crypto_aead *aead,
> - void *data,
> - unsigned int *_offset, unsigned int *_len,
> + struct xdr_buffer *buf,
> int *_error_code)
> {
> struct scatterlist sg[1];
> - size_t offset = 0, len = *_len;
> + size_t offset = 0, len = buf->len;
> int ret;
>
> - sg_init_one(sg, data, len);
> -
> + sg_init_one(sg, buf->p, len);
> ret = crypto_krb5_decrypt(krb5, aead, sg, 1, &offset, &len);
> switch (ret) {
> case 0:
> - *_offset += offset;
> - *_len = len;
> + if (offset & 3) {
> + *_error_code = RXGK_INCONSISTENCY;
> + ret = -EPROTO;
> + break;
> + }
> + buf->p = (void *)buf->p + offset;
> + buf->len = len;
> break;
> case -EBADMSG: /* Checksum mismatch. */
> case -EPROTO:
> @@ -144,50 +97,6 @@ static inline int rxgk_decrypt(const struct krb5_enctype *krb5,
> return ret;
> }
>
> -/*
> - * Check the MIC on a region of an skbuff. The offset and length are updated
> - * to reflect the actual content of the secure region.
> - */
> -static inline
> -int rxgk_verify_mic_skb(const struct krb5_enctype *krb5,
> - struct crypto_shash *shash,
> - const struct krb5_buffer *metadata,
> - struct sk_buff *skb,
> - unsigned int *_offset, unsigned int *_len,
> - u32 *_error_code)
> -{
> - struct scatterlist sg[16];
> - size_t offset = 0, len = *_len;
> - int nr_sg, ret;
> -
> - sg_init_table(sg, ARRAY_SIZE(sg));
> - nr_sg = skb_to_sgvec(skb, sg, *_offset, len);
> - if (unlikely(nr_sg < 0))
> - return nr_sg;
> -
> - ret = crypto_krb5_verify_mic(krb5, shash, metadata, sg, nr_sg,
> - &offset, &len);
> - switch (ret) {
> - case 0:
> - *_offset += offset;
> - *_len = len;
> - break;
> - case -EBADMSG: /* Checksum mismatch */
> - case -EPROTO:
> - *_error_code = RXGK_SEALEDINCON;
> - break;
> - case -EMSGSIZE:
> - *_error_code = RXGK_PACKETSHORT;
> - break;
> - case -ENOPKG: /* Would prefer RXGK_BADETYPE, but not available for YFS. */
> - default:
> - *_error_code = RXGK_INCONSISTENCY;
> - break;
> - }
> -
> - return ret;
> -}
> -
> /*
> * Check the MIC on a flat buffer. The offset and length are updated to
> * reflect the actual content of the secure region.
> diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c
> index 075936337836..e0f3db451b05 100644
> --- a/net/rxrpc/rxkad.c
> +++ b/net/rxrpc/rxkad.c
> @@ -963,7 +963,6 @@ static int rxkad_decrypt_ticket(struct rxrpc_connection *conn,
> *_expiry = 0;
>
> ASSERT(server_key->payload.data[0] != NULL);
> - ASSERTCMP((unsigned long) ticket & 7UL, ==, 0);
>
> memcpy(&iv, &server_key->payload.data[2], sizeof(iv));
>
> @@ -1112,20 +1111,49 @@ static int rxkad_decrypt_response(struct rxrpc_connection *conn,
> * verify a response
> */
> static int rxkad_verify_response(struct rxrpc_connection *conn,
> - struct sk_buff *skb)
> + struct sk_buff *skb,
> + struct xdr_buffer *response_xdr)
> {
> struct rxkad_response *response;
> struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> struct rxrpc_crypt session_key;
> struct key *server_key;
> time64_t expiry;
> - void *ticket = NULL;
> + void *ticket;
> u32 version, kvno, ticket_len, level;
> __be32 csum;
> int ret, i;
>
> _enter("{%d}", conn->debug_id);
>
> + response = xdr_extract_region(response_xdr, sizeof(*response));
> + if (!response)
> + return rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
> + rxkad_abort_resp_short);
> +
> + version = ntohl(response->version);
> + ticket_len = ntohl(response->ticket_len);
> + kvno = ntohl(response->kvno);
> +
> + trace_rxrpc_rx_response(conn, sp->hdr.serial, version, kvno, ticket_len);
> +
> + if (version != RXKAD_VERSION)
> + return rxrpc_abort_conn(conn, skb, RXKADINCONSISTENCY, -EPROTO,
> + rxkad_abort_resp_version);
> +
> + if (kvno >= RXKAD_TKT_TYPE_KERBEROS_V5)
> + return rxrpc_abort_conn(conn, skb, RXKADUNKNOWNKEY, -EPROTO,
> + rxkad_abort_resp_unknown_tkt);
> +
> + ticket = xdr_extract_region(response_xdr, ticket_len);
> + if (!ticket)
> + return rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
> + rxkad_abort_resp_short_tkt);
> +
> + if (ticket_len < 4 || ticket_len > MAXKRB5TICKETLEN)
> + return rxrpc_abort_conn(conn, skb, RXKADTICKETLEN, -EPROTO,
> + rxkad_abort_resp_tkt_len);
> +
> server_key = rxrpc_look_up_server_security(conn, skb, 0, 0);
> if (IS_ERR(server_key)) {
> ret = PTR_ERR(server_key);
> @@ -1142,55 +1170,10 @@ static int rxkad_verify_response(struct rxrpc_connection *conn,
> }
> }
>
> - ret = -ENOMEM;
> - response = kzalloc_obj(struct rxkad_response, GFP_NOFS);
> - if (!response)
> - goto error;
> -
> - if (skb_copy_bits(skb, sizeof(struct rxrpc_wire_header),
> - response, sizeof(*response)) < 0) {
> - ret = rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
> - rxkad_abort_resp_short);
> - goto error;
> - }
> -
> - version = ntohl(response->version);
> - ticket_len = ntohl(response->ticket_len);
> - kvno = ntohl(response->kvno);
> -
> - trace_rxrpc_rx_response(conn, sp->hdr.serial, version, kvno, ticket_len);
> -
> - if (version != RXKAD_VERSION) {
> - ret = rxrpc_abort_conn(conn, skb, RXKADINCONSISTENCY, -EPROTO,
> - rxkad_abort_resp_version);
> - goto error;
> - }
> -
> - if (ticket_len < 4 || ticket_len > MAXKRB5TICKETLEN) {
> - ret = rxrpc_abort_conn(conn, skb, RXKADTICKETLEN, -EPROTO,
> - rxkad_abort_resp_tkt_len);
> - goto error;
> - }
> -
> - if (kvno >= RXKAD_TKT_TYPE_KERBEROS_V5) {
> - ret = rxrpc_abort_conn(conn, skb, RXKADUNKNOWNKEY, -EPROTO,
> - rxkad_abort_resp_unknown_tkt);
> - goto error;
> - }
> -
> - /* extract the kerberos ticket and decrypt and decode it */
> - ret = -ENOMEM;
> - ticket = kmalloc(ticket_len, GFP_NOFS);
> - if (!ticket)
> - goto error;
> -
> - if (skb_copy_bits(skb, sizeof(struct rxrpc_wire_header) + sizeof(*response),
> - ticket, ticket_len) < 0) {
> - ret = rxrpc_abort_conn(conn, skb, RXKADPACKETSHORT, -EPROTO,
> - rxkad_abort_resp_short_tkt);
> - goto error;
> - }
> -
> + /* Extract the kerberos ticket and decrypt and decode it. We copy it
> + * before decryption which ensures it's correctly aligned for the
> + * decryption.
> + */
> ret = rxkad_decrypt_ticket(conn, server_key, skb, ticket, ticket_len,
> &session_key, &expiry);
> if (ret < 0)
> @@ -1265,8 +1248,6 @@ static int rxkad_verify_response(struct rxrpc_connection *conn,
> ret = rxrpc_get_server_data_key(conn, &session_key, expiry, kvno);
>
> error:
> - kfree(ticket);
> - kfree(response);
> key_put(server_key);
> _leave(" = %d", ret);
> return ret;
>
It would be easier to review this change if the use of a temporary buffer was
separated from the XDR abstraction changes. Would it be possible to do so?
Thank you.
Jeffrey Altman
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 4120 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg
2026-05-11 16:07 ` [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg David Howells
2026-05-12 7:58 ` Jeffrey Altman
@ 2026-05-12 13:38 ` David Laight
2026-05-12 16:52 ` David Howells
1 sibling, 1 reply; 15+ messages in thread
From: David Laight @ 2026-05-12 13:38 UTC (permalink / raw)
To: David Howells
Cc: netdev, Hyunwoo Kim, Marc Dionne, Jakub Kicinski, David S. Miller,
Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
Jeffrey Altman, Jiayuan Chen, stable
On Mon, 11 May 2026 17:07:48 +0100
David Howells <dhowells@redhat.com> wrote:
> This improves the fix for CVE-2026-43500.
>
> Fix the pagecache corruption from in-place decryption of a DATA packet
> transmitted locally by splice() by getting rid of the packet sharing in the
> I/O thread and unconditionally extracting the packet content into a bounce
> buffer in which the buffer is decrypted. recvmsg() (or the kernel
> equivalent) then copies the data from the bounce buffer to the destination
> buffer. The sk_buff then remains unmodified.
>
> This has an additional advantage in that the packet is then arranged in the
> buffer with the correct alignment required for the crypto algorithms to
> process directly. The performance of the crypto does seem to be a little
> faster and, surprisingly, the unencrypted performance doesn't seem to
> change much - possibly due to removing complexity from the I/O thread.
Yep, avoiding data copies is overrated :-)
> Yet another advantage is that the I/O thread doesn't have to copy packets
> which would slow down packet distribution, ACK generation, etc..
>
> The buffer belongs to the call and is allocated initially at 2K,
> sufficiently large to hold a whole jumbo subpacket, but the buffer will be
> increased in size if needed. There is one downside here, and that's if a
> MSG_PEEK of more than one byte occurs, it may move on to the next packet,
> replacing the content of the buffer. In such a case, it has to go back and
> re-decrypt the current packet.
>
> Note that rx_pkt_offset may legitimately see 0 as a valid offset now, so
> switch to using USHRT_MAX to indicate an invalid offset.
>
> Note also that I would generally prefer to replace the buffers of the
> current sk_buff with a new kmalloc'd buffer of the right size, ditching the
> old data and frags as this makes the handling of MSG_PEEK easier and
> removes the double-decryption issue, but this looks like quite a
> complicated thing to achieve. skb_morph() looks half way to what I want,
> but I don't want to have to allocate a new sk_buff.
Wouldn't you need to do that anyway when the kkb is shared - or can't
that happen?
>
> Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()")
> Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
> Closes: https://lore.kernel.org/r/afKV2zGR6rrelPC7@v4bel/
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Marc Dionne <marc.dionne@auristor.com>
> cc: Jeffrey Altman <jaltman@auristor.com>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Eric Dumazet <edumazet@google.com>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: Simon Horman <horms@kernel.org>
> cc: Jiayuan Chen <jiayuan.chen@linux.dev>
> cc: netdev@vger.kernel.org
> cc: linux-afs@lists.infradead.org
> cc: stable@vger.kernel.org
> ---
> net/rxrpc/ar-internal.h | 7 +++-
> net/rxrpc/call_event.c | 22 +----------
> net/rxrpc/call_object.c | 2 +
> net/rxrpc/insecure.c | 3 --
> net/rxrpc/recvmsg.c | 72 +++++++++++++++++++++++++++-------
> net/rxrpc/rxgk.c | 49 +++++++++++------------
> net/rxrpc/rxgk_common.h | 79 +++++++++++++++++++++++++++++++++++++
> net/rxrpc/rxkad.c | 86 +++++++++++++++--------------------------
> 8 files changed, 200 insertions(+), 120 deletions(-)
>
> diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
> index 27c2aa2dd023..783367eea798 100644
> --- a/net/rxrpc/ar-internal.h
> +++ b/net/rxrpc/ar-internal.h
> @@ -213,8 +213,6 @@ struct rxrpc_skb_priv {
> struct {
> u16 offset; /* Offset of data */
> u16 len; /* Length of data */
> - u8 flags;
> -#define RXRPC_RX_VERIFIED 0x01
> };
> struct {
> rxrpc_seq_t first_ack; /* First packet in acks table */
> @@ -774,6 +772,11 @@ struct rxrpc_call {
> struct sk_buff_head recvmsg_queue; /* Queue of packets ready for recvmsg() */
> struct sk_buff_head rx_queue; /* Queue of packets for this call to receive */
> struct sk_buff_head rx_oos_queue; /* Queue of out of sequence packets */
> + void *rx_dec_buffer; /* Decryption buffer */
> + unsigned short rx_dec_bsize; /* rx_dec_buffer size */
> + unsigned short rx_dec_offset; /* Decrypted packet data offset */
> + unsigned short rx_dec_len; /* Decrypted packet data len */
Is it actually worth making those short rather than int?
I doubt the extra 4 bytes will matter and the generated code might be better.
(IIRC 32bit arm has a limited offset from 16 bit load/store, dunno about 64bit)
> + rxrpc_seq_t rx_dec_seq; /* Packet in decryption buffer */
>
> rxrpc_seq_t rx_highest_seq; /* Higest sequence number received */
> rxrpc_seq_t rx_consumed; /* Highest packet consumed */
> diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
> index 2b19b252225e..fec59d9338b9 100644
> --- a/net/rxrpc/call_event.c
> +++ b/net/rxrpc/call_event.c
> @@ -332,27 +332,7 @@ bool rxrpc_input_call_event(struct rxrpc_call *call)
>
> saw_ack |= sp->hdr.type == RXRPC_PACKET_TYPE_ACK;
>
> - if (sp->hdr.type == RXRPC_PACKET_TYPE_DATA &&
> - sp->hdr.securityIndex != 0 &&
> - (skb_cloned(skb) ||
> - skb_has_frag_list(skb) ||
> - skb_has_shared_frag(skb))) {
> - /* Unshare the packet so that it can be
> - * modified by in-place decryption.
> - */
> - struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC);
> -
> - if (nskb) {
> - rxrpc_new_skb(nskb, rxrpc_skb_new_unshared);
> - rxrpc_input_call_packet(call, nskb);
> - rxrpc_free_skb(nskb, rxrpc_skb_put_call_rx);
> - } else {
> - /* OOM - Drop the packet. */
> - rxrpc_see_skb(skb, rxrpc_skb_see_unshare_nomem);
> - }
> - } else {
> - rxrpc_input_call_packet(call, skb);
> - }
> + rxrpc_input_call_packet(call, skb);
> rxrpc_free_skb(skb, rxrpc_skb_put_call_rx);
> did_receive = true;
> }
> diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
> index f035f486c139..fcb9d38bb521 100644
> --- a/net/rxrpc/call_object.c
> +++ b/net/rxrpc/call_object.c
> @@ -152,6 +152,7 @@ struct rxrpc_call *rxrpc_alloc_call(struct rxrpc_sock *rx, gfp_t gfp,
> spin_lock_init(&call->notify_lock);
> refcount_set(&call->ref, 1);
> call->debug_id = debug_id;
> + call->rx_pkt_offset = USHRT_MAX;
> call->tx_total_len = -1;
> call->tx_jumbo_max = 1;
> call->next_rx_timo = 20 * HZ;
> @@ -553,6 +554,7 @@ static void rxrpc_cleanup_rx_buffers(struct rxrpc_call *call)
> rxrpc_purge_queue(&call->recvmsg_queue);
> rxrpc_purge_queue(&call->rx_queue);
> rxrpc_purge_queue(&call->rx_oos_queue);
> + kfree(call->rx_dec_buffer);
> }
>
> /*
> diff --git a/net/rxrpc/insecure.c b/net/rxrpc/insecure.c
> index 0a260df45d25..7a26c6097d03 100644
> --- a/net/rxrpc/insecure.c
> +++ b/net/rxrpc/insecure.c
> @@ -32,9 +32,6 @@ static int none_secure_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
>
> static int none_verify_packet(struct rxrpc_call *call, struct sk_buff *skb)
> {
> - struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> -
> - sp->flags |= RXRPC_RX_VERIFIED;
> return 0;
> }
>
> diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
> index e1f7513a46db..865e368381d5 100644
> --- a/net/rxrpc/recvmsg.c
> +++ b/net/rxrpc/recvmsg.c
> @@ -147,15 +147,55 @@ static void rxrpc_rotate_rx_window(struct rxrpc_call *call)
> }
>
> /*
> - * Decrypt and verify a DATA packet.
> + * Decrypt and verify a DATA packet. The content of the packet is pulled out
> + * into a flat buffer rather than decrypting in place in the skbuff. This also
> + * has the advantage of aligning the buffer correctly for the crypto routines.
> + *
> + * We keep track of the sequence number of the packet currently decrypted into
> + * the buffer in ->rx_dec_seq. Unfortunately, this means that a MSG_PEEK of
> + * more than one byte may cause a later packet to be decrypted into the buffer,
> + * requiring the original to be re-decrypted when recvmsg() is called again.
> */
> static int rxrpc_verify_data(struct rxrpc_call *call, struct sk_buff *skb)
> {
> struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
> + int ret;
>
> - if (sp->flags & RXRPC_RX_VERIFIED)
> + if (call->rx_dec_seq == sp->hdr.seq && call->rx_dec_buffer)
> return 0;
> - return call->security->verify_packet(call, skb);
> +
> + if (call->rx_dec_bsize < sp->len) {
IMHO That test is backwards; the 'more constant' value should be on the right.
> + /* Make sure we can hold a 1412-byte jumbo subpacket and make
> + * sure that the buffer size is aligned to a crypto blocksize.
> + */
> + size_t size = umin(round_up(sp->len, 32), 2048);
Doesn't min() work?
> + void *buffer = krealloc(call->rx_dec_buffer, size, GFP_NOFS);
> +
> + if (!buffer)
> + return -ENOMEM;
> + call->rx_dec_buffer = buffer;
> + call->rx_dec_bsize = size;
> + }
That doesn't look right.
If sp->len is bigger than 2048 the you keep allocating a new buffer
and the call below overruns the allocated buffer.
> +
> + ret = -EFAULT;
> + if (skb_copy_bits(skb, sp->offset, call->rx_dec_buffer, sp->len) < 0)
> + goto err;
> +
...
-- David
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg
2026-05-12 13:38 ` David Laight
@ 2026-05-12 16:52 ` David Howells
2026-05-12 21:36 ` David Laight
0 siblings, 1 reply; 15+ messages in thread
From: David Howells @ 2026-05-12 16:52 UTC (permalink / raw)
To: David Laight
Cc: dhowells, netdev, Hyunwoo Kim, Marc Dionne, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
linux-afs, linux-kernel, Jeffrey Altman, Jiayuan Chen, stable
David Laight <david.laight.linux@gmail.com> wrote:
> > Note also that I would generally prefer to replace the buffers of the
> > current sk_buff with a new kmalloc'd buffer of the right size, ditching the
> > old data and frags as this makes the handling of MSG_PEEK easier and
> > removes the double-decryption issue, but this looks like quite a
> > complicated thing to achieve. skb_morph() looks half way to what I want,
> > but I don't want to have to allocate a new sk_buff.
>
> Wouldn't you need to do that anyway when the kkb is shared - or can't
> that happen?
Hmmm... That may well be the case - but if it's shared, do I own the
->next/prev pointers and the ->cb area?
> > + unsigned short rx_dec_bsize; /* rx_dec_buffer size */
> > + unsigned short rx_dec_offset; /* Decrypted packet data offset */
> > + unsigned short rx_dec_len; /* Decrypted packet data len */
>
> Is it actually worth making those short rather than int?
> I doubt the extra 4 bytes will matter and the generated code might be better.
> (IIRC 32bit arm has a limited offset from 16 bit load/store, dunno about 64bit)
Well, the capacity of a UDP packet less the rxrpc header can't reach 65535, so
on that basis this should be fine. I'm a little worried about the rxrpc_call
struct's size - it's already ~1.3K. It's already got a lot of 8- and 16-bit
fields in it. Of course, it's nowhere near as bit-for-bit optimised as
sk_buff, but I guess there are a lot more of those in a system.
> > + if (call->rx_dec_bsize < sp->len) {
>
> IMHO That test is backwards; the 'more constant' value should be on the right.
Actually, the thing you're testing should be on the left and the thing you're
testing against on the right - but, yes, I should switch them around.
> > + size_t size = umin(round_up(sp->len, 32), 2048);
>
> Doesn't min() work?
Actually, it should be umax() as I want the largest of the values (as Jeff
pointed out). I prefer using umin/umax for values that are known to be
unsigned as you don't get casting errors (see the number of places we end up
using min/max_t(<unsigned-type>, ...) when we should use umin/umax() instead)
and the compiler may generate better code as we've told it that it doesn't
have to worry about negatives.
> That doesn't look right.
> If sp->len is bigger than 2048 the you keep allocating a new buffer
> and the call below overruns the allocated buffer.
Yep - see the aforementioned umax comment.
David
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg
2026-05-12 16:52 ` David Howells
@ 2026-05-12 21:36 ` David Laight
0 siblings, 0 replies; 15+ messages in thread
From: David Laight @ 2026-05-12 21:36 UTC (permalink / raw)
To: David Howells
Cc: netdev, Hyunwoo Kim, Marc Dionne, Jakub Kicinski, David S. Miller,
Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
Jeffrey Altman, Jiayuan Chen, stable
On Tue, 12 May 2026 17:52:03 +0100
David Howells <dhowells@redhat.com> wrote:
> David Laight <david.laight.linux@gmail.com> wrote:
...
> > > + size_t size = umin(round_up(sp->len, 32), 2048);
> >
> > Doesn't min() work?
>
> Actually, it should be umax() as I want the largest of the values (as Jeff
> pointed out). I prefer using umin/umax for values that are known to be
> unsigned as you don't get casting errors (see the number of places we end up
> using min/max_t(<unsigned-type>, ...) when we should use umin/umax() instead)
> and the compiler may generate better code as we've told it that it doesn't
> have to worry about negatives.
umin() and umax() are better than min_t() and max_t() (which is why I added
them); but you lose the compile-time check in min() and max() that rejects
comparisons where one side is unsigned and the compiler doesn't know that the
other is always non-negative.
Basically if you compare a signed 32bit value and an unsigned 64bit one
with umin() the 32bit one is zero-extended to 64 bits.
OTOH min_t(u64) will sign-extend the 32bit value and then treat it as unsigned.
In both cases the onus is on the programmer to ensure the 32bit value isn't
negative.
For valid non-negative values the result is the same.
Zero-extending is usually free, sign-extending is particularly horrid on 32bit.
But it is better to use min() or max().
The compile-time tests will reject any cases where the integer promotion
rules could convert a negative value to a large positive one.
Note that the types no longer have to match.
Code like this is (usually) ok:
unsigned int blk_len = ...;
int rval = fun(...);
while (rval > 0) {
u32 len = min(rval, blk_len);
// process len bytes;
rval -= len;
}
even though the types passed to min() differ in signedness the compiler's
value tracking means it knows that rval can never become a large unsigned
value - and min() uses that to allow it all through.
-- David
>
> > That doesn't look right.
> > If sp->len is bigger than 2048 the you keep allocating a new buffer
> > and the call below overruns the allocated buffer.
>
> Yep - see the aforementioned umax comment.
>
> David
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net 3/3] rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer
2026-05-11 16:07 ` [PATCH net 3/3] rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer David Howells
2026-05-12 8:22 ` Jeffrey Altman
@ 2026-05-13 0:06 ` Jakub Kicinski
2026-05-13 7:35 ` David Howells
1 sibling, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2026-05-13 0:06 UTC (permalink / raw)
To: David Howells
Cc: netdev, Hyunwoo Kim, Marc Dionne, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
Jeffrey Altman, Jiayuan Chen, stable
On Mon, 11 May 2026 17:07:49 +0100 David Howells wrote:
> This improves the fix for CVE-2026-43500.
>
> Fix the verification of RESPONSE packets to avoid the problem of
> overwriting a RESPONSE packet sent via splice to a local address by
> extracting the contents of the UDP packet into a kmalloc'd linear buffer
> rather than decrypting the data in place in the sk_buff (which may corrupt
> the original buffer).
net/rxrpc/conn_event.c:254:10: warning: variable 'ret' is uninitialized when used here [-Wuninitialized]
254 | return ret;
| ^~~
net/rxrpc/conn_event.c:250:9: note: initialize the variable 'ret' to silence this warning
250 | int ret;
| ^
| = 0
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net 3/3] rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer
2026-05-13 0:06 ` Jakub Kicinski
@ 2026-05-13 7:35 ` David Howells
0 siblings, 0 replies; 15+ messages in thread
From: David Howells @ 2026-05-13 7:35 UTC (permalink / raw)
To: Jakub Kicinski
Cc: dhowells, netdev, Hyunwoo Kim, Marc Dionne, David S. Miller,
Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
Jeffrey Altman, Jiayuan Chen, stable
Jakub Kicinski <kuba@kernel.org> wrote:
> net/rxrpc/conn_event.c:254:10: warning: variable 'ret' is uninitialized when used here [-Wuninitialized]
> 254 | return ret;
> | ^~~
> net/rxrpc/conn_event.c:250:9: note: initialize the variable 'ret' to silence this warning
> 250 | int ret;
> | ^
> | = 0
I'm puzzled why gcc doesn't report a lot of these, even compiling with W=1.
This one is pretty obvious.
David
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg
2026-05-12 7:58 ` Jeffrey Altman
@ 2026-05-13 8:01 ` David Howells
2026-05-13 8:13 ` David Howells
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: David Howells @ 2026-05-13 8:01 UTC (permalink / raw)
To: Jeffrey Altman
Cc: dhowells, netdev, Hyunwoo Kim, Marc Dionne, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
linux-afs, linux-kernel, Jiayuan Chen, stable
Jeffrey Altman <jaltman@auristor.com> wrote:
> > + void *rx_dec_buffer; /* Decryption buffer */
> > + unsigned short rx_dec_bsize; /* rx_dec_buffer size */
> > + unsigned short rx_dec_offset; /* Decrypted packet data offset */
> > + unsigned short rx_dec_len; /* Decrypted packet data len */
> > + rxrpc_seq_t rx_dec_seq; /* Packet in decryption buffer */
> >
> > rxrpc_seq_t rx_highest_seq; /* Higest sequence number received */
> > rxrpc_seq_t rx_consumed; /* Highest packet consumed */
>
>
> Instead of allocating the storage within struct rxrpc_call perhaps
> It would be better to add them to struct rxrpc_channel. Doing so
> would reduce the allocation/deallocation churn. The majority of
> calls are short lived (perhaps a single packet in each direction)
> but there will be many calls in rapid succession.
I'm trying to keep the I/O side separate from the application side. I don't
particularly want recvmsg (on the app side) reaching into the rxrpc_connection
struct (on the I/O side).
Further, by only looking at the rxrpc_call struct, I don't have to deal with
locking required for the possibility that the next call on that channel will
start before I've finished with this one (say an incoming call is aborted and
immediately followed up by the first packet of the next call).
> > + size_t size = umin(round_up(sp->len, 32), 2048);
>
> I think you meant to use max() here so that a minimum of 2048 bytes
> is allocated.
Yeah.
> I think applying a cap on the allocation size would also be
> beneficial. IBM/Transarc derived Rx implementations have a hard
> upper-bound of 21180 (15 x 1412) bytes plus one 28 byte rx header.
> Applying a cap of 32KiB seems prudent.
This would need checking earlier in the input path. A DATA packet that's too
large would need to be rejected as it comes off of the UDP socket if we're not
going to be able to unpack it later.
> It is also worth noting that there are no current implementations
> of Rx RPC which will send individual Rx DATA packets larger than
> 1444 bytes including the Rx header. Rx RESPONSE packets can be sent
> as large as 16384 bytes (including the Rx header). However, it is
> extremely unlikely that this buffer once allocated would ever need
> to be grown.
For Rx RESPONSE packets, I'm fine with allocating a buffer on the spur of the
moment and freeing it immediately. Ideally, there would only be one RESPONSE
per connection anyway. I could do a static buffer with a lock, I suppose, to
make sure I can process the things under memory pressure-based writeback.
> > + kfree(call->rx_dec_buffer);
>
> It might be better to avoid deallocating the buffer on the error
> path and permit it to be freed during normal call (or call channel)
> deallocation.
Hmmm. But I then need some other way to note that the buffer is no longer
occupied by valid data. I suppose I could set ->rx_dec_offset to USHRT_MAX.
David
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg
2026-05-13 8:01 ` David Howells
@ 2026-05-13 8:13 ` David Howells
2026-05-13 8:38 ` David Laight
2026-05-13 9:48 ` Jeffrey Altman
2 siblings, 0 replies; 15+ messages in thread
From: David Howells @ 2026-05-13 8:13 UTC (permalink / raw)
To: Jeffrey Altman
Cc: dhowells, netdev, Hyunwoo Kim, Marc Dionne, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
linux-afs, linux-kernel, Jiayuan Chen, stable
David Howells <dhowells@redhat.com> wrote:
> > > + kfree(call->rx_dec_buffer);
> >
> > It might be better to avoid deallocating the buffer on the error
> > path and permit it to be freed during normal call (or call channel)
> > deallocation.
>
> Hmmm. But I then need some other way to note that the buffer is no longer
> occupied by valid data. I suppose I could set ->rx_dec_offset to USHRT_MAX.
Actually, I'm not sure that just freeing the buffer is all that bad.
If skb_copy_bits() fails (ie. EFAULT), then the sk_buff is unrecoverably
broken somehow and the app will may have to abandon the call. Possibly the
call should be aborted directly here. The case really shouldn't happen and
probably merits a pr_warn().
If ->verify_packet() fails with ENOMEM, then it's retryable. Releasing the
buffer temporarily might help the system.
If ->verify_packet() fails with anything else, then the call should have been
aborted.
David
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg
2026-05-13 8:01 ` David Howells
2026-05-13 8:13 ` David Howells
@ 2026-05-13 8:38 ` David Laight
2026-05-13 9:48 ` Jeffrey Altman
2 siblings, 0 replies; 15+ messages in thread
From: David Laight @ 2026-05-13 8:38 UTC (permalink / raw)
To: David Howells
Cc: Jeffrey Altman, netdev, Hyunwoo Kim, Marc Dionne, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
linux-afs, linux-kernel, Jiayuan Chen, stable
On Wed, 13 May 2026 09:01:14 +0100
David Howells <dhowells@redhat.com> wrote:
> Jeffrey Altman <jaltman@auristor.com> wrote:
>
> > > + void *rx_dec_buffer; /* Decryption buffer */
> > > + unsigned short rx_dec_bsize; /* rx_dec_buffer size */
> > > + unsigned short rx_dec_offset; /* Decrypted packet data offset */
> > > + unsigned short rx_dec_len; /* Decrypted packet data len */
> > > + rxrpc_seq_t rx_dec_seq; /* Packet in decryption buffer */
> > >
> > > rxrpc_seq_t rx_highest_seq; /* Higest sequence number received */
> > > rxrpc_seq_t rx_consumed; /* Highest packet consumed */
> >
> >
> > Instead of allocating the storage within struct rxrpc_call perhaps
> > It would be better to add them to struct rxrpc_channel. Doing so
> > would reduce the allocation/deallocation churn. The majority of
> > calls are short lived (perhaps a single packet in each direction)
> > but there will be many calls in rapid succession.
>
> I'm trying to keep the I/O side separate from the application side. I don't
> particularly want recvmsg (on the app side) reaching into the rxrpc_connection
> struct (on the I/O side).
>
> Further, by only looking at the rxrpc_call struct, I don't have to deal with
> locking required for the possibility that the next call on that channel will
> start before I've finished with this one (say an incoming call is aborted and
> immediately followed up by the first packet of the next call).
There are also loads of other allocates and frees (eg the skb itself).
One more isn't really going to be significant.
Especially for sub-page sizes that just come of a per-cpu list.
>
> > > + size_t size = umin(round_up(sp->len, 32), 2048);
> >
> > I think you meant to use max() here so that a minimum of 2048 bytes
> > is allocated.
>
> Yeah.
>
> > I think applying a cap on the allocation size would also be
> > beneficial. IBM/Transarc derived Rx implementations have a hard
> > upper-bound of 21180 (15 x 1412) bytes plus one 28 byte rx header.
> > Applying a cap of 32KiB seems prudent.
>
> This would need checking earlier in the input path. A DATA packet that's too
> large would need to be rejected as it comes off of the UDP socket if we're not
> going to be able to unpack it later.
>
> > It is also worth noting that there are no current implementations
> > of Rx RPC which will send individual Rx DATA packets larger than
> > 1444 bytes including the Rx header. Rx RESPONSE packets can be sent
> > as large as 16384 bytes (including the Rx header). However, it is
> > extremely unlikely that this buffer once allocated would ever need
> > to be grown.
>
> For Rx RESPONSE packets, I'm fine with allocating a buffer on the spur of the
> moment and freeing it immediately. Ideally, there would only be one RESPONSE
> per connection anyway. I could do a static buffer with a lock, I suppose, to
> make sure I can process the things under memory pressure-based writeback.
A 16K block of static data is rather a waste.
Under that much memory pressure something has to give.
Dropping a packet and forcing the remote to resend on timeout
may actually be the best thing to do.
-- David L
>
> > > + kfree(call->rx_dec_buffer);
> >
> > It might be better to avoid deallocating the buffer on the error
> > path and permit it to be freed during normal call (or call channel)
> > deallocation.
>
> Hmmm. But I then need some other way to note that the buffer is no longer
> occupied by valid data. I suppose I could set ->rx_dec_offset to USHRT_MAX.
>
> David
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg
2026-05-13 8:01 ` David Howells
2026-05-13 8:13 ` David Howells
2026-05-13 8:38 ` David Laight
@ 2026-05-13 9:48 ` Jeffrey Altman
2 siblings, 0 replies; 15+ messages in thread
From: Jeffrey Altman @ 2026-05-13 9:48 UTC (permalink / raw)
To: David Howells
Cc: netdev, Hyunwoo Kim, Marc Dionne, Jakub Kicinski, David S. Miller,
Eric Dumazet, Paolo Abeni, Simon Horman, linux-afs, linux-kernel,
Jiayuan Chen, stable
[-- Attachment #1: Type: text/plain, Size: 4497 bytes --]
> On May 13, 2026, at 4:01 AM, David Howells <dhowells@redhat.com> wrote:
>
> Jeffrey Altman <jaltman@auristor.com> wrote:
>
>>> + void *rx_dec_buffer; /* Decryption buffer */
>>> + unsigned short rx_dec_bsize; /* rx_dec_buffer size */
>>> + unsigned short rx_dec_offset; /* Decrypted packet data offset */
>>> + unsigned short rx_dec_len; /* Decrypted packet data len */
>>> + rxrpc_seq_t rx_dec_seq; /* Packet in decryption buffer */
>>>
>>> rxrpc_seq_t rx_highest_seq; /* Higest sequence number received */
>>> rxrpc_seq_t rx_consumed; /* Highest packet consumed */
>>
>>
>> Instead of allocating the storage within struct rxrpc_call perhaps
>> It would be better to add them to struct rxrpc_channel. Doing so
>> would reduce the allocation/deallocation churn. The majority of
>> calls are short lived (perhaps a single packet in each direction)
>> but there will be many calls in rapid succession.
>
> I'm trying to keep the I/O side separate from the application side. I don't
> particularly want recvmsg (on the app side) reaching into the rxrpc_connection
> struct (on the I/O side).
>
> Further, by only looking at the rxrpc_call struct, I don't have to deal with
> locking required for the possibility that the next call on that channel will
> start before I've finished with this one (say an incoming call is aborted and
> immediately followed up by the first packet of the next call).
There could only be one rxrpc_call structure at a time referring to the
rxrpc_channel. If rxrpc_input_packet_on_conn() receives a DATA packet for a
later call number and there is a prior rxrpc_call assigned to rxrpc_channel.call
then an RX_PACKET_TYPE_BUSY should be sent to the peer if the rx_call cannot be
terminated without blocking the processing of incoming packets. The Rx BUSY
informs the initiating peer that the DATA packet has been dropped and that it
should be retransmitted (which will occur anyway due to the lack of an Rx ACK
packet.
The busy call channel logic is out of scope for this change and avoiding
per-call allocation churn is an optimization that can be implemented another
day if desired.
>
>>> + size_t size = umin(round_up(sp->len, 32), 2048);
>>
>> I think you meant to use max() here so that a minimum of 2048 bytes
>> is allocated.
>
> Yeah.
>
>> I think applying a cap on the allocation size would also be
>> beneficial. IBM/Transarc derived Rx implementations have a hard
>> upper-bound of 21180 (15 x 1412) bytes plus one 28 byte rx header.
>> Applying a cap of 32KiB seems prudent.
>
> This would need checking earlier in the input path. A DATA packet that's too
> large would need to be rejected as it comes off of the UDP socket if we're not
> going to be able to unpack it later.
ok.
>
>> It is also worth noting that there are no current implementations
>> of Rx RPC which will send individual Rx DATA packets larger than
>> 1444 bytes including the Rx header. Rx RESPONSE packets can be sent
>> as large as 16384 bytes (including the Rx header). However, it is
>> extremely unlikely that this buffer once allocated would ever need
>> to be grown.
>
> For Rx RESPONSE packets, I'm fine with allocating a buffer on the spur of the
> moment and freeing it immediately. Ideally, there would only be one RESPONSE
> per connection anyway. I could do a static buffer with a lock, I suppose, to
> make sure I can process the things under memory pressure-based writeback.
I agree. The Rx RESPONSE packets belong to the rxrpc_connection and not to
any particular call. It is possible for there to be more than one Rx RESPONSE
packet received due to Rx CHALLENGE retransmission or because this peer decided
it can no longer trust the prior authentication and sends a new Rx CHALLENGE.
However, RESPONSE packets are few and far between. There is no benefit to
adding another 16KB to the rxrpc_connection allocation.
>
>>> + kfree(call->rx_dec_buffer);
>>
>> It might be better to avoid deallocating the buffer on the error
>> path and permit it to be freed during normal call (or call channel)
>> deallocation.
>
> Hmmm. But I then need some other way to note that the buffer is no longer
> occupied by valid data. I suppose I could set ->rx_dec_offset to USHRT_MAX.
That meaning was unclear and could warrant a code comment indicating that
is why the allocation is being freed.
> David
>
Thanks.
Jeffrey Altman
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 4120 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2026-05-13 9:49 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-11 16:07 [PATCH net 0/3] rxrpc: Better fix for DATA/RESPONSE decrypt vs splice() David Howells
2026-05-11 16:07 ` [PATCH net 1/3] rxrpc: Also unshare DATA/RESPONSE packets when paged frags are present David Howells
2026-05-11 16:07 ` [PATCH net 2/3] rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg David Howells
2026-05-12 7:58 ` Jeffrey Altman
2026-05-13 8:01 ` David Howells
2026-05-13 8:13 ` David Howells
2026-05-13 8:38 ` David Laight
2026-05-13 9:48 ` Jeffrey Altman
2026-05-12 13:38 ` David Laight
2026-05-12 16:52 ` David Howells
2026-05-12 21:36 ` David Laight
2026-05-11 16:07 ` [PATCH net 3/3] rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer David Howells
2026-05-12 8:22 ` Jeffrey Altman
2026-05-13 0:06 ` Jakub Kicinski
2026-05-13 7:35 ` David Howells
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox