public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] RDMA/rxe: reject non-8-byte ATOMIC_WRITE payloads
@ 2026-04-18 16:21 Michael Bommarito
  2026-04-18 22:49 ` Zhu Yanjun
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Bommarito @ 2026-04-18 16:21 UTC (permalink / raw)
  To: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky
  Cc: Xiao Yang, linux-rdma, linux-kernel

atomic_write_reply() at drivers/infiniband/sw/rxe/rxe_resp.c
unconditionally dereferences 8 bytes at payload_addr(pkt):

    value = *(u64 *)payload_addr(pkt);

check_rkey() previously accepted an ATOMIC_WRITE request with
pktlen == resid == 0 because the length validation only compared
pktlen against resid. A remote initiator that sets the RETH
length to 0 therefore reaches atomic_write_reply() with a
zero-byte logical payload, and the responder reads sizeof(u64)
bytes from past the logical end of the packet into skb->head
tailroom, then writes those 8 bytes into the attacker's MR via
rxe_mr_do_atomic_write(). That is a remote disclosure of 4 bytes
of kernel tailroom per probe (the other 4 bytes are the packet's
own trailing ICRC).

IBA oA19-28 defines ATOMIC_WRITE as exactly 8 bytes. Anything
else is protocol-invalid. Hoist a strict length check into
check_rkey() so the responder never reaches the unchecked
dereference, and keep the existing WRITE-family length logic for
the normal RDMA WRITE path.

Reproduced on mainline with an unmodified rxe driver: a
sustained zero-length ATOMIC_WRITE probe repeatedly leaks
adjacent skb head-buffer bytes into the attacker's MR,
including recognisable kernel strings and partial
kernel-direct-map pointer words.  With this patch applied the
responder rejects the PDU and the MR stays all-zero.

Fixes: 034e285f8b99 ("RDMA/rxe: Make responder support atomic write on RC service")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
Previously reported to security@ (2026-04-18); reposting
publicly at the maintainer's request.

Per-probe evidence from a 100K-packet run on the clean
unpatched tree at 9ca18fc915c6 (single attacker QP against a
hairpin target QP over a veth pair; each probe one crafted
zero-length ATOMIC_WRITE PDU):

    transmitted packets:          100,000
    observed MR writes:            48,575
    non-zero leaked tails:         33,297  (68.55% of observed writes)
    mostly-printable tails:         3,796  (7.81%)
    fully-printable tails:          2,241  (4.61%)
    unique non-zero tails:         22,220

Each probe is a fresh skb head-buffer allocation, so the 4
attacker-visible bytes after the ICRC are an independent
sample of slab-adjacent memory.  Content distribution across
the 48,575 observed writes: 31.45% zero, 4.61% fully
printable, 3.20% mostly printable, 12.06% header/sentinel-
looking (08004500, 08004508, ffffffff, ...), 48.68% other
binary.  80.9% of unique non-zero tails were singletons, so
the leak is not dominated by one repeated value.

Representative printable fragments observed on the attacker
side:

    74 6f 70 2e   "top."
    66 72 65 65   "free"
    45 78 65 63   "Exec"
    2f 73 79 73   "/sys"
    72 6f 6f 74   "root"
    45 56 50 41   "EVPA"
    43 4f 44 45   "CODE"

Partial pointer-like recoveries (4-byte words ending in the
kernel-direct-map prefix 0xffff....):

    3,361 observations ending in ffff
    1,364 unique ....ffff tails
    most common:
      81 88 ff ff   LE 0xffff8881   1.68% of observed writes
      80 88 ff ff   LE 0xffff8880   0.22%

The run did not recover full 64-bit kernel virtual addresses
(only 4 bytes per probe are attacker-observable), but the
partial pointer material is consistent with a KASLR-weakening
primitive under sustained probing.  With the fix applied, the
same harness leaves the attacker MR all-zero.
---
 drivers/infiniband/sw/rxe/rxe_resp.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index 711f73e0bbb1..09ba21d0f3c4 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -526,7 +526,19 @@ static enum resp_states check_rkey(struct rxe_qp *qp,
 	}
 
 skip_check_range:
-	if (pkt->mask & (RXE_WRITE_MASK | RXE_ATOMIC_WRITE_MASK)) {
+	if (pkt->mask & RXE_ATOMIC_WRITE_MASK) {
+		/* IBA oA19-28: ATOMIC_WRITE payload is exactly 8 bytes.
+		 * Reject any other length before the responder reads
+		 * sizeof(u64) bytes from payload_addr(pkt); a shorter
+		 * payload would read past the logical end of the packet
+		 * into skb->head tailroom.
+		 */
+		if (resid != sizeof(u64) || pktlen != sizeof(u64) ||
+		    bth_pad(pkt)) {
+			state = RESPST_ERR_LENGTH;
+			goto err;
+		}
+	} else if (pkt->mask & RXE_WRITE_MASK) {
 		if (resid > mtu) {
 			if (pktlen != mtu || bth_pad(pkt)) {
 				state = RESPST_ERR_LENGTH;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] RDMA/rxe: reject non-8-byte ATOMIC_WRITE payloads
  2026-04-18 16:21 [PATCH] RDMA/rxe: reject non-8-byte ATOMIC_WRITE payloads Michael Bommarito
@ 2026-04-18 22:49 ` Zhu Yanjun
  2026-04-18 23:11   ` Michael Bommarito
  0 siblings, 1 reply; 5+ messages in thread
From: Zhu Yanjun @ 2026-04-18 22:49 UTC (permalink / raw)
  To: Michael Bommarito, Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky,
	yanjun.zhu@linux.dev
  Cc: Xiao Yang, linux-rdma, linux-kernel

在 2026/4/18 9:21, Michael Bommarito 写道:
> atomic_write_reply() at drivers/infiniband/sw/rxe/rxe_resp.c
> unconditionally dereferences 8 bytes at payload_addr(pkt):
> 
>      value = *(u64 *)payload_addr(pkt);
> 
> check_rkey() previously accepted an ATOMIC_WRITE request with
> pktlen == resid == 0 because the length validation only compared
> pktlen against resid. A remote initiator that sets the RETH
> length to 0 therefore reaches atomic_write_reply() with a
> zero-byte logical payload, and the responder reads sizeof(u64)
> bytes from past the logical end of the packet into skb->head
> tailroom, then writes those 8 bytes into the attacker's MR via
> rxe_mr_do_atomic_write(). That is a remote disclosure of 4 bytes
> of kernel tailroom per probe (the other 4 bytes are the packet's
> own trailing ICRC).
> 
> IBA oA19-28 defines ATOMIC_WRITE as exactly 8 bytes. Anything
> else is protocol-invalid. Hoist a strict length check into
> check_rkey() so the responder never reaches the unchecked
> dereference, and keep the existing WRITE-family length logic for
> the normal RDMA WRITE path.
> 
> Reproduced on mainline with an unmodified rxe driver: a
> sustained zero-length ATOMIC_WRITE probe repeatedly leaks
> adjacent skb head-buffer bytes into the attacker's MR,
> including recognisable kernel strings and partial
> kernel-direct-map pointer words.  With this patch applied the
> responder rejects the PDU and the MR stays all-zero.
> 
> Fixes: 034e285f8b99 ("RDMA/rxe: Make responder support atomic write on RC service")
> Cc: stable@vger.kernel.org
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
> ---
> Previously reported to security@ (2026-04-18); reposting
> publicly at the maintainer's request.
> 
> Per-probe evidence from a 100K-packet run on the clean
> unpatched tree at 9ca18fc915c6 (single attacker QP against a
> hairpin target QP over a veth pair; each probe one crafted
> zero-length ATOMIC_WRITE PDU):
> 
>      transmitted packets:          100,000
>      observed MR writes:            48,575
>      non-zero leaked tails:         33,297  (68.55% of observed writes)
>      mostly-printable tails:         3,796  (7.81%)
>      fully-printable tails:          2,241  (4.61%)
>      unique non-zero tails:         22,220
> 
> Each probe is a fresh skb head-buffer allocation, so the 4
> attacker-visible bytes after the ICRC are an independent
> sample of slab-adjacent memory.  Content distribution across
> the 48,575 observed writes: 31.45% zero, 4.61% fully
> printable, 3.20% mostly printable, 12.06% header/sentinel-
> looking (08004500, 08004508, ffffffff, ...), 48.68% other
> binary.  80.9% of unique non-zero tails were singletons, so
> the leak is not dominated by one repeated value.
> 
> Representative printable fragments observed on the attacker
> side:
> 
>      74 6f 70 2e   "top."
>      66 72 65 65   "free"
>      45 78 65 63   "Exec"
>      2f 73 79 73   "/sys"
>      72 6f 6f 74   "root"
>      45 56 50 41   "EVPA"
>      43 4f 44 45   "CODE"
> 
> Partial pointer-like recoveries (4-byte words ending in the
> kernel-direct-map prefix 0xffff....):
> 
>      3,361 observations ending in ffff
>      1,364 unique ....ffff tails
>      most common:
>        81 88 ff ff   LE 0xffff8881   1.68% of observed writes
>        80 88 ff ff   LE 0xffff8880   0.22%
> 
> The run did not recover full 64-bit kernel virtual addresses
> (only 4 bytes per probe are attacker-observable), but the
> partial pointer material is consistent with a KASLR-weakening
> primitive under sustained probing.  With the fix applied, the
> same harness leaves the attacker MR all-zero.

Thanks a lot. It would be great to have a corresponding negative test in 
tools/testing/selftests/rdma that sends malformed ATOMIC_WRITE requests 
(e.g., zero-length) and verifies that they are rejected and do not 
modify the target MR.

It is up to you. To this commit, I am fine with it.
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

> ---
>   drivers/infiniband/sw/rxe/rxe_resp.c | 14 +++++++++++++-
>   1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
> index 711f73e0bbb1..09ba21d0f3c4 100644
> --- a/drivers/infiniband/sw/rxe/rxe_resp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
> @@ -526,7 +526,19 @@ static enum resp_states check_rkey(struct rxe_qp *qp,
>   	}
>   
>   skip_check_range:
> -	if (pkt->mask & (RXE_WRITE_MASK | RXE_ATOMIC_WRITE_MASK)) {
> +	if (pkt->mask & RXE_ATOMIC_WRITE_MASK) {
> +		/* IBA oA19-28: ATOMIC_WRITE payload is exactly 8 bytes.
> +		 * Reject any other length before the responder reads
> +		 * sizeof(u64) bytes from payload_addr(pkt); a shorter
> +		 * payload would read past the logical end of the packet
> +		 * into skb->head tailroom.
> +		 */
> +		if (resid != sizeof(u64) || pktlen != sizeof(u64) ||
> +		    bth_pad(pkt)) {
> +			state = RESPST_ERR_LENGTH;
> +			goto err;
> +		}
> +	} else if (pkt->mask & RXE_WRITE_MASK) {
>   		if (resid > mtu) {
>   			if (pktlen != mtu || bth_pad(pkt)) {
>   				state = RESPST_ERR_LENGTH;


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RDMA/rxe: reject non-8-byte ATOMIC_WRITE payloads
  2026-04-18 22:49 ` Zhu Yanjun
@ 2026-04-18 23:11   ` Michael Bommarito
       [not found]     ` <1bd36ce7-e3dd-4ff5-867a-b8b9ade90a1e@linux.dev>
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Bommarito @ 2026-04-18 23:11 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, Xiao Yang,
	linux-rdma, linux-kernel

On Sat, Apr 18, 2026 at 6:49 PM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
> Thanks a lot. It would be great to have a corresponding negative test in
> tools/testing/selftests/rdma that sends malformed ATOMIC_WRITE requests
> (e.g., zero-length) and verifies that they are rejected and do not
> modify the target MR.

Good idea.  Do you want a v2 with this fix + a separate test case in
rdma or should I submit it separately?

Thanks,
Mike Bommarito

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RDMA/rxe: reject non-8-byte ATOMIC_WRITE payloads
       [not found]     ` <1bd36ce7-e3dd-4ff5-867a-b8b9ade90a1e@linux.dev>
@ 2026-04-19  1:57       ` Michael Bommarito
  2026-04-19  3:34         ` Zhu Yanjun
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Bommarito @ 2026-04-19  1:57 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, Xiao Yang,
	linux-rdma, linux-kernel

On Sat, Apr 18, 2026 at 7:18 PM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
> The fix should stand on its own, and the test case can be added as a
> follow-up patch under tools/testing/selftests/rdma (or rdma-core if more appropriate).
> This keeps each patch focused on a single logical change and makes review easier.

It seems like mainlining rdma tests might not be a good idea.  I found
the last time here:
Kamal Heib, 23 Oct 2019 — [PATCH for-next] selftests: rdma: Add rdma tests

I put the tests here and will PR it into linux-rdma/rdma-core tomorrow:
https://github.com/mjbommar/rdma-core/blob/4104d991a764de4aad9f645a2f0ec723f6076209/tests/test_rxe_atomic_write_oob.py

Thanks,
Mike Bommarito

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RDMA/rxe: reject non-8-byte ATOMIC_WRITE payloads
  2026-04-19  1:57       ` Michael Bommarito
@ 2026-04-19  3:34         ` Zhu Yanjun
  0 siblings, 0 replies; 5+ messages in thread
From: Zhu Yanjun @ 2026-04-19  3:34 UTC (permalink / raw)
  To: Michael Bommarito, yanjun.zhu@linux.dev
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, Xiao Yang,
	linux-rdma, linux-kernel


在 2026/4/18 18:57, Michael Bommarito 写道:
> On Sat, Apr 18, 2026 at 7:18 PM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>> The fix should stand on its own, and the test case can be added as a
>> follow-up patch under tools/testing/selftests/rdma (or rdma-core if more appropriate).
>> This keeps each patch focused on a single logical change and makes review easier.
> It seems like mainlining rdma tests might not be a good idea.  I found
> the last time here:
> Kamal Heib, 23 Oct 2019 — [PATCH for-next] selftests: rdma: Add rdma tests
>
> I put the tests here and will PR it into linux-rdma/rdma-core tomorrow:
> https://github.com/mjbommar/rdma-core/blob/4104d991a764de4aad9f645a2f0ec723f6076209/tests/test_rxe_atomic_write_oob.py

Thanks a lot. I am fine with the above link.

Zhu Yanjun

>
> Thanks,
> Mike Bommarito

-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-19  3:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-18 16:21 [PATCH] RDMA/rxe: reject non-8-byte ATOMIC_WRITE payloads Michael Bommarito
2026-04-18 22:49 ` Zhu Yanjun
2026-04-18 23:11   ` Michael Bommarito
     [not found]     ` <1bd36ce7-e3dd-4ff5-867a-b8b9ade90a1e@linux.dev>
2026-04-19  1:57       ` Michael Bommarito
2026-04-19  3:34         ` Zhu Yanjun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox