* [PATCH rdma-next] RDMA/core: Fix resolve_prepare_src error cleanup
@ 2022-12-11 9:08 Leon Romanovsky
2022-12-12 13:27 ` Jason Gunthorpe
0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2022-12-11 9:08 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Patrisious Haddad, linux-rdma, Maor Gottlieb, Wei Chen
From: Patrisious Haddad <phaddad@nvidia.com>
resolve_prepare_src() changes the destination address of the id,
regardless of success, and on failure zeroes it out.
Instead on function failure keep the original destination address
of the id.
Since the id could have been already added to the cm id tree and
zeroing its destination address, could result in a key mismatch or
multiple ids having the same key(zero) in the tree which could lead to:
BUG: KASAN: slab-out-of-bounds in compare_netdev_and_ip.isra.0+0x25/0x120 drivers/infiniband/core/cma.c:473
Read of size 4 at addr ffff88800920d9e4 by task syz-executor.4/14865
CPU: 2 PID: 14865 Comm: syz-executor.4 Not tainted 5.19.0-rc2 #21
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x6e/0x91 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:313 [inline]
print_report.cold+0xff/0x68e mm/kasan/report.c:429
kasan_report+0xa8/0x130 mm/kasan/report.c:491
compare_netdev_and_ip.isra.0+0x25/0x120 drivers/infiniband/core/cma.c:473
cma_add_id_to_tree drivers/infiniband/core/cma.c:506 [inline]
rdma_resolve_route+0xbc4/0x1560 drivers/infiniband/core/cma.c:3303
ucma_resolve_route+0xe4/0x150 drivers/infiniband/core/ucma.c:746
ucma_write+0x19f/0x280 drivers/infiniband/core/ucma.c:1744
vfs_write fs/read_write.c:589 [inline]
vfs_write+0x181/0x580 fs/read_write.c:571
ksys_write+0x1a1/0x1e0 fs/read_write.c:644
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f2afbe8c89d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f2afcf1ec28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f2afbfabf60 RCX: 00007f2afbe8c89d
RDX: 0000000000000010 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 00007f2afbef910d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffcdc77c66f R14: 00007f2afbfabf60 R15: 00007f2afcf1edc0
</TASK>
Fixes: 19b752a19dce ("IB/cma: Allow port reuse for rdma_id")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Reported-by: Wei Chen <harperchen1110@gmail.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/infiniband/core/cma.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 1fca0a24f30f..2d4c391e36a9 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -3584,8 +3584,11 @@ static int resolve_prepare_src(struct rdma_id_private *id_priv,
struct sockaddr *src_addr,
const struct sockaddr *dst_addr)
{
+ struct sockaddr org_addr = {};
int ret;
+ memcpy(&org_addr, cma_dst_addr(id_priv),
+ rdma_addr_size(cma_dst_addr(id_priv)));
memcpy(cma_dst_addr(id_priv), dst_addr, rdma_addr_size(dst_addr));
if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_ADDR_QUERY)) {
/* For a well behaved ULP state will be RDMA_CM_IDLE */
@@ -3608,7 +3611,7 @@ static int resolve_prepare_src(struct rdma_id_private *id_priv,
err_state:
cma_comp_exch(id_priv, RDMA_CM_ADDR_QUERY, RDMA_CM_ADDR_BOUND);
err_dst:
- memset(cma_dst_addr(id_priv), 0, rdma_addr_size(dst_addr));
+ memcpy(cma_dst_addr(id_priv), &org_addr, rdma_addr_size(&org_addr));
return ret;
}
--
2.38.1
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH rdma-next] RDMA/core: Fix resolve_prepare_src error cleanup 2022-12-11 9:08 [PATCH rdma-next] RDMA/core: Fix resolve_prepare_src error cleanup Leon Romanovsky @ 2022-12-12 13:27 ` Jason Gunthorpe 2022-12-12 13:42 ` Patrisious Haddad 0 siblings, 1 reply; 8+ messages in thread From: Jason Gunthorpe @ 2022-12-12 13:27 UTC (permalink / raw) To: Leon Romanovsky; +Cc: Patrisious Haddad, linux-rdma, Maor Gottlieb, Wei Chen On Sun, Dec 11, 2022 at 11:08:30AM +0200, Leon Romanovsky wrote: > From: Patrisious Haddad <phaddad@nvidia.com> > > resolve_prepare_src() changes the destination address of the id, > regardless of success, and on failure zeroes it out. > > Instead on function failure keep the original destination address > of the id. > > Since the id could have been already added to the cm id tree and > zeroing its destination address, could result in a key mismatch or > multiple ids having the same key(zero) in the tree which could lead to: Oh, this can't be right The destination address is variable and it is changed by resolve even in good cases. So this part of the rb search is nonsense: result = compare_netdev_and_ip( node_id_priv->id.route.addr.dev_addr.bound_dev_if, cma_dst_addr(node_id_priv), this); The only way to fix it is to freeze the dst_addr before inserting things into the rb tree. ie completely block resolve_prepare_src() Most probably this suggests that the id is being inserted into the rbtree at the wrong time, before the dst_add becomes unchangable. Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-next] RDMA/core: Fix resolve_prepare_src error cleanup 2022-12-12 13:27 ` Jason Gunthorpe @ 2022-12-12 13:42 ` Patrisious Haddad 2022-12-12 13:43 ` Jason Gunthorpe 0 siblings, 1 reply; 8+ messages in thread From: Patrisious Haddad @ 2022-12-12 13:42 UTC (permalink / raw) To: Jason Gunthorpe, Leon Romanovsky; +Cc: linux-rdma, Maor Gottlieb, Wei Chen I think we have the same realization but different understanding of the code, please correct what I'm missing, rest inline: On 12/12/2022 15:27, Jason Gunthorpe wrote: > On Sun, Dec 11, 2022 at 11:08:30AM +0200, Leon Romanovsky wrote: >> From: Patrisious Haddad <phaddad@nvidia.com> >> >> resolve_prepare_src() changes the destination address of the id, >> regardless of success, and on failure zeroes it out. >> >> Instead on function failure keep the original destination address >> of the id. >> >> Since the id could have been already added to the cm id tree and >> zeroing its destination address, could result in a key mismatch or >> multiple ids having the same key(zero) in the tree which could lead to: > > Oh, this can't be right > > The destination address is variable and it is changed by resolve even > in good cases. This is what I don't think can happen, since one address is resolved(bound), it can't be bound again so each an other try of resolve would fail and enter the error flow which I just fixed. > > So this part of the rb search is nonsense: > > result = compare_netdev_and_ip( > node_id_priv->id.route.addr.dev_addr.bound_dev_if, > cma_dst_addr(node_id_priv), this); > > The only way to fix it is to freeze the dst_addr before inserting > things into the rb tree. I completely agree, and this was my assumption that after resolve address, and resolve route(where I add to the tree), the dst_addr is frozen, the only scenario where it isn't was the resolve_prepare_src failure which some why nullified the value instead of keeping the original. and what I'm trying to say, is that once the CM is added the tree(aka passed resolve addr once + resolve route) , there can't be a good(success) case for the resolve_prepare_src again, since it is already bound so every consecutive call should fail, meaning the cma_dst_addr is technically frozen. > > ie completely block resolve_prepare_src() > > Most probably this suggests that the id is being inserted into the > rbtree at the wrong time, before the dst_add becomes unchangable. > > Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-next] RDMA/core: Fix resolve_prepare_src error cleanup 2022-12-12 13:42 ` Patrisious Haddad @ 2022-12-12 13:43 ` Jason Gunthorpe 2022-12-12 13:55 ` Patrisious Haddad 0 siblings, 1 reply; 8+ messages in thread From: Jason Gunthorpe @ 2022-12-12 13:43 UTC (permalink / raw) To: Patrisious Haddad; +Cc: Leon Romanovsky, linux-rdma, Maor Gottlieb, Wei Chen On Mon, Dec 12, 2022 at 03:42:07PM +0200, Patrisious Haddad wrote: > I think we have the same realization but different understanding of the > code, please correct what I'm missing, rest inline: > > On 12/12/2022 15:27, Jason Gunthorpe wrote: > > On Sun, Dec 11, 2022 at 11:08:30AM +0200, Leon Romanovsky wrote: > > > From: Patrisious Haddad <phaddad@nvidia.com> > > > > > > resolve_prepare_src() changes the destination address of the id, > > > regardless of success, and on failure zeroes it out. > > > > > > Instead on function failure keep the original destination address > > > of the id. > > > > > > Since the id could have been already added to the cm id tree and > > > zeroing its destination address, could result in a key mismatch or > > > multiple ids having the same key(zero) in the tree which could lead to: > > > > Oh, this can't be right > > > > The destination address is variable and it is changed by resolve even > > in good cases. > This is what I don't think can happen, since one address is resolved(bound), > it can't be bound again so each an other try of resolve would fail and enter > the error flow which I just fixed. > > > > So this part of the rb search is nonsense: > > > > result = compare_netdev_and_ip( > > node_id_priv->id.route.addr.dev_addr.bound_dev_if, > > cma_dst_addr(node_id_priv), this); > > > > The only way to fix it is to freeze the dst_addr before inserting > > things into the rb tree. > I completely agree, and this was my assumption that after resolve address, > and resolve route(where I add to the tree), the dst_addr is frozen, the only > scenario where it isn't was the resolve_prepare_src failure which some why > nullified the value instead of keeping the original. Then fix the control flow so it doesn't do the nullification if it didn't change the value You can't just change it while it is in the rb tree, that is racy Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-next] RDMA/core: Fix resolve_prepare_src error cleanup 2022-12-12 13:43 ` Jason Gunthorpe @ 2022-12-12 13:55 ` Patrisious Haddad 2022-12-12 14:00 ` Jason Gunthorpe 0 siblings, 1 reply; 8+ messages in thread From: Patrisious Haddad @ 2022-12-12 13:55 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Maor Gottlieb, Wei Chen Agree, but changing the control flow of this function is really problematic , it was even tried before if you remember commit "e4103312d7b7a" , it got something to do with port allocation, I'll take another look over the code to see what other options we have though. Since in short, you are right it is racy now. On 12/12/2022 15:43, Jason Gunthorpe wrote: > On Mon, Dec 12, 2022 at 03:42:07PM +0200, Patrisious Haddad wrote: >> I think we have the same realization but different understanding of the >> code, please correct what I'm missing, rest inline: >> >> On 12/12/2022 15:27, Jason Gunthorpe wrote: >>> On Sun, Dec 11, 2022 at 11:08:30AM +0200, Leon Romanovsky wrote: >>>> From: Patrisious Haddad <phaddad@nvidia.com> >>>> >>>> resolve_prepare_src() changes the destination address of the id, >>>> regardless of success, and on failure zeroes it out. >>>> >>>> Instead on function failure keep the original destination address >>>> of the id. >>>> >>>> Since the id could have been already added to the cm id tree and >>>> zeroing its destination address, could result in a key mismatch or >>>> multiple ids having the same key(zero) in the tree which could lead to: >>> >>> Oh, this can't be right >>> >>> The destination address is variable and it is changed by resolve even >>> in good cases. >> This is what I don't think can happen, since one address is resolved(bound), >> it can't be bound again so each an other try of resolve would fail and enter >> the error flow which I just fixed. >>> >>> So this part of the rb search is nonsense: >>> >>> result = compare_netdev_and_ip( >>> node_id_priv->id.route.addr.dev_addr.bound_dev_if, >>> cma_dst_addr(node_id_priv), this); >>> >>> The only way to fix it is to freeze the dst_addr before inserting >>> things into the rb tree. >> I completely agree, and this was my assumption that after resolve address, >> and resolve route(where I add to the tree), the dst_addr is frozen, the only >> scenario where it isn't was the resolve_prepare_src failure which some why >> nullified the value instead of keeping the original. > > Then fix the control flow so it doesn't do the nullification if it > didn't change the value > > You can't just change it while it is in the rb tree, that is racy > > Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-next] RDMA/core: Fix resolve_prepare_src error cleanup 2022-12-12 13:55 ` Patrisious Haddad @ 2022-12-12 14:00 ` Jason Gunthorpe 2022-12-12 14:06 ` Patrisious Haddad 0 siblings, 1 reply; 8+ messages in thread From: Jason Gunthorpe @ 2022-12-12 14:00 UTC (permalink / raw) To: Patrisious Haddad; +Cc: Leon Romanovsky, linux-rdma, Maor Gottlieb, Wei Chen On Mon, Dec 12, 2022 at 03:55:37PM +0200, Patrisious Haddad wrote: > Agree, but changing the control flow of this function is really problematic > , it was even tried before if you remember commit "e4103312d7b7a" , it got > something to do with port allocation, I'll take another look over the code > to see what other options we have though. Yes, we've changed this function many times because it is badly mis-designed Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-next] RDMA/core: Fix resolve_prepare_src error cleanup 2022-12-12 14:00 ` Jason Gunthorpe @ 2022-12-12 14:06 ` Patrisious Haddad 2022-12-12 20:15 ` Jason Gunthorpe 0 siblings, 1 reply; 8+ messages in thread From: Patrisious Haddad @ 2022-12-12 14:06 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Maor Gottlieb, Wei Chen Btw there is the easy ugly fix obviously, which would be this patch + locking this function with the tree spin-lock(to avoid any race). I'll check however if there is hope for a better possible design for this function. On 12/12/2022 16:00, Jason Gunthorpe wrote: > On Mon, Dec 12, 2022 at 03:55:37PM +0200, Patrisious Haddad wrote: >> Agree, but changing the control flow of this function is really problematic >> , it was even tried before if you remember commit "e4103312d7b7a" , it got >> something to do with port allocation, I'll take another look over the code >> to see what other options we have though. > > Yes, we've changed this function many times because it is badly > mis-designed > > Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-next] RDMA/core: Fix resolve_prepare_src error cleanup 2022-12-12 14:06 ` Patrisious Haddad @ 2022-12-12 20:15 ` Jason Gunthorpe 0 siblings, 0 replies; 8+ messages in thread From: Jason Gunthorpe @ 2022-12-12 20:15 UTC (permalink / raw) To: Patrisious Haddad; +Cc: Leon Romanovsky, linux-rdma, Maor Gottlieb, Wei Chen On Mon, Dec 12, 2022 at 04:06:03PM +0200, Patrisious Haddad wrote: > Btw there is the easy ugly fix obviously, which would be this patch + > locking this function with the tree spin-lock(to avoid any race). > > I'll check however if there is hope for a better possible design for this > function. The usual way I've fixed this is to avoid touching, in this case, cma_dst_addr() in the call chain. eg we already pass in the correct dst_addr What you've done is made it so that in RDMA_CM_ROUTE_QUERY and beyond the CM id's dst cannot change. The trick with this nasty code is that it is trying to trigger auto-bind, and it has to do it blind because of bad code structure So, try something like this: diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 13e0ab785baa24..1d1f9cd01dd38f 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -3547,7 +3547,7 @@ static int cma_bind_addr(struct rdma_cm_id *id, struct sockaddr *src_addr, struct sockaddr_storage zero_sock = {}; if (src_addr && src_addr->sa_family) - return rdma_bind_addr(id, src_addr); + return rdma_bind_addr_dst(id, src_addr, dst_addr); /* * When the src_addr is not specified, automatically supply an any addr @@ -3567,7 +3567,7 @@ static int cma_bind_addr(struct rdma_cm_id *id, struct sockaddr *src_addr, ((struct sockaddr_ib *)&zero_sock)->sib_pkey = ((struct sockaddr_ib *)dst_addr)->sib_pkey; } - return rdma_bind_addr(id, (struct sockaddr *)&zero_sock); + return rdma_bind_addr_dst(id, (struct sockaddr *)&zero_sock, dst_addr); } /* @@ -3582,17 +3582,14 @@ static int resolve_prepare_src(struct rdma_id_private *id_priv, { int ret; - memcpy(cma_dst_addr(id_priv), dst_addr, rdma_addr_size(dst_addr)); if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_ADDR_QUERY)) { /* For a well behaved ULP state will be RDMA_CM_IDLE */ ret = cma_bind_addr(&id_priv->id, src_addr, dst_addr); if (ret) - goto err_dst; + return ret; if (WARN_ON(!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, - RDMA_CM_ADDR_QUERY))) { - ret = -EINVAL; - goto err_dst; - } + RDMA_CM_ADDR_QUERY))) + return -EINVAL; } if (cma_family(id_priv) != dst_addr->sa_family) { @@ -3603,8 +3600,6 @@ static int resolve_prepare_src(struct rdma_id_private *id_priv, err_state: cma_comp_exch(id_priv, RDMA_CM_ADDR_QUERY, RDMA_CM_ADDR_BOUND); -err_dst: - memset(cma_dst_addr(id_priv), 0, rdma_addr_size(dst_addr)); return ret; } @@ -4058,27 +4053,25 @@ int rdma_listen(struct rdma_cm_id *id, int backlog) } EXPORT_SYMBOL(rdma_listen); -int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr) +static int rdma_bind_addr_dst(struct rdma_id_private *id_priv, + struct sockaddr *addr, struct sockaddr *daddr) { - struct rdma_id_private *id_priv; int ret; - struct sockaddr *daddr; if (addr->sa_family != AF_INET && addr->sa_family != AF_INET6 && addr->sa_family != AF_IB) return -EAFNOSUPPORT; - id_priv = container_of(id, struct rdma_id_private, id); if (!cma_comp_exch(id_priv, RDMA_CM_IDLE, RDMA_CM_ADDR_BOUND)) return -EINVAL; - ret = cma_check_linklocal(&id->route.addr.dev_addr, addr); + ret = cma_check_linklocal(&id_priv->id.route.addr.dev_addr, addr); if (ret) goto err1; memcpy(cma_src_addr(id_priv), addr, rdma_addr_size(addr)); if (!cma_any_addr(addr)) { - ret = cma_translate_addr(addr, &id->route.addr.dev_addr); + ret = cma_translate_addr(addr, &id_priv->id.route.addr.dev_addr); if (ret) goto err1; @@ -4098,8 +4091,14 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr) } #endif } - daddr = cma_dst_addr(id_priv); + + /* + * FIXME: This seems wrong, we can't just blidnly replace the sa_family + * unless we know the daddr is zero. It will corrupt it. + */ daddr->sa_family = addr->sa_family; + if (daddr != cma_dst_addr(id_priv)) + memcpy(cma_dst_addr(id_priv), daddr, rdma_addr_size(addr)); ret = cma_get_port(id_priv); if (ret) @@ -4115,6 +4114,14 @@ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr) cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_IDLE); return ret; } + +int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr) +{ + struct rdma_id_private *id_priv = + container_of(id, struct rdma_id_private, id); + + return rdma_bind_addr_dst(id_priv, addr, cma_dst_addr(id_priv)); +} EXPORT_SYMBOL(rdma_bind_addr); static int cma_format_hdr(void *hdr, struct rdma_id_private *id_priv) ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-12-12 20:16 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-12-11 9:08 [PATCH rdma-next] RDMA/core: Fix resolve_prepare_src error cleanup Leon Romanovsky 2022-12-12 13:27 ` Jason Gunthorpe 2022-12-12 13:42 ` Patrisious Haddad 2022-12-12 13:43 ` Jason Gunthorpe 2022-12-12 13:55 ` Patrisious Haddad 2022-12-12 14:00 ` Jason Gunthorpe 2022-12-12 14:06 ` Patrisious Haddad 2022-12-12 20:15 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).