All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	Chen Zhao <chezhao@nvidia.com>, Parav Pandit <parav@nvidia.com>
Subject: Re: [PATCH rdma-next] IB/core: Fix zero dmac race in neighbor resolution
Date: Thu, 9 Apr 2026 12:28:44 -0300	[thread overview]
Message-ID: <20260409152844.GA1995590@nvidia.com> (raw)
In-Reply-To: <20260405-fix-dmac-race-v1-1-cfa1ec2ce54a@nvidia.com>

On Sun, Apr 05, 2026 at 06:44:55PM +0300, Leon Romanovsky wrote:
> From: Chen Zhao <chezhao@nvidia.com>
> 
> dst_fetch_ha() checks nud_state without holding the neighbor lock, then
> copies ha under the seqlock. A race in __neigh_update() where nud_state
> is set to NUD_REACHABLE before ha is written allows dst_fetch_ha() to
> read a zero MAC address while the seqlock reports no concurrent writer.
> 
> netevent_callback amplifies this by waking ALL pending addr_req workers
> when ANY neighbor becomes NUD_VALID. At scale (N peers resolving ARP
> concurrently), the hit probability scales as N^2, making it near-certain
> for large RDMA workloads.
> 
> N(A): neigh_update(A)                   W(A): addr_resolve(A)
>  |                                       [sleep]
>  | write_lock_bh(&A->lock)               |
>  | A->nud_state = NUD_REACHABLE          |
>  | // A->ha is still 0                   |
>  |                                       [woken by netevent_cb() of
>  |                                         another neighbour]
>  |                                       | dst_fetch_ha(A)
>  |                                       |   A->nud_state & NUD_VALID
>  |                                       |   read_seqbegin(&A->ha_lock)
>  |                                       |   snapshot = A->ha  /* 0 */
>  |                                       |   read_seqretry(&A->ha_lock)
>  |                                       |   return snapshot
>  | seqlock(&A->ha_lock)
>  | A->ha = mac_A     /* too late */
>  | sequnlock(&A->ha_lock)
>  | write_unlock_bh(&A->lock)
> 
> The incorrect/zero mac is read and programmed in the device QP while it
> was not yet updated. This causes silent packet loss and eventual
> RETRY_EXC_ERR.
> 
> Fix by holding the neighbor read lock across the nud_state check and
> ha copy in dst_fetch_ha(), ensuring it synchronizes with
> __neigh_update() which is updating while holding the write lock.
> 
> Fixes: 92ebb6a0a13a ("IB/cm: Remove now useless rcu_lock in dst_fetch_ha")
> Signed-off-by: Chen Zhao <chezhao@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
> Strictly speaking the commit in Fixes doesn't look like the one which
> caused the race, but it is most relevant one to put.
> ---
>  drivers/infiniband/core/addr.c | 3 +++
>  1 file changed, 3 insertions(+)

Applied to for-next

Thanks,
Jason

      reply	other threads:[~2026-04-09 15:28 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-05 15:44 [PATCH rdma-next] IB/core: Fix zero dmac race in neighbor resolution Leon Romanovsky
2026-04-09 15:28 ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260409152844.GA1995590@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=chezhao@nvidia.com \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=parav@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.