All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: syzbot+53cf317e7803e4ef2f33@syzkaller.appspotmail.com,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	jgg@ziepe.ca, linux-kernel@vger.kernel.org,
	linux-rdma@vger.kernel.org
Subject: Re: [syzbot] [rdma?] kernel BUG in ib_device_get_by_index
Date: Wed, 4 Mar 2026 16:22:39 +0200	[thread overview]
Message-ID: <20260304142239.GA12611@unreal> (raw)
In-Reply-To: <6014ee8b-382a-4fe8-81de-74a67595f585@I-love.SAKURA.ne.jp>

On Tue, Mar 03, 2026 at 10:38:17PM +0900, Tetsuo Handa wrote:
> On 2026/03/03 4:17, Leon Romanovsky wrote:
> > On Sat, Feb 28, 2026 at 02:07:46PM +0900, Tetsuo Handa wrote:
> >> Hmm, this assertion was wrong because ib_device_get_by_index()
> >> might be called before enable_device_and_get() is called.
> >>
> >> #syz invalid
> > 
> > I think this is a valid syzkaller report. As you correctly noted, the device
> > was inserted into the xarray database in assign_name(), but its refcount was
> > only set later in enable_device_and_get().
> 
> I was wondering why enable_device_and_get() is using not refcount_add()
> but refcount_set(), and I tried
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit?id=510cd4b7d46753b4bf0f57004aa7b53b91b2b25a
> in case commit 9af0feae8016 ("RDMA/core: Fix stale RoCE GIDs during netdev
> events at registration") unexpectedly triggered modification of ->refcount
> before refcount_set(&device->refcount, 2) is called.
> 
> But I concluded from this syzbot report that the reason enable_device_and_get() is
> using refcount_set() is that we cannot use refcount_add() because ->refcount == 0.
> 
> Therefore, it is safe to call ib_device_try_get() before enable_device_and_get()
> calls refcount_set().
> 
> > 
> > The proper fix can be something like that:
> > 
>           down_read(&devices_rwsem);
>           device = xa_load(&devices, index);
>   -       if (device) {
>   +       if (device && xa_get_mark(&devices, index, DEVICE_REGISTERED)) {
>                   if (!rdma_dev_access_netns(device, net)) {
>                           device = NULL;
>                           goto out;
>                   }
>    
>                   if (!ib_device_try_get(device))
>                           device = NULL;
>           }
> 
> Why do you want to make this change? Unless it is unsafe to call
> rdma_dev_access_netns() when DEVICE_REGISTERED is not set,
> refcount_inc_not_zero() from ib_device_try_get() makes the final
> result same (i.e. device == NULL).
> 
> Since enable_device_and_get() sets ->refcount immediately before
> xa_set_mark() is called, adding xa_get_mark() check does not change
> effective behavior.

xa_set_mark() is performed under down_write(&devices_rwsem) and it
ensures that xa_load(...) will return fully initialized device.

But yes, you are right, ib_device_try_get() should return 0 if this
device isn't set yet.

Thanks

> 
> What I rather worry is that refcount_set() is called too early if
> there is an ib_device_try_get() user who expects that
> device->ops.enable_driver()/add_client_context()/add_compat_devs()
> have already completed when ib_device_try_get() succeeded.
> 

      reply	other threads:[~2026-03-04 14:22 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-28  4:38 [syzbot] [rdma?] kernel BUG in ib_device_get_by_index syzbot
2026-02-28  5:07 ` Tetsuo Handa
2026-03-02 19:17   ` Leon Romanovsky
2026-03-03 13:38     ` Tetsuo Handa
2026-03-04 14:22       ` Leon Romanovsky [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260304142239.GA12611@unreal \
    --to=leon@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=syzbot+53cf317e7803e4ef2f33@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.