public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Leon Romanovsky <leon@kernel.org>
Cc: Zhu Yanjun <yanjun.zhu@linux.dev>,
	linux-rdma@vger.kernel.org,
	syzbot+b0da83a6c0e2e2bddbd4@syzkaller.appspotmail.com
Subject: Re: [PATCH rdma-next v2 1/1] RDMA/core: Fix WARNING in gid_table_release_one
Date: Wed, 5 Nov 2025 09:45:24 -0400	[thread overview]
Message-ID: <20251105134524.GL1204670@ziepe.ca> (raw)
In-Reply-To: <20251105130958.GE16832@unreal>

On Wed, Nov 05, 2025 at 03:09:58PM +0200, Leon Romanovsky wrote:
> On Tue, Nov 04, 2025 at 03:36:01PM -0800, Zhu Yanjun wrote:
> > GID entry ref leak for dev syz1 index 2 ref=615
> > ...
> > Call Trace:
> >  <TASK>
> >  ib_device_release+0xd2/0x1c0 drivers/infiniband/core/device.c:509
> >  device_release+0x99/0x1c0 drivers/base/core.c:-1
> >  kobject_cleanup lib/kobject.c:689 [inline]
> >  kobject_release lib/kobject.c:720 [inline]
> >  kref_put include/linux/kref.h:65 [inline]
> >  kobject_put+0x228/0x480 lib/kobject.c:737
> >  process_one_work kernel/workqueue.c:3263 [inline]
> >  process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346
> >  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427
> >  kthread+0x711/0x8a0 kernel/kthread.c:463
> >  ret_from_fork+0x47c/0x820 arch/x86/kernel/process.c:158
> >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> >  </TASK>
> > 
> > When the state of a GID is GID_TABLE_ENTRY_PENDING_DEL, it indicates
> > that the GID is about to be released soon. Therefore, it does not
> > appear to be a leak.
> > 
> > Fixes: b150c3862d21 ("IB/core: Introduce GID entry reference counts")
> > Reported-by: syzbot+b0da83a6c0e2e2bddbd4@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=b0da83a6c0e2e2bddbd4
> > Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> > ---
> > V1->V2: Use flush_workqueue instead of while loop
> > ---
> >  drivers/infiniband/core/cache.c | 16 +++++++++++++---
> >  1 file changed, 13 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > index 81cf3c902e81..74211fb37020 100644
> > --- a/drivers/infiniband/core/cache.c
> > +++ b/drivers/infiniband/core/cache.c
> > @@ -799,16 +799,26 @@ static void release_gid_table(struct ib_device *device,
> >  	if (!table)
> >  		return;
> >  
> > +	mutex_lock(&table->lock);
> >  	for (i = 0; i < table->sz; i++) {
> >  		if (is_gid_entry_free(table->data_vec[i]))
> >  			continue;
> >  
> > -		WARN_ONCE(true,
> > -			  "GID entry ref leak for dev %s index %d ref=%u\n",
> > +		WARN_ONCE(table->data_vec[i]->state != GID_TABLE_ENTRY_PENDING_DEL,
> > +			  "GID entry ref leak for dev %s index %d ref=%u, state: %d\n",
> >  			  dev_name(&device->dev), i,
> > -			  kref_read(&table->data_vec[i]->kref));
> > +			  kref_read(&table->data_vec[i]->kref), table->data_vec[i]->state);
> > +		/*
> > +		 * The entry may be sitting in the WQ waiting for
> > +		 * free_gid_work(), flush it to try to clean it.
> > +		 */
> > +		mutex_unlock(&table->lock);
> > +		flush_workqueue(ib_wq);
> > +		mutex_lock(&table->lock);
> 
> I can't agree with idea that flush_workqueue() is called in the loop.

Since we almost never see these WARN_ON's it isn't really called in a
loop, but sure you could put a conditional around it to do it only
once.

The WARN on is in the wrong order, it is not a kernel bug if the
workqueue is still pending. flush the queue and then check again, and
then do the warn.

@@ -791,22 +791,31 @@ static struct ib_gid_table *alloc_gid_table(int sz)
        return NULL;
 }
 
-static void release_gid_table(struct ib_device *device,
-                             struct ib_gid_table *table)
+static bool is_gid_table_clean(struct ib_gid_table *table)
 {
        int i;
 
+       guard(mutex)(&table->lock);
+       for (i = 0; i < table->sz; i++)
+               if (!is_gid_entry_free(table->data_vec[i]))
+                       return false;
+       return true;
+}
+
+static void release_gid_table(struct ib_device *device,
+                             struct ib_gid_table *table)
+{
        if (!table)
                return;
 
-       for (i = 0; i < table->sz; i++) {
-               if (is_gid_entry_free(table->data_vec[i]))
-                       continue;
-
-               WARN_ONCE(true,
-                         "GID entry ref leak for dev %s index %d ref=%u\n",
-                         dev_name(&device->dev), i,
-                         kref_read(&table->data_vec[i]->kref));
+       if (!is_gid_table_clean(table)) {
+               /*
+                * The entry may be sitting in the WQ waiting for
+                * free_gid_work(), flush it to try to clean it.
+                */
+               flush_workqueue(ib_wq);
+               if (!is_gid_table_clean(table))
+                       WARN_ONCE(true, "GID entry has leaked");
        }
 
        mutex_destroy(&table->lock);

Jason

  reply	other threads:[~2025-11-05 13:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-04 23:36 [PATCH rdma-next v2 1/1] RDMA/core: Fix WARNING in gid_table_release_one Zhu Yanjun
2025-11-05 13:09 ` Leon Romanovsky
2025-11-05 13:45   ` Jason Gunthorpe [this message]
2025-11-05 14:54     ` Leon Romanovsky
2025-11-05 15:46     ` Zhu Yanjun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251105134524.GL1204670@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=syzbot+b0da83a6c0e2e2bddbd4@syzkaller.appspotmail.com \
    --cc=yanjun.zhu@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox