public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next v2 1/1] RDMA/core: Fix WARNING in gid_table_release_one
@ 2025-11-04 23:36 Zhu Yanjun
  2025-11-05 13:09 ` Leon Romanovsky
  0 siblings, 1 reply; 5+ messages in thread
From: Zhu Yanjun @ 2025-11-04 23:36 UTC (permalink / raw)
  To: jgg, leon, linux-rdma; +Cc: Zhu Yanjun, syzbot+b0da83a6c0e2e2bddbd4

GID entry ref leak for dev syz1 index 2 ref=615
...
Call Trace:
 <TASK>
 ib_device_release+0xd2/0x1c0 drivers/infiniband/core/device.c:509
 device_release+0x99/0x1c0 drivers/base/core.c:-1
 kobject_cleanup lib/kobject.c:689 [inline]
 kobject_release lib/kobject.c:720 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x228/0x480 lib/kobject.c:737
 process_one_work kernel/workqueue.c:3263 [inline]
 process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346
 worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427
 kthread+0x711/0x8a0 kernel/kthread.c:463
 ret_from_fork+0x47c/0x820 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>

When the state of a GID is GID_TABLE_ENTRY_PENDING_DEL, it indicates
that the GID is about to be released soon. Therefore, it does not
appear to be a leak.

Fixes: b150c3862d21 ("IB/core: Introduce GID entry reference counts")
Reported-by: syzbot+b0da83a6c0e2e2bddbd4@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b0da83a6c0e2e2bddbd4
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
V1->V2: Use flush_workqueue instead of while loop
---
 drivers/infiniband/core/cache.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 81cf3c902e81..74211fb37020 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -799,16 +799,26 @@ static void release_gid_table(struct ib_device *device,
 	if (!table)
 		return;
 
+	mutex_lock(&table->lock);
 	for (i = 0; i < table->sz; i++) {
 		if (is_gid_entry_free(table->data_vec[i]))
 			continue;
 
-		WARN_ONCE(true,
-			  "GID entry ref leak for dev %s index %d ref=%u\n",
+		WARN_ONCE(table->data_vec[i]->state != GID_TABLE_ENTRY_PENDING_DEL,
+			  "GID entry ref leak for dev %s index %d ref=%u, state: %d\n",
 			  dev_name(&device->dev), i,
-			  kref_read(&table->data_vec[i]->kref));
+			  kref_read(&table->data_vec[i]->kref), table->data_vec[i]->state);
+		/*
+		 * The entry may be sitting in the WQ waiting for
+		 * free_gid_work(), flush it to try to clean it.
+		 */
+		mutex_unlock(&table->lock);
+		flush_workqueue(ib_wq);
+		mutex_lock(&table->lock);
 	}
 
+	mutex_unlock(&table->lock);
+
 	mutex_destroy(&table->lock);
 	kfree(table->data_vec);
 	kfree(table);
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH rdma-next v2 1/1] RDMA/core: Fix WARNING in gid_table_release_one
  2025-11-04 23:36 [PATCH rdma-next v2 1/1] RDMA/core: Fix WARNING in gid_table_release_one Zhu Yanjun
@ 2025-11-05 13:09 ` Leon Romanovsky
  2025-11-05 13:45   ` Jason Gunthorpe
  0 siblings, 1 reply; 5+ messages in thread
From: Leon Romanovsky @ 2025-11-05 13:09 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: jgg, linux-rdma, syzbot+b0da83a6c0e2e2bddbd4

On Tue, Nov 04, 2025 at 03:36:01PM -0800, Zhu Yanjun wrote:
> GID entry ref leak for dev syz1 index 2 ref=615
> ...
> Call Trace:
>  <TASK>
>  ib_device_release+0xd2/0x1c0 drivers/infiniband/core/device.c:509
>  device_release+0x99/0x1c0 drivers/base/core.c:-1
>  kobject_cleanup lib/kobject.c:689 [inline]
>  kobject_release lib/kobject.c:720 [inline]
>  kref_put include/linux/kref.h:65 [inline]
>  kobject_put+0x228/0x480 lib/kobject.c:737
>  process_one_work kernel/workqueue.c:3263 [inline]
>  process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346
>  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427
>  kthread+0x711/0x8a0 kernel/kthread.c:463
>  ret_from_fork+0x47c/0x820 arch/x86/kernel/process.c:158
>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>  </TASK>
> 
> When the state of a GID is GID_TABLE_ENTRY_PENDING_DEL, it indicates
> that the GID is about to be released soon. Therefore, it does not
> appear to be a leak.
> 
> Fixes: b150c3862d21 ("IB/core: Introduce GID entry reference counts")
> Reported-by: syzbot+b0da83a6c0e2e2bddbd4@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b0da83a6c0e2e2bddbd4
> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> ---
> V1->V2: Use flush_workqueue instead of while loop
> ---
>  drivers/infiniband/core/cache.c | 16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index 81cf3c902e81..74211fb37020 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -799,16 +799,26 @@ static void release_gid_table(struct ib_device *device,
>  	if (!table)
>  		return;
>  
> +	mutex_lock(&table->lock);
>  	for (i = 0; i < table->sz; i++) {
>  		if (is_gid_entry_free(table->data_vec[i]))
>  			continue;
>  
> -		WARN_ONCE(true,
> -			  "GID entry ref leak for dev %s index %d ref=%u\n",
> +		WARN_ONCE(table->data_vec[i]->state != GID_TABLE_ENTRY_PENDING_DEL,
> +			  "GID entry ref leak for dev %s index %d ref=%u, state: %d\n",
>  			  dev_name(&device->dev), i,
> -			  kref_read(&table->data_vec[i]->kref));
> +			  kref_read(&table->data_vec[i]->kref), table->data_vec[i]->state);
> +		/*
> +		 * The entry may be sitting in the WQ waiting for
> +		 * free_gid_work(), flush it to try to clean it.
> +		 */
> +		mutex_unlock(&table->lock);
> +		flush_workqueue(ib_wq);
> +		mutex_lock(&table->lock);

I can't agree with idea that flush_workqueue() is called in the loop.

Thanks

>  	}
>  
> +	mutex_unlock(&table->lock);
> +
>  	mutex_destroy(&table->lock);
>  	kfree(table->data_vec);
>  	kfree(table);
> -- 
> 2.51.2
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH rdma-next v2 1/1] RDMA/core: Fix WARNING in gid_table_release_one
  2025-11-05 13:09 ` Leon Romanovsky
@ 2025-11-05 13:45   ` Jason Gunthorpe
  2025-11-05 14:54     ` Leon Romanovsky
  2025-11-05 15:46     ` Zhu Yanjun
  0 siblings, 2 replies; 5+ messages in thread
From: Jason Gunthorpe @ 2025-11-05 13:45 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Zhu Yanjun, linux-rdma, syzbot+b0da83a6c0e2e2bddbd4

On Wed, Nov 05, 2025 at 03:09:58PM +0200, Leon Romanovsky wrote:
> On Tue, Nov 04, 2025 at 03:36:01PM -0800, Zhu Yanjun wrote:
> > GID entry ref leak for dev syz1 index 2 ref=615
> > ...
> > Call Trace:
> >  <TASK>
> >  ib_device_release+0xd2/0x1c0 drivers/infiniband/core/device.c:509
> >  device_release+0x99/0x1c0 drivers/base/core.c:-1
> >  kobject_cleanup lib/kobject.c:689 [inline]
> >  kobject_release lib/kobject.c:720 [inline]
> >  kref_put include/linux/kref.h:65 [inline]
> >  kobject_put+0x228/0x480 lib/kobject.c:737
> >  process_one_work kernel/workqueue.c:3263 [inline]
> >  process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346
> >  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427
> >  kthread+0x711/0x8a0 kernel/kthread.c:463
> >  ret_from_fork+0x47c/0x820 arch/x86/kernel/process.c:158
> >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> >  </TASK>
> > 
> > When the state of a GID is GID_TABLE_ENTRY_PENDING_DEL, it indicates
> > that the GID is about to be released soon. Therefore, it does not
> > appear to be a leak.
> > 
> > Fixes: b150c3862d21 ("IB/core: Introduce GID entry reference counts")
> > Reported-by: syzbot+b0da83a6c0e2e2bddbd4@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=b0da83a6c0e2e2bddbd4
> > Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> > ---
> > V1->V2: Use flush_workqueue instead of while loop
> > ---
> >  drivers/infiniband/core/cache.c | 16 +++++++++++++---
> >  1 file changed, 13 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > index 81cf3c902e81..74211fb37020 100644
> > --- a/drivers/infiniband/core/cache.c
> > +++ b/drivers/infiniband/core/cache.c
> > @@ -799,16 +799,26 @@ static void release_gid_table(struct ib_device *device,
> >  	if (!table)
> >  		return;
> >  
> > +	mutex_lock(&table->lock);
> >  	for (i = 0; i < table->sz; i++) {
> >  		if (is_gid_entry_free(table->data_vec[i]))
> >  			continue;
> >  
> > -		WARN_ONCE(true,
> > -			  "GID entry ref leak for dev %s index %d ref=%u\n",
> > +		WARN_ONCE(table->data_vec[i]->state != GID_TABLE_ENTRY_PENDING_DEL,
> > +			  "GID entry ref leak for dev %s index %d ref=%u, state: %d\n",
> >  			  dev_name(&device->dev), i,
> > -			  kref_read(&table->data_vec[i]->kref));
> > +			  kref_read(&table->data_vec[i]->kref), table->data_vec[i]->state);
> > +		/*
> > +		 * The entry may be sitting in the WQ waiting for
> > +		 * free_gid_work(), flush it to try to clean it.
> > +		 */
> > +		mutex_unlock(&table->lock);
> > +		flush_workqueue(ib_wq);
> > +		mutex_lock(&table->lock);
> 
> I can't agree with idea that flush_workqueue() is called in the loop.

Since we almost never see these WARN_ON's it isn't really called in a
loop, but sure you could put a conditional around it to do it only
once.

The WARN on is in the wrong order, it is not a kernel bug if the
workqueue is still pending. flush the queue and then check again, and
then do the warn.

@@ -791,22 +791,31 @@ static struct ib_gid_table *alloc_gid_table(int sz)
        return NULL;
 }
 
-static void release_gid_table(struct ib_device *device,
-                             struct ib_gid_table *table)
+static bool is_gid_table_clean(struct ib_gid_table *table)
 {
        int i;
 
+       guard(mutex)(&table->lock);
+       for (i = 0; i < table->sz; i++)
+               if (!is_gid_entry_free(table->data_vec[i]))
+                       return false;
+       return true;
+}
+
+static void release_gid_table(struct ib_device *device,
+                             struct ib_gid_table *table)
+{
        if (!table)
                return;
 
-       for (i = 0; i < table->sz; i++) {
-               if (is_gid_entry_free(table->data_vec[i]))
-                       continue;
-
-               WARN_ONCE(true,
-                         "GID entry ref leak for dev %s index %d ref=%u\n",
-                         dev_name(&device->dev), i,
-                         kref_read(&table->data_vec[i]->kref));
+       if (!is_gid_table_clean(table)) {
+               /*
+                * The entry may be sitting in the WQ waiting for
+                * free_gid_work(), flush it to try to clean it.
+                */
+               flush_workqueue(ib_wq);
+               if (!is_gid_table_clean(table))
+                       WARN_ONCE(true, "GID entry has leaked");
        }
 
        mutex_destroy(&table->lock);

Jason

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH rdma-next v2 1/1] RDMA/core: Fix WARNING in gid_table_release_one
  2025-11-05 13:45   ` Jason Gunthorpe
@ 2025-11-05 14:54     ` Leon Romanovsky
  2025-11-05 15:46     ` Zhu Yanjun
  1 sibling, 0 replies; 5+ messages in thread
From: Leon Romanovsky @ 2025-11-05 14:54 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Zhu Yanjun, linux-rdma, syzbot+b0da83a6c0e2e2bddbd4

On Wed, Nov 05, 2025 at 09:45:24AM -0400, Jason Gunthorpe wrote:
> On Wed, Nov 05, 2025 at 03:09:58PM +0200, Leon Romanovsky wrote:
> > On Tue, Nov 04, 2025 at 03:36:01PM -0800, Zhu Yanjun wrote:
> > > GID entry ref leak for dev syz1 index 2 ref=615
> > > ...
> > > Call Trace:
> > >  <TASK>
> > >  ib_device_release+0xd2/0x1c0 drivers/infiniband/core/device.c:509
> > >  device_release+0x99/0x1c0 drivers/base/core.c:-1
> > >  kobject_cleanup lib/kobject.c:689 [inline]
> > >  kobject_release lib/kobject.c:720 [inline]
> > >  kref_put include/linux/kref.h:65 [inline]
> > >  kobject_put+0x228/0x480 lib/kobject.c:737
> > >  process_one_work kernel/workqueue.c:3263 [inline]
> > >  process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346
> > >  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427
> > >  kthread+0x711/0x8a0 kernel/kthread.c:463
> > >  ret_from_fork+0x47c/0x820 arch/x86/kernel/process.c:158
> > >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> > >  </TASK>
> > > 
> > > When the state of a GID is GID_TABLE_ENTRY_PENDING_DEL, it indicates
> > > that the GID is about to be released soon. Therefore, it does not
> > > appear to be a leak.
> > > 
> > > Fixes: b150c3862d21 ("IB/core: Introduce GID entry reference counts")
> > > Reported-by: syzbot+b0da83a6c0e2e2bddbd4@syzkaller.appspotmail.com
> > > Closes: https://syzkaller.appspot.com/bug?extid=b0da83a6c0e2e2bddbd4
> > > Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> > > ---
> > > V1->V2: Use flush_workqueue instead of while loop
> > > ---
> > >  drivers/infiniband/core/cache.c | 16 +++++++++++++---
> > >  1 file changed, 13 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > > index 81cf3c902e81..74211fb37020 100644
> > > --- a/drivers/infiniband/core/cache.c
> > > +++ b/drivers/infiniband/core/cache.c
> > > @@ -799,16 +799,26 @@ static void release_gid_table(struct ib_device *device,
> > >  	if (!table)
> > >  		return;
> > >  
> > > +	mutex_lock(&table->lock);
> > >  	for (i = 0; i < table->sz; i++) {
> > >  		if (is_gid_entry_free(table->data_vec[i]))
> > >  			continue;
> > >  
> > > -		WARN_ONCE(true,
> > > -			  "GID entry ref leak for dev %s index %d ref=%u\n",
> > > +		WARN_ONCE(table->data_vec[i]->state != GID_TABLE_ENTRY_PENDING_DEL,
> > > +			  "GID entry ref leak for dev %s index %d ref=%u, state: %d\n",
> > >  			  dev_name(&device->dev), i,
> > > -			  kref_read(&table->data_vec[i]->kref));
> > > +			  kref_read(&table->data_vec[i]->kref), table->data_vec[i]->state);
> > > +		/*
> > > +		 * The entry may be sitting in the WQ waiting for
> > > +		 * free_gid_work(), flush it to try to clean it.
> > > +		 */
> > > +		mutex_unlock(&table->lock);
> > > +		flush_workqueue(ib_wq);
> > > +		mutex_lock(&table->lock);
> > 
> > I can't agree with idea that flush_workqueue() is called in the loop.
> 
> Since we almost never see these WARN_ON's it isn't really called in a
> loop, but sure you could put a conditional around it to do it only
> once.

We have WARN_ONCE(), this is why you don't see many WARN_ON's.

> 
> The WARN on is in the wrong order, it is not a kernel bug if the
> workqueue is still pending. flush the queue and then check again, and
> then do the warn.
> 
> @@ -791,22 +791,31 @@ static struct ib_gid_table *alloc_gid_table(int sz)
>         return NULL;
>  }
>  
> -static void release_gid_table(struct ib_device *device,
> -                             struct ib_gid_table *table)
> +static bool is_gid_table_clean(struct ib_gid_table *table)
>  {
>         int i;
>  
> +       guard(mutex)(&table->lock);
> +       for (i = 0; i < table->sz; i++)
> +               if (!is_gid_entry_free(table->data_vec[i]))
> +                       return false;
> +       return true;
> +}
> +
> +static void release_gid_table(struct ib_device *device,
> +                             struct ib_gid_table *table)
> +{
>         if (!table)
>                 return;
>  
> -       for (i = 0; i < table->sz; i++) {
> -               if (is_gid_entry_free(table->data_vec[i]))
> -                       continue;
> -
> -               WARN_ONCE(true,
> -                         "GID entry ref leak for dev %s index %d ref=%u\n",
> -                         dev_name(&device->dev), i,
> -                         kref_read(&table->data_vec[i]->kref));
> +       if (!is_gid_table_clean(table)) {
> +               /*
> +                * The entry may be sitting in the WQ waiting for
> +                * free_gid_work(), flush it to try to clean it.
> +                */
> +               flush_workqueue(ib_wq);
> +               if (!is_gid_table_clean(table))
> +                       WARN_ONCE(true, "GID entry has leaked");
>         }
>  
>         mutex_destroy(&table->lock);
> 
> Jason

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH rdma-next v2 1/1] RDMA/core: Fix WARNING in gid_table_release_one
  2025-11-05 13:45   ` Jason Gunthorpe
  2025-11-05 14:54     ` Leon Romanovsky
@ 2025-11-05 15:46     ` Zhu Yanjun
  1 sibling, 0 replies; 5+ messages in thread
From: Zhu Yanjun @ 2025-11-05 15:46 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky; +Cc: linux-rdma, syzbot+b0da83a6c0e2e2bddbd4


在 2025/11/5 5:45, Jason Gunthorpe 写道:
> On Wed, Nov 05, 2025 at 03:09:58PM +0200, Leon Romanovsky wrote:
>> On Tue, Nov 04, 2025 at 03:36:01PM -0800, Zhu Yanjun wrote:
>>> GID entry ref leak for dev syz1 index 2 ref=615
>>> ...
>>> Call Trace:
>>>   <TASK>
>>>   ib_device_release+0xd2/0x1c0 drivers/infiniband/core/device.c:509
>>>   device_release+0x99/0x1c0 drivers/base/core.c:-1
>>>   kobject_cleanup lib/kobject.c:689 [inline]
>>>   kobject_release lib/kobject.c:720 [inline]
>>>   kref_put include/linux/kref.h:65 [inline]
>>>   kobject_put+0x228/0x480 lib/kobject.c:737
>>>   process_one_work kernel/workqueue.c:3263 [inline]
>>>   process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346
>>>   worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427
>>>   kthread+0x711/0x8a0 kernel/kthread.c:463
>>>   ret_from_fork+0x47c/0x820 arch/x86/kernel/process.c:158
>>>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>>>   </TASK>
>>>
>>> When the state of a GID is GID_TABLE_ENTRY_PENDING_DEL, it indicates
>>> that the GID is about to be released soon. Therefore, it does not
>>> appear to be a leak.
>>>
>>> Fixes: b150c3862d21 ("IB/core: Introduce GID entry reference counts")
>>> Reported-by: syzbot+b0da83a6c0e2e2bddbd4@syzkaller.appspotmail.com
>>> Closes: https://syzkaller.appspot.com/bug?extid=b0da83a6c0e2e2bddbd4
>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
>>> ---
>>> V1->V2: Use flush_workqueue instead of while loop
>>> ---
>>>   drivers/infiniband/core/cache.c | 16 +++++++++++++---
>>>   1 file changed, 13 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
>>> index 81cf3c902e81..74211fb37020 100644
>>> --- a/drivers/infiniband/core/cache.c
>>> +++ b/drivers/infiniband/core/cache.c
>>> @@ -799,16 +799,26 @@ static void release_gid_table(struct ib_device *device,
>>>   	if (!table)
>>>   		return;
>>>   
>>> +	mutex_lock(&table->lock);
>>>   	for (i = 0; i < table->sz; i++) {
>>>   		if (is_gid_entry_free(table->data_vec[i]))
>>>   			continue;
>>>   
>>> -		WARN_ONCE(true,
>>> -			  "GID entry ref leak for dev %s index %d ref=%u\n",
>>> +		WARN_ONCE(table->data_vec[i]->state != GID_TABLE_ENTRY_PENDING_DEL,
>>> +			  "GID entry ref leak for dev %s index %d ref=%u, state: %d\n",
>>>   			  dev_name(&device->dev), i,
>>> -			  kref_read(&table->data_vec[i]->kref));
>>> +			  kref_read(&table->data_vec[i]->kref), table->data_vec[i]->state);
>>> +		/*
>>> +		 * The entry may be sitting in the WQ waiting for
>>> +		 * free_gid_work(), flush it to try to clean it.
>>> +		 */
>>> +		mutex_unlock(&table->lock);
>>> +		flush_workqueue(ib_wq);
>>> +		mutex_lock(&table->lock);
>> I can't agree with idea that flush_workqueue() is called in the loop.
> Since we almost never see these WARN_ON's it isn't really called in a
> loop, but sure you could put a conditional around it to do it only
> once.
>
> The WARN on is in the wrong order, it is not a kernel bug if the
> workqueue is still pending. flush the queue and then check again, and
> then do the warn.
>
> @@ -791,22 +791,31 @@ static struct ib_gid_table *alloc_gid_table(int sz)
>          return NULL;
>   }
>   
> -static void release_gid_table(struct ib_device *device,
> -                             struct ib_gid_table *table)
> +static bool is_gid_table_clean(struct ib_gid_table *table)
>   {
>          int i;
>   
> +       guard(mutex)(&table->lock);
> +       for (i = 0; i < table->sz; i++)
> +               if (!is_gid_entry_free(table->data_vec[i]))
> +                       return false;
> +       return true;
> +}
> +
> +static void release_gid_table(struct ib_device *device,
> +                             struct ib_gid_table *table)
> +{
>          if (!table)
>                  return;
>   
> -       for (i = 0; i < table->sz; i++) {
> -               if (is_gid_entry_free(table->data_vec[i]))
> -                       continue;
> -
> -               WARN_ONCE(true,
> -                         "GID entry ref leak for dev %s index %d ref=%u\n",
> -                         dev_name(&device->dev), i,
> -                         kref_read(&table->data_vec[i]->kref));
> +       if (!is_gid_table_clean(table)) {
> +               /*
> +                * The entry may be sitting in the WQ waiting for
> +                * free_gid_work(), flush it to try to clean it.
> +                */
> +               flush_workqueue(ib_wq);
> +               if (!is_gid_table_clean(table))
> +                       WARN_ONCE(true, "GID entry has leaked");

Thanks a lot. IMO, if a leak occurs, more information should be helpful. Thus, the following should be better?
	
	WARN_ONCE(table->data_vec[i]->state != GID_TABLE_ENTRY_PENDING_DEL,
			  "GID entry ref leak for dev %s index %d ref=%u, state: %d\n",
  			  dev_name(&device->dev), index,
			  kref_read(&table->data_vec[index]->kref), table->data_vec[index]->state);
I will make tests with syz and send the latest patch very soon.

Zhu Yanjun

>          }
>   
>          mutex_destroy(&table->lock);
>
> Jason

-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-11-05 15:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-04 23:36 [PATCH rdma-next v2 1/1] RDMA/core: Fix WARNING in gid_table_release_one Zhu Yanjun
2025-11-05 13:09 ` Leon Romanovsky
2025-11-05 13:45   ` Jason Gunthorpe
2025-11-05 14:54     ` Leon Romanovsky
2025-11-05 15:46     ` Zhu Yanjun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox