All of lore.kernel.org
 help / color / mirror / Atom feed
* [bug report] RDMA/rxe: Failure of ibv_query_device() and ibv_query_device_ex() tests in rdma-core
@ 2025-02-26 10:32 Daisuke Matsuda (Fujitsu)
  2025-03-01 20:14 ` Zhu Yanjun
  0 siblings, 1 reply; 3+ messages in thread
From: Daisuke Matsuda (Fujitsu) @ 2025-02-26 10:32 UTC (permalink / raw)
  To: 'yanjun.zhu@linux.dev', 'zyjzyj2000@gmail.com'
  Cc: 'linux-rdma@vger.kernel.org', 'jgg@ziepe.ca',
	'leon@kernel.org'

Currently, two testcases in rdma-core fail with the latest kernel, leaving the console log below.
=====
$ ./build/bin/run_tests.py -k device
ssssssss....FF........s
======================================================================
FAIL: test_query_device (tests.test_device.DeviceTest.test_query_device)
Test ibv_query_device()
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/rdma-core/tests/test_device.py", line 63, in test_query_device
    self.verify_device_attr(attr, dev)
  File "/home/ubuntu/rdma-core/tests/test_device.py", line 200, in verify_device_attr
    assert attr.sys_image_guid != 0
           ^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

======================================================================
FAIL: test_query_device_ex (tests.test_device.DeviceTest.test_query_device_ex)
Test ibv_query_device_ex()
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/rdma-core/tests/test_device.py", line 222, in test_query_device_ex
    self.verify_device_attr(attr_ex.orig_attr, dev)
  File "/home/ubuntu/rdma-core/tests/test_device.py", line 200, in verify_device_attr
    assert attr.sys_image_guid != 0
           ^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

----------------------------------------------------------------------
Ran 23 tests in 0.007s

FAILED (failures=2, skipped=9)
=====

It seems sys_image_guid is set here:
https://github.com/torvalds/linux/blob/2ac5415022d16d63d912a39a06f32f1f51140261/drivers/infiniband/sw/rxe/rxe.c#L82

I tried rolling back to commit 57a7138d0627, just before this patch was applied, and found the error resolved.
[PATCH 1/1] RDMA/rxe: Remove the direct link to net_device
https://lore.kernel.org/all/20241220222325.2487767-1-yanjun.zhu@linux.dev/

I think the root cause lies in ndev patches applied in the past two months,
but I am not very sure if it is good idea to revert them. I would like opinions
from Zhu and other developers.

Thanks,
Daisuke Matsuda


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] RDMA/rxe: Failure of ibv_query_device() and ibv_query_device_ex() tests in rdma-core
  2025-02-26 10:32 [bug report] RDMA/rxe: Failure of ibv_query_device() and ibv_query_device_ex() tests in rdma-core Daisuke Matsuda (Fujitsu)
@ 2025-03-01 20:14 ` Zhu Yanjun
  2025-03-01 23:22   ` Zhu Yanjun
  0 siblings, 1 reply; 3+ messages in thread
From: Zhu Yanjun @ 2025-03-01 20:14 UTC (permalink / raw)
  To: Daisuke Matsuda (Fujitsu), 'zyjzyj2000@gmail.com'
  Cc: 'linux-rdma@vger.kernel.org', 'jgg@ziepe.ca',
	'leon@kernel.org'

在 2025/2/26 11:32, Daisuke Matsuda (Fujitsu) 写道:
> Currently, two testcases in rdma-core fail with the latest kernel, leaving the console log below.
> =====
> $ ./build/bin/run_tests.py -k device
> ssssssss....FF........s
> ======================================================================
> FAIL: test_query_device (tests.test_device.DeviceTest.test_query_device)
> Test ibv_query_device()
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>    File "/home/ubuntu/rdma-core/tests/test_device.py", line 63, in test_query_device
>      self.verify_device_attr(attr, dev)
>    File "/home/ubuntu/rdma-core/tests/test_device.py", line 200, in verify_device_attr
>      assert attr.sys_image_guid != 0
>             ^^^^^^^^^^^^^^^^^^^^^^^^
> AssertionError
> 
> ======================================================================
> FAIL: test_query_device_ex (tests.test_device.DeviceTest.test_query_device_ex)
> Test ibv_query_device_ex()
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>    File "/home/ubuntu/rdma-core/tests/test_device.py", line 222, in test_query_device_ex
>      self.verify_device_attr(attr_ex.orig_attr, dev)
>    File "/home/ubuntu/rdma-core/tests/test_device.py", line 200, in verify_device_attr
>      assert attr.sys_image_guid != 0
>             ^^^^^^^^^^^^^^^^^^^^^^^^
> AssertionError
> 
> ----------------------------------------------------------------------
> Ran 23 tests in 0.007s
> 
> FAILED (failures=2, skipped=9)
> =====
> 
> It seems sys_image_guid is set here:
> https://github.com/torvalds/linux/blob/2ac5415022d16d63d912a39a06f32f1f51140261/drivers/infiniband/sw/rxe/rxe.c#L82
> 
> I tried rolling back to commit 57a7138d0627, just before this patch was applied, and found the error resolved.
> [PATCH 1/1] RDMA/rxe: Remove the direct link to net_device
> https://lore.kernel.org/all/20241220222325.2487767-1-yanjun.zhu@linux.dev/

Thanks. The following commits are to fix this problem in upstream and 
for-next.

Because the patchset 
https://patchwork.kernel.org/project/linux-rdma/cover/20250119172831.3123110-1-yanjun.zhu@linux.dev/ 
exists in for-next, but this patchset does not exist in upstream.

Thus, 
https://patchwork.kernel.org/project/linux-rdma/patch/20250301193530.904720-1-yanjun.zhu@linux.dev/ 
is for for-next.

https://patchwork.kernel.org/project/linux-rdma/patch/20250301193351.901749-1-yanjun.zhu@linux.dev/ 
is for upstream.

Thanks,
Zhu Yanjun

> 
> I think the root cause lies in ndev patches applied in the past two months,
> but I am not very sure if it is good idea to revert them. I would like opinions
> from Zhu and other developers.
> 
> Thanks,
> Daisuke Matsuda
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] RDMA/rxe: Failure of ibv_query_device() and ibv_query_device_ex() tests in rdma-core
  2025-03-01 20:14 ` Zhu Yanjun
@ 2025-03-01 23:22   ` Zhu Yanjun
  0 siblings, 0 replies; 3+ messages in thread
From: Zhu Yanjun @ 2025-03-01 23:22 UTC (permalink / raw)
  To: Daisuke Matsuda (Fujitsu), 'zyjzyj2000@gmail.com'
  Cc: 'linux-rdma@vger.kernel.org', 'jgg@ziepe.ca',
	'leon@kernel.org'



在 2025/3/1 21:14, Zhu Yanjun 写道:
> 在 2025/2/26 11:32, Daisuke Matsuda (Fujitsu) 写道:
>> Currently, two testcases in rdma-core fail with the latest kernel, 
>> leaving the console log below.
>> =====
>> $ ./build/bin/run_tests.py -k device
>> ssssssss....FF........s
>> ======================================================================
>> FAIL: test_query_device (tests.test_device.DeviceTest.test_query_device)
>> Test ibv_query_device()
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>>    File "/home/ubuntu/rdma-core/tests/test_device.py", line 63, in 
>> test_query_device
>>      self.verify_device_attr(attr, dev)
>>    File "/home/ubuntu/rdma-core/tests/test_device.py", line 200, in 
>> verify_device_attr
>>      assert attr.sys_image_guid != 0
>>             ^^^^^^^^^^^^^^^^^^^^^^^^
>> AssertionError
>>
>> ======================================================================
>> FAIL: test_query_device_ex 
>> (tests.test_device.DeviceTest.test_query_device_ex)
>> Test ibv_query_device_ex()
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>>    File "/home/ubuntu/rdma-core/tests/test_device.py", line 222, in 
>> test_query_device_ex
>>      self.verify_device_attr(attr_ex.orig_attr, dev)
>>    File "/home/ubuntu/rdma-core/tests/test_device.py", line 200, in 
>> verify_device_attr
>>      assert attr.sys_image_guid != 0
>>             ^^^^^^^^^^^^^^^^^^^^^^^^
>> AssertionError
>>
>> ----------------------------------------------------------------------
>> Ran 23 tests in 0.007s
>>
>> FAILED (failures=2, skipped=9)
>> =====
>>
>> It seems sys_image_guid is set here:
>> https://github.com/torvalds/linux/ 
>> blob/2ac5415022d16d63d912a39a06f32f1f51140261/drivers/infiniband/sw/ 
>> rxe/rxe.c#L82
>>
>> I tried rolling back to commit 57a7138d0627, just before this patch 
>> was applied, and found the error resolved.
>> [PATCH 1/1] RDMA/rxe: Remove the direct link to net_device
>> https://lore.kernel.org/all/20241220222325.2487767-1- 
>> yanjun.zhu@linux.dev/
> 
> Thanks. The following commits are to fix this problem in upstream and 
> for-next.
> 
> Because the patchset https://patchwork.kernel.org/project/linux-rdma/ 
> cover/20250119172831.3123110-1-yanjun.zhu@linux.dev/ exists in for-next, 
> but this patchset does not exist in upstream.
> 
> Thus, https://patchwork.kernel.org/project/linux-rdma/ 
> patch/20250301193530.904720-1-yanjun.zhu@linux.dev/ is for for-next.

V2 for for-next is in the link:
https://patchwork.kernel.org/project/linux-rdma/patch/20250301231639.1304156-1-yanjun.zhu@linux.dev/

Zhu Yanjun

> 
> https://patchwork.kernel.org/project/linux-rdma/ 
> patch/20250301193351.901749-1-yanjun.zhu@linux.dev/ is for upstream.
> 
> Thanks,
> Zhu Yanjun
> 
>>
>> I think the root cause lies in ndev patches applied in the past two 
>> months,
>> but I am not very sure if it is good idea to revert them. I would like 
>> opinions
>> from Zhu and other developers.
>>
>> Thanks,
>> Daisuke Matsuda
>>
> 

-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-03-01 23:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-26 10:32 [bug report] RDMA/rxe: Failure of ibv_query_device() and ibv_query_device_ex() tests in rdma-core Daisuke Matsuda (Fujitsu)
2025-03-01 20:14 ` Zhu Yanjun
2025-03-01 23:22   ` Zhu Yanjun

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.