* [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table
@ 2025-03-04 12:43 Guangguan Wang
2025-03-11 8:59 ` Paolo Abeni
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Guangguan Wang @ 2025-03-04 12:43 UTC (permalink / raw)
To: wenjia, pasic, jaka, alibuda, tonylu, guwen
Cc: davem, edumazet, kuba, pabeni, horms, linux-rdma, linux-s390,
netdev, linux-kernel
When using smc_pnet in SMC, it will only search the pnetid in the
base_ndev of the netdev hierarchy(both HW PNETID and User-defined
sw pnetid). This may not work for some scenarios when using SMC in
container on cloud environment.
In container, there have choices of different container network,
such as directly using host network, virtual network IPVLAN, veth,
etc. Different choices of container network have different netdev
hierarchy. Examples of netdev hierarchy show below. (eth0 and eth1
in host below is the netdev directly related to the physical device).
_______________________________
| _________________ |
| |POD | |
| | | |
| | eth0_________ | |
| |____| |__| |
| | | |
| | | |
| eth1|base_ndev| eth0_______ |
| | | | RDMA ||
| host |_________| |_______||
---------------------------------
netdev hierarchy if directly using host network
________________________________
| _________________ |
| |POD __________ | |
| | |upper_ndev| | |
| |eth0|__________| | |
| |_______|_________| |
| |lower netdev |
| __|______ |
| eth1| | eth0_______ |
| |base_ndev| | RDMA ||
| host |_________| |_______||
---------------------------------
netdev hierarchy if using IPVLAN
_______________________________
| _____________________ |
| |POD _________ | |
| | |base_ndev|| |
| |eth0(veth)|_________|| |
| |____________|________| |
| |pairs |
| _______|_ |
| | | eth0_______ |
| veth|base_ndev| | RDMA ||
| |_________| |_______||
| _________ |
| eth1|base_ndev| |
| host |_________| |
---------------------------------
netdev hierarchy if using veth
Due to some reasons, the eth1 in host is not RDMA attached netdevice,
pnetid is needed to map the eth1(in host) with RDMA device so that POD
can do SMC-R. Because the eth1(in host) is managed by CNI plugin(such
as Terway, network management plugin in container environment), and in
cloud environment the eth(in host) can dynamically be inserted by CNI
when POD create and dynamically be removed by CNI when POD destroy and
no POD related to the eth(in host) anymore. It is hard to config the
pnetid to the eth1(in host). But it is easy to config the pnetid to the
netdevice which can be seen in POD. When do SMC-R, both the container
directly using host network and the container using veth network can
successfully match the RDMA device, because the configured pnetid netdev
is a base_ndev. But the container using IPVLAN can not successfully
match the RDMA device and 0x03030000 fallback happens, because the
configured pnetid netdev is not a base_ndev. Additionally, if config
pnetid to the eth1(in host) also can not work for matching RDMA device
when using veth network and doing SMC-R in POD.
To resolve the problems list above, this patch extends to search user
-defined sw pnetid in the clc handshake ndev when no pnetid can be found
in the base_ndev, and the base_ndev take precedence over ndev for backward
compatibility. This patch also can unify the pnetid setup of different
network choices list above in container(Config user-defined sw pnetid in
the netdevice can be seen in POD).
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
---
net/smc/smc_pnet.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
index 716808f374a8..b391c2ef463f 100644
--- a/net/smc/smc_pnet.c
+++ b/net/smc/smc_pnet.c
@@ -1079,14 +1079,16 @@ static void smc_pnet_find_roce_by_pnetid(struct net_device *ndev,
struct smc_init_info *ini)
{
u8 ndev_pnetid[SMC_MAX_PNETID_LEN];
+ struct net_device *base_ndev;
struct net *net;
- ndev = pnet_find_base_ndev(ndev);
+ base_ndev = pnet_find_base_ndev(ndev);
net = dev_net(ndev);
- if (smc_pnetid_by_dev_port(ndev->dev.parent, ndev->dev_port,
+ if (smc_pnetid_by_dev_port(base_ndev->dev.parent, base_ndev->dev_port,
ndev_pnetid) &&
+ smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid) &&
smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid)) {
- smc_pnet_find_rdma_dev(ndev, ini);
+ smc_pnet_find_rdma_dev(base_ndev, ini);
return; /* pnetid could not be determined */
}
_smc_pnet_find_roce_by_pnetid(ndev_pnetid, ini, NULL, net);
--
2.24.3 (Apple Git-128)
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table
2025-03-04 12:43 [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table Guangguan Wang
@ 2025-03-11 8:59 ` Paolo Abeni
2025-03-11 14:36 ` Wenjia Zhang
2025-03-13 7:46 ` Wenjia Zhang
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Paolo Abeni @ 2025-03-11 8:59 UTC (permalink / raw)
To: Guangguan Wang, wenjia, pasic, jaka, alibuda, tonylu, guwen
Cc: davem, edumazet, kuba, horms, linux-rdma, linux-s390, netdev,
linux-kernel
On 3/4/25 1:43 PM, Guangguan Wang wrote:
> When using smc_pnet in SMC, it will only search the pnetid in the
> base_ndev of the netdev hierarchy(both HW PNETID and User-defined
> sw pnetid). This may not work for some scenarios when using SMC in
> container on cloud environment.
> In container, there have choices of different container network,
> such as directly using host network, virtual network IPVLAN, veth,
> etc. Different choices of container network have different netdev
> hierarchy. Examples of netdev hierarchy show below. (eth0 and eth1
> in host below is the netdev directly related to the physical device).
> _______________________________
> | _________________ |
> | |POD | |
> | | | |
> | | eth0_________ | |
> | |____| |__| |
> | | | |
> | | | |
> | eth1|base_ndev| eth0_______ |
> | | | | RDMA ||
> | host |_________| |_______||
> ---------------------------------
> netdev hierarchy if directly using host network
> ________________________________
> | _________________ |
> | |POD __________ | |
> | | |upper_ndev| | |
> | |eth0|__________| | |
> | |_______|_________| |
> | |lower netdev |
> | __|______ |
> | eth1| | eth0_______ |
> | |base_ndev| | RDMA ||
> | host |_________| |_______||
> ---------------------------------
> netdev hierarchy if using IPVLAN
> _______________________________
> | _____________________ |
> | |POD _________ | |
> | | |base_ndev|| |
> | |eth0(veth)|_________|| |
> | |____________|________| |
> | |pairs |
> | _______|_ |
> | | | eth0_______ |
> | veth|base_ndev| | RDMA ||
> | |_________| |_______||
> | _________ |
> | eth1|base_ndev| |
> | host |_________| |
> ---------------------------------
> netdev hierarchy if using veth
> Due to some reasons, the eth1 in host is not RDMA attached netdevice,
> pnetid is needed to map the eth1(in host) with RDMA device so that POD
> can do SMC-R. Because the eth1(in host) is managed by CNI plugin(such
> as Terway, network management plugin in container environment), and in
> cloud environment the eth(in host) can dynamically be inserted by CNI
> when POD create and dynamically be removed by CNI when POD destroy and
> no POD related to the eth(in host) anymore. It is hard to config the
> pnetid to the eth1(in host). But it is easy to config the pnetid to the
> netdevice which can be seen in POD. When do SMC-R, both the container
> directly using host network and the container using veth network can
> successfully match the RDMA device, because the configured pnetid netdev
> is a base_ndev. But the container using IPVLAN can not successfully
> match the RDMA device and 0x03030000 fallback happens, because the
> configured pnetid netdev is not a base_ndev. Additionally, if config
> pnetid to the eth1(in host) also can not work for matching RDMA device
> when using veth network and doing SMC-R in POD.
>
> To resolve the problems list above, this patch extends to search user
> -defined sw pnetid in the clc handshake ndev when no pnetid can be found
> in the base_ndev, and the base_ndev take precedence over ndev for backward
> compatibility. This patch also can unify the pnetid setup of different
> network choices list above in container(Config user-defined sw pnetid in
> the netdevice can be seen in POD).
>
> Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
> ---
> net/smc/smc_pnet.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 716808f374a8..b391c2ef463f 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1079,14 +1079,16 @@ static void smc_pnet_find_roce_by_pnetid(struct net_device *ndev,
> struct smc_init_info *ini)
> {
> u8 ndev_pnetid[SMC_MAX_PNETID_LEN];
> + struct net_device *base_ndev;
> struct net *net;
>
> - ndev = pnet_find_base_ndev(ndev);
> + base_ndev = pnet_find_base_ndev(ndev);
> net = dev_net(ndev);
> - if (smc_pnetid_by_dev_port(ndev->dev.parent, ndev->dev_port,
> + if (smc_pnetid_by_dev_port(base_ndev->dev.parent, base_ndev->dev_port,
> ndev_pnetid) &&
> + smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid) &&
> smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid)) {
> - smc_pnet_find_rdma_dev(ndev, ini);
> + smc_pnet_find_rdma_dev(base_ndev, ini);
> return; /* pnetid could not be determined */
> }
> _smc_pnet_find_roce_by_pnetid(ndev_pnetid, ini, NULL, net);
I understand Wenjia opposed to this solution as it may create invalid
topologies ?!?
https://lore.kernel.org/netdev/08cd6e15-3f8c-47a0-8490-103d59abf910@linux.ibm.com/#t
Wenjia, could you please confirm?
Thanks,
Paolo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table
2025-03-11 8:59 ` Paolo Abeni
@ 2025-03-11 14:36 ` Wenjia Zhang
0 siblings, 0 replies; 7+ messages in thread
From: Wenjia Zhang @ 2025-03-11 14:36 UTC (permalink / raw)
To: Paolo Abeni, Guangguan Wang, pasic, jaka, alibuda, tonylu, guwen,
mjambigi, sidraya
Cc: davem, edumazet, kuba, horms, linux-rdma, linux-s390, netdev,
linux-kernel
On 11.03.25 09:59, Paolo Abeni wrote:
> On 3/4/25 1:43 PM, Guangguan Wang wrote:
>> When using smc_pnet in SMC, it will only search the pnetid in the
>> base_ndev of the netdev hierarchy(both HW PNETID and User-defined
>> sw pnetid). This may not work for some scenarios when using SMC in
>> container on cloud environment.
>> In container, there have choices of different container network,
>> such as directly using host network, virtual network IPVLAN, veth,
>> etc. Different choices of container network have different netdev
>> hierarchy. Examples of netdev hierarchy show below. (eth0 and eth1
>> in host below is the netdev directly related to the physical device).
>> _______________________________
>> | _________________ |
>> | |POD | |
>> | | | |
>> | | eth0_________ | |
>> | |____| |__| |
>> | | | |
>> | | | |
>> | eth1|base_ndev| eth0_______ |
>> | | | | RDMA ||
>> | host |_________| |_______||
>> ---------------------------------
>> netdev hierarchy if directly using host network
>> ________________________________
>> | _________________ |
>> | |POD __________ | |
>> | | |upper_ndev| | |
>> | |eth0|__________| | |
>> | |_______|_________| |
>> | |lower netdev |
>> | __|______ |
>> | eth1| | eth0_______ |
>> | |base_ndev| | RDMA ||
>> | host |_________| |_______||
>> ---------------------------------
>> netdev hierarchy if using IPVLAN
>> _______________________________
>> | _____________________ |
>> | |POD _________ | |
>> | | |base_ndev|| |
>> | |eth0(veth)|_________|| |
>> | |____________|________| |
>> | |pairs |
>> | _______|_ |
>> | | | eth0_______ |
>> | veth|base_ndev| | RDMA ||
>> | |_________| |_______||
>> | _________ |
>> | eth1|base_ndev| |
>> | host |_________| |
>> ---------------------------------
>> netdev hierarchy if using veth
>> Due to some reasons, the eth1 in host is not RDMA attached netdevice,
>> pnetid is needed to map the eth1(in host) with RDMA device so that POD
>> can do SMC-R. Because the eth1(in host) is managed by CNI plugin(such
>> as Terway, network management plugin in container environment), and in
>> cloud environment the eth(in host) can dynamically be inserted by CNI
>> when POD create and dynamically be removed by CNI when POD destroy and
>> no POD related to the eth(in host) anymore. It is hard to config the
>> pnetid to the eth1(in host). But it is easy to config the pnetid to the
>> netdevice which can be seen in POD. When do SMC-R, both the container
>> directly using host network and the container using veth network can
>> successfully match the RDMA device, because the configured pnetid netdev
>> is a base_ndev. But the container using IPVLAN can not successfully
>> match the RDMA device and 0x03030000 fallback happens, because the
>> configured pnetid netdev is not a base_ndev. Additionally, if config
>> pnetid to the eth1(in host) also can not work for matching RDMA device
>> when using veth network and doing SMC-R in POD.
>>
>> To resolve the problems list above, this patch extends to search user
>> -defined sw pnetid in the clc handshake ndev when no pnetid can be found
>> in the base_ndev, and the base_ndev take precedence over ndev for backward
>> compatibility. This patch also can unify the pnetid setup of different
>> network choices list above in container(Config user-defined sw pnetid in
>> the netdevice can be seen in POD).
>>
>> Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
>> ---
>> net/smc/smc_pnet.c | 8 +++++---
>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
>> index 716808f374a8..b391c2ef463f 100644
>> --- a/net/smc/smc_pnet.c
>> +++ b/net/smc/smc_pnet.c
>> @@ -1079,14 +1079,16 @@ static void smc_pnet_find_roce_by_pnetid(struct net_device *ndev,
>> struct smc_init_info *ini)
>> {
>> u8 ndev_pnetid[SMC_MAX_PNETID_LEN];
>> + struct net_device *base_ndev;
>> struct net *net;
>>
>> - ndev = pnet_find_base_ndev(ndev);
>> + base_ndev = pnet_find_base_ndev(ndev);
>> net = dev_net(ndev);
>> - if (smc_pnetid_by_dev_port(ndev->dev.parent, ndev->dev_port,
>> + if (smc_pnetid_by_dev_port(base_ndev->dev.parent, base_ndev->dev_port,
>> ndev_pnetid) &&
>> + smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid) &&
>> smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid)) {
>> - smc_pnet_find_rdma_dev(ndev, ini);
>> + smc_pnet_find_rdma_dev(base_ndev, ini);
>> return; /* pnetid could not be determined */
>> }
>> _smc_pnet_find_roce_by_pnetid(ndev_pnetid, ini, NULL, net);
>
> I understand Wenjia opposed to this solution as it may create invalid
> topologies ?!?
>
> https://lore.kernel.org/netdev/08cd6e15-3f8c-47a0-8490-103d59abf910@linux.ibm.com/#t
>
> Wenjia, could you please confirm?
>
> Thanks,
>
> Paolo
>
Hi Paolo,
Thanks for asking! I really appreciate it.
I was initially opposed, but after discussing with Halil, I agreed that
my concerns might be not necessary. Halil and I reached an agreement
that he responded to the emails (v1) to ask for the version as he
already did, and we will double-check version 2 to ensure it works
correctly.
In any case, I still need to review it carefully and will provide my
answer as soon as possible.
Thanks,
Wenjia
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table
2025-03-04 12:43 [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table Guangguan Wang
2025-03-11 8:59 ` Paolo Abeni
@ 2025-03-13 7:46 ` Wenjia Zhang
2025-03-13 8:09 ` Guangguan Wang
2025-03-13 15:42 ` Halil Pasic
2025-03-14 13:00 ` patchwork-bot+netdevbpf
3 siblings, 1 reply; 7+ messages in thread
From: Wenjia Zhang @ 2025-03-13 7:46 UTC (permalink / raw)
To: Guangguan Wang, pasic, jaka, alibuda, tonylu, guwen, mjambigi,
sidraya
Cc: davem, edumazet, kuba, pabeni, horms, linux-rdma, linux-s390,
netdev, linux-kernel
On 04.03.25 13:43, Guangguan Wang wrote:
> When using smc_pnet in SMC, it will only search the pnetid in the
> base_ndev of the netdev hierarchy(both HW PNETID and User-defined
> sw pnetid). This may not work for some scenarios when using SMC in
> container on cloud environment.
> In container, there have choices of different container network,
> such as directly using host network, virtual network IPVLAN, veth,
> etc. Different choices of container network have different netdev
> hierarchy. Examples of netdev hierarchy show below. (eth0 and eth1
> in host below is the netdev directly related to the physical device).
> _______________________________
> | _________________ |
> | |POD | |
> | | | |
> | | eth0_________ | |
> | |____| |__| |
> | | | |
> | | | |
> | eth1|base_ndev| eth0_______ |
> | | | | RDMA ||
> | host |_________| |_______||
> ---------------------------------
> netdev hierarchy if directly using host network
> ________________________________
> | _________________ |
> | |POD __________ | |
> | | |upper_ndev| | |
> | |eth0|__________| | |
> | |_______|_________| |
> | |lower netdev |
> | __|______ |
> | eth1| | eth0_______ |
> | |base_ndev| | RDMA ||
> | host |_________| |_______||
> ---------------------------------
> netdev hierarchy if using IPVLAN
> _______________________________
> | _____________________ |
> | |POD _________ | |
> | | |base_ndev|| |
> | |eth0(veth)|_________|| |
> | |____________|________| |
> | |pairs |
> | _______|_ |
> | | | eth0_______ |
> | veth|base_ndev| | RDMA ||
> | |_________| |_______||
> | _________ |
> | eth1|base_ndev| |
> | host |_________| |
> ---------------------------------
> netdev hierarchy if using veth
> Due to some reasons, the eth1 in host is not RDMA attached netdevice,
> pnetid is needed to map the eth1(in host) with RDMA device so that POD
> can do SMC-R. Because the eth1(in host) is managed by CNI plugin(such
> as Terway, network management plugin in container environment), and in
> cloud environment the eth(in host) can dynamically be inserted by CNI
> when POD create and dynamically be removed by CNI when POD destroy and
> no POD related to the eth(in host) anymore. It is hard to config the
> pnetid to the eth1(in host). But it is easy to config the pnetid to the
> netdevice which can be seen in POD. When do SMC-R, both the container
> directly using host network and the container using veth network can
> successfully match the RDMA device, because the configured pnetid netdev
> is a base_ndev. But the container using IPVLAN can not successfully
> match the RDMA device and 0x03030000 fallback happens, because the
> configured pnetid netdev is not a base_ndev. Additionally, if config
> pnetid to the eth1(in host) also can not work for matching RDMA device
> when using veth network and doing SMC-R in POD.
>
> To resolve the problems list above, this patch extends to search user
> -defined sw pnetid in the clc handshake ndev when no pnetid can be found
> in the base_ndev, and the base_ndev take precedence over ndev for backward
> compatibility. This patch also can unify the pnetid setup of different
> network choices list above in container(Config user-defined sw pnetid in
> the netdevice can be seen in POD).
>
> Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
> ---
> net/smc/smc_pnet.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 716808f374a8..b391c2ef463f 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1079,14 +1079,16 @@ static void smc_pnet_find_roce_by_pnetid(struct net_device *ndev,
> struct smc_init_info *ini)
> {
> u8 ndev_pnetid[SMC_MAX_PNETID_LEN];
> + struct net_device *base_ndev;
> struct net *net;
>
> - ndev = pnet_find_base_ndev(ndev);
> + base_ndev = pnet_find_base_ndev(ndev);
> net = dev_net(ndev);
> - if (smc_pnetid_by_dev_port(ndev->dev.parent, ndev->dev_port,
> + if (smc_pnetid_by_dev_port(base_ndev->dev.parent, base_ndev->dev_port,
> ndev_pnetid) &&
> + smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid) &&
> smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid)) {
> - smc_pnet_find_rdma_dev(ndev, ini);
> + smc_pnet_find_rdma_dev(base_ndev, ini);
> return; /* pnetid could not be determined */
> }
> _smc_pnet_find_roce_by_pnetid(ndev_pnetid, ini, NULL, net);
Hi Guangguan,
sorry for the late answer! It looks good to me. Here is my R-b:
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Btw. could you give Halil some time for the review? He also wants to
have a look.
Thanks,
Wenjia
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table
2025-03-13 7:46 ` Wenjia Zhang
@ 2025-03-13 8:09 ` Guangguan Wang
0 siblings, 0 replies; 7+ messages in thread
From: Guangguan Wang @ 2025-03-13 8:09 UTC (permalink / raw)
To: Wenjia Zhang, pasic, jaka, alibuda, tonylu, guwen, mjambigi,
sidraya
Cc: davem, edumazet, kuba, pabeni, horms, linux-rdma, linux-s390,
netdev, linux-kernel
On 2025/3/13 15:46, Wenjia Zhang wrote:
>
>
> On 04.03.25 13:43, Guangguan Wang wrote:
>> When using smc_pnet in SMC, it will only search the pnetid in the
>> base_ndev of the netdev hierarchy(both HW PNETID and User-defined
>> sw pnetid). This may not work for some scenarios when using SMC in
>> container on cloud environment.
>> In container, there have choices of different container network,
>> such as directly using host network, virtual network IPVLAN, veth,
>> etc. Different choices of container network have different netdev
>> hierarchy. Examples of netdev hierarchy show below. (eth0 and eth1
>> in host below is the netdev directly related to the physical device).
>> _______________________________
>> | _________________ |
>> | |POD | |
>> | | | |
>> | | eth0_________ | |
>> | |____| |__| |
>> | | | |
>> | | | |
>> | eth1|base_ndev| eth0_______ |
>> | | | | RDMA ||
>> | host |_________| |_______||
>> ---------------------------------
>> netdev hierarchy if directly using host network
>> ________________________________
>> | _________________ |
>> | |POD __________ | |
>> | | |upper_ndev| | |
>> | |eth0|__________| | |
>> | |_______|_________| |
>> | |lower netdev |
>> | __|______ |
>> | eth1| | eth0_______ |
>> | |base_ndev| | RDMA ||
>> | host |_________| |_______||
>> ---------------------------------
>> netdev hierarchy if using IPVLAN
>> _______________________________
>> | _____________________ |
>> | |POD _________ | |
>> | | |base_ndev|| |
>> | |eth0(veth)|_________|| |
>> | |____________|________| |
>> | |pairs |
>> | _______|_ |
>> | | | eth0_______ |
>> | veth|base_ndev| | RDMA ||
>> | |_________| |_______||
>> | _________ |
>> | eth1|base_ndev| |
>> | host |_________| |
>> ---------------------------------
>> netdev hierarchy if using veth
>> Due to some reasons, the eth1 in host is not RDMA attached netdevice,
>> pnetid is needed to map the eth1(in host) with RDMA device so that POD
>> can do SMC-R. Because the eth1(in host) is managed by CNI plugin(such
>> as Terway, network management plugin in container environment), and in
>> cloud environment the eth(in host) can dynamically be inserted by CNI
>> when POD create and dynamically be removed by CNI when POD destroy and
>> no POD related to the eth(in host) anymore. It is hard to config the
>> pnetid to the eth1(in host). But it is easy to config the pnetid to the
>> netdevice which can be seen in POD. When do SMC-R, both the container
>> directly using host network and the container using veth network can
>> successfully match the RDMA device, because the configured pnetid netdev
>> is a base_ndev. But the container using IPVLAN can not successfully
>> match the RDMA device and 0x03030000 fallback happens, because the
>> configured pnetid netdev is not a base_ndev. Additionally, if config
>> pnetid to the eth1(in host) also can not work for matching RDMA device
>> when using veth network and doing SMC-R in POD.
>>
>> To resolve the problems list above, this patch extends to search user
>> -defined sw pnetid in the clc handshake ndev when no pnetid can be found
>> in the base_ndev, and the base_ndev take precedence over ndev for backward
>> compatibility. This patch also can unify the pnetid setup of different
>> network choices list above in container(Config user-defined sw pnetid in
>> the netdevice can be seen in POD).
>>
>> Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
>> ---
>> net/smc/smc_pnet.c | 8 +++++---
>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
>> index 716808f374a8..b391c2ef463f 100644
>> --- a/net/smc/smc_pnet.c
>> +++ b/net/smc/smc_pnet.c
>> @@ -1079,14 +1079,16 @@ static void smc_pnet_find_roce_by_pnetid(struct net_device *ndev,
>> struct smc_init_info *ini)
>> {
>> u8 ndev_pnetid[SMC_MAX_PNETID_LEN];
>> + struct net_device *base_ndev;
>> struct net *net;
>> - ndev = pnet_find_base_ndev(ndev);
>> + base_ndev = pnet_find_base_ndev(ndev);
>> net = dev_net(ndev);
>> - if (smc_pnetid_by_dev_port(ndev->dev.parent, ndev->dev_port,
>> + if (smc_pnetid_by_dev_port(base_ndev->dev.parent, base_ndev->dev_port,
>> ndev_pnetid) &&
>> + smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid) &&
>> smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid)) {
>> - smc_pnet_find_rdma_dev(ndev, ini);
>> + smc_pnet_find_rdma_dev(base_ndev, ini);
>> return; /* pnetid could not be determined */
>> }
>> _smc_pnet_find_roce_by_pnetid(ndev_pnetid, ini, NULL, net);
>
> Hi Guangguan,
>
> sorry for the late answer! It looks good to me. Here is my R-b:
>
> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
>
Thanks, Wenjia.
> Btw. could you give Halil some time for the review? He also wants to have a look.
It is OK.
Regards,
Guangguan Wang
>
> Thanks,
> Wenjia
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table
2025-03-04 12:43 [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table Guangguan Wang
2025-03-11 8:59 ` Paolo Abeni
2025-03-13 7:46 ` Wenjia Zhang
@ 2025-03-13 15:42 ` Halil Pasic
2025-03-14 13:00 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 7+ messages in thread
From: Halil Pasic @ 2025-03-13 15:42 UTC (permalink / raw)
To: Guangguan Wang
Cc: wenjia, jaka, alibuda, tonylu, guwen, davem, edumazet, kuba,
pabeni, horms, linux-rdma, linux-s390, netdev, linux-kernel,
Halil Pasic
On Tue, 4 Mar 2025 20:43:04 +0800
Guangguan Wang <guangguan.wang@linux.alibaba.com> wrote:
> To resolve the problems list above, this patch extends to search user
> -defined sw pnetid in the clc handshake ndev when no pnetid can be found
> in the base_ndev, and the base_ndev take precedence over ndev for backward
> compatibility. This patch also can unify the pnetid setup of different
> network choices list above in container(Config user-defined sw pnetid in
> the netdevice can be seen in POD).
>
> Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table
2025-03-04 12:43 [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table Guangguan Wang
` (2 preceding siblings ...)
2025-03-13 15:42 ` Halil Pasic
@ 2025-03-14 13:00 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-03-14 13:00 UTC (permalink / raw)
To: Guangguan Wang
Cc: wenjia, pasic, jaka, alibuda, tonylu, guwen, davem, edumazet,
kuba, pabeni, horms, linux-rdma, linux-s390, netdev, linux-kernel
Hello:
This patch was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:
On Tue, 4 Mar 2025 20:43:04 +0800 you wrote:
> When using smc_pnet in SMC, it will only search the pnetid in the
> base_ndev of the netdev hierarchy(both HW PNETID and User-defined
> sw pnetid). This may not work for some scenarios when using SMC in
> container on cloud environment.
> In container, there have choices of different container network,
> such as directly using host network, virtual network IPVLAN, veth,
> etc. Different choices of container network have different netdev
> hierarchy. Examples of netdev hierarchy show below. (eth0 and eth1
> in host below is the netdev directly related to the physical device).
> _______________________________
> | _________________ |
> | |POD | |
> | | | |
> | | eth0_________ | |
> | |____| |__| |
> | | | |
> | | | |
> | eth1|base_ndev| eth0_______ |
> | | | | RDMA ||
> | host |_________| |_______||
>
> [...]
Here is the summary with links:
- [net-next,v2] net/smc: use the correct ndev to find pnetid by pnetid table
https://git.kernel.org/netdev/net-next/c/bfc6c67ec2d6
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-03-14 12:59 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-04 12:43 [PATCH net-next v2] net/smc: use the correct ndev to find pnetid by pnetid table Guangguan Wang
2025-03-11 8:59 ` Paolo Abeni
2025-03-11 14:36 ` Wenjia Zhang
2025-03-13 7:46 ` Wenjia Zhang
2025-03-13 8:09 ` Guangguan Wang
2025-03-13 15:42 ` Halil Pasic
2025-03-14 13:00 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).