From: Guangguan Wang <guangguan.wang@linux.alibaba.com>
To: Alexandra Winter <wintera@linux.ibm.com>,
Halil Pasic <pasic@linux.ibm.com>,
Paolo Abeni <pabeni@redhat.com>
Cc: wenjia@linux.ibm.com, jaka@linux.ibm.com, PASIC@de.ibm.com,
alibuda@linux.alibaba.com, tonylu@linux.alibaba.com,
guwen@linux.alibaba.com, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, horms@kernel.org,
linux-rdma@vger.kernel.org, linux-s390@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net] net/smc: use the correct ndev to find pnetid by pnetid table
Date: Fri, 10 Jan 2025 14:39:01 +0800 [thread overview]
Message-ID: <68059840-3aed-44e5-ac6e-9b1cabca392d@linux.alibaba.com> (raw)
In-Reply-To: <3ff078e0-150d-41ba-b705-a8e0365f0370@linux.ibm.com>
On 2025/1/9 00:00, Alexandra Winter wrote:
>
>
> On 07.01.25 20:32, Halil Pasic wrote:
>> On Tue, 7 Jan 2025 09:44:30 +0100
>> Paolo Abeni <pabeni@redhat.com> wrote:
>>
>>> On 12/27/24 5:04 AM, Guangguan Wang wrote:
> ...
>>>> The command 'smc_pnet -a -I <ethx> <pnetid>' will add <pnetid>
>>>> to the pnetid table and will attach the <pnetid> to net device
>>>> whose name is <ethx>. But When do SMCR by <ethx>, in function
>>>> smc_pnet_find_roce_by_pnetid, it will use <ethx>'s base ndev's
>>>> pnetid to match rdma device, not <ethx>'s pnetid. The asymmetric
>>>> use of the pnetid seems weird. Sometimes it is difficult to know
>>>> the hierarchy of net device what may make it difficult to configure
>>>> the pnetid and to use the pnetid. Looking into the history of
>>>> commit, it was the commit 890a2cb4a966 ("net/smc: rework pnet table")
>>>> that changes the ndev from the <ethx> to the <ethx>'s base ndev
>>>> when finding pnetid by pnetid table. It seems a mistake.
>>>>
>>>> This patch changes the ndev back to the <ethx> when finding pnetid
>>>> by pnetid table.
>>>>
>>>> Fixes: 890a2cb4a966 ("net/smc: rework pnet table")
>>>> Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
>>>
>>> If I read correctly, this will break existing applications using the
>>> lookup schema introduced by the blamed commit - which is not very
>>> recent.
>
>
> I agree that this patch may break existing applications or existing
> configuration automation scripts.
>
> ...
>> PNETID stands for "Physical Network Identifier" and the idea is that iff
>> two ports are connected to the same physical network then they should
>> have the same PNETID. And on s390 PNETID can come and often is comming
>> "from the hardware".
> ...
>
>
> HW pnetids (smc_pnetid_by_dev_port()) are only visible at the base netdevice.
> It seems that the pnetid table, managed by the smc_pnet tool, tries to mimick
> that behaviour, and the concept (recommendation?) would be to set the
> user-defined pnetid also for the base netdevice and then use the upper
> level netdevices for the tcp connection. Which makes some sense,
> all upper level devices have the same connectivity as the base device.
>
> So this patch would break a setup that follows this concept and only sets the
> pnetid at the base netdevice.
>
Hi Alexandra,
See the examples of using smc-r in container on cloud environment here
https://lore.kernel.org/linux-s390/3ff078e0-150d-41ba-b705-a8e0365f0370@linux.ibm.com/T/#t.
Especially the example of using veth in container, it can not successfully match the expected
RDMA device when do SMC-R in POD, if follow the concept of the HW pnetid and only sets the
pnetid at the base netdevice.
Maybe it is time to extend the concept and usage of pnetid table?
>
> Optionally you can set a user-defined pnetid on upper level devices (maybe for
> usability??), but as Guangguan noticed, that has no practical impact.
> In the documentation I see examples where the same pnetid is set for upper
> and base device.
> You cannot set a user-defined pnetid on a upper level device, if the base
> device has a HW pnetid (smc_pnet_add_eth()) which makes some sense,
> not even the same pnetid (makes less sense IMO).
> However you can set different user-defined pnetids on the upper netdevices
> and the base device, which makes no sense to me.
>
>>>
>>> Perhaps for a net patch would be better to support both lookup schemas
>>> i.e.
>>>
>>> (smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid) ||
>>> smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid))
>>>
>>> ?
>
> This may yield undesirable results, if base pnetid and upper pnetid differ..
> But then maybe GIGO?
>
>
> ...
>> BTW to implement the logic proposed by you Paolo, as understood by me,
>> we would have to use "&&" instead of "||".
> +1
>
>
> Another idea may be to change the setting of a user-defined pnetid
> on an upper level netdevice to
> - fail if the base netdevice has a different pnetid
> - set the pnetid of the base device , if it is currently unset.
>
In the example of veth, the base ndev found through the upper level ndev in POD
is not the real "base device" describe here.
And I wonder whether it is confused if set a user-defined pnetid to one netdevice,
but list another netdevice when smc_pnet show. For example set pnetid ABC to eth1,
whose base netdev is eth0, but when smc_pnet show, only can find eth0 with pnetid
ABC.
Thanks,
Guangguan Wang
prev parent reply other threads:[~2025-01-10 6:39 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-27 4:04 [PATCH net] net/smc: use the correct ndev to find pnetid by pnetid table Guangguan Wang
2025-01-04 16:40 ` Jakub Kicinski
2025-01-07 2:17 ` Wen Gu
2025-01-07 8:44 ` Paolo Abeni
2025-01-07 19:32 ` Halil Pasic
2025-01-08 4:57 ` Guangguan Wang
2025-01-09 3:04 ` Halil Pasic
2025-01-10 5:43 ` Guangguan Wang
2025-01-14 12:07 ` Halil Pasic
2025-01-15 11:53 ` Guangguan Wang
2025-02-10 11:16 ` Guangguan Wang
2025-02-10 13:13 ` Wenjia Zhang
2025-02-10 14:20 ` Halil Pasic
2025-02-10 14:19 ` Halil Pasic
2025-02-10 13:52 ` Halil Pasic
2025-02-11 3:44 ` Guangguan Wang
2025-03-03 14:24 ` Halil Pasic
2025-03-04 2:39 ` Guangguan Wang
2025-01-08 16:00 ` Alexandra Winter
2025-01-10 6:39 ` Guangguan Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=68059840-3aed-44e5-ac6e-9b1cabca392d@linux.alibaba.com \
--to=guangguan.wang@linux.alibaba.com \
--cc=PASIC@de.ibm.com \
--cc=alibuda@linux.alibaba.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=guwen@linux.alibaba.com \
--cc=horms@kernel.org \
--cc=jaka@linux.ibm.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=tonylu@linux.alibaba.com \
--cc=wenjia@linux.ibm.com \
--cc=wintera@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox