public inbox for linux-s390@vger.kernel.org
 help / color / mirror / Atom feed
From: Guangguan Wang <guangguan.wang@linux.alibaba.com>
To: Halil Pasic <pasic@linux.ibm.com>, Paolo Abeni <pabeni@redhat.com>
Cc: wenjia@linux.ibm.com, jaka@linux.ibm.com,
	alibuda@linux.alibaba.com, tonylu@linux.alibaba.com,
	guwen@linux.alibaba.com, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, horms@kernel.org,
	linux-rdma@vger.kernel.org, linux-s390@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Alexandra Winter <wintera@linux.ibm.com>
Subject: Re: [PATCH net] net/smc: use the correct ndev to find pnetid by pnetid table
Date: Wed, 8 Jan 2025 12:57:00 +0800	[thread overview]
Message-ID: <908be351-b4f8-4c25-9171-4f033e11ffc4@linux.alibaba.com> (raw)
In-Reply-To: <20250107203218.5787acb4.pasic@linux.ibm.com>



On 2025/1/8 03:32, Halil Pasic wrote:
> On Tue, 7 Jan 2025 09:44:30 +0100
> Paolo Abeni <pabeni@redhat.com> wrote:
> 
>> On 12/27/24 5:04 AM, Guangguan Wang wrote:
> 
> @Guangguan Wang: please use my linux.ibm.com address
> in the future.
Get it.

> 
>>> The command 'smc_pnet -a -I <ethx> <pnetid>' will add <pnetid>
>>> to the pnetid table and will attach the <pnetid> to net device
>>> whose name is <ethx>. But When do SMCR by <ethx>, in function
>>> smc_pnet_find_roce_by_pnetid, it will use <ethx>'s base ndev's
>>> pnetid to match rdma device, not <ethx>'s pnetid. The asymmetric
>>> use of the pnetid seems weird. Sometimes it is difficult to know
>>> the hierarchy of net device what may make it difficult to configure
>>> the pnetid and to use the pnetid. Looking into the history of
>>> commit, it was the commit 890a2cb4a966 ("net/smc: rework pnet table")
>>> that changes the ndev from the <ethx> to the <ethx>'s base ndev
>>> when finding pnetid by pnetid table. It seems a mistake.
>>>
>>> This patch changes the ndev back to the <ethx> when finding pnetid
>>> by pnetid table.
>>>
>>> Fixes: 890a2cb4a966 ("net/smc: rework pnet table")
>>> Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>  
>>
>> If I read correctly, this will break existing applications using the
>> lookup schema introduced by the blamed commit - which is not very
>> recent.
> 
> Hi Paolo,
> 
> sorry for chiming in late. Wenjia is on vacation and Jan is out sick!
> After some reading and thinking I could not figure out how 890a2cb4a966
> ("net/smc: rework pnet table") is broken.

Before commit 890a2cb4a966:
smc_pnet_find_roce_resource
    smc_pnet_find_roce_by_pnetid(ndev, ...) /* lookup via hardware-defined pnetid */
        smc_pnetid_by_dev_port(base_ndev, ...)
    smc_pnet_find_roce_by_table(ndev, ...) /* lookup via SMC PNET table */
    {
        ...
        list_for_each_entry(pnetelem, &smc_pnettable.pnetlist, list) {
                if (ndev == pnetelem->ndev) { /* notice here, it was ndev to matching pnetid element in pnet table */
        ...
    }

After commit 890a2cb4a966:
smc_pnet_find_roce_resource
    smc_pnet_find_roce_by_pnetid
    {
        ...
        base_ndev = pnet_find_base_ndev(ndev); /* rename the variable name to base_ndev for better understanding */
        smc_pnetid_by_dev_port(base_ndev, ...)
        smc_pnet_find_ndev_pnetid_by_table(base_ndev, ...)
        {
                ...
                list_for_each_entry(pnetelem, &smc_pnettable.pnetlist, list) {
                if (base_ndev == pnetelem->ndev) { /* notice here, it is base_ndev to matching pnetid element in pnet table */
                ...
        }

    }

The commit 890a2cb4a966 has changed ndev to base_ndev when matching pnetid element in pnet table.
But in the function smc_pnet_add_eth, the pnetid is attached to the ndev itself, not the base_ndev.
smc_pnet_add_eth(...)
{
    ...
    ndev = dev_get_by_name(net, eth_name);
    ...
        if (new_netdev) {
            if (ndev) {
                new_pe->ndev = ndev;
                netdev_tracker_alloc(ndev, &new_pe->dev_tracker,
                    GFP_ATOMIC);
            }
            list_add_tail(&new_pe->list, &pnettable->pnetlist);
            mutex_unlock(&pnettable->lock);
        } else {
    ...
}

> 
> Admittedly I'm not really a net guy,and I'm mostly guessing what that
> lower and upper device stuff is, so please bear with me. All that said, I
> do think that going to the lowest netdev in the hierarchy is a sane
> thing to do here.  I assume  that lower and upper devices are applicable
> to stuff like bonding. 
> 
> PNETID stands for "Physical Network Identifier" and the idea is that iff
> two ports are connected to the same physical network then they should
> have the same PNETID. And on s390 PNETID can come and often is comming
> "from the hardware". Now for something like a bond of two OSA
> interfaces, I would expect the two legs of the bond to probably have a
> "HW PNETID", but the netdev representing the bond itself won't have one
> unless the Linux admin defines a software PNETID, which is work, and
> can't have a HW PNETID because it is a software construct within Linux.
> Breaking for example an active-backup bond setup where the legs have
> HW PNETIDs and the admin did not bother to specify a PNETID for the bond
> is not acceptable.
> 
> Let me also note that if ndev is a leaf (i.e. there is no lower device to
> it) then ndev == base_ndev, and the whole discussion does not matter for
> that case.
> 
> Again I have to emphasize that my domain knowledge is very limited, but
> I really don't feel comfortable going forward with this without Jan or
> Wenjia weighing in on the matter.
> 
> Paolo thanks for bringing this up!
> 
>>
>> Perhaps for a net patch would be better to support both lookup schemas
>> i.e.
>>
>> 	(smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid) ||
>> 	 smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid))
>>
>> ?
>>
> 
> Hm, I guess the idea here is that if ndev has a PNETID then it should
> take precedence, but if not we should try to obtain the PNETID of its
> "base_ndev". I'm not sure this would make things better compared to the
> original idea of caring about the leaf. Which makes me question my
> understanding of the problem statement from the commit message.
> 
> BTW to implement the logic proposed by you Paolo, as understood by me,
> we would have to use "&&" instead of "||". The whole expression is
> supposed
> to evaluate to false if a pnetid is found and to true if no pnet_id is
> found. smc_pnet_find_ndev_pnetid_by_table(ndev) returns false if a pnetid
> is found. I.e. if not found we would just short circuit to true and not
> call smc_pnet_find_ndev_pnetid_by_table(base_ndev), which is not what I
> believe you wanted to propose.

Yes, it should be
        (smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid) &&
 	 smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid))
if for the consideration of application's compatible usage of smc_pnet.

Thanks,
Guangguan Wang

> 
> To sum it up, please let us wait until Wenjia or Jan chime in. Copying
> Alexandra as well: she is more of a net person than I am, and maybe she
> has a more informed opinion.
> 
> Regards,
> Halil
> [...]
> 

  reply	other threads:[~2025-01-08  4:57 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-27  4:04 [PATCH net] net/smc: use the correct ndev to find pnetid by pnetid table Guangguan Wang
2025-01-04 16:40 ` Jakub Kicinski
2025-01-07  2:17 ` Wen Gu
2025-01-07  8:44 ` Paolo Abeni
2025-01-07 19:32   ` Halil Pasic
2025-01-08  4:57     ` Guangguan Wang [this message]
2025-01-09  3:04       ` Halil Pasic
2025-01-10  5:43         ` Guangguan Wang
2025-01-14 12:07           ` Halil Pasic
2025-01-15 11:53             ` Guangguan Wang
2025-02-10 11:16               ` Guangguan Wang
2025-02-10 13:13                 ` Wenjia Zhang
2025-02-10 14:20                 ` Halil Pasic
2025-02-10 14:19               ` Halil Pasic
2025-02-10 13:52           ` Halil Pasic
2025-02-11  3:44             ` Guangguan Wang
2025-03-03 14:24               ` Halil Pasic
2025-03-04  2:39                 ` Guangguan Wang
2025-01-08 16:00     ` Alexandra Winter
2025-01-10  6:39       ` Guangguan Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=908be351-b4f8-4c25-9171-4f033e11ffc4@linux.alibaba.com \
    --to=guangguan.wang@linux.alibaba.com \
    --cc=alibuda@linux.alibaba.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=guwen@linux.alibaba.com \
    --cc=horms@kernel.org \
    --cc=jaka@linux.ibm.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=tonylu@linux.alibaba.com \
    --cc=wenjia@linux.ibm.com \
    --cc=wintera@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox