netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Omer Shpigelman <oshpigelman@habana.ai>
To: Leon Romanovsky <leon@kernel.org>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"ogabbay@kernel.org" <ogabbay@kernel.org>,
	Zvika Yehudai <zyehudai@habana.ai>
Subject: Re: [PATCH 11/15] RDMA/hbl: add habanalabs RDMA driver
Date: Mon, 1 Jul 2024 10:46:48 +0000	[thread overview]
Message-ID: <a19d80d4-e452-461c-a060-2c94030301a7@habana.ai> (raw)
In-Reply-To: <20240630132911.GB176465@unreal>

On 6/30/24 16:29, Leon Romanovsky wrote:
> On Fri, Jun 28, 2024 at 10:24:32AM +0000, Omer Shpigelman wrote:
>> On 6/19/24 13:52, Leon Romanovsky wrote:
>>> On Wed, Jun 19, 2024 at 09:27:54AM +0000, Omer Shpigelman wrote:
>>>> On 6/18/24 15:58, Leon Romanovsky wrote:
>>>>> On Tue, Jun 18, 2024 at 11:08:34AM +0000, Omer Shpigelman wrote:
>>>>>> On 6/17/24 22:04, Leon Romanovsky wrote:
>>>>>>> [Some people who received this message don't often get email from leon@kernel.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>>>>>>>
>>>>>>> On Mon, Jun 17, 2024 at 05:43:49PM +0000, Omer Shpigelman wrote:
>>>>>>>> On 6/13/24 22:18, Leon Romanovsky wrote:
>>>>>>>>> [Some people who received this message don't often get email from leon@kernel.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>>>>>>>>>
>>>>>>>>> On Thu, Jun 13, 2024 at 11:22:04AM +0300, Omer Shpigelman wrote:
>>>>>>>>>> Add an RDMA driver of Gaudi ASICs family for AI scaling.
>>>>>>>>>> The driver itself is agnostic to the ASIC in action, it operates according
>>>>>>>>>> to the capabilities that were passed on device initialization.
>>>>>>>>>> The device is initialized by the hbl_cn driver via auxiliary bus.
>>>>>>>>>> The driver also supports QP resource tracking and port/device HW counters.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
>>>>>>>>>> Co-developed-by: Abhilash K V <kvabhilash@habana.ai>
>>>>>>>>>> Signed-off-by: Abhilash K V <kvabhilash@habana.ai>
>>>>>>>>>> Co-developed-by: Andrey Agranovich <aagranovich@habana.ai>
>>>>>>>>>> Signed-off-by: Andrey Agranovich <aagranovich@habana.ai>
>>>>>>>>>> Co-developed-by: Bharat Jauhari <bjauhari@habana.ai>
>>>>>>>>>> Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
>>>>>>>>>> Co-developed-by: David Meriin <dmeriin@habana.ai>
>>>>>>>>>> Signed-off-by: David Meriin <dmeriin@habana.ai>
>>>>>>>>>> Co-developed-by: Sagiv Ozeri <sozeri@habana.ai>
>>>>>>>>>> Signed-off-by: Sagiv Ozeri <sozeri@habana.ai>
>>>>>>>>>> Co-developed-by: Zvika Yehudai <zyehudai@habana.ai>
>>>>>>>>>> Signed-off-by: Zvika Yehudai <zyehudai@habana.ai>
>>>>>>>>>
>>
>> <...>
>>
>>>> mlx5 IB driver doesn't export any symbol that is used by the core driver,
>>>> that's why the core driver can be loaded without the IB driver (althought
>>>> you'll get circular dependency if you would export).
>>>
>>> Yes, IB and ETH drivers are "users" of core driver. As RDMA maintainer,
>>> I'm reluctant to accept code that exports symbols from IB drivers to
>>> other subsystems. We have drivers/infiniband/core/ for that.
>>>
>>
>> We need the core driver to access the IB driver (and to the ETH driver as
>> well). As you wrote, we can't use exported symbols from our IB driver nor
>> rely on function pointers, but what about providing the core driver an ops
>> structure? meaning exporting a register function from the core driver that
>> should be called by the IB driver during auxiliary device probe.
>> Something like:
>>
>> int hbl_cn_register_ib_aux_dev(struct auxiliary_device *adev,
>> 			       struct hbl_ib_ops *ops)
>> {
>> ...
>> }
>> EXPORT_SYMBOL(hbl_cn_register_ib_aux_dev);
>>
>> That's how only the parent driver exports symbols to the son driver so the
>> IB driver is a "user" of the core driver and so we count on the internal
>> module reference counter. But we also get the ability to access the IB
>> driver from the core driver (to report a HW error for example).
> 
> Before you are talking about solutions, please explain in technical
> terms why you absolutely need to access IB from core driver and any
> other possible way is not possible.
> 
> Thanks

First of all, as a general assumption, everything we do today can also be
done with unidirectional drivers communication only. If the parent driver
cannot access the son driver directly, then we can have a blocking command
queue on the parent side that the parent driver will push to it and the
son driver will fetch from it, execute the command and unblock the parent.
That will work but it adds complexity which I'm not sure that is needed.
The second point is not necessarily about the direction of the
communication but more about generally using function pointers rather than
exported symbols - we have 2 flavors of functions for inter driver
communications: common functions and ASIC specific functions. The ASIC
specific functions are exposed and initialized per ASIC. If we convert
them to EXPORT_SYMBOLs then we expose ASIC specific functions regardless
of the ASIC in action.
Again, that will work but seems unnecessary. We can check the ASIC type
that was passed in each exported function and fail if a wrong ASIC type
was used, but it seems to me like an incorrect approach to use exported
symbols for ASIC specific communication. EXPORT_SYMBOLs were meant to be
used for driver level communication, not for utilizing device specific
capabilities. For that, an ops struct seems more appropriate.
That's why I'm suggesting to combine both exported symbols and function
pointers.

  reply	other threads:[~2024-07-01 10:47 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-13  8:21 [PATCH 00/15] Introduce HabanaLabs network drivers Omer Shpigelman
2024-06-13  8:21 ` [PATCH 01/15] net: hbl_cn: add habanalabs Core Network driver Omer Shpigelman
2024-06-13 13:01   ` Przemek Kitszel
2024-06-13 14:16     ` Przemek Kitszel
2024-06-17  8:08     ` Omer Shpigelman
2024-06-17 11:48       ` Leon Romanovsky
2024-06-18  7:28         ` Omer Shpigelman
2024-06-15  0:05   ` Stephen Hemminger
2024-06-17  8:14     ` Omer Shpigelman
2024-06-17 14:05   ` Markus Elfring
2024-06-17 15:02     ` Andrew Lunn
2024-06-18  7:51       ` Omer Shpigelman
2024-06-13  8:21 ` [PATCH 02/15] net: hbl_cn: memory manager component Omer Shpigelman
2024-06-13  8:21 ` [PATCH 03/15] net: hbl_cn: physical layer support Omer Shpigelman
2024-06-13  8:21 ` [PATCH 04/15] net: hbl_cn: QP state machine Omer Shpigelman
2024-06-17 13:18   ` Leon Romanovsky
2024-06-18  5:50     ` Omer Shpigelman
2024-06-18  7:08       ` Leon Romanovsky
2024-06-18  7:58         ` Omer Shpigelman
2024-06-18  9:00           ` Leon Romanovsky
2024-06-24  7:24             ` Omer Shpigelman
2024-06-13  8:21 ` [PATCH 05/15] net: hbl_cn: memory trace events Omer Shpigelman
2024-06-13  8:21 ` [PATCH 06/15] net: hbl_cn: debugfs support Omer Shpigelman
2024-06-19 18:35   ` Sunil Kovvuri Goutham
2024-06-21 10:17     ` Omer Shpigelman
2024-06-21 10:30       ` Sunil Kovvuri Goutham
2024-06-23  7:25         ` Omer Shpigelman
2024-06-21 15:33       ` Andrew Lunn
2024-06-23  6:57         ` Omer Shpigelman
2024-06-23 15:02           ` Andrew Lunn
2024-06-24  7:21             ` Omer Shpigelman
2024-06-24  9:22             ` Leon Romanovsky
2024-12-17 10:00             ` Avri Kehat
2024-12-17 10:52               ` Andrew Lunn
2024-06-13  8:22 ` [PATCH 07/15] net: hbl_cn: gaudi2: ASIC register header files Omer Shpigelman
2024-06-13  8:22 ` [PATCH 08/15] net: hbl_cn: gaudi2: ASIC specific support Omer Shpigelman
2024-06-13  8:22 ` [PATCH 09/15] net: hbl_en: add habanalabs Ethernet driver Omer Shpigelman
2024-06-13 21:49   ` Andrew Lunn
2024-06-18  6:58     ` Omer Shpigelman
2024-06-18 14:19       ` Andrew Lunn
2024-06-19  7:16         ` Omer Shpigelman
2024-06-19  8:01           ` Przemek Kitszel
2024-06-19 12:15             ` Omer Shpigelman
2024-06-19 15:21           ` Jakub Kicinski
2024-06-20  8:43             ` Omer Shpigelman
2024-06-20 13:51               ` Jakub Kicinski
2024-06-20 19:14                 ` Andrew Lunn
2024-06-23 14:48                   ` Omer Shpigelman
2024-06-19 16:13           ` Andrew Lunn
2024-06-23  6:22             ` Omer Shpigelman
2024-06-23 14:46               ` Andrew Lunn
2024-06-26 10:13                 ` Omer Shpigelman
2024-06-26 14:13                   ` Andrew Lunn
2024-06-30  7:11                     ` Omer Shpigelman
2024-06-14 22:48   ` Joe Damato
2024-06-16  1:04     ` Andrew Lunn
2024-06-18 19:37     ` Omer Shpigelman
2024-06-18 21:19       ` Stephen Hemminger
2024-06-19 12:13         ` Omer Shpigelman
2024-06-15  0:10   ` Stephen Hemminger
2024-06-19 12:07     ` Omer Shpigelman
2024-06-15  0:16   ` Stephen Hemminger
2024-06-18 19:39     ` Omer Shpigelman
2024-06-19 15:40       ` Andrew Lunn
2024-06-20  8:36         ` Omer Shpigelman
2024-06-15 10:55   ` Zhu Yanjun
2024-06-18 11:16     ` Omer Shpigelman
2024-06-15 17:13   ` Zhu Yanjun
2024-06-16  1:08     ` Andrew Lunn
2024-06-13  8:22 ` [PATCH 10/15] net: hbl_en: gaudi2: ASIC specific support Omer Shpigelman
2024-06-13  8:22 ` [PATCH 11/15] RDMA/hbl: add habanalabs RDMA driver Omer Shpigelman
2024-06-13 19:18   ` Leon Romanovsky
2024-06-17 17:43     ` Omer Shpigelman
2024-06-17 19:04       ` Leon Romanovsky
2024-06-18 11:08         ` Omer Shpigelman
2024-06-18 12:58           ` Leon Romanovsky
2024-06-19  9:27             ` Omer Shpigelman
2024-06-19 10:52               ` Leon Romanovsky
2024-06-24  8:47                 ` Omer Shpigelman
2024-06-24  9:10                   ` Leon Romanovsky
2024-06-28 10:24                 ` Omer Shpigelman
2024-06-30 13:29                   ` Leon Romanovsky
2024-07-01 10:46                     ` Omer Shpigelman [this message]
2024-07-01 12:46                       ` Leon Romanovsky
2024-07-12 13:08                   ` Jason Gunthorpe
2024-07-14 10:18                     ` Omer Shpigelman
2024-07-16 13:40                       ` Jason Gunthorpe
2024-07-17  7:08                         ` Omer Shpigelman
2024-07-17  7:36                           ` Leon Romanovsky
2024-07-17 10:51                             ` Omer Shpigelman
2024-07-17 11:56                               ` Jason Gunthorpe
2024-07-17 12:33                               ` Leon Romanovsky
2024-07-18  6:54                                 ` Omer Shpigelman
2024-07-18  8:31                                   ` Leon Romanovsky
2024-06-18 16:01           ` Przemek Kitszel
2024-06-19  9:34             ` Omer Shpigelman
2024-06-17 14:17   ` Jason Gunthorpe
2024-06-19  9:39     ` Omer Shpigelman
2024-06-13  8:22 ` [PATCH 12/15] RDMA/hbl: direct verbs support Omer Shpigelman
2024-06-13  8:22 ` [PATCH 13/15] accel/habanalabs: network scaling support Omer Shpigelman
2024-06-19 18:41   ` Sunil Kovvuri Goutham
2024-06-21 10:21     ` Omer Shpigelman
2024-06-13  8:22 ` [PATCH 14/15] accel/habanalabs/gaudi2: CN registers header files Omer Shpigelman
2024-06-13  8:22 ` [PATCH 15/15] accel/habanalabs/gaudi2: network scaling support Omer Shpigelman
2024-06-17 12:34 ` [PATCH 00/15] Introduce HabanaLabs network drivers Alexander Lobakin
2024-06-19 11:40   ` Omer Shpigelman
2024-06-19 16:33 ` Jiri Pirko
2024-06-20  5:37   ` Omer Shpigelman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a19d80d4-e452-461c-a060-2c94030301a7@habana.ai \
    --to=oshpigelman@habana.ai \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=ogabbay@kernel.org \
    --cc=zyehudai@habana.ai \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).