netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: "Ertman, David M" <david.m.ertman@intel.com>,
	Jakub Kicinski <kuba@kernel.org>,
	"Nikolova, Tatyana E" <tatyana.e.nikolova@intel.com>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	"intel-wired-lan@lists.osuosl.org"
	<intel-wired-lan@lists.osuosl.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [Intel-wired-lan] [iwl-next v4 1/1] iidc/ice/irdma: Update IDC to support multiple consumers
Date: Fri, 14 Mar 2025 18:18:00 -0700	[thread overview]
Message-ID: <8b4868dd-f615-4049-a885-f2f95a3e7a54@intel.com> (raw)
In-Reply-To: <20250314181230.GP1322339@unreal>



On 3/14/2025 11:12 AM, Leon Romanovsky wrote:
> On Thu, Mar 13, 2025 at 04:38:39PM -0700, Samudrala, Sridhar wrote:
>>
>>
>> On 3/2/2025 12:26 AM, Leon Romanovsky wrote:
>>> On Wed, Feb 26, 2025 at 11:01:52PM +0000, Ertman, David M wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Leon Romanovsky <leon@kernel.org>
>>>>> Sent: Wednesday, February 26, 2025 10:50 AM
>>>>> To: Ertman, David M <david.m.ertman@intel.com>
>>>>> Cc: Nikolova, Tatyana E <tatyana.e.nikolova@intel.com>; jgg@nvidia.com;
>>>>> intel-wired-lan@lists.osuosl.org; linux-rdma@vger.kernel.org;
>>>>> netdev@vger.kernel.org
>>>>> Subject: Re: [iwl-next v4 1/1] iidc/ice/irdma: Update IDC to support multiple
>>>>> consumers
>>>>>
>>>>> On Wed, Feb 26, 2025 at 05:36:44PM +0000, Ertman, David M wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Leon Romanovsky <leon@kernel.org>
>>>>>>> Sent: Monday, February 24, 2025 11:56 PM
>>>>>>> To: Nikolova, Tatyana E <tatyana.e.nikolova@intel.com>
>>>>>>> Cc: jgg@nvidia.com; intel-wired-lan@lists.osuosl.org; linux-
>>>>>>> rdma@vger.kernel.org; netdev@vger.kernel.org; Ertman, David M
>>>>>>> <david.m.ertman@intel.com>
>>>>>>> Subject: Re: [iwl-next v4 1/1] iidc/ice/irdma: Update IDC to support
>>>>> multiple
>>>>>>> consumers
>>>>>>>
>>>>>>> On Mon, Feb 24, 2025 at 11:04:28PM -0600, Tatyana Nikolova wrote:
>>>>>>>> From: Dave Ertman <david.m.ertman@intel.com>
>>>>>>>>
>>>>>>>> To support RDMA for E2000 product, the idpf driver will use the IDC
>>>>>>>> interface with the irdma auxiliary driver, thus becoming a second
>>>>>>>> consumer of it. This requires the IDC be updated to support multiple
>>>>>>>> consumers. The use of exported symbols no longer makes sense
>>>>> because it
>>>>>>>> will require all core drivers (ice/idpf) that can interface with irdma
>>>>>>>> auxiliary driver to be loaded even if hardware is not present for those
>>>>>>>> drivers.
>>>>>>>
>>>>>>> In auxiliary bus world, the code drivers (ice/idpf) need to created
>>>>>>> auxiliary devices only if specific device present. That auxiliary device
>>>>>>> will trigger the load of specific module (irdma in our case).
>>>>>>>
>>>>>>> EXPORT_SYMBOL won't trigger load of irdma driver, but the opposite is
>>>>>>> true, load of irdma will trigger load of ice/idpf drivers (depends on
>>>>>>> their exported symbol).
>>>>>>>
>>>>>>>>
>>>>>>>> To address this, implement an ops struct that will be universal set of
>>>>>>>> naked function pointers that will be populated by each core driver for
>>>>>>>> the irdma auxiliary driver to call.
>>>>>>>
>>>>>>> No, we worked very hard to make proper HW discovery and driver
>>>>> autoload,
>>>>>>> let's not return back. For now, it is no-go.
>>>>>>
>>>>>> Hi Leon,
>>>>>>
>>>>>> I am a little confused about what the problem here is.  The main issue I pull
>>>>>> from your response is: Removing exported symbols will stop ice/idpf from
>>>>>> autoloading when irdma loads.  Is this correct or did I miss your point?
>>>>>
>>>>> It is one of the main points.
>>>>>
>>>>>>
>>>>>> But, if there is an ice or idpf supported device present in the system, the
>>>>>> appropriate driver will have already been loaded anyway (and gone
>>>>> through its
>>>>>> probe flow to create auxiliary devices).  If it is not loaded, then the system
>>>>> owner
>>>>>> has either unloaded it manually or blacklisted it.  This would not cause an
>>>>> issue
>>>>>> anyway, since irdma and ice/idpf can load in any order.
>>>>>
>>>>> There are two assumptions above, which both not true.
>>>>> 1. Users never issue "modprobe irdma" command alone and always will call
>>>>> to whole chain "modprobe ice ..." before.
>>>>> 2. You open-code module subsystem properly with reference counters,
>>>>> ownership and locks to protect from function pointers to be set/clear
>>>>> dynamically.
>>>>
>>>> Ah, I see your reasoning now.  Our goal was to make the two modules independent,
>>>> with no prescribed load order mandated, and utilize the auxiliary bus and device subsystem
>>>> to handle load order and unload of one or the other module.  The auxiliary device only has
>>>> the lifespan of the core PCI driver, so if the core driver unloads, then the auxiliary device gets
>>>> destroyed, and the associated link based off it will be gone.  We wanted to be able to unload
>>>> and reload either of the modules (core or irdma) and have the interaction be able to restart with a
>>>> new probe.  All our inter-driver function calls are protected by device lock on the auxiliary
>>>> device for the duration of the call.
>>>
>>> Yes, you are trying to return to pre-aux era.
>>
>>
>> The main motivation to go with callbacks to the parent driver instead of
>> using exported symbols is to allow loading only the parent driver required
>> for a particular deployment. irdma is a common rdma auxilary driver that
>> supports multiple parent pci drivers(ice, idpf, i40e, iavf). If we use
>> exported symbols, all these modules will get loaded even on a system with
>> only idpf device.
> 
> It is not how kernel works. IRDMA doesn't call to all exported symbols
> of all these modules. It will call to only one exported symbol and that
> module will be loaded.

If we are using plain exported symbols from all the parent pci drivers 
and make direct calls from irdma, i would expect that all the drivers 
need to load based on module dependencies.

Are you referring to the usage of request_module() based dynamic loading 
of the modules?

> 
>>
>> The documentation for auxiliary bus
>> 	https://docs.kernel.org/driver-api/auxiliary_bus.html
>> shows an example on how shared data/callbacks can be used to establish
>> connection with the parent.
> 
> I'm aware of this documentation, it is incorrect. You can find the
> explanation why this documentation exists in habanalabs discussion.

Will look into that discussion.

> 
>>
>> Auxiliary devices are created and registered by a subsystem-level core
>> device that needs to break up its functionality into smaller fragments. One
>> way to extend the scope of an auxiliary_device is to encapsulate it within a
>> domain-specific structure defined by the parent device. This structure
>> contains the auxiliary_device and any associated shared data/callbacks
>> needed to establish the connection with the parent.
>>
>> An example is:
>>
>>   struct foo {
>>        struct auxiliary_device auxdev;
>>        void (*connect)(struct auxiliary_device *auxdev);
>>        void (*disconnect)(struct auxiliary_device *auxdev);
>>        void *data;
>> };
>>
>> This example clearly shows that it is OK to use callbacks from the aux
>> driver. The aux driver is dependent on the parent driver and the parent
>> driver will guarantee that it will not get unloaded until all the auxiliary
>> devices are destroyed.
>>
>> Hopefully you will understand our motivation of going with this design and
>> not force us to go with a solution that is not optimal.
> 
> Feel free to fix documentation.
> 
> Thanks
> 
>>
>> Thanks
>> Sridhar
>>


  reply	other threads:[~2025-03-15  1:18 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-25  5:04 [iwl-next v4 0/1] Add RDMA support for Intel IPU E2000 (GEN3) Tatyana Nikolova
2025-02-25  5:04 ` [iwl-next v4 1/1] iidc/ice/irdma: Update IDC to support multiple consumers Tatyana Nikolova
2025-02-25  7:55   ` Leon Romanovsky
2025-02-26 17:36     ` Ertman, David M
2025-02-26 18:50       ` Leon Romanovsky
2025-02-26 23:01         ` Ertman, David M
2025-03-02  8:26           ` Leon Romanovsky
2025-03-13 23:38             ` [Intel-wired-lan] " Samudrala, Sridhar
2025-03-14 18:12               ` Leon Romanovsky
2025-03-15  1:18                 ` Samudrala, Sridhar [this message]
2025-03-17 11:57                   ` Leon Romanovsky
2025-03-18 17:01                     ` Samudrala, Sridhar
2025-03-18 17:20                       ` Jason Gunthorpe
2025-03-18 19:45                         ` Samudrala, Sridhar
2025-03-19  8:04                           ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8b4868dd-f615-4049-a885-f2f95a3e7a54@intel.com \
    --to=sridhar.samudrala@intel.com \
    --cc=david.m.ertman@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tatyana.e.nikolova@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).