netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Przemek Kitszel <przemyslaw.kitszel@intel.com>
To: Shay Drori <shayd@nvidia.com>, Greg KH <gregkh@linuxfoundation.org>
Cc: <netdev@vger.kernel.org>, <pabeni@redhat.com>,
	<davem@davemloft.net>, <kuba@kernel.org>, <edumazet@google.com>,
	<david.m.ertman@intel.com>, <rafael@kernel.org>,
	<ira.weiny@intel.com>, <linux-rdma@vger.kernel.org>,
	<leon@kernel.org>, <tariqt@nvidia.com>,
	Parav Pandit <parav@nvidia.com>
Subject: Re: [PATCH net-next v6 1/2] driver core: auxiliary bus: show auxiliary device IRQs
Date: Mon, 17 Jun 2024 11:52:28 +0200	[thread overview]
Message-ID: <d51d74da-9a0e-4602-bf6b-fa314a3a7e8b@intel.com> (raw)
In-Reply-To: <42af42b6-ccdd-4d0d-8e5a-306c74f330f4@nvidia.com>

On 6/17/24 08:38, Shay Drori wrote:
> 
> 
> On 13/06/2024 19:33, Greg KH wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On Thu, Jun 13, 2024 at 07:19:11PM +0300, Shay Drory wrote:
>>> PCI subfunctions (SF) are anchored on the auxiliary bus. PCI physical
>>> and virtual functions are anchored on the PCI bus. The irq information
>>> of each such function is visible to users via sysfs directory "msi_irqs"
>>> containing files for each irq entry. However, for PCI SFs such
>>> information is unavailable. Due to this users have no visibility on IRQs
>>> used by the SFs.
>>> Secondly, an SF can be multi function device supporting rdma, netdevice
>>> and more. Without irq information at the bus level, the user is unable
>>> to view or use the affinity of the SF IRQs.
>>>
>>> Hence to match to the equivalent PCI PFs and VFs, add "irqs" directory,
>>> for supporting auxiliary devices, containing file for each irq entry.
>>>
>>> For example:
>>> $ ls /sys/bus/auxiliary/devices/mlx5_core.sf.1/irqs/
>>> 50  51  52  53  54  55  56  57  58
>>>
>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>> Signed-off-by: Shay Drory <shayd@nvidia.com>
>>>
>>> ---
>>> v5-v6:
>>> - removed concept of shared and exclusive and hence global xarray (Greg)
>>> v4-v5:
>>> - restore global mutex and replace refcount_t with simple integer (Greg)
>>> v3->4:
>>> - remove global mutex (Przemek)
>>> v2->v3:
>>> - fix function declaration in case SYSFS isn't defined
>>> v1->v2:
>>> - move #ifdefs from drivers/base/auxiliary.c to
>>>    include/linux/auxiliary_bus.h (Greg)
>>> - use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL (Greg)
>>> - Fix kzalloc(ref) to kzalloc(*ref) (Simon)
>>> - Add return description in auxiliary_device_sysfs_irq_add() kdoc 
>>> (Simon)
>>> - Fix auxiliary_irq_mode_show doc (kernel test boot)
>>> ---
>>>   Documentation/ABI/testing/sysfs-bus-auxiliary |  7 ++
>>>   drivers/base/auxiliary.c                      | 96 ++++++++++++++++++-
>>>   include/linux/auxiliary_bus.h                 | 24 ++++-
>>>   3 files changed, 124 insertions(+), 3 deletions(-)
>>>   create mode 100644 Documentation/ABI/testing/sysfs-bus-auxiliary
>>>
>>> diff --git a/Documentation/ABI/testing/sysfs-bus-auxiliary 
>>> b/Documentation/ABI/testing/sysfs-bus-auxiliary
>>> new file mode 100644
>>> index 000000000000..e8752c2354bc
>>> --- /dev/null
>>> +++ b/Documentation/ABI/testing/sysfs-bus-auxiliary
>>> @@ -0,0 +1,7 @@
>>> +What:                /sys/bus/auxiliary/devices/.../irqs/
>>> +Date:                April, 2024
>>> +Contact:     Shay Drory <shayd@nvidia.com>
>>> +Description:
>>> +             The /sys/devices/.../irqs directory contains a variable 
>>> set of
>>> +             files, with each file is named as irq number similar to 
>>> PCI PF
>>> +             or VF's irq number located in msi_irqs directory.
>>> diff --git a/drivers/base/auxiliary.c b/drivers/base/auxiliary.c
>>> index d3a2c40c2f12..fcd7dbf20f88 100644
>>> --- a/drivers/base/auxiliary.c
>>> +++ b/drivers/base/auxiliary.c
>>> @@ -158,6 +158,94 @@
>>>    *   };
>>>    */
>>>
>>> +#ifdef CONFIG_SYSFS
>>
>> People really build boxes without sysfs?  Ok :(
>>
>> But if so, why not move this to a whole new file?  That would make it
>> simpler to maintain.
> 
> sounds good. Will move them to new sysfs.c

your proposed name combined with the directory would suggest that this
is base sysfs for drivers - drivers/base/sysfs.c
please add aux_ prefix, or similar

> 
>>
>>> +struct auxiliary_irq_info {
>>> +     struct device_attribute sysfs_attr;
>>> +};
>>> +
>>> +static struct attribute *auxiliary_irq_attrs[] = {
>>> +     NULL
>>> +};
>>> +
>>> +static const struct attribute_group auxiliary_irqs_group = {
>>> +     .name = "irqs",
>>> +     .attrs = auxiliary_irq_attrs,
>>> +};
>>> +
>>> +static const struct attribute_group *auxiliary_irqs_groups[] = {
>>> +     &auxiliary_irqs_group,
>>> +     NULL
>>> +};
>>> +
>>> +/**
>>> + * auxiliary_device_sysfs_irq_add - add a sysfs entry for the given IRQ
>>> + * @auxdev: auxiliary bus device to add the sysfs entry.
>>> + * @irq: The associated interrupt number.
>>> + *
>>> + * This function should be called after auxiliary device have 
>>> successfully
>>> + * received the irq.
>>> + *
>>> + * Return: zero on success or an error code on failure.
>>> + */
>>> +int auxiliary_device_sysfs_irq_add(struct auxiliary_device *auxdev, 
>>> int irq)
>>> +{
>>> +     struct device *dev = &auxdev->dev;
>>> +     struct auxiliary_irq_info *info;
>>> +     int ret;
>>> +
>>> +     info = kzalloc(sizeof(*info), GFP_KERNEL);
>>> +     if (!info)
>>> +             return -ENOMEM;
>>> +
>>> +     sysfs_attr_init(&info->sysfs_attr.attr);
>>> +     info->sysfs_attr.attr.name = kasprintf(GFP_KERNEL, "%d", irq);
>>> +     if (!info->sysfs_attr.attr.name) {
>>> +             ret = -ENOMEM;
>>> +             goto name_err;
>>> +     }
>>> +
>>> +     ret = xa_insert(&auxdev->irqs, irq, info, GFP_KERNEL);
>>> +     if (ret)
>>> +             goto auxdev_xa_err;
>>> +
>>> +     ret = sysfs_add_file_to_group(&dev->kobj, &info->sysfs_attr.attr,
>>> +                                   auxiliary_irqs_group.name);
>>
>> Dynamic attributes are rough, because:
> 
> Your response after "because" is missing.
> Can you please elaborate?

you have "complicated" (compared to "nothing" for static attrs)
unwinding/error path

> 
>>
>>
>>> +     if (ret)
>>> +             goto sysfs_add_err;
>>> +
>>> +     return 0;
>>> +
>>> +sysfs_add_err:
>>> +     xa_erase(&auxdev->irqs, irq);
>>> +auxdev_xa_err:
>>> +     kfree(info->sysfs_attr.attr.name);
>>> +name_err:
>>> +     kfree(info);
>>> +     return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_add);
>>> +
>>> +/**
>>> + * auxiliary_device_sysfs_irq_remove - remove a sysfs entry for the 
>>> given IRQ
>>> + * @auxdev: auxiliary bus device to add the sysfs entry.
>>> + * @irq: the IRQ to remove.
>>> + *
>>> + * This function should be called to remove an IRQ sysfs entry.
>>> + */
>>> +void auxiliary_device_sysfs_irq_remove(struct auxiliary_device 
>>> *auxdev, int irq)
>>> +{
>>> +     struct auxiliary_irq_info *info = xa_load(&auxdev->irqs, irq);
>>> +     struct device *dev = &auxdev->dev;
>>> +
>>> +     sysfs_remove_file_from_group(&dev->kobj, &info->sysfs_attr.attr,
>>> +                                  auxiliary_irqs_group.name);
>>> +     xa_erase(&auxdev->irqs, irq);
>>> +     kfree(info->sysfs_attr.attr.name);
>>> +     kfree(info);
>>> +}
>>> +EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_remove);
>>
>> What is forcing you to remove the irqs after a device is removed from
>> the system?
> 
> We are removing the irqs _before_ removing the device.
> Irqs removal is following the exact mirror of add flow.
> 
>>
>> Why not just remove them all automatically?  Why would you ever want to
>> remove them after they were added, will they ever actually change over
>> the lifespan of a device?
> 
> IRQs of the SFs are allocated and removed when the resources are
> created.
> for example, devlink reload flow that re-initialize the whole device by
> releasing and re-allocating new set of IRQs.
> Certain driver internal health recovery flow can also trigger similar
> re-initialize.

I read it as "removing all is what we use 'remove-one' for",
I'm correct?

> 
>>
>>>   int auxiliary_device_init(struct auxiliary_device *auxdev);
>>> -int __auxiliary_device_add(struct auxiliary_device *auxdev, const 
>>> char *modname);
>>> -#define auxiliary_device_add(auxdev) __auxiliary_device_add(auxdev, 
>>> KBUILD_MODNAME)
>>> +int __auxiliary_device_add(struct auxiliary_device *auxdev, const 
>>> char *modname,
>>> +                        bool irqs_sysfs_enable);
>>> +#define auxiliary_device_add(auxdev) __auxiliary_device_add(auxdev, 
>>> KBUILD_MODNAME, false)
>>> +#define auxiliary_device_add_with_irqs(auxdev) \
>>> +     __auxiliary_device_add(auxdev, KBUILD_MODNAME, true)
>>
>> Ick, no, that way lies madness.
>>
>> Just keep the original function:
>>          auxiliary_device_add()
>> as is.
>>
>> Then, if someone DOES call auxiliary_device_sysfs_irq_add() then add the
>> irq directory and file as needed then.
>>
>> That way no "norml" paths are messed up and over time, we don't keep
>> having an explosion of combinations of function calls to create an aux
>> device (as we all know, this is NOT going to be the last feature ever
>> added to them...)
> 
> Thanks for the suggestion, will change in next version.
> 
>>
>> thanks,
>>
>> greg k-h
> 


  reply	other threads:[~2024-06-17  9:53 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-13 16:19 [PATCH net-next v6 0/2] Introduce auxiliary bus IRQs sysfs Shay Drory
2024-06-13 16:19 ` [PATCH net-next v6 1/2] driver core: auxiliary bus: show auxiliary device IRQs Shay Drory
2024-06-13 16:33   ` Greg KH
2024-06-17  6:38     ` Shay Drori
2024-06-17  9:52       ` Przemek Kitszel [this message]
2024-06-18  6:19         ` Shay Drori
2024-06-13 16:19 ` [PATCH net-next v6 2/2] net/mlx5: Expose SFs IRQs Shay Drory

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d51d74da-9a0e-4602-bf6b-fa314a3a7e8b@intel.com \
    --to=przemyslaw.kitszel@intel.com \
    --cc=davem@davemloft.net \
    --cc=david.m.ertman@intel.com \
    --cc=edumazet@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=ira.weiny@intel.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=parav@nvidia.com \
    --cc=rafael@kernel.org \
    --cc=shayd@nvidia.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).