* [PATCH net-next v9 0/2] Introduce auxiliary bus IRQs sysfs
@ 2024-07-03 7:38 Shay Drory
2024-07-03 7:38 ` [PATCH net-next v9 1/2] driver core: auxiliary bus: show auxiliary device IRQs Shay Drory
2024-07-03 7:38 ` [PATCH net-next v9 2/2] net/mlx5: Expose SFs IRQs Shay Drory
0 siblings, 2 replies; 7+ messages in thread
From: Shay Drory @ 2024-07-03 7:38 UTC (permalink / raw)
To: netdev, pabeni, davem, kuba, edumazet, gregkh, david.m.ertman
Cc: rafael, ira.weiny, linux-rdma, leon, tariqt, Shay Drory
Today, PCI PFs and VFs, which are anchored on the PCI bus, display their
IRQ information in the <pci_device>/msi_irqs/<irq_num> sysfs files. PCI
subfunctions (SFs) are similar to PFs and VFs and these SFs are anchored
on the auxiliary bus. However, these PCI SFs lack such IRQ information
on the auxiliary bus, leaving users without visibility into which IRQs
are used by the SFs. This absence makes it impossible to debug
situations and to understand the source of interrupts/SFs for
performance tuning and debug.
Additionally, the SFs are multifunctional devices supporting RDMA,
network devices, clocks, and more, similar to their peer PCI PFs and
VFs. Therefore, it is desirable to have SFs' IRQ information available
at the bus/device level.
To overcome the above limitations, this short series extends the
auxiliary bus to display IRQ information in sysfs, similar to that of
PFs and VFs.
It adds an 'irqs' directory under the auxiliary device and includes an
<irq_num> sysfs file within it.
For example:
$ ls /sys/bus/auxiliary/devices/mlx5_core.sf.1/irqs/
50 51 52 53 54 55 56 57 58
Patch summary:
==============
patch-1 adds auxiliary bus to support irqs used by auxiliary device
patch-2 mlx5 driver using exposing irqs for PCI SF devices via auxiliary
bus
---
v8-v9:
- add Przemek RB
- use guard() in auxiliary_irq_dir_prepare (Paolo)
v7-v8:
- use cleanup.h for info and name fields (Greg)
- correct error flow in auxiliary_irq_dir_prepare (Przemek)
- add documentation for new fields of auxiliary_device (Simon)
v6->v7:
- dynamically creating irqs directory when first irq file created, patch #1 (Greg).
- removed irqs flag and simplified the dev_add() API, patch #1 (Greg).
- move sysfs related new code to a new auxiliary_sysfs.c file, patch #1 (Greg).
v5->v6:
- fix error flow in patch #2 (Przemek and Parav).
- remove concept of shared and exclusive and hence global xarray in patch #1 (Greg).
v4->v5:
- addressed comments from Greg in patch #1.
v3->4:
- addressed comments from Przemek in patch #1.
v2->v3:
- addressed comments from Parav and Przemek in patch #1.
- fixed a bug in patch #2.
v1->v2:
- addressed comments from Greg, Simon H and kernel test boot in patch #1.
Shay Drory (2):
driver core: auxiliary bus: show auxiliary device IRQs
net/mlx5: Expose SFs IRQs
Documentation/ABI/testing/sysfs-bus-auxiliary | 9 ++
drivers/base/Makefile | 1 +
drivers/base/auxiliary.c | 1 +
drivers/base/auxiliary_sysfs.c | 111 ++++++++++++++++++
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 6 +-
.../mellanox/mlx5/core/irq_affinity.c | 18 ++-
.../ethernet/mellanox/mlx5/core/mlx5_core.h | 6 +
.../ethernet/mellanox/mlx5/core/mlx5_irq.h | 12 +-
.../net/ethernet/mellanox/mlx5/core/pci_irq.c | 12 +-
include/linux/auxiliary_bus.h | 22 ++++
10 files changed, 187 insertions(+), 11 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-auxiliary
create mode 100644 drivers/base/auxiliary_sysfs.c
--
2.38.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net-next v9 1/2] driver core: auxiliary bus: show auxiliary device IRQs
2024-07-03 7:38 [PATCH net-next v9 0/2] Introduce auxiliary bus IRQs sysfs Shay Drory
@ 2024-07-03 7:38 ` Shay Drory
2024-07-04 10:41 ` Greg KH
2024-07-03 7:38 ` [PATCH net-next v9 2/2] net/mlx5: Expose SFs IRQs Shay Drory
1 sibling, 1 reply; 7+ messages in thread
From: Shay Drory @ 2024-07-03 7:38 UTC (permalink / raw)
To: netdev, pabeni, davem, kuba, edumazet, gregkh, david.m.ertman
Cc: rafael, ira.weiny, linux-rdma, leon, tariqt, Shay Drory,
Simon Horman, Przemek Kitszel, Parav Pandit
PCI subfunctions (SF) are anchored on the auxiliary bus. PCI physical
and virtual functions are anchored on the PCI bus. The irq information
of each such function is visible to users via sysfs directory "msi_irqs"
containing files for each irq entry. However, for PCI SFs such
information is unavailable. Due to this users have no visibility on IRQs
used by the SFs.
Secondly, an SF can be multi function device supporting rdma, netdevice
and more. Without irq information at the bus level, the user is unable
to view or use the affinity of the SF IRQs.
Hence to match to the equivalent PCI PFs and VFs, add "irqs" directory,
for supporting auxiliary devices, containing file for each irq entry.
For example:
$ ls /sys/bus/auxiliary/devices/mlx5_core.sf.1/irqs/
50 51 52 53 54 55 56 57 58
Cc: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
---
v8-v9:
- add Przemek RB
- use guard() in auxiliary_irq_dir_prepare (Paolo)
v7-v8:
- use cleanup.h for info and name fields (Greg)
- correct error flow in auxiliary_irq_dir_prepare (Przemek)
- add documentation for new fields of auxiliary_device (Simon)
v6-v7:
- dynamically creating irqs directory when first irq file created (Greg)
- removed irqs flag and simplified the dev_add() API (Greg)
- move sysfs related new code to a new auxiliary_sysfs.c file (Greg)
v5-v6:
- removed concept of shared and exclusive and hence global xarray (Greg)
v4-v5:
- restore global mutex and replace refcount_t with simple integer (Greg)
v3->4:
- remove global mutex (Przemek)
v2->v3:
- fix function declaration in case SYSFS isn't defined
v1->v2:
- move #ifdefs from drivers/base/auxiliary.c to
include/linux/auxiliary_bus.h (Greg)
- use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL (Greg)
- Fix kzalloc(ref) to kzalloc(*ref) (Simon)
- Add return description in auxiliary_device_sysfs_irq_add() kdoc (Simon)
- Fix auxiliary_irq_mode_show doc (kernel test boot)
---
Documentation/ABI/testing/sysfs-bus-auxiliary | 9 ++
drivers/base/Makefile | 1 +
drivers/base/auxiliary.c | 1 +
drivers/base/auxiliary_sysfs.c | 111 ++++++++++++++++++
include/linux/auxiliary_bus.h | 22 ++++
5 files changed, 144 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-bus-auxiliary
create mode 100644 drivers/base/auxiliary_sysfs.c
diff --git a/Documentation/ABI/testing/sysfs-bus-auxiliary b/Documentation/ABI/testing/sysfs-bus-auxiliary
new file mode 100644
index 000000000000..cc856079690f
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-auxiliary
@@ -0,0 +1,9 @@
+What: /sys/bus/auxiliary/devices/.../irqs/
+Date: April, 2024
+Contact: Shay Drory <shayd@nvidia.com>
+Description:
+ The /sys/devices/.../irqs directory contains a variable set of
+ files, with each file is named as irq number similar to PCI PF
+ or VF's irq number located in msi_irqs directory.
+ These irq files are added and removed dynamically when an IRQ
+ is requested and freed respectively for the PCI SF.
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 3079bfe53d04..7fb21768ca36 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -16,6 +16,7 @@ obj-$(CONFIG_NUMA) += node.o
obj-$(CONFIG_MEMORY_HOTPLUG) += memory.o
ifeq ($(CONFIG_SYSFS),y)
obj-$(CONFIG_MODULES) += module.o
+obj-$(CONFIG_AUXILIARY_BUS) += auxiliary_sysfs.o
endif
obj-$(CONFIG_SYS_HYPERVISOR) += hypervisor.o
obj-$(CONFIG_REGMAP) += regmap/
diff --git a/drivers/base/auxiliary.c b/drivers/base/auxiliary.c
index d3a2c40c2f12..55bde375150f 100644
--- a/drivers/base/auxiliary.c
+++ b/drivers/base/auxiliary.c
@@ -287,6 +287,7 @@ int auxiliary_device_init(struct auxiliary_device *auxdev)
dev->bus = &auxiliary_bus_type;
device_initialize(&auxdev->dev);
+ mutex_init(&auxdev->lock);
return 0;
}
EXPORT_SYMBOL_GPL(auxiliary_device_init);
diff --git a/drivers/base/auxiliary_sysfs.c b/drivers/base/auxiliary_sysfs.c
new file mode 100644
index 000000000000..f4e267971d70
--- /dev/null
+++ b/drivers/base/auxiliary_sysfs.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES
+ */
+
+#include <linux/auxiliary_bus.h>
+#include <linux/slab.h>
+
+struct auxiliary_irq_info {
+ struct device_attribute sysfs_attr;
+};
+
+static struct attribute *auxiliary_irq_attrs[] = {
+ NULL
+};
+
+static const struct attribute_group auxiliary_irqs_group = {
+ .name = "irqs",
+ .attrs = auxiliary_irq_attrs,
+};
+
+static int auxiliary_irq_dir_prepare(struct auxiliary_device *auxdev)
+{
+ int ret = 0;
+
+ guard(mutex)(&auxdev->lock);
+ if (auxdev->irq_dir_exists)
+ return 0;
+
+ ret = devm_device_add_group(&auxdev->dev, &auxiliary_irqs_group);
+ if (ret)
+ return ret;
+
+ auxdev->irq_dir_exists = true;
+ xa_init(&auxdev->irqs);
+ return 0;
+}
+
+/**
+ * auxiliary_device_sysfs_irq_add - add a sysfs entry for the given IRQ
+ * @auxdev: auxiliary bus device to add the sysfs entry.
+ * @irq: The associated interrupt number.
+ *
+ * This function should be called after auxiliary device have successfully
+ * received the irq.
+ * The driver is responsible to add a unique irq for the auxiliary device. The
+ * driver can invoke this function from multiple thread context safely for
+ * unique irqs of the auxiliary devices. The driver must not invoke this API
+ * multiple times if the irq is already added previously.
+ *
+ * Return: zero on success or an error code on failure.
+ */
+int auxiliary_device_sysfs_irq_add(struct auxiliary_device *auxdev, int irq)
+{
+ struct auxiliary_irq_info *info __free(kfree) = NULL;
+ struct device *dev = &auxdev->dev;
+ char *name __free(kfree) = NULL;
+ int ret;
+
+ ret = auxiliary_irq_dir_prepare(auxdev);
+ if (ret)
+ return ret;
+
+ info = kzalloc(sizeof(*info), GFP_KERNEL);
+ if (!info)
+ return -ENOMEM;
+
+ sysfs_attr_init(&info->sysfs_attr.attr);
+ name = kasprintf(GFP_KERNEL, "%d", irq);
+ if (!name)
+ return -ENOMEM;
+
+ ret = xa_insert(&auxdev->irqs, irq, info, GFP_KERNEL);
+ if (ret)
+ return ret;
+
+ info->sysfs_attr.attr.name = name;
+ ret = sysfs_add_file_to_group(&dev->kobj, &info->sysfs_attr.attr,
+ auxiliary_irqs_group.name);
+ if (ret)
+ goto sysfs_add_err;
+
+ info->sysfs_attr.attr.name = no_free_ptr(name);
+ xa_store(&auxdev->irqs, irq, no_free_ptr(info), GFP_KERNEL);
+ return 0;
+
+sysfs_add_err:
+ xa_erase(&auxdev->irqs, irq);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_add);
+
+/**
+ * auxiliary_device_sysfs_irq_remove - remove a sysfs entry for the given IRQ
+ * @auxdev: auxiliary bus device to add the sysfs entry.
+ * @irq: the IRQ to remove.
+ *
+ * This function should be called to remove an IRQ sysfs entry.
+ * The driver must invoke this API when IRQ is released by the device.
+ */
+void auxiliary_device_sysfs_irq_remove(struct auxiliary_device *auxdev, int irq)
+{
+ struct auxiliary_irq_info *info __free(kfree) = xa_load(&auxdev->irqs, irq);
+ const char *name __free(kfree) = info->sysfs_attr.attr.name;
+ struct device *dev = &auxdev->dev;
+
+ sysfs_remove_file_from_group(&dev->kobj, &info->sysfs_attr.attr,
+ auxiliary_irqs_group.name);
+ xa_erase(&auxdev->irqs, irq);
+}
+EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_remove);
diff --git a/include/linux/auxiliary_bus.h b/include/linux/auxiliary_bus.h
index de21d9d24a95..ee738379d5f2 100644
--- a/include/linux/auxiliary_bus.h
+++ b/include/linux/auxiliary_bus.h
@@ -58,6 +58,9 @@
* in
* @name: Match name found by the auxiliary device driver,
* @id: unique identitier if multiple devices of the same name are exported,
+ * @irqs: irqs xarray contains irq indices which are used by the device,
+ * @lock: Synchronize irq sysfs creation,
+ * @irq_dir_exists: whether "irqs" directory exists,
*
* An auxiliary_device represents a part of its parent device's functionality.
* It is given a name that, combined with the registering drivers
@@ -138,7 +141,10 @@
struct auxiliary_device {
struct device dev;
const char *name;
+ struct xarray irqs;
+ struct mutex lock; /* Synchronize irq sysfs creation */
u32 id;
+ bool irq_dir_exists;
};
/**
@@ -212,8 +218,24 @@ int auxiliary_device_init(struct auxiliary_device *auxdev);
int __auxiliary_device_add(struct auxiliary_device *auxdev, const char *modname);
#define auxiliary_device_add(auxdev) __auxiliary_device_add(auxdev, KBUILD_MODNAME)
+#ifdef CONFIG_SYSFS
+int auxiliary_device_sysfs_irq_add(struct auxiliary_device *auxdev, int irq);
+void auxiliary_device_sysfs_irq_remove(struct auxiliary_device *auxdev,
+ int irq);
+#else /* CONFIG_SYSFS */
+static inline int
+auxiliary_device_sysfs_irq_add(struct auxiliary_device *auxdev, int irq)
+{
+ return 0;
+}
+
+static inline void
+auxiliary_device_sysfs_irq_remove(struct auxiliary_device *auxdev, int irq) {}
+#endif
+
static inline void auxiliary_device_uninit(struct auxiliary_device *auxdev)
{
+ mutex_destroy(&auxdev->lock);
put_device(&auxdev->dev);
}
--
2.38.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net-next v9 2/2] net/mlx5: Expose SFs IRQs
2024-07-03 7:38 [PATCH net-next v9 0/2] Introduce auxiliary bus IRQs sysfs Shay Drory
2024-07-03 7:38 ` [PATCH net-next v9 1/2] driver core: auxiliary bus: show auxiliary device IRQs Shay Drory
@ 2024-07-03 7:38 ` Shay Drory
1 sibling, 0 replies; 7+ messages in thread
From: Shay Drory @ 2024-07-03 7:38 UTC (permalink / raw)
To: netdev, pabeni, davem, kuba, edumazet, gregkh, david.m.ertman
Cc: rafael, ira.weiny, linux-rdma, leon, tariqt, Shay Drory,
Przemek Kitszel, Parav Pandit
Expose the sysfs files for the IRQs that the mlx5 PCI SFs are using.
These entries are similar to PCI PFs and VFs in 'msi_irqs' directory.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
---
v8-v9:
- add Przemek RB
v6->v7:
- remove not needed changes to mlx5 sfnum SF sysfs
v5->v6:
- fail IRQ creation in case auxiliary_device_sysfs_irq_add() failed
(Parav and Przemek)
v2->v3:
- fix mlx5 sfnum SF sysfs
---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 6 +++---
.../ethernet/mellanox/mlx5/core/irq_affinity.c | 18 +++++++++++++++++-
.../ethernet/mellanox/mlx5/core/mlx5_core.h | 6 ++++++
.../net/ethernet/mellanox/mlx5/core/mlx5_irq.h | 12 ++++++++----
.../net/ethernet/mellanox/mlx5/core/pci_irq.c | 12 +++++++++---
5 files changed, 43 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 5693986ae656..5661f047702e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -714,7 +714,7 @@ static int create_async_eqs(struct mlx5_core_dev *dev)
err1:
mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);
mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
- mlx5_ctrl_irq_release(table->ctrl_irq);
+ mlx5_ctrl_irq_release(dev, table->ctrl_irq);
return err;
}
@@ -730,7 +730,7 @@ static void destroy_async_eqs(struct mlx5_core_dev *dev)
cleanup_async_eq(dev, &table->cmd_eq, "cmd");
mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);
mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
- mlx5_ctrl_irq_release(table->ctrl_irq);
+ mlx5_ctrl_irq_release(dev, table->ctrl_irq);
}
struct mlx5_eq *mlx5_get_async_eq(struct mlx5_core_dev *dev)
@@ -918,7 +918,7 @@ static int comp_irq_request_sf(struct mlx5_core_dev *dev, u16 vecidx)
af_desc.is_managed = 1;
cpumask_copy(&af_desc.mask, cpu_online_mask);
cpumask_andnot(&af_desc.mask, &af_desc.mask, &table->used_cpus);
- irq = mlx5_irq_affinity_request(pool, &af_desc);
+ irq = mlx5_irq_affinity_request(dev, pool, &af_desc);
if (IS_ERR(irq))
return PTR_ERR(irq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c b/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
index 612e666ec263..f7b01b3f0cba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
@@ -112,15 +112,18 @@ irq_pool_find_least_loaded(struct mlx5_irq_pool *pool, const struct cpumask *req
/**
* mlx5_irq_affinity_request - request an IRQ according to the given mask.
+ * @dev: mlx5 core device which is requesting the IRQ.
* @pool: IRQ pool to request from.
* @af_desc: affinity descriptor for this IRQ.
*
* This function returns a pointer to IRQ, or ERR_PTR in case of error.
*/
struct mlx5_irq *
-mlx5_irq_affinity_request(struct mlx5_irq_pool *pool, struct irq_affinity_desc *af_desc)
+mlx5_irq_affinity_request(struct mlx5_core_dev *dev, struct mlx5_irq_pool *pool,
+ struct irq_affinity_desc *af_desc)
{
struct mlx5_irq *least_loaded_irq, *new_irq;
+ int ret;
mutex_lock(&pool->lock);
least_loaded_irq = irq_pool_find_least_loaded(pool, &af_desc->mask);
@@ -153,6 +156,16 @@ mlx5_irq_affinity_request(struct mlx5_irq_pool *pool, struct irq_affinity_desc *
mlx5_irq_read_locked(least_loaded_irq) / MLX5_EQ_REFS_PER_IRQ);
unlock:
mutex_unlock(&pool->lock);
+ if (mlx5_irq_pool_is_sf_pool(pool)) {
+ ret = auxiliary_device_sysfs_irq_add(mlx5_sf_coredev_to_adev(dev),
+ mlx5_irq_get_irq(least_loaded_irq));
+ if (ret) {
+ mlx5_core_err(dev, "Failed to create sysfs entry for irq %d, ret = %d\n",
+ mlx5_irq_get_irq(least_loaded_irq), ret);
+ mlx5_irq_put(least_loaded_irq);
+ least_loaded_irq = ERR_PTR(ret);
+ }
+ }
return least_loaded_irq;
}
@@ -164,6 +177,9 @@ void mlx5_irq_affinity_irq_release(struct mlx5_core_dev *dev, struct mlx5_irq *i
cpu = cpumask_first(mlx5_irq_get_affinity_mask(irq));
synchronize_irq(pci_irq_vector(pool->dev->pdev,
mlx5_irq_get_index(irq)));
+ if (mlx5_irq_pool_is_sf_pool(pool))
+ auxiliary_device_sysfs_irq_remove(mlx5_sf_coredev_to_adev(dev),
+ mlx5_irq_get_irq(irq));
if (mlx5_irq_put(irq))
if (pool->irqs_per_cpu)
cpu_put(pool, cpu);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index c38342b9f320..e764b720d9b2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -320,6 +320,12 @@ static inline bool mlx5_core_is_sf(const struct mlx5_core_dev *dev)
return dev->coredev_type == MLX5_COREDEV_SF;
}
+static inline struct auxiliary_device *
+mlx5_sf_coredev_to_adev(struct mlx5_core_dev *mdev)
+{
+ return container_of(mdev->device, struct auxiliary_device, dev);
+}
+
int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx);
void mlx5_mdev_uninit(struct mlx5_core_dev *dev);
int mlx5_init_one(struct mlx5_core_dev *dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h
index 1088114e905d..0881e961d8b1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h
@@ -25,7 +25,7 @@ int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int devfn,
int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs);
struct mlx5_irq *mlx5_ctrl_irq_request(struct mlx5_core_dev *dev);
-void mlx5_ctrl_irq_release(struct mlx5_irq *ctrl_irq);
+void mlx5_ctrl_irq_release(struct mlx5_core_dev *dev, struct mlx5_irq *ctrl_irq);
struct mlx5_irq *mlx5_irq_request(struct mlx5_core_dev *dev, u16 vecidx,
struct irq_affinity_desc *af_desc,
struct cpu_rmap **rmap);
@@ -36,13 +36,15 @@ int mlx5_irq_attach_nb(struct mlx5_irq *irq, struct notifier_block *nb);
int mlx5_irq_detach_nb(struct mlx5_irq *irq, struct notifier_block *nb);
struct cpumask *mlx5_irq_get_affinity_mask(struct mlx5_irq *irq);
int mlx5_irq_get_index(struct mlx5_irq *irq);
+int mlx5_irq_get_irq(const struct mlx5_irq *irq);
struct mlx5_irq_pool;
#ifdef CONFIG_MLX5_SF
struct mlx5_irq *mlx5_irq_affinity_irq_request_auto(struct mlx5_core_dev *dev,
struct cpumask *used_cpus, u16 vecidx);
-struct mlx5_irq *mlx5_irq_affinity_request(struct mlx5_irq_pool *pool,
- struct irq_affinity_desc *af_desc);
+struct mlx5_irq *
+mlx5_irq_affinity_request(struct mlx5_core_dev *dev, struct mlx5_irq_pool *pool,
+ struct irq_affinity_desc *af_desc);
void mlx5_irq_affinity_irq_release(struct mlx5_core_dev *dev, struct mlx5_irq *irq);
#else
static inline
@@ -53,7 +55,8 @@ struct mlx5_irq *mlx5_irq_affinity_irq_request_auto(struct mlx5_core_dev *dev,
}
static inline struct mlx5_irq *
-mlx5_irq_affinity_request(struct mlx5_irq_pool *pool, struct irq_affinity_desc *af_desc)
+mlx5_irq_affinity_request(struct mlx5_core_dev *dev, struct mlx5_irq_pool *pool,
+ struct irq_affinity_desc *af_desc)
{
return ERR_PTR(-EOPNOTSUPP);
}
@@ -61,6 +64,7 @@ mlx5_irq_affinity_request(struct mlx5_irq_pool *pool, struct irq_affinity_desc *
static inline
void mlx5_irq_affinity_irq_release(struct mlx5_core_dev *dev, struct mlx5_irq *irq)
{
+ mlx5_irq_release_vector(irq);
}
#endif
#endif /* __MLX5_IRQ_H__ */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
index fb8787e30d3f..ac7c3a76b4cf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
@@ -367,6 +367,11 @@ struct cpumask *mlx5_irq_get_affinity_mask(struct mlx5_irq *irq)
return irq->mask;
}
+int mlx5_irq_get_irq(const struct mlx5_irq *irq)
+{
+ return irq->map.virq;
+}
+
int mlx5_irq_get_index(struct mlx5_irq *irq)
{
return irq->map.index;
@@ -440,11 +445,12 @@ static void _mlx5_irq_release(struct mlx5_irq *irq)
/**
* mlx5_ctrl_irq_release - release a ctrl IRQ back to the system.
+ * @dev: mlx5 device that releasing the IRQ.
* @ctrl_irq: ctrl IRQ to be released.
*/
-void mlx5_ctrl_irq_release(struct mlx5_irq *ctrl_irq)
+void mlx5_ctrl_irq_release(struct mlx5_core_dev *dev, struct mlx5_irq *ctrl_irq)
{
- _mlx5_irq_release(ctrl_irq);
+ mlx5_irq_affinity_irq_release(dev, ctrl_irq);
}
/**
@@ -473,7 +479,7 @@ struct mlx5_irq *mlx5_ctrl_irq_request(struct mlx5_core_dev *dev)
/* Allocate the IRQ in index 0. The vector was already allocated */
irq = irq_pool_request_vector(pool, 0, &af_desc, NULL);
} else {
- irq = mlx5_irq_affinity_request(pool, &af_desc);
+ irq = mlx5_irq_affinity_request(dev, pool, &af_desc);
}
return irq;
--
2.38.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next v9 1/2] driver core: auxiliary bus: show auxiliary device IRQs
2024-07-03 7:38 ` [PATCH net-next v9 1/2] driver core: auxiliary bus: show auxiliary device IRQs Shay Drory
@ 2024-07-04 10:41 ` Greg KH
2024-07-05 5:35 ` Shay Drori
0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2024-07-04 10:41 UTC (permalink / raw)
To: Shay Drory
Cc: netdev, pabeni, davem, kuba, edumazet, david.m.ertman, rafael,
ira.weiny, linux-rdma, leon, tariqt, Simon Horman,
Przemek Kitszel, Parav Pandit
On Wed, Jul 03, 2024 at 10:38:57AM +0300, Shay Drory wrote:
> +/**
> + * auxiliary_device_sysfs_irq_add - add a sysfs entry for the given IRQ
> + * @auxdev: auxiliary bus device to add the sysfs entry.
> + * @irq: The associated interrupt number.
> + *
> + * This function should be called after auxiliary device have successfully
> + * received the irq.
> + * The driver is responsible to add a unique irq for the auxiliary device. The
> + * driver can invoke this function from multiple thread context safely for
> + * unique irqs of the auxiliary devices. The driver must not invoke this API
> + * multiple times if the irq is already added previously.
> + *
> + * Return: zero on success or an error code on failure.
> + */
> +int auxiliary_device_sysfs_irq_add(struct auxiliary_device *auxdev, int irq)
> +{
> + struct auxiliary_irq_info *info __free(kfree) = NULL;
> + struct device *dev = &auxdev->dev;
> + char *name __free(kfree) = NULL;
> + int ret;
> +
> + ret = auxiliary_irq_dir_prepare(auxdev);
> + if (ret)
> + return ret;
> +
> + info = kzalloc(sizeof(*info), GFP_KERNEL);
> + if (!info)
> + return -ENOMEM;
> +
> + sysfs_attr_init(&info->sysfs_attr.attr);
> + name = kasprintf(GFP_KERNEL, "%d", irq);
> + if (!name)
> + return -ENOMEM;
> +
> + ret = xa_insert(&auxdev->irqs, irq, info, GFP_KERNEL);
> + if (ret)
> + return ret;
> +
> + info->sysfs_attr.attr.name = name;
> + ret = sysfs_add_file_to_group(&dev->kobj, &info->sysfs_attr.attr,
> + auxiliary_irqs_group.name);
> + if (ret)
> + goto sysfs_add_err;
> +
> + info->sysfs_attr.attr.name = no_free_ptr(name);
This assignment of a name AFTER it has been created is odd. I think I
know why you are doing this, but please make it obvious and perhaps
solve it in a cleaner way. Assigning this "deep" in a sysfs structure
is not ok.
> + xa_store(&auxdev->irqs, irq, no_free_ptr(info), GFP_KERNEL);
> + return 0;
> +
> +sysfs_add_err:
> + xa_erase(&auxdev->irqs, irq);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_add);
> +
> +/**
> + * auxiliary_device_sysfs_irq_remove - remove a sysfs entry for the given IRQ
> + * @auxdev: auxiliary bus device to add the sysfs entry.
> + * @irq: the IRQ to remove.
> + *
> + * This function should be called to remove an IRQ sysfs entry.
> + * The driver must invoke this API when IRQ is released by the device.
> + */
> +void auxiliary_device_sysfs_irq_remove(struct auxiliary_device *auxdev, int irq)
> +{
> + struct auxiliary_irq_info *info __free(kfree) = xa_load(&auxdev->irqs, irq);
No verification that this is an actual entry before you dereferenced it?
Bold move...
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next v9 1/2] driver core: auxiliary bus: show auxiliary device IRQs
2024-07-04 10:41 ` Greg KH
@ 2024-07-05 5:35 ` Shay Drori
2024-07-05 5:53 ` Greg KH
2024-07-05 6:27 ` Przemek Kitszel
0 siblings, 2 replies; 7+ messages in thread
From: Shay Drori @ 2024-07-05 5:35 UTC (permalink / raw)
To: Greg KH
Cc: netdev, pabeni, davem, kuba, edumazet, david.m.ertman, rafael,
ira.weiny, linux-rdma, leon, tariqt, Simon Horman,
Przemek Kitszel, Parav Pandit
On 04/07/2024 13:41, Greg KH wrote:
> External email: Use caution opening links or attachments
>
>
> On Wed, Jul 03, 2024 at 10:38:57AM +0300, Shay Drory wrote:
>> +/**
>> + * auxiliary_device_sysfs_irq_add - add a sysfs entry for the given IRQ
>> + * @auxdev: auxiliary bus device to add the sysfs entry.
>> + * @irq: The associated interrupt number.
>> + *
>> + * This function should be called after auxiliary device have successfully
>> + * received the irq.
>> + * The driver is responsible to add a unique irq for the auxiliary device. The
>> + * driver can invoke this function from multiple thread context safely for
>> + * unique irqs of the auxiliary devices. The driver must not invoke this API
>> + * multiple times if the irq is already added previously.
>> + *
>> + * Return: zero on success or an error code on failure.
>> + */
>> +int auxiliary_device_sysfs_irq_add(struct auxiliary_device *auxdev, int irq)
>> +{
>> + struct auxiliary_irq_info *info __free(kfree) = NULL;
>> + struct device *dev = &auxdev->dev;
>> + char *name __free(kfree) = NULL;
>> + int ret;
>> +
>> + ret = auxiliary_irq_dir_prepare(auxdev);
>> + if (ret)
>> + return ret;
>> +
>> + info = kzalloc(sizeof(*info), GFP_KERNEL);
>> + if (!info)
>> + return -ENOMEM;
>> +
>> + sysfs_attr_init(&info->sysfs_attr.attr);
>> + name = kasprintf(GFP_KERNEL, "%d", irq);
>> + if (!name)
>> + return -ENOMEM;
>> +
>> + ret = xa_insert(&auxdev->irqs, irq, info, GFP_KERNEL);
>> + if (ret)
>> + return ret;
>> +
>> + info->sysfs_attr.attr.name = name;
>> + ret = sysfs_add_file_to_group(&dev->kobj, &info->sysfs_attr.attr,
>> + auxiliary_irqs_group.name);
>> + if (ret)
>> + goto sysfs_add_err;
>> +
>> + info->sysfs_attr.attr.name = no_free_ptr(name);
>
> This assignment of a name AFTER it has been created is odd. I think I
> know why you are doing this, but please make it obvious and perhaps
> solve it in a cleaner way.
I am doing it since I want the name memory to be freed in case of
sysfs_add_file_to_group() fails.
I don’t see a cleaner way available with cleanup.h.
> Assigning this "deep" in a sysfs structure is not ok.
when creating sysfs dynamically, there isn't a cleaner way to assign the
name memory.
The closest and exact same use case for pci irq sysfs which uses dynamic
sysfs is msi_sysfs_populate_desc().
It does not use cleanup.h but still has to assign.
I Don’t have any other ideas on how to implement it any more elegantly
with cleanup.h.
Do you prefer to assign it before sysfs_add_file_to_group() similar to
msi_sysfs_populate_desc() and avoid cleanup.h for now?
>
>
>> + xa_store(&auxdev->irqs, irq, no_free_ptr(info), GFP_KERNEL);
>> + return 0;
>> +
>> +sysfs_add_err:
>> + xa_erase(&auxdev->irqs, irq);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_add);
>> +
>> +/**
>> + * auxiliary_device_sysfs_irq_remove - remove a sysfs entry for the given IRQ
>> + * @auxdev: auxiliary bus device to add the sysfs entry.
>> + * @irq: the IRQ to remove.
>> + *
>> + * This function should be called to remove an IRQ sysfs entry.
>> + * The driver must invoke this API when IRQ is released by the device.
>> + */
>> +void auxiliary_device_sysfs_irq_remove(struct auxiliary_device *auxdev, int irq)
>> +{
>> + struct auxiliary_irq_info *info __free(kfree) = xa_load(&auxdev->irqs, irq);
>
> No verification that this is an actual entry before you dereferenced it?
> Bold move...
Driver must do this for allocated irq. So xa_load cannot fail.
In previous versions we had WARN_ON to catch driver bugs, but you didn’t
like it.
I think this is fine the way it is in v9.
>
> greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next v9 1/2] driver core: auxiliary bus: show auxiliary device IRQs
2024-07-05 5:35 ` Shay Drori
@ 2024-07-05 5:53 ` Greg KH
2024-07-05 6:27 ` Przemek Kitszel
1 sibling, 0 replies; 7+ messages in thread
From: Greg KH @ 2024-07-05 5:53 UTC (permalink / raw)
To: Shay Drori
Cc: netdev, pabeni, davem, kuba, edumazet, david.m.ertman, rafael,
ira.weiny, linux-rdma, leon, tariqt, Simon Horman,
Przemek Kitszel, Parav Pandit
On Fri, Jul 05, 2024 at 08:35:33AM +0300, Shay Drori wrote:
>
>
> On 04/07/2024 13:41, Greg KH wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Wed, Jul 03, 2024 at 10:38:57AM +0300, Shay Drory wrote:
> > > +/**
> > > + * auxiliary_device_sysfs_irq_add - add a sysfs entry for the given IRQ
> > > + * @auxdev: auxiliary bus device to add the sysfs entry.
> > > + * @irq: The associated interrupt number.
> > > + *
> > > + * This function should be called after auxiliary device have successfully
> > > + * received the irq.
> > > + * The driver is responsible to add a unique irq for the auxiliary device. The
> > > + * driver can invoke this function from multiple thread context safely for
> > > + * unique irqs of the auxiliary devices. The driver must not invoke this API
> > > + * multiple times if the irq is already added previously.
> > > + *
> > > + * Return: zero on success or an error code on failure.
> > > + */
> > > +int auxiliary_device_sysfs_irq_add(struct auxiliary_device *auxdev, int irq)
> > > +{
> > > + struct auxiliary_irq_info *info __free(kfree) = NULL;
> > > + struct device *dev = &auxdev->dev;
> > > + char *name __free(kfree) = NULL;
> > > + int ret;
> > > +
> > > + ret = auxiliary_irq_dir_prepare(auxdev);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + info = kzalloc(sizeof(*info), GFP_KERNEL);
> > > + if (!info)
> > > + return -ENOMEM;
> > > +
> > > + sysfs_attr_init(&info->sysfs_attr.attr);
> > > + name = kasprintf(GFP_KERNEL, "%d", irq);
> > > + if (!name)
> > > + return -ENOMEM;
> > > +
> > > + ret = xa_insert(&auxdev->irqs, irq, info, GFP_KERNEL);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + info->sysfs_attr.attr.name = name;
> > > + ret = sysfs_add_file_to_group(&dev->kobj, &info->sysfs_attr.attr,
> > > + auxiliary_irqs_group.name);
> > > + if (ret)
> > > + goto sysfs_add_err;
> > > +
> > > + info->sysfs_attr.attr.name = no_free_ptr(name);
> >
> > This assignment of a name AFTER it has been created is odd. I think I
> > know why you are doing this, but please make it obvious and perhaps
> > solve it in a cleaner way.
>
> I am doing it since I want the name memory to be freed in case of
> sysfs_add_file_to_group() fails.
> I don’t see a cleaner way available with cleanup.h.
>
> > Assigning this "deep" in a sysfs structure is not ok.
>
> when creating sysfs dynamically, there isn't a cleaner way to assign the
> name memory.
> The closest and exact same use case for pci irq sysfs which uses dynamic
> sysfs is msi_sysfs_populate_desc().
> It does not use cleanup.h but still has to assign.
> I Don’t have any other ideas on how to implement it any more elegantly
> with cleanup.h.
> Do you prefer to assign it before sysfs_add_file_to_group() similar to
> msi_sysfs_populate_desc() and avoid cleanup.h for now?
No, what msi_sysfs_populate_desc() does is not good, the only objection
here is the assignment after-the-fact you are doing just to work around
cleanup.h. Surely there's a better way to tell it not to free the
pointer at this point in time other than this.
> > > + xa_store(&auxdev->irqs, irq, no_free_ptr(info), GFP_KERNEL);
> > > + return 0;
> > > +
> > > +sysfs_add_err:
> > > + xa_erase(&auxdev->irqs, irq);
> > > + return ret;
> > > +}
> > > +EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_add);
> > > +
> > > +/**
> > > + * auxiliary_device_sysfs_irq_remove - remove a sysfs entry for the given IRQ
> > > + * @auxdev: auxiliary bus device to add the sysfs entry.
> > > + * @irq: the IRQ to remove.
> > > + *
> > > + * This function should be called to remove an IRQ sysfs entry.
> > > + * The driver must invoke this API when IRQ is released by the device.
> > > + */
> > > +void auxiliary_device_sysfs_irq_remove(struct auxiliary_device *auxdev, int irq)
> > > +{
> > > + struct auxiliary_irq_info *info __free(kfree) = xa_load(&auxdev->irqs, irq);
> >
> > No verification that this is an actual entry before you dereferenced it?
> > Bold move...
>
> Driver must do this for allocated irq. So xa_load cannot fail.
> In previous versions we had WARN_ON to catch driver bugs, but you didn’t
> like it.
Yes, because if something can happen, you handle the error properly, you
don't reboot a machine.
> I think this is fine the way it is in v9.
No, you are now causing a NULL dereference (or close to it) if something
went wrong. Properly check this and handle it correctly.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next v9 1/2] driver core: auxiliary bus: show auxiliary device IRQs
2024-07-05 5:35 ` Shay Drori
2024-07-05 5:53 ` Greg KH
@ 2024-07-05 6:27 ` Przemek Kitszel
1 sibling, 0 replies; 7+ messages in thread
From: Przemek Kitszel @ 2024-07-05 6:27 UTC (permalink / raw)
To: Shay Drori, Greg KH
Cc: netdev, pabeni, davem, kuba, edumazet, david.m.ertman, rafael,
ira.weiny, linux-rdma, leon, tariqt, Simon Horman, Parav Pandit
On 7/5/24 07:35, Shay Drori wrote:
>
>
> On 04/07/2024 13:41, Greg KH wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On Wed, Jul 03, 2024 at 10:38:57AM +0300, Shay Drory wrote:
>>> +/**
>>> + * auxiliary_device_sysfs_irq_add - add a sysfs entry for the given IRQ
>>> + * @auxdev: auxiliary bus device to add the sysfs entry.
>>> + * @irq: The associated interrupt number.
>>> + *
>>> + * This function should be called after auxiliary device have
>>> successfully
>>> + * received the irq.
>>> + * The driver is responsible to add a unique irq for the auxiliary
>>> device. The
>>> + * driver can invoke this function from multiple thread context
>>> safely for
>>> + * unique irqs of the auxiliary devices. The driver must not invoke
>>> this API
>>> + * multiple times if the irq is already added previously.
>>> + *
>>> + * Return: zero on success or an error code on failure.
>>> + */
>>> +int auxiliary_device_sysfs_irq_add(struct auxiliary_device *auxdev,
>>> int irq)
>>> +{
>>> + struct auxiliary_irq_info *info __free(kfree) = NULL;
>>> + struct device *dev = &auxdev->dev;
>>> + char *name __free(kfree) = NULL;
>>> + int ret;
>>> +
>>> + ret = auxiliary_irq_dir_prepare(auxdev);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + info = kzalloc(sizeof(*info), GFP_KERNEL);
>>> + if (!info)
>>> + return -ENOMEM;
>>> +
>>> + sysfs_attr_init(&info->sysfs_attr.attr);
>>> + name = kasprintf(GFP_KERNEL, "%d", irq);
>>> + if (!name)
>>> + return -ENOMEM;
>>> +
>>> + ret = xa_insert(&auxdev->irqs, irq, info, GFP_KERNEL);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + info->sysfs_attr.attr.name = name;
>>> + ret = sysfs_add_file_to_group(&dev->kobj, &info->sysfs_attr.attr,
>>> + auxiliary_irqs_group.name);
>>> + if (ret)
>>> + goto sysfs_add_err;
>>> +
>>> + info->sysfs_attr.attr.name = no_free_ptr(name);
>>
>> This assignment of a name AFTER it has been created is odd. I think I
>> know why you are doing this, but please make it obvious and perhaps
>> solve it in a cleaner way.
>
> I am doing it since I want the name memory to be freed in case of
> sysfs_add_file_to_group() fails.
> I don’t see a cleaner way available with cleanup.h.
>
>> Assigning this "deep" in a sysfs structure is not ok.
>
> when creating sysfs dynamically, there isn't a cleaner way to assign the
> name memory.
> The closest and exact same use case for pci irq sysfs which uses dynamic
> sysfs is msi_sysfs_populate_desc().
> It does not use cleanup.h but still has to assign.
> I Don’t have any other ideas on how to implement it any more elegantly
> with cleanup.h.
> Do you prefer to assign it before sysfs_add_file_to_group() similar to
> msi_sysfs_populate_desc() and avoid cleanup.h for now?
I've overlooked it earlier, sorry.
easiest solution for "general" case would be:
info->sysfs_attr.attr.name = no_free_ptr(name);
ret = sysfs_add_file_to_group(&dev->kobj,
&info->sysfs_attr.attr,
auxiliary_irqs_group.name);
if (ret) {
/* freeing manualy since auto cleanup was
* disabled by no_free_ptr() */
kfree(info->sysfs_attr.attr.name);
goto sysfs_add_err;
}
but in your case it will be cleaner to alloc the space for name
together with struct auxiliary_irq_info, by placing a char array
there, either with static size or a flex one (if such case would
be generic)
going one step further would be be to reorder struct device_attribute
and struct attribute fields to have @name as the last one and make it
a flex array - but it is perhaps for another series ;)
>>> +void auxiliary_device_sysfs_irq_remove(struct auxiliary_device *auxdev, int irq)
>>> +{
>>> + struct auxiliary_irq_info *info __free(kfree) = xa_load(&auxdev->irqs, irq);
>>
>> No verification that this is an actual entry before you dereferenced it?
>> Bold move...
>
> Driver must do this for allocated irq. So xa_load cannot fail.
> In previous versions we had WARN_ON to catch driver bugs, but you didn’t
> like it.
> I think this is fine the way it is in v9.
>
>>
>> greg k-h
Perhaps this is more about trust boundaries?,
I would like to learn something from this case :)
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-07-05 6:28 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-03 7:38 [PATCH net-next v9 0/2] Introduce auxiliary bus IRQs sysfs Shay Drory
2024-07-03 7:38 ` [PATCH net-next v9 1/2] driver core: auxiliary bus: show auxiliary device IRQs Shay Drory
2024-07-04 10:41 ` Greg KH
2024-07-05 5:35 ` Shay Drori
2024-07-05 5:53 ` Greg KH
2024-07-05 6:27 ` Przemek Kitszel
2024-07-03 7:38 ` [PATCH net-next v9 2/2] net/mlx5: Expose SFs IRQs Shay Drory
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).