* [PATCH 0/2] dmaengine: idxd: Add basic DSA 3.0 capability and SGL support
@ 2025-06-13 16:18 Yi Sun
2025-06-13 16:18 ` [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs Yi Sun
2025-06-13 16:18 ` [PATCH 2/2] dmaengine: idxd: Add Max SGL Size Support for DSA3.0 Yi Sun
0 siblings, 2 replies; 13+ messages in thread
From: Yi Sun @ 2025-06-13 16:18 UTC (permalink / raw)
To: dave.jiang, vinicius.gomes, dmaengine, linux-kernel
Cc: yi.sun, gordon.jin, fenghuay, anil.s.keshavamurthy, philip.lantz
This patch series introduces foundational support for DSA 3.0 features,
exposing hardware capability registers to userspace in the IDXD driver.
DSA 3.0 introduces several new features that require awareness and
configuration from both kernel and userspace. It is necessary to
understand the hardware's capabilities for userspace tools (e.g.,
idxd-config, libraries, and applications) to make use of the features
properly, such as supported features, memory layouts, and opcode
compatibility.
Patch 1/2 exposes the three new capability registers (dsacap0-2)
introduced in the DSA 3.0 specification through a new sysfs entry.
This allows tools and users to query hardware capabilities such as
supported SGL formats, floating-point options, and maximum supported
sizes.
Patch 2/2 enables configuration of the maximum SGL size for DSA 3.0
devices. Some DSA 3.0 opcodes (e.g., Gather Copy, Gather Reduce) require
that the workqueue's SGL size is explicitly configured. This patch sets
that value based on hardware capabilities at initialization time,
allowing these opcodes to function without additional user configuration.
Yi Sun (2):
dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
dmaengine: idxd: Add Max SGL Size Support for DSA3.0
.../ABI/stable/sysfs-driver-dma-idxd | 15 ++++++++++
drivers/dma/idxd/device.c | 5 ++++
drivers/dma/idxd/idxd.h | 19 +++++++++++++
drivers/dma/idxd/init.c | 9 ++++++
drivers/dma/idxd/registers.h | 28 ++++++++++++++++++-
drivers/dma/idxd/sysfs.c | 27 ++++++++++++++++++
6 files changed, 102 insertions(+), 1 deletion(-)
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
2025-06-13 16:18 [PATCH 0/2] dmaengine: idxd: Add basic DSA 3.0 capability and SGL support Yi Sun
@ 2025-06-13 16:18 ` Yi Sun
2025-06-13 20:59 ` Dave Jiang
` (2 more replies)
2025-06-13 16:18 ` [PATCH 2/2] dmaengine: idxd: Add Max SGL Size Support for DSA3.0 Yi Sun
1 sibling, 3 replies; 13+ messages in thread
From: Yi Sun @ 2025-06-13 16:18 UTC (permalink / raw)
To: dave.jiang, vinicius.gomes, dmaengine, linux-kernel
Cc: yi.sun, gordon.jin, fenghuay, anil.s.keshavamurthy, philip.lantz
Introduce sysfs interfaces for 3 new Data Streaming Accelerator (DSA)
capability registers (dsacap0-2) to enable userspace awareness of hardware
features in DSA version 3 and later devices.
Userspace components (e.g. configure libraries, workload Apps) require this
information to:
1. Select optimal data transfer strategies based on SGL capabilities
2. Enable hardware-specific optimizations for floating-point operations
3. Configure memory operations with proper numerical handling
4. Verify compute operation compatibility before submitting jobs
The output consists of values from the three dsacap registers, concatenated
in order and separated by commas.
Example:
cat /sys/bus/dsa/devices/dsa0/dsacap
0014000e000007aa,00fa01ff01ff03ff,000000000000f18d
Signed-off-by: Yi Sun <yi.sun@intel.com>
Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
diff --git a/Documentation/ABI/stable/sysfs-driver-dma-idxd b/Documentation/ABI/stable/sysfs-driver-dma-idxd
index 4a355e6747ae..f9568ea52b2f 100644
--- a/Documentation/ABI/stable/sysfs-driver-dma-idxd
+++ b/Documentation/ABI/stable/sysfs-driver-dma-idxd
@@ -136,6 +136,21 @@ Description: The last executed device administrative command's status/error.
Also last configuration error overloaded.
Writing to it will clear the status.
+What: /sys/bus/dsa/devices/dsa<m>/dsacap
+Date: June 1, 2025
+KernelVersion: 6.17.0
+Contact: dmaengine@vger.kernel.org
+Description: The DSA3 specification introduces three new capability
+ registers: dsacap[0-2]. User components (e.g., configuration
+ libraries and workload applications) require this information
+ to properly utilize the DSA3 features.
+ This includes SGL capability support, Enabling hardware-specific
+ optimizations, Configuring memory, etc.
+ The output consists of values from the three dsacap registers,
+ concatenated in order and separated by commas.
+ This attribute should only be visible on DSA devices of version
+ 3 or later.
+
What: /sys/bus/dsa/devices/dsa<m>/iaa_cap
Date: Sept 14, 2022
KernelVersion: 6.0.0
diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
index 74e6695881e6..cc0a3fe1c957 100644
--- a/drivers/dma/idxd/idxd.h
+++ b/drivers/dma/idxd/idxd.h
@@ -252,6 +252,9 @@ struct idxd_hw {
struct opcap opcap;
u32 cmd_cap;
union iaa_cap_reg iaa_cap;
+ union dsacap0_reg dsacap0;
+ union dsacap1_reg dsacap1;
+ union dsacap2_reg dsacap2;
};
enum idxd_device_state {
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 80355d03004d..cc8203320d40 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -582,6 +582,10 @@ static void idxd_read_caps(struct idxd_device *idxd)
}
multi_u64_to_bmap(idxd->opcap_bmap, &idxd->hw.opcap.bits[0], 4);
+ idxd->hw.dsacap0.bits = ioread64(idxd->reg_base + IDXD_DSACAP0_OFFSET);
+ idxd->hw.dsacap1.bits = ioread64(idxd->reg_base + IDXD_DSACAP1_OFFSET);
+ idxd->hw.dsacap2.bits = ioread64(idxd->reg_base + IDXD_DSACAP2_OFFSET);
+
/* read iaa cap */
if (idxd->data->type == IDXD_TYPE_IAX && idxd->hw.version >= DEVICE_VERSION_2)
idxd->hw.iaa_cap.bits = ioread64(idxd->reg_base + IDXD_IAACAP_OFFSET);
diff --git a/drivers/dma/idxd/registers.h b/drivers/dma/idxd/registers.h
index 006ba206ab1b..45485ecd7bb6 100644
--- a/drivers/dma/idxd/registers.h
+++ b/drivers/dma/idxd/registers.h
@@ -13,6 +13,7 @@
#define DEVICE_VERSION_1 0x100
#define DEVICE_VERSION_2 0x200
+#define DEVICE_VERSION_3 0x300
#define IDXD_MMIO_BAR 0
#define IDXD_WQ_BAR 2
@@ -582,6 +583,21 @@ union evl_status_reg {
u64 bits;
} __packed;
+#define IDXD_DSACAP0_OFFSET 0x180
+union dsacap0_reg {
+ u64 bits;
+};
+
+#define IDXD_DSACAP1_OFFSET 0x188
+union dsacap1_reg {
+ u64 bits;
+};
+
+#define IDXD_DSACAP2_OFFSET 0x190
+union dsacap2_reg {
+ u64 bits;
+};
+
#define IDXD_MAX_BATCH_IDENT 256
struct __evl_entry {
diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
index 9f0701021af0..624b7d1b193f 100644
--- a/drivers/dma/idxd/sysfs.c
+++ b/drivers/dma/idxd/sysfs.c
@@ -1713,6 +1713,21 @@ static ssize_t event_log_size_store(struct device *dev,
}
static DEVICE_ATTR_RW(event_log_size);
+static ssize_t dsacap_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct idxd_device *idxd = confdev_to_idxd(dev);
+
+ return sysfs_emit(buf, "%08x,%08x,%08x,%08x,%08x,%08x\n",
+ upper_32_bits(idxd->hw.dsacap0.bits),
+ lower_32_bits(idxd->hw.dsacap0.bits),
+ upper_32_bits(idxd->hw.dsacap1.bits),
+ lower_32_bits(idxd->hw.dsacap1.bits),
+ upper_32_bits(idxd->hw.dsacap2.bits),
+ lower_32_bits(idxd->hw.dsacap2.bits));
+}
+static DEVICE_ATTR_RO(dsacap);
+
static bool idxd_device_attr_max_batch_size_invisible(struct attribute *attr,
struct idxd_device *idxd)
{
@@ -1750,6 +1765,14 @@ static bool idxd_device_attr_event_log_size_invisible(struct attribute *attr,
!idxd->hw.gen_cap.evl_support);
}
+static bool idxd_device_attr_dsacap_invisible(struct attribute *attr,
+ struct idxd_device *idxd)
+{
+ return attr == &dev_attr_dsacap.attr &&
+ (idxd->data->type != IDXD_TYPE_DSA ||
+ idxd->hw.version < DEVICE_VERSION_3);
+}
+
static umode_t idxd_device_attr_visible(struct kobject *kobj,
struct attribute *attr, int n)
{
@@ -1768,6 +1791,9 @@ static umode_t idxd_device_attr_visible(struct kobject *kobj,
if (idxd_device_attr_event_log_size_invisible(attr, idxd))
return 0;
+ if (idxd_device_attr_dsacap_invisible(attr, idxd))
+ return 0;
+
return attr->mode;
}
@@ -1795,6 +1821,7 @@ static struct attribute *idxd_device_attributes[] = {
&dev_attr_cmd_status.attr,
&dev_attr_iaa_cap.attr,
&dev_attr_event_log_size.attr,
+ &dev_attr_dsacap.attr,
NULL,
};
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/2] dmaengine: idxd: Add Max SGL Size Support for DSA3.0
2025-06-13 16:18 [PATCH 0/2] dmaengine: idxd: Add basic DSA 3.0 capability and SGL support Yi Sun
2025-06-13 16:18 ` [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs Yi Sun
@ 2025-06-13 16:18 ` Yi Sun
2025-06-13 21:00 ` Dave Jiang
2025-06-13 22:03 ` Fenghua Yu
1 sibling, 2 replies; 13+ messages in thread
From: Yi Sun @ 2025-06-13 16:18 UTC (permalink / raw)
To: dave.jiang, vinicius.gomes, dmaengine, linux-kernel
Cc: yi.sun, gordon.jin, fenghuay, anil.s.keshavamurthy, philip.lantz
Certain DSA 3.0 opcodes, such as Gather copy and Gather reduce requires max
SGL configured for workqueues prior to support these opcodes.
Configure the maximum scatter-gather list (SGL) size for workqueues during
setup on the supported HW. Application can then properly handle the SGL
size without explicitly setting it.
Signed-off-by: Yi Sun <yi.sun@intel.com>
Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c
index 5cf419fe6b46..1c10b030bea7 100644
--- a/drivers/dma/idxd/device.c
+++ b/drivers/dma/idxd/device.c
@@ -375,6 +375,7 @@ static void idxd_wq_disable_cleanup(struct idxd_wq *wq)
memset(wq->name, 0, WQ_NAME_SIZE);
wq->max_xfer_bytes = WQ_DEFAULT_MAX_XFER;
idxd_wq_set_max_batch_size(idxd->data->type, wq, WQ_DEFAULT_MAX_BATCH);
+ idxd_wq_set_init_max_sgl_size(idxd, wq);
if (wq->opcap_bmap)
bitmap_copy(wq->opcap_bmap, idxd->opcap_bmap, IDXD_MAX_OPCAP_BITS);
}
@@ -974,6 +975,8 @@ static int idxd_wq_config_write(struct idxd_wq *wq)
/* bytes 12-15 */
wq->wqcfg->max_xfer_shift = ilog2(wq->max_xfer_bytes);
idxd_wqcfg_set_max_batch_shift(idxd->data->type, wq->wqcfg, ilog2(wq->max_batch_size));
+ if (idxd_sgl_supported(idxd))
+ wq->wqcfg->max_sgl_shift = ilog2(wq->max_sgl_size);
/* bytes 32-63 */
if (idxd->hw.wq_cap.op_config && wq->opcap_bmap) {
@@ -1152,6 +1155,8 @@ static int idxd_wq_load_config(struct idxd_wq *wq)
wq->max_xfer_bytes = 1ULL << wq->wqcfg->max_xfer_shift;
idxd_wq_set_max_batch_size(idxd->data->type, wq, 1U << wq->wqcfg->max_batch_shift);
+ if (idxd_sgl_supported(idxd))
+ wq->max_sgl_size = 1U << wq->wqcfg->max_sgl_shift;
for (i = 0; i < WQCFG_STRIDES(idxd); i++) {
wqcfg_offset = WQCFG_OFFSET(idxd, wq->id, i);
diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
index cc0a3fe1c957..fe5af50b58a4 100644
--- a/drivers/dma/idxd/idxd.h
+++ b/drivers/dma/idxd/idxd.h
@@ -227,6 +227,7 @@ struct idxd_wq {
char name[WQ_NAME_SIZE + 1];
u64 max_xfer_bytes;
u32 max_batch_size;
+ u32 max_sgl_size;
/* Lock to protect upasid_xa access. */
struct mutex uc_lock;
@@ -348,6 +349,7 @@ struct idxd_device {
u64 max_xfer_bytes;
u32 max_batch_size;
+ u32 max_sgl_size;
int max_groups;
int max_engines;
int max_rdbufs;
@@ -692,6 +694,20 @@ static inline void idxd_wq_set_max_batch_size(int idxd_type, struct idxd_wq *wq,
wq->max_batch_size = max_batch_size;
}
+static bool idxd_sgl_supported(struct idxd_device *idxd)
+{
+ return idxd->hw.dsacap0.sgl_formats &&
+ idxd->data->type == IDXD_TYPE_DSA &&
+ idxd->hw.version >= DEVICE_VERSION_3;
+}
+
+static inline void idxd_wq_set_init_max_sgl_size(struct idxd_device *idxd,
+ struct idxd_wq *wq)
+{
+ if (idxd_sgl_supported(idxd))
+ wq->max_sgl_size = 1U << idxd->hw.dsacap0.max_sgl_shift;
+}
+
static inline void idxd_wqcfg_set_max_batch_shift(int idxd_type, union wqcfg *wqcfg,
u32 max_batch_shift)
{
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index cc8203320d40..f37a7d7b537a 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -217,6 +217,7 @@ static int idxd_setup_wqs(struct idxd_device *idxd)
init_completion(&wq->wq_resurrect);
wq->max_xfer_bytes = WQ_DEFAULT_MAX_XFER;
idxd_wq_set_max_batch_size(idxd->data->type, wq, WQ_DEFAULT_MAX_BATCH);
+ idxd_wq_set_init_max_sgl_size(idxd, wq);
wq->enqcmds_retries = IDXD_ENQCMDS_RETRIES;
wq->wqcfg = kzalloc_node(idxd->wqcfg_size, GFP_KERNEL, dev_to_node(dev));
if (!wq->wqcfg) {
@@ -585,6 +586,10 @@ static void idxd_read_caps(struct idxd_device *idxd)
idxd->hw.dsacap0.bits = ioread64(idxd->reg_base + IDXD_DSACAP0_OFFSET);
idxd->hw.dsacap1.bits = ioread64(idxd->reg_base + IDXD_DSACAP1_OFFSET);
idxd->hw.dsacap2.bits = ioread64(idxd->reg_base + IDXD_DSACAP2_OFFSET);
+ if (idxd_sgl_supported(idxd)) {
+ idxd->max_sgl_size = 1U << idxd->hw.dsacap0.max_sgl_shift;
+ dev_dbg(dev, "max sgl size: %u\n", idxd->max_sgl_size);
+ }
/* read iaa cap */
if (idxd->data->type == IDXD_TYPE_IAX && idxd->hw.version >= DEVICE_VERSION_2)
diff --git a/drivers/dma/idxd/registers.h b/drivers/dma/idxd/registers.h
index 45485ecd7bb6..0401cfc95f27 100644
--- a/drivers/dma/idxd/registers.h
+++ b/drivers/dma/idxd/registers.h
@@ -385,7 +385,8 @@ union wqcfg {
/* bytes 12-15 */
u32 max_xfer_shift:5;
u32 max_batch_shift:4;
- u32 rsvd4:23;
+ u32 max_sgl_shift:4;
+ u32 rsvd4:19;
/* bytes 16-19 */
u16 occupancy_inth;
@@ -585,6 +586,15 @@ union evl_status_reg {
#define IDXD_DSACAP0_OFFSET 0x180
union dsacap0_reg {
+ struct {
+ u64 max_sgl_shift:4;
+ u64 max_gr_block_shift:4;
+ u64 ops_inter_domain:7;
+ u64 rsvd1:17;
+ u64 sgl_formats:16;
+ u64 max_sg_process:8;
+ u64 rsvd2:8;
+ };
u64 bits;
};
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
2025-06-13 16:18 ` [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs Yi Sun
@ 2025-06-13 20:59 ` Dave Jiang
2025-06-14 10:01 ` Yi Sun
2025-06-13 21:43 ` Fenghua Yu
2025-06-13 22:07 ` Fenghua Yu
2 siblings, 1 reply; 13+ messages in thread
From: Dave Jiang @ 2025-06-13 20:59 UTC (permalink / raw)
To: Yi Sun, vinicius.gomes, dmaengine, linux-kernel
Cc: gordon.jin, fenghuay, anil.s.keshavamurthy, philip.lantz
On 6/13/25 9:18 AM, Yi Sun wrote:
> Introduce sysfs interfaces for 3 new Data Streaming Accelerator (DSA)
> capability registers (dsacap0-2) to enable userspace awareness of hardware
> features in DSA version 3 and later devices.
>
> Userspace components (e.g. configure libraries, workload Apps) require this
> information to:
> 1. Select optimal data transfer strategies based on SGL capabilities
> 2. Enable hardware-specific optimizations for floating-point operations
> 3. Configure memory operations with proper numerical handling
> 4. Verify compute operation compatibility before submitting jobs
>
> The output consists of values from the three dsacap registers, concatenated
> in order and separated by commas.
>
> Example:
> cat /sys/bus/dsa/devices/dsa0/dsacap
> 0014000e000007aa,00fa01ff01ff03ff,000000000000f18d
>
> Signed-off-by: Yi Sun <yi.sun@intel.com>
> Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Would be good to provide a link to the 3.0 spec. Otherwise
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
>
> diff --git a/Documentation/ABI/stable/sysfs-driver-dma-idxd b/Documentation/ABI/stable/sysfs-driver-dma-idxd
> index 4a355e6747ae..f9568ea52b2f 100644
> --- a/Documentation/ABI/stable/sysfs-driver-dma-idxd
> +++ b/Documentation/ABI/stable/sysfs-driver-dma-idxd
> @@ -136,6 +136,21 @@ Description: The last executed device administrative command's status/error.
> Also last configuration error overloaded.
> Writing to it will clear the status.
>
> +What: /sys/bus/dsa/devices/dsa<m>/dsacap
> +Date: June 1, 2025
> +KernelVersion: 6.17.0
> +Contact: dmaengine@vger.kernel.org
> +Description: The DSA3 specification introduces three new capability
> + registers: dsacap[0-2]. User components (e.g., configuration
> + libraries and workload applications) require this information
> + to properly utilize the DSA3 features.
> + This includes SGL capability support, Enabling hardware-specific
> + optimizations, Configuring memory, etc.
> + The output consists of values from the three dsacap registers,
> + concatenated in order and separated by commas.
> + This attribute should only be visible on DSA devices of version
> + 3 or later.
> +
> What: /sys/bus/dsa/devices/dsa<m>/iaa_cap
> Date: Sept 14, 2022
> KernelVersion: 6.0.0
> diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
> index 74e6695881e6..cc0a3fe1c957 100644
> --- a/drivers/dma/idxd/idxd.h
> +++ b/drivers/dma/idxd/idxd.h
> @@ -252,6 +252,9 @@ struct idxd_hw {
> struct opcap opcap;
> u32 cmd_cap;
> union iaa_cap_reg iaa_cap;
> + union dsacap0_reg dsacap0;
> + union dsacap1_reg dsacap1;
> + union dsacap2_reg dsacap2;
> };
>
> enum idxd_device_state {
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index 80355d03004d..cc8203320d40 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -582,6 +582,10 @@ static void idxd_read_caps(struct idxd_device *idxd)
> }
> multi_u64_to_bmap(idxd->opcap_bmap, &idxd->hw.opcap.bits[0], 4);
>
> + idxd->hw.dsacap0.bits = ioread64(idxd->reg_base + IDXD_DSACAP0_OFFSET);
> + idxd->hw.dsacap1.bits = ioread64(idxd->reg_base + IDXD_DSACAP1_OFFSET);
> + idxd->hw.dsacap2.bits = ioread64(idxd->reg_base + IDXD_DSACAP2_OFFSET);
> +
> /* read iaa cap */
> if (idxd->data->type == IDXD_TYPE_IAX && idxd->hw.version >= DEVICE_VERSION_2)
> idxd->hw.iaa_cap.bits = ioread64(idxd->reg_base + IDXD_IAACAP_OFFSET);
> diff --git a/drivers/dma/idxd/registers.h b/drivers/dma/idxd/registers.h
> index 006ba206ab1b..45485ecd7bb6 100644
> --- a/drivers/dma/idxd/registers.h
> +++ b/drivers/dma/idxd/registers.h
> @@ -13,6 +13,7 @@
>
> #define DEVICE_VERSION_1 0x100
> #define DEVICE_VERSION_2 0x200
> +#define DEVICE_VERSION_3 0x300
>
> #define IDXD_MMIO_BAR 0
> #define IDXD_WQ_BAR 2
> @@ -582,6 +583,21 @@ union evl_status_reg {
> u64 bits;
> } __packed;
>
> +#define IDXD_DSACAP0_OFFSET 0x180
> +union dsacap0_reg {
> + u64 bits;
> +};
> +
> +#define IDXD_DSACAP1_OFFSET 0x188
> +union dsacap1_reg {
> + u64 bits;
> +};
> +
> +#define IDXD_DSACAP2_OFFSET 0x190
> +union dsacap2_reg {
> + u64 bits;
> +};
> +
> #define IDXD_MAX_BATCH_IDENT 256
>
> struct __evl_entry {
> diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
> index 9f0701021af0..624b7d1b193f 100644
> --- a/drivers/dma/idxd/sysfs.c
> +++ b/drivers/dma/idxd/sysfs.c
> @@ -1713,6 +1713,21 @@ static ssize_t event_log_size_store(struct device *dev,
> }
> static DEVICE_ATTR_RW(event_log_size);
>
> +static ssize_t dsacap_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct idxd_device *idxd = confdev_to_idxd(dev);
> +
> + return sysfs_emit(buf, "%08x,%08x,%08x,%08x,%08x,%08x\n",
> + upper_32_bits(idxd->hw.dsacap0.bits),
> + lower_32_bits(idxd->hw.dsacap0.bits),
> + upper_32_bits(idxd->hw.dsacap1.bits),
> + lower_32_bits(idxd->hw.dsacap1.bits),
> + upper_32_bits(idxd->hw.dsacap2.bits),
> + lower_32_bits(idxd->hw.dsacap2.bits));
> +}
> +static DEVICE_ATTR_RO(dsacap);
> +
> static bool idxd_device_attr_max_batch_size_invisible(struct attribute *attr,
> struct idxd_device *idxd)
> {
> @@ -1750,6 +1765,14 @@ static bool idxd_device_attr_event_log_size_invisible(struct attribute *attr,
> !idxd->hw.gen_cap.evl_support);
> }
>
> +static bool idxd_device_attr_dsacap_invisible(struct attribute *attr,
> + struct idxd_device *idxd)
> +{
> + return attr == &dev_attr_dsacap.attr &&
> + (idxd->data->type != IDXD_TYPE_DSA ||
> + idxd->hw.version < DEVICE_VERSION_3);
> +}
> +
> static umode_t idxd_device_attr_visible(struct kobject *kobj,
> struct attribute *attr, int n)
> {
> @@ -1768,6 +1791,9 @@ static umode_t idxd_device_attr_visible(struct kobject *kobj,
> if (idxd_device_attr_event_log_size_invisible(attr, idxd))
> return 0;
>
> + if (idxd_device_attr_dsacap_invisible(attr, idxd))
> + return 0;
> +
> return attr->mode;
> }
>
> @@ -1795,6 +1821,7 @@ static struct attribute *idxd_device_attributes[] = {
> &dev_attr_cmd_status.attr,
> &dev_attr_iaa_cap.attr,
> &dev_attr_event_log_size.attr,
> + &dev_attr_dsacap.attr,
> NULL,
> };
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] dmaengine: idxd: Add Max SGL Size Support for DSA3.0
2025-06-13 16:18 ` [PATCH 2/2] dmaengine: idxd: Add Max SGL Size Support for DSA3.0 Yi Sun
@ 2025-06-13 21:00 ` Dave Jiang
2025-06-13 22:03 ` Fenghua Yu
1 sibling, 0 replies; 13+ messages in thread
From: Dave Jiang @ 2025-06-13 21:00 UTC (permalink / raw)
To: Yi Sun, vinicius.gomes, dmaengine, linux-kernel
Cc: gordon.jin, fenghuay, anil.s.keshavamurthy, philip.lantz
On 6/13/25 9:18 AM, Yi Sun wrote:
> Certain DSA 3.0 opcodes, such as Gather copy and Gather reduce requires max
> SGL configured for workqueues prior to support these opcodes.
>
> Configure the maximum scatter-gather list (SGL) size for workqueues during
> setup on the supported HW. Application can then properly handle the SGL
> size without explicitly setting it.
>
> Signed-off-by: Yi Sun <yi.sun@intel.com>
> Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
>
> diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c
> index 5cf419fe6b46..1c10b030bea7 100644
> --- a/drivers/dma/idxd/device.c
> +++ b/drivers/dma/idxd/device.c
> @@ -375,6 +375,7 @@ static void idxd_wq_disable_cleanup(struct idxd_wq *wq)
> memset(wq->name, 0, WQ_NAME_SIZE);
> wq->max_xfer_bytes = WQ_DEFAULT_MAX_XFER;
> idxd_wq_set_max_batch_size(idxd->data->type, wq, WQ_DEFAULT_MAX_BATCH);
> + idxd_wq_set_init_max_sgl_size(idxd, wq);
> if (wq->opcap_bmap)
> bitmap_copy(wq->opcap_bmap, idxd->opcap_bmap, IDXD_MAX_OPCAP_BITS);
> }
> @@ -974,6 +975,8 @@ static int idxd_wq_config_write(struct idxd_wq *wq)
> /* bytes 12-15 */
> wq->wqcfg->max_xfer_shift = ilog2(wq->max_xfer_bytes);
> idxd_wqcfg_set_max_batch_shift(idxd->data->type, wq->wqcfg, ilog2(wq->max_batch_size));
> + if (idxd_sgl_supported(idxd))
> + wq->wqcfg->max_sgl_shift = ilog2(wq->max_sgl_size);
>
> /* bytes 32-63 */
> if (idxd->hw.wq_cap.op_config && wq->opcap_bmap) {
> @@ -1152,6 +1155,8 @@ static int idxd_wq_load_config(struct idxd_wq *wq)
>
> wq->max_xfer_bytes = 1ULL << wq->wqcfg->max_xfer_shift;
> idxd_wq_set_max_batch_size(idxd->data->type, wq, 1U << wq->wqcfg->max_batch_shift);
> + if (idxd_sgl_supported(idxd))
> + wq->max_sgl_size = 1U << wq->wqcfg->max_sgl_shift;
>
> for (i = 0; i < WQCFG_STRIDES(idxd); i++) {
> wqcfg_offset = WQCFG_OFFSET(idxd, wq->id, i);
> diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
> index cc0a3fe1c957..fe5af50b58a4 100644
> --- a/drivers/dma/idxd/idxd.h
> +++ b/drivers/dma/idxd/idxd.h
> @@ -227,6 +227,7 @@ struct idxd_wq {
> char name[WQ_NAME_SIZE + 1];
> u64 max_xfer_bytes;
> u32 max_batch_size;
> + u32 max_sgl_size;
>
> /* Lock to protect upasid_xa access. */
> struct mutex uc_lock;
> @@ -348,6 +349,7 @@ struct idxd_device {
>
> u64 max_xfer_bytes;
> u32 max_batch_size;
> + u32 max_sgl_size;
> int max_groups;
> int max_engines;
> int max_rdbufs;
> @@ -692,6 +694,20 @@ static inline void idxd_wq_set_max_batch_size(int idxd_type, struct idxd_wq *wq,
> wq->max_batch_size = max_batch_size;
> }
>
> +static bool idxd_sgl_supported(struct idxd_device *idxd)
> +{
> + return idxd->hw.dsacap0.sgl_formats &&
> + idxd->data->type == IDXD_TYPE_DSA &&
> + idxd->hw.version >= DEVICE_VERSION_3;
> +}
> +
> +static inline void idxd_wq_set_init_max_sgl_size(struct idxd_device *idxd,
> + struct idxd_wq *wq)
> +{
> + if (idxd_sgl_supported(idxd))
> + wq->max_sgl_size = 1U << idxd->hw.dsacap0.max_sgl_shift;
> +}
> +
> static inline void idxd_wqcfg_set_max_batch_shift(int idxd_type, union wqcfg *wqcfg,
> u32 max_batch_shift)
> {
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index cc8203320d40..f37a7d7b537a 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -217,6 +217,7 @@ static int idxd_setup_wqs(struct idxd_device *idxd)
> init_completion(&wq->wq_resurrect);
> wq->max_xfer_bytes = WQ_DEFAULT_MAX_XFER;
> idxd_wq_set_max_batch_size(idxd->data->type, wq, WQ_DEFAULT_MAX_BATCH);
> + idxd_wq_set_init_max_sgl_size(idxd, wq);
> wq->enqcmds_retries = IDXD_ENQCMDS_RETRIES;
> wq->wqcfg = kzalloc_node(idxd->wqcfg_size, GFP_KERNEL, dev_to_node(dev));
> if (!wq->wqcfg) {
> @@ -585,6 +586,10 @@ static void idxd_read_caps(struct idxd_device *idxd)
> idxd->hw.dsacap0.bits = ioread64(idxd->reg_base + IDXD_DSACAP0_OFFSET);
> idxd->hw.dsacap1.bits = ioread64(idxd->reg_base + IDXD_DSACAP1_OFFSET);
> idxd->hw.dsacap2.bits = ioread64(idxd->reg_base + IDXD_DSACAP2_OFFSET);
> + if (idxd_sgl_supported(idxd)) {
> + idxd->max_sgl_size = 1U << idxd->hw.dsacap0.max_sgl_shift;
> + dev_dbg(dev, "max sgl size: %u\n", idxd->max_sgl_size);
> + }
>
> /* read iaa cap */
> if (idxd->data->type == IDXD_TYPE_IAX && idxd->hw.version >= DEVICE_VERSION_2)
> diff --git a/drivers/dma/idxd/registers.h b/drivers/dma/idxd/registers.h
> index 45485ecd7bb6..0401cfc95f27 100644
> --- a/drivers/dma/idxd/registers.h
> +++ b/drivers/dma/idxd/registers.h
> @@ -385,7 +385,8 @@ union wqcfg {
> /* bytes 12-15 */
> u32 max_xfer_shift:5;
> u32 max_batch_shift:4;
> - u32 rsvd4:23;
> + u32 max_sgl_shift:4;
> + u32 rsvd4:19;
>
> /* bytes 16-19 */
> u16 occupancy_inth;
> @@ -585,6 +586,15 @@ union evl_status_reg {
>
> #define IDXD_DSACAP0_OFFSET 0x180
> union dsacap0_reg {
> + struct {
> + u64 max_sgl_shift:4;
> + u64 max_gr_block_shift:4;
> + u64 ops_inter_domain:7;
> + u64 rsvd1:17;
> + u64 sgl_formats:16;
> + u64 max_sg_process:8;
> + u64 rsvd2:8;
> + };
> u64 bits;
> };
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
2025-06-13 16:18 ` [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs Yi Sun
2025-06-13 20:59 ` Dave Jiang
@ 2025-06-13 21:43 ` Fenghua Yu
2025-06-13 22:07 ` Fenghua Yu
2 siblings, 0 replies; 13+ messages in thread
From: Fenghua Yu @ 2025-06-13 21:43 UTC (permalink / raw)
To: Yi Sun, dave.jiang, vinicius.gomes, dmaengine, linux-kernel
Cc: gordon.jin, anil.s.keshavamurthy, philip.lantz
Hi, Yi,
On 6/13/25 09:18, Yi Sun wrote:
> Introduce sysfs interfaces for 3 new Data Streaming Accelerator (DSA)
> capability registers (dsacap0-2) to enable userspace awareness of hardware
> features in DSA version 3 and later devices.
>
> Userspace components (e.g. configure libraries, workload Apps) require this
> information to:
> 1. Select optimal data transfer strategies based on SGL capabilities
> 2. Enable hardware-specific optimizations for floating-point operations
> 3. Configure memory operations with proper numerical handling
> 4. Verify compute operation compatibility before submitting jobs
>
> The output consists of values from the three dsacap registers, concatenated
> in order and separated by commas.
>
> Example:
> cat /sys/bus/dsa/devices/dsa0/dsacap
> 0014000e000007aa,00fa01ff01ff03ff,000000000000f18d
>
> Signed-off-by: Yi Sun <yi.sun@intel.com>
> Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
>
> diff --git a/Documentation/ABI/stable/sysfs-driver-dma-idxd b/Documentation/ABI/stable/sysfs-driver-dma-idxd
> index 4a355e6747ae..f9568ea52b2f 100644
> --- a/Documentation/ABI/stable/sysfs-driver-dma-idxd
> +++ b/Documentation/ABI/stable/sysfs-driver-dma-idxd
> @@ -136,6 +136,21 @@ Description: The last executed device administrative command's status/error.
> Also last configuration error overloaded.
> Writing to it will clear the status.
>
> +What: /sys/bus/dsa/devices/dsa<m>/dsacap
Since 3 dsa caps are shown together, it's better to change this ABI name
to "dsacaps"?
> +Date: June 1, 2025
> +KernelVersion: 6.17.0
> +Contact: dmaengine@vger.kernel.org
> +Description: The DSA3 specification introduces three new capability
> + registers: dsacap[0-2]. User components (e.g., configuration
> + libraries and workload applications) require this information
> + to properly utilize the DSA3 features.
> + This includes SGL capability support, Enabling hardware-specific
> + optimizations, Configuring memory, etc.
> + The output consists of values from the three dsacap registers,
> + concatenated in order and separated by commas.
> + This attribute should only be visible on DSA devices of version
> + 3 or later.
> +
It's better to document the "order" of the output of the caps. So apps
can parse the caps. Something like:
"The output format is <dsacap2>,<dsacap1>,<dsacap0> where each DSA cap
value is a 64 bit hex value."
> What: /sys/bus/dsa/devices/dsa<m>/iaa_cap
> Date: Sept 14, 2022
> KernelVersion: 6.0.0
> diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
> index 74e6695881e6..cc0a3fe1c957 100644
> --- a/drivers/dma/idxd/idxd.h
> +++ b/drivers/dma/idxd/idxd.h
> @@ -252,6 +252,9 @@ struct idxd_hw {
> struct opcap opcap;
> u32 cmd_cap;
> union iaa_cap_reg iaa_cap;
> + union dsacap0_reg dsacap0;
> + union dsacap1_reg dsacap1;
> + union dsacap2_reg dsacap2;
> };
>
> enum idxd_device_state {
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index 80355d03004d..cc8203320d40 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -582,6 +582,10 @@ static void idxd_read_caps(struct idxd_device *idxd)
> }
> multi_u64_to_bmap(idxd->opcap_bmap, &idxd->hw.opcap.bits[0], 4);
>
> + idxd->hw.dsacap0.bits = ioread64(idxd->reg_base + IDXD_DSACAP0_OFFSET);
> + idxd->hw.dsacap1.bits = ioread64(idxd->reg_base + IDXD_DSACAP1_OFFSET);
> + idxd->hw.dsacap2.bits = ioread64(idxd->reg_base + IDXD_DSACAP2_OFFSET);
> +
> /* read iaa cap */
> if (idxd->data->type == IDXD_TYPE_IAX && idxd->hw.version >= DEVICE_VERSION_2)
> idxd->hw.iaa_cap.bits = ioread64(idxd->reg_base + IDXD_IAACAP_OFFSET);
> diff --git a/drivers/dma/idxd/registers.h b/drivers/dma/idxd/registers.h
> index 006ba206ab1b..45485ecd7bb6 100644
> --- a/drivers/dma/idxd/registers.h
> +++ b/drivers/dma/idxd/registers.h
> @@ -13,6 +13,7 @@
>
> #define DEVICE_VERSION_1 0x100
> #define DEVICE_VERSION_2 0x200
> +#define DEVICE_VERSION_3 0x300
>
> #define IDXD_MMIO_BAR 0
> #define IDXD_WQ_BAR 2
> @@ -582,6 +583,21 @@ union evl_status_reg {
> u64 bits;
> } __packed;
>
> +#define IDXD_DSACAP0_OFFSET 0x180
> +union dsacap0_reg {
> + u64 bits;
> +};
I forgot the format of dsacap. Is there any field in each dsa cap
register? If yes, better to add a structure inside to describe the fields.
> +
> +#define IDXD_DSACAP1_OFFSET 0x188
> +union dsacap1_reg {
> + u64 bits;
> +};
> +
> +#define IDXD_DSACAP2_OFFSET 0x190
> +union dsacap2_reg {
> + u64 bits;
> +};
> +
> #define IDXD_MAX_BATCH_IDENT 256
>
> struct __evl_entry {
> diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
> index 9f0701021af0..624b7d1b193f 100644
> --- a/drivers/dma/idxd/sysfs.c
> +++ b/drivers/dma/idxd/sysfs.c
> @@ -1713,6 +1713,21 @@ static ssize_t event_log_size_store(struct device *dev,
> }
> static DEVICE_ATTR_RW(event_log_size);
>
> +static ssize_t dsacap_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct idxd_device *idxd = confdev_to_idxd(dev);
> +
> + return sysfs_emit(buf, "%08x,%08x,%08x,%08x,%08x,%08x\n",
> + upper_32_bits(idxd->hw.dsacap0.bits),
> + lower_32_bits(idxd->hw.dsacap0.bits),
> + upper_32_bits(idxd->hw.dsacap1.bits),
> + lower_32_bits(idxd->hw.dsacap1.bits),
> + upper_32_bits(idxd->hw.dsacap2.bits),
> + lower_32_bits(idxd->hw.dsacap2.bits));
The output format of this sysfs_emit() doesn't match the format in your
earlier example:
cat /sys/bus/dsa/devices/dsa0/dsacap
0014000e000007aa,00fa01ff01ff03ff,000000000000f18d
And this sysfs_emit() is too complex and can be simplified as well.
So it might be changed to this?
+ return sysfs_emit(buf, "%016llx,%016llx,%016llx\n",
+ (u64)idxd->hw.dsacap0.bits,
+ (u64)idxd->hw.dsacap1.bits,
+ (u64)idxd->hw.dsacap2.bits);
> +}
> +static DEVICE_ATTR_RO(dsacap);
Since 3 dsa caps are shown together, do you need to change the ABI name
to "dsacaps" instead of "dsacap"?
> +
> static bool idxd_device_attr_max_batch_size_invisible(struct attribute *attr,
> struct idxd_device *idxd)
> {
> @@ -1750,6 +1765,14 @@ static bool idxd_device_attr_event_log_size_invisible(struct attribute *attr,
> !idxd->hw.gen_cap.evl_support);
> }
>
> +static bool idxd_device_attr_dsacap_invisible(struct attribute *attr,
> + struct idxd_device *idxd)
> +{
> + return attr == &dev_attr_dsacap.attr &&
> + (idxd->data->type != IDXD_TYPE_DSA ||
> + idxd->hw.version < DEVICE_VERSION_3);
> +}
> +
> static umode_t idxd_device_attr_visible(struct kobject *kobj,
> struct attribute *attr, int n)
> {
> @@ -1768,6 +1791,9 @@ static umode_t idxd_device_attr_visible(struct kobject *kobj,
> if (idxd_device_attr_event_log_size_invisible(attr, idxd))
> return 0;
>
> + if (idxd_device_attr_dsacap_invisible(attr, idxd))
> + return 0;
> +
> return attr->mode;
> }
>
> @@ -1795,6 +1821,7 @@ static struct attribute *idxd_device_attributes[] = {
> &dev_attr_cmd_status.attr,
> &dev_attr_iaa_cap.attr,
> &dev_attr_event_log_size.attr,
> + &dev_attr_dsacap.attr,
> NULL,
> };
>
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] dmaengine: idxd: Add Max SGL Size Support for DSA3.0
2025-06-13 16:18 ` [PATCH 2/2] dmaengine: idxd: Add Max SGL Size Support for DSA3.0 Yi Sun
2025-06-13 21:00 ` Dave Jiang
@ 2025-06-13 22:03 ` Fenghua Yu
2025-06-14 7:56 ` Yi Sun
1 sibling, 1 reply; 13+ messages in thread
From: Fenghua Yu @ 2025-06-13 22:03 UTC (permalink / raw)
To: Yi Sun, dave.jiang, vinicius.gomes, dmaengine, linux-kernel
Cc: gordon.jin, anil.s.keshavamurthy, philip.lantz
Hi, Yi,
On 6/13/25 09:18, Yi Sun wrote:
> Certain DSA 3.0 opcodes, such as Gather copy and Gather reduce requires max
s/reduce requires/reduce, require/
> SGL configured for workqueues prior to support these opcodes.
s/prior to support/prior to supporting/
>
> Configure the maximum scatter-gather list (SGL) size for workqueues during
> setup on the supported HW. Application can then properly handle the SGL
> size without explicitly setting it.
>
> Signed-off-by: Yi Sun <yi.sun@intel.com>
> Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
>
> diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c
> index 5cf419fe6b46..1c10b030bea7 100644
> --- a/drivers/dma/idxd/device.c
> +++ b/drivers/dma/idxd/device.c
> @@ -375,6 +375,7 @@ static void idxd_wq_disable_cleanup(struct idxd_wq *wq)
> memset(wq->name, 0, WQ_NAME_SIZE);
> wq->max_xfer_bytes = WQ_DEFAULT_MAX_XFER;
> idxd_wq_set_max_batch_size(idxd->data->type, wq, WQ_DEFAULT_MAX_BATCH);
> + idxd_wq_set_init_max_sgl_size(idxd, wq);
> if (wq->opcap_bmap)
> bitmap_copy(wq->opcap_bmap, idxd->opcap_bmap, IDXD_MAX_OPCAP_BITS);
> }
> @@ -974,6 +975,8 @@ static int idxd_wq_config_write(struct idxd_wq *wq)
> /* bytes 12-15 */
> wq->wqcfg->max_xfer_shift = ilog2(wq->max_xfer_bytes);
> idxd_wqcfg_set_max_batch_shift(idxd->data->type, wq->wqcfg, ilog2(wq->max_batch_size));
> + if (idxd_sgl_supported(idxd))
> + wq->wqcfg->max_sgl_shift = ilog2(wq->max_sgl_size);
>
> /* bytes 32-63 */
> if (idxd->hw.wq_cap.op_config && wq->opcap_bmap) {
> @@ -1152,6 +1155,8 @@ static int idxd_wq_load_config(struct idxd_wq *wq)
>
> wq->max_xfer_bytes = 1ULL << wq->wqcfg->max_xfer_shift;
> idxd_wq_set_max_batch_size(idxd->data->type, wq, 1U << wq->wqcfg->max_batch_shift);
> + if (idxd_sgl_supported(idxd))
> + wq->max_sgl_size = 1U << wq->wqcfg->max_sgl_shift;
>
> for (i = 0; i < WQCFG_STRIDES(idxd); i++) {
> wqcfg_offset = WQCFG_OFFSET(idxd, wq->id, i);
> diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
> index cc0a3fe1c957..fe5af50b58a4 100644
> --- a/drivers/dma/idxd/idxd.h
> +++ b/drivers/dma/idxd/idxd.h
> @@ -227,6 +227,7 @@ struct idxd_wq {
> char name[WQ_NAME_SIZE + 1];
> u64 max_xfer_bytes;
> u32 max_batch_size;
> + u32 max_sgl_size;
>
> /* Lock to protect upasid_xa access. */
> struct mutex uc_lock;
> @@ -348,6 +349,7 @@ struct idxd_device {
>
> u64 max_xfer_bytes;
> u32 max_batch_size;
> + u32 max_sgl_size;
> int max_groups;
> int max_engines;
> int max_rdbufs;
> @@ -692,6 +694,20 @@ static inline void idxd_wq_set_max_batch_size(int idxd_type, struct idxd_wq *wq,
> wq->max_batch_size = max_batch_size;
> }
>
> +static bool idxd_sgl_supported(struct idxd_device *idxd)
> +{
> + return idxd->hw.dsacap0.sgl_formats &&
> + idxd->data->type == IDXD_TYPE_DSA &&
> + idxd->hw.version >= DEVICE_VERSION_3;
> +}
This is not safe on DSA 1 or 2 because the first check
idxd->hw.dsacap0.sgl_format is an invalid value on DSA 1 and 2.
You need to change the order to this for safety:
+ return idxd->data->type == IDXD_TYPE_DSA &&
+ idxd->hw.version >= DEVICE_VERSION_3 &&
+ idxd->hw.dsacap0.sgl_formats;
> +
> +static inline void idxd_wq_set_init_max_sgl_size(struct idxd_device *idxd,
> + struct idxd_wq *wq)
> +{
> + if (idxd_sgl_supported(idxd))
> + wq->max_sgl_size = 1U << idxd->hw.dsacap0.max_sgl_shift;
> +}
> +
> static inline void idxd_wqcfg_set_max_batch_shift(int idxd_type, union wqcfg *wqcfg,
> u32 max_batch_shift)
> {
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index cc8203320d40..f37a7d7b537a 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -217,6 +217,7 @@ static int idxd_setup_wqs(struct idxd_device *idxd)
> init_completion(&wq->wq_resurrect);
> wq->max_xfer_bytes = WQ_DEFAULT_MAX_XFER;
> idxd_wq_set_max_batch_size(idxd->data->type, wq, WQ_DEFAULT_MAX_BATCH);
> + idxd_wq_set_init_max_sgl_size(idxd, wq);
> wq->enqcmds_retries = IDXD_ENQCMDS_RETRIES;
> wq->wqcfg = kzalloc_node(idxd->wqcfg_size, GFP_KERNEL, dev_to_node(dev));
> if (!wq->wqcfg) {
> @@ -585,6 +586,10 @@ static void idxd_read_caps(struct idxd_device *idxd)
> idxd->hw.dsacap0.bits = ioread64(idxd->reg_base + IDXD_DSACAP0_OFFSET);
> idxd->hw.dsacap1.bits = ioread64(idxd->reg_base + IDXD_DSACAP1_OFFSET);
> idxd->hw.dsacap2.bits = ioread64(idxd->reg_base + IDXD_DSACAP2_OFFSET);
> + if (idxd_sgl_supported(idxd)) {
> + idxd->max_sgl_size = 1U << idxd->hw.dsacap0.max_sgl_shift;
> + dev_dbg(dev, "max sgl size: %u\n", idxd->max_sgl_size);
> + }
>
> /* read iaa cap */
> if (idxd->data->type == IDXD_TYPE_IAX && idxd->hw.version >= DEVICE_VERSION_2)
> diff --git a/drivers/dma/idxd/registers.h b/drivers/dma/idxd/registers.h
> index 45485ecd7bb6..0401cfc95f27 100644
> --- a/drivers/dma/idxd/registers.h
> +++ b/drivers/dma/idxd/registers.h
> @@ -385,7 +385,8 @@ union wqcfg {
> /* bytes 12-15 */
> u32 max_xfer_shift:5;
> u32 max_batch_shift:4;
> - u32 rsvd4:23;
> + u32 max_sgl_shift:4;
> + u32 rsvd4:19;
>
> /* bytes 16-19 */
> u16 occupancy_inth;
> @@ -585,6 +586,15 @@ union evl_status_reg {
>
> #define IDXD_DSACAP0_OFFSET 0x180
> union dsacap0_reg {
> + struct {
> + u64 max_sgl_shift:4;
> + u64 max_gr_block_shift:4;
> + u64 ops_inter_domain:7;
> + u64 rsvd1:17;
> + u64 sgl_formats:16;
> + u64 max_sg_process:8;
> + u64 rsvd2:8;
> + };
Ah. The fields are defined here. I would suggest the fields are defined
in patch 1 because:
1. Reviewer (like me) may get confused when reviewing patch 1 where
dsacap0 doesn't have any field but is defined a union.
2. There are fields that not max_sgl_shift. So those fields are
irrelevant to this patch and had better to be define in patch 1.
> u64 bits;
> };
>
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
2025-06-13 16:18 ` [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs Yi Sun
2025-06-13 20:59 ` Dave Jiang
2025-06-13 21:43 ` Fenghua Yu
@ 2025-06-13 22:07 ` Fenghua Yu
2025-06-13 22:26 ` Lantz, Philip
2 siblings, 1 reply; 13+ messages in thread
From: Fenghua Yu @ 2025-06-13 22:07 UTC (permalink / raw)
To: Yi Sun, dave.jiang, vinicius.gomes, dmaengine, linux-kernel
Cc: gordon.jin, anil.s.keshavamurthy, philip.lantz
Hi, Yi,
On 6/13/25 09:18, Yi Sun wrote:
> Introduce sysfs interfaces for 3 new Data Streaming Accelerator (DSA)
> capability registers (dsacap0-2) to enable userspace awareness of hardware
> features in DSA version 3 and later devices.
>
> Userspace components (e.g. configure libraries, workload Apps) require this
> information to:
> 1. Select optimal data transfer strategies based on SGL capabilities
> 2. Enable hardware-specific optimizations for floating-point operations
> 3. Configure memory operations with proper numerical handling
> 4. Verify compute operation compatibility before submitting jobs
>
> The output consists of values from the three dsacap registers, concatenated
> in order and separated by commas.
>
> Example:
> cat /sys/bus/dsa/devices/dsa0/dsacap
> 0014000e000007aa,00fa01ff01ff03ff,000000000000f18d
>
> Signed-off-by: Yi Sun <yi.sun@intel.com>
> Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
>
> diff --git a/Documentation/ABI/stable/sysfs-driver-dma-idxd b/Documentation/ABI/stable/sysfs-driver-dma-idxd
> index 4a355e6747ae..f9568ea52b2f 100644
> --- a/Documentation/ABI/stable/sysfs-driver-dma-idxd
> +++ b/Documentation/ABI/stable/sysfs-driver-dma-idxd
> @@ -136,6 +136,21 @@ Description: The last executed device administrative command's status/error.
> Also last configuration error overloaded.
> Writing to it will clear the status.
>
> +What: /sys/bus/dsa/devices/dsa<m>/dsacap
> +Date: June 1, 2025
> +KernelVersion: 6.17.0
> +Contact: dmaengine@vger.kernel.org
> +Description: The DSA3 specification introduces three new capability
> + registers: dsacap[0-2]. User components (e.g., configuration
> + libraries and workload applications) require this information
> + to properly utilize the DSA3 features.
> + This includes SGL capability support, Enabling hardware-specific
> + optimizations, Configuring memory, etc.
> + The output consists of values from the three dsacap registers,
> + concatenated in order and separated by commas.
> + This attribute should only be visible on DSA devices of version
> + 3 or later.
> +
> What: /sys/bus/dsa/devices/dsa<m>/iaa_cap
> Date: Sept 14, 2022
> KernelVersion: 6.0.0
> diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
> index 74e6695881e6..cc0a3fe1c957 100644
> --- a/drivers/dma/idxd/idxd.h
> +++ b/drivers/dma/idxd/idxd.h
> @@ -252,6 +252,9 @@ struct idxd_hw {
> struct opcap opcap;
> u32 cmd_cap;
> union iaa_cap_reg iaa_cap;
> + union dsacap0_reg dsacap0;
> + union dsacap1_reg dsacap1;
> + union dsacap2_reg dsacap2;
> };
>
> enum idxd_device_state {
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index 80355d03004d..cc8203320d40 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -582,6 +582,10 @@ static void idxd_read_caps(struct idxd_device *idxd)
> }
> multi_u64_to_bmap(idxd->opcap_bmap, &idxd->hw.opcap.bits[0], 4);
>
> + idxd->hw.dsacap0.bits = ioread64(idxd->reg_base + IDXD_DSACAP0_OFFSET);
> + idxd->hw.dsacap1.bits = ioread64(idxd->reg_base + IDXD_DSACAP1_OFFSET);
> + idxd->hw.dsacap2.bits = ioread64(idxd->reg_base + IDXD_DSACAP2_OFFSET);
> +
The dsacaps are invalid for DSA 1 and 2. Not safe to read and assign the
bits on DSA 1 and 2.
Better to assign the dsacap bits only when idxd.hw.version >= DSA_VERSION_3.
[SNIP]
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
2025-06-13 22:07 ` Fenghua Yu
@ 2025-06-13 22:26 ` Lantz, Philip
2025-06-16 18:08 ` Fenghua Yu
0 siblings, 1 reply; 13+ messages in thread
From: Lantz, Philip @ 2025-06-13 22:26 UTC (permalink / raw)
To: Fenghua Yu, Sun, Yi
Cc: Jin, Gordon, Keshavamurthy, Anil S, Jiang, Dave, Gomes, Vinicius,
dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org
Fenghua wrote:
> Hi, Yi,
>
> On 6/13/25 09:18, Yi Sun wrote:
> > Introduce sysfs interfaces for 3 new Data Streaming Accelerator (DSA)
> > capability registers (dsacap0-2) to enable userspace awareness of hardware
> > features in DSA version 3 and later devices.
> >
> > Userspace components (e.g. configure libraries, workload Apps) require this
> > information to:
> > 1. Select optimal data transfer strategies based on SGL capabilities
> > 2. Enable hardware-specific optimizations for floating-point operations
> > 3. Configure memory operations with proper numerical handling
> > 4. Verify compute operation compatibility before submitting jobs
> >
> > The output consists of values from the three dsacap registers, concatenated
> > in order and separated by commas.
> >
> > Example:
> > cat /sys/bus/dsa/devices/dsa0/dsacap
> > 0014000e000007aa,00fa01ff01ff03ff,000000000000f18d
> >
> > Signed-off-by: Yi Sun <yi.sun@intel.com>
> > Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
> > Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
> >
> > diff --git a/Documentation/ABI/stable/sysfs-driver-dma-idxd
> b/Documentation/ABI/stable/sysfs-driver-dma-idxd
> > index 4a355e6747ae..f9568ea52b2f 100644
> > --- a/Documentation/ABI/stable/sysfs-driver-dma-idxd
> > +++ b/Documentation/ABI/stable/sysfs-driver-dma-idxd
> > @@ -136,6 +136,21 @@ Description: The last executed device administrative
> command's status/error.
> > Also last configuration error overloaded.
> > Writing to it will clear the status.
> >
> > +What: /sys/bus/dsa/devices/dsa<m>/dsacap
> > +Date: June 1, 2025
> > +KernelVersion: 6.17.0
> > +Contact: dmaengine@vger.kernel.org
> > +Description: The DSA3 specification introduces three new capability
> > + registers: dsacap[0-2]. User components (e.g., configuration
> > + libraries and workload applications) require this information
> > + to properly utilize the DSA3 features.
> > + This includes SGL capability support, Enabling hardware-specific
> > + optimizations, Configuring memory, etc.
> > + The output consists of values from the three dsacap registers,
> > + concatenated in order and separated by commas.
> > + This attribute should only be visible on DSA devices of version
> > + 3 or later.
> > +
> > What: /sys/bus/dsa/devices/dsa<m>/iaa_cap
> > Date: Sept 14, 2022
> > KernelVersion: 6.0.0
> > diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
> > index 74e6695881e6..cc0a3fe1c957 100644
> > --- a/drivers/dma/idxd/idxd.h
> > +++ b/drivers/dma/idxd/idxd.h
> > @@ -252,6 +252,9 @@ struct idxd_hw {
> > struct opcap opcap;
> > u32 cmd_cap;
> > union iaa_cap_reg iaa_cap;
> > + union dsacap0_reg dsacap0;
> > + union dsacap1_reg dsacap1;
> > + union dsacap2_reg dsacap2;
> > };
> >
> > enum idxd_device_state {
> > diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> > index 80355d03004d..cc8203320d40 100644
> > --- a/drivers/dma/idxd/init.c
> > +++ b/drivers/dma/idxd/init.c
> > @@ -582,6 +582,10 @@ static void idxd_read_caps(struct idxd_device *idxd)
> > }
> > multi_u64_to_bmap(idxd->opcap_bmap, &idxd->hw.opcap.bits[0], 4);
> >
> > + idxd->hw.dsacap0.bits = ioread64(idxd->reg_base +
> IDXD_DSACAP0_OFFSET);
> > + idxd->hw.dsacap1.bits = ioread64(idxd->reg_base +
> IDXD_DSACAP1_OFFSET);
> > + idxd->hw.dsacap2.bits = ioread64(idxd->reg_base +
> IDXD_DSACAP2_OFFSET);
> > +
>
> The dsacaps are invalid for DSA 1 and 2. Not safe to read and assign the
> bits on DSA 1 and 2.
>
> Better to assign the dsacap bits only when idxd.hw.version >= DSA_VERSION_3.
The registers are architecturally guaranteed to return 0 on prior versions, so it is
safe to read them on DSA 1 and 2 and there is no need for an additional check.
> [SNIP]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2] dmaengine: idxd: Add Max SGL Size Support for DSA3.0
2025-06-13 22:03 ` Fenghua Yu
@ 2025-06-14 7:56 ` Yi Sun
0 siblings, 0 replies; 13+ messages in thread
From: Yi Sun @ 2025-06-14 7:56 UTC (permalink / raw)
To: Fenghua Yu
Cc: dave.jiang, vinicius.gomes, dmaengine, linux-kernel, gordon.jin,
anil.s.keshavamurthy, philip.lantz
On 13.06.2025 15:03, Fenghua Yu wrote:
>Hi, Yi,
>
>On 6/13/25 09:18, Yi Sun wrote:
>>Certain DSA 3.0 opcodes, such as Gather copy and Gather reduce requires max
>s/reduce requires/reduce, require/
>>SGL configured for workqueues prior to support these opcodes.
>s/prior to support/prior to supporting/
>>
Get it.
... ...
>> #define IDXD_DSACAP0_OFFSET 0x180
>> union dsacap0_reg {
>>+ struct {
>>+ u64 max_sgl_shift:4;
>>+ u64 max_gr_block_shift:4;
>>+ u64 ops_inter_domain:7;
>>+ u64 rsvd1:17;
>>+ u64 sgl_formats:16;
>>+ u64 max_sg_process:8;
>>+ u64 rsvd2:8;
>>+ };
>
>Ah. The fields are defined here. I would suggest the fields are
>defined in patch 1 because:
>
>1. Reviewer (like me) may get confused when reviewing patch 1 where
>dsacap0 doesn't have any field but is defined a union.
>
>2. There are fields that not max_sgl_shift. So those fields are
>irrelevant to this patch and had better to be define in patch 1.
>
>> u64 bits;
>> };
>
OK, I see. I'll move this definition to patch 1.
Thanks
--Sun, Yi
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
2025-06-13 20:59 ` Dave Jiang
@ 2025-06-14 10:01 ` Yi Sun
0 siblings, 0 replies; 13+ messages in thread
From: Yi Sun @ 2025-06-14 10:01 UTC (permalink / raw)
To: Dave Jiang
Cc: vinicius.gomes, dmaengine, linux-kernel, gordon.jin, fenghuay,
anil.s.keshavamurthy, philip.lantz
On 13.06.2025 13:59, Dave Jiang wrote:
>
>
>On 6/13/25 9:18 AM, Yi Sun wrote:
>> Introduce sysfs interfaces for 3 new Data Streaming Accelerator (DSA)
>> capability registers (dsacap0-2) to enable userspace awareness of hardware
>> features in DSA version 3 and later devices.
>>
>> Userspace components (e.g. configure libraries, workload Apps) require this
>> information to:
>> 1. Select optimal data transfer strategies based on SGL capabilities
>> 2. Enable hardware-specific optimizations for floating-point operations
>> 3. Configure memory operations with proper numerical handling
>> 4. Verify compute operation compatibility before submitting jobs
>>
>> The output consists of values from the three dsacap registers, concatenated
>> in order and separated by commas.
>>
>> Example:
>> cat /sys/bus/dsa/devices/dsa0/dsacap
>> 0014000e000007aa,00fa01ff01ff03ff,000000000000f18d
>>
>> Signed-off-by: Yi Sun <yi.sun@intel.com>
>> Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
>> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
>
>Would be good to provide a link to the 3.0 spec. Otherwise
>Reviewed-by: Dave Jiang <dave.jiang@intel.com>
>
Sure, will add this link:
https://cdrdv2-public.intel.com/857060/341204-006-intel-data-streaming-accelerator-spec.pdf
Thanks
--Sun, Yi
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
2025-06-13 22:26 ` Lantz, Philip
@ 2025-06-16 18:08 ` Fenghua Yu
2025-06-19 2:51 ` Sun, Yi
0 siblings, 1 reply; 13+ messages in thread
From: Fenghua Yu @ 2025-06-16 18:08 UTC (permalink / raw)
To: Lantz, Philip, Sun, Yi
Cc: Jin, Gordon, Keshavamurthy, Anil S, Jiang, Dave, Gomes, Vinicius,
dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org
Hi, Philip,
On 6/13/25 15:26, Lantz, Philip wrote:
>
> Fenghua wrote:
>
>> Hi, Yi,
>>
>> On 6/13/25 09:18, Yi Sun wrote:
>>> Introduce sysfs interfaces for 3 new Data Streaming Accelerator (DSA)
>>> capability registers (dsacap0-2) to enable userspace awareness of hardware
>>> features in DSA version 3 and later devices.
>>>
>>> Userspace components (e.g. configure libraries, workload Apps) require this
>>> information to:
>>> 1. Select optimal data transfer strategies based on SGL capabilities
>>> 2. Enable hardware-specific optimizations for floating-point operations
>>> 3. Configure memory operations with proper numerical handling
>>> 4. Verify compute operation compatibility before submitting jobs
>>>
>>> The output consists of values from the three dsacap registers, concatenated
>>> in order and separated by commas.
>>>
>>> Example:
>>> cat /sys/bus/dsa/devices/dsa0/dsacap
>>> 0014000e000007aa,00fa01ff01ff03ff,000000000000f18d
>>>
>>> Signed-off-by: Yi Sun <yi.sun@intel.com>
>>> Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
>>> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
>>>
>>> diff --git a/Documentation/ABI/stable/sysfs-driver-dma-idxd
>> b/Documentation/ABI/stable/sysfs-driver-dma-idxd
>>> index 4a355e6747ae..f9568ea52b2f 100644
>>> --- a/Documentation/ABI/stable/sysfs-driver-dma-idxd
>>> +++ b/Documentation/ABI/stable/sysfs-driver-dma-idxd
>>> @@ -136,6 +136,21 @@ Description: The last executed device administrative
>> command's status/error.
>>> Also last configuration error overloaded.
>>> Writing to it will clear the status.
>>>
>>> +What: /sys/bus/dsa/devices/dsa<m>/dsacap
>>> +Date: June 1, 2025
>>> +KernelVersion: 6.17.0
>>> +Contact: dmaengine@vger.kernel.org
>>> +Description: The DSA3 specification introduces three new capability
>>> + registers: dsacap[0-2]. User components (e.g., configuration
>>> + libraries and workload applications) require this information
>>> + to properly utilize the DSA3 features.
>>> + This includes SGL capability support, Enabling hardware-specific
>>> + optimizations, Configuring memory, etc.
>>> + The output consists of values from the three dsacap registers,
>>> + concatenated in order and separated by commas.
>>> + This attribute should only be visible on DSA devices of version
>>> + 3 or later.
>>> +
>>> What: /sys/bus/dsa/devices/dsa<m>/iaa_cap
>>> Date: Sept 14, 2022
>>> KernelVersion: 6.0.0
>>> diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
>>> index 74e6695881e6..cc0a3fe1c957 100644
>>> --- a/drivers/dma/idxd/idxd.h
>>> +++ b/drivers/dma/idxd/idxd.h
>>> @@ -252,6 +252,9 @@ struct idxd_hw {
>>> struct opcap opcap;
>>> u32 cmd_cap;
>>> union iaa_cap_reg iaa_cap;
>>> + union dsacap0_reg dsacap0;
>>> + union dsacap1_reg dsacap1;
>>> + union dsacap2_reg dsacap2;
>>> };
>>>
>>> enum idxd_device_state {
>>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>> index 80355d03004d..cc8203320d40 100644
>>> --- a/drivers/dma/idxd/init.c
>>> +++ b/drivers/dma/idxd/init.c
>>> @@ -582,6 +582,10 @@ static void idxd_read_caps(struct idxd_device *idxd)
>>> }
>>> multi_u64_to_bmap(idxd->opcap_bmap, &idxd->hw.opcap.bits[0], 4);
>>>
>>> + idxd->hw.dsacap0.bits = ioread64(idxd->reg_base +
>> IDXD_DSACAP0_OFFSET);
>>> + idxd->hw.dsacap1.bits = ioread64(idxd->reg_base +
>> IDXD_DSACAP1_OFFSET);
>>> + idxd->hw.dsacap2.bits = ioread64(idxd->reg_base +
>> IDXD_DSACAP2_OFFSET);
>>> +
>> The dsacaps are invalid for DSA 1 and 2. Not safe to read and assign the
>> bits on DSA 1 and 2.
>>
>> Better to assign the dsacap bits only when idxd.hw.version >= DSA_VERSION_3.
> The registers are architecturally guaranteed to return 0 on prior versions, so it is
> safe to read them on DSA 1 and 2 and there is no need for an additional check.
Although it's safe to read them here on DSA 1 and 2, reading a reserved
value generally is not a good code practice in the kernel. I would still
suggest to avoid to read the reserved values on DSA 1 and 2.
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs
2025-06-16 18:08 ` Fenghua Yu
@ 2025-06-19 2:51 ` Sun, Yi
0 siblings, 0 replies; 13+ messages in thread
From: Sun, Yi @ 2025-06-19 2:51 UTC (permalink / raw)
To: Fenghua Yu, Lantz, Philip
Cc: Jin, Gordon, Keshavamurthy, Anil S, Jiang, Dave, Gomes, Vinicius,
dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org
On 16.06.2025 11:08, Fenghua Yu wrote:
>Hi, Philip,
>
>On 6/13/25 15:26, Lantz, Philip wrote:
>>
>>Fenghua wrote:
>>
>>>Hi, Yi,
>>>
>>>On 6/13/25 09:18, Yi Sun wrote:
>>>>Introduce sysfs interfaces for 3 new Data Streaming Accelerator (DSA)
>>>>capability registers (dsacap0-2) to enable userspace awareness of hardware
>>>>features in DSA version 3 and later devices.
>>>>
>>>>Userspace components (e.g. configure libraries, workload Apps) require this
>>>>information to:
>>>>1. Select optimal data transfer strategies based on SGL capabilities
>>>>2. Enable hardware-specific optimizations for floating-point operations
>>>>3. Configure memory operations with proper numerical handling
>>>>4. Verify compute operation compatibility before submitting jobs
>>>>
>>>>The output consists of values from the three dsacap registers, concatenated
>>>>in order and separated by commas.
>>>>
>>>>Example:
>>>>cat /sys/bus/dsa/devices/dsa0/dsacap
>>>> 0014000e000007aa,00fa01ff01ff03ff,000000000000f18d
>>>>
>>>>Signed-off-by: Yi Sun <yi.sun@intel.com>
>>>>Co-developed-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
>>>>Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
>>>>
>>>>diff --git a/Documentation/ABI/stable/sysfs-driver-dma-idxd
>>>b/Documentation/ABI/stable/sysfs-driver-dma-idxd
>>>>index 4a355e6747ae..f9568ea52b2f 100644
>>>>--- a/Documentation/ABI/stable/sysfs-driver-dma-idxd
>>>>+++ b/Documentation/ABI/stable/sysfs-driver-dma-idxd
>>>>@@ -136,6 +136,21 @@ Description: The last executed device administrative
>>>command's status/error.
>>>> Also last configuration error overloaded.
>>>> Writing to it will clear the status.
>>>>
>>>>+What: /sys/bus/dsa/devices/dsa<m>/dsacap
>>>>+Date: June 1, 2025
>>>>+KernelVersion: 6.17.0
>>>>+Contact: dmaengine@vger.kernel.org
>>>>+Description: The DSA3 specification introduces three new capability
>>>>+ registers: dsacap[0-2]. User components (e.g., configuration
>>>>+ libraries and workload applications) require this information
>>>>+ to properly utilize the DSA3 features.
>>>>+ This includes SGL capability support, Enabling hardware-specific
>>>>+ optimizations, Configuring memory, etc.
>>>>+ The output consists of values from the three dsacap registers,
>>>>+ concatenated in order and separated by commas.
>>>>+ This attribute should only be visible on DSA devices of version
>>>>+ 3 or later.
>>>>+
>>>> What: /sys/bus/dsa/devices/dsa<m>/iaa_cap
>>>> Date: Sept 14, 2022
>>>> KernelVersion: 6.0.0
>>>>diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
>>>>index 74e6695881e6..cc0a3fe1c957 100644
>>>>--- a/drivers/dma/idxd/idxd.h
>>>>+++ b/drivers/dma/idxd/idxd.h
>>>>@@ -252,6 +252,9 @@ struct idxd_hw {
>>>> struct opcap opcap;
>>>> u32 cmd_cap;
>>>> union iaa_cap_reg iaa_cap;
>>>>+ union dsacap0_reg dsacap0;
>>>>+ union dsacap1_reg dsacap1;
>>>>+ union dsacap2_reg dsacap2;
>>>> };
>>>>
>>>> enum idxd_device_state {
>>>>diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>>>index 80355d03004d..cc8203320d40 100644
>>>>--- a/drivers/dma/idxd/init.c
>>>>+++ b/drivers/dma/idxd/init.c
>>>>@@ -582,6 +582,10 @@ static void idxd_read_caps(struct idxd_device *idxd)
>>>> }
>>>> multi_u64_to_bmap(idxd->opcap_bmap, &idxd->hw.opcap.bits[0], 4);
>>>>
>>>>+ idxd->hw.dsacap0.bits = ioread64(idxd->reg_base +
>>>IDXD_DSACAP0_OFFSET);
>>>>+ idxd->hw.dsacap1.bits = ioread64(idxd->reg_base +
>>>IDXD_DSACAP1_OFFSET);
>>>>+ idxd->hw.dsacap2.bits = ioread64(idxd->reg_base +
>>>IDXD_DSACAP2_OFFSET);
>>>>+
>>>The dsacaps are invalid for DSA 1 and 2. Not safe to read and assign the
>>>bits on DSA 1 and 2.
>>>
>>>Better to assign the dsacap bits only when idxd.hw.version >= DSA_VERSION_3.
>>The registers are architecturally guaranteed to return 0 on prior versions, so it is
>>safe to read them on DSA 1 and 2 and there is no need for an additional check.
>
>Although it's safe to read them here on DSA 1 and 2, reading a
>reserved value generally is not a good code practice in the kernel. I
>would still suggest to avoid to read the reserved values on DSA 1 and
>2.
My previous understanding was that ioread64() would ensure safe behavior
on DSA1.0 and DSA2.0. However, I'm fine with Fenghua's suggestion adding
the condition version >= DSA_VERSION_3. It can provide future-proofing
in case the behavior of ioread64() changes.
Thanks
--Sun, Yi
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-06-19 2:51 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-13 16:18 [PATCH 0/2] dmaengine: idxd: Add basic DSA 3.0 capability and SGL support Yi Sun
2025-06-13 16:18 ` [PATCH 1/2] dmaengine: idxd: Expose DSA3.0 capabilities through sysfs Yi Sun
2025-06-13 20:59 ` Dave Jiang
2025-06-14 10:01 ` Yi Sun
2025-06-13 21:43 ` Fenghua Yu
2025-06-13 22:07 ` Fenghua Yu
2025-06-13 22:26 ` Lantz, Philip
2025-06-16 18:08 ` Fenghua Yu
2025-06-19 2:51 ` Sun, Yi
2025-06-13 16:18 ` [PATCH 2/2] dmaengine: idxd: Add Max SGL Size Support for DSA3.0 Yi Sun
2025-06-13 21:00 ` Dave Jiang
2025-06-13 22:03 ` Fenghua Yu
2025-06-14 7:56 ` Yi Sun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox