* [PATCH v2 0/2] devlink: Support setting max_io_eqs
@ 2024-04-10 11:58 Parav Pandit
2024-04-10 11:58 ` [PATCH v2 1/2] uapi: Update devlink kernel headers Parav Pandit
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Parav Pandit @ 2024-04-10 11:58 UTC (permalink / raw)
To: netdev, dsahern, stephen; +Cc: jiri, shayd, Parav Pandit
Devices send event notifications for the IO queues,
such as tx and rx queues, through event queues.
Enable a privileged owner, such as a hypervisor PF, to set the number
of IO event queues for the VF and SF during the provisioning stage.
example:
Get maximum IO event queues of the VF device::
$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
function:
hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10
Set maximum IO event queues of the VF device::
$ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32
$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
function:
hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32
patch summary:
patch-1 updates devlink uapi
patch-2 adds print, get and set routines for max_io_eqs field
changelog:
v1->v2:
- addressed comments from Jiri
- updated man page for the new parameter
- corrected print to not have EQs value as optional
- replaced 'value' with 'EQs'
Parav Pandit (2):
uapi: Update devlink kernel headers
devlink: Support setting max_io_eqs
devlink/devlink.c | 29 ++++++++++++++++++++++++++++-
include/uapi/linux/devlink.h | 1 +
man/man8/devlink-port.8 | 12 ++++++++++++
3 files changed, 41 insertions(+), 1 deletion(-)
--
2.26.2
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH v2 1/2] uapi: Update devlink kernel headers 2024-04-10 11:58 [PATCH v2 0/2] devlink: Support setting max_io_eqs Parav Pandit @ 2024-04-10 11:58 ` Parav Pandit 2024-04-10 11:58 ` [PATCH v2 2/2] devlink: Support setting max_io_eqs Parav Pandit ` (2 subsequent siblings) 3 siblings, 0 replies; 12+ messages in thread From: Parav Pandit @ 2024-04-10 11:58 UTC (permalink / raw) To: netdev, dsahern, stephen; +Cc: jiri, shayd, Parav Pandit Update devlink kernel header to commit: 92de776d2090 ("Merge tag 'mlx5-updates-2023-12-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux") Signed-off-by:: Parav Pandit <parav@nvidia.com> --- note: This patch can be dropped by first syncing all the uapi files. --- include/uapi/linux/devlink.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h index aaac2438..80051b8c 100644 --- a/include/uapi/linux/devlink.h +++ b/include/uapi/linux/devlink.h @@ -686,6 +686,7 @@ enum devlink_port_function_attr { DEVLINK_PORT_FN_ATTR_OPSTATE, /* u8 */ DEVLINK_PORT_FN_ATTR_CAPS, /* bitfield32 */ DEVLINK_PORT_FN_ATTR_DEVLINK, /* nested */ + DEVLINK_PORT_FN_ATTR_MAX_IO_EQS, /* u32 */ __DEVLINK_PORT_FUNCTION_ATTR_MAX, DEVLINK_PORT_FUNCTION_ATTR_MAX = __DEVLINK_PORT_FUNCTION_ATTR_MAX - 1 -- 2.26.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] devlink: Support setting max_io_eqs 2024-04-10 11:58 [PATCH v2 0/2] devlink: Support setting max_io_eqs Parav Pandit 2024-04-10 11:58 ` [PATCH v2 1/2] uapi: Update devlink kernel headers Parav Pandit @ 2024-04-10 11:58 ` Parav Pandit 2024-04-10 23:22 ` [PATCH v2 0/2] " Samudrala, Sridhar 2024-04-13 16:40 ` patchwork-bot+netdevbpf 3 siblings, 0 replies; 12+ messages in thread From: Parav Pandit @ 2024-04-10 11:58 UTC (permalink / raw) To: netdev, dsahern, stephen; +Cc: jiri, shayd, Parav Pandit Devices send event notifications for the IO queues, such as tx and rx queues, through event queues. Enable a privileged owner, such as a hypervisor PF, to set the number of IO event queues for the VF and SF during the provisioning stage. example: Get maximum IO event queues of the VF device:: $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10 Set maximum IO event queues of the VF device:: $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32 $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32 Signed-off-by: Parav Pandit <parav@nvidia.com> --- changelog: v1->v2: - addressed comments from Jiri - updated man page for the new parameter - corrected print to not have EQs value as optional - replaced 'value' with 'EQs' --- devlink/devlink.c | 29 ++++++++++++++++++++++++++++- man/man8/devlink-port.8 | 12 ++++++++++++ 2 files changed, 40 insertions(+), 1 deletion(-) diff --git a/devlink/devlink.c b/devlink/devlink.c index dbeb6e39..03d27202 100644 --- a/devlink/devlink.c +++ b/devlink/devlink.c @@ -309,6 +309,7 @@ static int ifname_map_update(struct ifname_map *ifname_map, const char *ifname) #define DL_OPT_PORT_FN_RATE_TX_PRIORITY BIT(55) #define DL_OPT_PORT_FN_RATE_TX_WEIGHT BIT(56) #define DL_OPT_PORT_FN_CAPS BIT(57) +#define DL_OPT_PORT_FN_MAX_IO_EQS BIT(58) struct dl_opts { uint64_t present; /* flags of present items */ @@ -375,6 +376,7 @@ struct dl_opts { const char *linecard_type; bool selftests_opt[DEVLINK_ATTR_SELFTEST_ID_MAX + 1]; struct nla_bitfield32 port_fn_caps; + uint32_t port_fn_max_io_eqs; }; struct dl { @@ -773,6 +775,7 @@ devlink_function_policy[DEVLINK_PORT_FUNCTION_ATTR_MAX + 1] = { [DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR ] = MNL_TYPE_BINARY, [DEVLINK_PORT_FN_ATTR_STATE] = MNL_TYPE_U8, [DEVLINK_PORT_FN_ATTR_DEVLINK] = MNL_TYPE_NESTED, + [DEVLINK_PORT_FN_ATTR_MAX_IO_EQS] = MNL_TYPE_U32, }; static int function_attr_cb(const struct nlattr *attr, void *data) @@ -2298,6 +2301,17 @@ static int dl_argv_parse(struct dl *dl, uint64_t o_required, if (ipsec_packet) opts->port_fn_caps.value |= DEVLINK_PORT_FN_CAP_IPSEC_PACKET; o_found |= DL_OPT_PORT_FN_CAPS; + } else if (dl_argv_match(dl, "max_io_eqs") && + (o_all & DL_OPT_PORT_FN_MAX_IO_EQS)) { + uint32_t max_io_eqs; + + dl_arg_inc(dl); + err = dl_argv_uint32_t(dl, &max_io_eqs); + if (err) + return err; + opts->port_fn_max_io_eqs = max_io_eqs; + o_found |= DL_OPT_PORT_FN_MAX_IO_EQS; + } else { pr_err("Unknown option \"%s\"\n", dl_argv(dl)); return -EINVAL; @@ -2428,6 +2442,9 @@ dl_function_attr_put(struct nlmsghdr *nlh, const struct dl_opts *opts) if (opts->present & DL_OPT_PORT_FN_CAPS) mnl_attr_put(nlh, DEVLINK_PORT_FN_ATTR_CAPS, sizeof(opts->port_fn_caps), &opts->port_fn_caps); + if (opts->present & DL_OPT_PORT_FN_MAX_IO_EQS) + mnl_attr_put_u32(nlh, DEVLINK_PORT_FN_ATTR_MAX_IO_EQS, + opts->port_fn_max_io_eqs); mnl_attr_nest_end(nlh, nest); } @@ -4744,6 +4761,7 @@ static void cmd_port_help(void) pr_err(" devlink port function set DEV/PORT_INDEX [ hw_addr ADDR ] [ state { active | inactive } ]\n"); pr_err(" [ roce { enable | disable } ] [ migratable { enable | disable } ]\n"); pr_err(" [ ipsec_crypto { enable | disable } ] [ ipsec_packet { enable | disable } ]\n"); + pr_err(" [ max_io_eqs EQS\n"); pr_err(" devlink port function rate { help | show | add | del | set }\n"); pr_err(" devlink port param set DEV/PORT_INDEX name PARAMETER value VALUE cmode { permanent | driverinit | runtime }\n"); pr_err(" devlink port param show [DEV/PORT_INDEX name PARAMETER]\n"); @@ -4878,6 +4896,15 @@ static void pr_out_port_function(struct dl *dl, struct nlattr **tb_port) port_fn_caps->value & DEVLINK_PORT_FN_CAP_IPSEC_PACKET ? "enable" : "disable"); } + if (tb[DEVLINK_PORT_FN_ATTR_MAX_IO_EQS]) { + uint32_t max_io_eqs; + + max_io_eqs = mnl_attr_get_u32(tb[DEVLINK_PORT_FN_ATTR_MAX_IO_EQS]); + + print_uint(PRINT_ANY, "max_io_eqs", " max_io_eqs %u", + max_io_eqs); + } + if (tb[DEVLINK_PORT_FN_ATTR_DEVLINK]) pr_out_nested_handle_obj(dl, tb[DEVLINK_PORT_FN_ATTR_DEVLINK], true, true); @@ -5086,7 +5113,7 @@ static int cmd_port_function_set(struct dl *dl) } err = dl_argv_parse(dl, DL_OPT_HANDLEP, DL_OPT_PORT_FUNCTION_HW_ADDR | DL_OPT_PORT_FUNCTION_STATE | - DL_OPT_PORT_FN_CAPS); + DL_OPT_PORT_FN_CAPS | DL_OPT_PORT_FN_MAX_IO_EQS); if (err) return err; diff --git a/man/man8/devlink-port.8 b/man/man8/devlink-port.8 index 70d8837e..6f582260 100644 --- a/man/man8/devlink-port.8 +++ b/man/man8/devlink-port.8 @@ -83,6 +83,9 @@ devlink-port \- devlink port configuration .RI "[ " .BR ipsec_packet " { " enable " | " disable " }" .RI "]" +.RI "[ " +.BR max_io_eqs " EQS" +.RI "]" .ti -8 .BR "devlink port function rate " @@ -238,6 +241,10 @@ crypto operation (Encrypt/Decrypt) offload. Set the IPsec packet offload capability of the function. Controls XFRM state and policy offload (Encrypt/Decrypt operation and IPsec encapsulation). +.TP +.BR max_io_eqs " EQS" +Set the maximum number of IO event queues of the function. + .ti -8 .SS devlink port del - delete a devlink port .PP @@ -377,6 +384,11 @@ devlink port function set pci/0000:01:00.0/1 ipsec_packet enable This will enable the IPsec packet offload functionality of the function. .RE .PP +devlink port function set pci/0000:01:00.0/1 max_io_eqs 4 +.RS 4 +This will set the maximum number of IO event queues of the function to 4. +.RE +.PP devlink port function set pci/0000:01:00.0/1 hw_addr 00:00:00:11:22:33 state active .RS 4 Configure hardware address and also active the function. When a function is -- 2.26.2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] devlink: Support setting max_io_eqs 2024-04-10 11:58 [PATCH v2 0/2] devlink: Support setting max_io_eqs Parav Pandit 2024-04-10 11:58 ` [PATCH v2 1/2] uapi: Update devlink kernel headers Parav Pandit 2024-04-10 11:58 ` [PATCH v2 2/2] devlink: Support setting max_io_eqs Parav Pandit @ 2024-04-10 23:22 ` Samudrala, Sridhar 2024-04-11 2:32 ` Parav Pandit 2024-04-13 16:40 ` patchwork-bot+netdevbpf 3 siblings, 1 reply; 12+ messages in thread From: Samudrala, Sridhar @ 2024-04-10 23:22 UTC (permalink / raw) To: Parav Pandit, netdev, dsahern, stephen; +Cc: jiri, shayd On 4/10/2024 6:58 AM, Parav Pandit wrote: > Devices send event notifications for the IO queues, > such as tx and rx queues, through event queues. > > Enable a privileged owner, such as a hypervisor PF, to set the number > of IO event queues for the VF and SF during the provisioning stage. How do you provision tx/rx queues for VFs & SFs? Don't you need similar mechanism to setup max tx/rx queues too? > > example: > Get maximum IO event queues of the VF device:: > > $ devlink port show pci/0000:06:00.0/2 > pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 > function: > hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10 > > Set maximum IO event queues of the VF device:: > > $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32 > > $ devlink port show pci/0000:06:00.0/2 > pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 > function: > hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32 > > patch summary: > patch-1 updates devlink uapi > patch-2 adds print, get and set routines for max_io_eqs field > > changelog: > v1->v2: > - addressed comments from Jiri > - updated man page for the new parameter > - corrected print to not have EQs value as optional > - replaced 'value' with 'EQs' > > Parav Pandit (2): > uapi: Update devlink kernel headers > devlink: Support setting max_io_eqs > > devlink/devlink.c | 29 ++++++++++++++++++++++++++++- > include/uapi/linux/devlink.h | 1 + > man/man8/devlink-port.8 | 12 ++++++++++++ > 3 files changed, 41 insertions(+), 1 deletion(-) > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH v2 0/2] devlink: Support setting max_io_eqs 2024-04-10 23:22 ` [PATCH v2 0/2] " Samudrala, Sridhar @ 2024-04-11 2:32 ` Parav Pandit 2024-04-11 23:03 ` Samudrala, Sridhar 0 siblings, 1 reply; 12+ messages in thread From: Parav Pandit @ 2024-04-11 2:32 UTC (permalink / raw) To: Samudrala, Sridhar, netdev@vger.kernel.org, dsahern@kernel.org, stephen@networkplumber.org Cc: Jiri Pirko, Shay Drori Hi Sridhar, > From: Samudrala, Sridhar <sridhar.samudrala@intel.com> > Sent: Thursday, April 11, 2024 4:53 AM > > > On 4/10/2024 6:58 AM, Parav Pandit wrote: > > Devices send event notifications for the IO queues, such as tx and rx > > queues, through event queues. > > > > Enable a privileged owner, such as a hypervisor PF, to set the number > > of IO event queues for the VF and SF during the provisioning stage. > > How do you provision tx/rx queues for VFs & SFs? > Don't you need similar mechanism to setup max tx/rx queues too? Currently we don’t. They are derived from the IO event queues. As you know, sometimes more txqs than IO event queues needed for XDP, timestamp, multiple TCs. If needed, probably additional knob for txq, rxq can be added to restrict device resources. > > > > > > example: > > Get maximum IO event queues of the VF device:: > > > > $ devlink port show pci/0000:06:00.0/2 > > pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 > vfnum 1 > > function: > > hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs > > 10 > > > > Set maximum IO event queues of the VF device:: > > > > $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32 > > > > $ devlink port show pci/0000:06:00.0/2 > > pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 > vfnum 1 > > function: > > hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs > > 32 > > > > patch summary: > > patch-1 updates devlink uapi > > patch-2 adds print, get and set routines for max_io_eqs field > > > > changelog: > > v1->v2: > > - addressed comments from Jiri > > - updated man page for the new parameter > > - corrected print to not have EQs value as optional > > - replaced 'value' with 'EQs' > > > > Parav Pandit (2): > > uapi: Update devlink kernel headers > > devlink: Support setting max_io_eqs > > > > devlink/devlink.c | 29 ++++++++++++++++++++++++++++- > > include/uapi/linux/devlink.h | 1 + > > man/man8/devlink-port.8 | 12 ++++++++++++ > > 3 files changed, 41 insertions(+), 1 deletion(-) > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] devlink: Support setting max_io_eqs 2024-04-11 2:32 ` Parav Pandit @ 2024-04-11 23:03 ` Samudrala, Sridhar 2024-04-12 2:06 ` David Ahern 0 siblings, 1 reply; 12+ messages in thread From: Samudrala, Sridhar @ 2024-04-11 23:03 UTC (permalink / raw) To: Parav Pandit, netdev@vger.kernel.org, dsahern@kernel.org, stephen@networkplumber.org Cc: Jiri Pirko, Shay Drori On 4/10/2024 9:32 PM, Parav Pandit wrote: > Hi Sridhar, > >> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> >> Sent: Thursday, April 11, 2024 4:53 AM >> >> >> On 4/10/2024 6:58 AM, Parav Pandit wrote: >>> Devices send event notifications for the IO queues, such as tx and rx >>> queues, through event queues. >>> >>> Enable a privileged owner, such as a hypervisor PF, to set the number >>> of IO event queues for the VF and SF during the provisioning stage. >> >> How do you provision tx/rx queues for VFs & SFs? >> Don't you need similar mechanism to setup max tx/rx queues too? > > Currently we don’t. They are derived from the IO event queues. > As you know, sometimes more txqs than IO event queues needed for XDP, timestamp, multiple TCs. > If needed, probably additional knob for txq, rxq can be added to restrict device resources. Rather than deriving tx and rx queues from IO event queues, isn't it more user friendly to do the other way. Let the host admin set the max number of tx and rx queues allowed and the driver derive the number of ioevent queues based on those values. This will be consistent with what ethtool reports as pre-set maximum values for the corresponding VF/SF. > >> >> >>> >>> example: >>> Get maximum IO event queues of the VF device:: >>> >>> $ devlink port show pci/0000:06:00.0/2 >>> pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 >> vfnum 1 >>> function: >>> hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs >>> 10 >>> >>> Set maximum IO event queues of the VF device:: >>> >>> $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32 >>> >>> $ devlink port show pci/0000:06:00.0/2 >>> pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 >> vfnum 1 >>> function: >>> hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs >>> 32 >>> >>> patch summary: >>> patch-1 updates devlink uapi >>> patch-2 adds print, get and set routines for max_io_eqs field >>> >>> changelog: >>> v1->v2: >>> - addressed comments from Jiri >>> - updated man page for the new parameter >>> - corrected print to not have EQs value as optional >>> - replaced 'value' with 'EQs' >>> >>> Parav Pandit (2): >>> uapi: Update devlink kernel headers >>> devlink: Support setting max_io_eqs >>> >>> devlink/devlink.c | 29 ++++++++++++++++++++++++++++- >>> include/uapi/linux/devlink.h | 1 + >>> man/man8/devlink-port.8 | 12 ++++++++++++ >>> 3 files changed, 41 insertions(+), 1 deletion(-) >>> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] devlink: Support setting max_io_eqs 2024-04-11 23:03 ` Samudrala, Sridhar @ 2024-04-12 2:06 ` David Ahern 2024-04-12 3:31 ` Parav Pandit 0 siblings, 1 reply; 12+ messages in thread From: David Ahern @ 2024-04-12 2:06 UTC (permalink / raw) To: Samudrala, Sridhar, Parav Pandit, netdev@vger.kernel.org, stephen@networkplumber.org Cc: Jiri Pirko, Shay Drori On 4/11/24 5:03 PM, Samudrala, Sridhar wrote: > > > On 4/10/2024 9:32 PM, Parav Pandit wrote: >> Hi Sridhar, >> >>> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> >>> Sent: Thursday, April 11, 2024 4:53 AM >>> >>> >>> On 4/10/2024 6:58 AM, Parav Pandit wrote: >>>> Devices send event notifications for the IO queues, such as tx and rx >>>> queues, through event queues. >>>> >>>> Enable a privileged owner, such as a hypervisor PF, to set the number >>>> of IO event queues for the VF and SF during the provisioning stage. >>> >>> How do you provision tx/rx queues for VFs & SFs? >>> Don't you need similar mechanism to setup max tx/rx queues too? >> >> Currently we don’t. They are derived from the IO event queues. >> As you know, sometimes more txqs than IO event queues needed for XDP, >> timestamp, multiple TCs. >> If needed, probably additional knob for txq, rxq can be added to >> restrict device resources. > > Rather than deriving tx and rx queues from IO event queues, isn't it > more user friendly to do the other way. Let the host admin set the max > number of tx and rx queues allowed and the driver derive the number of > ioevent queues based on those values. This will be consistent with what > ethtool reports as pre-set maximum values for the corresponding VF/SF. > I agree with this point: IO EQ seems to be a mlx5 thing (or maybe I have not reviewed enough of the other drivers). Rx and Tx queues are already part of the ethtool API. This devlink feature is allowing resource limits to be configured, and a consistent API across tools would be better for users. ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH v2 0/2] devlink: Support setting max_io_eqs 2024-04-12 2:06 ` David Ahern @ 2024-04-12 3:31 ` Parav Pandit 2024-04-12 5:22 ` Parav Pandit 0 siblings, 1 reply; 12+ messages in thread From: Parav Pandit @ 2024-04-12 3:31 UTC (permalink / raw) To: David Ahern, Samudrala, Sridhar, netdev@vger.kernel.org, stephen@networkplumber.org Cc: Jiri Pirko, Shay Drori Hi David, Sridhar, > From: David Ahern <dsahern@kernel.org> > Sent: Friday, April 12, 2024 7:36 AM > > On 4/11/24 5:03 PM, Samudrala, Sridhar wrote: > > > > > > On 4/10/2024 9:32 PM, Parav Pandit wrote: > >> Hi Sridhar, > >> > >>> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> > >>> Sent: Thursday, April 11, 2024 4:53 AM > >>> > >>> > >>> On 4/10/2024 6:58 AM, Parav Pandit wrote: > >>>> Devices send event notifications for the IO queues, such as tx and > >>>> rx queues, through event queues. > >>>> > >>>> Enable a privileged owner, such as a hypervisor PF, to set the > >>>> number of IO event queues for the VF and SF during the provisioning > stage. > >>> > >>> How do you provision tx/rx queues for VFs & SFs? > >>> Don't you need similar mechanism to setup max tx/rx queues too? > >> > >> Currently we don’t. They are derived from the IO event queues. > >> As you know, sometimes more txqs than IO event queues needed for XDP, > >> timestamp, multiple TCs. > >> If needed, probably additional knob for txq, rxq can be added to > >> restrict device resources. > > > > Rather than deriving tx and rx queues from IO event queues, isn't it > > more user friendly to do the other way. Let the host admin set the max > > number of tx and rx queues allowed and the driver derive the number of > > ioevent queues based on those values. This will be consistent with > > what ethtool reports as pre-set maximum values for the corresponding > VF/SF. > > > > I agree with this point: IO EQ seems to be a mlx5 thing (or maybe I have not > reviewed enough of the other drivers). IO EQs are used by hns3, mana, mlx5, mlxsw, be2net. They might not yet have the need to provision them. > Rx and Tx queues are already part of > the ethtool API. This devlink feature is allowing resource limits to be > configured, and a consistent API across tools would be better for users. IO Eqs of a function are utilized by the non netdev stack as well for a multi-functionality function like rdma completion vectors. Txq and rxq are yet another separate resource, so it is not mutually exclusive with IO EQs. I can additionally add txq and rxq provisioning knob too if this is useful, yes? Sridhar, I didn’t lately check other drivers how usable is it, will you also implement the txq, rxq callbacks? Please let me know I can start the work later next week for those additional knobs. ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH v2 0/2] devlink: Support setting max_io_eqs 2024-04-12 3:31 ` Parav Pandit @ 2024-04-12 5:22 ` Parav Pandit 2024-04-12 22:03 ` Samudrala, Sridhar 0 siblings, 1 reply; 12+ messages in thread From: Parav Pandit @ 2024-04-12 5:22 UTC (permalink / raw) To: David Ahern, Samudrala, Sridhar, netdev@vger.kernel.org, stephen@networkplumber.org Cc: Jiri Pirko, Shay Drori > From: Parav Pandit <parav@nvidia.com> > Sent: Friday, April 12, 2024 9:02 AM > > Hi David, Sridhar, > > > From: David Ahern <dsahern@kernel.org> > > Sent: Friday, April 12, 2024 7:36 AM > > > > On 4/11/24 5:03 PM, Samudrala, Sridhar wrote: > > > > > > > > > On 4/10/2024 9:32 PM, Parav Pandit wrote: > > >> Hi Sridhar, > > >> > > >>> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> > > >>> Sent: Thursday, April 11, 2024 4:53 AM > > >>> > > >>> > > >>> On 4/10/2024 6:58 AM, Parav Pandit wrote: > > >>>> Devices send event notifications for the IO queues, such as tx > > >>>> and rx queues, through event queues. > > >>>> > > >>>> Enable a privileged owner, such as a hypervisor PF, to set the > > >>>> number of IO event queues for the VF and SF during the > > >>>> provisioning > > stage. > > >>> > > >>> How do you provision tx/rx queues for VFs & SFs? > > >>> Don't you need similar mechanism to setup max tx/rx queues too? > > >> > > >> Currently we don’t. They are derived from the IO event queues. > > >> As you know, sometimes more txqs than IO event queues needed for > > >> XDP, timestamp, multiple TCs. > > >> If needed, probably additional knob for txq, rxq can be added to > > >> restrict device resources. > > > > > > Rather than deriving tx and rx queues from IO event queues, isn't it > > > more user friendly to do the other way. Let the host admin set the > > > max number of tx and rx queues allowed and the driver derive the > > > number of ioevent queues based on those values. This will be > > > consistent with what ethtool reports as pre-set maximum values for > > > the corresponding > > VF/SF. > > > > > > > I agree with this point: IO EQ seems to be a mlx5 thing (or maybe I > > have not reviewed enough of the other drivers). > > IO EQs are used by hns3, mana, mlx5, mlxsw, be2net. They might not yet have > the need to provision them. > > > Rx and Tx queues are already part of > > the ethtool API. This devlink feature is allowing resource limits to > > be configured, and a consistent API across tools would be better for users. > > IO Eqs of a function are utilized by the non netdev stack as well for a multi- > functionality function like rdma completion vectors. > Txq and rxq are yet another separate resource, so it is not mutually exclusive > with IO EQs. > > I can additionally add txq and rxq provisioning knob too if this is useful, yes? > > Sridhar, > I didn’t lately check other drivers how usable is it, will you also implement > the txq, rxq callbacks? > Please let me know I can start the work later next week for those additional > knobs. I also forgot to describe in above reply that some driver like mlx5 creates internal tx and rxqs not directly visible in channels, for xdp, timestamp, for traffic classes, dropping certain packets on rx, etc. So exact derivation of io queues is also hard there. Regardless to me, both knobs are useful, and driver will create min() resource based on both the device limits. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] devlink: Support setting max_io_eqs 2024-04-12 5:22 ` Parav Pandit @ 2024-04-12 22:03 ` Samudrala, Sridhar 2024-04-13 2:01 ` Parav Pandit 0 siblings, 1 reply; 12+ messages in thread From: Samudrala, Sridhar @ 2024-04-12 22:03 UTC (permalink / raw) To: Parav Pandit, David Ahern, netdev@vger.kernel.org, stephen@networkplumber.org Cc: Jiri Pirko, Shay Drori, Michal Swiatkowski On 4/12/2024 12:22 AM, Parav Pandit wrote: > >> From: Parav Pandit <parav@nvidia.com> >> Sent: Friday, April 12, 2024 9:02 AM >> >> Hi David, Sridhar, >> >>> From: David Ahern <dsahern@kernel.org> >>> Sent: Friday, April 12, 2024 7:36 AM >>> >>> On 4/11/24 5:03 PM, Samudrala, Sridhar wrote: >>>> >>>> >>>> On 4/10/2024 9:32 PM, Parav Pandit wrote: >>>>> Hi Sridhar, >>>>> >>>>>> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> >>>>>> Sent: Thursday, April 11, 2024 4:53 AM >>>>>> >>>>>> >>>>>> On 4/10/2024 6:58 AM, Parav Pandit wrote: >>>>>>> Devices send event notifications for the IO queues, such as tx >>>>>>> and rx queues, through event queues. >>>>>>> >>>>>>> Enable a privileged owner, such as a hypervisor PF, to set the >>>>>>> number of IO event queues for the VF and SF during the >>>>>>> provisioning >>> stage. >>>>>> >>>>>> How do you provision tx/rx queues for VFs & SFs? >>>>>> Don't you need similar mechanism to setup max tx/rx queues too? >>>>> >>>>> Currently we don’t. They are derived from the IO event queues. >>>>> As you know, sometimes more txqs than IO event queues needed for >>>>> XDP, timestamp, multiple TCs. >>>>> If needed, probably additional knob for txq, rxq can be added to >>>>> restrict device resources. >>>> >>>> Rather than deriving tx and rx queues from IO event queues, isn't it >>>> more user friendly to do the other way. Let the host admin set the >>>> max number of tx and rx queues allowed and the driver derive the >>>> number of ioevent queues based on those values. This will be >>>> consistent with what ethtool reports as pre-set maximum values for >>>> the corresponding >>> VF/SF. >>>> >>> >>> I agree with this point: IO EQ seems to be a mlx5 thing (or maybe I >>> have not reviewed enough of the other drivers). >> >> IO EQs are used by hns3, mana, mlx5, mlxsw, be2net. They might not yet have >> the need to provision them. >> >>> Rx and Tx queues are already part of >>> the ethtool API. This devlink feature is allowing resource limits to >>> be configured, and a consistent API across tools would be better for users. >> >> IO Eqs of a function are utilized by the non netdev stack as well for a multi- >> functionality function like rdma completion vectors. >> Txq and rxq are yet another separate resource, so it is not mutually exclusive >> with IO EQs. >> >> I can additionally add txq and rxq provisioning knob too if this is useful, yes? Yes. We need knobs for txq and rxq too. IO Eq looks like a completion queue. We don't need them for ice driver at this time, but for our idpf based control/switchdev driver we need a way to setup max number of txqueues, rxqueues, rxbuffer queues and tx completion queues. >> >> Sridhar, >> I didn’t lately check other drivers how usable is it, will you also implement >> the txq, rxq callbacks? >> Please let me know I can start the work later next week for those additional >> knobs. Sure. Our subfunction support for ice is currently under review and we are defaulting to 1 rx/tx queue for now. These knobs would be required and useful when we enable more than 1 queue for each SF. > > I also forgot to describe in above reply that some driver like mlx5 creates internal tx and rxqs not directly visible in channels, for xdp, timestamp, for traffic classes, dropping certain packets on rx, etc. > So exact derivation of io queues is also hard there > > Regardless to me, both knobs are useful, and driver will create min() resource based on both the device limits. > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH v2 0/2] devlink: Support setting max_io_eqs 2024-04-12 22:03 ` Samudrala, Sridhar @ 2024-04-13 2:01 ` Parav Pandit 0 siblings, 0 replies; 12+ messages in thread From: Parav Pandit @ 2024-04-13 2:01 UTC (permalink / raw) To: Samudrala, Sridhar, David Ahern, netdev@vger.kernel.org, stephen@networkplumber.org Cc: Jiri Pirko, Shay Drori, Michal Swiatkowski > From: Samudrala, Sridhar <sridhar.samudrala@intel.com> > Sent: Saturday, April 13, 2024 3:33 AM > > On 4/12/2024 12:22 AM, Parav Pandit wrote: > > > >> From: Parav Pandit <parav@nvidia.com> > >> Sent: Friday, April 12, 2024 9:02 AM > >> > >> Hi David, Sridhar, > >> > >>> From: David Ahern <dsahern@kernel.org> > >>> Sent: Friday, April 12, 2024 7:36 AM > >>> > >>> On 4/11/24 5:03 PM, Samudrala, Sridhar wrote: > >>>> > >>>> > >>>> On 4/10/2024 9:32 PM, Parav Pandit wrote: > >>>>> Hi Sridhar, > >>>>> > >>>>>> From: Samudrala, Sridhar <sridhar.samudrala@intel.com> > >>>>>> Sent: Thursday, April 11, 2024 4:53 AM > >>>>>> > >>>>>> > >>>>>> On 4/10/2024 6:58 AM, Parav Pandit wrote: > >>>>>>> Devices send event notifications for the IO queues, such as tx > >>>>>>> and rx queues, through event queues. > >>>>>>> > >>>>>>> Enable a privileged owner, such as a hypervisor PF, to set the > >>>>>>> number of IO event queues for the VF and SF during the > >>>>>>> provisioning > >>> stage. > >>>>>> > >>>>>> How do you provision tx/rx queues for VFs & SFs? > >>>>>> Don't you need similar mechanism to setup max tx/rx queues too? > >>>>> > >>>>> Currently we don’t. They are derived from the IO event queues. > >>>>> As you know, sometimes more txqs than IO event queues needed for > >>>>> XDP, timestamp, multiple TCs. > >>>>> If needed, probably additional knob for txq, rxq can be added to > >>>>> restrict device resources. > >>>> > >>>> Rather than deriving tx and rx queues from IO event queues, isn't > >>>> it more user friendly to do the other way. Let the host admin set > >>>> the max number of tx and rx queues allowed and the driver derive > >>>> the number of ioevent queues based on those values. This will be > >>>> consistent with what ethtool reports as pre-set maximum values for > >>>> the corresponding > >>> VF/SF. > >>>> > >>> > >>> I agree with this point: IO EQ seems to be a mlx5 thing (or maybe I > >>> have not reviewed enough of the other drivers). > >> > >> IO EQs are used by hns3, mana, mlx5, mlxsw, be2net. They might not > >> yet have the need to provision them. > >> > >>> Rx and Tx queues are already part of the ethtool API. This devlink > >>> feature is allowing resource limits to be configured, and a > >>> consistent API across tools would be better for users. > >> > >> IO Eqs of a function are utilized by the non netdev stack as well for > >> a multi- functionality function like rdma completion vectors. > >> Txq and rxq are yet another separate resource, so it is not mutually > >> exclusive with IO EQs. > >> > >> I can additionally add txq and rxq provisioning knob too if this is useful, > yes? > > Yes. We need knobs for txq and rxq too. > IO Eq looks like a completion queue. We don't need them for ice driver at > this time, but for our idpf based control/switchdev driver we need a way to > setup max number of txqueues, rxqueues, rxbuffer queues and tx completion > queues. > Understood. Make sense. > >> > >> Sridhar, > >> I didn’t lately check other drivers how usable is it, will you also implement > >> the txq, rxq callbacks? > >> Please let me know I can start the work later next week for those > additional > >> knobs. > > Sure. Our subfunction support for ice is currently under review and we > are defaulting to 1 rx/tx queue for now. These knobs would be required > and useful when we enable more than 1 queue for each SF. > Got it. I will start the kernel side patches and CC you for reviews after completing this iproute2 patch. It would be good if you can help verify on your device. > > > > I also forgot to describe in above reply that some driver like mlx5 creates > internal tx and rxqs not directly visible in channels, for xdp, timestamp, for > traffic classes, dropping certain packets on rx, etc. > > So exact derivation of io queues is also hard there > > > Regardless to me, both knobs are useful, and driver will create min() > resource based on both the device limits. > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] devlink: Support setting max_io_eqs 2024-04-10 11:58 [PATCH v2 0/2] devlink: Support setting max_io_eqs Parav Pandit ` (2 preceding siblings ...) 2024-04-10 23:22 ` [PATCH v2 0/2] " Samudrala, Sridhar @ 2024-04-13 16:40 ` patchwork-bot+netdevbpf 3 siblings, 0 replies; 12+ messages in thread From: patchwork-bot+netdevbpf @ 2024-04-13 16:40 UTC (permalink / raw) To: Parav Pandit; +Cc: netdev, dsahern, stephen, jiri, shayd Hello: This series was applied to iproute2/iproute2-next.git (main) by David Ahern <dsahern@kernel.org>: On Wed, 10 Apr 2024 14:58:06 +0300 you wrote: > Devices send event notifications for the IO queues, > such as tx and rx queues, through event queues. > > Enable a privileged owner, such as a hypervisor PF, to set the number > of IO event queues for the VF and SF during the provisioning stage. > > example: > Get maximum IO event queues of the VF device:: > > [...] Here is the summary with links: - [v2,1/2] uapi: Update devlink kernel headers (no matching commit) - [v2,2/2] devlink: Support setting max_io_eqs https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=e8add23c59b7 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-04-13 16:40 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-04-10 11:58 [PATCH v2 0/2] devlink: Support setting max_io_eqs Parav Pandit 2024-04-10 11:58 ` [PATCH v2 1/2] uapi: Update devlink kernel headers Parav Pandit 2024-04-10 11:58 ` [PATCH v2 2/2] devlink: Support setting max_io_eqs Parav Pandit 2024-04-10 23:22 ` [PATCH v2 0/2] " Samudrala, Sridhar 2024-04-11 2:32 ` Parav Pandit 2024-04-11 23:03 ` Samudrala, Sridhar 2024-04-12 2:06 ` David Ahern 2024-04-12 3:31 ` Parav Pandit 2024-04-12 5:22 ` Parav Pandit 2024-04-12 22:03 ` Samudrala, Sridhar 2024-04-13 2:01 ` Parav Pandit 2024-04-13 16:40 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).