netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Heng Qi <hengqi@linux.alibaba.com>
Cc: netdev@vger.kernel.org, virtualization@lists.linux.dev,
	"David S . Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Jason Wang <jasowang@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Brett Creeley <bcreeley@amd.com>,
	Ratheesh Kannoth <rkannoth@marvell.com>,
	Alexander Lobakin <aleksander.lobakin@intel.com>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Tal Gilboa <talgi@nvidia.com>, Jonathan Corbet <corbet@lwn.net>,
	linux-doc@vger.kernel.org,
	Maxime Chevallier <maxime.chevallier@bootlin.com>,
	Jiri Pirko <jiri@resnulli.us>,
	Paul Greenwalt <paul.greenwalt@intel.com>,
	Ahmed Zaki <ahmed.zaki@intel.com>,
	Vladimir Oltean <vladimir.oltean@nxp.com>,
	Kory Maincent <kory.maincent@bootlin.com>,
	Andrew Lunn <andrew@lunn.ch>,
	"justinstitt@google.com" <justinstitt@google.com>
Subject: Re: [PATCH net-next v9 2/4] ethtool: provide customized dim profile management
Date: Mon, 22 Apr 2024 11:39:43 -0700	[thread overview]
Message-ID: <20240422113943.736861fc@kernel.org> (raw)
In-Reply-To: <96b59800-85e6-4a9e-ad9b-7ad3fa56fff4@linux.alibaba.com>

On Mon, 22 Apr 2024 17:00:25 +0800 Heng Qi wrote:
> 在 2024/4/19 上午8:48, Jakub Kicinski 写道:
> > On Wed, 17 Apr 2024 23:55:44 +0800 Heng Qi wrote:  
> >> $ ethtool -c ethx
> >> ...
> >> rx-eqe-profile:
> >> {.usec =   1, .pkts = 256, .comps =   0,},
> >> {.usec =   8, .pkts = 256, .comps =   0,},
> >> {.usec =  64, .pkts = 256, .comps =   0,},
> >> {.usec = 128, .pkts = 256, .comps =   0,},
> >> {.usec = 256, .pkts = 256, .comps =   0,}
> >> rx-cqe-profile:   n/a
> >> tx-eqe-profile:   n/a
> >> tx-cqe-profile:   n/a  
> > I don't think that exposing CQE vs EQE mode makes much sense here.
> > We enable the right mode via:
> >
> > struct kernel_ethtool_coalesce {
> > 	u8 use_cqe_mode_tx;
> > 	u8 use_cqe_mode_rx;  
> 
> I think it is reasonable to use 4 types:
> 
> dim lib itself is designed with 4 independent arrays, which are queried 
> by dim->mode and
> dim->profile_ix. This way allows users to manually modify the cqe mode 
> of tx or rx, and the
> dim interface can automatically match the profile of the corresponding 
> mode when obtaining
> the results.

I'm guessing that DIM has 4 profiles because it was easier to implement
for profiles hardcoded by DIM itself(!) Even for driver-declared
profiles the distinction is a hack.

> Merely exposing rx_profile and tx_profile does not seem to make the 
> interface and code clearer.
> 
> > the user needs to set the packets and usecs in an educated way
> > (matching the mode).  
> 
> I think this is acceptable. In addition to the above reasons, users can 
> already set the cqe
> mode of tx and rx in user mode, which means that they have been educated.
> 
> > The kernel never changes the mode.  
> 
> Sorry, I don't seem to understand what this means.

Kernel never changes the mode for its own reasons.
Only user can change the mode.
So we don't have to always have both CQE and EQE mode tables ready.
If the kernel changed the mode for some reason we'd have to get both
from the user, in case kernel wanted to switch.

> >>   /**
> >>    * register_netdevice() - register a network device
> >>    * @dev: device to register
> >> @@ -10258,6 +10310,10 @@ int register_netdevice(struct net_device *dev)
> >>   	if (ret)
> >>   		return ret;
> >>   
> >> +	ret = dev_dim_profile_init(dev);
> >> +	if (ret)
> >> +		return ret;  
> > This is fine but the driver still has to manually do bunch of init:
> >
> > 		INIT_WORK(&vi->rq[i].dim.work, virtnet_rx_dim_work);
> > 		vi->rq[i].dim.mode = DIM_CQ_PERIOD_MODE_START_FROM_EQE;
> >
> > It'd be better to collect all this static info (flags, mode, work func)
> > in one place / struct, attached to netdev or netdev_ops. Then the
> > driver can call a helper like which only needs to take netdev and dim
> > as arguments.  
> 
> If mode is put into netdev, then use_cqe_mode_rx and use_cqe_mode_tx 
> need to be moved into netdev too.
> Then dim->mode is no longer required when dim obtain its results.
> We need to transform the way all drivers with dim currently behave.

Hopefully that won't be too hard.

> But why should work be put into netdev?
> The driver still requires a for loop to initialize dim work for a 
> driver-specific queues.

It's easier to refactor and extend the API when there's an explicit
call done to the core by the driver, rather than driver poking values
into random fields of the core structure.

> >> + * coalesce_put_profile - fill reply with a nla nest with four child nla nests.
> >> + * @skb: socket buffer the message is stored in
> >> + * @attr_type: nest attr type ETHTOOL_A_COALESCE_*X_*QE_PROFILE
> >> + * @profile: data passed to userspace
> >> + * @supported_params: modifiable parameters supported by the driver
> >> + *
> >> + * Put a dim profile nest attribute. Refer to ETHTOOL_A_MODERATIONS_MODERATION.
> >> + *
> >> + * Return: false to indicate successful placement or no placement, and
> >> + * true to pass the -EMSGSIZE error to the wrapper.  
> > Why the bool? Doesn't most of the similar code return the error?  
> 
> Its wrapper function ethnl_default_doit only recognizes the EMSGSIZE code.

Not sure what you mean.

> >> @@ -227,7 +315,19 @@ const struct nla_policy ethnl_coalesce_set_policy[] = {
> >>   	[ETHTOOL_A_COALESCE_TX_AGGR_MAX_BYTES] = { .type = NLA_U32 },
> >>   	[ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES] = { .type = NLA_U32 },
> >>   	[ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS] = { .type = NLA_U32 },
> >> +	[ETHTOOL_A_COALESCE_RX_EQE_PROFILE]     = { .type = NLA_NESTED },
> >> +	[ETHTOOL_A_COALESCE_RX_CQE_PROFILE]     = { .type = NLA_NESTED },
> >> +	[ETHTOOL_A_COALESCE_TX_EQE_PROFILE]     = { .type = NLA_NESTED },
> >> +	[ETHTOOL_A_COALESCE_TX_CQE_PROFILE]     = { .type = NLA_NESTED },  
> > NLA_POLICY_NESTED(coalesce_set_profile_policy)  
> 
> This doesn't work because the first level of sub-nesting of
> ETHTOOL_A_COALESCE_RX_CQE_PROFILE is ETHTOOL_A_PROFILE_IRQ_MODERATION.

So declare a policy for IRQ_MODERATION which has one entry -> nested
profile policy?

  reply	other threads:[~2024-04-22 18:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-17 15:55 [PATCH net-next v9 0/4] ethtool: provide the dim profile fine-tuning channel Heng Qi
2024-04-17 15:55 ` [PATCH net-next v9 1/4] linux/dim: move useful macros to .h file Heng Qi
2024-04-17 15:55 ` [PATCH net-next v9 2/4] ethtool: provide customized dim profile management Heng Qi
2024-04-19  0:48   ` Jakub Kicinski
2024-04-22  9:00     ` Heng Qi
2024-04-22 18:39       ` Jakub Kicinski [this message]
2024-04-24 13:10         ` Heng Qi
2024-04-24 15:52           ` Jakub Kicinski
2024-04-24 13:41       ` Heng Qi
2024-04-24 16:18         ` Jakub Kicinski
2024-04-24 16:49           ` Heng Qi
2024-04-24 23:59             ` Jakub Kicinski
2024-04-21 15:53   ` Brett Creeley
2024-04-22  9:04     ` Heng Qi
2024-04-17 15:55 ` [PATCH net-next v9 3/4] virtio-net: refactor dim initialization/destruction Heng Qi
2024-04-17 15:55 ` [PATCH net-next v9 4/4] virtio-net: support dim profile fine-tuning Heng Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240422113943.736861fc@kernel.org \
    --to=kuba@kernel.org \
    --cc=ahmed.zaki@intel.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=andrew@lunn.ch \
    --cc=bcreeley@amd.com \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hengqi@linux.alibaba.com \
    --cc=jasowang@redhat.com \
    --cc=jiri@resnulli.us \
    --cc=justinstitt@google.com \
    --cc=kory.maincent@bootlin.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=maxime.chevallier@bootlin.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=paul.greenwalt@intel.com \
    --cc=rkannoth@marvell.com \
    --cc=talgi@nvidia.com \
    --cc=virtualization@lists.linux.dev \
    --cc=vladimir.oltean@nxp.com \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).