From: Paolo Abeni <pabeni@redhat.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Pirko <jiri@resnulli.us>,
netdev@vger.kernel.org, John Fastabend <john.fastabend@gmail.com>,
Jamal Hadi Salim <jhs@mojatatu.com>,
edumazet@google.com, Madhu Chittim <madhu.chittim@intel.com>,
anthony.l.nguyen@intel.com, Simon Horman <horms@kernel.org>,
Sridhar Samudrala <sridhar.samudrala@intel.com>,
Donald Hunter <donald.hunter@gmail.com>,
intel-wired-lan@lists.osuosl.org, przemyslaw.kitszel@intel.com,
Sunil Kovvuri Goutham <sgoutham@marvell.com>
Subject: Re: [Intel-wired-lan] [PATCH v6 net-next 07/15] net-shapers: implement shaper cleanup on queue deletion
Date: Fri, 6 Sep 2024 16:25:25 +0200 [thread overview]
Message-ID: <8ba551da-3626-4505-bdf2-fa617d4ad66b@redhat.com> (raw)
In-Reply-To: <20240905182521.2f9f4c1c@kernel.org>
On 9/6/24 03:25, Jakub Kicinski wrote:
> For the driver -- let me flip the question around -- what do you expect
> the locking scheme to be in case of channel count change? Alternatively
> we could just expect the driver to take netdev->lock around the
> appropriate section of code and we'd do:
>
> void net_shaper_set_real_num_tx_queues(struct net_device *dev, ...)
> {
> ...
> if (!READ_ONCE(dev->net_shaper_hierarchy))
> return;
>
> lockdep_assert_held(dev->lock);
> ...
> }
In the IAVF case that will be problematic, as AFAICS the channel reconf
is done by 2 consecutive async task, the first task - iavf_reset_task -
changes the actual number of channels freeing/allocating the driver
resources and the 2nd one - iavf_finish_config - notify the stack
issuing netif_set_real_num_tx_queues(). iavf_reset_task can't easily
wait for iavf_finish_config due to locking order.
> I had a look at iavf, and there is no relevant locking around the queue
> count check at all, so that doesn't help.
Yep, that is racy.
>> Acquiring dev->lock around set_channel() will not be enough: some driver
>> change the channels number i.e. when enabling XDP.
>
> Indeed, trying to lock before calling the driver would be both a huge
> job and destined to fail.
>
>> I think/fear we need to replace the dev->lock with the rtnl lock to
>> solve the race for good.
>
> Maybe :( I think we need *an* answer for:
> - how we expect the driver to protect itself (assuming that the racy
> check in iavf_verify_handle() actually serves some purpose, which
> may not be true);
> - how we ensure consistency of core state (no shapers for queues which
> don't exist, assuming we agree having shapers for queues which
> don't exist is counter productive).
I agree we must delete shapers on removed/deleted queues. The
driver/firmware could reuse the same H/W resources for a different VF
and such queue must start in the new VF with a default (no shaping) config.
> Reverting back to rtnl_lock for all would be sad, the scheme of
> expecting the driver to take netdev->lock could work?
> It's the model we effectively settled on in devlink.
> Core->driver callbacks are always locked by the core,
> for driver->core calls driver should explicitly take the lock
> (some wrappers for lock+op+unlock are provided).
I think/guess/hope the following could work:
- the driver wraps the h/w resources reconfiguration and
netif_set_real_num_tx_queues() with dev->lock. In the iavf case, that
means 2 separate critical sections: in iavf_reset_task() and in
iavf_finish_config().
- the core, under dev->lock, checks vs real_num_tx_queues and call the
shaper ops
- the iavf shaper callbacks would still need to check the queue id vs
the current allocated hw resource number as the shapers ops could run
in-between the 2 mentioned critical sections. The iavf driver could
still act consistently with the core:
- if real_num_tx_queues < qid < current_allocated_hw_resources
set the shaper,
- if current_allocated_hw_resources < qid < real_num_tx_queues do
nothing and return success
In both the above scenarios, real_num_tx_queues will be set to
current_allocated_hw_resources soon by the upcoming
iavf_finish_config(), the core will update the hierarchy accordingly,
the status will be consistent.
I think the code should be more clear, let me try to share it ASAP (or
please block me soon ;)
Thanks,
Paolo
next prev parent reply other threads:[~2024-09-06 14:25 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-04 13:53 [Intel-wired-lan] [PATCH v6 net-next 00/15] net: introduce TX H/W shaping API Paolo Abeni
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 01/15] genetlink: extend info user-storage to match NL cb ctx Paolo Abeni
2024-09-05 0:40 ` Jakub Kicinski
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 02/15] netlink: spec: add shaper YAML spec Paolo Abeni
2024-09-05 1:03 ` Jakub Kicinski
2024-09-05 14:51 ` Paolo Abeni
2024-09-05 15:05 ` Jakub Kicinski
2024-09-05 16:17 ` Paolo Abeni
2024-09-06 0:38 ` Jakub Kicinski
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 03/15] net-shapers: implement NL get operation Paolo Abeni
2024-09-05 1:11 ` Jakub Kicinski
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 04/15] net-shapers: implement NL set and delete operations Paolo Abeni
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 05/15] net-shapers: implement NL group operation Paolo Abeni
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 06/15] net-shapers: implement delete support for NODE scope shaper Paolo Abeni
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 07/15] net-shapers: implement shaper cleanup on queue deletion Paolo Abeni
2024-09-05 1:33 ` Jakub Kicinski
2024-09-05 18:02 ` Paolo Abeni
2024-09-06 1:25 ` Jakub Kicinski
2024-09-06 14:25 ` Paolo Abeni [this message]
2024-09-06 14:42 ` Jakub Kicinski
2024-09-06 14:49 ` Paolo Abeni
2024-09-06 14:56 ` Jakub Kicinski
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 08/15] netlink: spec: add shaper introspection support Paolo Abeni
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 09/15] net: shaper: implement " Paolo Abeni
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 10/15] net-shapers: implement cap validation in the core Paolo Abeni
2024-09-05 1:56 ` Jakub Kicinski
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 11/15] testing: net-drv: add basic shaper test Paolo Abeni
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 12/15] virtchnl: support queue rate limit and quanta size configuration Paolo Abeni
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 13/15] ice: Support VF " Paolo Abeni
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 14/15] iavf: Add net_shaper_ops support Paolo Abeni
2024-09-05 1:58 ` Jakub Kicinski
2024-09-04 13:53 ` [Intel-wired-lan] [PATCH v6 net-next 15/15] iavf: add support to exchange qos capabilities Paolo Abeni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8ba551da-3626-4505-bdf2-fa617d4ad66b@redhat.com \
--to=pabeni@redhat.com \
--cc=anthony.l.nguyen@intel.com \
--cc=donald.hunter@gmail.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=madhu.chittim@intel.com \
--cc=netdev@vger.kernel.org \
--cc=przemyslaw.kitszel@intel.com \
--cc=sgoutham@marvell.com \
--cc=sridhar.samudrala@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox