From: Simon Horman <horms@kernel.org>
To: Brian Vazquez <brianvv@google.com>
Cc: Brian Vazquez <brianvv.kernel@gmail.com>,
Tony Nguyen <anthony.l.nguyen@intel.com>,
Przemek Kitszel <przemyslaw.kitszel@intel.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
intel-wired-lan@lists.osuosl.org,
David Decotigny <decot@google.com>,
Anjali Singhai <anjali.singhai@intel.com>,
Sridhar Samudrala <sridhar.samudrala@intel.com>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
emil.s.tantilov@intel.com, Josh Hay <joshua.a.hay@intel.com>,
Luigi Rizzo <lrizzo@google.com>
Subject: Re: [Intel-wired-lan] [iwl-net PATCH v2] idpf: fix a race in txq wakeup
Date: Thu, 1 May 2025 16:16:16 +0100 [thread overview]
Message-ID: <20250501151616.GA3339421@horms.kernel.org> (raw)
In-Reply-To: <20250428195532.1590892-1-brianvv@google.com>
On Mon, Apr 28, 2025 at 07:55:32PM +0000, Brian Vazquez wrote:
> Add a helper function to correctly handle the lockless
> synchronization when the sender needs to block. The paradigm is
>
> if (no_resources()) {
> stop_queue();
> barrier();
> if (!no_resources())
> restart_queue();
> }
>
> netif_subqueue_maybe_stop already handles the paradigm correctly, but
> the code split the check for resources in three parts, the first one
> (descriptors) followed the protocol, but the other two (completions and
> tx_buf) were only doing the first part and so race prone.
>
> Luckily netif_subqueue_maybe_stop macro already allows you to use a
> function to evaluate the start/stop conditions so the fix only requires
> the right helper function to evaluate all the conditions at once.
>
> The patch removes idpf_tx_maybe_stop_common since it's no longer needed
> and instead adjusts separately the conditions for singleq and splitq.
>
> Note that idpf_rx_buf_hw_update doesn't need to check for resources
> since that will be covered in idpf_tx_splitq_frame.
Should the above read idpf_tx_buf_hw_update() rather than
idpf_rx_buf_hw_update()?
If so, I see that this is true when idpf_tx_buf_hw_update() is called from
idpf_tx_singleq_frame(). But is a check required in the case where
idpf_rx_buf_hw_update() is called by idpf_tx_singleq_map()?
>
> To reproduce:
>
> Reduce the threshold for pending completions to increase the chances of
> hitting this pause by changing your kernel:
>
> drivers/net/ethernet/intel/idpf/idpf_txrx.h
>
> -#define IDPF_TX_COMPLQ_OVERFLOW_THRESH(txcq) ((txcq)->desc_count >> 1)
> +#define IDPF_TX_COMPLQ_OVERFLOW_THRESH(txcq) ((txcq)->desc_count >> 4)
>
> Use pktgen to force the host to push small pkts very aggressively:
>
> ./pktgen_sample02_multiqueue.sh -i eth1 -s 100 -6 -d $IP -m $MAC \
> -p 10000-10000 -t 16 -n 0 -v -x -c 64
>
> Fixes: 6818c4d5b3c2 ("idpf: add splitq start_xmit")
> Signed-off-by: Josh Hay <joshua.a.hay@intel.com>
> Signed-off-by: Brian Vazquez <brianvv@google.com>
> Signed-off-by: Luigi Rizzo <lrizzo@google.com>
...
WARNING: multiple messages have this Message-ID (diff)
From: Simon Horman <horms@kernel.org>
To: Brian Vazquez <brianvv@google.com>
Cc: Brian Vazquez <brianvv.kernel@gmail.com>,
Tony Nguyen <anthony.l.nguyen@intel.com>,
Przemek Kitszel <przemyslaw.kitszel@intel.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
intel-wired-lan@lists.osuosl.org,
David Decotigny <decot@google.com>,
Anjali Singhai <anjali.singhai@intel.com>,
Sridhar Samudrala <sridhar.samudrala@intel.com>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
emil.s.tantilov@intel.com, Josh Hay <joshua.a.hay@intel.com>,
Luigi Rizzo <lrizzo@google.com>
Subject: Re: [iwl-net PATCH v2] idpf: fix a race in txq wakeup
Date: Thu, 1 May 2025 16:16:16 +0100 [thread overview]
Message-ID: <20250501151616.GA3339421@horms.kernel.org> (raw)
In-Reply-To: <20250428195532.1590892-1-brianvv@google.com>
On Mon, Apr 28, 2025 at 07:55:32PM +0000, Brian Vazquez wrote:
> Add a helper function to correctly handle the lockless
> synchronization when the sender needs to block. The paradigm is
>
> if (no_resources()) {
> stop_queue();
> barrier();
> if (!no_resources())
> restart_queue();
> }
>
> netif_subqueue_maybe_stop already handles the paradigm correctly, but
> the code split the check for resources in three parts, the first one
> (descriptors) followed the protocol, but the other two (completions and
> tx_buf) were only doing the first part and so race prone.
>
> Luckily netif_subqueue_maybe_stop macro already allows you to use a
> function to evaluate the start/stop conditions so the fix only requires
> the right helper function to evaluate all the conditions at once.
>
> The patch removes idpf_tx_maybe_stop_common since it's no longer needed
> and instead adjusts separately the conditions for singleq and splitq.
>
> Note that idpf_rx_buf_hw_update doesn't need to check for resources
> since that will be covered in idpf_tx_splitq_frame.
Should the above read idpf_tx_buf_hw_update() rather than
idpf_rx_buf_hw_update()?
If so, I see that this is true when idpf_tx_buf_hw_update() is called from
idpf_tx_singleq_frame(). But is a check required in the case where
idpf_rx_buf_hw_update() is called by idpf_tx_singleq_map()?
>
> To reproduce:
>
> Reduce the threshold for pending completions to increase the chances of
> hitting this pause by changing your kernel:
>
> drivers/net/ethernet/intel/idpf/idpf_txrx.h
>
> -#define IDPF_TX_COMPLQ_OVERFLOW_THRESH(txcq) ((txcq)->desc_count >> 1)
> +#define IDPF_TX_COMPLQ_OVERFLOW_THRESH(txcq) ((txcq)->desc_count >> 4)
>
> Use pktgen to force the host to push small pkts very aggressively:
>
> ./pktgen_sample02_multiqueue.sh -i eth1 -s 100 -6 -d $IP -m $MAC \
> -p 10000-10000 -t 16 -n 0 -v -x -c 64
>
> Fixes: 6818c4d5b3c2 ("idpf: add splitq start_xmit")
> Signed-off-by: Josh Hay <joshua.a.hay@intel.com>
> Signed-off-by: Brian Vazquez <brianvv@google.com>
> Signed-off-by: Luigi Rizzo <lrizzo@google.com>
...
next prev parent reply other threads:[~2025-05-01 15:16 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-28 19:55 [Intel-wired-lan] [iwl-net PATCH v2] idpf: fix a race in txq wakeup Brian Vazquez
2025-04-28 19:55 ` Brian Vazquez
2025-04-29 1:42 ` [Intel-wired-lan] " Chittim, Madhu
2025-04-29 1:42 ` Chittim, Madhu
2025-04-30 20:58 ` [Intel-wired-lan] " Jacob Keller
2025-05-01 15:16 ` Simon Horman [this message]
2025-05-01 15:16 ` Simon Horman
2025-05-01 16:51 ` [Intel-wired-lan] " Brian Vazquez
2025-05-01 16:51 ` Brian Vazquez
2025-05-02 9:36 ` [Intel-wired-lan] " Simon Horman
2025-05-02 9:36 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250501151616.GA3339421@horms.kernel.org \
--to=horms@kernel.org \
--cc=anjali.singhai@intel.com \
--cc=anthony.l.nguyen@intel.com \
--cc=brianvv.kernel@gmail.com \
--cc=brianvv@google.com \
--cc=davem@davemloft.net \
--cc=decot@google.com \
--cc=edumazet@google.com \
--cc=emil.s.tantilov@intel.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=joshua.a.hay@intel.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lrizzo@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=sridhar.samudrala@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.