All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Horman <horms@kernel.org>
To: "Hay, Joshua A" <joshua.a.hay@intel.com>
Cc: "intel-wired-lan@lists.osuosl.org"
	<intel-wired-lan@lists.osuosl.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [Intel-wired-lan] [PATCH net v2 0/6] idpf: replace Tx flow scheduling buffer ring with buffer pool
Date: Thu, 24 Jul 2025 10:14:02 +0100	[thread overview]
Message-ID: <20250724091402.GH1150792@horms.kernel.org> (raw)
In-Reply-To: <DM4PR11MB6502F4622507B81F56DFFC65D45EA@DM4PR11MB6502.namprd11.prod.outlook.com>

On Thu, Jul 24, 2025 at 12:03:33AM +0000, Hay, Joshua A wrote:
> > On Thu, Jul 17, 2025 at 05:21:44PM -0700, Joshua Hay wrote:
> > > This series fixes a stability issue in the flow scheduling Tx send/clean
> > > path that results in a Tx timeout.
> > >
> > > The existing guardrails in the Tx path were not sufficient to prevent
> > > the driver from reusing completion tags that were still in flight (held
> > > by the HW).  This collision would cause the driver to erroneously clean
> > > the wrong packet thus leaving the descriptor ring in a bad state.
> > >
> > > The main point of this refactor is to replace the flow scheduling buffer
> > > ring with a large pool/array of buffers.  The completion tag then simply
> > > is the index into this array.  The driver tracks the free tags and pulls
> > > the next free one from a refillq.  The cleaning routines simply use the
> > > completion tag from the completion descriptor to index into the array to
> > > quickly find the buffers to clean.
> > >
> > > All of the code to support the refactor is added first to ensure traffic
> > > still passes with each patch.  The final patch then removes all of the
> > > obsolete stashing code.
> > >
> > > ---
> > > v2:
> > > - Add a new patch "idpf: simplify and fix splitq Tx packet rollback
> > >   error path" that fixes a bug in the error path. It also sets up
> > >   changes in patch 4 that are necessary to prevent a crash when a packet
> > >   rollback occurs using the buffer pool.
> > >
> > > v1:
> > > https://lore.kernel.org/intel-wired-lan/c6444d15-bc20-41a8-9230-
> > 9bb266cb2ac6@molgen.mpg.de/T/#maf9f464c598951ee860e5dd24ef8a45
> > 1a488c5a0
> > >
> > > Joshua Hay (6):
> > >   idpf: add support for Tx refillqs in flow scheduling mode
> > >   idpf: improve when to set RE bit logic
> > >   idpf: simplify and fix splitq Tx packet rollback error path
> > >   idpf: replace flow scheduling buffer ring with buffer pool
> > >   idpf: stop Tx if there are insufficient buffer resources
> > >   idpf: remove obsolete stashing code
> > >
> > >  .../ethernet/intel/idpf/idpf_singleq_txrx.c   |  61 +-
> > >  drivers/net/ethernet/intel/idpf/idpf_txrx.c   | 723 +++++++-----------
> > >  drivers/net/ethernet/intel/idpf/idpf_txrx.h   |  87 +--
> > >  3 files changed, 356 insertions(+), 515 deletions(-)
> > 
> > Hi Joshua, all,
> > 
> > Perhaps it is not followed much anymore, but at least according to [1]
> > patches for stable should not be more than 100 lines, with context.
> > 
> > This patch-set is an order of magnitude larger.
> > Can something be done to create a more minimal fix?
> > 
> > [1] https://docs.kernel.org/process/stable-kernel-rules.html#stable-kernel-
> > rules
> 
> Hi Simon,
> 
> It will be quite difficult to make this series smaller without any negative side effects. Here's a summary of why certain patches are necessary or where deferrals are possible (and the side effects):
> 
> "idpf: add support for Tx refillqs in flow scheduling mode" is required to keep the Tx path lockless. Otherwise, we would have to use locking in hot path to pass the tags freed in the cleaning thread back to whatever data struct the sending thread pulls the tags from.
> 
> Without "idpf: improve when to set RE bit logic", the driver will be much more susceptible to Tx timeouts when sending large packets.
> 
> "idpf: simplify and fix splitq Tx packet rollback error path" is a must to setup for the new buffer chaining. The previous implementation rolled back based on ring indexing, which will crash the system if a dma_mapping_error occurs or we run out of tags (more on that below). The buffer tags (indexes) pulled for a packet are not guaranteed to be contiguous, especially as out-of-order completions are processed. 
> 
> "idpf: stop Tx if there are insufficient buffer resources" could possibly be deferred, but that makes the rollback change above even more critical as we lose an extra layer of protection from running out of tags. It's also one of the smaller patches and won't make a significant difference in terms of size.
> 
> "idpf: remove obsolete stashing code" could also potentially be deferred but is 95% removing obsolete code, which leaves a lot of dead code.

Thanks Joshua,

if there is justification, which is indeed the case,
then I withdraw my suggestion to create a smaller fix.

      reply	other threads:[~2025-07-24  9:14 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-18  0:21 [Intel-wired-lan] [PATCH net v2 0/6] idpf: replace Tx flow scheduling buffer ring with buffer pool Joshua Hay
2025-07-18  0:21 ` [Intel-wired-lan] [PATCH net v2 1/6] idpf: add support for Tx refillqs in flow scheduling mode Joshua Hay
2025-07-21 15:05   ` Loktionov, Aleksandr
2025-07-21 15:05     ` Loktionov, Aleksandr
2025-07-18  0:21 ` [Intel-wired-lan] [PATCH net v2 2/6] idpf: improve when to set RE bit logic Joshua Hay
2025-07-18  0:21 ` [Intel-wired-lan] [PATCH net v2 3/6] idpf: simplify and fix splitq Tx packet rollback error path Joshua Hay
2025-07-18  0:21 ` [Intel-wired-lan] [PATCH net v2 4/6] idpf: replace flow scheduling buffer ring with buffer pool Joshua Hay
2025-07-24  9:20   ` Simon Horman
2025-07-25 17:15     ` Hay, Joshua A
2025-07-25 17:15       ` Hay, Joshua A
2025-07-18  0:21 ` [Intel-wired-lan] [PATCH net v2 5/6] idpf: stop Tx if there are insufficient buffer resources Joshua Hay
2025-07-18  0:21 ` [Intel-wired-lan] [PATCH net v2 6/6] idpf: remove obsolete stashing code Joshua Hay
2025-07-23 15:08 ` [Intel-wired-lan] [PATCH net v2 0/6] idpf: replace Tx flow scheduling buffer ring with buffer pool Simon Horman
2025-07-24  0:03   ` Hay, Joshua A
2025-07-24  0:03     ` Hay, Joshua A
2025-07-24  9:14     ` Simon Horman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250724091402.GH1150792@horms.kernel.org \
    --to=horms@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=joshua.a.hay@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.