netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: "Maurice Baijens (Ellips B.V.)" <maurice.baijens@ellips.com>
Cc: "intel-wired-lan@lists.osuosl.org"
	<intel-wired-lan@lists.osuosl.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [External] ixgbe driver link down causes 100% load in ksoftirqd/x
Date: Mon, 31 Jan 2022 13:54:31 +0100	[thread overview]
Message-ID: <YffcB2YZ1h5SRyEP@boxer> (raw)
In-Reply-To: <VI1PR02MB41424341E3E7BA3166E043BD88229@VI1PR02MB4142.eurprd02.prod.outlook.com>

On Fri, Jan 28, 2022 at 03:53:25PM +0000, Maurice Baijens (Ellips B.V.) wrote:
> Hello,
> 
> 
> > -----Original Message-----
> > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com> 
> > Sent: Friday, January 28, 2022 4:31 PM
> > To: Maurice Baijens (Ellips B.V.) <maurice.baijens@ellips.com>
> > Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org
> > Subject: Re: [External] ixgbe driver link down causes 100% load in ksoftirqd/x
> >
> > On Thu, Jan 20, 2022 at 09:23:06AM +0000, Maurice Baijens (Ellips B.V.) wrote:
> > > Hello,
> > > 
> > > 
> > > I have an issue with the ixgbe driver and X550Tx network adapter.
> > > When I disconnect the network cable I end up with 100% load in ksoftirqd/x. I am running the adapter in
> > > xdp mode (XDP_FLAGS_DRV_MODE). Problem seen in linux kernel 5.15.x and also 5.16.0+ (head).
> >
> > Hello,
> >
> > a stupid question - why do you disconnect the cable when running traffic? :)
> 
> The answer is even more stupid. Due to supply problems we sometimes have to use
> dual adapters instead of single once, and if one by accident enables the wrong port,
> the bug is triggered.
> 
> > If you plug this back in then what happens?
> 
> Then everything works normal again.
> 
> >
> > > 
> > > I traced the problem down to function ixgbe_xmit_zc in ixgbe_xsk.c:
> > > 
> > > if (unlikely(!ixgbe_desc_unused(xdp_ring)) ||
> > >     !netif_carrier_ok(xdp_ring->netdev)) {
> > >             work_done = false;
> > >             break;
> > > }
> >
> > This was done in commit c685c69fba71 ("ixgbe: don't do any AF_XDP
> > zero-copy transmit if netif is not OK") - it was addressing the transient
> > state when configuring the xsk pool on particular queue pair.
> >
> > > 
> > > This function is called from ixgbe_poll() function via ixgbe_clean_xdp_tx_irq(). It sets
> > > work_done to false if netif_carrier_ok() returns false (so if link is down). Because work_done
> > > is always false, ixgbe_poll keeps on polling forever.
> > > 
> > > I made a fix by checking link in ixgbe_poll() function and if no link exiting polling mode:
> > > 
> > > /* If all work not completed, return budget and keep polling */
> > > if ((!clean_complete) && netif_carrier_ok(adapter->netdev))
> > >             return budget;
> >
> > Not sure about the correctness of this. Question is how should we act for
> > link down - should we say that we are done with processing or should we
> > wait until the link gets back?
> >
> > Instead of setting the work_done to false immediately for
> >!netif_carrier_ok(), I'd rather break out the checks that are currently
> > combined into the single statement, something like this:
> >
> > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
> > index b3fd8e5cd85b..6a5e9cf6b5da 100644
> > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
> > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
> > @@ -390,12 +390,14 @@ static bool ixgbe_xmit_zc(struct ixgbe_ring *xdp_ring, unsigned int budget)
> >  	u32 cmd_type;
> >  
> >  	while (budget-- > 0) {
> > -		if (unlikely(!ixgbe_desc_unused(xdp_ring)) ||
> > -		    !netif_carrier_ok(xdp_ring->netdev)) {
> > +		if (unlikely(!ixgbe_desc_unused(xdp_ring))) {
> >  			work_done = false;
> >  			break;
> >  		}
> >  
> > +		if (!netif_carrier_ok(xdp_ring->netdev))
> > +			break;
> > +
> >  		if (!xsk_tx_peek_desc(pool, &desc))
> >  			break;
> >
> >
> > > 
> > > This is probably fine for our application as we only run in xdpdrv mode, however I am not sure this
> >
> > By xdpdrv I would understand that you're running XDP in standard native
> > mode, however you refer to the AF_XDP Zero Copy implementation in the
> > driver. But I don't think it changes anything in this thread.
> >
> > In the end I see some outstanding issues with ixgbe_xmit_zc(), so this
> > probably might need some attention.
> >
> > Thanks!
> > Maciej
> 
> Your suggestion for a fix sounds ok. (I have not tested it). Is someone going to fix it in the next version of the kernel,
> so we don't have to apply a patch here forever? Or how should we proceed to get it fixed in the kernel?

Could you test it then? If it's fine then I'll send it as a fix. I just
don't currently have ixgbe HW around me.

> 
> Thank you,
> Maurice
> 
> 
> >
> > > is the correct way to fix this issue and the behaviour of the normal skb mode operation is 
> > > also affected by my fix.
> > > 
> > > So hopefully my observations are correct and someone here can fix the issue and push it upstream.
> > > 
> > > 
> > > Best regards,
> > > 	Maurice Baijens
> 
> 
> 

  reply	other threads:[~2022-01-31 12:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-20  9:23 ixgbe driver link down causes 100% load in ksoftirqd/x Maurice Baijens (Ellips B.V.)
2022-01-28 15:31 ` Maciej Fijalkowski
2022-01-28 15:53   ` [External] " Maurice Baijens (Ellips B.V.)
2022-01-31 12:54     ` Maciej Fijalkowski [this message]
2022-01-31 16:46       ` Maurice Baijens (Ellips B.V.)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YffcB2YZ1h5SRyEP@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=maurice.baijens@ellips.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).