From mboxrd@z Thu Jan 1 00:00:00 1970 From: Florian Fainelli Subject: Re: pegged softirq and NAPI race (?) Date: Tue, 18 Sep 2018 14:25:38 -0700 Message-ID: <09e7e3f8-dee8-71ea-7e57-4d0c92dcf13b@gmail.com> References: <0FD562CC-CDE9-43C8-9623-B42AC7A208C8@fb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: Alexei Starovoitov , netdev , Jeff Kirsher , Alexander Duyck , michael.chan@broadcom.com, kernel-team To: Eric Dumazet , songliubraving@fb.com Return-path: Received: from mail-wm1-f67.google.com ([209.85.128.67]:53720 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726821AbeISDAS (ORCPT ); Tue, 18 Sep 2018 23:00:18 -0400 Received: by mail-wm1-f67.google.com with SMTP id b19-v6so4019692wme.3 for ; Tue, 18 Sep 2018 14:25:48 -0700 (PDT) In-Reply-To: Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 09/18/2018 02:13 PM, Eric Dumazet wrote: > On Tue, Sep 18, 2018 at 1:37 PM Song Liu wrote: >> > >> Looks like a patch like the following fixes the issue for ixgbe. But I >> cannot explain it yet. >> >> Does this ring a bell? > > I dunno, it looks like the NIC is generating an interrupt while it should not, > and constantly sets NAPI_STATE_MISSED. > > Or maybe we need to properly check napi_complete_done() return value. > > diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c > b/drivers/net/ethernet/intel/ixgb/ixgb_main.c > index d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedcdcb72dc1c401813 > 100644 > --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c > +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c > @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int budget) > ixgb_clean_rx_irq(adapter, &work_done, budget); > > /* If budget not fully consumed, exit the polling mode */ > - if (work_done < budget) { > - napi_complete_done(napi, work_done); > + if (work_done < budget && > + napi_complete_done(napi, work_done)) { > if (!test_bit(__IXGB_DOWN, &adapter->flags)) > ixgb_irq_enable(adapter); > } This would not be the only driver doing this unfortunately... should we add a __must_check annotation to help catch those (mis)uses? Though that could cause false positives for drivers using NAPI to clean their TX ring. -- Florian