netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Woodhouse <dwmw2@infradead.org>
To: Francois Romieu <romieu@fr.zoreil.com>
Cc: netdev@vger.kernel.org
Subject: Re: [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms
Date: Mon, 21 Sep 2015 21:52:30 +0100	[thread overview]
Message-ID: <1442868750.7367.70.camel@infradead.org> (raw)
In-Reply-To: <20150921202541.GA4128@electric-eye.fr.zoreil.com>

[-- Attachment #1: Type: text/plain, Size: 1961 bytes --]

On Mon, 2015-09-21 at 22:25 +0200, Francois Romieu wrote:
> David Woodhouse <dwmw2@infradead.org> :
> > From: David Woodhouse <David.Woodhouse@intel.com>
> > 
> > The TX timeout handling has been observed to trigger RX IRQ storms. And
> > since cp_interrupt() just keeps saying that it handled the interrupt,
> > the machine then dies. Fix the return value from cp_interrupt(), and
> > the offending IRQ gets disabled and the machine survives.
> 
> I am not fond of the way it dissociates the hardware status word and the
> software "handled" variable.

Oh, I like that part very much :)

The practice of returning a 'handled' status only when you've actually
*done* something you expect to mitigate the interrupt, is a useful way
of also protecting against both hardware misbehaviour and software
bugs.

> What you are describing - RX IRQ storms - looks like a problem between
> the irq and poll handlers. That's where I expect the problem to be solved.

I already fixed that, in the next patch in the series. But the failure
mode *should* have been 'IRQ disabled' and the device continuing to
work via polling. Not a complete death of the machine. That's the
difference this patch makes.

> Sprinkling "handled" operations does not make me terribly confortable,
> especially as I'd happily trade the old-style part irq, part napi
> processing for a plain napi processing (I can get over it though :o) ).

The existing cp_rx_poll() function mostly runs without taking cp->lock.
But cp_tx() *does* need cp->lock for the tx_head/tx_tail manipulations.
I'm guessing that's why it's still being called directly from the hard
IRQ handler? I suppose we could probably find a way to move that out.

But I was mostly just trying to fix stuff that was actually broken...
:)

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]

  reply	other threads:[~2015-09-21 20:52 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-17 23:19 [PATCH 1/2] 8139cp: Use dev_kfree_skb_any() instead of dev_kfree_skb() in cp_clean_rings() David Woodhouse
2015-09-17 23:21 ` [PATCH 2/2] 8139cp: Call __cp_set_rx_mode() from cp_tx_timeout() David Woodhouse
2015-09-18 11:37   ` [PATCH 3/2] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms David Woodhouse
2015-09-18 12:17   ` [PATCH 4/2] 8139cp: Do not re-enable RX interrupts in cp_tx_timeout() David Woodhouse
2015-09-21  5:24   ` [PATCH 2/2] 8139cp: Call __cp_set_rx_mode() from cp_tx_timeout() David Miller
2015-09-21 13:59     ` David Woodhouse
2015-09-21 14:01       ` [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms David Woodhouse
2015-09-21 20:25         ` Francois Romieu
2015-09-21 20:52           ` David Woodhouse [this message]
2015-09-22 23:45         ` David Miller
2015-09-23  8:14           ` David Woodhouse
2015-09-23  8:43             ` [PATCH 1/7] 8139cp: Do not re-enable RX interrupts in cp_tx_timeout() David Woodhouse
2015-09-23  8:44             ` [PATCH 2/7] 8139cp: Fix tx_queued debug message to print correct slot numbers David Woodhouse
2015-09-23  8:44             ` [PATCH 3/7] 8139cp: Fix TSO/scatter-gather descriptor setup David Woodhouse
2015-09-23  8:44             ` [PATCH 4/7] 8139cp: Reduce duplicate csum/tso code in cp_start_xmit() David Woodhouse
2015-09-23  8:45             ` [PATCH 5/7] 8139cp: Fix DMA unmapping of transmitted buffers David Woodhouse
2015-09-23  8:45             ` [PATCH 6/7] 8139cp: Dump contents of descriptor ring on TX timeout David Woodhouse
2015-09-23  8:46             ` [PATCH 7/7] 8139cp: Enable offload features by default David Woodhouse
2015-09-23 17:58             ` [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms David Miller
2015-09-23 19:45               ` David Woodhouse
2015-09-23 21:48                 ` David Miller
2015-09-23 22:00                   ` David Woodhouse
2015-09-23 23:29                     ` Francois Romieu
2015-09-24  8:58                       ` [PATCH WTF] 8139cp: Fix GSO MSS handling David Woodhouse
2015-09-24 10:38                         ` [PATCH v2 RFC] " David Woodhouse
2015-09-24 12:05                           ` Eric Dumazet
2015-09-24 12:31                             ` David Woodhouse
2015-09-28  5:37                               ` Tom Herbert
2015-09-28  7:21                                 ` David Woodhouse
2015-09-27  5:38                           ` David Miller
2015-09-23 22:02                   ` [PATCH] 8139cp: Set GSO max size and enforce it David Woodhouse
2015-09-23 22:44                 ` [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms Francois Romieu
2015-09-23 23:09                   ` David Woodhouse
2015-10-28  8:47               ` David Woodhouse
2015-09-23 22:44             ` Francois Romieu
2015-09-23 23:18               ` David Woodhouse
2015-09-21 14:02       ` [PATCH 2/7] 8139cp: Do not re-enable RX interrupts in cp_tx_timeout() David Woodhouse
2015-09-22 23:46         ` David Miller
2015-09-21 14:02       ` [PATCH 3/7] 8139cp: Fix tx_queued debug message to print correct slot numbers David Woodhouse
2015-09-21 14:02       ` [PATCH 4/7] 8139cp: Fix TSO/scatter-gather descriptor setup David Woodhouse
2015-09-21 21:01         ` Francois Romieu
2015-09-21 21:06           ` David Woodhouse
2015-09-21 21:47           ` David Woodhouse
2015-09-22 21:59             ` Francois Romieu
2015-09-21 14:03       ` [PATCH 5/7] 8139cp: Fix DMA unmapping of transmitted buffers David Woodhouse
2015-09-21 14:03       ` [PATCH 6/7] 8139cp: Dump contents of descriptor ring on TX timeout David Woodhouse
2015-09-21 14:05       ` [PATCH 7/7] 8139cp: Avoid gratuitous writes to TxPoll register when already running David Woodhouse
2015-09-21 20:54         ` Francois Romieu
2015-09-21 21:10           ` David Woodhouse
2015-09-21 14:11       ` [PATCH 2/2] 8139cp: Call __cp_set_rx_mode() from cp_tx_timeout() David Woodhouse
2015-09-21  5:24 ` [PATCH 1/2] 8139cp: Use dev_kfree_skb_any() instead of dev_kfree_skb() in cp_clean_rings() David Miller
     [not found] <1443078810.74600.45.camel@infradead.org>
2015-09-24  7:25 ` [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms Francois Romieu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1442868750.7367.70.camel@infradead.org \
    --to=dwmw2@infradead.org \
    --cc=netdev@vger.kernel.org \
    --cc=romieu@fr.zoreil.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).