From: David Woodhouse <dwmw2@infradead.org>
To: Francois Romieu <romieu@fr.zoreil.com>
Cc: netdev@vger.kernel.org
Subject: Re: [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms
Date: Mon, 21 Sep 2015 21:52:30 +0100 [thread overview]
Message-ID: <1442868750.7367.70.camel@infradead.org> (raw)
In-Reply-To: <20150921202541.GA4128@electric-eye.fr.zoreil.com>
[-- Attachment #1: Type: text/plain, Size: 1961 bytes --]
On Mon, 2015-09-21 at 22:25 +0200, Francois Romieu wrote:
> David Woodhouse <dwmw2@infradead.org> :
> > From: David Woodhouse <David.Woodhouse@intel.com>
> >
> > The TX timeout handling has been observed to trigger RX IRQ storms. And
> > since cp_interrupt() just keeps saying that it handled the interrupt,
> > the machine then dies. Fix the return value from cp_interrupt(), and
> > the offending IRQ gets disabled and the machine survives.
>
> I am not fond of the way it dissociates the hardware status word and the
> software "handled" variable.
Oh, I like that part very much :)
The practice of returning a 'handled' status only when you've actually
*done* something you expect to mitigate the interrupt, is a useful way
of also protecting against both hardware misbehaviour and software
bugs.
> What you are describing - RX IRQ storms - looks like a problem between
> the irq and poll handlers. That's where I expect the problem to be solved.
I already fixed that, in the next patch in the series. But the failure
mode *should* have been 'IRQ disabled' and the device continuing to
work via polling. Not a complete death of the machine. That's the
difference this patch makes.
> Sprinkling "handled" operations does not make me terribly confortable,
> especially as I'd happily trade the old-style part irq, part napi
> processing for a plain napi processing (I can get over it though :o) ).
The existing cp_rx_poll() function mostly runs without taking cp->lock.
But cp_tx() *does* need cp->lock for the tx_head/tx_tail manipulations.
I'm guessing that's why it's still being called directly from the hard
IRQ handler? I suppose we could probably find a way to move that out.
But I was mostly just trying to fix stuff that was actually broken...
:)
--
David Woodhouse Open Source Technology Centre
David.Woodhouse@intel.com Intel Corporation
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]
next prev parent reply other threads:[~2015-09-21 20:52 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-17 23:19 [PATCH 1/2] 8139cp: Use dev_kfree_skb_any() instead of dev_kfree_skb() in cp_clean_rings() David Woodhouse
2015-09-17 23:21 ` [PATCH 2/2] 8139cp: Call __cp_set_rx_mode() from cp_tx_timeout() David Woodhouse
2015-09-18 11:37 ` [PATCH 3/2] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms David Woodhouse
2015-09-18 12:17 ` [PATCH 4/2] 8139cp: Do not re-enable RX interrupts in cp_tx_timeout() David Woodhouse
2015-09-21 5:24 ` [PATCH 2/2] 8139cp: Call __cp_set_rx_mode() from cp_tx_timeout() David Miller
2015-09-21 13:59 ` David Woodhouse
2015-09-21 14:01 ` [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms David Woodhouse
2015-09-21 20:25 ` Francois Romieu
2015-09-21 20:52 ` David Woodhouse [this message]
2015-09-22 23:45 ` David Miller
2015-09-23 8:14 ` David Woodhouse
2015-09-23 8:43 ` [PATCH 1/7] 8139cp: Do not re-enable RX interrupts in cp_tx_timeout() David Woodhouse
2015-09-23 8:44 ` [PATCH 2/7] 8139cp: Fix tx_queued debug message to print correct slot numbers David Woodhouse
2015-09-23 8:44 ` [PATCH 3/7] 8139cp: Fix TSO/scatter-gather descriptor setup David Woodhouse
2015-09-23 8:44 ` [PATCH 4/7] 8139cp: Reduce duplicate csum/tso code in cp_start_xmit() David Woodhouse
2015-09-23 8:45 ` [PATCH 5/7] 8139cp: Fix DMA unmapping of transmitted buffers David Woodhouse
2015-09-23 8:45 ` [PATCH 6/7] 8139cp: Dump contents of descriptor ring on TX timeout David Woodhouse
2015-09-23 8:46 ` [PATCH 7/7] 8139cp: Enable offload features by default David Woodhouse
2015-09-23 17:58 ` [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms David Miller
2015-09-23 19:45 ` David Woodhouse
2015-09-23 21:48 ` David Miller
2015-09-23 22:00 ` David Woodhouse
2015-09-23 23:29 ` Francois Romieu
2015-09-24 8:58 ` [PATCH WTF] 8139cp: Fix GSO MSS handling David Woodhouse
2015-09-24 10:38 ` [PATCH v2 RFC] " David Woodhouse
2015-09-24 12:05 ` Eric Dumazet
2015-09-24 12:31 ` David Woodhouse
2015-09-28 5:37 ` Tom Herbert
2015-09-28 7:21 ` David Woodhouse
2015-09-27 5:38 ` David Miller
2015-09-23 22:02 ` [PATCH] 8139cp: Set GSO max size and enforce it David Woodhouse
2015-09-23 22:44 ` [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms Francois Romieu
2015-09-23 23:09 ` David Woodhouse
2015-10-28 8:47 ` David Woodhouse
2015-09-23 22:44 ` Francois Romieu
2015-09-23 23:18 ` David Woodhouse
2015-09-21 14:02 ` [PATCH 2/7] 8139cp: Do not re-enable RX interrupts in cp_tx_timeout() David Woodhouse
2015-09-22 23:46 ` David Miller
2015-09-21 14:02 ` [PATCH 3/7] 8139cp: Fix tx_queued debug message to print correct slot numbers David Woodhouse
2015-09-21 14:02 ` [PATCH 4/7] 8139cp: Fix TSO/scatter-gather descriptor setup David Woodhouse
2015-09-21 21:01 ` Francois Romieu
2015-09-21 21:06 ` David Woodhouse
2015-09-21 21:47 ` David Woodhouse
2015-09-22 21:59 ` Francois Romieu
2015-09-21 14:03 ` [PATCH 5/7] 8139cp: Fix DMA unmapping of transmitted buffers David Woodhouse
2015-09-21 14:03 ` [PATCH 6/7] 8139cp: Dump contents of descriptor ring on TX timeout David Woodhouse
2015-09-21 14:05 ` [PATCH 7/7] 8139cp: Avoid gratuitous writes to TxPoll register when already running David Woodhouse
2015-09-21 20:54 ` Francois Romieu
2015-09-21 21:10 ` David Woodhouse
2015-09-21 14:11 ` [PATCH 2/2] 8139cp: Call __cp_set_rx_mode() from cp_tx_timeout() David Woodhouse
2015-09-21 5:24 ` [PATCH 1/2] 8139cp: Use dev_kfree_skb_any() instead of dev_kfree_skb() in cp_clean_rings() David Miller
[not found] <1443078810.74600.45.camel@infradead.org>
2015-09-24 7:25 ` [PATCH 1/7] 8139cp: Improve accuracy of cp_interrupt() return, to survive IRQ storms Francois Romieu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1442868750.7367.70.camel@infradead.org \
--to=dwmw2@infradead.org \
--cc=netdev@vger.kernel.org \
--cc=romieu@fr.zoreil.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).