public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Pete Swain <swine@google.com>, linux-kernel@vger.kernel.org
Cc: Pete Swain <swine@google.com>, stable@vger.kernel.org
Subject: Re: [PATCH] FIXUP: genirq: defuse spurious-irq timebomb
Date: Sun, 07 Jul 2024 20:39:31 +0200	[thread overview]
Message-ID: <87jzhxvyfw.ffs@tglx> (raw)
In-Reply-To: <20240615044307.359980-1-swine@google.com>

Pete!

On Fri, Jun 14 2024 at 21:42, Pete Swain wrote:
> The flapping-irq detector still has a timebomb.
>
> A pathological workload, or test script,
> can arm the spurious-irq timebomb described in
>   4f27c00bf80f ("Improve behaviour of spurious IRQ detect")
>
> This leads to irqs being moved the much slower polled mode,
> despite the actual unhandled-irq rate being well under the
> 99.9k/100k threshold that the code appears to check.
>
> How?
>   - Queued completion handler, like nvme, servicing events
>     as they appear in the queue, even if the irq corresponding
>     to the event has not yet been seen.
>
>   - queues frequently empty, so seeing "spurious" irqs
>     whenever the last events of a threaded handler's
>       while (events_queued()) process_them();
>     ends with those events' irqs posted while thread was scanning.
>     In this case the while() has consumed last event(s),
>     so next handler says IRQ_NONE.
>
>   - In each run of "unhandled" irqs, exactly one IRQ_NONE response
>     is promoted from IRQ_NONE to IRQ_HANDLED, by note_interrupt()'s
>     SPURIOUS_DEFERRED logic.
>
>   - Any 2+ unhandled-irq runs will increment irqs_unhandled.
>     The time_after() check in note_interrupt() resets irqs_unhandled
>     to 1 after an idle period, but if irqs are never spaced more
>     than HZ/10 apart, irqs_unhandled keeps growing.
>
>   - During processing of long completion queues, the non-threaded
>     handlers will return IRQ_WAKE_THREAD, for potentially thousands
>     of per-event irqs. These bypass note_interrupt()'s irq_count++ logic,
>     so do not count as handled, and do not invoke the flapping-irq
>     logic.
>
>   - When the _counted_ irq_count reaches the 100k threshold,
>     it's possible for irqs_unhandled > 99.9k to force a move
>     to polling mode, even though many millions of _WAKE_THREAD
>     irqs have been handled without being counted.
>
> Solution: include IRQ_WAKE_THREAD events in irq_count.
> Only when IRQ_NONE responses outweigh (IRQ_HANDLED + IRQ_WAKE_THREAD)
> by the old 99:1 ratio will an irq be moved to polling mode.

Nice detective work. Though I'm not entirely sure whether that's the
correct approach as it might misjudge the situation where
IRQ_WAKE_THREAD is issued but the thread does not make progress at all.

Let me think about it some more.

Thanks,

        tglx

  reply	other threads:[~2024-07-07 18:39 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-15  4:42 [PATCH] FIXUP: genirq: defuse spurious-irq timebomb Pete Swain
2024-07-07 18:39 ` Thomas Gleixner [this message]
2024-07-26 17:27   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzhxvyfw.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=swine@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox