From: "Péter Ujfalusi" <peter.ujfalusi@linux.intel.com>
To: Lino Sanfilippo <l.sanfilippo@kunbus.com>,
Jarkko Sakkinen <jarkko@kernel.org>,
Lino Sanfilippo <LinoSanfilippo@gmx.de>,
peterhuewe@gmx.de, jgg@ziepe.ca
Cc: jsnitsel@redhat.com, hdegoede@redhat.com, oe-lkp@lists.linux.dev,
lkp@intel.com, peterz@infradead.org, linux@mniewoehner.de,
linux-integrity@vger.kernel.org, linux-kernel@vger.kernel.org,
lukas@wunner.de, p.rosenberger@kunbus.com
Subject: Re: [PATCH 1/2] tpm, tpm_tis: Handle interrupt storm
Date: Mon, 29 May 2023 09:46:08 +0300 [thread overview]
Message-ID: <a84c447f-cdfb-d33c-62cb-bb5d9aa8510b@linux.intel.com> (raw)
In-Reply-To: <da435e0d-5f22-fac7-bc10-96a0fd4c6d54@kunbus.com>
Hi Lino,
On 23/05/2023 23:46, Lino Sanfilippo wrote:
>> On the other hand any new functionality is objectively a maintanance
>> burden of some measure (applies to any functionality). So how do we know
>> that taking this change is less of a maintenance burden than just add
>> new table entries, as they come up?
>>
>
> Initially this set was created as a response to this 0-day bug report which you asked me
> to have a look at:
>
> https://lore.kernel.org/linux-integrity/d80b180a569a9f068d3a2614f062cfa3a78af5a6.camel@kernel.org/
>
> My hope was that it could also avoid some of (existing or future) DMI entries. But even if it does not
> (e.g. the problem Péter Ujfalusi reported with the UPX-i11 cannot be fixed by this patch set and thus
> needs the DMI quirk) we may at least avoid more bug reports due to interrupt storms once
> 6.4 is released.
I'm surprised that there is a need for a storm detection in the first
place... Do we have something else on the same IRQ line on the affected
devices which might have a bug or no driver at all?
It is hard to believe that a TPM (Trusted Platform Module) is integrated
so poorly ;)
But put that aside: I think the storm detection is good given that there
is no other way to know which machine have sloppy TPM integration.
There are machines where this happens, so it is a know integration
issue, right?
My only 'nitpick' is with the printk level to be used.
The ERR level is not correct as we know the issue and we handle it, so
all is under control.
If we want to add these machines to the quirk list then WARN is a good
level to gain attention but I'm not sure if a user will know how to get
the machine in the quirk (where to file a bug).
If we only want the quirk to be used for machines like UPX-i11 which
simply just have broken (likely floating) IRQ line then the WARN is too
high level, INFO or even DBG would be appropriate as you are not going
to update the quirk, it is just handled under the hood (which is a great
thing, but on the other hand you will have the storm never the less and
that is not a nice thing).
It is a matter on how this is going to be handled in a long term. Add
quirk for all the known machines with either stormy or plain broken IRQ
line or handle the stormy ones and quirk the broken ones only.
>>> Detect an interrupt storm by counting the number of unhandled interrupts
>>> within a 10 ms time interval. In case that more than 1000 were unhandled
>>> deactivate interrupts, deregister the handler and fall back to polling.
>>
>> I know it can be sometimes hard to evaluate but can you try to explain
>> how you came up to the 10 ms sampling period and 1000 interrupt
>> threshold? I just don't like abritrary numbers.
>
> At least the 100 ms is not plucked out of thin air but its the same time period
> that the generic code in note_interrupt() uses - I assume for a good reason.
> Not only this number but the whole irq storm detection logic is taken from
> there:
>
>>
>>> This equals the implementation that handles interrupt storms in
>>> note_interrupt() by means of timestamps and counters in struct irq_desc.
>> The number of 1000 unhandled interrupts is still far below the 99900
used in
> note_interrupt() but IMHO enough to indicate that there is something seriously
> wrong with interrupt processing and it is probably saver to fall back to polling.
Except that if the line got the spurious designation in core, the
interrupt line will be disabled while the TPM driver will think that it
is still using IRQ mode and will not switch to polling.
A storm of 1000 is better than a storm of 99900 for sure but quirking
these would be the desired final solution. imho
There are many buts around this ;)
--
Péter
next prev parent reply other threads:[~2023-05-29 6:45 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-22 14:31 [PATCH 1/2] tpm, tpm_tis: Handle interrupt storm Lino Sanfilippo
2023-05-22 14:31 ` [PATCH 2/2] tpm, tpm_tis: reuse code in disable_interrupts() Lino Sanfilippo
2023-05-22 22:45 ` Jerry Snitselaar
2023-05-23 7:08 ` Péter Ujfalusi
2023-05-23 19:07 ` Jarkko Sakkinen
2023-05-23 20:52 ` Lino Sanfilippo
2023-05-24 1:29 ` Jarkko Sakkinen
2023-05-22 22:44 ` [PATCH 1/2] tpm, tpm_tis: Handle interrupt storm Jerry Snitselaar
2023-05-23 6:48 ` Péter Ujfalusi
2023-05-23 7:07 ` Péter Ujfalusi
2023-05-23 7:44 ` Lukas Wunner
2023-05-23 9:14 ` Péter Ujfalusi
2023-05-23 9:20 ` Hans de Goede
2023-05-23 9:35 ` Péter Ujfalusi
2023-05-23 10:35 ` Lino Sanfilippo
2023-05-23 15:19 ` Lukas Wunner
2023-05-23 10:41 ` Lino Sanfilippo
2023-05-23 15:16 ` Lukas Wunner
2023-05-24 9:08 ` Hans de Goede
2023-05-29 10:44 ` Michael Niewöhner
2023-05-23 19:35 ` Jarkko Sakkinen
2023-05-23 18:53 ` Jarkko Sakkinen
2023-05-23 19:00 ` Jarkko Sakkinen
2023-05-23 19:46 ` Lukas Wunner
2023-05-24 1:44 ` Jarkko Sakkinen
2023-05-23 20:50 ` Lino Sanfilippo
2023-05-23 19:12 ` Jarkko Sakkinen
2023-05-23 22:32 ` Jerry Snitselaar
2023-05-24 1:21 ` Jarkko Sakkinen
2023-05-23 19:37 ` Lukas Wunner
2023-05-24 1:46 ` Jarkko Sakkinen
2023-05-23 20:46 ` Lino Sanfilippo
2023-05-29 6:46 ` Péter Ujfalusi [this message]
2023-05-29 13:15 ` Lino Sanfilippo
2023-06-06 16:42 ` Jarkko Sakkinen
2023-05-30 17:56 ` Jerry Snitselaar
2023-06-06 16:35 ` Jarkko Sakkinen
2023-06-06 16:45 ` Jarkko Sakkinen
2023-05-24 3:58 ` Jarkko Sakkinen
2023-05-24 3:59 ` Jarkko Sakkinen
2023-05-24 7:29 ` Lukas Wunner
2023-05-24 15:30 ` Jarkko Sakkinen
2023-05-26 0:37 ` Lino Sanfilippo
2023-05-30 10:31 ` Lino Sanfilippo
2023-05-25 23:45 ` Lino Sanfilippo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a84c447f-cdfb-d33c-62cb-bb5d9aa8510b@linux.intel.com \
--to=peter.ujfalusi@linux.intel.com \
--cc=LinoSanfilippo@gmx.de \
--cc=hdegoede@redhat.com \
--cc=jarkko@kernel.org \
--cc=jgg@ziepe.ca \
--cc=jsnitsel@redhat.com \
--cc=l.sanfilippo@kunbus.com \
--cc=linux-integrity@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@mniewoehner.de \
--cc=lkp@intel.com \
--cc=lukas@wunner.de \
--cc=oe-lkp@lists.linux.dev \
--cc=p.rosenberger@kunbus.com \
--cc=peterhuewe@gmx.de \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox