From: Mimi Zohar <zohar@linux.ibm.com>
To: Calvin Owens <calvinowens@fb.com>
Cc: Peter Huewe <peterhuewe@gmx.de>,
Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>,
Jason Gunthorpe <jgg@ziepe.ca>, Arnd Bergmann <arnd@arndb.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"linux-integrity@vger.kernel.org"
<linux-integrity@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH] tpm: Make timeout logic simpler and more robust
Date: Tue, 12 Mar 2019 16:56:15 -0400 [thread overview]
Message-ID: <1552424175.24794.105.camel@linux.ibm.com> (raw)
In-Reply-To: <20190312200820.GB5058@Haydn>
On Tue, 2019-03-12 at 20:08 +0000, Calvin Owens wrote:
> On Tuesday 03/12 at 13:04 -0400, Mimi Zohar wrote:
> > On Mon, 2019-03-11 at 16:54 -0700, Calvin Owens wrote:
> > > We're having lots of problems with TPM commands timing out, and we're
> > > seeing these problems across lots of different hardware (both v1/v2).
> > >
> > > I instrumented the driver to collect latency data, but I wasn't able to
> > > find any specific timeout to fix: it seems like many of them are too
> > > aggressive. So I tried replacing all the timeout logic with a single
> > > universal long timeout, and found that makes our TPMs 100% reliable.
> > >
> > > Given that this timeout logic is very complex, problematic, and appears
> > > to serve no real purpose, I propose simply deleting all of it.
> >
> > Normally before sending such a massive change like this, included in
> > the bug report or patch description, there would be some indication as
> > to which kernel introduced a regression. Has this always been a
> > problem? Is this something new? How new?
>
> Honestly we've always had problems with flakiness from these devices,
> but it seems to have regressed sometime between 4.11 and 4.16.
Well, that's a start. Around 4.10 is when we started noticing TPM
performance issues due to the change in the kernel timer scheduling.
This resulted in commit a233a0289cf9 ("tpm: msleep() delays - replace
with usleep_range() in i2c nuvoton driver"), which was upstreamed in
4.12.
At the other end, James was referring to commit "424eaf910c32 tpm:
reduce polling time to usecs for even finer granularity", which was
introduced in 4.18.
>
> I wish a had a better answer for you: we need on the order of a hundred
> machines to see the difference, and setting up these 100+ machine tests
> is unfortunately involved enough that e.g. bisecting it just isn't
> feasible :/
> What I can say for sure is that this patch makes everything much better
> for us. If there's anything in particular you'd like me to test, I have
> an army of machines I'm happy to put to use, let me know :)
I would assume not all of your machines are the same nor have the same
TPM. Could you verify that this problem is across the board, not
limited to a particular TPM.
BTW, are you seeing this problem with both TPM 1.2 or 2.0?
thanks!
Mimi
next prev parent reply other threads:[~2019-03-12 20:56 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-11 23:54 [PATCH] tpm: Make timeout logic simpler and more robust Calvin Owens
2019-03-12 0:27 ` James Bottomley
2019-03-12 12:50 ` Jarkko Sakkinen
2019-03-12 14:42 ` James Bottomley
2019-03-12 15:39 ` Jarkko Sakkinen
2019-03-12 19:41 ` Calvin Owens
2019-03-12 16:59 ` Mimi Zohar
2019-03-12 17:14 ` James Bottomley
2019-03-12 18:32 ` Mimi Zohar
2019-03-12 19:37 ` Calvin Owens
2019-03-12 12:36 ` Jarkko Sakkinen
2019-03-12 16:56 ` Mimi Zohar
2019-03-12 14:55 ` Jarkko Sakkinen
2019-03-12 17:04 ` Mimi Zohar
2019-03-12 20:08 ` Calvin Owens
2019-03-12 20:56 ` Mimi Zohar [this message]
2019-03-13 13:22 ` Jarkko Sakkinen
2019-03-13 13:23 ` Jarkko Sakkinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1552424175.24794.105.camel@linux.ibm.com \
--to=zohar@linux.ibm.com \
--cc=Kernel-team@fb.com \
--cc=arnd@arndb.de \
--cc=calvinowens@fb.com \
--cc=gregkh@linuxfoundation.org \
--cc=jarkko.sakkinen@linux.intel.com \
--cc=jgg@ziepe.ca \
--cc=linux-integrity@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterhuewe@gmx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.