public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jarkko Sakkinen" <jarkko@kernel.org>
To: "James Bottomley" <James.Bottomley@HansenPartnership.com>,
	"Roberto Sassu" <roberto.sassu@huaweicloud.com>,
	"Linux regressions mailing list" <regressions@lists.linux.dev>
Cc: <keyrings@vger.kernel.org>,
	"linux-integrity@vger.kernel.org"
	<linux-integrity@vger.kernel.org>,
	"LKML" <linux-kernel@vger.kernel.org>,
	"Pengyu Ma" <mapengyu@gmail.com>
Subject: Re: [regression] significant delays when secureboot is enabled since 6.10
Date: Sat, 14 Sep 2024 13:42:18 +0300	[thread overview]
Message-ID: <D45Y0H3JRIJE.3LIRI1PEDTJE3@kernel.org> (raw)
In-Reply-To: <10ae7b8592af7bacef87e493e6d628a027641b8d.camel@HansenPartnership.com>

On Wed Sep 11, 2024 at 3:21 PM EEST, James Bottomley wrote:
> On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote:
> > On Tue, 2024-09-10 at 16:28 +0300, Jarkko Sakkinen wrote:
> > > On Tue Sep 10, 2024 at 3:57 PM EEST, James Bottomley wrote:
> > > > On Tue, 2024-09-10 at 15:48 +0300, Jarkko Sakkinen wrote:
> > > > > On Tue Sep 10, 2024 at 3:39 PM EEST, Jarkko Sakkinen wrote:
> > > > > > On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote:
> > > > > > > On Tue, 2024-09-10 at 11:01 +0200, Linux regression
> > > > > > > tracking
> > > > > > > (Thorsten
> > > > > > > Leemhuis) wrote:
> > > > > > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > > > > > > 
> > > > > > > > James, Jarkoo, I noticed a report about a regression in
> > > > > > > > bugzilla.kernel.org that appears to be caused by this
> > > > > > > > change of
> > > > > > > > yours:
> > > > > > > > 
> > > > > > > > 6519fea6fd372b ("tpm: add hmac checks to
> > > > > > > > tpm2_pcr_extend()")
> > > > > > > > [v6.10-rc1]
> > > > > > > > 
> > > > > > > > As many (most?) kernel developers don't keep an eye on
> > > > > > > > the bug
> > > > > > > > tracker,
> > > > > > > > I decided to forward it by mail. To quote from
> > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> > > > > > > > 
> > > > > > > > > When secureboot is enabled,
> > > > > > > > > the kernel boot time is ~20 seconds after 6.10 kernel.
> > > > > > > > > it's ~7 seconds on 6.8 kernel version.
> > > > > > > > > 
> > > > > > > > > When secureboot is disabled,
> > > > > > > > > the boot time is ~7 seconds too.
> > > > > > > > > 
> > > > > > > > > Reproduced on both AMD and Intel platform on ThinkPad
> > > > > > > > > X1 and
> > > > > > > > > T14.
> > > > > > > > > 
> > > > > > > > > It probably caused autologin failure and micmute led
> > > > > > > > > not
> > > > > > > > > loaded on AMD platform.
> > > > > > > > 
> > > > > > > > It was later bisected to the change mentioned above. See
> > > > > > > > the
> > > > > > > > ticket for
> > > > > > > > more details.
> > > > > > > 
> > > > > > > Hi
> > > > > > > 
> > > > > > > I suspect I encountered the same problem:
> > > > > > > 
> > > > > > > https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@huaweicloud.com/
> > > > > > > 
> > > > > > > Going to provide more info there.
> > > > > > 
> > > > > > I suppose you are going try to acquire the tracing data I
> > > > > > asked?
> > > > > > That would be awesome, thanks for taking the troube.  Let's
> > > > > > look
> > > > > > at the data and draw conclusions based on that.
> > > > > > 
> > > > > > Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the
> > > > > > kernel
> > > > > > configuration disables the feature.
> > > > > > 
> > > > > > For making decisions what to do with the  we are talking
> > > > > > about ~2
> > > > > > week window estimated, given the Vienna conference slows
> > > > > > things
> > > > > > down, so I hope my workaround is good enough before that.
> > > > > 
> > > > > I can enumerate three most likely ways to address the issue:
> > > > > 
> > > > > 1. Strongest: drop from defconfig.
> > > > > 2. Medium: leave to defconfig but add an opt-in kernel command-
> > > > > line
> > > > >    parameter.
> > > > > 3. Lightest: if we can based on tracing data nail the
> > > > > regression in
> > > > >    sustainable schedule, fix it.
> > > > 
> > > > Actually, there's a fourth: not use sessions for the PCR extend
> > > > (if
> > > > we'd got the timings when I asked, this was going to be my
> > > > suggestion
> > > > if they came back problematic).  This seems only to be a problem
> > > > for
> > > > IMA measured boot (because it does lots of extends).  If
> > > > necessary this
> > > > could even be wrapped in a separate config or boot option that
> > > > only
> > > > disables HMAC on extend if IMA (so we still get security for
> > > > things
> > > > like sd-boot)
> > > 
> > > I can buy that but with a twist that make it an opt-in kernel
> > > command
> > > line option. We don't want to take already existing functionality
> > > away
> > > from those who might want to use it (given e.g. hardening
> > > requirements),
> > > and with that basis opt-in (by default disabled) would be more
> > > balanced
> > > way to address the issue.
> > > 
> > > Please do a send a patch!
> > 
> > I made few measurements. I have a Fedora 38 VM with TPM passthrough.
> > 
> > Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)
> > 
> > QEMU:
> > 
> > rc  qemu-kvm                                          1:4.2-
> > 3ubuntu6.27
> > ii  qemu-system-x86                                   1:6.2+dfsg-
> > 2ubuntu6.22
> > 
> > 
> > TPM2_PT_MANUFACTURER:
> >   raw: 0x49465800
> >   value: "IFX"
> > TPM2_PT_VENDOR_STRING_1:
> >   raw: 0x534C4239
> >   value: "SLB9"
> > TPM2_PT_VENDOR_STRING_2:
> >   raw: 0x36373000
> >   value: "670"
> > 
> > 
> > No HMAC:
> > 
> > # tracer: function_graph
> > #
> > # CPU  DURATION                  FUNCTION CALLS
> > # |     |   |                     |   |   |   |
> >  0)               |  tpm2_pcr_extend() {
> >  0)   1.112 us    |    tpm_buf_append_hmac_session();
> >  0) # 6360.029 us |    tpm_transmit_cmd();
> >  0) # 6415.012 us |  }
> > 
> > 
> > HMAC:
> > 
> > # tracer: function_graph
> > #
> > # CPU  DURATION                  FUNCTION CALLS
> > # |     |   |                     |   |   |   |
> >  1)               |  tpm2_pcr_extend() {
> >  1)               |    tpm2_start_auth_session() {
> >  1) * 36976.99 us |      tpm_transmit_cmd();
> >  1) * 84746.51 us |      tpm_transmit_cmd();
> >  1) # 3195.083 us |      tpm_transmit_cmd();
> >  1) @ 126795.1 us |    }
> >  1)   2.254 us    |    tpm_buf_append_hmac_session();
> >  1)   3.546 us    |    tpm_buf_fill_hmac_session();
> >  1) * 24356.46 us |    tpm_transmit_cmd();
> >  1)   3.496 us    |    tpm_buf_check_hmac_response();
> >  1) @ 151171.0 us |  }
>
> Well, unfortunately, that tells us that it's the TPM itself that's
> taking the time processing the security overhead.  The ordering of the
> commands in tpm2_start_auth_session() shows
>
>  37ms for context restore of null key
>  85ms for start session with encrypted salt
>   3ms to flush null key
> -----
> 125ms
>
> If we context save the session, we'd likely only bear a single 37ms
> cost to restore it (replacing the total 125ms).  However, there's
> nothing we can do about the extend execution going from 6ms to 24ms, so
> I could halve your current boot time with security enabled (it's
> currently 149ms, it would go to 61ms, but it's still 10x slower than
> the unsecured extend at 6ms)

Please address how this discussion is related to https://bugzilla.kernel.org/show_bug.cgi?id=219229

I just read the bug report nothing about IMA or PCR extend.

There's now tons of spam about performance issue in a patch set that is
not in the mainline and barely nothing about the original issue:

"
When secureboot is enabled,
the kernel boot time is ~20 seconds after 6.10 kernel.
it's ~7 seconds on 6.8 kernel version.

When secureboot is disabled,
the boot time is ~7 seconds too.

Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.

It probably caused autologin failure and micmute led not loaded on AMD platform.

6.9 kernel version is not tested since not signed kernel found.
6.8, 6.10, 6.11 are tested, the first bad version is 6.10.
"

How is this going to help to fix this one?

I say this once and one: I zero care fixing code that is in the
mainline.

BR, Jarkko

  parent reply	other threads:[~2024-09-14 10:42 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-10  9:01 [regression] significant delays when secureboot is enabled since 6.10 Linux regression tracking (Thorsten Leemhuis)
2024-09-10  9:05 ` Roberto Sassu
2024-09-10 12:39   ` Jarkko Sakkinen
2024-09-10 12:48     ` Jarkko Sakkinen
2024-09-10 12:57       ` James Bottomley
2024-09-10 13:28         ` Jarkko Sakkinen
2024-09-11  8:53           ` Roberto Sassu
2024-09-11 12:21             ` James Bottomley
2024-09-12 13:16               ` Jarkko Sakkinen
2024-09-12 13:26                 ` James Bottomley
2024-09-12 13:36                   ` Roberto Sassu
2024-09-12 14:13                     ` James Bottomley
2024-09-12 14:52                       ` Roberto Sassu
2024-09-12 14:26                   ` Jarkko Sakkinen
2024-09-14 10:42               ` Jarkko Sakkinen [this message]
2024-09-14 10:51                 ` Jarkko Sakkinen
2024-09-14 10:58                   ` Jarkko Sakkinen
2024-09-11 15:14             ` Jarkko Sakkinen
2024-09-12  8:13               ` Roberto Sassu
2024-09-12 14:23                 ` Jarkko Sakkinen
2024-09-13 20:50                 ` Jarkko Sakkinen
2024-09-13 22:06                   ` Jarkko Sakkinen
2024-09-15  9:43                 ` Jarkko Sakkinen
2024-09-15 10:07                   ` Jarkko Sakkinen
2024-09-15 13:59                     ` James Bottomley
2024-09-15 14:50                       ` Jarkko Sakkinen
2024-09-15 14:55                         ` Jarkko Sakkinen
2024-09-15 15:00                         ` James Bottomley
2024-09-15 16:22                           ` Jarkko Sakkinen
2024-09-21 15:40                             ` Jarkko Sakkinen
2024-09-22 14:11                               ` Jarkko Sakkinen
2024-09-10 12:22 ` James Bottomley
2024-09-10 12:41   ` Linux regression tracking (Thorsten Leemhuis)
2024-09-10 22:40     ` Jarkko Sakkinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D45Y0H3JRIJE.3LIRI1PEDTJE3@kernel.org \
    --to=jarkko@kernel.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=keyrings@vger.kernel.org \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mapengyu@gmail.com \
    --cc=regressions@lists.linux.dev \
    --cc=roberto.sassu@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox