* [regression] significant delays when secureboot is enabled since 6.10
@ 2024-09-10 9:01 Linux regression tracking (Thorsten Leemhuis)
2024-09-10 9:05 ` Roberto Sassu
2024-09-10 12:22 ` James Bottomley
0 siblings, 2 replies; 34+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-09-10 9:01 UTC (permalink / raw)
To: James Bottomley, Jarkko Sakkinen
Cc: keyrings, linux-integrity@vger.kernel.org, LKML,
Linux kernel regressions list, Pengyu Ma
Hi, Thorsten here, the Linux kernel's regression tracker.
James, Jarkoo, I noticed a report about a regression in
bugzilla.kernel.org that appears to be caused by this change of yours:
6519fea6fd372b ("tpm: add hmac checks to tpm2_pcr_extend()") [v6.10-rc1]
As many (most?) kernel developers don't keep an eye on the bug tracker,
I decided to forward it by mail. To quote from
https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> When secureboot is enabled,
> the kernel boot time is ~20 seconds after 6.10 kernel.
> it's ~7 seconds on 6.8 kernel version.
>
> When secureboot is disabled,
> the boot time is ~7 seconds too.
>
> Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.
>
> It probably caused autologin failure and micmute led not loaded on AMD platform.
It was later bisected to the change mentioned above. See the ticket for
more details.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
[1] because bugzilla.kernel.org tells users upon registration their
"email address will never be displayed to logged out users"
P.S.: let me use this mail to also add the report to the list of tracked
regressions to ensure it's doesn't fall through the cracks:
#regzbot introduced: 6519fea6fd372b
#regzbot from: Pengyu Ma <mapengyu@gmail.com>
#regzbot duplicate: https://bugzilla.kernel.org/show_bug.cgi?id=219229
#regzbot title: tpm: significant delays when secureboot is enabled
#regzbot ignore-activity
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-10 9:01 [regression] significant delays when secureboot is enabled since 6.10 Linux regression tracking (Thorsten Leemhuis)
@ 2024-09-10 9:05 ` Roberto Sassu
2024-09-10 12:39 ` Jarkko Sakkinen
2024-09-10 12:22 ` James Bottomley
1 sibling, 1 reply; 34+ messages in thread
From: Roberto Sassu @ 2024-09-10 9:05 UTC (permalink / raw)
To: Linux regressions mailing list, James Bottomley, Jarkko Sakkinen
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Tue, 2024-09-10 at 11:01 +0200, Linux regression tracking (Thorsten
Leemhuis) wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker.
>
> James, Jarkoo, I noticed a report about a regression in
> bugzilla.kernel.org that appears to be caused by this change of yours:
>
> 6519fea6fd372b ("tpm: add hmac checks to tpm2_pcr_extend()") [v6.10-rc1]
>
> As many (most?) kernel developers don't keep an eye on the bug tracker,
> I decided to forward it by mail. To quote from
> https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
>
> > When secureboot is enabled,
> > the kernel boot time is ~20 seconds after 6.10 kernel.
> > it's ~7 seconds on 6.8 kernel version.
> >
> > When secureboot is disabled,
> > the boot time is ~7 seconds too.
> >
> > Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.
> >
> > It probably caused autologin failure and micmute led not loaded on AMD platform.
>
> It was later bisected to the change mentioned above. See the ticket for
> more details.
Hi
I suspect I encountered the same problem:
https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@huaweicloud.com/
Going to provide more info there.
Roberto
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> [1] because bugzilla.kernel.org tells users upon registration their
> "email address will never be displayed to logged out users"
>
> P.S.: let me use this mail to also add the report to the list of tracked
> regressions to ensure it's doesn't fall through the cracks:
>
> #regzbot introduced: 6519fea6fd372b
> #regzbot from: Pengyu Ma <mapengyu@gmail.com>
> #regzbot duplicate: https://bugzilla.kernel.org/show_bug.cgi?id=219229
> #regzbot title: tpm: significant delays when secureboot is enabled
> #regzbot ignore-activity
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-10 9:01 [regression] significant delays when secureboot is enabled since 6.10 Linux regression tracking (Thorsten Leemhuis)
2024-09-10 9:05 ` Roberto Sassu
@ 2024-09-10 12:22 ` James Bottomley
2024-09-10 12:41 ` Linux regression tracking (Thorsten Leemhuis)
1 sibling, 1 reply; 34+ messages in thread
From: James Bottomley @ 2024-09-10 12:22 UTC (permalink / raw)
To: Linux regressions mailing list, Jarkko Sakkinen
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Tue, 2024-09-10 at 11:01 +0200, Linux regression tracking (Thorsten
Leemhuis) wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker.
>
> James, Jarkoo, I noticed a report about a regression in
> bugzilla.kernel.org that appears to be caused by this change of
> yours:
>
> 6519fea6fd372b ("tpm: add hmac checks to tpm2_pcr_extend()") [v6.10-
> rc1]
>
> As many (most?) kernel developers don't keep an eye on the bug
> tracker, I decided to forward it by mail. To quote from
> https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
>
> > When secureboot is enabled,
> > the kernel boot time is ~20 seconds after 6.10 kernel.
> > it's ~7 seconds on 6.8 kernel version.
> >
> > When secureboot is disabled,
> > the boot time is ~7 seconds too.
> >
> > Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.
> >
> > It probably caused autologin failure and micmute led not loaded on
> > AMD platform.
>
> It was later bisected to the change mentioned above. See the ticket
> for more details.
We always suspected encryption and hmac would add overheads which is
why it's gated by a config option. The way to fix this is to set
CONFIG_TCG_TPM_HMAC to N
of course, TPM transactions are then insecure, but it's the same state
as you were in before.
James
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-10 9:05 ` Roberto Sassu
@ 2024-09-10 12:39 ` Jarkko Sakkinen
2024-09-10 12:48 ` Jarkko Sakkinen
0 siblings, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-10 12:39 UTC (permalink / raw)
To: Roberto Sassu, Linux regressions mailing list, James Bottomley
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote:
> On Tue, 2024-09-10 at 11:01 +0200, Linux regression tracking (Thorsten
> Leemhuis) wrote:
> > Hi, Thorsten here, the Linux kernel's regression tracker.
> >
> > James, Jarkoo, I noticed a report about a regression in
> > bugzilla.kernel.org that appears to be caused by this change of yours:
> >
> > 6519fea6fd372b ("tpm: add hmac checks to tpm2_pcr_extend()") [v6.10-rc1]
> >
> > As many (most?) kernel developers don't keep an eye on the bug tracker,
> > I decided to forward it by mail. To quote from
> > https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> >
> > > When secureboot is enabled,
> > > the kernel boot time is ~20 seconds after 6.10 kernel.
> > > it's ~7 seconds on 6.8 kernel version.
> > >
> > > When secureboot is disabled,
> > > the boot time is ~7 seconds too.
> > >
> > > Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.
> > >
> > > It probably caused autologin failure and micmute led not loaded on AMD platform.
> >
> > It was later bisected to the change mentioned above. See the ticket for
> > more details.
>
> Hi
>
> I suspect I encountered the same problem:
>
> https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@huaweicloud.com/
>
> Going to provide more info there.
I suppose you are going try to acquire the tracing data I asked?
That would be awesome, thanks for taking the troube. Let's look
at the data and draw conclusions based on that.
Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the kernel
configuration disables the feature.
For making decisions what to do with the we are talking about ~2
week window estimated, given the Vienna conference slows things
down, so I hope my workaround is good enough before that.
> Roberto
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-10 12:22 ` James Bottomley
@ 2024-09-10 12:41 ` Linux regression tracking (Thorsten Leemhuis)
2024-09-10 22:40 ` Jarkko Sakkinen
0 siblings, 1 reply; 34+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-09-10 12:41 UTC (permalink / raw)
To: James Bottomley, Linux regressions mailing list, Jarkko Sakkinen
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma,
Roberto Sassu
On 10.09.24 14:22, James Bottomley wrote:
> On Tue, 2024-09-10 at 11:01 +0200, Linux regression tracking (Thorsten
> Leemhuis) wrote:
>>
>> 6519fea6fd372b ("tpm: add hmac checks to tpm2_pcr_extend()") [v6.10-
>> rc1]
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
>>
>>> When secureboot is enabled,
>>> the kernel boot time is ~20 seconds after 6.10 kernel.
>>> it's ~7 seconds on 6.8 kernel version.
>>>
>>> When secureboot is disabled,
>>> the boot time is ~7 seconds too.
>>>
>>> Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.
>
> We always suspected encryption and hmac would add overheads which is
> why it's gated by a config option. The way to fix this is to set
>
> CONFIG_TCG_TPM_HMAC to N
FWIW (mainly for others that later find this thread on lore), I's pretty
sure James meant CONFIG_TCG_TPM2_HMAC.
> of course, TPM transactions are then insecure, but it's the same state
> as you were in before.
Hmmm. But it's on by default on X86_64.
Hmmm. If this would cause serious trouble, I'd say this is a regression
that must be fixed, as we can't expect people to know that they need to
turn this off. But delays during boot? Hmmm. Makes me wonder what Linus
stance would be here. I suspect it might be "why was this enabled by
default for x86_64 anyway, new features almost always should be off by
default", but might be wrong there. And given that this was introduced
in 6.10 I assume a lot of users already have CONFIG_TCG_TPM2_HMAC=Y in
their .config files already anyway. :-/
Hmmm. :-|
Ciao, Thorsten
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-10 12:39 ` Jarkko Sakkinen
@ 2024-09-10 12:48 ` Jarkko Sakkinen
2024-09-10 12:57 ` James Bottomley
0 siblings, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-10 12:48 UTC (permalink / raw)
To: Jarkko Sakkinen, Roberto Sassu, Linux regressions mailing list,
James Bottomley
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Tue Sep 10, 2024 at 3:39 PM EEST, Jarkko Sakkinen wrote:
> On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote:
> > On Tue, 2024-09-10 at 11:01 +0200, Linux regression tracking (Thorsten
> > Leemhuis) wrote:
> > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > >
> > > James, Jarkoo, I noticed a report about a regression in
> > > bugzilla.kernel.org that appears to be caused by this change of yours:
> > >
> > > 6519fea6fd372b ("tpm: add hmac checks to tpm2_pcr_extend()") [v6.10-rc1]
> > >
> > > As many (most?) kernel developers don't keep an eye on the bug tracker,
> > > I decided to forward it by mail. To quote from
> > > https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> > >
> > > > When secureboot is enabled,
> > > > the kernel boot time is ~20 seconds after 6.10 kernel.
> > > > it's ~7 seconds on 6.8 kernel version.
> > > >
> > > > When secureboot is disabled,
> > > > the boot time is ~7 seconds too.
> > > >
> > > > Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.
> > > >
> > > > It probably caused autologin failure and micmute led not loaded on AMD platform.
> > >
> > > It was later bisected to the change mentioned above. See the ticket for
> > > more details.
> >
> > Hi
> >
> > I suspect I encountered the same problem:
> >
> > https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@huaweicloud.com/
> >
> > Going to provide more info there.
>
> I suppose you are going try to acquire the tracing data I asked?
> That would be awesome, thanks for taking the troube. Let's look
> at the data and draw conclusions based on that.
>
> Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the kernel
> configuration disables the feature.
>
> For making decisions what to do with the we are talking about ~2
> week window estimated, given the Vienna conference slows things
> down, so I hope my workaround is good enough before that.
I can enumerate three most likely ways to address the issue:
1. Strongest: drop from defconfig.
2. Medium: leave to defconfig but add an opt-in kernel command-line
parameter.
3. Lightest: if we can based on tracing data nail the regression in
sustainable schedule, fix it.
Without data it is impossible to point out the right choice (or
some unknown alternative that has not crossed my mind yet).
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-10 12:48 ` Jarkko Sakkinen
@ 2024-09-10 12:57 ` James Bottomley
2024-09-10 13:28 ` Jarkko Sakkinen
0 siblings, 1 reply; 34+ messages in thread
From: James Bottomley @ 2024-09-10 12:57 UTC (permalink / raw)
To: Jarkko Sakkinen, Roberto Sassu, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Tue, 2024-09-10 at 15:48 +0300, Jarkko Sakkinen wrote:
> On Tue Sep 10, 2024 at 3:39 PM EEST, Jarkko Sakkinen wrote:
> > On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote:
> > > On Tue, 2024-09-10 at 11:01 +0200, Linux regression tracking
> > > (Thorsten
> > > Leemhuis) wrote:
> > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > >
> > > > James, Jarkoo, I noticed a report about a regression in
> > > > bugzilla.kernel.org that appears to be caused by this change of
> > > > yours:
> > > >
> > > > 6519fea6fd372b ("tpm: add hmac checks to tpm2_pcr_extend()")
> > > > [v6.10-rc1]
> > > >
> > > > As many (most?) kernel developers don't keep an eye on the bug
> > > > tracker,
> > > > I decided to forward it by mail. To quote from
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> > > >
> > > > > When secureboot is enabled,
> > > > > the kernel boot time is ~20 seconds after 6.10 kernel.
> > > > > it's ~7 seconds on 6.8 kernel version.
> > > > >
> > > > > When secureboot is disabled,
> > > > > the boot time is ~7 seconds too.
> > > > >
> > > > > Reproduced on both AMD and Intel platform on ThinkPad X1 and
> > > > > T14.
> > > > >
> > > > > It probably caused autologin failure and micmute led not
> > > > > loaded on AMD platform.
> > > >
> > > > It was later bisected to the change mentioned above. See the
> > > > ticket for
> > > > more details.
> > >
> > > Hi
> > >
> > > I suspect I encountered the same problem:
> > >
> > > https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@huaweicloud.com/
> > >
> > > Going to provide more info there.
> >
> > I suppose you are going try to acquire the tracing data I asked?
> > That would be awesome, thanks for taking the troube. Let's look
> > at the data and draw conclusions based on that.
> >
> > Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the kernel
> > configuration disables the feature.
> >
> > For making decisions what to do with the we are talking about ~2
> > week window estimated, given the Vienna conference slows things
> > down, so I hope my workaround is good enough before that.
>
> I can enumerate three most likely ways to address the issue:
>
> 1. Strongest: drop from defconfig.
> 2. Medium: leave to defconfig but add an opt-in kernel command-line
> parameter.
> 3. Lightest: if we can based on tracing data nail the regression in
> sustainable schedule, fix it.
Actually, there's a fourth: not use sessions for the PCR extend (if
we'd got the timings when I asked, this was going to be my suggestion
if they came back problematic). This seems only to be a problem for
IMA measured boot (because it does lots of extends). If necessary this
could even be wrapped in a separate config or boot option that only
disables HMAC on extend if IMA (so we still get security for things
like sd-boot)
The down side of doing this is that an interposer can drop any extend
it wants without being immediately detected, but as long as they don't
have control of the kernel they can't change the log entry, so the
mismatch would be detected on check (which has to be done by the remote
verifier). The unavoidable increased threat is that if you get tricked
into booting a malicious kernel (so the attacker has control of the
log) and the interposer substitutes the boot measurements, it can
actually fake out a remote verification system into thinking you're
actually a good node.
James
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-10 12:57 ` James Bottomley
@ 2024-09-10 13:28 ` Jarkko Sakkinen
2024-09-11 8:53 ` Roberto Sassu
0 siblings, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-10 13:28 UTC (permalink / raw)
To: James Bottomley, Roberto Sassu, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Tue Sep 10, 2024 at 3:57 PM EEST, James Bottomley wrote:
> On Tue, 2024-09-10 at 15:48 +0300, Jarkko Sakkinen wrote:
> > On Tue Sep 10, 2024 at 3:39 PM EEST, Jarkko Sakkinen wrote:
> > > On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote:
> > > > On Tue, 2024-09-10 at 11:01 +0200, Linux regression tracking
> > > > (Thorsten
> > > > Leemhuis) wrote:
> > > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > > >
> > > > > James, Jarkoo, I noticed a report about a regression in
> > > > > bugzilla.kernel.org that appears to be caused by this change of
> > > > > yours:
> > > > >
> > > > > 6519fea6fd372b ("tpm: add hmac checks to tpm2_pcr_extend()")
> > > > > [v6.10-rc1]
> > > > >
> > > > > As many (most?) kernel developers don't keep an eye on the bug
> > > > > tracker,
> > > > > I decided to forward it by mail. To quote from
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> > > > >
> > > > > > When secureboot is enabled,
> > > > > > the kernel boot time is ~20 seconds after 6.10 kernel.
> > > > > > it's ~7 seconds on 6.8 kernel version.
> > > > > >
> > > > > > When secureboot is disabled,
> > > > > > the boot time is ~7 seconds too.
> > > > > >
> > > > > > Reproduced on both AMD and Intel platform on ThinkPad X1 and
> > > > > > T14.
> > > > > >
> > > > > > It probably caused autologin failure and micmute led not
> > > > > > loaded on AMD platform.
> > > > >
> > > > > It was later bisected to the change mentioned above. See the
> > > > > ticket for
> > > > > more details.
> > > >
> > > > Hi
> > > >
> > > > I suspect I encountered the same problem:
> > > >
> > > > https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@huaweicloud.com/
> > > >
> > > > Going to provide more info there.
> > >
> > > I suppose you are going try to acquire the tracing data I asked?
> > > That would be awesome, thanks for taking the troube. Let's look
> > > at the data and draw conclusions based on that.
> > >
> > > Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the kernel
> > > configuration disables the feature.
> > >
> > > For making decisions what to do with the we are talking about ~2
> > > week window estimated, given the Vienna conference slows things
> > > down, so I hope my workaround is good enough before that.
> >
> > I can enumerate three most likely ways to address the issue:
> >
> > 1. Strongest: drop from defconfig.
> > 2. Medium: leave to defconfig but add an opt-in kernel command-line
> > parameter.
> > 3. Lightest: if we can based on tracing data nail the regression in
> > sustainable schedule, fix it.
>
> Actually, there's a fourth: not use sessions for the PCR extend (if
> we'd got the timings when I asked, this was going to be my suggestion
> if they came back problematic). This seems only to be a problem for
> IMA measured boot (because it does lots of extends). If necessary this
> could even be wrapped in a separate config or boot option that only
> disables HMAC on extend if IMA (so we still get security for things
> like sd-boot)
I can buy that but with a twist that make it an opt-in kernel command
line option. We don't want to take already existing functionality away
from those who might want to use it (given e.g. hardening requirements),
and with that basis opt-in (by default disabled) would be more balanced
way to address the issue.
Please do a send a patch!
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-10 12:41 ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-09-10 22:40 ` Jarkko Sakkinen
0 siblings, 0 replies; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-10 22:40 UTC (permalink / raw)
To: Linux regressions mailing list, James Bottomley
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma,
Roberto Sassu
On Tue Sep 10, 2024 at 3:41 PM EEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> FWIW (mainly for others that later find this thread on lore), I's pretty
> sure James meant CONFIG_TCG_TPM2_HMAC.
Yeah, exactly.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-10 13:28 ` Jarkko Sakkinen
@ 2024-09-11 8:53 ` Roberto Sassu
2024-09-11 12:21 ` James Bottomley
2024-09-11 15:14 ` Jarkko Sakkinen
0 siblings, 2 replies; 34+ messages in thread
From: Roberto Sassu @ 2024-09-11 8:53 UTC (permalink / raw)
To: Jarkko Sakkinen, James Bottomley, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Tue, 2024-09-10 at 16:28 +0300, Jarkko Sakkinen wrote:
> On Tue Sep 10, 2024 at 3:57 PM EEST, James Bottomley wrote:
> > On Tue, 2024-09-10 at 15:48 +0300, Jarkko Sakkinen wrote:
> > > On Tue Sep 10, 2024 at 3:39 PM EEST, Jarkko Sakkinen wrote:
> > > > On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote:
> > > > > On Tue, 2024-09-10 at 11:01 +0200, Linux regression tracking
> > > > > (Thorsten
> > > > > Leemhuis) wrote:
> > > > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > > > >
> > > > > > James, Jarkoo, I noticed a report about a regression in
> > > > > > bugzilla.kernel.org that appears to be caused by this change of
> > > > > > yours:
> > > > > >
> > > > > > 6519fea6fd372b ("tpm: add hmac checks to tpm2_pcr_extend()")
> > > > > > [v6.10-rc1]
> > > > > >
> > > > > > As many (most?) kernel developers don't keep an eye on the bug
> > > > > > tracker,
> > > > > > I decided to forward it by mail. To quote from
> > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> > > > > >
> > > > > > > When secureboot is enabled,
> > > > > > > the kernel boot time is ~20 seconds after 6.10 kernel.
> > > > > > > it's ~7 seconds on 6.8 kernel version.
> > > > > > >
> > > > > > > When secureboot is disabled,
> > > > > > > the boot time is ~7 seconds too.
> > > > > > >
> > > > > > > Reproduced on both AMD and Intel platform on ThinkPad X1 and
> > > > > > > T14.
> > > > > > >
> > > > > > > It probably caused autologin failure and micmute led not
> > > > > > > loaded on AMD platform.
> > > > > >
> > > > > > It was later bisected to the change mentioned above. See the
> > > > > > ticket for
> > > > > > more details.
> > > > >
> > > > > Hi
> > > > >
> > > > > I suspect I encountered the same problem:
> > > > >
> > > > > https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@huaweicloud.com/
> > > > >
> > > > > Going to provide more info there.
> > > >
> > > > I suppose you are going try to acquire the tracing data I asked?
> > > > That would be awesome, thanks for taking the troube. Let's look
> > > > at the data and draw conclusions based on that.
> > > >
> > > > Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the kernel
> > > > configuration disables the feature.
> > > >
> > > > For making decisions what to do with the we are talking about ~2
> > > > week window estimated, given the Vienna conference slows things
> > > > down, so I hope my workaround is good enough before that.
> > >
> > > I can enumerate three most likely ways to address the issue:
> > >
> > > 1. Strongest: drop from defconfig.
> > > 2. Medium: leave to defconfig but add an opt-in kernel command-line
> > > parameter.
> > > 3. Lightest: if we can based on tracing data nail the regression in
> > > sustainable schedule, fix it.
> >
> > Actually, there's a fourth: not use sessions for the PCR extend (if
> > we'd got the timings when I asked, this was going to be my suggestion
> > if they came back problematic). This seems only to be a problem for
> > IMA measured boot (because it does lots of extends). If necessary this
> > could even be wrapped in a separate config or boot option that only
> > disables HMAC on extend if IMA (so we still get security for things
> > like sd-boot)
>
> I can buy that but with a twist that make it an opt-in kernel command
> line option. We don't want to take already existing functionality away
> from those who might want to use it (given e.g. hardening requirements),
> and with that basis opt-in (by default disabled) would be more balanced
> way to address the issue.
>
> Please do a send a patch!
I made few measurements. I have a Fedora 38 VM with TPM passthrough.
Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)
QEMU:
rc qemu-kvm 1:4.2-3ubuntu6.27
ii qemu-system-x86 1:6.2+dfsg-2ubuntu6.22
TPM2_PT_MANUFACTURER:
raw: 0x49465800
value: "IFX"
TPM2_PT_VENDOR_STRING_1:
raw: 0x534C4239
value: "SLB9"
TPM2_PT_VENDOR_STRING_2:
raw: 0x36373000
value: "670"
No HMAC:
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
0) | tpm2_pcr_extend() {
0) 1.112 us | tpm_buf_append_hmac_session();
0) # 6360.029 us | tpm_transmit_cmd();
0) # 6415.012 us | }
HMAC:
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
1) | tpm2_pcr_extend() {
1) | tpm2_start_auth_session() {
1) * 36976.99 us | tpm_transmit_cmd();
1) * 84746.51 us | tpm_transmit_cmd();
1) # 3195.083 us | tpm_transmit_cmd();
1) @ 126795.1 us | }
1) 2.254 us | tpm_buf_append_hmac_session();
1) 3.546 us | tpm_buf_fill_hmac_session();
1) * 24356.46 us | tpm_transmit_cmd();
1) 3.496 us | tpm_buf_check_hmac_response();
1) @ 151171.0 us | }
Roberto
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-11 8:53 ` Roberto Sassu
@ 2024-09-11 12:21 ` James Bottomley
2024-09-12 13:16 ` Jarkko Sakkinen
2024-09-14 10:42 ` Jarkko Sakkinen
2024-09-11 15:14 ` Jarkko Sakkinen
1 sibling, 2 replies; 34+ messages in thread
From: James Bottomley @ 2024-09-11 12:21 UTC (permalink / raw)
To: Roberto Sassu, Jarkko Sakkinen, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote:
> On Tue, 2024-09-10 at 16:28 +0300, Jarkko Sakkinen wrote:
> > On Tue Sep 10, 2024 at 3:57 PM EEST, James Bottomley wrote:
> > > On Tue, 2024-09-10 at 15:48 +0300, Jarkko Sakkinen wrote:
> > > > On Tue Sep 10, 2024 at 3:39 PM EEST, Jarkko Sakkinen wrote:
> > > > > On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote:
> > > > > > On Tue, 2024-09-10 at 11:01 +0200, Linux regression
> > > > > > tracking
> > > > > > (Thorsten
> > > > > > Leemhuis) wrote:
> > > > > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > > > > >
> > > > > > > James, Jarkoo, I noticed a report about a regression in
> > > > > > > bugzilla.kernel.org that appears to be caused by this
> > > > > > > change of
> > > > > > > yours:
> > > > > > >
> > > > > > > 6519fea6fd372b ("tpm: add hmac checks to
> > > > > > > tpm2_pcr_extend()")
> > > > > > > [v6.10-rc1]
> > > > > > >
> > > > > > > As many (most?) kernel developers don't keep an eye on
> > > > > > > the bug
> > > > > > > tracker,
> > > > > > > I decided to forward it by mail. To quote from
> > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> > > > > > >
> > > > > > > > When secureboot is enabled,
> > > > > > > > the kernel boot time is ~20 seconds after 6.10 kernel.
> > > > > > > > it's ~7 seconds on 6.8 kernel version.
> > > > > > > >
> > > > > > > > When secureboot is disabled,
> > > > > > > > the boot time is ~7 seconds too.
> > > > > > > >
> > > > > > > > Reproduced on both AMD and Intel platform on ThinkPad
> > > > > > > > X1 and
> > > > > > > > T14.
> > > > > > > >
> > > > > > > > It probably caused autologin failure and micmute led
> > > > > > > > not
> > > > > > > > loaded on AMD platform.
> > > > > > >
> > > > > > > It was later bisected to the change mentioned above. See
> > > > > > > the
> > > > > > > ticket for
> > > > > > > more details.
> > > > > >
> > > > > > Hi
> > > > > >
> > > > > > I suspect I encountered the same problem:
> > > > > >
> > > > > > https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@huaweicloud.com/
> > > > > >
> > > > > > Going to provide more info there.
> > > > >
> > > > > I suppose you are going try to acquire the tracing data I
> > > > > asked?
> > > > > That would be awesome, thanks for taking the troube. Let's
> > > > > look
> > > > > at the data and draw conclusions based on that.
> > > > >
> > > > > Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the
> > > > > kernel
> > > > > configuration disables the feature.
> > > > >
> > > > > For making decisions what to do with the we are talking
> > > > > about ~2
> > > > > week window estimated, given the Vienna conference slows
> > > > > things
> > > > > down, so I hope my workaround is good enough before that.
> > > >
> > > > I can enumerate three most likely ways to address the issue:
> > > >
> > > > 1. Strongest: drop from defconfig.
> > > > 2. Medium: leave to defconfig but add an opt-in kernel command-
> > > > line
> > > > parameter.
> > > > 3. Lightest: if we can based on tracing data nail the
> > > > regression in
> > > > sustainable schedule, fix it.
> > >
> > > Actually, there's a fourth: not use sessions for the PCR extend
> > > (if
> > > we'd got the timings when I asked, this was going to be my
> > > suggestion
> > > if they came back problematic). This seems only to be a problem
> > > for
> > > IMA measured boot (because it does lots of extends). If
> > > necessary this
> > > could even be wrapped in a separate config or boot option that
> > > only
> > > disables HMAC on extend if IMA (so we still get security for
> > > things
> > > like sd-boot)
> >
> > I can buy that but with a twist that make it an opt-in kernel
> > command
> > line option. We don't want to take already existing functionality
> > away
> > from those who might want to use it (given e.g. hardening
> > requirements),
> > and with that basis opt-in (by default disabled) would be more
> > balanced
> > way to address the issue.
> >
> > Please do a send a patch!
>
> I made few measurements. I have a Fedora 38 VM with TPM passthrough.
>
> Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)
>
> QEMU:
>
> rc qemu-kvm 1:4.2-
> 3ubuntu6.27
> ii qemu-system-x86 1:6.2+dfsg-
> 2ubuntu6.22
>
>
> TPM2_PT_MANUFACTURER:
> raw: 0x49465800
> value: "IFX"
> TPM2_PT_VENDOR_STRING_1:
> raw: 0x534C4239
> value: "SLB9"
> TPM2_PT_VENDOR_STRING_2:
> raw: 0x36373000
> value: "670"
>
>
> No HMAC:
>
> # tracer: function_graph
> #
> # CPU DURATION FUNCTION CALLS
> # | | | | | | |
> 0) | tpm2_pcr_extend() {
> 0) 1.112 us | tpm_buf_append_hmac_session();
> 0) # 6360.029 us | tpm_transmit_cmd();
> 0) # 6415.012 us | }
>
>
> HMAC:
>
> # tracer: function_graph
> #
> # CPU DURATION FUNCTION CALLS
> # | | | | | | |
> 1) | tpm2_pcr_extend() {
> 1) | tpm2_start_auth_session() {
> 1) * 36976.99 us | tpm_transmit_cmd();
> 1) * 84746.51 us | tpm_transmit_cmd();
> 1) # 3195.083 us | tpm_transmit_cmd();
> 1) @ 126795.1 us | }
> 1) 2.254 us | tpm_buf_append_hmac_session();
> 1) 3.546 us | tpm_buf_fill_hmac_session();
> 1) * 24356.46 us | tpm_transmit_cmd();
> 1) 3.496 us | tpm_buf_check_hmac_response();
> 1) @ 151171.0 us | }
Well, unfortunately, that tells us that it's the TPM itself that's
taking the time processing the security overhead. The ordering of the
commands in tpm2_start_auth_session() shows
37ms for context restore of null key
85ms for start session with encrypted salt
3ms to flush null key
-----
125ms
If we context save the session, we'd likely only bear a single 37ms
cost to restore it (replacing the total 125ms). However, there's
nothing we can do about the extend execution going from 6ms to 24ms, so
I could halve your current boot time with security enabled (it's
currently 149ms, it would go to 61ms, but it's still 10x slower than
the unsecured extend at 6ms)
James
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-11 8:53 ` Roberto Sassu
2024-09-11 12:21 ` James Bottomley
@ 2024-09-11 15:14 ` Jarkko Sakkinen
2024-09-12 8:13 ` Roberto Sassu
1 sibling, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-11 15:14 UTC (permalink / raw)
To: Roberto Sassu, James Bottomley, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Wed Sep 11, 2024 at 11:53 AM EEST, Roberto Sassu wrote:
> I made few measurements. I have a Fedora 38 VM with TPM passthrough.
I was thinking more like
sudo bpftrace -e 'k:tpm_transmit { @start[tid] = nsecs; } kr:tpm_transmit { @[kstack, ustack, comm] = sum(nsecs - @start[tid]); delete(@start[tid]); } END { clear(@start); }'
For example when running "tpm2_createprimary --hierarchy o -G rsa2048 -c owner.txt", I get:
Attaching 3 probes...
^C
@[
tpm_transmit_cmd+46
tpm2_flush_context+120
tpm2_commit_space+197
tpm_dev_transmit.constprop.0+137
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 2860677
@[
tpm_dev_transmit.constprop.0+111
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/16:1]: 3890693
@[
tpm_transmit_cmd+46
tpm2_load_context+195
tpm2_prepare_space+410
tpm_dev_transmit.constprop.0+54
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 9058524
@[
tpm_transmit_cmd+46
tpm2_save_context+179
tpm2_commit_space+314
tpm_dev_transmit.constprop.0+137
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 11426260
@[
tpm_transmit_cmd+46
tpm2_load_context+195
tpm2_prepare_space+318
tpm_dev_transmit.constprop.0+54
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 14182972
@[
tpm_transmit_cmd+46
tpm2_save_context+179
tpm2_commit_space+155
tpm_dev_transmit.constprop.0+137
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 22597059
@[
tpm_dev_transmit.constprop.0+111
tpm_dev_async_work+102
process_one_work+374
worker_thread+614
kthread+207
ret_from_fork+49
ret_from_fork_asm+26
, , kworker/4:2]: 1958500581
This results stacks to compare with "real" time spent total in each
stack (in nsecs). CPU time is relevant measure in the problem we're
dealing.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-11 15:14 ` Jarkko Sakkinen
@ 2024-09-12 8:13 ` Roberto Sassu
2024-09-12 14:23 ` Jarkko Sakkinen
` (2 more replies)
0 siblings, 3 replies; 34+ messages in thread
From: Roberto Sassu @ 2024-09-12 8:13 UTC (permalink / raw)
To: Jarkko Sakkinen, James Bottomley, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Wed, 2024-09-11 at 18:14 +0300, Jarkko Sakkinen wrote:
> On Wed Sep 11, 2024 at 11:53 AM EEST, Roberto Sassu wrote:
> > I made few measurements. I have a Fedora 38 VM with TPM passthrough.
>
> I was thinking more like
>
> sudo bpftrace -e 'k:tpm_transmit { @start[tid] = nsecs; } kr:tpm_transmit { @[kstack, ustack, comm] = sum(nsecs - @start[tid]); delete(@start[tid]); } END { clear(@start); }'
>
> For example when running "tpm2_createprimary --hierarchy o -G rsa2048 -c owner.txt", I get:
Sure:
Without HMAC:
@[
tpm_transmit_cmd+50
tpm2_pcr_extend+295
tpm_pcr_extend+221
ima_add_template_entry+437
ima_store_template+114
ima_store_measurement+209
process_measurement+2473
ima_file_check+82
security_file_post_open+92
path_openat+550
do_filp_open+171
do_sys_openat2+186
do_sys_open+76
__x64_sys_openat+35
x64_sys_call+9589
do_syscall_64+96
entry_SYSCALL_64_after_hwframe+118
,
0x7f338ee7be55
0x55bf24459ac2
0x7f338eda2b8a
0x7f338eda2c4b
0x55bf2445a9b5
, cat]: 5273648
With HMAC:
@[
tpm_transmit_cmd+50
tpm2_flush_context+95
tpm2_start_auth_session+676
tpm2_pcr_extend+39
tpm_pcr_extend+221
ima_add_template_entry+437
ima_store_template+114
ima_store_measurement+209
process_measurement+2473
ima_file_check+82
security_file_post_open+92
path_openat+550
do_filp_open+171
do_sys_openat2+186
do_sys_open+76
__x64_sys_openat+35
x64_sys_call+9589
do_syscall_64+96
entry_SYSCALL_64_after_hwframe+118
,
0x7f03ea0ade55
0x55f929b7dac2
0x7f03e9fd4b8a
0x7f03e9fd4c4b
0x55f929b7e9b5
, cat]: 3128177
@[
tpm_transmit_cmd+50
tpm2_pcr_extend+338
tpm_pcr_extend+221
ima_add_template_entry+437
ima_store_template+114
ima_store_measurement+209
process_measurement+2473
ima_file_check+82
security_file_post_open+92
path_openat+550
do_filp_open+171
do_sys_openat2+186
do_sys_open+76
__x64_sys_openat+35
x64_sys_call+9589
do_syscall_64+96
entry_SYSCALL_64_after_hwframe+118
,
0x7f03ea0ade55
0x55f929b7dac2
0x7f03e9fd4b8a
0x7f03e9fd4c4b
0x55f929b7e9b5
, cat]: 25851638
@[
tpm_transmit_cmd+50
tpm2_load_context+161
tpm2_start_auth_session+98
tpm2_pcr_extend+39
tpm_pcr_extend+221
ima_add_template_entry+437
ima_store_template+114
ima_store_measurement+209
process_measurement+2473
ima_file_check+82
security_file_post_open+92
path_openat+550
do_filp_open+171
do_sys_openat2+186
do_sys_open+76
__x64_sys_openat+35
x64_sys_call+9589
do_syscall_64+96
entry_SYSCALL_64_after_hwframe+118
,
0x7f03ea0ade55
0x55f929b7dac2
0x7f03e9fd4b8a
0x7f03e9fd4c4b
0x55f929b7e9b5
, cat]: 35928108
@[
tpm_transmit_cmd+50
tpm2_start_auth_session+650
tpm2_pcr_extend+39
tpm_pcr_extend+221
ima_add_template_entry+437
ima_store_template+114
ima_store_measurement+209
process_measurement+2473
ima_file_check+82
security_file_post_open+92
path_openat+550
do_filp_open+171
do_sys_openat2+186
do_sys_open+76
__x64_sys_openat+35
x64_sys_call+9589
do_syscall_64+96
entry_SYSCALL_64_after_hwframe+118
,
0x7f03ea0ade55
0x55f929b7dac2
0x7f03e9fd4b8a
0x7f03e9fd4c4b
0x55f929b7e9b5
, cat]: 84616611
Roberto
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-11 12:21 ` James Bottomley
@ 2024-09-12 13:16 ` Jarkko Sakkinen
2024-09-12 13:26 ` James Bottomley
2024-09-14 10:42 ` Jarkko Sakkinen
1 sibling, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-12 13:16 UTC (permalink / raw)
To: James Bottomley, Roberto Sassu, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Wed Sep 11, 2024 at 3:21 PM EEST, James Bottomley wrote:
> On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote:
> > On Tue, 2024-09-10 at 16:28 +0300, Jarkko Sakkinen wrote:
> > > On Tue Sep 10, 2024 at 3:57 PM EEST, James Bottomley wrote:
> > > > On Tue, 2024-09-10 at 15:48 +0300, Jarkko Sakkinen wrote:
> > > > > On Tue Sep 10, 2024 at 3:39 PM EEST, Jarkko Sakkinen wrote:
> > > > > > On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote:
> > > > > > > On Tue, 2024-09-10 at 11:01 +0200, Linux regression
> > > > > > > tracking
> > > > > > > (Thorsten
> > > > > > > Leemhuis) wrote:
> > > > > > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > > > > > >
> > > > > > > > James, Jarkoo, I noticed a report about a regression in
> > > > > > > > bugzilla.kernel.org that appears to be caused by this
> > > > > > > > change of
> > > > > > > > yours:
> > > > > > > >
> > > > > > > > 6519fea6fd372b ("tpm: add hmac checks to
> > > > > > > > tpm2_pcr_extend()")
> > > > > > > > [v6.10-rc1]
> > > > > > > >
> > > > > > > > As many (most?) kernel developers don't keep an eye on
> > > > > > > > the bug
> > > > > > > > tracker,
> > > > > > > > I decided to forward it by mail. To quote from
> > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> > > > > > > >
> > > > > > > > > When secureboot is enabled,
> > > > > > > > > the kernel boot time is ~20 seconds after 6.10 kernel.
> > > > > > > > > it's ~7 seconds on 6.8 kernel version.
> > > > > > > > >
> > > > > > > > > When secureboot is disabled,
> > > > > > > > > the boot time is ~7 seconds too.
> > > > > > > > >
> > > > > > > > > Reproduced on both AMD and Intel platform on ThinkPad
> > > > > > > > > X1 and
> > > > > > > > > T14.
> > > > > > > > >
> > > > > > > > > It probably caused autologin failure and micmute led
> > > > > > > > > not
> > > > > > > > > loaded on AMD platform.
> > > > > > > >
> > > > > > > > It was later bisected to the change mentioned above. See
> > > > > > > > the
> > > > > > > > ticket for
> > > > > > > > more details.
> > > > > > >
> > > > > > > Hi
> > > > > > >
> > > > > > > I suspect I encountered the same problem:
> > > > > > >
> > > > > > > https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@huaweicloud.com/
> > > > > > >
> > > > > > > Going to provide more info there.
> > > > > >
> > > > > > I suppose you are going try to acquire the tracing data I
> > > > > > asked?
> > > > > > That would be awesome, thanks for taking the troube. Let's
> > > > > > look
> > > > > > at the data and draw conclusions based on that.
> > > > > >
> > > > > > Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the
> > > > > > kernel
> > > > > > configuration disables the feature.
> > > > > >
> > > > > > For making decisions what to do with the we are talking
> > > > > > about ~2
> > > > > > week window estimated, given the Vienna conference slows
> > > > > > things
> > > > > > down, so I hope my workaround is good enough before that.
> > > > >
> > > > > I can enumerate three most likely ways to address the issue:
> > > > >
> > > > > 1. Strongest: drop from defconfig.
> > > > > 2. Medium: leave to defconfig but add an opt-in kernel command-
> > > > > line
> > > > > parameter.
> > > > > 3. Lightest: if we can based on tracing data nail the
> > > > > regression in
> > > > > sustainable schedule, fix it.
> > > >
> > > > Actually, there's a fourth: not use sessions for the PCR extend
> > > > (if
> > > > we'd got the timings when I asked, this was going to be my
> > > > suggestion
> > > > if they came back problematic). This seems only to be a problem
> > > > for
> > > > IMA measured boot (because it does lots of extends). If
> > > > necessary this
> > > > could even be wrapped in a separate config or boot option that
> > > > only
> > > > disables HMAC on extend if IMA (so we still get security for
> > > > things
> > > > like sd-boot)
> > >
> > > I can buy that but with a twist that make it an opt-in kernel
> > > command
> > > line option. We don't want to take already existing functionality
> > > away
> > > from those who might want to use it (given e.g. hardening
> > > requirements),
> > > and with that basis opt-in (by default disabled) would be more
> > > balanced
> > > way to address the issue.
> > >
> > > Please do a send a patch!
> >
> > I made few measurements. I have a Fedora 38 VM with TPM passthrough.
> >
> > Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)
> >
> > QEMU:
> >
> > rc qemu-kvm 1:4.2-
> > 3ubuntu6.27
> > ii qemu-system-x86 1:6.2+dfsg-
> > 2ubuntu6.22
> >
> >
> > TPM2_PT_MANUFACTURER:
> > raw: 0x49465800
> > value: "IFX"
> > TPM2_PT_VENDOR_STRING_1:
> > raw: 0x534C4239
> > value: "SLB9"
> > TPM2_PT_VENDOR_STRING_2:
> > raw: 0x36373000
> > value: "670"
> >
> >
> > No HMAC:
> >
> > # tracer: function_graph
> > #
> > # CPU DURATION FUNCTION CALLS
> > # | | | | | | |
> > 0) | tpm2_pcr_extend() {
> > 0) 1.112 us | tpm_buf_append_hmac_session();
> > 0) # 6360.029 us | tpm_transmit_cmd();
> > 0) # 6415.012 us | }
> >
> >
> > HMAC:
> >
> > # tracer: function_graph
> > #
> > # CPU DURATION FUNCTION CALLS
> > # | | | | | | |
> > 1) | tpm2_pcr_extend() {
> > 1) | tpm2_start_auth_session() {
> > 1) * 36976.99 us | tpm_transmit_cmd();
> > 1) * 84746.51 us | tpm_transmit_cmd();
> > 1) # 3195.083 us | tpm_transmit_cmd();
> > 1) @ 126795.1 us | }
> > 1) 2.254 us | tpm_buf_append_hmac_session();
> > 1) 3.546 us | tpm_buf_fill_hmac_session();
> > 1) * 24356.46 us | tpm_transmit_cmd();
> > 1) 3.496 us | tpm_buf_check_hmac_response();
> > 1) @ 151171.0 us | }
>
> Well, unfortunately, that tells us that it's the TPM itself that's
> taking the time processing the security overhead. The ordering of the
> commands in tpm2_start_auth_session() shows
>
> 37ms for context restore of null key
> 85ms for start session with encrypted salt
> 3ms to flush null key
> -----
> 125ms
>
> If we context save the session, we'd likely only bear a single 37ms
> cost to restore it (replacing the total 125ms). However, there's
> nothing we can do about the extend execution going from 6ms to 24ms, so
> I could halve your current boot time with security enabled (it's
> currently 149ms, it would go to 61ms, but it's still 10x slower than
> the unsecured extend at 6ms)
>
> James
I'll hold for better benchmarks.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-12 13:16 ` Jarkko Sakkinen
@ 2024-09-12 13:26 ` James Bottomley
2024-09-12 13:36 ` Roberto Sassu
2024-09-12 14:26 ` Jarkko Sakkinen
0 siblings, 2 replies; 34+ messages in thread
From: James Bottomley @ 2024-09-12 13:26 UTC (permalink / raw)
To: Jarkko Sakkinen, Roberto Sassu, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Thu, 2024-09-12 at 16:16 +0300, Jarkko Sakkinen wrote:
> On Wed Sep 11, 2024 at 3:21 PM EEST, James Bottomley wrote:
> > On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote:
[...]
> > > I made few measurements. I have a Fedora 38 VM with TPM
> > > passthrough.
> > >
> > > Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)
> > >
> > > QEMU:
> > >
> > > rc qemu-kvm 1:4.2-
> > > 3ubuntu6.27
> > > ii qemu-system-x86 1:6.2+dfsg-
> > > 2ubuntu6.22
> > >
> > >
> > > TPM2_PT_MANUFACTURER:
> > > raw: 0x49465800
> > > value: "IFX"
> > > TPM2_PT_VENDOR_STRING_1:
> > > raw: 0x534C4239
> > > value: "SLB9"
> > > TPM2_PT_VENDOR_STRING_2:
> > > raw: 0x36373000
> > > value: "670"
> > >
> > >
> > > No HMAC:
> > >
> > > # tracer: function_graph
> > > #
> > > # CPU DURATION FUNCTION CALLS
> > > # | | | | | | |
> > > 0) | tpm2_pcr_extend() {
> > > 0) 1.112 us | tpm_buf_append_hmac_session();
> > > 0) # 6360.029 us | tpm_transmit_cmd();
> > > 0) # 6415.012 us | }
> > >
> > >
> > > HMAC:
> > >
> > > # tracer: function_graph
> > > #
> > > # CPU DURATION FUNCTION CALLS
> > > # | | | | | | |
> > > 1) | tpm2_pcr_extend() {
> > > 1) | tpm2_start_auth_session() {
> > > 1) * 36976.99 us | tpm_transmit_cmd();
> > > 1) * 84746.51 us | tpm_transmit_cmd();
> > > 1) # 3195.083 us | tpm_transmit_cmd();
> > > 1) @ 126795.1 us | }
> > > 1) 2.254 us | tpm_buf_append_hmac_session();
> > > 1) 3.546 us | tpm_buf_fill_hmac_session();
> > > 1) * 24356.46 us | tpm_transmit_cmd();
> > > 1) 3.496 us | tpm_buf_check_hmac_response();
> > > 1) @ 151171.0 us | }
> >
> > Well, unfortunately, that tells us that it's the TPM itself that's
> > taking the time processing the security overhead. The ordering of
> > the commands in tpm2_start_auth_session() shows
> >
> > 37ms for context restore of null key
> > 85ms for start session with encrypted salt
> > 3ms to flush null key
> > -----
> > 125ms
> >
> > If we context save the session, we'd likely only bear a single 37ms
> > cost to restore it (replacing the total 125ms). However, there's
> > nothing we can do about the extend execution going from 6ms to
> > 24ms, so I could halve your current boot time with security enabled
> > (it's currently 149ms, it would go to 61ms, but it's still 10x
> > slower than the unsecured extend at 6ms)
> >
> > James
>
> I'll hold for better benchmarks.
Well, yes, I'd like to see this for a variety of TPMs.
This one clearly shows it's the real time wait for the TPM (since it
dwarfs the CPU time calculation there's not much optimization we can do
on the kernel end). The one thing that's missing in all of this is
what was the TPM? but even if it's an outlier that's really bad at
crypto what should we do? We could have a blacklist that turns off the
extend hmac (or a whitelist that turns it on), but we can't simply say
too bad you need a better TPM.
James
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-12 13:26 ` James Bottomley
@ 2024-09-12 13:36 ` Roberto Sassu
2024-09-12 14:13 ` James Bottomley
2024-09-12 14:26 ` Jarkko Sakkinen
1 sibling, 1 reply; 34+ messages in thread
From: Roberto Sassu @ 2024-09-12 13:36 UTC (permalink / raw)
To: James Bottomley, Jarkko Sakkinen, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Thu, 2024-09-12 at 09:26 -0400, James Bottomley wrote:
> On Thu, 2024-09-12 at 16:16 +0300, Jarkko Sakkinen wrote:
> > On Wed Sep 11, 2024 at 3:21 PM EEST, James Bottomley wrote:
> > > On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote:
> [...]
> > > > I made few measurements. I have a Fedora 38 VM with TPM
> > > > passthrough.
> > > >
> > > > Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)
> > > >
> > > > QEMU:
> > > >
> > > > rc qemu-kvm 1:4.2-
> > > > 3ubuntu6.27
> > > > ii qemu-system-x86 1:6.2+dfsg-
> > > > 2ubuntu6.22
> > > >
> > > >
> > > > TPM2_PT_MANUFACTURER:
> > > > raw: 0x49465800
> > > > value: "IFX"
> > > > TPM2_PT_VENDOR_STRING_1:
> > > > raw: 0x534C4239
> > > > value: "SLB9"
> > > > TPM2_PT_VENDOR_STRING_2:
> > > > raw: 0x36373000
> > > > value: "670"
> > > >
> > > >
> > > > No HMAC:
> > > >
> > > > # tracer: function_graph
> > > > #
> > > > # CPU DURATION FUNCTION CALLS
> > > > # | | | | | | |
> > > > 0) | tpm2_pcr_extend() {
> > > > 0) 1.112 us | tpm_buf_append_hmac_session();
> > > > 0) # 6360.029 us | tpm_transmit_cmd();
> > > > 0) # 6415.012 us | }
> > > >
> > > >
> > > > HMAC:
> > > >
> > > > # tracer: function_graph
> > > > #
> > > > # CPU DURATION FUNCTION CALLS
> > > > # | | | | | | |
> > > > 1) | tpm2_pcr_extend() {
> > > > 1) | tpm2_start_auth_session() {
> > > > 1) * 36976.99 us | tpm_transmit_cmd();
> > > > 1) * 84746.51 us | tpm_transmit_cmd();
> > > > 1) # 3195.083 us | tpm_transmit_cmd();
> > > > 1) @ 126795.1 us | }
> > > > 1) 2.254 us | tpm_buf_append_hmac_session();
> > > > 1) 3.546 us | tpm_buf_fill_hmac_session();
> > > > 1) * 24356.46 us | tpm_transmit_cmd();
> > > > 1) 3.496 us | tpm_buf_check_hmac_response();
> > > > 1) @ 151171.0 us | }
> > >
> > > Well, unfortunately, that tells us that it's the TPM itself that's
> > > taking the time processing the security overhead. The ordering of
> > > the commands in tpm2_start_auth_session() shows
> > >
> > > 37ms for context restore of null key
> > > 85ms for start session with encrypted salt
> > > 3ms to flush null key
> > > -----
> > > 125ms
> > >
> > > If we context save the session, we'd likely only bear a single 37ms
> > > cost to restore it (replacing the total 125ms). However, there's
> > > nothing we can do about the extend execution going from 6ms to
> > > 24ms, so I could halve your current boot time with security enabled
> > > (it's currently 149ms, it would go to 61ms, but it's still 10x
> > > slower than the unsecured extend at 6ms)
> > >
> > > James
> >
> > I'll hold for better benchmarks.
>
> Well, yes, I'd like to see this for a variety of TPMs.
>
> This one clearly shows it's the real time wait for the TPM (since it
> dwarfs the CPU time calculation there's not much optimization we can do
> on the kernel end). The one thing that's missing in all of this is
> what was the TPM? but even if it's an outlier that's really bad at
> crypto what should we do? We could have a blacklist that turns off the
> extend hmac (or a whitelist that turns it on), but we can't simply say
> too bad you need a better TPM.
Ops, sorry. I pasted the TPM properties. Was not that clear:
Infineon Optiga SLB9670 (interpreting the properties).
Roberto
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-12 13:36 ` Roberto Sassu
@ 2024-09-12 14:13 ` James Bottomley
2024-09-12 14:52 ` Roberto Sassu
0 siblings, 1 reply; 34+ messages in thread
From: James Bottomley @ 2024-09-12 14:13 UTC (permalink / raw)
To: Roberto Sassu, Jarkko Sakkinen, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Thu, 2024-09-12 at 15:36 +0200, Roberto Sassu wrote:
> On Thu, 2024-09-12 at 09:26 -0400, James Bottomley wrote:
> > On Thu, 2024-09-12 at 16:16 +0300, Jarkko Sakkinen wrote:
> > > On Wed Sep 11, 2024 at 3:21 PM EEST, James Bottomley wrote:
> > > > On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote:
> > [...]
> > > > > I made few measurements. I have a Fedora 38 VM with TPM
> > > > > passthrough.
> > > > >
> > > > > Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)
> > > > >
> > > > > QEMU:
> > > > >
> > > > > rc qemu-kvm 1:4.2-
> > > > > 3ubuntu6.27
> > > > > ii qemu-system-x86
> > > > > 1:6.2+dfsg-
> > > > > 2ubuntu6.22
> > > > >
> > > > >
> > > > > TPM2_PT_MANUFACTURER:
> > > > > raw: 0x49465800
> > > > > value: "IFX"
> > > > > TPM2_PT_VENDOR_STRING_1:
> > > > > raw: 0x534C4239
> > > > > value: "SLB9"
> > > > > TPM2_PT_VENDOR_STRING_2:
> > > > > raw: 0x36373000
> > > > > value: "670"
> > > > >
> > > > >
> > > > > No HMAC:
> > > > >
> > > > > # tracer: function_graph
> > > > > #
> > > > > # CPU DURATION FUNCTION CALLS
> > > > > # | | | | | | |
> > > > > 0) | tpm2_pcr_extend() {
> > > > > 0) 1.112 us | tpm_buf_append_hmac_session();
> > > > > 0) # 6360.029 us | tpm_transmit_cmd();
> > > > > 0) # 6415.012 us | }
> > > > >
> > > > >
> > > > > HMAC:
> > > > >
> > > > > # tracer: function_graph
> > > > > #
> > > > > # CPU DURATION FUNCTION CALLS
> > > > > # | | | | | | |
> > > > > 1) | tpm2_pcr_extend() {
> > > > > 1) | tpm2_start_auth_session() {
> > > > > 1) * 36976.99 us | tpm_transmit_cmd();
> > > > > 1) * 84746.51 us | tpm_transmit_cmd();
> > > > > 1) # 3195.083 us | tpm_transmit_cmd();
> > > > > 1) @ 126795.1 us | }
> > > > > 1) 2.254 us | tpm_buf_append_hmac_session();
> > > > > 1) 3.546 us | tpm_buf_fill_hmac_session();
> > > > > 1) * 24356.46 us | tpm_transmit_cmd();
> > > > > 1) 3.496 us | tpm_buf_check_hmac_response();
> > > > > 1) @ 151171.0 us | }
> > > >
> > > > Well, unfortunately, that tells us that it's the TPM itself
> > > > that's
> > > > taking the time processing the security overhead. The ordering
> > > > of
> > > > the commands in tpm2_start_auth_session() shows
> > > >
> > > > 37ms for context restore of null key
> > > > 85ms for start session with encrypted salt
> > > > 3ms to flush null key
> > > > -----
> > > > 125ms
> > > >
> > > > If we context save the session, we'd likely only bear a single
> > > > 37ms
> > > > cost to restore it (replacing the total 125ms). However,
> > > > there's
> > > > nothing we can do about the extend execution going from 6ms to
> > > > 24ms, so I could halve your current boot time with security
> > > > enabled
> > > > (it's currently 149ms, it would go to 61ms, but it's still 10x
> > > > slower than the unsecured extend at 6ms)
> > > >
> > > > James
> > >
> > > I'll hold for better benchmarks.
> >
> > Well, yes, I'd like to see this for a variety of TPMs.
> >
> > This one clearly shows it's the real time wait for the TPM (since
> > it dwarfs the CPU time calculation there's not much optimization we
> > can do on the kernel end). The one thing that's missing in all of
> > this is what was the TPM? but even if it's an outlier that's
> > really bad at crypto what should we do? We could have a blacklist
> > that turns off the extend hmac (or a whitelist that turns it on),
> > but we can't simply say too bad you need a better TPM.
>
> Ops, sorry. I pasted the TPM properties. Was not that clear:
>
> Infineon Optiga SLB9670 (interpreting the properties).
OK, that's reasonably modern and common:
https://www.infineon.com/cms/en/product/security-smart-card-solutions/optiga-embedded-security-solutions/optiga-tpm/
I assume it's one of the Q20 (otherwise it would be a TPM 1.2) but what
firmware version (as in could it be upgraded and the tests re-run to
see if that makes a difference).
I also need the IMA community to start thinking about what they're
willing to accept in terms of performance for the added security hmac
brings to TPM extends.
James
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-12 8:13 ` Roberto Sassu
@ 2024-09-12 14:23 ` Jarkko Sakkinen
2024-09-13 20:50 ` Jarkko Sakkinen
2024-09-15 9:43 ` Jarkko Sakkinen
2 siblings, 0 replies; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-12 14:23 UTC (permalink / raw)
To: Roberto Sassu, James Bottomley, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Thu Sep 12, 2024 at 11:13 AM EEST, Roberto Sassu wrote:
> On Wed, 2024-09-11 at 18:14 +0300, Jarkko Sakkinen wrote:
> > On Wed Sep 11, 2024 at 11:53 AM EEST, Roberto Sassu wrote:
> > > I made few measurements. I have a Fedora 38 VM with TPM passthrough.
> >
> > I was thinking more like
> >
> > sudo bpftrace -e 'k:tpm_transmit { @start[tid] = nsecs; } kr:tpm_transmit { @[kstack, ustack, comm] = sum(nsecs - @start[tid]); delete(@start[tid]); } END { clear(@start); }'
> >
> > For example when running "tpm2_createprimary --hierarchy o -G rsa2048 -c owner.txt", I get:
>
> Sure:
>
> Without HMAC:
>
> @[
> tpm_transmit_cmd+50
> tpm2_pcr_extend+295
> tpm_pcr_extend+221
> ima_add_template_entry+437
> ima_store_template+114
> ima_store_measurement+209
> process_measurement+2473
> ima_file_check+82
> security_file_post_open+92
> path_openat+550
> do_filp_open+171
> do_sys_openat2+186
> do_sys_open+76
> __x64_sys_openat+35
> x64_sys_call+9589
> do_syscall_64+96
> entry_SYSCALL_64_after_hwframe+118
> ,
> 0x7f338ee7be55
> 0x55bf24459ac2
> 0x7f338eda2b8a
> 0x7f338eda2c4b
> 0x55bf2445a9b5
> , cat]: 5273648
>
>
> With HMAC:
>
> @[
> tpm_transmit_cmd+50
> tpm2_flush_context+95
> tpm2_start_auth_session+676
> tpm2_pcr_extend+39
> tpm_pcr_extend+221
> ima_add_template_entry+437
> ima_store_template+114
> ima_store_measurement+209
> process_measurement+2473
> ima_file_check+82
> security_file_post_open+92
> path_openat+550
> do_filp_open+171
> do_sys_openat2+186
> do_sys_open+76
> __x64_sys_openat+35
> x64_sys_call+9589
> do_syscall_64+96
> entry_SYSCALL_64_after_hwframe+118
> ,
> 0x7f03ea0ade55
> 0x55f929b7dac2
> 0x7f03e9fd4b8a
> 0x7f03e9fd4c4b
> 0x55f929b7e9b5
> , cat]: 3128177
> @[
> tpm_transmit_cmd+50
> tpm2_pcr_extend+338
> tpm_pcr_extend+221
> ima_add_template_entry+437
> ima_store_template+114
> ima_store_measurement+209
> process_measurement+2473
> ima_file_check+82
> security_file_post_open+92
> path_openat+550
> do_filp_open+171
> do_sys_openat2+186
> do_sys_open+76
> __x64_sys_openat+35
> x64_sys_call+9589
> do_syscall_64+96
> entry_SYSCALL_64_after_hwframe+118
> ,
> 0x7f03ea0ade55
> 0x55f929b7dac2
> 0x7f03e9fd4b8a
> 0x7f03e9fd4c4b
> 0x55f929b7e9b5
> , cat]: 25851638
> @[
> tpm_transmit_cmd+50
> tpm2_load_context+161
> tpm2_start_auth_session+98
> tpm2_pcr_extend+39
> tpm_pcr_extend+221
> ima_add_template_entry+437
> ima_store_template+114
> ima_store_measurement+209
> process_measurement+2473
> ima_file_check+82
> security_file_post_open+92
> path_openat+550
> do_filp_open+171
> do_sys_openat2+186
> do_sys_open+76
> __x64_sys_openat+35
> x64_sys_call+9589
> do_syscall_64+96
> entry_SYSCALL_64_after_hwframe+118
> ,
> 0x7f03ea0ade55
> 0x55f929b7dac2
> 0x7f03e9fd4b8a
> 0x7f03e9fd4c4b
> 0x55f929b7e9b5
> , cat]: 35928108
> @[
> tpm_transmit_cmd+50
> tpm2_start_auth_session+650
> tpm2_pcr_extend+39
> tpm_pcr_extend+221
> ima_add_template_entry+437
> ima_store_template+114
> ima_store_measurement+209
> process_measurement+2473
> ima_file_check+82
> security_file_post_open+92
> path_openat+550
> do_filp_open+171
> do_sys_openat2+186
> do_sys_open+76
> __x64_sys_openat+35
> x64_sys_call+9589
> do_syscall_64+96
> entry_SYSCALL_64_after_hwframe+118
> ,
> 0x7f03ea0ade55
> 0x55f929b7dac2
> 0x7f03e9fd4b8a
> 0x7f03e9fd4c4b
> 0x55f929b7e9b5
> , cat]: 84616611
>
> Roberto
Looking into tomorrow thank you.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-12 13:26 ` James Bottomley
2024-09-12 13:36 ` Roberto Sassu
@ 2024-09-12 14:26 ` Jarkko Sakkinen
1 sibling, 0 replies; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-12 14:26 UTC (permalink / raw)
To: James Bottomley, Roberto Sassu, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Thu Sep 12, 2024 at 4:26 PM EEST, James Bottomley wrote:
> On Thu, 2024-09-12 at 16:16 +0300, Jarkko Sakkinen wrote:
> > On Wed Sep 11, 2024 at 3:21 PM EEST, James Bottomley wrote:
> > > On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote:
> [...]
> > > > I made few measurements. I have a Fedora 38 VM with TPM
> > > > passthrough.
> > > >
> > > > Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)
> > > >
> > > > QEMU:
> > > >
> > > > rc qemu-kvm 1:4.2-
> > > > 3ubuntu6.27
> > > > ii qemu-system-x86 1:6.2+dfsg-
> > > > 2ubuntu6.22
> > > >
> > > >
> > > > TPM2_PT_MANUFACTURER:
> > > > raw: 0x49465800
> > > > value: "IFX"
> > > > TPM2_PT_VENDOR_STRING_1:
> > > > raw: 0x534C4239
> > > > value: "SLB9"
> > > > TPM2_PT_VENDOR_STRING_2:
> > > > raw: 0x36373000
> > > > value: "670"
> > > >
> > > >
> > > > No HMAC:
> > > >
> > > > # tracer: function_graph
> > > > #
> > > > # CPU DURATION FUNCTION CALLS
> > > > # | | | | | | |
> > > > 0) | tpm2_pcr_extend() {
> > > > 0) 1.112 us | tpm_buf_append_hmac_session();
> > > > 0) # 6360.029 us | tpm_transmit_cmd();
> > > > 0) # 6415.012 us | }
> > > >
> > > >
> > > > HMAC:
> > > >
> > > > # tracer: function_graph
> > > > #
> > > > # CPU DURATION FUNCTION CALLS
> > > > # | | | | | | |
> > > > 1) | tpm2_pcr_extend() {
> > > > 1) | tpm2_start_auth_session() {
> > > > 1) * 36976.99 us | tpm_transmit_cmd();
> > > > 1) * 84746.51 us | tpm_transmit_cmd();
> > > > 1) # 3195.083 us | tpm_transmit_cmd();
> > > > 1) @ 126795.1 us | }
> > > > 1) 2.254 us | tpm_buf_append_hmac_session();
> > > > 1) 3.546 us | tpm_buf_fill_hmac_session();
> > > > 1) * 24356.46 us | tpm_transmit_cmd();
> > > > 1) 3.496 us | tpm_buf_check_hmac_response();
> > > > 1) @ 151171.0 us | }
> > >
> > > Well, unfortunately, that tells us that it's the TPM itself that's
> > > taking the time processing the security overhead. The ordering of
> > > the commands in tpm2_start_auth_session() shows
> > >
> > > 37ms for context restore of null key
> > > 85ms for start session with encrypted salt
> > > 3ms to flush null key
> > > -----
> > > 125ms
> > >
> > > If we context save the session, we'd likely only bear a single 37ms
> > > cost to restore it (replacing the total 125ms). However, there's
> > > nothing we can do about the extend execution going from 6ms to
> > > 24ms, so I could halve your current boot time with security enabled
> > > (it's currently 149ms, it would go to 61ms, but it's still 10x
> > > slower than the unsecured extend at 6ms)
> > >
> > > James
> >
> > I'll hold for better benchmarks.
>
> Well, yes, I'd like to see this for a variety of TPMs.
>
> This one clearly shows it's the real time wait for the TPM (since it
> dwarfs the CPU time calculation there's not much optimization we can do
> on the kernel end). The one thing that's missing in all of this is
> what was the TPM? but even if it's an outlier that's really bad at
> crypto what should we do? We could have a blacklist that turns off the
> extend hmac (or a whitelist that turns it on), but we can't simply say
> too bad you need a better TPM.
>
> James
I'm pasting here my yesterday's one-liner ;-)
sudo bpftrace -e 'k:tpm_transmit { @start[tid] = nsecs; } kr:tpm_transmit { @[kstack, ustack, comm] = sum(nsecs - @start[tid]); delete(@start[tid]); } END { clear(@start); }'
If you have a fix candidate, snippet of the output before/after would
work as rationale too.
Looking into the data Roberto put me tomorrow.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-12 14:13 ` James Bottomley
@ 2024-09-12 14:52 ` Roberto Sassu
0 siblings, 0 replies; 34+ messages in thread
From: Roberto Sassu @ 2024-09-12 14:52 UTC (permalink / raw)
To: James Bottomley, Jarkko Sakkinen, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Thu, 2024-09-12 at 10:13 -0400, James Bottomley wrote:
> On Thu, 2024-09-12 at 15:36 +0200, Roberto Sassu wrote:
> > On Thu, 2024-09-12 at 09:26 -0400, James Bottomley wrote:
> > > On Thu, 2024-09-12 at 16:16 +0300, Jarkko Sakkinen wrote:
> > > > On Wed Sep 11, 2024 at 3:21 PM EEST, James Bottomley wrote:
> > > > > On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote:
> > > [...]
> > > > > > I made few measurements. I have a Fedora 38 VM with TPM
> > > > > > passthrough.
> > > > > >
> > > > > > Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)
> > > > > >
> > > > > > QEMU:
> > > > > >
> > > > > > rc qemu-kvm 1:4.2-
> > > > > > 3ubuntu6.27
> > > > > > ii qemu-system-x86
> > > > > > 1:6.2+dfsg-
> > > > > > 2ubuntu6.22
> > > > > >
> > > > > >
> > > > > > TPM2_PT_MANUFACTURER:
> > > > > > raw: 0x49465800
> > > > > > value: "IFX"
> > > > > > TPM2_PT_VENDOR_STRING_1:
> > > > > > raw: 0x534C4239
> > > > > > value: "SLB9"
> > > > > > TPM2_PT_VENDOR_STRING_2:
> > > > > > raw: 0x36373000
> > > > > > value: "670"
> > > > > >
> > > > > >
> > > > > > No HMAC:
> > > > > >
> > > > > > # tracer: function_graph
> > > > > > #
> > > > > > # CPU DURATION FUNCTION CALLS
> > > > > > # | | | | | | |
> > > > > > 0) | tpm2_pcr_extend() {
> > > > > > 0) 1.112 us | tpm_buf_append_hmac_session();
> > > > > > 0) # 6360.029 us | tpm_transmit_cmd();
> > > > > > 0) # 6415.012 us | }
> > > > > >
> > > > > >
> > > > > > HMAC:
> > > > > >
> > > > > > # tracer: function_graph
> > > > > > #
> > > > > > # CPU DURATION FUNCTION CALLS
> > > > > > # | | | | | | |
> > > > > > 1) | tpm2_pcr_extend() {
> > > > > > 1) | tpm2_start_auth_session() {
> > > > > > 1) * 36976.99 us | tpm_transmit_cmd();
> > > > > > 1) * 84746.51 us | tpm_transmit_cmd();
> > > > > > 1) # 3195.083 us | tpm_transmit_cmd();
> > > > > > 1) @ 126795.1 us | }
> > > > > > 1) 2.254 us | tpm_buf_append_hmac_session();
> > > > > > 1) 3.546 us | tpm_buf_fill_hmac_session();
> > > > > > 1) * 24356.46 us | tpm_transmit_cmd();
> > > > > > 1) 3.496 us | tpm_buf_check_hmac_response();
> > > > > > 1) @ 151171.0 us | }
> > > > >
> > > > > Well, unfortunately, that tells us that it's the TPM itself
> > > > > that's
> > > > > taking the time processing the security overhead. The ordering
> > > > > of
> > > > > the commands in tpm2_start_auth_session() shows
> > > > >
> > > > > 37ms for context restore of null key
> > > > > 85ms for start session with encrypted salt
> > > > > 3ms to flush null key
> > > > > -----
> > > > > 125ms
> > > > >
> > > > > If we context save the session, we'd likely only bear a single
> > > > > 37ms
> > > > > cost to restore it (replacing the total 125ms). However,
> > > > > there's
> > > > > nothing we can do about the extend execution going from 6ms to
> > > > > 24ms, so I could halve your current boot time with security
> > > > > enabled
> > > > > (it's currently 149ms, it would go to 61ms, but it's still 10x
> > > > > slower than the unsecured extend at 6ms)
> > > > >
> > > > > James
> > > >
> > > > I'll hold for better benchmarks.
> > >
> > > Well, yes, I'd like to see this for a variety of TPMs.
> > >
> > > This one clearly shows it's the real time wait for the TPM (since
> > > it dwarfs the CPU time calculation there's not much optimization we
> > > can do on the kernel end). The one thing that's missing in all of
> > > this is what was the TPM? but even if it's an outlier that's
> > > really bad at crypto what should we do? We could have a blacklist
> > > that turns off the extend hmac (or a whitelist that turns it on),
> > > but we can't simply say too bad you need a better TPM.
> >
> > Ops, sorry. I pasted the TPM properties. Was not that clear:
> >
> > Infineon Optiga SLB9670 (interpreting the properties).
>
> OK, that's reasonably modern and common:
>
> https://www.infineon.com/cms/en/product/security-smart-card-solutions/optiga-embedded-security-solutions/optiga-tpm/
>
> I assume it's one of the Q20 (otherwise it would be a TPM 1.2) but what
> firmware version (as in could it be upgraded and the tests re-run to
> see if that makes a difference).
>
> I also need the IMA community to start thinking about what they're
> willing to accept in terms of performance for the added security hmac
> brings to TPM extends.
Just for curiosity, I made a comparison of the boot time of Fedora 38
(minimal installation) without and with HMAC enabled, without and with
the Integrity Digest Cache [1], which I originally designed exactly for
this purpose (one measurement per package):
Without HMAC:
Without Integrity Digest Cache:
[root@fedora ~]# systemd-analyze
Startup finished in 2.486s (kernel) + 3.594s (initrd) + 11.613s (userspace) = 17.694s
multi-user.target reached after 11.559s in userspace.
[root@fedora ~]# cat /sys/kernel/security/ima/ascii_runtime_measurements|wc -l
444
With Integrity Digest Cache:
[root@fedora ~]# systemd-analyze
Startup finished in 2.381s (kernel) + 3.469s (initrd) + 11.794s (userspace) = 17.644s
multi-user.target reached after 11.750s in userspace.
[root@fedora ~]# cat /sys/kernel/security/ima/ascii_runtime_measurements|wc -l
218
With HMAC:
Without Integrity Digest Cache:
[root@fedora ~]# systemd-analyze
Startup finished in 2.911s (kernel) + 3.453s (initrd) + 1min 5.754s (userspace) = 1min 12.119s
multi-user.target reached after 1min 5.707s in userspace.
[root@fedora ~]# cat /sys/kernel/security/ima/ascii_runtime_measurements|wc -l
444
With Integrity Digest Cache:
[root@fedora ~]# systemd-analyze
Startup finished in 2.990s (kernel) + 3.462s (initrd) + 37.038s (userspace) = 43.491s
multi-user.target reached after 36.997s in userspace.
[root@fedora ~]# cat /sys/kernel/security/ima/ascii_runtime_measurements|wc -l
218
[1]: https://lore.kernel.org/linux-integrity/20240905150543.3766895-1-roberto.sassu@huaweicloud.com/
Roberto
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-12 8:13 ` Roberto Sassu
2024-09-12 14:23 ` Jarkko Sakkinen
@ 2024-09-13 20:50 ` Jarkko Sakkinen
2024-09-13 22:06 ` Jarkko Sakkinen
2024-09-15 9:43 ` Jarkko Sakkinen
2 siblings, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-13 20:50 UTC (permalink / raw)
To: Roberto Sassu, Jarkko Sakkinen, James Bottomley,
Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Thu Sep 12, 2024 at 11:13 AM EEST, Roberto Sassu wrote:
> On Wed, 2024-09-11 at 18:14 +0300, Jarkko Sakkinen wrote:
> > On Wed Sep 11, 2024 at 11:53 AM EEST, Roberto Sassu wrote:
> > > I made few measurements. I have a Fedora 38 VM with TPM passthrough.
> >
> > I was thinking more like
> >
> > sudo bpftrace -e 'k:tpm_transmit { @start[tid] = nsecs; } kr:tpm_transmit { @[kstack, ustack, comm] = sum(nsecs - @start[tid]); delete(@start[tid]); } END { clear(@start); }'
> >
> > For example when running "tpm2_createprimary --hierarchy o -G rsa2048 -c owner.txt", I get:
>
> Sure:
Took couple of days to upgrade my BuildRoot environment to have bcc and
bpftrace [1] but finally got similar figures (not the same test but doing
extends).
Summarizing your results looking at call before tpm_transmit:
- HMAC management: 124 ms
- extend with HMAC: 25 ms
- extend without HMAC: 5.2 ms
I'd see the only possible way to fix this would be refactor the HMAC
implementation by making the caller always the orchestrator and thus
allowing to use continueSession flag for TPM2_StartAuthSession to be
used.
For example if you do multiple extends there should not be good reason
to setup and rollback session for each call separately right?
[1] https://codeberg.org/jarkko/linux-tpmdd-test
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-13 20:50 ` Jarkko Sakkinen
@ 2024-09-13 22:06 ` Jarkko Sakkinen
0 siblings, 0 replies; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-13 22:06 UTC (permalink / raw)
To: Jarkko Sakkinen, Roberto Sassu, James Bottomley,
Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Fri Sep 13, 2024 at 11:50 PM EEST, Jarkko Sakkinen wrote:
> On Thu Sep 12, 2024 at 11:13 AM EEST, Roberto Sassu wrote:
> > On Wed, 2024-09-11 at 18:14 +0300, Jarkko Sakkinen wrote:
> > > On Wed Sep 11, 2024 at 11:53 AM EEST, Roberto Sassu wrote:
> > > > I made few measurements. I have a Fedora 38 VM with TPM passthrough.
> > >
> > > I was thinking more like
> > >
> > > sudo bpftrace -e 'k:tpm_transmit { @start[tid] = nsecs; } kr:tpm_transmit { @[kstack, ustack, comm] = sum(nsecs - @start[tid]); delete(@start[tid]); } END { clear(@start); }'
> > >
> > > For example when running "tpm2_createprimary --hierarchy o -G rsa2048 -c owner.txt", I get:
> >
> > Sure:
>
> Took couple of days to upgrade my BuildRoot environment to have bcc and
> bpftrace [1] but finally got similar figures (not the same test but doing
> extends).
>
> Summarizing your results looking at call before tpm_transmit:
>
> - HMAC management: 124 ms
> - extend with HMAC: 25 ms
> - extend without HMAC: 5.2 ms
>
> I'd see the only possible way to fix this would be refactor the HMAC
> implementation by making the caller always the orchestrator and thus
> allowing to use continueSession flag for TPM2_StartAuthSession to be
> used.
>
> For example if you do multiple extends there should not be good reason
> to setup and rollback session for each call separately right?
>
> [1] https://codeberg.org/jarkko/linux-tpmdd-test
Note that the timings are accumulated (not averaged). It would be easy
to fix this tho.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-11 12:21 ` James Bottomley
2024-09-12 13:16 ` Jarkko Sakkinen
@ 2024-09-14 10:42 ` Jarkko Sakkinen
2024-09-14 10:51 ` Jarkko Sakkinen
1 sibling, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-14 10:42 UTC (permalink / raw)
To: James Bottomley, Roberto Sassu, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Wed Sep 11, 2024 at 3:21 PM EEST, James Bottomley wrote:
> On Wed, 2024-09-11 at 10:53 +0200, Roberto Sassu wrote:
> > On Tue, 2024-09-10 at 16:28 +0300, Jarkko Sakkinen wrote:
> > > On Tue Sep 10, 2024 at 3:57 PM EEST, James Bottomley wrote:
> > > > On Tue, 2024-09-10 at 15:48 +0300, Jarkko Sakkinen wrote:
> > > > > On Tue Sep 10, 2024 at 3:39 PM EEST, Jarkko Sakkinen wrote:
> > > > > > On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote:
> > > > > > > On Tue, 2024-09-10 at 11:01 +0200, Linux regression
> > > > > > > tracking
> > > > > > > (Thorsten
> > > > > > > Leemhuis) wrote:
> > > > > > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > > > > > >
> > > > > > > > James, Jarkoo, I noticed a report about a regression in
> > > > > > > > bugzilla.kernel.org that appears to be caused by this
> > > > > > > > change of
> > > > > > > > yours:
> > > > > > > >
> > > > > > > > 6519fea6fd372b ("tpm: add hmac checks to
> > > > > > > > tpm2_pcr_extend()")
> > > > > > > > [v6.10-rc1]
> > > > > > > >
> > > > > > > > As many (most?) kernel developers don't keep an eye on
> > > > > > > > the bug
> > > > > > > > tracker,
> > > > > > > > I decided to forward it by mail. To quote from
> > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> > > > > > > >
> > > > > > > > > When secureboot is enabled,
> > > > > > > > > the kernel boot time is ~20 seconds after 6.10 kernel.
> > > > > > > > > it's ~7 seconds on 6.8 kernel version.
> > > > > > > > >
> > > > > > > > > When secureboot is disabled,
> > > > > > > > > the boot time is ~7 seconds too.
> > > > > > > > >
> > > > > > > > > Reproduced on both AMD and Intel platform on ThinkPad
> > > > > > > > > X1 and
> > > > > > > > > T14.
> > > > > > > > >
> > > > > > > > > It probably caused autologin failure and micmute led
> > > > > > > > > not
> > > > > > > > > loaded on AMD platform.
> > > > > > > >
> > > > > > > > It was later bisected to the change mentioned above. See
> > > > > > > > the
> > > > > > > > ticket for
> > > > > > > > more details.
> > > > > > >
> > > > > > > Hi
> > > > > > >
> > > > > > > I suspect I encountered the same problem:
> > > > > > >
> > > > > > > https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@huaweicloud.com/
> > > > > > >
> > > > > > > Going to provide more info there.
> > > > > >
> > > > > > I suppose you are going try to acquire the tracing data I
> > > > > > asked?
> > > > > > That would be awesome, thanks for taking the troube. Let's
> > > > > > look
> > > > > > at the data and draw conclusions based on that.
> > > > > >
> > > > > > Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the
> > > > > > kernel
> > > > > > configuration disables the feature.
> > > > > >
> > > > > > For making decisions what to do with the we are talking
> > > > > > about ~2
> > > > > > week window estimated, given the Vienna conference slows
> > > > > > things
> > > > > > down, so I hope my workaround is good enough before that.
> > > > >
> > > > > I can enumerate three most likely ways to address the issue:
> > > > >
> > > > > 1. Strongest: drop from defconfig.
> > > > > 2. Medium: leave to defconfig but add an opt-in kernel command-
> > > > > line
> > > > > parameter.
> > > > > 3. Lightest: if we can based on tracing data nail the
> > > > > regression in
> > > > > sustainable schedule, fix it.
> > > >
> > > > Actually, there's a fourth: not use sessions for the PCR extend
> > > > (if
> > > > we'd got the timings when I asked, this was going to be my
> > > > suggestion
> > > > if they came back problematic). This seems only to be a problem
> > > > for
> > > > IMA measured boot (because it does lots of extends). If
> > > > necessary this
> > > > could even be wrapped in a separate config or boot option that
> > > > only
> > > > disables HMAC on extend if IMA (so we still get security for
> > > > things
> > > > like sd-boot)
> > >
> > > I can buy that but with a twist that make it an opt-in kernel
> > > command
> > > line option. We don't want to take already existing functionality
> > > away
> > > from those who might want to use it (given e.g. hardening
> > > requirements),
> > > and with that basis opt-in (by default disabled) would be more
> > > balanced
> > > way to address the issue.
> > >
> > > Please do a send a patch!
> >
> > I made few measurements. I have a Fedora 38 VM with TPM passthrough.
> >
> > Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)
> >
> > QEMU:
> >
> > rc qemu-kvm 1:4.2-
> > 3ubuntu6.27
> > ii qemu-system-x86 1:6.2+dfsg-
> > 2ubuntu6.22
> >
> >
> > TPM2_PT_MANUFACTURER:
> > raw: 0x49465800
> > value: "IFX"
> > TPM2_PT_VENDOR_STRING_1:
> > raw: 0x534C4239
> > value: "SLB9"
> > TPM2_PT_VENDOR_STRING_2:
> > raw: 0x36373000
> > value: "670"
> >
> >
> > No HMAC:
> >
> > # tracer: function_graph
> > #
> > # CPU DURATION FUNCTION CALLS
> > # | | | | | | |
> > 0) | tpm2_pcr_extend() {
> > 0) 1.112 us | tpm_buf_append_hmac_session();
> > 0) # 6360.029 us | tpm_transmit_cmd();
> > 0) # 6415.012 us | }
> >
> >
> > HMAC:
> >
> > # tracer: function_graph
> > #
> > # CPU DURATION FUNCTION CALLS
> > # | | | | | | |
> > 1) | tpm2_pcr_extend() {
> > 1) | tpm2_start_auth_session() {
> > 1) * 36976.99 us | tpm_transmit_cmd();
> > 1) * 84746.51 us | tpm_transmit_cmd();
> > 1) # 3195.083 us | tpm_transmit_cmd();
> > 1) @ 126795.1 us | }
> > 1) 2.254 us | tpm_buf_append_hmac_session();
> > 1) 3.546 us | tpm_buf_fill_hmac_session();
> > 1) * 24356.46 us | tpm_transmit_cmd();
> > 1) 3.496 us | tpm_buf_check_hmac_response();
> > 1) @ 151171.0 us | }
>
> Well, unfortunately, that tells us that it's the TPM itself that's
> taking the time processing the security overhead. The ordering of the
> commands in tpm2_start_auth_session() shows
>
> 37ms for context restore of null key
> 85ms for start session with encrypted salt
> 3ms to flush null key
> -----
> 125ms
>
> If we context save the session, we'd likely only bear a single 37ms
> cost to restore it (replacing the total 125ms). However, there's
> nothing we can do about the extend execution going from 6ms to 24ms, so
> I could halve your current boot time with security enabled (it's
> currently 149ms, it would go to 61ms, but it's still 10x slower than
> the unsecured extend at 6ms)
Please address how this discussion is related to https://bugzilla.kernel.org/show_bug.cgi?id=219229
I just read the bug report nothing about IMA or PCR extend.
There's now tons of spam about performance issue in a patch set that is
not in the mainline and barely nothing about the original issue:
"
When secureboot is enabled,
the kernel boot time is ~20 seconds after 6.10 kernel.
it's ~7 seconds on 6.8 kernel version.
When secureboot is disabled,
the boot time is ~7 seconds too.
Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.
It probably caused autologin failure and micmute led not loaded on AMD platform.
6.9 kernel version is not tested since not signed kernel found.
6.8, 6.10, 6.11 are tested, the first bad version is 6.10.
"
How is this going to help to fix this one?
I say this once and one: I zero care fixing code that is in the
mainline.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-14 10:42 ` Jarkko Sakkinen
@ 2024-09-14 10:51 ` Jarkko Sakkinen
2024-09-14 10:58 ` Jarkko Sakkinen
0 siblings, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-14 10:51 UTC (permalink / raw)
To: Jarkko Sakkinen, James Bottomley, Roberto Sassu,
Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Sat Sep 14, 2024 at 1:42 PM EEST, Jarkko Sakkinen wrote:
> Please address how this discussion is related to https://bugzilla.kernel.org/show_bug.cgi?id=219229
>
> I just read the bug report nothing about IMA or PCR extend.
>
> There's now tons of spam about performance issue in a patch set that is
> not in the mainline and barely nothing about the original issue:
>
> "
> When secureboot is enabled,
> the kernel boot time is ~20 seconds after 6.10 kernel.
> it's ~7 seconds on 6.8 kernel version.
>
> When secureboot is disabled,
> the boot time is ~7 seconds too.
>
> Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.
>
> It probably caused autologin failure and micmute led not loaded on AMD platform.
>
> 6.9 kernel version is not tested since not signed kernel found.
> 6.8, 6.10, 6.11 are tested, the first bad version is 6.10.
> "
>
> How is this going to help to fix this one?
>
> I say this once and one: I zero care fixing code that is in the
> mainline.
How do we now that bug is anything to do with IMA? I'm having a weekend
now but on Monday I'll ask the kconfig from the reporter. I think
important thing is to then revisit how many times the session is setup
during boot and make conclusions from that.
It is plain wrong and immoral to convolute a regression with marketing
a new kernel feature. These topics should be brought up in the topic
(i.e. patch set comments), not here. It misleads everyone.
Please explain me how this is going to help the reporter in any
possible?
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-14 10:51 ` Jarkko Sakkinen
@ 2024-09-14 10:58 ` Jarkko Sakkinen
0 siblings, 0 replies; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-14 10:58 UTC (permalink / raw)
To: Jarkko Sakkinen, Jarkko Sakkinen, James Bottomley, Roberto Sassu,
Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Sat Sep 14, 2024 at 1:51 PM EEST, Jarkko Sakkinen wrote:
> On Sat Sep 14, 2024 at 1:42 PM EEST, Jarkko Sakkinen wrote:
> > Please address how this discussion is related to https://bugzilla.kernel.org/show_bug.cgi?id=219229
> >
> > I just read the bug report nothing about IMA or PCR extend.
> >
> > There's now tons of spam about performance issue in a patch set that is
> > not in the mainline and barely nothing about the original issue:
> >
> > "
> > When secureboot is enabled,
> > the kernel boot time is ~20 seconds after 6.10 kernel.
> > it's ~7 seconds on 6.8 kernel version.
> >
> > When secureboot is disabled,
> > the boot time is ~7 seconds too.
> >
> > Reproduced on both AMD and Intel platform on ThinkPad X1 and T14.
> >
> > It probably caused autologin failure and micmute led not loaded on AMD platform.
> >
> > 6.9 kernel version is not tested since not signed kernel found.
> > 6.8, 6.10, 6.11 are tested, the first bad version is 6.10.
> > "
> >
> > How is this going to help to fix this one?
> >
> > I say this once and one: I zero care fixing code that is in the
> > mainline.
"not in the mainline" (oops)
>
> How do we now that bug is anything to do with IMA? I'm having a weekend
> now but on Monday I'll ask the kconfig from the reporter. I think
> important thing is to then revisit how many times the session is setup
> during boot and make conclusions from that.
>
> It is plain wrong and immoral to convolute a regression with marketing
> a new kernel feature. These topics should be brought up in the topic
> (i.e. patch set comments), not here. It misleads everyone.
>
> Please explain me how this is going to help the reporter in any
> possible?
I will check the original reporters kconfig once I get it. Based on
that I can reverse TPM call sequences. Based on those I check if
anything can be orchestrated.
If this leads no results I just send a patch that makes the whole
feature as an opt-in kernel command-line option and call it a day.
I think we can the full next week timeline for this not going to
hold longer than that.
Any comments that are related to Roberto's unfinished patch set
take them elsewhere.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-12 8:13 ` Roberto Sassu
2024-09-12 14:23 ` Jarkko Sakkinen
2024-09-13 20:50 ` Jarkko Sakkinen
@ 2024-09-15 9:43 ` Jarkko Sakkinen
2024-09-15 10:07 ` Jarkko Sakkinen
2 siblings, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-15 9:43 UTC (permalink / raw)
To: Roberto Sassu, James Bottomley, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Thu Sep 12, 2024 at 11:13 AM EEST, Roberto Sassu wrote:
> @[
> tpm_transmit_cmd+50
> tpm2_load_context+161
> tpm2_start_auth_session+98
> tpm2_pcr_extend+39
> tpm_pcr_extend+221
> ima_add_template_entry+437
> ima_store_template+114
> ima_store_measurement+209
> process_measurement+2473
> ima_file_check+82
> security_file_post_open+92
> path_openat+550
> do_filp_open+171
> do_sys_openat2+186
> do_sys_open+76
> __x64_sys_openat+35
> x64_sys_call+9589
> do_syscall_64+96
> entry_SYSCALL_64_after_hwframe+118
> ,
> 0x7f03ea0ade55
> 0x55f929b7dac2
> 0x7f03e9fd4b8a
> 0x7f03e9fd4c4b
> 0x55f929b7e9b5
> , cat]: 35928108
> @[
> tpm_transmit_cmd+50
> tpm2_start_auth_session+650
> tpm2_pcr_extend+39
> tpm_pcr_extend+221
> ima_add_template_entry+437
> ima_store_template+114
> ima_store_measurement+209
> process_measurement+2473
> ima_file_check+82
> security_file_post_open+92
> path_openat+550
> do_filp_open+171
> do_sys_openat2+186
> do_sys_open+76
> __x64_sys_openat+35
> x64_sys_call+9589
> do_syscall_64+96
> entry_SYSCALL_64_after_hwframe+118
> ,
> 0x7f03ea0ade55
> 0x55f929b7dac2
> 0x7f03e9fd4b8a
> 0x7f03e9fd4c4b
> 0x55f929b7e9b5
> , cat]: 84616611
These commands and TPM2_CreatePrimary are the ones that give overhead
to the AMD boot-up:
1. TPM2_LoadContext (35 ms)
2. TPM2_StartAuthSession (85 ms)
We can conclude that the implementation is too slow and making it faster
requires a whole set of small improvements. From this basis the only
right fix is to make it opt-in kernel command-line option.
That will give space to make small performance improvements over time,
and not rush. How the session is orchestrated is not production quality,
and the bug gives direct evidence of that.
High-level improvements that could be done over time:
- Do not call start_auth_session() in extend and get_random().
Orchestrate outside.
- Find places to not close and open session sequentially, e.g.
with the help of use SA_CONTINUE_SESSION.
When it comes to boot we should aim for one single start_auth_session
during boot, i.e. different phases would leave that session open so
that we don't have to load the context every single time. I think it
should be doable.
Making all this happen is not a "performance regression fix". It is
set of gradual improvements to the code that is not there yet
On plus side, the kernel command-line option allows the enable the
feature by default during compilation time for all architectures.
I've made my decision on this and will submit a fix for it.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-15 9:43 ` Jarkko Sakkinen
@ 2024-09-15 10:07 ` Jarkko Sakkinen
2024-09-15 13:59 ` James Bottomley
0 siblings, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-15 10:07 UTC (permalink / raw)
To: Jarkko Sakkinen, Roberto Sassu, James Bottomley,
Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Sun Sep 15, 2024 at 12:43 PM EEST, Jarkko Sakkinen wrote:
> When it comes to boot we should aim for one single start_auth_session
> during boot, i.e. different phases would leave that session open so
> that we don't have to load the context every single time. I think it
> should be doable.
The best possible idea how to improve performance here would be to
transfer the cost from time to space. This can be achieved by keeping
null key permanently in the TPM memory during power cycle.
It would give about 80% increase given Roberto's benchmark to all
in-kernel callers. There's no really other possible solution for this
to make any major improvements. So after opt-in kernel command line
option I might look into this.
This is already done locally in tpm2_get_random(), which uses
continueSession to keep session open for all calls.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-15 10:07 ` Jarkko Sakkinen
@ 2024-09-15 13:59 ` James Bottomley
2024-09-15 14:50 ` Jarkko Sakkinen
0 siblings, 1 reply; 34+ messages in thread
From: James Bottomley @ 2024-09-15 13:59 UTC (permalink / raw)
To: Jarkko Sakkinen, Roberto Sassu, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Sun, 2024-09-15 at 13:07 +0300, Jarkko Sakkinen wrote:
> On Sun Sep 15, 2024 at 12:43 PM EEST, Jarkko Sakkinen wrote:
> > When it comes to boot we should aim for one single
> > start_auth_session during boot, i.e. different phases would leave
> > that session open so that we don't have to load the context every
> > single time. I think it should be doable.
>
> The best possible idea how to improve performance here would be to
> transfer the cost from time to space. This can be achieved by keeping
> null key permanently in the TPM memory during power cycle.
No it's not at all. If you look at it, the NULL key is only used to
encrypt the salt for the start session and that's the operating taking
a lot of time. That's why the cleanest mitigation would be to save and
restore the session. Unfortunately the timings you already complain
about still show this would be about 10x longer than a no-hmac extend
so I'm still waiting to see if IMA people consider that an acceptable
tradeoff.
> It would give about 80% increase given Roberto's benchmark to all
> in-kernel callers. There's no really other possible solution for this
> to make any major improvements. So after opt-in kernel command line
> option I might look into this.
>
> This is already done locally in tpm2_get_random(), which uses
> continueSession to keep session open for all calls.
The other problem if the session is context saved, as I already said,
is that it becomes long lived and requires degapping the session
manager.
James
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-15 13:59 ` James Bottomley
@ 2024-09-15 14:50 ` Jarkko Sakkinen
2024-09-15 14:55 ` Jarkko Sakkinen
2024-09-15 15:00 ` James Bottomley
0 siblings, 2 replies; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-15 14:50 UTC (permalink / raw)
To: James Bottomley, Roberto Sassu, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Sun Sep 15, 2024 at 4:59 PM EEST, James Bottomley wrote:
> On Sun, 2024-09-15 at 13:07 +0300, Jarkko Sakkinen wrote:
> > On Sun Sep 15, 2024 at 12:43 PM EEST, Jarkko Sakkinen wrote:
> > > When it comes to boot we should aim for one single
> > > start_auth_session during boot, i.e. different phases would leave
> > > that session open so that we don't have to load the context every
> > > single time. I think it should be doable.
> >
> > The best possible idea how to improve performance here would be to
> > transfer the cost from time to space. This can be achieved by keeping
> > null key permanently in the TPM memory during power cycle.
>
> No it's not at all. If you look at it, the NULL key is only used to
> encrypt the salt for the start session and that's the operating taking
> a lot of time. That's why the cleanest mitigation would be to save and
> restore the session. Unfortunately the timings you already complain
> about still show this would be about 10x longer than a no-hmac extend
> so I'm still waiting to see if IMA people consider that an acceptable
> tradeoff.
The bug report does not say anything about IMA issues. Please read the
bug reports before commenting ;-) I will ignore your comment because
it is plain misleading information.
https://bugzilla.kernel.org/show_bug.cgi?id=219229
>
> > It would give about 80% increase given Roberto's benchmark to all
> > in-kernel callers. There's no really other possible solution for this
> > to make any major improvements. So after opt-in kernel command line
> > option I might look into this.
> >
> > This is already done locally in tpm2_get_random(), which uses
> > continueSession to keep session open for all calls.
>
> The other problem if the session is context saved, as I already said,
> is that it becomes long lived and requires degapping the session
> manager.
I don't really care what you claim, I care what you code only at most.
Especially when topic shifted like it was now to IMA, which feels to
me like misguided communication tbh.
I don't think a round trip in kernel would qualify in that but there
is more low-hanging fruit too.
One low-hanging fruit improvement in the startup code is the handling
of null key. If it was flushed only on need, which means in practice
access to /dev/tpm0 or /dev/tpmrm0
I'm already working on patch set which adds chip->null_key that will
be flushed on-need basis only. I can measure with qemu how it affects
boot time.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-15 14:50 ` Jarkko Sakkinen
@ 2024-09-15 14:55 ` Jarkko Sakkinen
2024-09-15 15:00 ` James Bottomley
1 sibling, 0 replies; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-15 14:55 UTC (permalink / raw)
To: Jarkko Sakkinen, James Bottomley, Roberto Sassu,
Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Sun Sep 15, 2024 at 5:50 PM EEST, Jarkko Sakkinen wrote:
> One low-hanging fruit improvement in the startup code is the handling
> of null key. If it was flushed only on need, which means in practice
> access to /dev/tpm0 or /dev/tpmrm0
>
> I'm already working on patch set which adds chip->null_key that will
> be flushed on-need basis only. I can measure with qemu how it affects
> boot time.
I can agree with that playing continueSession is not like the first
thing to try out but keeping null key in memory as long as it can be
does not affect context gap so I start experimenting with that.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-15 14:50 ` Jarkko Sakkinen
2024-09-15 14:55 ` Jarkko Sakkinen
@ 2024-09-15 15:00 ` James Bottomley
2024-09-15 16:22 ` Jarkko Sakkinen
1 sibling, 1 reply; 34+ messages in thread
From: James Bottomley @ 2024-09-15 15:00 UTC (permalink / raw)
To: Jarkko Sakkinen, Roberto Sassu, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Sun, 2024-09-15 at 17:50 +0300, Jarkko Sakkinen wrote:
> On Sun Sep 15, 2024 at 4:59 PM EEST, James Bottomley wrote:
> > On Sun, 2024-09-15 at 13:07 +0300, Jarkko Sakkinen wrote:
> > > On Sun Sep 15, 2024 at 12:43 PM EEST, Jarkko Sakkinen wrote:
> > > > When it comes to boot we should aim for one single
> > > > start_auth_session during boot, i.e. different phases would
> > > > leave that session open so that we don't have to load the
> > > > context every single time. I think it should be doable.
> > >
> > > The best possible idea how to improve performance here would be
> > > to transfer the cost from time to space. This can be achieved by
> > > keeping null key permanently in the TPM memory during power
> > > cycle.
> >
> > No it's not at all. If you look at it, the NULL key is only used
> > to encrypt the salt for the start session and that's the operating
> > taking a lot of time. That's why the cleanest mitigation would be
> > to save and restore the session. Unfortunately the timings you
> > already complain about still show this would be about 10x longer
> > than a no-hmac extend so I'm still waiting to see if IMA people
> > consider that an acceptable tradeoff.
>
> The bug report does not say anything about IMA issues. Please read
> the bug reports before commenting ;-) I will ignore your comment
> because it is plain misleading information.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=219229
Well, given that the kernel does no measured boot extends after the EFI
boot stub (which isn't session protected) finishes, what's your theory
for the root cause?
James
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-15 15:00 ` James Bottomley
@ 2024-09-15 16:22 ` Jarkko Sakkinen
2024-09-21 15:40 ` Jarkko Sakkinen
0 siblings, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-15 16:22 UTC (permalink / raw)
To: James Bottomley, Roberto Sassu, Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Sun Sep 15, 2024 at 6:00 PM EEST, James Bottomley wrote:
> On Sun, 2024-09-15 at 17:50 +0300, Jarkko Sakkinen wrote:
> > On Sun Sep 15, 2024 at 4:59 PM EEST, James Bottomley wrote:
> > > On Sun, 2024-09-15 at 13:07 +0300, Jarkko Sakkinen wrote:
> > > > On Sun Sep 15, 2024 at 12:43 PM EEST, Jarkko Sakkinen wrote:
> > > > > When it comes to boot we should aim for one single
> > > > > start_auth_session during boot, i.e. different phases would
> > > > > leave that session open so that we don't have to load the
> > > > > context every single time. I think it should be doable.
> > > >
> > > > The best possible idea how to improve performance here would be
> > > > to transfer the cost from time to space. This can be achieved by
> > > > keeping null key permanently in the TPM memory during power
> > > > cycle.
> > >
> > > No it's not at all. If you look at it, the NULL key is only used
> > > to encrypt the salt for the start session and that's the operating
> > > taking a lot of time. That's why the cleanest mitigation would be
> > > to save and restore the session. Unfortunately the timings you
> > > already complain about still show this would be about 10x longer
> > > than a no-hmac extend so I'm still waiting to see if IMA people
> > > consider that an acceptable tradeoff.
> >
> > The bug report does not say anything about IMA issues. Please read
> > the bug reports before commenting ;-) I will ignore your comment
> > because it is plain misleading information.
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=219229
>
> Well, given that the kernel does no measured boot extends after the EFI
> boot stub (which isn't session protected) finishes, what's your theory
> for the root cause?
I don't think there is a silver bullet. Based on benchmark which showed
80% overhead from throttling the context reducing number of loads and
saves will cut a slice of the fat.
Since it is the low-hanging fruit I'll start with that. In other words,
I'm not going touch session loading and saving. I'll start with null
key loading and saving.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-15 16:22 ` Jarkko Sakkinen
@ 2024-09-21 15:40 ` Jarkko Sakkinen
2024-09-22 14:11 ` Jarkko Sakkinen
0 siblings, 1 reply; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-21 15:40 UTC (permalink / raw)
To: Jarkko Sakkinen, James Bottomley, Roberto Sassu,
Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Sun Sep 15, 2024 at 7:22 PM EEST, Jarkko Sakkinen wrote:
> On Sun Sep 15, 2024 at 6:00 PM EEST, James Bottomley wrote:
> > On Sun, 2024-09-15 at 17:50 +0300, Jarkko Sakkinen wrote:
> > > On Sun Sep 15, 2024 at 4:59 PM EEST, James Bottomley wrote:
> > > > On Sun, 2024-09-15 at 13:07 +0300, Jarkko Sakkinen wrote:
> > > > > On Sun Sep 15, 2024 at 12:43 PM EEST, Jarkko Sakkinen wrote:
> > > > > > When it comes to boot we should aim for one single
> > > > > > start_auth_session during boot, i.e. different phases would
> > > > > > leave that session open so that we don't have to load the
> > > > > > context every single time. I think it should be doable.
> > > > >
> > > > > The best possible idea how to improve performance here would be
> > > > > to transfer the cost from time to space. This can be achieved by
> > > > > keeping null key permanently in the TPM memory during power
> > > > > cycle.
> > > >
> > > > No it's not at all. If you look at it, the NULL key is only used
> > > > to encrypt the salt for the start session and that's the operating
> > > > taking a lot of time. That's why the cleanest mitigation would be
> > > > to save and restore the session. Unfortunately the timings you
> > > > already complain about still show this would be about 10x longer
> > > > than a no-hmac extend so I'm still waiting to see if IMA people
> > > > consider that an acceptable tradeoff.
> > >
> > > The bug report does not say anything about IMA issues. Please read
> > > the bug reports before commenting ;-) I will ignore your comment
> > > because it is plain misleading information.
> > >
> > > https://bugzilla.kernel.org/show_bug.cgi?id=219229
> >
> > Well, given that the kernel does no measured boot extends after the EFI
> > boot stub (which isn't session protected) finishes, what's your theory
> > for the root cause?
>
> I don't think there is a silver bullet. Based on benchmark which showed
> 80% overhead from throttling the context reducing number of loads and
> saves will cut a slice of the fat.
>
> Since it is the low-hanging fruit I'll start with that. In other words,
> I'm not going touch session loading and saving. I'll start with null
> key loading and saving.
"my theory" worked pretty well. It brought the boot time back to 8.7s,
which can be explained with encryption overhead pretty well.
I'd suggest reading the bug report next time before solving a problem
that did not exist. We care about users, not unfinished patch sets.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [regression] significant delays when secureboot is enabled since 6.10
2024-09-21 15:40 ` Jarkko Sakkinen
@ 2024-09-22 14:11 ` Jarkko Sakkinen
0 siblings, 0 replies; 34+ messages in thread
From: Jarkko Sakkinen @ 2024-09-22 14:11 UTC (permalink / raw)
To: Jarkko Sakkinen, James Bottomley, Roberto Sassu,
Linux regressions mailing list
Cc: keyrings, linux-integrity@vger.kernel.org, LKML, Pengyu Ma
On Sat Sep 21, 2024 at 6:40 PM EEST, Jarkko Sakkinen wrote:
> On Sun Sep 15, 2024 at 7:22 PM EEST, Jarkko Sakkinen wrote:
> > On Sun Sep 15, 2024 at 6:00 PM EEST, James Bottomley wrote:
> > > On Sun, 2024-09-15 at 17:50 +0300, Jarkko Sakkinen wrote:
> > > > On Sun Sep 15, 2024 at 4:59 PM EEST, James Bottomley wrote:
> > > > > On Sun, 2024-09-15 at 13:07 +0300, Jarkko Sakkinen wrote:
> > > > > > On Sun Sep 15, 2024 at 12:43 PM EEST, Jarkko Sakkinen wrote:
> > > > > > > When it comes to boot we should aim for one single
> > > > > > > start_auth_session during boot, i.e. different phases would
> > > > > > > leave that session open so that we don't have to load the
> > > > > > > context every single time. I think it should be doable.
> > > > > >
> > > > > > The best possible idea how to improve performance here would be
> > > > > > to transfer the cost from time to space. This can be achieved by
> > > > > > keeping null key permanently in the TPM memory during power
> > > > > > cycle.
> > > > >
> > > > > No it's not at all. If you look at it, the NULL key is only used
> > > > > to encrypt the salt for the start session and that's the operating
> > > > > taking a lot of time. That's why the cleanest mitigation would be
> > > > > to save and restore the session. Unfortunately the timings you
> > > > > already complain about still show this would be about 10x longer
> > > > > than a no-hmac extend so I'm still waiting to see if IMA people
> > > > > consider that an acceptable tradeoff.
> > > >
> > > > The bug report does not say anything about IMA issues. Please read
> > > > the bug reports before commenting ;-) I will ignore your comment
> > > > because it is plain misleading information.
> > > >
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=219229
> > >
> > > Well, given that the kernel does no measured boot extends after the EFI
> > > boot stub (which isn't session protected) finishes, what's your theory
> > > for the root cause?
> >
> > I don't think there is a silver bullet. Based on benchmark which showed
> > 80% overhead from throttling the context reducing number of loads and
> > saves will cut a slice of the fat.
> >
> > Since it is the low-hanging fruit I'll start with that. In other words,
> > I'm not going touch session loading and saving. I'll start with null
> > key loading and saving.
>
> "my theory" worked pretty well. It brought the boot time back to 8.7s,
> which can be explained with encryption overhead pretty well.
>
> I'd suggest reading the bug report next time before solving a problem
> that did not exist. We care about users, not unfinished patch sets.
I'd also expect to review a patch set that fixes a performance issue
caused by a feature that you implemented less than a one week. One that
doubles the boot time on AMD CPU's.
This is ridiculous tbh.
BR, Jarkko
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2024-09-22 14:11 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-10 9:01 [regression] significant delays when secureboot is enabled since 6.10 Linux regression tracking (Thorsten Leemhuis)
2024-09-10 9:05 ` Roberto Sassu
2024-09-10 12:39 ` Jarkko Sakkinen
2024-09-10 12:48 ` Jarkko Sakkinen
2024-09-10 12:57 ` James Bottomley
2024-09-10 13:28 ` Jarkko Sakkinen
2024-09-11 8:53 ` Roberto Sassu
2024-09-11 12:21 ` James Bottomley
2024-09-12 13:16 ` Jarkko Sakkinen
2024-09-12 13:26 ` James Bottomley
2024-09-12 13:36 ` Roberto Sassu
2024-09-12 14:13 ` James Bottomley
2024-09-12 14:52 ` Roberto Sassu
2024-09-12 14:26 ` Jarkko Sakkinen
2024-09-14 10:42 ` Jarkko Sakkinen
2024-09-14 10:51 ` Jarkko Sakkinen
2024-09-14 10:58 ` Jarkko Sakkinen
2024-09-11 15:14 ` Jarkko Sakkinen
2024-09-12 8:13 ` Roberto Sassu
2024-09-12 14:23 ` Jarkko Sakkinen
2024-09-13 20:50 ` Jarkko Sakkinen
2024-09-13 22:06 ` Jarkko Sakkinen
2024-09-15 9:43 ` Jarkko Sakkinen
2024-09-15 10:07 ` Jarkko Sakkinen
2024-09-15 13:59 ` James Bottomley
2024-09-15 14:50 ` Jarkko Sakkinen
2024-09-15 14:55 ` Jarkko Sakkinen
2024-09-15 15:00 ` James Bottomley
2024-09-15 16:22 ` Jarkko Sakkinen
2024-09-21 15:40 ` Jarkko Sakkinen
2024-09-22 14:11 ` Jarkko Sakkinen
2024-09-10 12:22 ` James Bottomley
2024-09-10 12:41 ` Linux regression tracking (Thorsten Leemhuis)
2024-09-10 22:40 ` Jarkko Sakkinen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox