public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 00/17] Fix up the recent SRSO patches
@ 2023-08-09  7:12 Peter Zijlstra
  2023-08-09  9:04 ` Nikolay Borisov
  2023-08-09 10:04 ` Andrew.Cooper3
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

Since I wasn't invited to the party (even though I did retbleed), I get to
clean things up afterwards :/

Anyway, this here overhauls the SRSO patches in a big way.

I claim that AMD retbleed (also called Speculative-Type-Confusion -- not to be
confused with Intel retbleed, which is an entirely different bug) is
fundamentally the same as this SRSO -- which is also caused by STC. And the
mitigations are so similar they should all be controlled from a single spot and
not conflated like they are now.

As such, at the end of the ride the new kernel command line and srso sysfs
files are no more and all we're left with is a few extra retbleed options.

Aside of that; this deals with a few implementation issues -- but not all known
issues. Josh and Andrew are telling me there's a problem when running inside
virt due to how this checks the microcode. I'm hoping either of those two gents
will add a patch to address this.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2023-08-09  7:12 Peter Zijlstra
@ 2023-08-09  9:04 ` Nikolay Borisov
  2023-08-09 10:04 ` Andrew.Cooper3
  1 sibling, 0 replies; 17+ messages in thread
From: Nikolay Borisov @ 2023-08-09  9:04 UTC (permalink / raw)
  To: Peter Zijlstra, x86
  Cc: linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh



On 9.08.23 г. 10:12 ч., Peter Zijlstra wrote:
> Since I wasn't invited to the party (even though I did retbleed), I get to
> clean things up afterwards :/
> 
> Anyway, this here overhauls the SRSO patches in a big way.
> 
> I claim that AMD retbleed (also called Speculative-Type-Confusion -- not to be
> confused with Intel retbleed, which is an entirely different bug) is
> fundamentally the same as this SRSO -- which is also caused by STC. And the
> mitigations are so similar they should all be controlled from a single spot and
> not conflated like they are now.
> 
> As such, at the end of the ride the new kernel command line and srso sysfs
> files are no more and all we're left with is a few extra retbleed options.
> 
> Aside of that; this deals with a few implementation issues -- but not all known
> issues. Josh and Andrew are telling me there's a problem when running inside
> virt due to how this checks the microcode. I'm hoping either of those two gents
> will add a patch to address this.

The microcode issue should have been fixed as Boris added a safe_wrmsr 
call which checks for the presence of SBPB bit on zen3/4.


> 
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2023-08-09  7:12 Peter Zijlstra
  2023-08-09  9:04 ` Nikolay Borisov
@ 2023-08-09 10:04 ` Andrew.Cooper3
  2023-08-09 11:58   ` Peter Zijlstra
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew.Cooper3 @ 2023-08-09 10:04 UTC (permalink / raw)
  To: Peter Zijlstra, x86; +Cc: linux-kernel, David.Kaplan, jpoimboe, gregkh

On 09/08/2023 8:12 am, Peter Zijlstra wrote:
> Since I wasn't invited to the party (even though I did retbleed), I get to
> clean things up afterwards :/
>
> Anyway, this here overhauls the SRSO patches in a big way.
>
> I claim that AMD retbleed (also called Speculative-Type-Confusion

Branch Type Confusion.

Speculative Type Confusion is something else; generally Spectre v1 or v2
around a logical type check, usually ending up confusing pointers and
integer.

It appears that you might be suffering from Type-of-Speculative-Bug
Confusion, an affliction brought on by the chronic lack of documentation
and consistency, the fact that almost everything has at least 2 names,
and that 6 years in this horror show it's not showing any sign of
slowing down.

>  -- not to be
> confused with Intel retbleed, which is an entirely different bug) is
> fundamentally the same as this SRSO -- which is also caused by STC. And the
> mitigations are so similar they should all be controlled from a single spot and
> not conflated like they are now.

BTC and SRSO are certainly related, but they're not the same.

With BTC, an attacker poisons a branch type prediction to say "that
thing (which isn't actually a ret) is a ret".

With SRSO, an attacker leaves a poisoned infinite-call-loop prediction. 
Later, a real function (that is architecturally correct execution and
will retire) trips over the predicted infinite loop, which overflows the
RSB/RAS/RAP replacing the correct prediction on the top with the
attackers choice of value.

So while branch type confusion is used to poison the top-of-RSB value,
the ret that actually goes wrong needs a correct type=ret prediction for
the SRSO attack to succeed.


Both issues can be mitigated with IBPB-on-entry (given up-to-date
microcode in some cases).

Both issues have a software sequence that tries to make the contents of
a __x86_return_thunk sequence safe to use.  For BTC, it's simply a case
of ensuring the type prediction of the one ret is good.  For SRSO, it's
something more complicated and I don't know the uarch details fully.

~Andrew

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2023-08-09 10:04 ` Andrew.Cooper3
@ 2023-08-09 11:58   ` Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2023-08-09 11:58 UTC (permalink / raw)
  To: Andrew.Cooper3; +Cc: x86, linux-kernel, David.Kaplan, jpoimboe, gregkh

On Wed, Aug 09, 2023 at 11:04:15AM +0100, Andrew.Cooper3@citrix.com wrote:
> On 09/08/2023 8:12 am, Peter Zijlstra wrote:
> > Since I wasn't invited to the party (even though I did retbleed), I get to
> > clean things up afterwards :/
> >
> > Anyway, this here overhauls the SRSO patches in a big way.
> >
> > I claim that AMD retbleed (also called Speculative-Type-Confusion
> 
> Branch Type Confusion.

Durr, I shoud've double checked, and yes, too damn many different things
and not enough sleep.

> >  -- not to be
> > confused with Intel retbleed, which is an entirely different bug) is
> > fundamentally the same as this SRSO -- which is also caused by STC. And the
> > mitigations are so similar they should all be controlled from a single spot and
> > not conflated like they are now.
> 
> BTC and SRSO are certainly related, but they're not the same.
> 
> With BTC, an attacker poisons a branch type prediction to say "that
> thing (which isn't actually a ret) is a ret".
> 
> With SRSO, an attacker leaves a poisoned infinite-call-loop prediction. 
> Later, a real function (that is architecturally correct execution and
> will retire) trips over the predicted infinite loop, which overflows the
> RSB/RAS/RAP replacing the correct prediction on the top with the
> attackers choice of value.
> 
> So while branch type confusion is used to poison the top-of-RSB value,
> the ret that actually goes wrong needs a correct type=ret prediction for
> the SRSO attack to succeed.

Yes, this is what I meant, and I clearly failed to express myself
better. The point was that branch-type-confusion is involved with both,
just in different ways.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC][PATCH 00/17] Fix up the recent SRSO patches
@ 2024-01-27 18:58 a-development
  2024-01-27 19:19 ` Borislav Petkov
  0 siblings, 1 reply; 17+ messages in thread
From: a-development @ 2024-01-27 18:58 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

putting srso=off in the cmdline fixed up my FUSE related issues.
Basically, I could not suspend anymore.
kernel 6.7.1.

This is the behavior with srso enabled... 
https://paste.cachyos.org/p/bae7257

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-01-27 18:58 [RFC][PATCH 00/17] Fix up the recent SRSO patches a-development
@ 2024-01-27 19:19 ` Borislav Petkov
  2024-01-27 19:27   ` a-development
  2024-01-27 19:28   ` a-development
  0 siblings, 2 replies; 17+ messages in thread
From: Borislav Petkov @ 2024-01-27 19:19 UTC (permalink / raw)
  To: a-development
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

On Sat, Jan 27, 2024 at 06:58:37PM +0000, a-development@posteo.de wrote:
> putting srso=off in the cmdline fixed up my FUSE related issues.
> Basically, I could not suspend anymore.
> kernel 6.7.1.
> 
> This is the behavior with srso enabled...
> https://paste.cachyos.org/p/bae7257

Can you disable, if possible, whatever's doing FUSE and try suspending
then?

Also, can you share full dmesg, .config and /proc/cpuinfo from the
machine?

Thx.


-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-01-27 19:19 ` Borislav Petkov
@ 2024-01-27 19:27   ` a-development
  2024-01-27 19:41     ` Borislav Petkov
  2024-01-27 19:28   ` a-development
  1 sibling, 1 reply; 17+ messages in thread
From: a-development @ 2024-01-27 19:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

Oh that was quick :)

I can umount the FUSE mounts and it will work fine.
Previously I didn't even suspend.
Also, in the log I had provided, I was on a cachyos kernel, but it 
didn't matter, even the most recent arch kernel had the same issues.

full dmesg is no problem - I can do that the next day, when I startup 
the server again
full ~/.config folder I don't want to share
here is /proc/cpuinfo https://paste.cachyos.org/p/158b767

Thanks

On 27.01.2024 20:19, Borislav Petkov wrote:
> On Sat, Jan 27, 2024 at 06:58:37PM +0000, a-development@posteo.de 
> wrote:
>> putting srso=off in the cmdline fixed up my FUSE related issues.
>> Basically, I could not suspend anymore.
>> kernel 6.7.1.
>> 
>> This is the behavior with srso enabled...
>> https://paste.cachyos.org/p/bae7257
> 
> Can you disable, if possible, whatever's doing FUSE and try suspending
> then?
> 
> Also, can you share full dmesg, .config and /proc/cpuinfo from the
> machine?
> 
> Thx.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-01-27 19:19 ` Borislav Petkov
  2024-01-27 19:27   ` a-development
@ 2024-01-27 19:28   ` a-development
  1 sibling, 0 replies; 17+ messages in thread
From: a-development @ 2024-01-27 19:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

Oh that was quick :)

I can umount the FUSE mounts and it will work fine.
Previously I didn't even suspend.
Also, in the log I had provided, I was on a cachyos kernel, but it 
didn't matter, even the most recent arch kernel had the same issues.

full dmesg is no problem - I can do that the next day, when I startup 
the server again
full ~/.config folder I don't want to share
here is /proc/cpuinfo https://paste.cachyos.org/p/158b767

Thanks

On 27.01.2024 20:19, Borislav Petkov wrote:
> On Sat, Jan 27, 2024 at 06:58:37PM +0000, a-development@posteo.de 
> wrote:
>> putting srso=off in the cmdline fixed up my FUSE related issues.
>> Basically, I could not suspend anymore.
>> kernel 6.7.1.
>> 
>> This is the behavior with srso enabled...
>> https://paste.cachyos.org/p/bae7257
> 
> Can you disable, if possible, whatever's doing FUSE and try suspending
> then?
> 
> Also, can you share full dmesg, .config and /proc/cpuinfo from the
> machine?
> 
> Thx.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-01-27 19:27   ` a-development
@ 2024-01-27 19:41     ` Borislav Petkov
  2024-01-29 18:18       ` a-development
  0 siblings, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2024-01-27 19:41 UTC (permalink / raw)
  To: a-development
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

On Sat, Jan 27, 2024 at 07:27:45PM +0000, a-development@posteo.de wrote:
> I can umount the FUSE mounts and it will work fine.

Aha, so it is FUSE-related.

How do I trigger it here? What are the steps to reproduce? Suspend while
I have a FUSE mount? How do I set it up so that it is as close to yours
as possible?

> Previously I didn't even suspend.  Also, in the log I had provided,
> I was on a cachyos kernel, but it didn't matter, even the most recent
> arch kernel had the same issues.

You should try an upstream kernel to confirm it reproduces there - no
distro kernels.

> full dmesg is no problem - I can do that the next day, when I startup
> the server again full ~/.config folder I don't want to share

Not the full .config folder - just the kernel .config of the kernel
you're triggering this with so that I can try to do it here too.

> here is /proc/cpuinfo https://paste.cachyos.org/p/158b767

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-01-27 19:41     ` Borislav Petkov
@ 2024-01-29 18:18       ` a-development
  2024-03-26 22:21         ` Borislav Petkov
  0 siblings, 1 reply; 17+ messages in thread
From: a-development @ 2024-01-29 18:18 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

Hello.

I have the feeling that something else is amiss.
Currently under 6.7.2-2-cachyos with srso=off.
https://0x0.st/HDqP.txt

Now I feel, further communication is rather selfish, as a clean 
environment is hard to provide.
In any case, my FUSE arguments are sshfs -o kernel_cache -o auto_cache 
-o reconnect \
                -o compression=yes -o cache_timeout=600 -o 
ServerAliveInterval=30 \
                "$source" "$target" -o idmap=user

With this line, I somehow managed to have the FUSE mount infinitely 
mounted, even if the device was offline for couple of days.
A followed suspend would fail to freeze.
srso=off would reproducibly work.

Please provide me a specific version of a kernel I should try in my 
configuration to try and reproduce.
I'd prefer a pre-compiled one; if not tell me...
I use archlinux.

Please give me a reason to not feel bad about myself.

All the best







On 27.01.2024 20:41, Borislav Petkov wrote:
> On Sat, Jan 27, 2024 at 07:27:45PM +0000, a-development@posteo.de 
> wrote:
>> I can umount the FUSE mounts and it will work fine.
> 
> Aha, so it is FUSE-related.
> 
> How do I trigger it here? What are the steps to reproduce? Suspend 
> while
> I have a FUSE mount? How do I set it up so that it is as close to yours
> as possible?
> 
>> Previously I didn't even suspend.  Also, in the log I had provided,
>> I was on a cachyos kernel, but it didn't matter, even the most recent
>> arch kernel had the same issues.
> 
> You should try an upstream kernel to confirm it reproduces there - no
> distro kernels.
> 
>> full dmesg is no problem - I can do that the next day, when I startup
>> the server again full ~/.config folder I don't want to share
> 
> Not the full .config folder - just the kernel .config of the kernel
> you're triggering this with so that I can try to do it here too.
> 
>> here is /proc/cpuinfo https://paste.cachyos.org/p/158b767
> 
> Thx.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-01-29 18:18       ` a-development
@ 2024-03-26 22:21         ` Borislav Petkov
  2024-04-16  6:48           ` a-development
  0 siblings, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2024-03-26 22:21 UTC (permalink / raw)
  To: a-development
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

Whoops, 

this fell through the cracks. Sorry about that.

On Mon, Jan 29, 2024 at 06:18:00PM +0000, a-development@posteo.de wrote:
> I have the feeling that something else is amiss.
> Currently under 6.7.2-2-cachyos with srso=off.
> https://0x0.st/HDqP.txt

Yah, your tasks refuse to freeze on suspend and they have this fuse
stuff in the stacktrace:

[ 6346.492593] task:btop            state:D stack:0     pid:279617 tgid:1548  ppid:1531   flags:0x00004006
[ 6346.492600] Call Trace:
[ 6346.492602]  <TASK>
[ 6346.492607]  __schedule+0xd44/0x1af0
[ 6346.492614]  ? srso_alias_return_thunk+0x5/0xfbef5
[ 6346.492617]  ? __wake_up+0x9d/0xc0
[ 6346.492622]  schedule+0x32/0xd0
[ 6346.492627]  request_wait_answer+0xd0/0x2a0 [fuse db37c699d94393e946cf93306449ea0f307959a1]
[ 6346.492638]  ? __pfx_autoremove_wake_function+0x10/0x10
[ 6346.492643]  fuse_simple_request+0x21c/0x390 [fuse db37c699d94393e946cf93306449ea0f307959a1]
[ 6346.492653]  fuse_statfs+0xf2/0x160 [fuse db37c699d94393e946cf93306449ea0f307959a1]
[ 6346.492667]  statfs_by_dentry+0x67/0x90

> 
> Now I feel, further communication is rather selfish, as a clean environment
> is hard to provide.
> In any case, my FUSE arguments are sshfs -o kernel_cache -o auto_cache -o
> reconnect \
>                -o compression=yes -o cache_timeout=600 -o
> ServerAliveInterval=30 \
>                "$source" "$target" -o idmap=user
> 
> With this line, I somehow managed to have the FUSE mount infinitely mounted,
> even if the device was offline for couple of days.
> A followed suspend would fail to freeze.
> srso=off would reproducibly work.

Not in your example above. It would fail after a couple of suspend
cycles.

And looking at your splats

[ 6366.524953]  ? switch_fpu_return+0x50/0xe0
[ 6366.524956]  ? srso_alias_return_thunk+0x5/0xfbef5
		  ^^^^^^^^^^^^^^^^^^^^^^^^
[ 6366.524958]  ? exit_to_user_mode_prepare+0x132/0x1f

the right cmdline option to disable it is:

spec_rstack_overflow=off

not

srso=off

:-)

> Please provide me a specific version of a kernel I should try in my
> configuration to try and reproduce.
> I'd prefer a pre-compiled one; if not tell me...
> I use archlinux.

Just build the latest released kernel, which is 6.8 now. Take your
.config and use it to build it. The net is full of tutorials how to do
so.

And then try with spec_rstack_overflow=off and let's see what that does.

> Please give me a reason to not feel bad about myself.

Don't worry - it's just a machine. :)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-03-26 22:21         ` Borislav Petkov
@ 2024-04-16  6:48           ` a-development
  2024-04-16  8:45             ` Borislav Petkov
  0 siblings, 1 reply; 17+ messages in thread
From: a-development @ 2024-04-16  6:48 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

It worked, it worked!

https://up.tail.ws/txt/working-suspend.txt

I've tested it now quite some time.
But, I also had to start using 6.6.26-1-lts because my magewell capture 
card wouldn't without.

Thanks again!

On 26.03.2024 23:21, Borislav Petkov wrote:
> Whoops,
> 
> this fell through the cracks. Sorry about that.
> 
> On Mon, Jan 29, 2024 at 06:18:00PM +0000, a-development@posteo.de 
> wrote:
>> I have the feeling that something else is amiss.
>> Currently under 6.7.2-2-cachyos with srso=off.
>> https://0x0.st/HDqP.txt
> 
> Yah, your tasks refuse to freeze on suspend and they have this fuse
> stuff in the stacktrace:
> 
> [ 6346.492593] task:btop            state:D stack:0     pid:279617
> tgid:1548  ppid:1531   flags:0x00004006
> [ 6346.492600] Call Trace:
> [ 6346.492602]  <TASK>
> [ 6346.492607]  __schedule+0xd44/0x1af0
> [ 6346.492614]  ? srso_alias_return_thunk+0x5/0xfbef5
> [ 6346.492617]  ? __wake_up+0x9d/0xc0
> [ 6346.492622]  schedule+0x32/0xd0
> [ 6346.492627]  request_wait_answer+0xd0/0x2a0 [fuse
> db37c699d94393e946cf93306449ea0f307959a1]
> [ 6346.492638]  ? __pfx_autoremove_wake_function+0x10/0x10
> [ 6346.492643]  fuse_simple_request+0x21c/0x390 [fuse
> db37c699d94393e946cf93306449ea0f307959a1]
> [ 6346.492653]  fuse_statfs+0xf2/0x160 [fuse
> db37c699d94393e946cf93306449ea0f307959a1]
> [ 6346.492667]  statfs_by_dentry+0x67/0x90
> 
>> 
>> Now I feel, further communication is rather selfish, as a clean 
>> environment
>> is hard to provide.
>> In any case, my FUSE arguments are sshfs -o kernel_cache -o auto_cache 
>> -o
>> reconnect \
>>                -o compression=yes -o cache_timeout=600 -o
>> ServerAliveInterval=30 \
>>                "$source" "$target" -o idmap=user
>> 
>> With this line, I somehow managed to have the FUSE mount infinitely 
>> mounted,
>> even if the device was offline for couple of days.
>> A followed suspend would fail to freeze.
>> srso=off would reproducibly work.
> 
> Not in your example above. It would fail after a couple of suspend
> cycles.
> 
> And looking at your splats
> 
> [ 6366.524953]  ? switch_fpu_return+0x50/0xe0
> [ 6366.524956]  ? srso_alias_return_thunk+0x5/0xfbef5
> 		  ^^^^^^^^^^^^^^^^^^^^^^^^
> [ 6366.524958]  ? exit_to_user_mode_prepare+0x132/0x1f
> 
> the right cmdline option to disable it is:
> 
> spec_rstack_overflow=off
> 
> not
> 
> srso=off
> 
> :-)
> 
>> Please provide me a specific version of a kernel I should try in my
>> configuration to try and reproduce.
>> I'd prefer a pre-compiled one; if not tell me...
>> I use archlinux.
> 
> Just build the latest released kernel, which is 6.8 now. Take your
> .config and use it to build it. The net is full of tutorials how to do
> so.
> 
> And then try with spec_rstack_overflow=off and let's see what that 
> does.
> 
>> Please give me a reason to not feel bad about myself.
> 
> Don't worry - it's just a machine. :)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-04-16  6:48           ` a-development
@ 2024-04-16  8:45             ` Borislav Petkov
  2024-04-16 20:14               ` a-development
  0 siblings, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2024-04-16  8:45 UTC (permalink / raw)
  To: a-development
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

On Tue, Apr 16, 2024 at 06:48:54AM +0000, a-development@posteo.de wrote:
> It worked, it worked!
> 
> https://up.tail.ws/txt/working-suspend.txt
> 
> I've tested it now quite some time.
> But, I also had to start using 6.6.26-1-lts because my magewell capture card
> wouldn't without.

Right, that thing I guess:

[Mon Apr 15 18:37:58 2024] ProCapture: loading out-of-tree module taints kernel.
[Mon Apr 15 18:37:58 2024] ProCapture: module verification failed: signature and/or required key missing - tainting kernel

So, machines do suspend even with SRSO enabled and since your machine is
affected, you probably should try without spec_rstack_overflow=off to
see if it works with the new kernel.

Then, the other thing you could try is whether suspend works without
that proprietary crap.

And then we can see.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-04-16  8:45             ` Borislav Petkov
@ 2024-04-16 20:14               ` a-development
  2024-04-17  8:08                 ` a-development
  0 siblings, 1 reply; 17+ messages in thread
From: a-development @ 2024-04-16 20:14 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

Now that it is deactivated, the machine no longer suspends!

https://up.tail.ws/txt/non-working-suspend.txt

> Then, the other thing you could try is whether suspend works without
> that proprietary crap.

I refuse. I can explain. I tried lots of capture cards that stated they 
support uvcvideo and linux.
This problem existed prior and I need it for work on this machine.
But none of them worked reliably or would straight up glitch out.
Thats because they do not implement it properly.

It had to be a product from Magewell, who manage an array of bash 
scripts and the AUR maintainer gets updates if something breaks, too. 
Why do I use a PCIe HDMI Capture Card?
I need to use Cameras and Displays.

As for USB Cameras, unless its a product from e.g Logitech, they kept 
giving me similar headaches.
And that included an older setup that ran a Intel i7 8700K as well.

Thx.



On 16.04.2024 10:45, Borislav Petkov wrote:
> On Tue, Apr 16, 2024 at 06:48:54AM +0000, a-development@posteo.de 
> wrote:
>> It worked, it worked!
>> 
>> https://up.tail.ws/txt/working-suspend.txt
>> 
>> I've tested it now quite some time.
>> But, I also had to start using 6.6.26-1-lts because my magewell 
>> capture card
>> wouldn't without.
> 
> Right, that thing I guess:
> 
> [Mon Apr 15 18:37:58 2024] ProCapture: loading out-of-tree module 
> taints kernel.
> [Mon Apr 15 18:37:58 2024] ProCapture: module verification failed:
> signature and/or required key missing - tainting kernel
> 
> So, machines do suspend even with SRSO enabled and since your machine 
> is
> affected, you probably should try without spec_rstack_overflow=off to
> see if it works with the new kernel.
> 
> Then, the other thing you could try is whether suspend works without
> that proprietary crap.
> 
> And then we can see.
> 
> Thx.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-04-16 20:14               ` a-development
@ 2024-04-17  8:08                 ` a-development
  2024-04-17  9:12                   ` Borislav Petkov
  0 siblings, 1 reply; 17+ messages in thread
From: a-development @ 2024-04-17  8:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

Today I failed to suspend, and the spec_rstack thing was off.

https://up.tail.ws/txt/non-working-suspend-2.txt




On 16.04.2024 22:14, a-development@posteo.de wrote:
> Now that it is deactivated, the machine no longer suspends!
> 
> https://up.tail.ws/txt/non-working-suspend.txt
> 
>> Then, the other thing you could try is whether suspend works without
>> that proprietary crap.
> 
> I refuse. I can explain. I tried lots of capture cards that stated
> they support uvcvideo and linux.
> This problem existed prior and I need it for work on this machine.
> But none of them worked reliably or would straight up glitch out.
> Thats because they do not implement it properly.
> 
> It had to be a product from Magewell, who manage an array of bash
> scripts and the AUR maintainer gets updates if something breaks, too.
> Why do I use a PCIe HDMI Capture Card?
> I need to use Cameras and Displays.
> 
> As for USB Cameras, unless its a product from e.g Logitech, they kept
> giving me similar headaches.
> And that included an older setup that ran a Intel i7 8700K as well.
> 
> Thx.
> 
> 
> 
> On 16.04.2024 10:45, Borislav Petkov wrote:
>> On Tue, Apr 16, 2024 at 06:48:54AM +0000, a-development@posteo.de 
>> wrote:
>>> It worked, it worked!
>>> 
>>> https://up.tail.ws/txt/working-suspend.txt
>>> 
>>> I've tested it now quite some time.
>>> But, I also had to start using 6.6.26-1-lts because my magewell 
>>> capture card
>>> wouldn't without.
>> 
>> Right, that thing I guess:
>> 
>> [Mon Apr 15 18:37:58 2024] ProCapture: loading out-of-tree module 
>> taints kernel.
>> [Mon Apr 15 18:37:58 2024] ProCapture: module verification failed:
>> signature and/or required key missing - tainting kernel
>> 
>> So, machines do suspend even with SRSO enabled and since your machine 
>> is
>> affected, you probably should try without spec_rstack_overflow=off to
>> see if it works with the new kernel.
>> 
>> Then, the other thing you could try is whether suspend works without
>> that proprietary crap.
>> 
>> And then we can see.
>> 
>> Thx.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-04-17  8:08                 ` a-development
@ 2024-04-17  9:12                   ` Borislav Petkov
  2024-04-22  7:01                     ` a-development
  0 siblings, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2024-04-17  9:12 UTC (permalink / raw)
  To: a-development
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

On Wed, Apr 17, 2024 at 08:08:53AM +0000, a-development@posteo.de wrote:
> Today I failed to suspend, and the spec_rstack thing was off.
> 
> https://up.tail.ws/txt/non-working-suspend-2.txt

Ok, but please do not top-post. Put your reply underneath the next
you're replying to and remove the rest of the quoted text like I just
did.

So this could be caused by the proprietary module or something else.

If you want this debugged, you'd have to try to reproduce it with the
latest upstream kernel from here:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

after having removed the propietary module.

HTH.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2024-04-17  9:12                   ` Borislav Petkov
@ 2024-04-22  7:01                     ` a-development
  0 siblings, 0 replies; 17+ messages in thread
From: a-development @ 2024-04-22  7:01 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh

Hello. I have installed the kernel through 
https://aur.archlinux.org/packages/linux-mainline and noticed that SRSO 
is disabled. "Speculative Return Stack Overflow: IBPB-extending 
microcode not applied!"

cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Vulnerable: Safe RET, no microcode

So far, I have been succesfully suspending the one night I used it.

Assuming this is per-default, I've installed the kernel module for my 
PCIe Capture card and testing it.

Any new instructions?

Thanks

On 17.04.2024 11:12, Borislav Petkov wrote:
> On Wed, Apr 17, 2024 at 08:08:53AM +0000, a-development@posteo.de 
> wrote:
>> Today I failed to suspend, and the spec_rstack thing was off.
>> 
>> https://up.tail.ws/txt/non-working-suspend-2.txt
> 
> Ok, but please do not top-post. Put your reply underneath the next
> you're replying to and remove the rest of the quoted text like I just
> did.
> 
> So this could be caused by the proprietary module or something else.
> 
> If you want this debugged, you'd have to try to reproduce it with the
> latest upstream kernel from here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> 
> after having removed the propietary module.
> 
> HTH.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-04-22  7:01 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-27 18:58 [RFC][PATCH 00/17] Fix up the recent SRSO patches a-development
2024-01-27 19:19 ` Borislav Petkov
2024-01-27 19:27   ` a-development
2024-01-27 19:41     ` Borislav Petkov
2024-01-29 18:18       ` a-development
2024-03-26 22:21         ` Borislav Petkov
2024-04-16  6:48           ` a-development
2024-04-16  8:45             ` Borislav Petkov
2024-04-16 20:14               ` a-development
2024-04-17  8:08                 ` a-development
2024-04-17  9:12                   ` Borislav Petkov
2024-04-22  7:01                     ` a-development
2024-01-27 19:28   ` a-development
  -- strict thread matches above, loose matches on Subject: below --
2023-08-09  7:12 Peter Zijlstra
2023-08-09  9:04 ` Nikolay Borisov
2023-08-09 10:04 ` Andrew.Cooper3
2023-08-09 11:58   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox