linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "Shankar, Ravi V" <ravi.v.shankar@intel.com>,
	"bsingharora@gmail.com" <bsingharora@gmail.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"Syromiatnikov, Eugene" <esyr@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"rdunlap@infradead.org" <rdunlap@infradead.org>,
	"keescook@chromium.org" <keescook@chromium.org>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"Eranian, Stephane" <eranian@google.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"fweimer@redhat.com" <fweimer@redhat.com>,
	"nadav.amit@gmail.com" <nadav.amit@gmail.com>,
	"jannh@google.com" <jannh@google.com>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"kcc@google.com" <kcc@google.com>, "bp@alien8.de" <bp@alien8.de>,
	"oleg@redhat.com" <oleg@redhat.com>,
	"hjl.tools@gmail.com" <hjl.tools@gmail.com>,
	"Yang, Weijiang" <weijiang.yang@intel.com>,
	"Lutomirski, Andy" <luto@kernel.org>,
	"pavel@ucw.cz" <pavel@ucw.cz>, "arnd@arndb.de" <arnd@arndb.de>,
	"Moreira, Joao" <joao.moreira@intel.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mike.kravetz@oracle.com" <mike.kravetz@oracle.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"Dave.Martin@arm.com" <Dave.Martin@arm.com>,
	"john.allen@amd.com" <john.allen@amd.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"Hansen, Dave" <dave.hansen@intel.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"gorcunov@gmail.com" <gorcunov@gmail.com>
Subject: Re: [PATCH 23/35] x86/fpu: Add helpers for modifying supervisor xstate
Date: Sat, 12 Feb 2022 02:31:13 +0000	[thread overview]
Message-ID: <7b2be3259a8f062f17048b1b6496a5303e49db46.camel@intel.com> (raw)
In-Reply-To: <d22f9dfc-cb7b-94f6-8585-bb39ed04536b@intel.com>

On Fri, 2022-02-11 at 16:27 -0800, Dave Hansen wrote:
> On 1/30/22 13:18, Rick Edgecombe wrote:
> > Add helpers that can be used to modify supervisor xstate safely for
> > the
> > current task.
> 
> This should be at the end of the changelog.

Hmm, ok.

> 
> > State for supervisors xstate based features can be live and
> > accesses via MSR's, or saved in memory in an xsave buffer. When the
> > kernel needs to modify this state it needs to be sure to operate on
> > it
> > in the right place, so the modifications don't get clobbered.
> 
> We tend to call these "supervisor xfeatures".  The "state is in the
> registers" we call "active".  Maybe:
> 
> 	Just like user xfeatures, supervisor xfeatures can be either
> 	active in the registers or inactive and present in the task FPU
> 	buffer.  If the registers are active, the registers can be
> 	modified directly.  If the registers are not active, the
> 	modification must be performed on the task FPU buffer.

Ok, thanks.

> 
> 
> > In the past supervisor xstate features have used get_xsave_addr()
> > directly, and performed open coded logic handle operating on the
> > saved
> > state correctly. This has posed two problems:
> >  1. It has logic that has been gotten wrong more than once.
> >  2. To reduce code, less common path's are not optimized.
> > Determination
> 
> 			   "paths" ^
> 

Arg, thanks.

> 
> > xstate = start_update_xsave_msrs(XFEATURE_FOO);
> > r = xsave_rdmsrl(state, MSR_IA32_FOO_1, &val)
> > if (r)
> > 	xsave_wrmsrl(state, MSR_IA32_FOO_2, FOO_ENABLE);
> > end_update_xsave_msrs();
> 
> This looks OK.  I'm not thrilled about it.  The
> start_update_xsave_msrs() can probably drop the "_msrs".  Maybe:
> 
> 	start_xfeature_update(...);

Hmm, this whole thing pretends to be updating MSRs, which is often not
true. Maybe the xsave_rdmsrl/xsave_wrmsrl should be renamed too.
xsave_readl()/xsave_writel() or something.

> 
> Also, if you have to do the address lookup in xsave_rdmsrl() anyway,
> I
> wonder if the 'xstate' should just be a full fledged 'struct
> xregs_state'.
> 
> The other option would be to make a little on-stack structure like:
> 
> 	struct xsave_update {
> 		int feature;
> 		struct xregs_state *xregs;
> 	};
> 
> Then you do:
> 
> 	struct xsave_update xsu;
> 	...
> 	start_update_xsave_msrs(&xsu, XFEATURE_FOO);
> 
> and then pass it along to each of the other operations:
> 
> 	r = xsave_rdmsrl(xsu, MSR_IA32_FOO_1, &val)
> 
> It's slightly less likely to get type confused as a 'void *';

The 'void *' is actually a pointer to the specific xfeature in the
buffer. So the read/writes don't have to re-compute the offset every
time. It's not too much work though. I'm really surprised by the desire
to obfuscate the pointer, but I guess if we really want to, I'd rather
do that and keep the regular read/write operations.

If we don't care about the extra lookups this can totally drop the
caller side state. The feature nr can be looked up from the MSR along
with the struct offset. Then it doesn't expose the pointer to the
buffer, since it's all recomputed every operation.

So like:
start_xfeature_update();
r = xsave_readl(MSR_IA32_FOO_1, &val)
if (r)
        xsave_writel(MSR_IA32_FOO_2, FOO_ENABLE);
end_xfeature_update();

The WARNs then happen in the read/writes. An early iteration looked
like that. I liked this version with caller side state, but thought it
might be worth revisiting if there really is a strong desire to hide
the pointer.

> 
> > +static u64 *__get_xsave_member(void *xstate, u32 msr)
> > +{
> > +	switch (msr) {
> > +	/* Currently there are no MSR's supported */
> > +	default:
> > +		WARN_ONCE(1, "x86/fpu: unsupported xstate msr (%u)\n",
> > msr);
> > +		return NULL;
> > +	}
> > +}
> 
> Just to get an idea what this is doing, it's OK to include the shadow
> stack MSRs in here.

Ok.

> 
> Are you sure this should return a u64*?  We have lots of <=64-bit
> XSAVE
> fields.

I thought it should only be used with 64 bit msrs. Maybe it needs a
better name?

> 
> > +/*
> > + * Return a pointer to the xstate for the feature if it should be
> > used, or NULL
> > + * if the MSRs should be written to directly. To do this safely,
> > using the
> > + * associated read/write helpers is required.
> > + */
> > +void *start_update_xsave_msrs(int xfeature_nr)
> > +{
> > +	void *xstate;
> > +
> > +	/*
> > +	 * fpregs_lock() only disables preemption (mostly). So modifing
> > state
> 
> 							 modifying ^
> 	
> > +	 * in an interrupt could screw up some in progress fpregs
> > operation,
> 
> 						^ in-progress

I swear I ran checkpatch...

> 
> > +	 * but appear to work. Warn about it.
> > +	 */
> > +	WARN_ON_ONCE(!in_task());
> > +	WARN_ON_ONCE(current->flags & PF_KTHREAD);
> 
> This might also be a good spot to check that xfeature_nr is in
> fpstate.xfeatures.

Hmm, good idea.

> 
> > +	fpregs_lock();
> > +
> > +	fpregs_assert_state_consistent();
> > +
> > +	/*
> > +	 * If the registers don't need to be reloaded. Go ahead and
> > operate on the
> > +	 * registers.
> > +	 */
> > +	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
> > +		return NULL;
> > +
> > +	xstate = get_xsave_addr(&current->thread.fpu.fpstate-
> > >regs.xsave, xfeature_nr);
> > +
> > +	/*
> > +	 * If regs are in the init state, they can't be retrieved from
> > +	 * init_fpstate due to the init optimization, but are not
> > nessarily
> 
> 							necessarily ^

Oof, thanks.

> 
> Spell checker time.  ":set spell" in vim works for me nicely.
> 
> > +	 * zero. The only option is to restore to make everything live
> > and
> > +	 * operate on registers. This will clear TIF_NEED_FPU_LOAD.
> > +	 *
> > +	 * Otherwise, if not in the init state but TIF_NEED_FPU_LOAD is
> > set,
> > +	 * operate on the buffer. The registers will be restored before
> > going
> > +	 * to userspace in any case, but the task might get preempted
> > before
> > +	 * then, so this possibly saves an xsave.
> > +	 */
> > +	if (!xstate)
> > +		fpregs_restore_userregs();
> 
> Won't fpregs_restore_userregs() end up setting TIF_NEED_FPU_LOAD=0?
> Isn't that a case where a "return NULL" is needed?

This is for the case when the feature is in the init state. For CET's 
case this could just zero the buffer and return the pointer to it, but
for other features the init state wasn't always zero. So this just
makes all the features "active" and TIF_NEED_FPU_LOAD is cleared. It
then returns NULL and the read/writes go to the MSRs. It still looks
correct to me, am I missing something?

> 
> In any case, this makes me think this code should start out stupid
> and
> slow.  Keep the API as-is, but make the first patch unconditionally
> do
> the WRMSR.  Leave the "fast" buffer modifications for a follow-on
> patch.

Ok. Should I drop the optimized versions from the series or just split
them out? The optimizations were trying to address Boris' comments:
https://lore.kernel.org/lkml/YS95VzrNhDhFpsop@zn.tnic/


  reply	other threads:[~2022-02-12  2:31 UTC|newest]

Thread overview: 152+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-30 21:18 [PATCH 00/35] Shadow stacks for userspace Rick Edgecombe
2022-01-30 21:18 ` [PATCH 01/35] Documentation/x86: Add CET description Rick Edgecombe
2022-01-30 21:18 ` [PATCH 02/35] x86/cet/shstk: Add Kconfig option for Shadow Stack Rick Edgecombe
2022-02-07 22:39   ` Dave Hansen
2022-02-08  8:41     ` Thomas Gleixner
2022-02-08 20:20       ` Edgecombe, Rick P
2022-02-08  8:39   ` Thomas Gleixner
2022-01-30 21:18 ` [PATCH 03/35] x86/cpufeatures: Add CET CPU feature flags for Control-flow Enforcement Technology (CET) Rick Edgecombe
2022-02-07 22:45   ` Dave Hansen
2022-02-08 20:23     ` Edgecombe, Rick P
2022-02-09  1:10   ` Kees Cook
2022-01-30 21:18 ` [PATCH 04/35] x86/cpufeatures: Introduce CPU setup and option parsing for CET Rick Edgecombe
2022-02-07 22:49   ` Dave Hansen
2022-02-08 20:29     ` Edgecombe, Rick P
2022-01-30 21:18 ` [PATCH 05/35] x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states Rick Edgecombe
2022-02-07 23:28   ` Dave Hansen
2022-02-08 21:36     ` Edgecombe, Rick P
2022-01-30 21:18 ` [PATCH 06/35] x86/cet: Add control-protection fault handler Rick Edgecombe
2022-02-07 23:56   ` Dave Hansen
2022-02-08 22:23     ` Edgecombe, Rick P
2022-01-30 21:18 ` [PATCH 07/35] x86/mm: Remove _PAGE_DIRTY from kernel RO pages Rick Edgecombe
2022-02-08  0:13   ` Dave Hansen
2022-02-08 22:52     ` Edgecombe, Rick P
2022-01-30 21:18 ` [PATCH 08/35] x86/mm: Move pmd_write(), pud_write() up in the file Rick Edgecombe
2022-01-30 21:18 ` [PATCH 09/35] x86/mm: Introduce _PAGE_COW Rick Edgecombe
2022-02-08  1:05   ` Dave Hansen
2022-01-30 21:18 ` [PATCH 10/35] drm/i915/gvt: Change _PAGE_DIRTY to _PAGE_DIRTY_BITS Rick Edgecombe
2022-02-09 16:58   ` Dave Hansen
2022-02-11  1:39     ` Edgecombe, Rick P
2022-02-11  7:13       ` Wang, Zhi A
2022-02-12  1:45         ` Edgecombe, Rick P
2022-01-30 21:18 ` [PATCH 11/35] x86/mm: Update pte_modify for _PAGE_COW Rick Edgecombe
2022-02-09 18:00   ` Dave Hansen
2022-01-30 21:18 ` [PATCH 12/35] x86/mm: Update ptep_set_wrprotect() and pmdp_set_wrprotect() for transition from _PAGE_DIRTY to _PAGE_COW Rick Edgecombe
2022-02-09 18:30   ` Dave Hansen
2022-01-30 21:18 ` [PATCH 13/35] mm: Move VM_UFFD_MINOR_BIT from 37 to 38 Rick Edgecombe
2022-01-30 21:18 ` [PATCH 14/35] mm: Introduce VM_SHADOW_STACK for shadow stack memory Rick Edgecombe
2022-02-09 21:55   ` Dave Hansen
2022-01-30 21:18 ` [PATCH 15/35] x86/mm: Check Shadow Stack page fault errors Rick Edgecombe
2022-02-09 19:06   ` Dave Hansen
2022-01-30 21:18 ` [PATCH 16/35] x86/mm: Update maybe_mkwrite() for shadow stack Rick Edgecombe
2022-02-09 21:16   ` Dave Hansen
2022-01-30 21:18 ` [PATCH 17/35] mm: Fixup places that call pte_mkwrite() directly Rick Edgecombe
2022-02-09 21:51   ` Dave Hansen
2022-01-30 21:18 ` [PATCH 18/35] mm: Add guard pages around a shadow stack Rick Edgecombe
2022-02-09 22:23   ` Dave Hansen
2022-02-10 22:38     ` David Laight
2022-02-10 23:42       ` Edgecombe, Rick P
2022-02-11  9:08         ` David Laight
2022-02-10 22:43   ` Dave Hansen
2022-02-10 23:07     ` Andy Lutomirski
2022-02-10 23:40       ` Edgecombe, Rick P
2022-02-11 17:54         ` Andy Lutomirski
2022-02-12  0:10           ` Edgecombe, Rick P
2022-01-30 21:18 ` [PATCH 19/35] mm/mmap: Add shadow stack pages to memory accounting Rick Edgecombe
2022-02-09 22:27   ` Dave Hansen
2022-01-30 21:18 ` [PATCH 20/35] mm: Update can_follow_write_pte() for shadow stack Rick Edgecombe
2022-02-09 22:50   ` Dave Hansen
2022-02-09 22:52   ` Dave Hansen
2022-02-10 22:45     ` David Laight
2022-01-30 21:18 ` [PATCH 21/35] mm/mprotect: Exclude shadow stack from preserve_write Rick Edgecombe
2022-02-10 19:27   ` Dave Hansen
2022-01-30 21:18 ` [PATCH 22/35] x86/mm: Prevent VM_WRITE shadow stacks Rick Edgecombe
2022-02-11 22:19   ` Dave Hansen
2022-02-12  1:44     ` Edgecombe, Rick P
2022-01-30 21:18 ` [PATCH 23/35] x86/fpu: Add helpers for modifying supervisor xstate Rick Edgecombe
2022-02-08  8:51   ` Thomas Gleixner
2022-02-09 19:55     ` Edgecombe, Rick P
2022-02-12  0:27   ` Dave Hansen
2022-02-12  2:31     ` Edgecombe, Rick P [this message]
2022-01-30 21:18 ` [PATCH 24/35] mm: Re-introduce vm_flags to do_mmap() Rick Edgecombe
2022-01-30 21:18 ` [PATCH 25/35] x86/cet/shstk: Add user-mode shadow stack support Rick Edgecombe
2022-02-11 23:37   ` Dave Hansen
2022-02-12  0:07     ` Andy Lutomirski
2022-02-12  0:11       ` Dave Hansen
2022-02-12  0:12     ` Edgecombe, Rick P
2022-01-30 21:18 ` [PATCH 26/35] x86/process: Change copy_thread() argument 'arg' to 'stack_size' Rick Edgecombe
2022-02-08  8:38   ` Thomas Gleixner
2022-02-11  2:09     ` Edgecombe, Rick P
2022-02-14 12:33   ` Jann Horn
2022-02-15  1:22     ` Edgecombe, Rick P
2022-02-15  8:49       ` Christian Brauner
2022-01-30 21:18 ` [PATCH 27/35] x86/fpu: Add unsafe xsave buffer helpers Rick Edgecombe
2022-01-30 21:18 ` [PATCH 28/35] x86/cet/shstk: Handle thread shadow stack Rick Edgecombe
2022-01-30 21:18 ` [PATCH 29/35] x86/cet/shstk: Introduce shadow stack token setup/verify routines Rick Edgecombe
2022-01-30 21:18 ` [PATCH 30/35] x86/cet/shstk: Handle signals for shadow stack Rick Edgecombe
2022-01-30 21:18 ` [PATCH 31/35] x86/cet/shstk: Add arch_prctl elf feature functions Rick Edgecombe
2022-01-30 21:18 ` [PATCH 32/35] x86/cet/shstk: Introduce map_shadow_stack syscall Rick Edgecombe
2022-01-30 21:18 ` [PATCH 33/35] selftests/x86: Add map_shadow_stack syscall test Rick Edgecombe
2022-02-03 22:42   ` Dave Hansen
2022-02-04  1:22     ` Edgecombe, Rick P
2022-01-30 21:18 ` [PATCH 34/35] x86/cet/shstk: Support wrss for userspace Rick Edgecombe
2022-01-31  7:56   ` Florian Weimer
2022-01-31 18:26     ` H.J. Lu
2022-01-31 18:45       ` Florian Weimer
2022-01-30 21:18 ` [PATCH 35/35] x86/cpufeatures: Limit shadow stack to Intel CPUs Rick Edgecombe
2022-02-03 21:58   ` John Allen
2022-02-03 22:23     ` H.J. Lu
2022-02-04 22:21       ` John Allen
2022-02-03 21:07 ` [PATCH 00/35] Shadow stacks for userspace Thomas Gleixner
2022-02-04  1:08   ` Edgecombe, Rick P
2022-02-04  5:20     ` Andy Lutomirski
2022-02-04 20:23       ` Edgecombe, Rick P
2022-02-05 13:26     ` David Laight
2022-02-05 13:29       ` H.J. Lu
2022-02-05 20:15         ` Edgecombe, Rick P
2022-02-05 20:21           ` H.J. Lu
2022-02-06 13:19             ` Peter Zijlstra
2022-02-06 13:42           ` David Laight
2022-02-06 13:55             ` H.J. Lu
2022-02-07 10:22             ` Florian Weimer
2022-02-08  1:46             ` Edgecombe, Rick P
2022-02-08  1:31           ` Andy Lutomirski
2022-02-08  9:31             ` Thomas Gleixner
2022-02-08 16:15               ` Andy Lutomirski
2022-02-06 13:06     ` Peter Zijlstra
2022-02-06 18:42 ` Mike Rapoport
2022-02-07  7:20   ` Adrian Reber
2022-02-07 16:30     ` Dave Hansen
2022-02-08  9:16       ` Mike Rapoport
2022-02-08  9:29         ` Cyrill Gorcunov
2022-02-08 16:21           ` Andy Lutomirski
2022-02-08 17:02             ` Cyrill Gorcunov
2022-02-09  2:18               ` Edgecombe, Rick P
2022-02-09  6:43                 ` Cyrill Gorcunov
2022-02-09 10:53                 ` Mike Rapoport
2022-02-10  2:37                 ` Andy Lutomirski
2022-02-10  2:53                   ` H.J. Lu
2022-02-10 13:52                     ` Willgerodt, Felix
2022-02-11  7:41                   ` avagin
2022-02-11  8:04                     ` Mike Rapoport
2022-02-28 20:27                   ` Mike Rapoport
2022-02-28 20:30                     ` Andy Lutomirski
2022-02-28 21:30                       ` Mike Rapoport
2022-02-28 22:55                         ` Andy Lutomirski
2022-03-03 19:40                           ` Mike Rapoport
2022-03-03 23:00                             ` Andy Lutomirski
2022-03-04  1:30                               ` Edgecombe, Rick P
2022-03-04 19:13                                 ` Andy Lutomirski
2022-03-07 18:56                                   ` Mike Rapoport
2022-03-07 19:07                                     ` H.J. Lu
2022-05-31 11:59                                       ` Mike Rapoport
2022-05-31 16:25                                         ` Edgecombe, Rick P
2022-05-31 16:36                                           ` Mike Rapoport
2022-05-31 17:34                                             ` Edgecombe, Rick P
2022-05-31 18:00                                               ` H.J. Lu
2022-06-01 17:27                                                 ` Edgecombe, Rick P
2022-06-01 19:27                                                   ` H.J. Lu
2022-06-01  8:06                                               ` Mike Rapoport
2022-06-01 17:24                                                 ` Edgecombe, Rick P
2022-06-09 18:04                                                   ` Mike Rapoport
2022-03-07 22:21                                     ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b2be3259a8f062f17048b1b6496a5303e49db46.camel@intel.com \
    --to=rick.p.edgecombe@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=bsingharora@gmail.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=eranian@google.com \
    --cc=esyr@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=gorcunov@gmail.com \
    --cc=hjl.tools@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jannh@google.com \
    --cc=joao.moreira@intel.com \
    --cc=john.allen@amd.com \
    --cc=kcc@google.com \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=oleg@redhat.com \
    --cc=pavel@ucw.cz \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=rdunlap@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=weijiang.yang@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).