public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: "Chang S. Bae" <chang.seok.bae@intel.com>,
	bp@suse.de, luto@kernel.org, mingo@kernel.org, x86@kernel.org
Cc: len.brown@intel.com, lenb@kernel.org, dave.hansen@intel.com,
	thiago.macieira@intel.com, jing2.liu@intel.com,
	ravi.v.shankar@intel.com, linux-kernel@vger.kernel.org,
	chang.seok.bae@intel.com
Subject: Re: [PATCH v10 09/28] x86/fpu/xstate: Introduce helpers to manage the XSTATE buffer dynamically
Date: Fri, 01 Oct 2021 16:20:16 +0200	[thread overview]
Message-ID: <874ka06d3z.ffs@tglx> (raw)
In-Reply-To: <20210825155413.19673-10-chang.seok.bae@intel.com>

On Wed, Aug 25 2021 at 08:53, Chang S. Bae wrote:
> +/**
> + * calculate_xstate_buf_size_from_mask - Calculate an xstate buffer size
> + * @mask:	A bitmap to tell which components to be saved in the buffer.
> + *
> + * Available once those arrays for the offset, size, and alignment info are
> + * set up, by setup_xstate_features().
> + *
> + * Returns:	The buffer size
> + */
> +unsigned int calculate_xstate_buf_size_from_mask(u64 mask)
> +{
> +	unsigned int size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
> +	int i, nr;
> +
> +	if (!mask)
> +		return 0;
> +
> +	/*
> +	 * The minimum buffer size excludes the dynamic user state. When a
> +	 * task uses the state, the buffer can grow up to the max size.
> +	 */
> +	if (mask == (xfeatures_mask_all & ~xfeatures_mask_user_dynamic))
> +		return fpu_buf_cfg.min_size;
> +	else if (mask == xfeatures_mask_all)
> +		return fpu_buf_cfg.max_size;
> +
> +	nr = fls64(mask) - 1;
> +	if (nr < FIRST_EXTENDED_XFEATURE)
> +		return size;
> +
> +	/*
> +	 * Each state offset in the non-compacted format is fixed. Take the
> +	 * size from the last feature 'nr'.
> +	 */
> +	if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
> +		return xstate_offsets[nr] + xstate_sizes[nr];
> +
> +	/*
> +	 * With the given mask, no relevant size is found so far. So,
> +	 * calculate it by summing up each state size.
> +	 */
> +	for (i = FIRST_EXTENDED_XFEATURE; i <= nr; i++) {
> +		if (!(mask & BIT_ULL(i)))
> +			continue;
> +
> +		if (xstate_64byte_aligned[i])
> +			size = ALIGN(size, 64);
> +		size += xstate_sizes[i];
> +	}
> +	return size;
> +}

So we have yet another slightly different function to calculate the
buffer size. Why do we still need calculate_xstate_size()?

> +void free_xstate_buffer(struct fpu *fpu)
> +{

Can you please put the check:

> +	if (fpu->state != &fpu->__default_state)

into this function? If it is ever called without checking it and state
points at fpu->__default_state then the explosions are going to be
interesting.

> +	vfree(fpu->state);
> +}
> +
> +/**
> + * realloc_xstate_buffer - Re-alloc a buffer with the size calculated from
> + *			   @mask.
> + *
> + * @fpu:	A struct fpu * pointer
> + * @mask:	The bitmap tells which components to be reserved in the new
> + *		buffer.
> + *
> + * It deals with enlarging the xstate buffer with dynamic states.
> + *
> + * Use vzalloc() simply here. If the task with a vzalloc()-allocated buffer
> + * tends to terminate quickly, vfree()-induced IPIs may be a concern.
> + * Caching may be helpful for this. But the task with large state is likely
> + * to live longer.
> + *
> + * Also, this method does not shrink or reclaim the buffer.
> + *
> + * Returns 0 on success, -ENOMEM on allocation error.
> + */
> +int realloc_xstate_buffer(struct fpu *fpu, u64 mask)
> +{
> +	union fpregs_state *state;
> +	u64 state_mask;
> +
> +	state_mask = fpu->state_mask | mask;
> +	if ((state_mask & fpu->state_mask) == state_mask)
> +		return 0;
> +
> +	state = vzalloc(calculate_xstate_buf_size_from_mask(state_mask));
> +	if (!state)
> +		return -ENOMEM;
> +
> +	/*
> +	 * As long as the register state is intact, save the xstate in the
> +	 * new buffer at the next context switch or ptrace's context
> +	 * injection.

What exactly guarantees that current's xstate is valid in the hardware
registers? This has to be fully preemptible context, otherwise you could
not invoke vzalloc() which can sleep.

Which in turn means that the placement of the exception fixup in a later
patch is broken:

> DEFINE_IDTENTRY(exc_device_not_available)
> {
> 	unsigned long cr0 = read_cr0();
> 
> +	if (handle_xfd_event(&current->thread.fpu, regs))
> +		return;

And no, we are not going to use an atomic allocation for this.

For the other call site from xstateregs_set() the FPU register state is
definitely not live because the task is stopped and the FPU registers
if live belong to the ptracer.

So you really want something like this:

static struct fpregs_state *swap_fpstate(fpu, newstate, mask, size)
{
        old_state = fpu->state;
        fpu->state = newstate;
        fpu->state_mask = mask;
        fpu->state_size = size;
        return old_state != fpu->__default_state ? old_state : NULL;
}

int realloc_xstate_buffer(struct task_struct *tsk, u64 mask)
{
	old_state;
        fpu = tsk->fpu;
        
        size = calc_size(state_mask);
        state = vzalloc(size);
        if (!state)
        	return -ENOMEM;

	if (cpu_feature_enabled(X86_FEATURE_XSAVES))
		fpstate_init_xstate(&state->xsave, state_mask);

        if (tsk != current) {
                // PTRACE ....
        	old_state = swap_fpstate(fpu, state, state_mask, size);
        } else {
                fpregs_lock();
                if (!registers_valid())
                      copy_state(state, fpu->state);
                old_state = swap_fpstate(fpu, state, state_mask, size);
                fpregs_unlock();
        }

        vfree(old_state);
        return 0;
}

And the exception fixup has to move into the irq enabled region. I come
to that patch later.

> @@ -1147,6 +1268,8 @@ static int copy_uabi_to_xstate(struct fpu *fpu, const void *kbuf,
>  	if (validate_user_xstate_header(&hdr))
>  		return -EINVAL;
>  
> +	hdr.xfeatures &= fpu->state_mask;

This is really the wrong patch doing this. This belongs into the one
which introduces fpu->state_mask.

Thanks,

        tglx

  reply	other threads:[~2021-10-01 14:20 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-25 15:53 [PATCH v10 00/28] x86: Support Intel Advanced Matrix Extensions Chang S. Bae
2021-08-25 15:53 ` [PATCH v10 01/28] x86/fpu/xstate: Fix the state copy function to the XSTATE buffer Chang S. Bae
2021-10-01 12:44   ` Thomas Gleixner
2021-10-03 22:34     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 02/28] x86/fpu/xstate: Modify the initialization helper to handle both static and dynamic buffers Chang S. Bae
2021-10-01 12:45   ` Thomas Gleixner
2021-10-03 22:35     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 03/28] x86/fpu/xstate: Modify state copy helpers " Chang S. Bae
2021-10-01 12:47   ` Thomas Gleixner
2021-10-03 22:42     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 04/28] x86/fpu/xstate: Modify address finders " Chang S. Bae
2021-10-01 13:15   ` Thomas Gleixner
2021-10-03 22:35     ` Bae, Chang Seok
2021-10-04 12:54       ` Thomas Gleixner
2021-08-25 15:53 ` [PATCH v10 05/28] x86/fpu/xstate: Add a new variable to indicate dynamic user states Chang S. Bae
2021-10-01 13:16   ` Thomas Gleixner
2021-10-03 22:35     ` Bae, Chang Seok
2021-10-04 12:57       ` Thomas Gleixner
2021-08-25 15:53 ` [PATCH v10 06/28] x86/fpu/xstate: Add new variables to indicate dynamic XSTATE buffer size Chang S. Bae
2021-10-01 13:32   ` Thomas Gleixner
2021-10-03 22:36     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 07/28] x86/fpu/xstate: Calculate and remember dynamic XSTATE buffer sizes Chang S. Bae
2021-08-25 15:53 ` [PATCH v10 08/28] x86/fpu/xstate: Convert the struct fpu 'state' field to a pointer Chang S. Bae
2021-08-25 15:53 ` [PATCH v10 09/28] x86/fpu/xstate: Introduce helpers to manage the XSTATE buffer dynamically Chang S. Bae
2021-10-01 14:20   ` Thomas Gleixner [this message]
2021-10-03 22:36     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 10/28] x86/fpu/xstate: Update the XSTATE save function to support dynamic states Chang S. Bae
2021-10-01 15:41   ` Thomas Gleixner
2021-10-02 21:31     ` Thomas Gleixner
2021-10-02 22:54       ` Bae, Chang Seok
2021-10-05  8:16         ` Paolo Bonzini
2021-10-05  7:50       ` Paolo Bonzini
2021-10-05  9:55         ` Thomas Gleixner
2021-08-25 15:53 ` [PATCH v10 11/28] x86/fpu/xstate: Update the XSTATE buffer address finder " Chang S. Bae
2021-08-25 15:53 ` [PATCH v10 12/28] x86/fpu/xstate: Update the XSTATE context copy function " Chang S. Bae
2021-08-25 15:53 ` [PATCH v10 13/28] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state Chang S. Bae
2021-10-01 15:02   ` Thomas Gleixner
2021-10-01 15:10     ` Thomas Gleixner
2021-10-03 22:38       ` Bae, Chang Seok
2021-10-04 12:35         ` Thomas Gleixner
2021-10-01 20:20     ` Thomas Gleixner
2021-10-03 22:39       ` Bae, Chang Seok
2021-10-04 19:03         ` Thomas Gleixner
2021-10-03 22:41     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 14/28] x86/fpu/xstate: Support ptracer-induced XSTATE buffer expansion Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 15/28] x86/arch_prctl: Create ARCH_SET_STATE_ENABLE/ARCH_GET_STATE_ENABLE Chang S. Bae
2021-08-25 16:36   ` Bae, Chang Seok
2021-08-25 15:54 ` [PATCH v10 16/28] x86/fpu/xstate: Support both legacy and expanded signal XSTATE size Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 17/28] x86/fpu/xstate: Adjust the XSAVE feature table to address gaps in state component numbers Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 18/28] x86/fpu/xstate: Disable XSTATE support if an inconsistent state is detected Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 19/28] x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature bits Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 20/28] x86/fpu/amx: Define AMX state components and have it used for boot-time checks Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 21/28] x86/fpu/amx: Initialize child's AMX state Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 22/28] x86/fpu/amx: Enable the AMX feature in 64-bit mode Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 23/28] x86/fpu/xstate: Skip writing zeros to signal frame for dynamic user states if in INIT-state Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 24/28] selftest/x86/amx: Test cases for the AMX state management Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 25/28] x86/insn/amx: Add TILERELEASE instruction to the opcode map Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 26/28] intel_idle/amx: Add SPR support with XTILEDATA capability Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 27/28] x86/fpu/xstate: Add a sanity check for XFD state when saving XSTATE Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 28/28] x86/arch_prctl: ARCH_GET_FEATURES_WITH_KERNEL_ASSISTANCE Chang S. Bae
2021-09-30 21:12 ` [PATCH v10 00/28] x86: Support Intel Advanced Matrix Extensions Len Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874ka06d3z.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=bp@suse.de \
    --cc=chang.seok.bae@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=len.brown@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=thiago.macieira@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox