Re: [RFC PATCH] x86/arch_prctl: Add ARCH_SET_XCR0 to mask XCR0 per-thread

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Dave Hansen <dave.hansen@linux.intel.com>
To: Keno Fischer <keno@juliacomputing.com>, linux-kernel@vger.kernel.org
Cc: "Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	"Borislav Petkov" <bp@suse.de>,
	"Andi Kleen" <andi@firstfloor.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Kyle Huey" <khuey@kylehuey.com>,
	"Robert O'Callahan" <robert@ocallahan.org>
Subject: Re: [RFC PATCH] x86/arch_prctl: Add ARCH_SET_XCR0 to mask XCR0 per-thread
Date: Mon, 18 Jun 2018 05:47:22 -0700	[thread overview]
Message-ID: <d8503ac3-b6a4-d6cc-3db7-4e092724b79d@linux.intel.com> (raw)
In-Reply-To: <1529195582-64207-1-git-send-email-keno@alumni.harvard.edu>

On 06/16/2018 05:33 PM, Keno Fischer wrote:
> For my use case, it would be sufficient to simply disallow
> any value of XCR0 with "holes" in it,
But what if the hardware you are migrating to/from *has* holes?

There's no way this is even close to viable until it has been made to
cope with holes.

FWIW, I just don't think this is going to be viable.  I have the feeling
that there's way too much stuff that hard-codes assumptions about XCR0
inside the kernel and out. This is just going to make it much more fragile.

Folks that want this level of container migration are probably better
off running one of the hardware-based containers and migrating _those_.
Or, just ensuring the places to/from they want to migrate have a
homogeneous XCR0 mix.

> @@ -252,6 +301,8 @@ void arch_setup_new_exec(void)
>  	/* If cpuid was previously disabled for this task, re-enable it. */
>  	if (test_thread_flag(TIF_NOCPUID))
>  		enable_cpuid();
> +	if (test_thread_flag(TIF_MASKXCR0))
> +		reset_xcr0_mask();
>  }

So the mask is cleared on exec().  Does that mean that *every*
individual process using this interface has to set up its own mask
before anything in the C library establishes its cached value of XCR0.
I'd want to see how that's being accomplished.

> +static int xstate_is_initial(unsigned long mask)
> +{
> +	int i, j;
> +	unsigned long max_bit = __ffs(mask);
> +
> +	for (i = 0; i < max_bit; ++i) {
> +		if (mask & (1 << i)) {
> +			char *xfeature_addr = (char *)get_xsave_addr(
> +					&current->thread.fpu.state.xsave,
> +					1 << i);
> +			unsigned long feature_size = xfeature_size(i);
> +
> +			for (j = 0; j < feature_size; ++j) {
> +				if (xfeature_addr[j] != 0)
> +					return 0;
> +			}
> +		}
> +	}
> +	return 1;
> +}

There is nothing architectural saying that the init state has to be 0.

> +	case ARCH_SET_XCR0: {

The interface is a mit burky.  The SET_XCR0 operation masks out the
"set" value from the current value?  That's a bit counterintuitive.

> +		unsigned long mask = xfeatures_mask & ~arg2;
> +
> +		if (!use_xsave())
> +			return -ENODEV;
> +
> +		if (arg2 & ~xfeatures_mask)
> +			return -ENODEV;

This is rather unfortunately comment-free.  "Are you trying to clear a
bit that was not set in the first place?"

Also, shouldn't this be dealing with the new task->xcr0, *not* the
global xfeatures_mask?  What if someone calls this more than once?

> +		if (!xcr0_is_legal(arg2))
> +			return -EINVAL;

FWIW, I don't really get the point of disallowing some of the values
made illegal in there.  Sure, you shoot yourself in the foot, but the
worst you'll probably see is a general-protection-fault from the XSETBV,
or from the first XRSTOR*.  We can cope with those, and I'd rather not
be trying to keep a list of things you're not allowed to do with XSAVE.

I also don't see any sign of checking for supervisor features anywhere.

> +		/*
> +		 * We require that any state components being disabled by
> +		 * this prctl be currently in their initial state.
> +		 */
> +		if (!xstate_is_initial(mask))
> +			return -EPERM;

Aside: I would *not* refer to the "initial state", for fear that we
could confuse it with the hardware-defined "init state".  From software,
we really have zero control over when the hardware is in its "init state".

But, in any case, so how is this supposed to work?

	// get features we are disabling into values matching the
	// hardware "init state".
	__asm__("XRSTOR %reg1,%reg2", ...);
	prctl(PRCTL_SET_XCR0, something);

?

That would be *really* fragile code from userspace.  Adding a printk()
between those two lines would probably break it, for instance.

I'd probably just not have these checks.

next prev parent reply	other threads:[~2018-06-18 12:47 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-17  0:33 [RFC PATCH] x86/arch_prctl: Add ARCH_SET_XCR0 to mask XCR0 per-thread Keno Fischer
2018-06-17 16:35 ` Andi Kleen
2018-06-17 16:48   ` Keno Fischer
2018-06-17 18:22     ` Keno Fischer
2018-06-18 16:58     ` Andi Kleen
2018-06-18 17:50       ` Keno Fischer
2018-06-19 13:43         ` Andi Kleen
2018-06-18 12:47 ` Dave Hansen [this message]
2018-06-18 14:42   ` Keno Fischer
2018-06-18 15:04     ` Dave Hansen
2018-06-18 15:13       ` Keno Fischer
2018-06-18 16:16         ` Dave Hansen
2018-06-18 17:22           ` Keno Fischer
2018-06-18 17:29             ` Dave Hansen
2018-06-18 17:43     ` Dave Hansen
2018-06-18 18:16       ` Keno Fischer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8503ac3-b6a4-d6cc-3db7-4e092724b79d@linux.intel.com \
    --to=dave.hansen@linux.intel.com \
    --cc=andi@firstfloor.org \
    --cc=bp@suse.de \
    --cc=hpa@zytor.com \
    --cc=keno@juliacomputing.com \
    --cc=khuey@kylehuey.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=robert@ocallahan.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox