xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] x86/fpu: CR0.TS should be set before trap into PV guest's #NM exception handle
       [not found] <52967C45.3030506@prgmr.com>
@ 2013-11-27 23:35 ` Sarah Newman
  2013-11-29  9:42   ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Sarah Newman @ 2013-11-27 23:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Crawford, Luke


> Note the comment close to the beginning - the fact that CR0.TS
> is clear at exception handler entry is actually part of the PV ABI,
> i.e. by altering hypervisor behavior here you break all forward
> ported kernels.
> 
> Nevertheless I agree that there is an issue, but this needs to be
> fixed on the Linux side (hence adding the Linux maintainers to Cc);
> this issue was introduced way back in 2.6.26 (before that there
> was no allocation on that path). It's not clear though whether
> using GFP_ATOMIC for the allocation would be preferable over
> stts() before calling the allocation function (and clts() if it
> succeeded), or whether perhaps to defer the stts() until we
> actually know the task is being switched out. It's going to be an
> ugly, Xen-specific hack in any event.
> 
> Jan

This is a serious bug.  Unfortunately not all of us have the option of updating our guests if/when
such a fix is made to Linux.

How about a per domU + xen command line configuration option to implement Zhu's changes that is
disabled by default?  I'm willing to try to implement it.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/fpu: CR0.TS should be set before trap into PV guest's #NM exception handle
  2013-11-27 23:35 ` [PATCH] x86/fpu: CR0.TS should be set before trap into PV guest's #NM exception handle Sarah Newman
@ 2013-11-29  9:42   ` Jan Beulich
  2013-11-30  0:16     ` Sarah Newman
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2013-11-29  9:42 UTC (permalink / raw)
  To: Sarah Newman; +Cc: xen-devel, Luke Crawford

>>> On 28.11.13 at 00:35, Sarah Newman <srn@prgmr.com> wrote:

>> Note the comment close to the beginning - the fact that CR0.TS
>> is clear at exception handler entry is actually part of the PV ABI,
>> i.e. by altering hypervisor behavior here you break all forward
>> ported kernels.
>> 
>> Nevertheless I agree that there is an issue, but this needs to be
>> fixed on the Linux side (hence adding the Linux maintainers to Cc);
>> this issue was introduced way back in 2.6.26 (before that there
>> was no allocation on that path). It's not clear though whether
>> using GFP_ATOMIC for the allocation would be preferable over
>> stts() before calling the allocation function (and clts() if it
>> succeeded), or whether perhaps to defer the stts() until we
>> actually know the task is being switched out. It's going to be an
>> ugly, Xen-specific hack in any event.
> 
> This is a serious bug.  Unfortunately not all of us have the option of 
> updating our guests if/when
> such a fix is made to Linux.
> 
> How about a per domU + xen command line configuration option to implement 
> Zhu's changes that is
> disabled by default?  I'm willing to try to implement it.

There's nothing keeping you from doing so - whether we'd then
accept it as a workaround, or whether instead you'd have to
carry it as a private patch would need to be seen. Iirc there
hasn't been a precedent like this (a command line / config file
override to alter the ABI), so I can't really make predictions on
how acceptable such a change might be.

I'm personally not eager to see something like this go in, most
importantly because I think bugs should be fixed where they
got introduced, not worked around elsewhere, and also
because I think that a generally available workaround would
lower the chances of the bug getting fixed properly.

Apart from that this workaround of yours would have very
ugly behavior: If a guest later got updated to a fixed kernel,
you'd have to alter its configuration along with booting into
the new kernel (i.e. a simple reboot of the VM won't do), or
else your new kernel would crash due to not clearing CR0.TS
before trying to restore FPU context.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/fpu: CR0.TS should be set before trap into PV guest's #NM exception handle
  2013-11-29  9:42   ` Jan Beulich
@ 2013-11-30  0:16     ` Sarah Newman
  2013-12-02  8:34       ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Sarah Newman @ 2013-11-30  0:16 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Luke Crawford

On 11/29/2013 01:42 AM, Jan Beulich wrote:
> I'm personally not eager to see something like this go in, most
> importantly because I think bugs should be fixed where they
> got introduced, not worked around elsewhere, and also
> because I think that a generally available workaround would
> lower the chances of the bug getting fixed properly.
If the proper bug fix was released first would that help?

> 
> Apart from that this workaround of yours would have very
> ugly behavior: If a guest later got updated to a fixed kernel,
> you'd have to alter its configuration along with booting into
> the new kernel (i.e. a simple reboot of the VM won't do), or
> else your new kernel would crash due to not clearing CR0.TS
> before trying to restore FPU context.
Assuming it predictably crashed, that would be easier to deal with than occasional silent memory
corruption.

I think something could be done with the on_reboot action, maybe a reload-restart?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/fpu: CR0.TS should be set before trap into PV guest's #NM exception handle
  2013-11-30  0:16     ` Sarah Newman
@ 2013-12-02  8:34       ` Jan Beulich
  2013-12-02  9:58         ` Luke S. Crawford
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2013-12-02  8:34 UTC (permalink / raw)
  To: Sarah Newman; +Cc: xen-devel, Luke Crawford

>>> On 30.11.13 at 01:16, Sarah Newman <srn@prgmr.com> wrote:
> On 11/29/2013 01:42 AM, Jan Beulich wrote:
>> I'm personally not eager to see something like this go in, most
>> importantly because I think bugs should be fixed where they
>> got introduced, not worked around elsewhere, and also
>> because I think that a generally available workaround would
>> lower the chances of the bug getting fixed properly.
> If the proper bug fix was released first would that help?

Perhaps.

>> Apart from that this workaround of yours would have very
>> ugly behavior: If a guest later got updated to a fixed kernel,
>> you'd have to alter its configuration along with booting into
>> the new kernel (i.e. a simple reboot of the VM won't do), or
>> else your new kernel would crash due to not clearing CR0.TS
>> before trying to restore FPU context.
> Assuming it predictably crashed, that would be easier to deal with than 
> occasional silent memory
> corruption.
> 
> I think something could be done with the on_reboot action, maybe a 
> reload-restart?

The thing is that a guest admin may not have access to the VM's
config file, i.e. a satisfactory solution ought to involve only guest
side actions.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/fpu: CR0.TS should be set before trap into PV guest's #NM exception handle
  2013-12-02  8:34       ` Jan Beulich
@ 2013-12-02  9:58         ` Luke S. Crawford
  2013-12-02 10:04           ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Luke S. Crawford @ 2013-12-02  9:58 UTC (permalink / raw)
  To: Jan Beulich, Sarah Newman; +Cc: xen-devel

On 12/02/2013 12:34 AM, Jan Beulich wrote:
> The thing is that a guest admin may not have access to the VM's
> config file, i.e. a satisfactory solution ought to involve only guest
> side actions.

A service provider who allows customers to run their own kernel would 
see it the other way around.

That is the situation I find myself in.

I don't have access to my guest's config files.   Something I can do as 
the manager of the dom0 to ameliorate the problem without requiting the 
customers to do anything, and without requiring me to break into the 
guest and figure out how to patch whatever random kernel they might be 
using would help me a lot.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/fpu: CR0.TS should be set before trap into PV guest's #NM exception handle
  2013-12-02  9:58         ` Luke S. Crawford
@ 2013-12-02 10:04           ` Jan Beulich
  2013-12-02 10:25             ` Luke S. Crawford
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2013-12-02 10:04 UTC (permalink / raw)
  To: Luke S. Crawford, Sarah Newman; +Cc: xen-devel

>>> On 02.12.13 at 10:58, "Luke S. Crawford" <lsc@prgmr.com> wrote:
> On 12/02/2013 12:34 AM, Jan Beulich wrote:
>> The thing is that a guest admin may not have access to the VM's
>> config file, i.e. a satisfactory solution ought to involve only guest
>> side actions.
> 
> A service provider who allows customers to run their own kernel would 
> see it the other way around.

And I didn't say that might not also need taking care of, but I'm in
no way convinced that's necessary:

> That is the situation I find myself in.
> 
> I don't have access to my guest's config files.   Something I can do as 
> the manager of the dom0 to ameliorate the problem without requiting the 
> customers to do anything, and without requiring me to break into the 
> guest and figure out how to patch whatever random kernel they might be 
> using would help me a lot.

If you're the manager of the Dom0, how can you not have access
to your guests' config files?

And it surely isn't the responsibility of the Dom0 admin to take care
of broken guest kernels. That's solely the guest admin's job.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/fpu: CR0.TS should be set before trap into PV guest's #NM exception handle
  2013-12-02 10:04           ` Jan Beulich
@ 2013-12-02 10:25             ` Luke S. Crawford
  0 siblings, 0 replies; 7+ messages in thread
From: Luke S. Crawford @ 2013-12-02 10:25 UTC (permalink / raw)
  To: Jan Beulich, Sarah Newman; +Cc: xen-devel

On 12/02/2013 02:04 AM, Jan Beulich wrote:


>> I don't have access to my guest's config files.   Something I can do as
>> the manager of the dom0 to ameliorate the problem without requiting the
>> customers to do anything, and without requiring me to break into the
>> guest and figure out how to patch whatever random kernel they might be
>> using would help me a lot.
>
> If you're the manager of the Dom0, how can you not have access
> to your guests' config files?

I misspoke.  I mean that I don't have access to the guest's kernel, and 
the guest's linux config files, without breaking into the guests.   Of 
course, I have access to the config files that boot the guests.

> And it surely isn't the responsibility of the Dom0 admin to take care
> of broken guest kernels. That's solely the guest admin's job.

That's what I keep saying, but that's not the way the customers see it. 
   In fact, I think this is why most providers keep a tight hold over 
what kernels they allow their customers to run.

Like I said, it's just my $0.02, and as I recall, I haven't even given 
you that much, so my business needs, obviously, aren't always going to 
map to your priorities.

I'm just saying that from the point of view of a provider, a solution 
that can be implemented from the dom0 is almost always better than a 
solution that requires a change within the DomU.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-12-02 10:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <52967C45.3030506@prgmr.com>
2013-11-27 23:35 ` [PATCH] x86/fpu: CR0.TS should be set before trap into PV guest's #NM exception handle Sarah Newman
2013-11-29  9:42   ` Jan Beulich
2013-11-30  0:16     ` Sarah Newman
2013-12-02  8:34       ` Jan Beulich
2013-12-02  9:58         ` Luke S. Crawford
2013-12-02 10:04           ` Jan Beulich
2013-12-02 10:25             ` Luke S. Crawford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).