virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* proposed interface change for setting the ldt
@ 2006-08-18 10:42 Jeremy Fitzhardinge
  2006-08-18 12:46 ` Jeremy Fitzhardinge
  2006-08-18 20:23 ` Zachary Amsden
  0 siblings, 2 replies; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2006-08-18 10:42 UTC (permalink / raw)
  To: Zachary Amsden, Rusty Russell, Chris Wright; +Cc: Virtualization Mailing List

At the moment, all the places where the LDT are set in the kernel are of 
the form:

    set_ldt_desc(cpu, segments, count);
    load_LDT_desc();

(two instances in asm-i386/desc.h).  set_ldt_desc() sets an LDT 
descriptor in the GDT, and load_LDT_desc() is basically just lldt. These 
map to the write_gdt_entry and load_ldt_desc paravirt ops.

This doesn't work well for Xen, because you set the ldt directly by 
passing the base+size into the hypervisor.  In fact, it doesn't allow 
you to set an LDT-type descriptor into the GDT, so this current 
interface requires the Xen backend to decode the descriptor passed to 
write_gdt_entry, look to see if its an LDT; if so, store the base+size 
somewhere, and then when load_ldt_desc() is called, do the appropriate 
Xen hypercall.

A better interface for us would be simply:

	set_ldt(const struct desc_struct *ldt, int num_entries);

since this maps directly to the appropriate Xen hypercall.  If you still 
want to implement it by plugging the LDT descriptor into the GDT and 
then lldt, then there's no reason you can't implement it that way.

Thoughts?

    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-18 10:42 proposed interface change for setting the ldt Jeremy Fitzhardinge
@ 2006-08-18 12:46 ` Jeremy Fitzhardinge
  2006-08-18 13:03   ` Jeremy Fitzhardinge
  2006-08-18 20:23 ` Zachary Amsden
  1 sibling, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2006-08-18 12:46 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Chris Wright, Virtualization Mailing List

Jeremy Fitzhardinge wrote:
> 	set_ldt(const struct desc_struct *ldt, int num_entries);
>   
Make that

	set_ldt(unsigned cpu, const struct desc_struct *ldt, int num_entries);

    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-18 12:46 ` Jeremy Fitzhardinge
@ 2006-08-18 13:03   ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2006-08-18 13:03 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Chris Wright, Virtualization Mailing List

Jeremy Fitzhardinge wrote:
> Jeremy Fitzhardinge wrote:
>>     set_ldt(const struct desc_struct *ldt, int num_entries);
>>   
> Make that
>
>     set_ldt(unsigned cpu, const struct desc_struct *ldt, int 
> num_entries);

Or maybe not.  The current code assumes that cpu is the current cpu 
anyway;  the implementation can use smp_processor_id() if it needs it 
anyway (Xen doesn't).

    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-18 10:42 proposed interface change for setting the ldt Jeremy Fitzhardinge
  2006-08-18 12:46 ` Jeremy Fitzhardinge
@ 2006-08-18 20:23 ` Zachary Amsden
  2006-08-19  2:39   ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 12+ messages in thread
From: Zachary Amsden @ 2006-08-18 20:23 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Chris Wright, Virtualization Mailing List

Jeremy Fitzhardinge wrote:
> A better interface for us would be simply:
>
>     set_ldt(const struct desc_struct *ldt, int num_entries);
>
> since this maps directly to the appropriate Xen hypercall.  If you 
> still want to implement it by plugging the LDT descriptor into the GDT 
> and then lldt, then there's no reason you can't implement it that way.
>
> Thoughts?

This interface doesn't work for anything other than Xen.  It is 
impossible to implement it without specific knowledge of kernel 
internals, since it doesn't provide the GDT selector for the LDT.  Now 
everything that looks like real hardware needs to move the knowledge of 
the LDT structure into paravirt-ops, and it has no clear calling 
convention, so you've now got to reason about SMP preemption correctness 
inside the paravirt-op, instead of at the higher level where it should 
be done.

This is an example of one place where Xen has broken the x86 
architecture in favor of a simpler implementation of the hypervisor, but 
a radical change to the kernel.  It is one of the reasons why XenoLinux 
as implemented is unable to boot on native hardware.  I would strongly 
prefer if we didn't introduce this ugliness into paravirt-ops.

Zach

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-18 20:23 ` Zachary Amsden
@ 2006-08-19  2:39   ` Jeremy Fitzhardinge
  2006-08-19  3:12     ` Zachary Amsden
  0 siblings, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2006-08-19  2:39 UTC (permalink / raw)
  To: Zachary Amsden; +Cc: Chris Wright, Virtualization Mailing List

Zachary Amsden wrote:
> This interface doesn't work for anything other than Xen.

It works OK for native.  It's a very simple rolling together of two 
operations which always happen together anyway.

> It is impossible to implement it without specific knowledge of kernel 
> internals, since it doesn't provide the GDT selector for the LDT.

Neither does the previous interface.  load_ldt_desc needs to have the 
specific LDT entry hardcoded into it.

>   Now everything that looks like real hardware needs to move the 
> knowledge of the LDT structure into paravirt-ops, 

Do you mean the GDT structure?

> and it has no clear calling convention, so you've now got to reason 
> about SMP preemption correctness inside the paravirt-op, instead of at 
> the higher level where it should be done.

The previous interface already required that preempt be disabled around 
those functions.  In the previous interface, set_ldt_desc takes a cpu 
number, but it is required to equal the current cpu; load_ldt_desc 
always operates on the current CPU.  It therefore requires that those 
two ops be paired with preempt disabled.  The new interface is simpler, 
but still requires preempt disabled around it.

In general, the set_ldt interface is cleaner for the base kernel, and 
much cleaner for Xen, while being trivial to implement for native 
hardware or something which looks like it.

    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-19  2:39   ` Jeremy Fitzhardinge
@ 2006-08-19  3:12     ` Zachary Amsden
  2006-08-19  3:18       ` Jeremy Fitzhardinge
  2006-08-19  3:22       ` Chris Wright
  0 siblings, 2 replies; 12+ messages in thread
From: Zachary Amsden @ 2006-08-19  3:12 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Chris Wright, Virtualization Mailing List

Jeremy Fitzhardinge wrote:
> Zachary Amsden wrote:
>> This interface doesn't work for anything other than Xen.
>
> It works OK for native.  It's a very simple rolling together of two 
> operations which always happen together anyway.
>
>> It is impossible to implement it without specific knowledge of kernel 
>> internals, since it doesn't provide the GDT selector for the LDT.
>
> Neither does the previous interface.  load_ldt_desc needs to have the 
> specific LDT entry hardcoded into it.
>
>>   Now everything that looks like real hardware needs to move the 
>> knowledge of the LDT structure into paravirt-ops, 
>
> Do you mean the GDT structure?

Yes.

>
>> and it has no clear calling convention, so you've now got to reason 
>> about SMP preemption correctness inside the paravirt-op, instead of 
>> at the higher level where it should be done.
>
> The previous interface already required that preempt be disabled 
> around those functions.  In the previous interface, set_ldt_desc takes 
> a cpu number, but it is required to equal the current cpu; 
> load_ldt_desc always operates on the current CPU.  It therefore 
> requires that those two ops be paired with preempt disabled.  The new 
> interface is simpler, but still requires preempt disabled around it.

The paravirt-op just got a lot harder to implement, so there is a cost 
to the simpler interface.

>
> In general, the set_ldt interface is cleaner for the base kernel, and 
> much cleaner for Xen, while being trivial to implement for native 
> hardware or something which looks like it.

I just think it's really weird to have LDT not described in the GDT, but 
LDT is weird anyways.

Zach

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-19  3:12     ` Zachary Amsden
@ 2006-08-19  3:18       ` Jeremy Fitzhardinge
  2006-08-19  3:22       ` Chris Wright
  1 sibling, 0 replies; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2006-08-19  3:18 UTC (permalink / raw)
  To: Zachary Amsden; +Cc: Chris Wright, Virtualization Mailing List

Zachary Amsden wrote:
> The paravirt-op just got a lot harder to implement, so there is a cost 
> to the simpler interface.

I don't see why it is a "lot" harder.  It might take 5 lines of code 
rather than jumping straight into the ROM, but it doesn't seem like huge 
complexity.

> I just think it's really weird to have LDT not described in the GDT, 
> but LDT is weird anyways.

It gets excluded as part of a broader test which also prevents any 
table, but specifically the LDT, from referring to an LDT (on the 
grounds that its hard to reason about, and may be undefined).

    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-19  3:12     ` Zachary Amsden
  2006-08-19  3:18       ` Jeremy Fitzhardinge
@ 2006-08-19  3:22       ` Chris Wright
  2006-08-19  3:41         ` Zachary Amsden
  1 sibling, 1 reply; 12+ messages in thread
From: Chris Wright @ 2006-08-19  3:22 UTC (permalink / raw)
  To: Zachary Amsden; +Cc: Chris Wright, Virtualization Mailing List

* Zachary Amsden (zach@vmware.com) wrote:
> The paravirt-op just got a lot harder to implement, so there is a cost 
> to the simpler interface.

I'm missing why it's a lot harder.  Seems reasonably straight forward.
puzzled...

-chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-19  3:22       ` Chris Wright
@ 2006-08-19  3:41         ` Zachary Amsden
  2006-08-19  4:32           ` Rusty Russell
  0 siblings, 1 reply; 12+ messages in thread
From: Zachary Amsden @ 2006-08-19  3:41 UTC (permalink / raw)
  To: Chris Wright; +Cc: Virtualization Mailing List

Chris Wright wrote:
> * Zachary Amsden (zach@vmware.com) wrote:
>   
>> The paravirt-op just got a lot harder to implement, so there is a cost 
>> to the simpler interface.
>>     
>
> I'm missing why it's a lot harder.  Seems reasonably straight forward.
> puzzled...
>   

Before it could be a direct call for us.  Now I am forced to write a 
wrapper function around it which does exactly the same work as the 
native code, them calls a ROM function.  It is straight forward, but 
obviously undesirable.

Zach

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-19  3:41         ` Zachary Amsden
@ 2006-08-19  4:32           ` Rusty Russell
  2006-08-19 12:18             ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 12+ messages in thread
From: Rusty Russell @ 2006-08-19  4:32 UTC (permalink / raw)
  To: Zachary Amsden; +Cc: Chris Wright, Virtualization Mailing List

On Fri, 2006-08-18 at 20:41 -0700, Zachary Amsden wrote:
> Chris Wright wrote:
> > * Zachary Amsden (zach@vmware.com) wrote:
> >   
> >> The paravirt-op just got a lot harder to implement, so there is a cost 
> >> to the simpler interface.
> >>     
> >
> > I'm missing why it's a lot harder.  Seems reasonably straight forward.
> > puzzled...
> >   
> 
> Before it could be a direct call for us.  Now I am forced to write a 
> wrapper function around it which does exactly the same work as the 
> native code, them calls a ROM function.  It is straight forward, but 
> obviously undesirable.

It sounds fine to me, although I'd like to see the patch.  I don't have
anything against higher-level abstractions, if it helps any hypervisor,
as long as it doesn't warp the kernel code.  And if most hypervisors and
native break it down the same way, well, we can always create helpers.

Cheers,
Rusty.
-- 
Help! Save Australia from the worst of the DMCA: http://linux.org.au/law

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-19  4:32           ` Rusty Russell
@ 2006-08-19 12:18             ` Jeremy Fitzhardinge
  2006-08-21  5:01               ` Zachary Amsden
  0 siblings, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2006-08-19 12:18 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Chris Wright, Virtualization Mailing List

Rusty Russell wrote:
> It sounds fine to me, although I'd like to see the patch.  I don't have
> anything against higher-level abstractions, if it helps any hypervisor,
> as long as it doesn't warp the kernel code.  And if most hypervisors and
> native break it down the same way, well, we can always create helpers.
>   

Doing it this way does save a lot of complexity in the Xen case, which 
would otherwise have to:

   1. unpack the descriptor
   2. if it's an LDT, save it off in a per-cpu structure
   3. when load_ldt_desc is called, look up the structure, extract the
      base+size of the ldt, and make a hypercall to set the ldt

vs

   1. set the ldt


    J

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: proposed interface change for setting the ldt
  2006-08-19 12:18             ` Jeremy Fitzhardinge
@ 2006-08-21  5:01               ` Zachary Amsden
  0 siblings, 0 replies; 12+ messages in thread
From: Zachary Amsden @ 2006-08-21  5:01 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Chris Wright, Virtualization Mailing List

Jeremy Fitzhardinge wrote:
> Rusty Russell wrote:
>> It sounds fine to me, although I'd like to see the patch.  I don't have
>> anything against higher-level abstractions, if it helps any hypervisor,
>> as long as it doesn't warp the kernel code.  And if most hypervisors and
>> native break it down the same way, well, we can always create helpers.
>>   
>
> Doing it this way does save a lot of complexity in the Xen case, which 
> would otherwise have to:
>
>   1. unpack the descriptor
>   2. if it's an LDT, save it off in a per-cpu structure
>   3. when load_ldt_desc is called, look up the structure, extract the
>      base+size of the ldt, and make a hypercall to set the ldt
>
> vs
>
>   1. set the ldt

This way works for us as well, but my opinion is - Xen deliberately 
broke the x86 interface here, for no strong reason.  In an ideal world, 
we shouldn't have to change Linux to work around that, nor should other 
hypervisors have to live with the consequences of that decision.  So I 
believe the onus is on Xen to fix the problem in their layer, not the 
other way around.  But I have no problem adapting to this way of doing 
things for now, since this is not performance critical, and LDT is 
barely used in Linux anyway, even then, mostly by unsupported 
applications in a paravirtualized world.

Zach

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-08-21  5:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-18 10:42 proposed interface change for setting the ldt Jeremy Fitzhardinge
2006-08-18 12:46 ` Jeremy Fitzhardinge
2006-08-18 13:03   ` Jeremy Fitzhardinge
2006-08-18 20:23 ` Zachary Amsden
2006-08-19  2:39   ` Jeremy Fitzhardinge
2006-08-19  3:12     ` Zachary Amsden
2006-08-19  3:18       ` Jeremy Fitzhardinge
2006-08-19  3:22       ` Chris Wright
2006-08-19  3:41         ` Zachary Amsden
2006-08-19  4:32           ` Rusty Russell
2006-08-19 12:18             ` Jeremy Fitzhardinge
2006-08-21  5:01               ` Zachary Amsden

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).