public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
@ 2007-05-04 11:56 Eric W. Biederman
  2007-05-04 12:12 ` Rusty Russell
  0 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-04 11:56 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Rusty Russell, Chris Wright, Jeremy Fitzhardinge, Zachary Amsden,
	Andrew Morton, Linus Torvalds, H. Peter Anvin, linux-kernel

This reverts commit c9ccf30d77f04064fe5436027ab9d2230c7cdd94.

Entering the kernel at startup_32 without passing our real mode data
in %esi, and without guaranteeing that physical and virtual addresses
are identity mapped makes head.S impossible to maintain.

The only user of this infrastructure is lguest which is not merged so
nothing we currently support will break by removing this over designed
nightmare, and only the pending lguest patches will be affected. The
pending Xen patches have a different entry point that they use.

We are currently discussing what Xen and lguest need to do to boot the
kernel in a more normal fashion so using startup_32 in this weird
manner is clearly not their long term direction.

So let's remove this code in head.S before it causes brain damage to
people trying to maintain head.S

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andrew Morton <akpm@osdl.org>
CC: H. Peter Anvin <hpa@zytor.com> 
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 arch/i386/kernel/head.S        |   38 --------------------------------------
 arch/i386/kernel/paravirt.c    |    1 -
 arch/i386/kernel/vmlinux.lds.S |    6 ------
 include/asm-i386/paravirt.h    |    5 -----
 4 files changed, 0 insertions(+), 50 deletions(-)

diff --git a/arch/i386/kernel/head.S b/arch/i386/kernel/head.S
index 3fa7f93..9a8c7f0 100644
--- a/arch/i386/kernel/head.S
+++ b/arch/i386/kernel/head.S
@@ -56,12 +56,6 @@
 .section .text.head,"ax",@progbits
 ENTRY(startup_32)
 
-#ifdef CONFIG_PARAVIRT
-        movl %cs, %eax
-        testl $0x3, %eax
-        jnz startup_paravirt
-#endif
-
 /*
  * Set segments to known values.
  */
@@ -503,38 +497,6 @@ ignore_int:
 	iret
 
 .section .text
-#ifdef CONFIG_PARAVIRT
-startup_paravirt:
-	cld
- 	movl $(init_thread_union+THREAD_SIZE),%esp
-
-	/* We take pains to preserve all the regs. */
-	pushl	%edx
-	pushl	%ecx
-	pushl	%eax
-
-	pushl	$__start_paravirtprobe
-1:
-	movl	0(%esp), %eax
-	cmpl	$__stop_paravirtprobe, %eax
-	je	unhandled_paravirt
-	pushl	(%eax)
-	movl	8(%esp), %eax
-	call	*(%esp)
-	popl	%eax
-
-	movl	4(%esp), %eax
-	movl	8(%esp), %ecx
-	movl	12(%esp), %edx
-
-	addl	$4, (%esp)
-	jmp	1b
-
-unhandled_paravirt:
-	/* Nothing wanted us: we're screwed. */
-	ud2
-#endif
-
 /*
  * Real beginning of normal "text" segment
  */
diff --git a/arch/i386/kernel/paravirt.c b/arch/i386/kernel/paravirt.c
index 2ec331e..fe9e4dc 100644
--- a/arch/i386/kernel/paravirt.c
+++ b/arch/i386/kernel/paravirt.c
@@ -19,7 +19,6 @@
 #include <linux/module.h>
 #include <linux/efi.h>
 #include <linux/bcd.h>
-#include <linux/start_kernel.h>
 
 #include <asm/bug.h>
 #include <asm/paravirt.h>
diff --git a/arch/i386/kernel/vmlinux.lds.S b/arch/i386/kernel/vmlinux.lds.S
index 6f38f81..28c5477 100644
--- a/arch/i386/kernel/vmlinux.lds.S
+++ b/arch/i386/kernel/vmlinux.lds.S
@@ -79,12 +79,6 @@ SECTIONS
 	CONSTRUCTORS
 	} :data
 
-  .paravirtprobe : AT(ADDR(.paravirtprobe) - LOAD_OFFSET) {
-  	__start_paravirtprobe = .;
-	*(.paravirtprobe)
-  	__stop_paravirtprobe = .;
-  }
-
   . = ALIGN(4096);
   .data_nosave : AT(ADDR(.data_nosave) - LOAD_OFFSET) {
   	__nosave_begin = .;
diff --git a/include/asm-i386/paravirt.h b/include/asm-i386/paravirt.h
index e63f1e4..ce61030 100644
--- a/include/asm-i386/paravirt.h
+++ b/include/asm-i386/paravirt.h
@@ -160,11 +160,6 @@ struct paravirt_ops
 	void (*startup_ipi_hook)(int phys_apicid, unsigned long start_eip, unsigned long start_esp);
 };
 
-/* Mark a paravirt probe function. */
-#define paravirt_probe(fn)						\
- static asmlinkage void (*__paravirtprobe_##fn)(void) __attribute_used__ \
-		__attribute__((__section__(".paravirtprobe"))) = fn
-
 extern struct paravirt_ops paravirt_ops;
 
 #define paravirt_enabled() (paravirt_ops.paravirt_enabled)
-- 
1.5.1.1.181.g2de0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-04 11:56 [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization" Eric W. Biederman
@ 2007-05-04 12:12 ` Rusty Russell
  2007-05-04 14:13   ` Eric W. Biederman
  0 siblings, 1 reply; 17+ messages in thread
From: Rusty Russell @ 2007-05-04 12:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andi Kleen, Chris Wright, Jeremy Fitzhardinge, Zachary Amsden,
	Andrew Morton, Linus Torvalds, H. Peter Anvin, linux-kernel

On Fri, 2007-05-04 at 05:56 -0600, Eric W. Biederman wrote:
> This reverts commit c9ccf30d77f04064fe5436027ab9d2230c7cdd94.
> 
> Entering the kernel at startup_32 without passing our real mode data
> in %esi, and without guaranteeing that physical and virtual addresses
> are identity mapped makes head.S impossible to maintain.

Your assertion that these #ifdef-contained lines of code are "impossible
to maintain" is theatrical.  I'd prefer to wait until we actually have a
replacement before we go ripping out working code.  

Fortunately, I'm working on such code right now.  But as lguest
currently boots with V = P + PAGE_OFFSET, it's not entirely a trivial
matter to change.

Thanks,
Rusty.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-04 12:12 ` Rusty Russell
@ 2007-05-04 14:13   ` Eric W. Biederman
  2007-05-04 14:37     ` Rusty Russell
  2007-05-04 14:59     ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-04 14:13 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andi Kleen, Chris Wright, Jeremy Fitzhardinge, Zachary Amsden,
	Andrew Morton, Linus Torvalds, H. Peter Anvin, linux-kernel

Rusty Russell <rusty@rustcorp.com.au> writes:

> On Fri, 2007-05-04 at 05:56 -0600, Eric W. Biederman wrote:
>> This reverts commit c9ccf30d77f04064fe5436027ab9d2230c7cdd94.
>> 
>> Entering the kernel at startup_32 without passing our real mode data
>> in %esi, and without guaranteeing that physical and virtual addresses
>> are identity mapped makes head.S impossible to maintain.
>
> Your assertion that these #ifdef-contained lines of code are "impossible
> to maintain" is theatrical.  I'd prefer to wait until we actually have a
> replacement before we go ripping out working code.  

It's not theatrical.  It makes this code path extremely brittle and
very hard to change, which over the long term means that it is
impossible to maintain.  Quickly resulting in a state where any little
change will break something.  See Jeremy's impossible attempt to
detect if we start with physical addresses identity mapped or physical
address mapped at PAGE_OFFSET, just so he can clear the bss before we
test to see if we are running in para-virtualized environment.

The first step in making things better is removing this code (even in
your patches) so this patch doesn't get in the way.

We don't have any working code, there are no in tree users.

Therefore the code should go.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-04 14:13   ` Eric W. Biederman
@ 2007-05-04 14:37     ` Rusty Russell
  2007-05-04 15:07       ` Eric W. Biederman
  2007-05-04 15:58       ` [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization" Andrew Morton
  2007-05-04 14:59     ` Jeremy Fitzhardinge
  1 sibling, 2 replies; 17+ messages in thread
From: Rusty Russell @ 2007-05-04 14:37 UTC (permalink / raw)
  To: Eric W. Biederman, Andrew Morton
  Cc: Andi Kleen, Chris Wright, Jeremy Fitzhardinge, Zachary Amsden,
	Andrew Morton, Linus Torvalds, H. Peter Anvin, linux-kernel

On Fri, 2007-05-04 at 08:13 -0600, Eric W. Biederman wrote:
> We don't have any working code, there are no in tree users.

Hi Eric,

	Lack of in-tree code is definitely not due to me.  The code which uses
it has been sitting in -mm for three months.  Suddenly ripping this out
and breaking all that work without replacing it is rude.

> Therefore the code should go.

Then put the patch is -mm, behind lguest.  Or better, help me refine my
patches and I'll happily do it for you.

Of course, I expect that Andrew is about to push my patches to Linus....
any day now... right Andrew?  Then we don't need this argument.

Rusty.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-04 14:13   ` Eric W. Biederman
  2007-05-04 14:37     ` Rusty Russell
@ 2007-05-04 14:59     ` Jeremy Fitzhardinge
  2007-05-04 15:20       ` Eric W. Biederman
  1 sibling, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2007-05-04 14:59 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Rusty Russell, Andi Kleen, Chris Wright, Zachary Amsden,
	Andrew Morton, Linus Torvalds, H. Peter Anvin, linux-kernel

Eric W. Biederman wrote:
> It's not theatrical.  It makes this code path extremely brittle and
> very hard to change, which over the long term means that it is
> impossible to maintain.  Quickly resulting in a state where any little
> change will break something.

Removing the code now is premature.  I'm presuming you're proposing this
as a .23 change, since we're planning on changing the paravirt boot to
make use of bzImage and the boot protocol then anyway.

The world isn't going to fall apart if its in there for a couple of months.

    J

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-04 14:37     ` Rusty Russell
@ 2007-05-04 15:07       ` Eric W. Biederman
  2007-05-05  1:22         ` Rusty Russell
  2007-05-04 15:58       ` [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization" Andrew Morton
  1 sibling, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-04 15:07 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Andi Kleen, Chris Wright, Jeremy Fitzhardinge,
	Zachary Amsden, Linus Torvalds, H. Peter Anvin, linux-kernel

Rusty Russell <rusty@rustcorp.com.au> writes:

> On Fri, 2007-05-04 at 08:13 -0600, Eric W. Biederman wrote:
>> We don't have any working code, there are no in tree users.
>
> Hi Eric,
>
> 	Lack of in-tree code is definitely not due to me.  The code which uses
> it has been sitting in -mm for three months.  Suddenly ripping this out
> and breaking all that work without replacing it is rude.

My memory is very fuzzy now, but I know it at least came up early on
that everyone should be using %esi to point to real mode data and
that didn't happen.  None of the usual bootloader folks (myself
or HPA) appear to have been on the CC when this code was merged into
the kernel.  Ignoring review comments, skipping the people who know
the code and the bootloader requirements and putting in code that is
so brittle no one can change it is another kind of rude.

I'm wondering how much of the paravirt code is getting in through
reviewer exhaustion?

The interface for bootloaders is as hard to change as the user space
syscall interface.  I'm doing my best to nip this horror in the bud
now that I have looked into it enough to see what is going on.

The interface to bootloaders is quite different from the internal
kernel interfaces that we have all of the users captive so can change
it just by updating all of the users.  Except when we are first adding
the interface anyway.

The only reason this is even changeable is that we don't have any
users, and the code is dead weight.

So since I didn't get the opportunity to NAK the code.  I sent
a patch to revert once I understood it.

>> Therefore the code should go.
>
> Then put the patch is -mm, behind lguest.  Or better, help me refine my
> patches and I'll happily do it for you.

Before lguest.  Thank you very much.  This code should never ever
have been in a stable kernel.  It is a very ill conceived interface.

I have no problems helping you refine your patches, and am working
in that direction now.

I just think it is irresponsible to let something as potentially
problematic as this piece of code is to sit.

And frankly I don't think lguest should be merged until we are as
close to certain as human beings can get that have the ABI reviewed
and sorted out.  ABIs unfortunately are very very hard to change.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-04 14:59     ` Jeremy Fitzhardinge
@ 2007-05-04 15:20       ` Eric W. Biederman
  0 siblings, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-04 15:20 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Rusty Russell, Andi Kleen, Chris Wright, Zachary Amsden,
	Andrew Morton, Linus Torvalds, H. Peter Anvin, linux-kernel

Jeremy Fitzhardinge <jeremy@goop.org> writes:

> Eric W. Biederman wrote:
>> It's not theatrical.  It makes this code path extremely brittle and
>> very hard to change, which over the long term means that it is
>> impossible to maintain.  Quickly resulting in a state where any little
>> change will break something.
>
> Removing the code now is premature.  I'm presuming you're proposing this
> as a .23 change, since we're planning on changing the paravirt boot to
> make use of bzImage and the boot protocol then anyway.

.22 and possibly -stable.  -stable doesn't really matter because it is dead
code.

> The world isn't going to fall apart if its in there for a couple of months.

As long as the code remains dead and unused I don't really care when we purge it.

             That code should NEVER EVER be used.

               That code is UNMAINTAINABLE.

The code encourages random ABIs so we don't even know what we are supporting.
Once used in a production environment (i.e. where we have to support it)
it makes further changes to head.S essentially impossible.

Despite the lack of Documentation we not counting the paravirt mess in
head.S we have a well defined ABI for starting a linux-kernel.  Given the
generally difficulty of changing bootloaders I have no interesting is
messing up a good thing.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-04 14:37     ` Rusty Russell
  2007-05-04 15:07       ` Eric W. Biederman
@ 2007-05-04 15:58       ` Andrew Morton
  2007-05-05  1:55         ` Rusty Russell
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2007-05-04 15:58 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Eric W. Biederman, Andi Kleen, Chris Wright, Jeremy Fitzhardinge,
	Zachary Amsden, Linus Torvalds, H. Peter Anvin, linux-kernel

On Sat, 05 May 2007 00:37:30 +1000 Rusty Russell <rusty@rustcorp.com.au> wrote:

> Of course, I expect that Andrew is about to push my patches to Linus....
> any day now... right Andrew?  Then we don't need this argument.

It would be about a week off.  I'll resend the paches out for rereview.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-04 15:07       ` Eric W. Biederman
@ 2007-05-05  1:22         ` Rusty Russell
  2007-05-05  2:45           ` David Miller
  2007-05-05  2:53           ` Eric W. Biederman
  0 siblings, 2 replies; 17+ messages in thread
From: Rusty Russell @ 2007-05-05  1:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Andi Kleen, Chris Wright, Jeremy Fitzhardinge,
	Zachary Amsden, Linus Torvalds, H. Peter Anvin, linux-kernel

On Fri, 2007-05-04 at 09:07 -0600, Eric W. Biederman wrote:
> Rusty Russell <rusty@rustcorp.com.au> writes:
> 
> > On Fri, 2007-05-04 at 08:13 -0600, Eric W. Biederman wrote:
> >> We don't have any working code, there are no in tree users.
> >
> > Hi Eric,
> >
> > 	Lack of in-tree code is definitely not due to me.  The code which uses
> > it has been sitting in -mm for three months.  Suddenly ripping this out
> > and breaking all that work without replacing it is rude.
> 
> My memory is very fuzzy now, but I know it at least came up early on
> that everyone should be using %esi to point to real mode data and
> that didn't happen.

Hi Eric,

	Well, I certainly don't recall that (that's not to say that someone
didn't say it).  Trying to meet the requirements of Xen, VMI and other
future hypervisors lead to an awkward result; this is the main reason I
started on lguest, so we'd have a simple example in front of us to say
"do it this way".

	(It's not certain that anyone else will ever use this code, but we
should *try* IMHO).

> Before lguest.  Thank you very much.  This code should never ever
> have been in a stable kernel.  It is a very ill conceived interface.

I disagree.  It was *not* obvious how paravirt kernels should boot.
Lguest, for example, copied Xen's "set up kernel pagetables already"
design decision, which now seems wrong.  But it was the example we had.

> And frankly I don't think lguest should be merged until we are as
> close to certain as human beings can get that have the ABI reviewed
> and sorted out.  ABIs unfortunately are very very hard to change.

I think you misunderstand lguest.  I agree with this sentiment
completely: this is *why* lguest doesn't have an ABI.  It's all in-tree,
so it can simply be changed.  There's no guarantee that running
different kernels as guest and host will work.

Maybe later the ABI will nail down, but the last year of hacking on
various hypervisors has shown it's folly to try to get it right now.  We
need to play a lot first.

Hope that clarifies!
Rusty.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-04 15:58       ` [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization" Andrew Morton
@ 2007-05-05  1:55         ` Rusty Russell
  0 siblings, 0 replies; 17+ messages in thread
From: Rusty Russell @ 2007-05-05  1:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Andi Kleen, Chris Wright, Jeremy Fitzhardinge,
	Zachary Amsden, Linus Torvalds, H. Peter Anvin, linux-kernel

On Fri, 2007-05-04 at 08:58 -0700, Andrew Morton wrote:
> On Sat, 05 May 2007 00:37:30 +1000 Rusty Russell <rusty@rustcorp.com.au> wrote:
> 
> > Of course, I expect that Andrew is about to push my patches to Linus....
> > any day now... right Andrew?  Then we don't need this argument.
> 
> It would be about a week off.  I'll resend the paches out for rereview.

Unfortunately, it's becoming clear that we need to know what's going to
happen with the boot code first.  The details are both nasty and
important.

Hopefully this will all occur in the next few days so we can put
something small in w/o blocking lguest.  But if not, so be it: at least
we're making progress.

Sorry,
Rusty.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-05  1:22         ` Rusty Russell
@ 2007-05-05  2:45           ` David Miller
  2007-05-05  3:14             ` Eric W. Biederman
  2007-05-05  2:53           ` Eric W. Biederman
  1 sibling, 1 reply; 17+ messages in thread
From: David Miller @ 2007-05-05  2:45 UTC (permalink / raw)
  To: rusty; +Cc: ebiederm, akpm, ak, chrisw, jeremy, zach, torvalds, hpa,
	linux-kernel

From: Rusty Russell <rusty@rustcorp.com.au>
Date: Sat, 05 May 2007 11:22:48 +1000

> Hi Eric,

To all of those who don't speak "rusty", "Hi Soandso" means
"NAK".

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-05  1:22         ` Rusty Russell
  2007-05-05  2:45           ` David Miller
@ 2007-05-05  2:53           ` Eric W. Biederman
  2007-05-05  3:22             ` Rusty Russell
  1 sibling, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-05  2:53 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Andi Kleen, Chris Wright, Jeremy Fitzhardinge,
	Zachary Amsden, Linus Torvalds, H. Peter Anvin, linux-kernel

Rusty Russell <rusty@rustcorp.com.au> writes:

> Hi Eric,
>
> 	Well, I certainly don't recall that (that's not to say that someone
> didn't say it).  Trying to meet the requirements of Xen, VMI and other
> future hypervisors lead to an awkward result; this is the main reason I
> started on lguest, so we'd have a simple example in front of us to say
> "do it this way".
>
> 	(It's not certain that anyone else will ever use this code, but we
> should *try* IMHO).

Ok.  So the purpose of lguest is to be an experimental platform and
as much as possible be a good example.

>> Before lguest.  Thank you very much.  This code should never ever
>> have been in a stable kernel.  It is a very ill conceived interface.
>
> I disagree.  It was *not* obvious how paravirt kernels should boot.
> Lguest, for example, copied Xen's "set up kernel pagetables already"
> design decision, which now seems wrong.  But it was the example we had.

Makes some sense.  There was a bit of experience with booting kernels
without doing 16bit BIOS calls in nearly this way outside the paravirt
world though. 

>> And frankly I don't think lguest should be merged until we are as
>> close to certain as human beings can get that have the ABI reviewed
>> and sorted out.  ABIs unfortunately are very very hard to change.
>
> I think you misunderstand lguest.  I agree with this sentiment
> completely: this is *why* lguest doesn't have an ABI.  It's all in-tree,
> so it can simply be changed.  There's no guarantee that running
> different kernels as guest and host will work.

Reasonable.  Just so long as it is hidden under CONFIG_EXPERIMENTAL
And that people will know that mixing and matching is a problem.

Someday I might have to tell the story of the challenge of getting
the kexec on panic code path to all mixing and matching of kernels.

> Maybe later the ABI will nail down, but the last year of hacking on
> various hypervisors has shown it's folly to try to get it right now.  We
> need to play a lot first.
>
> Hope that clarifies!

For lguest at least.

The delicate part right now is that lguest is attempting to use the
standard kernel entry point which does have a fixed ABI.

If lguest uses that entry point in a hard to maintain way it provides
a bad example, and it potentially leads to other problems.  So I
really don't want to see the bad example happen, especially if the
code in the bad example is as general as it is today.

That said we have two practical short term solutions.
-  A separate entry point made advertised with an ELF note like
   Xen is currently doing.

-  We reserve a type field in the first 2K of the boot param area
   and if that type is lguest we immediately branch to your code.

Reserving the type field is a subset of what we are looking at for
the next rev of the boot protocol that will handle this clearly.

We have enough in flight work for cleaning up the booting of arch/i386
that I don't believe we can have a complete solution before the merge
window closes.  At best we can get a minor rev and reserve the fields
it looks like we will need in head.S

Frankly I think the least risk of problems comes from just doing a
separate entry point for lguest for now.  It means we don't even have
to touch the common code path and later dropping will be trivially 
lguest specific, and certain to not break anything else.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-05  2:45           ` David Miller
@ 2007-05-05  3:14             ` Eric W. Biederman
  2007-05-05  3:50               ` David Miller
  0 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-05  3:14 UTC (permalink / raw)
  To: David Miller
  Cc: rusty, akpm, ak, chrisw, jeremy, zach, torvalds, hpa,
	linux-kernel

David Miller <davem@davemloft.net> writes:

> From: Rusty Russell <rusty@rustcorp.com.au>
> Date: Sat, 05 May 2007 11:22:48 +1000
>
>> Hi Eric,
>
> To all of those who don't speak "rusty", "Hi Soandso" means
> "NAK".

The question between Rusty and myself is not if we should remove
that code but when. 

If we are going to break an ABI I figure the sooner the better,
and the conversation between Rusty and myself has not been totally
unproductive.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-05  2:53           ` Eric W. Biederman
@ 2007-05-05  3:22             ` Rusty Russell
  2007-05-05 11:18               ` [PATCH] lguest: don't use paravirt_probe, it's dying Rusty Russell
  0 siblings, 1 reply; 17+ messages in thread
From: Rusty Russell @ 2007-05-05  3:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Andi Kleen, Chris Wright, Jeremy Fitzhardinge,
	Zachary Amsden, Linus Torvalds, H. Peter Anvin, linux-kernel

On Fri, 2007-05-04 at 20:53 -0600, Eric W. Biederman wrote:
> Rusty Russell <rusty@rustcorp.com.au> writes:
> The delicate part right now is that lguest is attempting to use the
> standard kernel entry point which does have a fixed ABI.
> 
> If lguest uses that entry point in a hard to maintain way it provides
> a bad example, and it potentially leads to other problems.  So I
> really don't want to see the bad example happen, especially if the
> code in the bad example is as general as it is today.

I completely agree, a bad example is worse than no example.  Plus, an
opportunity to have you and hpa hacking on lguest is not to be missed.

> Frankly I think the least risk of problems comes from just doing a
> separate entry point for lguest for now.  It means we don't even have
> to touch the common code path and later dropping will be trivially 
> lguest specific, and certain to not break anything else.

Hmm, I railed for so long against Xen doing that, it feels... wrong...
to do that now 8)

I think I'll need to hack in a magic signature before the lguest start:
it's the only way it'll work with unpacking bzImages as well.  And it'll
be trivial to rip out later when we have the Right Way.

I'll spin a patch this afternoon (got to go to puppy training now).

Thanks!
Rusty.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization"
  2007-05-05  3:14             ` Eric W. Biederman
@ 2007-05-05  3:50               ` David Miller
  0 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2007-05-05  3:50 UTC (permalink / raw)
  To: ebiederm; +Cc: rusty, akpm, ak, chrisw, jeremy, zach, torvalds, hpa,
	linux-kernel

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Fri, 04 May 2007 21:14:42 -0600

> David Miller <davem@davemloft.net> writes:
> 
> > From: Rusty Russell <rusty@rustcorp.com.au>
> > Date: Sat, 05 May 2007 11:22:48 +1000
> >
> >> Hi Eric,
> >
> > To all of those who don't speak "rusty", "Hi Soandso" means
> > "NAK".
> 
> The question between Rusty and myself is not if we should remove
> that code but when. 

I should have added a smiley face, sorry about that.

I was merely making a joke wrt. a recent blog entry of Rusty's:

	http://ozlabs.org/~rusty/index.cgi/tech/2007-05-04.html

:-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] lguest: don't use paravirt_probe, it's dying
  2007-05-05  3:22             ` Rusty Russell
@ 2007-05-05 11:18               ` Rusty Russell
  2007-05-07  1:53                 ` Eric W. Biederman
  0 siblings, 1 reply; 17+ messages in thread
From: Rusty Russell @ 2007-05-05 11:18 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Andi Kleen, Chris Wright, Jeremy Fitzhardinge,
	Zachary Amsden, Linus Torvalds, H. Peter Anvin, linux-kernel

On Sat, 2007-05-05 at 13:22 +1000, Rusty Russell wrote:
> On Fri, 2007-05-04 at 20:53 -0600, Eric W. Biederman wrote:
> > Frankly I think the least risk of problems comes from just doing a
> > separate entry point for lguest for now.  It means we don't even have
> > to touch the common code path and later dropping will be trivially 
> > lguest specific, and certain to not break anything else.
> 
> Hmm, I railed for so long against Xen doing that, it feels... wrong...
> to do that now 8)

And here's the patch.  It's pretty trivial actually.  It even worked
first time.  It applies against -mm (ie. before the "boot with P=V"
patches I showed before).

Thanks!
Rusty.
==
paravirt_probe() and its infrastructure proved as popular as it was
pretty.  In anticipation of its imminent demise, this switches lguest to
use a magic string to identify a different entry point.

This is not the long-term solution: that will take a new bootloader rev
and will hopefully be done in the next few months.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff -r 9cfbf629088e Documentation/lguest/lguest.c
--- a/Documentation/lguest/lguest.c	Sat May 05 13:06:53 2007 +1000
+++ b/Documentation/lguest/lguest.c	Sat May 05 17:33:25 2007 +1000
@@ -96,6 +96,19 @@ static void *map_zeroed_pages(unsigned l
 	return (void *)addr;
 }
 
+/* Find magic string marking entry point, return entry point. */
+static unsigned long entry_point(void *start, void *end,
+				 unsigned long page_offset)
+{
+	void *p;
+
+	for (p = start; p < end; p++)
+		if (memcmp(p, "GenuineLguest", strlen("GenuineLguest")) == 0)
+			return (long)p + strlen("GenuineLguest") + page_offset;
+
+	err(1, "Is this image a genuine lguest?");
+}
+
 /* Returns the entry point */
 static unsigned long map_elf(int elf_fd, const Elf32_Ehdr *ehdr,
 			     unsigned long *page_offset)
@@ -103,6 +116,7 @@ static unsigned long map_elf(int elf_fd,
 	void *addr;
 	Elf32_Phdr phdr[ehdr->e_phnum];
 	unsigned int i;
+	unsigned long start = -1UL, end = 0;
 
 	/* Sanity checks. */
 	if (ehdr->e_type != ET_EXEC
@@ -131,6 +145,11 @@ static unsigned long map_elf(int elf_fd,
 			*page_offset = phdr[i].p_vaddr - phdr[i].p_paddr;
 		else if (*page_offset != phdr[i].p_vaddr - phdr[i].p_paddr)
 			errx(1, "Page offset of section %i different", i);
+
+		if (phdr[i].p_paddr < start)
+			start = phdr[i].p_paddr;
+		if (phdr[i].p_paddr + phdr[i].p_filesz > end)
+			end = phdr[i].p_paddr + phdr[i].p_filesz;
 
 		/* We map everything private, writable. */
 		addr = mmap((void *)phdr[i].p_paddr,
@@ -143,8 +162,7 @@ static unsigned long map_elf(int elf_fd,
 			    i, addr, (void *)phdr[i].p_paddr);
 	}
 
-	/* Entry is physical address: convert to virtual */
-	return ehdr->e_entry + *page_offset;
+	return entry_point((void *)start, (void *)end, *page_offset);
 }
 
 /* This is amazingly reliable. */
@@ -175,8 +193,7 @@ static unsigned long unpack_bzimage(int 
 	verbose("Unpacked size %i addr %p\n", len, img);
 	*page_offset = intuit_page_offset(img, len);
 
-	/* Entry is physical address: convert to virtual */
-	return (unsigned long)img + *page_offset;
+	return entry_point(img, img + len, *page_offset);
 }
 
 static unsigned long load_bzimage(int fd, unsigned long *page_offset)
diff -r 9cfbf629088e drivers/lguest/lguest.c
--- a/drivers/lguest/lguest.c	Sat May 05 13:06:53 2007 +1000
+++ b/drivers/lguest/lguest.c	Sat May 05 17:21:44 2007 +1000
@@ -43,7 +43,6 @@ extern const char lgstart_popf[], lgend_
 extern const char lgstart_popf[], lgend_popf[];
 extern const char lgstart_pushf[], lgend_pushf[];
 extern const char lgstart_iret[], lgend_iret[];
-extern asmlinkage void lguest_maybe_init(void);
 extern void lguest_iret(void);
 
 struct lguest_data lguest_data = {
@@ -497,4 +496,3 @@ __init void lguest_init(void)
 	pm_power_off = lguest_power_off;
 	start_kernel();
 }
-paravirt_probe(lguest_maybe_init);
diff -r 9cfbf629088e drivers/lguest/lguest_asm.S
--- a/drivers/lguest/lguest_asm.S	Sat May 05 13:06:53 2007 +1000
+++ b/drivers/lguest/lguest_asm.S	Sat May 05 17:21:38 2007 +1000
@@ -1,29 +1,24 @@
 #include <linux/linkage.h>
 #include <linux/lguest.h>
 #include <asm/asm-offsets.h>
+#include <asm/thread_info.h>
 
 /* FIXME: Once asm/processor-flags.h goes in, include that */
 #define X86_EFLAGS_IF 0x00000200
 
 /*
- * This is where we begin: head.S notes that paging is already enabled (which
- * doesn't happen in native boot) and calls the registered paravirt_probe
- * functions one at a time.  Ours is a simple assembler test for magic
- * registers.  If they're correct we jump to lguest_init.
+ * This is where we begin: we have a magic signature which the launcher looks
+ * for.  The plan is that the Linux boot protocol will be extended with a
+ * "platform type" field which will guide us here from the normal entry point,
+ * but for the moment this suffices.
  *
  * We put it in .init.text will be discarded after boot.
  */
 .section .init.text, "ax", @progbits
-ENTRY(lguest_maybe_init)
-	cmpl $LGUEST_MAGIC_EBP, %ebp
-	jne out
-	cmpl $LGUEST_MAGIC_EDI, %edi
-	jne out
-	cmpl $LGUEST_MAGIC_ESI, %esi
-	jne out
-	je lguest_init
-out:
-	ret
+.ascii "GenuineLguest"
+	/* Set up initial stack. */
+ 	movl $(init_thread_union+THREAD_SIZE),%esp
+	jmp lguest_init
 
 /* The templates for inline patching. */
 #define LGUEST_PATCH(name, insns...)			\





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] lguest: don't use paravirt_probe, it's dying
  2007-05-05 11:18               ` [PATCH] lguest: don't use paravirt_probe, it's dying Rusty Russell
@ 2007-05-07  1:53                 ` Eric W. Biederman
  0 siblings, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-05-07  1:53 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Andrew Morton, Andi Kleen, Chris Wright, Jeremy Fitzhardinge,
	Zachary Amsden, Linus Torvalds, H. Peter Anvin, linux-kernel

Rusty Russell <rusty@rustcorp.com.au> writes:

> On Sat, 2007-05-05 at 13:22 +1000, Rusty Russell wrote:
>> On Fri, 2007-05-04 at 20:53 -0600, Eric W. Biederman wrote:
>> > Frankly I think the least risk of problems comes from just doing a
>> > separate entry point for lguest for now.  It means we don't even have
>> > to touch the common code path and later dropping will be trivially 
>> > lguest specific, and certain to not break anything else.
>> 
>> Hmm, I railed for so long against Xen doing that, it feels... wrong...
>> to do that now 8)
>
> And here's the patch.  It's pretty trivial actually.  It even worked
> first time.  It applies against -mm (ie. before the "boot with P=V"
> patches I showed before).
>
> Thanks!
> Rusty.
> ==

Looks fine to me.  I'm not thrilled about grepping for a know ascii
string in the kernel binary, to find the lguest entry point.  But it
is simple, and it's temporary so it looks fine.

It clearly keeps us from having problems on the traditional x86 entry
point.

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-05-07  1:54 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-04 11:56 [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization" Eric W. Biederman
2007-05-04 12:12 ` Rusty Russell
2007-05-04 14:13   ` Eric W. Biederman
2007-05-04 14:37     ` Rusty Russell
2007-05-04 15:07       ` Eric W. Biederman
2007-05-05  1:22         ` Rusty Russell
2007-05-05  2:45           ` David Miller
2007-05-05  3:14             ` Eric W. Biederman
2007-05-05  3:50               ` David Miller
2007-05-05  2:53           ` Eric W. Biederman
2007-05-05  3:22             ` Rusty Russell
2007-05-05 11:18               ` [PATCH] lguest: don't use paravirt_probe, it's dying Rusty Russell
2007-05-07  1:53                 ` Eric W. Biederman
2007-05-04 15:58       ` [PATCH] Revert "[PATCH] paravirt: Add startup infrastructure for paravirtualization" Andrew Morton
2007-05-05  1:55         ` Rusty Russell
2007-05-04 14:59     ` Jeremy Fitzhardinge
2007-05-04 15:20       ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox