public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Regression: boot failure on AMD Elan TS-5500
@ 2008-06-16 12:11 Sean Young
  2008-06-16 13:27 ` Rusty Russell
  0 siblings, 1 reply; 32+ messages in thread
From: Sean Young @ 2008-06-16 12:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: Rusty Russell

The symptons are either a crash or reboot on booting the kernel. No printk's
have occurred yet -- even with early printk on.

2.6.15 worked on this board however current does not. I've bisected it to:

	commit a24e785111a32ccb7cebafd24b1b1cb474ea8e5d
	Author: Rusty Russell <rusty@rustcorp.com.au>
	Date:   Sun Oct 21 16:41:35 2007 -0700

	    i386: paravirt boot sequence
	    
	    This patch uses the updated boot protocol to do paravirtualized boot.
	    If the boot version is >= 2.07, then it will do two things:
	    
	     1. Check the bootparams loadflags to see if we should reload the
		segment registers and clear interrupts.  This is appropriate
		for normal native boot and some paravirtualized environments, but
		inapproprate for others.
	    
	     2. Check the hardware architecture, and dispatch to the appropriate
		kernel entrypoint.  If the bootloader doesn't set this, then we
		simply do the normal boot sequence.

Unfortunately this patch can't be reverse-applied onto mainline, so I've
tried it on 2.6.24-rc1. 2.6.24-rc1 as-is fails, but with this patch
reverted it does work.

This is the board: 

http://www.embeddedarm.com/products/board-detail.php?product=TS-5500

Regards
Sean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-16 12:11 Regression: boot failure on AMD Elan TS-5500 Sean Young
@ 2008-06-16 13:27 ` Rusty Russell
  2008-06-16 13:37   ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 32+ messages in thread
From: Rusty Russell @ 2008-06-16 13:27 UTC (permalink / raw)
  To: Sean Young; +Cc: linux-kernel, Jeremy Fitzhardinge, H. Anvin

On Monday 16 June 2008 22:11:39 Sean Young wrote:
> The symptons are either a crash or reboot on booting the kernel. No
> printk's have occurred yet -- even with early printk on.
>
> 2.6.15 worked on this board however current does not. I've bisected it to:
>
> 	commit a24e785111a32ccb7cebafd24b1b1cb474ea8e5d
> 	Author: Rusty Russell <rusty@rustcorp.com.au>
> 	Date:   Sun Oct 21 16:41:35 2007 -0700
>
> 	    i386: paravirt boot sequence

Hi Sean,

   Thanks for tracking this down.  Can we try reverting this in pieces to see 
exactly what the cause was?

1) Revert arch/x86/boot/compressed/head_32.S
2) If that doesn't fix it, Try removing the 8 lines which were added to 
arch/x86/kernel/head_32.S

I assume this is CONFIG_PARAVIRT=n?

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-16 13:27 ` Rusty Russell
@ 2008-06-16 13:37   ` Jeremy Fitzhardinge
  2008-06-16 16:19     ` Sean Young
  0 siblings, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2008-06-16 13:37 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Sean Young, linux-kernel, H. Anvin

Rusty Russell wrote:
> On Monday 16 June 2008 22:11:39 Sean Young wrote:
>   
>> The symptons are either a crash or reboot on booting the kernel. No
>> printk's have occurred yet -- even with early printk on.
>>
>> 2.6.15 worked on this board however current does not. I've bisected it to:
>>
>> 	commit a24e785111a32ccb7cebafd24b1b1cb474ea8e5d
>> 	Author: Rusty Russell <rusty@rustcorp.com.au>
>> 	Date:   Sun Oct 21 16:41:35 2007 -0700
>>
>> 	    i386: paravirt boot sequence
>>     
>
> Hi Sean,
>
>    Thanks for tracking this down.  Can we try reverting this in pieces to see 
> exactly what the cause was?
>
> 1) Revert arch/x86/boot/compressed/head_32.S
> 2) If that doesn't fix it, Try removing the 8 lines which were added to 
> arch/x86/kernel/head_32.S
>   
And the arch/x86/boot/compressed/misc_32.c change too, just in case...

> I assume this is CONFIG_PARAVIRT=n?

What bootloader does this platform use?  I wonder if it's setting things 
up in an unexpected way?

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-16 13:37   ` Jeremy Fitzhardinge
@ 2008-06-16 16:19     ` Sean Young
  2008-06-16 17:06       ` H. Peter Anvin
                         ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Sean Young @ 2008-06-16 16:19 UTC (permalink / raw)
  To: Jeremy Fitzhardinge, Rusty Russell; +Cc: linux-kernel, H. Anvin

On Mon, Jun 16, 2008 at 06:37:12AM -0700, Jeremy Fitzhardinge wrote:
> Rusty Russell wrote:
> >On Monday 16 June 2008 22:11:39 Sean Young wrote:
> >  
> >>The symptons are either a crash or reboot on booting the kernel. No
> >>printk's have occurred yet -- even with early printk on.
> >>
> >>2.6.15 worked on this board however current does not. I've bisected it to:
> >>
> >>	commit a24e785111a32ccb7cebafd24b1b1cb474ea8e5d
> >>	Author: Rusty Russell <rusty@rustcorp.com.au>
> >>	Date:   Sun Oct 21 16:41:35 2007 -0700
> >>
> >>	    i386: paravirt boot sequence
> >>    
> >
> >Hi Sean,
> >
> >   Thanks for tracking this down.  Can we try reverting this in pieces to 
> >   see exactly what the cause was?
> >
> >1) Revert arch/x86/boot/compressed/head_32.S
> >2) If that doesn't fix it, Try removing the 8 lines which were added to 
> >arch/x86/kernel/head_32.S
> >  
> And the arch/x86/boot/compressed/misc_32.c change too, just in case...

Reverting only head_32.S makes the problem go away. I didn't try any other
changes.

> >I assume this is CONFIG_PARAVIRT=n?

Indeed.

> What bootloader does this platform use?  I wonder if it's setting things 
> up in an unexpected way?

It's syslinux. That must be the culprit, considering the change to head_32.S.

I'm digging in syslinux right now. 

Thanks
Sean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-16 16:19     ` Sean Young
@ 2008-06-16 17:06       ` H. Peter Anvin
  2008-06-16 17:58         ` Sean Young
  2008-06-16 17:06       ` H. Peter Anvin
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2008-06-16 17:06 UTC (permalink / raw)
  To: Sean Young; +Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel

Sean Young wrote:
> 
>> What bootloader does this platform use?  I wonder if it's setting things 
>> up in an unexpected way?
> 
> It's syslinux. That must be the culprit, considering the change to head_32.S.
> 
> I'm digging in syslinux right now. 
> 

What makes you draw that conclusion?

	-hpa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-16 16:19     ` Sean Young
  2008-06-16 17:06       ` H. Peter Anvin
@ 2008-06-16 17:06       ` H. Peter Anvin
  2008-06-16 17:07       ` Jeremy Fitzhardinge
  2008-06-30 21:43       ` Sean Young
  3 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2008-06-16 17:06 UTC (permalink / raw)
  To: Sean Young; +Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel

Sean Young wrote:
> 
>> What bootloader does this platform use?  I wonder if it's setting things 
>> up in an unexpected way?
> 
> It's syslinux. That must be the culprit, considering the change to head_32.S.
> 
> I'm digging in syslinux right now. 
> 

Also, what version of syslinux?

	-hpa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-16 16:19     ` Sean Young
  2008-06-16 17:06       ` H. Peter Anvin
  2008-06-16 17:06       ` H. Peter Anvin
@ 2008-06-16 17:07       ` Jeremy Fitzhardinge
  2008-06-30 21:43       ` Sean Young
  3 siblings, 0 replies; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2008-06-16 17:07 UTC (permalink / raw)
  To: Sean Young; +Cc: Rusty Russell, linux-kernel, H. Anvin

Sean Young wrote:
> On Mon, Jun 16, 2008 at 06:37:12AM -0700, Jeremy Fitzhardinge wrote:
>   
>> Rusty Russell wrote:
>>     
>>> On Monday 16 June 2008 22:11:39 Sean Young wrote:
>>>  
>>>       
>>>> The symptons are either a crash or reboot on booting the kernel. No
>>>> printk's have occurred yet -- even with early printk on.
>>>>
>>>> 2.6.15 worked on this board however current does not. I've bisected it to:
>>>>
>>>> 	commit a24e785111a32ccb7cebafd24b1b1cb474ea8e5d
>>>> 	Author: Rusty Russell <rusty@rustcorp.com.au>
>>>> 	Date:   Sun Oct 21 16:41:35 2007 -0700
>>>>
>>>> 	    i386: paravirt boot sequence
>>>>    
>>>>         
>>> Hi Sean,
>>>
>>>   Thanks for tracking this down.  Can we try reverting this in pieces to 
>>>   see exactly what the cause was?
>>>
>>> 1) Revert arch/x86/boot/compressed/head_32.S
>>> 2) If that doesn't fix it, Try removing the 8 lines which were added to 
>>> arch/x86/kernel/head_32.S
>>>  
>>>       
>> And the arch/x86/boot/compressed/misc_32.c change too, just in case...
>>     
>
> Reverting only head_32.S makes the problem go away. I didn't try any other
> changes.
>
>   
>>> I assume this is CONFIG_PARAVIRT=n?
>>>       
>
> Indeed.
>
>   
>> What bootloader does this platform use?  I wonder if it's setting things 
>> up in an unexpected way?
>>     
>
> It's syslinux. That must be the culprit, considering the change to head_32.S.
>
> I'm digging in syslinux right now. 

Weird. That change has been in there a fairly long time now, and lots of 
people have successfully used syslinux to boot it in that time.  Must be 
some specific combination of factors on your platform.

I would guess the KEEP_SEGMENTS flag is being set for some reason,  but 
it's hard to imagine how, unless there's some memory corruption going on.

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-16 17:06       ` H. Peter Anvin
@ 2008-06-16 17:58         ` Sean Young
  2008-06-16 18:16           ` H. Peter Anvin
  0 siblings, 1 reply; 32+ messages in thread
From: Sean Young @ 2008-06-16 17:58 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel

On Mon, Jun 16, 2008 at 10:06:30AM -0700, H. Peter Anvin wrote:
> Sean Young wrote:
> >
> >>What bootloader does this platform use?  I wonder if it's setting things 
> >>up in an unexpected way?
> >
> >It's syslinux. That must be the culprit, considering the change to 
> >head_32.S.
> >
> >I'm digging in syslinux right now. 
> >
> 
> What makes you draw that conclusion?

The thinking was that syslinux sets the boot parameters which are being read 
in head_32.S. I've used syslinux 3.53 but have just moved to 3.63.

That was the theory at least. I'm not sure what's going on now. I'll spend
some more time on this.

Thanks
Sean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-16 17:58         ` Sean Young
@ 2008-06-16 18:16           ` H. Peter Anvin
  0 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2008-06-16 18:16 UTC (permalink / raw)
  To: Sean Young; +Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel

Sean Young wrote:
> 
> The thinking was that syslinux sets the boot parameters which are being read 
> in head_32.S. I've used syslinux 3.53 but have just moved to 3.63.
> 
> That was the theory at least. I'm not sure what's going on now. I'll spend
> some more time on this.
> 

Most of the boot parameters actually come from the real-mode boot code 
(arch/x86/boot).  A few of them come from the bootloader.

With a version of Syslinux that recent there shouldn't be any issues, 
especially since Syslinux is, for somewhat obvious reasons, my reference 
bootloader.  You may want to experiment with using the optional 
linux.c32 module just to see if it changes the behaviour:

boot: linux.c32 kernel options...

Or, in the configuration file:

label blah
	kernel linux.c32
	append kernel options...

	-hpa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-16 16:19     ` Sean Young
                         ` (2 preceding siblings ...)
  2008-06-16 17:07       ` Jeremy Fitzhardinge
@ 2008-06-30 21:43       ` Sean Young
  2008-06-30 21:52         ` Jeremy Fitzhardinge
  3 siblings, 1 reply; 32+ messages in thread
From: Sean Young @ 2008-06-30 21:43 UTC (permalink / raw)
  To: Jeremy Fitzhardinge, Rusty Russell, linux-kernel, H. Anvin

On Mon, Jun 16, 2008 at 04:19:44PM +0000, Sean Young wrote:
> On Mon, Jun 16, 2008 at 06:37:12AM -0700, Jeremy Fitzhardinge wrote:
> > Rusty Russell wrote:
> > >On Monday 16 June 2008 22:11:39 Sean Young wrote:
> > >  
> > >>The symptons are either a crash or reboot on booting the kernel. No
> > >>printk's have occurred yet -- even with early printk on.
> > >>
> > >>2.6.15 worked on this board however current does not. I've bisected it to:
> > >>
> > >>	commit a24e785111a32ccb7cebafd24b1b1cb474ea8e5d
> > >>	Author: Rusty Russell <rusty@rustcorp.com.au>
> > >>	Date:   Sun Oct 21 16:41:35 2007 -0700
> > >>
> > >>	    i386: paravirt boot sequence
> > >>    
> > >
> > >Hi Sean,
> > >
> > >   Thanks for tracking this down.  Can we try reverting this in pieces to 
> > >   see exactly what the cause was?
> > >
> > >1) Revert arch/x86/boot/compressed/head_32.S
> > >2) If that doesn't fix it, Try removing the 8 lines which were added to 
> > >arch/x86/kernel/head_32.S
> > >  
> > And the arch/x86/boot/compressed/misc_32.c change too, just in case...
> 
> Reverting only head_32.S makes the problem go away. I didn't try any other
> changes.

Looking at the beginning of startup_32, it seems ds is used before it is set:

startup_32:
        cld
        /* test KEEP_SEGMENTS flag to see if the bootloader is asking
         * us to not reload segments */
        testb $(1<<6), BP_loadflags(%esi)
        jnz 1f

        cli
        movl $(__BOOT_DS),%eax
        movl %eax,%ds
        movl %eax,%es
        movl %eax,%fs
        movl %eax,%gs
        movl %eax,%ss
1:

Since the testb instruction is a dereference, ds is implicitly used. If
I move the testb to after "movl %eax,%ds" it seems to work (not that it
would make any sense there, but just to prove the point).

1) Am I barking up the wrong tree?

2) If I'm right I have no idea what the correct solution is; it seems that
   a chicken & egg issue is introduced.

Please advise. I am very new to all of this.


Sean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 21:43       ` Sean Young
@ 2008-06-30 21:52         ` Jeremy Fitzhardinge
  2008-06-30 22:04           ` H. Peter Anvin
  2008-06-30 22:09           ` Sean Young
  0 siblings, 2 replies; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2008-06-30 21:52 UTC (permalink / raw)
  To: Sean Young; +Cc: Rusty Russell, linux-kernel, H. Anvin

Sean Young wrote:
> On Mon, Jun 16, 2008 at 04:19:44PM +0000, Sean Young wrote:
>   
>> On Mon, Jun 16, 2008 at 06:37:12AM -0700, Jeremy Fitzhardinge wrote:
>>     
>>> Rusty Russell wrote:
>>>       
>>>> On Monday 16 June 2008 22:11:39 Sean Young wrote:
>>>>  
>>>>         
>>>>> The symptons are either a crash or reboot on booting the kernel. No
>>>>> printk's have occurred yet -- even with early printk on.
>>>>>
>>>>> 2.6.15 worked on this board however current does not. I've bisected it to:
>>>>>
>>>>> 	commit a24e785111a32ccb7cebafd24b1b1cb474ea8e5d
>>>>> 	Author: Rusty Russell <rusty@rustcorp.com.au>
>>>>> 	Date:   Sun Oct 21 16:41:35 2007 -0700
>>>>>
>>>>> 	    i386: paravirt boot sequence
>>>>>    
>>>>>           
>>>> Hi Sean,
>>>>
>>>>   Thanks for tracking this down.  Can we try reverting this in pieces to 
>>>>   see exactly what the cause was?
>>>>
>>>> 1) Revert arch/x86/boot/compressed/head_32.S
>>>> 2) If that doesn't fix it, Try removing the 8 lines which were added to 
>>>> arch/x86/kernel/head_32.S
>>>>  
>>>>         
>>> And the arch/x86/boot/compressed/misc_32.c change too, just in case...
>>>       
>> Reverting only head_32.S makes the problem go away. I didn't try any other
>> changes.
>>     
>
> Looking at the beginning of startup_32, it seems ds is used before it is set:
>
> startup_32:
>         cld
>         /* test KEEP_SEGMENTS flag to see if the bootloader is asking
>          * us to not reload segments */
>         testb $(1<<6), BP_loadflags(%esi)
>         jnz 1f
>
>         cli
>         movl $(__BOOT_DS),%eax
>         movl %eax,%ds
>         movl %eax,%es
>         movl %eax,%fs
>         movl %eax,%gs
>         movl %eax,%ss
> 1:
>
> Since the testb instruction is a dereference, ds is implicitly used. If
> I move the testb to after "movl %eax,%ds" it seems to work (not that it
> would make any sense there, but just to prove the point).
>
> 1) Am I barking up the wrong tree?
>
> 2) If I'm right I have no idea what the correct solution is; it seems that
>    a chicken & egg issue is introduced.
>
> Please advise. I am very new to all of this.

It's a bit odd that the boot loader neglected to set up ds properly, but 
changing the testb line to

	testb $(1<<6), %cs:BP_loadflags(%esi)

should work.  (Or perhaps a %ss: override would be better?)

I'm assuming that the GDT setup isn't completely mad and that the 
segments have the same base at least.

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 21:52         ` Jeremy Fitzhardinge
@ 2008-06-30 22:04           ` H. Peter Anvin
  2008-06-30 22:14             ` H. Peter Anvin
                               ` (2 more replies)
  2008-06-30 22:09           ` Sean Young
  1 sibling, 3 replies; 32+ messages in thread
From: H. Peter Anvin @ 2008-06-30 22:04 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Sean Young, Rusty Russell, linux-kernel

Jeremy Fitzhardinge wrote:
>>
>> Looking at the beginning of startup_32, it seems ds is used before it 
>> is set:
>>
>> startup_32:
>>         cld
>>         /* test KEEP_SEGMENTS flag to see if the bootloader is asking
>>          * us to not reload segments */
>>         testb $(1<<6), BP_loadflags(%esi)
>>         jnz 1f
>>
>>         cli
>>         movl $(__BOOT_DS),%eax
>>         movl %eax,%ds
>>         movl %eax,%es
>>         movl %eax,%fs
>>         movl %eax,%gs
>>         movl %eax,%ss
>> 1:
>>
>> Since the testb instruction is a dereference, ds is implicitly used. If
>> I move the testb to after "movl %eax,%ds" it seems to work (not that it
>> would make any sense there, but just to prove the point).
>>
>> 1) Am I barking up the wrong tree?
>>
>> 2) If I'm right I have no idea what the correct solution is; it seems 
>> that
>>    a chicken & egg issue is introduced.
>>
>> Please advise. I am very new to all of this.
> 
> It's a bit odd that the boot loader neglected to set up ds properly, but 
> changing the testb line to
> 
>     testb $(1<<6), %cs:BP_loadflags(%esi)
> 
> should work.  (Or perhaps a %ss: override would be better?)
> 
> I'm assuming that the GDT setup isn't completely mad and that the 
> segments have the same base at least.
> 

This should have been set up by the *boot code* (specifically lines 
57-61 of arch/x86/boot/pmjump.S) since he's using a conventional boot 
loader (syslinux) so something is utterly fuggled up.  Using %cs: here 
should be safe, though (and *is* more conservative, after all, why 
otherwise bother reloading these segments at all?), but it still 
concerns me a great deal if this is broken in this way.  It's definitely 
better than %ss:.

In particular, I'm wondering if the Elan CPU has any strange ordering 
requirements with regards to the protected mode transition that we're 
not obeying.

It would be interesting to put in a heavyweight "brutally synchronizing" 
instruction like WBINVD at various places in arch/x86/boot/pmjump.S and 
see if it helps.

	-hpa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 21:52         ` Jeremy Fitzhardinge
  2008-06-30 22:04           ` H. Peter Anvin
@ 2008-06-30 22:09           ` Sean Young
  2008-07-06  2:41             ` Eric W. Biederman
  1 sibling, 1 reply; 32+ messages in thread
From: Sean Young @ 2008-06-30 22:09 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Rusty Russell, linux-kernel, H. Anvin

On Mon, Jun 30, 2008 at 02:52:10PM -0700, Jeremy Fitzhardinge wrote:
> It's a bit odd that the boot loader neglected to set up ds properly, but 
> changing the testb line to
> 
> 	testb $(1<<6), %cs:BP_loadflags(%esi)
> 
> should work.  (Or perhaps a %ss: override would be better?)

Yup, it does (%cs: prefix).

> I'm assuming that the GDT setup isn't completely mad and that the 
> segments have the same base at least.

Where do I go from here?

Thanks
Sean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 22:04           ` H. Peter Anvin
@ 2008-06-30 22:14             ` H. Peter Anvin
  2008-06-30 22:36               ` Jeremy Fitzhardinge
  2008-06-30 22:31             ` Sean Young
  2008-06-30 22:34             ` Jeremy Fitzhardinge
  2 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2008-06-30 22:14 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Sean Young, Rusty Russell, linux-kernel

H. Peter Anvin wrote:
>>>
>>> startup_32:
>>>         cld
>>>         /* test KEEP_SEGMENTS flag to see if the bootloader is asking
>>>          * us to not reload segments */
>>>         testb $(1<<6), BP_loadflags(%esi)
>>>         jnz 1f
>>>
>>>         cli
>>>         movl $(__BOOT_DS),%eax
>>>         movl %eax,%ds
>>>         movl %eax,%es
>>>         movl %eax,%fs
>>>         movl %eax,%gs
>>>         movl %eax,%ss
>>> 1:

On this general subject... I keep thinking that it would be better to 
have this as:

	movl	%cs, %eax
	addl	$8, %eax
	movl	%eax, %cs

... instead of a hard-coded constant.  That actually removes all 
hard-coded uses of BOOT_CS/BOOT_DS until we eventually load the kernel's 
own boot GDT at head_32.S:94.

Does anyone see any problem with that?  As far as I can tell, we're 
requiring %cs == BOOT_CS for the current code anyway (unless 
KEEP_SEGMENTS), but %ds == %cs + 8 seems like a more sensible requirement.

	-hpa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 22:04           ` H. Peter Anvin
  2008-06-30 22:14             ` H. Peter Anvin
@ 2008-06-30 22:31             ` Sean Young
  2008-06-30 22:34             ` Jeremy Fitzhardinge
  2 siblings, 0 replies; 32+ messages in thread
From: Sean Young @ 2008-06-30 22:31 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel

On Mon, Jun 30, 2008 at 03:04:33PM -0700, H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
> >It's a bit odd that the boot loader neglected to set up ds properly, but 
> >changing the testb line to
> >
> >    testb $(1<<6), %cs:BP_loadflags(%esi)
> >
> >should work.  (Or perhaps a %ss: override would be better?)
> >
> >I'm assuming that the GDT setup isn't completely mad and that the 
> >segments have the same base at least.
> >
> 
> This should have been set up by the *boot code* (specifically lines 
> 57-61 of arch/x86/boot/pmjump.S) since he's using a conventional boot 
> loader (syslinux) so something is utterly fuggled up.  Using %cs: here 
> should be safe, though (and *is* more conservative, after all, why 
> otherwise bother reloading these segments at all?), but it still 
> concerns me a great deal if this is broken in this way.  It's definitely 
> better than %ss:.
> 
> In particular, I'm wondering if the Elan CPU has any strange ordering 
> requirements with regards to the protected mode transition that we're 
> not obeying.
> 
> It would be interesting to put in a heavyweight "brutally synchronizing" 
> instruction like WBINVD at various places in arch/x86/boot/pmjump.S and 
> see if it helps.

You are right. With wbinvd before %ds is changed in that file the kernel
boots. 


Sean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 22:04           ` H. Peter Anvin
  2008-06-30 22:14             ` H. Peter Anvin
  2008-06-30 22:31             ` Sean Young
@ 2008-06-30 22:34             ` Jeremy Fitzhardinge
  2008-06-30 22:42               ` H. Peter Anvin
  2 siblings, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2008-06-30 22:34 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Sean Young, Rusty Russell, linux-kernel

H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
>>>
>>> Looking at the beginning of startup_32, it seems ds is used before 
>>> it is set:
>>>
>>> startup_32:
>>>         cld
>>>         /* test KEEP_SEGMENTS flag to see if the bootloader is asking
>>>          * us to not reload segments */
>>>         testb $(1<<6), BP_loadflags(%esi)
>>>         jnz 1f
>>>
>>>         cli
>>>         movl $(__BOOT_DS),%eax
>>>         movl %eax,%ds
>>>         movl %eax,%es
>>>         movl %eax,%fs
>>>         movl %eax,%gs
>>>         movl %eax,%ss
>>> 1:
>>>
>>> Since the testb instruction is a dereference, ds is implicitly used. If
>>> I move the testb to after "movl %eax,%ds" it seems to work (not that it
>>> would make any sense there, but just to prove the point).
>>>
>>> 1) Am I barking up the wrong tree?
>>>
>>> 2) If I'm right I have no idea what the correct solution is; it 
>>> seems that
>>>    a chicken & egg issue is introduced.
>>>
>>> Please advise. I am very new to all of this.
>>
>> It's a bit odd that the boot loader neglected to set up ds properly, 
>> but changing the testb line to
>>
>>     testb $(1<<6), %cs:BP_loadflags(%esi)
>>
>> should work.  (Or perhaps a %ss: override would be better?)
>>
>> I'm assuming that the GDT setup isn't completely mad and that the 
>> segments have the same base at least.
>>
>
> This should have been set up by the *boot code* (specifically lines 
> 57-61 of arch/x86/boot/pmjump.S) since he's using a conventional boot 
> loader (syslinux) so something is utterly fuggled up.

Hm, yeah.

> Using %cs: here should be safe, though (and *is* more conservative, 
> after all, why otherwise bother reloading these segments at all?), but 
> it still concerns me a great deal if this is broken in this way.  It's 
> definitely better than %ss:.
>
> In particular, I'm wondering if the Elan CPU has any strange ordering 
> requirements with regards to the protected mode transition that we're 
> not obeying.

Maybe it really does require the far jump immediately after setting PE 
in cr0...

Hm, I don't remember this paragraph being in vol 3a, section 8.9.1 
before.  Is it a recent addition?

    Random failures can occur if other instructions exist between steps
    3 and 4 above.  Failures will be readily seen in some situations,
    such as when instructions that reference memory are inserted between
    steps 3 and 4 while in system management mode.



    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 22:14             ` H. Peter Anvin
@ 2008-06-30 22:36               ` Jeremy Fitzhardinge
  2008-06-30 22:45                 ` H. Peter Anvin
  0 siblings, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2008-06-30 22:36 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Sean Young, Rusty Russell, linux-kernel

H. Peter Anvin wrote:
> H. Peter Anvin wrote:
>>>>
>>>> startup_32:
>>>>         cld
>>>>         /* test KEEP_SEGMENTS flag to see if the bootloader is asking
>>>>          * us to not reload segments */
>>>>         testb $(1<<6), BP_loadflags(%esi)
>>>>         jnz 1f
>>>>
>>>>         cli
>>>>         movl $(__BOOT_DS),%eax
>>>>         movl %eax,%ds
>>>>         movl %eax,%es
>>>>         movl %eax,%fs
>>>>         movl %eax,%gs
>>>>         movl %eax,%ss
>>>> 1:
>
> On this general subject... I keep thinking that it would be better to 
> have this as:
>
>     movl    %cs, %eax
>     addl    $8, %eax
>     movl    %eax, %cs

What's your intent?  You can't directly load %cs; do you mean the other 
segment registers?

>
> ... instead of a hard-coded constant.  That actually removes all 
> hard-coded uses of BOOT_CS/BOOT_DS until we eventually load the 
> kernel's own boot GDT at head_32.S:94.
>
> Does anyone see any problem with that?  As far as I can tell, we're 
> requiring %cs == BOOT_CS for the current code anyway (unless 
> KEEP_SEGMENTS), but %ds == %cs + 8 seems like a more sensible 
> requirement.
>

Oh, you mean

	mov %cs, %eax
	add $8, %eax
	mov %eax, %ds (etc)
?

Seems reasonable.

	J


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 22:34             ` Jeremy Fitzhardinge
@ 2008-06-30 22:42               ` H. Peter Anvin
  2008-07-01 19:59                 ` Sean Young
  0 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2008-06-30 22:42 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Sean Young, Rusty Russell, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 582 bytes --]

Jeremy Fitzhardinge wrote:
> 
> Maybe it really does require the far jump immediately after setting PE 
> in cr0...
> 
> Hm, I don't remember this paragraph being in vol 3a, section 8.9.1 
> before.  Is it a recent addition?
> 
>    Random failures can occur if other instructions exist between steps
>    3 and 4 above.  Failures will be readily seen in some situations,
>    such as when instructions that reference memory are inserted between
>    steps 3 and 4 while in system management mode.
> 

I don't remember that, either.

Sean: could you try the following patch?

	-hpa

[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 571 bytes --]

diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
index ab049d4..141b6e2 100644
--- a/arch/x86/boot/pmjump.S
+++ b/arch/x86/boot/pmjump.S
@@ -33,6 +33,8 @@ protected_mode_jump:
 	movw	%cs, %bx
 	shll	$4, %ebx
 	addl	%ebx, 2f
+	jmp	1f			# Short jump to serialize on 386/486
+1:
 
 	movw	$__BOOT_DS, %cx
 	movw	$__BOOT_TSS, %di
@@ -40,8 +42,6 @@ protected_mode_jump:
 	movl	%cr0, %edx
 	orb	$X86_CR0_PE, %dl	# Protected mode
 	movl	%edx, %cr0
-	jmp	1f			# Short jump to serialize on 386/486
-1:
 
 	# Transition to 32-bit mode
 	.byte	0x66, 0xea		# ljmpl opcode

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 22:36               ` Jeremy Fitzhardinge
@ 2008-06-30 22:45                 ` H. Peter Anvin
  0 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2008-06-30 22:45 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Sean Young, Rusty Russell, linux-kernel

Jeremy Fitzhardinge wrote:
>>
>> On this general subject... I keep thinking that it would be better to 
>> have this as:
>>
>>     movl    %cs, %eax
>>     addl    $8, %eax
>>     movl    %eax, %cs
> 
> What's your intent?  You can't directly load %cs; do you mean the other 
> segment registers?
> 
> 
> Oh, you mean
> 
>     mov %cs, %eax
>     add $8, %eax
>     mov %eax, %ds (etc)
> ?
> 
> Seems reasonable.
> 

Uhm, yeah ;)

	-hpa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 22:42               ` H. Peter Anvin
@ 2008-07-01 19:59                 ` Sean Young
  2008-07-01 20:20                   ` Sean Young
                                     ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Sean Young @ 2008-07-01 19:59 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel

On Mon, Jun 30, 2008 at 03:42:47PM -0700, H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
> >
> >Maybe it really does require the far jump immediately after setting PE 
> >in cr0...
> >
> >Hm, I don't remember this paragraph being in vol 3a, section 8.9.1 
> >before.  Is it a recent addition?
> >
> >   Random failures can occur if other instructions exist between steps
> >   3 and 4 above.  Failures will be readily seen in some situations,
> >   such as when instructions that reference memory are inserted between
> >   steps 3 and 4 while in system management mode.
> >
> 
> I don't remember that, either.

Which document are we talking about? 

> Sean: could you try the following patch?
> 
> 	-hpa

> diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
> index ab049d4..141b6e2 100644
> --- a/arch/x86/boot/pmjump.S
> +++ b/arch/x86/boot/pmjump.S
> @@ -33,6 +33,8 @@ protected_mode_jump:
>  	movw	%cs, %bx
>  	shll	$4, %ebx
>  	addl	%ebx, 2f
> +	jmp	1f			# Short jump to serialize on 386/486
> +1:
>  
>  	movw	$__BOOT_DS, %cx
>  	movw	$__BOOT_TSS, %di
> @@ -40,8 +42,6 @@ protected_mode_jump:
>  	movl	%cr0, %edx
>  	orb	$X86_CR0_PE, %dl	# Protected mode
>  	movl	%edx, %cr0
> -	jmp	1f			# Short jump to serialize on 386/486
> -1:
>  
>  	# Transition to 32-bit mode
>  	.byte	0x66, 0xea		# ljmpl opcode

I'm afraid it doesn't work. Maybe I can find something in the AMD Elan
documentation. Would a fence make sense?

Thanks
Sean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-01 19:59                 ` Sean Young
@ 2008-07-01 20:20                   ` Sean Young
  2008-07-01 20:24                     ` Jeremy Fitzhardinge
                                       ` (2 more replies)
  2008-07-01 20:22                   ` Jeremy Fitzhardinge
  2008-07-01 20:23                   ` H. Peter Anvin
  2 siblings, 3 replies; 32+ messages in thread
From: Sean Young @ 2008-07-01 20:20 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel

On Tue, Jul 01, 2008 at 07:59:42PM +0000, Sean Young wrote:
> On Mon, Jun 30, 2008 at 03:42:47PM -0700, H. Peter Anvin wrote:
> > diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
> > index ab049d4..141b6e2 100644
> > --- a/arch/x86/boot/pmjump.S
> > +++ b/arch/x86/boot/pmjump.S
> > @@ -33,6 +33,8 @@ protected_mode_jump:
> >  	movw	%cs, %bx
> >  	shll	$4, %ebx
> >  	addl	%ebx, 2f
> > +	jmp	1f			# Short jump to serialize on 386/486
> > +1:
> >  
> >  	movw	$__BOOT_DS, %cx
> >  	movw	$__BOOT_TSS, %di
> > @@ -40,8 +42,6 @@ protected_mode_jump:
> >  	movl	%cr0, %edx
> >  	orb	$X86_CR0_PE, %dl	# Protected mode
> >  	movl	%edx, %cr0
> > -	jmp	1f			# Short jump to serialize on 386/486
> > -1:
> >  
> >  	# Transition to 32-bit mode
> >  	.byte	0x66, 0xea		# ljmpl opcode
> 
> I'm afraid it doesn't work. Maybe I can find something in the AMD Elan
> documentation. Would a fence make sense?

I was just trying to confirm this with postive and negative tests which 
is failing. I'm probably being an idiot, please let systematically test
this.

This does not seem like the heisenbug category, or is it?


Sean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-01 19:59                 ` Sean Young
  2008-07-01 20:20                   ` Sean Young
@ 2008-07-01 20:22                   ` Jeremy Fitzhardinge
  2008-07-01 20:23                   ` H. Peter Anvin
  2 siblings, 0 replies; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2008-07-01 20:22 UTC (permalink / raw)
  To: Sean Young; +Cc: H. Peter Anvin, Rusty Russell, linux-kernel

Sean Young wrote:
> On Mon, Jun 30, 2008 at 03:42:47PM -0700, H. Peter Anvin wrote:
>   
>> Jeremy Fitzhardinge wrote:
>>     
>>> Maybe it really does require the far jump immediately after setting PE 
>>> in cr0...
>>>
>>> Hm, I don't remember this paragraph being in vol 3a, section 8.9.1 
>>> before.  Is it a recent addition?
>>>
>>>   Random failures can occur if other instructions exist between steps
>>>   3 and 4 above.  Failures will be readily seen in some situations,
>>>   such as when instructions that reference memory are inserted between
>>>   steps 3 and 4 while in system management mode.
>>>
>>>       
>> I don't remember that, either.
>>     
>
> Which document are we talking about? 
>   

*The* document ;)
http://www.intel.com/products/processor/manuals/ specifically, Volume 
3a: System Programming Guide, Part 1. 

Section 8.9.1 describes the steps needed to turn on protected mode 
correctly.  It says that you need to do a far jump or call immediately 
after turning on protected mode.  Linux has not done it immediately, and 
there has been a school of thought that this advice is a workaround for 
some obsolete CPU, and is not something we have to worry about now.

However, the paragraph I quoted was added since the previous release of 
the manual, and so presumably documents a current concern.  
Specifically, the mention of SMM is interesting, because I gather that 
embedded-class processors like the Elan are very SMM-dependent.

>> Sean: could you try the following patch?
>>
>> 	-hpa
>>     
>
>   
>> diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
>> index ab049d4..141b6e2 100644
>> --- a/arch/x86/boot/pmjump.S
>> +++ b/arch/x86/boot/pmjump.S
>> @@ -33,6 +33,8 @@ protected_mode_jump:
>>  	movw	%cs, %bx
>>  	shll	$4, %ebx
>>  	addl	%ebx, 2f
>> +	jmp	1f			# Short jump to serialize on 386/486
>> +1:
>>  
>>  	movw	$__BOOT_DS, %cx
>>  	movw	$__BOOT_TSS, %di
>> @@ -40,8 +42,6 @@ protected_mode_jump:
>>  	movl	%cr0, %edx
>>  	orb	$X86_CR0_PE, %dl	# Protected mode
>>  	movl	%edx, %cr0
>> -	jmp	1f			# Short jump to serialize on 386/486
>> -1:
>>  
>>  	# Transition to 32-bit mode
>>  	.byte	0x66, 0xea		# ljmpl opcode
>>     
>
> I'm afraid it doesn't work. Maybe I can find something in the AMD Elan
> documentation. Would a fence make sense?
>   

Not really, but if it fixes the bug it won't hurt anyone else (unless 
older processors treat it as an illegal instruction).

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-01 19:59                 ` Sean Young
  2008-07-01 20:20                   ` Sean Young
  2008-07-01 20:22                   ` Jeremy Fitzhardinge
@ 2008-07-01 20:23                   ` H. Peter Anvin
  2008-07-01 20:29                     ` Jeremy Fitzhardinge
  2 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2008-07-01 20:23 UTC (permalink / raw)
  To: Sean Young; +Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel

Sean Young wrote:
> On Mon, Jun 30, 2008 at 03:42:47PM -0700, H. Peter Anvin wrote:
>> Jeremy Fitzhardinge wrote:
>>> Maybe it really does require the far jump immediately after setting PE 
>>> in cr0...
>>>
>>> Hm, I don't remember this paragraph being in vol 3a, section 8.9.1 
>>> before.  Is it a recent addition?
>>>
>>>   Random failures can occur if other instructions exist between steps
>>>   3 and 4 above.  Failures will be readily seen in some situations,
>>>   such as when instructions that reference memory are inserted between
>>>   steps 3 and 4 while in system management mode.
>>>
>> I don't remember that, either.
> 
> Which document are we talking about? 
> 

Intel's Software Developer's Manual, volume 3a.

> I'm afraid it doesn't work. Maybe I can find something in the AMD Elan
> documentation. Would a fence make sense?

Not really.  In particular, I really don't want to add something that is 
even remotely likely to break on other platforms, and a fence 
instruction is relatively recent.

	-hpa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-01 20:20                   ` Sean Young
@ 2008-07-01 20:24                     ` Jeremy Fitzhardinge
  2008-07-01 20:25                     ` H. Peter Anvin
  2008-07-15 22:23                     ` Sean Young
  2 siblings, 0 replies; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2008-07-01 20:24 UTC (permalink / raw)
  To: Sean Young; +Cc: H. Peter Anvin, Rusty Russell, linux-kernel

Sean Young wrote:
> I was just trying to confirm this with postive and negative tests which 
> is failing. I'm probably being an idiot, please let systematically test
> this.
>
> This does not seem like the heisenbug category, or is it?
>   

Possibly.  Does the platform use system-management mode?  If it does, it 
may depend on whether you get an SMM interrupt in a critical window.

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-01 20:20                   ` Sean Young
  2008-07-01 20:24                     ` Jeremy Fitzhardinge
@ 2008-07-01 20:25                     ` H. Peter Anvin
  2008-07-15 22:23                     ` Sean Young
  2 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2008-07-01 20:25 UTC (permalink / raw)
  To: Sean Young; +Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel

Sean Young wrote:
> 
> I was just trying to confirm this with postive and negative tests which 
> is failing. I'm probably being an idiot, please let systematically test
> this.
> 
> This does not seem like the heisenbug category, or is it?
> 

It could easily be something timing-dependent, which might affect only X 
boots out of Y.  Please keep us posted on your testing progress.

It would also be very interesting if you put, say, a WBINVD at the point 
  of label 1:

	-hpa


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-01 20:23                   ` H. Peter Anvin
@ 2008-07-01 20:29                     ` Jeremy Fitzhardinge
  2008-07-01 20:39                       ` H. Peter Anvin
  2008-07-06 13:21                       ` Pavel Machek
  0 siblings, 2 replies; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2008-07-01 20:29 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Sean Young, Rusty Russell, linux-kernel

H. Peter Anvin wrote:
> Not really.  In particular, I really don't want to add something that 
> is even remotely likely to break on other platforms, and a fence 
> instruction is relatively recent.

I wonder if a cpuid would do the trick?  Do we still support cpuid-less 
processors?

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-01 20:29                     ` Jeremy Fitzhardinge
@ 2008-07-01 20:39                       ` H. Peter Anvin
  2008-07-06 13:21                       ` Pavel Machek
  1 sibling, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2008-07-01 20:39 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Sean Young, Rusty Russell, linux-kernel

Jeremy Fitzhardinge wrote:
> H. Peter Anvin wrote:
>> Not really.  In particular, I really don't want to add something that 
>> is even remotely likely to break on other platforms, and a fence 
>> instruction is relatively recent.
> 
> I wonder if a cpuid would do the trick?  Do we still support cpuid-less 
> processors?

Yes, we do.  Worst case, we could bypass it, but I'd like to understand 
what the real constraint is, here.

	-hpa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-06-30 22:09           ` Sean Young
@ 2008-07-06  2:41             ` Eric W. Biederman
  2008-07-06  3:07               ` H. Peter Anvin
  0 siblings, 1 reply; 32+ messages in thread
From: Eric W. Biederman @ 2008-07-06  2:41 UTC (permalink / raw)
  To: Sean Young; +Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel, H. Anvin

Sean Young <sean@mess.org> writes:

> On Mon, Jun 30, 2008 at 02:52:10PM -0700, Jeremy Fitzhardinge wrote:
>> It's a bit odd that the boot loader neglected to set up ds properly, but 
>> changing the testb line to
>> 
>> 	testb $(1<<6), %cs:BP_loadflags(%esi)
>> 
>> should work.  (Or perhaps a %ss: override would be better?)
>
> Yup, it does (%cs: prefix).

Guys I have a fuzzy memory of reads and definitely writes through %cs not working,
on some processor models.  i386?

If that hadn't been the case I would have used that form more to remove the
dependencies before we reached lgdt in haed.S when I was in there years ago.

Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-06  2:41             ` Eric W. Biederman
@ 2008-07-06  3:07               ` H. Peter Anvin
  0 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2008-07-06  3:07 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Sean Young, Jeremy Fitzhardinge, Rusty Russell, linux-kernel

Eric W. Biederman wrote:
> Sean Young <sean@mess.org> writes:
> 
>> On Mon, Jun 30, 2008 at 02:52:10PM -0700, Jeremy Fitzhardinge wrote:
>>> It's a bit odd that the boot loader neglected to set up ds properly, but 
>>> changing the testb line to
>>>
>>> 	testb $(1<<6), %cs:BP_loadflags(%esi)
>>>
>>> should work.  (Or perhaps a %ss: override would be better?)
>> Yup, it does (%cs: prefix).
> 
> Guys I have a fuzzy memory of reads and definitely writes through %cs not working,
> on some processor models.  i386?
> 

You can never write through %cs in protected mode.

You can read through %cs: in protected mode unless %cs contains an 
execute-only descriptor.

	-hpa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-01 20:29                     ` Jeremy Fitzhardinge
  2008-07-01 20:39                       ` H. Peter Anvin
@ 2008-07-06 13:21                       ` Pavel Machek
  1 sibling, 0 replies; 32+ messages in thread
From: Pavel Machek @ 2008-07-06 13:21 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: H. Peter Anvin, Sean Young, Rusty Russell, linux-kernel

On Tue 2008-07-01 13:29:01, Jeremy Fitzhardinge wrote:
> H. Peter Anvin wrote:
>> Not really.  In particular, I really don't want to add something that is 
>> even remotely likely to break on other platforms, and a fence instruction 
>> is relatively recent.
>
> I wonder if a cpuid would do the trick?  Do we still support cpuid-less 
> processors?

We still support 386~~, AFAICT.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-01 20:20                   ` Sean Young
  2008-07-01 20:24                     ` Jeremy Fitzhardinge
  2008-07-01 20:25                     ` H. Peter Anvin
@ 2008-07-15 22:23                     ` Sean Young
  2008-07-15 22:30                       ` H. Peter Anvin
  2 siblings, 1 reply; 32+ messages in thread
From: Sean Young @ 2008-07-15 22:23 UTC (permalink / raw)
  To: H. Peter Anvin, Jeremy Fitzhardinge, Rusty Russell, linux-kernel

On Tue, Jul 01, 2008 at 08:20:14PM +0000, Sean Young wrote:
> On Tue, Jul 01, 2008 at 07:59:42PM +0000, Sean Young wrote:
> > On Mon, Jun 30, 2008 at 03:42:47PM -0700, H. Peter Anvin wrote:
> > > diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
> > > index ab049d4..141b6e2 100644
> > > --- a/arch/x86/boot/pmjump.S
> > > +++ b/arch/x86/boot/pmjump.S
> > > @@ -33,6 +33,8 @@ protected_mode_jump:
> > >  	movw	%cs, %bx
> > >  	shll	$4, %ebx
> > >  	addl	%ebx, 2f
> > > +	jmp	1f			# Short jump to serialize on 386/486
> > > +1:
> > >  
> > >  	movw	$__BOOT_DS, %cx
> > >  	movw	$__BOOT_TSS, %di
> > > @@ -40,8 +42,6 @@ protected_mode_jump:
> > >  	movl	%cr0, %edx
> > >  	orb	$X86_CR0_PE, %dl	# Protected mode
> > >  	movl	%edx, %cr0
> > > -	jmp	1f			# Short jump to serialize on 386/486
> > > -1:
> > >  
> > >  	# Transition to 32-bit mode
> > >  	.byte	0x66, 0xea		# ljmpl opcode
> > 
> > I'm afraid it doesn't work. Maybe I can find something in the AMD Elan
> > documentation. Would a fence make sense?
> 
> I was just trying to confirm this with postive and negative tests which 
> is failing. I'm probably being an idiot, please let systematically test
> this.

When I bisected this issue between latest and 2.6.23 the issue seemed to be
here (round about 2.6.24-rc1). Trying to repeat the same things on latest 
just led to confusion.

So I've started from scratch with contemporary trees and found something 
very different.

2.6.25: works
2.6.25.5: works
2.6.25.6: broken
2.6.25.10: broken
2.6.26: works

Now the issue which was introduced in 2.6.25.6 is that a rdtsc will be done
even though this CPU does not support it (before console_init(), leaving
me with only one led for output!). This issue has been fixed in 2.6.26 but 
not in 2.6.25-stable.

The protected mode jump was just a red herring (or at least whatever was 
broken back then is fixed now); 2.6.26 is working fine and that's good 
enough for me.

Thanks for all the help,

Sean

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Regression: boot failure on AMD Elan TS-5500
  2008-07-15 22:23                     ` Sean Young
@ 2008-07-15 22:30                       ` H. Peter Anvin
  0 siblings, 0 replies; 32+ messages in thread
From: H. Peter Anvin @ 2008-07-15 22:30 UTC (permalink / raw)
  To: Sean Young
  Cc: Jeremy Fitzhardinge, Rusty Russell, linux-kernel,
	the arch/x86 maintainers

Sean Young wrote:
> 
> When I bisected this issue between latest and 2.6.23 the issue seemed to be
> here (round about 2.6.24-rc1). Trying to repeat the same things on latest 
> just led to confusion.
> 
> So I've started from scratch with contemporary trees and found something 
> very different.
> 
> 2.6.25: works
> 2.6.25.5: works
> 2.6.25.6: broken
> 2.6.25.10: broken
> 2.6.26: works
> 
> Now the issue which was introduced in 2.6.25.6 is that a rdtsc will be done
> even though this CPU does not support it (before console_init(), leaving
> me with only one led for output!). This issue has been fixed in 2.6.26 but 
> not in 2.6.25-stable.
> 
> The protected mode jump was just a red herring (or at least whatever was 
> broken back then is fixed now); 2.6.26 is working fine and that's good 
> enough for me.
> 

Okay, that is good to know.  We did move the jmp in pmjump.S for 2.6.26; 
even if it ended up being redundant I have to say I'm happier following 
the documented script, especially given the apparent recent 
documentation change on Intel's part about this.

	-hpa

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2008-07-15 22:36 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-16 12:11 Regression: boot failure on AMD Elan TS-5500 Sean Young
2008-06-16 13:27 ` Rusty Russell
2008-06-16 13:37   ` Jeremy Fitzhardinge
2008-06-16 16:19     ` Sean Young
2008-06-16 17:06       ` H. Peter Anvin
2008-06-16 17:58         ` Sean Young
2008-06-16 18:16           ` H. Peter Anvin
2008-06-16 17:06       ` H. Peter Anvin
2008-06-16 17:07       ` Jeremy Fitzhardinge
2008-06-30 21:43       ` Sean Young
2008-06-30 21:52         ` Jeremy Fitzhardinge
2008-06-30 22:04           ` H. Peter Anvin
2008-06-30 22:14             ` H. Peter Anvin
2008-06-30 22:36               ` Jeremy Fitzhardinge
2008-06-30 22:45                 ` H. Peter Anvin
2008-06-30 22:31             ` Sean Young
2008-06-30 22:34             ` Jeremy Fitzhardinge
2008-06-30 22:42               ` H. Peter Anvin
2008-07-01 19:59                 ` Sean Young
2008-07-01 20:20                   ` Sean Young
2008-07-01 20:24                     ` Jeremy Fitzhardinge
2008-07-01 20:25                     ` H. Peter Anvin
2008-07-15 22:23                     ` Sean Young
2008-07-15 22:30                       ` H. Peter Anvin
2008-07-01 20:22                   ` Jeremy Fitzhardinge
2008-07-01 20:23                   ` H. Peter Anvin
2008-07-01 20:29                     ` Jeremy Fitzhardinge
2008-07-01 20:39                       ` H. Peter Anvin
2008-07-06 13:21                       ` Pavel Machek
2008-06-30 22:09           ` Sean Young
2008-07-06  2:41             ` Eric W. Biederman
2008-07-06  3:07               ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox