All of lore.kernel.org
 help / color / mirror / Atom feed
* [parisc-linux] SMP kernel problems on a D350
@ 2002-09-19  7:38 Istvan Gyenes
  0 siblings, 0 replies; 29+ messages in thread
From: Istvan Gyenes @ 2002-09-19  7:38 UTC (permalink / raw)
  To: parisc-linux

Hello List,

I'm trying to compile an SMP kernel for my D 350 (2cpu) server without
success.
The kernel source is 2.4.19-pa18 and in non-smp configuration it works
well. Anyway the SMP kernel compiles fine but when I try to boot it
stops at the "If this is the last message you see, you may need to switch
your console" line. I've switched the console but got no other output.
(It was the same with 2.4.19-pa14)
The system was installed from a debian 3.0 CD.
Can somebody send me a working .config? Or what can be the problem?

Thanks in advance,

__
Steve

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
       [not found] <200209190805.KAA0000032531@simba.sch.bme.hu>
@ 2002-09-19  9:17 ` Istvan Gyenes
  2002-09-19 12:21   ` J.Steindlberger
  2002-09-19 22:46   ` Grant Grundler
  0 siblings, 2 replies; 29+ messages in thread
From: Istvan Gyenes @ 2002-09-19  9:17 UTC (permalink / raw)
  To: J.Steindlberger; +Cc: parisc-linux

Hello Joerg,

The only difference between the non-smp and smp kernel config file is
CONFIG_SMP=yes , AFAIK.
I made a "make menuconfig" and the only thing I've changed is SMP support.
The strange thing is that the precompiled smp kernel from the install cd
boots fine. (2.4.18-smp)

BTW where can I select 32bit/64bit support?
__
Steve

On Thu, 19 Sep 2002, J.Steindlberger wrote:

> Hi Steve,
>
> I'm no developer, but I know, that there are a few restrictions with D-Class
> machines. Did You select 64bit support? This machine is a 32bit architecture.
>
> Regards
> Joerg
>
> On Thursday 19 September 2002 09:38, you wrote:
> > I'm trying to compile an SMP kernel for my D 350 (2cpu) server without
> > success.
> > The kernel source is 2.4.19-pa18 and in non-smp configuration it works
> > well. Anyway the SMP kernel compiles fine but when I try to boot it
> > stops at the "If this is the last message you see, you may need to switch
> > your console" line. I've switched the console but got no other output.
> > (It was the same with 2.4.19-pa14)
> > The system was installed from a debian 3.0 CD.
> > Can somebody send me a working .config? Or what can be the problem?
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-19  9:17 ` [parisc-linux] SMP kernel problems on a D350 Istvan Gyenes
@ 2002-09-19 12:21   ` J.Steindlberger
  2002-09-19 12:29     ` Ryan Bradetich
  2002-09-19 22:46   ` Grant Grundler
  1 sibling, 1 reply; 29+ messages in thread
From: J.Steindlberger @ 2002-09-19 12:21 UTC (permalink / raw)
  To: Istvan Gyenes; +Cc: parisc-linux

Hi,

You can choose 32bit/64bit in the "Processor type" section. And only if You 
choose a processor that supports more than 32bit.

Did You try the config file included in the 2.4.18-smp kernel image package? 
Perhaps it's different in more than that SMP-section. So far for my ideas. If 
You still get problems, there is something to be fixed and the answering is 
up to an expert -- not me ;-) , sorry. I hope I could help You anyway.

Joerg

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-19 12:21   ` J.Steindlberger
@ 2002-09-19 12:29     ` Ryan Bradetich
  0 siblings, 0 replies; 29+ messages in thread
From: Ryan Bradetich @ 2002-09-19 12:29 UTC (permalink / raw)
  To: J.Steindlberger; +Cc: Istvan Gyenes, parisc-linux

On Thu, 2002-09-19 at 06:21, J.Steindlberger wrote:
> Hi,
> 
> You can choose 32bit/64bit in the "Processor type" section. And only if You 
> choose a processor that supports more than 32bit.

ie PA8x00 chipset.  From the hwdb the D350 has the following processor:
UL Proc 1-way T'100 (821/D250,D350) (Processor)  (PA7200 (PCX-T'))

so 64-bit would not be supported on this system.

- Ryan


> Did You try the config file included in the 2.4.18-smp kernel image package? 
> Perhaps it's different in more than that SMP-section. So far for my ideas. If 
> You still get problems, there is something to be fixed and the answering is 
> up to an expert -- not me ;-) , sorry. I hope I could help You anyway.
> 
> Joerg
> _______________________________________________
> parisc-linux mailing list
> parisc-linux@lists.parisc-linux.org
> http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-19  9:17 ` [parisc-linux] SMP kernel problems on a D350 Istvan Gyenes
  2002-09-19 12:21   ` J.Steindlberger
@ 2002-09-19 22:46   ` Grant Grundler
  2002-09-20  8:28     ` Istvan Gyenes
  1 sibling, 1 reply; 29+ messages in thread
From: Grant Grundler @ 2002-09-19 22:46 UTC (permalink / raw)
  To: Istvan Gyenes; +Cc: J.Steindlberger, parisc-linux

Istvan Gyenes wrote:
> Hello Joerg,
> 
> The only difference between the non-smp and smp kernel config file is
> CONFIG_SMP=yes , AFAIK.
> I made a "make menuconfig" and the only thing I've changed is SMP support.

I'm paranoid. I do "make distclean" when doing anything other than
adding/removing drivers. Save/restore the .config if you need to before
running "make distclean".  I don't trust the Makefiles to rebuild
everything correctly for "global" CONFIG_ changes like "SMP".

> The strange thing is that the precompiled smp kernel from the install cd
> boots fine. (2.4.18-smp)

SMP on 2.4.19 isn't as stable yet.  So that's no surprise.

If you want to debug this further, define "EARLY_BOOTUP_DEBUG"
in arch/parisc/kernel/pdc_cons.c and you should get more output
about how far the kernel gets before it crashes/hangs.

grant

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-19 22:46   ` Grant Grundler
@ 2002-09-20  8:28     ` Istvan Gyenes
  2002-09-20 19:48       ` Carlos O'Donell
  2002-09-21  4:31       ` Grant Grundler
  0 siblings, 2 replies; 29+ messages in thread
From: Istvan Gyenes @ 2002-09-20  8:28 UTC (permalink / raw)
  To: Grant Grundler; +Cc: parisc-linux

Thanks I'll try that!

Another question: If 2.4.19 SMP not enough stable where can I find the
latest stable smp kernel source?

Thanks,

__
Steve

On Thu, 19 Sep 2002, Grant Grundler wrote:

> Istvan Gyenes wrote:
> > Hello Joerg,
> >
> > The only difference between the non-smp and smp kernel config file is
> > CONFIG_SMP=yes , AFAIK.
> > I made a "make menuconfig" and the only thing I've changed is SMP support.
>
> I'm paranoid. I do "make distclean" when doing anything other than
> adding/removing drivers. Save/restore the .config if you need to before
> running "make distclean".  I don't trust the Makefiles to rebuild
> everything correctly for "global" CONFIG_ changes like "SMP".
>
> > The strange thing is that the precompiled smp kernel from the install cd
> > boots fine. (2.4.18-smp)
>
> SMP on 2.4.19 isn't as stable yet.  So that's no surprise.
>
> If you want to debug this further, define "EARLY_BOOTUP_DEBUG"
> in arch/parisc/kernel/pdc_cons.c and you should get more output
> about how far the kernel gets before it crashes/hangs.
>
> grant
> _______________________________________________
> parisc-linux mailing list
> parisc-linux@lists.parisc-linux.org
> http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20  8:28     ` Istvan Gyenes
@ 2002-09-20 19:48       ` Carlos O'Donell
  2002-09-20 20:02         ` Jeremy Drake
  2002-09-21  4:31       ` Grant Grundler
  1 sibling, 1 reply; 29+ messages in thread
From: Carlos O'Donell @ 2002-09-20 19:48 UTC (permalink / raw)
  To: Istvan Gyenes; +Cc: Grant Grundler, parisc-linux

> Thanks I'll try that!
> 
> Another question: If 2.4.19 SMP not enough stable where can I find the
> latest stable smp kernel source?
> 
> Thanks,
> 
> Steve

Nothing up this sleeve, nothing up this sleeve...
<carlos pulls a stable smp kernel out of his hat>

Tada! ;)

I'm not quite certain that we ever had a stable
SMP kernel. While an older kernel might seem to 
give you SMP stability, it does so at the cost of 
speed and the introduction of old bugs.

If you can find some test cases for Non-SMP vs.
SMP stability, then we'll be a step in the right 
direction.

c.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 19:48       ` Carlos O'Donell
@ 2002-09-20 20:02         ` Jeremy Drake
  2002-09-20 20:37           ` Carlos O'Donell
  2002-09-20 20:37           ` [parisc-linux] SMP kernel problems on a D350 Bdale Garbee
  0 siblings, 2 replies; 29+ messages in thread
From: Jeremy Drake @ 2002-09-20 20:02 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: Istvan Gyenes, Grant Grundler, parisc-linux

On Fri, 20 Sep 2002, Carlos O'Donell wrote:

> > Thanks I'll try that!
> > 
> > Another question: If 2.4.19 SMP not enough stable where can I find the
> > latest stable smp kernel source?
> > 
> > Thanks,
> > 
> > Steve
> 
> Nothing up this sleeve, nothing up this sleeve...
> <carlos pulls a stable smp kernel out of his hat>
> 
> Tada! ;)
> 
> I'm not quite certain that we ever had a stable
> SMP kernel. While an older kernel might seem to 
> give you SMP stability, it does so at the cost of 
> speed and the introduction of old bugs.

For me, the last kernel that didn't crash on my J5000 in smp mode while
doing apt-get update was kernel-image-2.4.17-32-smp_23.1_hppa.deb

I wouldn't recommend using it, however...
> 
> If you can find some test cases for Non-SMP vs.
> SMP stability, then we'll be a step in the right 
> direction.
> 
> c.
> 
> _______________________________________________
> parisc-linux mailing list
> parisc-linux@lists.parisc-linux.org
> http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
> 

-- 
Mason's First Law of Synergism:
	The one day you'd sell your soul for something, souls are a glut.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 20:02         ` Jeremy Drake
@ 2002-09-20 20:37           ` Carlos O'Donell
  2002-09-20 20:46             ` John David Anglin
  2002-09-21  3:38             ` [parisc-linux] malloc limits John David Anglin
  2002-09-20 20:37           ` [parisc-linux] SMP kernel problems on a D350 Bdale Garbee
  1 sibling, 2 replies; 29+ messages in thread
From: Carlos O'Donell @ 2002-09-20 20:37 UTC (permalink / raw)
  To: Jeremy Drake; +Cc: Istvan Gyenes, Grant Grundler, parisc-linux

> 
> For me, the last kernel that didn't crash on my J5000 in smp mode while
> doing apt-get update was kernel-image-2.4.17-32-smp_23.1_hppa.deb
> 
> I wouldn't recommend using it, however...

If I _wasn't_ using my A500 for contiual binutils/glibc/gcc 
builds, I'd be testing out the SMP problems :)

When running an SMP kernel and doing multiple compiles, the 
box was rather unsuable e.g. random process death.

As Randolph noted to me on IRC, it looks like fixing the mmap
issues we have would be a step in the right direction. When I
get my current projects completed (glibc fixing)... I want to
look at this :) then again, maybe I'll be doing glibc work for
the rest of my usefull days ;)

We also fail many of the LTP tests having to do with
signals, mmap'ing, and direct IO.

c.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 20:02         ` Jeremy Drake
  2002-09-20 20:37           ` Carlos O'Donell
@ 2002-09-20 20:37           ` Bdale Garbee
  2002-09-20 20:52             ` Carlos O'Donell
  2002-09-20 23:11             ` Jeremy Drake
  1 sibling, 2 replies; 29+ messages in thread
From: Bdale Garbee @ 2002-09-20 20:37 UTC (permalink / raw)
  To: parisc-linux

jeremyd@apptechsys.com (Jeremy Drake) writes:

> For me, the last kernel that didn't crash on my J5000 in smp mode while
> doing apt-get update was kernel-image-2.4.17-32-smp_23.1_hppa.deb

I've had good luck with the 2.4.19 kernel images I've uploaded to unstable,
which are built and running on my J5000 in 64-bit SMP mode.  I don't promise
they're "stable", but the apt-get update problem is gone.

Bdale

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 20:37           ` Carlos O'Donell
@ 2002-09-20 20:46             ` John David Anglin
  2002-09-20 20:50               ` Randolph Chung
  2002-09-21  3:38             ` [parisc-linux] malloc limits John David Anglin
  1 sibling, 1 reply; 29+ messages in thread
From: John David Anglin @ 2002-09-20 20:46 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: jeremyd, frts, grundler, parisc-linux

> When running an SMP kernel and doing multiple compiles, the 
> box was rather unsuable e.g. random process death.

Is there a way to turn off the unaligned handler?  It may be hiding
bad stuff going on in userland.  There are still cases where expect
causes a continuous sequence of unaligned faults.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6605)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 20:46             ` John David Anglin
@ 2002-09-20 20:50               ` Randolph Chung
  2002-09-20 20:55                 ` Carlos O'Donell
  2002-09-20 20:55                 ` John David Anglin
  0 siblings, 2 replies; 29+ messages in thread
From: Randolph Chung @ 2002-09-20 20:50 UTC (permalink / raw)
  To: John David Anglin
  Cc: Carlos O'Donell, jeremyd, frts, grundler, parisc-linux

> Is there a way to turn off the unaligned handler?  It may be hiding
> bad stuff going on in userland.  There are still cases where expect
> causes a continuous sequence of unaligned faults.

not at runtime, but i can build a kernel with this turned off and let
you test it.

randolph
--  
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 20:37           ` [parisc-linux] SMP kernel problems on a D350 Bdale Garbee
@ 2002-09-20 20:52             ` Carlos O'Donell
  2002-09-20 23:11             ` Jeremy Drake
  1 sibling, 0 replies; 29+ messages in thread
From: Carlos O'Donell @ 2002-09-20 20:52 UTC (permalink / raw)
  To: Bdale Garbee; +Cc: parisc-linux

> 
> > For me, the last kernel that didn't crash on my J5000 in smp mode while
> > doing apt-get update was kernel-image-2.4.17-32-smp_23.1_hppa.deb
> 
> I've had good luck with the 2.4.19 kernel images I've uploaded to unstable,
> which are built and running on my J5000 in 64-bit SMP mode.  I don't promise
> they're "stable", but the apt-get update problem is gone.

What type of workloads do you have that machine doing?

c.
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 20:50               ` Randolph Chung
@ 2002-09-20 20:55                 ` Carlos O'Donell
  2002-09-21 23:20                   ` Randolph Chung
  2002-09-20 20:55                 ` John David Anglin
  1 sibling, 1 reply; 29+ messages in thread
From: Carlos O'Donell @ 2002-09-20 20:55 UTC (permalink / raw)
  To: Randolph Chung; +Cc: John David Anglin, parisc-linux

> > Is there a way to turn off the unaligned handler?  It may be hiding
> > bad stuff going on in userland.  There are still cases where expect
> > causes a continuous sequence of unaligned faults.
> 
> not at runtime, but i can build a kernel with this turned off and let
> you test it.
> 
> randolph

Would be nice to have a proc interface for this.
I would like to do consecutive testing with it 
enabled and disabled.

c.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 20:50               ` Randolph Chung
  2002-09-20 20:55                 ` Carlos O'Donell
@ 2002-09-20 20:55                 ` John David Anglin
  2002-09-20 21:51                   ` Randolph Chung
  1 sibling, 1 reply; 29+ messages in thread
From: John David Anglin @ 2002-09-20 20:55 UTC (permalink / raw)
  To: randolph; +Cc: carlos, jeremyd, frts, grundler, parisc-linux

> > Is there a way to turn off the unaligned handler?  It may be hiding
> > bad stuff going on in userland.  There are still cases where expect
> > causes a continuous sequence of unaligned faults.
> 
> not at runtime, but i can build a kernel with this turned off and let
> you test it.

Is there any software that actually needs the unaligned handler?
I think it would be useful for test purposes to have a kernel with
it off.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6605)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 20:55                 ` John David Anglin
@ 2002-09-20 21:51                   ` Randolph Chung
  0 siblings, 0 replies; 29+ messages in thread
From: Randolph Chung @ 2002-09-20 21:51 UTC (permalink / raw)
  To: John David Anglin; +Cc: carlos, jeremyd, frts, grundler, parisc-linux

> Is there any software that actually needs the unaligned handler?
> I think it would be useful for test purposes to have a kernel with
> it off.

off the top of my head....

in the kernel, the usb driver has some unaligned accesses.

in userspace, several of the network utilities (tcpdump, nmap, etc) make
unaligned accesses

randolph
--  
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 20:37           ` [parisc-linux] SMP kernel problems on a D350 Bdale Garbee
  2002-09-20 20:52             ` Carlos O'Donell
@ 2002-09-20 23:11             ` Jeremy Drake
  2002-09-20 23:46               ` Jeremy Drake
  2002-09-20 23:55               ` Robert Stanford
  1 sibling, 2 replies; 29+ messages in thread
From: Jeremy Drake @ 2002-09-20 23:11 UTC (permalink / raw)
  To: Bdale Garbee; +Cc: parisc-linux

On 20 Sep 2002, Bdale Garbee wrote:

> jeremyd@apptechsys.com (Jeremy Drake) writes:
> 
> > For me, the last kernel that didn't crash on my J5000 in smp mode while
> > doing apt-get update was kernel-image-2.4.17-32-smp_23.1_hppa.deb
> 
> I've had good luck with the 2.4.19 kernel images I've uploaded to unstable,
> which are built and running on my J5000 in 64-bit SMP mode.  I don't promise
> they're "stable", but the apt-get update problem is gone.
I have installed 2.4.19-32-smp and 2.4.19-64-smp from unstable.  The 
32-bit one seems to still have the apt-get update problem, and will not 
run X (no surprise there).  However, 64bit seems to be fairly stable (will 
do apt-get update and will run X).  And here I thought 64-bit was LESS 
stable than 32 :)  


 > > Bdale
> _______________________________________________
> parisc-linux mailing list
> parisc-linux@lists.parisc-linux.org
> http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
> 

-- 
The rose of yore is but a name, mere names are left to us.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 23:11             ` Jeremy Drake
@ 2002-09-20 23:46               ` Jeremy Drake
  2002-09-20 23:55               ` Robert Stanford
  1 sibling, 0 replies; 29+ messages in thread
From: Jeremy Drake @ 2002-09-20 23:46 UTC (permalink / raw)
  To: Bdale Garbee; +Cc: parisc-linux

Sorry for replying to myself, but I forgot to mention the one problem I 
have had with the 2.4.19-64-smp so far.  Setserial locked it up on 
2.4.19-64-smp (no hpmc or error message, just locked up).  After disabling 
setserial, everything was fine.


 On Fri, 20 Sep 2002, Jeremy Drake wrote:

> On 20 Sep 2002, Bdale Garbee wrote:
> 
> > jeremyd@apptechsys.com (Jeremy Drake) writes:
> > 
> > > For me, the last kernel that didn't crash on my J5000 in smp mode while
> > > doing apt-get update was kernel-image-2.4.17-32-smp_23.1_hppa.deb
> > 
> > I've had good luck with the 2.4.19 kernel images I've uploaded to unstable,
> > which are built and running on my J5000 in 64-bit SMP mode.  I don't promise
> > they're "stable", but the apt-get update problem is gone.
> I have installed 2.4.19-32-smp and 2.4.19-64-smp from unstable.  The 
> 32-bit one seems to still have the apt-get update problem, and will not 
> run X (no surprise there).  However, 64bit seems to be fairly stable (will 
> do apt-get update and will run X).  And here I thought 64-bit was LESS 
> stable than 32 :)  
> 
> 
>  > > Bdale
> > _______________________________________________
> > parisc-linux mailing list
> > parisc-linux@lists.parisc-linux.org
> > http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
> > 
> 
> 

-- 
"You need tender loving care once a week - so that I can slap you into shape."
- Ellyn Mustard

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 23:11             ` Jeremy Drake
  2002-09-20 23:46               ` Jeremy Drake
@ 2002-09-20 23:55               ` Robert Stanford
  1 sibling, 0 replies; 29+ messages in thread
From: Robert Stanford @ 2002-09-20 23:55 UTC (permalink / raw)
  To: Parisc

Well things seem to be getting better on my 580

k580:~# uname -a                            
Linux k580 2.4.19-pa18 #26 SMP Sat Sep 21 09:09:22 EST 2002 parisc
unknown

k580:~# apt-get update  
Get:1 http://ftp.au.debian.org unstable/main Packages [1962kB]
Get:2 http://ftp.au.debian.org unstable/main Release [82B]
Get:3 http://ftp.au.debian.org unstable/non-free Packages [50.0kB]
Get:4 http://ftp.au.debian.org unstable/non-free Release [86B]
Get:5 http://ftp.au.debian.org unstable/contrib Packages [47.8kB]
Get:6 http://ftp.au.debian.org unstable/contrib Release [85B]
Fetched 2060kB in 11m39s (2944B/s)
apt-get(181): unaligned access to 0x403ce08c at ip=0x4005e4f7
apt-get(181): unaligned access to 0xef20c024 at ip=0x4005e4fb
isr verification failed (isr: 00000000, sr7: 000000ac 
Unaligned handler failed, ret = 1

k580:~# /etc/init.d/samba start                                
Starting Samba daemons: nmbd smbdsmbd(175): unaligned access to
0x4001a2b8 at if

Robert Stanford

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [parisc-linux] malloc limits
  2002-09-20 20:37           ` Carlos O'Donell
  2002-09-20 20:46             ` John David Anglin
@ 2002-09-21  3:38             ` John David Anglin
  2002-09-21  4:14               ` Matthew Wilcox
  1 sibling, 1 reply; 29+ messages in thread
From: John David Anglin @ 2002-09-21  3:38 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: parisc-linux

In looking at the failure of the gcc v3 pthread2 test, I see it dies
with a segv in chunk_free when next is larger than 0x80000000:

#0  chunk_free (ar_ptr=0x400ad16c, p=0x4010c744) at malloc.c:3179
3179      nextsz = chunksize(next);
(gdb) p next
$2 = (struct malloc_chunk *) 0x802191cc

I thought there was a flat memory model.  If so, shouldn't it be possible
for the data section to expand past 0x80000000?

The test will pass if I cut max_loop_count to 30000.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6605)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] malloc limits
  2002-09-21  3:38             ` [parisc-linux] malloc limits John David Anglin
@ 2002-09-21  4:14               ` Matthew Wilcox
  2002-09-21  4:46                 ` Grant Grundler
  0 siblings, 1 reply; 29+ messages in thread
From: Matthew Wilcox @ 2002-09-21  4:14 UTC (permalink / raw)
  To: John David Anglin; +Cc: Carlos O'Donell, parisc-linux

On Fri, Sep 20, 2002 at 11:38:37PM -0400, John David Anglin wrote:
> I thought there was a flat memory model.  If so, shouldn't it be possible
> for the data section to expand past 0x80000000?

There is a flat memory model... libs are mapped at 0x4000'0000 so that's
not it.  worth looking at /proc/$pid/maps for that process, maybe?

-- 
Revolutions do not require corporate support.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20  8:28     ` Istvan Gyenes
  2002-09-20 19:48       ` Carlos O'Donell
@ 2002-09-21  4:31       ` Grant Grundler
  1 sibling, 0 replies; 29+ messages in thread
From: Grant Grundler @ 2002-09-21  4:31 UTC (permalink / raw)
  To: Istvan Gyenes; +Cc: parisc-linux

Istvan Gyenes wrote:
> Thanks I'll try that!
> 
> Another question: If 2.4.19 SMP not enough stable where can I find the
> latest stable smp kernel source?

I'd advise using the 2.4.19 images uploaded by Bdale to debian.org.
Mostly because it's "fall-out-of-bed" easy to get matching source in
case you need to change something or want to try something out.

If that doesn't work for you, for A500, one of the better ones is:
	ftp://ftp.parisc-linux.org/kernels/a500/2.4.18-pa54.tgz

Look in kernels/32 or kernel/64 for other revs using default configs.

I'm convinced SMP instability is because of timing (race conditions)
and/or D-cache problems. My gut feeling if the 4-way associative
cache isn't getting flushed properly in all locations it needs to be.
I'm hoping someone who has more clue about VM and virtually indexed
caches could dig into this.

hth,
grant

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] malloc limits
  2002-09-21  4:14               ` Matthew Wilcox
@ 2002-09-21  4:46                 ` Grant Grundler
  2002-09-21  5:24                   ` John David Anglin
  0 siblings, 1 reply; 29+ messages in thread
From: Grant Grundler @ 2002-09-21  4:46 UTC (permalink / raw)
  To: John David Anglin; +Cc: Matthew Wilcox, Carlos O'Donell, parisc-linux

Matthew Wilcox wrote:
> On Fri, Sep 20, 2002 at 11:38:37PM -0400, John David Anglin wrote:
> > I thought there was a flat memory model.  If so, shouldn't it be possible
> > for the data section to expand past 0x80000000?
> 
> There is a flat memory model... libs are mapped at 0x4000'0000 so that's
> not it.  worth looking at /proc/$pid/maps for that process, maybe?

is 0x80000000 the address or the size?
If it's the size then you get up into 0xc0000000 (which is ok).
Getting up into 0xf0000000 - 0xffffffff address is not.

grant

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] malloc limits
  2002-09-21  4:46                 ` Grant Grundler
@ 2002-09-21  5:24                   ` John David Anglin
  2002-09-21 22:33                     ` Grant Grundler
  0 siblings, 1 reply; 29+ messages in thread
From: John David Anglin @ 2002-09-21  5:24 UTC (permalink / raw)
  To: Grant Grundler; +Cc: willy, carlos, parisc-linux

> is 0x80000000 the address or the size?

It's the address of the next contiguous chunk.  This is roughly the sum
of the address plus the size of the chunk to be freed.  The segv occurs
loading the size of the next chunk using the address.

I haven't been successful debugging the code directly.  I can get the
code to seg fault by setting SIG37 to nostop noprint, but the debugger
seems to think the fault occurs following the INLINE_SYSCALL in
__sigsuspend.  However, the address points to an ldi instruction
which can't seg fault, so I don't know what's up.  The data that
I posted were from a core dump.

> If it's the size then you get up into 0xc0000000 (which is ok).
> Getting up into 0xf0000000 - 0xffffffff address is not.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6605)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] malloc limits
  2002-09-21  5:24                   ` John David Anglin
@ 2002-09-21 22:33                     ` Grant Grundler
  2002-09-22  5:43                       ` John David Anglin
  0 siblings, 1 reply; 29+ messages in thread
From: Grant Grundler @ 2002-09-21 22:33 UTC (permalink / raw)
  To: John David Anglin; +Cc: willy, carlos, parisc-linux

"John David Anglin" wrote:
> It's the address of the next contiguous chunk.  This is roughly the sum
> of the address plus the size of the chunk to be freed.  The segv occurs
> loading the size of the next chunk using the address.

I'll assume this is happening on the A500 (PA2.0) and wonder if it's
a signed/unsigned bug. Look closely at how PA2.0 extends register
values and make sure code is treating addresses and sizes as unsigned.


> I haven't been successful debugging the code directly.  I can get the
> code to seg fault by setting SIG37 to nostop noprint, but the debugger
> seems to think the fault occurs following the INLINE_SYSCALL in
> __sigsuspend.  However, the address points to an ldi instruction
> which can't seg fault, so I don't know what's up.

Not all instructions trap precisely. FP ops definitely do not and
I thought a few others didn't either.

I'm wondering what happens when unaligned access should segfault.
Does the unaligned code handle check for that?
I'll take a quick look at that code path.


thanks,
grant

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-20 20:55                 ` Carlos O'Donell
@ 2002-09-21 23:20                   ` Randolph Chung
  2002-09-22  0:57                     ` Grant Grundler
  0 siblings, 1 reply; 29+ messages in thread
From: Randolph Chung @ 2002-09-21 23:20 UTC (permalink / raw)
  To: Carlos O'Donell, John David Anglin, parisc-linux

> Would be nice to have a proc interface for this.
> I would like to do consecutive testing with it 
> enabled and disabled.

ftp://ftp.parisc-linux.org/patches/unaligned-procfs.diff

legolas:/home/randolph# cat /proc/sys/kernel/unaligned 
Unaligned trap handler is enabled
legolas:/home/randolph# ./t; echo $?
0
legolas:/home/randolph# echo 0 >> /proc/sys/kernel/unaligned 
legolas:/home/randolph# cat /proc/sys/kernel/unaligned 
Unaligned trap handler is not enabled
legolas:/home/randolph# ./t; echo $?
Bus error
138
legolas:/home/randolph# echo 1 >> /proc/sys/kernel/unaligned 
legolas:/home/randolph# cat /proc/sys/kernel/unaligned 
Unaligned trap handler is enabled
legolas:/home/randolph# ./t; echo $?
0

if someone can review this real quick before i commit it to cvs, i'd
appreciate it. in particular, the point where it decides that the
unaligned trap is not enabled and forces the SIGBUS is not exactly at
the beginning of the trap handler -- it still prints the unaligned
message.... 

randolph
--  
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
  2002-09-21 23:20                   ` Randolph Chung
@ 2002-09-22  0:57                     ` Grant Grundler
  0 siblings, 0 replies; 29+ messages in thread
From: Grant Grundler @ 2002-09-22  0:57 UTC (permalink / raw)
  To: Randolph Chung; +Cc: parisc-linux

Randolph Chung wrote:
> legolas:/home/randolph# cat /proc/sys/kernel/unaligned 
> Unaligned trap handler is enabled
> legolas:/home/randolph# ./t; echo $?
> 0
> legolas:/home/randolph# echo 0 >> /proc/sys/kernel/unaligned 

Cool!
After reviewing the diff (on ftp.p-l.o/patches), only two nits
that have nothing to do with the code:

o cat output should relate to what I have to "echo" into the /proc file.
  ie only display '0' or '1' when catting.
  Or is "blah is enabled" by convention?

o SYSCTL_FILENAME should be "sys/kernel/unaligned_trap"
  and then I think 0 or 1 should be clear enough to anyone
  daring to mess with it.


> if someone can review this real quick before i commit it to cvs,

If you don't like my suggestions, I'm ok with committing
what you've already got.

> i'd
> appreciate it. in particular, the point where it decides that the
> unaligned trap is not enabled and forces the SIGBUS is not exactly at
> the beginning of the trap handler -- it still prints the unaligned
> message.... 

hmmm...if running under a debugger, one gets that info anyway.
But that's not always easy to do. I think it's ok since we don't
like to see unligned traps happen anyway.

Maybe a "unaligned_trap_msg" tunable?
/me runs...

thanks
grant

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] malloc limits
  2002-09-21 22:33                     ` Grant Grundler
@ 2002-09-22  5:43                       ` John David Anglin
  0 siblings, 0 replies; 29+ messages in thread
From: John David Anglin @ 2002-09-22  5:43 UTC (permalink / raw)
  To: Grant Grundler; +Cc: willy, carlos, parisc-linux

> I'll assume this is happening on the A500 (PA2.0) and wonder if it's
> a signed/unsigned bug. Look closely at how PA2.0 extends register
> values and make sure code is treating addresses and sizes as unsigned.

This is the code that adds the chunk pointer plus size of chunk and
then tries to load the size of the next check:

0x402611d4 <chunk_free+32>:     add,l r25,ret1,r31
0x402611d8 <chunk_free+36>:     ldw 4(sr0,r31),r20

The add is a 64-bit add on a PA2.0 machine, so the result won't be
signed extended.  My understanding is that the upper 32-bits are
truncated when the PSW W bit is zero.  So, it isn't obvious to
me how this can be a signed/unsigned bug unless it is in the
kernel.

> > I haven't been successful debugging the code directly.  I can get the
> > code to seg fault by setting SIG37 to nostop noprint, but the debugger
> > seems to think the fault occurs following the INLINE_SYSCALL in
> > __sigsuspend.  However, the address points to an ldi instruction
> > which can't seg fault, so I don't know what's up.
> 
> Not all instructions trap precisely. FP ops definitely do not and
> I thought a few others didn't either.
> 
> I'm wondering what happens when unaligned access should segfault.
> Does the unaligned code handle check for that?
> I'll take a quick look at that code path.

There is definitely something strange with this program.  It doesn't
seg fault 100% of the time.  This suggests either a timing/lock problem
or something that isn't being properly initialized.  I don't know
how to debug it under gdb because it seems to change the way traps
are handled.  When I set a break, it appears that the code under test
catches the trap instead of gdb.  The system also dumps core.

I've tried setting breaks in chunk_free and __pthread_mutex_lock
where the unaligned faults occur with a condition matching the
unaligned pointer value which i see in /var/log/debug.  However,
I get the following:

Program received signal SIGTRAP, Trace/breakpoint trap.
0x4021e114 in __sigsuspend (set=0x25)
    at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
45        return INLINE_SYSCALL (rt_sigsuspend, 2, CHECK_SIGSET (set), _NSIG / 8);
(gdb) info proc
process 20194
cmdline = '/home/dave/pthread2.x0g'
warning: unable to read link '/proc/20194/cwd'
warning: unable to read link '/proc/20194/exe'

dave     20193 20041  0 21:41 pts/2    00:00:02 gdb pthread2.x0g
dave     20194 20193  0 21:43 pts/2    00:00:00 /home/dave/pthread2.x0g
dave     20199 20194  0 21:46 pts/2    00:00:00 [pthread2.x0g <defunct>]

I tried setting follow-fork-mode to child but it doesn't seem to follow
the child.

I don't think fp exceptions are involved.

I can see in debug that two traps occur associated with each run.  They
are both type 15 (Data TLB Miss Fault) and they seem to both occur at
the same location.

The program pthread2.x0g is in my home directory on gsyprf11.  If you
want to try it, it probably best to set

  LD_LIBRARY_PATH=/home/dave/opt/gnu/lib

It may take several tries to get it to seg fault.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6605)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [parisc-linux] SMP kernel problems on a D350
       [not found] <20020921044102.3DD104829@dsl2.external.hp.com>
@ 2002-09-23 17:23 ` Jeremy Drake
  0 siblings, 0 replies; 29+ messages in thread
From: Jeremy Drake @ 2002-09-23 17:23 UTC (permalink / raw)
  To: Grant Grundler; +Cc: parisc-linux

On Fri, 20 Sep 2002, Grant Grundler wrote:

> If you have time, could you reproduce the "lockup" and then hit "TOC" button?
> If you cancel autoboot and get a PDC prompt (eg "BOOT_ADMIN>"), "ser pim"
> output will contain machine state when it was TOCed. Capture and Post that 
> output to the mailing list and someone might see what the problem is.
> 

OK.  Here it is....  The kernel is 2.4.19-64-smp from unstable, version 
18.1, so the System.map can be found there...

krakatoa:~# /bin/sh /etc/init.d/setserial start
Loading the saved-state of the serial devices...
Cannot set serial info: Device or resource busy
/dev/ttyS0 at 0x03f8 (irq = 195) is a 16550A

Firmware Version 5.0

Duplex Console IO Dependent Code (IODC) revision 1

------------------------------------------------------------------------------
   (c) Copyright 1995-2000, Hewlett-Packard Company, All rights reserved
------------------------------------------------------------------------------

  Processor   Speed            State           Coprocessor State  I/D Cache 
  ---------  --------   ---------------------  -----------------  -------------
      0      440 MHz    Active                 Functional         512 kB/1 MB
      1      440 MHz    Idle                   Functional         512 kB/1 MB

  Central Bus Speed:                   120 MHz

  Available memory:              536870912 bytes
  Good memory required:           46678016 bytes

  Primary boot path:    FWSCSI.5.0
  Alternate boot path:  FWSCSI.6.0
  Console path:         SERIAL_1.9600.8.none
  Keyboard path:        PCI8.0.0

Processor is booting from first available device.

To discontinue, press any key within 10 seconds.

\aBoot terminated.


----- Main Menu -------------------------------------------------------------

      Command                           Description
      -------                           -----------
      BOot [PRI|ALT|<path>]             Boot from specified path
      PAth [PRI|ALT|CON|KEY [<path>]]   Display or modify a path
      SEArch [DIsplay|[[IPL] [<path>]]] Search for boot devices

      COnfiguration [<command>]         Access Configuration menu/commands
      INformation [<command>]           Access Information menu/commands
      SERvice [<command>]               Access Service menu/commands

      DIsplay                           Redisplay the current menu
      HElp [<menu>|<command>]           Display help for menu or command
      RESET                             Restart the system
-----
Main Menu: Enter command > ser pim toc

PROCESSOR PIM INFORMATION

-----------------  Processor 0 TOC Information -------------------

General Registers 0 - 31
00-03   0000000000000000  0000000010478480  000000001010ded4  0000000010435ac8
04-07   0000000000000041  0000000000000001  0000000000000000  0000000000000000
08-11   000000001054b5a0  0000000000000003  00000000105b8b40  00000000105646cc
12-15   0000000000000000  00000000ffffffff  0000000000000001  00000000f0400004
16-19   00000000105b8b40  00000000f000017c  00000000f0000174  0000000000000000
20-23   0000000000000000  0000000000000000  00000000104456a0  00000000105555a0
24-27   00000000105b8b40  0000000000000041  0000000010435ac8  000000001054b5a0
28-31   0000000000000000  00000000105b8ef0  00000000105b9000  000000001055e5a0

<Press any key to continue (q to quit)> 

Control Registers 0 - 31
00-03   0000000000000000  0000000000000000  0000000000000000  0000000000000000
04-07   0000000000000000  0000000000000000  0000000000000000  0000000000000000
08-11   000000000000295a  0000000000000000  00000000000000c0  000000000000003f
12-15   0000000000000000  0000000000000000  0000000000107000  0000000000000000
16-19   00005f379490dc48  0000000000000000  000000001010de1c  0000000000000000
20-23   0000000000000000  0000000000000000  0000007f082cff0e  a000000000000000
24-27   0000000000487000  0000000015bc1000  0000000000044021  00000000f0412000
28-31   0000000055555555  0000000055555555  00000000105b8000  00000000105c0000
Space Registers 0 - 7

00-03   000a5680          000a5680          00000000          000a5680
04-07   00000000          00000000          00000000          00000000

IIA Space                    = 0x0000000000000000
IIA Offset                   = 0x000000001010de0c
CPU State                    = 0x9e000001

Main Menu: Enter command > 

Main Menu: Enter command > reset

Resetting...


-- 
If a child annoys you, quiet him by brushing his hair.  If this doesn't
work, use the other side of the brush on the other end of the child.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2002-09-23 17:23 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200209190805.KAA0000032531@simba.sch.bme.hu>
2002-09-19  9:17 ` [parisc-linux] SMP kernel problems on a D350 Istvan Gyenes
2002-09-19 12:21   ` J.Steindlberger
2002-09-19 12:29     ` Ryan Bradetich
2002-09-19 22:46   ` Grant Grundler
2002-09-20  8:28     ` Istvan Gyenes
2002-09-20 19:48       ` Carlos O'Donell
2002-09-20 20:02         ` Jeremy Drake
2002-09-20 20:37           ` Carlos O'Donell
2002-09-20 20:46             ` John David Anglin
2002-09-20 20:50               ` Randolph Chung
2002-09-20 20:55                 ` Carlos O'Donell
2002-09-21 23:20                   ` Randolph Chung
2002-09-22  0:57                     ` Grant Grundler
2002-09-20 20:55                 ` John David Anglin
2002-09-20 21:51                   ` Randolph Chung
2002-09-21  3:38             ` [parisc-linux] malloc limits John David Anglin
2002-09-21  4:14               ` Matthew Wilcox
2002-09-21  4:46                 ` Grant Grundler
2002-09-21  5:24                   ` John David Anglin
2002-09-21 22:33                     ` Grant Grundler
2002-09-22  5:43                       ` John David Anglin
2002-09-20 20:37           ` [parisc-linux] SMP kernel problems on a D350 Bdale Garbee
2002-09-20 20:52             ` Carlos O'Donell
2002-09-20 23:11             ` Jeremy Drake
2002-09-20 23:46               ` Jeremy Drake
2002-09-20 23:55               ` Robert Stanford
2002-09-21  4:31       ` Grant Grundler
     [not found] <20020921044102.3DD104829@dsl2.external.hp.com>
2002-09-23 17:23 ` Jeremy Drake
2002-09-19  7:38 Istvan Gyenes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.