* [parisc-linux] Unaligned access failures with apt-get on SMP K460.
@ 2002-06-17 0:22 Ryan Bradetich
2002-06-17 0:59 ` John David Anglin
2002-06-17 3:19 ` [parisc-linux] Unaligned access failures with apt-get on SMP K460 Jeremy Drake
0 siblings, 2 replies; 10+ messages in thread
From: Ryan Bradetich @ 2002-06-17 0:22 UTC (permalink / raw)
To: parisc-linux; +Cc: richard_hirst
Hello parisc-linux hackers,
I (with a lot of help from Richard) am looking into a problem with
apt-get .... on a SMP kernel for the K460.
The problem is that when I run apt-get <command> I get the following
error message:
apt-get(<PID>): unaligned access to 0x403ce094 at ip=0x4005e47f
This kernel has the DEBUG_UNALIGNED defined in
arch/parisc/kernel/unaligned.c to provide additional debug information
for unaligned accesses.
Here is the trace from the apt-get install sudo:
rebel:~# gdb /usr/bin/apt-get
GNU gdb 2002-04-01-cvs
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "hppa-linux"...(no debugging symbols
found)...
(gdb) break main
Breakpoint 1 at 0x24554
(gdb) run install sudo
Starting program: /usr/bin/apt-get install sudo
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...
Breakpoint 1, 0x00024554 in main ()
(gdb) continue
Continuing.
apt-get(165): unaligned access to 0x403ce094 at ip=0x4005e47f
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000000000000000000000000 Not tainted
r00-03 00000000 00044a20 40098df3 403ce08c
r04-07 00000038 40111868 faf00c18 00049da4
r08-11 00049da0 faf00f18 faf01368 faf00e4c
r12-15 000129e7 faf00bf0 faf00a88 faf00bcc
r16-19 faf006c8 0000000a faf005b8 40111868
r20-23 000022c8 00000166 00000000 403ce044
r24-27 faf01368 00000038 000022c8 00040220
r28-31 0004a900 400c65a7 faf01500 000282a3
sr0-3 00000097 0000008f 00000000 00000097
sr4-7 00000097 00000097 00000097 00000097
IASQ: 00000097 00000097 IAOQ: 4005e47f 4005e483
IIR: 0c751290 ISR: 00000097 IOR: 403ce094
CPU: 2 CR30: ee294000 CR31: 11111111
ORIG_R28: 00000001
unaligned.c:183:emulate_store <7>store r21 (0x00000166) to
00000097:403ce094 for 4 bytes
unaligned.c:365:handle_unaligned <7>ret = 0
apt-get (pid 165): Illegal instruction (code 8)
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000000000000000000000000 Not tainted
r00-03 00000000 ee58c000 40098df3 403ce08c
r04-07 00000038 40111868 faf00c18 00049da4
r08-11 10356810 10356810 faf01368 faf00e4c
r12-15 000129e7 faf00bf0 faf00a88 faf00bcc
r16-19 ee294380 0000000a faf005b8 40111868
r20-23 000022c8 00000166 00000000 403ce044
r24-27 ee81c03c eeb3db00 000022c8 00040220
r28-31 0004a900 400c65a7 faf01500 000282a3
sr0-3 00000097 0000008f 00000000 00000097
sr4-7 00000097 00000097 00000097 00000097
IASQ: 00000097 00000097 IAOQ: 4005e483 4005e487
IIR: 48340048 ISR: 00000000 IOR: ee58c024
CPU: 2 CR30: ee294000 CR31: 11111111
ORIG_R28: 00000001
Program received signal SIGILL, Illegal instruction.
0x4005e480 in DynamicMMap::Allocate(unsigned long) ()
from /usr/lib/libapt-pkg-libc6.2-3.so.3.2
(gdb)
Here is the instruction dump:
(gdb) x/10i 0x4005e470
0x4005e470 <_ZN11DynamicMMap8AllocateEm+100>: copy r4,r25
0x4005e474 <_ZN11DynamicMMap8AllocateEm+104>: ldo -1(r21),r21
0x4005e478 <_ZN11DynamicMMap8AllocateEm+108>: copy r20,r26
0x4005e47c <_ZN11DynamicMMap8AllocateEm+112>: stw r21,8(sr0,r3)
0x4005e480 <_ZN11DynamicMMap8AllocateEm+116>: b,l 0x4005d76c
<_init+232>,r31
0x4005e484 <_ZN11DynamicMMap8AllocateEm+120>: add,l r20,r4,r20
0x4005e488 <_ZN11DynamicMMap8AllocateEm+124>: stw r20,4(sr0,r3)
0x4005e48c <_ZN11DynamicMMap8AllocateEm+128>: copy ret1,ret0
0x4005e490 <_ZN11DynamicMMap8AllocateEm+132>: ldw -54(sr0,sp),rp
0x4005e494 <_ZN11DynamicMMap8AllocateEm+136>: ldw -3c(sr0,sp),r4
(gdb)
The instruction causing the unaligned trap is:
0x4005e47c <_ZN11DynamicMMap8AllocateEm+112>: stw r21,8(sr0,r3)
As you can see from r3 (403ce08c) in the register dump is aligned on a
4-byte boundry. So the question is why is this trap being executed?
Also Richard thought the following two things looked wiered in the
register dump:
PSW is all 0's.
r30 is faf01500 ... isn't userspace stack usually 0xbf??????
Note: The apt-get commands work fine on a UP kernel, and I am running
against the unstable distribution. The same apt-get seems to work fine
on the J200 with an SMP kernel too.
Any thoughts or insight is appreciated :)
Thanks,
- Ryan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Unaligned access failures with apt-get on SMP K460.
2002-06-17 0:22 [parisc-linux] Unaligned access failures with apt-get on SMP K460 Ryan Bradetich
@ 2002-06-17 0:59 ` John David Anglin
2002-06-17 2:31 ` John David Anglin
2002-06-17 3:19 ` [parisc-linux] Unaligned access failures with apt-get on SMP K460 Jeremy Drake
1 sibling, 1 reply; 10+ messages in thread
From: John David Anglin @ 2002-06-17 0:59 UTC (permalink / raw)
To: Ryan Bradetich; +Cc: parisc-linux, richard_hirst
> The instruction causing the unaligned trap is:
> 0x4005e47c <_ZN11DynamicMMap8AllocateEm+112>: stw r21,8(sr0,r3)
>
> As you can see from r3 (403ce08c) in the register dump is aligned on a
> 4-byte boundry. So the question is why is this trap being executed?
Maybe the message is misleading. It looks as if the insn may be trying
to write to readonly memory based of the value of r3.
Dave
--
J. David Anglin dave.anglin@nrc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Unaligned access failures with apt-get on SMP K460.
2002-06-17 0:59 ` John David Anglin
@ 2002-06-17 2:31 ` John David Anglin
2002-06-17 3:04 ` Ryan Bradetich
0 siblings, 1 reply; 10+ messages in thread
From: John David Anglin @ 2002-06-17 2:31 UTC (permalink / raw)
To: John David Anglin; +Cc: rbradetich, parisc-linux, richard_hirst
> > The instruction causing the unaligned trap is:
> > 0x4005e47c <_ZN11DynamicMMap8AllocateEm+112>: stw r21,8(sr0,r3)
> >
> > As you can see from r3 (403ce08c) in the register dump is aligned on a
> > 4-byte boundry. So the question is why is this trap being executed?
>
> Maybe the message is misleading. It looks as if the insn may be trying
> to write to readonly memory based of the value of r3.
Sorry, this is wrong. Was _ZN11DynamicMMap8AllocateEm compiled with
gcc-3.2? It looks as if C++ exceptions may be involved. This only
has a chance of working with 3.2. I suspect that an exception handler
is involved because r20 is not valid across calls and r20/r21 are
used in C++ exceptions. The call looks to be a millicode call which
might be part of the problem.
Dave
--
J. David Anglin dave.anglin@nrc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Unaligned access failures with apt-get on SMP K460.
2002-06-17 2:31 ` John David Anglin
@ 2002-06-17 3:04 ` Ryan Bradetich
2002-06-17 4:12 ` [parisc-linux] Unaligned access failures with apt-get on SMP John David Anglin
0 siblings, 1 reply; 10+ messages in thread
From: Ryan Bradetich @ 2002-06-17 3:04 UTC (permalink / raw)
To: John David Anglin; +Cc: parisc-linux, richard_hirst
Dave,
This is the apt-get from debian unstable ... so if I had to guess
I would assume that it is:
||/ Name Version Description
+++-==============-==============-============================================
ii gcc 3.0.4-6 The GNU C compiler.
any way I can tell from the binary?
thanks,
- Ryan
On Sun, 2002-06-16 at 20:31, John David Anglin wrote:
> > > The instruction causing the unaligned trap is:
> > > 0x4005e47c <_ZN11DynamicMMap8AllocateEm+112>: stw r21,8(sr0,r3)
> > >
> > > As you can see from r3 (403ce08c) in the register dump is aligned on a
> > > 4-byte boundry. So the question is why is this trap being executed?
> >
> > Maybe the message is misleading. It looks as if the insn may be trying
> > to write to readonly memory based of the value of r3.
>
> Sorry, this is wrong. Was _ZN11DynamicMMap8AllocateEm compiled with
> gcc-3.2? It looks as if C++ exceptions may be involved. This only
> has a chance of working with 3.2. I suspect that an exception handler
> is involved because r20 is not valid across calls and r20/r21 are
> used in C++ exceptions. The call looks to be a millicode call which
> might be part of the problem.
>
> Dave
> --
> J. David Anglin dave.anglin@nrc.ca
> National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Unaligned access failures with apt-get on SMP K460.
2002-06-17 0:22 [parisc-linux] Unaligned access failures with apt-get on SMP K460 Ryan Bradetich
2002-06-17 0:59 ` John David Anglin
@ 2002-06-17 3:19 ` Jeremy Drake
2002-06-17 3:37 ` Ryan Bradetich
1 sibling, 1 reply; 10+ messages in thread
From: Jeremy Drake @ 2002-06-17 3:19 UTC (permalink / raw)
To: Ryan Bradetich; +Cc: parisc-linux, richard_hirst
On 16 Jun 2002, Ryan Bradetich wrote:
> Hello parisc-linux hackers,
>
> I (with a lot of help from Richard) am looking into a problem with
> apt-get .... on a SMP kernel for the K460.
>
> The problem is that when I run apt-get <command> I get the following
> error message:
> apt-get(<PID>): unaligned access to 0x403ce094 at ip=0x4005e47f
>
I get stuff like this on my J5000, only it locks up and/or HPMC's when it
does it. (Sometimes prints HPMC info to the LCD, sometimes not, but
always locks up).
I try to keep up-to-date with the cvs kernels, and I try it on SMP every
now and then, but I generally run this box under UP and it is very stable.
I started having problems with this box when the latest 2.4.17-32-smp was
released. With the previous one, this box was stable under SMP. (this
was before I got the courage to build my own kernels on parisc). Not sure
what versions in cvs this corresponds to, but maybe some change between
caused the problem.
Not sure if this is the same problem you are encountering, because Samba
also tends to lock this box up, and if I leave it alone, it just crashes
itself later...
--
Troubled day for virgins over 16 who are beautiful and wealthy and live
in eucalyptus trees.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Unaligned access failures with apt-get on SMP K460.
2002-06-17 3:19 ` [parisc-linux] Unaligned access failures with apt-get on SMP K460 Jeremy Drake
@ 2002-06-17 3:37 ` Ryan Bradetich
2002-06-17 4:45 ` Jeremy Drake
0 siblings, 1 reply; 10+ messages in thread
From: Ryan Bradetich @ 2002-06-17 3:37 UTC (permalink / raw)
To: Jeremy Drake; +Cc: parisc-linux, richard_hirst
Hello Jeremy,
The SMP problem is what I am trying to debug. I am running SMP cvs head
on the J200 and the K460 and also an A500 recently thanks to the ESIEE
team. The K460 has 4 processors which I am hoping will help to identify
the SMP problems faster, but it is also exposing new SMP problems for me
to look into :(
Hopefully we will be able to identify and fix these SMP problems, but
they do not appear to be easy and will probably take a while to fix.
Thanks for the report! and if you notice anything else odd (or
repeatable patterns) let me and/or the list know so maybe it will
provide additional insights into the problems.
Thanks,
- Ryan
On Sun, 2002-06-16 at 21:19, Jeremy Drake wrote:
> On 16 Jun 2002, Ryan Bradetich wrote:
>
> > Hello parisc-linux hackers,
> >
> > I (with a lot of help from Richard) am looking into a problem with
> > apt-get .... on a SMP kernel for the K460.
> >
> > The problem is that when I run apt-get <command> I get the following
> > error message:
> > apt-get(<PID>): unaligned access to 0x403ce094 at ip=0x4005e47f
> >
> I get stuff like this on my J5000, only it locks up and/or HPMC's when it
> does it. (Sometimes prints HPMC info to the LCD, sometimes not, but
> always locks up).
>
> I try to keep up-to-date with the cvs kernels, and I try it on SMP every
> now and then, but I generally run this box under UP and it is very stable.
> I started having problems with this box when the latest 2.4.17-32-smp was
> released. With the previous one, this box was stable under SMP. (this
> was before I got the courage to build my own kernels on parisc). Not sure
> what versions in cvs this corresponds to, but maybe some change between
> caused the problem.
>
> Not sure if this is the same problem you are encountering, because Samba
> also tends to lock this box up, and if I leave it alone, it just crashes
> itself later...
>
>
> --
> Troubled day for virgins over 16 who are beautiful and wealthy and live
> in eucalyptus trees.
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Unaligned access failures with apt-get on SMP
2002-06-17 3:04 ` Ryan Bradetich
@ 2002-06-17 4:12 ` John David Anglin
2002-06-17 20:43 ` Ryan Bradetich
0 siblings, 1 reply; 10+ messages in thread
From: John David Anglin @ 2002-06-17 4:12 UTC (permalink / raw)
To: Ryan Bradetich; +Cc: parisc-linux, richard_hirst
> any way I can tell from the binary?
Not that I am aware of. On further thought, I think the user code is ok.
Studying you original message further, I see that the printout from
unaligned.c is fully consistent with the register dump and user code.
Thus, I have to think that the problem is actually in the kernel.
If the failure occurs all the time, I would put a break at 0x4005e47c
and then set a large ignore count. Run the program and see how many
times the break is hit before the fault occurs. Then, set the ignore
count to 1 less than the number of hits and rerun. If the fault is
deterministic, you should be able to determine the exact conditions
which cause the "trap".
Oh, I remember that gdb may not print r3 correctly with info reg.
It's better to use p $r3 or printf "0x%x\n", $r3.
Dave
--
J. David Anglin dave.anglin@nrc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Unaligned access failures with apt-get on SMP K460.
2002-06-17 3:37 ` Ryan Bradetich
@ 2002-06-17 4:45 ` Jeremy Drake
0 siblings, 0 replies; 10+ messages in thread
From: Jeremy Drake @ 2002-06-17 4:45 UTC (permalink / raw)
To: Ryan Bradetich; +Cc: parisc-linux, richard_hirst
On 16 Jun 2002, Ryan Bradetich wrote:
> Hello Jeremy,
>
> The SMP problem is what I am trying to debug. I am running SMP cvs head
> on the J200 and the K460 and also an A500 recently thanks to the ESIEE
> team. The K460 has 4 processors which I am hoping will help to identify
> the SMP problems faster, but it is also exposing new SMP problems for me
> to look into :(
I can try things out for you on my J5k if it will help. This box was not
being used at my work, so they let me install linux on it and play with it
until it is needed. So, it is okay for me to crash it :P
> Thanks for the report! and if you notice anything else odd (or
> repeatable patterns) let me and/or the list know so maybe it will
> provide additional insights into the problems.
This time I got a new message.
WARNING! Stack pointer and cr30 do not correspond!
Dumping virtual address stack instead
Dumping Stack from 0x203c8000 to 0x203cc700:
8000 c377fe59 d5f07332 86effcb3 abe0e665 0ddff967 57c1cccb 1bc9479d
aff52cc4
8020 37928f3b 5fea5989 6f53ab25 bfa20640 dea7564b 7f440c81 bd3819c5
fefeac50
8040 7a70338b fdfd58a1 f4e06717 fbfab143 e9c0ce2f f7f56287 d3819c5f
efeac50f
<tuns of hex deleted>
The next time, it just locks up, no message.
It looks horribly unpridictable, although I have found 2 programs that
consistantly do it: samba and apt-get.
If I could be of any help, testing or coding, let me know. I don't know a
lot about the parisc arch, but I would be willing to learn... I would
like to make linux a viable alternative to HP-UX for our clients, but
since they are still stuck in 10.2x land, I don't think they will be very
interested in change... Still, these boxes would make pretty good
development workstations around here, despite their size, power
consumption, and heat output...
--
Freedom's just another word for nothing left to lose.
-- Kris Kristofferson, "Me and Bobby McGee"
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Unaligned access failures with apt-get on SMP
2002-06-17 4:12 ` [parisc-linux] Unaligned access failures with apt-get on SMP John David Anglin
@ 2002-06-17 20:43 ` Ryan Bradetich
2002-06-17 22:11 ` John David Anglin
0 siblings, 1 reply; 10+ messages in thread
From: Ryan Bradetich @ 2002-06-17 20:43 UTC (permalink / raw)
To: John David Anglin; +Cc: parisc-linux, richard_hirst
John et all,
I recompiled the debian apt-get package this time leaving the debug
symbols intact.
Here is the function that is causing the failure:
// DynamicMMap::Allocate - Pooled aligned allocation
/*{{{*/
// ---------------------------------------------------------------------
/* This allocates an Item of size ItemSize so that it is aligned to its
size in the file. */
unsigned long DynamicMMap::Allocate(unsigned long ItemSize)
{
// Look for a matching pool entry
Pool *I;
Pool *Empty = 0;
for (I = Pools; I != Pools + PoolCount; I++)
{
if (I->ItemSize == 0)
Empty = I;
if (I->ItemSize == ItemSize)
break;
}
// No pool is allocated, use an unallocated one
if (I == Pools + PoolCount)
{
// Woops, we ran out, the calling code should allocate more.
if (Empty == 0)
{
_error->Error("Ran out of allocation pools");
return 0;
}
I = Empty;
I->ItemSize = ItemSize;
I->Count = 0;
}
// Out of space, allocate some more
if (I->Count == 0)
{
I->Count = 20*1024/ItemSize;
I->Start = RawAllocate(I->Count*ItemSize,ItemSize);
}
I->Count--;
unsigned long Result = I->Start;
I->Start += ItemSize;
return Result/ItemSize;
}
Here is my gdb output while tracing the failure:
root@rebel:~# gdb /usr/bin/apt-get
GNU gdb 2002-04-01-cvs
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "hppa-linux"...
(gdb) b main
Breakpoint 1 at 0x27ea4: file apt-get.cc, line 2134.
(gdb) run install less
Starting program: /usr/bin/apt-get install less
Breakpoint 1, main (argc=3, argv=0x46e66) at apt-get.cc:2134
2134 CommandLine CmdL(Args,_config);
(gdb) b DynamicMMap::Allocate
Breakpoint 2 at 0x40050358: file contrib/mmap.cc, line 229.
(gdb) continue
Continuing.
Reading Package Lists... 0%
Breakpoint 2, DynamicMMap::Allocate(unsigned long) (this=0x4c900,
ItemSize=275112) at contrib/mmap.cc:229
229 Pool *Empty = 0;
(gdb) bt
#0 DynamicMMap::Allocate(unsigned long) (this=0x4c900, ItemSize=275112)
at contrib/mmap.cc:229
#1 0x400ba64c in pkgCacheGenerator::SelectFile(std::string,
std::string, pkgIndexFile const&, unsigned long) (this=0xbff01020, File=
{static npos = 4294967295, _M_dataplus = {<allocator<char>> =
{<No data fields>}, _M_p = 0x489f4 "/var/lib/dpkg/status"}, static
_S_empty_rep_storage = {0, 0, 1, 18, 1, 0}}, Site={static npos =
4294967295, _M_dataplus = {<allocator<char>> = {<No data fields>}, _M_p
= 0x432b4 ""}, static _S_empty_rep_storage = {0, 0, 1, 18, 1, 0}},
Index=@0x4bda0,
Flags=1) at pkgcachegen.cc:404
#2 0x400e5a14 in debStatusIndex::Merge(pkgCacheGenerator&, OpProgress&)
const (this=0x4bda0, Gen=@0xbff01020, Prog=@0xbff00d90) at
/usr/include/g++-v3/bits/basic_string.h:863
#3 0x400bbf8c in BuildCache(pkgCacheGenerator&, OpProgress&, unsigned
long&, unsigned long, std::__normal_iterator<pkgIndexFile**,
std::vector<pkgIndexFile*, std::allocator<pkgIndexFile*> > >,
std::__normal_iterator<pkgIndexFile**, std::vector<pkgIndexFile*,
std::allocator<pkgIndexFile*> > >) (Gen=@0xbff01020,
Progress=@0xbff00d90, CurrentSize=@0xbff01190,
TotalSize=107592,
Start={<iterator<std::random_access_iterator_tag,pkgIndexFile*,int,pkgIndexFile**,pkgIndexFile*&>> = {<No data fields>}, _M_current = 0x4c578}, End=
{<iterator<std::random_access_iterator_tag,pkgIndexFile*,int,pkgIndexFile**,pkgIndexFile*&>> = {<No data fields>}, _M_current = 0x4c57c})
at /usr/include/g++-v3/bits/stl_iterator.h:478
#4 0x400bd280 in pkgMakeStatusCache(pkgSourceList&, OpProgress&,
MMap**, bool) (List=@0xbff01020, Progress=@0xbff00d90,
OutMap=0xbff00990, AllowMem=224)
at /usr/include/g++-v3/bits/stl_vector.h:187
#5 0x400ad8d4 in pkgCacheFile::Open(OpProgress&, bool)
(this=0xbff00990, Progress=@0xbff00d90, WithLock=true) at
cachefile.cc:70
#6 0x0002b794 in CacheFile::Open(bool) (this=0xbff00990, WithLock=56)
at apt-get.cc:85
(gdb) n
DynamicMMap::Allocate(unsigned long) (this=0x4c900, ItemSize=275112) at
contrib/mmap.cc:226
226 {
(gdb) n
DynamicMMap::Allocate(unsigned long) (this=0x4c900, ItemSize=56) at
contrib/mmap.cc:230
230 for (I = Pools; I != Pools + PoolCount; I++)
(gdb) n
232 if (I->ItemSize == 0)
(gdb) n
234 if (I->ItemSize == ItemSize)
(gdb) n
230 for (I = Pools; I != Pools + PoolCount; I++)
(gdb) n
234 if (I->ItemSize == ItemSize)
(gdb) n
230 for (I = Pools; I != Pools + PoolCount; I++)
(gdb) n
234 if (I->ItemSize == ItemSize)
(gdb) n
230 for (I = Pools; I != Pools + PoolCount; I++)
(gdb) n
234 if (I->ItemSize == ItemSize)
(gdb) n
230 for (I = Pools; I != Pools + PoolCount; I++)
(gdb) n
234 if (I->ItemSize == ItemSize)
(gdb) n
230 for (I = Pools; I != Pools + PoolCount; I++)
(gdb) n
234 if (I->ItemSize == ItemSize)
(gdb) n
230 for (I = Pools; I != Pools + PoolCount; I++)
(gdb) n
234 if (I->ItemSize == ItemSize)
(gdb) n
239 if (I == Pools + PoolCount)
(gdb) n
254 if (I->Count == 0)
========> Things get interesting here <=======
(gdb) n
261 unsigned long Result = I->Start;
(gdb) n
263 return Result/ItemSize;
(gdb) n
260 I->Count--;
(gdb) n
263 return Result/ItemSize;
(gdb) n
260 I->Count--;
(gdb) n
Program received signal SIGBUS, Bus error.
DynamicMMap::Allocate(unsigned long) (this=0x4c900, ItemSize=56) at
contrib/mmap.cc:263
263 return Result/ItemSize;
It looks like the the function gets exited twice.... but I do not see
any recursion in the function, and the function is not listed twice
in the origional back trace I posted. Do we have a corrupt stack?
or can you think of anything else? I would be glad to provide any
additional debugging output to anyone interested. I can also give
remote access to this system if someone is interested in looking
this further.
Thanks,
- Ryan
On Sun, 2002-06-16 at 22:12, John David Anglin wrote:
> > any way I can tell from the binary?
>
> Not that I am aware of. On further thought, I think the user code is ok.
>
> Studying you original message further, I see that the printout from
> unaligned.c is fully consistent with the register dump and user code.
> Thus, I have to think that the problem is actually in the kernel.
>
> If the failure occurs all the time, I would put a break at 0x4005e47c
> and then set a large ignore count. Run the program and see how many
> times the break is hit before the fault occurs. Then, set the ignore
> count to 1 less than the number of hits and rerun. If the fault is
> deterministic, you should be able to determine the exact conditions
> which cause the "trap".
>
> Oh, I remember that gdb may not print r3 correctly with info reg.
> It's better to use p $r3 or printf "0x%x\n", $r3.
>
> Dave
> --
> J. David Anglin dave.anglin@nrc.ca
> National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Unaligned access failures with apt-get on SMP
2002-06-17 20:43 ` Ryan Bradetich
@ 2002-06-17 22:11 ` John David Anglin
0 siblings, 0 replies; 10+ messages in thread
From: John David Anglin @ 2002-06-17 22:11 UTC (permalink / raw)
To: Ryan Bradetich; +Cc: parisc-linux, richard_hirst
> (gdb) n
> 263 return Result/ItemSize;
>
> (gdb) n
>
> 260 I->Count--;
> (gdb) n
>
> Program received signal SIGBUS, Bus error.
> DynamicMMap::Allocate(unsigned long) (this=0x4c900, ItemSize=56) at
> contrib/mmap.cc:263
> 263 return Result/ItemSize;
>
>
> It looks like the the function gets exited twice.... but I do not see
> any recursion in the function, and the function is not listed twice
> in the origional back trace I posted. Do we have a corrupt stack?
> or can you think of anything else? I would be glad to provide any
I don't think the function exits twice. The duplication in lines as
you step through the function is caused by optimisation. A similar
effect is observable in the value for ItemSize printed at the
initial break. If you print r25 at the break, you should see the
value 56. The initial printout is wrong because the break is before
the point where the argument register is copied to the register or
stack slot for ItemSize. It's also possible that the register of
stack slot used for ItemSize may get reused later in the function.
The fun and games of debugging!
If you put a break on line 263 and single step from that point you
should find the exact assembly insn causing the bus error and be able
to determine what's causing the bus error.
If the problem looks like a compilation error as opposed to a coding
problem, send me offline the preprocessed source, the assembly output for
DynamicMMap::Allocate(unsigned long) from your compiler, and the
compilation command.
Dave
--
J. David Anglin dave.anglin@nrc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2002-06-17 22:11 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-06-17 0:22 [parisc-linux] Unaligned access failures with apt-get on SMP K460 Ryan Bradetich
2002-06-17 0:59 ` John David Anglin
2002-06-17 2:31 ` John David Anglin
2002-06-17 3:04 ` Ryan Bradetich
2002-06-17 4:12 ` [parisc-linux] Unaligned access failures with apt-get on SMP John David Anglin
2002-06-17 20:43 ` Ryan Bradetich
2002-06-17 22:11 ` John David Anglin
2002-06-17 3:19 ` [parisc-linux] Unaligned access failures with apt-get on SMP K460 Jeremy Drake
2002-06-17 3:37 ` Ryan Bradetich
2002-06-17 4:45 ` Jeremy Drake
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.